Confine vacuum skip logic to lazy_scan_skip

Started by Melanie Plagemanabout 2 years ago81 messages
#1Melanie Plageman
melanieplageman@gmail.com
7 attachment(s)

Hi,

I've written a patch set for vacuum to use the streaming read interface
proposed in [1]/messages/by-id/CA+hUKGJkOiOCa+mag4BF+zHo7qo=o9CFheB8=g6uT5TUm2gkvA@mail.gmail.com. Making lazy_scan_heap() async-friendly required a bit
of refactoring of lazy_scan_heap() and lazy_scan_skip(). I needed to
confine all of the skipping logic -- previously spread across
lazy_scan_heap() and lazy_scan_skip() -- to lazy_scan_skip(). All of the
patches doing this and other preparation for vacuum to use the streaming
read API can be applied on top of master. The attached patch set does
this.

There are a few comments that still need to be updated. I also noticed I
needed to reorder and combine a couple of the commits. I wanted to
register this for the january commitfest, so I didn't quite have time
for the finishing touches.

- Melanie

[1]: /messages/by-id/CA+hUKGJkOiOCa+mag4BF+zHo7qo=o9CFheB8=g6uT5TUm2gkvA@mail.gmail.com

Attachments:

v1-0001-lazy_scan_skip-remove-unnecessary-local-var-rel_p.patchtext/x-patch; charset=US-ASCII; name=v1-0001-lazy_scan_skip-remove-unnecessary-local-var-rel_p.patchDownload
From 9cd579d6a20aef2aeeab6ef50d72e779d75bf7cd Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 30 Dec 2023 16:18:40 -0500
Subject: [PATCH v1 1/7] lazy_scan_skip remove unnecessary local var rel_pages

lazy_scan_skip() only uses vacrel->rel_pages twice, so there seems to be
no reason to save it in a local variable, rel_pages.
---
 src/backend/access/heap/vacuumlazy.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 3b9299b8924..c4e0c077694 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1302,13 +1302,12 @@ static BlockNumber
 lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 			   bool *next_unskippable_allvis, bool *skipping_current_range)
 {
-	BlockNumber rel_pages = vacrel->rel_pages,
-				next_unskippable_block = next_block,
+	BlockNumber next_unskippable_block = next_block,
 				nskippable_blocks = 0;
 	bool		skipsallvis = false;
 
 	*next_unskippable_allvis = true;
-	while (next_unskippable_block < rel_pages)
+	while (next_unskippable_block < vacrel->rel_pages)
 	{
 		uint8		mapbits = visibilitymap_get_status(vacrel->rel,
 													   next_unskippable_block,
@@ -1331,7 +1330,7 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 		 *
 		 * Implement this by always treating the last block as unsafe to skip.
 		 */
-		if (next_unskippable_block == rel_pages - 1)
+		if (next_unskippable_block == vacrel->rel_pages - 1)
 			break;
 
 		/* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
-- 
2.37.2

v1-0002-lazy_scan_skip-remove-unneeded-local-var-nskippab.patchtext/x-patch; charset=US-ASCII; name=v1-0002-lazy_scan_skip-remove-unneeded-local-var-nskippab.patchDownload
From 314dd9038593610583e4fe60ab62e0d46ea3be86 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 30 Dec 2023 16:30:59 -0500
Subject: [PATCH v1 2/7] lazy_scan_skip remove unneeded local var
 nskippable_blocks

nskippable_blocks can be easily derived from next_unskippable_block's
progress when compared to the passed in next_block.
---
 src/backend/access/heap/vacuumlazy.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index c4e0c077694..3b28ea2cdb5 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1302,8 +1302,7 @@ static BlockNumber
 lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 			   bool *next_unskippable_allvis, bool *skipping_current_range)
 {
-	BlockNumber next_unskippable_block = next_block,
-				nskippable_blocks = 0;
+	BlockNumber next_unskippable_block = next_block;
 	bool		skipsallvis = false;
 
 	*next_unskippable_allvis = true;
@@ -1360,7 +1359,6 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 
 		vacuum_delay_point();
 		next_unskippable_block++;
-		nskippable_blocks++;
 	}
 
 	/*
@@ -1373,7 +1371,7 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 	 * non-aggressive VACUUMs.  If the range has any all-visible pages then
 	 * skipping makes updating relfrozenxid unsafe, which is a real downside.
 	 */
-	if (nskippable_blocks < SKIP_PAGES_THRESHOLD)
+	if (next_unskippable_block - next_block < SKIP_PAGES_THRESHOLD)
 		*skipping_current_range = false;
 	else
 	{
-- 
2.37.2

v1-0003-Add-lazy_scan_skip-unskippable-state.patchtext/x-patch; charset=US-ASCII; name=v1-0003-Add-lazy_scan_skip-unskippable-state.patchDownload
From 67043818003faa9cf3cdf10e6fdc6cbf6f8eee4c Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 30 Dec 2023 16:22:12 -0500
Subject: [PATCH v1 3/7] Add lazy_scan_skip unskippable state

Future commits will remove all skipping logic from lazy_scan_heap() and
confine it to lazy_scan_skip(). To make those commits more clear, first
introduce the struct, VacSkipState, which will maintain the variables
needed to skip ranges less than SKIP_PAGES_THRESHOLD.

While we are at it, add additional information to the lazy_scan_skip()
comment, including descriptions of the role and expectations for its
function parameters.
---
 src/backend/access/heap/vacuumlazy.c | 105 ++++++++++++++++-----------
 src/tools/pgindent/typedefs.list     |   1 +
 2 files changed, 64 insertions(+), 42 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 3b28ea2cdb5..6f9c2446c56 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -238,13 +238,24 @@ typedef struct LVSavedErrInfo
 	VacErrPhase phase;
 } LVSavedErrInfo;
 
+/*
+ * Parameters maintained by lazy_scan_skip() to manage skipping ranges of pages
+ * greater than SKIP_PAGES_THRESHOLD.
+ */
+typedef struct VacSkipState
+{
+	/* Next unskippable block */
+	BlockNumber next_unskippable_block;
+	/* Next unskippable block's visibility status */
+	bool		next_unskippable_allvis;
+	/* Whether or not skippable blocks should be skipped */
+	bool		skipping_current_range;
+} VacSkipState;
 
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
-static BlockNumber lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer,
-								  BlockNumber next_block,
-								  bool *next_unskippable_allvis,
-								  bool *skipping_current_range);
+static void lazy_scan_skip(LVRelState *vacrel, VacSkipState *vacskip,
+						   BlockNumber next_block, Buffer *vmbuffer);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
 								   bool sharelock, Buffer vmbuffer);
@@ -826,12 +837,10 @@ lazy_scan_heap(LVRelState *vacrel)
 {
 	BlockNumber rel_pages = vacrel->rel_pages,
 				blkno,
-				next_unskippable_block,
 				next_fsm_block_to_vacuum = 0;
+	VacSkipState vacskip;
 	VacDeadItems *dead_items = vacrel->dead_items;
 	Buffer		vmbuffer = InvalidBuffer;
-	bool		next_unskippable_allvis,
-				skipping_current_range;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -846,9 +855,7 @@ lazy_scan_heap(LVRelState *vacrel)
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/* Set up an initial range of skippable blocks using the visibility map */
-	next_unskippable_block = lazy_scan_skip(vacrel, &vmbuffer, 0,
-											&next_unskippable_allvis,
-											&skipping_current_range);
+	lazy_scan_skip(vacrel, &vacskip, 0, &vmbuffer);
 	for (blkno = 0; blkno < rel_pages; blkno++)
 	{
 		Buffer		buf;
@@ -856,26 +863,23 @@ lazy_scan_heap(LVRelState *vacrel)
 		bool		all_visible_according_to_vm;
 		LVPagePruneState prunestate;
 
-		if (blkno == next_unskippable_block)
+		if (blkno == vacskip.next_unskippable_block)
 		{
 			/*
 			 * Can't skip this page safely.  Must scan the page.  But
 			 * determine the next skippable range after the page first.
 			 */
-			all_visible_according_to_vm = next_unskippable_allvis;
-			next_unskippable_block = lazy_scan_skip(vacrel, &vmbuffer,
-													blkno + 1,
-													&next_unskippable_allvis,
-													&skipping_current_range);
+			all_visible_according_to_vm = vacskip.next_unskippable_allvis;
+			lazy_scan_skip(vacrel, &vacskip, blkno + 1, &vmbuffer);
 
-			Assert(next_unskippable_block >= blkno + 1);
+			Assert(vacskip.next_unskippable_block >= blkno + 1);
 		}
 		else
 		{
 			/* Last page always scanned (may need to set nonempty_pages) */
 			Assert(blkno < rel_pages - 1);
 
-			if (skipping_current_range)
+			if (vacskip.skipping_current_range)
 				continue;
 
 			/* Current range is too small to skip -- just scan the page */
@@ -1280,15 +1284,34 @@ lazy_scan_heap(LVRelState *vacrel)
  *	lazy_scan_skip() -- set up range of skippable blocks using visibility map.
  *
  * lazy_scan_heap() calls here every time it needs to set up a new range of
- * blocks to skip via the visibility map.  Caller passes the next block in
- * line.  We return a next_unskippable_block for this range.  When there are
- * no skippable blocks we just return caller's next_block.  The all-visible
- * status of the returned block is set in *next_unskippable_allvis for caller,
- * too.  Block usually won't be all-visible (since it's unskippable), but it
- * can be during aggressive VACUUMs (as well as in certain edge cases).
+ * blocks to skip via the visibility map.  Caller passes next_block, the next
+ * block in line. The parameters of the skipped range are recorded in vacskip.
+ * vacrel is an in/out parameter here; vacuum options and information about the
+ * relation are read and vacrel->skippedallvis is set to ensure we don't
+ * advance relfrozenxid when we have skipped vacuuming all visible blocks.
+ *
+ * vmbuffer will contain the block from the VM containing visibility
+ * information for the next unskippable heap block. We may end up needed a
+ * different block from the VM (if we decide not to skip a skippable block).
+ * This is okay; visibilitymap_pin() will take care of this while processing
+ * the block.
+ *
+ * A block is unskippable if it is not all visible according to the visibility
+ * map. It is also unskippable if it is the last block in the relation, if the
+ * vacuum is an aggressive vacuum, or if DISABLE_PAGE_SKIPPING was passed to
+ * vacuum.
  *
- * Sets *skipping_current_range to indicate if caller should skip this range.
- * Costs and benefits drive our decision.  Very small ranges won't be skipped.
+ * Even if a block is skippable, we may choose not to skip it if the range of
+ * skippable blocks is too small (below SKIP_PAGES_THRESHOLD). As a
+ * consequence, we must keep track of the next truly unskippable block and its
+ * visibility status along with whether or not we are skipping the current
+ * range of skippable blocks. This can be used to derive the next block
+ * lazy_scan_heap() must process and its visibility status.
+ *
+ * The block number and visibility status of the next unskippable block are set
+ * in vacskip->next_unskippable_block and next_unskippable_allvis.
+ * vacskip->skipping_current_range indicates to the caller whether or not it is
+ * processing a skippable (and thus all-visible) block.
  *
  * Note: our opinion of which blocks can be skipped can go stale immediately.
  * It's okay if caller "misses" a page whose all-visible or all-frozen marking
@@ -1298,24 +1321,24 @@ lazy_scan_heap(LVRelState *vacrel)
  * older XIDs/MXIDs.  The vacrel->skippedallvis flag will be set here when the
  * choice to skip such a range is actually made, making everything safe.)
  */
-static BlockNumber
-lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
-			   bool *next_unskippable_allvis, bool *skipping_current_range)
+static void
+lazy_scan_skip(LVRelState *vacrel, VacSkipState *vacskip,
+			   BlockNumber next_block, Buffer *vmbuffer)
 {
-	BlockNumber next_unskippable_block = next_block;
 	bool		skipsallvis = false;
 
-	*next_unskippable_allvis = true;
-	while (next_unskippable_block < vacrel->rel_pages)
+	vacskip->next_unskippable_block = next_block;
+	vacskip->next_unskippable_allvis = true;
+	while (vacskip->next_unskippable_block < vacrel->rel_pages)
 	{
 		uint8		mapbits = visibilitymap_get_status(vacrel->rel,
-													   next_unskippable_block,
+													   vacskip->next_unskippable_block,
 													   vmbuffer);
 
 		if ((mapbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
 		{
 			Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
-			*next_unskippable_allvis = false;
+			vacskip->next_unskippable_allvis = false;
 			break;
 		}
 
@@ -1329,14 +1352,14 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 		 *
 		 * Implement this by always treating the last block as unsafe to skip.
 		 */
-		if (next_unskippable_block == vacrel->rel_pages - 1)
+		if (vacskip->next_unskippable_block == vacrel->rel_pages - 1)
 			break;
 
 		/* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
 		if (!vacrel->skipwithvm)
 		{
 			/* Caller shouldn't rely on all_visible_according_to_vm */
-			*next_unskippable_allvis = false;
+			vacskip->next_unskippable_allvis = false;
 			break;
 		}
 
@@ -1358,7 +1381,7 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 		}
 
 		vacuum_delay_point();
-		next_unskippable_block++;
+		vacskip->next_unskippable_block++;
 	}
 
 	/*
@@ -1371,16 +1394,14 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 	 * non-aggressive VACUUMs.  If the range has any all-visible pages then
 	 * skipping makes updating relfrozenxid unsafe, which is a real downside.
 	 */
-	if (next_unskippable_block - next_block < SKIP_PAGES_THRESHOLD)
-		*skipping_current_range = false;
+	if (vacskip->next_unskippable_block - next_block < SKIP_PAGES_THRESHOLD)
+		vacskip->skipping_current_range = false;
 	else
 	{
-		*skipping_current_range = true;
+		vacskip->skipping_current_range = true;
 		if (skipsallvis)
 			vacrel->skippedallvis = true;
 	}
-
-	return next_unskippable_block;
 }
 
 /*
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index e37ef9aa76d..bd008e1699b 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2955,6 +2955,7 @@ VacOptValue
 VacuumParams
 VacuumRelation
 VacuumStmt
+VacSkipState
 ValidIOData
 ValidateIndexState
 ValuesScan
-- 
2.37.2

v1-0004-Confine-vacuum-skip-logic-to-lazy_scan_skip.patchtext/x-patch; charset=US-ASCII; name=v1-0004-Confine-vacuum-skip-logic-to-lazy_scan_skip.patchDownload
From 387db30b7e06f450dc7f60494751f78e00d43272 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 30 Dec 2023 16:59:27 -0500
Subject: [PATCH v1 4/7] Confine vacuum skip logic to lazy_scan_skip

In preparation for vacuum to use the streaming read interface (and eventually
AIO), refactor vacuum's logic for skipping blocks such that it is entirely
confined to lazy_scan_skip(). This turns lazy_scan_skip() and the VacSkipState
it uses into an iterator which yields blocks to lazy_scan_heap(). Such a
structure is conducive to an async interface.

By always calling lazy_scan_skip() -- instead of only when we have reached the
next unskippable block, we no longer need the skipping_current_range variable.
lazy_scan_heap() no longer needs to manage the skipped range -- checking if we
reached the end in order to then call lazy_scan_skip(). And lazy_scan_skip()
can derive the visibility status of a block from whether or not we are in a
skippable range -- that is, whether or not the next_block is equal to the next
unskippable block.

ci-os-only:
---
 src/backend/access/heap/vacuumlazy.c | 230 ++++++++++++++-------------
 1 file changed, 117 insertions(+), 113 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 6f9c2446c56..5070c3fe744 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -248,14 +248,13 @@ typedef struct VacSkipState
 	BlockNumber next_unskippable_block;
 	/* Next unskippable block's visibility status */
 	bool		next_unskippable_allvis;
-	/* Whether or not skippable blocks should be skipped */
-	bool		skipping_current_range;
 } VacSkipState;
 
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
-static void lazy_scan_skip(LVRelState *vacrel, VacSkipState *vacskip,
-						   BlockNumber next_block, Buffer *vmbuffer);
+static BlockNumber lazy_scan_skip(LVRelState *vacrel, VacSkipState *vacskip,
+								  BlockNumber blkno, Buffer *vmbuffer,
+								  bool *all_visible_according_to_vm);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
 								   bool sharelock, Buffer vmbuffer);
@@ -836,9 +835,12 @@ static void
 lazy_scan_heap(LVRelState *vacrel)
 {
 	BlockNumber rel_pages = vacrel->rel_pages,
-				blkno,
 				next_fsm_block_to_vacuum = 0;
-	VacSkipState vacskip;
+	bool		all_visible_according_to_vm;
+
+	/* relies on InvalidBlockNumber overflowing to 0 */
+	BlockNumber blkno = InvalidBlockNumber;
+	VacSkipState vacskip = {.next_unskippable_block = InvalidBlockNumber};
 	VacDeadItems *dead_items = vacrel->dead_items;
 	Buffer		vmbuffer = InvalidBuffer;
 	const int	initprog_index[] = {
@@ -854,37 +856,17 @@ lazy_scan_heap(LVRelState *vacrel)
 	initprog_val[2] = dead_items->max_items;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
-	/* Set up an initial range of skippable blocks using the visibility map */
-	lazy_scan_skip(vacrel, &vacskip, 0, &vmbuffer);
-	for (blkno = 0; blkno < rel_pages; blkno++)
+	while (true)
 	{
 		Buffer		buf;
 		Page		page;
-		bool		all_visible_according_to_vm;
 		LVPagePruneState prunestate;
 
-		if (blkno == vacskip.next_unskippable_block)
-		{
-			/*
-			 * Can't skip this page safely.  Must scan the page.  But
-			 * determine the next skippable range after the page first.
-			 */
-			all_visible_according_to_vm = vacskip.next_unskippable_allvis;
-			lazy_scan_skip(vacrel, &vacskip, blkno + 1, &vmbuffer);
-
-			Assert(vacskip.next_unskippable_block >= blkno + 1);
-		}
-		else
-		{
-			/* Last page always scanned (may need to set nonempty_pages) */
-			Assert(blkno < rel_pages - 1);
-
-			if (vacskip.skipping_current_range)
-				continue;
+		blkno = lazy_scan_skip(vacrel, &vacskip, blkno + 1,
+									&vmbuffer, &all_visible_according_to_vm);
 
-			/* Current range is too small to skip -- just scan the page */
-			all_visible_according_to_vm = true;
-		}
+		if (blkno == InvalidBlockNumber)
+			break;
 
 		vacrel->scanned_pages++;
 
@@ -1281,20 +1263,13 @@ lazy_scan_heap(LVRelState *vacrel)
 }
 
 /*
- *	lazy_scan_skip() -- set up range of skippable blocks using visibility map.
- *
- * lazy_scan_heap() calls here every time it needs to set up a new range of
- * blocks to skip via the visibility map.  Caller passes next_block, the next
- * block in line. The parameters of the skipped range are recorded in vacskip.
- * vacrel is an in/out parameter here; vacuum options and information about the
- * relation are read and vacrel->skippedallvis is set to ensure we don't
- * advance relfrozenxid when we have skipped vacuuming all visible blocks.
+ *	lazy_scan_skip() -- get next block for vacuum to process
  *
- * vmbuffer will contain the block from the VM containing visibility
- * information for the next unskippable heap block. We may end up needed a
- * different block from the VM (if we decide not to skip a skippable block).
- * This is okay; visibilitymap_pin() will take care of this while processing
- * the block.
+ * lazy_scan_heap() calls here every time it needs to get the next block to
+ * prune and vacuum, using the visibility map, vacuum options, and various
+ * thresholds to skip blocks which do not need to be processed. Caller passes
+ * next_block, the next block in line. This block may end up being skipped.
+ * lazy_scan_skip() returns the next block that needs to be processed.
  *
  * A block is unskippable if it is not all visible according to the visibility
  * map. It is also unskippable if it is the last block in the relation, if the
@@ -1304,14 +1279,26 @@ lazy_scan_heap(LVRelState *vacrel)
  * Even if a block is skippable, we may choose not to skip it if the range of
  * skippable blocks is too small (below SKIP_PAGES_THRESHOLD). As a
  * consequence, we must keep track of the next truly unskippable block and its
- * visibility status along with whether or not we are skipping the current
- * range of skippable blocks. This can be used to derive the next block
- * lazy_scan_heap() must process and its visibility status.
+ * visibility status separate from the next block lazy_scan_heap() should
+ * process (and its visibility status).
  *
  * The block number and visibility status of the next unskippable block are set
- * in vacskip->next_unskippable_block and next_unskippable_allvis.
- * vacskip->skipping_current_range indicates to the caller whether or not it is
- * processing a skippable (and thus all-visible) block.
+ * in vacskip->next_unskippable_block and next_unskippable_allvis. The caller
+ * should not concern itself with anything in vacskip. This is only used by
+ * lazy_scan_skip() to keep track of this state across invocations.
+ *
+ * lazy_scan_skip() returns the next block for vacuum to process and sets its
+ * visibility status in the output parameter, all_visible_according_to_vm.
+ *
+ * vacrel is an in/out parameter here; vacuum options and information about the
+ * relation are read and vacrel->skippedallvis is set to ensure we don't
+ * advance relfrozenxid when we have skipped vacuuming all visible blocks.
+ *
+ * vmbuffer will contain the block from the VM containing visibility
+ * information for the next unskippable heap block. We may end up needed a
+ * different block from the VM (if we decide not to skip a skippable block).
+ * This is okay; visibilitymap_pin() will take care of this while processing
+ * the block.
  *
  * Note: our opinion of which blocks can be skipped can go stale immediately.
  * It's okay if caller "misses" a page whose all-visible or all-frozen marking
@@ -1321,87 +1308,104 @@ lazy_scan_heap(LVRelState *vacrel)
  * older XIDs/MXIDs.  The vacrel->skippedallvis flag will be set here when the
  * choice to skip such a range is actually made, making everything safe.)
  */
-static void
+static BlockNumber
 lazy_scan_skip(LVRelState *vacrel, VacSkipState *vacskip,
-			   BlockNumber next_block, Buffer *vmbuffer)
+			   BlockNumber next_block, Buffer *vmbuffer,
+			   bool *all_visible_according_to_vm)
 {
 	bool		skipsallvis = false;
 
-	vacskip->next_unskippable_block = next_block;
-	vacskip->next_unskippable_allvis = true;
-	while (vacskip->next_unskippable_block < vacrel->rel_pages)
-	{
-		uint8		mapbits = visibilitymap_get_status(vacrel->rel,
-													   vacskip->next_unskippable_block,
-													   vmbuffer);
+	if (next_block >= vacrel->rel_pages)
+		return InvalidBlockNumber;
 
-		if ((mapbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
+	if (vacskip->next_unskippable_block == InvalidBlockNumber ||
+		next_block > vacskip->next_unskippable_block)
+	{
+		while (++vacskip->next_unskippable_block < vacrel->rel_pages)
 		{
-			Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
-			vacskip->next_unskippable_allvis = false;
-			break;
-		}
+			uint8		mapbits = visibilitymap_get_status(vacrel->rel,
+														   vacskip->next_unskippable_block,
+														   vmbuffer);
 
-		/*
-		 * Caller must scan the last page to determine whether it has tuples
-		 * (caller must have the opportunity to set vacrel->nonempty_pages).
-		 * This rule avoids having lazy_truncate_heap() take access-exclusive
-		 * lock on rel to attempt a truncation that fails anyway, just because
-		 * there are tuples on the last page (it is likely that there will be
-		 * tuples on other nearby pages as well, but those can be skipped).
-		 *
-		 * Implement this by always treating the last block as unsafe to skip.
-		 */
-		if (vacskip->next_unskippable_block == vacrel->rel_pages - 1)
-			break;
+			vacskip->next_unskippable_allvis = mapbits & VISIBILITYMAP_ALL_VISIBLE;
 
-		/* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
-		if (!vacrel->skipwithvm)
-		{
-			/* Caller shouldn't rely on all_visible_according_to_vm */
-			vacskip->next_unskippable_allvis = false;
-			break;
-		}
+			if (!vacskip->next_unskippable_allvis)
+			{
+				Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
+				break;
+			}
 
-		/*
-		 * Aggressive VACUUM caller can't skip pages just because they are
-		 * all-visible.  They may still skip all-frozen pages, which can't
-		 * contain XIDs < OldestXmin (XIDs that aren't already frozen by now).
-		 */
-		if ((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0)
-		{
-			if (vacrel->aggressive)
+			/*
+			 * Caller must scan the last page to determine whether it has
+			 * tuples (caller must have the opportunity to set
+			 * vacrel->nonempty_pages). This rule avoids having
+			 * lazy_truncate_heap() take access-exclusive lock on rel to
+			 * attempt a truncation that fails anyway, just because there are
+			 * tuples on the last page (it is likely that there will be tuples
+			 * on other nearby pages as well, but those can be skipped).
+			 *
+			 * Implement this by always treating the last block as unsafe to
+			 * skip.
+			 */
+			if (vacskip->next_unskippable_block == vacrel->rel_pages - 1)
 				break;
 
+			/* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
+			if (!vacrel->skipwithvm)
+			{
+				/* Caller shouldn't rely on all_visible_according_to_vm */
+				vacskip->next_unskippable_allvis = false;
+				break;
+			}
+
 			/*
-			 * All-visible block is safe to skip in non-aggressive case.  But
-			 * remember that the final range contains such a block for later.
+			 * Aggressive VACUUM caller can't skip pages just because they are
+			 * all-visible.  They may still skip all-frozen pages, which can't
+			 * contain XIDs < OldestXmin (XIDs that aren't already frozen by
+			 * now).
 			 */
-			skipsallvis = true;
+			if ((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0)
+			{
+				if (vacrel->aggressive)
+					break;
+
+				/*
+				 * All-visible block is safe to skip in non-aggressive case.
+				 * But remember that the final range contains such a block for
+				 * later.
+				 */
+				skipsallvis = true;
+			}
+
+			vacuum_delay_point();
 		}
 
-		vacuum_delay_point();
-		vacskip->next_unskippable_block++;
+		/*
+		 * We only skip a range with at least SKIP_PAGES_THRESHOLD consecutive
+		 * pages.  Since we're reading sequentially, the OS should be doing
+		 * readahead for us, so there's no gain in skipping a page now and
+		 * then. Skipping such a range might even discourage sequential
+		 * detection.
+		 *
+		 * This test also enables more frequent relfrozenxid advancement
+		 * during non-aggressive VACUUMs.  If the range has any all-visible
+		 * pages then skipping makes updating relfrozenxid unsafe, which is a
+		 * real downside.
+		 */
+		if (vacskip->next_unskippable_block - next_block >= SKIP_PAGES_THRESHOLD)
+		{
+			next_block = vacskip->next_unskippable_block;
+			if (skipsallvis)
+				vacrel->skippedallvis = true;
+		}
 	}
 
-	/*
-	 * We only skip a range with at least SKIP_PAGES_THRESHOLD consecutive
-	 * pages.  Since we're reading sequentially, the OS should be doing
-	 * readahead for us, so there's no gain in skipping a page now and then.
-	 * Skipping such a range might even discourage sequential detection.
-	 *
-	 * This test also enables more frequent relfrozenxid advancement during
-	 * non-aggressive VACUUMs.  If the range has any all-visible pages then
-	 * skipping makes updating relfrozenxid unsafe, which is a real downside.
-	 */
-	if (vacskip->next_unskippable_block - next_block < SKIP_PAGES_THRESHOLD)
-		vacskip->skipping_current_range = false;
+	if (next_block == vacskip->next_unskippable_block)
+		*all_visible_according_to_vm = vacskip->next_unskippable_allvis;
 	else
-	{
-		vacskip->skipping_current_range = true;
-		if (skipsallvis)
-			vacrel->skippedallvis = true;
-	}
+		*all_visible_according_to_vm = true;
+
+	return next_block;
 }
 
 /*
-- 
2.37.2

v1-0005-VacSkipState-saves-reference-to-LVRelState.patchtext/x-patch; charset=US-ASCII; name=v1-0005-VacSkipState-saves-reference-to-LVRelState.patchDownload
From 2fdb5fc93e8db3f885fc270a6742b8bd4c399aab Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sun, 31 Dec 2023 09:47:18 -0500
Subject: [PATCH v1 5/7] VacSkipState saves reference to LVRelState

The streaming read interface can only give pgsr_next callbacks access to
two pieces of private data. As such, move a reference to the LVRelState
into the VacSkipState.
---
 src/backend/access/heap/vacuumlazy.c | 36 ++++++++++++++++------------
 1 file changed, 21 insertions(+), 15 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 5070c3fe744..67020c2a807 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -248,11 +248,13 @@ typedef struct VacSkipState
 	BlockNumber next_unskippable_block;
 	/* Next unskippable block's visibility status */
 	bool		next_unskippable_allvis;
+	/* reference to whole relation vac state */
+	LVRelState *vacrel;
 } VacSkipState;
 
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
-static BlockNumber lazy_scan_skip(LVRelState *vacrel, VacSkipState *vacskip,
+static BlockNumber lazy_scan_skip(VacSkipState *vacskip,
 								  BlockNumber blkno, Buffer *vmbuffer,
 								  bool *all_visible_according_to_vm);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
@@ -840,7 +842,10 @@ lazy_scan_heap(LVRelState *vacrel)
 
 	/* relies on InvalidBlockNumber overflowing to 0 */
 	BlockNumber blkno = InvalidBlockNumber;
-	VacSkipState vacskip = {.next_unskippable_block = InvalidBlockNumber};
+	VacSkipState vacskip = {
+		.next_unskippable_block = InvalidBlockNumber,
+		.vacrel = vacrel
+	};
 	VacDeadItems *dead_items = vacrel->dead_items;
 	Buffer		vmbuffer = InvalidBuffer;
 	const int	initprog_index[] = {
@@ -862,8 +867,8 @@ lazy_scan_heap(LVRelState *vacrel)
 		Page		page;
 		LVPagePruneState prunestate;
 
-		blkno = lazy_scan_skip(vacrel, &vacskip, blkno + 1,
-									&vmbuffer, &all_visible_according_to_vm);
+		blkno = lazy_scan_skip(&vacskip, blkno + 1,
+							   &vmbuffer, &all_visible_according_to_vm);
 
 		if (blkno == InvalidBlockNumber)
 			break;
@@ -1290,9 +1295,10 @@ lazy_scan_heap(LVRelState *vacrel)
  * lazy_scan_skip() returns the next block for vacuum to process and sets its
  * visibility status in the output parameter, all_visible_according_to_vm.
  *
- * vacrel is an in/out parameter here; vacuum options and information about the
- * relation are read and vacrel->skippedallvis is set to ensure we don't
- * advance relfrozenxid when we have skipped vacuuming all visible blocks.
+ * vacskip->vacrel is an in/out parameter here; vacuum options and information
+ * about the relation are read and vacrel->skippedallvis is set to ensure we
+ * don't advance relfrozenxid when we have skipped vacuuming all visible
+ * blocks.
  *
  * vmbuffer will contain the block from the VM containing visibility
  * information for the next unskippable heap block. We may end up needed a
@@ -1309,21 +1315,21 @@ lazy_scan_heap(LVRelState *vacrel)
  * choice to skip such a range is actually made, making everything safe.)
  */
 static BlockNumber
-lazy_scan_skip(LVRelState *vacrel, VacSkipState *vacskip,
+lazy_scan_skip(VacSkipState *vacskip,
 			   BlockNumber next_block, Buffer *vmbuffer,
 			   bool *all_visible_according_to_vm)
 {
 	bool		skipsallvis = false;
 
-	if (next_block >= vacrel->rel_pages)
+	if (next_block >= vacskip->vacrel->rel_pages)
 		return InvalidBlockNumber;
 
 	if (vacskip->next_unskippable_block == InvalidBlockNumber ||
 		next_block > vacskip->next_unskippable_block)
 	{
-		while (++vacskip->next_unskippable_block < vacrel->rel_pages)
+		while (++vacskip->next_unskippable_block < vacskip->vacrel->rel_pages)
 		{
-			uint8		mapbits = visibilitymap_get_status(vacrel->rel,
+			uint8		mapbits = visibilitymap_get_status(vacskip->vacrel->rel,
 														   vacskip->next_unskippable_block,
 														   vmbuffer);
 
@@ -1347,11 +1353,11 @@ lazy_scan_skip(LVRelState *vacrel, VacSkipState *vacskip,
 			 * Implement this by always treating the last block as unsafe to
 			 * skip.
 			 */
-			if (vacskip->next_unskippable_block == vacrel->rel_pages - 1)
+			if (vacskip->next_unskippable_block == vacskip->vacrel->rel_pages - 1)
 				break;
 
 			/* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
-			if (!vacrel->skipwithvm)
+			if (!vacskip->vacrel->skipwithvm)
 			{
 				/* Caller shouldn't rely on all_visible_according_to_vm */
 				vacskip->next_unskippable_allvis = false;
@@ -1366,7 +1372,7 @@ lazy_scan_skip(LVRelState *vacrel, VacSkipState *vacskip,
 			 */
 			if ((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0)
 			{
-				if (vacrel->aggressive)
+				if (vacskip->vacrel->aggressive)
 					break;
 
 				/*
@@ -1396,7 +1402,7 @@ lazy_scan_skip(LVRelState *vacrel, VacSkipState *vacskip,
 		{
 			next_block = vacskip->next_unskippable_block;
 			if (skipsallvis)
-				vacrel->skippedallvis = true;
+				vacskip->vacrel->skippedallvis = true;
 		}
 	}
 
-- 
2.37.2

v1-0006-VacSkipState-store-next-unskippable-block-vmbuffe.patchtext/x-patch; charset=US-ASCII; name=v1-0006-VacSkipState-store-next-unskippable-block-vmbuffe.patchDownload
From a9d0592b7a54686968c4872323f5e45b809379c2 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sun, 31 Dec 2023 11:20:07 -0500
Subject: [PATCH v1 6/7] VacSkipState store next unskippable block vmbuffer

This fits nicely with the other state in VacSkipState.
---
 src/backend/access/heap/vacuumlazy.c | 47 ++++++++++++++++------------
 1 file changed, 27 insertions(+), 20 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 67020c2a807..49d16fcf039 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -248,6 +248,8 @@ typedef struct VacSkipState
 	BlockNumber next_unskippable_block;
 	/* Next unskippable block's visibility status */
 	bool		next_unskippable_allvis;
+	/* Next unskippable block's vmbuffer */
+	Buffer		vmbuffer;
 	/* reference to whole relation vac state */
 	LVRelState *vacrel;
 } VacSkipState;
@@ -255,7 +257,7 @@ typedef struct VacSkipState
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
 static BlockNumber lazy_scan_skip(VacSkipState *vacskip,
-								  BlockNumber blkno, Buffer *vmbuffer,
+								  BlockNumber blkno,
 								  bool *all_visible_according_to_vm);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
@@ -844,10 +846,10 @@ lazy_scan_heap(LVRelState *vacrel)
 	BlockNumber blkno = InvalidBlockNumber;
 	VacSkipState vacskip = {
 		.next_unskippable_block = InvalidBlockNumber,
+		.vmbuffer = InvalidBuffer,
 		.vacrel = vacrel
 	};
 	VacDeadItems *dead_items = vacrel->dead_items;
-	Buffer		vmbuffer = InvalidBuffer;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -868,7 +870,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		LVPagePruneState prunestate;
 
 		blkno = lazy_scan_skip(&vacskip, blkno + 1,
-							   &vmbuffer, &all_visible_according_to_vm);
+							   &all_visible_according_to_vm);
 
 		if (blkno == InvalidBlockNumber)
 			break;
@@ -909,10 +911,10 @@ lazy_scan_heap(LVRelState *vacrel)
 			 * correctness, but we do it anyway to avoid holding the pin
 			 * across a lengthy, unrelated operation.
 			 */
-			if (BufferIsValid(vmbuffer))
+			if (BufferIsValid(vacskip.vmbuffer))
 			{
-				ReleaseBuffer(vmbuffer);
-				vmbuffer = InvalidBuffer;
+				ReleaseBuffer(vacskip.vmbuffer);
+				vacskip.vmbuffer = InvalidBuffer;
 			}
 
 			/* Perform a round of index and heap vacuuming */
@@ -937,7 +939,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		 * all-visible.  In most cases this will be very cheap, because we'll
 		 * already have the correct page pinned anyway.
 		 */
-		visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
+		visibilitymap_pin(vacrel->rel, blkno, &vacskip.vmbuffer);
 
 		/*
 		 * We need a buffer cleanup lock to prune HOT chains and defragment
@@ -957,7 +959,7 @@ lazy_scan_heap(LVRelState *vacrel)
 
 			/* Check for new or empty pages before lazy_scan_noprune call */
 			if (lazy_scan_new_or_empty(vacrel, buf, blkno, page, true,
-									   vmbuffer))
+									   vacskip.vmbuffer))
 			{
 				/* Processed as new/empty page (lock and pin released) */
 				continue;
@@ -995,7 +997,8 @@ lazy_scan_heap(LVRelState *vacrel)
 		}
 
 		/* Check for new or empty pages before lazy_scan_prune call */
-		if (lazy_scan_new_or_empty(vacrel, buf, blkno, page, false, vmbuffer))
+		if (lazy_scan_new_or_empty(vacrel, buf, blkno, page, false,
+								   vacskip.vmbuffer))
 		{
 			/* Processed as new/empty page (lock and pin released) */
 			continue;
@@ -1032,7 +1035,7 @@ lazy_scan_heap(LVRelState *vacrel)
 			{
 				Size		freespace;
 
-				lazy_vacuum_heap_page(vacrel, blkno, buf, 0, vmbuffer);
+				lazy_vacuum_heap_page(vacrel, blkno, buf, 0, vacskip.vmbuffer);
 
 				/* Forget the LP_DEAD items that we just vacuumed */
 				dead_items->num_items = 0;
@@ -1111,7 +1114,7 @@ lazy_scan_heap(LVRelState *vacrel)
 			PageSetAllVisible(page);
 			MarkBufferDirty(buf);
 			visibilitymap_set(vacrel->rel, blkno, buf, InvalidXLogRecPtr,
-							  vmbuffer, prunestate.visibility_cutoff_xid,
+							  vacskip.vmbuffer, prunestate.visibility_cutoff_xid,
 							  flags);
 		}
 
@@ -1122,11 +1125,12 @@ lazy_scan_heap(LVRelState *vacrel)
 		 * with buffer lock before concluding that the VM is corrupt.
 		 */
 		else if (all_visible_according_to_vm && !PageIsAllVisible(page) &&
-				 visibilitymap_get_status(vacrel->rel, blkno, &vmbuffer) != 0)
+				 visibilitymap_get_status(vacrel->rel,
+										  blkno, &vacskip.vmbuffer) != 0)
 		{
 			elog(WARNING, "page is not marked all-visible but visibility map bit is set in relation \"%s\" page %u",
 				 vacrel->relname, blkno);
-			visibilitymap_clear(vacrel->rel, blkno, vmbuffer,
+			visibilitymap_clear(vacrel->rel, blkno, vacskip.vmbuffer,
 								VISIBILITYMAP_VALID_BITS);
 		}
 
@@ -1151,7 +1155,7 @@ lazy_scan_heap(LVRelState *vacrel)
 				 vacrel->relname, blkno);
 			PageClearAllVisible(page);
 			MarkBufferDirty(buf);
-			visibilitymap_clear(vacrel->rel, blkno, vmbuffer,
+			visibilitymap_clear(vacrel->rel, blkno, vacskip.vmbuffer,
 								VISIBILITYMAP_VALID_BITS);
 		}
 
@@ -1162,7 +1166,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		else if (all_visible_according_to_vm && prunestate.all_visible &&
 				 prunestate.all_frozen &&
-				 !VM_ALL_FROZEN(vacrel->rel, blkno, &vmbuffer))
+				 !VM_ALL_FROZEN(vacrel->rel, blkno, &vacskip.vmbuffer))
 		{
 			/*
 			 * Avoid relying on all_visible_according_to_vm as a proxy for the
@@ -1184,7 +1188,7 @@ lazy_scan_heap(LVRelState *vacrel)
 			 */
 			Assert(!TransactionIdIsValid(prunestate.visibility_cutoff_xid));
 			visibilitymap_set(vacrel->rel, blkno, buf, InvalidXLogRecPtr,
-							  vmbuffer, InvalidTransactionId,
+							  vacskip.vmbuffer, InvalidTransactionId,
 							  VISIBILITYMAP_ALL_VISIBLE |
 							  VISIBILITYMAP_ALL_FROZEN);
 		}
@@ -1226,8 +1230,11 @@ lazy_scan_heap(LVRelState *vacrel)
 	}
 
 	vacrel->blkno = InvalidBlockNumber;
-	if (BufferIsValid(vmbuffer))
-		ReleaseBuffer(vmbuffer);
+	if (BufferIsValid(vacskip.vmbuffer))
+	{
+		ReleaseBuffer(vacskip.vmbuffer);
+		vacskip.vmbuffer = InvalidBuffer;
+	}
 
 	/* report that everything is now scanned */
 	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
@@ -1316,7 +1323,7 @@ lazy_scan_heap(LVRelState *vacrel)
  */
 static BlockNumber
 lazy_scan_skip(VacSkipState *vacskip,
-			   BlockNumber next_block, Buffer *vmbuffer,
+			   BlockNumber next_block,
 			   bool *all_visible_according_to_vm)
 {
 	bool		skipsallvis = false;
@@ -1331,7 +1338,7 @@ lazy_scan_skip(VacSkipState *vacskip,
 		{
 			uint8		mapbits = visibilitymap_get_status(vacskip->vacrel->rel,
 														   vacskip->next_unskippable_block,
-														   vmbuffer);
+														   &vacskip->vmbuffer);
 
 			vacskip->next_unskippable_allvis = mapbits & VISIBILITYMAP_ALL_VISIBLE;
 
-- 
2.37.2

v1-0007-Remove-unneeded-vacuum_delay_point-from-lazy_scan.patchtext/x-patch; charset=US-ASCII; name=v1-0007-Remove-unneeded-vacuum_delay_point-from-lazy_scan.patchDownload
From b1fe24867172da232456b7e452178bf8adbf0538 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sun, 31 Dec 2023 12:49:56 -0500
Subject: [PATCH v1 7/7] Remove unneeded vacuum_delay_point from lazy_scan_skip

lazy_scan_skip() does relatively little work, so there is no need to
call vacuum_delay_point(). A future commit will call lazy_scan_skip()
from a callback, and we would like to avoid calling vacuum_delay_point()
in that callback.
---
 src/backend/access/heap/vacuumlazy.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 49d16fcf039..fc32610397b 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1389,8 +1389,6 @@ lazy_scan_skip(VacSkipState *vacskip,
 				 */
 				skipsallvis = true;
 			}
-
-			vacuum_delay_point();
 		}
 
 		/*
-- 
2.37.2

#2Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#1)
6 attachment(s)
Re: Confine vacuum skip logic to lazy_scan_skip

On Sun, Dec 31, 2023 at 1:28 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

There are a few comments that still need to be updated. I also noticed I
needed to reorder and combine a couple of the commits. I wanted to
register this for the january commitfest, so I didn't quite have time
for the finishing touches.

I've updated this patch set to remove a commit that didn't make sense
on its own and do various other cleanup.

- Melanie

Attachments:

v2-0004-Confine-vacuum-skip-logic-to-lazy_scan_skip.patchtext/x-patch; charset=US-ASCII; name=v2-0004-Confine-vacuum-skip-logic-to-lazy_scan_skip.patchDownload
From 335faad5948b2bec3b83c2db809bb9161d373dcb Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 30 Dec 2023 16:59:27 -0500
Subject: [PATCH v2 4/6] Confine vacuum skip logic to lazy_scan_skip

In preparation for vacuum to use the streaming read interface (and eventually
AIO), refactor vacuum's logic for skipping blocks such that it is entirely
confined to lazy_scan_skip(). This turns lazy_scan_skip() and the VacSkipState
it uses into an iterator which yields blocks to lazy_scan_heap(). Such a
structure is conducive to an async interface.

By always calling lazy_scan_skip() -- instead of only when we have reached the
next unskippable block, we no longer need the skipping_current_range variable.
lazy_scan_heap() no longer needs to manage the skipped range -- checking if we
reached the end in order to then call lazy_scan_skip(). And lazy_scan_skip()
can derive the visibility status of a block from whether or not we are in a
skippable range -- that is, whether or not the next_block is equal to the next
unskippable block.
---
 src/backend/access/heap/vacuumlazy.c | 233 ++++++++++++++-------------
 1 file changed, 120 insertions(+), 113 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index e3827a5e4d3..42da4ac64f8 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -250,14 +250,13 @@ typedef struct VacSkipState
 	Buffer		vmbuffer;
 	/* Next unskippable block's visibility status */
 	bool		next_unskippable_allvis;
-	/* Whether or not skippable blocks should be skipped */
-	bool		skipping_current_range;
 } VacSkipState;
 
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
-static void lazy_scan_skip(LVRelState *vacrel, VacSkipState *vacskip,
-						   BlockNumber next_block);
+static BlockNumber lazy_scan_skip(LVRelState *vacrel, VacSkipState *vacskip,
+								  BlockNumber blkno,
+								  bool *all_visible_according_to_vm);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
 								   bool sharelock, Buffer vmbuffer);
@@ -838,9 +837,15 @@ static void
 lazy_scan_heap(LVRelState *vacrel)
 {
 	BlockNumber rel_pages = vacrel->rel_pages,
-				blkno,
 				next_fsm_block_to_vacuum = 0;
-	VacSkipState vacskip = {.vmbuffer = InvalidBuffer};
+	bool		all_visible_according_to_vm;
+
+	/* relies on InvalidBlockNumber overflowing to 0 */
+	BlockNumber blkno = InvalidBlockNumber;
+	VacSkipState vacskip = {
+		.next_unskippable_block = InvalidBlockNumber,
+		.vmbuffer = InvalidBuffer
+	};
 	VacDeadItems *dead_items = vacrel->dead_items;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
@@ -855,37 +860,17 @@ lazy_scan_heap(LVRelState *vacrel)
 	initprog_val[2] = dead_items->max_items;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
-	/* Set up an initial range of skippable blocks using the visibility map */
-	lazy_scan_skip(vacrel, &vacskip, 0);
-	for (blkno = 0; blkno < rel_pages; blkno++)
+	while (true)
 	{
 		Buffer		buf;
 		Page		page;
-		bool		all_visible_according_to_vm;
 		LVPagePruneState prunestate;
 
-		if (blkno == vacskip.next_unskippable_block)
-		{
-			/*
-			 * Can't skip this page safely.  Must scan the page.  But
-			 * determine the next skippable range after the page first.
-			 */
-			all_visible_according_to_vm = vacskip.next_unskippable_allvis;
-			lazy_scan_skip(vacrel, &vacskip, blkno + 1);
-
-			Assert(vacskip.next_unskippable_block >= blkno + 1);
-		}
-		else
-		{
-			/* Last page always scanned (may need to set nonempty_pages) */
-			Assert(blkno < rel_pages - 1);
-
-			if (vacskip.skipping_current_range)
-				continue;
+		blkno = lazy_scan_skip(vacrel, &vacskip, blkno + 1,
+							   &all_visible_according_to_vm);
 
-			/* Current range is too small to skip -- just scan the page */
-			all_visible_according_to_vm = true;
-		}
+		if (blkno == InvalidBlockNumber)
+			break;
 
 		vacrel->scanned_pages++;
 
@@ -1287,20 +1272,13 @@ lazy_scan_heap(LVRelState *vacrel)
 }
 
 /*
- *	lazy_scan_skip() -- set up range of skippable blocks using visibility map.
- *
- * lazy_scan_heap() calls here every time it needs to set up a new range of
- * blocks to skip via the visibility map.  Caller passes next_block, the next
- * block in line. The parameters of the skipped range are recorded in vacskip.
- * vacrel is an in/out parameter here; vacuum options and information about the
- * relation are read and vacrel->skippedallvis is set to ensure we don't
- * advance relfrozenxid when we have skipped vacuuming all visible blocks.
+ *	lazy_scan_skip() -- get next block for vacuum to process
  *
- * vacskip->vmbuffer will contain the block from the VM containing visibility
- * information for the next unskippable heap block. We may end up needed a
- * different block from the VM (if we decide not to skip a skippable block).
- * This is okay; visibilitymap_pin() will take care of this while processing
- * the block.
+ * lazy_scan_heap() calls here every time it needs to get the next block to
+ * prune and vacuum, using the visibility map, vacuum options, and various
+ * thresholds to skip blocks which do not need to be processed. Caller passes
+ * next_block, the next block in line. This block may end up being skipped.
+ * lazy_scan_skip() returns the next block that needs to be processed.
  *
  * A block is unskippable if it is not all visible according to the visibility
  * map. It is also unskippable if it is the last block in the relation, if the
@@ -1310,14 +1288,26 @@ lazy_scan_heap(LVRelState *vacrel)
  * Even if a block is skippable, we may choose not to skip it if the range of
  * skippable blocks is too small (below SKIP_PAGES_THRESHOLD). As a
  * consequence, we must keep track of the next truly unskippable block and its
- * visibility status along with whether or not we are skipping the current
- * range of skippable blocks. This can be used to derive the next block
- * lazy_scan_heap() must process and its visibility status.
+ * visibility status separate from the next block lazy_scan_heap() should
+ * process (and its visibility status).
  *
  * The block number and visibility status of the next unskippable block are set
- * in vacskip->next_unskippable_block and next_unskippable_allvis.
- * vacskip->skipping_current_range indicates to the caller whether or not it is
- * processing a skippable (and thus all-visible) block.
+ * in vacskip->next_unskippable_block and next_unskippable_allvis. The caller
+ * should not concern itself with anything in vacskip. This is only used by
+ * lazy_scan_skip() to keep track of this state across invocations.
+ *
+ * lazy_scan_skip() returns the next block for vacuum to process and sets its
+ * visibility status in the output parameter, all_visible_according_to_vm.
+ *
+ * vacrel is an in/out parameter here; vacuum options and information about the
+ * relation are read and vacrel->skippedallvis is set to ensure we don't
+ * advance relfrozenxid when we have skipped vacuuming all visible blocks.
+ *
+ * vacskip->vmbuffer will contain the block from the VM containing visibility
+ * information for the next unskippable heap block. We may end up needed a
+ * different block from the VM (if we decide not to skip a skippable block).
+ * This is okay; visibilitymap_pin() will take care of this while processing
+ * the block.
  *
  * Note: our opinion of which blocks can be skipped can go stale immediately.
  * It's okay if caller "misses" a page whose all-visible or all-frozen marking
@@ -1327,87 +1317,104 @@ lazy_scan_heap(LVRelState *vacrel)
  * older XIDs/MXIDs.  The vacrel->skippedallvis flag will be set here when the
  * choice to skip such a range is actually made, making everything safe.)
  */
-static void
+static BlockNumber
 lazy_scan_skip(LVRelState *vacrel, VacSkipState *vacskip,
-			   BlockNumber next_block)
+			   BlockNumber next_block,
+			   bool *all_visible_according_to_vm)
 {
 	bool		skipsallvis = false;
 
-	vacskip->next_unskippable_block = next_block;
-	vacskip->next_unskippable_allvis = true;
-	while (vacskip->next_unskippable_block < vacrel->rel_pages)
-	{
-		uint8		mapbits = visibilitymap_get_status(vacrel->rel,
-													   vacskip->next_unskippable_block,
-													   &vacskip->vmbuffer);
+	if (next_block >= vacrel->rel_pages)
+		return InvalidBlockNumber;
 
-		if ((mapbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
+	if (vacskip->next_unskippable_block == InvalidBlockNumber ||
+		next_block > vacskip->next_unskippable_block)
+	{
+		while (++vacskip->next_unskippable_block < vacrel->rel_pages)
 		{
-			Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
-			vacskip->next_unskippable_allvis = false;
-			break;
-		}
+			uint8		mapbits = visibilitymap_get_status(vacrel->rel,
+														   vacskip->next_unskippable_block,
+														   &vacskip->vmbuffer);
 
-		/*
-		 * Caller must scan the last page to determine whether it has tuples
-		 * (caller must have the opportunity to set vacrel->nonempty_pages).
-		 * This rule avoids having lazy_truncate_heap() take access-exclusive
-		 * lock on rel to attempt a truncation that fails anyway, just because
-		 * there are tuples on the last page (it is likely that there will be
-		 * tuples on other nearby pages as well, but those can be skipped).
-		 *
-		 * Implement this by always treating the last block as unsafe to skip.
-		 */
-		if (vacskip->next_unskippable_block == vacrel->rel_pages - 1)
-			break;
+			vacskip->next_unskippable_allvis = mapbits & VISIBILITYMAP_ALL_VISIBLE;
 
-		/* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
-		if (!vacrel->skipwithvm)
-		{
-			/* Caller shouldn't rely on all_visible_according_to_vm */
-			vacskip->next_unskippable_allvis = false;
-			break;
-		}
+			if (!vacskip->next_unskippable_allvis)
+			{
+				Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
+				break;
+			}
 
-		/*
-		 * Aggressive VACUUM caller can't skip pages just because they are
-		 * all-visible.  They may still skip all-frozen pages, which can't
-		 * contain XIDs < OldestXmin (XIDs that aren't already frozen by now).
-		 */
-		if ((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0)
-		{
-			if (vacrel->aggressive)
+			/*
+			 * Caller must scan the last page to determine whether it has
+			 * tuples (caller must have the opportunity to set
+			 * vacrel->nonempty_pages). This rule avoids having
+			 * lazy_truncate_heap() take access-exclusive lock on rel to
+			 * attempt a truncation that fails anyway, just because there are
+			 * tuples on the last page (it is likely that there will be tuples
+			 * on other nearby pages as well, but those can be skipped).
+			 *
+			 * Implement this by always treating the last block as unsafe to
+			 * skip.
+			 */
+			if (vacskip->next_unskippable_block == vacrel->rel_pages - 1)
 				break;
 
+			/* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
+			if (!vacrel->skipwithvm)
+			{
+				/* Caller shouldn't rely on all_visible_according_to_vm */
+				vacskip->next_unskippable_allvis = false;
+				break;
+			}
+
 			/*
-			 * All-visible block is safe to skip in non-aggressive case.  But
-			 * remember that the final range contains such a block for later.
+			 * Aggressive VACUUM caller can't skip pages just because they are
+			 * all-visible.  They may still skip all-frozen pages, which can't
+			 * contain XIDs < OldestXmin (XIDs that aren't already frozen by
+			 * now).
 			 */
-			skipsallvis = true;
+			if ((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0)
+			{
+				if (vacrel->aggressive)
+					break;
+
+				/*
+				 * All-visible block is safe to skip in non-aggressive case.
+				 * But remember that the final range contains such a block for
+				 * later.
+				 */
+				skipsallvis = true;
+			}
+
+			vacuum_delay_point();
 		}
 
-		vacuum_delay_point();
-		vacskip->next_unskippable_block++;
+		/*
+		 * We only skip a range with at least SKIP_PAGES_THRESHOLD consecutive
+		 * pages.  Since we're reading sequentially, the OS should be doing
+		 * readahead for us, so there's no gain in skipping a page now and
+		 * then. Skipping such a range might even discourage sequential
+		 * detection.
+		 *
+		 * This test also enables more frequent relfrozenxid advancement
+		 * during non-aggressive VACUUMs.  If the range has any all-visible
+		 * pages then skipping makes updating relfrozenxid unsafe, which is a
+		 * real downside.
+		 */
+		if (vacskip->next_unskippable_block - next_block >= SKIP_PAGES_THRESHOLD)
+		{
+			next_block = vacskip->next_unskippable_block;
+			if (skipsallvis)
+				vacrel->skippedallvis = true;
+		}
 	}
 
-	/*
-	 * We only skip a range with at least SKIP_PAGES_THRESHOLD consecutive
-	 * pages.  Since we're reading sequentially, the OS should be doing
-	 * readahead for us, so there's no gain in skipping a page now and then.
-	 * Skipping such a range might even discourage sequential detection.
-	 *
-	 * This test also enables more frequent relfrozenxid advancement during
-	 * non-aggressive VACUUMs.  If the range has any all-visible pages then
-	 * skipping makes updating relfrozenxid unsafe, which is a real downside.
-	 */
-	if (vacskip->next_unskippable_block - next_block < SKIP_PAGES_THRESHOLD)
-		vacskip->skipping_current_range = false;
+	if (next_block == vacskip->next_unskippable_block)
+		*all_visible_according_to_vm = vacskip->next_unskippable_allvis;
 	else
-	{
-		vacskip->skipping_current_range = true;
-		if (skipsallvis)
-			vacrel->skippedallvis = true;
-	}
+		*all_visible_according_to_vm = true;
+
+	return next_block;
 }
 
 /*
-- 
2.37.2

v2-0003-Add-lazy_scan_skip-unskippable-state.patchtext/x-patch; charset=US-ASCII; name=v2-0003-Add-lazy_scan_skip-unskippable-state.patchDownload
From eea3c207eeaf7c390096dcb48fc062d81d6d7cc3 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 30 Dec 2023 16:22:12 -0500
Subject: [PATCH v2 3/6] Add lazy_scan_skip unskippable state

Future commits will remove all skipping logic from lazy_scan_heap() and
confine it to lazy_scan_skip(). To make those commits more clear, first
introduce the struct, VacSkipState, which will maintain the variables
needed to skip ranges less than SKIP_PAGES_THRESHOLD.

While we are at it, add additional information to the lazy_scan_skip()
comment, including descriptions of the role and expectations for its
function parameters.
---
 src/backend/access/heap/vacuumlazy.c | 145 ++++++++++++++++-----------
 src/tools/pgindent/typedefs.list     |   1 +
 2 files changed, 87 insertions(+), 59 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 3b28ea2cdb5..e3827a5e4d3 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -238,13 +238,26 @@ typedef struct LVSavedErrInfo
 	VacErrPhase phase;
 } LVSavedErrInfo;
 
+/*
+ * Parameters maintained by lazy_scan_skip() to manage skipping ranges of pages
+ * greater than SKIP_PAGES_THRESHOLD.
+ */
+typedef struct VacSkipState
+{
+	/* Next unskippable block */
+	BlockNumber next_unskippable_block;
+	/* Buffer containing next unskippable block's visibility info */
+	Buffer		vmbuffer;
+	/* Next unskippable block's visibility status */
+	bool		next_unskippable_allvis;
+	/* Whether or not skippable blocks should be skipped */
+	bool		skipping_current_range;
+} VacSkipState;
 
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
-static BlockNumber lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer,
-								  BlockNumber next_block,
-								  bool *next_unskippable_allvis,
-								  bool *skipping_current_range);
+static void lazy_scan_skip(LVRelState *vacrel, VacSkipState *vacskip,
+						   BlockNumber next_block);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
 								   bool sharelock, Buffer vmbuffer);
@@ -826,12 +839,9 @@ lazy_scan_heap(LVRelState *vacrel)
 {
 	BlockNumber rel_pages = vacrel->rel_pages,
 				blkno,
-				next_unskippable_block,
 				next_fsm_block_to_vacuum = 0;
+	VacSkipState vacskip = {.vmbuffer = InvalidBuffer};
 	VacDeadItems *dead_items = vacrel->dead_items;
-	Buffer		vmbuffer = InvalidBuffer;
-	bool		next_unskippable_allvis,
-				skipping_current_range;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -846,9 +856,7 @@ lazy_scan_heap(LVRelState *vacrel)
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	/* Set up an initial range of skippable blocks using the visibility map */
-	next_unskippable_block = lazy_scan_skip(vacrel, &vmbuffer, 0,
-											&next_unskippable_allvis,
-											&skipping_current_range);
+	lazy_scan_skip(vacrel, &vacskip, 0);
 	for (blkno = 0; blkno < rel_pages; blkno++)
 	{
 		Buffer		buf;
@@ -856,26 +864,23 @@ lazy_scan_heap(LVRelState *vacrel)
 		bool		all_visible_according_to_vm;
 		LVPagePruneState prunestate;
 
-		if (blkno == next_unskippable_block)
+		if (blkno == vacskip.next_unskippable_block)
 		{
 			/*
 			 * Can't skip this page safely.  Must scan the page.  But
 			 * determine the next skippable range after the page first.
 			 */
-			all_visible_according_to_vm = next_unskippable_allvis;
-			next_unskippable_block = lazy_scan_skip(vacrel, &vmbuffer,
-													blkno + 1,
-													&next_unskippable_allvis,
-													&skipping_current_range);
+			all_visible_according_to_vm = vacskip.next_unskippable_allvis;
+			lazy_scan_skip(vacrel, &vacskip, blkno + 1);
 
-			Assert(next_unskippable_block >= blkno + 1);
+			Assert(vacskip.next_unskippable_block >= blkno + 1);
 		}
 		else
 		{
 			/* Last page always scanned (may need to set nonempty_pages) */
 			Assert(blkno < rel_pages - 1);
 
-			if (skipping_current_range)
+			if (vacskip.skipping_current_range)
 				continue;
 
 			/* Current range is too small to skip -- just scan the page */
@@ -918,10 +923,10 @@ lazy_scan_heap(LVRelState *vacrel)
 			 * correctness, but we do it anyway to avoid holding the pin
 			 * across a lengthy, unrelated operation.
 			 */
-			if (BufferIsValid(vmbuffer))
+			if (BufferIsValid(vacskip.vmbuffer))
 			{
-				ReleaseBuffer(vmbuffer);
-				vmbuffer = InvalidBuffer;
+				ReleaseBuffer(vacskip.vmbuffer);
+				vacskip.vmbuffer = InvalidBuffer;
 			}
 
 			/* Perform a round of index and heap vacuuming */
@@ -946,7 +951,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		 * all-visible.  In most cases this will be very cheap, because we'll
 		 * already have the correct page pinned anyway.
 		 */
-		visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
+		visibilitymap_pin(vacrel->rel, blkno, &vacskip.vmbuffer);
 
 		/*
 		 * We need a buffer cleanup lock to prune HOT chains and defragment
@@ -966,7 +971,7 @@ lazy_scan_heap(LVRelState *vacrel)
 
 			/* Check for new or empty pages before lazy_scan_noprune call */
 			if (lazy_scan_new_or_empty(vacrel, buf, blkno, page, true,
-									   vmbuffer))
+									   vacskip.vmbuffer))
 			{
 				/* Processed as new/empty page (lock and pin released) */
 				continue;
@@ -1004,7 +1009,8 @@ lazy_scan_heap(LVRelState *vacrel)
 		}
 
 		/* Check for new or empty pages before lazy_scan_prune call */
-		if (lazy_scan_new_or_empty(vacrel, buf, blkno, page, false, vmbuffer))
+		if (lazy_scan_new_or_empty(vacrel, buf, blkno, page, false,
+								   vacskip.vmbuffer))
 		{
 			/* Processed as new/empty page (lock and pin released) */
 			continue;
@@ -1041,7 +1047,7 @@ lazy_scan_heap(LVRelState *vacrel)
 			{
 				Size		freespace;
 
-				lazy_vacuum_heap_page(vacrel, blkno, buf, 0, vmbuffer);
+				lazy_vacuum_heap_page(vacrel, blkno, buf, 0, vacskip.vmbuffer);
 
 				/* Forget the LP_DEAD items that we just vacuumed */
 				dead_items->num_items = 0;
@@ -1120,7 +1126,7 @@ lazy_scan_heap(LVRelState *vacrel)
 			PageSetAllVisible(page);
 			MarkBufferDirty(buf);
 			visibilitymap_set(vacrel->rel, blkno, buf, InvalidXLogRecPtr,
-							  vmbuffer, prunestate.visibility_cutoff_xid,
+							  vacskip.vmbuffer, prunestate.visibility_cutoff_xid,
 							  flags);
 		}
 
@@ -1131,11 +1137,12 @@ lazy_scan_heap(LVRelState *vacrel)
 		 * with buffer lock before concluding that the VM is corrupt.
 		 */
 		else if (all_visible_according_to_vm && !PageIsAllVisible(page) &&
-				 visibilitymap_get_status(vacrel->rel, blkno, &vmbuffer) != 0)
+				 visibilitymap_get_status(vacrel->rel, blkno,
+										  &vacskip.vmbuffer) != 0)
 		{
 			elog(WARNING, "page is not marked all-visible but visibility map bit is set in relation \"%s\" page %u",
 				 vacrel->relname, blkno);
-			visibilitymap_clear(vacrel->rel, blkno, vmbuffer,
+			visibilitymap_clear(vacrel->rel, blkno, vacskip.vmbuffer,
 								VISIBILITYMAP_VALID_BITS);
 		}
 
@@ -1160,7 +1167,7 @@ lazy_scan_heap(LVRelState *vacrel)
 				 vacrel->relname, blkno);
 			PageClearAllVisible(page);
 			MarkBufferDirty(buf);
-			visibilitymap_clear(vacrel->rel, blkno, vmbuffer,
+			visibilitymap_clear(vacrel->rel, blkno, vacskip.vmbuffer,
 								VISIBILITYMAP_VALID_BITS);
 		}
 
@@ -1171,7 +1178,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		else if (all_visible_according_to_vm && prunestate.all_visible &&
 				 prunestate.all_frozen &&
-				 !VM_ALL_FROZEN(vacrel->rel, blkno, &vmbuffer))
+				 !VM_ALL_FROZEN(vacrel->rel, blkno, &vacskip.vmbuffer))
 		{
 			/*
 			 * Avoid relying on all_visible_according_to_vm as a proxy for the
@@ -1193,7 +1200,7 @@ lazy_scan_heap(LVRelState *vacrel)
 			 */
 			Assert(!TransactionIdIsValid(prunestate.visibility_cutoff_xid));
 			visibilitymap_set(vacrel->rel, blkno, buf, InvalidXLogRecPtr,
-							  vmbuffer, InvalidTransactionId,
+							  vacskip.vmbuffer, InvalidTransactionId,
 							  VISIBILITYMAP_ALL_VISIBLE |
 							  VISIBILITYMAP_ALL_FROZEN);
 		}
@@ -1235,8 +1242,11 @@ lazy_scan_heap(LVRelState *vacrel)
 	}
 
 	vacrel->blkno = InvalidBlockNumber;
-	if (BufferIsValid(vmbuffer))
-		ReleaseBuffer(vmbuffer);
+	if (BufferIsValid(vacskip.vmbuffer))
+	{
+		ReleaseBuffer(vacskip.vmbuffer);
+		vacskip.vmbuffer = InvalidBuffer;
+	}
 
 	/* report that everything is now scanned */
 	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
@@ -1280,15 +1290,34 @@ lazy_scan_heap(LVRelState *vacrel)
  *	lazy_scan_skip() -- set up range of skippable blocks using visibility map.
  *
  * lazy_scan_heap() calls here every time it needs to set up a new range of
- * blocks to skip via the visibility map.  Caller passes the next block in
- * line.  We return a next_unskippable_block for this range.  When there are
- * no skippable blocks we just return caller's next_block.  The all-visible
- * status of the returned block is set in *next_unskippable_allvis for caller,
- * too.  Block usually won't be all-visible (since it's unskippable), but it
- * can be during aggressive VACUUMs (as well as in certain edge cases).
+ * blocks to skip via the visibility map.  Caller passes next_block, the next
+ * block in line. The parameters of the skipped range are recorded in vacskip.
+ * vacrel is an in/out parameter here; vacuum options and information about the
+ * relation are read and vacrel->skippedallvis is set to ensure we don't
+ * advance relfrozenxid when we have skipped vacuuming all visible blocks.
+ *
+ * vacskip->vmbuffer will contain the block from the VM containing visibility
+ * information for the next unskippable heap block. We may end up needed a
+ * different block from the VM (if we decide not to skip a skippable block).
+ * This is okay; visibilitymap_pin() will take care of this while processing
+ * the block.
+ *
+ * A block is unskippable if it is not all visible according to the visibility
+ * map. It is also unskippable if it is the last block in the relation, if the
+ * vacuum is an aggressive vacuum, or if DISABLE_PAGE_SKIPPING was passed to
+ * vacuum.
  *
- * Sets *skipping_current_range to indicate if caller should skip this range.
- * Costs and benefits drive our decision.  Very small ranges won't be skipped.
+ * Even if a block is skippable, we may choose not to skip it if the range of
+ * skippable blocks is too small (below SKIP_PAGES_THRESHOLD). As a
+ * consequence, we must keep track of the next truly unskippable block and its
+ * visibility status along with whether or not we are skipping the current
+ * range of skippable blocks. This can be used to derive the next block
+ * lazy_scan_heap() must process and its visibility status.
+ *
+ * The block number and visibility status of the next unskippable block are set
+ * in vacskip->next_unskippable_block and next_unskippable_allvis.
+ * vacskip->skipping_current_range indicates to the caller whether or not it is
+ * processing a skippable (and thus all-visible) block.
  *
  * Note: our opinion of which blocks can be skipped can go stale immediately.
  * It's okay if caller "misses" a page whose all-visible or all-frozen marking
@@ -1298,24 +1327,24 @@ lazy_scan_heap(LVRelState *vacrel)
  * older XIDs/MXIDs.  The vacrel->skippedallvis flag will be set here when the
  * choice to skip such a range is actually made, making everything safe.)
  */
-static BlockNumber
-lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
-			   bool *next_unskippable_allvis, bool *skipping_current_range)
+static void
+lazy_scan_skip(LVRelState *vacrel, VacSkipState *vacskip,
+			   BlockNumber next_block)
 {
-	BlockNumber next_unskippable_block = next_block;
 	bool		skipsallvis = false;
 
-	*next_unskippable_allvis = true;
-	while (next_unskippable_block < vacrel->rel_pages)
+	vacskip->next_unskippable_block = next_block;
+	vacskip->next_unskippable_allvis = true;
+	while (vacskip->next_unskippable_block < vacrel->rel_pages)
 	{
 		uint8		mapbits = visibilitymap_get_status(vacrel->rel,
-													   next_unskippable_block,
-													   vmbuffer);
+													   vacskip->next_unskippable_block,
+													   &vacskip->vmbuffer);
 
 		if ((mapbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
 		{
 			Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
-			*next_unskippable_allvis = false;
+			vacskip->next_unskippable_allvis = false;
 			break;
 		}
 
@@ -1329,14 +1358,14 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 		 *
 		 * Implement this by always treating the last block as unsafe to skip.
 		 */
-		if (next_unskippable_block == vacrel->rel_pages - 1)
+		if (vacskip->next_unskippable_block == vacrel->rel_pages - 1)
 			break;
 
 		/* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
 		if (!vacrel->skipwithvm)
 		{
 			/* Caller shouldn't rely on all_visible_according_to_vm */
-			*next_unskippable_allvis = false;
+			vacskip->next_unskippable_allvis = false;
 			break;
 		}
 
@@ -1358,7 +1387,7 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 		}
 
 		vacuum_delay_point();
-		next_unskippable_block++;
+		vacskip->next_unskippable_block++;
 	}
 
 	/*
@@ -1371,16 +1400,14 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 	 * non-aggressive VACUUMs.  If the range has any all-visible pages then
 	 * skipping makes updating relfrozenxid unsafe, which is a real downside.
 	 */
-	if (next_unskippable_block - next_block < SKIP_PAGES_THRESHOLD)
-		*skipping_current_range = false;
+	if (vacskip->next_unskippable_block - next_block < SKIP_PAGES_THRESHOLD)
+		vacskip->skipping_current_range = false;
 	else
 	{
-		*skipping_current_range = true;
+		vacskip->skipping_current_range = true;
 		if (skipsallvis)
 			vacrel->skippedallvis = true;
 	}
-
-	return next_unskippable_block;
 }
 
 /*
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index e37ef9aa76d..bd008e1699b 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2955,6 +2955,7 @@ VacOptValue
 VacuumParams
 VacuumRelation
 VacuumStmt
+VacSkipState
 ValidIOData
 ValidateIndexState
 ValuesScan
-- 
2.37.2

v2-0001-lazy_scan_skip-remove-unnecessary-local-var-rel_p.patchtext/x-patch; charset=US-ASCII; name=v2-0001-lazy_scan_skip-remove-unnecessary-local-var-rel_p.patchDownload
From 9cd579d6a20aef2aeeab6ef50d72e779d75bf7cd Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 30 Dec 2023 16:18:40 -0500
Subject: [PATCH v2 1/6] lazy_scan_skip remove unnecessary local var rel_pages

lazy_scan_skip() only uses vacrel->rel_pages twice, so there seems to be
no reason to save it in a local variable, rel_pages.
---
 src/backend/access/heap/vacuumlazy.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 3b9299b8924..c4e0c077694 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1302,13 +1302,12 @@ static BlockNumber
 lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 			   bool *next_unskippable_allvis, bool *skipping_current_range)
 {
-	BlockNumber rel_pages = vacrel->rel_pages,
-				next_unskippable_block = next_block,
+	BlockNumber next_unskippable_block = next_block,
 				nskippable_blocks = 0;
 	bool		skipsallvis = false;
 
 	*next_unskippable_allvis = true;
-	while (next_unskippable_block < rel_pages)
+	while (next_unskippable_block < vacrel->rel_pages)
 	{
 		uint8		mapbits = visibilitymap_get_status(vacrel->rel,
 													   next_unskippable_block,
@@ -1331,7 +1330,7 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 		 *
 		 * Implement this by always treating the last block as unsafe to skip.
 		 */
-		if (next_unskippable_block == rel_pages - 1)
+		if (next_unskippable_block == vacrel->rel_pages - 1)
 			break;
 
 		/* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
-- 
2.37.2

v2-0002-lazy_scan_skip-remove-unneeded-local-var-nskippab.patchtext/x-patch; charset=US-ASCII; name=v2-0002-lazy_scan_skip-remove-unneeded-local-var-nskippab.patchDownload
From 314dd9038593610583e4fe60ab62e0d46ea3be86 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 30 Dec 2023 16:30:59 -0500
Subject: [PATCH v2 2/6] lazy_scan_skip remove unneeded local var
 nskippable_blocks

nskippable_blocks can be easily derived from next_unskippable_block's
progress when compared to the passed in next_block.
---
 src/backend/access/heap/vacuumlazy.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index c4e0c077694..3b28ea2cdb5 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1302,8 +1302,7 @@ static BlockNumber
 lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 			   bool *next_unskippable_allvis, bool *skipping_current_range)
 {
-	BlockNumber next_unskippable_block = next_block,
-				nskippable_blocks = 0;
+	BlockNumber next_unskippable_block = next_block;
 	bool		skipsallvis = false;
 
 	*next_unskippable_allvis = true;
@@ -1360,7 +1359,6 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 
 		vacuum_delay_point();
 		next_unskippable_block++;
-		nskippable_blocks++;
 	}
 
 	/*
@@ -1373,7 +1371,7 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 	 * non-aggressive VACUUMs.  If the range has any all-visible pages then
 	 * skipping makes updating relfrozenxid unsafe, which is a real downside.
 	 */
-	if (nskippable_blocks < SKIP_PAGES_THRESHOLD)
+	if (next_unskippable_block - next_block < SKIP_PAGES_THRESHOLD)
 		*skipping_current_range = false;
 	else
 	{
-- 
2.37.2

v2-0005-VacSkipState-saves-reference-to-LVRelState.patchtext/x-patch; charset=US-ASCII; name=v2-0005-VacSkipState-saves-reference-to-LVRelState.patchDownload
From b6603e35147c4bbe3337280222e6243524b0110e Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sun, 31 Dec 2023 09:47:18 -0500
Subject: [PATCH v2 5/6] VacSkipState saves reference to LVRelState

The streaming read interface can only give pgsr_next callbacks access to
two pieces of private data. As such, move a reference to the LVRelState
into the VacSkipState.

This is a separate commit (as opposed to as part of the commit
introducing VacSkipState) because it is required for using the streaming
read interface but not a natural change on its own. VacSkipState is per
block and the LVRelState is referenced for the whole relation vacuum.
---
 src/backend/access/heap/vacuumlazy.c | 35 +++++++++++++++-------------
 1 file changed, 19 insertions(+), 16 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 42da4ac64f8..1b64b9988de 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -250,11 +250,13 @@ typedef struct VacSkipState
 	Buffer		vmbuffer;
 	/* Next unskippable block's visibility status */
 	bool		next_unskippable_allvis;
+	/* reference to whole relation vac state */
+	LVRelState *vacrel;
 } VacSkipState;
 
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
-static BlockNumber lazy_scan_skip(LVRelState *vacrel, VacSkipState *vacskip,
+static BlockNumber lazy_scan_skip(VacSkipState *vacskip,
 								  BlockNumber blkno,
 								  bool *all_visible_according_to_vm);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
@@ -844,7 +846,8 @@ lazy_scan_heap(LVRelState *vacrel)
 	BlockNumber blkno = InvalidBlockNumber;
 	VacSkipState vacskip = {
 		.next_unskippable_block = InvalidBlockNumber,
-		.vmbuffer = InvalidBuffer
+		.vmbuffer = InvalidBuffer,
+		.vacrel = vacrel
 	};
 	VacDeadItems *dead_items = vacrel->dead_items;
 	const int	initprog_index[] = {
@@ -866,7 +869,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		Page		page;
 		LVPagePruneState prunestate;
 
-		blkno = lazy_scan_skip(vacrel, &vacskip, blkno + 1,
+		blkno = lazy_scan_skip(&vacskip, blkno + 1,
 							   &all_visible_according_to_vm);
 
 		if (blkno == InvalidBlockNumber)
@@ -1299,9 +1302,10 @@ lazy_scan_heap(LVRelState *vacrel)
  * lazy_scan_skip() returns the next block for vacuum to process and sets its
  * visibility status in the output parameter, all_visible_according_to_vm.
  *
- * vacrel is an in/out parameter here; vacuum options and information about the
- * relation are read and vacrel->skippedallvis is set to ensure we don't
- * advance relfrozenxid when we have skipped vacuuming all visible blocks.
+ * vacskip->vacrel is an in/out parameter here; vacuum options and information
+ * about the relation are read and vacrel->skippedallvis is set to ensure we
+ * don't advance relfrozenxid when we have skipped vacuuming all visible
+ * blocks.
  *
  * vacskip->vmbuffer will contain the block from the VM containing visibility
  * information for the next unskippable heap block. We may end up needed a
@@ -1318,21 +1322,20 @@ lazy_scan_heap(LVRelState *vacrel)
  * choice to skip such a range is actually made, making everything safe.)
  */
 static BlockNumber
-lazy_scan_skip(LVRelState *vacrel, VacSkipState *vacskip,
-			   BlockNumber next_block,
-			   bool *all_visible_according_to_vm)
+lazy_scan_skip(VacSkipState *vacskip,
+			   BlockNumber next_block, bool *all_visible_according_to_vm)
 {
 	bool		skipsallvis = false;
 
-	if (next_block >= vacrel->rel_pages)
+	if (next_block >= vacskip->vacrel->rel_pages)
 		return InvalidBlockNumber;
 
 	if (vacskip->next_unskippable_block == InvalidBlockNumber ||
 		next_block > vacskip->next_unskippable_block)
 	{
-		while (++vacskip->next_unskippable_block < vacrel->rel_pages)
+		while (++vacskip->next_unskippable_block < vacskip->vacrel->rel_pages)
 		{
-			uint8		mapbits = visibilitymap_get_status(vacrel->rel,
+			uint8		mapbits = visibilitymap_get_status(vacskip->vacrel->rel,
 														   vacskip->next_unskippable_block,
 														   &vacskip->vmbuffer);
 
@@ -1356,11 +1359,11 @@ lazy_scan_skip(LVRelState *vacrel, VacSkipState *vacskip,
 			 * Implement this by always treating the last block as unsafe to
 			 * skip.
 			 */
-			if (vacskip->next_unskippable_block == vacrel->rel_pages - 1)
+			if (vacskip->next_unskippable_block == vacskip->vacrel->rel_pages - 1)
 				break;
 
 			/* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
-			if (!vacrel->skipwithvm)
+			if (!vacskip->vacrel->skipwithvm)
 			{
 				/* Caller shouldn't rely on all_visible_according_to_vm */
 				vacskip->next_unskippable_allvis = false;
@@ -1375,7 +1378,7 @@ lazy_scan_skip(LVRelState *vacrel, VacSkipState *vacskip,
 			 */
 			if ((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0)
 			{
-				if (vacrel->aggressive)
+				if (vacskip->vacrel->aggressive)
 					break;
 
 				/*
@@ -1405,7 +1408,7 @@ lazy_scan_skip(LVRelState *vacrel, VacSkipState *vacskip,
 		{
 			next_block = vacskip->next_unskippable_block;
 			if (skipsallvis)
-				vacrel->skippedallvis = true;
+				vacskip->vacrel->skippedallvis = true;
 		}
 	}
 
-- 
2.37.2

v2-0006-Remove-unneeded-vacuum_delay_point-from-lazy_scan.patchtext/x-patch; charset=US-ASCII; name=v2-0006-Remove-unneeded-vacuum_delay_point-from-lazy_scan.patchDownload
From aa948b99c09f2fdf9a10deac78bcb880e09366ec Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sun, 31 Dec 2023 12:49:56 -0500
Subject: [PATCH v2 6/6] Remove unneeded vacuum_delay_point from lazy_scan_skip

lazy_scan_skip() does relatively little work, so there is no need to
call vacuum_delay_point(). A future commit will call lazy_scan_skip()
from a callback, and we would like to avoid calling vacuum_delay_point()
in that callback.
---
 src/backend/access/heap/vacuumlazy.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 1b64b9988de..1329da95254 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1388,8 +1388,6 @@ lazy_scan_skip(VacSkipState *vacskip,
 				 */
 				skipsallvis = true;
 			}
-
-			vacuum_delay_point();
 		}
 
 		/*
-- 
2.37.2

#3Andres Freund
andres@anarazel.de
In reply to: Melanie Plageman (#2)
Re: Confine vacuum skip logic to lazy_scan_skip

Hi,

On 2024-01-02 12:36:18 -0500, Melanie Plageman wrote:

Subject: [PATCH v2 1/6] lazy_scan_skip remove unnecessary local var rel_pages
Subject: [PATCH v2 2/6] lazy_scan_skip remove unneeded local var
nskippable_blocks

I think these may lead to worse code - the compiler has to reload
vacrel->rel_pages/next_unskippable_block for every loop iteration, because it
can't guarantee that they're not changed within one of the external functions
called in the loop body.

Subject: [PATCH v2 3/6] Add lazy_scan_skip unskippable state

Future commits will remove all skipping logic from lazy_scan_heap() and
confine it to lazy_scan_skip(). To make those commits more clear, first
introduce the struct, VacSkipState, which will maintain the variables
needed to skip ranges less than SKIP_PAGES_THRESHOLD.

Why not add this to LVRelState, possibly as a struct embedded within it?

From 335faad5948b2bec3b83c2db809bb9161d373dcb Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 30 Dec 2023 16:59:27 -0500
Subject: [PATCH v2 4/6] Confine vacuum skip logic to lazy_scan_skip

In preparation for vacuum to use the streaming read interface (and eventually
AIO), refactor vacuum's logic for skipping blocks such that it is entirely
confined to lazy_scan_skip(). This turns lazy_scan_skip() and the VacSkipState
it uses into an iterator which yields blocks to lazy_scan_heap(). Such a
structure is conducive to an async interface.

And it's cleaner - I find the current code extremely hard to reason about.

By always calling lazy_scan_skip() -- instead of only when we have reached the
next unskippable block, we no longer need the skipping_current_range variable.
lazy_scan_heap() no longer needs to manage the skipped range -- checking if we
reached the end in order to then call lazy_scan_skip(). And lazy_scan_skip()
can derive the visibility status of a block from whether or not we are in a
skippable range -- that is, whether or not the next_block is equal to the next
unskippable block.

I wonder if it should be renamed as part of this - the name is somewhat
confusing now (and perhaps before)? lazy_scan_get_next_block() or such?

+ while (true)
{
Buffer buf;
Page page;
- bool all_visible_according_to_vm;
LVPagePruneState prunestate;

-		if (blkno == vacskip.next_unskippable_block)
-		{
-			/*
-			 * Can't skip this page safely.  Must scan the page.  But
-			 * determine the next skippable range after the page first.
-			 */
-			all_visible_according_to_vm = vacskip.next_unskippable_allvis;
-			lazy_scan_skip(vacrel, &vacskip, blkno + 1);
-
-			Assert(vacskip.next_unskippable_block >= blkno + 1);
-		}
-		else
-		{
-			/* Last page always scanned (may need to set nonempty_pages) */
-			Assert(blkno < rel_pages - 1);
-
-			if (vacskip.skipping_current_range)
-				continue;
+		blkno = lazy_scan_skip(vacrel, &vacskip, blkno + 1,
+							   &all_visible_according_to_vm);
-			/* Current range is too small to skip -- just scan the page */
-			all_visible_according_to_vm = true;
-		}
+		if (blkno == InvalidBlockNumber)
+			break;

vacrel->scanned_pages++;

I don't like that we still do determination about the next block outside of
lazy_scan_skip() and have duplicated exit conditions between lazy_scan_skip()
and lazy_scan_heap().

I'd probably change the interface to something like

while (lazy_scan_get_next_block(vacrel, &blkno))
{
...
}

From b6603e35147c4bbe3337280222e6243524b0110e Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sun, 31 Dec 2023 09:47:18 -0500
Subject: [PATCH v2 5/6] VacSkipState saves reference to LVRelState

The streaming read interface can only give pgsr_next callbacks access to
two pieces of private data. As such, move a reference to the LVRelState
into the VacSkipState.

This is a separate commit (as opposed to as part of the commit
introducing VacSkipState) because it is required for using the streaming
read interface but not a natural change on its own. VacSkipState is per
block and the LVRelState is referenced for the whole relation vacuum.

I'd do it the other way round, i.e. either embed VacSkipState ino LVRelState
or point to it from VacSkipState.

LVRelState is already tied to the iteration state, so I don't think there's a
reason not to do so.

Greetings,

Andres Freund

#4Jim Nasby
jim.nasby@gmail.com
In reply to: Andres Freund (#3)
Re: Confine vacuum skip logic to lazy_scan_skip

<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<div class="moz-cite-prefix">On 1/4/24 2:23 PM, Andres Freund wrote:<br>
</div>
<blockquote type="cite"
cite="mid:20240104202309.77h5llrambkl5a3m@awork3.anarazel.de">
<pre><pre class="moz-quote-pre" wrap="">On 2024-01-02 12:36:18 -0500, Melanie Plageman wrote:
</pre><blockquote type="cite" style="color: #007cff;"><pre
class="moz-quote-pre" wrap="">Subject: [PATCH v2 1/6] lazy_scan_skip remove unnecessary local var rel_pages
Subject: [PATCH v2 2/6] lazy_scan_skip remove unneeded local var
nskippable_blocks
</pre></blockquote><pre class="moz-quote-pre" wrap="">I think these may lead to worse code - the compiler has to reload
vacrel-&gt;rel_pages/next_unskippable_block for every loop iteration, because it
can't guarantee that they're not changed within one of the external functions
called in the loop body.</pre></pre>
</blockquote>
<p>Admittedly I'm not up to speed on recent vacuum changes, but I
have to wonder if the concept of skipping should go away in the
context of vector IO? Instead of thinking about "we can skip this
range of blocks", why not maintain a list of "here's the next X
number of blocks that we need to vacuum"?<br>
</p>
<pre class="moz-signature" cols="72">--
Jim Nasby, Data Architect, Austin TX</pre>
</body>
</html>

#5Nazir Bilal Yavuz
byavuz81@gmail.com
In reply to: Jim Nasby (#4)
Re: Confine vacuum skip logic to lazy_scan_skip

Hi,

On Fri, 5 Jan 2024 at 02:25, Jim Nasby <jim.nasby@gmail.com> wrote:

On 1/4/24 2:23 PM, Andres Freund wrote:

On 2024-01-02 12:36:18 -0500, Melanie Plageman wrote:

Subject: [PATCH v2 1/6] lazy_scan_skip remove unnecessary local var rel_pages
Subject: [PATCH v2 2/6] lazy_scan_skip remove unneeded local var
nskippable_blocks

I think these may lead to worse code - the compiler has to reload
vacrel->rel_pages/next_unskippable_block for every loop iteration, because it
can't guarantee that they're not changed within one of the external functions
called in the loop body.

Admittedly I'm not up to speed on recent vacuum changes, but I have to wonder if the concept of skipping should go away in the context of vector IO? Instead of thinking about "we can skip this range of blocks", why not maintain a list of "here's the next X number of blocks that we need to vacuum"?

Sorry if I misunderstood. AFAIU, with the help of the vectored IO;
"the next X number of blocks that need to be vacuumed" will be
prefetched by calculating the unskippable blocks ( using the
lazy_scan_skip() function ) and the X will be determined by Postgres
itself. Do you have something different in your mind?

--
Regards,
Nazir Bilal Yavuz
Microsoft

#6Melanie Plageman
melanieplageman@gmail.com
In reply to: Andres Freund (#3)
4 attachment(s)
Re: Confine vacuum skip logic to lazy_scan_skip

v3 attached

On Thu, Jan 4, 2024 at 3:23 PM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2024-01-02 12:36:18 -0500, Melanie Plageman wrote:

Subject: [PATCH v2 1/6] lazy_scan_skip remove unnecessary local var rel_pages
Subject: [PATCH v2 2/6] lazy_scan_skip remove unneeded local var
nskippable_blocks

I think these may lead to worse code - the compiler has to reload
vacrel->rel_pages/next_unskippable_block for every loop iteration, because it
can't guarantee that they're not changed within one of the external functions
called in the loop body.

I buy that for 0001 but 0002 is still using local variables.
nskippable_blocks was just another variable to keep track of even
though we could already get that info from local variables
next_unskippable_block and next_block.

In light of this comment, I've refactored 0003/0004 (0002 and 0003 in
this version [v3]) to use local variables in the loop as well. I had
started using the members of the VacSkipState which I introduced.

Subject: [PATCH v2 3/6] Add lazy_scan_skip unskippable state

Future commits will remove all skipping logic from lazy_scan_heap() and
confine it to lazy_scan_skip(). To make those commits more clear, first
introduce the struct, VacSkipState, which will maintain the variables
needed to skip ranges less than SKIP_PAGES_THRESHOLD.

Why not add this to LVRelState, possibly as a struct embedded within it?

Done in attached.

From 335faad5948b2bec3b83c2db809bb9161d373dcb Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 30 Dec 2023 16:59:27 -0500
Subject: [PATCH v2 4/6] Confine vacuum skip logic to lazy_scan_skip

By always calling lazy_scan_skip() -- instead of only when we have reached the
next unskippable block, we no longer need the skipping_current_range variable.
lazy_scan_heap() no longer needs to manage the skipped range -- checking if we
reached the end in order to then call lazy_scan_skip(). And lazy_scan_skip()
can derive the visibility status of a block from whether or not we are in a
skippable range -- that is, whether or not the next_block is equal to the next
unskippable block.

I wonder if it should be renamed as part of this - the name is somewhat
confusing now (and perhaps before)? lazy_scan_get_next_block() or such?

Why stop there! I've removed lazy and called it
heap_vac_scan_get_next_block() -- a little long, but...

+     while (true)
{
Buffer          buf;
Page            page;
-             bool            all_visible_according_to_vm;
LVPagePruneState prunestate;
-             if (blkno == vacskip.next_unskippable_block)
-             {
-                     /*
-                      * Can't skip this page safely.  Must scan the page.  But
-                      * determine the next skippable range after the page first.
-                      */
-                     all_visible_according_to_vm = vacskip.next_unskippable_allvis;
-                     lazy_scan_skip(vacrel, &vacskip, blkno + 1);
-
-                     Assert(vacskip.next_unskippable_block >= blkno + 1);
-             }
-             else
-             {
-                     /* Last page always scanned (may need to set nonempty_pages) */
-                     Assert(blkno < rel_pages - 1);
-
-                     if (vacskip.skipping_current_range)
-                             continue;
+             blkno = lazy_scan_skip(vacrel, &vacskip, blkno + 1,
+                                                        &all_visible_according_to_vm);
-                     /* Current range is too small to skip -- just scan the page */
-                     all_visible_according_to_vm = true;
-             }
+             if (blkno == InvalidBlockNumber)
+                     break;

vacrel->scanned_pages++;

I don't like that we still do determination about the next block outside of
lazy_scan_skip() and have duplicated exit conditions between lazy_scan_skip()
and lazy_scan_heap().

I'd probably change the interface to something like

while (lazy_scan_get_next_block(vacrel, &blkno))
{
...
}

I've done this. I do now find the parameter names a bit confusing.
There is next_block (which is the "next block in line" and is an input
parameter) and blkno, which is an output parameter with the next block
that should actually be processed. Maybe it's okay?

From b6603e35147c4bbe3337280222e6243524b0110e Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sun, 31 Dec 2023 09:47:18 -0500
Subject: [PATCH v2 5/6] VacSkipState saves reference to LVRelState

The streaming read interface can only give pgsr_next callbacks access to
two pieces of private data. As such, move a reference to the LVRelState
into the VacSkipState.

This is a separate commit (as opposed to as part of the commit
introducing VacSkipState) because it is required for using the streaming
read interface but not a natural change on its own. VacSkipState is per
block and the LVRelState is referenced for the whole relation vacuum.

I'd do it the other way round, i.e. either embed VacSkipState ino LVRelState
or point to it from VacSkipState.

LVRelState is already tied to the iteration state, so I don't think there's a
reason not to do so.

Done, and, as such, this patch is dropped from the set.

- Melane

Attachments:

v3-0004-Remove-unneeded-vacuum_delay_point-from-heap_vac_.patchtext/x-patch; charset=US-ASCII; name=v3-0004-Remove-unneeded-vacuum_delay_point-from-heap_vac_.patchDownload
From 2ac56028b7379a5f40680801eb0e188ca512bd7f Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sun, 31 Dec 2023 12:49:56 -0500
Subject: [PATCH v3 4/4] Remove unneeded vacuum_delay_point from
 heap_vac_scan_get_next_block

heap_vac_scan_get_next_block() does relatively little work, so there is
no need to call vacuum_delay_point(). A future commit will call
heap_vac_scan_get_next_block() from a callback, and we would like to
avoid calling vacuum_delay_point() in that callback.
---
 src/backend/access/heap/vacuumlazy.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 77d35f1974d..a6cb96409b9 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1384,8 +1384,6 @@ heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
 				 */
 				skipsallvis = true;
 			}
-
-			vacuum_delay_point();
 		}
 
 		vacrel->skip.next_unskippable_block = next_unskippable_block;
-- 
2.37.2

v3-0002-Add-lazy_scan_skip-unskippable-state-to-LVRelStat.patchtext/x-patch; charset=US-ASCII; name=v3-0002-Add-lazy_scan_skip-unskippable-state-to-LVRelStat.patchDownload
From 57b18a83ea998dd27ecaac9c04e503089813cf6c Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 30 Dec 2023 16:22:12 -0500
Subject: [PATCH v3 2/4] Add lazy_scan_skip unskippable state to LVRelState

Future commits will remove all skipping logic from lazy_scan_heap() and
confine it to lazy_scan_skip(). To make those commits more clear, first
introduce add a struct to LVRelState containing variables needed to skip
ranges less than SKIP_PAGES_THRESHOLD.

While we are at it, add additional information to the lazy_scan_skip()
comment, including descriptions of the role and expectations for its
function parameters.
---
 src/backend/access/heap/vacuumlazy.c | 139 ++++++++++++++++-----------
 1 file changed, 84 insertions(+), 55 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 22ba16b6718..cab2bbb9629 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -210,6 +210,22 @@ typedef struct LVRelState
 	int64		live_tuples;	/* # live tuples remaining */
 	int64		recently_dead_tuples;	/* # dead, but not yet removable */
 	int64		missed_dead_tuples; /* # removable, but not removed */
+
+	/*
+	 * Parameters maintained by lazy_scan_skip() to manage skipping ranges of
+	 * pages greater than SKIP_PAGES_THRESHOLD.
+	 */
+	struct
+	{
+		/* Next unskippable block */
+		BlockNumber next_unskippable_block;
+		/* Buffer containing next unskippable block's visibility info */
+		Buffer		vmbuffer;
+		/* Next unskippable block's visibility status */
+		bool		next_unskippable_allvis;
+		/* Whether or not skippable blocks should be skipped */
+		bool		skipping_current_range;
+	}			skip;
 } LVRelState;
 
 /*
@@ -237,13 +253,9 @@ typedef struct LVSavedErrInfo
 	VacErrPhase phase;
 } LVSavedErrInfo;
 
-
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
-static BlockNumber lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer,
-								  BlockNumber next_block,
-								  bool *next_unskippable_allvis,
-								  bool *skipping_current_range);
+static void lazy_scan_skip(LVRelState *vacrel, BlockNumber next_block);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
 								   bool sharelock, Buffer vmbuffer);
@@ -825,12 +837,8 @@ lazy_scan_heap(LVRelState *vacrel)
 {
 	BlockNumber rel_pages = vacrel->rel_pages,
 				blkno,
-				next_unskippable_block,
 				next_fsm_block_to_vacuum = 0;
 	VacDeadItems *dead_items = vacrel->dead_items;
-	Buffer		vmbuffer = InvalidBuffer;
-	bool		next_unskippable_allvis,
-				skipping_current_range;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -844,10 +852,9 @@ lazy_scan_heap(LVRelState *vacrel)
 	initprog_val[2] = dead_items->max_items;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
+	vacrel->skip.vmbuffer = InvalidBuffer;
 	/* Set up an initial range of skippable blocks using the visibility map */
-	next_unskippable_block = lazy_scan_skip(vacrel, &vmbuffer, 0,
-											&next_unskippable_allvis,
-											&skipping_current_range);
+	lazy_scan_skip(vacrel, 0);
 	for (blkno = 0; blkno < rel_pages; blkno++)
 	{
 		Buffer		buf;
@@ -855,26 +862,23 @@ lazy_scan_heap(LVRelState *vacrel)
 		bool		all_visible_according_to_vm;
 		LVPagePruneState prunestate;
 
-		if (blkno == next_unskippable_block)
+		if (blkno == vacrel->skip.next_unskippable_block)
 		{
 			/*
 			 * Can't skip this page safely.  Must scan the page.  But
 			 * determine the next skippable range after the page first.
 			 */
-			all_visible_according_to_vm = next_unskippable_allvis;
-			next_unskippable_block = lazy_scan_skip(vacrel, &vmbuffer,
-													blkno + 1,
-													&next_unskippable_allvis,
-													&skipping_current_range);
+			all_visible_according_to_vm = vacrel->skip.next_unskippable_allvis;
+			lazy_scan_skip(vacrel, blkno + 1);
 
-			Assert(next_unskippable_block >= blkno + 1);
+			Assert(vacrel->skip.next_unskippable_block >= blkno + 1);
 		}
 		else
 		{
 			/* Last page always scanned (may need to set nonempty_pages) */
 			Assert(blkno < rel_pages - 1);
 
-			if (skipping_current_range)
+			if (vacrel->skip.skipping_current_range)
 				continue;
 
 			/* Current range is too small to skip -- just scan the page */
@@ -917,10 +921,10 @@ lazy_scan_heap(LVRelState *vacrel)
 			 * correctness, but we do it anyway to avoid holding the pin
 			 * across a lengthy, unrelated operation.
 			 */
-			if (BufferIsValid(vmbuffer))
+			if (BufferIsValid(vacrel->skip.vmbuffer))
 			{
-				ReleaseBuffer(vmbuffer);
-				vmbuffer = InvalidBuffer;
+				ReleaseBuffer(vacrel->skip.vmbuffer);
+				vacrel->skip.vmbuffer = InvalidBuffer;
 			}
 
 			/* Perform a round of index and heap vacuuming */
@@ -945,7 +949,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		 * all-visible.  In most cases this will be very cheap, because we'll
 		 * already have the correct page pinned anyway.
 		 */
-		visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
+		visibilitymap_pin(vacrel->rel, blkno, &vacrel->skip.vmbuffer);
 
 		/*
 		 * We need a buffer cleanup lock to prune HOT chains and defragment
@@ -964,7 +968,7 @@ lazy_scan_heap(LVRelState *vacrel)
 
 			/* Check for new or empty pages before lazy_scan_noprune call */
 			if (lazy_scan_new_or_empty(vacrel, buf, blkno, page, true,
-									   vmbuffer))
+									   vacrel->skip.vmbuffer))
 			{
 				/* Processed as new/empty page (lock and pin released) */
 				continue;
@@ -1003,7 +1007,8 @@ lazy_scan_heap(LVRelState *vacrel)
 		}
 
 		/* Check for new or empty pages before lazy_scan_prune call */
-		if (lazy_scan_new_or_empty(vacrel, buf, blkno, page, false, vmbuffer))
+		if (lazy_scan_new_or_empty(vacrel, buf, blkno, page, false,
+								   vacrel->skip.vmbuffer))
 		{
 			/* Processed as new/empty page (lock and pin released) */
 			continue;
@@ -1037,7 +1042,7 @@ lazy_scan_heap(LVRelState *vacrel)
 			{
 				Size		freespace;
 
-				lazy_vacuum_heap_page(vacrel, blkno, buf, 0, vmbuffer);
+				lazy_vacuum_heap_page(vacrel, blkno, buf, 0, vacrel->skip.vmbuffer);
 
 				/* Forget the LP_DEAD items that we just vacuumed */
 				dead_items->num_items = 0;
@@ -1116,7 +1121,7 @@ lazy_scan_heap(LVRelState *vacrel)
 			PageSetAllVisible(page);
 			MarkBufferDirty(buf);
 			visibilitymap_set(vacrel->rel, blkno, buf, InvalidXLogRecPtr,
-							  vmbuffer, prunestate.visibility_cutoff_xid,
+							  vacrel->skip.vmbuffer, prunestate.visibility_cutoff_xid,
 							  flags);
 		}
 
@@ -1127,11 +1132,12 @@ lazy_scan_heap(LVRelState *vacrel)
 		 * with buffer lock before concluding that the VM is corrupt.
 		 */
 		else if (all_visible_according_to_vm && !PageIsAllVisible(page) &&
-				 visibilitymap_get_status(vacrel->rel, blkno, &vmbuffer) != 0)
+				 visibilitymap_get_status(vacrel->rel, blkno,
+										  &vacrel->skip.vmbuffer) != 0)
 		{
 			elog(WARNING, "page is not marked all-visible but visibility map bit is set in relation \"%s\" page %u",
 				 vacrel->relname, blkno);
-			visibilitymap_clear(vacrel->rel, blkno, vmbuffer,
+			visibilitymap_clear(vacrel->rel, blkno, vacrel->skip.vmbuffer,
 								VISIBILITYMAP_VALID_BITS);
 		}
 
@@ -1156,7 +1162,7 @@ lazy_scan_heap(LVRelState *vacrel)
 				 vacrel->relname, blkno);
 			PageClearAllVisible(page);
 			MarkBufferDirty(buf);
-			visibilitymap_clear(vacrel->rel, blkno, vmbuffer,
+			visibilitymap_clear(vacrel->rel, blkno, vacrel->skip.vmbuffer,
 								VISIBILITYMAP_VALID_BITS);
 		}
 
@@ -1167,7 +1173,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		else if (all_visible_according_to_vm && prunestate.all_visible &&
 				 prunestate.all_frozen &&
-				 !VM_ALL_FROZEN(vacrel->rel, blkno, &vmbuffer))
+				 !VM_ALL_FROZEN(vacrel->rel, blkno, &vacrel->skip.vmbuffer))
 		{
 			/*
 			 * Avoid relying on all_visible_according_to_vm as a proxy for the
@@ -1189,7 +1195,7 @@ lazy_scan_heap(LVRelState *vacrel)
 			 */
 			Assert(!TransactionIdIsValid(prunestate.visibility_cutoff_xid));
 			visibilitymap_set(vacrel->rel, blkno, buf, InvalidXLogRecPtr,
-							  vmbuffer, InvalidTransactionId,
+							  vacrel->skip.vmbuffer, InvalidTransactionId,
 							  VISIBILITYMAP_ALL_VISIBLE |
 							  VISIBILITYMAP_ALL_FROZEN);
 		}
@@ -1231,8 +1237,11 @@ lazy_scan_heap(LVRelState *vacrel)
 	}
 
 	vacrel->blkno = InvalidBlockNumber;
-	if (BufferIsValid(vmbuffer))
-		ReleaseBuffer(vmbuffer);
+	if (BufferIsValid(vacrel->skip.vmbuffer))
+	{
+		ReleaseBuffer(vacrel->skip.vmbuffer);
+		vacrel->skip.vmbuffer = InvalidBuffer;
+	}
 
 	/* report that everything is now scanned */
 	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
@@ -1276,15 +1285,34 @@ lazy_scan_heap(LVRelState *vacrel)
  *	lazy_scan_skip() -- set up range of skippable blocks using visibility map.
  *
  * lazy_scan_heap() calls here every time it needs to set up a new range of
- * blocks to skip via the visibility map.  Caller passes the next block in
- * line.  We return a next_unskippable_block for this range.  When there are
- * no skippable blocks we just return caller's next_block.  The all-visible
- * status of the returned block is set in *next_unskippable_allvis for caller,
- * too.  Block usually won't be all-visible (since it's unskippable), but it
- * can be during aggressive VACUUMs (as well as in certain edge cases).
+ * blocks to skip via the visibility map.  Caller passes next_block, the next
+ * block in line. The parameters of the skipped range are recorded in skip.
+ * vacrel is an in/out parameter here; vacuum options and information about the
+ * relation are read and vacrel->skippedallvis is set to ensure we don't
+ * advance relfrozenxid when we have skipped vacuuming all visible blocks.
+ *
+ * skip->vmbuffer will contain the block from the VM containing visibility
+ * information for the next unskippable heap block. We may end up needed a
+ * different block from the VM (if we decide not to skip a skippable block).
+ * This is okay; visibilitymap_pin() will take care of this while processing
+ * the block.
+ *
+ * A block is unskippable if it is not all visible according to the visibility
+ * map. It is also unskippable if it is the last block in the relation, if the
+ * vacuum is an aggressive vacuum, or if DISABLE_PAGE_SKIPPING was passed to
+ * vacuum.
+ *
+ * Even if a block is skippable, we may choose not to skip it if the range of
+ * skippable blocks is too small (below SKIP_PAGES_THRESHOLD). As a
+ * consequence, we must keep track of the next truly unskippable block and its
+ * visibility status along with whether or not we are skipping the current
+ * range of skippable blocks. This can be used to derive the next block
+ * lazy_scan_heap() must process and its visibility status.
  *
- * Sets *skipping_current_range to indicate if caller should skip this range.
- * Costs and benefits drive our decision.  Very small ranges won't be skipped.
+ * The block number and visibility status of the next unskippable block are set
+ * in skip->next_unskippable_block and next_unskippable_allvis.
+ * skip->skipping_current_range indicates to the caller whether or not it is
+ * processing a skippable (and thus all-visible) block.
  *
  * Note: our opinion of which blocks can be skipped can go stale immediately.
  * It's okay if caller "misses" a page whose all-visible or all-frozen marking
@@ -1294,25 +1322,26 @@ lazy_scan_heap(LVRelState *vacrel)
  * older XIDs/MXIDs.  The vacrel->skippedallvis flag will be set here when the
  * choice to skip such a range is actually made, making everything safe.)
  */
-static BlockNumber
-lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
-			   bool *next_unskippable_allvis, bool *skipping_current_range)
+static void
+lazy_scan_skip(LVRelState *vacrel, BlockNumber next_block)
 {
+	/* Use local variables for better optimized loop code */
 	BlockNumber rel_pages = vacrel->rel_pages,
 				next_unskippable_block = next_block;
+
 	bool		skipsallvis = false;
 
-	*next_unskippable_allvis = true;
+	vacrel->skip.next_unskippable_allvis = true;
 	while (next_unskippable_block < rel_pages)
 	{
 		uint8		mapbits = visibilitymap_get_status(vacrel->rel,
 													   next_unskippable_block,
-													   vmbuffer);
+													   &vacrel->skip.vmbuffer);
 
 		if ((mapbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
 		{
 			Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
-			*next_unskippable_allvis = false;
+			vacrel->skip.next_unskippable_allvis = false;
 			break;
 		}
 
@@ -1333,7 +1362,7 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 		if (!vacrel->skipwithvm)
 		{
 			/* Caller shouldn't rely on all_visible_according_to_vm */
-			*next_unskippable_allvis = false;
+			vacrel->skip.next_unskippable_allvis = false;
 			break;
 		}
 
@@ -1358,6 +1387,8 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 		next_unskippable_block++;
 	}
 
+	vacrel->skip.next_unskippable_block = next_unskippable_block;
+
 	/*
 	 * We only skip a range with at least SKIP_PAGES_THRESHOLD consecutive
 	 * pages.  Since we're reading sequentially, the OS should be doing
@@ -1368,16 +1399,14 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 	 * non-aggressive VACUUMs.  If the range has any all-visible pages then
 	 * skipping makes updating relfrozenxid unsafe, which is a real downside.
 	 */
-	if (next_unskippable_block - next_block < SKIP_PAGES_THRESHOLD)
-		*skipping_current_range = false;
+	if (vacrel->skip.next_unskippable_block - next_block < SKIP_PAGES_THRESHOLD)
+		vacrel->skip.skipping_current_range = false;
 	else
 	{
-		*skipping_current_range = true;
+		vacrel->skip.skipping_current_range = true;
 		if (skipsallvis)
 			vacrel->skippedallvis = true;
 	}
-
-	return next_unskippable_block;
 }
 
 /*
-- 
2.37.2

v3-0003-Confine-vacuum-skip-logic-to-lazy_scan_skip.patchtext/x-patch; charset=US-ASCII; name=v3-0003-Confine-vacuum-skip-logic-to-lazy_scan_skip.patchDownload
From 5d4f6b74e35491be52a5cc32e4e56f8584ce127e Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 30 Dec 2023 16:59:27 -0500
Subject: [PATCH v3 3/4] Confine vacuum skip logic to lazy_scan_skip

In preparation for vacuum to use the streaming read interface (and
eventually AIO), refactor vacuum's logic for skipping blocks such that
it is entirely confined to lazy_scan_skip(). This turns lazy_scan_skip()
and the skip state in LVRelState it uses into an iterator which yields
blocks to lazy_scan_heap(). Such a structure is conducive to an async
interface. While we are at it, rename lazy_scan_skip() to
heap_vac_scan_get_next_block(), which now more accurately describes it.

By always calling heap_vac_scan_get_next_block() -- instead of only when
we have reached the next unskippable block, we no longer need the
skipping_current_range variable. lazy_scan_heap() no longer needs to
manage the skipped range -- checking if we reached the end in order to
then call heap_vac_scan_get_next_block(). And
heap_vac_scan_get_next_block() can derive the visibility status of a
block from whether or not we are in a skippable range -- that is,
whether or not the next_block is equal to the next unskippable block.
---
 src/backend/access/heap/vacuumlazy.c | 258 ++++++++++++++-------------
 1 file changed, 134 insertions(+), 124 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index cab2bbb9629..77d35f1974d 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -212,8 +212,8 @@ typedef struct LVRelState
 	int64		missed_dead_tuples; /* # removable, but not removed */
 
 	/*
-	 * Parameters maintained by lazy_scan_skip() to manage skipping ranges of
-	 * pages greater than SKIP_PAGES_THRESHOLD.
+	 * Parameters maintained by heap_vac_scan_get_next_block() to manage
+	 * skipping ranges of pages greater than SKIP_PAGES_THRESHOLD.
 	 */
 	struct
 	{
@@ -255,7 +255,9 @@ typedef struct LVSavedErrInfo
 
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
-static void lazy_scan_skip(LVRelState *vacrel, BlockNumber next_block);
+static bool heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
+										 BlockNumber *blkno,
+										 bool *all_visible_according_to_vm);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
 								   bool sharelock, Buffer vmbuffer);
@@ -835,9 +837,15 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 static void
 lazy_scan_heap(LVRelState *vacrel)
 {
+	Buffer		buf;
+	Page		page;
+	LVPagePruneState prunestate;
 	BlockNumber rel_pages = vacrel->rel_pages,
-				blkno,
 				next_fsm_block_to_vacuum = 0;
+	bool		all_visible_according_to_vm;
+
+	/* relies on InvalidBlockNumber overflowing to 0 */
+	BlockNumber blkno = InvalidBlockNumber;
 	VacDeadItems *dead_items = vacrel->dead_items;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
@@ -852,39 +860,12 @@ lazy_scan_heap(LVRelState *vacrel)
 	initprog_val[2] = dead_items->max_items;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
+	vacrel->skip.next_unskippable_block = InvalidBlockNumber;
 	vacrel->skip.vmbuffer = InvalidBuffer;
-	/* Set up an initial range of skippable blocks using the visibility map */
-	lazy_scan_skip(vacrel, 0);
-	for (blkno = 0; blkno < rel_pages; blkno++)
-	{
-		Buffer		buf;
-		Page		page;
-		bool		all_visible_according_to_vm;
-		LVPagePruneState prunestate;
-
-		if (blkno == vacrel->skip.next_unskippable_block)
-		{
-			/*
-			 * Can't skip this page safely.  Must scan the page.  But
-			 * determine the next skippable range after the page first.
-			 */
-			all_visible_according_to_vm = vacrel->skip.next_unskippable_allvis;
-			lazy_scan_skip(vacrel, blkno + 1);
-
-			Assert(vacrel->skip.next_unskippable_block >= blkno + 1);
-		}
-		else
-		{
-			/* Last page always scanned (may need to set nonempty_pages) */
-			Assert(blkno < rel_pages - 1);
-
-			if (vacrel->skip.skipping_current_range)
-				continue;
-
-			/* Current range is too small to skip -- just scan the page */
-			all_visible_according_to_vm = true;
-		}
 
+	while (heap_vac_scan_get_next_block(vacrel, blkno + 1,
+										&blkno, &all_visible_according_to_vm))
+	{
 		vacrel->scanned_pages++;
 
 		/* Report as block scanned, update error traceback information */
@@ -1093,7 +1074,8 @@ lazy_scan_heap(LVRelState *vacrel)
 
 		/*
 		 * Handle setting visibility map bit based on information from the VM
-		 * (as of last lazy_scan_skip() call), and from prunestate
+		 * (as of last heap_vac_scan_get_next_block() call), and from
+		 * prunestate
 		 */
 		if (!all_visible_according_to_vm && prunestate.all_visible)
 		{
@@ -1128,8 +1110,9 @@ lazy_scan_heap(LVRelState *vacrel)
 		/*
 		 * As of PostgreSQL 9.2, the visibility map bit should never be set if
 		 * the page-level bit is clear.  However, it's possible that the bit
-		 * got cleared after lazy_scan_skip() was called, so we must recheck
-		 * with buffer lock before concluding that the VM is corrupt.
+		 * got cleared after heap_vac_scan_get_next_block() was called, so we
+		 * must recheck with buffer lock before concluding that the VM is
+		 * corrupt.
 		 */
 		else if (all_visible_according_to_vm && !PageIsAllVisible(page) &&
 				 visibilitymap_get_status(vacrel->rel, blkno,
@@ -1282,20 +1265,14 @@ lazy_scan_heap(LVRelState *vacrel)
 }
 
 /*
- *	lazy_scan_skip() -- set up range of skippable blocks using visibility map.
+ *	heap_vac_scan_get_next_block() -- get next block for vacuum to process
  *
- * lazy_scan_heap() calls here every time it needs to set up a new range of
- * blocks to skip via the visibility map.  Caller passes next_block, the next
- * block in line. The parameters of the skipped range are recorded in skip.
- * vacrel is an in/out parameter here; vacuum options and information about the
- * relation are read and vacrel->skippedallvis is set to ensure we don't
- * advance relfrozenxid when we have skipped vacuuming all visible blocks.
- *
- * skip->vmbuffer will contain the block from the VM containing visibility
- * information for the next unskippable heap block. We may end up needed a
- * different block from the VM (if we decide not to skip a skippable block).
- * This is okay; visibilitymap_pin() will take care of this while processing
- * the block.
+ * lazy_scan_heap() calls here every time it needs to get the next block to
+ * prune and vacuum, using the visibility map, vacuum options, and various
+ * thresholds to skip blocks which do not need to be processed. Caller passes
+ * next_block, the next block in line. This block may end up being skipped.
+ * heap_vac_scan_get_next_block() sets blkno to next block that actually needs
+ * to be processed.
  *
  * A block is unskippable if it is not all visible according to the visibility
  * map. It is also unskippable if it is the last block in the relation, if the
@@ -1305,14 +1282,25 @@ lazy_scan_heap(LVRelState *vacrel)
  * Even if a block is skippable, we may choose not to skip it if the range of
  * skippable blocks is too small (below SKIP_PAGES_THRESHOLD). As a
  * consequence, we must keep track of the next truly unskippable block and its
- * visibility status along with whether or not we are skipping the current
- * range of skippable blocks. This can be used to derive the next block
- * lazy_scan_heap() must process and its visibility status.
+ * visibility status separate from the next block lazy_scan_heap() should
+ * process (and its visibility status).
  *
  * The block number and visibility status of the next unskippable block are set
- * in skip->next_unskippable_block and next_unskippable_allvis.
- * skip->skipping_current_range indicates to the caller whether or not it is
- * processing a skippable (and thus all-visible) block.
+ * in vacrel->skip->next_unskippable_block and next_unskippable_allvis.
+ *
+ * The block number and visibility status of the next block to process are set
+ * in blkno and all_visible_according_to_vm. heap_vac_scan_get_next_block()
+ * returns false if there are no further blocks to process.
+ *
+ * vacrel is an in/out parameter here; vacuum options and information about the
+ * relation are read and vacrel->skippedallvis is set to ensure we don't
+ * advance relfrozenxid when we have skipped vacuuming all visible blocks.
+ *
+ * skip->vmbuffer will contain the block from the VM containing visibility
+ * information for the next unskippable heap block. We may end up needed a
+ * different block from the VM (if we decide not to skip a skippable block).
+ * This is okay; visibilitymap_pin() will take care of this while processing
+ * the block.
  *
  * Note: our opinion of which blocks can be skipped can go stale immediately.
  * It's okay if caller "misses" a page whose all-visible or all-frozen marking
@@ -1322,91 +1310,113 @@ lazy_scan_heap(LVRelState *vacrel)
  * older XIDs/MXIDs.  The vacrel->skippedallvis flag will be set here when the
  * choice to skip such a range is actually made, making everything safe.)
  */
-static void
-lazy_scan_skip(LVRelState *vacrel, BlockNumber next_block)
+static bool
+heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
+							 BlockNumber *blkno, bool *all_visible_according_to_vm)
 {
-	/* Use local variables for better optimized loop code */
-	BlockNumber rel_pages = vacrel->rel_pages,
-				next_unskippable_block = next_block;
-
 	bool		skipsallvis = false;
 
-	vacrel->skip.next_unskippable_allvis = true;
-	while (next_unskippable_block < rel_pages)
+	if (next_block >= vacrel->rel_pages)
 	{
-		uint8		mapbits = visibilitymap_get_status(vacrel->rel,
-													   next_unskippable_block,
-													   &vacrel->skip.vmbuffer);
+		*blkno = InvalidBlockNumber;
+		return false;
+	}
+
+	if (vacrel->skip.next_unskippable_block == InvalidBlockNumber ||
+		next_block > vacrel->skip.next_unskippable_block)
+	{
+		/* Use local variables for better optimized loop code */
+		BlockNumber rel_pages = vacrel->rel_pages;
+		BlockNumber next_unskippable_block = vacrel->skip.next_unskippable_block;
 
-		if ((mapbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
+		while (++next_unskippable_block < rel_pages)
 		{
-			Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
-			vacrel->skip.next_unskippable_allvis = false;
-			break;
-		}
+			uint8		mapbits = visibilitymap_get_status(vacrel->rel,
+														   next_unskippable_block,
+														   &vacrel->skip.vmbuffer);
 
-		/*
-		 * Caller must scan the last page to determine whether it has tuples
-		 * (caller must have the opportunity to set vacrel->nonempty_pages).
-		 * This rule avoids having lazy_truncate_heap() take access-exclusive
-		 * lock on rel to attempt a truncation that fails anyway, just because
-		 * there are tuples on the last page (it is likely that there will be
-		 * tuples on other nearby pages as well, but those can be skipped).
-		 *
-		 * Implement this by always treating the last block as unsafe to skip.
-		 */
-		if (next_unskippable_block == rel_pages - 1)
-			break;
+			vacrel->skip.next_unskippable_allvis = mapbits & VISIBILITYMAP_ALL_VISIBLE;
 
-		/* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
-		if (!vacrel->skipwithvm)
-		{
-			/* Caller shouldn't rely on all_visible_according_to_vm */
-			vacrel->skip.next_unskippable_allvis = false;
-			break;
-		}
+			if (!vacrel->skip.next_unskippable_allvis)
+			{
+				Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
+				break;
+			}
 
-		/*
-		 * Aggressive VACUUM caller can't skip pages just because they are
-		 * all-visible.  They may still skip all-frozen pages, which can't
-		 * contain XIDs < OldestXmin (XIDs that aren't already frozen by now).
-		 */
-		if ((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0)
-		{
-			if (vacrel->aggressive)
+			/*
+			 * Caller must scan the last page to determine whether it has
+			 * tuples (caller must have the opportunity to set
+			 * vacrel->nonempty_pages). This rule avoids having
+			 * lazy_truncate_heap() take access-exclusive lock on rel to
+			 * attempt a truncation that fails anyway, just because there are
+			 * tuples on the last page (it is likely that there will be tuples
+			 * on other nearby pages as well, but those can be skipped).
+			 *
+			 * Implement this by always treating the last block as unsafe to
+			 * skip.
+			 */
+			if (next_unskippable_block == rel_pages - 1)
 				break;
 
+			/* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
+			if (!vacrel->skipwithvm)
+			{
+				/* Caller shouldn't rely on all_visible_according_to_vm */
+				vacrel->skip.next_unskippable_allvis = false;
+				break;
+			}
+
 			/*
-			 * All-visible block is safe to skip in non-aggressive case.  But
-			 * remember that the final range contains such a block for later.
+			 * Aggressive VACUUM caller can't skip pages just because they are
+			 * all-visible.  They may still skip all-frozen pages, which can't
+			 * contain XIDs < OldestXmin (XIDs that aren't already frozen by
+			 * now).
 			 */
-			skipsallvis = true;
+			if ((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0)
+			{
+				if (vacrel->aggressive)
+					break;
+
+				/*
+				 * All-visible block is safe to skip in non-aggressive case.
+				 * But remember that the final range contains such a block for
+				 * later.
+				 */
+				skipsallvis = true;
+			}
+
+			vacuum_delay_point();
 		}
 
-		vacuum_delay_point();
-		next_unskippable_block++;
-	}
+		vacrel->skip.next_unskippable_block = next_unskippable_block;
 
-	vacrel->skip.next_unskippable_block = next_unskippable_block;
+		/*
+		 * We only skip a range with at least SKIP_PAGES_THRESHOLD consecutive
+		 * pages.  Since we're reading sequentially, the OS should be doing
+		 * readahead for us, so there's no gain in skipping a page now and
+		 * then. Skipping such a range might even discourage sequential
+		 * detection.
+		 *
+		 * This test also enables more frequent relfrozenxid advancement
+		 * during non-aggressive VACUUMs.  If the range has any all-visible
+		 * pages then skipping makes updating relfrozenxid unsafe, which is a
+		 * real downside.
+		 */
+		if (vacrel->skip.next_unskippable_block - next_block >= SKIP_PAGES_THRESHOLD)
+		{
+			next_block = vacrel->skip.next_unskippable_block;
+			if (skipsallvis)
+				vacrel->skippedallvis = true;
+		}
+	}
 
-	/*
-	 * We only skip a range with at least SKIP_PAGES_THRESHOLD consecutive
-	 * pages.  Since we're reading sequentially, the OS should be doing
-	 * readahead for us, so there's no gain in skipping a page now and then.
-	 * Skipping such a range might even discourage sequential detection.
-	 *
-	 * This test also enables more frequent relfrozenxid advancement during
-	 * non-aggressive VACUUMs.  If the range has any all-visible pages then
-	 * skipping makes updating relfrozenxid unsafe, which is a real downside.
-	 */
-	if (vacrel->skip.next_unskippable_block - next_block < SKIP_PAGES_THRESHOLD)
-		vacrel->skip.skipping_current_range = false;
+	if (next_block == vacrel->skip.next_unskippable_block)
+		*all_visible_according_to_vm = vacrel->skip.next_unskippable_allvis;
 	else
-	{
-		vacrel->skip.skipping_current_range = true;
-		if (skipsallvis)
-			vacrel->skippedallvis = true;
-	}
+		*all_visible_according_to_vm = true;
+
+	*blkno = next_block;
+	return true;
 }
 
 /*
-- 
2.37.2

v3-0001-lazy_scan_skip-remove-unneeded-local-var-nskippab.patchtext/x-patch; charset=US-ASCII; name=v3-0001-lazy_scan_skip-remove-unneeded-local-var-nskippab.patchDownload
From e0bc1eb3b5c1a2dfda0f54eae98278ad21a05be8 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 30 Dec 2023 16:30:59 -0500
Subject: [PATCH v3 1/4] lazy_scan_skip remove unneeded local var
 nskippable_blocks

nskippable_blocks can be easily derived from next_unskippable_block's
progress when compared to the passed in next_block.
---
 src/backend/access/heap/vacuumlazy.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index b63cad1335f..22ba16b6718 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1299,8 +1299,7 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 			   bool *next_unskippable_allvis, bool *skipping_current_range)
 {
 	BlockNumber rel_pages = vacrel->rel_pages,
-				next_unskippable_block = next_block,
-				nskippable_blocks = 0;
+				next_unskippable_block = next_block;
 	bool		skipsallvis = false;
 
 	*next_unskippable_allvis = true;
@@ -1357,7 +1356,6 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 
 		vacuum_delay_point();
 		next_unskippable_block++;
-		nskippable_blocks++;
 	}
 
 	/*
@@ -1370,7 +1368,7 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 	 * non-aggressive VACUUMs.  If the range has any all-visible pages then
 	 * skipping makes updating relfrozenxid unsafe, which is a real downside.
 	 */
-	if (nskippable_blocks < SKIP_PAGES_THRESHOLD)
+	if (next_unskippable_block - next_block < SKIP_PAGES_THRESHOLD)
 		*skipping_current_range = false;
 	else
 	{
-- 
2.37.2

#7Melanie Plageman
melanieplageman@gmail.com
In reply to: Nazir Bilal Yavuz (#5)
Re: Confine vacuum skip logic to lazy_scan_skip

On Fri, Jan 5, 2024 at 5:51 AM Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:

On Fri, 5 Jan 2024 at 02:25, Jim Nasby <jim.nasby@gmail.com> wrote:

On 1/4/24 2:23 PM, Andres Freund wrote:

On 2024-01-02 12:36:18 -0500, Melanie Plageman wrote:

Subject: [PATCH v2 1/6] lazy_scan_skip remove unnecessary local var rel_pages
Subject: [PATCH v2 2/6] lazy_scan_skip remove unneeded local var
nskippable_blocks

I think these may lead to worse code - the compiler has to reload
vacrel->rel_pages/next_unskippable_block for every loop iteration, because it
can't guarantee that they're not changed within one of the external functions
called in the loop body.

Admittedly I'm not up to speed on recent vacuum changes, but I have to wonder if the concept of skipping should go away in the context of vector IO? Instead of thinking about "we can skip this range of blocks", why not maintain a list of "here's the next X number of blocks that we need to vacuum"?

Sorry if I misunderstood. AFAIU, with the help of the vectored IO;
"the next X number of blocks that need to be vacuumed" will be
prefetched by calculating the unskippable blocks ( using the
lazy_scan_skip() function ) and the X will be determined by Postgres
itself. Do you have something different in your mind?

I think you are both right. As we gain more control of readahead from
within Postgres, we will likely want to revisit this heuristic as it
may not serve us anymore. But the streaming read interface/vectored
I/O is also not a drop-in replacement for it. To change anything and
ensure there is no regression, we will probably have to do
cross-platform benchmarking, though.

That being said, I would absolutely love to get rid of the skippable
ranges because I find them very error-prone and confusing. Hopefully
now that the skipping logic is isolated to a single function, it will
be easier not to trip over it when working on lazy_scan_heap().

- Melanie

#8Jim Nasby
jim.nasby@gmail.com
In reply to: Melanie Plageman (#7)
Re: Confine vacuum skip logic to lazy_scan_skip

On 1/11/24 5:50 PM, Melanie Plageman wrote:

On Fri, Jan 5, 2024 at 5:51 AM Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:

On Fri, 5 Jan 2024 at 02:25, Jim Nasby <jim.nasby@gmail.com> wrote:

On 1/4/24 2:23 PM, Andres Freund wrote:

On 2024-01-02 12:36:18 -0500, Melanie Plageman wrote:

Subject: [PATCH v2 1/6] lazy_scan_skip remove unnecessary local var rel_pages
Subject: [PATCH v2 2/6] lazy_scan_skip remove unneeded local var
nskippable_blocks

I think these may lead to worse code - the compiler has to reload
vacrel->rel_pages/next_unskippable_block for every loop iteration, because it
can't guarantee that they're not changed within one of the external functions
called in the loop body.

Admittedly I'm not up to speed on recent vacuum changes, but I have to wonder if the concept of skipping should go away in the context of vector IO? Instead of thinking about "we can skip this range of blocks", why not maintain a list of "here's the next X number of blocks that we need to vacuum"?

Sorry if I misunderstood. AFAIU, with the help of the vectored IO;
"the next X number of blocks that need to be vacuumed" will be
prefetched by calculating the unskippable blocks ( using the
lazy_scan_skip() function ) and the X will be determined by Postgres
itself. Do you have something different in your mind?

I think you are both right. As we gain more control of readahead from
within Postgres, we will likely want to revisit this heuristic as it
may not serve us anymore. But the streaming read interface/vectored
I/O is also not a drop-in replacement for it. To change anything and
ensure there is no regression, we will probably have to do
cross-platform benchmarking, though.

That being said, I would absolutely love to get rid of the skippable
ranges because I find them very error-prone and confusing. Hopefully
now that the skipping logic is isolated to a single function, it will
be easier not to trip over it when working on lazy_scan_heap().

Yeah, arguably it's just a matter of semantics, but IMO it's a lot
clearer to simply think in terms of "here's the next blocks we know we
want to vacuum" instead of "we vacuum everything, but sometimes we skip
some blocks".
--
Jim Nasby, Data Architect, Austin TX

#9Melanie Plageman
melanieplageman@gmail.com
In reply to: Jim Nasby (#8)
Re: Confine vacuum skip logic to lazy_scan_skip

On Fri, Jan 12, 2024 at 2:02 PM Jim Nasby <jim.nasby@gmail.com> wrote:

On 1/11/24 5:50 PM, Melanie Plageman wrote:

On Fri, Jan 5, 2024 at 5:51 AM Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:

On Fri, 5 Jan 2024 at 02:25, Jim Nasby <jim.nasby@gmail.com> wrote:

On 1/4/24 2:23 PM, Andres Freund wrote:

On 2024-01-02 12:36:18 -0500, Melanie Plageman wrote:

Subject: [PATCH v2 1/6] lazy_scan_skip remove unnecessary local var rel_pages
Subject: [PATCH v2 2/6] lazy_scan_skip remove unneeded local var
nskippable_blocks

I think these may lead to worse code - the compiler has to reload
vacrel->rel_pages/next_unskippable_block for every loop iteration, because it
can't guarantee that they're not changed within one of the external functions
called in the loop body.

Admittedly I'm not up to speed on recent vacuum changes, but I have to wonder if the concept of skipping should go away in the context of vector IO? Instead of thinking about "we can skip this range of blocks", why not maintain a list of "here's the next X number of blocks that we need to vacuum"?

Sorry if I misunderstood. AFAIU, with the help of the vectored IO;
"the next X number of blocks that need to be vacuumed" will be
prefetched by calculating the unskippable blocks ( using the
lazy_scan_skip() function ) and the X will be determined by Postgres
itself. Do you have something different in your mind?

I think you are both right. As we gain more control of readahead from
within Postgres, we will likely want to revisit this heuristic as it
may not serve us anymore. But the streaming read interface/vectored
I/O is also not a drop-in replacement for it. To change anything and
ensure there is no regression, we will probably have to do
cross-platform benchmarking, though.

That being said, I would absolutely love to get rid of the skippable
ranges because I find them very error-prone and confusing. Hopefully
now that the skipping logic is isolated to a single function, it will
be easier not to trip over it when working on lazy_scan_heap().

Yeah, arguably it's just a matter of semantics, but IMO it's a lot
clearer to simply think in terms of "here's the next blocks we know we
want to vacuum" instead of "we vacuum everything, but sometimes we skip
some blocks".

Even "we vacuum some stuff, but sometimes we skip some blocks" would
be okay. What we have now is "we vacuum some stuff, but sometimes we
skip some blocks, but only if we would skip enough blocks, and, when
we decide to do that we can't go back and actually get visibility
information for those blocks we skipped because we are too cheap"

- Melanie

#10vignesh C
vignesh21@gmail.com
In reply to: Melanie Plageman (#6)
Re: Confine vacuum skip logic to lazy_scan_skip

On Fri, 12 Jan 2024 at 05:12, Melanie Plageman
<melanieplageman@gmail.com> wrote:

v3 attached

On Thu, Jan 4, 2024 at 3:23 PM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2024-01-02 12:36:18 -0500, Melanie Plageman wrote:

Subject: [PATCH v2 1/6] lazy_scan_skip remove unnecessary local var rel_pages
Subject: [PATCH v2 2/6] lazy_scan_skip remove unneeded local var
nskippable_blocks

I think these may lead to worse code - the compiler has to reload
vacrel->rel_pages/next_unskippable_block for every loop iteration, because it
can't guarantee that they're not changed within one of the external functions
called in the loop body.

I buy that for 0001 but 0002 is still using local variables.
nskippable_blocks was just another variable to keep track of even
though we could already get that info from local variables
next_unskippable_block and next_block.

In light of this comment, I've refactored 0003/0004 (0002 and 0003 in
this version [v3]) to use local variables in the loop as well. I had
started using the members of the VacSkipState which I introduced.

Subject: [PATCH v2 3/6] Add lazy_scan_skip unskippable state

Future commits will remove all skipping logic from lazy_scan_heap() and
confine it to lazy_scan_skip(). To make those commits more clear, first
introduce the struct, VacSkipState, which will maintain the variables
needed to skip ranges less than SKIP_PAGES_THRESHOLD.

Why not add this to LVRelState, possibly as a struct embedded within it?

Done in attached.

From 335faad5948b2bec3b83c2db809bb9161d373dcb Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 30 Dec 2023 16:59:27 -0500
Subject: [PATCH v2 4/6] Confine vacuum skip logic to lazy_scan_skip

By always calling lazy_scan_skip() -- instead of only when we have reached the
next unskippable block, we no longer need the skipping_current_range variable.
lazy_scan_heap() no longer needs to manage the skipped range -- checking if we
reached the end in order to then call lazy_scan_skip(). And lazy_scan_skip()
can derive the visibility status of a block from whether or not we are in a
skippable range -- that is, whether or not the next_block is equal to the next
unskippable block.

I wonder if it should be renamed as part of this - the name is somewhat
confusing now (and perhaps before)? lazy_scan_get_next_block() or such?

Why stop there! I've removed lazy and called it
heap_vac_scan_get_next_block() -- a little long, but...

+     while (true)
{
Buffer          buf;
Page            page;
-             bool            all_visible_according_to_vm;
LVPagePruneState prunestate;
-             if (blkno == vacskip.next_unskippable_block)
-             {
-                     /*
-                      * Can't skip this page safely.  Must scan the page.  But
-                      * determine the next skippable range after the page first.
-                      */
-                     all_visible_according_to_vm = vacskip.next_unskippable_allvis;
-                     lazy_scan_skip(vacrel, &vacskip, blkno + 1);
-
-                     Assert(vacskip.next_unskippable_block >= blkno + 1);
-             }
-             else
-             {
-                     /* Last page always scanned (may need to set nonempty_pages) */
-                     Assert(blkno < rel_pages - 1);
-
-                     if (vacskip.skipping_current_range)
-                             continue;
+             blkno = lazy_scan_skip(vacrel, &vacskip, blkno + 1,
+                                                        &all_visible_according_to_vm);
-                     /* Current range is too small to skip -- just scan the page */
-                     all_visible_according_to_vm = true;
-             }
+             if (blkno == InvalidBlockNumber)
+                     break;

vacrel->scanned_pages++;

I don't like that we still do determination about the next block outside of
lazy_scan_skip() and have duplicated exit conditions between lazy_scan_skip()
and lazy_scan_heap().

I'd probably change the interface to something like

while (lazy_scan_get_next_block(vacrel, &blkno))
{
...
}

I've done this. I do now find the parameter names a bit confusing.
There is next_block (which is the "next block in line" and is an input
parameter) and blkno, which is an output parameter with the next block
that should actually be processed. Maybe it's okay?

From b6603e35147c4bbe3337280222e6243524b0110e Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sun, 31 Dec 2023 09:47:18 -0500
Subject: [PATCH v2 5/6] VacSkipState saves reference to LVRelState

The streaming read interface can only give pgsr_next callbacks access to
two pieces of private data. As such, move a reference to the LVRelState
into the VacSkipState.

This is a separate commit (as opposed to as part of the commit
introducing VacSkipState) because it is required for using the streaming
read interface but not a natural change on its own. VacSkipState is per
block and the LVRelState is referenced for the whole relation vacuum.

I'd do it the other way round, i.e. either embed VacSkipState ino LVRelState
or point to it from VacSkipState.

LVRelState is already tied to the iteration state, so I don't think there's a
reason not to do so.

Done, and, as such, this patch is dropped from the set.

CFBot shows that the patch does not apply anymore as in [1]http://cfbot.cputube.org/patch_46_4755.log:
=== applying patch
./v3-0002-Add-lazy_scan_skip-unskippable-state-to-LVRelStat.patch
patching file src/backend/access/heap/vacuumlazy.c
...
Hunk #10 FAILED at 1042.
Hunk #11 FAILED at 1121.
Hunk #12 FAILED at 1132.
Hunk #13 FAILED at 1161.
Hunk #14 FAILED at 1172.
Hunk #15 FAILED at 1194.
...
6 out of 21 hunks FAILED -- saving rejects to file
src/backend/access/heap/vacuumlazy.c.rej

Please post an updated version for the same.

[1]: http://cfbot.cputube.org/patch_46_4755.log

Regards,
Vignesh

#11Melanie Plageman
melanieplageman@gmail.com
In reply to: vignesh C (#10)
4 attachment(s)
Re: Confine vacuum skip logic to lazy_scan_skip

On Fri, Jan 26, 2024 at 8:28 AM vignesh C <vignesh21@gmail.com> wrote:

CFBot shows that the patch does not apply anymore as in [1]:
=== applying patch
./v3-0002-Add-lazy_scan_skip-unskippable-state-to-LVRelStat.patch
patching file src/backend/access/heap/vacuumlazy.c
...
Hunk #10 FAILED at 1042.
Hunk #11 FAILED at 1121.
Hunk #12 FAILED at 1132.
Hunk #13 FAILED at 1161.
Hunk #14 FAILED at 1172.
Hunk #15 FAILED at 1194.
...
6 out of 21 hunks FAILED -- saving rejects to file
src/backend/access/heap/vacuumlazy.c.rej

Please post an updated version for the same.

[1] - http://cfbot.cputube.org/patch_46_4755.log

Fixed in attached rebased v4

- Melanie

Attachments:

v4-0004-Remove-unneeded-vacuum_delay_point-from-heap_vac_.patchtext/x-patch; charset=US-ASCII; name=v4-0004-Remove-unneeded-vacuum_delay_point-from-heap_vac_.patchDownload
From b71fbf1e9abea4689b57d9439ecdcc4387e91195 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sun, 31 Dec 2023 12:49:56 -0500
Subject: [PATCH v4 4/4] Remove unneeded vacuum_delay_point from
 heap_vac_scan_get_next_block

heap_vac_scan_get_next_block() does relatively little work, so there is
no need to call vacuum_delay_point(). A future commit will call
heap_vac_scan_get_next_block() from a callback, and we would like to
avoid calling vacuum_delay_point() in that callback.
---
 src/backend/access/heap/vacuumlazy.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index e5988262611..ea270941379 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1190,8 +1190,6 @@ heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
 				 */
 				skipsallvis = true;
 			}
-
-			vacuum_delay_point();
 		}
 
 		vacrel->skip.next_unskippable_block = next_unskippable_block;
-- 
2.37.2

v4-0003-Confine-vacuum-skip-logic-to-lazy_scan_skip.patchtext/x-patch; charset=US-ASCII; name=v4-0003-Confine-vacuum-skip-logic-to-lazy_scan_skip.patchDownload
From 5b39165dde6e60ed214bc988eeb58fb9d357030c Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 30 Dec 2023 16:59:27 -0500
Subject: [PATCH v4 3/4] Confine vacuum skip logic to lazy_scan_skip

In preparation for vacuum to use the streaming read interface (and
eventually AIO), refactor vacuum's logic for skipping blocks such that
it is entirely confined to lazy_scan_skip(). This turns lazy_scan_skip()
and the skip state in LVRelState it uses into an iterator which yields
blocks to lazy_scan_heap(). Such a structure is conducive to an async
interface. While we are at it, rename lazy_scan_skip() to
heap_vac_scan_get_next_block(), which now more accurately describes it.

By always calling heap_vac_scan_get_next_block() -- instead of only when
we have reached the next unskippable block, we no longer need the
skipping_current_range variable. lazy_scan_heap() no longer needs to
manage the skipped range -- checking if we reached the end in order to
then call heap_vac_scan_get_next_block(). And
heap_vac_scan_get_next_block() can derive the visibility status of a
block from whether or not we are in a skippable range -- that is,
whether or not the next_block is equal to the next unskippable block.
---
 src/backend/access/heap/vacuumlazy.c | 243 ++++++++++++++-------------
 1 file changed, 126 insertions(+), 117 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 077164896fb..e5988262611 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -212,8 +212,8 @@ typedef struct LVRelState
 	int64		missed_dead_tuples; /* # removable, but not removed */
 
 	/*
-	 * Parameters maintained by lazy_scan_skip() to manage skipping ranges of
-	 * pages greater than SKIP_PAGES_THRESHOLD.
+	 * Parameters maintained by heap_vac_scan_get_next_block() to manage
+	 * skipping ranges of pages greater than SKIP_PAGES_THRESHOLD.
 	 */
 	struct
 	{
@@ -238,7 +238,9 @@ typedef struct LVSavedErrInfo
 
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
-static void lazy_scan_skip(LVRelState *vacrel, BlockNumber next_block);
+static bool heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
+										 BlockNumber *blkno,
+										 bool *all_visible_according_to_vm);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
 								   bool sharelock);
@@ -820,8 +822,11 @@ static void
 lazy_scan_heap(LVRelState *vacrel)
 {
 	BlockNumber rel_pages = vacrel->rel_pages,
-				blkno,
 				next_fsm_block_to_vacuum = 0;
+	bool		all_visible_according_to_vm;
+
+	/* relies on InvalidBlockNumber overflowing to 0 */
+	BlockNumber blkno = InvalidBlockNumber;
 	VacDeadItems *dead_items = vacrel->dead_items;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
@@ -836,40 +841,17 @@ lazy_scan_heap(LVRelState *vacrel)
 	initprog_val[2] = dead_items->max_items;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
+	vacrel->skip.next_unskippable_block = InvalidBlockNumber;
 	vacrel->skip.vmbuffer = InvalidBuffer;
-	/* Set up an initial range of skippable blocks using the visibility map */
-	lazy_scan_skip(vacrel, 0);
-	for (blkno = 0; blkno < rel_pages; blkno++)
+
+	while (heap_vac_scan_get_next_block(vacrel, blkno + 1,
+										&blkno, &all_visible_according_to_vm))
 	{
 		Buffer		buf;
 		Page		page;
-		bool		all_visible_according_to_vm;
 		bool		has_lpdead_items;
 		bool		got_cleanup_lock = false;
 
-		if (blkno == vacrel->skip.next_unskippable_block)
-		{
-			/*
-			 * Can't skip this page safely.  Must scan the page.  But
-			 * determine the next skippable range after the page first.
-			 */
-			all_visible_according_to_vm = vacrel->skip.next_unskippable_allvis;
-			lazy_scan_skip(vacrel, blkno + 1);
-
-			Assert(vacrel->skip.next_unskippable_block >= blkno + 1);
-		}
-		else
-		{
-			/* Last page always scanned (may need to set nonempty_pages) */
-			Assert(blkno < rel_pages - 1);
-
-			if (vacrel->skip.skipping_current_range)
-				continue;
-
-			/* Current range is too small to skip -- just scan the page */
-			all_visible_according_to_vm = true;
-		}
-
 		vacrel->scanned_pages++;
 
 		/* Report as block scanned, update error traceback information */
@@ -1089,20 +1071,14 @@ lazy_scan_heap(LVRelState *vacrel)
 }
 
 /*
- *	lazy_scan_skip() -- set up range of skippable blocks using visibility map.
- *
- * lazy_scan_heap() calls here every time it needs to set up a new range of
- * blocks to skip via the visibility map.  Caller passes next_block, the next
- * block in line. The parameters of the skipped range are recorded in skip.
- * vacrel is an in/out parameter here; vacuum options and information about the
- * relation are read and vacrel->skippedallvis is set to ensure we don't
- * advance relfrozenxid when we have skipped vacuuming all visible blocks.
+ *	heap_vac_scan_get_next_block() -- get next block for vacuum to process
  *
- * skip->vmbuffer will contain the block from the VM containing visibility
- * information for the next unskippable heap block. We may end up needed a
- * different block from the VM (if we decide not to skip a skippable block).
- * This is okay; visibilitymap_pin() will take care of this while processing
- * the block.
+ * lazy_scan_heap() calls here every time it needs to get the next block to
+ * prune and vacuum, using the visibility map, vacuum options, and various
+ * thresholds to skip blocks which do not need to be processed. Caller passes
+ * next_block, the next block in line. This block may end up being skipped.
+ * heap_vac_scan_get_next_block() sets blkno to next block that actually needs
+ * to be processed.
  *
  * A block is unskippable if it is not all visible according to the visibility
  * map. It is also unskippable if it is the last block in the relation, if the
@@ -1112,14 +1088,25 @@ lazy_scan_heap(LVRelState *vacrel)
  * Even if a block is skippable, we may choose not to skip it if the range of
  * skippable blocks is too small (below SKIP_PAGES_THRESHOLD). As a
  * consequence, we must keep track of the next truly unskippable block and its
- * visibility status along with whether or not we are skipping the current
- * range of skippable blocks. This can be used to derive the next block
- * lazy_scan_heap() must process and its visibility status.
+ * visibility status separate from the next block lazy_scan_heap() should
+ * process (and its visibility status).
  *
  * The block number and visibility status of the next unskippable block are set
- * in skip->next_unskippable_block and next_unskippable_allvis.
- * skip->skipping_current_range indicates to the caller whether or not it is
- * processing a skippable (and thus all-visible) block.
+ * in vacrel->skip->next_unskippable_block and next_unskippable_allvis.
+ *
+ * The block number and visibility status of the next block to process are set
+ * in blkno and all_visible_according_to_vm. heap_vac_scan_get_next_block()
+ * returns false if there are no further blocks to process.
+ *
+ * vacrel is an in/out parameter here; vacuum options and information about the
+ * relation are read and vacrel->skippedallvis is set to ensure we don't
+ * advance relfrozenxid when we have skipped vacuuming all visible blocks.
+ *
+ * skip->vmbuffer will contain the block from the VM containing visibility
+ * information for the next unskippable heap block. We may end up needed a
+ * different block from the VM (if we decide not to skip a skippable block).
+ * This is okay; visibilitymap_pin() will take care of this while processing
+ * the block.
  *
  * Note: our opinion of which blocks can be skipped can go stale immediately.
  * It's okay if caller "misses" a page whose all-visible or all-frozen marking
@@ -1129,91 +1116,113 @@ lazy_scan_heap(LVRelState *vacrel)
  * older XIDs/MXIDs.  The vacrel->skippedallvis flag will be set here when the
  * choice to skip such a range is actually made, making everything safe.)
  */
-static void
-lazy_scan_skip(LVRelState *vacrel, BlockNumber next_block)
+static bool
+heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
+							 BlockNumber *blkno, bool *all_visible_according_to_vm)
 {
-	/* Use local variables for better optimized loop code */
-	BlockNumber rel_pages = vacrel->rel_pages,
-				next_unskippable_block = next_block;
-
 	bool		skipsallvis = false;
 
-	vacrel->skip.next_unskippable_allvis = true;
-	while (next_unskippable_block < rel_pages)
+	if (next_block >= vacrel->rel_pages)
 	{
-		uint8		mapbits = visibilitymap_get_status(vacrel->rel,
-													   next_unskippable_block,
-													   &vacrel->skip.vmbuffer);
+		*blkno = InvalidBlockNumber;
+		return false;
+	}
 
-		if ((mapbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
+	if (vacrel->skip.next_unskippable_block == InvalidBlockNumber ||
+		next_block > vacrel->skip.next_unskippable_block)
+	{
+		/* Use local variables for better optimized loop code */
+		BlockNumber rel_pages = vacrel->rel_pages;
+		BlockNumber next_unskippable_block = vacrel->skip.next_unskippable_block;
+
+		while (++next_unskippable_block < rel_pages)
 		{
-			Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
-			vacrel->skip.next_unskippable_allvis = false;
-			break;
-		}
+			uint8		mapbits = visibilitymap_get_status(vacrel->rel,
+														   next_unskippable_block,
+														   &vacrel->skip.vmbuffer);
 
-		/*
-		 * Caller must scan the last page to determine whether it has tuples
-		 * (caller must have the opportunity to set vacrel->nonempty_pages).
-		 * This rule avoids having lazy_truncate_heap() take access-exclusive
-		 * lock on rel to attempt a truncation that fails anyway, just because
-		 * there are tuples on the last page (it is likely that there will be
-		 * tuples on other nearby pages as well, but those can be skipped).
-		 *
-		 * Implement this by always treating the last block as unsafe to skip.
-		 */
-		if (next_unskippable_block == rel_pages - 1)
-			break;
+			vacrel->skip.next_unskippable_allvis = mapbits & VISIBILITYMAP_ALL_VISIBLE;
 
-		/* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
-		if (!vacrel->skipwithvm)
-		{
-			/* Caller shouldn't rely on all_visible_according_to_vm */
-			vacrel->skip.next_unskippable_allvis = false;
-			break;
-		}
+			if (!vacrel->skip.next_unskippable_allvis)
+			{
+				Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
+				break;
+			}
 
-		/*
-		 * Aggressive VACUUM caller can't skip pages just because they are
-		 * all-visible.  They may still skip all-frozen pages, which can't
-		 * contain XIDs < OldestXmin (XIDs that aren't already frozen by now).
-		 */
-		if ((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0)
-		{
-			if (vacrel->aggressive)
+			/*
+			 * Caller must scan the last page to determine whether it has
+			 * tuples (caller must have the opportunity to set
+			 * vacrel->nonempty_pages). This rule avoids having
+			 * lazy_truncate_heap() take access-exclusive lock on rel to
+			 * attempt a truncation that fails anyway, just because there are
+			 * tuples on the last page (it is likely that there will be tuples
+			 * on other nearby pages as well, but those can be skipped).
+			 *
+			 * Implement this by always treating the last block as unsafe to
+			 * skip.
+			 */
+			if (next_unskippable_block == rel_pages - 1)
 				break;
 
+			/* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
+			if (!vacrel->skipwithvm)
+			{
+				/* Caller shouldn't rely on all_visible_according_to_vm */
+				vacrel->skip.next_unskippable_allvis = false;
+				break;
+			}
+
 			/*
-			 * All-visible block is safe to skip in non-aggressive case.  But
-			 * remember that the final range contains such a block for later.
+			 * Aggressive VACUUM caller can't skip pages just because they are
+			 * all-visible.  They may still skip all-frozen pages, which can't
+			 * contain XIDs < OldestXmin (XIDs that aren't already frozen by
+			 * now).
 			 */
-			skipsallvis = true;
+			if ((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0)
+			{
+				if (vacrel->aggressive)
+					break;
+
+				/*
+				 * All-visible block is safe to skip in non-aggressive case.
+				 * But remember that the final range contains such a block for
+				 * later.
+				 */
+				skipsallvis = true;
+			}
+
+			vacuum_delay_point();
 		}
 
-		vacuum_delay_point();
-		next_unskippable_block++;
-	}
+		vacrel->skip.next_unskippable_block = next_unskippable_block;
 
-	vacrel->skip.next_unskippable_block = next_unskippable_block;
+		/*
+		 * We only skip a range with at least SKIP_PAGES_THRESHOLD consecutive
+		 * pages.  Since we're reading sequentially, the OS should be doing
+		 * readahead for us, so there's no gain in skipping a page now and
+		 * then. Skipping such a range might even discourage sequential
+		 * detection.
+		 *
+		 * This test also enables more frequent relfrozenxid advancement
+		 * during non-aggressive VACUUMs.  If the range has any all-visible
+		 * pages then skipping makes updating relfrozenxid unsafe, which is a
+		 * real downside.
+		 */
+		if (vacrel->skip.next_unskippable_block - next_block >= SKIP_PAGES_THRESHOLD)
+		{
+			next_block = vacrel->skip.next_unskippable_block;
+			if (skipsallvis)
+				vacrel->skippedallvis = true;
+		}
+	}
 
-	/*
-	 * We only skip a range with at least SKIP_PAGES_THRESHOLD consecutive
-	 * pages.  Since we're reading sequentially, the OS should be doing
-	 * readahead for us, so there's no gain in skipping a page now and then.
-	 * Skipping such a range might even discourage sequential detection.
-	 *
-	 * This test also enables more frequent relfrozenxid advancement during
-	 * non-aggressive VACUUMs.  If the range has any all-visible pages then
-	 * skipping makes updating relfrozenxid unsafe, which is a real downside.
-	 */
-	if (vacrel->skip.next_unskippable_block - next_block < SKIP_PAGES_THRESHOLD)
-		vacrel->skip.skipping_current_range = false;
+	if (next_block == vacrel->skip.next_unskippable_block)
+		*all_visible_according_to_vm = vacrel->skip.next_unskippable_allvis;
 	else
-	{
-		vacrel->skip.skipping_current_range = true;
-		if (skipsallvis)
-			vacrel->skippedallvis = true;
-	}
+		*all_visible_according_to_vm = true;
+
+	*blkno = next_block;
+	return true;
 }
 
 /*
-- 
2.37.2

v4-0001-lazy_scan_skip-remove-unneeded-local-var-nskippab.patchtext/x-patch; charset=US-ASCII; name=v4-0001-lazy_scan_skip-remove-unneeded-local-var-nskippab.patchDownload
From ab6d37e74daba17ff845f7399ec734f751127156 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 30 Dec 2023 16:30:59 -0500
Subject: [PATCH v4 1/4] lazy_scan_skip remove unneeded local var
 nskippable_blocks

nskippable_blocks can be easily derived from next_unskippable_block's
progress when compared to the passed in next_block.
---
 src/backend/access/heap/vacuumlazy.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index fa56480808b..e21c1124f5c 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1109,8 +1109,7 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 			   bool *next_unskippable_allvis, bool *skipping_current_range)
 {
 	BlockNumber rel_pages = vacrel->rel_pages,
-				next_unskippable_block = next_block,
-				nskippable_blocks = 0;
+				next_unskippable_block = next_block;
 	bool		skipsallvis = false;
 
 	*next_unskippable_allvis = true;
@@ -1167,7 +1166,6 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 
 		vacuum_delay_point();
 		next_unskippable_block++;
-		nskippable_blocks++;
 	}
 
 	/*
@@ -1180,7 +1178,7 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 	 * non-aggressive VACUUMs.  If the range has any all-visible pages then
 	 * skipping makes updating relfrozenxid unsafe, which is a real downside.
 	 */
-	if (nskippable_blocks < SKIP_PAGES_THRESHOLD)
+	if (next_unskippable_block - next_block < SKIP_PAGES_THRESHOLD)
 		*skipping_current_range = false;
 	else
 	{
-- 
2.37.2

v4-0002-Add-lazy_scan_skip-unskippable-state-to-LVRelStat.patchtext/x-patch; charset=US-ASCII; name=v4-0002-Add-lazy_scan_skip-unskippable-state-to-LVRelStat.patchDownload
From 9e949d6e6f6e6d63b246c70fc88ba1d79ea5eeb6 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 30 Dec 2023 16:22:12 -0500
Subject: [PATCH v4 2/4] Add lazy_scan_skip unskippable state to LVRelState

Future commits will remove all skipping logic from lazy_scan_heap() and
confine it to lazy_scan_skip(). To make those commits more clear, first
introduce add a struct to LVRelState containing variables needed to skip
ranges less than SKIP_PAGES_THRESHOLD.

lazy_scan_prune() and lazy_scan_new_or_empty() can now access the
buffer containing the relevant block of the visibility map through the
LVRelState.skip, so it no longer needs to be a separate function
parameter.

While we are at it, add additional information to the lazy_scan_skip()
comment, including descriptions of the role and expectations for its
function parameters.
---
 src/backend/access/heap/vacuumlazy.c | 154 ++++++++++++++++-----------
 1 file changed, 90 insertions(+), 64 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index e21c1124f5c..077164896fb 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -210,6 +210,22 @@ typedef struct LVRelState
 	int64		live_tuples;	/* # live tuples remaining */
 	int64		recently_dead_tuples;	/* # dead, but not yet removable */
 	int64		missed_dead_tuples; /* # removable, but not removed */
+
+	/*
+	 * Parameters maintained by lazy_scan_skip() to manage skipping ranges of
+	 * pages greater than SKIP_PAGES_THRESHOLD.
+	 */
+	struct
+	{
+		/* Next unskippable block */
+		BlockNumber next_unskippable_block;
+		/* Buffer containing next unskippable block's visibility info */
+		Buffer		vmbuffer;
+		/* Next unskippable block's visibility status */
+		bool		next_unskippable_allvis;
+		/* Whether or not skippable blocks should be skipped */
+		bool		skipping_current_range;
+	}			skip;
 } LVRelState;
 
 /* Struct for saving and restoring vacuum error information. */
@@ -220,19 +236,15 @@ typedef struct LVSavedErrInfo
 	VacErrPhase phase;
 } LVSavedErrInfo;
 
-
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
-static BlockNumber lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer,
-								  BlockNumber next_block,
-								  bool *next_unskippable_allvis,
-								  bool *skipping_current_range);
+static void lazy_scan_skip(LVRelState *vacrel, BlockNumber next_block);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
-								   bool sharelock, Buffer vmbuffer);
+								   bool sharelock);
 static void lazy_scan_prune(LVRelState *vacrel, Buffer buf,
 							BlockNumber blkno, Page page,
-							Buffer vmbuffer, bool all_visible_according_to_vm,
+							bool all_visible_according_to_vm,
 							bool *has_lpdead_items);
 static bool lazy_scan_noprune(LVRelState *vacrel, Buffer buf,
 							  BlockNumber blkno, Page page,
@@ -809,12 +821,8 @@ lazy_scan_heap(LVRelState *vacrel)
 {
 	BlockNumber rel_pages = vacrel->rel_pages,
 				blkno,
-				next_unskippable_block,
 				next_fsm_block_to_vacuum = 0;
 	VacDeadItems *dead_items = vacrel->dead_items;
-	Buffer		vmbuffer = InvalidBuffer;
-	bool		next_unskippable_allvis,
-				skipping_current_range;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -828,10 +836,9 @@ lazy_scan_heap(LVRelState *vacrel)
 	initprog_val[2] = dead_items->max_items;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
+	vacrel->skip.vmbuffer = InvalidBuffer;
 	/* Set up an initial range of skippable blocks using the visibility map */
-	next_unskippable_block = lazy_scan_skip(vacrel, &vmbuffer, 0,
-											&next_unskippable_allvis,
-											&skipping_current_range);
+	lazy_scan_skip(vacrel, 0);
 	for (blkno = 0; blkno < rel_pages; blkno++)
 	{
 		Buffer		buf;
@@ -840,26 +847,23 @@ lazy_scan_heap(LVRelState *vacrel)
 		bool		has_lpdead_items;
 		bool		got_cleanup_lock = false;
 
-		if (blkno == next_unskippable_block)
+		if (blkno == vacrel->skip.next_unskippable_block)
 		{
 			/*
 			 * Can't skip this page safely.  Must scan the page.  But
 			 * determine the next skippable range after the page first.
 			 */
-			all_visible_according_to_vm = next_unskippable_allvis;
-			next_unskippable_block = lazy_scan_skip(vacrel, &vmbuffer,
-													blkno + 1,
-													&next_unskippable_allvis,
-													&skipping_current_range);
+			all_visible_according_to_vm = vacrel->skip.next_unskippable_allvis;
+			lazy_scan_skip(vacrel, blkno + 1);
 
-			Assert(next_unskippable_block >= blkno + 1);
+			Assert(vacrel->skip.next_unskippable_block >= blkno + 1);
 		}
 		else
 		{
 			/* Last page always scanned (may need to set nonempty_pages) */
 			Assert(blkno < rel_pages - 1);
 
-			if (skipping_current_range)
+			if (vacrel->skip.skipping_current_range)
 				continue;
 
 			/* Current range is too small to skip -- just scan the page */
@@ -902,10 +906,10 @@ lazy_scan_heap(LVRelState *vacrel)
 			 * correctness, but we do it anyway to avoid holding the pin
 			 * across a lengthy, unrelated operation.
 			 */
-			if (BufferIsValid(vmbuffer))
+			if (BufferIsValid(vacrel->skip.vmbuffer))
 			{
-				ReleaseBuffer(vmbuffer);
-				vmbuffer = InvalidBuffer;
+				ReleaseBuffer(vacrel->skip.vmbuffer);
+				vacrel->skip.vmbuffer = InvalidBuffer;
 			}
 
 			/* Perform a round of index and heap vacuuming */
@@ -930,7 +934,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		 * all-visible.  In most cases this will be very cheap, because we'll
 		 * already have the correct page pinned anyway.
 		 */
-		visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
+		visibilitymap_pin(vacrel->rel, blkno, &vacrel->skip.vmbuffer);
 
 		buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL,
 								 vacrel->bstrategy);
@@ -948,8 +952,7 @@ lazy_scan_heap(LVRelState *vacrel)
 			LockBuffer(buf, BUFFER_LOCK_SHARE);
 
 		/* Check for new or empty pages before lazy_scan_[no]prune call */
-		if (lazy_scan_new_or_empty(vacrel, buf, blkno, page, !got_cleanup_lock,
-								   vmbuffer))
+		if (lazy_scan_new_or_empty(vacrel, buf, blkno, page, !got_cleanup_lock))
 		{
 			/* Processed as new/empty page (lock and pin released) */
 			continue;
@@ -991,7 +994,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		if (got_cleanup_lock)
 			lazy_scan_prune(vacrel, buf, blkno, page,
-							vmbuffer, all_visible_according_to_vm,
+							all_visible_according_to_vm,
 							&has_lpdead_items);
 
 		/*
@@ -1041,8 +1044,11 @@ lazy_scan_heap(LVRelState *vacrel)
 	}
 
 	vacrel->blkno = InvalidBlockNumber;
-	if (BufferIsValid(vmbuffer))
-		ReleaseBuffer(vmbuffer);
+	if (BufferIsValid(vacrel->skip.vmbuffer))
+	{
+		ReleaseBuffer(vacrel->skip.vmbuffer);
+		vacrel->skip.vmbuffer = InvalidBuffer;
+	}
 
 	/* report that everything is now scanned */
 	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
@@ -1086,15 +1092,34 @@ lazy_scan_heap(LVRelState *vacrel)
  *	lazy_scan_skip() -- set up range of skippable blocks using visibility map.
  *
  * lazy_scan_heap() calls here every time it needs to set up a new range of
- * blocks to skip via the visibility map.  Caller passes the next block in
- * line.  We return a next_unskippable_block for this range.  When there are
- * no skippable blocks we just return caller's next_block.  The all-visible
- * status of the returned block is set in *next_unskippable_allvis for caller,
- * too.  Block usually won't be all-visible (since it's unskippable), but it
- * can be during aggressive VACUUMs (as well as in certain edge cases).
+ * blocks to skip via the visibility map.  Caller passes next_block, the next
+ * block in line. The parameters of the skipped range are recorded in skip.
+ * vacrel is an in/out parameter here; vacuum options and information about the
+ * relation are read and vacrel->skippedallvis is set to ensure we don't
+ * advance relfrozenxid when we have skipped vacuuming all visible blocks.
+ *
+ * skip->vmbuffer will contain the block from the VM containing visibility
+ * information for the next unskippable heap block. We may end up needed a
+ * different block from the VM (if we decide not to skip a skippable block).
+ * This is okay; visibilitymap_pin() will take care of this while processing
+ * the block.
+ *
+ * A block is unskippable if it is not all visible according to the visibility
+ * map. It is also unskippable if it is the last block in the relation, if the
+ * vacuum is an aggressive vacuum, or if DISABLE_PAGE_SKIPPING was passed to
+ * vacuum.
  *
- * Sets *skipping_current_range to indicate if caller should skip this range.
- * Costs and benefits drive our decision.  Very small ranges won't be skipped.
+ * Even if a block is skippable, we may choose not to skip it if the range of
+ * skippable blocks is too small (below SKIP_PAGES_THRESHOLD). As a
+ * consequence, we must keep track of the next truly unskippable block and its
+ * visibility status along with whether or not we are skipping the current
+ * range of skippable blocks. This can be used to derive the next block
+ * lazy_scan_heap() must process and its visibility status.
+ *
+ * The block number and visibility status of the next unskippable block are set
+ * in skip->next_unskippable_block and next_unskippable_allvis.
+ * skip->skipping_current_range indicates to the caller whether or not it is
+ * processing a skippable (and thus all-visible) block.
  *
  * Note: our opinion of which blocks can be skipped can go stale immediately.
  * It's okay if caller "misses" a page whose all-visible or all-frozen marking
@@ -1104,25 +1129,26 @@ lazy_scan_heap(LVRelState *vacrel)
  * older XIDs/MXIDs.  The vacrel->skippedallvis flag will be set here when the
  * choice to skip such a range is actually made, making everything safe.)
  */
-static BlockNumber
-lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
-			   bool *next_unskippable_allvis, bool *skipping_current_range)
+static void
+lazy_scan_skip(LVRelState *vacrel, BlockNumber next_block)
 {
+	/* Use local variables for better optimized loop code */
 	BlockNumber rel_pages = vacrel->rel_pages,
 				next_unskippable_block = next_block;
+
 	bool		skipsallvis = false;
 
-	*next_unskippable_allvis = true;
+	vacrel->skip.next_unskippable_allvis = true;
 	while (next_unskippable_block < rel_pages)
 	{
 		uint8		mapbits = visibilitymap_get_status(vacrel->rel,
 													   next_unskippable_block,
-													   vmbuffer);
+													   &vacrel->skip.vmbuffer);
 
 		if ((mapbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
 		{
 			Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
-			*next_unskippable_allvis = false;
+			vacrel->skip.next_unskippable_allvis = false;
 			break;
 		}
 
@@ -1143,7 +1169,7 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 		if (!vacrel->skipwithvm)
 		{
 			/* Caller shouldn't rely on all_visible_according_to_vm */
-			*next_unskippable_allvis = false;
+			vacrel->skip.next_unskippable_allvis = false;
 			break;
 		}
 
@@ -1168,6 +1194,8 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 		next_unskippable_block++;
 	}
 
+	vacrel->skip.next_unskippable_block = next_unskippable_block;
+
 	/*
 	 * We only skip a range with at least SKIP_PAGES_THRESHOLD consecutive
 	 * pages.  Since we're reading sequentially, the OS should be doing
@@ -1178,16 +1206,14 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 	 * non-aggressive VACUUMs.  If the range has any all-visible pages then
 	 * skipping makes updating relfrozenxid unsafe, which is a real downside.
 	 */
-	if (next_unskippable_block - next_block < SKIP_PAGES_THRESHOLD)
-		*skipping_current_range = false;
+	if (vacrel->skip.next_unskippable_block - next_block < SKIP_PAGES_THRESHOLD)
+		vacrel->skip.skipping_current_range = false;
 	else
 	{
-		*skipping_current_range = true;
+		vacrel->skip.skipping_current_range = true;
 		if (skipsallvis)
 			vacrel->skippedallvis = true;
 	}
-
-	return next_unskippable_block;
 }
 
 /*
@@ -1220,7 +1246,7 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
  */
 static bool
 lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno,
-					   Page page, bool sharelock, Buffer vmbuffer)
+					   Page page, bool sharelock)
 {
 	Size		freespace;
 
@@ -1306,7 +1332,7 @@ lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno,
 
 			PageSetAllVisible(page);
 			visibilitymap_set(vacrel->rel, blkno, buf, InvalidXLogRecPtr,
-							  vmbuffer, InvalidTransactionId,
+							  vacrel->skip.vmbuffer, InvalidTransactionId,
 							  VISIBILITYMAP_ALL_VISIBLE | VISIBILITYMAP_ALL_FROZEN);
 			END_CRIT_SECTION();
 		}
@@ -1342,10 +1368,11 @@ lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno,
  * any tuple that becomes dead after the call to heap_page_prune() can't need to
  * be frozen, because it was visible to another session when vacuum started.
  *
- * vmbuffer is the buffer containing the VM block with visibility information
- * for the heap block, blkno. all_visible_according_to_vm is the saved
- * visibility status of the heap block looked up earlier by the caller. We
- * won't rely entirely on this status, as it may be out of date.
+ * vacrel->skipstate.vmbuffer is the buffer containing the VM block with
+ * visibility information for the heap block, blkno.
+ * all_visible_according_to_vm is the saved visibility status of the heap block
+ * looked up earlier by the caller. We won't rely entirely on this status, as
+ * it may be out of date.
  *
  * *has_lpdead_items is set to true or false depending on whether, upon return
  * from this function, any LP_DEAD items are still present on the page.
@@ -1355,7 +1382,6 @@ lazy_scan_prune(LVRelState *vacrel,
 				Buffer buf,
 				BlockNumber blkno,
 				Page page,
-				Buffer vmbuffer,
 				bool all_visible_according_to_vm,
 				bool *has_lpdead_items)
 {
@@ -1789,7 +1815,7 @@ lazy_scan_prune(LVRelState *vacrel,
 		PageSetAllVisible(page);
 		MarkBufferDirty(buf);
 		visibilitymap_set(vacrel->rel, blkno, buf, InvalidXLogRecPtr,
-						  vmbuffer, visibility_cutoff_xid,
+						  vacrel->skip.vmbuffer, visibility_cutoff_xid,
 						  flags);
 	}
 
@@ -1800,11 +1826,11 @@ lazy_scan_prune(LVRelState *vacrel,
 	 * buffer lock before concluding that the VM is corrupt.
 	 */
 	else if (all_visible_according_to_vm && !PageIsAllVisible(page) &&
-			 visibilitymap_get_status(vacrel->rel, blkno, &vmbuffer) != 0)
+			 visibilitymap_get_status(vacrel->rel, blkno, &vacrel->skip.vmbuffer) != 0)
 	{
 		elog(WARNING, "page is not marked all-visible but visibility map bit is set in relation \"%s\" page %u",
 			 vacrel->relname, blkno);
-		visibilitymap_clear(vacrel->rel, blkno, vmbuffer,
+		visibilitymap_clear(vacrel->rel, blkno, vacrel->skip.vmbuffer,
 							VISIBILITYMAP_VALID_BITS);
 	}
 
@@ -1828,7 +1854,7 @@ lazy_scan_prune(LVRelState *vacrel,
 			 vacrel->relname, blkno);
 		PageClearAllVisible(page);
 		MarkBufferDirty(buf);
-		visibilitymap_clear(vacrel->rel, blkno, vmbuffer,
+		visibilitymap_clear(vacrel->rel, blkno, vacrel->skip.vmbuffer,
 							VISIBILITYMAP_VALID_BITS);
 	}
 
@@ -1838,7 +1864,7 @@ lazy_scan_prune(LVRelState *vacrel,
 	 * true, so we must check both all_visible and all_frozen.
 	 */
 	else if (all_visible_according_to_vm && all_visible &&
-			 all_frozen && !VM_ALL_FROZEN(vacrel->rel, blkno, &vmbuffer))
+			 all_frozen && !VM_ALL_FROZEN(vacrel->rel, blkno, &vacrel->skip.vmbuffer))
 	{
 		/*
 		 * Avoid relying on all_visible_according_to_vm as a proxy for the
@@ -1860,7 +1886,7 @@ lazy_scan_prune(LVRelState *vacrel,
 		 */
 		Assert(!TransactionIdIsValid(visibility_cutoff_xid));
 		visibilitymap_set(vacrel->rel, blkno, buf, InvalidXLogRecPtr,
-						  vmbuffer, InvalidTransactionId,
+						  vacrel->skip.vmbuffer, InvalidTransactionId,
 						  VISIBILITYMAP_ALL_VISIBLE |
 						  VISIBILITYMAP_ALL_FROZEN);
 	}
-- 
2.37.2

#12Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#11)
7 attachment(s)
Re: Confine vacuum skip logic to lazy_scan_skip

On Mon, Jan 29, 2024 at 8:18 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Fri, Jan 26, 2024 at 8:28 AM vignesh C <vignesh21@gmail.com> wrote:

CFBot shows that the patch does not apply anymore as in [1]:
=== applying patch
./v3-0002-Add-lazy_scan_skip-unskippable-state-to-LVRelStat.patch
patching file src/backend/access/heap/vacuumlazy.c
...
Hunk #10 FAILED at 1042.
Hunk #11 FAILED at 1121.
Hunk #12 FAILED at 1132.
Hunk #13 FAILED at 1161.
Hunk #14 FAILED at 1172.
Hunk #15 FAILED at 1194.
...
6 out of 21 hunks FAILED -- saving rejects to file
src/backend/access/heap/vacuumlazy.c.rej

Please post an updated version for the same.

[1] - http://cfbot.cputube.org/patch_46_4755.log

Fixed in attached rebased v4

In light of Thomas' update to the streaming read API [1]/messages/by-id/CA+hUKGJtLyxcAEvLhVUhgD4fMQkOu3PDaj8Qb9SR_UsmzgsBpQ@mail.gmail.com, I have
rebased and updated this patch set.

The attached v5 has some simplifications when compared to v4 but takes
largely the same approach.

0001-0004 are refactoring
0005 is the streaming read code not yet in master
0006 is the vacuum streaming read user for vacuum's first pass
0007 is the vacuum streaming read user for vacuum's second pass

- Melanie

[1]: /messages/by-id/CA+hUKGJtLyxcAEvLhVUhgD4fMQkOu3PDaj8Qb9SR_UsmzgsBpQ@mail.gmail.com

Attachments:

v5-0004-Remove-unneeded-vacuum_delay_point-from-heap_vac_.patchtext/x-patch; charset=US-ASCII; name=v5-0004-Remove-unneeded-vacuum_delay_point-from-heap_vac_.patchDownload
From a153a50da9bcdeff408a1fe183ee2226570aadc9 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sun, 31 Dec 2023 12:49:56 -0500
Subject: [PATCH v5 4/7] Remove unneeded vacuum_delay_point from
 heap_vac_scan_get_next_block

heap_vac_scan_get_next_block() does relatively little work, so there is
no need to call vacuum_delay_point(). A future commit will call
heap_vac_scan_get_next_block() from a callback, and we would like to
avoid calling vacuum_delay_point() in that callback.
---
 src/backend/access/heap/vacuumlazy.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index e5988262611..ea270941379 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1190,8 +1190,6 @@ heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
 				 */
 				skipsallvis = true;
 			}
-
-			vacuum_delay_point();
 		}
 
 		vacrel->skip.next_unskippable_block = next_unskippable_block;
-- 
2.37.2

v5-0001-lazy_scan_skip-remove-unneeded-local-var-nskippab.patchtext/x-patch; charset=US-ASCII; name=v5-0001-lazy_scan_skip-remove-unneeded-local-var-nskippab.patchDownload
From 958e254aa12779c3a9f457a5751ba85931e23201 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 30 Dec 2023 16:30:59 -0500
Subject: [PATCH v5 1/7] lazy_scan_skip remove unneeded local var
 nskippable_blocks

nskippable_blocks can be easily derived from next_unskippable_block's
progress when compared to the passed in next_block.
---
 src/backend/access/heap/vacuumlazy.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index fa56480808b..e21c1124f5c 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1109,8 +1109,7 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 			   bool *next_unskippable_allvis, bool *skipping_current_range)
 {
 	BlockNumber rel_pages = vacrel->rel_pages,
-				next_unskippable_block = next_block,
-				nskippable_blocks = 0;
+				next_unskippable_block = next_block;
 	bool		skipsallvis = false;
 
 	*next_unskippable_allvis = true;
@@ -1167,7 +1166,6 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 
 		vacuum_delay_point();
 		next_unskippable_block++;
-		nskippable_blocks++;
 	}
 
 	/*
@@ -1180,7 +1178,7 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 	 * non-aggressive VACUUMs.  If the range has any all-visible pages then
 	 * skipping makes updating relfrozenxid unsafe, which is a real downside.
 	 */
-	if (nskippable_blocks < SKIP_PAGES_THRESHOLD)
+	if (next_unskippable_block - next_block < SKIP_PAGES_THRESHOLD)
 		*skipping_current_range = false;
 	else
 	{
-- 
2.37.2

v5-0003-Confine-vacuum-skip-logic-to-lazy_scan_skip.patchtext/x-patch; charset=US-ASCII; name=v5-0003-Confine-vacuum-skip-logic-to-lazy_scan_skip.patchDownload
From e9a3b596fab396d3e5b4ac5a66bc80762667c885 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 30 Dec 2023 16:59:27 -0500
Subject: [PATCH v5 3/7] Confine vacuum skip logic to lazy_scan_skip

In preparation for vacuum to use the streaming read interface (and
eventually AIO), refactor vacuum's logic for skipping blocks such that
it is entirely confined to lazy_scan_skip(). This turns lazy_scan_skip()
and the skip state in LVRelState it uses into an iterator which yields
blocks to lazy_scan_heap(). Such a structure is conducive to an async
interface. While we are at it, rename lazy_scan_skip() to
heap_vac_scan_get_next_block(), which now more accurately describes it.

By always calling heap_vac_scan_get_next_block() -- instead of only when
we have reached the next unskippable block, we no longer need the
skipping_current_range variable. lazy_scan_heap() no longer needs to
manage the skipped range -- checking if we reached the end in order to
then call heap_vac_scan_get_next_block(). And
heap_vac_scan_get_next_block() can derive the visibility status of a
block from whether or not we are in a skippable range -- that is,
whether or not the next_block is equal to the next unskippable block.
---
 src/backend/access/heap/vacuumlazy.c | 243 ++++++++++++++-------------
 1 file changed, 126 insertions(+), 117 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 077164896fb..e5988262611 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -212,8 +212,8 @@ typedef struct LVRelState
 	int64		missed_dead_tuples; /* # removable, but not removed */
 
 	/*
-	 * Parameters maintained by lazy_scan_skip() to manage skipping ranges of
-	 * pages greater than SKIP_PAGES_THRESHOLD.
+	 * Parameters maintained by heap_vac_scan_get_next_block() to manage
+	 * skipping ranges of pages greater than SKIP_PAGES_THRESHOLD.
 	 */
 	struct
 	{
@@ -238,7 +238,9 @@ typedef struct LVSavedErrInfo
 
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
-static void lazy_scan_skip(LVRelState *vacrel, BlockNumber next_block);
+static bool heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
+										 BlockNumber *blkno,
+										 bool *all_visible_according_to_vm);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
 								   bool sharelock);
@@ -820,8 +822,11 @@ static void
 lazy_scan_heap(LVRelState *vacrel)
 {
 	BlockNumber rel_pages = vacrel->rel_pages,
-				blkno,
 				next_fsm_block_to_vacuum = 0;
+	bool		all_visible_according_to_vm;
+
+	/* relies on InvalidBlockNumber overflowing to 0 */
+	BlockNumber blkno = InvalidBlockNumber;
 	VacDeadItems *dead_items = vacrel->dead_items;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
@@ -836,40 +841,17 @@ lazy_scan_heap(LVRelState *vacrel)
 	initprog_val[2] = dead_items->max_items;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
+	vacrel->skip.next_unskippable_block = InvalidBlockNumber;
 	vacrel->skip.vmbuffer = InvalidBuffer;
-	/* Set up an initial range of skippable blocks using the visibility map */
-	lazy_scan_skip(vacrel, 0);
-	for (blkno = 0; blkno < rel_pages; blkno++)
+
+	while (heap_vac_scan_get_next_block(vacrel, blkno + 1,
+										&blkno, &all_visible_according_to_vm))
 	{
 		Buffer		buf;
 		Page		page;
-		bool		all_visible_according_to_vm;
 		bool		has_lpdead_items;
 		bool		got_cleanup_lock = false;
 
-		if (blkno == vacrel->skip.next_unskippable_block)
-		{
-			/*
-			 * Can't skip this page safely.  Must scan the page.  But
-			 * determine the next skippable range after the page first.
-			 */
-			all_visible_according_to_vm = vacrel->skip.next_unskippable_allvis;
-			lazy_scan_skip(vacrel, blkno + 1);
-
-			Assert(vacrel->skip.next_unskippable_block >= blkno + 1);
-		}
-		else
-		{
-			/* Last page always scanned (may need to set nonempty_pages) */
-			Assert(blkno < rel_pages - 1);
-
-			if (vacrel->skip.skipping_current_range)
-				continue;
-
-			/* Current range is too small to skip -- just scan the page */
-			all_visible_according_to_vm = true;
-		}
-
 		vacrel->scanned_pages++;
 
 		/* Report as block scanned, update error traceback information */
@@ -1089,20 +1071,14 @@ lazy_scan_heap(LVRelState *vacrel)
 }
 
 /*
- *	lazy_scan_skip() -- set up range of skippable blocks using visibility map.
- *
- * lazy_scan_heap() calls here every time it needs to set up a new range of
- * blocks to skip via the visibility map.  Caller passes next_block, the next
- * block in line. The parameters of the skipped range are recorded in skip.
- * vacrel is an in/out parameter here; vacuum options and information about the
- * relation are read and vacrel->skippedallvis is set to ensure we don't
- * advance relfrozenxid when we have skipped vacuuming all visible blocks.
+ *	heap_vac_scan_get_next_block() -- get next block for vacuum to process
  *
- * skip->vmbuffer will contain the block from the VM containing visibility
- * information for the next unskippable heap block. We may end up needed a
- * different block from the VM (if we decide not to skip a skippable block).
- * This is okay; visibilitymap_pin() will take care of this while processing
- * the block.
+ * lazy_scan_heap() calls here every time it needs to get the next block to
+ * prune and vacuum, using the visibility map, vacuum options, and various
+ * thresholds to skip blocks which do not need to be processed. Caller passes
+ * next_block, the next block in line. This block may end up being skipped.
+ * heap_vac_scan_get_next_block() sets blkno to next block that actually needs
+ * to be processed.
  *
  * A block is unskippable if it is not all visible according to the visibility
  * map. It is also unskippable if it is the last block in the relation, if the
@@ -1112,14 +1088,25 @@ lazy_scan_heap(LVRelState *vacrel)
  * Even if a block is skippable, we may choose not to skip it if the range of
  * skippable blocks is too small (below SKIP_PAGES_THRESHOLD). As a
  * consequence, we must keep track of the next truly unskippable block and its
- * visibility status along with whether or not we are skipping the current
- * range of skippable blocks. This can be used to derive the next block
- * lazy_scan_heap() must process and its visibility status.
+ * visibility status separate from the next block lazy_scan_heap() should
+ * process (and its visibility status).
  *
  * The block number and visibility status of the next unskippable block are set
- * in skip->next_unskippable_block and next_unskippable_allvis.
- * skip->skipping_current_range indicates to the caller whether or not it is
- * processing a skippable (and thus all-visible) block.
+ * in vacrel->skip->next_unskippable_block and next_unskippable_allvis.
+ *
+ * The block number and visibility status of the next block to process are set
+ * in blkno and all_visible_according_to_vm. heap_vac_scan_get_next_block()
+ * returns false if there are no further blocks to process.
+ *
+ * vacrel is an in/out parameter here; vacuum options and information about the
+ * relation are read and vacrel->skippedallvis is set to ensure we don't
+ * advance relfrozenxid when we have skipped vacuuming all visible blocks.
+ *
+ * skip->vmbuffer will contain the block from the VM containing visibility
+ * information for the next unskippable heap block. We may end up needed a
+ * different block from the VM (if we decide not to skip a skippable block).
+ * This is okay; visibilitymap_pin() will take care of this while processing
+ * the block.
  *
  * Note: our opinion of which blocks can be skipped can go stale immediately.
  * It's okay if caller "misses" a page whose all-visible or all-frozen marking
@@ -1129,91 +1116,113 @@ lazy_scan_heap(LVRelState *vacrel)
  * older XIDs/MXIDs.  The vacrel->skippedallvis flag will be set here when the
  * choice to skip such a range is actually made, making everything safe.)
  */
-static void
-lazy_scan_skip(LVRelState *vacrel, BlockNumber next_block)
+static bool
+heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
+							 BlockNumber *blkno, bool *all_visible_according_to_vm)
 {
-	/* Use local variables for better optimized loop code */
-	BlockNumber rel_pages = vacrel->rel_pages,
-				next_unskippable_block = next_block;
-
 	bool		skipsallvis = false;
 
-	vacrel->skip.next_unskippable_allvis = true;
-	while (next_unskippable_block < rel_pages)
+	if (next_block >= vacrel->rel_pages)
 	{
-		uint8		mapbits = visibilitymap_get_status(vacrel->rel,
-													   next_unskippable_block,
-													   &vacrel->skip.vmbuffer);
+		*blkno = InvalidBlockNumber;
+		return false;
+	}
 
-		if ((mapbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
+	if (vacrel->skip.next_unskippable_block == InvalidBlockNumber ||
+		next_block > vacrel->skip.next_unskippable_block)
+	{
+		/* Use local variables for better optimized loop code */
+		BlockNumber rel_pages = vacrel->rel_pages;
+		BlockNumber next_unskippable_block = vacrel->skip.next_unskippable_block;
+
+		while (++next_unskippable_block < rel_pages)
 		{
-			Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
-			vacrel->skip.next_unskippable_allvis = false;
-			break;
-		}
+			uint8		mapbits = visibilitymap_get_status(vacrel->rel,
+														   next_unskippable_block,
+														   &vacrel->skip.vmbuffer);
 
-		/*
-		 * Caller must scan the last page to determine whether it has tuples
-		 * (caller must have the opportunity to set vacrel->nonempty_pages).
-		 * This rule avoids having lazy_truncate_heap() take access-exclusive
-		 * lock on rel to attempt a truncation that fails anyway, just because
-		 * there are tuples on the last page (it is likely that there will be
-		 * tuples on other nearby pages as well, but those can be skipped).
-		 *
-		 * Implement this by always treating the last block as unsafe to skip.
-		 */
-		if (next_unskippable_block == rel_pages - 1)
-			break;
+			vacrel->skip.next_unskippable_allvis = mapbits & VISIBILITYMAP_ALL_VISIBLE;
 
-		/* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
-		if (!vacrel->skipwithvm)
-		{
-			/* Caller shouldn't rely on all_visible_according_to_vm */
-			vacrel->skip.next_unskippable_allvis = false;
-			break;
-		}
+			if (!vacrel->skip.next_unskippable_allvis)
+			{
+				Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
+				break;
+			}
 
-		/*
-		 * Aggressive VACUUM caller can't skip pages just because they are
-		 * all-visible.  They may still skip all-frozen pages, which can't
-		 * contain XIDs < OldestXmin (XIDs that aren't already frozen by now).
-		 */
-		if ((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0)
-		{
-			if (vacrel->aggressive)
+			/*
+			 * Caller must scan the last page to determine whether it has
+			 * tuples (caller must have the opportunity to set
+			 * vacrel->nonempty_pages). This rule avoids having
+			 * lazy_truncate_heap() take access-exclusive lock on rel to
+			 * attempt a truncation that fails anyway, just because there are
+			 * tuples on the last page (it is likely that there will be tuples
+			 * on other nearby pages as well, but those can be skipped).
+			 *
+			 * Implement this by always treating the last block as unsafe to
+			 * skip.
+			 */
+			if (next_unskippable_block == rel_pages - 1)
 				break;
 
+			/* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
+			if (!vacrel->skipwithvm)
+			{
+				/* Caller shouldn't rely on all_visible_according_to_vm */
+				vacrel->skip.next_unskippable_allvis = false;
+				break;
+			}
+
 			/*
-			 * All-visible block is safe to skip in non-aggressive case.  But
-			 * remember that the final range contains such a block for later.
+			 * Aggressive VACUUM caller can't skip pages just because they are
+			 * all-visible.  They may still skip all-frozen pages, which can't
+			 * contain XIDs < OldestXmin (XIDs that aren't already frozen by
+			 * now).
 			 */
-			skipsallvis = true;
+			if ((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0)
+			{
+				if (vacrel->aggressive)
+					break;
+
+				/*
+				 * All-visible block is safe to skip in non-aggressive case.
+				 * But remember that the final range contains such a block for
+				 * later.
+				 */
+				skipsallvis = true;
+			}
+
+			vacuum_delay_point();
 		}
 
-		vacuum_delay_point();
-		next_unskippable_block++;
-	}
+		vacrel->skip.next_unskippable_block = next_unskippable_block;
 
-	vacrel->skip.next_unskippable_block = next_unskippable_block;
+		/*
+		 * We only skip a range with at least SKIP_PAGES_THRESHOLD consecutive
+		 * pages.  Since we're reading sequentially, the OS should be doing
+		 * readahead for us, so there's no gain in skipping a page now and
+		 * then. Skipping such a range might even discourage sequential
+		 * detection.
+		 *
+		 * This test also enables more frequent relfrozenxid advancement
+		 * during non-aggressive VACUUMs.  If the range has any all-visible
+		 * pages then skipping makes updating relfrozenxid unsafe, which is a
+		 * real downside.
+		 */
+		if (vacrel->skip.next_unskippable_block - next_block >= SKIP_PAGES_THRESHOLD)
+		{
+			next_block = vacrel->skip.next_unskippable_block;
+			if (skipsallvis)
+				vacrel->skippedallvis = true;
+		}
+	}
 
-	/*
-	 * We only skip a range with at least SKIP_PAGES_THRESHOLD consecutive
-	 * pages.  Since we're reading sequentially, the OS should be doing
-	 * readahead for us, so there's no gain in skipping a page now and then.
-	 * Skipping such a range might even discourage sequential detection.
-	 *
-	 * This test also enables more frequent relfrozenxid advancement during
-	 * non-aggressive VACUUMs.  If the range has any all-visible pages then
-	 * skipping makes updating relfrozenxid unsafe, which is a real downside.
-	 */
-	if (vacrel->skip.next_unskippable_block - next_block < SKIP_PAGES_THRESHOLD)
-		vacrel->skip.skipping_current_range = false;
+	if (next_block == vacrel->skip.next_unskippable_block)
+		*all_visible_according_to_vm = vacrel->skip.next_unskippable_allvis;
 	else
-	{
-		vacrel->skip.skipping_current_range = true;
-		if (skipsallvis)
-			vacrel->skippedallvis = true;
-	}
+		*all_visible_according_to_vm = true;
+
+	*blkno = next_block;
+	return true;
 }
 
 /*
-- 
2.37.2

v5-0005-Streaming-Read-API.patchtext/x-patch; charset=US-ASCII; name=v5-0005-Streaming-Read-API.patchDownload
From 1f10bf10a02502a9521d8ce8130d1964315da30c Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Mon, 26 Feb 2024 23:48:31 +1300
Subject: [PATCH v5 5/7] Streaming Read API

---
 contrib/pg_prewarm/pg_prewarm.c          |  40 +-
 src/backend/storage/Makefile             |   2 +-
 src/backend/storage/aio/Makefile         |  14 +
 src/backend/storage/aio/meson.build      |   5 +
 src/backend/storage/aio/streaming_read.c | 612 ++++++++++++++++++++++
 src/backend/storage/buffer/bufmgr.c      | 641 ++++++++++++++++-------
 src/backend/storage/buffer/localbuf.c    |  14 +-
 src/backend/storage/meson.build          |   1 +
 src/include/storage/bufmgr.h             |  45 ++
 src/include/storage/streaming_read.h     |  52 ++
 src/tools/pgindent/typedefs.list         |   3 +
 11 files changed, 1218 insertions(+), 211 deletions(-)
 create mode 100644 src/backend/storage/aio/Makefile
 create mode 100644 src/backend/storage/aio/meson.build
 create mode 100644 src/backend/storage/aio/streaming_read.c
 create mode 100644 src/include/storage/streaming_read.h

diff --git a/contrib/pg_prewarm/pg_prewarm.c b/contrib/pg_prewarm/pg_prewarm.c
index 8541e4d6e46..1cc84bcb0c2 100644
--- a/contrib/pg_prewarm/pg_prewarm.c
+++ b/contrib/pg_prewarm/pg_prewarm.c
@@ -20,6 +20,7 @@
 #include "miscadmin.h"
 #include "storage/bufmgr.h"
 #include "storage/smgr.h"
+#include "storage/streaming_read.h"
 #include "utils/acl.h"
 #include "utils/builtins.h"
 #include "utils/lsyscache.h"
@@ -38,6 +39,25 @@ typedef enum
 
 static PGIOAlignedBlock blockbuffer;
 
+struct pg_prewarm_streaming_read_private
+{
+	BlockNumber blocknum;
+	int64		last_block;
+};
+
+static BlockNumber
+pg_prewarm_streaming_read_next(PgStreamingRead *pgsr,
+							   void *pgsr_private,
+							   void *per_buffer_data)
+{
+	struct pg_prewarm_streaming_read_private *p = pgsr_private;
+
+	if (p->blocknum <= p->last_block)
+		return p->blocknum++;
+
+	return InvalidBlockNumber;
+}
+
 /*
  * pg_prewarm(regclass, mode text, fork text,
  *			  first_block int8, last_block int8)
@@ -183,18 +203,36 @@ pg_prewarm(PG_FUNCTION_ARGS)
 	}
 	else if (ptype == PREWARM_BUFFER)
 	{
+		struct pg_prewarm_streaming_read_private p;
+		PgStreamingRead *pgsr;
+
 		/*
 		 * In buffer mode, we actually pull the data into shared_buffers.
 		 */
+
+		/* Set up the private state for our streaming buffer read callback. */
+		p.blocknum = first_block;
+		p.last_block = last_block;
+
+		pgsr = pg_streaming_read_buffer_alloc(PGSR_FLAG_FULL,
+											  &p,
+											  0,
+											  NULL,
+											  BMR_REL(rel),
+											  forkNumber,
+											  pg_prewarm_streaming_read_next);
+
 		for (block = first_block; block <= last_block; ++block)
 		{
 			Buffer		buf;
 
 			CHECK_FOR_INTERRUPTS();
-			buf = ReadBufferExtended(rel, forkNumber, block, RBM_NORMAL, NULL);
+			buf = pg_streaming_read_buffer_get_next(pgsr, NULL);
 			ReleaseBuffer(buf);
 			++blocks_done;
 		}
+		Assert(pg_streaming_read_buffer_get_next(pgsr, NULL) == InvalidBuffer);
+		pg_streaming_read_free(pgsr);
 	}
 
 	/* Close relation, release lock. */
diff --git a/src/backend/storage/Makefile b/src/backend/storage/Makefile
index 8376cdfca20..eec03f6f2b4 100644
--- a/src/backend/storage/Makefile
+++ b/src/backend/storage/Makefile
@@ -8,6 +8,6 @@ subdir = src/backend/storage
 top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS     = buffer file freespace ipc large_object lmgr page smgr sync
+SUBDIRS     = aio buffer file freespace ipc large_object lmgr page smgr sync
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/storage/aio/Makefile b/src/backend/storage/aio/Makefile
new file mode 100644
index 00000000000..bcab44c802f
--- /dev/null
+++ b/src/backend/storage/aio/Makefile
@@ -0,0 +1,14 @@
+#
+# Makefile for storage/aio
+#
+# src/backend/storage/aio/Makefile
+#
+
+subdir = src/backend/storage/aio
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+OBJS = \
+	streaming_read.o
+
+include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/storage/aio/meson.build b/src/backend/storage/aio/meson.build
new file mode 100644
index 00000000000..39aef2a84a2
--- /dev/null
+++ b/src/backend/storage/aio/meson.build
@@ -0,0 +1,5 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+backend_sources += files(
+  'streaming_read.c',
+)
diff --git a/src/backend/storage/aio/streaming_read.c b/src/backend/storage/aio/streaming_read.c
new file mode 100644
index 00000000000..71f2c4a70b6
--- /dev/null
+++ b/src/backend/storage/aio/streaming_read.c
@@ -0,0 +1,612 @@
+#include "postgres.h"
+
+#include "storage/streaming_read.h"
+#include "utils/rel.h"
+
+/*
+ * Element type for PgStreamingRead's circular array of block ranges.
+ */
+typedef struct PgStreamingReadRange
+{
+	bool		need_wait;
+	bool		advice_issued;
+	BlockNumber blocknum;
+	int			nblocks;
+	int			per_buffer_data_index;
+	Buffer		buffers[MAX_BUFFERS_PER_TRANSFER];
+	ReadBuffersOperation operation;
+} PgStreamingReadRange;
+
+/*
+ * Streaming read object.
+ */
+struct PgStreamingRead
+{
+	int			max_ios;
+	int			ios_in_progress;
+	int			max_pinned_buffers;
+	int			pinned_buffers;
+	int			pinned_buffers_trigger;
+	int			next_tail_buffer;
+	int			ramp_up_pin_limit;
+	int			ramp_up_pin_stall;
+	bool		finished;
+	bool		advice_enabled;
+	void	   *pgsr_private;
+	PgStreamingReadBufferCB callback;
+
+	BufferAccessStrategy strategy;
+	BufferManagerRelation bmr;
+	ForkNumber	forknum;
+
+	/* Sometimes we need to buffer one block for flow control. */
+	BlockNumber unget_blocknum;
+	void	   *unget_per_buffer_data;
+
+	/* Next expected block, for detecting sequential access. */
+	BlockNumber seq_blocknum;
+
+	/* Space for optional per-buffer private data. */
+	size_t		per_buffer_data_size;
+	void	   *per_buffer_data;
+
+	/* Circular buffer of ranges. */
+	int			size;
+	int			head;
+	int			tail;
+	PgStreamingReadRange ranges[FLEXIBLE_ARRAY_MEMBER];
+};
+
+static PgStreamingRead *
+pg_streaming_read_buffer_alloc_internal(int flags,
+										void *pgsr_private,
+										size_t per_buffer_data_size,
+										BufferAccessStrategy strategy)
+{
+	PgStreamingRead *pgsr;
+	int			size;
+	int			max_ios;
+	uint32		max_pinned_buffers;
+
+
+	/*
+	 * Decide how many assumed I/Os we will allow to run concurrently.  That
+	 * is, advice to the kernel to tell it that we will soon read.  This
+	 * number also affects how far we look ahead for opportunities to start
+	 * more I/Os.
+	 */
+	if (flags & PGSR_FLAG_MAINTENANCE)
+		max_ios = maintenance_io_concurrency;
+	else
+		max_ios = effective_io_concurrency;
+
+	/*
+	 * The desired level of I/O concurrency controls how far ahead we are
+	 * willing to look ahead.  We also clamp it to at least
+	 * MAX_BUFFER_PER_TRANFER so that we can have a chance to build up a full
+	 * sized read, even when max_ios is zero.
+	 */
+	max_pinned_buffers = Max(max_ios * 4, MAX_BUFFERS_PER_TRANSFER);
+
+	/*
+	 * The *_io_concurrency GUCs might be set to 0, but we want to allow at
+	 * least one, to keep our gating logic simple.
+	 */
+	max_ios = Max(max_ios, 1);
+
+	/*
+	 * Don't allow this backend to pin too many buffers.  For now we'll apply
+	 * the limit for the shared buffer pool and the local buffer pool, without
+	 * worrying which it is.
+	 */
+	LimitAdditionalPins(&max_pinned_buffers);
+	LimitAdditionalLocalPins(&max_pinned_buffers);
+	Assert(max_pinned_buffers > 0);
+
+	/*
+	 * pgsr->ranges is a circular buffer.  When it is empty, head == tail.
+	 * When it is full, there is an empty element between head and tail.  Head
+	 * can also be empty (nblocks == 0), therefore we need two extra elements
+	 * for non-occupied ranges, on top of max_pinned_buffers to allow for the
+	 * maxmimum possible number of occupied ranges of the smallest possible
+	 * size of one.
+	 */
+	size = max_pinned_buffers + 2;
+
+	pgsr = (PgStreamingRead *)
+		palloc0(offsetof(PgStreamingRead, ranges) +
+				sizeof(pgsr->ranges[0]) * size);
+
+	pgsr->max_ios = max_ios;
+	pgsr->per_buffer_data_size = per_buffer_data_size;
+	pgsr->max_pinned_buffers = max_pinned_buffers;
+	pgsr->pgsr_private = pgsr_private;
+	pgsr->strategy = strategy;
+	pgsr->size = size;
+
+	pgsr->unget_blocknum = InvalidBlockNumber;
+
+#ifdef USE_PREFETCH
+
+	/*
+	 * This system supports prefetching advice.  As long as direct I/O isn't
+	 * enabled, and the caller hasn't promised sequential access, we can use
+	 * it.
+	 */
+	if ((io_direct_flags & IO_DIRECT_DATA) == 0 &&
+		(flags & PGSR_FLAG_SEQUENTIAL) == 0)
+		pgsr->advice_enabled = true;
+#endif
+
+	/*
+	 * We start off building small ranges, but double that quickly, for the
+	 * benefit of users that don't know how far ahead they'll read.  This can
+	 * be disabled by users that already know they'll read all the way.
+	 */
+	if (flags & PGSR_FLAG_FULL)
+		pgsr->ramp_up_pin_limit = INT_MAX;
+	else
+		pgsr->ramp_up_pin_limit = 1;
+
+	/*
+	 * We want to avoid creating ranges that are smaller than they could be
+	 * just because we hit max_pinned_buffers.  We only look ahead when the
+	 * number of pinned buffers falls below this trigger number, or put
+	 * another way, we stop looking ahead when we wouldn't be able to build a
+	 * "full sized" range.
+	 */
+	pgsr->pinned_buffers_trigger =
+		Max(1, (int) max_pinned_buffers - MAX_BUFFERS_PER_TRANSFER);
+
+	/* Space for the callback to store extra data along with each block. */
+	if (per_buffer_data_size)
+		pgsr->per_buffer_data = palloc(per_buffer_data_size * max_pinned_buffers);
+
+	return pgsr;
+}
+
+/*
+ * Create a new streaming read object that can be used to perform the
+ * equivalent of a series of ReadBuffer() calls for one fork of one relation.
+ * Internally, it generates larger vectored reads where possible by looking
+ * ahead.
+ */
+PgStreamingRead *
+pg_streaming_read_buffer_alloc(int flags,
+							   void *pgsr_private,
+							   size_t per_buffer_data_size,
+							   BufferAccessStrategy strategy,
+							   BufferManagerRelation bmr,
+							   ForkNumber forknum,
+							   PgStreamingReadBufferCB next_block_cb)
+{
+	PgStreamingRead *result;
+
+	result = pg_streaming_read_buffer_alloc_internal(flags,
+													 pgsr_private,
+													 per_buffer_data_size,
+													 strategy);
+	result->callback = next_block_cb;
+	result->bmr = bmr;
+	result->forknum = forknum;
+
+	return result;
+}
+
+/*
+ * Find the per-buffer data index for the Nth block of a range.
+ */
+static int
+get_per_buffer_data_index(PgStreamingRead *pgsr, PgStreamingReadRange *range, int n)
+{
+	int			result;
+
+	/*
+	 * Find slot in the circular buffer of per-buffer data, without using the
+	 * expensive % operator.
+	 */
+	result = range->per_buffer_data_index + n;
+	if (result >= pgsr->max_pinned_buffers)
+		result -= pgsr->max_pinned_buffers;
+	Assert(result == (range->per_buffer_data_index + n) % pgsr->max_pinned_buffers);
+
+	return result;
+}
+
+/*
+ * Return a pointer to the per-buffer data by index.
+ */
+static void *
+get_per_buffer_data_by_index(PgStreamingRead *pgsr, int per_buffer_data_index)
+{
+	return (char *) pgsr->per_buffer_data +
+		pgsr->per_buffer_data_size * per_buffer_data_index;
+}
+
+/*
+ * Return a pointer to the per-buffer data for the Nth block of a range.
+ */
+static void *
+get_per_buffer_data(PgStreamingRead *pgsr, PgStreamingReadRange *range, int n)
+{
+	return get_per_buffer_data_by_index(pgsr,
+										get_per_buffer_data_index(pgsr,
+																  range,
+																  n));
+}
+
+/*
+ * Start reading the head range, and create a new head range.  The new head
+ * range is returned.  It may not be empty, if StartReadBuffers() couldn't
+ * start the entire range; in that case the returned range contains the
+ * remaining portion of the range.
+ */
+static PgStreamingReadRange *
+pg_streaming_read_start_head_range(PgStreamingRead *pgsr)
+{
+	PgStreamingReadRange *head_range;
+	PgStreamingReadRange *new_head_range;
+	int			nblocks_pinned;
+	int			flags;
+
+	/* Caller should make sure we never exceed max_ios. */
+	Assert(pgsr->ios_in_progress < pgsr->max_ios);
+
+	/* Should only call if the head range has some blocks to read. */
+	head_range = &pgsr->ranges[pgsr->head];
+	Assert(head_range->nblocks > 0);
+
+	/*
+	 * If advice hasn't been suppressed, and this system supports it, this
+	 * isn't a strictly sequential pattern, then we'll issue advice.
+	 */
+	if (pgsr->advice_enabled && head_range->blocknum != pgsr->seq_blocknum)
+		flags = READ_BUFFERS_ISSUE_ADVICE;
+	else
+		flags = 0;
+
+
+	/* Start reading as many blocks as we can from the head range. */
+	nblocks_pinned = head_range->nblocks;
+	head_range->need_wait =
+		StartReadBuffers(pgsr->bmr,
+						 head_range->buffers,
+						 pgsr->forknum,
+						 head_range->blocknum,
+						 &nblocks_pinned,
+						 pgsr->strategy,
+						 flags,
+						 &head_range->operation);
+
+	/* Did that start an I/O? */
+	if (head_range->need_wait && (flags & READ_BUFFERS_ISSUE_ADVICE))
+	{
+		head_range->advice_issued = true;
+		pgsr->ios_in_progress++;
+		Assert(pgsr->ios_in_progress <= pgsr->max_ios);
+	}
+
+	/*
+	 * StartReadBuffers() might have pinned fewer blocks than we asked it to,
+	 * but always at least one.
+	 */
+	Assert(nblocks_pinned <= head_range->nblocks);
+	Assert(nblocks_pinned >= 1);
+	pgsr->pinned_buffers += nblocks_pinned;
+
+	/*
+	 * Remember where the next block would be after that, so we can detect
+	 * sequential access next time.
+	 */
+	pgsr->seq_blocknum = head_range->blocknum + nblocks_pinned;
+
+	/*
+	 * Create a new head range.  There must be space, because we have enough
+	 * elements for every range to hold just one block, up to the pin limit.
+	 */
+	Assert(pgsr->size > pgsr->max_pinned_buffers);
+	Assert((pgsr->head + 1) % pgsr->size != pgsr->tail);
+	if (++pgsr->head == pgsr->size)
+		pgsr->head = 0;
+	new_head_range = &pgsr->ranges[pgsr->head];
+	new_head_range->nblocks = 0;
+	new_head_range->advice_issued = false;
+
+	/*
+	 * If we didn't manage to start the whole read above, we split the range,
+	 * moving the remainder into the new head range.
+	 */
+	if (nblocks_pinned < head_range->nblocks)
+	{
+		int			nblocks_remaining = head_range->nblocks - nblocks_pinned;
+
+		head_range->nblocks = nblocks_pinned;
+
+		new_head_range->blocknum = head_range->blocknum + nblocks_pinned;
+		new_head_range->nblocks = nblocks_remaining;
+	}
+
+	/* The new range has per-buffer data starting after the previous range. */
+	new_head_range->per_buffer_data_index =
+		get_per_buffer_data_index(pgsr, head_range, nblocks_pinned);
+
+	return new_head_range;
+}
+
+/*
+ * Ask the callback which block it would like us to read next, with a small
+ * buffer in front to allow pg_streaming_unget_block() to work.
+ */
+static BlockNumber
+pg_streaming_get_block(PgStreamingRead *pgsr, void *per_buffer_data)
+{
+	BlockNumber result;
+
+	if (unlikely(pgsr->unget_blocknum != InvalidBlockNumber))
+	{
+		/*
+		 * If we had to unget a block, now it is time to return that one
+		 * again.
+		 */
+		result = pgsr->unget_blocknum;
+		pgsr->unget_blocknum = InvalidBlockNumber;
+
+		/*
+		 * The same per_buffer_data element must have been used, and still
+		 * contains whatever data the callback wrote into it.  So we just
+		 * sanity-check that we were called with the value that
+		 * pg_streaming_unget_block() pushed back.
+		 */
+		Assert(per_buffer_data == pgsr->unget_per_buffer_data);
+	}
+	else
+	{
+		/* Use the installed callback directly. */
+		result = pgsr->callback(pgsr, pgsr->pgsr_private, per_buffer_data);
+	}
+
+	return result;
+}
+
+/*
+ * In order to deal with short reads in StartReadBuffers(), we sometimes need
+ * to defer handling of a block until later.  This *must* be called with the
+ * last value returned by pg_streaming_get_block().
+ */
+static void
+pg_streaming_unget_block(PgStreamingRead *pgsr, BlockNumber blocknum, void *per_buffer_data)
+{
+	Assert(pgsr->unget_blocknum == InvalidBlockNumber);
+	pgsr->unget_blocknum = blocknum;
+	pgsr->unget_per_buffer_data = per_buffer_data;
+}
+
+static void
+pg_streaming_read_look_ahead(PgStreamingRead *pgsr)
+{
+	PgStreamingReadRange *range;
+
+	/*
+	 * If we're still ramping up, we may have to stall to wait for buffers to
+	 * be consumed first before we do any more prefetching.
+	 */
+	if (pgsr->ramp_up_pin_stall > 0)
+	{
+		Assert(pgsr->pinned_buffers > 0);
+		return;
+	}
+
+	/*
+	 * If we're finished or can't start more I/O, then don't look ahead.
+	 */
+	if (pgsr->finished || pgsr->ios_in_progress == pgsr->max_ios)
+		return;
+
+	/*
+	 * We'll also wait until the number of pinned buffers falls below our
+	 * trigger level, so that we have the chance to create a full range.
+	 */
+	if (pgsr->pinned_buffers >= pgsr->pinned_buffers_trigger)
+		return;
+
+	do
+	{
+		BlockNumber blocknum;
+		void	   *per_buffer_data;
+
+		/* Do we have a full-sized range? */
+		range = &pgsr->ranges[pgsr->head];
+		if (range->nblocks == lengthof(range->buffers))
+		{
+			/* Start as much of it as we can. */
+			range = pg_streaming_read_start_head_range(pgsr);
+
+			/* If we're now at the I/O limit, stop here. */
+			if (pgsr->ios_in_progress == pgsr->max_ios)
+				return;
+
+			/*
+			 * If we couldn't form a full range, then stop here to avoid
+			 * creating small I/O.
+			 */
+			if (pgsr->pinned_buffers >= pgsr->pinned_buffers_trigger)
+				return;
+
+			/*
+			 * That might have only been partially started, but always
+			 * processes at least one so that'll do for now.
+			 */
+			Assert(range->nblocks < lengthof(range->buffers));
+		}
+
+		/* Find per-buffer data slot for the next block. */
+		per_buffer_data = get_per_buffer_data(pgsr, range, range->nblocks);
+
+		/* Find out which block the callback wants to read next. */
+		blocknum = pg_streaming_get_block(pgsr, per_buffer_data);
+		if (blocknum == InvalidBlockNumber)
+		{
+			/* End of stream. */
+			pgsr->finished = true;
+			break;
+		}
+
+		/*
+		 * Is there a head range that we cannot extend, because the requested
+		 * block is not consecutive?
+		 */
+		if (range->nblocks > 0 &&
+			range->blocknum + range->nblocks != blocknum)
+		{
+			/* Yes.  Start it, so we can begin building a new one. */
+			range = pg_streaming_read_start_head_range(pgsr);
+
+			/*
+			 * It's possible that it was only partially started, and we have a
+			 * new range with the remainder.  Keep starting I/Os until we get
+			 * it all out of the way, or we hit the I/O limit.
+			 */
+			while (range->nblocks > 0 && pgsr->ios_in_progress < pgsr->max_ios)
+				range = pg_streaming_read_start_head_range(pgsr);
+
+			/*
+			 * We have to 'unget' the block returned by the callback if we
+			 * don't have enough I/O capacity left to start something.
+			 */
+			if (pgsr->ios_in_progress == pgsr->max_ios)
+			{
+				pg_streaming_unget_block(pgsr, blocknum, per_buffer_data);
+				return;
+			}
+		}
+
+		/* If we have a new, empty range, initialize the start block. */
+		if (range->nblocks == 0)
+		{
+			range->blocknum = blocknum;
+		}
+
+		/* This block extends the range by one. */
+		Assert(range->blocknum + range->nblocks == blocknum);
+		range->nblocks++;
+
+	} while (pgsr->pinned_buffers + range->nblocks < pgsr->max_pinned_buffers &&
+			 pgsr->pinned_buffers + range->nblocks < pgsr->ramp_up_pin_limit);
+
+	/* If we've hit the ramp-up limit, insert a stall. */
+	if (pgsr->pinned_buffers + range->nblocks >= pgsr->ramp_up_pin_limit)
+	{
+		/* Can't get here if an earlier stall hasn't finished. */
+		Assert(pgsr->ramp_up_pin_stall == 0);
+		/* Don't do any more prefetching until these buffers are consumed. */
+		pgsr->ramp_up_pin_stall = pgsr->ramp_up_pin_limit;
+		/* Double it.  It will soon be out of the way. */
+		pgsr->ramp_up_pin_limit *= 2;
+	}
+
+	/* Start as much as we can. */
+	while (range->nblocks > 0)
+	{
+		range = pg_streaming_read_start_head_range(pgsr);
+		if (pgsr->ios_in_progress == pgsr->max_ios)
+			break;
+	}
+}
+
+Buffer
+pg_streaming_read_buffer_get_next(PgStreamingRead *pgsr, void **per_buffer_data)
+{
+	pg_streaming_read_look_ahead(pgsr);
+
+	/* See if we have one buffer to return. */
+	while (pgsr->tail != pgsr->head)
+	{
+		PgStreamingReadRange *tail_range;
+
+		tail_range = &pgsr->ranges[pgsr->tail];
+
+		/*
+		 * Do we need to perform an I/O before returning the buffers from this
+		 * range?
+		 */
+		if (tail_range->need_wait)
+		{
+			WaitReadBuffers(&tail_range->operation);
+			tail_range->need_wait = false;
+
+			/*
+			 * We don't really know if the kernel generated a physical I/O
+			 * when we issued advice, let alone when it finished, but it has
+			 * certainly finished now because we've performed the read.
+			 */
+			if (tail_range->advice_issued)
+			{
+				Assert(pgsr->ios_in_progress > 0);
+				pgsr->ios_in_progress--;
+			}
+		}
+
+		/* Are there more buffers available in this range? */
+		if (pgsr->next_tail_buffer < tail_range->nblocks)
+		{
+			int			buffer_index;
+			Buffer		buffer;
+
+			buffer_index = pgsr->next_tail_buffer++;
+			buffer = tail_range->buffers[buffer_index];
+
+			Assert(BufferIsValid(buffer));
+
+			/* We are giving away ownership of this pinned buffer. */
+			Assert(pgsr->pinned_buffers > 0);
+			pgsr->pinned_buffers--;
+
+			if (pgsr->ramp_up_pin_stall > 0)
+				pgsr->ramp_up_pin_stall--;
+
+			if (per_buffer_data)
+				*per_buffer_data = get_per_buffer_data(pgsr, tail_range, buffer_index);
+
+			return buffer;
+		}
+
+		/* Advance tail to next range, if there is one. */
+		if (++pgsr->tail == pgsr->size)
+			pgsr->tail = 0;
+		pgsr->next_tail_buffer = 0;
+
+		/*
+		 * If tail crashed into head, and head is not empty, then it is time
+		 * to start that range.
+		 */
+		if (pgsr->tail == pgsr->head &&
+			pgsr->ranges[pgsr->head].nblocks > 0)
+			pg_streaming_read_start_head_range(pgsr);
+	}
+
+	Assert(pgsr->pinned_buffers == 0);
+
+	return InvalidBuffer;
+}
+
+void
+pg_streaming_read_free(PgStreamingRead *pgsr)
+{
+	Buffer		buffer;
+
+	/* Stop looking ahead. */
+	pgsr->finished = true;
+
+	/* Unpin anything that wasn't consumed. */
+	while ((buffer = pg_streaming_read_buffer_get_next(pgsr, NULL)) != InvalidBuffer)
+		ReleaseBuffer(buffer);
+
+	Assert(pgsr->pinned_buffers == 0);
+	Assert(pgsr->ios_in_progress == 0);
+
+	/* Release memory. */
+	if (pgsr->per_buffer_data)
+		pfree(pgsr->per_buffer_data);
+
+	pfree(pgsr);
+}
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index bdf89bbc4dc..3b1b0ad99df 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -19,6 +19,11 @@
  *		and pin it so that no one can destroy it while this process
  *		is using it.
  *
+ * StartReadBuffers() -- as above, but for multiple contiguous blocks in
+ *		two steps.
+ *
+ * WaitReadBuffers() -- second step of StartReadBuffers().
+ *
  * ReleaseBuffer() -- unpin a buffer
  *
  * MarkBufferDirty() -- mark a pinned buffer's contents as "dirty".
@@ -472,10 +477,9 @@ ForgetPrivateRefCountEntry(PrivateRefCountEntry *ref)
 )
 
 
-static Buffer ReadBuffer_common(SMgrRelation smgr, char relpersistence,
+static Buffer ReadBuffer_common(BufferManagerRelation bmr,
 								ForkNumber forkNum, BlockNumber blockNum,
-								ReadBufferMode mode, BufferAccessStrategy strategy,
-								bool *hit);
+								ReadBufferMode mode, BufferAccessStrategy strategy);
 static BlockNumber ExtendBufferedRelCommon(BufferManagerRelation bmr,
 										   ForkNumber fork,
 										   BufferAccessStrategy strategy,
@@ -501,7 +505,7 @@ static uint32 WaitBufHdrUnlocked(BufferDesc *buf);
 static int	SyncOneBuffer(int buf_id, bool skip_recently_used,
 						  WritebackContext *wb_context);
 static void WaitIO(BufferDesc *buf);
-static bool StartBufferIO(BufferDesc *buf, bool forInput);
+static bool StartBufferIO(BufferDesc *buf, bool forInput, bool nowait);
 static void TerminateBufferIO(BufferDesc *buf, bool clear_dirty,
 							  uint32 set_flag_bits, bool forget_owner);
 static void AbortBufferIO(Buffer buffer);
@@ -782,7 +786,6 @@ Buffer
 ReadBufferExtended(Relation reln, ForkNumber forkNum, BlockNumber blockNum,
 				   ReadBufferMode mode, BufferAccessStrategy strategy)
 {
-	bool		hit;
 	Buffer		buf;
 
 	/*
@@ -795,15 +798,9 @@ ReadBufferExtended(Relation reln, ForkNumber forkNum, BlockNumber blockNum,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("cannot access temporary tables of other sessions")));
 
-	/*
-	 * Read the buffer, and update pgstat counters to reflect a cache hit or
-	 * miss.
-	 */
-	pgstat_count_buffer_read(reln);
-	buf = ReadBuffer_common(RelationGetSmgr(reln), reln->rd_rel->relpersistence,
-							forkNum, blockNum, mode, strategy, &hit);
-	if (hit)
-		pgstat_count_buffer_hit(reln);
+	buf = ReadBuffer_common(BMR_REL(reln),
+							forkNum, blockNum, mode, strategy);
+
 	return buf;
 }
 
@@ -823,13 +820,12 @@ ReadBufferWithoutRelcache(RelFileLocator rlocator, ForkNumber forkNum,
 						  BlockNumber blockNum, ReadBufferMode mode,
 						  BufferAccessStrategy strategy, bool permanent)
 {
-	bool		hit;
-
 	SMgrRelation smgr = smgropen(rlocator, InvalidBackendId);
 
-	return ReadBuffer_common(smgr, permanent ? RELPERSISTENCE_PERMANENT :
-							 RELPERSISTENCE_UNLOGGED, forkNum, blockNum,
-							 mode, strategy, &hit);
+	return ReadBuffer_common(BMR_SMGR(smgr, permanent ? RELPERSISTENCE_PERMANENT :
+									  RELPERSISTENCE_UNLOGGED),
+							 forkNum, blockNum,
+							 mode, strategy);
 }
 
 /*
@@ -995,35 +991,68 @@ ExtendBufferedRelTo(BufferManagerRelation bmr,
 	 */
 	if (buffer == InvalidBuffer)
 	{
-		bool		hit;
-
 		Assert(extended_by == 0);
-		buffer = ReadBuffer_common(bmr.smgr, bmr.relpersistence,
-								   fork, extend_to - 1, mode, strategy,
-								   &hit);
+		buffer = ReadBuffer_common(bmr, fork, extend_to - 1, mode, strategy);
 	}
 
 	return buffer;
 }
 
+/*
+ * Zero a buffer and lock it, as part of the implementation of
+ * RBM_ZERO_AND_LOCK or RBM_ZERO_AND_CLEANUP_LOCK.  The buffer must be already
+ * pinned.  It does not have to be valid, but it is valid and locked on
+ * return.
+ */
+static void
+ZeroBuffer(Buffer buffer, ReadBufferMode mode)
+{
+	BufferDesc *bufHdr;
+	uint32		buf_state;
+
+	Assert(mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK);
+
+	if (BufferIsLocal(buffer))
+		bufHdr = GetLocalBufferDescriptor(-buffer - 1);
+	else
+	{
+		bufHdr = GetBufferDescriptor(buffer - 1);
+		if (mode == RBM_ZERO_AND_LOCK)
+			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
+		else
+			LockBufferForCleanup(buffer);
+	}
+
+	memset(BufferGetPage(buffer), 0, BLCKSZ);
+
+	if (BufferIsLocal(buffer))
+	{
+		buf_state = pg_atomic_read_u32(&bufHdr->state);
+		buf_state |= BM_VALID;
+		pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
+	}
+	else
+	{
+		buf_state = LockBufHdr(bufHdr);
+		buf_state |= BM_VALID;
+		UnlockBufHdr(bufHdr, buf_state);
+	}
+}
+
 /*
  * ReadBuffer_common -- common logic for all ReadBuffer variants
  *
  * *hit is set to true if the request was satisfied from shared buffer cache.
  */
 static Buffer
-ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
+ReadBuffer_common(BufferManagerRelation bmr, ForkNumber forkNum,
 				  BlockNumber blockNum, ReadBufferMode mode,
-				  BufferAccessStrategy strategy, bool *hit)
+				  BufferAccessStrategy strategy)
 {
-	BufferDesc *bufHdr;
-	Block		bufBlock;
-	bool		found;
-	IOContext	io_context;
-	IOObject	io_object;
-	bool		isLocalBuf = SmgrIsTemp(smgr);
-
-	*hit = false;
+	ReadBuffersOperation operation;
+	Buffer		buffer;
+	int			nblocks;
+	int			flags;
 
 	/*
 	 * Backward compatibility path, most code should use ExtendBufferedRel()
@@ -1042,181 +1071,404 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 		if (mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK)
 			flags |= EB_LOCK_FIRST;
 
-		return ExtendBufferedRel(BMR_SMGR(smgr, relpersistence),
-								 forkNum, strategy, flags);
+		return ExtendBufferedRel(bmr, forkNum, strategy, flags);
 	}
 
-	TRACE_POSTGRESQL_BUFFER_READ_START(forkNum, blockNum,
-									   smgr->smgr_rlocator.locator.spcOid,
-									   smgr->smgr_rlocator.locator.dbOid,
-									   smgr->smgr_rlocator.locator.relNumber,
-									   smgr->smgr_rlocator.backend);
+	nblocks = 1;
+	if (mode == RBM_ZERO_ON_ERROR)
+		flags = READ_BUFFERS_ZERO_ON_ERROR;
+	else
+		flags = 0;
+	if (StartReadBuffers(bmr,
+						 &buffer,
+						 forkNum,
+						 blockNum,
+						 &nblocks,
+						 strategy,
+						 flags,
+						 &operation))
+		WaitReadBuffers(&operation);
+	Assert(nblocks == 1);		/* single block can't be short */
+
+	if (mode == RBM_ZERO_AND_CLEANUP_LOCK || mode == RBM_ZERO_AND_LOCK)
+		ZeroBuffer(buffer, mode);
+
+	return buffer;
+}
+
+static Buffer
+PrepareReadBuffer(BufferManagerRelation bmr,
+				  ForkNumber forkNum,
+				  BlockNumber blockNum,
+				  BufferAccessStrategy strategy,
+				  bool *foundPtr)
+{
+	BufferDesc *bufHdr;
+	bool		isLocalBuf;
+	IOContext	io_context;
+	IOObject	io_object;
+
+	Assert(blockNum != P_NEW);
 
+	Assert(bmr.smgr);
+
+	isLocalBuf = SmgrIsTemp(bmr.smgr);
 	if (isLocalBuf)
 	{
-		/*
-		 * We do not use a BufferAccessStrategy for I/O of temporary tables.
-		 * However, in some cases, the "strategy" may not be NULL, so we can't
-		 * rely on IOContextForStrategy() to set the right IOContext for us.
-		 * This may happen in cases like CREATE TEMPORARY TABLE AS...
-		 */
 		io_context = IOCONTEXT_NORMAL;
 		io_object = IOOBJECT_TEMP_RELATION;
-		bufHdr = LocalBufferAlloc(smgr, forkNum, blockNum, &found);
-		if (found)
-			pgBufferUsage.local_blks_hit++;
-		else if (mode == RBM_NORMAL || mode == RBM_NORMAL_NO_LOG ||
-				 mode == RBM_ZERO_ON_ERROR)
-			pgBufferUsage.local_blks_read++;
 	}
 	else
 	{
-		/*
-		 * lookup the buffer.  IO_IN_PROGRESS is set if the requested block is
-		 * not currently in memory.
-		 */
 		io_context = IOContextForStrategy(strategy);
 		io_object = IOOBJECT_RELATION;
-		bufHdr = BufferAlloc(smgr, relpersistence, forkNum, blockNum,
-							 strategy, &found, io_context);
-		if (found)
-			pgBufferUsage.shared_blks_hit++;
-		else if (mode == RBM_NORMAL || mode == RBM_NORMAL_NO_LOG ||
-				 mode == RBM_ZERO_ON_ERROR)
-			pgBufferUsage.shared_blks_read++;
 	}
 
-	/* At this point we do NOT hold any locks. */
+	TRACE_POSTGRESQL_BUFFER_READ_START(forkNum, blockNum,
+									   bmr.smgr->smgr_rlocator.locator.spcOid,
+									   bmr.smgr->smgr_rlocator.locator.dbOid,
+									   bmr.smgr->smgr_rlocator.locator.relNumber,
+									   bmr.smgr->smgr_rlocator.backend);
 
-	/* if it was already in the buffer pool, we're done */
-	if (found)
+	ResourceOwnerEnlarge(CurrentResourceOwner);
+	if (isLocalBuf)
+	{
+		bufHdr = LocalBufferAlloc(bmr.smgr, forkNum, blockNum, foundPtr);
+		if (*foundPtr)
+			pgBufferUsage.local_blks_hit++;
+	}
+	else
+	{
+		bufHdr = BufferAlloc(bmr.smgr, bmr.relpersistence, forkNum, blockNum,
+							 strategy, foundPtr, io_context);
+		if (*foundPtr)
+			pgBufferUsage.shared_blks_hit++;
+	}
+	if (bmr.rel)
+	{
+		/*
+		 * While pgBufferUsage's "read" counter isn't bumped unless we reach
+		 * WaitReadBuffers() (so, not for hits, and not for buffers that are
+		 * zeroed instead), the per-relation stats always count them.
+		 */
+		pgstat_count_buffer_read(bmr.rel);
+		if (*foundPtr)
+			pgstat_count_buffer_hit(bmr.rel);
+	}
+	if (*foundPtr)
 	{
-		/* Just need to update stats before we exit */
-		*hit = true;
 		VacuumPageHit++;
 		pgstat_count_io_op(io_object, io_context, IOOP_HIT);
-
 		if (VacuumCostActive)
 			VacuumCostBalance += VacuumCostPageHit;
 
 		TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
-										  smgr->smgr_rlocator.locator.spcOid,
-										  smgr->smgr_rlocator.locator.dbOid,
-										  smgr->smgr_rlocator.locator.relNumber,
-										  smgr->smgr_rlocator.backend,
-										  found);
+										  bmr.smgr->smgr_rlocator.locator.spcOid,
+										  bmr.smgr->smgr_rlocator.locator.dbOid,
+										  bmr.smgr->smgr_rlocator.locator.relNumber,
+										  bmr.smgr->smgr_rlocator.backend,
+										  true);
+	}
 
-		/*
-		 * In RBM_ZERO_AND_LOCK mode the caller expects the page to be locked
-		 * on return.
-		 */
-		if (!isLocalBuf)
-		{
-			if (mode == RBM_ZERO_AND_LOCK)
-				LWLockAcquire(BufferDescriptorGetContentLock(bufHdr),
-							  LW_EXCLUSIVE);
-			else if (mode == RBM_ZERO_AND_CLEANUP_LOCK)
-				LockBufferForCleanup(BufferDescriptorGetBuffer(bufHdr));
-		}
+	return BufferDescriptorGetBuffer(bufHdr);
+}
 
-		return BufferDescriptorGetBuffer(bufHdr);
+/*
+ * Begin reading a range of blocks beginning at blockNum and extending for
+ * *nblocks.  On return, up to *nblocks pinned buffers holding those blocks
+ * are written into the buffers array, and *nblocks is updated to contain the
+ * actual number, which may be fewer than requested.
+ *
+ * If false is returned, no I/O is necessary and WaitReadBuffers() is not
+ * necessary.  If true is returned, one I/O has been started, and
+ * WaitReadBuffers() must be called with the same operation object before the
+ * buffers are accessed.  Along with the operation object, the caller-supplied
+ * array of buffers must remain valid until WaitReadBuffers() is called.
+ *
+ * Currently the I/O is only started with optional operating system advice,
+ * and the real I/O happens in WaitReadBuffers().  In future work, true I/O
+ * could be initiated here.
+ */
+bool
+StartReadBuffers(BufferManagerRelation bmr,
+				 Buffer *buffers,
+				 ForkNumber forkNum,
+				 BlockNumber blockNum,
+				 int *nblocks,
+				 BufferAccessStrategy strategy,
+				 int flags,
+				 ReadBuffersOperation *operation)
+{
+	int			actual_nblocks = *nblocks;
+
+	if (bmr.rel)
+	{
+		bmr.smgr = RelationGetSmgr(bmr.rel);
+		bmr.relpersistence = bmr.rel->rd_rel->relpersistence;
 	}
 
-	/*
-	 * if we have gotten to this point, we have allocated a buffer for the
-	 * page but its contents are not yet valid.  IO_IN_PROGRESS is set for it,
-	 * if it's a shared buffer.
-	 */
-	Assert(!(pg_atomic_read_u32(&bufHdr->state) & BM_VALID));	/* spinlock not needed */
+	operation->bmr = bmr;
+	operation->forknum = forkNum;
+	operation->blocknum = blockNum;
+	operation->buffers = buffers;
+	operation->nblocks = actual_nblocks;
+	operation->strategy = strategy;
+	operation->flags = flags;
 
-	bufBlock = isLocalBuf ? LocalBufHdrGetBlock(bufHdr) : BufHdrGetBlock(bufHdr);
+	operation->io_buffers_len = 0;
 
-	/*
-	 * Read in the page, unless the caller intends to overwrite it and just
-	 * wants us to allocate a buffer.
-	 */
-	if (mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK)
-		MemSet((char *) bufBlock, 0, BLCKSZ);
-	else
+	for (int i = 0; i < actual_nblocks; ++i)
 	{
-		instr_time	io_start = pgstat_prepare_io_time(track_io_timing);
+		bool		found;
 
-		smgrread(smgr, forkNum, blockNum, bufBlock);
+		buffers[i] = PrepareReadBuffer(bmr,
+									   forkNum,
+									   blockNum + i,
+									   strategy,
+									   &found);
 
-		pgstat_count_io_op_time(io_object, io_context,
-								IOOP_READ, io_start, 1);
+		if (found)
+		{
+			/*
+			 * Terminate the read as soon as we get a hit.  It could be a
+			 * single buffer hit, or it could be a hit that follows a readable
+			 * range.  We don't want to create more than one readable range,
+			 * so we stop here.
+			 */
+			actual_nblocks = operation->nblocks = *nblocks = i + 1;
+		}
+		else
+		{
+			/* Extend the readable range to cover this block. */
+			operation->io_buffers_len++;
+		}
+	}
 
-		/* check for garbage data */
-		if (!PageIsVerifiedExtended((Page) bufBlock, blockNum,
-									PIV_LOG_WARNING | PIV_REPORT_STAT))
+	if (operation->io_buffers_len > 0)
+	{
+		if (flags & READ_BUFFERS_ISSUE_ADVICE)
 		{
-			if (mode == RBM_ZERO_ON_ERROR || zero_damaged_pages)
-			{
-				ereport(WARNING,
-						(errcode(ERRCODE_DATA_CORRUPTED),
-						 errmsg("invalid page in block %u of relation %s; zeroing out page",
-								blockNum,
-								relpath(smgr->smgr_rlocator, forkNum))));
-				MemSet((char *) bufBlock, 0, BLCKSZ);
-			}
-			else
-				ereport(ERROR,
-						(errcode(ERRCODE_DATA_CORRUPTED),
-						 errmsg("invalid page in block %u of relation %s",
-								blockNum,
-								relpath(smgr->smgr_rlocator, forkNum))));
+			/*
+			 * In theory we should only do this if PrepareReadBuffers() had to
+			 * allocate new buffers above.  That way, if two calls to
+			 * StartReadBuffers() were made for the same blocks before
+			 * WaitReadBuffers(), only the first would issue the advice.
+			 * That'd be a better simulation of true asynchronous I/O, which
+			 * would only start the I/O once, but isn't done here for
+			 * simplicity.  Note also that the following call might actually
+			 * issue two advice calls if we cross a segment boundary; in a
+			 * true asynchronous version we might choose to process only one
+			 * real I/O at a time in that case.
+			 */
+			smgrprefetch(bmr.smgr, forkNum, blockNum, operation->io_buffers_len);
 		}
+
+		/* Indicate that WaitReadBuffers() should be called. */
+		return true;
 	}
+	else
+	{
+		return false;
+	}
+}
 
-	/*
-	 * In RBM_ZERO_AND_LOCK / RBM_ZERO_AND_CLEANUP_LOCK mode, grab the buffer
-	 * content lock before marking the page as valid, to make sure that no
-	 * other backend sees the zeroed page before the caller has had a chance
-	 * to initialize it.
-	 *
-	 * Since no-one else can be looking at the page contents yet, there is no
-	 * difference between an exclusive lock and a cleanup-strength lock. (Note
-	 * that we cannot use LockBuffer() or LockBufferForCleanup() here, because
-	 * they assert that the buffer is already valid.)
-	 */
-	if ((mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK) &&
-		!isLocalBuf)
+static inline bool
+WaitReadBuffersCanStartIO(Buffer buffer, bool nowait)
+{
+	if (BufferIsLocal(buffer))
 	{
-		LWLockAcquire(BufferDescriptorGetContentLock(bufHdr), LW_EXCLUSIVE);
+		BufferDesc *bufHdr = GetLocalBufferDescriptor(-buffer - 1);
+
+		return (pg_atomic_read_u32(&bufHdr->state) & BM_VALID) == 0;
 	}
+	else
+		return StartBufferIO(GetBufferDescriptor(buffer - 1), true, nowait);
+}
+
+void
+WaitReadBuffers(ReadBuffersOperation *operation)
+{
+	BufferManagerRelation bmr;
+	Buffer	   *buffers;
+	int			nblocks;
+	BlockNumber blocknum;
+	ForkNumber	forknum;
+	bool		isLocalBuf;
+	IOContext	io_context;
+	IOObject	io_object;
+
+	/*
+	 * Currently operations are only allowed to include a read of some range,
+	 * with an optional extra buffer that is already pinned at the end.  So
+	 * nblocks can be at most one more than io_buffers_len.
+	 */
+	Assert((operation->nblocks == operation->io_buffers_len) ||
+		   (operation->nblocks == operation->io_buffers_len + 1));
 
+	/* Find the range of the physical read we need to perform. */
+	nblocks = operation->io_buffers_len;
+	if (nblocks == 0)
+		return;					/* nothing to do */
+
+	buffers = &operation->buffers[0];
+	blocknum = operation->blocknum;
+	forknum = operation->forknum;
+	bmr = operation->bmr;
+
+	isLocalBuf = SmgrIsTemp(bmr.smgr);
 	if (isLocalBuf)
 	{
-		/* Only need to adjust flags */
-		uint32		buf_state = pg_atomic_read_u32(&bufHdr->state);
-
-		buf_state |= BM_VALID;
-		pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
+		io_context = IOCONTEXT_NORMAL;
+		io_object = IOOBJECT_TEMP_RELATION;
 	}
 	else
 	{
-		/* Set BM_VALID, terminate IO, and wake up any waiters */
-		TerminateBufferIO(bufHdr, false, BM_VALID, true);
+		io_context = IOContextForStrategy(operation->strategy);
+		io_object = IOOBJECT_RELATION;
 	}
 
-	VacuumPageMiss++;
-	if (VacuumCostActive)
-		VacuumCostBalance += VacuumCostPageMiss;
+	/*
+	 * We count all these blocks as read by this backend.  This is traditional
+	 * behavior, but might turn out to be not true if we find that someone
+	 * else has beaten us and completed the read of some of these blocks.  In
+	 * that case the system globally double-counts, but we traditionally don't
+	 * count this as a "hit", and we don't have a separate counter for "miss,
+	 * but another backend completed the read".
+	 */
+	if (isLocalBuf)
+		pgBufferUsage.local_blks_read += nblocks;
+	else
+		pgBufferUsage.shared_blks_read += nblocks;
 
-	TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
-									  smgr->smgr_rlocator.locator.spcOid,
-									  smgr->smgr_rlocator.locator.dbOid,
-									  smgr->smgr_rlocator.locator.relNumber,
-									  smgr->smgr_rlocator.backend,
-									  found);
+	for (int i = 0; i < nblocks; ++i)
+	{
+		int			io_buffers_len;
+		Buffer		io_buffers[MAX_BUFFERS_PER_TRANSFER];
+		void	   *io_pages[MAX_BUFFERS_PER_TRANSFER];
+		instr_time	io_start;
+		BlockNumber io_first_block;
 
-	return BufferDescriptorGetBuffer(bufHdr);
+		/*
+		 * Skip this block if someone else has already completed it.  If an
+		 * I/O is already in progress in another backend, this will wait for
+		 * the outcome: either done, or something went wrong and we will
+		 * retry.
+		 */
+		if (!WaitReadBuffersCanStartIO(buffers[i], false))
+		{
+			/*
+			 * Report this as a 'hit' for this backend, even though it must
+			 * have started out as a miss in PrepareReadBuffer().
+			 */
+			TRACE_POSTGRESQL_BUFFER_READ_DONE(forknum, blocknum + i,
+											  bmr.smgr->smgr_rlocator.locator.spcOid,
+											  bmr.smgr->smgr_rlocator.locator.dbOid,
+											  bmr.smgr->smgr_rlocator.locator.relNumber,
+											  bmr.smgr->smgr_rlocator.backend,
+											  true);
+			continue;
+		}
+
+		/* We found a buffer that we need to read in. */
+		io_buffers[0] = buffers[i];
+		io_pages[0] = BufferGetBlock(buffers[i]);
+		io_first_block = blocknum + i;
+		io_buffers_len = 1;
+
+		/*
+		 * How many neighboring-on-disk blocks can we can scatter-read into
+		 * other buffers at the same time?  In this case we don't wait if we
+		 * see an I/O already in progress.  We already hold BM_IO_IN_PROGRESS
+		 * for the head block, so we should get on with that I/O as soon as
+		 * possible.  We'll come back to this block again, above.
+		 */
+		while ((i + 1) < nblocks &&
+			   WaitReadBuffersCanStartIO(buffers[i + 1], true))
+		{
+			/* Must be consecutive block numbers. */
+			Assert(BufferGetBlockNumber(buffers[i + 1]) ==
+				   BufferGetBlockNumber(buffers[i]) + 1);
+
+			io_buffers[io_buffers_len] = buffers[++i];
+			io_pages[io_buffers_len++] = BufferGetBlock(buffers[i]);
+		}
+
+		io_start = pgstat_prepare_io_time(track_io_timing);
+		smgrreadv(bmr.smgr, forknum, io_first_block, io_pages, io_buffers_len);
+		pgstat_count_io_op_time(io_object, io_context, IOOP_READ, io_start,
+								io_buffers_len);
+
+		/* Verify each block we read, and terminate the I/O. */
+		for (int j = 0; j < io_buffers_len; ++j)
+		{
+			BufferDesc *bufHdr;
+			Block		bufBlock;
+
+			if (isLocalBuf)
+			{
+				bufHdr = GetLocalBufferDescriptor(-io_buffers[j] - 1);
+				bufBlock = LocalBufHdrGetBlock(bufHdr);
+			}
+			else
+			{
+				bufHdr = GetBufferDescriptor(io_buffers[j] - 1);
+				bufBlock = BufHdrGetBlock(bufHdr);
+			}
+
+			/* check for garbage data */
+			if (!PageIsVerifiedExtended((Page) bufBlock, io_first_block + j,
+										PIV_LOG_WARNING | PIV_REPORT_STAT))
+			{
+				if ((operation->flags & READ_BUFFERS_ZERO_ON_ERROR) || zero_damaged_pages)
+				{
+					ereport(WARNING,
+							(errcode(ERRCODE_DATA_CORRUPTED),
+							 errmsg("invalid page in block %u of relation %s; zeroing out page",
+									io_first_block + j,
+									relpath(bmr.smgr->smgr_rlocator, forknum))));
+					memset(bufBlock, 0, BLCKSZ);
+				}
+				else
+					ereport(ERROR,
+							(errcode(ERRCODE_DATA_CORRUPTED),
+							 errmsg("invalid page in block %u of relation %s",
+									io_first_block + j,
+									relpath(bmr.smgr->smgr_rlocator, forknum))));
+			}
+
+			/* Terminate I/O and set BM_VALID. */
+			if (isLocalBuf)
+			{
+				uint32		buf_state = pg_atomic_read_u32(&bufHdr->state);
+
+				buf_state |= BM_VALID;
+				pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
+			}
+			else
+			{
+				/* Set BM_VALID, terminate IO, and wake up any waiters */
+				TerminateBufferIO(bufHdr, false, BM_VALID, true);
+			}
+
+			/* Report I/Os as completing individually. */
+			TRACE_POSTGRESQL_BUFFER_READ_DONE(forknum, io_first_block + j,
+											  bmr.smgr->smgr_rlocator.locator.spcOid,
+											  bmr.smgr->smgr_rlocator.locator.dbOid,
+											  bmr.smgr->smgr_rlocator.locator.relNumber,
+											  bmr.smgr->smgr_rlocator.backend,
+											  false);
+		}
+
+		VacuumPageMiss += io_buffers_len;
+		if (VacuumCostActive)
+			VacuumCostBalance += VacuumCostPageMiss * io_buffers_len;
+	}
 }
 
 /*
- * BufferAlloc -- subroutine for ReadBuffer.  Handles lookup of a shared
- *		buffer.  If no buffer exists already, selects a replacement
- *		victim and evicts the old page, but does NOT read in new page.
+ * BufferAlloc -- subroutine for StartReadBuffers.  Handles lookup of a shared
+ *		buffer.  If no buffer exists already, selects a replacement victim and
+ *		evicts the old page, but does NOT read in new page.
  *
  * "strategy" can be a buffer replacement strategy object, or NULL for
  * the default strategy.  The selected buffer's usage_count is advanced when
@@ -1224,11 +1476,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
  *
  * The returned buffer is pinned and is already marked as holding the
  * desired page.  If it already did have the desired page, *foundPtr is
- * set true.  Otherwise, *foundPtr is set false and the buffer is marked
- * as IO_IN_PROGRESS; ReadBuffer will now need to do I/O to fill it.
- *
- * *foundPtr is actually redundant with the buffer's BM_VALID flag, but
- * we keep it for simplicity in ReadBuffer.
+ * set true.  Otherwise, *foundPtr is set false.
  *
  * io_context is passed as an output parameter to avoid calling
  * IOContextForStrategy() when there is a shared buffers hit and no IO
@@ -1287,19 +1535,10 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 		{
 			/*
 			 * We can only get here if (a) someone else is still reading in
-			 * the page, or (b) a previous read attempt failed.  We have to
-			 * wait for any active read attempt to finish, and then set up our
-			 * own read attempt if the page is still not BM_VALID.
-			 * StartBufferIO does it all.
+			 * the page, (b) a previous read attempt failed, or (c) someone
+			 * called StartReadBuffers() but not yet WaitReadBuffers().
 			 */
-			if (StartBufferIO(buf, true))
-			{
-				/*
-				 * If we get here, previous attempts to read the buffer must
-				 * have failed ... but we shall bravely try again.
-				 */
-				*foundPtr = false;
-			}
+			*foundPtr = false;
 		}
 
 		return buf;
@@ -1364,19 +1603,10 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 		{
 			/*
 			 * We can only get here if (a) someone else is still reading in
-			 * the page, or (b) a previous read attempt failed.  We have to
-			 * wait for any active read attempt to finish, and then set up our
-			 * own read attempt if the page is still not BM_VALID.
-			 * StartBufferIO does it all.
+			 * the page, (b) a previous read attempt failed, or (c) someone
+			 * called StartReadBuffers() but not yet WaitReadBuffers().
 			 */
-			if (StartBufferIO(existing_buf_hdr, true))
-			{
-				/*
-				 * If we get here, previous attempts to read the buffer must
-				 * have failed ... but we shall bravely try again.
-				 */
-				*foundPtr = false;
-			}
+			*foundPtr = false;
 		}
 
 		return existing_buf_hdr;
@@ -1408,15 +1638,9 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 	LWLockRelease(newPartitionLock);
 
 	/*
-	 * Buffer contents are currently invalid.  Try to obtain the right to
-	 * start I/O.  If StartBufferIO returns false, then someone else managed
-	 * to read it before we did, so there's nothing left for BufferAlloc() to
-	 * do.
+	 * Buffer contents are currently invalid.
 	 */
-	if (StartBufferIO(victim_buf_hdr, true))
-		*foundPtr = false;
-	else
-		*foundPtr = true;
+	*foundPtr = false;
 
 	return victim_buf_hdr;
 }
@@ -1770,7 +1994,7 @@ again:
  * pessimistic, but outside of toy-sized shared_buffers it should allow
  * sufficient pins.
  */
-static void
+void
 LimitAdditionalPins(uint32 *additional_pins)
 {
 	uint32		max_backends;
@@ -2035,7 +2259,7 @@ ExtendBufferedRelShared(BufferManagerRelation bmr,
 
 				buf_state &= ~BM_VALID;
 				UnlockBufHdr(existing_hdr, buf_state);
-			} while (!StartBufferIO(existing_hdr, true));
+			} while (!StartBufferIO(existing_hdr, true, false));
 		}
 		else
 		{
@@ -2058,7 +2282,7 @@ ExtendBufferedRelShared(BufferManagerRelation bmr,
 			LWLockRelease(partition_lock);
 
 			/* XXX: could combine the locked operations in it with the above */
-			StartBufferIO(victim_buf_hdr, true);
+			StartBufferIO(victim_buf_hdr, true, false);
 		}
 	}
 
@@ -2373,7 +2597,12 @@ PinBuffer(BufferDesc *buf, BufferAccessStrategy strategy)
 	else
 	{
 		/*
-		 * If we previously pinned the buffer, it must surely be valid.
+		 * If we previously pinned the buffer, it is likely to be valid, but
+		 * it may not be if StartReadBuffers() was called and
+		 * WaitReadBuffers() hasn't been called yet.  We'll check by loading
+		 * the flags without locking.  This is racy, but it's OK to return
+		 * false spuriously: when WaitReadBuffers() calls StartBufferIO(),
+		 * it'll see that it's now valid.
 		 *
 		 * Note: We deliberately avoid a Valgrind client request here.
 		 * Individual access methods can optionally superimpose buffer page
@@ -2382,7 +2611,7 @@ PinBuffer(BufferDesc *buf, BufferAccessStrategy strategy)
 		 * that the buffer page is legitimately non-accessible here.  We
 		 * cannot meddle with that.
 		 */
-		result = true;
+		result = (pg_atomic_read_u32(&buf->state) & BM_VALID) != 0;
 	}
 
 	ref->refcount++;
@@ -3450,7 +3679,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln, IOObject io_object,
 	 * someone else flushed the buffer before we could, so we need not do
 	 * anything.
 	 */
-	if (!StartBufferIO(buf, false))
+	if (!StartBufferIO(buf, false, false))
 		return;
 
 	/* Setup error traceback support for ereport() */
@@ -5185,9 +5414,15 @@ WaitIO(BufferDesc *buf)
  *
  * Returns true if we successfully marked the buffer as I/O busy,
  * false if someone else already did the work.
+ *
+ * If nowait is true, then we don't wait for an I/O to be finished by another
+ * backend.  In that case, false indicates either that the I/O was already
+ * finished, or is still in progress.  This is useful for callers that want to
+ * find out if they can perform the I/O as part of a larger operation, without
+ * waiting for the answer or distinguishing the reasons why not.
  */
 static bool
-StartBufferIO(BufferDesc *buf, bool forInput)
+StartBufferIO(BufferDesc *buf, bool forInput, bool nowait)
 {
 	uint32		buf_state;
 
@@ -5200,6 +5435,8 @@ StartBufferIO(BufferDesc *buf, bool forInput)
 		if (!(buf_state & BM_IO_IN_PROGRESS))
 			break;
 		UnlockBufHdr(buf, buf_state);
+		if (nowait)
+			return false;
 		WaitIO(buf);
 	}
 
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index 1f02fed250e..6956d4e5b49 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -109,10 +109,9 @@ PrefetchLocalBuffer(SMgrRelation smgr, ForkNumber forkNum,
  * LocalBufferAlloc -
  *	  Find or create a local buffer for the given page of the given relation.
  *
- * API is similar to bufmgr.c's BufferAlloc, except that we do not need
- * to do any locking since this is all local.   Also, IO_IN_PROGRESS
- * does not get set.  Lastly, we support only default access strategy
- * (hence, usage_count is always advanced).
+ * API is similar to bufmgr.c's BufferAlloc, except that we do not need to do
+ * any locking since this is all local.  We support only default access
+ * strategy (hence, usage_count is always advanced).
  */
 BufferDesc *
 LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
@@ -288,7 +287,7 @@ GetLocalVictimBuffer(void)
 }
 
 /* see LimitAdditionalPins() */
-static void
+void
 LimitAdditionalLocalPins(uint32 *additional_pins)
 {
 	uint32		max_pins;
@@ -298,9 +297,10 @@ LimitAdditionalLocalPins(uint32 *additional_pins)
 
 	/*
 	 * In contrast to LimitAdditionalPins() other backends don't play a role
-	 * here. We can allow up to NLocBuffer pins in total.
+	 * here. We can allow up to NLocBuffer pins in total, but it might not be
+	 * initialized yet so read num_temp_buffers.
 	 */
-	max_pins = (NLocBuffer - NLocalPinnedBuffers);
+	max_pins = (num_temp_buffers - NLocalPinnedBuffers);
 
 	if (*additional_pins >= max_pins)
 		*additional_pins = max_pins;
diff --git a/src/backend/storage/meson.build b/src/backend/storage/meson.build
index 40345bdca27..739d13293fb 100644
--- a/src/backend/storage/meson.build
+++ b/src/backend/storage/meson.build
@@ -1,5 +1,6 @@
 # Copyright (c) 2022-2024, PostgreSQL Global Development Group
 
+subdir('aio')
 subdir('buffer')
 subdir('file')
 subdir('freespace')
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index d51d46d3353..b57f71f97e3 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -14,6 +14,7 @@
 #ifndef BUFMGR_H
 #define BUFMGR_H
 
+#include "port/pg_iovec.h"
 #include "storage/block.h"
 #include "storage/buf.h"
 #include "storage/bufpage.h"
@@ -158,6 +159,11 @@ extern PGDLLIMPORT int32 *LocalRefCount;
 #define BUFFER_LOCK_SHARE		1
 #define BUFFER_LOCK_EXCLUSIVE	2
 
+/*
+ * Maximum number of buffers for multi-buffer I/O functions.  This is set to
+ * allow 128kB transfers, unless BLCKSZ and IOV_MAX imply a a smaller maximum.
+ */
+#define MAX_BUFFERS_PER_TRANSFER Min(PG_IOV_MAX, (128 * 1024) / BLCKSZ)
 
 /*
  * prototypes for functions in bufmgr.c
@@ -177,6 +183,42 @@ extern Buffer ReadBufferWithoutRelcache(RelFileLocator rlocator,
 										ForkNumber forkNum, BlockNumber blockNum,
 										ReadBufferMode mode, BufferAccessStrategy strategy,
 										bool permanent);
+
+#define READ_BUFFERS_ZERO_ON_ERROR 0x01
+#define READ_BUFFERS_ISSUE_ADVICE 0x02
+
+/*
+ * Private state used by StartReadBuffers() and WaitReadBuffers().  Declared
+ * in public header only to allow inclusion in other structs, but contents
+ * should not be accessed.
+ */
+struct ReadBuffersOperation
+{
+	/* Parameters passed in to StartReadBuffers(). */
+	BufferManagerRelation bmr;
+	Buffer	   *buffers;
+	ForkNumber	forknum;
+	BlockNumber blocknum;
+	int			nblocks;
+	BufferAccessStrategy strategy;
+	int			flags;
+
+	/* Range of buffers, if we need to perform a read. */
+	int			io_buffers_len;
+};
+
+typedef struct ReadBuffersOperation ReadBuffersOperation;
+
+extern bool StartReadBuffers(BufferManagerRelation bmr,
+							 Buffer *buffers,
+							 ForkNumber forknum,
+							 BlockNumber blocknum,
+							 int *nblocks,
+							 BufferAccessStrategy strategy,
+							 int flags,
+							 ReadBuffersOperation *operation);
+extern void WaitReadBuffers(ReadBuffersOperation *operation);
+
 extern void ReleaseBuffer(Buffer buffer);
 extern void UnlockReleaseBuffer(Buffer buffer);
 extern bool BufferIsExclusiveLocked(Buffer buffer);
@@ -250,6 +292,9 @@ extern bool HoldingBufferPinThatDelaysRecovery(void);
 
 extern bool BgBufferSync(struct WritebackContext *wb_context);
 
+extern void LimitAdditionalPins(uint32 *additional_pins);
+extern void LimitAdditionalLocalPins(uint32 *additional_pins);
+
 /* in buf_init.c */
 extern void InitBufferPool(void);
 extern Size BufferShmemSize(void);
diff --git a/src/include/storage/streaming_read.h b/src/include/storage/streaming_read.h
new file mode 100644
index 00000000000..c4d3892bb26
--- /dev/null
+++ b/src/include/storage/streaming_read.h
@@ -0,0 +1,52 @@
+#ifndef STREAMING_READ_H
+#define STREAMING_READ_H
+
+#include "storage/bufmgr.h"
+#include "storage/fd.h"
+#include "storage/smgr.h"
+
+/* Default tuning, reasonable for many users. */
+#define PGSR_FLAG_DEFAULT 0x00
+
+/*
+ * I/O streams that are performing maintenance work on behalf of potentially
+ * many users.
+ */
+#define PGSR_FLAG_MAINTENANCE 0x01
+
+/*
+ * We usually avoid issuing prefetch advice automatically when sequential
+ * access is detected, but this flag explicitly disables it, for cases that
+ * might not be correctly detected.  Explicit advice is known to perform worse
+ * than letting the kernel (at least Linux) detect sequential access.
+ */
+#define PGSR_FLAG_SEQUENTIAL 0x02
+
+/*
+ * We usually ramp up from smaller reads to larger ones, to support users who
+ * don't know if it's worth reading lots of buffers yet.  This flag disables
+ * that, declaring ahead of time that we'll be reading all available buffers.
+ */
+#define PGSR_FLAG_FULL 0x04
+
+struct PgStreamingRead;
+typedef struct PgStreamingRead PgStreamingRead;
+
+/* Callback that returns the next block number to read. */
+typedef BlockNumber (*PgStreamingReadBufferCB) (PgStreamingRead *pgsr,
+												void *pgsr_private,
+												void *per_buffer_private);
+
+extern PgStreamingRead *pg_streaming_read_buffer_alloc(int flags,
+													   void *pgsr_private,
+													   size_t per_buffer_private_size,
+													   BufferAccessStrategy strategy,
+													   BufferManagerRelation bmr,
+													   ForkNumber forknum,
+													   PgStreamingReadBufferCB next_block_cb);
+
+extern void pg_streaming_read_prefetch(PgStreamingRead *pgsr);
+extern Buffer pg_streaming_read_buffer_get_next(PgStreamingRead *pgsr, void **per_buffer_private);
+extern void pg_streaming_read_free(PgStreamingRead *pgsr);
+
+#endif
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index fc8b15d0cf2..cfb58cf4836 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2097,6 +2097,8 @@ PgStat_TableCounts
 PgStat_TableStatus
 PgStat_TableXactStatus
 PgStat_WalStats
+PgStreamingRead
+PgStreamingReadRange
 PgXmlErrorContext
 PgXmlStrictness
 Pg_finfo_record
@@ -2267,6 +2269,7 @@ ReInitializeDSMForeignScan_function
 ReScanForeignScan_function
 ReadBufPtrType
 ReadBufferMode
+ReadBuffersOperation
 ReadBytePtrType
 ReadExtraTocPtrType
 ReadFunc
-- 
2.37.2

v5-0002-Add-lazy_scan_skip-unskippable-state-to-LVRelStat.patchtext/x-patch; charset=US-ASCII; name=v5-0002-Add-lazy_scan_skip-unskippable-state-to-LVRelStat.patchDownload
From 1bcc771d1b6baeb70b50a071a9274842051d6365 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 30 Dec 2023 16:22:12 -0500
Subject: [PATCH v5 2/7] Add lazy_scan_skip unskippable state to LVRelState

Future commits will remove all skipping logic from lazy_scan_heap() and
confine it to lazy_scan_skip(). To make those commits more clear, first
introduce add a struct to LVRelState containing variables needed to skip
ranges less than SKIP_PAGES_THRESHOLD.

lazy_scan_prune() and lazy_scan_new_or_empty() can now access the
buffer containing the relevant block of the visibility map through the
LVRelState.skip, so it no longer needs to be a separate function
parameter.

While we are at it, add additional information to the lazy_scan_skip()
comment, including descriptions of the role and expectations for its
function parameters.
---
 src/backend/access/heap/vacuumlazy.c | 154 ++++++++++++++++-----------
 1 file changed, 90 insertions(+), 64 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index e21c1124f5c..077164896fb 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -210,6 +210,22 @@ typedef struct LVRelState
 	int64		live_tuples;	/* # live tuples remaining */
 	int64		recently_dead_tuples;	/* # dead, but not yet removable */
 	int64		missed_dead_tuples; /* # removable, but not removed */
+
+	/*
+	 * Parameters maintained by lazy_scan_skip() to manage skipping ranges of
+	 * pages greater than SKIP_PAGES_THRESHOLD.
+	 */
+	struct
+	{
+		/* Next unskippable block */
+		BlockNumber next_unskippable_block;
+		/* Buffer containing next unskippable block's visibility info */
+		Buffer		vmbuffer;
+		/* Next unskippable block's visibility status */
+		bool		next_unskippable_allvis;
+		/* Whether or not skippable blocks should be skipped */
+		bool		skipping_current_range;
+	}			skip;
 } LVRelState;
 
 /* Struct for saving and restoring vacuum error information. */
@@ -220,19 +236,15 @@ typedef struct LVSavedErrInfo
 	VacErrPhase phase;
 } LVSavedErrInfo;
 
-
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
-static BlockNumber lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer,
-								  BlockNumber next_block,
-								  bool *next_unskippable_allvis,
-								  bool *skipping_current_range);
+static void lazy_scan_skip(LVRelState *vacrel, BlockNumber next_block);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
-								   bool sharelock, Buffer vmbuffer);
+								   bool sharelock);
 static void lazy_scan_prune(LVRelState *vacrel, Buffer buf,
 							BlockNumber blkno, Page page,
-							Buffer vmbuffer, bool all_visible_according_to_vm,
+							bool all_visible_according_to_vm,
 							bool *has_lpdead_items);
 static bool lazy_scan_noprune(LVRelState *vacrel, Buffer buf,
 							  BlockNumber blkno, Page page,
@@ -809,12 +821,8 @@ lazy_scan_heap(LVRelState *vacrel)
 {
 	BlockNumber rel_pages = vacrel->rel_pages,
 				blkno,
-				next_unskippable_block,
 				next_fsm_block_to_vacuum = 0;
 	VacDeadItems *dead_items = vacrel->dead_items;
-	Buffer		vmbuffer = InvalidBuffer;
-	bool		next_unskippable_allvis,
-				skipping_current_range;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -828,10 +836,9 @@ lazy_scan_heap(LVRelState *vacrel)
 	initprog_val[2] = dead_items->max_items;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
+	vacrel->skip.vmbuffer = InvalidBuffer;
 	/* Set up an initial range of skippable blocks using the visibility map */
-	next_unskippable_block = lazy_scan_skip(vacrel, &vmbuffer, 0,
-											&next_unskippable_allvis,
-											&skipping_current_range);
+	lazy_scan_skip(vacrel, 0);
 	for (blkno = 0; blkno < rel_pages; blkno++)
 	{
 		Buffer		buf;
@@ -840,26 +847,23 @@ lazy_scan_heap(LVRelState *vacrel)
 		bool		has_lpdead_items;
 		bool		got_cleanup_lock = false;
 
-		if (blkno == next_unskippable_block)
+		if (blkno == vacrel->skip.next_unskippable_block)
 		{
 			/*
 			 * Can't skip this page safely.  Must scan the page.  But
 			 * determine the next skippable range after the page first.
 			 */
-			all_visible_according_to_vm = next_unskippable_allvis;
-			next_unskippable_block = lazy_scan_skip(vacrel, &vmbuffer,
-													blkno + 1,
-													&next_unskippable_allvis,
-													&skipping_current_range);
+			all_visible_according_to_vm = vacrel->skip.next_unskippable_allvis;
+			lazy_scan_skip(vacrel, blkno + 1);
 
-			Assert(next_unskippable_block >= blkno + 1);
+			Assert(vacrel->skip.next_unskippable_block >= blkno + 1);
 		}
 		else
 		{
 			/* Last page always scanned (may need to set nonempty_pages) */
 			Assert(blkno < rel_pages - 1);
 
-			if (skipping_current_range)
+			if (vacrel->skip.skipping_current_range)
 				continue;
 
 			/* Current range is too small to skip -- just scan the page */
@@ -902,10 +906,10 @@ lazy_scan_heap(LVRelState *vacrel)
 			 * correctness, but we do it anyway to avoid holding the pin
 			 * across a lengthy, unrelated operation.
 			 */
-			if (BufferIsValid(vmbuffer))
+			if (BufferIsValid(vacrel->skip.vmbuffer))
 			{
-				ReleaseBuffer(vmbuffer);
-				vmbuffer = InvalidBuffer;
+				ReleaseBuffer(vacrel->skip.vmbuffer);
+				vacrel->skip.vmbuffer = InvalidBuffer;
 			}
 
 			/* Perform a round of index and heap vacuuming */
@@ -930,7 +934,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		 * all-visible.  In most cases this will be very cheap, because we'll
 		 * already have the correct page pinned anyway.
 		 */
-		visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
+		visibilitymap_pin(vacrel->rel, blkno, &vacrel->skip.vmbuffer);
 
 		buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL,
 								 vacrel->bstrategy);
@@ -948,8 +952,7 @@ lazy_scan_heap(LVRelState *vacrel)
 			LockBuffer(buf, BUFFER_LOCK_SHARE);
 
 		/* Check for new or empty pages before lazy_scan_[no]prune call */
-		if (lazy_scan_new_or_empty(vacrel, buf, blkno, page, !got_cleanup_lock,
-								   vmbuffer))
+		if (lazy_scan_new_or_empty(vacrel, buf, blkno, page, !got_cleanup_lock))
 		{
 			/* Processed as new/empty page (lock and pin released) */
 			continue;
@@ -991,7 +994,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		if (got_cleanup_lock)
 			lazy_scan_prune(vacrel, buf, blkno, page,
-							vmbuffer, all_visible_according_to_vm,
+							all_visible_according_to_vm,
 							&has_lpdead_items);
 
 		/*
@@ -1041,8 +1044,11 @@ lazy_scan_heap(LVRelState *vacrel)
 	}
 
 	vacrel->blkno = InvalidBlockNumber;
-	if (BufferIsValid(vmbuffer))
-		ReleaseBuffer(vmbuffer);
+	if (BufferIsValid(vacrel->skip.vmbuffer))
+	{
+		ReleaseBuffer(vacrel->skip.vmbuffer);
+		vacrel->skip.vmbuffer = InvalidBuffer;
+	}
 
 	/* report that everything is now scanned */
 	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
@@ -1086,15 +1092,34 @@ lazy_scan_heap(LVRelState *vacrel)
  *	lazy_scan_skip() -- set up range of skippable blocks using visibility map.
  *
  * lazy_scan_heap() calls here every time it needs to set up a new range of
- * blocks to skip via the visibility map.  Caller passes the next block in
- * line.  We return a next_unskippable_block for this range.  When there are
- * no skippable blocks we just return caller's next_block.  The all-visible
- * status of the returned block is set in *next_unskippable_allvis for caller,
- * too.  Block usually won't be all-visible (since it's unskippable), but it
- * can be during aggressive VACUUMs (as well as in certain edge cases).
+ * blocks to skip via the visibility map.  Caller passes next_block, the next
+ * block in line. The parameters of the skipped range are recorded in skip.
+ * vacrel is an in/out parameter here; vacuum options and information about the
+ * relation are read and vacrel->skippedallvis is set to ensure we don't
+ * advance relfrozenxid when we have skipped vacuuming all visible blocks.
+ *
+ * skip->vmbuffer will contain the block from the VM containing visibility
+ * information for the next unskippable heap block. We may end up needed a
+ * different block from the VM (if we decide not to skip a skippable block).
+ * This is okay; visibilitymap_pin() will take care of this while processing
+ * the block.
+ *
+ * A block is unskippable if it is not all visible according to the visibility
+ * map. It is also unskippable if it is the last block in the relation, if the
+ * vacuum is an aggressive vacuum, or if DISABLE_PAGE_SKIPPING was passed to
+ * vacuum.
  *
- * Sets *skipping_current_range to indicate if caller should skip this range.
- * Costs and benefits drive our decision.  Very small ranges won't be skipped.
+ * Even if a block is skippable, we may choose not to skip it if the range of
+ * skippable blocks is too small (below SKIP_PAGES_THRESHOLD). As a
+ * consequence, we must keep track of the next truly unskippable block and its
+ * visibility status along with whether or not we are skipping the current
+ * range of skippable blocks. This can be used to derive the next block
+ * lazy_scan_heap() must process and its visibility status.
+ *
+ * The block number and visibility status of the next unskippable block are set
+ * in skip->next_unskippable_block and next_unskippable_allvis.
+ * skip->skipping_current_range indicates to the caller whether or not it is
+ * processing a skippable (and thus all-visible) block.
  *
  * Note: our opinion of which blocks can be skipped can go stale immediately.
  * It's okay if caller "misses" a page whose all-visible or all-frozen marking
@@ -1104,25 +1129,26 @@ lazy_scan_heap(LVRelState *vacrel)
  * older XIDs/MXIDs.  The vacrel->skippedallvis flag will be set here when the
  * choice to skip such a range is actually made, making everything safe.)
  */
-static BlockNumber
-lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
-			   bool *next_unskippable_allvis, bool *skipping_current_range)
+static void
+lazy_scan_skip(LVRelState *vacrel, BlockNumber next_block)
 {
+	/* Use local variables for better optimized loop code */
 	BlockNumber rel_pages = vacrel->rel_pages,
 				next_unskippable_block = next_block;
+
 	bool		skipsallvis = false;
 
-	*next_unskippable_allvis = true;
+	vacrel->skip.next_unskippable_allvis = true;
 	while (next_unskippable_block < rel_pages)
 	{
 		uint8		mapbits = visibilitymap_get_status(vacrel->rel,
 													   next_unskippable_block,
-													   vmbuffer);
+													   &vacrel->skip.vmbuffer);
 
 		if ((mapbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
 		{
 			Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
-			*next_unskippable_allvis = false;
+			vacrel->skip.next_unskippable_allvis = false;
 			break;
 		}
 
@@ -1143,7 +1169,7 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 		if (!vacrel->skipwithvm)
 		{
 			/* Caller shouldn't rely on all_visible_according_to_vm */
-			*next_unskippable_allvis = false;
+			vacrel->skip.next_unskippable_allvis = false;
 			break;
 		}
 
@@ -1168,6 +1194,8 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 		next_unskippable_block++;
 	}
 
+	vacrel->skip.next_unskippable_block = next_unskippable_block;
+
 	/*
 	 * We only skip a range with at least SKIP_PAGES_THRESHOLD consecutive
 	 * pages.  Since we're reading sequentially, the OS should be doing
@@ -1178,16 +1206,14 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 	 * non-aggressive VACUUMs.  If the range has any all-visible pages then
 	 * skipping makes updating relfrozenxid unsafe, which is a real downside.
 	 */
-	if (next_unskippable_block - next_block < SKIP_PAGES_THRESHOLD)
-		*skipping_current_range = false;
+	if (vacrel->skip.next_unskippable_block - next_block < SKIP_PAGES_THRESHOLD)
+		vacrel->skip.skipping_current_range = false;
 	else
 	{
-		*skipping_current_range = true;
+		vacrel->skip.skipping_current_range = true;
 		if (skipsallvis)
 			vacrel->skippedallvis = true;
 	}
-
-	return next_unskippable_block;
 }
 
 /*
@@ -1220,7 +1246,7 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
  */
 static bool
 lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno,
-					   Page page, bool sharelock, Buffer vmbuffer)
+					   Page page, bool sharelock)
 {
 	Size		freespace;
 
@@ -1306,7 +1332,7 @@ lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno,
 
 			PageSetAllVisible(page);
 			visibilitymap_set(vacrel->rel, blkno, buf, InvalidXLogRecPtr,
-							  vmbuffer, InvalidTransactionId,
+							  vacrel->skip.vmbuffer, InvalidTransactionId,
 							  VISIBILITYMAP_ALL_VISIBLE | VISIBILITYMAP_ALL_FROZEN);
 			END_CRIT_SECTION();
 		}
@@ -1342,10 +1368,11 @@ lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno,
  * any tuple that becomes dead after the call to heap_page_prune() can't need to
  * be frozen, because it was visible to another session when vacuum started.
  *
- * vmbuffer is the buffer containing the VM block with visibility information
- * for the heap block, blkno. all_visible_according_to_vm is the saved
- * visibility status of the heap block looked up earlier by the caller. We
- * won't rely entirely on this status, as it may be out of date.
+ * vacrel->skipstate.vmbuffer is the buffer containing the VM block with
+ * visibility information for the heap block, blkno.
+ * all_visible_according_to_vm is the saved visibility status of the heap block
+ * looked up earlier by the caller. We won't rely entirely on this status, as
+ * it may be out of date.
  *
  * *has_lpdead_items is set to true or false depending on whether, upon return
  * from this function, any LP_DEAD items are still present on the page.
@@ -1355,7 +1382,6 @@ lazy_scan_prune(LVRelState *vacrel,
 				Buffer buf,
 				BlockNumber blkno,
 				Page page,
-				Buffer vmbuffer,
 				bool all_visible_according_to_vm,
 				bool *has_lpdead_items)
 {
@@ -1789,7 +1815,7 @@ lazy_scan_prune(LVRelState *vacrel,
 		PageSetAllVisible(page);
 		MarkBufferDirty(buf);
 		visibilitymap_set(vacrel->rel, blkno, buf, InvalidXLogRecPtr,
-						  vmbuffer, visibility_cutoff_xid,
+						  vacrel->skip.vmbuffer, visibility_cutoff_xid,
 						  flags);
 	}
 
@@ -1800,11 +1826,11 @@ lazy_scan_prune(LVRelState *vacrel,
 	 * buffer lock before concluding that the VM is corrupt.
 	 */
 	else if (all_visible_according_to_vm && !PageIsAllVisible(page) &&
-			 visibilitymap_get_status(vacrel->rel, blkno, &vmbuffer) != 0)
+			 visibilitymap_get_status(vacrel->rel, blkno, &vacrel->skip.vmbuffer) != 0)
 	{
 		elog(WARNING, "page is not marked all-visible but visibility map bit is set in relation \"%s\" page %u",
 			 vacrel->relname, blkno);
-		visibilitymap_clear(vacrel->rel, blkno, vmbuffer,
+		visibilitymap_clear(vacrel->rel, blkno, vacrel->skip.vmbuffer,
 							VISIBILITYMAP_VALID_BITS);
 	}
 
@@ -1828,7 +1854,7 @@ lazy_scan_prune(LVRelState *vacrel,
 			 vacrel->relname, blkno);
 		PageClearAllVisible(page);
 		MarkBufferDirty(buf);
-		visibilitymap_clear(vacrel->rel, blkno, vmbuffer,
+		visibilitymap_clear(vacrel->rel, blkno, vacrel->skip.vmbuffer,
 							VISIBILITYMAP_VALID_BITS);
 	}
 
@@ -1838,7 +1864,7 @@ lazy_scan_prune(LVRelState *vacrel,
 	 * true, so we must check both all_visible and all_frozen.
 	 */
 	else if (all_visible_according_to_vm && all_visible &&
-			 all_frozen && !VM_ALL_FROZEN(vacrel->rel, blkno, &vmbuffer))
+			 all_frozen && !VM_ALL_FROZEN(vacrel->rel, blkno, &vacrel->skip.vmbuffer))
 	{
 		/*
 		 * Avoid relying on all_visible_according_to_vm as a proxy for the
@@ -1860,7 +1886,7 @@ lazy_scan_prune(LVRelState *vacrel,
 		 */
 		Assert(!TransactionIdIsValid(visibility_cutoff_xid));
 		visibilitymap_set(vacrel->rel, blkno, buf, InvalidXLogRecPtr,
-						  vmbuffer, InvalidTransactionId,
+						  vacrel->skip.vmbuffer, InvalidTransactionId,
 						  VISIBILITYMAP_ALL_VISIBLE |
 						  VISIBILITYMAP_ALL_FROZEN);
 	}
-- 
2.37.2

v5-0006-Vacuum-first-pass-uses-Streaming-Read-interface.patchtext/x-patch; charset=US-ASCII; name=v5-0006-Vacuum-first-pass-uses-Streaming-Read-interface.patchDownload
From f7e49269cf4e5ef0d2ef3ba0be42351bc1030ddb Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sun, 31 Dec 2023 11:29:02 -0500
Subject: [PATCH v5 6/7] Vacuum first pass uses Streaming Read interface

Now vacuum's first pass, which HOT prunes and records the TIDs of
non-removable dead tuples, uses the streaming read API by implementing a
streaming read callback which invokes heap_vac_scan_get_next_block().
---
 src/backend/access/heap/vacuumlazy.c | 79 +++++++++++++++++++++-------
 1 file changed, 59 insertions(+), 20 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index ea270941379..96150834614 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -59,6 +59,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "storage/streaming_read.h"
 #include "tcop/tcopprot.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
@@ -174,7 +175,12 @@ typedef struct LVRelState
 	char	   *relnamespace;
 	char	   *relname;
 	char	   *indname;		/* Current index name */
-	BlockNumber blkno;			/* used only for heap operations */
+
+	/*
+	 * The current block being processed by vacuum. Used only for heap
+	 * operations. Primarily for error reporting and logging.
+	 */
+	BlockNumber blkno;
 	OffsetNumber offnum;		/* used only for heap operations */
 	VacErrPhase phase;
 	bool		verbose;		/* VACUUM VERBOSE? */
@@ -195,6 +201,12 @@ typedef struct LVRelState
 	BlockNumber missed_dead_pages;	/* # pages with missed dead tuples */
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
 
+	/*
+	 * The most recent block submitted in the streaming read callback by the
+	 * first vacuum pass.
+	 */
+	BlockNumber blkno_prefetch;
+
 	/* Statistics output by us, for table */
 	double		new_rel_tuples; /* new estimated total # of tuples */
 	double		new_live_tuples;	/* new estimated total # of live tuples */
@@ -238,7 +250,7 @@ typedef struct LVSavedErrInfo
 
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
-static bool heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
+static void heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
 										 BlockNumber *blkno,
 										 bool *all_visible_according_to_vm);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
@@ -422,6 +434,9 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 	vacrel->nonempty_pages = 0;
 	/* dead_items_alloc allocates vacrel->dead_items later on */
 
+	/* relies on InvalidBlockNumber overflowing to 0 */
+	vacrel->blkno_prefetch = InvalidBlockNumber;
+
 	/* Allocate/initialize output statistics state */
 	vacrel->new_rel_tuples = 0;
 	vacrel->new_live_tuples = 0;
@@ -782,6 +797,22 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 	}
 }
 
+static BlockNumber
+vacuum_scan_pgsr_next(PgStreamingRead *pgsr,
+					  void *pgsr_private, void *per_buffer_data)
+{
+	LVRelState *vacrel = pgsr_private;
+	bool	   *all_visible_according_to_vm = per_buffer_data;
+
+	vacrel->blkno_prefetch++;
+
+	heap_vac_scan_get_next_block(vacrel,
+								 vacrel->blkno_prefetch, &vacrel->blkno_prefetch,
+								 all_visible_according_to_vm);
+
+	return vacrel->blkno_prefetch;
+}
+
 /*
  *	lazy_scan_heap() -- workhorse function for VACUUM
  *
@@ -821,12 +852,11 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 static void
 lazy_scan_heap(LVRelState *vacrel)
 {
+	Buffer		buf;
 	BlockNumber rel_pages = vacrel->rel_pages,
 				next_fsm_block_to_vacuum = 0;
-	bool		all_visible_according_to_vm;
+	bool	   *all_visible_according_to_vm;
 
-	/* relies on InvalidBlockNumber overflowing to 0 */
-	BlockNumber blkno = InvalidBlockNumber;
 	VacDeadItems *dead_items = vacrel->dead_items;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
@@ -834,6 +864,11 @@ lazy_scan_heap(LVRelState *vacrel)
 		PROGRESS_VACUUM_MAX_DEAD_TUPLES
 	};
 	int64		initprog_val[3];
+	PgStreamingRead *pgsr;
+
+	pgsr = pg_streaming_read_buffer_alloc(PGSR_FLAG_MAINTENANCE, vacrel,
+										  sizeof(bool), vacrel->bstrategy, BMR_REL(vacrel->rel),
+										  MAIN_FORKNUM, vacuum_scan_pgsr_next);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
@@ -844,13 +879,19 @@ lazy_scan_heap(LVRelState *vacrel)
 	vacrel->skip.next_unskippable_block = InvalidBlockNumber;
 	vacrel->skip.vmbuffer = InvalidBuffer;
 
-	while (heap_vac_scan_get_next_block(vacrel, blkno + 1,
-										&blkno, &all_visible_according_to_vm))
+	while (BufferIsValid(buf =
+						 pg_streaming_read_buffer_get_next(pgsr, (void **) &all_visible_according_to_vm)))
 	{
-		Buffer		buf;
 		Page		page;
 		bool		has_lpdead_items;
 		bool		got_cleanup_lock = false;
+		BlockNumber blkno;
+
+		vacrel->blkno = blkno = BufferGetBlockNumber(buf);
+
+		CheckBufferIsPinnedOnce(buf);
+
+		page = BufferGetPage(buf);
 
 		vacrel->scanned_pages++;
 
@@ -918,9 +959,6 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		visibilitymap_pin(vacrel->rel, blkno, &vacrel->skip.vmbuffer);
 
-		buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL,
-								 vacrel->bstrategy);
-		page = BufferGetPage(buf);
 
 		/*
 		 * We need a buffer cleanup lock to prune HOT chains and defragment
@@ -976,7 +1014,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		if (got_cleanup_lock)
 			lazy_scan_prune(vacrel, buf, blkno, page,
-							all_visible_according_to_vm,
+							*all_visible_according_to_vm,
 							&has_lpdead_items);
 
 		/*
@@ -1033,7 +1071,7 @@ lazy_scan_heap(LVRelState *vacrel)
 	}
 
 	/* report that everything is now scanned */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
+	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, vacrel->rel_pages);
 
 	/* now we can compute the new value for pg_class.reltuples */
 	vacrel->new_live_tuples = vac_estimate_reltuples(vacrel->rel, rel_pages,
@@ -1048,6 +1086,8 @@ lazy_scan_heap(LVRelState *vacrel)
 		Max(vacrel->new_live_tuples, 0) + vacrel->recently_dead_tuples +
 		vacrel->missed_dead_tuples;
 
+	pg_streaming_read_free(pgsr);
+
 	/*
 	 * Do index vacuuming (call each index's ambulkdelete routine), then do
 	 * related heap vacuuming
@@ -1059,11 +1099,11 @@ lazy_scan_heap(LVRelState *vacrel)
 	 * Vacuum the remainder of the Free Space Map.  We must do this whether or
 	 * not there were indexes, and whether or not we bypassed index vacuuming.
 	 */
-	if (blkno > next_fsm_block_to_vacuum)
-		FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, blkno);
+	if (vacrel->rel_pages > next_fsm_block_to_vacuum)
+		FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, vacrel->rel_pages);
 
 	/* report all blocks vacuumed */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
+	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, vacrel->rel_pages);
 
 	/* Do final index cleanup (call each index's amvacuumcleanup routine) */
 	if (vacrel->nindexes > 0 && vacrel->do_index_cleanup)
@@ -1096,7 +1136,7 @@ lazy_scan_heap(LVRelState *vacrel)
  *
  * The block number and visibility status of the next block to process are set
  * in blkno and all_visible_according_to_vm. heap_vac_scan_get_next_block()
- * returns false if there are no further blocks to process.
+ * sets blkno to InvalidBlockNumber if there are no further blocks to process.
  *
  * vacrel is an in/out parameter here; vacuum options and information about the
  * relation are read and vacrel->skippedallvis is set to ensure we don't
@@ -1116,7 +1156,7 @@ lazy_scan_heap(LVRelState *vacrel)
  * older XIDs/MXIDs.  The vacrel->skippedallvis flag will be set here when the
  * choice to skip such a range is actually made, making everything safe.)
  */
-static bool
+static void
 heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
 							 BlockNumber *blkno, bool *all_visible_according_to_vm)
 {
@@ -1125,7 +1165,7 @@ heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
 	if (next_block >= vacrel->rel_pages)
 	{
 		*blkno = InvalidBlockNumber;
-		return false;
+		return;
 	}
 
 	if (vacrel->skip.next_unskippable_block == InvalidBlockNumber ||
@@ -1220,7 +1260,6 @@ heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
 		*all_visible_according_to_vm = true;
 
 	*blkno = next_block;
-	return true;
 }
 
 /*
-- 
2.37.2

v5-0007-Vacuum-second-pass-uses-Streaming-Read-interface.patchtext/x-patch; charset=US-ASCII; name=v5-0007-Vacuum-second-pass-uses-Streaming-Read-interface.patchDownload
From 4aa8c614440ac384459b07f803194bd08fbca554 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Tue, 27 Feb 2024 14:35:36 -0500
Subject: [PATCH v5 7/7] Vacuum second pass uses Streaming Read interface

Now vacuum's second pass, which removes dead items referring to dead
tuples catalogued in the first pass, uses the streaming read API by
implementing a streaming read callback which returns the next block
containing previously catalogued dead items. A new struct,
VacReapBlkState, is introduced to provide the caller with the starting
and ending indexes of dead items to vacuum.
---
 src/backend/access/heap/vacuumlazy.c | 106 +++++++++++++++++++++------
 src/tools/pgindent/typedefs.list     |   1 +
 2 files changed, 83 insertions(+), 24 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 96150834614..99514fa960c 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -207,6 +207,12 @@ typedef struct LVRelState
 	 */
 	BlockNumber blkno_prefetch;
 
+	/*
+	 * The index of the next TID in dead_items to reap during the second
+	 * vacuum pass.
+	 */
+	int			idx_prefetch;
+
 	/* Statistics output by us, for table */
 	double		new_rel_tuples; /* new estimated total # of tuples */
 	double		new_live_tuples;	/* new estimated total # of live tuples */
@@ -248,6 +254,21 @@ typedef struct LVSavedErrInfo
 	VacErrPhase phase;
 } LVSavedErrInfo;
 
+/*
+ * State set up in streaming read callback during vacuum's second pass which
+ * removes dead items referring to dead tuples cataloged in the first pass
+ */
+typedef struct VacReapBlkState
+{
+	/*
+	 * The indexes of the TIDs of the first and last dead tuples in a single
+	 * block in the currently vacuumed relation. The callback will set these
+	 * up prior to adding this block to the stream.
+	 */
+	int			start_idx;
+	int			end_idx;
+} VacReapBlkState;
+
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
 static void heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
@@ -266,8 +287,9 @@ static bool lazy_scan_noprune(LVRelState *vacrel, Buffer buf,
 static void lazy_vacuum(LVRelState *vacrel);
 static bool lazy_vacuum_all_indexes(LVRelState *vacrel);
 static void lazy_vacuum_heap_rel(LVRelState *vacrel);
-static int	lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno,
-								  Buffer buffer, int index, Buffer vmbuffer);
+static void lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno,
+								  Buffer buffer, Buffer vmbuffer,
+								  VacReapBlkState *rbstate);
 static bool lazy_check_wraparound_failsafe(LVRelState *vacrel);
 static void lazy_cleanup_all_indexes(LVRelState *vacrel);
 static IndexBulkDeleteResult *lazy_vacuum_one_index(Relation indrel,
@@ -2407,6 +2429,37 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 	return allindexes;
 }
 
+static BlockNumber
+vacuum_reap_lp_pgsr_next(PgStreamingRead *pgsr,
+						 void *pgsr_private,
+						 void *per_buffer_data)
+{
+	BlockNumber blkno;
+	LVRelState *vacrel = pgsr_private;
+	VacReapBlkState *rbstate = per_buffer_data;
+
+	VacDeadItems *dead_items = vacrel->dead_items;
+
+	if (vacrel->idx_prefetch == dead_items->num_items)
+		return InvalidBlockNumber;
+
+	blkno = ItemPointerGetBlockNumber(&dead_items->items[vacrel->idx_prefetch]);
+	rbstate->start_idx = vacrel->idx_prefetch;
+
+	for (; vacrel->idx_prefetch < dead_items->num_items; vacrel->idx_prefetch++)
+	{
+		BlockNumber curblkno =
+			ItemPointerGetBlockNumber(&dead_items->items[vacrel->idx_prefetch]);
+
+		if (blkno != curblkno)
+			break;				/* past end of tuples for this block */
+	}
+
+	rbstate->end_idx = vacrel->idx_prefetch;
+
+	return blkno;
+}
+
 /*
  *	lazy_vacuum_heap_rel() -- second pass over the heap for two pass strategy
  *
@@ -2428,7 +2481,9 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 static void
 lazy_vacuum_heap_rel(LVRelState *vacrel)
 {
-	int			index = 0;
+	Buffer		buf;
+	PgStreamingRead *pgsr;
+	VacReapBlkState *rbstate;
 	BlockNumber vacuumed_pages = 0;
 	Buffer		vmbuffer = InvalidBuffer;
 	LVSavedErrInfo saved_err_info;
@@ -2446,17 +2501,21 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 							 VACUUM_ERRCB_PHASE_VACUUM_HEAP,
 							 InvalidBlockNumber, InvalidOffsetNumber);
 
-	while (index < vacrel->dead_items->num_items)
+	pgsr = pg_streaming_read_buffer_alloc(PGSR_FLAG_MAINTENANCE, vacrel,
+										  sizeof(VacReapBlkState), vacrel->bstrategy, BMR_REL(vacrel->rel),
+										  MAIN_FORKNUM, vacuum_reap_lp_pgsr_next);
+
+	while (BufferIsValid(buf =
+						 pg_streaming_read_buffer_get_next(pgsr,
+														   (void **) &rbstate)))
 	{
 		BlockNumber blkno;
-		Buffer		buf;
 		Page		page;
 		Size		freespace;
 
 		vacuum_delay_point();
 
-		blkno = ItemPointerGetBlockNumber(&vacrel->dead_items->items[index]);
-		vacrel->blkno = blkno;
+		vacrel->blkno = blkno = BufferGetBlockNumber(buf);
 
 		/*
 		 * Pin the visibility map page in case we need to mark the page
@@ -2466,10 +2525,8 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 		visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
 
 		/* We need a non-cleanup exclusive lock to mark dead_items unused */
-		buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL,
-								 vacrel->bstrategy);
 		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
-		index = lazy_vacuum_heap_page(vacrel, blkno, buf, index, vmbuffer);
+		lazy_vacuum_heap_page(vacrel, blkno, buf, vmbuffer, rbstate);
 
 		/* Now that we've vacuumed the page, record its available space */
 		page = BufferGetPage(buf);
@@ -2490,9 +2547,11 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 	 */
 	Assert(index > 0);
 	Assert(vacrel->num_index_scans > 1 ||
-		   (index == vacrel->lpdead_items &&
+		   (rbstate->end_idx == vacrel->lpdead_items &&
 			vacuumed_pages == vacrel->lpdead_item_pages));
 
+	pg_streaming_read_free(pgsr);
+
 	ereport(DEBUG2,
 			(errmsg("table \"%s\": removed %lld dead item identifiers in %u pages",
 					vacrel->relname, (long long) index, vacuumed_pages)));
@@ -2509,13 +2568,12 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
  * cleanup lock is also acceptable).  vmbuffer must be valid and already have
  * a pin on blkno's visibility map page.
  *
- * index is an offset into the vacrel->dead_items array for the first listed
- * LP_DEAD item on the page.  The return value is the first index immediately
- * after all LP_DEAD items for the same page in the array.
+ * Given a block and dead items recorded during the first pass, set those items
+ * dead and truncate the line pointer array. Update the VM as appropriate.
  */
-static int
-lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
-					  int index, Buffer vmbuffer)
+static void
+lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno,
+					  Buffer buffer, Buffer vmbuffer, VacReapBlkState *rbstate)
 {
 	VacDeadItems *dead_items = vacrel->dead_items;
 	Page		page = BufferGetPage(buffer);
@@ -2536,16 +2594,17 @@ lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; index < dead_items->num_items; index++)
+	for (int i = rbstate->start_idx; i < rbstate->end_idx; i++)
 	{
-		BlockNumber tblk;
 		OffsetNumber toff;
+		ItemPointer dead_item;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&dead_items->items[index]);
-		if (tblk != blkno)
-			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&dead_items->items[index]);
+		dead_item = &dead_items->items[i];
+
+		Assert(ItemPointerGetBlockNumber(dead_item) == blkno);
+
+		toff = ItemPointerGetOffsetNumber(dead_item);
 		itemid = PageGetItemId(page, toff);
 
 		Assert(ItemIdIsDead(itemid) && !ItemIdHasStorage(itemid));
@@ -2615,7 +2674,6 @@ lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
 
 	/* Revert to the previous phase information for error traceback */
 	restore_vacuum_error_info(vacrel, &saved_err_info);
-	return index;
 }
 
 /*
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index cfb58cf4836..af055d5c037 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2972,6 +2972,7 @@ VacOptValue
 VacuumParams
 VacuumRelation
 VacuumStmt
+VacReapBlkState
 ValidIOData
 ValidateIndexState
 ValuesScan
-- 
2.37.2

#13Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Melanie Plageman (#12)
9 attachment(s)
Re: Confine vacuum skip logic to lazy_scan_skip

On 27/02/2024 21:47, Melanie Plageman wrote:

The attached v5 has some simplifications when compared to v4 but takes
largely the same approach.

0001-0004 are refactoring

I'm looking at just these 0001-0004 patches for now. I like those
changes a lot for the sake of readablity even without any of the later
patches.

I made some further changes. I kept them as separate commits for easier
review, see the commit messages for details. Any thoughts on those changes?

I feel heap_vac_scan_get_next_block() function could use some love.
Maybe just some rewording of the comments, or maybe some other
refactoring; not sure. But I'm pretty happy with the function signature
and how it's called.

BTW, do we have tests that would fail if we botched up
heap_vac_scan_get_next_block() so that it would skip pages incorrectly,
for example? Not asking you to write them for this patch, but I'm just
wondering.

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachments:

v6-0001-lazy_scan_skip-remove-unneeded-local-var-nskippab.patchtext/x-patch; charset=UTF-8; name=v6-0001-lazy_scan_skip-remove-unneeded-local-var-nskippab.patchDownload
From 6e0258c3e31e85526475f46b2e14cbbcbb861909 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 30 Dec 2023 16:30:59 -0500
Subject: [PATCH v6 1/9] lazy_scan_skip remove unneeded local var
 nskippable_blocks

nskippable_blocks can be easily derived from next_unskippable_block's
progress when compared to the passed in next_block.
---
 src/backend/access/heap/vacuumlazy.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 8b320c3f89a..1dc6cc8e4db 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1103,8 +1103,7 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 			   bool *next_unskippable_allvis, bool *skipping_current_range)
 {
 	BlockNumber rel_pages = vacrel->rel_pages,
-				next_unskippable_block = next_block,
-				nskippable_blocks = 0;
+				next_unskippable_block = next_block;
 	bool		skipsallvis = false;
 
 	*next_unskippable_allvis = true;
@@ -1161,7 +1160,6 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 
 		vacuum_delay_point();
 		next_unskippable_block++;
-		nskippable_blocks++;
 	}
 
 	/*
@@ -1174,7 +1172,7 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 	 * non-aggressive VACUUMs.  If the range has any all-visible pages then
 	 * skipping makes updating relfrozenxid unsafe, which is a real downside.
 	 */
-	if (nskippable_blocks < SKIP_PAGES_THRESHOLD)
+	if (next_unskippable_block - next_block < SKIP_PAGES_THRESHOLD)
 		*skipping_current_range = false;
 	else
 	{
-- 
2.39.2

v6-0002-Add-lazy_scan_skip-unskippable-state-to-LVRelStat.patchtext/x-patch; charset=UTF-8; name=v6-0002-Add-lazy_scan_skip-unskippable-state-to-LVRelStat.patchDownload
From 258bce44e3275bc628bf984892797eecaebf0404 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 30 Dec 2023 16:22:12 -0500
Subject: [PATCH v6 2/9] Add lazy_scan_skip unskippable state to LVRelState

Future commits will remove all skipping logic from lazy_scan_heap() and
confine it to lazy_scan_skip(). To make those commits more clear, first
introduce add a struct to LVRelState containing variables needed to skip
ranges less than SKIP_PAGES_THRESHOLD.

lazy_scan_prune() and lazy_scan_new_or_empty() can now access the
buffer containing the relevant block of the visibility map through the
LVRelState.skip, so it no longer needs to be a separate function
parameter.

While we are at it, add additional information to the lazy_scan_skip()
comment, including descriptions of the role and expectations for its
function parameters.
---
 src/backend/access/heap/vacuumlazy.c | 154 ++++++++++++++++-----------
 1 file changed, 90 insertions(+), 64 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 1dc6cc8e4db..0ddb986bc03 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -204,6 +204,22 @@ typedef struct LVRelState
 	int64		live_tuples;	/* # live tuples remaining */
 	int64		recently_dead_tuples;	/* # dead, but not yet removable */
 	int64		missed_dead_tuples; /* # removable, but not removed */
+
+	/*
+	 * Parameters maintained by lazy_scan_skip() to manage skipping ranges of
+	 * pages greater than SKIP_PAGES_THRESHOLD.
+	 */
+	struct
+	{
+		/* Next unskippable block */
+		BlockNumber next_unskippable_block;
+		/* Buffer containing next unskippable block's visibility info */
+		Buffer		vmbuffer;
+		/* Next unskippable block's visibility status */
+		bool		next_unskippable_allvis;
+		/* Whether or not skippable blocks should be skipped */
+		bool		skipping_current_range;
+	}			skip;
 } LVRelState;
 
 /* Struct for saving and restoring vacuum error information. */
@@ -214,19 +230,15 @@ typedef struct LVSavedErrInfo
 	VacErrPhase phase;
 } LVSavedErrInfo;
 
-
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
-static BlockNumber lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer,
-								  BlockNumber next_block,
-								  bool *next_unskippable_allvis,
-								  bool *skipping_current_range);
+static void lazy_scan_skip(LVRelState *vacrel, BlockNumber next_block);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
-								   bool sharelock, Buffer vmbuffer);
+								   bool sharelock);
 static void lazy_scan_prune(LVRelState *vacrel, Buffer buf,
 							BlockNumber blkno, Page page,
-							Buffer vmbuffer, bool all_visible_according_to_vm,
+							bool all_visible_according_to_vm,
 							bool *has_lpdead_items);
 static bool lazy_scan_noprune(LVRelState *vacrel, Buffer buf,
 							  BlockNumber blkno, Page page,
@@ -803,12 +815,8 @@ lazy_scan_heap(LVRelState *vacrel)
 {
 	BlockNumber rel_pages = vacrel->rel_pages,
 				blkno,
-				next_unskippable_block,
 				next_fsm_block_to_vacuum = 0;
 	VacDeadItems *dead_items = vacrel->dead_items;
-	Buffer		vmbuffer = InvalidBuffer;
-	bool		next_unskippable_allvis,
-				skipping_current_range;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -822,10 +830,9 @@ lazy_scan_heap(LVRelState *vacrel)
 	initprog_val[2] = dead_items->max_items;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
+	vacrel->skip.vmbuffer = InvalidBuffer;
 	/* Set up an initial range of skippable blocks using the visibility map */
-	next_unskippable_block = lazy_scan_skip(vacrel, &vmbuffer, 0,
-											&next_unskippable_allvis,
-											&skipping_current_range);
+	lazy_scan_skip(vacrel, 0);
 	for (blkno = 0; blkno < rel_pages; blkno++)
 	{
 		Buffer		buf;
@@ -834,26 +841,23 @@ lazy_scan_heap(LVRelState *vacrel)
 		bool		has_lpdead_items;
 		bool		got_cleanup_lock = false;
 
-		if (blkno == next_unskippable_block)
+		if (blkno == vacrel->skip.next_unskippable_block)
 		{
 			/*
 			 * Can't skip this page safely.  Must scan the page.  But
 			 * determine the next skippable range after the page first.
 			 */
-			all_visible_according_to_vm = next_unskippable_allvis;
-			next_unskippable_block = lazy_scan_skip(vacrel, &vmbuffer,
-													blkno + 1,
-													&next_unskippable_allvis,
-													&skipping_current_range);
+			all_visible_according_to_vm = vacrel->skip.next_unskippable_allvis;
+			lazy_scan_skip(vacrel, blkno + 1);
 
-			Assert(next_unskippable_block >= blkno + 1);
+			Assert(vacrel->skip.next_unskippable_block >= blkno + 1);
 		}
 		else
 		{
 			/* Last page always scanned (may need to set nonempty_pages) */
 			Assert(blkno < rel_pages - 1);
 
-			if (skipping_current_range)
+			if (vacrel->skip.skipping_current_range)
 				continue;
 
 			/* Current range is too small to skip -- just scan the page */
@@ -896,10 +900,10 @@ lazy_scan_heap(LVRelState *vacrel)
 			 * correctness, but we do it anyway to avoid holding the pin
 			 * across a lengthy, unrelated operation.
 			 */
-			if (BufferIsValid(vmbuffer))
+			if (BufferIsValid(vacrel->skip.vmbuffer))
 			{
-				ReleaseBuffer(vmbuffer);
-				vmbuffer = InvalidBuffer;
+				ReleaseBuffer(vacrel->skip.vmbuffer);
+				vacrel->skip.vmbuffer = InvalidBuffer;
 			}
 
 			/* Perform a round of index and heap vacuuming */
@@ -924,7 +928,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		 * all-visible.  In most cases this will be very cheap, because we'll
 		 * already have the correct page pinned anyway.
 		 */
-		visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
+		visibilitymap_pin(vacrel->rel, blkno, &vacrel->skip.vmbuffer);
 
 		buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL,
 								 vacrel->bstrategy);
@@ -942,8 +946,7 @@ lazy_scan_heap(LVRelState *vacrel)
 			LockBuffer(buf, BUFFER_LOCK_SHARE);
 
 		/* Check for new or empty pages before lazy_scan_[no]prune call */
-		if (lazy_scan_new_or_empty(vacrel, buf, blkno, page, !got_cleanup_lock,
-								   vmbuffer))
+		if (lazy_scan_new_or_empty(vacrel, buf, blkno, page, !got_cleanup_lock))
 		{
 			/* Processed as new/empty page (lock and pin released) */
 			continue;
@@ -985,7 +988,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		if (got_cleanup_lock)
 			lazy_scan_prune(vacrel, buf, blkno, page,
-							vmbuffer, all_visible_according_to_vm,
+							all_visible_according_to_vm,
 							&has_lpdead_items);
 
 		/*
@@ -1035,8 +1038,11 @@ lazy_scan_heap(LVRelState *vacrel)
 	}
 
 	vacrel->blkno = InvalidBlockNumber;
-	if (BufferIsValid(vmbuffer))
-		ReleaseBuffer(vmbuffer);
+	if (BufferIsValid(vacrel->skip.vmbuffer))
+	{
+		ReleaseBuffer(vacrel->skip.vmbuffer);
+		vacrel->skip.vmbuffer = InvalidBuffer;
+	}
 
 	/* report that everything is now scanned */
 	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
@@ -1080,15 +1086,34 @@ lazy_scan_heap(LVRelState *vacrel)
  *	lazy_scan_skip() -- set up range of skippable blocks using visibility map.
  *
  * lazy_scan_heap() calls here every time it needs to set up a new range of
- * blocks to skip via the visibility map.  Caller passes the next block in
- * line.  We return a next_unskippable_block for this range.  When there are
- * no skippable blocks we just return caller's next_block.  The all-visible
- * status of the returned block is set in *next_unskippable_allvis for caller,
- * too.  Block usually won't be all-visible (since it's unskippable), but it
- * can be during aggressive VACUUMs (as well as in certain edge cases).
+ * blocks to skip via the visibility map.  Caller passes next_block, the next
+ * block in line. The parameters of the skipped range are recorded in skip.
+ * vacrel is an in/out parameter here; vacuum options and information about the
+ * relation are read and vacrel->skippedallvis is set to ensure we don't
+ * advance relfrozenxid when we have skipped vacuuming all visible blocks.
+ *
+ * skip->vmbuffer will contain the block from the VM containing visibility
+ * information for the next unskippable heap block. We may end up needed a
+ * different block from the VM (if we decide not to skip a skippable block).
+ * This is okay; visibilitymap_pin() will take care of this while processing
+ * the block.
+ *
+ * A block is unskippable if it is not all visible according to the visibility
+ * map. It is also unskippable if it is the last block in the relation, if the
+ * vacuum is an aggressive vacuum, or if DISABLE_PAGE_SKIPPING was passed to
+ * vacuum.
  *
- * Sets *skipping_current_range to indicate if caller should skip this range.
- * Costs and benefits drive our decision.  Very small ranges won't be skipped.
+ * Even if a block is skippable, we may choose not to skip it if the range of
+ * skippable blocks is too small (below SKIP_PAGES_THRESHOLD). As a
+ * consequence, we must keep track of the next truly unskippable block and its
+ * visibility status along with whether or not we are skipping the current
+ * range of skippable blocks. This can be used to derive the next block
+ * lazy_scan_heap() must process and its visibility status.
+ *
+ * The block number and visibility status of the next unskippable block are set
+ * in skip->next_unskippable_block and next_unskippable_allvis.
+ * skip->skipping_current_range indicates to the caller whether or not it is
+ * processing a skippable (and thus all-visible) block.
  *
  * Note: our opinion of which blocks can be skipped can go stale immediately.
  * It's okay if caller "misses" a page whose all-visible or all-frozen marking
@@ -1098,25 +1123,26 @@ lazy_scan_heap(LVRelState *vacrel)
  * older XIDs/MXIDs.  The vacrel->skippedallvis flag will be set here when the
  * choice to skip such a range is actually made, making everything safe.)
  */
-static BlockNumber
-lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
-			   bool *next_unskippable_allvis, bool *skipping_current_range)
+static void
+lazy_scan_skip(LVRelState *vacrel, BlockNumber next_block)
 {
+	/* Use local variables for better optimized loop code */
 	BlockNumber rel_pages = vacrel->rel_pages,
 				next_unskippable_block = next_block;
+
 	bool		skipsallvis = false;
 
-	*next_unskippable_allvis = true;
+	vacrel->skip.next_unskippable_allvis = true;
 	while (next_unskippable_block < rel_pages)
 	{
 		uint8		mapbits = visibilitymap_get_status(vacrel->rel,
 													   next_unskippable_block,
-													   vmbuffer);
+													   &vacrel->skip.vmbuffer);
 
 		if ((mapbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
 		{
 			Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
-			*next_unskippable_allvis = false;
+			vacrel->skip.next_unskippable_allvis = false;
 			break;
 		}
 
@@ -1137,7 +1163,7 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 		if (!vacrel->skipwithvm)
 		{
 			/* Caller shouldn't rely on all_visible_according_to_vm */
-			*next_unskippable_allvis = false;
+			vacrel->skip.next_unskippable_allvis = false;
 			break;
 		}
 
@@ -1162,6 +1188,8 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 		next_unskippable_block++;
 	}
 
+	vacrel->skip.next_unskippable_block = next_unskippable_block;
+
 	/*
 	 * We only skip a range with at least SKIP_PAGES_THRESHOLD consecutive
 	 * pages.  Since we're reading sequentially, the OS should be doing
@@ -1172,16 +1200,14 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 	 * non-aggressive VACUUMs.  If the range has any all-visible pages then
 	 * skipping makes updating relfrozenxid unsafe, which is a real downside.
 	 */
-	if (next_unskippable_block - next_block < SKIP_PAGES_THRESHOLD)
-		*skipping_current_range = false;
+	if (vacrel->skip.next_unskippable_block - next_block < SKIP_PAGES_THRESHOLD)
+		vacrel->skip.skipping_current_range = false;
 	else
 	{
-		*skipping_current_range = true;
+		vacrel->skip.skipping_current_range = true;
 		if (skipsallvis)
 			vacrel->skippedallvis = true;
 	}
-
-	return next_unskippable_block;
 }
 
 /*
@@ -1214,7 +1240,7 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
  */
 static bool
 lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno,
-					   Page page, bool sharelock, Buffer vmbuffer)
+					   Page page, bool sharelock)
 {
 	Size		freespace;
 
@@ -1300,7 +1326,7 @@ lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno,
 
 			PageSetAllVisible(page);
 			visibilitymap_set(vacrel->rel, blkno, buf, InvalidXLogRecPtr,
-							  vmbuffer, InvalidTransactionId,
+							  vacrel->skip.vmbuffer, InvalidTransactionId,
 							  VISIBILITYMAP_ALL_VISIBLE | VISIBILITYMAP_ALL_FROZEN);
 			END_CRIT_SECTION();
 		}
@@ -1336,10 +1362,11 @@ lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno,
  * any tuple that becomes dead after the call to heap_page_prune() can't need to
  * be frozen, because it was visible to another session when vacuum started.
  *
- * vmbuffer is the buffer containing the VM block with visibility information
- * for the heap block, blkno. all_visible_according_to_vm is the saved
- * visibility status of the heap block looked up earlier by the caller. We
- * won't rely entirely on this status, as it may be out of date.
+ * vacrel->skipstate.vmbuffer is the buffer containing the VM block with
+ * visibility information for the heap block, blkno.
+ * all_visible_according_to_vm is the saved visibility status of the heap block
+ * looked up earlier by the caller. We won't rely entirely on this status, as
+ * it may be out of date.
  *
  * *has_lpdead_items is set to true or false depending on whether, upon return
  * from this function, any LP_DEAD items are still present on the page.
@@ -1349,7 +1376,6 @@ lazy_scan_prune(LVRelState *vacrel,
 				Buffer buf,
 				BlockNumber blkno,
 				Page page,
-				Buffer vmbuffer,
 				bool all_visible_according_to_vm,
 				bool *has_lpdead_items)
 {
@@ -1783,7 +1809,7 @@ lazy_scan_prune(LVRelState *vacrel,
 		PageSetAllVisible(page);
 		MarkBufferDirty(buf);
 		visibilitymap_set(vacrel->rel, blkno, buf, InvalidXLogRecPtr,
-						  vmbuffer, visibility_cutoff_xid,
+						  vacrel->skip.vmbuffer, visibility_cutoff_xid,
 						  flags);
 	}
 
@@ -1794,11 +1820,11 @@ lazy_scan_prune(LVRelState *vacrel,
 	 * buffer lock before concluding that the VM is corrupt.
 	 */
 	else if (all_visible_according_to_vm && !PageIsAllVisible(page) &&
-			 visibilitymap_get_status(vacrel->rel, blkno, &vmbuffer) != 0)
+			 visibilitymap_get_status(vacrel->rel, blkno, &vacrel->skip.vmbuffer) != 0)
 	{
 		elog(WARNING, "page is not marked all-visible but visibility map bit is set in relation \"%s\" page %u",
 			 vacrel->relname, blkno);
-		visibilitymap_clear(vacrel->rel, blkno, vmbuffer,
+		visibilitymap_clear(vacrel->rel, blkno, vacrel->skip.vmbuffer,
 							VISIBILITYMAP_VALID_BITS);
 	}
 
@@ -1822,7 +1848,7 @@ lazy_scan_prune(LVRelState *vacrel,
 			 vacrel->relname, blkno);
 		PageClearAllVisible(page);
 		MarkBufferDirty(buf);
-		visibilitymap_clear(vacrel->rel, blkno, vmbuffer,
+		visibilitymap_clear(vacrel->rel, blkno, vacrel->skip.vmbuffer,
 							VISIBILITYMAP_VALID_BITS);
 	}
 
@@ -1832,7 +1858,7 @@ lazy_scan_prune(LVRelState *vacrel,
 	 * true, so we must check both all_visible and all_frozen.
 	 */
 	else if (all_visible_according_to_vm && all_visible &&
-			 all_frozen && !VM_ALL_FROZEN(vacrel->rel, blkno, &vmbuffer))
+			 all_frozen && !VM_ALL_FROZEN(vacrel->rel, blkno, &vacrel->skip.vmbuffer))
 	{
 		/*
 		 * Avoid relying on all_visible_according_to_vm as a proxy for the
@@ -1854,7 +1880,7 @@ lazy_scan_prune(LVRelState *vacrel,
 		 */
 		Assert(!TransactionIdIsValid(visibility_cutoff_xid));
 		visibilitymap_set(vacrel->rel, blkno, buf, InvalidXLogRecPtr,
-						  vmbuffer, InvalidTransactionId,
+						  vacrel->skip.vmbuffer, InvalidTransactionId,
 						  VISIBILITYMAP_ALL_VISIBLE |
 						  VISIBILITYMAP_ALL_FROZEN);
 	}
-- 
2.39.2

v6-0003-Confine-vacuum-skip-logic-to-lazy_scan_skip.patchtext/x-patch; charset=UTF-8; name=v6-0003-Confine-vacuum-skip-logic-to-lazy_scan_skip.patchDownload
From 6b2b0313ecf5a8f35a3fbab9e3ba817a5cff2c73 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 30 Dec 2023 16:59:27 -0500
Subject: [PATCH v6 3/9] Confine vacuum skip logic to lazy_scan_skip

In preparation for vacuum to use the streaming read interface (and
eventually AIO), refactor vacuum's logic for skipping blocks such that
it is entirely confined to lazy_scan_skip(). This turns lazy_scan_skip()
and the skip state in LVRelState it uses into an iterator which yields
blocks to lazy_scan_heap(). Such a structure is conducive to an async
interface. While we are at it, rename lazy_scan_skip() to
heap_vac_scan_get_next_block(), which now more accurately describes it.

By always calling heap_vac_scan_get_next_block() -- instead of only when
we have reached the next unskippable block, we no longer need the
skipping_current_range variable. lazy_scan_heap() no longer needs to
manage the skipped range -- checking if we reached the end in order to
then call heap_vac_scan_get_next_block(). And
heap_vac_scan_get_next_block() can derive the visibility status of a
block from whether or not we are in a skippable range -- that is,
whether or not the next_block is equal to the next unskippable block.
---
 src/backend/access/heap/vacuumlazy.c | 243 ++++++++++++++-------------
 1 file changed, 126 insertions(+), 117 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 0ddb986bc03..99d160335e1 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -206,8 +206,8 @@ typedef struct LVRelState
 	int64		missed_dead_tuples; /* # removable, but not removed */
 
 	/*
-	 * Parameters maintained by lazy_scan_skip() to manage skipping ranges of
-	 * pages greater than SKIP_PAGES_THRESHOLD.
+	 * Parameters maintained by heap_vac_scan_get_next_block() to manage
+	 * skipping ranges of pages greater than SKIP_PAGES_THRESHOLD.
 	 */
 	struct
 	{
@@ -232,7 +232,9 @@ typedef struct LVSavedErrInfo
 
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
-static void lazy_scan_skip(LVRelState *vacrel, BlockNumber next_block);
+static bool heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
+										 BlockNumber *blkno,
+										 bool *all_visible_according_to_vm);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
 								   bool sharelock);
@@ -814,8 +816,11 @@ static void
 lazy_scan_heap(LVRelState *vacrel)
 {
 	BlockNumber rel_pages = vacrel->rel_pages,
-				blkno,
 				next_fsm_block_to_vacuum = 0;
+	bool		all_visible_according_to_vm;
+
+	/* relies on InvalidBlockNumber overflowing to 0 */
+	BlockNumber blkno = InvalidBlockNumber;
 	VacDeadItems *dead_items = vacrel->dead_items;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
@@ -830,40 +835,17 @@ lazy_scan_heap(LVRelState *vacrel)
 	initprog_val[2] = dead_items->max_items;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
+	vacrel->skip.next_unskippable_block = InvalidBlockNumber;
 	vacrel->skip.vmbuffer = InvalidBuffer;
-	/* Set up an initial range of skippable blocks using the visibility map */
-	lazy_scan_skip(vacrel, 0);
-	for (blkno = 0; blkno < rel_pages; blkno++)
+
+	while (heap_vac_scan_get_next_block(vacrel, blkno + 1,
+										&blkno, &all_visible_according_to_vm))
 	{
 		Buffer		buf;
 		Page		page;
-		bool		all_visible_according_to_vm;
 		bool		has_lpdead_items;
 		bool		got_cleanup_lock = false;
 
-		if (blkno == vacrel->skip.next_unskippable_block)
-		{
-			/*
-			 * Can't skip this page safely.  Must scan the page.  But
-			 * determine the next skippable range after the page first.
-			 */
-			all_visible_according_to_vm = vacrel->skip.next_unskippable_allvis;
-			lazy_scan_skip(vacrel, blkno + 1);
-
-			Assert(vacrel->skip.next_unskippable_block >= blkno + 1);
-		}
-		else
-		{
-			/* Last page always scanned (may need to set nonempty_pages) */
-			Assert(blkno < rel_pages - 1);
-
-			if (vacrel->skip.skipping_current_range)
-				continue;
-
-			/* Current range is too small to skip -- just scan the page */
-			all_visible_according_to_vm = true;
-		}
-
 		vacrel->scanned_pages++;
 
 		/* Report as block scanned, update error traceback information */
@@ -1083,20 +1065,14 @@ lazy_scan_heap(LVRelState *vacrel)
 }
 
 /*
- *	lazy_scan_skip() -- set up range of skippable blocks using visibility map.
- *
- * lazy_scan_heap() calls here every time it needs to set up a new range of
- * blocks to skip via the visibility map.  Caller passes next_block, the next
- * block in line. The parameters of the skipped range are recorded in skip.
- * vacrel is an in/out parameter here; vacuum options and information about the
- * relation are read and vacrel->skippedallvis is set to ensure we don't
- * advance relfrozenxid when we have skipped vacuuming all visible blocks.
+ *	heap_vac_scan_get_next_block() -- get next block for vacuum to process
  *
- * skip->vmbuffer will contain the block from the VM containing visibility
- * information for the next unskippable heap block. We may end up needed a
- * different block from the VM (if we decide not to skip a skippable block).
- * This is okay; visibilitymap_pin() will take care of this while processing
- * the block.
+ * lazy_scan_heap() calls here every time it needs to get the next block to
+ * prune and vacuum, using the visibility map, vacuum options, and various
+ * thresholds to skip blocks which do not need to be processed. Caller passes
+ * next_block, the next block in line. This block may end up being skipped.
+ * heap_vac_scan_get_next_block() sets blkno to next block that actually needs
+ * to be processed.
  *
  * A block is unskippable if it is not all visible according to the visibility
  * map. It is also unskippable if it is the last block in the relation, if the
@@ -1106,14 +1082,25 @@ lazy_scan_heap(LVRelState *vacrel)
  * Even if a block is skippable, we may choose not to skip it if the range of
  * skippable blocks is too small (below SKIP_PAGES_THRESHOLD). As a
  * consequence, we must keep track of the next truly unskippable block and its
- * visibility status along with whether or not we are skipping the current
- * range of skippable blocks. This can be used to derive the next block
- * lazy_scan_heap() must process and its visibility status.
+ * visibility status separate from the next block lazy_scan_heap() should
+ * process (and its visibility status).
  *
  * The block number and visibility status of the next unskippable block are set
- * in skip->next_unskippable_block and next_unskippable_allvis.
- * skip->skipping_current_range indicates to the caller whether or not it is
- * processing a skippable (and thus all-visible) block.
+ * in vacrel->skip->next_unskippable_block and next_unskippable_allvis.
+ *
+ * The block number and visibility status of the next block to process are set
+ * in blkno and all_visible_according_to_vm. heap_vac_scan_get_next_block()
+ * returns false if there are no further blocks to process.
+ *
+ * vacrel is an in/out parameter here; vacuum options and information about the
+ * relation are read and vacrel->skippedallvis is set to ensure we don't
+ * advance relfrozenxid when we have skipped vacuuming all visible blocks.
+ *
+ * skip->vmbuffer will contain the block from the VM containing visibility
+ * information for the next unskippable heap block. We may end up needed a
+ * different block from the VM (if we decide not to skip a skippable block).
+ * This is okay; visibilitymap_pin() will take care of this while processing
+ * the block.
  *
  * Note: our opinion of which blocks can be skipped can go stale immediately.
  * It's okay if caller "misses" a page whose all-visible or all-frozen marking
@@ -1123,91 +1110,113 @@ lazy_scan_heap(LVRelState *vacrel)
  * older XIDs/MXIDs.  The vacrel->skippedallvis flag will be set here when the
  * choice to skip such a range is actually made, making everything safe.)
  */
-static void
-lazy_scan_skip(LVRelState *vacrel, BlockNumber next_block)
+static bool
+heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
+							 BlockNumber *blkno, bool *all_visible_according_to_vm)
 {
-	/* Use local variables for better optimized loop code */
-	BlockNumber rel_pages = vacrel->rel_pages,
-				next_unskippable_block = next_block;
-
 	bool		skipsallvis = false;
 
-	vacrel->skip.next_unskippable_allvis = true;
-	while (next_unskippable_block < rel_pages)
+	if (next_block >= vacrel->rel_pages)
 	{
-		uint8		mapbits = visibilitymap_get_status(vacrel->rel,
-													   next_unskippable_block,
-													   &vacrel->skip.vmbuffer);
+		*blkno = InvalidBlockNumber;
+		return false;
+	}
 
-		if ((mapbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
+	if (vacrel->skip.next_unskippable_block == InvalidBlockNumber ||
+		next_block > vacrel->skip.next_unskippable_block)
+	{
+		/* Use local variables for better optimized loop code */
+		BlockNumber rel_pages = vacrel->rel_pages;
+		BlockNumber next_unskippable_block = vacrel->skip.next_unskippable_block;
+
+		while (++next_unskippable_block < rel_pages)
 		{
-			Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
-			vacrel->skip.next_unskippable_allvis = false;
-			break;
-		}
+			uint8		mapbits = visibilitymap_get_status(vacrel->rel,
+														   next_unskippable_block,
+														   &vacrel->skip.vmbuffer);
 
-		/*
-		 * Caller must scan the last page to determine whether it has tuples
-		 * (caller must have the opportunity to set vacrel->nonempty_pages).
-		 * This rule avoids having lazy_truncate_heap() take access-exclusive
-		 * lock on rel to attempt a truncation that fails anyway, just because
-		 * there are tuples on the last page (it is likely that there will be
-		 * tuples on other nearby pages as well, but those can be skipped).
-		 *
-		 * Implement this by always treating the last block as unsafe to skip.
-		 */
-		if (next_unskippable_block == rel_pages - 1)
-			break;
+			vacrel->skip.next_unskippable_allvis = mapbits & VISIBILITYMAP_ALL_VISIBLE;
 
-		/* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
-		if (!vacrel->skipwithvm)
-		{
-			/* Caller shouldn't rely on all_visible_according_to_vm */
-			vacrel->skip.next_unskippable_allvis = false;
-			break;
-		}
+			if (!vacrel->skip.next_unskippable_allvis)
+			{
+				Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
+				break;
+			}
 
-		/*
-		 * Aggressive VACUUM caller can't skip pages just because they are
-		 * all-visible.  They may still skip all-frozen pages, which can't
-		 * contain XIDs < OldestXmin (XIDs that aren't already frozen by now).
-		 */
-		if ((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0)
-		{
-			if (vacrel->aggressive)
+			/*
+			 * Caller must scan the last page to determine whether it has
+			 * tuples (caller must have the opportunity to set
+			 * vacrel->nonempty_pages). This rule avoids having
+			 * lazy_truncate_heap() take access-exclusive lock on rel to
+			 * attempt a truncation that fails anyway, just because there are
+			 * tuples on the last page (it is likely that there will be tuples
+			 * on other nearby pages as well, but those can be skipped).
+			 *
+			 * Implement this by always treating the last block as unsafe to
+			 * skip.
+			 */
+			if (next_unskippable_block == rel_pages - 1)
 				break;
 
+			/* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
+			if (!vacrel->skipwithvm)
+			{
+				/* Caller shouldn't rely on all_visible_according_to_vm */
+				vacrel->skip.next_unskippable_allvis = false;
+				break;
+			}
+
 			/*
-			 * All-visible block is safe to skip in non-aggressive case.  But
-			 * remember that the final range contains such a block for later.
+			 * Aggressive VACUUM caller can't skip pages just because they are
+			 * all-visible.  They may still skip all-frozen pages, which can't
+			 * contain XIDs < OldestXmin (XIDs that aren't already frozen by
+			 * now).
 			 */
-			skipsallvis = true;
+			if ((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0)
+			{
+				if (vacrel->aggressive)
+					break;
+
+				/*
+				 * All-visible block is safe to skip in non-aggressive case.
+				 * But remember that the final range contains such a block for
+				 * later.
+				 */
+				skipsallvis = true;
+			}
+
+			vacuum_delay_point();
 		}
 
-		vacuum_delay_point();
-		next_unskippable_block++;
-	}
+		vacrel->skip.next_unskippable_block = next_unskippable_block;
 
-	vacrel->skip.next_unskippable_block = next_unskippable_block;
+		/*
+		 * We only skip a range with at least SKIP_PAGES_THRESHOLD consecutive
+		 * pages.  Since we're reading sequentially, the OS should be doing
+		 * readahead for us, so there's no gain in skipping a page now and
+		 * then. Skipping such a range might even discourage sequential
+		 * detection.
+		 *
+		 * This test also enables more frequent relfrozenxid advancement
+		 * during non-aggressive VACUUMs.  If the range has any all-visible
+		 * pages then skipping makes updating relfrozenxid unsafe, which is a
+		 * real downside.
+		 */
+		if (vacrel->skip.next_unskippable_block - next_block >= SKIP_PAGES_THRESHOLD)
+		{
+			next_block = vacrel->skip.next_unskippable_block;
+			if (skipsallvis)
+				vacrel->skippedallvis = true;
+		}
+	}
 
-	/*
-	 * We only skip a range with at least SKIP_PAGES_THRESHOLD consecutive
-	 * pages.  Since we're reading sequentially, the OS should be doing
-	 * readahead for us, so there's no gain in skipping a page now and then.
-	 * Skipping such a range might even discourage sequential detection.
-	 *
-	 * This test also enables more frequent relfrozenxid advancement during
-	 * non-aggressive VACUUMs.  If the range has any all-visible pages then
-	 * skipping makes updating relfrozenxid unsafe, which is a real downside.
-	 */
-	if (vacrel->skip.next_unskippable_block - next_block < SKIP_PAGES_THRESHOLD)
-		vacrel->skip.skipping_current_range = false;
+	if (next_block == vacrel->skip.next_unskippable_block)
+		*all_visible_according_to_vm = vacrel->skip.next_unskippable_allvis;
 	else
-	{
-		vacrel->skip.skipping_current_range = true;
-		if (skipsallvis)
-			vacrel->skippedallvis = true;
-	}
+		*all_visible_according_to_vm = true;
+
+	*blkno = next_block;
+	return true;
 }
 
 /*
-- 
2.39.2

v6-0004-Remove-unneeded-vacuum_delay_point-from-heap_vac_.patchtext/x-patch; charset=UTF-8; name=v6-0004-Remove-unneeded-vacuum_delay_point-from-heap_vac_.patchDownload
From 026af3d4b19feabea8ed78795c64b3278fa93d7f Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sun, 31 Dec 2023 12:49:56 -0500
Subject: [PATCH v6 4/9] Remove unneeded vacuum_delay_point from
 heap_vac_scan_get_next_block

heap_vac_scan_get_next_block() does relatively little work, so there is
no need to call vacuum_delay_point(). A future commit will call
heap_vac_scan_get_next_block() from a callback, and we would like to
avoid calling vacuum_delay_point() in that callback.
---
 src/backend/access/heap/vacuumlazy.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 99d160335e1..65d257aab83 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1184,8 +1184,6 @@ heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
 				 */
 				skipsallvis = true;
 			}
-
-			vacuum_delay_point();
 		}
 
 		vacrel->skip.next_unskippable_block = next_unskippable_block;
-- 
2.39.2

v6-0005-Remove-unused-skipping_current_range-field.patchtext/x-patch; charset=UTF-8; name=v6-0005-Remove-unused-skipping_current_range-field.patchDownload
From b4047b941182af0643838fde056c298d5cc3ae32 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Wed, 6 Mar 2024 20:13:42 +0200
Subject: [PATCH v6 5/9] Remove unused 'skipping_current_range' field

---
 src/backend/access/heap/vacuumlazy.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 65d257aab83..51391870bf3 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -217,8 +217,6 @@ typedef struct LVRelState
 		Buffer		vmbuffer;
 		/* Next unskippable block's visibility status */
 		bool		next_unskippable_allvis;
-		/* Whether or not skippable blocks should be skipped */
-		bool		skipping_current_range;
 	}			skip;
 } LVRelState;
 
-- 
2.39.2

v6-0006-Move-vmbuffer-back-to-a-local-varible-in-lazy_sca.patchtext/x-patch; charset=UTF-8; name=v6-0006-Move-vmbuffer-back-to-a-local-varible-in-lazy_sca.patchDownload
From 27e431e8dc69bbf09d831cb1cf2903d16f177d74 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Wed, 6 Mar 2024 20:58:57 +0200
Subject: [PATCH v6 6/9] Move vmbuffer back to a local varible in
 lazy_scan_heap()

It felt confusing that we passed around the current block, 'blkno', as
an argument to lazy_scan_new_or_empty() and lazy_scan_prune(), but
'vmbuffer' was accessed directly in the 'scan_state'.

It was also a bit vague, when exactly 'vmbuffer' was valid. Calling
heap_vac_scan_get_next_block() set it, sometimes, to a buffer that
might or might not contain the VM bit for 'blkno'. But other
functions, like lazy_scan_prune(), assumed it to contain the correct
buffer. That was fixed up visibilitymap_pin(). But clearly it was not
"owned" by heap_vac_scan_get_next_block(), like the other 'scan_state'
fields.

I moved it back to a local variable, like it was. Maybe there would be
even better ways to handle it, but at least this is not worse than
what we have in master currently.
---
 src/backend/access/heap/vacuumlazy.c | 72 +++++++++++++---------------
 1 file changed, 33 insertions(+), 39 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 51391870bf3..3f1661cea61 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -213,8 +213,6 @@ typedef struct LVRelState
 	{
 		/* Next unskippable block */
 		BlockNumber next_unskippable_block;
-		/* Buffer containing next unskippable block's visibility info */
-		Buffer		vmbuffer;
 		/* Next unskippable block's visibility status */
 		bool		next_unskippable_allvis;
 	}			skip;
@@ -232,13 +230,14 @@ typedef struct LVSavedErrInfo
 static void lazy_scan_heap(LVRelState *vacrel);
 static bool heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
 										 BlockNumber *blkno,
-										 bool *all_visible_according_to_vm);
+										 bool *all_visible_according_to_vm,
+										 Buffer *vmbuffer);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
-								   bool sharelock);
+								   bool sharelock, Buffer vmbuffer);
 static void lazy_scan_prune(LVRelState *vacrel, Buffer buf,
 							BlockNumber blkno, Page page,
-							bool all_visible_according_to_vm,
+							Buffer vmbuffer, bool all_visible_according_to_vm,
 							bool *has_lpdead_items);
 static bool lazy_scan_noprune(LVRelState *vacrel, Buffer buf,
 							  BlockNumber blkno, Page page,
@@ -815,11 +814,10 @@ lazy_scan_heap(LVRelState *vacrel)
 {
 	BlockNumber rel_pages = vacrel->rel_pages,
 				next_fsm_block_to_vacuum = 0;
-	bool		all_visible_according_to_vm;
-
-	/* relies on InvalidBlockNumber overflowing to 0 */
-	BlockNumber blkno = InvalidBlockNumber;
 	VacDeadItems *dead_items = vacrel->dead_items;
+	BlockNumber blkno;
+	bool		all_visible_according_to_vm;
+	Buffer		vmbuffer = InvalidBuffer;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -834,10 +832,9 @@ lazy_scan_heap(LVRelState *vacrel)
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
 	vacrel->skip.next_unskippable_block = InvalidBlockNumber;
-	vacrel->skip.vmbuffer = InvalidBuffer;
 
 	while (heap_vac_scan_get_next_block(vacrel, blkno + 1,
-										&blkno, &all_visible_according_to_vm))
+										&blkno, &all_visible_according_to_vm, &vmbuffer))
 	{
 		Buffer		buf;
 		Page		page;
@@ -880,10 +877,10 @@ lazy_scan_heap(LVRelState *vacrel)
 			 * correctness, but we do it anyway to avoid holding the pin
 			 * across a lengthy, unrelated operation.
 			 */
-			if (BufferIsValid(vacrel->skip.vmbuffer))
+			if (BufferIsValid(vmbuffer))
 			{
-				ReleaseBuffer(vacrel->skip.vmbuffer);
-				vacrel->skip.vmbuffer = InvalidBuffer;
+				ReleaseBuffer(vmbuffer);
+				vmbuffer = InvalidBuffer;
 			}
 
 			/* Perform a round of index and heap vacuuming */
@@ -908,7 +905,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		 * all-visible.  In most cases this will be very cheap, because we'll
 		 * already have the correct page pinned anyway.
 		 */
-		visibilitymap_pin(vacrel->rel, blkno, &vacrel->skip.vmbuffer);
+		visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
 
 		buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL,
 								 vacrel->bstrategy);
@@ -926,7 +923,8 @@ lazy_scan_heap(LVRelState *vacrel)
 			LockBuffer(buf, BUFFER_LOCK_SHARE);
 
 		/* Check for new or empty pages before lazy_scan_[no]prune call */
-		if (lazy_scan_new_or_empty(vacrel, buf, blkno, page, !got_cleanup_lock))
+		if (lazy_scan_new_or_empty(vacrel, buf, blkno, page, !got_cleanup_lock,
+								   vmbuffer))
 		{
 			/* Processed as new/empty page (lock and pin released) */
 			continue;
@@ -968,7 +966,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		if (got_cleanup_lock)
 			lazy_scan_prune(vacrel, buf, blkno, page,
-							all_visible_according_to_vm,
+							vmbuffer, all_visible_according_to_vm,
 							&has_lpdead_items);
 
 		/*
@@ -1018,11 +1016,8 @@ lazy_scan_heap(LVRelState *vacrel)
 	}
 
 	vacrel->blkno = InvalidBlockNumber;
-	if (BufferIsValid(vacrel->skip.vmbuffer))
-	{
-		ReleaseBuffer(vacrel->skip.vmbuffer);
-		vacrel->skip.vmbuffer = InvalidBuffer;
-	}
+	if (BufferIsValid(vmbuffer))
+		ReleaseBuffer(vmbuffer);
 
 	/* report that everything is now scanned */
 	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
@@ -1094,7 +1089,7 @@ lazy_scan_heap(LVRelState *vacrel)
  * relation are read and vacrel->skippedallvis is set to ensure we don't
  * advance relfrozenxid when we have skipped vacuuming all visible blocks.
  *
- * skip->vmbuffer will contain the block from the VM containing visibility
+ * vmbuffer will contain the block from the VM containing visibility
  * information for the next unskippable heap block. We may end up needed a
  * different block from the VM (if we decide not to skip a skippable block).
  * This is okay; visibilitymap_pin() will take care of this while processing
@@ -1110,7 +1105,7 @@ lazy_scan_heap(LVRelState *vacrel)
  */
 static bool
 heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
-							 BlockNumber *blkno, bool *all_visible_according_to_vm)
+							 BlockNumber *blkno, bool *all_visible_according_to_vm, Buffer *vmbuffer)
 {
 	bool		skipsallvis = false;
 
@@ -1131,7 +1126,7 @@ heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
 		{
 			uint8		mapbits = visibilitymap_get_status(vacrel->rel,
 														   next_unskippable_block,
-														   &vacrel->skip.vmbuffer);
+														   vmbuffer);
 
 			vacrel->skip.next_unskippable_allvis = mapbits & VISIBILITYMAP_ALL_VISIBLE;
 
@@ -1245,7 +1240,7 @@ heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
  */
 static bool
 lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno,
-					   Page page, bool sharelock)
+					   Page page, bool sharelock, Buffer vmbuffer)
 {
 	Size		freespace;
 
@@ -1331,7 +1326,7 @@ lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno,
 
 			PageSetAllVisible(page);
 			visibilitymap_set(vacrel->rel, blkno, buf, InvalidXLogRecPtr,
-							  vacrel->skip.vmbuffer, InvalidTransactionId,
+							  vmbuffer, InvalidTransactionId,
 							  VISIBILITYMAP_ALL_VISIBLE | VISIBILITYMAP_ALL_FROZEN);
 			END_CRIT_SECTION();
 		}
@@ -1367,11 +1362,10 @@ lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno,
  * any tuple that becomes dead after the call to heap_page_prune() can't need to
  * be frozen, because it was visible to another session when vacuum started.
  *
- * vacrel->skipstate.vmbuffer is the buffer containing the VM block with
- * visibility information for the heap block, blkno.
- * all_visible_according_to_vm is the saved visibility status of the heap block
- * looked up earlier by the caller. We won't rely entirely on this status, as
- * it may be out of date.
+ * vmbuffer is the buffer containing the VM block with visibility information
+ * for the heap block, blkno. all_visible_according_to_vm is the saved
+ * visibility status of the heap block looked up earlier by the caller. We
+ * won't rely entirely on this status, as it may be out of date.
  *
  * *has_lpdead_items is set to true or false depending on whether, upon return
  * from this function, any LP_DEAD items are still present on the page.
@@ -1381,6 +1375,7 @@ lazy_scan_prune(LVRelState *vacrel,
 				Buffer buf,
 				BlockNumber blkno,
 				Page page,
+				Buffer vmbuffer,
 				bool all_visible_according_to_vm,
 				bool *has_lpdead_items)
 {
@@ -1814,8 +1809,7 @@ lazy_scan_prune(LVRelState *vacrel,
 		PageSetAllVisible(page);
 		MarkBufferDirty(buf);
 		visibilitymap_set(vacrel->rel, blkno, buf, InvalidXLogRecPtr,
-						  vacrel->skip.vmbuffer, visibility_cutoff_xid,
-						  flags);
+						  vmbuffer, visibility_cutoff_xid, flags);
 	}
 
 	/*
@@ -1825,11 +1819,11 @@ lazy_scan_prune(LVRelState *vacrel,
 	 * buffer lock before concluding that the VM is corrupt.
 	 */
 	else if (all_visible_according_to_vm && !PageIsAllVisible(page) &&
-			 visibilitymap_get_status(vacrel->rel, blkno, &vacrel->skip.vmbuffer) != 0)
+			 visibilitymap_get_status(vacrel->rel, blkno, &vmbuffer) != 0)
 	{
 		elog(WARNING, "page is not marked all-visible but visibility map bit is set in relation \"%s\" page %u",
 			 vacrel->relname, blkno);
-		visibilitymap_clear(vacrel->rel, blkno, vacrel->skip.vmbuffer,
+		visibilitymap_clear(vacrel->rel, blkno, vmbuffer,
 							VISIBILITYMAP_VALID_BITS);
 	}
 
@@ -1853,7 +1847,7 @@ lazy_scan_prune(LVRelState *vacrel,
 			 vacrel->relname, blkno);
 		PageClearAllVisible(page);
 		MarkBufferDirty(buf);
-		visibilitymap_clear(vacrel->rel, blkno, vacrel->skip.vmbuffer,
+		visibilitymap_clear(vacrel->rel, blkno, vmbuffer,
 							VISIBILITYMAP_VALID_BITS);
 	}
 
@@ -1863,7 +1857,7 @@ lazy_scan_prune(LVRelState *vacrel,
 	 * true, so we must check both all_visible and all_frozen.
 	 */
 	else if (all_visible_according_to_vm && all_visible &&
-			 all_frozen && !VM_ALL_FROZEN(vacrel->rel, blkno, &vacrel->skip.vmbuffer))
+			 all_frozen && !VM_ALL_FROZEN(vacrel->rel, blkno, &vmbuffer))
 	{
 		/*
 		 * Avoid relying on all_visible_according_to_vm as a proxy for the
@@ -1885,7 +1879,7 @@ lazy_scan_prune(LVRelState *vacrel,
 		 */
 		Assert(!TransactionIdIsValid(visibility_cutoff_xid));
 		visibilitymap_set(vacrel->rel, blkno, buf, InvalidXLogRecPtr,
-						  vacrel->skip.vmbuffer, InvalidTransactionId,
+						  vmbuffer, InvalidTransactionId,
 						  VISIBILITYMAP_ALL_VISIBLE |
 						  VISIBILITYMAP_ALL_FROZEN);
 	}
-- 
2.39.2

v6-0007-Rename-skip_state.patchtext/x-patch; charset=UTF-8; name=v6-0007-Rename-skip_state.patchDownload
From 519e26a01b6e6974f9e0edb94b00756af053f7ee Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Wed, 6 Mar 2024 20:27:57 +0200
Subject: [PATCH v6 7/9] Rename skip_state

I don't want to emphasize the "skipping" part. Rather, it's the state
onwed by the heap_vac_scan_get_next_block() function
---
 src/backend/access/heap/vacuumlazy.c | 32 ++++++++++++++--------------
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 3f1661cea61..17e06065f7e 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -206,8 +206,8 @@ typedef struct LVRelState
 	int64		missed_dead_tuples; /* # removable, but not removed */
 
 	/*
-	 * Parameters maintained by heap_vac_scan_get_next_block() to manage
-	 * skipping ranges of pages greater than SKIP_PAGES_THRESHOLD.
+	 * State maintained by heap_vac_scan_get_next_block() to manage skipping
+	 * ranges of pages greater than SKIP_PAGES_THRESHOLD.
 	 */
 	struct
 	{
@@ -215,7 +215,7 @@ typedef struct LVRelState
 		BlockNumber next_unskippable_block;
 		/* Next unskippable block's visibility status */
 		bool		next_unskippable_allvis;
-	}			skip;
+	}			get_next_block_state;
 } LVRelState;
 
 /* Struct for saving and restoring vacuum error information. */
@@ -831,7 +831,7 @@ lazy_scan_heap(LVRelState *vacrel)
 	initprog_val[2] = dead_items->max_items;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
-	vacrel->skip.next_unskippable_block = InvalidBlockNumber;
+	vacrel->get_next_block_state.next_unskippable_block = InvalidBlockNumber;
 
 	while (heap_vac_scan_get_next_block(vacrel, blkno + 1,
 										&blkno, &all_visible_according_to_vm, &vmbuffer))
@@ -1079,7 +1079,7 @@ lazy_scan_heap(LVRelState *vacrel)
  * process (and its visibility status).
  *
  * The block number and visibility status of the next unskippable block are set
- * in vacrel->skip->next_unskippable_block and next_unskippable_allvis.
+ * in vacrel->scan_sate->next_unskippable_block and next_unskippable_allvis.
  *
  * The block number and visibility status of the next block to process are set
  * in blkno and all_visible_according_to_vm. heap_vac_scan_get_next_block()
@@ -1115,12 +1115,12 @@ heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
 		return false;
 	}
 
-	if (vacrel->skip.next_unskippable_block == InvalidBlockNumber ||
-		next_block > vacrel->skip.next_unskippable_block)
+	if (vacrel->get_next_block_state.next_unskippable_block == InvalidBlockNumber ||
+		next_block > vacrel->get_next_block_state.next_unskippable_block)
 	{
 		/* Use local variables for better optimized loop code */
 		BlockNumber rel_pages = vacrel->rel_pages;
-		BlockNumber next_unskippable_block = vacrel->skip.next_unskippable_block;
+		BlockNumber next_unskippable_block = vacrel->get_next_block_state.next_unskippable_block;
 
 		while (++next_unskippable_block < rel_pages)
 		{
@@ -1128,9 +1128,9 @@ heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
 														   next_unskippable_block,
 														   vmbuffer);
 
-			vacrel->skip.next_unskippable_allvis = mapbits & VISIBILITYMAP_ALL_VISIBLE;
+			vacrel->get_next_block_state.next_unskippable_allvis = mapbits & VISIBILITYMAP_ALL_VISIBLE;
 
-			if (!vacrel->skip.next_unskippable_allvis)
+			if (!vacrel->get_next_block_state.next_unskippable_allvis)
 			{
 				Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
 				break;
@@ -1155,7 +1155,7 @@ heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
 			if (!vacrel->skipwithvm)
 			{
 				/* Caller shouldn't rely on all_visible_according_to_vm */
-				vacrel->skip.next_unskippable_allvis = false;
+				vacrel->get_next_block_state.next_unskippable_allvis = false;
 				break;
 			}
 
@@ -1179,7 +1179,7 @@ heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
 			}
 		}
 
-		vacrel->skip.next_unskippable_block = next_unskippable_block;
+		vacrel->get_next_block_state.next_unskippable_block = next_unskippable_block;
 
 		/*
 		 * We only skip a range with at least SKIP_PAGES_THRESHOLD consecutive
@@ -1193,16 +1193,16 @@ heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
 		 * pages then skipping makes updating relfrozenxid unsafe, which is a
 		 * real downside.
 		 */
-		if (vacrel->skip.next_unskippable_block - next_block >= SKIP_PAGES_THRESHOLD)
+		if (vacrel->get_next_block_state.next_unskippable_block - next_block >= SKIP_PAGES_THRESHOLD)
 		{
-			next_block = vacrel->skip.next_unskippable_block;
+			next_block = vacrel->get_next_block_state.next_unskippable_block;
 			if (skipsallvis)
 				vacrel->skippedallvis = true;
 		}
 	}
 
-	if (next_block == vacrel->skip.next_unskippable_block)
-		*all_visible_according_to_vm = vacrel->skip.next_unskippable_allvis;
+	if (next_block == vacrel->get_next_block_state.next_unskippable_block)
+		*all_visible_according_to_vm = vacrel->get_next_block_state.next_unskippable_allvis;
 	else
 		*all_visible_according_to_vm = true;
 
-- 
2.39.2

v6-0008-Track-current_block-in-the-skip-state.patchtext/x-patch; charset=UTF-8; name=v6-0008-Track-current_block-in-the-skip-state.patchDownload
From 6dfae936a29e2d3479273f8ab47778a596258b16 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Wed, 6 Mar 2024 21:03:19 +0200
Subject: [PATCH v6 8/9] Track 'current_block' in the skip state

The caller was expected to always pass last blk + 1. It's not clear if
the next_unskippable block accounting would work correctly if you
passed something else. So rather than expecting the caller to do that,
have heap_vac_scan_get_next_block() keep track of the last returned
block itself, in the 'skip' state.

This is largely redundant with the LVRelState->blkno field. But that
one is currently only used for error reporting, so it feels best to
give heap_vac_scan_get_next_block() its own field that it owns.
---
 src/backend/access/heap/vacuumlazy.c | 27 +++++++++++++++------------
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 17e06065f7e..535d70b71c3 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -211,6 +211,8 @@ typedef struct LVRelState
 	 */
 	struct
 	{
+		BlockNumber current_block;
+
 		/* Next unskippable block */
 		BlockNumber next_unskippable_block;
 		/* Next unskippable block's visibility status */
@@ -228,7 +230,7 @@ typedef struct LVSavedErrInfo
 
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
-static bool heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
+static bool heap_vac_scan_get_next_block(LVRelState *vacrel,
 										 BlockNumber *blkno,
 										 bool *all_visible_according_to_vm,
 										 Buffer *vmbuffer);
@@ -831,10 +833,11 @@ lazy_scan_heap(LVRelState *vacrel)
 	initprog_val[2] = dead_items->max_items;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
+	/* initialize for first heap_vac_scan_get_next_block() call */
+	vacrel->get_next_block_state.current_block = InvalidBlockNumber;
 	vacrel->get_next_block_state.next_unskippable_block = InvalidBlockNumber;
 
-	while (heap_vac_scan_get_next_block(vacrel, blkno + 1,
-										&blkno, &all_visible_according_to_vm, &vmbuffer))
+	while (heap_vac_scan_get_next_block(vacrel, &blkno, &all_visible_according_to_vm, &vmbuffer))
 	{
 		Buffer		buf;
 		Page		page;
@@ -1061,11 +1064,9 @@ lazy_scan_heap(LVRelState *vacrel)
  *	heap_vac_scan_get_next_block() -- get next block for vacuum to process
  *
  * lazy_scan_heap() calls here every time it needs to get the next block to
- * prune and vacuum, using the visibility map, vacuum options, and various
- * thresholds to skip blocks which do not need to be processed. Caller passes
- * next_block, the next block in line. This block may end up being skipped.
- * heap_vac_scan_get_next_block() sets blkno to next block that actually needs
- * to be processed.
+ * prune and vacuum.  We use the visibility map, vacuum options, and various
+ * thresholds to skip blocks which do not need to be processed, and set blkno
+ * to next block that actually needs to be processed.
  *
  * A block is unskippable if it is not all visible according to the visibility
  * map. It is also unskippable if it is the last block in the relation, if the
@@ -1104,14 +1105,16 @@ lazy_scan_heap(LVRelState *vacrel)
  * choice to skip such a range is actually made, making everything safe.)
  */
 static bool
-heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
-							 BlockNumber *blkno, bool *all_visible_according_to_vm, Buffer *vmbuffer)
+heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber *blkno,
+							 bool *all_visible_according_to_vm, Buffer *vmbuffer)
 {
+	/* Relies on InvalidBlockNumber + 1 == 0 */
+	BlockNumber next_block = vacrel->get_next_block_state.current_block + 1;
 	bool		skipsallvis = false;
 
 	if (next_block >= vacrel->rel_pages)
 	{
-		*blkno = InvalidBlockNumber;
+		vacrel->get_next_block_state.current_block = *blkno = InvalidBlockNumber;
 		return false;
 	}
 
@@ -1206,7 +1209,7 @@ heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
 	else
 		*all_visible_according_to_vm = true;
 
-	*blkno = next_block;
+	vacrel->get_next_block_state.current_block = *blkno = next_block;
 	return true;
 }
 
-- 
2.39.2

v6-0009-Comment-whitespace-cleanup.patchtext/x-patch; charset=UTF-8; name=v6-0009-Comment-whitespace-cleanup.patchDownload
From 619556cad4aad68d1711c12b962e9002e56d8db2 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Wed, 6 Mar 2024 21:35:11 +0200
Subject: [PATCH v6 9/9] Comment & whitespace cleanup

I moved some of the paragraphs to inside the
heap_vac_scan_get_next_block() function. I found the explanation in
the function comment at the old place like too much detail. Someone
looking at the function signature and how to call it would not care
about all the details of what can or cannot be skipped.

The new place isn't great either, but will do for now
---
 src/backend/access/heap/vacuumlazy.c | 41 ++++++++++++++++------------
 1 file changed, 23 insertions(+), 18 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 535d70b71c3..b8a2dcfbbac 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -228,6 +228,7 @@ typedef struct LVSavedErrInfo
 	VacErrPhase phase;
 } LVSavedErrInfo;
 
+
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
 static bool heap_vac_scan_get_next_block(LVRelState *vacrel,
@@ -1068,30 +1069,17 @@ lazy_scan_heap(LVRelState *vacrel)
  * thresholds to skip blocks which do not need to be processed, and set blkno
  * to next block that actually needs to be processed.
  *
- * A block is unskippable if it is not all visible according to the visibility
- * map. It is also unskippable if it is the last block in the relation, if the
- * vacuum is an aggressive vacuum, or if DISABLE_PAGE_SKIPPING was passed to
- * vacuum.
- *
- * Even if a block is skippable, we may choose not to skip it if the range of
- * skippable blocks is too small (below SKIP_PAGES_THRESHOLD). As a
- * consequence, we must keep track of the next truly unskippable block and its
- * visibility status separate from the next block lazy_scan_heap() should
- * process (and its visibility status).
- *
- * The block number and visibility status of the next unskippable block are set
- * in vacrel->scan_sate->next_unskippable_block and next_unskippable_allvis.
- *
- * The block number and visibility status of the next block to process are set
- * in blkno and all_visible_according_to_vm. heap_vac_scan_get_next_block()
- * returns false if there are no further blocks to process.
+ * The block number and visibility status of the next block to process are
+ * returned in blkno and all_visible_according_to_vm.
+ * heap_vac_scan_get_next_block() returns false if there are no further blocks
+ * to process.
  *
  * vacrel is an in/out parameter here; vacuum options and information about the
  * relation are read and vacrel->skippedallvis is set to ensure we don't
  * advance relfrozenxid when we have skipped vacuuming all visible blocks.
  *
  * vmbuffer will contain the block from the VM containing visibility
- * information for the next unskippable heap block. We may end up needed a
+ * information for the next unskippable heap block.  We may end up needing a
  * different block from the VM (if we decide not to skip a skippable block).
  * This is okay; visibilitymap_pin() will take care of this while processing
  * the block.
@@ -1125,6 +1113,23 @@ heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber *blkno,
 		BlockNumber rel_pages = vacrel->rel_pages;
 		BlockNumber next_unskippable_block = vacrel->get_next_block_state.next_unskippable_block;
 
+		/*
+		 * A block is unskippable if it is not all visible according to the
+		 * visibility map.  It is also unskippable if it is the last block in
+		 * the relation, if the vacuum is an aggressive vacuum, or if
+		 * DISABLE_PAGE_SKIPPING was passed to vacuum.
+		 *
+		 * Even if a block is skippable, we may choose not to skip it if the
+		 * range of skippable blocks is too small (below
+		 * SKIP_PAGES_THRESHOLD).  As a consequence, we must keep track of the
+		 * next truly unskippable block and its visibility status separate
+		 * from the next block lazy_scan_heap() should process (and its
+		 * visibility status).
+		 *
+		 * The block number and visibility status of the next unskippable
+		 * block are set in vacrel->scan_sate->next_unskippable_block and
+		 * next_unskippable_allvis.
+		 */
 		while (++next_unskippable_block < rel_pages)
 		{
 			uint8		mapbits = visibilitymap_get_status(vacrel->rel,
-- 
2.39.2

#14Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#12)
7 attachment(s)
Re: Confine vacuum skip logic to lazy_scan_skip

On Tue, Feb 27, 2024 at 02:47:03PM -0500, Melanie Plageman wrote:

On Mon, Jan 29, 2024 at 8:18 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Fri, Jan 26, 2024 at 8:28 AM vignesh C <vignesh21@gmail.com> wrote:

CFBot shows that the patch does not apply anymore as in [1]:
=== applying patch
./v3-0002-Add-lazy_scan_skip-unskippable-state-to-LVRelStat.patch
patching file src/backend/access/heap/vacuumlazy.c
...
Hunk #10 FAILED at 1042.
Hunk #11 FAILED at 1121.
Hunk #12 FAILED at 1132.
Hunk #13 FAILED at 1161.
Hunk #14 FAILED at 1172.
Hunk #15 FAILED at 1194.
...
6 out of 21 hunks FAILED -- saving rejects to file
src/backend/access/heap/vacuumlazy.c.rej

Please post an updated version for the same.

[1] - http://cfbot.cputube.org/patch_46_4755.log

Fixed in attached rebased v4

In light of Thomas' update to the streaming read API [1], I have
rebased and updated this patch set.

The attached v5 has some simplifications when compared to v4 but takes
largely the same approach.

Attached is a patch set (v5a) which updates the streaming read user for
vacuum to fix an issue Andrey Borodin pointed out to me off-list.

Note that I started writing this email before seeing Heikki's upthread
review [1]/messages/by-id/1eeccf12-d5d1-4b7e-b88b-7342410129d7@iki.fi, so I will respond to that in a bit. There are no changes in
v5a to any of the prelim refactoring patches which Heikki reviewed in
that email. I only changed the vacuum streaming read users (last two
patches in the set).

Back to this patch set:
Andrey pointed out that it was failing to compile on windows and the
reason is that I had accidentally left an undefined variable "index" in
these places

Assert(index > 0);
...
ereport(DEBUG2,
(errmsg("table \"%s\": removed %lld dead item identifiers in %u pages",
vacrel->relname, (long long) index, vacuumed_pages)));

See https://cirrus-ci.com/task/6312305361682432

I don't understand how this didn't warn me (or fail to compile) for an
assert build on my own workstation. It seems to think "index" is a
function?

Anyway, thinking about what the correct assertion would be here:

Assert(index > 0);
Assert(vacrel->num_index_scans > 1 ||
(rbstate->end_idx == vacrel->lpdead_items &&
vacuumed_pages == vacrel->lpdead_item_pages));

I think I can just replace "index" with "rbstate->end_index". At the end
of reaping, this should have the same value that index would have had.
The issue with this is if pg_streaming_read_buffer_get_next() somehow
never returned a valid buffer (there were no dead items), then rbstate
would potentially be uninitialized. The old assertion (index > 0) would
only have been true if there were some dead items, but there isn't an
explicit assertion in this function that there were some dead items.
Perhaps it is worth adding this? Even if we add this, perhaps it is
unacceptable from a programming standpoint to use rbstate in that scope?

In addition to fixing this slip-up, I have done some performance testing
for streaming read vacuum. Note that these tests are for both vacuum
passes (1 and 2) using streaming read.

Performance results:

The TL;DR of my performance results is that streaming read vacuum is
faster. However there is an issue with the interaction of the streaming
read code and the vacuum buffer access strategy which must be addressed.

Note that "master" in the results below is actually just a commit on my
branch [2]https://github.com/melanieplageman/postgres/tree/vac_pgsr before the one adding the vacuum streaming read users. So it
includes all of my refactoring of the vacuum code from the preliminary
patches.

I tested two vacuum "data states". Both are relatively small tables
because the impact of streaming read can easily be seen even at small
table sizes. DDL for both data states is at the end of the email.

The first data state is a 28 MB table which has never been vacuumed and
has one or two dead tuples on every block. All of the blocks have dead
tuples, so all of the blocks must be vacuumed. We'll call this the
"sequential" data state.

The second data state is a 67 MB table which has been vacuumed and then
a small percentage of the blocks (non-consecutive blocks at irregular
intervals) are updated afterward. Because the visibility map has been
updated and only a few blocks have dead tuples, large ranges of blocks
do not need to be vacuumed. There is at least one run of blocks with
dead tuples larger than 1 block but most of the blocks with dead tuples
are a single block followed by many blocks with no dead tuples. We'll
call this the "few" data state.

I tested these data states with "master" and with streaming read vacuum
with three caching options:

- table data fully in shared buffers (via pg_prewarm)
- table data in the kernel buffer cache but not in shared buffers
- table data completely uncached

I tested the OS cached and uncached caching options with both the
default vacuum buffer access strategy and with BUFFER_USAGE_LIMIT 0
(which uses as many shared buffers as needed).

For the streaming read vacuum, I tested with maintenance_io_concurrency
10, 100, and 1000. 10 is the current default on master.
maintenance_io_concurrency is not used by vacuum on master AFAICT.

maintenance_io_concurrency is used by streaming read to determine how
many buffers it can pin at the same time (with the hope of combining
consecutive blocks into larger IOs) and, in the case of vacuum, it is
used to determine prefetch distance.

In the following results, I ran vacuum at least five times and averaged
the timing results.

Table data cached in shared buffers
===================================

Sequential data state
---------------------

The only noticeable difference in performance was that streaming read
vacuum took 2% longer than master (19 ms vs 18.6 ms). It was a bit more
noticeable at maintenance_io_concurrency 1000 than 10.

This may be resolved by a patch Thomas is working on to avoid pinning
too many buffers if larger IOs cannot be created (like in a fully SB
resident workload). We should revisit this when that patch is available.

Few data state
--------------

There was no difference in timing for any of the scenarios.

Table data cached in OS buffer cache
====================================

Sequential data state
---------------------

With the buffer access strategy disabled, streaming read vacuum took 11%
less time regardless of maintenance_io_concurrency (26 ms vs 23 ms).

With the default vacuum buffer access strategy,
maintenance_io_concurrency had a large impact:

Note that "mic" is maintenace_io_concurrency

| data state | code | mic | time (ms) |
+------------+-----------+------+-----------+
| sequential | master | NA | 99 |
| sequential | streaming | 10 | 122 |
| sequential | streaming | 100 | 28 |

The streaming read API calculates the maximum number of pinned buffers
as 4 * maintenance_io_concurrency. The default vacuum buffer access
strategy ring buffer is 256 kB -- which is 32 buffers.

With maintenance_io_concurrency 10, streaming read code wants to pin 40
buffers. There is likely an interaction between this and the buffer
access strategy which leads to the slowdown at
maintenance_io_concurrency 10.

We could change the default maintenance_io_concurrency, but a better
option is to take the buffer access strategy into account in the
streaming read code.

Few data state
--------------

There was no difference in timing for any of the scenarios.

Table data uncached
===================

Sequential data state
---------------------

When the buffer access strategy is disabled, streaming read vacuum takes
12% less time regardless of maintenance_io_concurrency (36 ms vs 41 ms).

With the default buffer access strategy (ring buffer 256 kB) and
maintenance_io_concurrency 10 (the default), the streaming read vacuum
takes 19% more time. But if we bump maintenance_io_concurrency up to
100+, streaming read vacuum takes 64% less time:

| data state | code | mic | time (ms) |
+------------+-----------+------+-----------+
| sequential | master | NA | 113 |
| sequential | streaming | 10 | 140 |
| sequential | streaming | 100 | 41 |

This is likely due to the same adverse interaction between streaming
reads' max pinned buffers and the buffer access strategy ring buffer
size.

Few data state
--------------

The buffer access strategy had no impact here, so all of these results
are with the default buffer access strategy. The streaming read vacuum
takes 20-25% less time than master vacuum.

| data state | code | mic | time (ms) |
+------------+-----------+------+-----------+
| few | master | NA | 4.5 |
| few | streaming | 10 | 3.4 |
| few | streaming | 100 | 3.5 |

The improvement is likely due to prefetching and the one range of
consecutive blocks containing dead tuples which could be merged into a
larger IO.

Higher maintenance_io_concurrency only helps a little probably because:

1) most the blocks to vacuum are not consecutive so we can't make bigger
IOs in most cases
2) we are not vacuuming enough blocks such that we want to prefetch more
than 10 blocks.

This experiment should probably be redone with larger tables containing
more blocks needing vacuum. At 3-4 ms, a 20% performance difference
isn't really that interesting.

The next step (other than the preliminary refactoring patches) is to
decide how the streaming read API should use the buffer access strategy.

Sequential Data State DDL:
drop table if exists foo;
create table foo (a int) with (autovacuum_enabled=false, fillfactor=25);
insert into foo select i % 3 from generate_series(1,200000)i;
update foo set a = 5 where a = 1;

Few Data State DDL:
drop table if exists foo;
create table foo (a int) with (autovacuum_enabled=false, fillfactor=25);
insert into foo select i from generate_series(2,20000)i;
insert into foo select 1 from generate_series(1,200)i;
insert into foo select i from generate_series(2,20000)i;
insert into foo select 1 from generate_series(1,200)i;
insert into foo select i from generate_series(2,200000)i;
insert into foo select 1 from generate_series(1,200)i;
insert into foo select i from generate_series(2,20000)i;
insert into foo select 1 from generate_series(1,2000)i;
insert into foo select i from generate_series(2,20000)i;
insert into foo select 1 from generate_series(1,200)i;
insert into foo select i from generate_series(2,200000)i;
insert into foo select 1 from generate_series(1,200)i;
vacuum (freeze) foo;
update foo set a = 5 where a = 1;

- Melanie

[1]: /messages/by-id/1eeccf12-d5d1-4b7e-b88b-7342410129d7@iki.fi
[2]: https://github.com/melanieplageman/postgres/tree/vac_pgsr

Attachments:

v5a-0001-lazy_scan_skip-remove-unneeded-local-var-nskippa.patchtext/x-diff; charset=us-asciiDownload
From c7a16bbf477c08ebf553da855e0554077c21de69 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 30 Dec 2023 16:30:59 -0500
Subject: [PATCH v5a 1/7] lazy_scan_skip remove unneeded local var
 nskippable_blocks

nskippable_blocks can be easily derived from next_unskippable_block's
progress when compared to the passed in next_block.
---
 src/backend/access/heap/vacuumlazy.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 8b320c3f89a..1dc6cc8e4db 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1103,8 +1103,7 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 			   bool *next_unskippable_allvis, bool *skipping_current_range)
 {
 	BlockNumber rel_pages = vacrel->rel_pages,
-				next_unskippable_block = next_block,
-				nskippable_blocks = 0;
+				next_unskippable_block = next_block;
 	bool		skipsallvis = false;
 
 	*next_unskippable_allvis = true;
@@ -1161,7 +1160,6 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 
 		vacuum_delay_point();
 		next_unskippable_block++;
-		nskippable_blocks++;
 	}
 
 	/*
@@ -1174,7 +1172,7 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 	 * non-aggressive VACUUMs.  If the range has any all-visible pages then
 	 * skipping makes updating relfrozenxid unsafe, which is a real downside.
 	 */
-	if (nskippable_blocks < SKIP_PAGES_THRESHOLD)
+	if (next_unskippable_block - next_block < SKIP_PAGES_THRESHOLD)
 		*skipping_current_range = false;
 	else
 	{
-- 
2.40.1

v5a-0002-Add-lazy_scan_skip-unskippable-state-to-LVRelSta.patchtext/x-diff; charset=us-asciiDownload
From c27436eaeedbc1551072922d92460ca1c30ed116 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 30 Dec 2023 16:22:12 -0500
Subject: [PATCH v5a 2/7] Add lazy_scan_skip unskippable state to LVRelState

Future commits will remove all skipping logic from lazy_scan_heap() and
confine it to lazy_scan_skip(). To make those commits more clear, first
introduce add a struct to LVRelState containing variables needed to skip
ranges less than SKIP_PAGES_THRESHOLD.

lazy_scan_prune() and lazy_scan_new_or_empty() can now access the
buffer containing the relevant block of the visibility map through the
LVRelState.skip, so it no longer needs to be a separate function
parameter.

While we are at it, add additional information to the lazy_scan_skip()
comment, including descriptions of the role and expectations for its
function parameters.
---
 src/backend/access/heap/vacuumlazy.c | 154 ++++++++++++++++-----------
 1 file changed, 90 insertions(+), 64 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 1dc6cc8e4db..0ddb986bc03 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -204,6 +204,22 @@ typedef struct LVRelState
 	int64		live_tuples;	/* # live tuples remaining */
 	int64		recently_dead_tuples;	/* # dead, but not yet removable */
 	int64		missed_dead_tuples; /* # removable, but not removed */
+
+	/*
+	 * Parameters maintained by lazy_scan_skip() to manage skipping ranges of
+	 * pages greater than SKIP_PAGES_THRESHOLD.
+	 */
+	struct
+	{
+		/* Next unskippable block */
+		BlockNumber next_unskippable_block;
+		/* Buffer containing next unskippable block's visibility info */
+		Buffer		vmbuffer;
+		/* Next unskippable block's visibility status */
+		bool		next_unskippable_allvis;
+		/* Whether or not skippable blocks should be skipped */
+		bool		skipping_current_range;
+	}			skip;
 } LVRelState;
 
 /* Struct for saving and restoring vacuum error information. */
@@ -214,19 +230,15 @@ typedef struct LVSavedErrInfo
 	VacErrPhase phase;
 } LVSavedErrInfo;
 
-
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
-static BlockNumber lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer,
-								  BlockNumber next_block,
-								  bool *next_unskippable_allvis,
-								  bool *skipping_current_range);
+static void lazy_scan_skip(LVRelState *vacrel, BlockNumber next_block);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
-								   bool sharelock, Buffer vmbuffer);
+								   bool sharelock);
 static void lazy_scan_prune(LVRelState *vacrel, Buffer buf,
 							BlockNumber blkno, Page page,
-							Buffer vmbuffer, bool all_visible_according_to_vm,
+							bool all_visible_according_to_vm,
 							bool *has_lpdead_items);
 static bool lazy_scan_noprune(LVRelState *vacrel, Buffer buf,
 							  BlockNumber blkno, Page page,
@@ -803,12 +815,8 @@ lazy_scan_heap(LVRelState *vacrel)
 {
 	BlockNumber rel_pages = vacrel->rel_pages,
 				blkno,
-				next_unskippable_block,
 				next_fsm_block_to_vacuum = 0;
 	VacDeadItems *dead_items = vacrel->dead_items;
-	Buffer		vmbuffer = InvalidBuffer;
-	bool		next_unskippable_allvis,
-				skipping_current_range;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -822,10 +830,9 @@ lazy_scan_heap(LVRelState *vacrel)
 	initprog_val[2] = dead_items->max_items;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
+	vacrel->skip.vmbuffer = InvalidBuffer;
 	/* Set up an initial range of skippable blocks using the visibility map */
-	next_unskippable_block = lazy_scan_skip(vacrel, &vmbuffer, 0,
-											&next_unskippable_allvis,
-											&skipping_current_range);
+	lazy_scan_skip(vacrel, 0);
 	for (blkno = 0; blkno < rel_pages; blkno++)
 	{
 		Buffer		buf;
@@ -834,26 +841,23 @@ lazy_scan_heap(LVRelState *vacrel)
 		bool		has_lpdead_items;
 		bool		got_cleanup_lock = false;
 
-		if (blkno == next_unskippable_block)
+		if (blkno == vacrel->skip.next_unskippable_block)
 		{
 			/*
 			 * Can't skip this page safely.  Must scan the page.  But
 			 * determine the next skippable range after the page first.
 			 */
-			all_visible_according_to_vm = next_unskippable_allvis;
-			next_unskippable_block = lazy_scan_skip(vacrel, &vmbuffer,
-													blkno + 1,
-													&next_unskippable_allvis,
-													&skipping_current_range);
+			all_visible_according_to_vm = vacrel->skip.next_unskippable_allvis;
+			lazy_scan_skip(vacrel, blkno + 1);
 
-			Assert(next_unskippable_block >= blkno + 1);
+			Assert(vacrel->skip.next_unskippable_block >= blkno + 1);
 		}
 		else
 		{
 			/* Last page always scanned (may need to set nonempty_pages) */
 			Assert(blkno < rel_pages - 1);
 
-			if (skipping_current_range)
+			if (vacrel->skip.skipping_current_range)
 				continue;
 
 			/* Current range is too small to skip -- just scan the page */
@@ -896,10 +900,10 @@ lazy_scan_heap(LVRelState *vacrel)
 			 * correctness, but we do it anyway to avoid holding the pin
 			 * across a lengthy, unrelated operation.
 			 */
-			if (BufferIsValid(vmbuffer))
+			if (BufferIsValid(vacrel->skip.vmbuffer))
 			{
-				ReleaseBuffer(vmbuffer);
-				vmbuffer = InvalidBuffer;
+				ReleaseBuffer(vacrel->skip.vmbuffer);
+				vacrel->skip.vmbuffer = InvalidBuffer;
 			}
 
 			/* Perform a round of index and heap vacuuming */
@@ -924,7 +928,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		 * all-visible.  In most cases this will be very cheap, because we'll
 		 * already have the correct page pinned anyway.
 		 */
-		visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
+		visibilitymap_pin(vacrel->rel, blkno, &vacrel->skip.vmbuffer);
 
 		buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL,
 								 vacrel->bstrategy);
@@ -942,8 +946,7 @@ lazy_scan_heap(LVRelState *vacrel)
 			LockBuffer(buf, BUFFER_LOCK_SHARE);
 
 		/* Check for new or empty pages before lazy_scan_[no]prune call */
-		if (lazy_scan_new_or_empty(vacrel, buf, blkno, page, !got_cleanup_lock,
-								   vmbuffer))
+		if (lazy_scan_new_or_empty(vacrel, buf, blkno, page, !got_cleanup_lock))
 		{
 			/* Processed as new/empty page (lock and pin released) */
 			continue;
@@ -985,7 +988,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		if (got_cleanup_lock)
 			lazy_scan_prune(vacrel, buf, blkno, page,
-							vmbuffer, all_visible_according_to_vm,
+							all_visible_according_to_vm,
 							&has_lpdead_items);
 
 		/*
@@ -1035,8 +1038,11 @@ lazy_scan_heap(LVRelState *vacrel)
 	}
 
 	vacrel->blkno = InvalidBlockNumber;
-	if (BufferIsValid(vmbuffer))
-		ReleaseBuffer(vmbuffer);
+	if (BufferIsValid(vacrel->skip.vmbuffer))
+	{
+		ReleaseBuffer(vacrel->skip.vmbuffer);
+		vacrel->skip.vmbuffer = InvalidBuffer;
+	}
 
 	/* report that everything is now scanned */
 	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
@@ -1080,15 +1086,34 @@ lazy_scan_heap(LVRelState *vacrel)
  *	lazy_scan_skip() -- set up range of skippable blocks using visibility map.
  *
  * lazy_scan_heap() calls here every time it needs to set up a new range of
- * blocks to skip via the visibility map.  Caller passes the next block in
- * line.  We return a next_unskippable_block for this range.  When there are
- * no skippable blocks we just return caller's next_block.  The all-visible
- * status of the returned block is set in *next_unskippable_allvis for caller,
- * too.  Block usually won't be all-visible (since it's unskippable), but it
- * can be during aggressive VACUUMs (as well as in certain edge cases).
+ * blocks to skip via the visibility map.  Caller passes next_block, the next
+ * block in line. The parameters of the skipped range are recorded in skip.
+ * vacrel is an in/out parameter here; vacuum options and information about the
+ * relation are read and vacrel->skippedallvis is set to ensure we don't
+ * advance relfrozenxid when we have skipped vacuuming all visible blocks.
+ *
+ * skip->vmbuffer will contain the block from the VM containing visibility
+ * information for the next unskippable heap block. We may end up needed a
+ * different block from the VM (if we decide not to skip a skippable block).
+ * This is okay; visibilitymap_pin() will take care of this while processing
+ * the block.
+ *
+ * A block is unskippable if it is not all visible according to the visibility
+ * map. It is also unskippable if it is the last block in the relation, if the
+ * vacuum is an aggressive vacuum, or if DISABLE_PAGE_SKIPPING was passed to
+ * vacuum.
  *
- * Sets *skipping_current_range to indicate if caller should skip this range.
- * Costs and benefits drive our decision.  Very small ranges won't be skipped.
+ * Even if a block is skippable, we may choose not to skip it if the range of
+ * skippable blocks is too small (below SKIP_PAGES_THRESHOLD). As a
+ * consequence, we must keep track of the next truly unskippable block and its
+ * visibility status along with whether or not we are skipping the current
+ * range of skippable blocks. This can be used to derive the next block
+ * lazy_scan_heap() must process and its visibility status.
+ *
+ * The block number and visibility status of the next unskippable block are set
+ * in skip->next_unskippable_block and next_unskippable_allvis.
+ * skip->skipping_current_range indicates to the caller whether or not it is
+ * processing a skippable (and thus all-visible) block.
  *
  * Note: our opinion of which blocks can be skipped can go stale immediately.
  * It's okay if caller "misses" a page whose all-visible or all-frozen marking
@@ -1098,25 +1123,26 @@ lazy_scan_heap(LVRelState *vacrel)
  * older XIDs/MXIDs.  The vacrel->skippedallvis flag will be set here when the
  * choice to skip such a range is actually made, making everything safe.)
  */
-static BlockNumber
-lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
-			   bool *next_unskippable_allvis, bool *skipping_current_range)
+static void
+lazy_scan_skip(LVRelState *vacrel, BlockNumber next_block)
 {
+	/* Use local variables for better optimized loop code */
 	BlockNumber rel_pages = vacrel->rel_pages,
 				next_unskippable_block = next_block;
+
 	bool		skipsallvis = false;
 
-	*next_unskippable_allvis = true;
+	vacrel->skip.next_unskippable_allvis = true;
 	while (next_unskippable_block < rel_pages)
 	{
 		uint8		mapbits = visibilitymap_get_status(vacrel->rel,
 													   next_unskippable_block,
-													   vmbuffer);
+													   &vacrel->skip.vmbuffer);
 
 		if ((mapbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
 		{
 			Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
-			*next_unskippable_allvis = false;
+			vacrel->skip.next_unskippable_allvis = false;
 			break;
 		}
 
@@ -1137,7 +1163,7 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 		if (!vacrel->skipwithvm)
 		{
 			/* Caller shouldn't rely on all_visible_according_to_vm */
-			*next_unskippable_allvis = false;
+			vacrel->skip.next_unskippable_allvis = false;
 			break;
 		}
 
@@ -1162,6 +1188,8 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 		next_unskippable_block++;
 	}
 
+	vacrel->skip.next_unskippable_block = next_unskippable_block;
+
 	/*
 	 * We only skip a range with at least SKIP_PAGES_THRESHOLD consecutive
 	 * pages.  Since we're reading sequentially, the OS should be doing
@@ -1172,16 +1200,14 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 	 * non-aggressive VACUUMs.  If the range has any all-visible pages then
 	 * skipping makes updating relfrozenxid unsafe, which is a real downside.
 	 */
-	if (next_unskippable_block - next_block < SKIP_PAGES_THRESHOLD)
-		*skipping_current_range = false;
+	if (vacrel->skip.next_unskippable_block - next_block < SKIP_PAGES_THRESHOLD)
+		vacrel->skip.skipping_current_range = false;
 	else
 	{
-		*skipping_current_range = true;
+		vacrel->skip.skipping_current_range = true;
 		if (skipsallvis)
 			vacrel->skippedallvis = true;
 	}
-
-	return next_unskippable_block;
 }
 
 /*
@@ -1214,7 +1240,7 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
  */
 static bool
 lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno,
-					   Page page, bool sharelock, Buffer vmbuffer)
+					   Page page, bool sharelock)
 {
 	Size		freespace;
 
@@ -1300,7 +1326,7 @@ lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno,
 
 			PageSetAllVisible(page);
 			visibilitymap_set(vacrel->rel, blkno, buf, InvalidXLogRecPtr,
-							  vmbuffer, InvalidTransactionId,
+							  vacrel->skip.vmbuffer, InvalidTransactionId,
 							  VISIBILITYMAP_ALL_VISIBLE | VISIBILITYMAP_ALL_FROZEN);
 			END_CRIT_SECTION();
 		}
@@ -1336,10 +1362,11 @@ lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno,
  * any tuple that becomes dead after the call to heap_page_prune() can't need to
  * be frozen, because it was visible to another session when vacuum started.
  *
- * vmbuffer is the buffer containing the VM block with visibility information
- * for the heap block, blkno. all_visible_according_to_vm is the saved
- * visibility status of the heap block looked up earlier by the caller. We
- * won't rely entirely on this status, as it may be out of date.
+ * vacrel->skipstate.vmbuffer is the buffer containing the VM block with
+ * visibility information for the heap block, blkno.
+ * all_visible_according_to_vm is the saved visibility status of the heap block
+ * looked up earlier by the caller. We won't rely entirely on this status, as
+ * it may be out of date.
  *
  * *has_lpdead_items is set to true or false depending on whether, upon return
  * from this function, any LP_DEAD items are still present on the page.
@@ -1349,7 +1376,6 @@ lazy_scan_prune(LVRelState *vacrel,
 				Buffer buf,
 				BlockNumber blkno,
 				Page page,
-				Buffer vmbuffer,
 				bool all_visible_according_to_vm,
 				bool *has_lpdead_items)
 {
@@ -1783,7 +1809,7 @@ lazy_scan_prune(LVRelState *vacrel,
 		PageSetAllVisible(page);
 		MarkBufferDirty(buf);
 		visibilitymap_set(vacrel->rel, blkno, buf, InvalidXLogRecPtr,
-						  vmbuffer, visibility_cutoff_xid,
+						  vacrel->skip.vmbuffer, visibility_cutoff_xid,
 						  flags);
 	}
 
@@ -1794,11 +1820,11 @@ lazy_scan_prune(LVRelState *vacrel,
 	 * buffer lock before concluding that the VM is corrupt.
 	 */
 	else if (all_visible_according_to_vm && !PageIsAllVisible(page) &&
-			 visibilitymap_get_status(vacrel->rel, blkno, &vmbuffer) != 0)
+			 visibilitymap_get_status(vacrel->rel, blkno, &vacrel->skip.vmbuffer) != 0)
 	{
 		elog(WARNING, "page is not marked all-visible but visibility map bit is set in relation \"%s\" page %u",
 			 vacrel->relname, blkno);
-		visibilitymap_clear(vacrel->rel, blkno, vmbuffer,
+		visibilitymap_clear(vacrel->rel, blkno, vacrel->skip.vmbuffer,
 							VISIBILITYMAP_VALID_BITS);
 	}
 
@@ -1822,7 +1848,7 @@ lazy_scan_prune(LVRelState *vacrel,
 			 vacrel->relname, blkno);
 		PageClearAllVisible(page);
 		MarkBufferDirty(buf);
-		visibilitymap_clear(vacrel->rel, blkno, vmbuffer,
+		visibilitymap_clear(vacrel->rel, blkno, vacrel->skip.vmbuffer,
 							VISIBILITYMAP_VALID_BITS);
 	}
 
@@ -1832,7 +1858,7 @@ lazy_scan_prune(LVRelState *vacrel,
 	 * true, so we must check both all_visible and all_frozen.
 	 */
 	else if (all_visible_according_to_vm && all_visible &&
-			 all_frozen && !VM_ALL_FROZEN(vacrel->rel, blkno, &vmbuffer))
+			 all_frozen && !VM_ALL_FROZEN(vacrel->rel, blkno, &vacrel->skip.vmbuffer))
 	{
 		/*
 		 * Avoid relying on all_visible_according_to_vm as a proxy for the
@@ -1854,7 +1880,7 @@ lazy_scan_prune(LVRelState *vacrel,
 		 */
 		Assert(!TransactionIdIsValid(visibility_cutoff_xid));
 		visibilitymap_set(vacrel->rel, blkno, buf, InvalidXLogRecPtr,
-						  vmbuffer, InvalidTransactionId,
+						  vacrel->skip.vmbuffer, InvalidTransactionId,
 						  VISIBILITYMAP_ALL_VISIBLE |
 						  VISIBILITYMAP_ALL_FROZEN);
 	}
-- 
2.40.1

v5a-0003-Confine-vacuum-skip-logic-to-lazy_scan_skip.patchtext/x-diff; charset=us-asciiDownload
From a33c255419eb539901b8d07852e1c4b189b615df Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 30 Dec 2023 16:59:27 -0500
Subject: [PATCH v5a 3/7] Confine vacuum skip logic to lazy_scan_skip

In preparation for vacuum to use the streaming read interface (and
eventually AIO), refactor vacuum's logic for skipping blocks such that
it is entirely confined to lazy_scan_skip(). This turns lazy_scan_skip()
and the skip state in LVRelState it uses into an iterator which yields
blocks to lazy_scan_heap(). Such a structure is conducive to an async
interface. While we are at it, rename lazy_scan_skip() to
heap_vac_scan_get_next_block(), which now more accurately describes it.

By always calling heap_vac_scan_get_next_block() -- instead of only when
we have reached the next unskippable block, we no longer need the
skipping_current_range variable. lazy_scan_heap() no longer needs to
manage the skipped range -- checking if we reached the end in order to
then call heap_vac_scan_get_next_block(). And
heap_vac_scan_get_next_block() can derive the visibility status of a
block from whether or not we are in a skippable range -- that is,
whether or not the next_block is equal to the next unskippable block.
---
 src/backend/access/heap/vacuumlazy.c | 243 ++++++++++++++-------------
 1 file changed, 126 insertions(+), 117 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 0ddb986bc03..99d160335e1 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -206,8 +206,8 @@ typedef struct LVRelState
 	int64		missed_dead_tuples; /* # removable, but not removed */
 
 	/*
-	 * Parameters maintained by lazy_scan_skip() to manage skipping ranges of
-	 * pages greater than SKIP_PAGES_THRESHOLD.
+	 * Parameters maintained by heap_vac_scan_get_next_block() to manage
+	 * skipping ranges of pages greater than SKIP_PAGES_THRESHOLD.
 	 */
 	struct
 	{
@@ -232,7 +232,9 @@ typedef struct LVSavedErrInfo
 
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
-static void lazy_scan_skip(LVRelState *vacrel, BlockNumber next_block);
+static bool heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
+										 BlockNumber *blkno,
+										 bool *all_visible_according_to_vm);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
 								   bool sharelock);
@@ -814,8 +816,11 @@ static void
 lazy_scan_heap(LVRelState *vacrel)
 {
 	BlockNumber rel_pages = vacrel->rel_pages,
-				blkno,
 				next_fsm_block_to_vacuum = 0;
+	bool		all_visible_according_to_vm;
+
+	/* relies on InvalidBlockNumber overflowing to 0 */
+	BlockNumber blkno = InvalidBlockNumber;
 	VacDeadItems *dead_items = vacrel->dead_items;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
@@ -830,40 +835,17 @@ lazy_scan_heap(LVRelState *vacrel)
 	initprog_val[2] = dead_items->max_items;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
+	vacrel->skip.next_unskippable_block = InvalidBlockNumber;
 	vacrel->skip.vmbuffer = InvalidBuffer;
-	/* Set up an initial range of skippable blocks using the visibility map */
-	lazy_scan_skip(vacrel, 0);
-	for (blkno = 0; blkno < rel_pages; blkno++)
+
+	while (heap_vac_scan_get_next_block(vacrel, blkno + 1,
+										&blkno, &all_visible_according_to_vm))
 	{
 		Buffer		buf;
 		Page		page;
-		bool		all_visible_according_to_vm;
 		bool		has_lpdead_items;
 		bool		got_cleanup_lock = false;
 
-		if (blkno == vacrel->skip.next_unskippable_block)
-		{
-			/*
-			 * Can't skip this page safely.  Must scan the page.  But
-			 * determine the next skippable range after the page first.
-			 */
-			all_visible_according_to_vm = vacrel->skip.next_unskippable_allvis;
-			lazy_scan_skip(vacrel, blkno + 1);
-
-			Assert(vacrel->skip.next_unskippable_block >= blkno + 1);
-		}
-		else
-		{
-			/* Last page always scanned (may need to set nonempty_pages) */
-			Assert(blkno < rel_pages - 1);
-
-			if (vacrel->skip.skipping_current_range)
-				continue;
-
-			/* Current range is too small to skip -- just scan the page */
-			all_visible_according_to_vm = true;
-		}
-
 		vacrel->scanned_pages++;
 
 		/* Report as block scanned, update error traceback information */
@@ -1083,20 +1065,14 @@ lazy_scan_heap(LVRelState *vacrel)
 }
 
 /*
- *	lazy_scan_skip() -- set up range of skippable blocks using visibility map.
- *
- * lazy_scan_heap() calls here every time it needs to set up a new range of
- * blocks to skip via the visibility map.  Caller passes next_block, the next
- * block in line. The parameters of the skipped range are recorded in skip.
- * vacrel is an in/out parameter here; vacuum options and information about the
- * relation are read and vacrel->skippedallvis is set to ensure we don't
- * advance relfrozenxid when we have skipped vacuuming all visible blocks.
+ *	heap_vac_scan_get_next_block() -- get next block for vacuum to process
  *
- * skip->vmbuffer will contain the block from the VM containing visibility
- * information for the next unskippable heap block. We may end up needed a
- * different block from the VM (if we decide not to skip a skippable block).
- * This is okay; visibilitymap_pin() will take care of this while processing
- * the block.
+ * lazy_scan_heap() calls here every time it needs to get the next block to
+ * prune and vacuum, using the visibility map, vacuum options, and various
+ * thresholds to skip blocks which do not need to be processed. Caller passes
+ * next_block, the next block in line. This block may end up being skipped.
+ * heap_vac_scan_get_next_block() sets blkno to next block that actually needs
+ * to be processed.
  *
  * A block is unskippable if it is not all visible according to the visibility
  * map. It is also unskippable if it is the last block in the relation, if the
@@ -1106,14 +1082,25 @@ lazy_scan_heap(LVRelState *vacrel)
  * Even if a block is skippable, we may choose not to skip it if the range of
  * skippable blocks is too small (below SKIP_PAGES_THRESHOLD). As a
  * consequence, we must keep track of the next truly unskippable block and its
- * visibility status along with whether or not we are skipping the current
- * range of skippable blocks. This can be used to derive the next block
- * lazy_scan_heap() must process and its visibility status.
+ * visibility status separate from the next block lazy_scan_heap() should
+ * process (and its visibility status).
  *
  * The block number and visibility status of the next unskippable block are set
- * in skip->next_unskippable_block and next_unskippable_allvis.
- * skip->skipping_current_range indicates to the caller whether or not it is
- * processing a skippable (and thus all-visible) block.
+ * in vacrel->skip->next_unskippable_block and next_unskippable_allvis.
+ *
+ * The block number and visibility status of the next block to process are set
+ * in blkno and all_visible_according_to_vm. heap_vac_scan_get_next_block()
+ * returns false if there are no further blocks to process.
+ *
+ * vacrel is an in/out parameter here; vacuum options and information about the
+ * relation are read and vacrel->skippedallvis is set to ensure we don't
+ * advance relfrozenxid when we have skipped vacuuming all visible blocks.
+ *
+ * skip->vmbuffer will contain the block from the VM containing visibility
+ * information for the next unskippable heap block. We may end up needed a
+ * different block from the VM (if we decide not to skip a skippable block).
+ * This is okay; visibilitymap_pin() will take care of this while processing
+ * the block.
  *
  * Note: our opinion of which blocks can be skipped can go stale immediately.
  * It's okay if caller "misses" a page whose all-visible or all-frozen marking
@@ -1123,91 +1110,113 @@ lazy_scan_heap(LVRelState *vacrel)
  * older XIDs/MXIDs.  The vacrel->skippedallvis flag will be set here when the
  * choice to skip such a range is actually made, making everything safe.)
  */
-static void
-lazy_scan_skip(LVRelState *vacrel, BlockNumber next_block)
+static bool
+heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
+							 BlockNumber *blkno, bool *all_visible_according_to_vm)
 {
-	/* Use local variables for better optimized loop code */
-	BlockNumber rel_pages = vacrel->rel_pages,
-				next_unskippable_block = next_block;
-
 	bool		skipsallvis = false;
 
-	vacrel->skip.next_unskippable_allvis = true;
-	while (next_unskippable_block < rel_pages)
+	if (next_block >= vacrel->rel_pages)
 	{
-		uint8		mapbits = visibilitymap_get_status(vacrel->rel,
-													   next_unskippable_block,
-													   &vacrel->skip.vmbuffer);
+		*blkno = InvalidBlockNumber;
+		return false;
+	}
 
-		if ((mapbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
+	if (vacrel->skip.next_unskippable_block == InvalidBlockNumber ||
+		next_block > vacrel->skip.next_unskippable_block)
+	{
+		/* Use local variables for better optimized loop code */
+		BlockNumber rel_pages = vacrel->rel_pages;
+		BlockNumber next_unskippable_block = vacrel->skip.next_unskippable_block;
+
+		while (++next_unskippable_block < rel_pages)
 		{
-			Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
-			vacrel->skip.next_unskippable_allvis = false;
-			break;
-		}
+			uint8		mapbits = visibilitymap_get_status(vacrel->rel,
+														   next_unskippable_block,
+														   &vacrel->skip.vmbuffer);
 
-		/*
-		 * Caller must scan the last page to determine whether it has tuples
-		 * (caller must have the opportunity to set vacrel->nonempty_pages).
-		 * This rule avoids having lazy_truncate_heap() take access-exclusive
-		 * lock on rel to attempt a truncation that fails anyway, just because
-		 * there are tuples on the last page (it is likely that there will be
-		 * tuples on other nearby pages as well, but those can be skipped).
-		 *
-		 * Implement this by always treating the last block as unsafe to skip.
-		 */
-		if (next_unskippable_block == rel_pages - 1)
-			break;
+			vacrel->skip.next_unskippable_allvis = mapbits & VISIBILITYMAP_ALL_VISIBLE;
 
-		/* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
-		if (!vacrel->skipwithvm)
-		{
-			/* Caller shouldn't rely on all_visible_according_to_vm */
-			vacrel->skip.next_unskippable_allvis = false;
-			break;
-		}
+			if (!vacrel->skip.next_unskippable_allvis)
+			{
+				Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
+				break;
+			}
 
-		/*
-		 * Aggressive VACUUM caller can't skip pages just because they are
-		 * all-visible.  They may still skip all-frozen pages, which can't
-		 * contain XIDs < OldestXmin (XIDs that aren't already frozen by now).
-		 */
-		if ((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0)
-		{
-			if (vacrel->aggressive)
+			/*
+			 * Caller must scan the last page to determine whether it has
+			 * tuples (caller must have the opportunity to set
+			 * vacrel->nonempty_pages). This rule avoids having
+			 * lazy_truncate_heap() take access-exclusive lock on rel to
+			 * attempt a truncation that fails anyway, just because there are
+			 * tuples on the last page (it is likely that there will be tuples
+			 * on other nearby pages as well, but those can be skipped).
+			 *
+			 * Implement this by always treating the last block as unsafe to
+			 * skip.
+			 */
+			if (next_unskippable_block == rel_pages - 1)
 				break;
 
+			/* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
+			if (!vacrel->skipwithvm)
+			{
+				/* Caller shouldn't rely on all_visible_according_to_vm */
+				vacrel->skip.next_unskippable_allvis = false;
+				break;
+			}
+
 			/*
-			 * All-visible block is safe to skip in non-aggressive case.  But
-			 * remember that the final range contains such a block for later.
+			 * Aggressive VACUUM caller can't skip pages just because they are
+			 * all-visible.  They may still skip all-frozen pages, which can't
+			 * contain XIDs < OldestXmin (XIDs that aren't already frozen by
+			 * now).
 			 */
-			skipsallvis = true;
+			if ((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0)
+			{
+				if (vacrel->aggressive)
+					break;
+
+				/*
+				 * All-visible block is safe to skip in non-aggressive case.
+				 * But remember that the final range contains such a block for
+				 * later.
+				 */
+				skipsallvis = true;
+			}
+
+			vacuum_delay_point();
 		}
 
-		vacuum_delay_point();
-		next_unskippable_block++;
-	}
+		vacrel->skip.next_unskippable_block = next_unskippable_block;
 
-	vacrel->skip.next_unskippable_block = next_unskippable_block;
+		/*
+		 * We only skip a range with at least SKIP_PAGES_THRESHOLD consecutive
+		 * pages.  Since we're reading sequentially, the OS should be doing
+		 * readahead for us, so there's no gain in skipping a page now and
+		 * then. Skipping such a range might even discourage sequential
+		 * detection.
+		 *
+		 * This test also enables more frequent relfrozenxid advancement
+		 * during non-aggressive VACUUMs.  If the range has any all-visible
+		 * pages then skipping makes updating relfrozenxid unsafe, which is a
+		 * real downside.
+		 */
+		if (vacrel->skip.next_unskippable_block - next_block >= SKIP_PAGES_THRESHOLD)
+		{
+			next_block = vacrel->skip.next_unskippable_block;
+			if (skipsallvis)
+				vacrel->skippedallvis = true;
+		}
+	}
 
-	/*
-	 * We only skip a range with at least SKIP_PAGES_THRESHOLD consecutive
-	 * pages.  Since we're reading sequentially, the OS should be doing
-	 * readahead for us, so there's no gain in skipping a page now and then.
-	 * Skipping such a range might even discourage sequential detection.
-	 *
-	 * This test also enables more frequent relfrozenxid advancement during
-	 * non-aggressive VACUUMs.  If the range has any all-visible pages then
-	 * skipping makes updating relfrozenxid unsafe, which is a real downside.
-	 */
-	if (vacrel->skip.next_unskippable_block - next_block < SKIP_PAGES_THRESHOLD)
-		vacrel->skip.skipping_current_range = false;
+	if (next_block == vacrel->skip.next_unskippable_block)
+		*all_visible_according_to_vm = vacrel->skip.next_unskippable_allvis;
 	else
-	{
-		vacrel->skip.skipping_current_range = true;
-		if (skipsallvis)
-			vacrel->skippedallvis = true;
-	}
+		*all_visible_according_to_vm = true;
+
+	*blkno = next_block;
+	return true;
 }
 
 /*
-- 
2.40.1

v5a-0004-Remove-unneeded-vacuum_delay_point-from-heap_vac.patchtext/x-diff; charset=us-asciiDownload
From 4ab5ed0863ca4fcdc197811d60011971a29b8857 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sun, 31 Dec 2023 12:49:56 -0500
Subject: [PATCH v5a 4/7] Remove unneeded vacuum_delay_point from
 heap_vac_scan_get_next_block

heap_vac_scan_get_next_block() does relatively little work, so there is
no need to call vacuum_delay_point(). A future commit will call
heap_vac_scan_get_next_block() from a callback, and we would like to
avoid calling vacuum_delay_point() in that callback.
---
 src/backend/access/heap/vacuumlazy.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 99d160335e1..65d257aab83 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1184,8 +1184,6 @@ heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
 				 */
 				skipsallvis = true;
 			}
-
-			vacuum_delay_point();
 		}
 
 		vacrel->skip.next_unskippable_block = next_unskippable_block;
-- 
2.40.1

v5a-0005-Streaming-Read-API.patchtext/x-diff; charset=us-asciiDownload
From 0c82c513818f1aa3e8aa982344aaac13f54629e4 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Wed, 6 Mar 2024 14:46:08 -0500
Subject: [PATCH v5a 5/7] Streaming Read API

---
 src/backend/storage/Makefile             |   2 +-
 src/backend/storage/aio/Makefile         |  14 +
 src/backend/storage/aio/meson.build      |   5 +
 src/backend/storage/aio/streaming_read.c | 612 ++++++++++++++++++++++
 src/backend/storage/buffer/bufmgr.c      | 641 ++++++++++++++++-------
 src/backend/storage/buffer/localbuf.c    |  14 +-
 src/backend/storage/meson.build          |   1 +
 src/include/storage/bufmgr.h             |  45 ++
 src/include/storage/streaming_read.h     |  52 ++
 src/tools/pgindent/typedefs.list         |   3 +
 10 files changed, 1179 insertions(+), 210 deletions(-)
 create mode 100644 src/backend/storage/aio/Makefile
 create mode 100644 src/backend/storage/aio/meson.build
 create mode 100644 src/backend/storage/aio/streaming_read.c
 create mode 100644 src/include/storage/streaming_read.h

diff --git a/src/backend/storage/Makefile b/src/backend/storage/Makefile
index 8376cdfca20..eec03f6f2b4 100644
--- a/src/backend/storage/Makefile
+++ b/src/backend/storage/Makefile
@@ -8,6 +8,6 @@ subdir = src/backend/storage
 top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS     = buffer file freespace ipc large_object lmgr page smgr sync
+SUBDIRS     = aio buffer file freespace ipc large_object lmgr page smgr sync
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/storage/aio/Makefile b/src/backend/storage/aio/Makefile
new file mode 100644
index 00000000000..bcab44c802f
--- /dev/null
+++ b/src/backend/storage/aio/Makefile
@@ -0,0 +1,14 @@
+#
+# Makefile for storage/aio
+#
+# src/backend/storage/aio/Makefile
+#
+
+subdir = src/backend/storage/aio
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+OBJS = \
+	streaming_read.o
+
+include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/storage/aio/meson.build b/src/backend/storage/aio/meson.build
new file mode 100644
index 00000000000..39aef2a84a2
--- /dev/null
+++ b/src/backend/storage/aio/meson.build
@@ -0,0 +1,5 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+backend_sources += files(
+  'streaming_read.c',
+)
diff --git a/src/backend/storage/aio/streaming_read.c b/src/backend/storage/aio/streaming_read.c
new file mode 100644
index 00000000000..71f2c4a70b6
--- /dev/null
+++ b/src/backend/storage/aio/streaming_read.c
@@ -0,0 +1,612 @@
+#include "postgres.h"
+
+#include "storage/streaming_read.h"
+#include "utils/rel.h"
+
+/*
+ * Element type for PgStreamingRead's circular array of block ranges.
+ */
+typedef struct PgStreamingReadRange
+{
+	bool		need_wait;
+	bool		advice_issued;
+	BlockNumber blocknum;
+	int			nblocks;
+	int			per_buffer_data_index;
+	Buffer		buffers[MAX_BUFFERS_PER_TRANSFER];
+	ReadBuffersOperation operation;
+} PgStreamingReadRange;
+
+/*
+ * Streaming read object.
+ */
+struct PgStreamingRead
+{
+	int			max_ios;
+	int			ios_in_progress;
+	int			max_pinned_buffers;
+	int			pinned_buffers;
+	int			pinned_buffers_trigger;
+	int			next_tail_buffer;
+	int			ramp_up_pin_limit;
+	int			ramp_up_pin_stall;
+	bool		finished;
+	bool		advice_enabled;
+	void	   *pgsr_private;
+	PgStreamingReadBufferCB callback;
+
+	BufferAccessStrategy strategy;
+	BufferManagerRelation bmr;
+	ForkNumber	forknum;
+
+	/* Sometimes we need to buffer one block for flow control. */
+	BlockNumber unget_blocknum;
+	void	   *unget_per_buffer_data;
+
+	/* Next expected block, for detecting sequential access. */
+	BlockNumber seq_blocknum;
+
+	/* Space for optional per-buffer private data. */
+	size_t		per_buffer_data_size;
+	void	   *per_buffer_data;
+
+	/* Circular buffer of ranges. */
+	int			size;
+	int			head;
+	int			tail;
+	PgStreamingReadRange ranges[FLEXIBLE_ARRAY_MEMBER];
+};
+
+static PgStreamingRead *
+pg_streaming_read_buffer_alloc_internal(int flags,
+										void *pgsr_private,
+										size_t per_buffer_data_size,
+										BufferAccessStrategy strategy)
+{
+	PgStreamingRead *pgsr;
+	int			size;
+	int			max_ios;
+	uint32		max_pinned_buffers;
+
+
+	/*
+	 * Decide how many assumed I/Os we will allow to run concurrently.  That
+	 * is, advice to the kernel to tell it that we will soon read.  This
+	 * number also affects how far we look ahead for opportunities to start
+	 * more I/Os.
+	 */
+	if (flags & PGSR_FLAG_MAINTENANCE)
+		max_ios = maintenance_io_concurrency;
+	else
+		max_ios = effective_io_concurrency;
+
+	/*
+	 * The desired level of I/O concurrency controls how far ahead we are
+	 * willing to look ahead.  We also clamp it to at least
+	 * MAX_BUFFER_PER_TRANFER so that we can have a chance to build up a full
+	 * sized read, even when max_ios is zero.
+	 */
+	max_pinned_buffers = Max(max_ios * 4, MAX_BUFFERS_PER_TRANSFER);
+
+	/*
+	 * The *_io_concurrency GUCs might be set to 0, but we want to allow at
+	 * least one, to keep our gating logic simple.
+	 */
+	max_ios = Max(max_ios, 1);
+
+	/*
+	 * Don't allow this backend to pin too many buffers.  For now we'll apply
+	 * the limit for the shared buffer pool and the local buffer pool, without
+	 * worrying which it is.
+	 */
+	LimitAdditionalPins(&max_pinned_buffers);
+	LimitAdditionalLocalPins(&max_pinned_buffers);
+	Assert(max_pinned_buffers > 0);
+
+	/*
+	 * pgsr->ranges is a circular buffer.  When it is empty, head == tail.
+	 * When it is full, there is an empty element between head and tail.  Head
+	 * can also be empty (nblocks == 0), therefore we need two extra elements
+	 * for non-occupied ranges, on top of max_pinned_buffers to allow for the
+	 * maxmimum possible number of occupied ranges of the smallest possible
+	 * size of one.
+	 */
+	size = max_pinned_buffers + 2;
+
+	pgsr = (PgStreamingRead *)
+		palloc0(offsetof(PgStreamingRead, ranges) +
+				sizeof(pgsr->ranges[0]) * size);
+
+	pgsr->max_ios = max_ios;
+	pgsr->per_buffer_data_size = per_buffer_data_size;
+	pgsr->max_pinned_buffers = max_pinned_buffers;
+	pgsr->pgsr_private = pgsr_private;
+	pgsr->strategy = strategy;
+	pgsr->size = size;
+
+	pgsr->unget_blocknum = InvalidBlockNumber;
+
+#ifdef USE_PREFETCH
+
+	/*
+	 * This system supports prefetching advice.  As long as direct I/O isn't
+	 * enabled, and the caller hasn't promised sequential access, we can use
+	 * it.
+	 */
+	if ((io_direct_flags & IO_DIRECT_DATA) == 0 &&
+		(flags & PGSR_FLAG_SEQUENTIAL) == 0)
+		pgsr->advice_enabled = true;
+#endif
+
+	/*
+	 * We start off building small ranges, but double that quickly, for the
+	 * benefit of users that don't know how far ahead they'll read.  This can
+	 * be disabled by users that already know they'll read all the way.
+	 */
+	if (flags & PGSR_FLAG_FULL)
+		pgsr->ramp_up_pin_limit = INT_MAX;
+	else
+		pgsr->ramp_up_pin_limit = 1;
+
+	/*
+	 * We want to avoid creating ranges that are smaller than they could be
+	 * just because we hit max_pinned_buffers.  We only look ahead when the
+	 * number of pinned buffers falls below this trigger number, or put
+	 * another way, we stop looking ahead when we wouldn't be able to build a
+	 * "full sized" range.
+	 */
+	pgsr->pinned_buffers_trigger =
+		Max(1, (int) max_pinned_buffers - MAX_BUFFERS_PER_TRANSFER);
+
+	/* Space for the callback to store extra data along with each block. */
+	if (per_buffer_data_size)
+		pgsr->per_buffer_data = palloc(per_buffer_data_size * max_pinned_buffers);
+
+	return pgsr;
+}
+
+/*
+ * Create a new streaming read object that can be used to perform the
+ * equivalent of a series of ReadBuffer() calls for one fork of one relation.
+ * Internally, it generates larger vectored reads where possible by looking
+ * ahead.
+ */
+PgStreamingRead *
+pg_streaming_read_buffer_alloc(int flags,
+							   void *pgsr_private,
+							   size_t per_buffer_data_size,
+							   BufferAccessStrategy strategy,
+							   BufferManagerRelation bmr,
+							   ForkNumber forknum,
+							   PgStreamingReadBufferCB next_block_cb)
+{
+	PgStreamingRead *result;
+
+	result = pg_streaming_read_buffer_alloc_internal(flags,
+													 pgsr_private,
+													 per_buffer_data_size,
+													 strategy);
+	result->callback = next_block_cb;
+	result->bmr = bmr;
+	result->forknum = forknum;
+
+	return result;
+}
+
+/*
+ * Find the per-buffer data index for the Nth block of a range.
+ */
+static int
+get_per_buffer_data_index(PgStreamingRead *pgsr, PgStreamingReadRange *range, int n)
+{
+	int			result;
+
+	/*
+	 * Find slot in the circular buffer of per-buffer data, without using the
+	 * expensive % operator.
+	 */
+	result = range->per_buffer_data_index + n;
+	if (result >= pgsr->max_pinned_buffers)
+		result -= pgsr->max_pinned_buffers;
+	Assert(result == (range->per_buffer_data_index + n) % pgsr->max_pinned_buffers);
+
+	return result;
+}
+
+/*
+ * Return a pointer to the per-buffer data by index.
+ */
+static void *
+get_per_buffer_data_by_index(PgStreamingRead *pgsr, int per_buffer_data_index)
+{
+	return (char *) pgsr->per_buffer_data +
+		pgsr->per_buffer_data_size * per_buffer_data_index;
+}
+
+/*
+ * Return a pointer to the per-buffer data for the Nth block of a range.
+ */
+static void *
+get_per_buffer_data(PgStreamingRead *pgsr, PgStreamingReadRange *range, int n)
+{
+	return get_per_buffer_data_by_index(pgsr,
+										get_per_buffer_data_index(pgsr,
+																  range,
+																  n));
+}
+
+/*
+ * Start reading the head range, and create a new head range.  The new head
+ * range is returned.  It may not be empty, if StartReadBuffers() couldn't
+ * start the entire range; in that case the returned range contains the
+ * remaining portion of the range.
+ */
+static PgStreamingReadRange *
+pg_streaming_read_start_head_range(PgStreamingRead *pgsr)
+{
+	PgStreamingReadRange *head_range;
+	PgStreamingReadRange *new_head_range;
+	int			nblocks_pinned;
+	int			flags;
+
+	/* Caller should make sure we never exceed max_ios. */
+	Assert(pgsr->ios_in_progress < pgsr->max_ios);
+
+	/* Should only call if the head range has some blocks to read. */
+	head_range = &pgsr->ranges[pgsr->head];
+	Assert(head_range->nblocks > 0);
+
+	/*
+	 * If advice hasn't been suppressed, and this system supports it, this
+	 * isn't a strictly sequential pattern, then we'll issue advice.
+	 */
+	if (pgsr->advice_enabled && head_range->blocknum != pgsr->seq_blocknum)
+		flags = READ_BUFFERS_ISSUE_ADVICE;
+	else
+		flags = 0;
+
+
+	/* Start reading as many blocks as we can from the head range. */
+	nblocks_pinned = head_range->nblocks;
+	head_range->need_wait =
+		StartReadBuffers(pgsr->bmr,
+						 head_range->buffers,
+						 pgsr->forknum,
+						 head_range->blocknum,
+						 &nblocks_pinned,
+						 pgsr->strategy,
+						 flags,
+						 &head_range->operation);
+
+	/* Did that start an I/O? */
+	if (head_range->need_wait && (flags & READ_BUFFERS_ISSUE_ADVICE))
+	{
+		head_range->advice_issued = true;
+		pgsr->ios_in_progress++;
+		Assert(pgsr->ios_in_progress <= pgsr->max_ios);
+	}
+
+	/*
+	 * StartReadBuffers() might have pinned fewer blocks than we asked it to,
+	 * but always at least one.
+	 */
+	Assert(nblocks_pinned <= head_range->nblocks);
+	Assert(nblocks_pinned >= 1);
+	pgsr->pinned_buffers += nblocks_pinned;
+
+	/*
+	 * Remember where the next block would be after that, so we can detect
+	 * sequential access next time.
+	 */
+	pgsr->seq_blocknum = head_range->blocknum + nblocks_pinned;
+
+	/*
+	 * Create a new head range.  There must be space, because we have enough
+	 * elements for every range to hold just one block, up to the pin limit.
+	 */
+	Assert(pgsr->size > pgsr->max_pinned_buffers);
+	Assert((pgsr->head + 1) % pgsr->size != pgsr->tail);
+	if (++pgsr->head == pgsr->size)
+		pgsr->head = 0;
+	new_head_range = &pgsr->ranges[pgsr->head];
+	new_head_range->nblocks = 0;
+	new_head_range->advice_issued = false;
+
+	/*
+	 * If we didn't manage to start the whole read above, we split the range,
+	 * moving the remainder into the new head range.
+	 */
+	if (nblocks_pinned < head_range->nblocks)
+	{
+		int			nblocks_remaining = head_range->nblocks - nblocks_pinned;
+
+		head_range->nblocks = nblocks_pinned;
+
+		new_head_range->blocknum = head_range->blocknum + nblocks_pinned;
+		new_head_range->nblocks = nblocks_remaining;
+	}
+
+	/* The new range has per-buffer data starting after the previous range. */
+	new_head_range->per_buffer_data_index =
+		get_per_buffer_data_index(pgsr, head_range, nblocks_pinned);
+
+	return new_head_range;
+}
+
+/*
+ * Ask the callback which block it would like us to read next, with a small
+ * buffer in front to allow pg_streaming_unget_block() to work.
+ */
+static BlockNumber
+pg_streaming_get_block(PgStreamingRead *pgsr, void *per_buffer_data)
+{
+	BlockNumber result;
+
+	if (unlikely(pgsr->unget_blocknum != InvalidBlockNumber))
+	{
+		/*
+		 * If we had to unget a block, now it is time to return that one
+		 * again.
+		 */
+		result = pgsr->unget_blocknum;
+		pgsr->unget_blocknum = InvalidBlockNumber;
+
+		/*
+		 * The same per_buffer_data element must have been used, and still
+		 * contains whatever data the callback wrote into it.  So we just
+		 * sanity-check that we were called with the value that
+		 * pg_streaming_unget_block() pushed back.
+		 */
+		Assert(per_buffer_data == pgsr->unget_per_buffer_data);
+	}
+	else
+	{
+		/* Use the installed callback directly. */
+		result = pgsr->callback(pgsr, pgsr->pgsr_private, per_buffer_data);
+	}
+
+	return result;
+}
+
+/*
+ * In order to deal with short reads in StartReadBuffers(), we sometimes need
+ * to defer handling of a block until later.  This *must* be called with the
+ * last value returned by pg_streaming_get_block().
+ */
+static void
+pg_streaming_unget_block(PgStreamingRead *pgsr, BlockNumber blocknum, void *per_buffer_data)
+{
+	Assert(pgsr->unget_blocknum == InvalidBlockNumber);
+	pgsr->unget_blocknum = blocknum;
+	pgsr->unget_per_buffer_data = per_buffer_data;
+}
+
+static void
+pg_streaming_read_look_ahead(PgStreamingRead *pgsr)
+{
+	PgStreamingReadRange *range;
+
+	/*
+	 * If we're still ramping up, we may have to stall to wait for buffers to
+	 * be consumed first before we do any more prefetching.
+	 */
+	if (pgsr->ramp_up_pin_stall > 0)
+	{
+		Assert(pgsr->pinned_buffers > 0);
+		return;
+	}
+
+	/*
+	 * If we're finished or can't start more I/O, then don't look ahead.
+	 */
+	if (pgsr->finished || pgsr->ios_in_progress == pgsr->max_ios)
+		return;
+
+	/*
+	 * We'll also wait until the number of pinned buffers falls below our
+	 * trigger level, so that we have the chance to create a full range.
+	 */
+	if (pgsr->pinned_buffers >= pgsr->pinned_buffers_trigger)
+		return;
+
+	do
+	{
+		BlockNumber blocknum;
+		void	   *per_buffer_data;
+
+		/* Do we have a full-sized range? */
+		range = &pgsr->ranges[pgsr->head];
+		if (range->nblocks == lengthof(range->buffers))
+		{
+			/* Start as much of it as we can. */
+			range = pg_streaming_read_start_head_range(pgsr);
+
+			/* If we're now at the I/O limit, stop here. */
+			if (pgsr->ios_in_progress == pgsr->max_ios)
+				return;
+
+			/*
+			 * If we couldn't form a full range, then stop here to avoid
+			 * creating small I/O.
+			 */
+			if (pgsr->pinned_buffers >= pgsr->pinned_buffers_trigger)
+				return;
+
+			/*
+			 * That might have only been partially started, but always
+			 * processes at least one so that'll do for now.
+			 */
+			Assert(range->nblocks < lengthof(range->buffers));
+		}
+
+		/* Find per-buffer data slot for the next block. */
+		per_buffer_data = get_per_buffer_data(pgsr, range, range->nblocks);
+
+		/* Find out which block the callback wants to read next. */
+		blocknum = pg_streaming_get_block(pgsr, per_buffer_data);
+		if (blocknum == InvalidBlockNumber)
+		{
+			/* End of stream. */
+			pgsr->finished = true;
+			break;
+		}
+
+		/*
+		 * Is there a head range that we cannot extend, because the requested
+		 * block is not consecutive?
+		 */
+		if (range->nblocks > 0 &&
+			range->blocknum + range->nblocks != blocknum)
+		{
+			/* Yes.  Start it, so we can begin building a new one. */
+			range = pg_streaming_read_start_head_range(pgsr);
+
+			/*
+			 * It's possible that it was only partially started, and we have a
+			 * new range with the remainder.  Keep starting I/Os until we get
+			 * it all out of the way, or we hit the I/O limit.
+			 */
+			while (range->nblocks > 0 && pgsr->ios_in_progress < pgsr->max_ios)
+				range = pg_streaming_read_start_head_range(pgsr);
+
+			/*
+			 * We have to 'unget' the block returned by the callback if we
+			 * don't have enough I/O capacity left to start something.
+			 */
+			if (pgsr->ios_in_progress == pgsr->max_ios)
+			{
+				pg_streaming_unget_block(pgsr, blocknum, per_buffer_data);
+				return;
+			}
+		}
+
+		/* If we have a new, empty range, initialize the start block. */
+		if (range->nblocks == 0)
+		{
+			range->blocknum = blocknum;
+		}
+
+		/* This block extends the range by one. */
+		Assert(range->blocknum + range->nblocks == blocknum);
+		range->nblocks++;
+
+	} while (pgsr->pinned_buffers + range->nblocks < pgsr->max_pinned_buffers &&
+			 pgsr->pinned_buffers + range->nblocks < pgsr->ramp_up_pin_limit);
+
+	/* If we've hit the ramp-up limit, insert a stall. */
+	if (pgsr->pinned_buffers + range->nblocks >= pgsr->ramp_up_pin_limit)
+	{
+		/* Can't get here if an earlier stall hasn't finished. */
+		Assert(pgsr->ramp_up_pin_stall == 0);
+		/* Don't do any more prefetching until these buffers are consumed. */
+		pgsr->ramp_up_pin_stall = pgsr->ramp_up_pin_limit;
+		/* Double it.  It will soon be out of the way. */
+		pgsr->ramp_up_pin_limit *= 2;
+	}
+
+	/* Start as much as we can. */
+	while (range->nblocks > 0)
+	{
+		range = pg_streaming_read_start_head_range(pgsr);
+		if (pgsr->ios_in_progress == pgsr->max_ios)
+			break;
+	}
+}
+
+Buffer
+pg_streaming_read_buffer_get_next(PgStreamingRead *pgsr, void **per_buffer_data)
+{
+	pg_streaming_read_look_ahead(pgsr);
+
+	/* See if we have one buffer to return. */
+	while (pgsr->tail != pgsr->head)
+	{
+		PgStreamingReadRange *tail_range;
+
+		tail_range = &pgsr->ranges[pgsr->tail];
+
+		/*
+		 * Do we need to perform an I/O before returning the buffers from this
+		 * range?
+		 */
+		if (tail_range->need_wait)
+		{
+			WaitReadBuffers(&tail_range->operation);
+			tail_range->need_wait = false;
+
+			/*
+			 * We don't really know if the kernel generated a physical I/O
+			 * when we issued advice, let alone when it finished, but it has
+			 * certainly finished now because we've performed the read.
+			 */
+			if (tail_range->advice_issued)
+			{
+				Assert(pgsr->ios_in_progress > 0);
+				pgsr->ios_in_progress--;
+			}
+		}
+
+		/* Are there more buffers available in this range? */
+		if (pgsr->next_tail_buffer < tail_range->nblocks)
+		{
+			int			buffer_index;
+			Buffer		buffer;
+
+			buffer_index = pgsr->next_tail_buffer++;
+			buffer = tail_range->buffers[buffer_index];
+
+			Assert(BufferIsValid(buffer));
+
+			/* We are giving away ownership of this pinned buffer. */
+			Assert(pgsr->pinned_buffers > 0);
+			pgsr->pinned_buffers--;
+
+			if (pgsr->ramp_up_pin_stall > 0)
+				pgsr->ramp_up_pin_stall--;
+
+			if (per_buffer_data)
+				*per_buffer_data = get_per_buffer_data(pgsr, tail_range, buffer_index);
+
+			return buffer;
+		}
+
+		/* Advance tail to next range, if there is one. */
+		if (++pgsr->tail == pgsr->size)
+			pgsr->tail = 0;
+		pgsr->next_tail_buffer = 0;
+
+		/*
+		 * If tail crashed into head, and head is not empty, then it is time
+		 * to start that range.
+		 */
+		if (pgsr->tail == pgsr->head &&
+			pgsr->ranges[pgsr->head].nblocks > 0)
+			pg_streaming_read_start_head_range(pgsr);
+	}
+
+	Assert(pgsr->pinned_buffers == 0);
+
+	return InvalidBuffer;
+}
+
+void
+pg_streaming_read_free(PgStreamingRead *pgsr)
+{
+	Buffer		buffer;
+
+	/* Stop looking ahead. */
+	pgsr->finished = true;
+
+	/* Unpin anything that wasn't consumed. */
+	while ((buffer = pg_streaming_read_buffer_get_next(pgsr, NULL)) != InvalidBuffer)
+		ReleaseBuffer(buffer);
+
+	Assert(pgsr->pinned_buffers == 0);
+	Assert(pgsr->ios_in_progress == 0);
+
+	/* Release memory. */
+	if (pgsr->per_buffer_data)
+		pfree(pgsr->per_buffer_data);
+
+	pfree(pgsr);
+}
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index f0f8d4259c5..729d1f91721 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -19,6 +19,11 @@
  *		and pin it so that no one can destroy it while this process
  *		is using it.
  *
+ * StartReadBuffers() -- as above, but for multiple contiguous blocks in
+ *		two steps.
+ *
+ * WaitReadBuffers() -- second step of StartReadBuffers().
+ *
  * ReleaseBuffer() -- unpin a buffer
  *
  * MarkBufferDirty() -- mark a pinned buffer's contents as "dirty".
@@ -471,10 +476,9 @@ ForgetPrivateRefCountEntry(PrivateRefCountEntry *ref)
 )
 
 
-static Buffer ReadBuffer_common(SMgrRelation smgr, char relpersistence,
+static Buffer ReadBuffer_common(BufferManagerRelation bmr,
 								ForkNumber forkNum, BlockNumber blockNum,
-								ReadBufferMode mode, BufferAccessStrategy strategy,
-								bool *hit);
+								ReadBufferMode mode, BufferAccessStrategy strategy);
 static BlockNumber ExtendBufferedRelCommon(BufferManagerRelation bmr,
 										   ForkNumber fork,
 										   BufferAccessStrategy strategy,
@@ -500,7 +504,7 @@ static uint32 WaitBufHdrUnlocked(BufferDesc *buf);
 static int	SyncOneBuffer(int buf_id, bool skip_recently_used,
 						  WritebackContext *wb_context);
 static void WaitIO(BufferDesc *buf);
-static bool StartBufferIO(BufferDesc *buf, bool forInput);
+static bool StartBufferIO(BufferDesc *buf, bool forInput, bool nowait);
 static void TerminateBufferIO(BufferDesc *buf, bool clear_dirty,
 							  uint32 set_flag_bits, bool forget_owner);
 static void AbortBufferIO(Buffer buffer);
@@ -781,7 +785,6 @@ Buffer
 ReadBufferExtended(Relation reln, ForkNumber forkNum, BlockNumber blockNum,
 				   ReadBufferMode mode, BufferAccessStrategy strategy)
 {
-	bool		hit;
 	Buffer		buf;
 
 	/*
@@ -794,15 +797,9 @@ ReadBufferExtended(Relation reln, ForkNumber forkNum, BlockNumber blockNum,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("cannot access temporary tables of other sessions")));
 
-	/*
-	 * Read the buffer, and update pgstat counters to reflect a cache hit or
-	 * miss.
-	 */
-	pgstat_count_buffer_read(reln);
-	buf = ReadBuffer_common(RelationGetSmgr(reln), reln->rd_rel->relpersistence,
-							forkNum, blockNum, mode, strategy, &hit);
-	if (hit)
-		pgstat_count_buffer_hit(reln);
+	buf = ReadBuffer_common(BMR_REL(reln),
+							forkNum, blockNum, mode, strategy);
+
 	return buf;
 }
 
@@ -822,13 +819,12 @@ ReadBufferWithoutRelcache(RelFileLocator rlocator, ForkNumber forkNum,
 						  BlockNumber blockNum, ReadBufferMode mode,
 						  BufferAccessStrategy strategy, bool permanent)
 {
-	bool		hit;
-
 	SMgrRelation smgr = smgropen(rlocator, INVALID_PROC_NUMBER);
 
-	return ReadBuffer_common(smgr, permanent ? RELPERSISTENCE_PERMANENT :
-							 RELPERSISTENCE_UNLOGGED, forkNum, blockNum,
-							 mode, strategy, &hit);
+	return ReadBuffer_common(BMR_SMGR(smgr, permanent ? RELPERSISTENCE_PERMANENT :
+									  RELPERSISTENCE_UNLOGGED),
+							 forkNum, blockNum,
+							 mode, strategy);
 }
 
 /*
@@ -994,35 +990,68 @@ ExtendBufferedRelTo(BufferManagerRelation bmr,
 	 */
 	if (buffer == InvalidBuffer)
 	{
-		bool		hit;
-
 		Assert(extended_by == 0);
-		buffer = ReadBuffer_common(bmr.smgr, bmr.relpersistence,
-								   fork, extend_to - 1, mode, strategy,
-								   &hit);
+		buffer = ReadBuffer_common(bmr, fork, extend_to - 1, mode, strategy);
 	}
 
 	return buffer;
 }
 
+/*
+ * Zero a buffer and lock it, as part of the implementation of
+ * RBM_ZERO_AND_LOCK or RBM_ZERO_AND_CLEANUP_LOCK.  The buffer must be already
+ * pinned.  It does not have to be valid, but it is valid and locked on
+ * return.
+ */
+static void
+ZeroBuffer(Buffer buffer, ReadBufferMode mode)
+{
+	BufferDesc *bufHdr;
+	uint32		buf_state;
+
+	Assert(mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK);
+
+	if (BufferIsLocal(buffer))
+		bufHdr = GetLocalBufferDescriptor(-buffer - 1);
+	else
+	{
+		bufHdr = GetBufferDescriptor(buffer - 1);
+		if (mode == RBM_ZERO_AND_LOCK)
+			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
+		else
+			LockBufferForCleanup(buffer);
+	}
+
+	memset(BufferGetPage(buffer), 0, BLCKSZ);
+
+	if (BufferIsLocal(buffer))
+	{
+		buf_state = pg_atomic_read_u32(&bufHdr->state);
+		buf_state |= BM_VALID;
+		pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
+	}
+	else
+	{
+		buf_state = LockBufHdr(bufHdr);
+		buf_state |= BM_VALID;
+		UnlockBufHdr(bufHdr, buf_state);
+	}
+}
+
 /*
  * ReadBuffer_common -- common logic for all ReadBuffer variants
  *
  * *hit is set to true if the request was satisfied from shared buffer cache.
  */
 static Buffer
-ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
+ReadBuffer_common(BufferManagerRelation bmr, ForkNumber forkNum,
 				  BlockNumber blockNum, ReadBufferMode mode,
-				  BufferAccessStrategy strategy, bool *hit)
+				  BufferAccessStrategy strategy)
 {
-	BufferDesc *bufHdr;
-	Block		bufBlock;
-	bool		found;
-	IOContext	io_context;
-	IOObject	io_object;
-	bool		isLocalBuf = SmgrIsTemp(smgr);
-
-	*hit = false;
+	ReadBuffersOperation operation;
+	Buffer		buffer;
+	int			nblocks;
+	int			flags;
 
 	/*
 	 * Backward compatibility path, most code should use ExtendBufferedRel()
@@ -1041,181 +1070,404 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 		if (mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK)
 			flags |= EB_LOCK_FIRST;
 
-		return ExtendBufferedRel(BMR_SMGR(smgr, relpersistence),
-								 forkNum, strategy, flags);
+		return ExtendBufferedRel(bmr, forkNum, strategy, flags);
 	}
 
-	TRACE_POSTGRESQL_BUFFER_READ_START(forkNum, blockNum,
-									   smgr->smgr_rlocator.locator.spcOid,
-									   smgr->smgr_rlocator.locator.dbOid,
-									   smgr->smgr_rlocator.locator.relNumber,
-									   smgr->smgr_rlocator.backend);
+	nblocks = 1;
+	if (mode == RBM_ZERO_ON_ERROR)
+		flags = READ_BUFFERS_ZERO_ON_ERROR;
+	else
+		flags = 0;
+	if (StartReadBuffers(bmr,
+						 &buffer,
+						 forkNum,
+						 blockNum,
+						 &nblocks,
+						 strategy,
+						 flags,
+						 &operation))
+		WaitReadBuffers(&operation);
+	Assert(nblocks == 1);		/* single block can't be short */
+
+	if (mode == RBM_ZERO_AND_CLEANUP_LOCK || mode == RBM_ZERO_AND_LOCK)
+		ZeroBuffer(buffer, mode);
+
+	return buffer;
+}
+
+static Buffer
+PrepareReadBuffer(BufferManagerRelation bmr,
+				  ForkNumber forkNum,
+				  BlockNumber blockNum,
+				  BufferAccessStrategy strategy,
+				  bool *foundPtr)
+{
+	BufferDesc *bufHdr;
+	bool		isLocalBuf;
+	IOContext	io_context;
+	IOObject	io_object;
+
+	Assert(blockNum != P_NEW);
 
+	Assert(bmr.smgr);
+
+	isLocalBuf = SmgrIsTemp(bmr.smgr);
 	if (isLocalBuf)
 	{
-		/*
-		 * We do not use a BufferAccessStrategy for I/O of temporary tables.
-		 * However, in some cases, the "strategy" may not be NULL, so we can't
-		 * rely on IOContextForStrategy() to set the right IOContext for us.
-		 * This may happen in cases like CREATE TEMPORARY TABLE AS...
-		 */
 		io_context = IOCONTEXT_NORMAL;
 		io_object = IOOBJECT_TEMP_RELATION;
-		bufHdr = LocalBufferAlloc(smgr, forkNum, blockNum, &found);
-		if (found)
-			pgBufferUsage.local_blks_hit++;
-		else if (mode == RBM_NORMAL || mode == RBM_NORMAL_NO_LOG ||
-				 mode == RBM_ZERO_ON_ERROR)
-			pgBufferUsage.local_blks_read++;
 	}
 	else
 	{
-		/*
-		 * lookup the buffer.  IO_IN_PROGRESS is set if the requested block is
-		 * not currently in memory.
-		 */
 		io_context = IOContextForStrategy(strategy);
 		io_object = IOOBJECT_RELATION;
-		bufHdr = BufferAlloc(smgr, relpersistence, forkNum, blockNum,
-							 strategy, &found, io_context);
-		if (found)
-			pgBufferUsage.shared_blks_hit++;
-		else if (mode == RBM_NORMAL || mode == RBM_NORMAL_NO_LOG ||
-				 mode == RBM_ZERO_ON_ERROR)
-			pgBufferUsage.shared_blks_read++;
 	}
 
-	/* At this point we do NOT hold any locks. */
+	TRACE_POSTGRESQL_BUFFER_READ_START(forkNum, blockNum,
+									   bmr.smgr->smgr_rlocator.locator.spcOid,
+									   bmr.smgr->smgr_rlocator.locator.dbOid,
+									   bmr.smgr->smgr_rlocator.locator.relNumber,
+									   bmr.smgr->smgr_rlocator.backend);
 
-	/* if it was already in the buffer pool, we're done */
-	if (found)
+	ResourceOwnerEnlarge(CurrentResourceOwner);
+	if (isLocalBuf)
+	{
+		bufHdr = LocalBufferAlloc(bmr.smgr, forkNum, blockNum, foundPtr);
+		if (*foundPtr)
+			pgBufferUsage.local_blks_hit++;
+	}
+	else
+	{
+		bufHdr = BufferAlloc(bmr.smgr, bmr.relpersistence, forkNum, blockNum,
+							 strategy, foundPtr, io_context);
+		if (*foundPtr)
+			pgBufferUsage.shared_blks_hit++;
+	}
+	if (bmr.rel)
+	{
+		/*
+		 * While pgBufferUsage's "read" counter isn't bumped unless we reach
+		 * WaitReadBuffers() (so, not for hits, and not for buffers that are
+		 * zeroed instead), the per-relation stats always count them.
+		 */
+		pgstat_count_buffer_read(bmr.rel);
+		if (*foundPtr)
+			pgstat_count_buffer_hit(bmr.rel);
+	}
+	if (*foundPtr)
 	{
-		/* Just need to update stats before we exit */
-		*hit = true;
 		VacuumPageHit++;
 		pgstat_count_io_op(io_object, io_context, IOOP_HIT);
-
 		if (VacuumCostActive)
 			VacuumCostBalance += VacuumCostPageHit;
 
 		TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
-										  smgr->smgr_rlocator.locator.spcOid,
-										  smgr->smgr_rlocator.locator.dbOid,
-										  smgr->smgr_rlocator.locator.relNumber,
-										  smgr->smgr_rlocator.backend,
-										  found);
+										  bmr.smgr->smgr_rlocator.locator.spcOid,
+										  bmr.smgr->smgr_rlocator.locator.dbOid,
+										  bmr.smgr->smgr_rlocator.locator.relNumber,
+										  bmr.smgr->smgr_rlocator.backend,
+										  true);
+	}
 
-		/*
-		 * In RBM_ZERO_AND_LOCK mode the caller expects the page to be locked
-		 * on return.
-		 */
-		if (!isLocalBuf)
-		{
-			if (mode == RBM_ZERO_AND_LOCK)
-				LWLockAcquire(BufferDescriptorGetContentLock(bufHdr),
-							  LW_EXCLUSIVE);
-			else if (mode == RBM_ZERO_AND_CLEANUP_LOCK)
-				LockBufferForCleanup(BufferDescriptorGetBuffer(bufHdr));
-		}
+	return BufferDescriptorGetBuffer(bufHdr);
+}
 
-		return BufferDescriptorGetBuffer(bufHdr);
+/*
+ * Begin reading a range of blocks beginning at blockNum and extending for
+ * *nblocks.  On return, up to *nblocks pinned buffers holding those blocks
+ * are written into the buffers array, and *nblocks is updated to contain the
+ * actual number, which may be fewer than requested.
+ *
+ * If false is returned, no I/O is necessary and WaitReadBuffers() is not
+ * necessary.  If true is returned, one I/O has been started, and
+ * WaitReadBuffers() must be called with the same operation object before the
+ * buffers are accessed.  Along with the operation object, the caller-supplied
+ * array of buffers must remain valid until WaitReadBuffers() is called.
+ *
+ * Currently the I/O is only started with optional operating system advice,
+ * and the real I/O happens in WaitReadBuffers().  In future work, true I/O
+ * could be initiated here.
+ */
+bool
+StartReadBuffers(BufferManagerRelation bmr,
+				 Buffer *buffers,
+				 ForkNumber forkNum,
+				 BlockNumber blockNum,
+				 int *nblocks,
+				 BufferAccessStrategy strategy,
+				 int flags,
+				 ReadBuffersOperation *operation)
+{
+	int			actual_nblocks = *nblocks;
+
+	if (bmr.rel)
+	{
+		bmr.smgr = RelationGetSmgr(bmr.rel);
+		bmr.relpersistence = bmr.rel->rd_rel->relpersistence;
 	}
 
-	/*
-	 * if we have gotten to this point, we have allocated a buffer for the
-	 * page but its contents are not yet valid.  IO_IN_PROGRESS is set for it,
-	 * if it's a shared buffer.
-	 */
-	Assert(!(pg_atomic_read_u32(&bufHdr->state) & BM_VALID));	/* spinlock not needed */
+	operation->bmr = bmr;
+	operation->forknum = forkNum;
+	operation->blocknum = blockNum;
+	operation->buffers = buffers;
+	operation->nblocks = actual_nblocks;
+	operation->strategy = strategy;
+	operation->flags = flags;
 
-	bufBlock = isLocalBuf ? LocalBufHdrGetBlock(bufHdr) : BufHdrGetBlock(bufHdr);
+	operation->io_buffers_len = 0;
 
-	/*
-	 * Read in the page, unless the caller intends to overwrite it and just
-	 * wants us to allocate a buffer.
-	 */
-	if (mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK)
-		MemSet((char *) bufBlock, 0, BLCKSZ);
-	else
+	for (int i = 0; i < actual_nblocks; ++i)
 	{
-		instr_time	io_start = pgstat_prepare_io_time(track_io_timing);
+		bool		found;
 
-		smgrread(smgr, forkNum, blockNum, bufBlock);
+		buffers[i] = PrepareReadBuffer(bmr,
+									   forkNum,
+									   blockNum + i,
+									   strategy,
+									   &found);
 
-		pgstat_count_io_op_time(io_object, io_context,
-								IOOP_READ, io_start, 1);
+		if (found)
+		{
+			/*
+			 * Terminate the read as soon as we get a hit.  It could be a
+			 * single buffer hit, or it could be a hit that follows a readable
+			 * range.  We don't want to create more than one readable range,
+			 * so we stop here.
+			 */
+			actual_nblocks = operation->nblocks = *nblocks = i + 1;
+		}
+		else
+		{
+			/* Extend the readable range to cover this block. */
+			operation->io_buffers_len++;
+		}
+	}
 
-		/* check for garbage data */
-		if (!PageIsVerifiedExtended((Page) bufBlock, blockNum,
-									PIV_LOG_WARNING | PIV_REPORT_STAT))
+	if (operation->io_buffers_len > 0)
+	{
+		if (flags & READ_BUFFERS_ISSUE_ADVICE)
 		{
-			if (mode == RBM_ZERO_ON_ERROR || zero_damaged_pages)
-			{
-				ereport(WARNING,
-						(errcode(ERRCODE_DATA_CORRUPTED),
-						 errmsg("invalid page in block %u of relation %s; zeroing out page",
-								blockNum,
-								relpath(smgr->smgr_rlocator, forkNum))));
-				MemSet((char *) bufBlock, 0, BLCKSZ);
-			}
-			else
-				ereport(ERROR,
-						(errcode(ERRCODE_DATA_CORRUPTED),
-						 errmsg("invalid page in block %u of relation %s",
-								blockNum,
-								relpath(smgr->smgr_rlocator, forkNum))));
+			/*
+			 * In theory we should only do this if PrepareReadBuffers() had to
+			 * allocate new buffers above.  That way, if two calls to
+			 * StartReadBuffers() were made for the same blocks before
+			 * WaitReadBuffers(), only the first would issue the advice.
+			 * That'd be a better simulation of true asynchronous I/O, which
+			 * would only start the I/O once, but isn't done here for
+			 * simplicity.  Note also that the following call might actually
+			 * issue two advice calls if we cross a segment boundary; in a
+			 * true asynchronous version we might choose to process only one
+			 * real I/O at a time in that case.
+			 */
+			smgrprefetch(bmr.smgr, forkNum, blockNum, operation->io_buffers_len);
 		}
+
+		/* Indicate that WaitReadBuffers() should be called. */
+		return true;
 	}
+	else
+	{
+		return false;
+	}
+}
 
-	/*
-	 * In RBM_ZERO_AND_LOCK / RBM_ZERO_AND_CLEANUP_LOCK mode, grab the buffer
-	 * content lock before marking the page as valid, to make sure that no
-	 * other backend sees the zeroed page before the caller has had a chance
-	 * to initialize it.
-	 *
-	 * Since no-one else can be looking at the page contents yet, there is no
-	 * difference between an exclusive lock and a cleanup-strength lock. (Note
-	 * that we cannot use LockBuffer() or LockBufferForCleanup() here, because
-	 * they assert that the buffer is already valid.)
-	 */
-	if ((mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK) &&
-		!isLocalBuf)
+static inline bool
+WaitReadBuffersCanStartIO(Buffer buffer, bool nowait)
+{
+	if (BufferIsLocal(buffer))
 	{
-		LWLockAcquire(BufferDescriptorGetContentLock(bufHdr), LW_EXCLUSIVE);
+		BufferDesc *bufHdr = GetLocalBufferDescriptor(-buffer - 1);
+
+		return (pg_atomic_read_u32(&bufHdr->state) & BM_VALID) == 0;
 	}
+	else
+		return StartBufferIO(GetBufferDescriptor(buffer - 1), true, nowait);
+}
+
+void
+WaitReadBuffers(ReadBuffersOperation *operation)
+{
+	BufferManagerRelation bmr;
+	Buffer	   *buffers;
+	int			nblocks;
+	BlockNumber blocknum;
+	ForkNumber	forknum;
+	bool		isLocalBuf;
+	IOContext	io_context;
+	IOObject	io_object;
+
+	/*
+	 * Currently operations are only allowed to include a read of some range,
+	 * with an optional extra buffer that is already pinned at the end.  So
+	 * nblocks can be at most one more than io_buffers_len.
+	 */
+	Assert((operation->nblocks == operation->io_buffers_len) ||
+		   (operation->nblocks == operation->io_buffers_len + 1));
 
+	/* Find the range of the physical read we need to perform. */
+	nblocks = operation->io_buffers_len;
+	if (nblocks == 0)
+		return;					/* nothing to do */
+
+	buffers = &operation->buffers[0];
+	blocknum = operation->blocknum;
+	forknum = operation->forknum;
+	bmr = operation->bmr;
+
+	isLocalBuf = SmgrIsTemp(bmr.smgr);
 	if (isLocalBuf)
 	{
-		/* Only need to adjust flags */
-		uint32		buf_state = pg_atomic_read_u32(&bufHdr->state);
-
-		buf_state |= BM_VALID;
-		pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
+		io_context = IOCONTEXT_NORMAL;
+		io_object = IOOBJECT_TEMP_RELATION;
 	}
 	else
 	{
-		/* Set BM_VALID, terminate IO, and wake up any waiters */
-		TerminateBufferIO(bufHdr, false, BM_VALID, true);
+		io_context = IOContextForStrategy(operation->strategy);
+		io_object = IOOBJECT_RELATION;
 	}
 
-	VacuumPageMiss++;
-	if (VacuumCostActive)
-		VacuumCostBalance += VacuumCostPageMiss;
+	/*
+	 * We count all these blocks as read by this backend.  This is traditional
+	 * behavior, but might turn out to be not true if we find that someone
+	 * else has beaten us and completed the read of some of these blocks.  In
+	 * that case the system globally double-counts, but we traditionally don't
+	 * count this as a "hit", and we don't have a separate counter for "miss,
+	 * but another backend completed the read".
+	 */
+	if (isLocalBuf)
+		pgBufferUsage.local_blks_read += nblocks;
+	else
+		pgBufferUsage.shared_blks_read += nblocks;
 
-	TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
-									  smgr->smgr_rlocator.locator.spcOid,
-									  smgr->smgr_rlocator.locator.dbOid,
-									  smgr->smgr_rlocator.locator.relNumber,
-									  smgr->smgr_rlocator.backend,
-									  found);
+	for (int i = 0; i < nblocks; ++i)
+	{
+		int			io_buffers_len;
+		Buffer		io_buffers[MAX_BUFFERS_PER_TRANSFER];
+		void	   *io_pages[MAX_BUFFERS_PER_TRANSFER];
+		instr_time	io_start;
+		BlockNumber io_first_block;
 
-	return BufferDescriptorGetBuffer(bufHdr);
+		/*
+		 * Skip this block if someone else has already completed it.  If an
+		 * I/O is already in progress in another backend, this will wait for
+		 * the outcome: either done, or something went wrong and we will
+		 * retry.
+		 */
+		if (!WaitReadBuffersCanStartIO(buffers[i], false))
+		{
+			/*
+			 * Report this as a 'hit' for this backend, even though it must
+			 * have started out as a miss in PrepareReadBuffer().
+			 */
+			TRACE_POSTGRESQL_BUFFER_READ_DONE(forknum, blocknum + i,
+											  bmr.smgr->smgr_rlocator.locator.spcOid,
+											  bmr.smgr->smgr_rlocator.locator.dbOid,
+											  bmr.smgr->smgr_rlocator.locator.relNumber,
+											  bmr.smgr->smgr_rlocator.backend,
+											  true);
+			continue;
+		}
+
+		/* We found a buffer that we need to read in. */
+		io_buffers[0] = buffers[i];
+		io_pages[0] = BufferGetBlock(buffers[i]);
+		io_first_block = blocknum + i;
+		io_buffers_len = 1;
+
+		/*
+		 * How many neighboring-on-disk blocks can we can scatter-read into
+		 * other buffers at the same time?  In this case we don't wait if we
+		 * see an I/O already in progress.  We already hold BM_IO_IN_PROGRESS
+		 * for the head block, so we should get on with that I/O as soon as
+		 * possible.  We'll come back to this block again, above.
+		 */
+		while ((i + 1) < nblocks &&
+			   WaitReadBuffersCanStartIO(buffers[i + 1], true))
+		{
+			/* Must be consecutive block numbers. */
+			Assert(BufferGetBlockNumber(buffers[i + 1]) ==
+				   BufferGetBlockNumber(buffers[i]) + 1);
+
+			io_buffers[io_buffers_len] = buffers[++i];
+			io_pages[io_buffers_len++] = BufferGetBlock(buffers[i]);
+		}
+
+		io_start = pgstat_prepare_io_time(track_io_timing);
+		smgrreadv(bmr.smgr, forknum, io_first_block, io_pages, io_buffers_len);
+		pgstat_count_io_op_time(io_object, io_context, IOOP_READ, io_start,
+								io_buffers_len);
+
+		/* Verify each block we read, and terminate the I/O. */
+		for (int j = 0; j < io_buffers_len; ++j)
+		{
+			BufferDesc *bufHdr;
+			Block		bufBlock;
+
+			if (isLocalBuf)
+			{
+				bufHdr = GetLocalBufferDescriptor(-io_buffers[j] - 1);
+				bufBlock = LocalBufHdrGetBlock(bufHdr);
+			}
+			else
+			{
+				bufHdr = GetBufferDescriptor(io_buffers[j] - 1);
+				bufBlock = BufHdrGetBlock(bufHdr);
+			}
+
+			/* check for garbage data */
+			if (!PageIsVerifiedExtended((Page) bufBlock, io_first_block + j,
+										PIV_LOG_WARNING | PIV_REPORT_STAT))
+			{
+				if ((operation->flags & READ_BUFFERS_ZERO_ON_ERROR) || zero_damaged_pages)
+				{
+					ereport(WARNING,
+							(errcode(ERRCODE_DATA_CORRUPTED),
+							 errmsg("invalid page in block %u of relation %s; zeroing out page",
+									io_first_block + j,
+									relpath(bmr.smgr->smgr_rlocator, forknum))));
+					memset(bufBlock, 0, BLCKSZ);
+				}
+				else
+					ereport(ERROR,
+							(errcode(ERRCODE_DATA_CORRUPTED),
+							 errmsg("invalid page in block %u of relation %s",
+									io_first_block + j,
+									relpath(bmr.smgr->smgr_rlocator, forknum))));
+			}
+
+			/* Terminate I/O and set BM_VALID. */
+			if (isLocalBuf)
+			{
+				uint32		buf_state = pg_atomic_read_u32(&bufHdr->state);
+
+				buf_state |= BM_VALID;
+				pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
+			}
+			else
+			{
+				/* Set BM_VALID, terminate IO, and wake up any waiters */
+				TerminateBufferIO(bufHdr, false, BM_VALID, true);
+			}
+
+			/* Report I/Os as completing individually. */
+			TRACE_POSTGRESQL_BUFFER_READ_DONE(forknum, io_first_block + j,
+											  bmr.smgr->smgr_rlocator.locator.spcOid,
+											  bmr.smgr->smgr_rlocator.locator.dbOid,
+											  bmr.smgr->smgr_rlocator.locator.relNumber,
+											  bmr.smgr->smgr_rlocator.backend,
+											  false);
+		}
+
+		VacuumPageMiss += io_buffers_len;
+		if (VacuumCostActive)
+			VacuumCostBalance += VacuumCostPageMiss * io_buffers_len;
+	}
 }
 
 /*
- * BufferAlloc -- subroutine for ReadBuffer.  Handles lookup of a shared
- *		buffer.  If no buffer exists already, selects a replacement
- *		victim and evicts the old page, but does NOT read in new page.
+ * BufferAlloc -- subroutine for StartReadBuffers.  Handles lookup of a shared
+ *		buffer.  If no buffer exists already, selects a replacement victim and
+ *		evicts the old page, but does NOT read in new page.
  *
  * "strategy" can be a buffer replacement strategy object, or NULL for
  * the default strategy.  The selected buffer's usage_count is advanced when
@@ -1223,11 +1475,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
  *
  * The returned buffer is pinned and is already marked as holding the
  * desired page.  If it already did have the desired page, *foundPtr is
- * set true.  Otherwise, *foundPtr is set false and the buffer is marked
- * as IO_IN_PROGRESS; ReadBuffer will now need to do I/O to fill it.
- *
- * *foundPtr is actually redundant with the buffer's BM_VALID flag, but
- * we keep it for simplicity in ReadBuffer.
+ * set true.  Otherwise, *foundPtr is set false.
  *
  * io_context is passed as an output parameter to avoid calling
  * IOContextForStrategy() when there is a shared buffers hit and no IO
@@ -1286,19 +1534,10 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 		{
 			/*
 			 * We can only get here if (a) someone else is still reading in
-			 * the page, or (b) a previous read attempt failed.  We have to
-			 * wait for any active read attempt to finish, and then set up our
-			 * own read attempt if the page is still not BM_VALID.
-			 * StartBufferIO does it all.
+			 * the page, (b) a previous read attempt failed, or (c) someone
+			 * called StartReadBuffers() but not yet WaitReadBuffers().
 			 */
-			if (StartBufferIO(buf, true))
-			{
-				/*
-				 * If we get here, previous attempts to read the buffer must
-				 * have failed ... but we shall bravely try again.
-				 */
-				*foundPtr = false;
-			}
+			*foundPtr = false;
 		}
 
 		return buf;
@@ -1363,19 +1602,10 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 		{
 			/*
 			 * We can only get here if (a) someone else is still reading in
-			 * the page, or (b) a previous read attempt failed.  We have to
-			 * wait for any active read attempt to finish, and then set up our
-			 * own read attempt if the page is still not BM_VALID.
-			 * StartBufferIO does it all.
+			 * the page, (b) a previous read attempt failed, or (c) someone
+			 * called StartReadBuffers() but not yet WaitReadBuffers().
 			 */
-			if (StartBufferIO(existing_buf_hdr, true))
-			{
-				/*
-				 * If we get here, previous attempts to read the buffer must
-				 * have failed ... but we shall bravely try again.
-				 */
-				*foundPtr = false;
-			}
+			*foundPtr = false;
 		}
 
 		return existing_buf_hdr;
@@ -1407,15 +1637,9 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 	LWLockRelease(newPartitionLock);
 
 	/*
-	 * Buffer contents are currently invalid.  Try to obtain the right to
-	 * start I/O.  If StartBufferIO returns false, then someone else managed
-	 * to read it before we did, so there's nothing left for BufferAlloc() to
-	 * do.
+	 * Buffer contents are currently invalid.
 	 */
-	if (StartBufferIO(victim_buf_hdr, true))
-		*foundPtr = false;
-	else
-		*foundPtr = true;
+	*foundPtr = false;
 
 	return victim_buf_hdr;
 }
@@ -1769,7 +1993,7 @@ again:
  * pessimistic, but outside of toy-sized shared_buffers it should allow
  * sufficient pins.
  */
-static void
+void
 LimitAdditionalPins(uint32 *additional_pins)
 {
 	uint32		max_backends;
@@ -2034,7 +2258,7 @@ ExtendBufferedRelShared(BufferManagerRelation bmr,
 
 				buf_state &= ~BM_VALID;
 				UnlockBufHdr(existing_hdr, buf_state);
-			} while (!StartBufferIO(existing_hdr, true));
+			} while (!StartBufferIO(existing_hdr, true, false));
 		}
 		else
 		{
@@ -2057,7 +2281,7 @@ ExtendBufferedRelShared(BufferManagerRelation bmr,
 			LWLockRelease(partition_lock);
 
 			/* XXX: could combine the locked operations in it with the above */
-			StartBufferIO(victim_buf_hdr, true);
+			StartBufferIO(victim_buf_hdr, true, false);
 		}
 	}
 
@@ -2372,7 +2596,12 @@ PinBuffer(BufferDesc *buf, BufferAccessStrategy strategy)
 	else
 	{
 		/*
-		 * If we previously pinned the buffer, it must surely be valid.
+		 * If we previously pinned the buffer, it is likely to be valid, but
+		 * it may not be if StartReadBuffers() was called and
+		 * WaitReadBuffers() hasn't been called yet.  We'll check by loading
+		 * the flags without locking.  This is racy, but it's OK to return
+		 * false spuriously: when WaitReadBuffers() calls StartBufferIO(),
+		 * it'll see that it's now valid.
 		 *
 		 * Note: We deliberately avoid a Valgrind client request here.
 		 * Individual access methods can optionally superimpose buffer page
@@ -2381,7 +2610,7 @@ PinBuffer(BufferDesc *buf, BufferAccessStrategy strategy)
 		 * that the buffer page is legitimately non-accessible here.  We
 		 * cannot meddle with that.
 		 */
-		result = true;
+		result = (pg_atomic_read_u32(&buf->state) & BM_VALID) != 0;
 	}
 
 	ref->refcount++;
@@ -3449,7 +3678,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln, IOObject io_object,
 	 * someone else flushed the buffer before we could, so we need not do
 	 * anything.
 	 */
-	if (!StartBufferIO(buf, false))
+	if (!StartBufferIO(buf, false, false))
 		return;
 
 	/* Setup error traceback support for ereport() */
@@ -5184,9 +5413,15 @@ WaitIO(BufferDesc *buf)
  *
  * Returns true if we successfully marked the buffer as I/O busy,
  * false if someone else already did the work.
+ *
+ * If nowait is true, then we don't wait for an I/O to be finished by another
+ * backend.  In that case, false indicates either that the I/O was already
+ * finished, or is still in progress.  This is useful for callers that want to
+ * find out if they can perform the I/O as part of a larger operation, without
+ * waiting for the answer or distinguishing the reasons why not.
  */
 static bool
-StartBufferIO(BufferDesc *buf, bool forInput)
+StartBufferIO(BufferDesc *buf, bool forInput, bool nowait)
 {
 	uint32		buf_state;
 
@@ -5199,6 +5434,8 @@ StartBufferIO(BufferDesc *buf, bool forInput)
 		if (!(buf_state & BM_IO_IN_PROGRESS))
 			break;
 		UnlockBufHdr(buf, buf_state);
+		if (nowait)
+			return false;
 		WaitIO(buf);
 	}
 
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index fcfac335a57..985a2c7049c 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -108,10 +108,9 @@ PrefetchLocalBuffer(SMgrRelation smgr, ForkNumber forkNum,
  * LocalBufferAlloc -
  *	  Find or create a local buffer for the given page of the given relation.
  *
- * API is similar to bufmgr.c's BufferAlloc, except that we do not need
- * to do any locking since this is all local.   Also, IO_IN_PROGRESS
- * does not get set.  Lastly, we support only default access strategy
- * (hence, usage_count is always advanced).
+ * API is similar to bufmgr.c's BufferAlloc, except that we do not need to do
+ * any locking since this is all local.  We support only default access
+ * strategy (hence, usage_count is always advanced).
  */
 BufferDesc *
 LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
@@ -287,7 +286,7 @@ GetLocalVictimBuffer(void)
 }
 
 /* see LimitAdditionalPins() */
-static void
+void
 LimitAdditionalLocalPins(uint32 *additional_pins)
 {
 	uint32		max_pins;
@@ -297,9 +296,10 @@ LimitAdditionalLocalPins(uint32 *additional_pins)
 
 	/*
 	 * In contrast to LimitAdditionalPins() other backends don't play a role
-	 * here. We can allow up to NLocBuffer pins in total.
+	 * here. We can allow up to NLocBuffer pins in total, but it might not be
+	 * initialized yet so read num_temp_buffers.
 	 */
-	max_pins = (NLocBuffer - NLocalPinnedBuffers);
+	max_pins = (num_temp_buffers - NLocalPinnedBuffers);
 
 	if (*additional_pins >= max_pins)
 		*additional_pins = max_pins;
diff --git a/src/backend/storage/meson.build b/src/backend/storage/meson.build
index 40345bdca27..739d13293fb 100644
--- a/src/backend/storage/meson.build
+++ b/src/backend/storage/meson.build
@@ -1,5 +1,6 @@
 # Copyright (c) 2022-2024, PostgreSQL Global Development Group
 
+subdir('aio')
 subdir('buffer')
 subdir('file')
 subdir('freespace')
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index d51d46d3353..b57f71f97e3 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -14,6 +14,7 @@
 #ifndef BUFMGR_H
 #define BUFMGR_H
 
+#include "port/pg_iovec.h"
 #include "storage/block.h"
 #include "storage/buf.h"
 #include "storage/bufpage.h"
@@ -158,6 +159,11 @@ extern PGDLLIMPORT int32 *LocalRefCount;
 #define BUFFER_LOCK_SHARE		1
 #define BUFFER_LOCK_EXCLUSIVE	2
 
+/*
+ * Maximum number of buffers for multi-buffer I/O functions.  This is set to
+ * allow 128kB transfers, unless BLCKSZ and IOV_MAX imply a a smaller maximum.
+ */
+#define MAX_BUFFERS_PER_TRANSFER Min(PG_IOV_MAX, (128 * 1024) / BLCKSZ)
 
 /*
  * prototypes for functions in bufmgr.c
@@ -177,6 +183,42 @@ extern Buffer ReadBufferWithoutRelcache(RelFileLocator rlocator,
 										ForkNumber forkNum, BlockNumber blockNum,
 										ReadBufferMode mode, BufferAccessStrategy strategy,
 										bool permanent);
+
+#define READ_BUFFERS_ZERO_ON_ERROR 0x01
+#define READ_BUFFERS_ISSUE_ADVICE 0x02
+
+/*
+ * Private state used by StartReadBuffers() and WaitReadBuffers().  Declared
+ * in public header only to allow inclusion in other structs, but contents
+ * should not be accessed.
+ */
+struct ReadBuffersOperation
+{
+	/* Parameters passed in to StartReadBuffers(). */
+	BufferManagerRelation bmr;
+	Buffer	   *buffers;
+	ForkNumber	forknum;
+	BlockNumber blocknum;
+	int			nblocks;
+	BufferAccessStrategy strategy;
+	int			flags;
+
+	/* Range of buffers, if we need to perform a read. */
+	int			io_buffers_len;
+};
+
+typedef struct ReadBuffersOperation ReadBuffersOperation;
+
+extern bool StartReadBuffers(BufferManagerRelation bmr,
+							 Buffer *buffers,
+							 ForkNumber forknum,
+							 BlockNumber blocknum,
+							 int *nblocks,
+							 BufferAccessStrategy strategy,
+							 int flags,
+							 ReadBuffersOperation *operation);
+extern void WaitReadBuffers(ReadBuffersOperation *operation);
+
 extern void ReleaseBuffer(Buffer buffer);
 extern void UnlockReleaseBuffer(Buffer buffer);
 extern bool BufferIsExclusiveLocked(Buffer buffer);
@@ -250,6 +292,9 @@ extern bool HoldingBufferPinThatDelaysRecovery(void);
 
 extern bool BgBufferSync(struct WritebackContext *wb_context);
 
+extern void LimitAdditionalPins(uint32 *additional_pins);
+extern void LimitAdditionalLocalPins(uint32 *additional_pins);
+
 /* in buf_init.c */
 extern void InitBufferPool(void);
 extern Size BufferShmemSize(void);
diff --git a/src/include/storage/streaming_read.h b/src/include/storage/streaming_read.h
new file mode 100644
index 00000000000..c4d3892bb26
--- /dev/null
+++ b/src/include/storage/streaming_read.h
@@ -0,0 +1,52 @@
+#ifndef STREAMING_READ_H
+#define STREAMING_READ_H
+
+#include "storage/bufmgr.h"
+#include "storage/fd.h"
+#include "storage/smgr.h"
+
+/* Default tuning, reasonable for many users. */
+#define PGSR_FLAG_DEFAULT 0x00
+
+/*
+ * I/O streams that are performing maintenance work on behalf of potentially
+ * many users.
+ */
+#define PGSR_FLAG_MAINTENANCE 0x01
+
+/*
+ * We usually avoid issuing prefetch advice automatically when sequential
+ * access is detected, but this flag explicitly disables it, for cases that
+ * might not be correctly detected.  Explicit advice is known to perform worse
+ * than letting the kernel (at least Linux) detect sequential access.
+ */
+#define PGSR_FLAG_SEQUENTIAL 0x02
+
+/*
+ * We usually ramp up from smaller reads to larger ones, to support users who
+ * don't know if it's worth reading lots of buffers yet.  This flag disables
+ * that, declaring ahead of time that we'll be reading all available buffers.
+ */
+#define PGSR_FLAG_FULL 0x04
+
+struct PgStreamingRead;
+typedef struct PgStreamingRead PgStreamingRead;
+
+/* Callback that returns the next block number to read. */
+typedef BlockNumber (*PgStreamingReadBufferCB) (PgStreamingRead *pgsr,
+												void *pgsr_private,
+												void *per_buffer_private);
+
+extern PgStreamingRead *pg_streaming_read_buffer_alloc(int flags,
+													   void *pgsr_private,
+													   size_t per_buffer_private_size,
+													   BufferAccessStrategy strategy,
+													   BufferManagerRelation bmr,
+													   ForkNumber forknum,
+													   PgStreamingReadBufferCB next_block_cb);
+
+extern void pg_streaming_read_prefetch(PgStreamingRead *pgsr);
+extern Buffer pg_streaming_read_buffer_get_next(PgStreamingRead *pgsr, void **per_buffer_private);
+extern void pg_streaming_read_free(PgStreamingRead *pgsr);
+
+#endif
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 95ae7845d86..aea8babd71a 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2097,6 +2097,8 @@ PgStat_TableCounts
 PgStat_TableStatus
 PgStat_TableXactStatus
 PgStat_WalStats
+PgStreamingRead
+PgStreamingReadRange
 PgXmlErrorContext
 PgXmlStrictness
 Pg_finfo_record
@@ -2267,6 +2269,7 @@ ReInitializeDSMForeignScan_function
 ReScanForeignScan_function
 ReadBufPtrType
 ReadBufferMode
+ReadBuffersOperation
 ReadBytePtrType
 ReadExtraTocPtrType
 ReadFunc
-- 
2.40.1

v5a-0006-Vacuum-first-pass-uses-Streaming-Read-interface.patchtext/x-diff; charset=us-asciiDownload
From 2e5ed537b9053ff3212177c1732d6afa2100fa0f Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sun, 31 Dec 2023 11:29:02 -0500
Subject: [PATCH v5a 6/7] Vacuum first pass uses Streaming Read interface

Now vacuum's first pass, which HOT prunes and records the TIDs of
non-removable dead tuples, uses the streaming read API by implementing a
streaming read callback which invokes heap_vac_scan_get_next_block().
---
 src/backend/access/heap/vacuumlazy.c | 79 +++++++++++++++++++++-------
 1 file changed, 59 insertions(+), 20 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 65d257aab83..fbbc87938e4 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -54,6 +54,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "storage/streaming_read.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -168,7 +169,12 @@ typedef struct LVRelState
 	char	   *relnamespace;
 	char	   *relname;
 	char	   *indname;		/* Current index name */
-	BlockNumber blkno;			/* used only for heap operations */
+
+	/*
+	 * The current block being processed by vacuum. Used only for heap
+	 * operations. Primarily for error reporting and logging.
+	 */
+	BlockNumber blkno;
 	OffsetNumber offnum;		/* used only for heap operations */
 	VacErrPhase phase;
 	bool		verbose;		/* VACUUM VERBOSE? */
@@ -189,6 +195,12 @@ typedef struct LVRelState
 	BlockNumber missed_dead_pages;	/* # pages with missed dead tuples */
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
 
+	/*
+	 * The most recent block submitted in the streaming read callback by the
+	 * first vacuum pass.
+	 */
+	BlockNumber blkno_prefetch;
+
 	/* Statistics output by us, for table */
 	double		new_rel_tuples; /* new estimated total # of tuples */
 	double		new_live_tuples;	/* new estimated total # of live tuples */
@@ -232,7 +244,7 @@ typedef struct LVSavedErrInfo
 
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
-static bool heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
+static void heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
 										 BlockNumber *blkno,
 										 bool *all_visible_according_to_vm);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
@@ -416,6 +428,9 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 	vacrel->nonempty_pages = 0;
 	/* dead_items_alloc allocates vacrel->dead_items later on */
 
+	/* relies on InvalidBlockNumber overflowing to 0 */
+	vacrel->blkno_prefetch = InvalidBlockNumber;
+
 	/* Allocate/initialize output statistics state */
 	vacrel->new_rel_tuples = 0;
 	vacrel->new_live_tuples = 0;
@@ -776,6 +791,22 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 	}
 }
 
+static BlockNumber
+vacuum_scan_pgsr_next(PgStreamingRead *pgsr,
+					  void *pgsr_private, void *per_buffer_data)
+{
+	LVRelState *vacrel = pgsr_private;
+	bool	   *all_visible_according_to_vm = per_buffer_data;
+
+	vacrel->blkno_prefetch++;
+
+	heap_vac_scan_get_next_block(vacrel,
+								 vacrel->blkno_prefetch, &vacrel->blkno_prefetch,
+								 all_visible_according_to_vm);
+
+	return vacrel->blkno_prefetch;
+}
+
 /*
  *	lazy_scan_heap() -- workhorse function for VACUUM
  *
@@ -815,12 +846,11 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 static void
 lazy_scan_heap(LVRelState *vacrel)
 {
+	Buffer		buf;
 	BlockNumber rel_pages = vacrel->rel_pages,
 				next_fsm_block_to_vacuum = 0;
-	bool		all_visible_according_to_vm;
+	bool	   *all_visible_according_to_vm;
 
-	/* relies on InvalidBlockNumber overflowing to 0 */
-	BlockNumber blkno = InvalidBlockNumber;
 	VacDeadItems *dead_items = vacrel->dead_items;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
@@ -828,6 +858,11 @@ lazy_scan_heap(LVRelState *vacrel)
 		PROGRESS_VACUUM_MAX_DEAD_TUPLES
 	};
 	int64		initprog_val[3];
+	PgStreamingRead *pgsr;
+
+	pgsr = pg_streaming_read_buffer_alloc(PGSR_FLAG_MAINTENANCE, vacrel,
+										  sizeof(bool), vacrel->bstrategy, BMR_REL(vacrel->rel),
+										  MAIN_FORKNUM, vacuum_scan_pgsr_next);
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
@@ -838,13 +873,19 @@ lazy_scan_heap(LVRelState *vacrel)
 	vacrel->skip.next_unskippable_block = InvalidBlockNumber;
 	vacrel->skip.vmbuffer = InvalidBuffer;
 
-	while (heap_vac_scan_get_next_block(vacrel, blkno + 1,
-										&blkno, &all_visible_according_to_vm))
+	while (BufferIsValid(buf =
+						 pg_streaming_read_buffer_get_next(pgsr, (void **) &all_visible_according_to_vm)))
 	{
-		Buffer		buf;
 		Page		page;
 		bool		has_lpdead_items;
 		bool		got_cleanup_lock = false;
+		BlockNumber blkno;
+
+		vacrel->blkno = blkno = BufferGetBlockNumber(buf);
+
+		CheckBufferIsPinnedOnce(buf);
+
+		page = BufferGetPage(buf);
 
 		vacrel->scanned_pages++;
 
@@ -912,9 +953,6 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		visibilitymap_pin(vacrel->rel, blkno, &vacrel->skip.vmbuffer);
 
-		buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL,
-								 vacrel->bstrategy);
-		page = BufferGetPage(buf);
 
 		/*
 		 * We need a buffer cleanup lock to prune HOT chains and defragment
@@ -970,7 +1008,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		if (got_cleanup_lock)
 			lazy_scan_prune(vacrel, buf, blkno, page,
-							all_visible_according_to_vm,
+							*all_visible_according_to_vm,
 							&has_lpdead_items);
 
 		/*
@@ -1027,7 +1065,7 @@ lazy_scan_heap(LVRelState *vacrel)
 	}
 
 	/* report that everything is now scanned */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
+	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, vacrel->rel_pages);
 
 	/* now we can compute the new value for pg_class.reltuples */
 	vacrel->new_live_tuples = vac_estimate_reltuples(vacrel->rel, rel_pages,
@@ -1042,6 +1080,8 @@ lazy_scan_heap(LVRelState *vacrel)
 		Max(vacrel->new_live_tuples, 0) + vacrel->recently_dead_tuples +
 		vacrel->missed_dead_tuples;
 
+	pg_streaming_read_free(pgsr);
+
 	/*
 	 * Do index vacuuming (call each index's ambulkdelete routine), then do
 	 * related heap vacuuming
@@ -1053,11 +1093,11 @@ lazy_scan_heap(LVRelState *vacrel)
 	 * Vacuum the remainder of the Free Space Map.  We must do this whether or
 	 * not there were indexes, and whether or not we bypassed index vacuuming.
 	 */
-	if (blkno > next_fsm_block_to_vacuum)
-		FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, blkno);
+	if (vacrel->rel_pages > next_fsm_block_to_vacuum)
+		FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, vacrel->rel_pages);
 
 	/* report all blocks vacuumed */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
+	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, vacrel->rel_pages);
 
 	/* Do final index cleanup (call each index's amvacuumcleanup routine) */
 	if (vacrel->nindexes > 0 && vacrel->do_index_cleanup)
@@ -1090,7 +1130,7 @@ lazy_scan_heap(LVRelState *vacrel)
  *
  * The block number and visibility status of the next block to process are set
  * in blkno and all_visible_according_to_vm. heap_vac_scan_get_next_block()
- * returns false if there are no further blocks to process.
+ * sets blkno to InvalidBlockNumber if there are no further blocks to process.
  *
  * vacrel is an in/out parameter here; vacuum options and information about the
  * relation are read and vacrel->skippedallvis is set to ensure we don't
@@ -1110,7 +1150,7 @@ lazy_scan_heap(LVRelState *vacrel)
  * older XIDs/MXIDs.  The vacrel->skippedallvis flag will be set here when the
  * choice to skip such a range is actually made, making everything safe.)
  */
-static bool
+static void
 heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
 							 BlockNumber *blkno, bool *all_visible_according_to_vm)
 {
@@ -1119,7 +1159,7 @@ heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
 	if (next_block >= vacrel->rel_pages)
 	{
 		*blkno = InvalidBlockNumber;
-		return false;
+		return;
 	}
 
 	if (vacrel->skip.next_unskippable_block == InvalidBlockNumber ||
@@ -1214,7 +1254,6 @@ heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
 		*all_visible_according_to_vm = true;
 
 	*blkno = next_block;
-	return true;
 }
 
 /*
-- 
2.40.1

v5a-0007-Vacuum-second-pass-uses-Streaming-Read-interface.patchtext/x-diff; charset=us-asciiDownload
From 84a5907b486d1f6c2ffe029a2e28fc557065739f Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Tue, 27 Feb 2024 14:35:36 -0500
Subject: [PATCH v5a 7/7] Vacuum second pass uses Streaming Read interface

Now vacuum's second pass, which removes dead items referring to dead
tuples catalogued in the first pass, uses the streaming read API by
implementing a streaming read callback which returns the next block
containing previously catalogued dead items. A new struct,
VacReapBlkState, is introduced to provide the caller with the starting
and ending indexes of dead items to vacuum.

ci-os-only:
---
 src/backend/access/heap/vacuumlazy.c | 110 ++++++++++++++++++++-------
 src/tools/pgindent/typedefs.list     |   1 +
 2 files changed, 85 insertions(+), 26 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index fbbc87938e4..68c146984b1 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -201,6 +201,12 @@ typedef struct LVRelState
 	 */
 	BlockNumber blkno_prefetch;
 
+	/*
+	 * The index of the next TID in dead_items to reap during the second
+	 * vacuum pass.
+	 */
+	int			idx_prefetch;
+
 	/* Statistics output by us, for table */
 	double		new_rel_tuples; /* new estimated total # of tuples */
 	double		new_live_tuples;	/* new estimated total # of live tuples */
@@ -242,6 +248,21 @@ typedef struct LVSavedErrInfo
 	VacErrPhase phase;
 } LVSavedErrInfo;
 
+/*
+ * State set up in streaming read callback during vacuum's second pass which
+ * removes dead items referring to dead tuples cataloged in the first pass
+ */
+typedef struct VacReapBlkState
+{
+	/*
+	 * The indexes of the TIDs of the first and last dead tuples in a single
+	 * block in the currently vacuumed relation. The callback will set these
+	 * up prior to adding this block to the stream.
+	 */
+	int			start_idx;
+	int			end_idx;
+} VacReapBlkState;
+
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
 static void heap_vac_scan_get_next_block(LVRelState *vacrel, BlockNumber next_block,
@@ -260,8 +281,9 @@ static bool lazy_scan_noprune(LVRelState *vacrel, Buffer buf,
 static void lazy_vacuum(LVRelState *vacrel);
 static bool lazy_vacuum_all_indexes(LVRelState *vacrel);
 static void lazy_vacuum_heap_rel(LVRelState *vacrel);
-static int	lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno,
-								  Buffer buffer, int index, Buffer vmbuffer);
+static void lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno,
+								  Buffer buffer, Buffer vmbuffer,
+								  VacReapBlkState *rbstate);
 static bool lazy_check_wraparound_failsafe(LVRelState *vacrel);
 static void lazy_cleanup_all_indexes(LVRelState *vacrel);
 static IndexBulkDeleteResult *lazy_vacuum_one_index(Relation indrel,
@@ -2401,6 +2423,37 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 	return allindexes;
 }
 
+static BlockNumber
+vacuum_reap_lp_pgsr_next(PgStreamingRead *pgsr,
+						 void *pgsr_private,
+						 void *per_buffer_data)
+{
+	BlockNumber blkno;
+	LVRelState *vacrel = pgsr_private;
+	VacReapBlkState *rbstate = per_buffer_data;
+
+	VacDeadItems *dead_items = vacrel->dead_items;
+
+	if (vacrel->idx_prefetch == dead_items->num_items)
+		return InvalidBlockNumber;
+
+	blkno = ItemPointerGetBlockNumber(&dead_items->items[vacrel->idx_prefetch]);
+	rbstate->start_idx = vacrel->idx_prefetch;
+
+	for (; vacrel->idx_prefetch < dead_items->num_items; vacrel->idx_prefetch++)
+	{
+		BlockNumber curblkno =
+			ItemPointerGetBlockNumber(&dead_items->items[vacrel->idx_prefetch]);
+
+		if (blkno != curblkno)
+			break;				/* past end of tuples for this block */
+	}
+
+	rbstate->end_idx = vacrel->idx_prefetch;
+
+	return blkno;
+}
+
 /*
  *	lazy_vacuum_heap_rel() -- second pass over the heap for two pass strategy
  *
@@ -2422,7 +2475,9 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 static void
 lazy_vacuum_heap_rel(LVRelState *vacrel)
 {
-	int			index = 0;
+	Buffer		buf;
+	PgStreamingRead *pgsr;
+	VacReapBlkState *rbstate;
 	BlockNumber vacuumed_pages = 0;
 	Buffer		vmbuffer = InvalidBuffer;
 	LVSavedErrInfo saved_err_info;
@@ -2440,17 +2495,21 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 							 VACUUM_ERRCB_PHASE_VACUUM_HEAP,
 							 InvalidBlockNumber, InvalidOffsetNumber);
 
-	while (index < vacrel->dead_items->num_items)
+	pgsr = pg_streaming_read_buffer_alloc(PGSR_FLAG_MAINTENANCE, vacrel,
+										  sizeof(VacReapBlkState), vacrel->bstrategy, BMR_REL(vacrel->rel),
+										  MAIN_FORKNUM, vacuum_reap_lp_pgsr_next);
+
+	while (BufferIsValid(buf =
+						 pg_streaming_read_buffer_get_next(pgsr,
+														   (void **) &rbstate)))
 	{
 		BlockNumber blkno;
-		Buffer		buf;
 		Page		page;
 		Size		freespace;
 
 		vacuum_delay_point();
 
-		blkno = ItemPointerGetBlockNumber(&vacrel->dead_items->items[index]);
-		vacrel->blkno = blkno;
+		vacrel->blkno = blkno = BufferGetBlockNumber(buf);
 
 		/*
 		 * Pin the visibility map page in case we need to mark the page
@@ -2460,10 +2519,8 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 		visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
 
 		/* We need a non-cleanup exclusive lock to mark dead_items unused */
-		buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL,
-								 vacrel->bstrategy);
 		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
-		index = lazy_vacuum_heap_page(vacrel, blkno, buf, index, vmbuffer);
+		lazy_vacuum_heap_page(vacrel, blkno, buf, vmbuffer, rbstate);
 
 		/* Now that we've vacuumed the page, record its available space */
 		page = BufferGetPage(buf);
@@ -2482,14 +2539,16 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 	 * We set all LP_DEAD items from the first heap pass to LP_UNUSED during
 	 * the second heap pass.  No more, no less.
 	 */
-	Assert(index > 0);
+	Assert(rbstate->end_idx > 0);
 	Assert(vacrel->num_index_scans > 1 ||
-		   (index == vacrel->lpdead_items &&
+		   (rbstate->end_idx == vacrel->lpdead_items &&
 			vacuumed_pages == vacrel->lpdead_item_pages));
 
+	pg_streaming_read_free(pgsr);
+
 	ereport(DEBUG2,
 			(errmsg("table \"%s\": removed %lld dead item identifiers in %u pages",
-					vacrel->relname, (long long) index, vacuumed_pages)));
+					vacrel->relname, (long long) rbstate->end_idx, vacuumed_pages)));
 
 	/* Revert to the previous phase information for error traceback */
 	restore_vacuum_error_info(vacrel, &saved_err_info);
@@ -2503,13 +2562,12 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
  * cleanup lock is also acceptable).  vmbuffer must be valid and already have
  * a pin on blkno's visibility map page.
  *
- * index is an offset into the vacrel->dead_items array for the first listed
- * LP_DEAD item on the page.  The return value is the first index immediately
- * after all LP_DEAD items for the same page in the array.
+ * Given a block and dead items recorded during the first pass, set those items
+ * dead and truncate the line pointer array. Update the VM as appropriate.
  */
-static int
-lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
-					  int index, Buffer vmbuffer)
+static void
+lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno,
+					  Buffer buffer, Buffer vmbuffer, VacReapBlkState *rbstate)
 {
 	VacDeadItems *dead_items = vacrel->dead_items;
 	Page		page = BufferGetPage(buffer);
@@ -2530,16 +2588,17 @@ lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; index < dead_items->num_items; index++)
+	for (int i = rbstate->start_idx; i < rbstate->end_idx; i++)
 	{
-		BlockNumber tblk;
 		OffsetNumber toff;
+		ItemPointer dead_item;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&dead_items->items[index]);
-		if (tblk != blkno)
-			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&dead_items->items[index]);
+		dead_item = &dead_items->items[i];
+
+		Assert(ItemPointerGetBlockNumber(dead_item) == blkno);
+
+		toff = ItemPointerGetOffsetNumber(dead_item);
 		itemid = PageGetItemId(page, toff);
 
 		Assert(ItemIdIsDead(itemid) && !ItemIdHasStorage(itemid));
@@ -2609,7 +2668,6 @@ lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
 
 	/* Revert to the previous phase information for error traceback */
 	restore_vacuum_error_info(vacrel, &saved_err_info);
-	return index;
 }
 
 /*
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index aea8babd71a..a8f0b5f091d 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2972,6 +2972,7 @@ VacOptValue
 VacuumParams
 VacuumRelation
 VacuumStmt
+VacReapBlkState
 ValidIOData
 ValidateIndexState
 ValuesScan
-- 
2.40.1

#15Melanie Plageman
melanieplageman@gmail.com
In reply to: Heikki Linnakangas (#13)
Re: Confine vacuum skip logic to lazy_scan_skip

On Wed, Mar 06, 2024 at 09:55:21PM +0200, Heikki Linnakangas wrote:

On 27/02/2024 21:47, Melanie Plageman wrote:

The attached v5 has some simplifications when compared to v4 but takes
largely the same approach.

0001-0004 are refactoring

I'm looking at just these 0001-0004 patches for now. I like those changes a
lot for the sake of readablity even without any of the later patches.

Thanks! And thanks so much for the review!

I've done a small performance experiment comparing a branch with all of
the patches applied (your v6 0001-0009) with master. I made an 11 GB
table that has 1,394,328 blocks. For setup, I vacuumed it to update the
VM and made sure it was entirely in shared buffers. All of this was to
make sure all of the blocks would be skipped and we spend the majority
of the time spinning through the lazy_scan_heap() code. Then I ran
vacuum again (the actual test). I saw vacuum go from 13 ms to 10 ms
with the patches applied.

I think I need to do some profiling to see if the difference is actually
due to our code changes, but I thought I would share preliminary
results.

I made some further changes. I kept them as separate commits for easier
review, see the commit messages for details. Any thoughts on those changes?

I've given some inline feedback on most of the extra patches you added.
Short answer is they all seem fine to me except I have a reservations
about 0008 because of the number of blkno variables flying around. I
didn't have a chance to rebase these into my existing changes today, so
either I will do it tomorrow or, if you are feeling like you're on a
roll and want to do it, that also works!

I feel heap_vac_scan_get_next_block() function could use some love. Maybe
just some rewording of the comments, or maybe some other refactoring; not
sure. But I'm pretty happy with the function signature and how it's called.

I was wondering if we should remove the "get" and just go with
heap_vac_scan_next_block(). I didn't do that originally because I didn't
want to imply that the next block was literally the sequentially next
block, but I think maybe I was overthinking it.

Another idea is to call it heap_scan_vac_next_block() and then the order
of the words is more like the table AM functions that get the next block
(e.g. heapam_scan_bitmap_next_block()). Though maybe we don't want it to
be too similar to those since this isn't a table AM callback.

As for other refactoring and other rewording of comments and such, I
will take a pass at this tomorrow.

BTW, do we have tests that would fail if we botched up
heap_vac_scan_get_next_block() so that it would skip pages incorrectly, for
example? Not asking you to write them for this patch, but I'm just
wondering.

So, while developing this, when I messed up and skipped blocks I
shouldn't, vacuum would error out with the "found xmin from before
relfrozenxid" error -- which would cause random tests to fail. I know
that's not a correctly failing test of this code. I think there might be
some tests in the verify_heapam tests that could/do test this kind of
thing but I don't remember them failing for me during development -- so
I didn't spend much time looking at them.

I would also sometimes get freespace or VM tests that would fail because
those blocks that are incorrectly skipped were meant to be reflected in
the FSM or VM in those tests.

All of that is to say, perhaps we should write a more targeted test?

When I was writing the code, I added logging of skipped blocks and then
came up with different scenarios and ran them on master and with the
patch and diffed the logs.

From b4047b941182af0643838fde056c298d5cc3ae32 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Wed, 6 Mar 2024 20:13:42 +0200
Subject: [PATCH v6 5/9] Remove unused 'skipping_current_range' field

---
src/backend/access/heap/vacuumlazy.c | 2 --
1 file changed, 2 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 65d257aab83..51391870bf3 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -217,8 +217,6 @@ typedef struct LVRelState
Buffer		vmbuffer;
/* Next unskippable block's visibility status */
bool		next_unskippable_allvis;
-		/* Whether or not skippable blocks should be skipped */
-		bool		skipping_current_range;
}			skip;
} LVRelState;

--
2.39.2

Oops! I thought I removed this. I must have forgotten

From 27e431e8dc69bbf09d831cb1cf2903d16f177d74 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Wed, 6 Mar 2024 20:58:57 +0200
Subject: [PATCH v6 6/9] Move vmbuffer back to a local varible in
lazy_scan_heap()

It felt confusing that we passed around the current block, 'blkno', as
an argument to lazy_scan_new_or_empty() and lazy_scan_prune(), but
'vmbuffer' was accessed directly in the 'scan_state'.

It was also a bit vague, when exactly 'vmbuffer' was valid. Calling
heap_vac_scan_get_next_block() set it, sometimes, to a buffer that
might or might not contain the VM bit for 'blkno'. But other
functions, like lazy_scan_prune(), assumed it to contain the correct
buffer. That was fixed up visibilitymap_pin(). But clearly it was not
"owned" by heap_vac_scan_get_next_block(), like the other 'scan_state'
fields.

I moved it back to a local variable, like it was. Maybe there would be
even better ways to handle it, but at least this is not worse than
what we have in master currently.

I'm fine with this. I did it the way I did (grouping it with the
"next_unskippable_block" in the skip struct), because I think that this
vmbuffer is always the buffer containing the VM bit for the next
unskippable block -- which sometimes is the block returned by
heap_vac_scan_get_next_block() and sometimes isn't.

I agree it might be best as a local variable but perhaps we could retain
the comment about it being the block of the VM containing the bit for the
next unskippable block. (Honestly, the whole thing is very confusing).

From 519e26a01b6e6974f9e0edb94b00756af053f7ee Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Wed, 6 Mar 2024 20:27:57 +0200
Subject: [PATCH v6 7/9] Rename skip_state

I don't want to emphasize the "skipping" part. Rather, it's the state
onwed by the heap_vac_scan_get_next_block() function

This makes sense to me. Skipping should be private details of vacuum's
get_next_block functionality. Though the name is a bit long. Maybe we
don't need the "get" and "state" parts (it is already in a struct with
state in the name)?

From 6dfae936a29e2d3479273f8ab47778a596258b16 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Wed, 6 Mar 2024 21:03:19 +0200
Subject: [PATCH v6 8/9] Track 'current_block' in the skip state

The caller was expected to always pass last blk + 1. It's not clear if
the next_unskippable block accounting would work correctly if you
passed something else. So rather than expecting the caller to do that,
have heap_vac_scan_get_next_block() keep track of the last returned
block itself, in the 'skip' state.

This is largely redundant with the LVRelState->blkno field. But that
one is currently only used for error reporting, so it feels best to
give heap_vac_scan_get_next_block() its own field that it owns.

I understand and agree with you that relying on blkno + 1 is bad and we
should make the "next_block" state keep track of the current block.

Though, I now find it easy to confuse
lvrelstate->get_next_block_state->current_block, lvrelstate->blkno and
the local variable blkno in lazy_scan_heap(). I think it is a naming
thing and not that we shouldn't have all three. I'll think more about it
in the morning.

From 619556cad4aad68d1711c12b962e9002e56d8db2 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Wed, 6 Mar 2024 21:35:11 +0200
Subject: [PATCH v6 9/9] Comment & whitespace cleanup

I moved some of the paragraphs to inside the
heap_vac_scan_get_next_block() function. I found the explanation in
the function comment at the old place like too much detail. Someone
looking at the function signature and how to call it would not care
about all the details of what can or cannot be skipped.

LGTM.

Thanks again.

- Melanie

#16Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#15)
7 attachment(s)
Re: Confine vacuum skip logic to lazy_scan_skip

On Wed, Mar 06, 2024 at 10:00:23PM -0500, Melanie Plageman wrote:

On Wed, Mar 06, 2024 at 09:55:21PM +0200, Heikki Linnakangas wrote:

I made some further changes. I kept them as separate commits for easier
review, see the commit messages for details. Any thoughts on those changes?

I've given some inline feedback on most of the extra patches you added.
Short answer is they all seem fine to me except I have a reservations
about 0008 because of the number of blkno variables flying around. I
didn't have a chance to rebase these into my existing changes today, so
either I will do it tomorrow or, if you are feeling like you're on a
roll and want to do it, that also works!

Attached v7 contains all of the changes that you suggested plus some
additional cleanups here and there.

I feel heap_vac_scan_get_next_block() function could use some love. Maybe
just some rewording of the comments, or maybe some other refactoring; not
sure. But I'm pretty happy with the function signature and how it's called.

I've cleaned up the comments on heap_vac_scan_next_block() in the first
couple patches (not so much in the streaming read user). Let me know if
it addresses your feelings or if I should look for other things I could
change.

I will say that now all of the variable names are *very* long. I didn't
want to remove the "state" from LVRelState->next_block_state. (In fact, I
kind of miss the "get". But I had to draw the line somewhere.) I think
without "state" in the name, next_block sounds too much like a function.

Any ideas for shortening the names of next_block_state and its members
or are you fine with them?

I was wondering if we should remove the "get" and just go with
heap_vac_scan_next_block(). I didn't do that originally because I didn't
want to imply that the next block was literally the sequentially next
block, but I think maybe I was overthinking it.

Another idea is to call it heap_scan_vac_next_block() and then the order
of the words is more like the table AM functions that get the next block
(e.g. heapam_scan_bitmap_next_block()). Though maybe we don't want it to
be too similar to those since this isn't a table AM callback.

I've done a version of this.

From 27e431e8dc69bbf09d831cb1cf2903d16f177d74 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Wed, 6 Mar 2024 20:58:57 +0200
Subject: [PATCH v6 6/9] Move vmbuffer back to a local varible in
lazy_scan_heap()

It felt confusing that we passed around the current block, 'blkno', as
an argument to lazy_scan_new_or_empty() and lazy_scan_prune(), but
'vmbuffer' was accessed directly in the 'scan_state'.

It was also a bit vague, when exactly 'vmbuffer' was valid. Calling
heap_vac_scan_get_next_block() set it, sometimes, to a buffer that
might or might not contain the VM bit for 'blkno'. But other
functions, like lazy_scan_prune(), assumed it to contain the correct
buffer. That was fixed up visibilitymap_pin(). But clearly it was not
"owned" by heap_vac_scan_get_next_block(), like the other 'scan_state'
fields.

I moved it back to a local variable, like it was. Maybe there would be
even better ways to handle it, but at least this is not worse than
what we have in master currently.

I'm fine with this. I did it the way I did (grouping it with the
"next_unskippable_block" in the skip struct), because I think that this
vmbuffer is always the buffer containing the VM bit for the next
unskippable block -- which sometimes is the block returned by
heap_vac_scan_get_next_block() and sometimes isn't.

I agree it might be best as a local variable but perhaps we could retain
the comment about it being the block of the VM containing the bit for the
next unskippable block. (Honestly, the whole thing is very confusing).

In 0001-0004 I've stuck with only having the local variable vmbuffer in
lazy_scan_heap().

In 0006 (introducing pass 1 vacuum streaming read user) I added a
vmbuffer back to the next_block_state (while also keeping the local
variable vmbuffer in lazy_scan_heap()). The vmbuffer in lazy_scan_heap()
contains the block of the VM containing visi information for the next
unskippable block or for the current block if its visi information
happens to be in the same block of the VM as either 1) the next
unskippable block or 2) the most recently processed heap block.

Streaming read vacuum separates this visibility check in
heap_vac_scan_next_block() from the main loop of lazy_scan_heap(), so we
can't just use a local variable anymore. Now the local variable vmbuffer
in lazy_scan_heap() will only already contain the block with the visi
information for the to-be-processed block if it happens to be in the
same VM block as the most recently processed heap block. That means
potentially more VM fetches.

However, by adding a vmbuffer to next_block_state, the callback may be
able to avoid extra VM fetches from one invocation to the next.

Note that next_block->current_block in the streaming read vacuum context
is actually the prefetch block.

- Melanie

Attachments:

v7-0001-lazy_scan_skip-remove-unneeded-local-var-nskippab.patchtext/x-diff; charset=us-asciiDownload
From 5018cf4a882d48bc424301400cb40aa7a36955b1 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 30 Dec 2023 16:30:59 -0500
Subject: [PATCH v7 1/7] lazy_scan_skip remove unneeded local var
 nskippable_blocks

nskippable_blocks can be easily derived from next_unskippable_block's
progress when compared to the passed in next_block.
---
 src/backend/access/heap/vacuumlazy.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 8b320c3f89a..1dc6cc8e4db 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1103,8 +1103,7 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 			   bool *next_unskippable_allvis, bool *skipping_current_range)
 {
 	BlockNumber rel_pages = vacrel->rel_pages,
-				next_unskippable_block = next_block,
-				nskippable_blocks = 0;
+				next_unskippable_block = next_block;
 	bool		skipsallvis = false;
 
 	*next_unskippable_allvis = true;
@@ -1161,7 +1160,6 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 
 		vacuum_delay_point();
 		next_unskippable_block++;
-		nskippable_blocks++;
 	}
 
 	/*
@@ -1174,7 +1172,7 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 	 * non-aggressive VACUUMs.  If the range has any all-visible pages then
 	 * skipping makes updating relfrozenxid unsafe, which is a real downside.
 	 */
-	if (nskippable_blocks < SKIP_PAGES_THRESHOLD)
+	if (next_unskippable_block - next_block < SKIP_PAGES_THRESHOLD)
 		*skipping_current_range = false;
 	else
 	{
-- 
2.40.1

v7-0002-Add-lazy_scan_skip-next-block-state-to-LVRelState.patchtext/x-diff; charset=us-asciiDownload
From 4d49028df51550af931f70c21a920a22ff09ba48 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 30 Dec 2023 16:22:12 -0500
Subject: [PATCH v7 2/7] Add lazy_scan_skip next block state to LVRelState

Future commits will remove all skipping logic from lazy_scan_heap() and
confine it to lazy_scan_skip(). To make those commits more clear, first
introduce a struct to LVRelState containing members tracking the current
block and the information needed to determine whether or not to skip
ranges less than SKIP_PAGES_THRESHOLD.

While we are at it, expand the comments in lazy_scan_skip(), including
descriptions of the role and expectations of its function parameters and
more detail on when skippable blocks are not skipped.

Discussion: https://postgr.es/m/flat/CAAKRu_Yf3gvXGcCnqqfoq0Q8LX8UM-e-qbm_B1LeZh60f8WhWA%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c | 124 ++++++++++++++++++---------
 1 file changed, 84 insertions(+), 40 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 1dc6cc8e4db..accc6303fa2 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -204,6 +204,22 @@ typedef struct LVRelState
 	int64		live_tuples;	/* # live tuples remaining */
 	int64		recently_dead_tuples;	/* # dead, but not yet removable */
 	int64		missed_dead_tuples; /* # removable, but not removed */
+
+	/*
+	 * Parameters maintained by lazy_scan_skip() to manage skipping ranges of
+	 * pages greater than SKIP_PAGES_THRESHOLD.
+	 */
+	struct
+	{
+		/* The last block lazy_scan_skip() returned and vacuum processed */
+		BlockNumber current_block;
+		/* Next unskippable block */
+		BlockNumber next_unskippable_block;
+		/* Next unskippable block's visibility status */
+		bool		next_unskippable_allvis;
+		/* Whether or not skippable blocks should be skipped */
+		bool		skipping_current_range;
+	}			next_block_state;
 } LVRelState;
 
 /* Struct for saving and restoring vacuum error information. */
@@ -214,13 +230,9 @@ typedef struct LVSavedErrInfo
 	VacErrPhase phase;
 } LVSavedErrInfo;
 
-
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
-static BlockNumber lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer,
-								  BlockNumber next_block,
-								  bool *next_unskippable_allvis,
-								  bool *skipping_current_range);
+static void lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
 								   bool sharelock, Buffer vmbuffer);
@@ -803,12 +815,9 @@ lazy_scan_heap(LVRelState *vacrel)
 {
 	BlockNumber rel_pages = vacrel->rel_pages,
 				blkno,
-				next_unskippable_block,
 				next_fsm_block_to_vacuum = 0;
 	VacDeadItems *dead_items = vacrel->dead_items;
 	Buffer		vmbuffer = InvalidBuffer;
-	bool		next_unskippable_allvis,
-				skipping_current_range;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -822,10 +831,12 @@ lazy_scan_heap(LVRelState *vacrel)
 	initprog_val[2] = dead_items->max_items;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
+	/* Initialize for first lazy_scan_skip() call */
+	vacrel->next_block_state.current_block = InvalidBlockNumber;
+	vacrel->next_block_state.next_unskippable_block = InvalidBlockNumber;
+
 	/* Set up an initial range of skippable blocks using the visibility map */
-	next_unskippable_block = lazy_scan_skip(vacrel, &vmbuffer, 0,
-											&next_unskippable_allvis,
-											&skipping_current_range);
+	lazy_scan_skip(vacrel, &vmbuffer);
 	for (blkno = 0; blkno < rel_pages; blkno++)
 	{
 		Buffer		buf;
@@ -834,26 +845,21 @@ lazy_scan_heap(LVRelState *vacrel)
 		bool		has_lpdead_items;
 		bool		got_cleanup_lock = false;
 
-		if (blkno == next_unskippable_block)
+		if (blkno == vacrel->next_block_state.next_unskippable_block)
 		{
 			/*
 			 * Can't skip this page safely.  Must scan the page.  But
 			 * determine the next skippable range after the page first.
 			 */
-			all_visible_according_to_vm = next_unskippable_allvis;
-			next_unskippable_block = lazy_scan_skip(vacrel, &vmbuffer,
-													blkno + 1,
-													&next_unskippable_allvis,
-													&skipping_current_range);
-
-			Assert(next_unskippable_block >= blkno + 1);
+			all_visible_according_to_vm = vacrel->next_block_state.next_unskippable_allvis;
+			lazy_scan_skip(vacrel, &vmbuffer);
 		}
 		else
 		{
 			/* Last page always scanned (may need to set nonempty_pages) */
 			Assert(blkno < rel_pages - 1);
 
-			if (skipping_current_range)
+			if (vacrel->next_block_state.skipping_current_range)
 				continue;
 
 			/* Current range is too small to skip -- just scan the page */
@@ -1036,7 +1042,10 @@ lazy_scan_heap(LVRelState *vacrel)
 
 	vacrel->blkno = InvalidBlockNumber;
 	if (BufferIsValid(vmbuffer))
+	{
 		ReleaseBuffer(vmbuffer);
+		vmbuffer = InvalidBuffer;
+	}
 
 	/* report that everything is now scanned */
 	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
@@ -1080,15 +1089,20 @@ lazy_scan_heap(LVRelState *vacrel)
  *	lazy_scan_skip() -- set up range of skippable blocks using visibility map.
  *
  * lazy_scan_heap() calls here every time it needs to set up a new range of
- * blocks to skip via the visibility map.  Caller passes the next block in
- * line.  We return a next_unskippable_block for this range.  When there are
- * no skippable blocks we just return caller's next_block.  The all-visible
- * status of the returned block is set in *next_unskippable_allvis for caller,
- * too.  Block usually won't be all-visible (since it's unskippable), but it
- * can be during aggressive VACUUMs (as well as in certain edge cases).
+ * blocks to skip via the visibility map.
  *
- * Sets *skipping_current_range to indicate if caller should skip this range.
- * Costs and benefits drive our decision.  Very small ranges won't be skipped.
+ * vacrel is an in/out parameter here; vacuum options and information about the
+ * relation are read, members of vacrel->next_block_state are read and set as
+ * bookeeping for this function, and vacrel->skippedallvis is set to ensure we
+ * don't advance relfrozenxid when we have skipped vacuuming all-visible
+ * blocks.
+ *
+ * vmbuffer is an output parameter which, upon return, will contain the block
+ * from the VM containing visibility information for the next unskippable heap
+ * block. If we decide not to skip this heap block, the caller is responsible
+ * for fetching the correct VM block into vmbuffer before using it. This is
+ * okay as providing it as an output parameter is an optimization, not a
+ * requirement.
  *
  * Note: our opinion of which blocks can be skipped can go stale immediately.
  * It's okay if caller "misses" a page whose all-visible or all-frozen marking
@@ -1098,15 +1112,38 @@ lazy_scan_heap(LVRelState *vacrel)
  * older XIDs/MXIDs.  The vacrel->skippedallvis flag will be set here when the
  * choice to skip such a range is actually made, making everything safe.)
  */
-static BlockNumber
-lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
-			   bool *next_unskippable_allvis, bool *skipping_current_range)
+static void
+lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer)
 {
-	BlockNumber rel_pages = vacrel->rel_pages,
-				next_unskippable_block = next_block;
+	/* Use local variables for better optimized loop code */
+	BlockNumber rel_pages = vacrel->rel_pages;
+	/* Relies on InvalidBlockNumber + 1 == 0 */
+	BlockNumber next_block = vacrel->next_block_state.current_block + 1;
+	BlockNumber next_unskippable_block = next_block;
+
 	bool		skipsallvis = false;
 
-	*next_unskippable_allvis = true;
+	vacrel->next_block_state.next_unskippable_allvis = true;
+
+	/*
+	 * A block is unskippable if it is not all visible according to the
+	 * visibility map. It is also unskippable if it is the last block in the
+	 * relation, if the vacuum is an aggressive vacuum, or if
+	 * DISABLE_PAGE_SKIPPING was passed to vacuum.
+	 *
+	 * Even if a block is skippable, we may choose not to skip it if the range
+	 * of skippable blocks is too small (below SKIP_PAGES_THRESHOLD). As a
+	 * consequence, we must keep track of the next truly unskippable block and
+	 * its visibility status along with whether or not we are skipping the
+	 * current range of skippable blocks. This can be used to derive the next
+	 * block lazy_scan_heap() must process and its visibility status.
+	 *
+	 * The block number and visibility status of the next unskippable block
+	 * are set in next_block_state->next_unskippable_block and
+	 * next_unskippable_allvis. next_block_state->skipping_current_range
+	 * indicates to the caller whether or not it is processing a skippable
+	 * (and thus all-visible) block.
+	 */
 	while (next_unskippable_block < rel_pages)
 	{
 		uint8		mapbits = visibilitymap_get_status(vacrel->rel,
@@ -1116,7 +1153,7 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 		if ((mapbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
 		{
 			Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
-			*next_unskippable_allvis = false;
+			vacrel->next_block_state.next_unskippable_allvis = false;
 			break;
 		}
 
@@ -1137,7 +1174,7 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 		if (!vacrel->skipwithvm)
 		{
 			/* Caller shouldn't rely on all_visible_according_to_vm */
-			*next_unskippable_allvis = false;
+			vacrel->next_block_state.next_unskippable_allvis = false;
 			break;
 		}
 
@@ -1162,6 +1199,10 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 		next_unskippable_block++;
 	}
 
+	Assert(vacrel->next_block_state.next_unskippable_block >=
+		   vacrel->next_block_state.current_block);
+	vacrel->next_block_state.next_unskippable_block = next_unskippable_block;
+
 	/*
 	 * We only skip a range with at least SKIP_PAGES_THRESHOLD consecutive
 	 * pages.  Since we're reading sequentially, the OS should be doing
@@ -1172,16 +1213,19 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 	 * non-aggressive VACUUMs.  If the range has any all-visible pages then
 	 * skipping makes updating relfrozenxid unsafe, which is a real downside.
 	 */
-	if (next_unskippable_block - next_block < SKIP_PAGES_THRESHOLD)
-		*skipping_current_range = false;
+	if (vacrel->next_block_state.next_unskippable_block - next_block < SKIP_PAGES_THRESHOLD)
+		vacrel->next_block_state.skipping_current_range = false;
 	else
 	{
-		*skipping_current_range = true;
+		vacrel->next_block_state.skipping_current_range = true;
 		if (skipsallvis)
 			vacrel->skippedallvis = true;
 	}
 
-	return next_unskippable_block;
+	if (next_unskippable_block >= rel_pages)
+		next_block = InvalidBlockNumber;
+
+	vacrel->next_block_state.current_block = next_block;
 }
 
 /*
-- 
2.40.1

v7-0003-Confine-vacuum-skip-logic-to-lazy_scan_skip.patchtext/x-diff; charset=us-asciiDownload
From 991c5a7ed46cc5dee36352194058ffb06a4e8670 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sat, 30 Dec 2023 16:59:27 -0500
Subject: [PATCH v7 3/7] Confine vacuum skip logic to lazy_scan_skip

In preparation for vacuum to use the streaming read interface [1] (and
eventually AIO), refactor vacuum's logic for skipping blocks such that
it is entirely confined to lazy_scan_skip(). This turns lazy_scan_skip()
and its next block state in LVRelState into an iterator which yields
blocks to lazy_scan_heap(). Such a structure is conducive to an async
interface. While we are at it, rename lazy_scan_skip() to
heap_vac_scan_next_block(), which now more accurately describes it.

By always calling heap_vac_scan_next_block(), instead of only when we
have reached the next unskippable block, we no longer need the
skipping_current_range variable. Furthermore, lazy_scan_heap() no longer
needs to manage the skipped range by checking if we reached the end in
order to then call heap_vac_scan_next_block(). And
heap_vac_scan_next_block() can derive the visibility status of a block
from whether or not we are in a skippable range; that is, if the next
block is equal to the next unskippable block, then the block isn't all
visible, otherwise it is.

[1] https://postgr.es/m/flat/CA%2BhUKGJkOiOCa%2Bmag4BF%2BzHo7qo%3Do9CFheB8%3Dg6uT5TUm2gkvA%40mail.gmail.com

Discussion: https://postgr.es/m/flat/CAAKRu_Yf3gvXGcCnqqfoq0Q8LX8UM-e-qbm_B1LeZh60f8WhWA%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c | 228 ++++++++++++++-------------
 1 file changed, 115 insertions(+), 113 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index accc6303fa2..8d715caccc1 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -206,19 +206,20 @@ typedef struct LVRelState
 	int64		missed_dead_tuples; /* # removable, but not removed */
 
 	/*
-	 * Parameters maintained by lazy_scan_skip() to manage skipping ranges of
-	 * pages greater than SKIP_PAGES_THRESHOLD.
+	 * Parameters maintained by heap_vac_scan_next_block() to manage getting
+	 * the next block for vacuum to process.
 	 */
 	struct
 	{
-		/* The last block lazy_scan_skip() returned and vacuum processed */
+		/*
+		 * The last block heap_vac_scan_next_block() returned and vacuum
+		 * processed
+		 */
 		BlockNumber current_block;
 		/* Next unskippable block */
 		BlockNumber next_unskippable_block;
 		/* Next unskippable block's visibility status */
 		bool		next_unskippable_allvis;
-		/* Whether or not skippable blocks should be skipped */
-		bool		skipping_current_range;
 	}			next_block_state;
 } LVRelState;
 
@@ -232,7 +233,9 @@ typedef struct LVSavedErrInfo
 
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
-static void lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer);
+static bool heap_vac_scan_next_block(LVRelState *vacrel, Buffer *vmbuffer,
+									 BlockNumber *blkno,
+									 bool *all_visible_according_to_vm);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
 								   bool sharelock, Buffer vmbuffer);
@@ -816,6 +819,8 @@ lazy_scan_heap(LVRelState *vacrel)
 	BlockNumber rel_pages = vacrel->rel_pages,
 				blkno,
 				next_fsm_block_to_vacuum = 0;
+	bool		all_visible_according_to_vm;
+
 	VacDeadItems *dead_items = vacrel->dead_items;
 	Buffer		vmbuffer = InvalidBuffer;
 	const int	initprog_index[] = {
@@ -831,41 +836,18 @@ lazy_scan_heap(LVRelState *vacrel)
 	initprog_val[2] = dead_items->max_items;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
-	/* Initialize for first lazy_scan_skip() call */
+	/* Initialize for first heap_vac_scan_next_block() call */
 	vacrel->next_block_state.current_block = InvalidBlockNumber;
 	vacrel->next_block_state.next_unskippable_block = InvalidBlockNumber;
 
-	/* Set up an initial range of skippable blocks using the visibility map */
-	lazy_scan_skip(vacrel, &vmbuffer);
-	for (blkno = 0; blkno < rel_pages; blkno++)
+	while (heap_vac_scan_next_block(vacrel, &vmbuffer,
+									&blkno, &all_visible_according_to_vm))
 	{
 		Buffer		buf;
 		Page		page;
-		bool		all_visible_according_to_vm;
 		bool		has_lpdead_items;
 		bool		got_cleanup_lock = false;
 
-		if (blkno == vacrel->next_block_state.next_unskippable_block)
-		{
-			/*
-			 * Can't skip this page safely.  Must scan the page.  But
-			 * determine the next skippable range after the page first.
-			 */
-			all_visible_according_to_vm = vacrel->next_block_state.next_unskippable_allvis;
-			lazy_scan_skip(vacrel, &vmbuffer);
-		}
-		else
-		{
-			/* Last page always scanned (may need to set nonempty_pages) */
-			Assert(blkno < rel_pages - 1);
-
-			if (vacrel->next_block_state.skipping_current_range)
-				continue;
-
-			/* Current range is too small to skip -- just scan the page */
-			all_visible_according_to_vm = true;
-		}
-
 		vacrel->scanned_pages++;
 
 		/* Report as block scanned, update error traceback information */
@@ -1086,10 +1068,16 @@ lazy_scan_heap(LVRelState *vacrel)
 }
 
 /*
- *	lazy_scan_skip() -- set up range of skippable blocks using visibility map.
+ *	heap_vac_scan_next_block() -- get next block for vacuum to process
+ *
+ * lazy_scan_heap() calls here every time it needs to get the next block to
+ * prune and vacuum, using the visibility map, vacuum options, and various
+ * thresholds to skip blocks which do not need to be processed and set blkno to
+ * the next block that actually needs to be processed.
  *
- * lazy_scan_heap() calls here every time it needs to set up a new range of
- * blocks to skip via the visibility map.
+ * The block number and visibility status of the next block to process are set
+ * in blkno and all_visible_according_to_vm. heap_vac_scan_next_block()
+ * returns false if there are no further blocks to process.
  *
  * vacrel is an in/out parameter here; vacuum options and information about the
  * relation are read, members of vacrel->next_block_state are read and set as
@@ -1112,19 +1100,14 @@ lazy_scan_heap(LVRelState *vacrel)
  * older XIDs/MXIDs.  The vacrel->skippedallvis flag will be set here when the
  * choice to skip such a range is actually made, making everything safe.)
  */
-static void
-lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer)
+static bool
+heap_vac_scan_next_block(LVRelState *vacrel, Buffer *vmbuffer,
+						 BlockNumber *blkno, bool *all_visible_according_to_vm)
 {
-	/* Use local variables for better optimized loop code */
-	BlockNumber rel_pages = vacrel->rel_pages;
 	/* Relies on InvalidBlockNumber + 1 == 0 */
 	BlockNumber next_block = vacrel->next_block_state.current_block + 1;
-	BlockNumber next_unskippable_block = next_block;
-
 	bool		skipsallvis = false;
 
-	vacrel->next_block_state.next_unskippable_allvis = true;
-
 	/*
 	 * A block is unskippable if it is not all visible according to the
 	 * visibility map. It is also unskippable if it is the last block in the
@@ -1144,88 +1127,107 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer)
 	 * indicates to the caller whether or not it is processing a skippable
 	 * (and thus all-visible) block.
 	 */
-	while (next_unskippable_block < rel_pages)
+	if (next_block >= vacrel->rel_pages)
 	{
-		uint8		mapbits = visibilitymap_get_status(vacrel->rel,
-													   next_unskippable_block,
-													   vmbuffer);
+		vacrel->next_block_state.current_block = *blkno = InvalidBlockNumber;
+		return false;
+	}
+
+	if (vacrel->next_block_state.next_unskippable_block == InvalidBlockNumber ||
+		next_block > vacrel->next_block_state.next_unskippable_block)
+	{
+		/* Use local variables for better optimized loop code */
+		BlockNumber rel_pages = vacrel->rel_pages;
+		BlockNumber next_unskippable_block = vacrel->next_block_state.next_unskippable_block;
 
-		if ((mapbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
+		while (++next_unskippable_block < rel_pages)
 		{
-			Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
-			vacrel->next_block_state.next_unskippable_allvis = false;
-			break;
-		}
+			uint8		mapbits = visibilitymap_get_status(vacrel->rel,
+														   next_unskippable_block,
+														   vmbuffer);
 
-		/*
-		 * Caller must scan the last page to determine whether it has tuples
-		 * (caller must have the opportunity to set vacrel->nonempty_pages).
-		 * This rule avoids having lazy_truncate_heap() take access-exclusive
-		 * lock on rel to attempt a truncation that fails anyway, just because
-		 * there are tuples on the last page (it is likely that there will be
-		 * tuples on other nearby pages as well, but those can be skipped).
-		 *
-		 * Implement this by always treating the last block as unsafe to skip.
-		 */
-		if (next_unskippable_block == rel_pages - 1)
-			break;
+			vacrel->next_block_state.next_unskippable_allvis = mapbits & VISIBILITYMAP_ALL_VISIBLE;
 
-		/* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
-		if (!vacrel->skipwithvm)
-		{
-			/* Caller shouldn't rely on all_visible_according_to_vm */
-			vacrel->next_block_state.next_unskippable_allvis = false;
-			break;
-		}
+			if (!vacrel->next_block_state.next_unskippable_allvis)
+			{
+				Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
+				break;
+			}
 
-		/*
-		 * Aggressive VACUUM caller can't skip pages just because they are
-		 * all-visible.  They may still skip all-frozen pages, which can't
-		 * contain XIDs < OldestXmin (XIDs that aren't already frozen by now).
-		 */
-		if ((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0)
-		{
-			if (vacrel->aggressive)
+			/*
+			 * Caller must scan the last page to determine whether it has
+			 * tuples (caller must have the opportunity to set
+			 * vacrel->nonempty_pages). This rule avoids having
+			 * lazy_truncate_heap() take access-exclusive lock on rel to
+			 * attempt a truncation that fails anyway, just because there are
+			 * tuples on the last page (it is likely that there will be tuples
+			 * on other nearby pages as well, but those can be skipped).
+			 *
+			 * Implement this by always treating the last block as unsafe to
+			 * skip.
+			 */
+			if (next_unskippable_block == rel_pages - 1)
 				break;
 
+			/* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
+			if (!vacrel->skipwithvm)
+			{
+				/* Caller shouldn't rely on all_visible_according_to_vm */
+				vacrel->next_block_state.next_unskippable_allvis = false;
+				break;
+			}
+
 			/*
-			 * All-visible block is safe to skip in non-aggressive case.  But
-			 * remember that the final range contains such a block for later.
+			 * Aggressive VACUUM caller can't skip pages just because they are
+			 * all-visible.  They may still skip all-frozen pages, which can't
+			 * contain XIDs < OldestXmin (XIDs that aren't already frozen by
+			 * now).
 			 */
-			skipsallvis = true;
-		}
+			if ((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0)
+			{
+				if (vacrel->aggressive)
+					break;
 
-		vacuum_delay_point();
-		next_unskippable_block++;
-	}
+				/*
+				 * All-visible block is safe to skip in non-aggressive case.
+				 * But remember that the final range contains such a block for
+				 * later.
+				 */
+				skipsallvis = true;
+			}
 
-	Assert(vacrel->next_block_state.next_unskippable_block >=
-		   vacrel->next_block_state.current_block);
-	vacrel->next_block_state.next_unskippable_block = next_unskippable_block;
+			vacuum_delay_point();
+		}
 
-	/*
-	 * We only skip a range with at least SKIP_PAGES_THRESHOLD consecutive
-	 * pages.  Since we're reading sequentially, the OS should be doing
-	 * readahead for us, so there's no gain in skipping a page now and then.
-	 * Skipping such a range might even discourage sequential detection.
-	 *
-	 * This test also enables more frequent relfrozenxid advancement during
-	 * non-aggressive VACUUMs.  If the range has any all-visible pages then
-	 * skipping makes updating relfrozenxid unsafe, which is a real downside.
-	 */
-	if (vacrel->next_block_state.next_unskippable_block - next_block < SKIP_PAGES_THRESHOLD)
-		vacrel->next_block_state.skipping_current_range = false;
-	else
-	{
-		vacrel->next_block_state.skipping_current_range = true;
-		if (skipsallvis)
-			vacrel->skippedallvis = true;
+		vacrel->next_block_state.next_unskippable_block = next_unskippable_block;
+
+		/*
+		 * We only skip a range with at least SKIP_PAGES_THRESHOLD consecutive
+		 * pages.  Since we're reading sequentially, the OS should be doing
+		 * readahead for us, so there's no gain in skipping a page now and
+		 * then. Skipping such a range might even discourage sequential
+		 * detection.
+		 *
+		 * This test also enables more frequent relfrozenxid advancement
+		 * during non-aggressive VACUUMs.  If the range has any all-visible
+		 * pages then skipping makes updating relfrozenxid unsafe, which is a
+		 * real downside.
+		 */
+		if (vacrel->next_block_state.next_unskippable_block - next_block >= SKIP_PAGES_THRESHOLD)
+		{
+			next_block = vacrel->next_block_state.next_unskippable_block;
+			if (skipsallvis)
+				vacrel->skippedallvis = true;
+		}
 	}
 
-	if (next_unskippable_block >= rel_pages)
-		next_block = InvalidBlockNumber;
+	if (next_block == vacrel->next_block_state.next_unskippable_block)
+		*all_visible_according_to_vm = vacrel->next_block_state.next_unskippable_allvis;
+	else
+		*all_visible_according_to_vm = true;
 
-	vacrel->next_block_state.current_block = next_block;
+	vacrel->next_block_state.current_block = *blkno = next_block;
+	return true;
 }
 
 /*
@@ -1798,8 +1800,8 @@ lazy_scan_prune(LVRelState *vacrel,
 
 	/*
 	 * Handle setting visibility map bit based on information from the VM (as
-	 * of last lazy_scan_skip() call), and from all_visible and all_frozen
-	 * variables
+	 * of last heap_vac_scan_next_block() call), and from all_visible and
+	 * all_frozen variables
 	 */
 	if (!all_visible_according_to_vm && all_visible)
 	{
@@ -1834,8 +1836,8 @@ lazy_scan_prune(LVRelState *vacrel,
 	/*
 	 * As of PostgreSQL 9.2, the visibility map bit should never be set if the
 	 * page-level bit is clear.  However, it's possible that the bit got
-	 * cleared after lazy_scan_skip() was called, so we must recheck with
-	 * buffer lock before concluding that the VM is corrupt.
+	 * cleared after heap_vac_scan_next_block() was called, so we must recheck
+	 * with buffer lock before concluding that the VM is corrupt.
 	 */
 	else if (all_visible_according_to_vm && !PageIsAllVisible(page) &&
 			 visibilitymap_get_status(vacrel->rel, blkno, &vmbuffer) != 0)
-- 
2.40.1

v7-0004-Remove-unneeded-vacuum_delay_point-from-heap_vac_.patchtext/x-diff; charset=us-asciiDownload
From 01be526bfb450d795dca7cabe3cd97687ef60156 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sun, 31 Dec 2023 12:49:56 -0500
Subject: [PATCH v7 4/7] Remove unneeded vacuum_delay_point from
 heap_vac_scan_get_next_block

heap_vac_scan_get_next_block() does relatively little work, so there is
no need to call vacuum_delay_point(). A future commit will call
heap_vac_scan_get_next_block() from a callback, and we would like to
avoid calling vacuum_delay_point() in that callback.
---
 src/backend/access/heap/vacuumlazy.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 8d715caccc1..d2c8f27fc57 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1195,8 +1195,6 @@ heap_vac_scan_next_block(LVRelState *vacrel, Buffer *vmbuffer,
 				 */
 				skipsallvis = true;
 			}
-
-			vacuum_delay_point();
 		}
 
 		vacrel->next_block_state.next_unskippable_block = next_unskippable_block;
-- 
2.40.1

v7-0005-Streaming-Read-API.patchtext/x-diff; charset=us-asciiDownload
From 4143bef6230138d85772f76a3129433f40d4195d Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Wed, 6 Mar 2024 14:46:08 -0500
Subject: [PATCH v7 5/7] Streaming Read API

---
 src/backend/storage/Makefile             |   2 +-
 src/backend/storage/aio/Makefile         |  14 +
 src/backend/storage/aio/meson.build      |   5 +
 src/backend/storage/aio/streaming_read.c | 612 ++++++++++++++++++++++
 src/backend/storage/buffer/bufmgr.c      | 641 ++++++++++++++++-------
 src/backend/storage/buffer/localbuf.c    |  14 +-
 src/backend/storage/meson.build          |   1 +
 src/include/storage/bufmgr.h             |  45 ++
 src/include/storage/streaming_read.h     |  52 ++
 src/tools/pgindent/typedefs.list         |   3 +
 10 files changed, 1179 insertions(+), 210 deletions(-)
 create mode 100644 src/backend/storage/aio/Makefile
 create mode 100644 src/backend/storage/aio/meson.build
 create mode 100644 src/backend/storage/aio/streaming_read.c
 create mode 100644 src/include/storage/streaming_read.h

diff --git a/src/backend/storage/Makefile b/src/backend/storage/Makefile
index 8376cdfca20..eec03f6f2b4 100644
--- a/src/backend/storage/Makefile
+++ b/src/backend/storage/Makefile
@@ -8,6 +8,6 @@ subdir = src/backend/storage
 top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS     = buffer file freespace ipc large_object lmgr page smgr sync
+SUBDIRS     = aio buffer file freespace ipc large_object lmgr page smgr sync
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/storage/aio/Makefile b/src/backend/storage/aio/Makefile
new file mode 100644
index 00000000000..bcab44c802f
--- /dev/null
+++ b/src/backend/storage/aio/Makefile
@@ -0,0 +1,14 @@
+#
+# Makefile for storage/aio
+#
+# src/backend/storage/aio/Makefile
+#
+
+subdir = src/backend/storage/aio
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+OBJS = \
+	streaming_read.o
+
+include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/storage/aio/meson.build b/src/backend/storage/aio/meson.build
new file mode 100644
index 00000000000..39aef2a84a2
--- /dev/null
+++ b/src/backend/storage/aio/meson.build
@@ -0,0 +1,5 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+backend_sources += files(
+  'streaming_read.c',
+)
diff --git a/src/backend/storage/aio/streaming_read.c b/src/backend/storage/aio/streaming_read.c
new file mode 100644
index 00000000000..71f2c4a70b6
--- /dev/null
+++ b/src/backend/storage/aio/streaming_read.c
@@ -0,0 +1,612 @@
+#include "postgres.h"
+
+#include "storage/streaming_read.h"
+#include "utils/rel.h"
+
+/*
+ * Element type for PgStreamingRead's circular array of block ranges.
+ */
+typedef struct PgStreamingReadRange
+{
+	bool		need_wait;
+	bool		advice_issued;
+	BlockNumber blocknum;
+	int			nblocks;
+	int			per_buffer_data_index;
+	Buffer		buffers[MAX_BUFFERS_PER_TRANSFER];
+	ReadBuffersOperation operation;
+} PgStreamingReadRange;
+
+/*
+ * Streaming read object.
+ */
+struct PgStreamingRead
+{
+	int			max_ios;
+	int			ios_in_progress;
+	int			max_pinned_buffers;
+	int			pinned_buffers;
+	int			pinned_buffers_trigger;
+	int			next_tail_buffer;
+	int			ramp_up_pin_limit;
+	int			ramp_up_pin_stall;
+	bool		finished;
+	bool		advice_enabled;
+	void	   *pgsr_private;
+	PgStreamingReadBufferCB callback;
+
+	BufferAccessStrategy strategy;
+	BufferManagerRelation bmr;
+	ForkNumber	forknum;
+
+	/* Sometimes we need to buffer one block for flow control. */
+	BlockNumber unget_blocknum;
+	void	   *unget_per_buffer_data;
+
+	/* Next expected block, for detecting sequential access. */
+	BlockNumber seq_blocknum;
+
+	/* Space for optional per-buffer private data. */
+	size_t		per_buffer_data_size;
+	void	   *per_buffer_data;
+
+	/* Circular buffer of ranges. */
+	int			size;
+	int			head;
+	int			tail;
+	PgStreamingReadRange ranges[FLEXIBLE_ARRAY_MEMBER];
+};
+
+static PgStreamingRead *
+pg_streaming_read_buffer_alloc_internal(int flags,
+										void *pgsr_private,
+										size_t per_buffer_data_size,
+										BufferAccessStrategy strategy)
+{
+	PgStreamingRead *pgsr;
+	int			size;
+	int			max_ios;
+	uint32		max_pinned_buffers;
+
+
+	/*
+	 * Decide how many assumed I/Os we will allow to run concurrently.  That
+	 * is, advice to the kernel to tell it that we will soon read.  This
+	 * number also affects how far we look ahead for opportunities to start
+	 * more I/Os.
+	 */
+	if (flags & PGSR_FLAG_MAINTENANCE)
+		max_ios = maintenance_io_concurrency;
+	else
+		max_ios = effective_io_concurrency;
+
+	/*
+	 * The desired level of I/O concurrency controls how far ahead we are
+	 * willing to look ahead.  We also clamp it to at least
+	 * MAX_BUFFER_PER_TRANFER so that we can have a chance to build up a full
+	 * sized read, even when max_ios is zero.
+	 */
+	max_pinned_buffers = Max(max_ios * 4, MAX_BUFFERS_PER_TRANSFER);
+
+	/*
+	 * The *_io_concurrency GUCs might be set to 0, but we want to allow at
+	 * least one, to keep our gating logic simple.
+	 */
+	max_ios = Max(max_ios, 1);
+
+	/*
+	 * Don't allow this backend to pin too many buffers.  For now we'll apply
+	 * the limit for the shared buffer pool and the local buffer pool, without
+	 * worrying which it is.
+	 */
+	LimitAdditionalPins(&max_pinned_buffers);
+	LimitAdditionalLocalPins(&max_pinned_buffers);
+	Assert(max_pinned_buffers > 0);
+
+	/*
+	 * pgsr->ranges is a circular buffer.  When it is empty, head == tail.
+	 * When it is full, there is an empty element between head and tail.  Head
+	 * can also be empty (nblocks == 0), therefore we need two extra elements
+	 * for non-occupied ranges, on top of max_pinned_buffers to allow for the
+	 * maxmimum possible number of occupied ranges of the smallest possible
+	 * size of one.
+	 */
+	size = max_pinned_buffers + 2;
+
+	pgsr = (PgStreamingRead *)
+		palloc0(offsetof(PgStreamingRead, ranges) +
+				sizeof(pgsr->ranges[0]) * size);
+
+	pgsr->max_ios = max_ios;
+	pgsr->per_buffer_data_size = per_buffer_data_size;
+	pgsr->max_pinned_buffers = max_pinned_buffers;
+	pgsr->pgsr_private = pgsr_private;
+	pgsr->strategy = strategy;
+	pgsr->size = size;
+
+	pgsr->unget_blocknum = InvalidBlockNumber;
+
+#ifdef USE_PREFETCH
+
+	/*
+	 * This system supports prefetching advice.  As long as direct I/O isn't
+	 * enabled, and the caller hasn't promised sequential access, we can use
+	 * it.
+	 */
+	if ((io_direct_flags & IO_DIRECT_DATA) == 0 &&
+		(flags & PGSR_FLAG_SEQUENTIAL) == 0)
+		pgsr->advice_enabled = true;
+#endif
+
+	/*
+	 * We start off building small ranges, but double that quickly, for the
+	 * benefit of users that don't know how far ahead they'll read.  This can
+	 * be disabled by users that already know they'll read all the way.
+	 */
+	if (flags & PGSR_FLAG_FULL)
+		pgsr->ramp_up_pin_limit = INT_MAX;
+	else
+		pgsr->ramp_up_pin_limit = 1;
+
+	/*
+	 * We want to avoid creating ranges that are smaller than they could be
+	 * just because we hit max_pinned_buffers.  We only look ahead when the
+	 * number of pinned buffers falls below this trigger number, or put
+	 * another way, we stop looking ahead when we wouldn't be able to build a
+	 * "full sized" range.
+	 */
+	pgsr->pinned_buffers_trigger =
+		Max(1, (int) max_pinned_buffers - MAX_BUFFERS_PER_TRANSFER);
+
+	/* Space for the callback to store extra data along with each block. */
+	if (per_buffer_data_size)
+		pgsr->per_buffer_data = palloc(per_buffer_data_size * max_pinned_buffers);
+
+	return pgsr;
+}
+
+/*
+ * Create a new streaming read object that can be used to perform the
+ * equivalent of a series of ReadBuffer() calls for one fork of one relation.
+ * Internally, it generates larger vectored reads where possible by looking
+ * ahead.
+ */
+PgStreamingRead *
+pg_streaming_read_buffer_alloc(int flags,
+							   void *pgsr_private,
+							   size_t per_buffer_data_size,
+							   BufferAccessStrategy strategy,
+							   BufferManagerRelation bmr,
+							   ForkNumber forknum,
+							   PgStreamingReadBufferCB next_block_cb)
+{
+	PgStreamingRead *result;
+
+	result = pg_streaming_read_buffer_alloc_internal(flags,
+													 pgsr_private,
+													 per_buffer_data_size,
+													 strategy);
+	result->callback = next_block_cb;
+	result->bmr = bmr;
+	result->forknum = forknum;
+
+	return result;
+}
+
+/*
+ * Find the per-buffer data index for the Nth block of a range.
+ */
+static int
+get_per_buffer_data_index(PgStreamingRead *pgsr, PgStreamingReadRange *range, int n)
+{
+	int			result;
+
+	/*
+	 * Find slot in the circular buffer of per-buffer data, without using the
+	 * expensive % operator.
+	 */
+	result = range->per_buffer_data_index + n;
+	if (result >= pgsr->max_pinned_buffers)
+		result -= pgsr->max_pinned_buffers;
+	Assert(result == (range->per_buffer_data_index + n) % pgsr->max_pinned_buffers);
+
+	return result;
+}
+
+/*
+ * Return a pointer to the per-buffer data by index.
+ */
+static void *
+get_per_buffer_data_by_index(PgStreamingRead *pgsr, int per_buffer_data_index)
+{
+	return (char *) pgsr->per_buffer_data +
+		pgsr->per_buffer_data_size * per_buffer_data_index;
+}
+
+/*
+ * Return a pointer to the per-buffer data for the Nth block of a range.
+ */
+static void *
+get_per_buffer_data(PgStreamingRead *pgsr, PgStreamingReadRange *range, int n)
+{
+	return get_per_buffer_data_by_index(pgsr,
+										get_per_buffer_data_index(pgsr,
+																  range,
+																  n));
+}
+
+/*
+ * Start reading the head range, and create a new head range.  The new head
+ * range is returned.  It may not be empty, if StartReadBuffers() couldn't
+ * start the entire range; in that case the returned range contains the
+ * remaining portion of the range.
+ */
+static PgStreamingReadRange *
+pg_streaming_read_start_head_range(PgStreamingRead *pgsr)
+{
+	PgStreamingReadRange *head_range;
+	PgStreamingReadRange *new_head_range;
+	int			nblocks_pinned;
+	int			flags;
+
+	/* Caller should make sure we never exceed max_ios. */
+	Assert(pgsr->ios_in_progress < pgsr->max_ios);
+
+	/* Should only call if the head range has some blocks to read. */
+	head_range = &pgsr->ranges[pgsr->head];
+	Assert(head_range->nblocks > 0);
+
+	/*
+	 * If advice hasn't been suppressed, and this system supports it, this
+	 * isn't a strictly sequential pattern, then we'll issue advice.
+	 */
+	if (pgsr->advice_enabled && head_range->blocknum != pgsr->seq_blocknum)
+		flags = READ_BUFFERS_ISSUE_ADVICE;
+	else
+		flags = 0;
+
+
+	/* Start reading as many blocks as we can from the head range. */
+	nblocks_pinned = head_range->nblocks;
+	head_range->need_wait =
+		StartReadBuffers(pgsr->bmr,
+						 head_range->buffers,
+						 pgsr->forknum,
+						 head_range->blocknum,
+						 &nblocks_pinned,
+						 pgsr->strategy,
+						 flags,
+						 &head_range->operation);
+
+	/* Did that start an I/O? */
+	if (head_range->need_wait && (flags & READ_BUFFERS_ISSUE_ADVICE))
+	{
+		head_range->advice_issued = true;
+		pgsr->ios_in_progress++;
+		Assert(pgsr->ios_in_progress <= pgsr->max_ios);
+	}
+
+	/*
+	 * StartReadBuffers() might have pinned fewer blocks than we asked it to,
+	 * but always at least one.
+	 */
+	Assert(nblocks_pinned <= head_range->nblocks);
+	Assert(nblocks_pinned >= 1);
+	pgsr->pinned_buffers += nblocks_pinned;
+
+	/*
+	 * Remember where the next block would be after that, so we can detect
+	 * sequential access next time.
+	 */
+	pgsr->seq_blocknum = head_range->blocknum + nblocks_pinned;
+
+	/*
+	 * Create a new head range.  There must be space, because we have enough
+	 * elements for every range to hold just one block, up to the pin limit.
+	 */
+	Assert(pgsr->size > pgsr->max_pinned_buffers);
+	Assert((pgsr->head + 1) % pgsr->size != pgsr->tail);
+	if (++pgsr->head == pgsr->size)
+		pgsr->head = 0;
+	new_head_range = &pgsr->ranges[pgsr->head];
+	new_head_range->nblocks = 0;
+	new_head_range->advice_issued = false;
+
+	/*
+	 * If we didn't manage to start the whole read above, we split the range,
+	 * moving the remainder into the new head range.
+	 */
+	if (nblocks_pinned < head_range->nblocks)
+	{
+		int			nblocks_remaining = head_range->nblocks - nblocks_pinned;
+
+		head_range->nblocks = nblocks_pinned;
+
+		new_head_range->blocknum = head_range->blocknum + nblocks_pinned;
+		new_head_range->nblocks = nblocks_remaining;
+	}
+
+	/* The new range has per-buffer data starting after the previous range. */
+	new_head_range->per_buffer_data_index =
+		get_per_buffer_data_index(pgsr, head_range, nblocks_pinned);
+
+	return new_head_range;
+}
+
+/*
+ * Ask the callback which block it would like us to read next, with a small
+ * buffer in front to allow pg_streaming_unget_block() to work.
+ */
+static BlockNumber
+pg_streaming_get_block(PgStreamingRead *pgsr, void *per_buffer_data)
+{
+	BlockNumber result;
+
+	if (unlikely(pgsr->unget_blocknum != InvalidBlockNumber))
+	{
+		/*
+		 * If we had to unget a block, now it is time to return that one
+		 * again.
+		 */
+		result = pgsr->unget_blocknum;
+		pgsr->unget_blocknum = InvalidBlockNumber;
+
+		/*
+		 * The same per_buffer_data element must have been used, and still
+		 * contains whatever data the callback wrote into it.  So we just
+		 * sanity-check that we were called with the value that
+		 * pg_streaming_unget_block() pushed back.
+		 */
+		Assert(per_buffer_data == pgsr->unget_per_buffer_data);
+	}
+	else
+	{
+		/* Use the installed callback directly. */
+		result = pgsr->callback(pgsr, pgsr->pgsr_private, per_buffer_data);
+	}
+
+	return result;
+}
+
+/*
+ * In order to deal with short reads in StartReadBuffers(), we sometimes need
+ * to defer handling of a block until later.  This *must* be called with the
+ * last value returned by pg_streaming_get_block().
+ */
+static void
+pg_streaming_unget_block(PgStreamingRead *pgsr, BlockNumber blocknum, void *per_buffer_data)
+{
+	Assert(pgsr->unget_blocknum == InvalidBlockNumber);
+	pgsr->unget_blocknum = blocknum;
+	pgsr->unget_per_buffer_data = per_buffer_data;
+}
+
+static void
+pg_streaming_read_look_ahead(PgStreamingRead *pgsr)
+{
+	PgStreamingReadRange *range;
+
+	/*
+	 * If we're still ramping up, we may have to stall to wait for buffers to
+	 * be consumed first before we do any more prefetching.
+	 */
+	if (pgsr->ramp_up_pin_stall > 0)
+	{
+		Assert(pgsr->pinned_buffers > 0);
+		return;
+	}
+
+	/*
+	 * If we're finished or can't start more I/O, then don't look ahead.
+	 */
+	if (pgsr->finished || pgsr->ios_in_progress == pgsr->max_ios)
+		return;
+
+	/*
+	 * We'll also wait until the number of pinned buffers falls below our
+	 * trigger level, so that we have the chance to create a full range.
+	 */
+	if (pgsr->pinned_buffers >= pgsr->pinned_buffers_trigger)
+		return;
+
+	do
+	{
+		BlockNumber blocknum;
+		void	   *per_buffer_data;
+
+		/* Do we have a full-sized range? */
+		range = &pgsr->ranges[pgsr->head];
+		if (range->nblocks == lengthof(range->buffers))
+		{
+			/* Start as much of it as we can. */
+			range = pg_streaming_read_start_head_range(pgsr);
+
+			/* If we're now at the I/O limit, stop here. */
+			if (pgsr->ios_in_progress == pgsr->max_ios)
+				return;
+
+			/*
+			 * If we couldn't form a full range, then stop here to avoid
+			 * creating small I/O.
+			 */
+			if (pgsr->pinned_buffers >= pgsr->pinned_buffers_trigger)
+				return;
+
+			/*
+			 * That might have only been partially started, but always
+			 * processes at least one so that'll do for now.
+			 */
+			Assert(range->nblocks < lengthof(range->buffers));
+		}
+
+		/* Find per-buffer data slot for the next block. */
+		per_buffer_data = get_per_buffer_data(pgsr, range, range->nblocks);
+
+		/* Find out which block the callback wants to read next. */
+		blocknum = pg_streaming_get_block(pgsr, per_buffer_data);
+		if (blocknum == InvalidBlockNumber)
+		{
+			/* End of stream. */
+			pgsr->finished = true;
+			break;
+		}
+
+		/*
+		 * Is there a head range that we cannot extend, because the requested
+		 * block is not consecutive?
+		 */
+		if (range->nblocks > 0 &&
+			range->blocknum + range->nblocks != blocknum)
+		{
+			/* Yes.  Start it, so we can begin building a new one. */
+			range = pg_streaming_read_start_head_range(pgsr);
+
+			/*
+			 * It's possible that it was only partially started, and we have a
+			 * new range with the remainder.  Keep starting I/Os until we get
+			 * it all out of the way, or we hit the I/O limit.
+			 */
+			while (range->nblocks > 0 && pgsr->ios_in_progress < pgsr->max_ios)
+				range = pg_streaming_read_start_head_range(pgsr);
+
+			/*
+			 * We have to 'unget' the block returned by the callback if we
+			 * don't have enough I/O capacity left to start something.
+			 */
+			if (pgsr->ios_in_progress == pgsr->max_ios)
+			{
+				pg_streaming_unget_block(pgsr, blocknum, per_buffer_data);
+				return;
+			}
+		}
+
+		/* If we have a new, empty range, initialize the start block. */
+		if (range->nblocks == 0)
+		{
+			range->blocknum = blocknum;
+		}
+
+		/* This block extends the range by one. */
+		Assert(range->blocknum + range->nblocks == blocknum);
+		range->nblocks++;
+
+	} while (pgsr->pinned_buffers + range->nblocks < pgsr->max_pinned_buffers &&
+			 pgsr->pinned_buffers + range->nblocks < pgsr->ramp_up_pin_limit);
+
+	/* If we've hit the ramp-up limit, insert a stall. */
+	if (pgsr->pinned_buffers + range->nblocks >= pgsr->ramp_up_pin_limit)
+	{
+		/* Can't get here if an earlier stall hasn't finished. */
+		Assert(pgsr->ramp_up_pin_stall == 0);
+		/* Don't do any more prefetching until these buffers are consumed. */
+		pgsr->ramp_up_pin_stall = pgsr->ramp_up_pin_limit;
+		/* Double it.  It will soon be out of the way. */
+		pgsr->ramp_up_pin_limit *= 2;
+	}
+
+	/* Start as much as we can. */
+	while (range->nblocks > 0)
+	{
+		range = pg_streaming_read_start_head_range(pgsr);
+		if (pgsr->ios_in_progress == pgsr->max_ios)
+			break;
+	}
+}
+
+Buffer
+pg_streaming_read_buffer_get_next(PgStreamingRead *pgsr, void **per_buffer_data)
+{
+	pg_streaming_read_look_ahead(pgsr);
+
+	/* See if we have one buffer to return. */
+	while (pgsr->tail != pgsr->head)
+	{
+		PgStreamingReadRange *tail_range;
+
+		tail_range = &pgsr->ranges[pgsr->tail];
+
+		/*
+		 * Do we need to perform an I/O before returning the buffers from this
+		 * range?
+		 */
+		if (tail_range->need_wait)
+		{
+			WaitReadBuffers(&tail_range->operation);
+			tail_range->need_wait = false;
+
+			/*
+			 * We don't really know if the kernel generated a physical I/O
+			 * when we issued advice, let alone when it finished, but it has
+			 * certainly finished now because we've performed the read.
+			 */
+			if (tail_range->advice_issued)
+			{
+				Assert(pgsr->ios_in_progress > 0);
+				pgsr->ios_in_progress--;
+			}
+		}
+
+		/* Are there more buffers available in this range? */
+		if (pgsr->next_tail_buffer < tail_range->nblocks)
+		{
+			int			buffer_index;
+			Buffer		buffer;
+
+			buffer_index = pgsr->next_tail_buffer++;
+			buffer = tail_range->buffers[buffer_index];
+
+			Assert(BufferIsValid(buffer));
+
+			/* We are giving away ownership of this pinned buffer. */
+			Assert(pgsr->pinned_buffers > 0);
+			pgsr->pinned_buffers--;
+
+			if (pgsr->ramp_up_pin_stall > 0)
+				pgsr->ramp_up_pin_stall--;
+
+			if (per_buffer_data)
+				*per_buffer_data = get_per_buffer_data(pgsr, tail_range, buffer_index);
+
+			return buffer;
+		}
+
+		/* Advance tail to next range, if there is one. */
+		if (++pgsr->tail == pgsr->size)
+			pgsr->tail = 0;
+		pgsr->next_tail_buffer = 0;
+
+		/*
+		 * If tail crashed into head, and head is not empty, then it is time
+		 * to start that range.
+		 */
+		if (pgsr->tail == pgsr->head &&
+			pgsr->ranges[pgsr->head].nblocks > 0)
+			pg_streaming_read_start_head_range(pgsr);
+	}
+
+	Assert(pgsr->pinned_buffers == 0);
+
+	return InvalidBuffer;
+}
+
+void
+pg_streaming_read_free(PgStreamingRead *pgsr)
+{
+	Buffer		buffer;
+
+	/* Stop looking ahead. */
+	pgsr->finished = true;
+
+	/* Unpin anything that wasn't consumed. */
+	while ((buffer = pg_streaming_read_buffer_get_next(pgsr, NULL)) != InvalidBuffer)
+		ReleaseBuffer(buffer);
+
+	Assert(pgsr->pinned_buffers == 0);
+	Assert(pgsr->ios_in_progress == 0);
+
+	/* Release memory. */
+	if (pgsr->per_buffer_data)
+		pfree(pgsr->per_buffer_data);
+
+	pfree(pgsr);
+}
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index f0f8d4259c5..729d1f91721 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -19,6 +19,11 @@
  *		and pin it so that no one can destroy it while this process
  *		is using it.
  *
+ * StartReadBuffers() -- as above, but for multiple contiguous blocks in
+ *		two steps.
+ *
+ * WaitReadBuffers() -- second step of StartReadBuffers().
+ *
  * ReleaseBuffer() -- unpin a buffer
  *
  * MarkBufferDirty() -- mark a pinned buffer's contents as "dirty".
@@ -471,10 +476,9 @@ ForgetPrivateRefCountEntry(PrivateRefCountEntry *ref)
 )
 
 
-static Buffer ReadBuffer_common(SMgrRelation smgr, char relpersistence,
+static Buffer ReadBuffer_common(BufferManagerRelation bmr,
 								ForkNumber forkNum, BlockNumber blockNum,
-								ReadBufferMode mode, BufferAccessStrategy strategy,
-								bool *hit);
+								ReadBufferMode mode, BufferAccessStrategy strategy);
 static BlockNumber ExtendBufferedRelCommon(BufferManagerRelation bmr,
 										   ForkNumber fork,
 										   BufferAccessStrategy strategy,
@@ -500,7 +504,7 @@ static uint32 WaitBufHdrUnlocked(BufferDesc *buf);
 static int	SyncOneBuffer(int buf_id, bool skip_recently_used,
 						  WritebackContext *wb_context);
 static void WaitIO(BufferDesc *buf);
-static bool StartBufferIO(BufferDesc *buf, bool forInput);
+static bool StartBufferIO(BufferDesc *buf, bool forInput, bool nowait);
 static void TerminateBufferIO(BufferDesc *buf, bool clear_dirty,
 							  uint32 set_flag_bits, bool forget_owner);
 static void AbortBufferIO(Buffer buffer);
@@ -781,7 +785,6 @@ Buffer
 ReadBufferExtended(Relation reln, ForkNumber forkNum, BlockNumber blockNum,
 				   ReadBufferMode mode, BufferAccessStrategy strategy)
 {
-	bool		hit;
 	Buffer		buf;
 
 	/*
@@ -794,15 +797,9 @@ ReadBufferExtended(Relation reln, ForkNumber forkNum, BlockNumber blockNum,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("cannot access temporary tables of other sessions")));
 
-	/*
-	 * Read the buffer, and update pgstat counters to reflect a cache hit or
-	 * miss.
-	 */
-	pgstat_count_buffer_read(reln);
-	buf = ReadBuffer_common(RelationGetSmgr(reln), reln->rd_rel->relpersistence,
-							forkNum, blockNum, mode, strategy, &hit);
-	if (hit)
-		pgstat_count_buffer_hit(reln);
+	buf = ReadBuffer_common(BMR_REL(reln),
+							forkNum, blockNum, mode, strategy);
+
 	return buf;
 }
 
@@ -822,13 +819,12 @@ ReadBufferWithoutRelcache(RelFileLocator rlocator, ForkNumber forkNum,
 						  BlockNumber blockNum, ReadBufferMode mode,
 						  BufferAccessStrategy strategy, bool permanent)
 {
-	bool		hit;
-
 	SMgrRelation smgr = smgropen(rlocator, INVALID_PROC_NUMBER);
 
-	return ReadBuffer_common(smgr, permanent ? RELPERSISTENCE_PERMANENT :
-							 RELPERSISTENCE_UNLOGGED, forkNum, blockNum,
-							 mode, strategy, &hit);
+	return ReadBuffer_common(BMR_SMGR(smgr, permanent ? RELPERSISTENCE_PERMANENT :
+									  RELPERSISTENCE_UNLOGGED),
+							 forkNum, blockNum,
+							 mode, strategy);
 }
 
 /*
@@ -994,35 +990,68 @@ ExtendBufferedRelTo(BufferManagerRelation bmr,
 	 */
 	if (buffer == InvalidBuffer)
 	{
-		bool		hit;
-
 		Assert(extended_by == 0);
-		buffer = ReadBuffer_common(bmr.smgr, bmr.relpersistence,
-								   fork, extend_to - 1, mode, strategy,
-								   &hit);
+		buffer = ReadBuffer_common(bmr, fork, extend_to - 1, mode, strategy);
 	}
 
 	return buffer;
 }
 
+/*
+ * Zero a buffer and lock it, as part of the implementation of
+ * RBM_ZERO_AND_LOCK or RBM_ZERO_AND_CLEANUP_LOCK.  The buffer must be already
+ * pinned.  It does not have to be valid, but it is valid and locked on
+ * return.
+ */
+static void
+ZeroBuffer(Buffer buffer, ReadBufferMode mode)
+{
+	BufferDesc *bufHdr;
+	uint32		buf_state;
+
+	Assert(mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK);
+
+	if (BufferIsLocal(buffer))
+		bufHdr = GetLocalBufferDescriptor(-buffer - 1);
+	else
+	{
+		bufHdr = GetBufferDescriptor(buffer - 1);
+		if (mode == RBM_ZERO_AND_LOCK)
+			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
+		else
+			LockBufferForCleanup(buffer);
+	}
+
+	memset(BufferGetPage(buffer), 0, BLCKSZ);
+
+	if (BufferIsLocal(buffer))
+	{
+		buf_state = pg_atomic_read_u32(&bufHdr->state);
+		buf_state |= BM_VALID;
+		pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
+	}
+	else
+	{
+		buf_state = LockBufHdr(bufHdr);
+		buf_state |= BM_VALID;
+		UnlockBufHdr(bufHdr, buf_state);
+	}
+}
+
 /*
  * ReadBuffer_common -- common logic for all ReadBuffer variants
  *
  * *hit is set to true if the request was satisfied from shared buffer cache.
  */
 static Buffer
-ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
+ReadBuffer_common(BufferManagerRelation bmr, ForkNumber forkNum,
 				  BlockNumber blockNum, ReadBufferMode mode,
-				  BufferAccessStrategy strategy, bool *hit)
+				  BufferAccessStrategy strategy)
 {
-	BufferDesc *bufHdr;
-	Block		bufBlock;
-	bool		found;
-	IOContext	io_context;
-	IOObject	io_object;
-	bool		isLocalBuf = SmgrIsTemp(smgr);
-
-	*hit = false;
+	ReadBuffersOperation operation;
+	Buffer		buffer;
+	int			nblocks;
+	int			flags;
 
 	/*
 	 * Backward compatibility path, most code should use ExtendBufferedRel()
@@ -1041,181 +1070,404 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 		if (mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK)
 			flags |= EB_LOCK_FIRST;
 
-		return ExtendBufferedRel(BMR_SMGR(smgr, relpersistence),
-								 forkNum, strategy, flags);
+		return ExtendBufferedRel(bmr, forkNum, strategy, flags);
 	}
 
-	TRACE_POSTGRESQL_BUFFER_READ_START(forkNum, blockNum,
-									   smgr->smgr_rlocator.locator.spcOid,
-									   smgr->smgr_rlocator.locator.dbOid,
-									   smgr->smgr_rlocator.locator.relNumber,
-									   smgr->smgr_rlocator.backend);
+	nblocks = 1;
+	if (mode == RBM_ZERO_ON_ERROR)
+		flags = READ_BUFFERS_ZERO_ON_ERROR;
+	else
+		flags = 0;
+	if (StartReadBuffers(bmr,
+						 &buffer,
+						 forkNum,
+						 blockNum,
+						 &nblocks,
+						 strategy,
+						 flags,
+						 &operation))
+		WaitReadBuffers(&operation);
+	Assert(nblocks == 1);		/* single block can't be short */
+
+	if (mode == RBM_ZERO_AND_CLEANUP_LOCK || mode == RBM_ZERO_AND_LOCK)
+		ZeroBuffer(buffer, mode);
+
+	return buffer;
+}
+
+static Buffer
+PrepareReadBuffer(BufferManagerRelation bmr,
+				  ForkNumber forkNum,
+				  BlockNumber blockNum,
+				  BufferAccessStrategy strategy,
+				  bool *foundPtr)
+{
+	BufferDesc *bufHdr;
+	bool		isLocalBuf;
+	IOContext	io_context;
+	IOObject	io_object;
+
+	Assert(blockNum != P_NEW);
 
+	Assert(bmr.smgr);
+
+	isLocalBuf = SmgrIsTemp(bmr.smgr);
 	if (isLocalBuf)
 	{
-		/*
-		 * We do not use a BufferAccessStrategy for I/O of temporary tables.
-		 * However, in some cases, the "strategy" may not be NULL, so we can't
-		 * rely on IOContextForStrategy() to set the right IOContext for us.
-		 * This may happen in cases like CREATE TEMPORARY TABLE AS...
-		 */
 		io_context = IOCONTEXT_NORMAL;
 		io_object = IOOBJECT_TEMP_RELATION;
-		bufHdr = LocalBufferAlloc(smgr, forkNum, blockNum, &found);
-		if (found)
-			pgBufferUsage.local_blks_hit++;
-		else if (mode == RBM_NORMAL || mode == RBM_NORMAL_NO_LOG ||
-				 mode == RBM_ZERO_ON_ERROR)
-			pgBufferUsage.local_blks_read++;
 	}
 	else
 	{
-		/*
-		 * lookup the buffer.  IO_IN_PROGRESS is set if the requested block is
-		 * not currently in memory.
-		 */
 		io_context = IOContextForStrategy(strategy);
 		io_object = IOOBJECT_RELATION;
-		bufHdr = BufferAlloc(smgr, relpersistence, forkNum, blockNum,
-							 strategy, &found, io_context);
-		if (found)
-			pgBufferUsage.shared_blks_hit++;
-		else if (mode == RBM_NORMAL || mode == RBM_NORMAL_NO_LOG ||
-				 mode == RBM_ZERO_ON_ERROR)
-			pgBufferUsage.shared_blks_read++;
 	}
 
-	/* At this point we do NOT hold any locks. */
+	TRACE_POSTGRESQL_BUFFER_READ_START(forkNum, blockNum,
+									   bmr.smgr->smgr_rlocator.locator.spcOid,
+									   bmr.smgr->smgr_rlocator.locator.dbOid,
+									   bmr.smgr->smgr_rlocator.locator.relNumber,
+									   bmr.smgr->smgr_rlocator.backend);
 
-	/* if it was already in the buffer pool, we're done */
-	if (found)
+	ResourceOwnerEnlarge(CurrentResourceOwner);
+	if (isLocalBuf)
+	{
+		bufHdr = LocalBufferAlloc(bmr.smgr, forkNum, blockNum, foundPtr);
+		if (*foundPtr)
+			pgBufferUsage.local_blks_hit++;
+	}
+	else
+	{
+		bufHdr = BufferAlloc(bmr.smgr, bmr.relpersistence, forkNum, blockNum,
+							 strategy, foundPtr, io_context);
+		if (*foundPtr)
+			pgBufferUsage.shared_blks_hit++;
+	}
+	if (bmr.rel)
+	{
+		/*
+		 * While pgBufferUsage's "read" counter isn't bumped unless we reach
+		 * WaitReadBuffers() (so, not for hits, and not for buffers that are
+		 * zeroed instead), the per-relation stats always count them.
+		 */
+		pgstat_count_buffer_read(bmr.rel);
+		if (*foundPtr)
+			pgstat_count_buffer_hit(bmr.rel);
+	}
+	if (*foundPtr)
 	{
-		/* Just need to update stats before we exit */
-		*hit = true;
 		VacuumPageHit++;
 		pgstat_count_io_op(io_object, io_context, IOOP_HIT);
-
 		if (VacuumCostActive)
 			VacuumCostBalance += VacuumCostPageHit;
 
 		TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
-										  smgr->smgr_rlocator.locator.spcOid,
-										  smgr->smgr_rlocator.locator.dbOid,
-										  smgr->smgr_rlocator.locator.relNumber,
-										  smgr->smgr_rlocator.backend,
-										  found);
+										  bmr.smgr->smgr_rlocator.locator.spcOid,
+										  bmr.smgr->smgr_rlocator.locator.dbOid,
+										  bmr.smgr->smgr_rlocator.locator.relNumber,
+										  bmr.smgr->smgr_rlocator.backend,
+										  true);
+	}
 
-		/*
-		 * In RBM_ZERO_AND_LOCK mode the caller expects the page to be locked
-		 * on return.
-		 */
-		if (!isLocalBuf)
-		{
-			if (mode == RBM_ZERO_AND_LOCK)
-				LWLockAcquire(BufferDescriptorGetContentLock(bufHdr),
-							  LW_EXCLUSIVE);
-			else if (mode == RBM_ZERO_AND_CLEANUP_LOCK)
-				LockBufferForCleanup(BufferDescriptorGetBuffer(bufHdr));
-		}
+	return BufferDescriptorGetBuffer(bufHdr);
+}
 
-		return BufferDescriptorGetBuffer(bufHdr);
+/*
+ * Begin reading a range of blocks beginning at blockNum and extending for
+ * *nblocks.  On return, up to *nblocks pinned buffers holding those blocks
+ * are written into the buffers array, and *nblocks is updated to contain the
+ * actual number, which may be fewer than requested.
+ *
+ * If false is returned, no I/O is necessary and WaitReadBuffers() is not
+ * necessary.  If true is returned, one I/O has been started, and
+ * WaitReadBuffers() must be called with the same operation object before the
+ * buffers are accessed.  Along with the operation object, the caller-supplied
+ * array of buffers must remain valid until WaitReadBuffers() is called.
+ *
+ * Currently the I/O is only started with optional operating system advice,
+ * and the real I/O happens in WaitReadBuffers().  In future work, true I/O
+ * could be initiated here.
+ */
+bool
+StartReadBuffers(BufferManagerRelation bmr,
+				 Buffer *buffers,
+				 ForkNumber forkNum,
+				 BlockNumber blockNum,
+				 int *nblocks,
+				 BufferAccessStrategy strategy,
+				 int flags,
+				 ReadBuffersOperation *operation)
+{
+	int			actual_nblocks = *nblocks;
+
+	if (bmr.rel)
+	{
+		bmr.smgr = RelationGetSmgr(bmr.rel);
+		bmr.relpersistence = bmr.rel->rd_rel->relpersistence;
 	}
 
-	/*
-	 * if we have gotten to this point, we have allocated a buffer for the
-	 * page but its contents are not yet valid.  IO_IN_PROGRESS is set for it,
-	 * if it's a shared buffer.
-	 */
-	Assert(!(pg_atomic_read_u32(&bufHdr->state) & BM_VALID));	/* spinlock not needed */
+	operation->bmr = bmr;
+	operation->forknum = forkNum;
+	operation->blocknum = blockNum;
+	operation->buffers = buffers;
+	operation->nblocks = actual_nblocks;
+	operation->strategy = strategy;
+	operation->flags = flags;
 
-	bufBlock = isLocalBuf ? LocalBufHdrGetBlock(bufHdr) : BufHdrGetBlock(bufHdr);
+	operation->io_buffers_len = 0;
 
-	/*
-	 * Read in the page, unless the caller intends to overwrite it and just
-	 * wants us to allocate a buffer.
-	 */
-	if (mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK)
-		MemSet((char *) bufBlock, 0, BLCKSZ);
-	else
+	for (int i = 0; i < actual_nblocks; ++i)
 	{
-		instr_time	io_start = pgstat_prepare_io_time(track_io_timing);
+		bool		found;
 
-		smgrread(smgr, forkNum, blockNum, bufBlock);
+		buffers[i] = PrepareReadBuffer(bmr,
+									   forkNum,
+									   blockNum + i,
+									   strategy,
+									   &found);
 
-		pgstat_count_io_op_time(io_object, io_context,
-								IOOP_READ, io_start, 1);
+		if (found)
+		{
+			/*
+			 * Terminate the read as soon as we get a hit.  It could be a
+			 * single buffer hit, or it could be a hit that follows a readable
+			 * range.  We don't want to create more than one readable range,
+			 * so we stop here.
+			 */
+			actual_nblocks = operation->nblocks = *nblocks = i + 1;
+		}
+		else
+		{
+			/* Extend the readable range to cover this block. */
+			operation->io_buffers_len++;
+		}
+	}
 
-		/* check for garbage data */
-		if (!PageIsVerifiedExtended((Page) bufBlock, blockNum,
-									PIV_LOG_WARNING | PIV_REPORT_STAT))
+	if (operation->io_buffers_len > 0)
+	{
+		if (flags & READ_BUFFERS_ISSUE_ADVICE)
 		{
-			if (mode == RBM_ZERO_ON_ERROR || zero_damaged_pages)
-			{
-				ereport(WARNING,
-						(errcode(ERRCODE_DATA_CORRUPTED),
-						 errmsg("invalid page in block %u of relation %s; zeroing out page",
-								blockNum,
-								relpath(smgr->smgr_rlocator, forkNum))));
-				MemSet((char *) bufBlock, 0, BLCKSZ);
-			}
-			else
-				ereport(ERROR,
-						(errcode(ERRCODE_DATA_CORRUPTED),
-						 errmsg("invalid page in block %u of relation %s",
-								blockNum,
-								relpath(smgr->smgr_rlocator, forkNum))));
+			/*
+			 * In theory we should only do this if PrepareReadBuffers() had to
+			 * allocate new buffers above.  That way, if two calls to
+			 * StartReadBuffers() were made for the same blocks before
+			 * WaitReadBuffers(), only the first would issue the advice.
+			 * That'd be a better simulation of true asynchronous I/O, which
+			 * would only start the I/O once, but isn't done here for
+			 * simplicity.  Note also that the following call might actually
+			 * issue two advice calls if we cross a segment boundary; in a
+			 * true asynchronous version we might choose to process only one
+			 * real I/O at a time in that case.
+			 */
+			smgrprefetch(bmr.smgr, forkNum, blockNum, operation->io_buffers_len);
 		}
+
+		/* Indicate that WaitReadBuffers() should be called. */
+		return true;
 	}
+	else
+	{
+		return false;
+	}
+}
 
-	/*
-	 * In RBM_ZERO_AND_LOCK / RBM_ZERO_AND_CLEANUP_LOCK mode, grab the buffer
-	 * content lock before marking the page as valid, to make sure that no
-	 * other backend sees the zeroed page before the caller has had a chance
-	 * to initialize it.
-	 *
-	 * Since no-one else can be looking at the page contents yet, there is no
-	 * difference between an exclusive lock and a cleanup-strength lock. (Note
-	 * that we cannot use LockBuffer() or LockBufferForCleanup() here, because
-	 * they assert that the buffer is already valid.)
-	 */
-	if ((mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK) &&
-		!isLocalBuf)
+static inline bool
+WaitReadBuffersCanStartIO(Buffer buffer, bool nowait)
+{
+	if (BufferIsLocal(buffer))
 	{
-		LWLockAcquire(BufferDescriptorGetContentLock(bufHdr), LW_EXCLUSIVE);
+		BufferDesc *bufHdr = GetLocalBufferDescriptor(-buffer - 1);
+
+		return (pg_atomic_read_u32(&bufHdr->state) & BM_VALID) == 0;
 	}
+	else
+		return StartBufferIO(GetBufferDescriptor(buffer - 1), true, nowait);
+}
+
+void
+WaitReadBuffers(ReadBuffersOperation *operation)
+{
+	BufferManagerRelation bmr;
+	Buffer	   *buffers;
+	int			nblocks;
+	BlockNumber blocknum;
+	ForkNumber	forknum;
+	bool		isLocalBuf;
+	IOContext	io_context;
+	IOObject	io_object;
+
+	/*
+	 * Currently operations are only allowed to include a read of some range,
+	 * with an optional extra buffer that is already pinned at the end.  So
+	 * nblocks can be at most one more than io_buffers_len.
+	 */
+	Assert((operation->nblocks == operation->io_buffers_len) ||
+		   (operation->nblocks == operation->io_buffers_len + 1));
 
+	/* Find the range of the physical read we need to perform. */
+	nblocks = operation->io_buffers_len;
+	if (nblocks == 0)
+		return;					/* nothing to do */
+
+	buffers = &operation->buffers[0];
+	blocknum = operation->blocknum;
+	forknum = operation->forknum;
+	bmr = operation->bmr;
+
+	isLocalBuf = SmgrIsTemp(bmr.smgr);
 	if (isLocalBuf)
 	{
-		/* Only need to adjust flags */
-		uint32		buf_state = pg_atomic_read_u32(&bufHdr->state);
-
-		buf_state |= BM_VALID;
-		pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
+		io_context = IOCONTEXT_NORMAL;
+		io_object = IOOBJECT_TEMP_RELATION;
 	}
 	else
 	{
-		/* Set BM_VALID, terminate IO, and wake up any waiters */
-		TerminateBufferIO(bufHdr, false, BM_VALID, true);
+		io_context = IOContextForStrategy(operation->strategy);
+		io_object = IOOBJECT_RELATION;
 	}
 
-	VacuumPageMiss++;
-	if (VacuumCostActive)
-		VacuumCostBalance += VacuumCostPageMiss;
+	/*
+	 * We count all these blocks as read by this backend.  This is traditional
+	 * behavior, but might turn out to be not true if we find that someone
+	 * else has beaten us and completed the read of some of these blocks.  In
+	 * that case the system globally double-counts, but we traditionally don't
+	 * count this as a "hit", and we don't have a separate counter for "miss,
+	 * but another backend completed the read".
+	 */
+	if (isLocalBuf)
+		pgBufferUsage.local_blks_read += nblocks;
+	else
+		pgBufferUsage.shared_blks_read += nblocks;
 
-	TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
-									  smgr->smgr_rlocator.locator.spcOid,
-									  smgr->smgr_rlocator.locator.dbOid,
-									  smgr->smgr_rlocator.locator.relNumber,
-									  smgr->smgr_rlocator.backend,
-									  found);
+	for (int i = 0; i < nblocks; ++i)
+	{
+		int			io_buffers_len;
+		Buffer		io_buffers[MAX_BUFFERS_PER_TRANSFER];
+		void	   *io_pages[MAX_BUFFERS_PER_TRANSFER];
+		instr_time	io_start;
+		BlockNumber io_first_block;
 
-	return BufferDescriptorGetBuffer(bufHdr);
+		/*
+		 * Skip this block if someone else has already completed it.  If an
+		 * I/O is already in progress in another backend, this will wait for
+		 * the outcome: either done, or something went wrong and we will
+		 * retry.
+		 */
+		if (!WaitReadBuffersCanStartIO(buffers[i], false))
+		{
+			/*
+			 * Report this as a 'hit' for this backend, even though it must
+			 * have started out as a miss in PrepareReadBuffer().
+			 */
+			TRACE_POSTGRESQL_BUFFER_READ_DONE(forknum, blocknum + i,
+											  bmr.smgr->smgr_rlocator.locator.spcOid,
+											  bmr.smgr->smgr_rlocator.locator.dbOid,
+											  bmr.smgr->smgr_rlocator.locator.relNumber,
+											  bmr.smgr->smgr_rlocator.backend,
+											  true);
+			continue;
+		}
+
+		/* We found a buffer that we need to read in. */
+		io_buffers[0] = buffers[i];
+		io_pages[0] = BufferGetBlock(buffers[i]);
+		io_first_block = blocknum + i;
+		io_buffers_len = 1;
+
+		/*
+		 * How many neighboring-on-disk blocks can we can scatter-read into
+		 * other buffers at the same time?  In this case we don't wait if we
+		 * see an I/O already in progress.  We already hold BM_IO_IN_PROGRESS
+		 * for the head block, so we should get on with that I/O as soon as
+		 * possible.  We'll come back to this block again, above.
+		 */
+		while ((i + 1) < nblocks &&
+			   WaitReadBuffersCanStartIO(buffers[i + 1], true))
+		{
+			/* Must be consecutive block numbers. */
+			Assert(BufferGetBlockNumber(buffers[i + 1]) ==
+				   BufferGetBlockNumber(buffers[i]) + 1);
+
+			io_buffers[io_buffers_len] = buffers[++i];
+			io_pages[io_buffers_len++] = BufferGetBlock(buffers[i]);
+		}
+
+		io_start = pgstat_prepare_io_time(track_io_timing);
+		smgrreadv(bmr.smgr, forknum, io_first_block, io_pages, io_buffers_len);
+		pgstat_count_io_op_time(io_object, io_context, IOOP_READ, io_start,
+								io_buffers_len);
+
+		/* Verify each block we read, and terminate the I/O. */
+		for (int j = 0; j < io_buffers_len; ++j)
+		{
+			BufferDesc *bufHdr;
+			Block		bufBlock;
+
+			if (isLocalBuf)
+			{
+				bufHdr = GetLocalBufferDescriptor(-io_buffers[j] - 1);
+				bufBlock = LocalBufHdrGetBlock(bufHdr);
+			}
+			else
+			{
+				bufHdr = GetBufferDescriptor(io_buffers[j] - 1);
+				bufBlock = BufHdrGetBlock(bufHdr);
+			}
+
+			/* check for garbage data */
+			if (!PageIsVerifiedExtended((Page) bufBlock, io_first_block + j,
+										PIV_LOG_WARNING | PIV_REPORT_STAT))
+			{
+				if ((operation->flags & READ_BUFFERS_ZERO_ON_ERROR) || zero_damaged_pages)
+				{
+					ereport(WARNING,
+							(errcode(ERRCODE_DATA_CORRUPTED),
+							 errmsg("invalid page in block %u of relation %s; zeroing out page",
+									io_first_block + j,
+									relpath(bmr.smgr->smgr_rlocator, forknum))));
+					memset(bufBlock, 0, BLCKSZ);
+				}
+				else
+					ereport(ERROR,
+							(errcode(ERRCODE_DATA_CORRUPTED),
+							 errmsg("invalid page in block %u of relation %s",
+									io_first_block + j,
+									relpath(bmr.smgr->smgr_rlocator, forknum))));
+			}
+
+			/* Terminate I/O and set BM_VALID. */
+			if (isLocalBuf)
+			{
+				uint32		buf_state = pg_atomic_read_u32(&bufHdr->state);
+
+				buf_state |= BM_VALID;
+				pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
+			}
+			else
+			{
+				/* Set BM_VALID, terminate IO, and wake up any waiters */
+				TerminateBufferIO(bufHdr, false, BM_VALID, true);
+			}
+
+			/* Report I/Os as completing individually. */
+			TRACE_POSTGRESQL_BUFFER_READ_DONE(forknum, io_first_block + j,
+											  bmr.smgr->smgr_rlocator.locator.spcOid,
+											  bmr.smgr->smgr_rlocator.locator.dbOid,
+											  bmr.smgr->smgr_rlocator.locator.relNumber,
+											  bmr.smgr->smgr_rlocator.backend,
+											  false);
+		}
+
+		VacuumPageMiss += io_buffers_len;
+		if (VacuumCostActive)
+			VacuumCostBalance += VacuumCostPageMiss * io_buffers_len;
+	}
 }
 
 /*
- * BufferAlloc -- subroutine for ReadBuffer.  Handles lookup of a shared
- *		buffer.  If no buffer exists already, selects a replacement
- *		victim and evicts the old page, but does NOT read in new page.
+ * BufferAlloc -- subroutine for StartReadBuffers.  Handles lookup of a shared
+ *		buffer.  If no buffer exists already, selects a replacement victim and
+ *		evicts the old page, but does NOT read in new page.
  *
  * "strategy" can be a buffer replacement strategy object, or NULL for
  * the default strategy.  The selected buffer's usage_count is advanced when
@@ -1223,11 +1475,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
  *
  * The returned buffer is pinned and is already marked as holding the
  * desired page.  If it already did have the desired page, *foundPtr is
- * set true.  Otherwise, *foundPtr is set false and the buffer is marked
- * as IO_IN_PROGRESS; ReadBuffer will now need to do I/O to fill it.
- *
- * *foundPtr is actually redundant with the buffer's BM_VALID flag, but
- * we keep it for simplicity in ReadBuffer.
+ * set true.  Otherwise, *foundPtr is set false.
  *
  * io_context is passed as an output parameter to avoid calling
  * IOContextForStrategy() when there is a shared buffers hit and no IO
@@ -1286,19 +1534,10 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 		{
 			/*
 			 * We can only get here if (a) someone else is still reading in
-			 * the page, or (b) a previous read attempt failed.  We have to
-			 * wait for any active read attempt to finish, and then set up our
-			 * own read attempt if the page is still not BM_VALID.
-			 * StartBufferIO does it all.
+			 * the page, (b) a previous read attempt failed, or (c) someone
+			 * called StartReadBuffers() but not yet WaitReadBuffers().
 			 */
-			if (StartBufferIO(buf, true))
-			{
-				/*
-				 * If we get here, previous attempts to read the buffer must
-				 * have failed ... but we shall bravely try again.
-				 */
-				*foundPtr = false;
-			}
+			*foundPtr = false;
 		}
 
 		return buf;
@@ -1363,19 +1602,10 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 		{
 			/*
 			 * We can only get here if (a) someone else is still reading in
-			 * the page, or (b) a previous read attempt failed.  We have to
-			 * wait for any active read attempt to finish, and then set up our
-			 * own read attempt if the page is still not BM_VALID.
-			 * StartBufferIO does it all.
+			 * the page, (b) a previous read attempt failed, or (c) someone
+			 * called StartReadBuffers() but not yet WaitReadBuffers().
 			 */
-			if (StartBufferIO(existing_buf_hdr, true))
-			{
-				/*
-				 * If we get here, previous attempts to read the buffer must
-				 * have failed ... but we shall bravely try again.
-				 */
-				*foundPtr = false;
-			}
+			*foundPtr = false;
 		}
 
 		return existing_buf_hdr;
@@ -1407,15 +1637,9 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 	LWLockRelease(newPartitionLock);
 
 	/*
-	 * Buffer contents are currently invalid.  Try to obtain the right to
-	 * start I/O.  If StartBufferIO returns false, then someone else managed
-	 * to read it before we did, so there's nothing left for BufferAlloc() to
-	 * do.
+	 * Buffer contents are currently invalid.
 	 */
-	if (StartBufferIO(victim_buf_hdr, true))
-		*foundPtr = false;
-	else
-		*foundPtr = true;
+	*foundPtr = false;
 
 	return victim_buf_hdr;
 }
@@ -1769,7 +1993,7 @@ again:
  * pessimistic, but outside of toy-sized shared_buffers it should allow
  * sufficient pins.
  */
-static void
+void
 LimitAdditionalPins(uint32 *additional_pins)
 {
 	uint32		max_backends;
@@ -2034,7 +2258,7 @@ ExtendBufferedRelShared(BufferManagerRelation bmr,
 
 				buf_state &= ~BM_VALID;
 				UnlockBufHdr(existing_hdr, buf_state);
-			} while (!StartBufferIO(existing_hdr, true));
+			} while (!StartBufferIO(existing_hdr, true, false));
 		}
 		else
 		{
@@ -2057,7 +2281,7 @@ ExtendBufferedRelShared(BufferManagerRelation bmr,
 			LWLockRelease(partition_lock);
 
 			/* XXX: could combine the locked operations in it with the above */
-			StartBufferIO(victim_buf_hdr, true);
+			StartBufferIO(victim_buf_hdr, true, false);
 		}
 	}
 
@@ -2372,7 +2596,12 @@ PinBuffer(BufferDesc *buf, BufferAccessStrategy strategy)
 	else
 	{
 		/*
-		 * If we previously pinned the buffer, it must surely be valid.
+		 * If we previously pinned the buffer, it is likely to be valid, but
+		 * it may not be if StartReadBuffers() was called and
+		 * WaitReadBuffers() hasn't been called yet.  We'll check by loading
+		 * the flags without locking.  This is racy, but it's OK to return
+		 * false spuriously: when WaitReadBuffers() calls StartBufferIO(),
+		 * it'll see that it's now valid.
 		 *
 		 * Note: We deliberately avoid a Valgrind client request here.
 		 * Individual access methods can optionally superimpose buffer page
@@ -2381,7 +2610,7 @@ PinBuffer(BufferDesc *buf, BufferAccessStrategy strategy)
 		 * that the buffer page is legitimately non-accessible here.  We
 		 * cannot meddle with that.
 		 */
-		result = true;
+		result = (pg_atomic_read_u32(&buf->state) & BM_VALID) != 0;
 	}
 
 	ref->refcount++;
@@ -3449,7 +3678,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln, IOObject io_object,
 	 * someone else flushed the buffer before we could, so we need not do
 	 * anything.
 	 */
-	if (!StartBufferIO(buf, false))
+	if (!StartBufferIO(buf, false, false))
 		return;
 
 	/* Setup error traceback support for ereport() */
@@ -5184,9 +5413,15 @@ WaitIO(BufferDesc *buf)
  *
  * Returns true if we successfully marked the buffer as I/O busy,
  * false if someone else already did the work.
+ *
+ * If nowait is true, then we don't wait for an I/O to be finished by another
+ * backend.  In that case, false indicates either that the I/O was already
+ * finished, or is still in progress.  This is useful for callers that want to
+ * find out if they can perform the I/O as part of a larger operation, without
+ * waiting for the answer or distinguishing the reasons why not.
  */
 static bool
-StartBufferIO(BufferDesc *buf, bool forInput)
+StartBufferIO(BufferDesc *buf, bool forInput, bool nowait)
 {
 	uint32		buf_state;
 
@@ -5199,6 +5434,8 @@ StartBufferIO(BufferDesc *buf, bool forInput)
 		if (!(buf_state & BM_IO_IN_PROGRESS))
 			break;
 		UnlockBufHdr(buf, buf_state);
+		if (nowait)
+			return false;
 		WaitIO(buf);
 	}
 
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index fcfac335a57..985a2c7049c 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -108,10 +108,9 @@ PrefetchLocalBuffer(SMgrRelation smgr, ForkNumber forkNum,
  * LocalBufferAlloc -
  *	  Find or create a local buffer for the given page of the given relation.
  *
- * API is similar to bufmgr.c's BufferAlloc, except that we do not need
- * to do any locking since this is all local.   Also, IO_IN_PROGRESS
- * does not get set.  Lastly, we support only default access strategy
- * (hence, usage_count is always advanced).
+ * API is similar to bufmgr.c's BufferAlloc, except that we do not need to do
+ * any locking since this is all local.  We support only default access
+ * strategy (hence, usage_count is always advanced).
  */
 BufferDesc *
 LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
@@ -287,7 +286,7 @@ GetLocalVictimBuffer(void)
 }
 
 /* see LimitAdditionalPins() */
-static void
+void
 LimitAdditionalLocalPins(uint32 *additional_pins)
 {
 	uint32		max_pins;
@@ -297,9 +296,10 @@ LimitAdditionalLocalPins(uint32 *additional_pins)
 
 	/*
 	 * In contrast to LimitAdditionalPins() other backends don't play a role
-	 * here. We can allow up to NLocBuffer pins in total.
+	 * here. We can allow up to NLocBuffer pins in total, but it might not be
+	 * initialized yet so read num_temp_buffers.
 	 */
-	max_pins = (NLocBuffer - NLocalPinnedBuffers);
+	max_pins = (num_temp_buffers - NLocalPinnedBuffers);
 
 	if (*additional_pins >= max_pins)
 		*additional_pins = max_pins;
diff --git a/src/backend/storage/meson.build b/src/backend/storage/meson.build
index 40345bdca27..739d13293fb 100644
--- a/src/backend/storage/meson.build
+++ b/src/backend/storage/meson.build
@@ -1,5 +1,6 @@
 # Copyright (c) 2022-2024, PostgreSQL Global Development Group
 
+subdir('aio')
 subdir('buffer')
 subdir('file')
 subdir('freespace')
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index d51d46d3353..b57f71f97e3 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -14,6 +14,7 @@
 #ifndef BUFMGR_H
 #define BUFMGR_H
 
+#include "port/pg_iovec.h"
 #include "storage/block.h"
 #include "storage/buf.h"
 #include "storage/bufpage.h"
@@ -158,6 +159,11 @@ extern PGDLLIMPORT int32 *LocalRefCount;
 #define BUFFER_LOCK_SHARE		1
 #define BUFFER_LOCK_EXCLUSIVE	2
 
+/*
+ * Maximum number of buffers for multi-buffer I/O functions.  This is set to
+ * allow 128kB transfers, unless BLCKSZ and IOV_MAX imply a a smaller maximum.
+ */
+#define MAX_BUFFERS_PER_TRANSFER Min(PG_IOV_MAX, (128 * 1024) / BLCKSZ)
 
 /*
  * prototypes for functions in bufmgr.c
@@ -177,6 +183,42 @@ extern Buffer ReadBufferWithoutRelcache(RelFileLocator rlocator,
 										ForkNumber forkNum, BlockNumber blockNum,
 										ReadBufferMode mode, BufferAccessStrategy strategy,
 										bool permanent);
+
+#define READ_BUFFERS_ZERO_ON_ERROR 0x01
+#define READ_BUFFERS_ISSUE_ADVICE 0x02
+
+/*
+ * Private state used by StartReadBuffers() and WaitReadBuffers().  Declared
+ * in public header only to allow inclusion in other structs, but contents
+ * should not be accessed.
+ */
+struct ReadBuffersOperation
+{
+	/* Parameters passed in to StartReadBuffers(). */
+	BufferManagerRelation bmr;
+	Buffer	   *buffers;
+	ForkNumber	forknum;
+	BlockNumber blocknum;
+	int			nblocks;
+	BufferAccessStrategy strategy;
+	int			flags;
+
+	/* Range of buffers, if we need to perform a read. */
+	int			io_buffers_len;
+};
+
+typedef struct ReadBuffersOperation ReadBuffersOperation;
+
+extern bool StartReadBuffers(BufferManagerRelation bmr,
+							 Buffer *buffers,
+							 ForkNumber forknum,
+							 BlockNumber blocknum,
+							 int *nblocks,
+							 BufferAccessStrategy strategy,
+							 int flags,
+							 ReadBuffersOperation *operation);
+extern void WaitReadBuffers(ReadBuffersOperation *operation);
+
 extern void ReleaseBuffer(Buffer buffer);
 extern void UnlockReleaseBuffer(Buffer buffer);
 extern bool BufferIsExclusiveLocked(Buffer buffer);
@@ -250,6 +292,9 @@ extern bool HoldingBufferPinThatDelaysRecovery(void);
 
 extern bool BgBufferSync(struct WritebackContext *wb_context);
 
+extern void LimitAdditionalPins(uint32 *additional_pins);
+extern void LimitAdditionalLocalPins(uint32 *additional_pins);
+
 /* in buf_init.c */
 extern void InitBufferPool(void);
 extern Size BufferShmemSize(void);
diff --git a/src/include/storage/streaming_read.h b/src/include/storage/streaming_read.h
new file mode 100644
index 00000000000..c4d3892bb26
--- /dev/null
+++ b/src/include/storage/streaming_read.h
@@ -0,0 +1,52 @@
+#ifndef STREAMING_READ_H
+#define STREAMING_READ_H
+
+#include "storage/bufmgr.h"
+#include "storage/fd.h"
+#include "storage/smgr.h"
+
+/* Default tuning, reasonable for many users. */
+#define PGSR_FLAG_DEFAULT 0x00
+
+/*
+ * I/O streams that are performing maintenance work on behalf of potentially
+ * many users.
+ */
+#define PGSR_FLAG_MAINTENANCE 0x01
+
+/*
+ * We usually avoid issuing prefetch advice automatically when sequential
+ * access is detected, but this flag explicitly disables it, for cases that
+ * might not be correctly detected.  Explicit advice is known to perform worse
+ * than letting the kernel (at least Linux) detect sequential access.
+ */
+#define PGSR_FLAG_SEQUENTIAL 0x02
+
+/*
+ * We usually ramp up from smaller reads to larger ones, to support users who
+ * don't know if it's worth reading lots of buffers yet.  This flag disables
+ * that, declaring ahead of time that we'll be reading all available buffers.
+ */
+#define PGSR_FLAG_FULL 0x04
+
+struct PgStreamingRead;
+typedef struct PgStreamingRead PgStreamingRead;
+
+/* Callback that returns the next block number to read. */
+typedef BlockNumber (*PgStreamingReadBufferCB) (PgStreamingRead *pgsr,
+												void *pgsr_private,
+												void *per_buffer_private);
+
+extern PgStreamingRead *pg_streaming_read_buffer_alloc(int flags,
+													   void *pgsr_private,
+													   size_t per_buffer_private_size,
+													   BufferAccessStrategy strategy,
+													   BufferManagerRelation bmr,
+													   ForkNumber forknum,
+													   PgStreamingReadBufferCB next_block_cb);
+
+extern void pg_streaming_read_prefetch(PgStreamingRead *pgsr);
+extern Buffer pg_streaming_read_buffer_get_next(PgStreamingRead *pgsr, void **per_buffer_private);
+extern void pg_streaming_read_free(PgStreamingRead *pgsr);
+
+#endif
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index cc3611e6068..5f637f07eeb 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2097,6 +2097,8 @@ PgStat_TableCounts
 PgStat_TableStatus
 PgStat_TableXactStatus
 PgStat_WalStats
+PgStreamingRead
+PgStreamingReadRange
 PgXmlErrorContext
 PgXmlStrictness
 Pg_finfo_record
@@ -2267,6 +2269,7 @@ ReInitializeDSMForeignScan_function
 ReScanForeignScan_function
 ReadBufPtrType
 ReadBufferMode
+ReadBuffersOperation
 ReadBytePtrType
 ReadExtraTocPtrType
 ReadFunc
-- 
2.40.1

v7-0006-Vacuum-first-pass-uses-Streaming-Read-interface.patchtext/x-diff; charset=us-asciiDownload
From bc9d97de3729e65752ef6a6e9cbfc0808c4725ac Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sun, 31 Dec 2023 11:29:02 -0500
Subject: [PATCH v7 6/7] Vacuum first pass uses Streaming Read interface

Now vacuum's first pass, which HOT prunes and records the TIDs of
non-removable dead tuples, uses the streaming read API by implementing a
streaming read callback which invokes heap_vac_scan_next_block().
---
 src/backend/access/heap/vacuumlazy.c | 131 +++++++++++++++++++--------
 1 file changed, 92 insertions(+), 39 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index d2c8f27fc57..d07a2a58b15 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -54,6 +54,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "storage/streaming_read.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -168,7 +169,12 @@ typedef struct LVRelState
 	char	   *relnamespace;
 	char	   *relname;
 	char	   *indname;		/* Current index name */
-	BlockNumber blkno;			/* used only for heap operations */
+
+	/*
+	 * The current block being processed by vacuum. Used only for heap
+	 * operations. Primarily for error reporting and logging.
+	 */
+	BlockNumber blkno;
 	OffsetNumber offnum;		/* used only for heap operations */
 	VacErrPhase phase;
 	bool		verbose;		/* VACUUM VERBOSE? */
@@ -220,6 +226,12 @@ typedef struct LVRelState
 		BlockNumber next_unskippable_block;
 		/* Next unskippable block's visibility status */
 		bool		next_unskippable_allvis;
+
+		/*
+		 * Buffer containing block of VM with visibility information for
+		 * next_unskippable_block.
+		 */
+		Buffer		next_unskippable_vmbuffer;
 	}			next_block_state;
 } LVRelState;
 
@@ -233,8 +245,7 @@ typedef struct LVSavedErrInfo
 
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
-static bool heap_vac_scan_next_block(LVRelState *vacrel, Buffer *vmbuffer,
-									 BlockNumber *blkno,
+static void heap_vac_scan_next_block(LVRelState *vacrel,
 									 bool *all_visible_according_to_vm);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
@@ -777,6 +788,47 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 	}
 }
 
+static BlockNumber
+vacuum_scan_pgsr_next(PgStreamingRead *pgsr,
+					  void *pgsr_private, void *per_buffer_data)
+{
+	LVRelState *vacrel = pgsr_private;
+	bool	   *all_visible_according_to_vm = per_buffer_data;
+
+	heap_vac_scan_next_block(vacrel,
+							 all_visible_according_to_vm);
+
+	/*
+	 * If there are no further blocks to vacuum in the relation, release the
+	 * vmbuffer.
+	 */
+	if (!BlockNumberIsValid(vacrel->next_block_state.current_block) &&
+		BufferIsValid(vacrel->next_block_state.next_unskippable_vmbuffer))
+	{
+		ReleaseBuffer(vacrel->next_block_state.next_unskippable_vmbuffer);
+		vacrel->next_block_state.next_unskippable_vmbuffer = InvalidBuffer;
+	}
+
+	return vacrel->next_block_state.current_block;
+}
+
+static inline PgStreamingRead *
+vac_scan_pgsr_alloc(LVRelState *vacrel, PgStreamingReadBufferCB next_block_cb)
+{
+	PgStreamingRead *result = pg_streaming_read_buffer_alloc(PGSR_FLAG_MAINTENANCE, vacrel,
+															 sizeof(bool), vacrel->bstrategy, BMR_REL(vacrel->rel),
+															 MAIN_FORKNUM, next_block_cb);
+
+	/*
+	 * Initialize for first heap_vac_scan_next_block() call. These rely on
+	 * InvalidBlockNumber + 1 = 0
+	 */
+	vacrel->next_block_state.current_block = InvalidBlockNumber;
+	vacrel->next_block_state.next_unskippable_block = InvalidBlockNumber;
+
+	return result;
+}
+
 /*
  *	lazy_scan_heap() -- workhorse function for VACUUM
  *
@@ -816,10 +868,10 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 static void
 lazy_scan_heap(LVRelState *vacrel)
 {
+	Buffer		buf;
 	BlockNumber rel_pages = vacrel->rel_pages,
-				blkno,
 				next_fsm_block_to_vacuum = 0;
-	bool		all_visible_according_to_vm;
+	bool	   *all_visible_according_to_vm;
 
 	VacDeadItems *dead_items = vacrel->dead_items;
 	Buffer		vmbuffer = InvalidBuffer;
@@ -830,23 +882,27 @@ lazy_scan_heap(LVRelState *vacrel)
 	};
 	int64		initprog_val[3];
 
+	PgStreamingRead *pgsr = vac_scan_pgsr_alloc(vacrel, vacuum_scan_pgsr_next);
+
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
 	initprog_val[1] = rel_pages;
 	initprog_val[2] = dead_items->max_items;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
-	/* Initialize for first heap_vac_scan_next_block() call */
-	vacrel->next_block_state.current_block = InvalidBlockNumber;
-	vacrel->next_block_state.next_unskippable_block = InvalidBlockNumber;
-
-	while (heap_vac_scan_next_block(vacrel, &vmbuffer,
-									&blkno, &all_visible_according_to_vm))
+	while (BufferIsValid(buf = pg_streaming_read_buffer_get_next(pgsr,
+																 (void **) &all_visible_according_to_vm)))
 	{
-		Buffer		buf;
 		Page		page;
 		bool		has_lpdead_items;
 		bool		got_cleanup_lock = false;
+		BlockNumber blkno;
+
+		vacrel->blkno = blkno = BufferGetBlockNumber(buf);
+
+		CheckBufferIsPinnedOnce(buf);
+
+		page = BufferGetPage(buf);
 
 		vacrel->scanned_pages++;
 
@@ -914,9 +970,6 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
 
-		buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL,
-								 vacrel->bstrategy);
-		page = BufferGetPage(buf);
 
 		/*
 		 * We need a buffer cleanup lock to prune HOT chains and defragment
@@ -973,7 +1026,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		if (got_cleanup_lock)
 			lazy_scan_prune(vacrel, buf, blkno, page,
-							vmbuffer, all_visible_according_to_vm,
+							vmbuffer, *all_visible_according_to_vm,
 							&has_lpdead_items);
 
 		/*
@@ -1030,7 +1083,7 @@ lazy_scan_heap(LVRelState *vacrel)
 	}
 
 	/* report that everything is now scanned */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
+	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, vacrel->rel_pages);
 
 	/* now we can compute the new value for pg_class.reltuples */
 	vacrel->new_live_tuples = vac_estimate_reltuples(vacrel->rel, rel_pages,
@@ -1045,6 +1098,8 @@ lazy_scan_heap(LVRelState *vacrel)
 		Max(vacrel->new_live_tuples, 0) + vacrel->recently_dead_tuples +
 		vacrel->missed_dead_tuples;
 
+	pg_streaming_read_free(pgsr);
+
 	/*
 	 * Do index vacuuming (call each index's ambulkdelete routine), then do
 	 * related heap vacuuming
@@ -1056,11 +1111,11 @@ lazy_scan_heap(LVRelState *vacrel)
 	 * Vacuum the remainder of the Free Space Map.  We must do this whether or
 	 * not there were indexes, and whether or not we bypassed index vacuuming.
 	 */
-	if (blkno > next_fsm_block_to_vacuum)
-		FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, blkno);
+	if (vacrel->rel_pages > next_fsm_block_to_vacuum)
+		FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, vacrel->rel_pages);
 
 	/* report all blocks vacuumed */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
+	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, vacrel->rel_pages);
 
 	/* Do final index cleanup (call each index's amvacuumcleanup routine) */
 	if (vacrel->nindexes > 0 && vacrel->do_index_cleanup)
@@ -1072,12 +1127,13 @@ lazy_scan_heap(LVRelState *vacrel)
  *
  * lazy_scan_heap() calls here every time it needs to get the next block to
  * prune and vacuum, using the visibility map, vacuum options, and various
- * thresholds to skip blocks which do not need to be processed and set blkno to
- * the next block that actually needs to be processed.
+ * thresholds to skip blocks which do not need to be processed and set
+ * current_block to the next block that actually needs to be processed.
  *
- * The block number and visibility status of the next block to process are set
- * in blkno and all_visible_according_to_vm. heap_vac_scan_next_block()
- * returns false if there are no further blocks to process.
+ * The number and visibility status of the next block to process are set in
+ * vacrel->next_block_state->current_block and all_visible_according_to_vm.
+ * vacrel->next_block_state->current_block is set to InvalidBlockNumber if
+ * there are no further blocks to process.
  *
  * vacrel is an in/out parameter here; vacuum options and information about the
  * relation are read, members of vacrel->next_block_state are read and set as
@@ -1085,12 +1141,10 @@ lazy_scan_heap(LVRelState *vacrel)
  * don't advance relfrozenxid when we have skipped vacuuming all-visible
  * blocks.
  *
- * vmbuffer is an output parameter which, upon return, will contain the block
- * from the VM containing visibility information for the next unskippable heap
- * block. If we decide not to skip this heap block, the caller is responsible
- * for fetching the correct VM block into vmbuffer before using it. This is
- * okay as providing it as an output parameter is an optimization, not a
- * requirement.
+ * vacrel->next_block_state->vmbuffer will contain visibility information for
+ * the next unskippable heap block. If we decide not to skip this heap block,
+ * the caller is responsible for fetching the correct VM block into the
+ * vmbuffer before using it.
  *
  * Note: our opinion of which blocks can be skipped can go stale immediately.
  * It's okay if caller "misses" a page whose all-visible or all-frozen marking
@@ -1100,9 +1154,9 @@ lazy_scan_heap(LVRelState *vacrel)
  * older XIDs/MXIDs.  The vacrel->skippedallvis flag will be set here when the
  * choice to skip such a range is actually made, making everything safe.)
  */
-static bool
-heap_vac_scan_next_block(LVRelState *vacrel, Buffer *vmbuffer,
-						 BlockNumber *blkno, bool *all_visible_according_to_vm)
+static void
+heap_vac_scan_next_block(LVRelState *vacrel,
+						 bool *all_visible_according_to_vm)
 {
 	/* Relies on InvalidBlockNumber + 1 == 0 */
 	BlockNumber next_block = vacrel->next_block_state.current_block + 1;
@@ -1129,8 +1183,8 @@ heap_vac_scan_next_block(LVRelState *vacrel, Buffer *vmbuffer,
 	 */
 	if (next_block >= vacrel->rel_pages)
 	{
-		vacrel->next_block_state.current_block = *blkno = InvalidBlockNumber;
-		return false;
+		vacrel->next_block_state.current_block = InvalidBlockNumber;
+		return;
 	}
 
 	if (vacrel->next_block_state.next_unskippable_block == InvalidBlockNumber ||
@@ -1144,7 +1198,7 @@ heap_vac_scan_next_block(LVRelState *vacrel, Buffer *vmbuffer,
 		{
 			uint8		mapbits = visibilitymap_get_status(vacrel->rel,
 														   next_unskippable_block,
-														   vmbuffer);
+														   &vacrel->next_block_state.next_unskippable_vmbuffer);
 
 			vacrel->next_block_state.next_unskippable_allvis = mapbits & VISIBILITYMAP_ALL_VISIBLE;
 
@@ -1224,8 +1278,7 @@ heap_vac_scan_next_block(LVRelState *vacrel, Buffer *vmbuffer,
 	else
 		*all_visible_according_to_vm = true;
 
-	vacrel->next_block_state.current_block = *blkno = next_block;
-	return true;
+	vacrel->next_block_state.current_block = next_block;
 }
 
 /*
-- 
2.40.1

v7-0007-Vacuum-second-pass-uses-Streaming-Read-interface.patchtext/x-diff; charset=us-asciiDownload
From c3cf35fcb3110da791e9edc1b3325dc8d0080068 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Tue, 27 Feb 2024 14:35:36 -0500
Subject: [PATCH v7 7/7] Vacuum second pass uses Streaming Read interface

Now vacuum's second pass, which removes dead items referring to dead
tuples catalogued in the first pass, uses the streaming read API by
implementing a streaming read callback which returns the next block
containing previously catalogued dead items. A new struct,
VacReapBlkState, is introduced to provide the caller with the starting
and ending indexes of dead items to vacuum.
---
 src/backend/access/heap/vacuumlazy.c | 110 ++++++++++++++++++++-------
 src/tools/pgindent/typedefs.list     |   1 +
 2 files changed, 85 insertions(+), 26 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index d07a2a58b15..375b66a62c4 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -195,6 +195,12 @@ typedef struct LVRelState
 	BlockNumber missed_dead_pages;	/* # pages with missed dead tuples */
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
 
+	/*
+	 * The index of the next TID in dead_items to reap during the second
+	 * vacuum pass.
+	 */
+	int			idx_prefetch;
+
 	/* Statistics output by us, for table */
 	double		new_rel_tuples; /* new estimated total # of tuples */
 	double		new_live_tuples;	/* new estimated total # of live tuples */
@@ -243,6 +249,21 @@ typedef struct LVSavedErrInfo
 	VacErrPhase phase;
 } LVSavedErrInfo;
 
+/*
+ * State set up in streaming read callback during vacuum's second pass which
+ * removes dead items referring to dead tuples catalogued in the first pass
+ */
+typedef struct VacReapBlkState
+{
+	/*
+	 * The indexes of the TIDs of the first and last dead tuples in a single
+	 * block in the currently vacuumed relation. The callback will set these
+	 * up prior to adding this block to the stream.
+	 */
+	int			start_idx;
+	int			end_idx;
+} VacReapBlkState;
+
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
 static void heap_vac_scan_next_block(LVRelState *vacrel,
@@ -260,8 +281,9 @@ static bool lazy_scan_noprune(LVRelState *vacrel, Buffer buf,
 static void lazy_vacuum(LVRelState *vacrel);
 static bool lazy_vacuum_all_indexes(LVRelState *vacrel);
 static void lazy_vacuum_heap_rel(LVRelState *vacrel);
-static int	lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno,
-								  Buffer buffer, int index, Buffer vmbuffer);
+static void lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno,
+								  Buffer buffer, Buffer vmbuffer,
+								  VacReapBlkState *rbstate);
 static bool lazy_check_wraparound_failsafe(LVRelState *vacrel);
 static void lazy_cleanup_all_indexes(LVRelState *vacrel);
 static IndexBulkDeleteResult *lazy_vacuum_one_index(Relation indrel,
@@ -2426,6 +2448,37 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 	return allindexes;
 }
 
+static BlockNumber
+vacuum_reap_lp_pgsr_next(PgStreamingRead *pgsr,
+						 void *pgsr_private,
+						 void *per_buffer_data)
+{
+	BlockNumber blkno;
+	LVRelState *vacrel = pgsr_private;
+	VacReapBlkState *rbstate = per_buffer_data;
+
+	VacDeadItems *dead_items = vacrel->dead_items;
+
+	if (vacrel->idx_prefetch == dead_items->num_items)
+		return InvalidBlockNumber;
+
+	blkno = ItemPointerGetBlockNumber(&dead_items->items[vacrel->idx_prefetch]);
+	rbstate->start_idx = vacrel->idx_prefetch;
+
+	for (; vacrel->idx_prefetch < dead_items->num_items; vacrel->idx_prefetch++)
+	{
+		BlockNumber curblkno =
+			ItemPointerGetBlockNumber(&dead_items->items[vacrel->idx_prefetch]);
+
+		if (blkno != curblkno)
+			break;				/* past end of tuples for this block */
+	}
+
+	rbstate->end_idx = vacrel->idx_prefetch;
+
+	return blkno;
+}
+
 /*
  *	lazy_vacuum_heap_rel() -- second pass over the heap for two pass strategy
  *
@@ -2447,7 +2500,9 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 static void
 lazy_vacuum_heap_rel(LVRelState *vacrel)
 {
-	int			index = 0;
+	Buffer		buf;
+	PgStreamingRead *pgsr;
+	VacReapBlkState *rbstate;
 	BlockNumber vacuumed_pages = 0;
 	Buffer		vmbuffer = InvalidBuffer;
 	LVSavedErrInfo saved_err_info;
@@ -2465,17 +2520,21 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 							 VACUUM_ERRCB_PHASE_VACUUM_HEAP,
 							 InvalidBlockNumber, InvalidOffsetNumber);
 
-	while (index < vacrel->dead_items->num_items)
+	pgsr = pg_streaming_read_buffer_alloc(PGSR_FLAG_MAINTENANCE, vacrel,
+										  sizeof(VacReapBlkState), vacrel->bstrategy, BMR_REL(vacrel->rel),
+										  MAIN_FORKNUM, vacuum_reap_lp_pgsr_next);
+
+	while (BufferIsValid(buf =
+						 pg_streaming_read_buffer_get_next(pgsr,
+														   (void **) &rbstate)))
 	{
 		BlockNumber blkno;
-		Buffer		buf;
 		Page		page;
 		Size		freespace;
 
 		vacuum_delay_point();
 
-		blkno = ItemPointerGetBlockNumber(&vacrel->dead_items->items[index]);
-		vacrel->blkno = blkno;
+		vacrel->blkno = blkno = BufferGetBlockNumber(buf);
 
 		/*
 		 * Pin the visibility map page in case we need to mark the page
@@ -2485,10 +2544,8 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 		visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
 
 		/* We need a non-cleanup exclusive lock to mark dead_items unused */
-		buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL,
-								 vacrel->bstrategy);
 		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
-		index = lazy_vacuum_heap_page(vacrel, blkno, buf, index, vmbuffer);
+		lazy_vacuum_heap_page(vacrel, blkno, buf, vmbuffer, rbstate);
 
 		/* Now that we've vacuumed the page, record its available space */
 		page = BufferGetPage(buf);
@@ -2507,14 +2564,16 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 	 * We set all LP_DEAD items from the first heap pass to LP_UNUSED during
 	 * the second heap pass.  No more, no less.
 	 */
-	Assert(index > 0);
+	Assert(rbstate->end_idx > 0);
 	Assert(vacrel->num_index_scans > 1 ||
-		   (index == vacrel->lpdead_items &&
+		   (rbstate->end_idx == vacrel->lpdead_items &&
 			vacuumed_pages == vacrel->lpdead_item_pages));
 
+	pg_streaming_read_free(pgsr);
+
 	ereport(DEBUG2,
 			(errmsg("table \"%s\": removed %lld dead item identifiers in %u pages",
-					vacrel->relname, (long long) index, vacuumed_pages)));
+					vacrel->relname, (long long) rbstate->end_idx, vacuumed_pages)));
 
 	/* Revert to the previous phase information for error traceback */
 	restore_vacuum_error_info(vacrel, &saved_err_info);
@@ -2528,13 +2587,12 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
  * cleanup lock is also acceptable).  vmbuffer must be valid and already have
  * a pin on blkno's visibility map page.
  *
- * index is an offset into the vacrel->dead_items array for the first listed
- * LP_DEAD item on the page.  The return value is the first index immediately
- * after all LP_DEAD items for the same page in the array.
+ * Given a block and dead items recorded during the first pass, set those items
+ * dead and truncate the line pointer array. Update the VM as appropriate.
  */
-static int
-lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
-					  int index, Buffer vmbuffer)
+static void
+lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno,
+					  Buffer buffer, Buffer vmbuffer, VacReapBlkState *rbstate)
 {
 	VacDeadItems *dead_items = vacrel->dead_items;
 	Page		page = BufferGetPage(buffer);
@@ -2555,16 +2613,17 @@ lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; index < dead_items->num_items; index++)
+	for (int i = rbstate->start_idx; i < rbstate->end_idx; i++)
 	{
-		BlockNumber tblk;
 		OffsetNumber toff;
+		ItemPointer dead_item;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&dead_items->items[index]);
-		if (tblk != blkno)
-			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&dead_items->items[index]);
+		dead_item = &dead_items->items[i];
+
+		Assert(ItemPointerGetBlockNumber(dead_item) == blkno);
+
+		toff = ItemPointerGetOffsetNumber(dead_item);
 		itemid = PageGetItemId(page, toff);
 
 		Assert(ItemIdIsDead(itemid) && !ItemIdHasStorage(itemid));
@@ -2634,7 +2693,6 @@ lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
 
 	/* Revert to the previous phase information for error traceback */
 	restore_vacuum_error_info(vacrel, &saved_err_info);
-	return index;
 }
 
 /*
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 5f637f07eeb..20b85a69f9d 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2972,6 +2972,7 @@ VacOptValue
 VacuumParams
 VacuumRelation
 VacuumStmt
+VacReapBlkState
 ValidIOData
 ValidateIndexState
 ValuesScan
-- 
2.40.1

#17Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Melanie Plageman (#16)
1 attachment(s)
Re: Confine vacuum skip logic to lazy_scan_skip

On 08/03/2024 02:46, Melanie Plageman wrote:

On Wed, Mar 06, 2024 at 10:00:23PM -0500, Melanie Plageman wrote:

I feel heap_vac_scan_get_next_block() function could use some love. Maybe
just some rewording of the comments, or maybe some other refactoring; not
sure. But I'm pretty happy with the function signature and how it's called.

I've cleaned up the comments on heap_vac_scan_next_block() in the first
couple patches (not so much in the streaming read user). Let me know if
it addresses your feelings or if I should look for other things I could
change.

Thanks, that is better. I think I now finally understand how the
function works, and now I can see some more issues and refactoring
opportunities :-).

Looking at current lazy_scan_skip() code in 'master', one thing now
caught my eye (and it's the same with your patches):

*next_unskippable_allvis = true;
while (next_unskippable_block < rel_pages)
{
uint8 mapbits = visibilitymap_get_status(vacrel->rel,
next_unskippable_block,
vmbuffer);

if ((mapbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
{
Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
*next_unskippable_allvis = false;
break;
}

/*
* Caller must scan the last page to determine whether it has tuples
* (caller must have the opportunity to set vacrel->nonempty_pages).
* This rule avoids having lazy_truncate_heap() take access-exclusive
* lock on rel to attempt a truncation that fails anyway, just because
* there are tuples on the last page (it is likely that there will be
* tuples on other nearby pages as well, but those can be skipped).
*
* Implement this by always treating the last block as unsafe to skip.
*/
if (next_unskippable_block == rel_pages - 1)
break;

/* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
if (!vacrel->skipwithvm)
{
/* Caller shouldn't rely on all_visible_according_to_vm */
*next_unskippable_allvis = false;
break;
}

/*
* Aggressive VACUUM caller can't skip pages just because they are
* all-visible. They may still skip all-frozen pages, which can't
* contain XIDs < OldestXmin (XIDs that aren't already frozen by now).
*/
if ((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0)
{
if (vacrel->aggressive)
break;

/*
* All-visible block is safe to skip in non-aggressive case. But
* remember that the final range contains such a block for later.
*/
skipsallvis = true;
}

/* XXX: is it OK to remove this? */
vacuum_delay_point();
next_unskippable_block++;
nskippable_blocks++;
}

Firstly, it seems silly to check DISABLE_PAGE_SKIPPING within the loop.
When DISABLE_PAGE_SKIPPING is set, we always return the next block and
set *next_unskippable_allvis = false regardless of the visibility map,
so why bother checking the visibility map at all?

Except at the very last block of the relation! If you look carefully,
at the last block we do return *next_unskippable_allvis = true, if the
VM says so, even if DISABLE_PAGE_SKIPPING is set. I think that's wrong.
Surely the intention was to pretend that none of the VM bits were set if
DISABLE_PAGE_SKIPPING is used, also for the last block.

This was changed in commit 980ae17310:

@@ -1311,7 +1327,11 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,

/* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
if (!vacrel->skipwithvm)
+               {
+                       /* Caller shouldn't rely on all_visible_according_to_vm */
+                       *next_unskippable_allvis = false;
break;
+               }

Before that, *next_unskippable_allvis was set correctly according to the
VM, even when DISABLE_PAGE_SKIPPING was used. It's not clear to me why
that was changed. And I think setting it to 'true' would be a more
failsafe value than 'false'. When *next_unskippable_allvis is set to
true, the caller cannot rely on it because a concurrent modification
could immediately clear the VM bit. But because VACUUM is the only
process that sets VM bits, if it's set to false, the caller can assume
that it's still not set later on.

One consequence of that is that with DISABLE_PAGE_SKIPPING,
lazy_scan_heap() dirties all pages, even if there are no changes. The
attached test script demonstrates that.

ISTM we should revert the above hunk, and backpatch it to v16. I'm a
little wary because I don't understand why that change was made in the
first place, though. I think it was just an ill-advised attempt at
tidying up the code as part of the larger commit, but I'm not sure.
Peter, do you remember?

I wonder if we should give up trying to set all_visible_according_to_vm
correctly when we decide what to skip, and always do
"all_visible_according_to_vm = visibilitymap_get_status(...)" in
lazy_scan_prune(). It would be more expensive, but maybe it doesn't
matter in practice. It would get rid of this tricky bookkeeping in
heap_vac_scan_next_block().

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachments:

vactest.sqlapplication/sql; name=vactest.sqlDownload
In reply to: Heikki Linnakangas (#17)
Re: Confine vacuum skip logic to lazy_scan_skip

On Fri, Mar 8, 2024 at 8:49 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

ISTM we should revert the above hunk, and backpatch it to v16. I'm a
little wary because I don't understand why that change was made in the
first place, though. I think it was just an ill-advised attempt at
tidying up the code as part of the larger commit, but I'm not sure.
Peter, do you remember?

I think that it makes sense to set the VM when indicated by
lazy_scan_prune, independent of what either the visibility map or the
page's PD_ALL_VISIBLE marking say. The whole point of
DISABLE_PAGE_SKIPPING is to deal with VM corruption, after all.

In retrospect I didn't handle this particular aspect very well in
commit 980ae17310. The approach I took is a bit crude (and in any case
slightly wrong in that it is inconsistent in how it handles the last
page). But it has the merit of fixing the case where we just have the
VM's all-frozen bit set for a given block (not the all-visible bit
set) -- which is always wrong. There was good reason to be concerned
about that possibility when 980ae17310 went in.

--
Peter Geoghegan

#19Melanie Plageman
melanieplageman@gmail.com
In reply to: Heikki Linnakangas (#17)
Re: Confine vacuum skip logic to lazy_scan_skip

On Fri, Mar 8, 2024 at 8:49 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

On 08/03/2024 02:46, Melanie Plageman wrote:

On Wed, Mar 06, 2024 at 10:00:23PM -0500, Melanie Plageman wrote:

I feel heap_vac_scan_get_next_block() function could use some love. Maybe
just some rewording of the comments, or maybe some other refactoring; not
sure. But I'm pretty happy with the function signature and how it's called.

I've cleaned up the comments on heap_vac_scan_next_block() in the first
couple patches (not so much in the streaming read user). Let me know if
it addresses your feelings or if I should look for other things I could
change.

Thanks, that is better. I think I now finally understand how the
function works, and now I can see some more issues and refactoring
opportunities :-).

Looking at current lazy_scan_skip() code in 'master', one thing now
caught my eye (and it's the same with your patches):

*next_unskippable_allvis = true;
while (next_unskippable_block < rel_pages)
{
uint8 mapbits = visibilitymap_get_status(vacrel->rel,
next_unskippable_block,
vmbuffer);

if ((mapbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
{
Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
*next_unskippable_allvis = false;
break;
}

/*
* Caller must scan the last page to determine whether it has tuples
* (caller must have the opportunity to set vacrel->nonempty_pages).
* This rule avoids having lazy_truncate_heap() take access-exclusive
* lock on rel to attempt a truncation that fails anyway, just because
* there are tuples on the last page (it is likely that there will be
* tuples on other nearby pages as well, but those can be skipped).
*
* Implement this by always treating the last block as unsafe to skip.
*/
if (next_unskippable_block == rel_pages - 1)
break;

/* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
if (!vacrel->skipwithvm)
{
/* Caller shouldn't rely on all_visible_according_to_vm */
*next_unskippable_allvis = false;
break;
}

/*
* Aggressive VACUUM caller can't skip pages just because they are
* all-visible. They may still skip all-frozen pages, which can't
* contain XIDs < OldestXmin (XIDs that aren't already frozen by now).
*/
if ((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0)
{
if (vacrel->aggressive)
break;

/*
* All-visible block is safe to skip in non-aggressive case. But
* remember that the final range contains such a block for later.
*/
skipsallvis = true;
}

/* XXX: is it OK to remove this? */
vacuum_delay_point();
next_unskippable_block++;
nskippable_blocks++;
}

Firstly, it seems silly to check DISABLE_PAGE_SKIPPING within the loop.
When DISABLE_PAGE_SKIPPING is set, we always return the next block and
set *next_unskippable_allvis = false regardless of the visibility map,
so why bother checking the visibility map at all?

Except at the very last block of the relation! If you look carefully,
at the last block we do return *next_unskippable_allvis = true, if the
VM says so, even if DISABLE_PAGE_SKIPPING is set. I think that's wrong.
Surely the intention was to pretend that none of the VM bits were set if
DISABLE_PAGE_SKIPPING is used, also for the last block.

I agree that having next_unskippable_allvis and, as a consequence,
all_visible_according_to_vm set to true for the last block seems
wrong. And It makes sense from a loop efficiency standpoint also to
move it up to the top. However, making that change would have us end
up dirtying all pages in your example.

This was changed in commit 980ae17310:

@@ -1311,7 +1327,11 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,

/* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
if (!vacrel->skipwithvm)
+               {
+                       /* Caller shouldn't rely on all_visible_according_to_vm */
+                       *next_unskippable_allvis = false;
break;
+               }

Before that, *next_unskippable_allvis was set correctly according to the
VM, even when DISABLE_PAGE_SKIPPING was used. It's not clear to me why
that was changed. And I think setting it to 'true' would be a more
failsafe value than 'false'. When *next_unskippable_allvis is set to
true, the caller cannot rely on it because a concurrent modification
could immediately clear the VM bit. But because VACUUM is the only
process that sets VM bits, if it's set to false, the caller can assume
that it's still not set later on.

One consequence of that is that with DISABLE_PAGE_SKIPPING,
lazy_scan_heap() dirties all pages, even if there are no changes. The
attached test script demonstrates that.

This does seem undesirable.

However, if we do as you suggest above and don't check
DISABLE_PAGE_SKIPPING in the loop and instead return without checking
the VM when DISABLE_PAGE_SKIPPING is passed, setting
next_unskippable_allvis = false, we will end up dirtying all pages as
in your example. It would fix the last block issue but it would result
in dirtying all pages in your example.

ISTM we should revert the above hunk, and backpatch it to v16. I'm a
little wary because I don't understand why that change was made in the
first place, though. I think it was just an ill-advised attempt at
tidying up the code as part of the larger commit, but I'm not sure.
Peter, do you remember?

If we revert this, then the when all_visible_according_to_vm and
all_visible are true in lazy_scan_prune(), the VM will only get
updated when all_frozen is true and the VM doesn't have all frozen set
yet, so maybe that is inconsistent with the goal of
DISABLE_PAGE_SKIPPING to update the VM when its contents are "suspect"
(according to docs).

I wonder if we should give up trying to set all_visible_according_to_vm
correctly when we decide what to skip, and always do
"all_visible_according_to_vm = visibilitymap_get_status(...)" in
lazy_scan_prune(). It would be more expensive, but maybe it doesn't
matter in practice. It would get rid of this tricky bookkeeping in
heap_vac_scan_next_block().

I did some experiments on this in the past and thought that it did
have a perf impact to call visibilitymap_get_status() every time. But
let me try and dig those up. (doesn't speak to whether or not in
matters in practice)

- Melanie

#20Melanie Plageman
melanieplageman@gmail.com
In reply to: Peter Geoghegan (#18)
Re: Confine vacuum skip logic to lazy_scan_skip

On Fri, Mar 8, 2024 at 10:41 AM Peter Geoghegan <pg@bowt.ie> wrote:

On Fri, Mar 8, 2024 at 8:49 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

ISTM we should revert the above hunk, and backpatch it to v16. I'm a
little wary because I don't understand why that change was made in the
first place, though. I think it was just an ill-advised attempt at
tidying up the code as part of the larger commit, but I'm not sure.
Peter, do you remember?

I think that it makes sense to set the VM when indicated by
lazy_scan_prune, independent of what either the visibility map or the
page's PD_ALL_VISIBLE marking say. The whole point of
DISABLE_PAGE_SKIPPING is to deal with VM corruption, after all.

Not that it will be fun to maintain another special case in the VM
update code in lazy_scan_prune(), but we could have a special case
that checks if DISABLE_PAGE_SKIPPING was passed to vacuum and if
all_visible_according_to_vm is true and all_visible is true, we update
the VM but don't dirty the page. The docs on DISABLE_PAGE_SKIPPING say
it is meant to deal with VM corruption -- it doesn't say anything
about dealing with incorrectly set PD_ALL_VISIBLE markings.

- Melanie

In reply to: Melanie Plageman (#20)
Re: Confine vacuum skip logic to lazy_scan_skip

On Fri, Mar 8, 2024 at 10:48 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

Not that it will be fun to maintain another special case in the VM
update code in lazy_scan_prune(), but we could have a special case
that checks if DISABLE_PAGE_SKIPPING was passed to vacuum and if
all_visible_according_to_vm is true and all_visible is true, we update
the VM but don't dirty the page.

It wouldn't necessarily have to be a special case, I think.

We already conditionally set PD_ALL_VISIBLE/call PageIsAllVisible() in
the block where lazy_scan_prune marks a previously all-visible page
all-frozen -- we don't want to dirty the page unnecessarily there.
Making it conditional is defensive in that particular block (this was
also added by this same commit of mine), and avoids dirtying the page.

Seems like it might be possible to simplify/consolidate the VM-setting
code that's now located at the end of lazy_scan_prune. Perhaps the two
distinct blocks that call visibilitymap_set() could be combined into
one.

--
Peter Geoghegan

In reply to: Peter Geoghegan (#21)
Re: Confine vacuum skip logic to lazy_scan_skip

On Fri, Mar 8, 2024 at 11:00 AM Peter Geoghegan <pg@bowt.ie> wrote:

Seems like it might be possible to simplify/consolidate the VM-setting
code that's now located at the end of lazy_scan_prune. Perhaps the two
distinct blocks that call visibilitymap_set() could be combined into
one.

FWIW I think that my error here might have had something to do with
hallucinating that the code already did things that way.

At the time this went in, I was working on a patchset that did things
this way (more or less). It broke the dependency on
all_visible_according_to_vm entirely, which simplified the
set-and-check-VM code that's now at the end of lazy_scan_prune.

Not sure how practical it'd be to do something like that now (not
offhand), but something to consider.

--
Peter Geoghegan

#23Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Melanie Plageman (#16)
3 attachment(s)
Re: Confine vacuum skip logic to lazy_scan_skip

On 08/03/2024 02:46, Melanie Plageman wrote:

On Wed, Mar 06, 2024 at 10:00:23PM -0500, Melanie Plageman wrote:

On Wed, Mar 06, 2024 at 09:55:21PM +0200, Heikki Linnakangas wrote:

I will say that now all of the variable names are *very* long. I didn't
want to remove the "state" from LVRelState->next_block_state. (In fact, I
kind of miss the "get". But I had to draw the line somewhere.) I think
without "state" in the name, next_block sounds too much like a function.

Any ideas for shortening the names of next_block_state and its members
or are you fine with them?

Hmm, we can remove the inner struct and add the fields directly into
LVRelState. LVRelState already contains many groups of variables, like
"Error reporting state", with no inner structs. I did it that way in the
attached patch. I also used local variables more.

I was wondering if we should remove the "get" and just go with
heap_vac_scan_next_block(). I didn't do that originally because I didn't
want to imply that the next block was literally the sequentially next
block, but I think maybe I was overthinking it.

Another idea is to call it heap_scan_vac_next_block() and then the order
of the words is more like the table AM functions that get the next block
(e.g. heapam_scan_bitmap_next_block()). Though maybe we don't want it to
be too similar to those since this isn't a table AM callback.

I've done a version of this.

+1

However, by adding a vmbuffer to next_block_state, the callback may be
able to avoid extra VM fetches from one invocation to the next.

That's a good idea, holding separate VM buffer pins for the
next-unskippable block and the block we're processing. I adopted that
approach.

My compiler caught one small bug when I was playing with various
refactorings of this: heap_vac_scan_next_block() must set *blkno to
rel_pages, not InvalidBlockNumber, after the last block. The caller uses
the 'blkno' variable also after the loop, and assumes that it's set to
rel_pages.

I'm pretty happy with the attached patches now. The first one fixes the
existing bug I mentioned in the other email (based on the on-going
discussion that might not how we want to fix it though). Second commit
is a squash of most of the patches. Third patch is the removal of the
delay point, that seems worthwhile to keep separate.

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachments:

v8-0001-Set-all_visible_according_to_vm-correctly-with-DI.patchtext/x-patch; charset=UTF-8; name=v8-0001-Set-all_visible_according_to_vm-correctly-with-DI.patchDownload
From b68cb29c547de3c4acd10f31aad47b453d154666 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Fri, 8 Mar 2024 16:00:22 +0200
Subject: [PATCH v8 1/3] Set all_visible_according_to_vm correctly with
 DISABLE_PAGE_SKIPPING

It's important for 'all_visible_according_to_vm' to correctly reflect
whether the VM bit is set or not, even when we are not trusting the VM
to skip pages, because contrary to what the comment said,
lazy_scan_prune() relies on it.

If it's incorrectly set to 'false', when the VM bit is in fact set,
lazy_scan_prune() will try to set the VM bit again and dirty the page
unnecessarily. As a result, if you used DISABLE_PAGE_SKIPPING, all
heap pages were dirtied, even if there were no changes. We would also
fail to clear any VM bits that were set incorrectly.

This was broken in commit 980ae17310, so backpatch to v16.

Backpatch-through: 16
Reviewed-by: Melanie Plageman
Discussion: https://www.postgresql.org/message-id/3df2b582-dc1c-46b6-99b6-38eddd1b2784@iki.fi
---
 src/backend/access/heap/vacuumlazy.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 8b320c3f89a..ac55ebd2ae5 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1136,11 +1136,7 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 
 		/* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
 		if (!vacrel->skipwithvm)
-		{
-			/* Caller shouldn't rely on all_visible_according_to_vm */
-			*next_unskippable_allvis = false;
 			break;
-		}
 
 		/*
 		 * Aggressive VACUUM caller can't skip pages just because they are
-- 
2.39.2

v8-0002-Confine-vacuum-skip-logic-to-lazy_scan_skip.patchtext/x-patch; charset=UTF-8; name=v8-0002-Confine-vacuum-skip-logic-to-lazy_scan_skip.patchDownload
From 47af1ca65cf55ca876869b43bff47f9d43f0750e Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Fri, 8 Mar 2024 17:32:19 +0200
Subject: [PATCH v8 2/3] Confine vacuum skip logic to lazy_scan_skip()

Rename lazy_scan_skip() to heap_vac_scan_next_block() and move more
code into the function, so that the caller doesn't need to know about
ranges or skipping anymore. heap_vac_scan_next_block() returns the
next block to process, and the logic for determining that block is all
within the function. This makes the skipping logic easier to
understand, as it's all in the same function, and makes the calling
code easier to understand as it's less cluttered. The state variables
needed to manage the skipping logic are moved to LVRelState.

heap_vac_scan_next_block() now manages its own VM buffer separately
from the caller's vmbuffer variable. The caller's vmbuffer holds the
VM page for the current block its processing, while
heap_vac_scan_next_block() keeps a pin on the VM page for the next
unskippable block. Most of the time they are the same, so we hold two
pins on the same buffer, but it's more convenient to manage them
separately.

This refactoring will also help future patches to switch to using a
streaming read interface, and eventually AIO
(https://postgr.es/m/CA%2BhUKGJkOiOCa%2Bmag4BF%2BzHo7qo%3Do9CFheB8%3Dg6uT5TUm2gkvA%40mail.gmail.com)

Author: Melanie Plageman, with some changes by me
Discussion: https://postgr.es/m/CAAKRu_Yf3gvXGcCnqqfoq0Q8LX8UM-e-qbm_B1LeZh60f8WhWA%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c | 256 +++++++++++++++------------
 1 file changed, 141 insertions(+), 115 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index ac55ebd2ae5..0aa08762015 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -204,6 +204,12 @@ typedef struct LVRelState
 	int64		live_tuples;	/* # live tuples remaining */
 	int64		recently_dead_tuples;	/* # dead, but not yet removable */
 	int64		missed_dead_tuples; /* # removable, but not removed */
+
+	/* State maintained by heap_vac_scan_next_block() */
+	BlockNumber current_block;	/* last block returned */
+	BlockNumber next_unskippable_block; /* next unskippable block */
+	bool		next_unskippable_allvis;	/* its visibility status */
+	Buffer		next_unskippable_vmbuffer;	/* buffer containing its VM bit */
 } LVRelState;
 
 /* Struct for saving and restoring vacuum error information. */
@@ -217,10 +223,8 @@ typedef struct LVSavedErrInfo
 
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
-static BlockNumber lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer,
-								  BlockNumber next_block,
-								  bool *next_unskippable_allvis,
-								  bool *skipping_current_range);
+static bool heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
+									 bool *all_visible_according_to_vm);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
 								   bool sharelock, Buffer vmbuffer);
@@ -803,12 +807,11 @@ lazy_scan_heap(LVRelState *vacrel)
 {
 	BlockNumber rel_pages = vacrel->rel_pages,
 				blkno,
-				next_unskippable_block,
 				next_fsm_block_to_vacuum = 0;
+	bool		all_visible_according_to_vm;
+
 	VacDeadItems *dead_items = vacrel->dead_items;
 	Buffer		vmbuffer = InvalidBuffer;
-	bool		next_unskippable_allvis,
-				skipping_current_range;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -822,44 +825,19 @@ lazy_scan_heap(LVRelState *vacrel)
 	initprog_val[2] = dead_items->max_items;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
-	/* Set up an initial range of skippable blocks using the visibility map */
-	next_unskippable_block = lazy_scan_skip(vacrel, &vmbuffer, 0,
-											&next_unskippable_allvis,
-											&skipping_current_range);
-	for (blkno = 0; blkno < rel_pages; blkno++)
+	/* Initialize for the first heap_vac_scan_next_block() call */
+	vacrel->current_block = InvalidBlockNumber;
+	vacrel->next_unskippable_block = InvalidBlockNumber;
+	vacrel->next_unskippable_allvis = false;
+	vacrel->next_unskippable_vmbuffer = InvalidBuffer;
+
+	while (heap_vac_scan_next_block(vacrel, &blkno, &all_visible_according_to_vm))
 	{
 		Buffer		buf;
 		Page		page;
-		bool		all_visible_according_to_vm;
 		bool		has_lpdead_items;
 		bool		got_cleanup_lock = false;
 
-		if (blkno == next_unskippable_block)
-		{
-			/*
-			 * Can't skip this page safely.  Must scan the page.  But
-			 * determine the next skippable range after the page first.
-			 */
-			all_visible_according_to_vm = next_unskippable_allvis;
-			next_unskippable_block = lazy_scan_skip(vacrel, &vmbuffer,
-													blkno + 1,
-													&next_unskippable_allvis,
-													&skipping_current_range);
-
-			Assert(next_unskippable_block >= blkno + 1);
-		}
-		else
-		{
-			/* Last page always scanned (may need to set nonempty_pages) */
-			Assert(blkno < rel_pages - 1);
-
-			if (skipping_current_range)
-				continue;
-
-			/* Current range is too small to skip -- just scan the page */
-			all_visible_according_to_vm = true;
-		}
-
 		vacrel->scanned_pages++;
 
 		/* Report as block scanned, update error traceback information */
@@ -1077,18 +1055,22 @@ lazy_scan_heap(LVRelState *vacrel)
 }
 
 /*
- *	lazy_scan_skip() -- set up range of skippable blocks using visibility map.
+ *	heap_vac_scan_next_block() -- get next block for vacuum to process
  *
- * lazy_scan_heap() calls here every time it needs to set up a new range of
- * blocks to skip via the visibility map.  Caller passes the next block in
- * line.  We return a next_unskippable_block for this range.  When there are
- * no skippable blocks we just return caller's next_block.  The all-visible
- * status of the returned block is set in *next_unskippable_allvis for caller,
- * too.  Block usually won't be all-visible (since it's unskippable), but it
- * can be during aggressive VACUUMs (as well as in certain edge cases).
+ * lazy_scan_heap() calls here every time it needs to get the next block to
+ * prune and vacuum.  The function uses the visibility map, vacuum options,
+ * and various thresholds to skip blocks which do not need to be processed and
+ * sets blkno to the next block that actually needs to be processed.
  *
- * Sets *skipping_current_range to indicate if caller should skip this range.
- * Costs and benefits drive our decision.  Very small ranges won't be skipped.
+ * The block number and visibility status of the next block to process are set
+ * in *blkno and *all_visible_according_to_vm.  The return value is false if
+ * there are no further blocks to process.
+ *
+ * vacrel is an in/out parameter here; vacuum options and information about
+ * the relation are read, and vacrel->skippedallvis is set to ensure we don't
+ * advance relfrozenxid when we have skipped vacuuming all-visible blocks.  It
+ * also holds information about the next unskippable block, as bookkeeping for
+ * this function.
  *
  * Note: our opinion of which blocks can be skipped can go stale immediately.
  * It's okay if caller "misses" a page whose all-visible or all-frozen marking
@@ -1098,88 +1080,132 @@ lazy_scan_heap(LVRelState *vacrel)
  * older XIDs/MXIDs.  The vacrel->skippedallvis flag will be set here when the
  * choice to skip such a range is actually made, making everything safe.)
  */
-static BlockNumber
-lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
-			   bool *next_unskippable_allvis, bool *skipping_current_range)
+static bool
+heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
+						 bool *all_visible_according_to_vm)
 {
-	BlockNumber rel_pages = vacrel->rel_pages,
-				next_unskippable_block = next_block,
-				nskippable_blocks = 0;
+	BlockNumber next_block;
 	bool		skipsallvis = false;
+	BlockNumber rel_pages = vacrel->rel_pages;
+	BlockNumber next_unskippable_block;
+	bool		next_unskippable_allvis;
+	Buffer		next_unskippable_vmbuffer;
 
-	*next_unskippable_allvis = true;
-	while (next_unskippable_block < rel_pages)
-	{
-		uint8		mapbits = visibilitymap_get_status(vacrel->rel,
-													   next_unskippable_block,
-													   vmbuffer);
+	/* relies on InvalidBlockNumber + 1 overflowing to 0 on first call */
+	next_block = vacrel->current_block + 1;
 
-		if ((mapbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
+	/* Have we reached the end of the relation? */
+	if (next_block >= rel_pages)
+	{
+		if (BufferIsValid(vacrel->next_unskippable_vmbuffer))
 		{
-			Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
-			*next_unskippable_allvis = false;
-			break;
+			ReleaseBuffer(vacrel->next_unskippable_vmbuffer);
+			vacrel->next_unskippable_vmbuffer = InvalidBuffer;
 		}
+		*blkno = rel_pages;
+		return false;
+	}
 
+	next_unskippable_block = vacrel->next_unskippable_block;
+	next_unskippable_allvis = vacrel->next_unskippable_allvis;
+	if (next_unskippable_block == InvalidBlockNumber ||
+		next_block > next_unskippable_block)
+	{
 		/*
-		 * Caller must scan the last page to determine whether it has tuples
-		 * (caller must have the opportunity to set vacrel->nonempty_pages).
-		 * This rule avoids having lazy_truncate_heap() take access-exclusive
-		 * lock on rel to attempt a truncation that fails anyway, just because
-		 * there are tuples on the last page (it is likely that there will be
-		 * tuples on other nearby pages as well, but those can be skipped).
-		 *
-		 * Implement this by always treating the last block as unsafe to skip.
+		 * Find the next unskippable block using the visibility map.
 		 */
-		if (next_unskippable_block == rel_pages - 1)
-			break;
+		next_unskippable_block = next_block;
+		next_unskippable_vmbuffer = vacrel->next_unskippable_vmbuffer;
+		for (;;)
+		{
+			uint8		mapbits = visibilitymap_get_status(vacrel->rel,
+														   next_unskippable_block,
+														   &next_unskippable_vmbuffer);
 
-		/* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
-		if (!vacrel->skipwithvm)
-			break;
+			next_unskippable_allvis = (mapbits & VISIBILITYMAP_ALL_VISIBLE) != 0;
 
-		/*
-		 * Aggressive VACUUM caller can't skip pages just because they are
-		 * all-visible.  They may still skip all-frozen pages, which can't
-		 * contain XIDs < OldestXmin (XIDs that aren't already frozen by now).
-		 */
-		if ((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0)
-		{
-			if (vacrel->aggressive)
+			/*
+			 * A block is unskippable if it is not all visible according to
+			 * the visibility map.
+			 */
+			if (!next_unskippable_allvis)
+			{
+				Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
+				break;
+			}
+
+			/*
+			 * Caller must scan the last page to determine whether it has
+			 * tuples (caller must have the opportunity to set
+			 * vacrel->nonempty_pages).  This rule avoids having
+			 * lazy_truncate_heap() take access-exclusive lock on rel to
+			 * attempt a truncation that fails anyway, just because there are
+			 * tuples on the last page (it is likely that there will be tuples
+			 * on other nearby pages as well, but those can be skipped).
+			 *
+			 * Implement this by always treating the last block as unsafe to
+			 * skip.
+			 */
+			if (next_unskippable_block == rel_pages - 1)
+				break;
+
+			/* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
+			if (!vacrel->skipwithvm)
 				break;
 
 			/*
-			 * All-visible block is safe to skip in non-aggressive case.  But
-			 * remember that the final range contains such a block for later.
+			 * Aggressive VACUUM caller can't skip pages just because they are
+			 * all-visible.  They may still skip all-frozen pages, which can't
+			 * contain XIDs < OldestXmin (XIDs that aren't already frozen by
+			 * now).
 			 */
-			skipsallvis = true;
+			if ((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0)
+			{
+				if (vacrel->aggressive)
+					break;
+
+				/*
+				 * All-visible block is safe to skip in non-aggressive case.
+				 * But remember that the final range contains such a block for
+				 * later.
+				 */
+				skipsallvis = true;
+			}
+
+			vacuum_delay_point();
+			next_unskippable_block++;
 		}
+		/* write the local variables back to vacrel */
+		vacrel->next_unskippable_block = next_unskippable_block;
+		vacrel->next_unskippable_allvis = next_unskippable_allvis;
+		vacrel->next_unskippable_vmbuffer = next_unskippable_vmbuffer;
 
-		vacuum_delay_point();
-		next_unskippable_block++;
-		nskippable_blocks++;
+		/*
+		 * We only skip a range with at least SKIP_PAGES_THRESHOLD consecutive
+		 * pages.  Since we're reading sequentially, the OS should be doing
+		 * readahead for us, so there's no gain in skipping a page now and
+		 * then. Skipping such a range might even discourage sequential
+		 * detection.
+		 *
+		 * This test also enables more frequent relfrozenxid advancement
+		 * during non-aggressive VACUUMs.  If the range has any all-visible
+		 * pages then skipping makes updating relfrozenxid unsafe, which is a
+		 * real downside.
+		 */
+		if (next_unskippable_block - next_block >= SKIP_PAGES_THRESHOLD)
+		{
+			next_block = next_unskippable_block;
+			if (skipsallvis)
+				vacrel->skippedallvis = true;
+		}
 	}
 
-	/*
-	 * We only skip a range with at least SKIP_PAGES_THRESHOLD consecutive
-	 * pages.  Since we're reading sequentially, the OS should be doing
-	 * readahead for us, so there's no gain in skipping a page now and then.
-	 * Skipping such a range might even discourage sequential detection.
-	 *
-	 * This test also enables more frequent relfrozenxid advancement during
-	 * non-aggressive VACUUMs.  If the range has any all-visible pages then
-	 * skipping makes updating relfrozenxid unsafe, which is a real downside.
-	 */
-	if (nskippable_blocks < SKIP_PAGES_THRESHOLD)
-		*skipping_current_range = false;
+	if (next_block == next_unskippable_block)
+		*all_visible_according_to_vm = next_unskippable_allvis;
 	else
-	{
-		*skipping_current_range = true;
-		if (skipsallvis)
-			vacrel->skippedallvis = true;
-	}
-
-	return next_unskippable_block;
+		*all_visible_according_to_vm = true;
+	*blkno = vacrel->current_block = next_block;
+	return true;
 }
 
 /*
@@ -1752,8 +1778,8 @@ lazy_scan_prune(LVRelState *vacrel,
 
 	/*
 	 * Handle setting visibility map bit based on information from the VM (as
-	 * of last lazy_scan_skip() call), and from all_visible and all_frozen
-	 * variables
+	 * of last heap_vac_scan_next_block() call), and from all_visible and
+	 * all_frozen variables
 	 */
 	if (!all_visible_according_to_vm && all_visible)
 	{
@@ -1788,8 +1814,8 @@ lazy_scan_prune(LVRelState *vacrel,
 	/*
 	 * As of PostgreSQL 9.2, the visibility map bit should never be set if the
 	 * page-level bit is clear.  However, it's possible that the bit got
-	 * cleared after lazy_scan_skip() was called, so we must recheck with
-	 * buffer lock before concluding that the VM is corrupt.
+	 * cleared after heap_vac_scan_next_block() was called, so we must recheck
+	 * with buffer lock before concluding that the VM is corrupt.
 	 */
 	else if (all_visible_according_to_vm && !PageIsAllVisible(page) &&
 			 visibilitymap_get_status(vacrel->rel, blkno, &vmbuffer) != 0)
-- 
2.39.2

v8-0003-Remove-unneeded-vacuum_delay_point-from-heap_vac_.patchtext/x-patch; charset=UTF-8; name=v8-0003-Remove-unneeded-vacuum_delay_point-from-heap_vac_.patchDownload
From 941ae7522ab6ac24ca5981303e4e7f6e2cba7458 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sun, 31 Dec 2023 12:49:56 -0500
Subject: [PATCH v8 3/3] Remove unneeded vacuum_delay_point from
 heap_vac_scan_get_next_block

heap_vac_scan_get_next_block() does relatively little work, so there is
no need to call vacuum_delay_point(). A future commit will call
heap_vac_scan_get_next_block() from a callback, and we would like to
avoid calling vacuum_delay_point() in that callback.
---
 src/backend/access/heap/vacuumlazy.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 0aa08762015..e1657ef4f9b 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1172,7 +1172,6 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 				skipsallvis = true;
 			}
 
-			vacuum_delay_point();
 			next_unskippable_block++;
 		}
 		/* write the local variables back to vacrel */
-- 
2.39.2

#24Melanie Plageman
melanieplageman@gmail.com
In reply to: Peter Geoghegan (#21)
Re: Confine vacuum skip logic to lazy_scan_skip

On Fri, Mar 8, 2024 at 11:00 AM Peter Geoghegan <pg@bowt.ie> wrote:

On Fri, Mar 8, 2024 at 10:48 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

Not that it will be fun to maintain another special case in the VM
update code in lazy_scan_prune(), but we could have a special case
that checks if DISABLE_PAGE_SKIPPING was passed to vacuum and if
all_visible_according_to_vm is true and all_visible is true, we update
the VM but don't dirty the page.

It wouldn't necessarily have to be a special case, I think.

We already conditionally set PD_ALL_VISIBLE/call PageIsAllVisible() in
the block where lazy_scan_prune marks a previously all-visible page
all-frozen -- we don't want to dirty the page unnecessarily there.
Making it conditional is defensive in that particular block (this was
also added by this same commit of mine), and avoids dirtying the page.

Ah, I see. I got confused. Even if the VM is suspect, if the page is
all visible and the heap block is already set all-visible in the VM,
there is no need to update it.

This did make me realize that it seems like there is a case we don't
handle in master with the current code that would be fixed by changing
that code Heikki mentioned:

Right now, even if the heap block is incorrectly marked all-visible in
the VM, if DISABLE_PAGE_SKIPPING is passed to vacuum,
all_visible_according_to_vm will be passed to lazy_scan_prune() as
false. Then even if lazy_scan_prune() finds that the page is not
all-visible, we won't call visibilitymap_clear().

If we revert the code setting next_unskippable_allvis to false in
lazy_scan_skip() when vacrel->skipwithvm is false and allow
all_visible_according_to_vm to be true when the VM has it incorrectly
set to true, then once lazy_scan_prune() discovers the page is not
all-visible and assuming PD_ALL_VISIBLE is not marked so
PageIsAllVisible() returns false, we will call visibilitymap_clear()
to clear the incorrectly set VM bit (without dirtying the page).

Here is a table of the variable states at the end of lazy_scan_prune()
for clarity:

master:
all_visible_according_to_vm: false
all_visible: false
VM says all vis: true
PageIsAllVisible: false

if fixed:
all_visible_according_to_vm: true
all_visible: false
VM says all vis: true
PageIsAllVisible: false

Seems like it might be possible to simplify/consolidate the VM-setting
code that's now located at the end of lazy_scan_prune. Perhaps the two
distinct blocks that call visibilitymap_set() could be combined into
one.

I agree. I have some code to do that in an unproposed patch which
combines the VM updates into the prune record. We will definitely want
to reorganize the code when we do that record combining.

- Melanie

#25Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#24)
Re: Confine vacuum skip logic to lazy_scan_skip

On Fri, Mar 8, 2024 at 11:31 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Fri, Mar 8, 2024 at 11:00 AM Peter Geoghegan <pg@bowt.ie> wrote:

On Fri, Mar 8, 2024 at 10:48 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

Not that it will be fun to maintain another special case in the VM
update code in lazy_scan_prune(), but we could have a special case
that checks if DISABLE_PAGE_SKIPPING was passed to vacuum and if
all_visible_according_to_vm is true and all_visible is true, we update
the VM but don't dirty the page.

It wouldn't necessarily have to be a special case, I think.

We already conditionally set PD_ALL_VISIBLE/call PageIsAllVisible() in
the block where lazy_scan_prune marks a previously all-visible page
all-frozen -- we don't want to dirty the page unnecessarily there.
Making it conditional is defensive in that particular block (this was
also added by this same commit of mine), and avoids dirtying the page.

Ah, I see. I got confused. Even if the VM is suspect, if the page is
all visible and the heap block is already set all-visible in the VM,
there is no need to update it.

This did make me realize that it seems like there is a case we don't
handle in master with the current code that would be fixed by changing
that code Heikki mentioned:

Right now, even if the heap block is incorrectly marked all-visible in
the VM, if DISABLE_PAGE_SKIPPING is passed to vacuum,
all_visible_according_to_vm will be passed to lazy_scan_prune() as
false. Then even if lazy_scan_prune() finds that the page is not
all-visible, we won't call visibilitymap_clear().

If we revert the code setting next_unskippable_allvis to false in
lazy_scan_skip() when vacrel->skipwithvm is false and allow
all_visible_according_to_vm to be true when the VM has it incorrectly
set to true, then once lazy_scan_prune() discovers the page is not
all-visible and assuming PD_ALL_VISIBLE is not marked so
PageIsAllVisible() returns false, we will call visibilitymap_clear()
to clear the incorrectly set VM bit (without dirtying the page).

Here is a table of the variable states at the end of lazy_scan_prune()
for clarity:

master:
all_visible_according_to_vm: false
all_visible: false
VM says all vis: true
PageIsAllVisible: false

if fixed:
all_visible_according_to_vm: true
all_visible: false
VM says all vis: true
PageIsAllVisible: false

Okay, I now see from Heikki's v8-0001 that he was already aware of this.

- Melanie

#26Melanie Plageman
melanieplageman@gmail.com
In reply to: Heikki Linnakangas (#23)
Re: Confine vacuum skip logic to lazy_scan_skip

On Fri, Mar 08, 2024 at 06:07:33PM +0200, Heikki Linnakangas wrote:

On 08/03/2024 02:46, Melanie Plageman wrote:

On Wed, Mar 06, 2024 at 10:00:23PM -0500, Melanie Plageman wrote:

On Wed, Mar 06, 2024 at 09:55:21PM +0200, Heikki Linnakangas wrote:

I will say that now all of the variable names are *very* long. I didn't
want to remove the "state" from LVRelState->next_block_state. (In fact, I
kind of miss the "get". But I had to draw the line somewhere.) I think
without "state" in the name, next_block sounds too much like a function.

Any ideas for shortening the names of next_block_state and its members
or are you fine with them?

Hmm, we can remove the inner struct and add the fields directly into
LVRelState. LVRelState already contains many groups of variables, like
"Error reporting state", with no inner structs. I did it that way in the
attached patch. I also used local variables more.

+1; I like the result of this.

However, by adding a vmbuffer to next_block_state, the callback may be
able to avoid extra VM fetches from one invocation to the next.

That's a good idea, holding separate VM buffer pins for the next-unskippable
block and the block we're processing. I adopted that approach.

Cool. It can't be avoided with streaming read vacuum, but I wonder if
there would ever be adverse effects to doing it on master? Maybe if we
are doing a lot of skipping and the block of the VM for the heap blocks
we are processing ends up changing each time but we would have had the
right block of the VM if we used the one from
heap_vac_scan_next_block()?

Frankly, I'm in favor of just doing it now because it makes
lazy_scan_heap() less confusing.

My compiler caught one small bug when I was playing with various
refactorings of this: heap_vac_scan_next_block() must set *blkno to
rel_pages, not InvalidBlockNumber, after the last block. The caller uses the
'blkno' variable also after the loop, and assumes that it's set to
rel_pages.

Oops! Thanks for catching that.

I'm pretty happy with the attached patches now. The first one fixes the
existing bug I mentioned in the other email (based on the on-going
discussion that might not how we want to fix it though).

ISTM we should still do the fix you mentioned -- seems like it has more
upsides than downsides?

From b68cb29c547de3c4acd10f31aad47b453d154666 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Fri, 8 Mar 2024 16:00:22 +0200
Subject: [PATCH v8 1/3] Set all_visible_according_to_vm correctly with
DISABLE_PAGE_SKIPPING

It's important for 'all_visible_according_to_vm' to correctly reflect
whether the VM bit is set or not, even when we are not trusting the VM
to skip pages, because contrary to what the comment said,
lazy_scan_prune() relies on it.

If it's incorrectly set to 'false', when the VM bit is in fact set,
lazy_scan_prune() will try to set the VM bit again and dirty the page
unnecessarily. As a result, if you used DISABLE_PAGE_SKIPPING, all
heap pages were dirtied, even if there were no changes. We would also
fail to clear any VM bits that were set incorrectly.

This was broken in commit 980ae17310, so backpatch to v16.

LGTM.

From 47af1ca65cf55ca876869b43bff47f9d43f0750e Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Fri, 8 Mar 2024 17:32:19 +0200
Subject: [PATCH v8 2/3] Confine vacuum skip logic to lazy_scan_skip()
---
src/backend/access/heap/vacuumlazy.c | 256 +++++++++++++++------------
1 file changed, 141 insertions(+), 115 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index ac55ebd2ae5..0aa08762015 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -204,6 +204,12 @@ typedef struct LVRelState
int64		live_tuples;	/* # live tuples remaining */
int64		recently_dead_tuples;	/* # dead, but not yet removable */
int64		missed_dead_tuples; /* # removable, but not removed */

Perhaps we should add a comment to the blkno member of LVRelState
indicating that it is used for error reporting and logging?

+	/* State maintained by heap_vac_scan_next_block() */
+	BlockNumber current_block;	/* last block returned */
+	BlockNumber next_unskippable_block; /* next unskippable block */
+	bool		next_unskippable_allvis;	/* its visibility status */
+	Buffer		next_unskippable_vmbuffer;	/* buffer containing its VM bit */
} LVRelState;
/*
-static BlockNumber
-lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
-			   bool *next_unskippable_allvis, bool *skipping_current_range)
+static bool
+heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
+						 bool *all_visible_according_to_vm)
{
-	BlockNumber rel_pages = vacrel->rel_pages,
-				next_unskippable_block = next_block,
-				nskippable_blocks = 0;
+	BlockNumber next_block;
bool		skipsallvis = false;
+	BlockNumber rel_pages = vacrel->rel_pages;
+	BlockNumber next_unskippable_block;
+	bool		next_unskippable_allvis;
+	Buffer		next_unskippable_vmbuffer;
-	*next_unskippable_allvis = true;
-	while (next_unskippable_block < rel_pages)
-	{
-		uint8		mapbits = visibilitymap_get_status(vacrel->rel,
-													   next_unskippable_block,
-													   vmbuffer);
+	/* relies on InvalidBlockNumber + 1 overflowing to 0 on first call */
+	next_block = vacrel->current_block + 1;
-		if ((mapbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
+	/* Have we reached the end of the relation? */
+	if (next_block >= rel_pages)
+	{
+		if (BufferIsValid(vacrel->next_unskippable_vmbuffer))
{
-			Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
-			*next_unskippable_allvis = false;
-			break;
+			ReleaseBuffer(vacrel->next_unskippable_vmbuffer);
+			vacrel->next_unskippable_vmbuffer = InvalidBuffer;
}

Good catch here. Also, I noticed that I set current_block to
InvalidBlockNumber too which seems strictly worse than leaving it as
rel_pages + 1 -- just in case a future dev makes a change that
accidentally causes heap_vac_scan_next_block() to be called again and
adding InvalidBlockNumber + 1 would end up going back to 0. So this all
looks correct to me.

+		*blkno = rel_pages;
+		return false;
+	}
+	next_unskippable_block = vacrel->next_unskippable_block;
+	next_unskippable_allvis = vacrel->next_unskippable_allvis;

Wishe there was a newline here.

I see why you removed my treatise-level comment that was here about
unskipped skippable blocks. However, when I was trying to understand
this code, I did wish there was some comment that explained to me why we
needed all of the variables next_unskippable_block,
next_unskippable_allvis, all_visible_according_to_vm, and current_block.

The idea that we would choose not to skip a skippable block because of
kernel readahead makes sense. The part that I had trouble wrapping my
head around was that we want to also keep the visibility status of both
the beginning and ending blocks of the skippable range and then use
those to infer the visibility status of the intervening blocks without
another VM lookup if we decide not to skip them.

+	if (next_unskippable_block == InvalidBlockNumber ||
+		next_block > next_unskippable_block)
+	{
/*
-		 * Caller must scan the last page to determine whether it has tuples
-		 * (caller must have the opportunity to set vacrel->nonempty_pages).
-		 * This rule avoids having lazy_truncate_heap() take access-exclusive
-		 * lock on rel to attempt a truncation that fails anyway, just because
-		 * there are tuples on the last page (it is likely that there will be
-		 * tuples on other nearby pages as well, but those can be skipped).
-		 *
-		 * Implement this by always treating the last block as unsafe to skip.
+		 * Find the next unskippable block using the visibility map.
*/
-		if (next_unskippable_block == rel_pages - 1)
-			break;
+		next_unskippable_block = next_block;
+		next_unskippable_vmbuffer = vacrel->next_unskippable_vmbuffer;
+		for (;;)

Ah yes, my old loop condition was redundant with the break if
next_unskippable_block == rel_pages - 1. This is better

+		{
+			uint8		mapbits = visibilitymap_get_status(vacrel->rel,
+														   next_unskippable_block,
+														   &next_unskippable_vmbuffer);

- /* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
- if (!vacrel->skipwithvm)

...

+			}
+
+			vacuum_delay_point();
+			next_unskippable_block++;
}

Would love a newline here

+		/* write the local variables back to vacrel */
+		vacrel->next_unskippable_block = next_unskippable_block;
+		vacrel->next_unskippable_allvis = next_unskippable_allvis;
+		vacrel->next_unskippable_vmbuffer = next_unskippable_vmbuffer;

...

-	if (nskippable_blocks < SKIP_PAGES_THRESHOLD)
-		*skipping_current_range = false;
+	if (next_block == next_unskippable_block)
+		*all_visible_according_to_vm = next_unskippable_allvis;
else
-	{
-		*skipping_current_range = true;
-		if (skipsallvis)
-			vacrel->skippedallvis = true;
-	}
-
-	return next_unskippable_block;
+		*all_visible_according_to_vm = true;

Also a newline here

+	*blkno = vacrel->current_block = next_block;
+	return true;
}

From 941ae7522ab6ac24ca5981303e4e7f6e2cba7458 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Sun, 31 Dec 2023 12:49:56 -0500
Subject: [PATCH v8 3/3] Remove unneeded vacuum_delay_point from
heap_vac_scan_get_next_block

heap_vac_scan_get_next_block() does relatively little work, so there is
no need to call vacuum_delay_point(). A future commit will call
heap_vac_scan_get_next_block() from a callback, and we would like to
avoid calling vacuum_delay_point() in that callback.
---
src/backend/access/heap/vacuumlazy.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 0aa08762015..e1657ef4f9b 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1172,7 +1172,6 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
skipsallvis = true;
}

- vacuum_delay_point();
next_unskippable_block++;
}
/* write the local variables back to vacrel */
--
2.39.2

LGTM

- Melanie

#27Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#26)
Re: Confine vacuum skip logic to lazy_scan_skip

On Fri, Mar 8, 2024 at 12:34 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Fri, Mar 08, 2024 at 06:07:33PM +0200, Heikki Linnakangas wrote:

On 08/03/2024 02:46, Melanie Plageman wrote:

On Wed, Mar 06, 2024 at 10:00:23PM -0500, Melanie Plageman wrote:

On Wed, Mar 06, 2024 at 09:55:21PM +0200, Heikki Linnakangas wrote:

I will say that now all of the variable names are *very* long. I didn't
want to remove the "state" from LVRelState->next_block_state. (In fact, I
kind of miss the "get". But I had to draw the line somewhere.) I think
without "state" in the name, next_block sounds too much like a function.

Any ideas for shortening the names of next_block_state and its members
or are you fine with them?

Hmm, we can remove the inner struct and add the fields directly into
LVRelState. LVRelState already contains many groups of variables, like
"Error reporting state", with no inner structs. I did it that way in the
attached patch. I also used local variables more.

+1; I like the result of this.

I did some perf testing of 0002 and 0003 using that fully-in-SB vacuum
test I mentioned in an earlier email. 0002 is a vacuum time reduction
from an average of 11.5 ms on master to 9.6 ms with 0002 applied. And
0003 reduces the time vacuum takes from 11.5 ms on master to 7.4 ms
with 0003 applied.

I profiled them and 0002 seems to simply spend less time in
heap_vac_scan_next_block() than master did in lazy_scan_skip().

And 0003 reduces the time vacuum takes because vacuum_delay_point()
shows up pretty high in the profile.

Here are the profiles for my test.

profile of master:

+   29.79%  postgres  postgres           [.] visibilitymap_get_status
+   27.35%  postgres  postgres           [.] vacuum_delay_point
+   17.00%  postgres  postgres           [.] lazy_scan_skip
+    6.59%  postgres  postgres           [.] heap_vacuum_rel
+    6.43%  postgres  postgres           [.] BufferGetBlockNumber

profile with 0001-0002:

+   40.30%  postgres  postgres           [.] visibilitymap_get_status
+   20.32%  postgres  postgres           [.] vacuum_delay_point
+   20.26%  postgres  postgres           [.] heap_vacuum_rel
+    5.17%  postgres  postgres           [.] BufferGetBlockNumber

profile with 0001-0003

+   59.77%  postgres  postgres           [.] visibilitymap_get_status
+   23.86%  postgres  postgres           [.] heap_vacuum_rel
+    6.59%  postgres  postgres           [.] StrategyGetBuffer

Test DDL and setup:

psql -c "ALTER SYSTEM SET shared_buffers = '16 GB';"
psql -c "CREATE TABLE foo(id INT, a INT, b INT, c INT, d INT, e INT, f
INT, g INT) with (autovacuum_enabled=false, fillfactor=25);"
psql -c "INSERT INTO foo SELECT i, i, i, i, i, i, i, i FROM
generate_series(1, 46000000)i;"
psql -c "VACUUM (FREEZE) foo;"
pg_ctl restart
psql -c "SELECT pg_prewarm('foo');"
# make sure there isn't an ill-timed checkpoint
psql -c "\timing on" -c "vacuum (verbose) foo;"

- Melanie

#28Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#14)
Re: Confine vacuum skip logic to lazy_scan_skip

On Wed, Mar 6, 2024 at 6:47 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

Performance results:

The TL;DR of my performance results is that streaming read vacuum is
faster. However there is an issue with the interaction of the streaming
read code and the vacuum buffer access strategy which must be addressed.

I have investigated the interaction between
maintenance_io_concurrency, streaming reads, and the vacuum buffer
access strategy (BAS_VACUUM).

The streaming read API limits max_pinned_buffers to a pinned buffer
multiplier (currently 4) * maintenance_io_concurrency buffers with the
goal of constructing reads of at least MAX_BUFFERS_PER_TRANSFER size.

Since the BAS_VACUUM ring buffer is size 256 kB or 32 buffers with
default block size, that means that for a fully uncached vacuum in
which all blocks must be vacuumed and will be dirtied, you'd have to
set maintenance_io_concurrency at 8 or lower to see the same number of
reuses (and shared buffer consumption) as master.

Given that we allow users to specify BUFFER_USAGE_LIMIT to vacuum, it
seems like we should force max_pinned_buffers to a value that
guarantees the expected shared buffer usage by vacuum. But that means
that maintenance_io_concurrency does not have a predictable impact on
streaming read vacuum.

What is the right thing to do here?

At the least, the default size of the BAS_VACUUM ring buffer should be
BLCKSZ * pinned_buffer_multiplier * default maintenance_io_concurrency
(probably rounded up to the next power of two) bytes.

- Melanie

#29Thomas Munro
thomas.munro@gmail.com
In reply to: Melanie Plageman (#28)
Re: Confine vacuum skip logic to lazy_scan_skip

On Mon, Mar 11, 2024 at 5:31 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Wed, Mar 6, 2024 at 6:47 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

Performance results:

The TL;DR of my performance results is that streaming read vacuum is
faster. However there is an issue with the interaction of the streaming
read code and the vacuum buffer access strategy which must be addressed.

Woo.

I have investigated the interaction between
maintenance_io_concurrency, streaming reads, and the vacuum buffer
access strategy (BAS_VACUUM).

The streaming read API limits max_pinned_buffers to a pinned buffer
multiplier (currently 4) * maintenance_io_concurrency buffers with the
goal of constructing reads of at least MAX_BUFFERS_PER_TRANSFER size.

Since the BAS_VACUUM ring buffer is size 256 kB or 32 buffers with
default block size, that means that for a fully uncached vacuum in
which all blocks must be vacuumed and will be dirtied, you'd have to
set maintenance_io_concurrency at 8 or lower to see the same number of
reuses (and shared buffer consumption) as master.

Given that we allow users to specify BUFFER_USAGE_LIMIT to vacuum, it
seems like we should force max_pinned_buffers to a value that
guarantees the expected shared buffer usage by vacuum. But that means
that maintenance_io_concurrency does not have a predictable impact on
streaming read vacuum.

What is the right thing to do here?

At the least, the default size of the BAS_VACUUM ring buffer should be
BLCKSZ * pinned_buffer_multiplier * default maintenance_io_concurrency
(probably rounded up to the next power of two) bytes.

Hmm, does the v6 look-ahead distance control algorithm mitigate that
problem? Using the ABC classifications from the streaming read
thread, I think for A it should now pin only 1, for B 16 and for C, it
depends on the size of the random 'chunks': if you have a lot of size
1 random reads then it shouldn't go above 10 because of (default)
maintenance_io_concurrency. The only way to get up to very high
numbers would be to have a lot of random chunks triggering behaviour
C, but each made up of long runs of misses. For example one can
contrive a BHS query that happens to read pages 0-15 then 20-35 then
40-55 etc etc so that we want to get lots of wide I/Os running
concurrently. Unless vacuum manages to do something like that, it
shouldn't be able to exceed 32 buffers very easily.

I suspect that if we taught streaming_read.c to ask the
BufferAccessStrategy (if one is passed in) what its recommended pin
limit is (strategy->nbuffers?), we could just clamp
max_pinned_buffers, and it would be hard to find a workload where that
makes a difference, and we could think about more complicated logic
later.

In other words, I think/hope your complaints about excessive pinning
from v5 WRT all-cached heap scans might have also already improved
this case by happy coincidence? I haven't tried it out though, I just
read your description of the problem...

#30Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Melanie Plageman (#26)
2 attachment(s)
Re: Confine vacuum skip logic to lazy_scan_skip

On 08/03/2024 19:34, Melanie Plageman wrote:

On Fri, Mar 08, 2024 at 06:07:33PM +0200, Heikki Linnakangas wrote:

On 08/03/2024 02:46, Melanie Plageman wrote:

On Wed, Mar 06, 2024 at 10:00:23PM -0500, Melanie Plageman wrote:

On Wed, Mar 06, 2024 at 09:55:21PM +0200, Heikki Linnakangas wrote:

However, by adding a vmbuffer to next_block_state, the callback may be
able to avoid extra VM fetches from one invocation to the next.

That's a good idea, holding separate VM buffer pins for the next-unskippable
block and the block we're processing. I adopted that approach.

Cool. It can't be avoided with streaming read vacuum, but I wonder if
there would ever be adverse effects to doing it on master? Maybe if we
are doing a lot of skipping and the block of the VM for the heap blocks
we are processing ends up changing each time but we would have had the
right block of the VM if we used the one from
heap_vac_scan_next_block()?

Frankly, I'm in favor of just doing it now because it makes
lazy_scan_heap() less confusing.

+1

I'm pretty happy with the attached patches now. The first one fixes the
existing bug I mentioned in the other email (based on the on-going
discussion that might not how we want to fix it though).

ISTM we should still do the fix you mentioned -- seems like it has more
upsides than downsides?

From b68cb29c547de3c4acd10f31aad47b453d154666 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Fri, 8 Mar 2024 16:00:22 +0200
Subject: [PATCH v8 1/3] Set all_visible_according_to_vm correctly with
DISABLE_PAGE_SKIPPING

It's important for 'all_visible_according_to_vm' to correctly reflect
whether the VM bit is set or not, even when we are not trusting the VM
to skip pages, because contrary to what the comment said,
lazy_scan_prune() relies on it.

If it's incorrectly set to 'false', when the VM bit is in fact set,
lazy_scan_prune() will try to set the VM bit again and dirty the page
unnecessarily. As a result, if you used DISABLE_PAGE_SKIPPING, all
heap pages were dirtied, even if there were no changes. We would also
fail to clear any VM bits that were set incorrectly.

This was broken in commit 980ae17310, so backpatch to v16.

LGTM.

Committed and backpatched this.

From 47af1ca65cf55ca876869b43bff47f9d43f0750e Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Fri, 8 Mar 2024 17:32:19 +0200
Subject: [PATCH v8 2/3] Confine vacuum skip logic to lazy_scan_skip()
---
src/backend/access/heap/vacuumlazy.c | 256 +++++++++++++++------------
1 file changed, 141 insertions(+), 115 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index ac55ebd2ae5..0aa08762015 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -204,6 +204,12 @@ typedef struct LVRelState
int64		live_tuples;	/* # live tuples remaining */
int64		recently_dead_tuples;	/* # dead, but not yet removable */
int64		missed_dead_tuples; /* # removable, but not removed */

Perhaps we should add a comment to the blkno member of LVRelState
indicating that it is used for error reporting and logging?

Well, it's already under the "/* Error reporting state */" section. I
agree this is a little confusing, the name 'blkno' doesn't convey that
it's supposed to be used just for error reporting. But it's a
pre-existing issue so I left it alone. It can be changed with a separate
patch if we come up with a good idea.

I see why you removed my treatise-level comment that was here about
unskipped skippable blocks. However, when I was trying to understand
this code, I did wish there was some comment that explained to me why we
needed all of the variables next_unskippable_block,
next_unskippable_allvis, all_visible_according_to_vm, and current_block.

The idea that we would choose not to skip a skippable block because of
kernel readahead makes sense. The part that I had trouble wrapping my
head around was that we want to also keep the visibility status of both
the beginning and ending blocks of the skippable range and then use
those to infer the visibility status of the intervening blocks without
another VM lookup if we decide not to skip them.

Right, I removed the comment because looked a little out of place and it
duplicated the other comments sprinkled in the function. I agree this
could still use some more comments though.

Here's yet another attempt at making this more readable. I moved the
logic to find the next unskippable block to a separate function, and
added comments to make the states more explicit. What do you think?

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachments:

v9-0001-Confine-vacuum-skip-logic-to-lazy_scan_skip.patchtext/x-patch; charset=UTF-8; name=v9-0001-Confine-vacuum-skip-logic-to-lazy_scan_skip.patchDownload
From c21480e9da61e145573de3b502551dde1b8fa3f6 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Fri, 8 Mar 2024 17:32:19 +0200
Subject: [PATCH v9 1/2] Confine vacuum skip logic to lazy_scan_skip()

Rename lazy_scan_skip() to heap_vac_scan_next_block() and move more
code into the function, so that the caller doesn't need to know about
ranges or skipping anymore. heap_vac_scan_next_block() returns the
next block to process, and the logic for determining that block is all
within the function. This makes the skipping logic easier to
understand, as it's all in the same function, and makes the calling
code easier to understand as it's less cluttered. The state variables
needed to manage the skipping logic are moved to LVRelState.

heap_vac_scan_next_block() now manages its own VM buffer separately
from the caller's vmbuffer variable. The caller's vmbuffer holds the
VM page for the current block its processing, while
heap_vac_scan_next_block() keeps a pin on the VM page for the next
unskippable block. Most of the time they are the same, so we hold two
pins on the same buffer, but it's more convenient to manage them
separately.

For readability inside heap_vac_scan_next_block(), move the logic of
finding the next unskippable block to separate function, and add some
comments.

This refactoring will also help future patches to switch to using a
streaming read interface, and eventually AIO
(https://postgr.es/m/CA%2BhUKGJkOiOCa%2Bmag4BF%2BzHo7qo%3Do9CFheB8%3Dg6uT5TUm2gkvA%40mail.gmail.com)

Author: Melanie Plageman, with some changes by me
Discussion: https://postgr.es/m/CAAKRu_Yf3gvXGcCnqqfoq0Q8LX8UM-e-qbm_B1LeZh60f8WhWA%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c | 233 +++++++++++++++++----------
 1 file changed, 146 insertions(+), 87 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index ac55ebd2ae..1757eb49b7 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -204,6 +204,12 @@ typedef struct LVRelState
 	int64		live_tuples;	/* # live tuples remaining */
 	int64		recently_dead_tuples;	/* # dead, but not yet removable */
 	int64		missed_dead_tuples; /* # removable, but not removed */
+
+	/* State maintained by heap_vac_scan_next_block() */
+	BlockNumber current_block;	/* last block returned */
+	BlockNumber next_unskippable_block; /* next unskippable block */
+	bool		next_unskippable_allvis;	/* its visibility status */
+	Buffer		next_unskippable_vmbuffer;	/* buffer containing its VM bit */
 } LVRelState;
 
 /* Struct for saving and restoring vacuum error information. */
@@ -217,10 +223,9 @@ typedef struct LVSavedErrInfo
 
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
-static BlockNumber lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer,
-								  BlockNumber next_block,
-								  bool *next_unskippable_allvis,
-								  bool *skipping_current_range);
+static bool heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
+									 bool *all_visible_according_to_vm);
+static void find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
 								   bool sharelock, Buffer vmbuffer);
@@ -803,12 +808,11 @@ lazy_scan_heap(LVRelState *vacrel)
 {
 	BlockNumber rel_pages = vacrel->rel_pages,
 				blkno,
-				next_unskippable_block,
 				next_fsm_block_to_vacuum = 0;
+	bool		all_visible_according_to_vm;
+
 	VacDeadItems *dead_items = vacrel->dead_items;
 	Buffer		vmbuffer = InvalidBuffer;
-	bool		next_unskippable_allvis,
-				skipping_current_range;
 	const int	initprog_index[] = {
 		PROGRESS_VACUUM_PHASE,
 		PROGRESS_VACUUM_TOTAL_HEAP_BLKS,
@@ -822,44 +826,19 @@ lazy_scan_heap(LVRelState *vacrel)
 	initprog_val[2] = dead_items->max_items;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
-	/* Set up an initial range of skippable blocks using the visibility map */
-	next_unskippable_block = lazy_scan_skip(vacrel, &vmbuffer, 0,
-											&next_unskippable_allvis,
-											&skipping_current_range);
-	for (blkno = 0; blkno < rel_pages; blkno++)
+	/* Initialize for the first heap_vac_scan_next_block() call */
+	vacrel->current_block = InvalidBlockNumber;
+	vacrel->next_unskippable_block = InvalidBlockNumber;
+	vacrel->next_unskippable_allvis = false;
+	vacrel->next_unskippable_vmbuffer = InvalidBuffer;
+
+	while (heap_vac_scan_next_block(vacrel, &blkno, &all_visible_according_to_vm))
 	{
 		Buffer		buf;
 		Page		page;
-		bool		all_visible_according_to_vm;
 		bool		has_lpdead_items;
 		bool		got_cleanup_lock = false;
 
-		if (blkno == next_unskippable_block)
-		{
-			/*
-			 * Can't skip this page safely.  Must scan the page.  But
-			 * determine the next skippable range after the page first.
-			 */
-			all_visible_according_to_vm = next_unskippable_allvis;
-			next_unskippable_block = lazy_scan_skip(vacrel, &vmbuffer,
-													blkno + 1,
-													&next_unskippable_allvis,
-													&skipping_current_range);
-
-			Assert(next_unskippable_block >= blkno + 1);
-		}
-		else
-		{
-			/* Last page always scanned (may need to set nonempty_pages) */
-			Assert(blkno < rel_pages - 1);
-
-			if (skipping_current_range)
-				continue;
-
-			/* Current range is too small to skip -- just scan the page */
-			all_visible_according_to_vm = true;
-		}
-
 		vacrel->scanned_pages++;
 
 		/* Report as block scanned, update error traceback information */
@@ -1077,18 +1056,22 @@ lazy_scan_heap(LVRelState *vacrel)
 }
 
 /*
- *	lazy_scan_skip() -- set up range of skippable blocks using visibility map.
+ *	heap_vac_scan_next_block() -- get next block for vacuum to process
  *
- * lazy_scan_heap() calls here every time it needs to set up a new range of
- * blocks to skip via the visibility map.  Caller passes the next block in
- * line.  We return a next_unskippable_block for this range.  When there are
- * no skippable blocks we just return caller's next_block.  The all-visible
- * status of the returned block is set in *next_unskippable_allvis for caller,
- * too.  Block usually won't be all-visible (since it's unskippable), but it
- * can be during aggressive VACUUMs (as well as in certain edge cases).
+ * lazy_scan_heap() calls here every time it needs to get the next block to
+ * prune and vacuum.  The function uses the visibility map, vacuum options,
+ * and various thresholds to skip blocks which do not need to be processed and
+ * sets blkno to the next block that actually needs to be processed.
  *
- * Sets *skipping_current_range to indicate if caller should skip this range.
- * Costs and benefits drive our decision.  Very small ranges won't be skipped.
+ * The block number and visibility status of the next block to process are set
+ * in *blkno and *all_visible_according_to_vm.  The return value is false if
+ * there are no further blocks to process.
+ *
+ * vacrel is an in/out parameter here; vacuum options and information about
+ * the relation are read, and vacrel->skippedallvis is set to ensure we don't
+ * advance relfrozenxid when we have skipped vacuuming all-visible blocks.  It
+ * also holds information about the next unskippable block, as bookkeeping for
+ * this function.
  *
  * Note: our opinion of which blocks can be skipped can go stale immediately.
  * It's okay if caller "misses" a page whose all-visible or all-frozen marking
@@ -1098,26 +1081,119 @@ lazy_scan_heap(LVRelState *vacrel)
  * older XIDs/MXIDs.  The vacrel->skippedallvis flag will be set here when the
  * choice to skip such a range is actually made, making everything safe.)
  */
-static BlockNumber
-lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
-			   bool *next_unskippable_allvis, bool *skipping_current_range)
+static bool
+heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
+						 bool *all_visible_according_to_vm)
 {
-	BlockNumber rel_pages = vacrel->rel_pages,
-				next_unskippable_block = next_block,
-				nskippable_blocks = 0;
-	bool		skipsallvis = false;
+	BlockNumber next_block;
 
-	*next_unskippable_allvis = true;
-	while (next_unskippable_block < rel_pages)
+	/* relies on InvalidBlockNumber + 1 overflowing to 0 on first call */
+	next_block = vacrel->current_block + 1;
+
+	/* Have we reached the end of the relation? */
+	if (next_block >= vacrel->rel_pages)
+	{
+		if (BufferIsValid(vacrel->next_unskippable_vmbuffer))
+		{
+			ReleaseBuffer(vacrel->next_unskippable_vmbuffer);
+			vacrel->next_unskippable_vmbuffer = InvalidBuffer;
+		}
+		*blkno = vacrel->rel_pages;
+		return false;
+	}
+
+	/*
+	 * We must be in one of the three following states:
+	 */
+	if (vacrel->next_unskippable_block == InvalidBlockNumber ||
+		next_block > vacrel->next_unskippable_block)
+	{
+		/*
+		 * 1. We have just processed an unskippable block (or we're at the
+		 * beginning of the scan).  Find the next unskippable block using the
+		 * visibility map.
+		 */
+		bool		skipsallvis;
+
+		find_next_unskippable_block(vacrel, &skipsallvis);
+
+		/*
+		 * We now know the next block that we must process.  It can be the
+		 * next block after the one we just processed, or something further
+		 * ahead.  If it's further ahead, we can jump to it, but we choose to
+		 * do so only if we can skip at least SKIP_PAGES_THRESHOLD consecutive
+		 * pages.  Since we're reading sequentially, the OS should be doing
+		 * readahead for us, so there's no gain in skipping a page now and
+		 * then.  Skipping such a range might even discourage sequential
+		 * detection.
+		 *
+		 * This test also enables more frequent relfrozenxid advancement
+		 * during non-aggressive VACUUMs.  If the range has any all-visible
+		 * pages then skipping makes updating relfrozenxid unsafe, which is a
+		 * real downside.
+		 */
+		if (vacrel->next_unskippable_block - next_block >= SKIP_PAGES_THRESHOLD)
+		{
+			next_block = vacrel->next_unskippable_block;
+			if (skipsallvis)
+				vacrel->skippedallvis = true;
+		}
+	}
+
+	/* Now we must be in one of the two remaining states: */
+	if (next_block < vacrel->next_unskippable_block)
+	{
+		/*
+		 * 2. We are processing a range of blocks that we could have skipped
+		 * but chose not to.  We know that they are all-visible in the VM,
+		 * otherwise they would've been unskippable.
+		 */
+		*blkno = vacrel->current_block = next_block;
+		*all_visible_according_to_vm = true;
+		return true;
+	}
+	else
+	{
+		/*
+		 * 3. We reached the next unskippable block.  Process it.  On next
+		 * iteration, we will be back in state 1.
+		 */
+		Assert(next_block == vacrel->next_unskippable_block);
+
+		*blkno = vacrel->current_block = next_block;
+		*all_visible_according_to_vm = vacrel->next_unskippable_allvis;
+		return true;
+	}
+}
+
+/*
+ * Find the next unskippable block in a vacuum scan using the visibility map.
+ */
+static void
+find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis)
+{
+	BlockNumber rel_pages = vacrel->rel_pages;
+	BlockNumber next_unskippable_block = vacrel->next_unskippable_block + 1;
+	Buffer		next_unskippable_vmbuffer = vacrel->next_unskippable_vmbuffer;
+	bool		next_unskippable_allvis;
+
+	*skipsallvis = false;
+
+	for (;;)
 	{
 		uint8		mapbits = visibilitymap_get_status(vacrel->rel,
 													   next_unskippable_block,
-													   vmbuffer);
+													   &next_unskippable_vmbuffer);
 
-		if ((mapbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
+		next_unskippable_allvis = (mapbits & VISIBILITYMAP_ALL_VISIBLE) != 0;
+
+		/*
+		 * A block is unskippable if it is not all visible according to the
+		 * visibility map.
+		 */
+		if (!next_unskippable_allvis)
 		{
 			Assert((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0);
-			*next_unskippable_allvis = false;
 			break;
 		}
 
@@ -1152,34 +1228,17 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 			 * All-visible block is safe to skip in non-aggressive case.  But
 			 * remember that the final range contains such a block for later.
 			 */
-			skipsallvis = true;
+			*skipsallvis = true;
 		}
 
 		vacuum_delay_point();
 		next_unskippable_block++;
-		nskippable_blocks++;
-	}
-
-	/*
-	 * We only skip a range with at least SKIP_PAGES_THRESHOLD consecutive
-	 * pages.  Since we're reading sequentially, the OS should be doing
-	 * readahead for us, so there's no gain in skipping a page now and then.
-	 * Skipping such a range might even discourage sequential detection.
-	 *
-	 * This test also enables more frequent relfrozenxid advancement during
-	 * non-aggressive VACUUMs.  If the range has any all-visible pages then
-	 * skipping makes updating relfrozenxid unsafe, which is a real downside.
-	 */
-	if (nskippable_blocks < SKIP_PAGES_THRESHOLD)
-		*skipping_current_range = false;
-	else
-	{
-		*skipping_current_range = true;
-		if (skipsallvis)
-			vacrel->skippedallvis = true;
 	}
 
-	return next_unskippable_block;
+	/* write the local variables back to vacrel */
+	vacrel->next_unskippable_block = next_unskippable_block;
+	vacrel->next_unskippable_allvis = next_unskippable_allvis;
+	vacrel->next_unskippable_vmbuffer = next_unskippable_vmbuffer;
 }
 
 /*
@@ -1752,8 +1811,8 @@ lazy_scan_prune(LVRelState *vacrel,
 
 	/*
 	 * Handle setting visibility map bit based on information from the VM (as
-	 * of last lazy_scan_skip() call), and from all_visible and all_frozen
-	 * variables
+	 * of last heap_vac_scan_next_block() call), and from all_visible and
+	 * all_frozen variables
 	 */
 	if (!all_visible_according_to_vm && all_visible)
 	{
@@ -1788,8 +1847,8 @@ lazy_scan_prune(LVRelState *vacrel,
 	/*
 	 * As of PostgreSQL 9.2, the visibility map bit should never be set if the
 	 * page-level bit is clear.  However, it's possible that the bit got
-	 * cleared after lazy_scan_skip() was called, so we must recheck with
-	 * buffer lock before concluding that the VM is corrupt.
+	 * cleared after heap_vac_scan_next_block() was called, so we must recheck
+	 * with buffer lock before concluding that the VM is corrupt.
 	 */
 	else if (all_visible_according_to_vm && !PageIsAllVisible(page) &&
 			 visibilitymap_get_status(vacrel->rel, blkno, &vmbuffer) != 0)
-- 
2.39.2

v9-0002-Remove-unneeded-vacuum_delay_point-from-heap_vac_.patchtext/x-patch; charset=UTF-8; name=v9-0002-Remove-unneeded-vacuum_delay_point-from-heap_vac_.patchDownload
From 07437212d629ab00b571dfcadb41497a6c5b43d5 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Mon, 11 Mar 2024 10:01:37 +0200
Subject: [PATCH v9 2/2] Remove unneeded vacuum_delay_point from
 heap_vac_scan_get_next_block

heap_vac_scan_get_next_block() does relatively little work, so there
is no need to call vacuum_delay_point(). A future commit will call
heap_vac_scan_get_next_block() from a callback, and we would like to
avoid calling vacuum_delay_point() in that callback.

Author: Melanie Plageman
Discussion: https://postgr.es/m/CAAKRu_Yf3gvXGcCnqqfoq0Q8LX8UM-e-qbm_B1LeZh60f8WhWA%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 1757eb49b7..58ee12fdfb 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1231,7 +1231,6 @@ find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis)
 			*skipsallvis = true;
 		}
 
-		vacuum_delay_point();
 		next_unskippable_block++;
 	}
 
-- 
2.39.2

#31Melanie Plageman
melanieplageman@gmail.com
In reply to: Heikki Linnakangas (#30)
Re: Confine vacuum skip logic to lazy_scan_skip

On Mon, Mar 11, 2024 at 11:29:44AM +0200, Heikki Linnakangas wrote:

I see why you removed my treatise-level comment that was here about
unskipped skippable blocks. However, when I was trying to understand
this code, I did wish there was some comment that explained to me why we
needed all of the variables next_unskippable_block,
next_unskippable_allvis, all_visible_according_to_vm, and current_block.

The idea that we would choose not to skip a skippable block because of
kernel readahead makes sense. The part that I had trouble wrapping my
head around was that we want to also keep the visibility status of both
the beginning and ending blocks of the skippable range and then use
those to infer the visibility status of the intervening blocks without
another VM lookup if we decide not to skip them.

Right, I removed the comment because looked a little out of place and it
duplicated the other comments sprinkled in the function. I agree this could
still use some more comments though.

Here's yet another attempt at making this more readable. I moved the logic
to find the next unskippable block to a separate function, and added
comments to make the states more explicit. What do you think?

Oh, I like the new structure. Very cool! Just a few remarks:

From c21480e9da61e145573de3b502551dde1b8fa3f6 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Fri, 8 Mar 2024 17:32:19 +0200
Subject: [PATCH v9 1/2] Confine vacuum skip logic to lazy_scan_skip()

Rename lazy_scan_skip() to heap_vac_scan_next_block() and move more
code into the function, so that the caller doesn't need to know about
ranges or skipping anymore. heap_vac_scan_next_block() returns the
next block to process, and the logic for determining that block is all
within the function. This makes the skipping logic easier to
understand, as it's all in the same function, and makes the calling
code easier to understand as it's less cluttered. The state variables
needed to manage the skipping logic are moved to LVRelState.

heap_vac_scan_next_block() now manages its own VM buffer separately
from the caller's vmbuffer variable. The caller's vmbuffer holds the
VM page for the current block its processing, while
heap_vac_scan_next_block() keeps a pin on the VM page for the next
unskippable block. Most of the time they are the same, so we hold two
pins on the same buffer, but it's more convenient to manage them
separately.

For readability inside heap_vac_scan_next_block(), move the logic of
finding the next unskippable block to separate function, and add some
comments.

This refactoring will also help future patches to switch to using a
streaming read interface, and eventually AIO
(/messages/by-id/CA+hUKGJkOiOCa+mag4BF+zHo7qo=o9CFheB8=g6uT5TUm2gkvA@mail.gmail.com)

Author: Melanie Plageman, with some changes by me

I'd argue you earned co-authorship by now :)

Discussion: /messages/by-id/CAAKRu_Yf3gvXGcCnqqfoq0Q8LX8UM-e-qbm_B1LeZh60f8WhWA@mail.gmail.com
---
src/backend/access/heap/vacuumlazy.c | 233 +++++++++++++++++----------
1 file changed, 146 insertions(+), 87 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index ac55ebd2ae..1757eb49b7 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
+
/*
- *	lazy_scan_skip() -- set up range of skippable blocks using visibility map.
+ *	heap_vac_scan_next_block() -- get next block for vacuum to process
*
- * lazy_scan_heap() calls here every time it needs to set up a new range of
- * blocks to skip via the visibility map.  Caller passes the next block in
- * line.  We return a next_unskippable_block for this range.  When there are
- * no skippable blocks we just return caller's next_block.  The all-visible
- * status of the returned block is set in *next_unskippable_allvis for caller,
- * too.  Block usually won't be all-visible (since it's unskippable), but it
- * can be during aggressive VACUUMs (as well as in certain edge cases).
+ * lazy_scan_heap() calls here every time it needs to get the next block to
+ * prune and vacuum.  The function uses the visibility map, vacuum options,
+ * and various thresholds to skip blocks which do not need to be processed and

I wonder if "need" is too strong a word since this function
(heap_vac_scan_next_block()) specifically can set blkno to a block which
doesn't *need* to be processed but which it chooses to process because
of SKIP_PAGES_THRESHOLD.

+ * sets blkno to the next block that actually needs to be processed.
*
- * Sets *skipping_current_range to indicate if caller should skip this range.
- * Costs and benefits drive our decision.  Very small ranges won't be skipped.
+ * The block number and visibility status of the next block to process are set
+ * in *blkno and *all_visible_according_to_vm.  The return value is false if
+ * there are no further blocks to process.
+ *
+ * vacrel is an in/out parameter here; vacuum options and information about
+ * the relation are read, and vacrel->skippedallvis is set to ensure we don't
+ * advance relfrozenxid when we have skipped vacuuming all-visible blocks.  It

Maybe this should say when we have skipped vacuuming all-visible blocks
which are not all-frozen or just blocks which are not all-frozen.

+ * also holds information about the next unskippable block, as bookkeeping for
+ * this function.
*
* Note: our opinion of which blocks can be skipped can go stale immediately.
* It's okay if caller "misses" a page whose all-visible or all-frozen marking

Wonder if it makes sense to move this note to
find_next_nunskippable_block().

@@ -1098,26 +1081,119 @@ lazy_scan_heap(LVRelState *vacrel)
* older XIDs/MXIDs.  The vacrel->skippedallvis flag will be set here when the
* choice to skip such a range is actually made, making everything safe.)
*/
-static BlockNumber
-lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
-			   bool *next_unskippable_allvis, bool *skipping_current_range)
+static bool
+heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
+						 bool *all_visible_according_to_vm)
{
-	BlockNumber rel_pages = vacrel->rel_pages,
-				next_unskippable_block = next_block,
-				nskippable_blocks = 0;
-	bool		skipsallvis = false;
+	BlockNumber next_block;
-	*next_unskippable_allvis = true;
-	while (next_unskippable_block < rel_pages)
+	/* relies on InvalidBlockNumber + 1 overflowing to 0 on first call */
+	next_block = vacrel->current_block + 1;
+
+	/* Have we reached the end of the relation? */

No strong opinion on this, but I wonder if being at the end of the
relation counts as a fourth state?

+	if (next_block >= vacrel->rel_pages)
+	{
+		if (BufferIsValid(vacrel->next_unskippable_vmbuffer))
+		{
+			ReleaseBuffer(vacrel->next_unskippable_vmbuffer);
+			vacrel->next_unskippable_vmbuffer = InvalidBuffer;
+		}
+		*blkno = vacrel->rel_pages;
+		return false;
+	}
+
+	/*
+	 * We must be in one of the three following states:
+	 */
+	if (vacrel->next_unskippable_block == InvalidBlockNumber ||
+		next_block > vacrel->next_unskippable_block)
+	{
+		/*
+		 * 1. We have just processed an unskippable block (or we're at the
+		 * beginning of the scan).  Find the next unskippable block using the
+		 * visibility map.
+		 */

I would reorder the options in the comment or in the if statement since
they seem to be in the reverse order.

+		bool		skipsallvis;
+
+		find_next_unskippable_block(vacrel, &skipsallvis);
+
+		/*
+		 * We now know the next block that we must process.  It can be the
+		 * next block after the one we just processed, or something further
+		 * ahead.  If it's further ahead, we can jump to it, but we choose to
+		 * do so only if we can skip at least SKIP_PAGES_THRESHOLD consecutive
+		 * pages.  Since we're reading sequentially, the OS should be doing
+		 * readahead for us, so there's no gain in skipping a page now and
+		 * then.  Skipping such a range might even discourage sequential
+		 * detection.
+		 *
+		 * This test also enables more frequent relfrozenxid advancement
+		 * during non-aggressive VACUUMs.  If the range has any all-visible
+		 * pages then skipping makes updating relfrozenxid unsafe, which is a
+		 * real downside.
+		 */
+		if (vacrel->next_unskippable_block - next_block >= SKIP_PAGES_THRESHOLD)
+		{
+			next_block = vacrel->next_unskippable_block;
+			if (skipsallvis)
+				vacrel->skippedallvis = true;
+		}
+
+/*
+ * Find the next unskippable block in a vacuum scan using the visibility map.

To expand this comment, I might mention it is a helper function for
heap_vac_scan_next_block(). I would also say that the next unskippable
block and its visibility information are recorded in vacrel. And that
skipsallvis is set to true if any of the intervening skipped blocks are
not all-frozen.

+ */
+static void
+find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis)
+{
+	BlockNumber rel_pages = vacrel->rel_pages;
+	BlockNumber next_unskippable_block = vacrel->next_unskippable_block + 1;
+	Buffer		next_unskippable_vmbuffer = vacrel->next_unskippable_vmbuffer;
+	bool		next_unskippable_allvis;
+
+	*skipsallvis = false;
+
+	for (;;)
{
uint8		mapbits = visibilitymap_get_status(vacrel->rel,
next_unskippable_block,
-													   vmbuffer);
+													   &next_unskippable_vmbuffer);
-		if ((mapbits & VISIBILITYMAP_ALL_VISIBLE) == 0)
+		next_unskippable_allvis = (mapbits & VISIBILITYMAP_ALL_VISIBLE) != 0;

Otherwise LGTM

- Melanie

#32Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Melanie Plageman (#31)
Re: Confine vacuum skip logic to lazy_scan_skip

On 11/03/2024 18:15, Melanie Plageman wrote:

On Mon, Mar 11, 2024 at 11:29:44AM +0200, Heikki Linnakangas wrote:

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index ac55ebd2ae..1757eb49b7 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
+
/*
- *	lazy_scan_skip() -- set up range of skippable blocks using visibility map.
+ *	heap_vac_scan_next_block() -- get next block for vacuum to process
*
- * lazy_scan_heap() calls here every time it needs to set up a new range of
- * blocks to skip via the visibility map.  Caller passes the next block in
- * line.  We return a next_unskippable_block for this range.  When there are
- * no skippable blocks we just return caller's next_block.  The all-visible
- * status of the returned block is set in *next_unskippable_allvis for caller,
- * too.  Block usually won't be all-visible (since it's unskippable), but it
- * can be during aggressive VACUUMs (as well as in certain edge cases).
+ * lazy_scan_heap() calls here every time it needs to get the next block to
+ * prune and vacuum.  The function uses the visibility map, vacuum options,
+ * and various thresholds to skip blocks which do not need to be processed and
+ * sets blkno to the next block that actually needs to be processed.

I wonder if "need" is too strong a word since this function
(heap_vac_scan_next_block()) specifically can set blkno to a block which
doesn't *need* to be processed but which it chooses to process because
of SKIP_PAGES_THRESHOLD.

Ok yeah, there's a lot of "needs" here :-). Fixed.

*
- * Sets *skipping_current_range to indicate if caller should skip this range.
- * Costs and benefits drive our decision.  Very small ranges won't be skipped.
+ * The block number and visibility status of the next block to process are set
+ * in *blkno and *all_visible_according_to_vm.  The return value is false if
+ * there are no further blocks to process.
+ *
+ * vacrel is an in/out parameter here; vacuum options and information about
+ * the relation are read, and vacrel->skippedallvis is set to ensure we don't
+ * advance relfrozenxid when we have skipped vacuuming all-visible blocks.  It

Maybe this should say when we have skipped vacuuming all-visible blocks
which are not all-frozen or just blocks which are not all-frozen.

Ok, rephrased.

+ * also holds information about the next unskippable block, as bookkeeping for
+ * this function.
*
* Note: our opinion of which blocks can be skipped can go stale immediately.
* It's okay if caller "misses" a page whose all-visible or all-frozen marking

Wonder if it makes sense to move this note to
find_next_nunskippable_block().

Moved.

@@ -1098,26 +1081,119 @@ lazy_scan_heap(LVRelState *vacrel)
* older XIDs/MXIDs.  The vacrel->skippedallvis flag will be set here when the
* choice to skip such a range is actually made, making everything safe.)
*/
-static BlockNumber
-lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
-			   bool *next_unskippable_allvis, bool *skipping_current_range)
+static bool
+heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
+						 bool *all_visible_according_to_vm)
{
-	BlockNumber rel_pages = vacrel->rel_pages,
-				next_unskippable_block = next_block,
-				nskippable_blocks = 0;
-	bool		skipsallvis = false;
+	BlockNumber next_block;
-	*next_unskippable_allvis = true;
-	while (next_unskippable_block < rel_pages)
+	/* relies on InvalidBlockNumber + 1 overflowing to 0 on first call */
+	next_block = vacrel->current_block + 1;
+
+	/* Have we reached the end of the relation? */

No strong opinion on this, but I wonder if being at the end of the
relation counts as a fourth state?

Yeah, perhaps. But I think it makes sense to treat it as a special case.

+	if (next_block >= vacrel->rel_pages)
+	{
+		if (BufferIsValid(vacrel->next_unskippable_vmbuffer))
+		{
+			ReleaseBuffer(vacrel->next_unskippable_vmbuffer);
+			vacrel->next_unskippable_vmbuffer = InvalidBuffer;
+		}
+		*blkno = vacrel->rel_pages;
+		return false;
+	}
+
+	/*
+	 * We must be in one of the three following states:
+	 */
+	if (vacrel->next_unskippable_block == InvalidBlockNumber ||
+		next_block > vacrel->next_unskippable_block)
+	{
+		/*
+		 * 1. We have just processed an unskippable block (or we're at the
+		 * beginning of the scan).  Find the next unskippable block using the
+		 * visibility map.
+		 */

I would reorder the options in the comment or in the if statement since
they seem to be in the reverse order.

Reordered them in the statement.

It feels a bit wrong to test next_block > vacrel->next_unskippable_block
before vacrel->next_unskippable_block == InvalidBlockNumber. But it
works, and that order makes more sense in the comment IMHO.

+		bool		skipsallvis;
+
+		find_next_unskippable_block(vacrel, &skipsallvis);
+
+		/*
+		 * We now know the next block that we must process.  It can be the
+		 * next block after the one we just processed, or something further
+		 * ahead.  If it's further ahead, we can jump to it, but we choose to
+		 * do so only if we can skip at least SKIP_PAGES_THRESHOLD consecutive
+		 * pages.  Since we're reading sequentially, the OS should be doing
+		 * readahead for us, so there's no gain in skipping a page now and
+		 * then.  Skipping such a range might even discourage sequential
+		 * detection.
+		 *
+		 * This test also enables more frequent relfrozenxid advancement
+		 * during non-aggressive VACUUMs.  If the range has any all-visible
+		 * pages then skipping makes updating relfrozenxid unsafe, which is a
+		 * real downside.
+		 */
+		if (vacrel->next_unskippable_block - next_block >= SKIP_PAGES_THRESHOLD)
+		{
+			next_block = vacrel->next_unskippable_block;
+			if (skipsallvis)
+				vacrel->skippedallvis = true;
+		}
+
+/*
+ * Find the next unskippable block in a vacuum scan using the visibility map.

To expand this comment, I might mention it is a helper function for
heap_vac_scan_next_block(). I would also say that the next unskippable
block and its visibility information are recorded in vacrel. And that
skipsallvis is set to true if any of the intervening skipped blocks are
not all-frozen.

Added comments.

Otherwise LGTM

Ok, pushed! Thank you, this is much more understandable now!

--
Heikki Linnakangas
Neon (https://neon.tech)

#33Melanie Plageman
melanieplageman@gmail.com
In reply to: Heikki Linnakangas (#32)
Re: Confine vacuum skip logic to lazy_scan_skip

On Mon, Mar 11, 2024 at 2:47 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

Otherwise LGTM

Ok, pushed! Thank you, this is much more understandable now!

Cool, thanks!

#34Melanie Plageman
melanieplageman@gmail.com
In reply to: Thomas Munro (#29)
3 attachment(s)
Re: Confine vacuum skip logic to lazy_scan_skip

On Sun, Mar 10, 2024 at 11:01 PM Thomas Munro <thomas.munro@gmail.com> wrote:

On Mon, Mar 11, 2024 at 5:31 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

I have investigated the interaction between
maintenance_io_concurrency, streaming reads, and the vacuum buffer
access strategy (BAS_VACUUM).

The streaming read API limits max_pinned_buffers to a pinned buffer
multiplier (currently 4) * maintenance_io_concurrency buffers with the
goal of constructing reads of at least MAX_BUFFERS_PER_TRANSFER size.

Since the BAS_VACUUM ring buffer is size 256 kB or 32 buffers with
default block size, that means that for a fully uncached vacuum in
which all blocks must be vacuumed and will be dirtied, you'd have to
set maintenance_io_concurrency at 8 or lower to see the same number of
reuses (and shared buffer consumption) as master.

Given that we allow users to specify BUFFER_USAGE_LIMIT to vacuum, it
seems like we should force max_pinned_buffers to a value that
guarantees the expected shared buffer usage by vacuum. But that means
that maintenance_io_concurrency does not have a predictable impact on
streaming read vacuum.

What is the right thing to do here?

At the least, the default size of the BAS_VACUUM ring buffer should be
BLCKSZ * pinned_buffer_multiplier * default maintenance_io_concurrency
(probably rounded up to the next power of two) bytes.

Hmm, does the v6 look-ahead distance control algorithm mitigate that
problem? Using the ABC classifications from the streaming read
thread, I think for A it should now pin only 1, for B 16 and for C, it
depends on the size of the random 'chunks': if you have a lot of size
1 random reads then it shouldn't go above 10 because of (default)
maintenance_io_concurrency. The only way to get up to very high
numbers would be to have a lot of random chunks triggering behaviour
C, but each made up of long runs of misses. For example one can
contrive a BHS query that happens to read pages 0-15 then 20-35 then
40-55 etc etc so that we want to get lots of wide I/Os running
concurrently. Unless vacuum manages to do something like that, it
shouldn't be able to exceed 32 buffers very easily.

I suspect that if we taught streaming_read.c to ask the
BufferAccessStrategy (if one is passed in) what its recommended pin
limit is (strategy->nbuffers?), we could just clamp
max_pinned_buffers, and it would be hard to find a workload where that
makes a difference, and we could think about more complicated logic
later.

In other words, I think/hope your complaints about excessive pinning
from v5 WRT all-cached heap scans might have also already improved
this case by happy coincidence? I haven't tried it out though, I just
read your description of the problem...

I've rebased the attached v10 over top of the changes to
lazy_scan_heap() Heikki just committed and over the v6 streaming read
patch set. I started testing them and see that you are right, we no
longer pin too many buffers. However, the uncached example below is
now slower with streaming read than on master -- it looks to be
because it is doing twice as many WAL writes and syncs. I'm still
investigating why that is.

psql \
-c "create table small (a int) with (autovacuum_enabled=false,
fillfactor=25);" \
-c "insert into small select generate_series(1,200000) % 3;" \
-c "update small set a = 6 where a = 1;"

pg_ctl stop
# drop caches
pg_ctl start

psql -c "\timing on" -c "vacuum (verbose) small"

- Melanie

Attachments:

v10-0001-Streaming-Read-API.patchtext/x-patch; charset=US-ASCII; name=v10-0001-Streaming-Read-API.patchDownload
From 2b181d178f0cd55de45659540ec35536918a4c9d Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Mon, 11 Mar 2024 15:36:39 -0400
Subject: [PATCH v10 1/3] Streaming Read API

---
 src/backend/storage/Makefile             |   2 +-
 src/backend/storage/aio/Makefile         |  14 +
 src/backend/storage/aio/meson.build      |   5 +
 src/backend/storage/aio/streaming_read.c | 648 +++++++++++++++++++++++
 src/backend/storage/buffer/bufmgr.c      | 641 +++++++++++++++-------
 src/backend/storage/buffer/localbuf.c    |  14 +-
 src/backend/storage/meson.build          |   1 +
 src/include/storage/bufmgr.h             |  45 ++
 src/include/storage/streaming_read.h     |  52 ++
 src/tools/pgindent/typedefs.list         |   3 +
 10 files changed, 1215 insertions(+), 210 deletions(-)
 create mode 100644 src/backend/storage/aio/Makefile
 create mode 100644 src/backend/storage/aio/meson.build
 create mode 100644 src/backend/storage/aio/streaming_read.c
 create mode 100644 src/include/storage/streaming_read.h

diff --git a/src/backend/storage/Makefile b/src/backend/storage/Makefile
index 8376cdfca20..eec03f6f2b4 100644
--- a/src/backend/storage/Makefile
+++ b/src/backend/storage/Makefile
@@ -8,6 +8,6 @@ subdir = src/backend/storage
 top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS     = buffer file freespace ipc large_object lmgr page smgr sync
+SUBDIRS     = aio buffer file freespace ipc large_object lmgr page smgr sync
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/storage/aio/Makefile b/src/backend/storage/aio/Makefile
new file mode 100644
index 00000000000..bcab44c802f
--- /dev/null
+++ b/src/backend/storage/aio/Makefile
@@ -0,0 +1,14 @@
+#
+# Makefile for storage/aio
+#
+# src/backend/storage/aio/Makefile
+#
+
+subdir = src/backend/storage/aio
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+OBJS = \
+	streaming_read.o
+
+include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/storage/aio/meson.build b/src/backend/storage/aio/meson.build
new file mode 100644
index 00000000000..39aef2a84a2
--- /dev/null
+++ b/src/backend/storage/aio/meson.build
@@ -0,0 +1,5 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+backend_sources += files(
+  'streaming_read.c',
+)
diff --git a/src/backend/storage/aio/streaming_read.c b/src/backend/storage/aio/streaming_read.c
new file mode 100644
index 00000000000..404c95b7d3c
--- /dev/null
+++ b/src/backend/storage/aio/streaming_read.c
@@ -0,0 +1,648 @@
+#include "postgres.h"
+
+#include "catalog/pg_tablespace.h"
+#include "miscadmin.h"
+#include "storage/streaming_read.h"
+#include "utils/rel.h"
+#include "utils/spccache.h"
+
+/*
+ * Element type for PgStreamingRead's circular array of block ranges.
+ */
+typedef struct PgStreamingReadRange
+{
+	bool		need_wait;
+	bool		advice_issued;
+	BlockNumber blocknum;
+	int			nblocks;
+	int			per_buffer_data_index;
+	Buffer		buffers[MAX_BUFFERS_PER_TRANSFER];
+	ReadBuffersOperation operation;
+} PgStreamingReadRange;
+
+/*
+ * Streaming read object.
+ */
+struct PgStreamingRead
+{
+	int			max_ios;
+	int			ios_in_progress;
+	int			max_pinned_buffers;
+	int			pinned_buffers;
+	int			pinned_buffers_trigger;
+	int			next_tail_buffer;
+	int			distance;
+	bool		finished;
+	bool		advice_enabled;
+	void	   *pgsr_private;
+	PgStreamingReadBufferCB callback;
+
+	BufferAccessStrategy strategy;
+	BufferManagerRelation bmr;
+	ForkNumber	forknum;
+
+	/* Sometimes we need to buffer one block for flow control. */
+	BlockNumber unget_blocknum;
+	void	   *unget_per_buffer_data;
+
+	/* Next expected block, for detecting sequential access. */
+	BlockNumber seq_blocknum;
+
+	/* Space for optional per-buffer private data. */
+	size_t		per_buffer_data_size;
+	void	   *per_buffer_data;
+
+	/* Circular buffer of ranges. */
+	int			size;
+	int			head;
+	int			tail;
+	PgStreamingReadRange ranges[FLEXIBLE_ARRAY_MEMBER];
+};
+
+/*
+ * Create a new streaming read object that can be used to perform the
+ * equivalent of a series of ReadBuffer() calls for one fork of one relation.
+ * Internally, it generates larger vectored reads where possible by looking
+ * ahead.
+ */
+PgStreamingRead *
+pg_streaming_read_buffer_alloc(int flags,
+							   void *pgsr_private,
+							   size_t per_buffer_data_size,
+							   BufferAccessStrategy strategy,
+							   BufferManagerRelation bmr,
+							   ForkNumber forknum,
+							   PgStreamingReadBufferCB next_block_cb)
+{
+	PgStreamingRead *pgsr;
+	int			size;
+	int			max_ios;
+	uint32		max_pinned_buffers;
+	Oid			tablespace_id;
+
+	/*
+	 * Make sure our bmr's smgr and persistent are populated.  The caller
+	 * asserts that the storage manager will remain valid.
+	 */
+	if (!bmr.smgr)
+	{
+		bmr.smgr = RelationGetSmgr(bmr.rel);
+		bmr.relpersistence = bmr.rel->rd_rel->relpersistence;
+	}
+
+	/*
+	 * Decide how many assumed I/Os we will allow to run concurrently.  That
+	 * is, advice to the kernel to tell it that we will soon read.  This
+	 * number also affects how far we look ahead for opportunities to start
+	 * more I/Os.
+	 */
+	tablespace_id = bmr.smgr->smgr_rlocator.locator.spcOid;
+	if (!OidIsValid(MyDatabaseId) ||
+		(bmr.rel && IsCatalogRelation(bmr.rel)) ||
+		IsCatalogRelationOid(bmr.smgr->smgr_rlocator.locator.relNumber))
+	{
+		/*
+		 * Avoid circularity while trying to look up tablespace settings or
+		 * before spccache.c is ready.
+		 */
+		max_ios = effective_io_concurrency;
+	}
+	else if (flags & PGSR_FLAG_MAINTENANCE)
+		max_ios = get_tablespace_maintenance_io_concurrency(tablespace_id);
+	else
+		max_ios = get_tablespace_io_concurrency(tablespace_id);
+
+	/*
+	 * The desired level of I/O concurrency controls how far ahead we are
+	 * willing to look ahead.  We also clamp it to at least
+	 * MAX_BUFFER_PER_TRANSFER so that we can have a chance to build up a full
+	 * sized read, even when max_ios is zero.
+	 */
+	max_pinned_buffers = Max(max_ios * 4, MAX_BUFFERS_PER_TRANSFER);
+
+	/* Don't allow this backend to pin more than its share of buffers. */
+	if (SmgrIsTemp(bmr.smgr))
+		LimitAdditionalLocalPins(&max_pinned_buffers);
+	else
+		LimitAdditionalPins(&max_pinned_buffers);
+	Assert(max_pinned_buffers > 0);
+
+	/*
+	 * pgsr->ranges is a circular buffer.  When it is empty, head == tail.
+	 * When it is full, there is an empty element between head and tail.  Head
+	 * can also be empty (nblocks == 0), therefore we need two extra elements
+	 * for non-occupied ranges, on top of max_pinned_buffers to allow for the
+	 * maxmimum possible number of occupied ranges of the smallest possible
+	 * size of one.
+	 */
+	size = max_pinned_buffers + 2;
+
+	pgsr = (PgStreamingRead *)
+		palloc0(offsetof(PgStreamingRead, ranges) +
+				sizeof(pgsr->ranges[0]) * size);
+
+	pgsr->max_ios = max_ios;
+	pgsr->per_buffer_data_size = per_buffer_data_size;
+	pgsr->max_pinned_buffers = max_pinned_buffers;
+	pgsr->pgsr_private = pgsr_private;
+	pgsr->strategy = strategy;
+	pgsr->size = size;
+
+	pgsr->callback = next_block_cb;
+	pgsr->bmr = bmr;
+	pgsr->forknum = forknum;
+
+	pgsr->unget_blocknum = InvalidBlockNumber;
+
+#ifdef USE_PREFETCH
+
+	/*
+	 * This system supports prefetching advice.  As long as direct I/O isn't
+	 * enabled, and the caller hasn't promised sequential access, we can use
+	 * it.
+	 */
+	if ((io_direct_flags & IO_DIRECT_DATA) == 0 &&
+		(flags & PGSR_FLAG_SEQUENTIAL) == 0)
+		pgsr->advice_enabled = true;
+#endif
+
+	/*
+	 * Skip the initial ramp-up phase if the caller says we're going to be
+	 * reading the whole relation.  This way we start out doing full-sized
+	 * reads.
+	 */
+	if (flags & PGSR_FLAG_FULL)
+		pgsr->distance = Min(MAX_BUFFERS_PER_TRANSFER, pgsr->max_pinned_buffers);
+	else
+		pgsr->distance = 1;
+
+	/*
+	 * We want to avoid creating ranges that are smaller than they could be
+	 * just because we hit max_pinned_buffers.  We only look ahead when the
+	 * number of pinned buffers falls below this trigger number, or put
+	 * another way, we stop looking ahead when we wouldn't be able to build a
+	 * "full sized" range.
+	 */
+	pgsr->pinned_buffers_trigger =
+		Max(1, (int) max_pinned_buffers - MAX_BUFFERS_PER_TRANSFER);
+
+	/* Space for the callback to store extra data along with each block. */
+	if (per_buffer_data_size)
+		pgsr->per_buffer_data = palloc(per_buffer_data_size * max_pinned_buffers);
+
+	return pgsr;
+}
+
+/*
+ * Find the per-buffer data index for the Nth block of a range.
+ */
+static int
+get_per_buffer_data_index(PgStreamingRead *pgsr, PgStreamingReadRange *range, int n)
+{
+	int			result;
+
+	/*
+	 * Find slot in the circular buffer of per-buffer data, without using the
+	 * expensive % operator.
+	 */
+	result = range->per_buffer_data_index + n;
+	while (result >= pgsr->max_pinned_buffers)
+		result -= pgsr->max_pinned_buffers;
+	Assert(result == (range->per_buffer_data_index + n) % pgsr->max_pinned_buffers);
+
+	return result;
+}
+
+/*
+ * Return a pointer to the per-buffer data by index.
+ */
+static void *
+get_per_buffer_data_by_index(PgStreamingRead *pgsr, int per_buffer_data_index)
+{
+	return (char *) pgsr->per_buffer_data +
+		pgsr->per_buffer_data_size * per_buffer_data_index;
+}
+
+/*
+ * Return a pointer to the per-buffer data for the Nth block of a range.
+ */
+static void *
+get_per_buffer_data(PgStreamingRead *pgsr, PgStreamingReadRange *range, int n)
+{
+	return get_per_buffer_data_by_index(pgsr,
+										get_per_buffer_data_index(pgsr,
+																  range,
+																  n));
+}
+
+/*
+ * Start reading the head range, and create a new head range.  The new head
+ * range is returned.  It may not be empty, if StartReadBuffers() couldn't
+ * start the entire range; in that case the returned range contains the
+ * remaining portion of the range.
+ */
+static PgStreamingReadRange *
+pg_streaming_read_start_head_range(PgStreamingRead *pgsr)
+{
+	PgStreamingReadRange *head_range;
+	PgStreamingReadRange *new_head_range;
+	int			nblocks_pinned;
+	int			flags;
+
+	/* Caller should make sure we never exceed max_ios. */
+	Assert((pgsr->ios_in_progress < pgsr->max_ios) ||
+		   (pgsr->ios_in_progress == 0 && pgsr->max_ios == 0));
+
+	/* Should only call if the head range has some blocks to read. */
+	head_range = &pgsr->ranges[pgsr->head];
+	Assert(head_range->nblocks > 0);
+
+	/*
+	 * If advice hasn't been suppressed, and this system supports it, this
+	 * isn't a strictly sequential pattern, then we'll issue advice.
+	 */
+	if (pgsr->advice_enabled &&
+		pgsr->max_ios > 0 &&
+		head_range->blocknum != pgsr->seq_blocknum)
+		flags = READ_BUFFERS_ISSUE_ADVICE;
+	else
+		flags = 0;
+
+	/* We shouldn't be trying to pin more buffers that we're allowed to. */
+	Assert(pgsr->pinned_buffers + head_range->nblocks <= pgsr->max_pinned_buffers);
+
+	/* Start reading as many blocks as we can from the head range. */
+	nblocks_pinned = head_range->nblocks;
+	head_range->need_wait =
+		StartReadBuffers(pgsr->bmr,
+						 head_range->buffers,
+						 pgsr->forknum,
+						 head_range->blocknum,
+						 &nblocks_pinned,
+						 pgsr->strategy,
+						 flags,
+						 &head_range->operation);
+
+	if (head_range->need_wait)
+	{
+		int		distance;
+
+		if (flags & READ_BUFFERS_ISSUE_ADVICE)
+		{
+			/*
+			 * Since we've issued advice, we count an I/O in progress until we
+			 * call WaitReadBuffers().
+			 */
+			head_range->advice_issued = true;
+			pgsr->ios_in_progress++;
+			Assert(pgsr->ios_in_progress <= pgsr->max_ios);
+
+			/*
+			 * Look-ahead distance ramps up rapidly, so we can search for more
+			 * I/Os to start.
+			 */
+			distance = pgsr->distance * 2;
+			distance = Min(distance, pgsr->max_pinned_buffers);
+			pgsr->distance = distance;
+		}
+		else
+		{
+			/*
+			 * There is no point in increasing look-ahead distance if we've
+			 * already reached the full I/O size, since we're not issuing
+			 * advice.  Extra distance would only pin more buffers for no
+			 * benefit.
+			 */
+			if (pgsr->distance > MAX_BUFFERS_PER_TRANSFER)
+			{
+				/* Look-ahead distance gradually decays. */
+				pgsr->distance--;
+			}
+			else
+			{
+				/*
+				 * Look-ahead distance ramps up rapidly, but not more that the
+				 * full I/O size.
+				 */
+				distance = pgsr->distance * 2;
+				distance = Min(distance, MAX_BUFFERS_PER_TRANSFER);
+				distance = Min(distance, pgsr->max_pinned_buffers);
+				pgsr->distance = distance;
+			}
+		}
+	}
+	else
+	{
+		/* No I/O necessary. Look-ahead distance gradually decays. */
+		if (pgsr->distance > 1)
+			pgsr->distance--;
+	}
+
+	/*
+	 * StartReadBuffers() might have pinned fewer blocks than we asked it to,
+	 * but always at least one.
+	 */
+	Assert(nblocks_pinned <= head_range->nblocks);
+	Assert(nblocks_pinned >= 1);
+	pgsr->pinned_buffers += nblocks_pinned;
+
+	/*
+	 * Remember where the next block would be after that, so we can detect
+	 * sequential access next time.
+	 */
+	pgsr->seq_blocknum = head_range->blocknum + nblocks_pinned;
+
+	/*
+	 * Create a new head range.  There must be space, because we have enough
+	 * elements for every range to hold just one block, up to the pin limit.
+	 */
+	Assert(pgsr->size > pgsr->max_pinned_buffers);
+	Assert((pgsr->head + 1) % pgsr->size != pgsr->tail);
+	if (++pgsr->head == pgsr->size)
+		pgsr->head = 0;
+	new_head_range = &pgsr->ranges[pgsr->head];
+	new_head_range->nblocks = 0;
+	new_head_range->advice_issued = false;
+
+	/*
+	 * If we didn't manage to start the whole read above, we split the range,
+	 * moving the remainder into the new head range.
+	 */
+	if (nblocks_pinned < head_range->nblocks)
+	{
+		int			nblocks_remaining = head_range->nblocks - nblocks_pinned;
+
+		head_range->nblocks = nblocks_pinned;
+
+		new_head_range->blocknum = head_range->blocknum + nblocks_pinned;
+		new_head_range->nblocks = nblocks_remaining;
+	}
+
+	/* The new range has per-buffer data starting after the previous range. */
+	new_head_range->per_buffer_data_index =
+		get_per_buffer_data_index(pgsr, head_range, nblocks_pinned);
+
+	return new_head_range;
+}
+
+/*
+ * Ask the callback which block it would like us to read next, with a small
+ * buffer in front to allow pg_streaming_unget_block() to work.
+ */
+static BlockNumber
+pg_streaming_get_block(PgStreamingRead *pgsr, void *per_buffer_data)
+{
+	BlockNumber result;
+
+	if (unlikely(pgsr->unget_blocknum != InvalidBlockNumber))
+	{
+		/*
+		 * If we had to unget a block, now it is time to return that one
+		 * again.
+		 */
+		result = pgsr->unget_blocknum;
+		pgsr->unget_blocknum = InvalidBlockNumber;
+
+		/*
+		 * The same per_buffer_data element must have been used, and still
+		 * contains whatever data the callback wrote into it.  So we just
+		 * sanity-check that we were called with the value that
+		 * pg_streaming_unget_block() pushed back.
+		 */
+		Assert(per_buffer_data == pgsr->unget_per_buffer_data);
+	}
+	else
+	{
+		/* Use the installed callback directly. */
+		result = pgsr->callback(pgsr, pgsr->pgsr_private, per_buffer_data);
+	}
+
+	return result;
+}
+
+/*
+ * In order to deal with short reads in StartReadBuffers(), we sometimes need
+ * to defer handling of a block until later.  This *must* be called with the
+ * last value returned by pg_streaming_get_block().
+ */
+static void
+pg_streaming_unget_block(PgStreamingRead *pgsr, BlockNumber blocknum, void *per_buffer_data)
+{
+	Assert(pgsr->unget_blocknum == InvalidBlockNumber);
+	pgsr->unget_blocknum = blocknum;
+	pgsr->unget_per_buffer_data = per_buffer_data;
+}
+
+static void
+pg_streaming_read_look_ahead(PgStreamingRead *pgsr)
+{
+	PgStreamingReadRange *range;
+
+	/* If we're finished, don't look ahead. */
+	if (pgsr->finished)
+		return;
+
+	/*
+	 * We we've already started the maximum allowed number of I/Os, don't look
+	 * ahead.  (The special case for max_ios == 0 is handle higher up.)
+	 */
+	if (pgsr->max_ios > 0 && pgsr->ios_in_progress == pgsr->max_ios)
+		return;
+
+	/*
+	 * We'll also wait until the number of pinned buffers falls below our
+	 * trigger level, so that we have the chance to create a full range.
+	 */
+	if (pgsr->pinned_buffers >= pgsr->pinned_buffers_trigger)
+		return;
+
+	do
+	{
+		BlockNumber blocknum;
+		void	   *per_buffer_data;
+
+		/* Do we have a full-sized range? */
+		range = &pgsr->ranges[pgsr->head];
+		if (range->nblocks == lengthof(range->buffers))
+		{
+			/* Start as much of it as we can. */
+			range = pg_streaming_read_start_head_range(pgsr);
+
+			/* If we're now at the I/O limit, stop here. */
+			if (pgsr->ios_in_progress == pgsr->max_ios)
+				return;
+
+			/*
+			 * If we couldn't form a full range, then stop here to avoid
+			 * creating small I/O.
+			 */
+			if (pgsr->pinned_buffers >= pgsr->pinned_buffers_trigger)
+				return;
+
+			/*
+			 * That might have only been partially started, but always
+			 * processes at least one so that'll do for now.
+			 */
+			Assert(range->nblocks < lengthof(range->buffers));
+		}
+
+		/* Find per-buffer data slot for the next block. */
+		per_buffer_data = get_per_buffer_data(pgsr, range, range->nblocks);
+
+		/* Find out which block the callback wants to read next. */
+		blocknum = pg_streaming_get_block(pgsr, per_buffer_data);
+		if (blocknum == InvalidBlockNumber)
+		{
+			/* End of stream. */
+			pgsr->finished = true;
+			break;
+		}
+
+		/*
+		 * Is there a head range that we cannot extend, because the requested
+		 * block is not consecutive?
+		 */
+		if (range->nblocks > 0 &&
+			range->blocknum + range->nblocks != blocknum)
+		{
+			/* Yes.  Start it, so we can begin building a new one. */
+			range = pg_streaming_read_start_head_range(pgsr);
+
+			/*
+			 * It's possible that it was only partially started, and we have a
+			 * new range with the remainder.  Keep starting I/Os until we get
+			 * it all out of the way, or we hit the I/O limit.
+			 */
+			while (range->nblocks > 0 && pgsr->ios_in_progress < pgsr->max_ios)
+				range = pg_streaming_read_start_head_range(pgsr);
+
+			/*
+			 * We have to 'unget' the block returned by the callback if we
+			 * don't have enough I/O capacity left to start something.
+			 */
+			if (pgsr->ios_in_progress == pgsr->max_ios)
+			{
+				pg_streaming_unget_block(pgsr, blocknum, per_buffer_data);
+				return;
+			}
+		}
+
+		/* If we have a new, empty range, initialize the start block. */
+		if (range->nblocks == 0)
+		{
+			range->blocknum = blocknum;
+		}
+
+		/* This block extends the range by one. */
+		Assert(range->blocknum + range->nblocks == blocknum);
+		range->nblocks++;
+
+	} while (pgsr->pinned_buffers + range->nblocks < pgsr->distance);
+
+	/* Start as much as we can. */
+	while (range->nblocks > 0)
+	{
+		range = pg_streaming_read_start_head_range(pgsr);
+		if (pgsr->ios_in_progress == pgsr->max_ios)
+			break;
+	}
+}
+
+Buffer
+pg_streaming_read_buffer_get_next(PgStreamingRead *pgsr, void **per_buffer_data)
+{
+	/*
+	 * The setting max_ios == 0 requires special t...
+	 */
+	if (pgsr->max_ios > 0 || pgsr->pinned_buffers == 0)
+		pg_streaming_read_look_ahead(pgsr);
+
+	/* See if we have one buffer to return. */
+	while (pgsr->tail != pgsr->head)
+	{
+		PgStreamingReadRange *tail_range;
+
+		tail_range = &pgsr->ranges[pgsr->tail];
+
+		/*
+		 * Do we need to perform an I/O before returning the buffers from this
+		 * range?
+		 */
+		if (tail_range->need_wait)
+		{
+			WaitReadBuffers(&tail_range->operation);
+			tail_range->need_wait = false;
+
+			/*
+			 * We don't really know if the kernel generated a physical I/O
+			 * when we issued advice, let alone when it finished, but it has
+			 * certainly finished now because we've performed the read.
+			 */
+			if (tail_range->advice_issued)
+			{
+				Assert(pgsr->ios_in_progress > 0);
+				pgsr->ios_in_progress--;
+			}
+		}
+
+		/* Are there more buffers available in this range? */
+		if (pgsr->next_tail_buffer < tail_range->nblocks)
+		{
+			int			buffer_index;
+			Buffer		buffer;
+
+			buffer_index = pgsr->next_tail_buffer++;
+			buffer = tail_range->buffers[buffer_index];
+
+			Assert(BufferIsValid(buffer));
+
+			/* We are giving away ownership of this pinned buffer. */
+			Assert(pgsr->pinned_buffers > 0);
+			pgsr->pinned_buffers--;
+
+			if (per_buffer_data)
+				*per_buffer_data = get_per_buffer_data(pgsr, tail_range, buffer_index);
+
+			return buffer;
+		}
+
+		/* Advance tail to next range, if there is one. */
+		if (++pgsr->tail == pgsr->size)
+			pgsr->tail = 0;
+		pgsr->next_tail_buffer = 0;
+
+		/*
+		 * If tail crashed into head, and head is not empty, then it is time
+		 * to start that range.
+		 */
+		if (pgsr->tail == pgsr->head &&
+			pgsr->ranges[pgsr->head].nblocks > 0)
+			pg_streaming_read_start_head_range(pgsr);
+	}
+
+	Assert(pgsr->pinned_buffers == 0);
+
+	return InvalidBuffer;
+}
+
+void
+pg_streaming_read_free(PgStreamingRead *pgsr)
+{
+	Buffer		buffer;
+
+	/* Stop looking ahead. */
+	pgsr->finished = true;
+
+	/* Unpin anything that wasn't consumed. */
+	while ((buffer = pg_streaming_read_buffer_get_next(pgsr, NULL)) != InvalidBuffer)
+		ReleaseBuffer(buffer);
+
+	Assert(pgsr->pinned_buffers == 0);
+	Assert(pgsr->ios_in_progress == 0);
+
+	/* Release memory. */
+	if (pgsr->per_buffer_data)
+		pfree(pgsr->per_buffer_data);
+
+	pfree(pgsr);
+}
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index f0f8d4259c5..729d1f91721 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -19,6 +19,11 @@
  *		and pin it so that no one can destroy it while this process
  *		is using it.
  *
+ * StartReadBuffers() -- as above, but for multiple contiguous blocks in
+ *		two steps.
+ *
+ * WaitReadBuffers() -- second step of StartReadBuffers().
+ *
  * ReleaseBuffer() -- unpin a buffer
  *
  * MarkBufferDirty() -- mark a pinned buffer's contents as "dirty".
@@ -471,10 +476,9 @@ ForgetPrivateRefCountEntry(PrivateRefCountEntry *ref)
 )
 
 
-static Buffer ReadBuffer_common(SMgrRelation smgr, char relpersistence,
+static Buffer ReadBuffer_common(BufferManagerRelation bmr,
 								ForkNumber forkNum, BlockNumber blockNum,
-								ReadBufferMode mode, BufferAccessStrategy strategy,
-								bool *hit);
+								ReadBufferMode mode, BufferAccessStrategy strategy);
 static BlockNumber ExtendBufferedRelCommon(BufferManagerRelation bmr,
 										   ForkNumber fork,
 										   BufferAccessStrategy strategy,
@@ -500,7 +504,7 @@ static uint32 WaitBufHdrUnlocked(BufferDesc *buf);
 static int	SyncOneBuffer(int buf_id, bool skip_recently_used,
 						  WritebackContext *wb_context);
 static void WaitIO(BufferDesc *buf);
-static bool StartBufferIO(BufferDesc *buf, bool forInput);
+static bool StartBufferIO(BufferDesc *buf, bool forInput, bool nowait);
 static void TerminateBufferIO(BufferDesc *buf, bool clear_dirty,
 							  uint32 set_flag_bits, bool forget_owner);
 static void AbortBufferIO(Buffer buffer);
@@ -781,7 +785,6 @@ Buffer
 ReadBufferExtended(Relation reln, ForkNumber forkNum, BlockNumber blockNum,
 				   ReadBufferMode mode, BufferAccessStrategy strategy)
 {
-	bool		hit;
 	Buffer		buf;
 
 	/*
@@ -794,15 +797,9 @@ ReadBufferExtended(Relation reln, ForkNumber forkNum, BlockNumber blockNum,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("cannot access temporary tables of other sessions")));
 
-	/*
-	 * Read the buffer, and update pgstat counters to reflect a cache hit or
-	 * miss.
-	 */
-	pgstat_count_buffer_read(reln);
-	buf = ReadBuffer_common(RelationGetSmgr(reln), reln->rd_rel->relpersistence,
-							forkNum, blockNum, mode, strategy, &hit);
-	if (hit)
-		pgstat_count_buffer_hit(reln);
+	buf = ReadBuffer_common(BMR_REL(reln),
+							forkNum, blockNum, mode, strategy);
+
 	return buf;
 }
 
@@ -822,13 +819,12 @@ ReadBufferWithoutRelcache(RelFileLocator rlocator, ForkNumber forkNum,
 						  BlockNumber blockNum, ReadBufferMode mode,
 						  BufferAccessStrategy strategy, bool permanent)
 {
-	bool		hit;
-
 	SMgrRelation smgr = smgropen(rlocator, INVALID_PROC_NUMBER);
 
-	return ReadBuffer_common(smgr, permanent ? RELPERSISTENCE_PERMANENT :
-							 RELPERSISTENCE_UNLOGGED, forkNum, blockNum,
-							 mode, strategy, &hit);
+	return ReadBuffer_common(BMR_SMGR(smgr, permanent ? RELPERSISTENCE_PERMANENT :
+									  RELPERSISTENCE_UNLOGGED),
+							 forkNum, blockNum,
+							 mode, strategy);
 }
 
 /*
@@ -994,35 +990,68 @@ ExtendBufferedRelTo(BufferManagerRelation bmr,
 	 */
 	if (buffer == InvalidBuffer)
 	{
-		bool		hit;
-
 		Assert(extended_by == 0);
-		buffer = ReadBuffer_common(bmr.smgr, bmr.relpersistence,
-								   fork, extend_to - 1, mode, strategy,
-								   &hit);
+		buffer = ReadBuffer_common(bmr, fork, extend_to - 1, mode, strategy);
 	}
 
 	return buffer;
 }
 
+/*
+ * Zero a buffer and lock it, as part of the implementation of
+ * RBM_ZERO_AND_LOCK or RBM_ZERO_AND_CLEANUP_LOCK.  The buffer must be already
+ * pinned.  It does not have to be valid, but it is valid and locked on
+ * return.
+ */
+static void
+ZeroBuffer(Buffer buffer, ReadBufferMode mode)
+{
+	BufferDesc *bufHdr;
+	uint32		buf_state;
+
+	Assert(mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK);
+
+	if (BufferIsLocal(buffer))
+		bufHdr = GetLocalBufferDescriptor(-buffer - 1);
+	else
+	{
+		bufHdr = GetBufferDescriptor(buffer - 1);
+		if (mode == RBM_ZERO_AND_LOCK)
+			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
+		else
+			LockBufferForCleanup(buffer);
+	}
+
+	memset(BufferGetPage(buffer), 0, BLCKSZ);
+
+	if (BufferIsLocal(buffer))
+	{
+		buf_state = pg_atomic_read_u32(&bufHdr->state);
+		buf_state |= BM_VALID;
+		pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
+	}
+	else
+	{
+		buf_state = LockBufHdr(bufHdr);
+		buf_state |= BM_VALID;
+		UnlockBufHdr(bufHdr, buf_state);
+	}
+}
+
 /*
  * ReadBuffer_common -- common logic for all ReadBuffer variants
  *
  * *hit is set to true if the request was satisfied from shared buffer cache.
  */
 static Buffer
-ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
+ReadBuffer_common(BufferManagerRelation bmr, ForkNumber forkNum,
 				  BlockNumber blockNum, ReadBufferMode mode,
-				  BufferAccessStrategy strategy, bool *hit)
+				  BufferAccessStrategy strategy)
 {
-	BufferDesc *bufHdr;
-	Block		bufBlock;
-	bool		found;
-	IOContext	io_context;
-	IOObject	io_object;
-	bool		isLocalBuf = SmgrIsTemp(smgr);
-
-	*hit = false;
+	ReadBuffersOperation operation;
+	Buffer		buffer;
+	int			nblocks;
+	int			flags;
 
 	/*
 	 * Backward compatibility path, most code should use ExtendBufferedRel()
@@ -1041,181 +1070,404 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 		if (mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK)
 			flags |= EB_LOCK_FIRST;
 
-		return ExtendBufferedRel(BMR_SMGR(smgr, relpersistence),
-								 forkNum, strategy, flags);
+		return ExtendBufferedRel(bmr, forkNum, strategy, flags);
 	}
 
-	TRACE_POSTGRESQL_BUFFER_READ_START(forkNum, blockNum,
-									   smgr->smgr_rlocator.locator.spcOid,
-									   smgr->smgr_rlocator.locator.dbOid,
-									   smgr->smgr_rlocator.locator.relNumber,
-									   smgr->smgr_rlocator.backend);
+	nblocks = 1;
+	if (mode == RBM_ZERO_ON_ERROR)
+		flags = READ_BUFFERS_ZERO_ON_ERROR;
+	else
+		flags = 0;
+	if (StartReadBuffers(bmr,
+						 &buffer,
+						 forkNum,
+						 blockNum,
+						 &nblocks,
+						 strategy,
+						 flags,
+						 &operation))
+		WaitReadBuffers(&operation);
+	Assert(nblocks == 1);		/* single block can't be short */
+
+	if (mode == RBM_ZERO_AND_CLEANUP_LOCK || mode == RBM_ZERO_AND_LOCK)
+		ZeroBuffer(buffer, mode);
+
+	return buffer;
+}
+
+static Buffer
+PrepareReadBuffer(BufferManagerRelation bmr,
+				  ForkNumber forkNum,
+				  BlockNumber blockNum,
+				  BufferAccessStrategy strategy,
+				  bool *foundPtr)
+{
+	BufferDesc *bufHdr;
+	bool		isLocalBuf;
+	IOContext	io_context;
+	IOObject	io_object;
+
+	Assert(blockNum != P_NEW);
 
+	Assert(bmr.smgr);
+
+	isLocalBuf = SmgrIsTemp(bmr.smgr);
 	if (isLocalBuf)
 	{
-		/*
-		 * We do not use a BufferAccessStrategy for I/O of temporary tables.
-		 * However, in some cases, the "strategy" may not be NULL, so we can't
-		 * rely on IOContextForStrategy() to set the right IOContext for us.
-		 * This may happen in cases like CREATE TEMPORARY TABLE AS...
-		 */
 		io_context = IOCONTEXT_NORMAL;
 		io_object = IOOBJECT_TEMP_RELATION;
-		bufHdr = LocalBufferAlloc(smgr, forkNum, blockNum, &found);
-		if (found)
-			pgBufferUsage.local_blks_hit++;
-		else if (mode == RBM_NORMAL || mode == RBM_NORMAL_NO_LOG ||
-				 mode == RBM_ZERO_ON_ERROR)
-			pgBufferUsage.local_blks_read++;
 	}
 	else
 	{
-		/*
-		 * lookup the buffer.  IO_IN_PROGRESS is set if the requested block is
-		 * not currently in memory.
-		 */
 		io_context = IOContextForStrategy(strategy);
 		io_object = IOOBJECT_RELATION;
-		bufHdr = BufferAlloc(smgr, relpersistence, forkNum, blockNum,
-							 strategy, &found, io_context);
-		if (found)
-			pgBufferUsage.shared_blks_hit++;
-		else if (mode == RBM_NORMAL || mode == RBM_NORMAL_NO_LOG ||
-				 mode == RBM_ZERO_ON_ERROR)
-			pgBufferUsage.shared_blks_read++;
 	}
 
-	/* At this point we do NOT hold any locks. */
+	TRACE_POSTGRESQL_BUFFER_READ_START(forkNum, blockNum,
+									   bmr.smgr->smgr_rlocator.locator.spcOid,
+									   bmr.smgr->smgr_rlocator.locator.dbOid,
+									   bmr.smgr->smgr_rlocator.locator.relNumber,
+									   bmr.smgr->smgr_rlocator.backend);
 
-	/* if it was already in the buffer pool, we're done */
-	if (found)
+	ResourceOwnerEnlarge(CurrentResourceOwner);
+	if (isLocalBuf)
+	{
+		bufHdr = LocalBufferAlloc(bmr.smgr, forkNum, blockNum, foundPtr);
+		if (*foundPtr)
+			pgBufferUsage.local_blks_hit++;
+	}
+	else
+	{
+		bufHdr = BufferAlloc(bmr.smgr, bmr.relpersistence, forkNum, blockNum,
+							 strategy, foundPtr, io_context);
+		if (*foundPtr)
+			pgBufferUsage.shared_blks_hit++;
+	}
+	if (bmr.rel)
+	{
+		/*
+		 * While pgBufferUsage's "read" counter isn't bumped unless we reach
+		 * WaitReadBuffers() (so, not for hits, and not for buffers that are
+		 * zeroed instead), the per-relation stats always count them.
+		 */
+		pgstat_count_buffer_read(bmr.rel);
+		if (*foundPtr)
+			pgstat_count_buffer_hit(bmr.rel);
+	}
+	if (*foundPtr)
 	{
-		/* Just need to update stats before we exit */
-		*hit = true;
 		VacuumPageHit++;
 		pgstat_count_io_op(io_object, io_context, IOOP_HIT);
-
 		if (VacuumCostActive)
 			VacuumCostBalance += VacuumCostPageHit;
 
 		TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
-										  smgr->smgr_rlocator.locator.spcOid,
-										  smgr->smgr_rlocator.locator.dbOid,
-										  smgr->smgr_rlocator.locator.relNumber,
-										  smgr->smgr_rlocator.backend,
-										  found);
+										  bmr.smgr->smgr_rlocator.locator.spcOid,
+										  bmr.smgr->smgr_rlocator.locator.dbOid,
+										  bmr.smgr->smgr_rlocator.locator.relNumber,
+										  bmr.smgr->smgr_rlocator.backend,
+										  true);
+	}
 
-		/*
-		 * In RBM_ZERO_AND_LOCK mode the caller expects the page to be locked
-		 * on return.
-		 */
-		if (!isLocalBuf)
-		{
-			if (mode == RBM_ZERO_AND_LOCK)
-				LWLockAcquire(BufferDescriptorGetContentLock(bufHdr),
-							  LW_EXCLUSIVE);
-			else if (mode == RBM_ZERO_AND_CLEANUP_LOCK)
-				LockBufferForCleanup(BufferDescriptorGetBuffer(bufHdr));
-		}
+	return BufferDescriptorGetBuffer(bufHdr);
+}
 
-		return BufferDescriptorGetBuffer(bufHdr);
+/*
+ * Begin reading a range of blocks beginning at blockNum and extending for
+ * *nblocks.  On return, up to *nblocks pinned buffers holding those blocks
+ * are written into the buffers array, and *nblocks is updated to contain the
+ * actual number, which may be fewer than requested.
+ *
+ * If false is returned, no I/O is necessary and WaitReadBuffers() is not
+ * necessary.  If true is returned, one I/O has been started, and
+ * WaitReadBuffers() must be called with the same operation object before the
+ * buffers are accessed.  Along with the operation object, the caller-supplied
+ * array of buffers must remain valid until WaitReadBuffers() is called.
+ *
+ * Currently the I/O is only started with optional operating system advice,
+ * and the real I/O happens in WaitReadBuffers().  In future work, true I/O
+ * could be initiated here.
+ */
+bool
+StartReadBuffers(BufferManagerRelation bmr,
+				 Buffer *buffers,
+				 ForkNumber forkNum,
+				 BlockNumber blockNum,
+				 int *nblocks,
+				 BufferAccessStrategy strategy,
+				 int flags,
+				 ReadBuffersOperation *operation)
+{
+	int			actual_nblocks = *nblocks;
+
+	if (bmr.rel)
+	{
+		bmr.smgr = RelationGetSmgr(bmr.rel);
+		bmr.relpersistence = bmr.rel->rd_rel->relpersistence;
 	}
 
-	/*
-	 * if we have gotten to this point, we have allocated a buffer for the
-	 * page but its contents are not yet valid.  IO_IN_PROGRESS is set for it,
-	 * if it's a shared buffer.
-	 */
-	Assert(!(pg_atomic_read_u32(&bufHdr->state) & BM_VALID));	/* spinlock not needed */
+	operation->bmr = bmr;
+	operation->forknum = forkNum;
+	operation->blocknum = blockNum;
+	operation->buffers = buffers;
+	operation->nblocks = actual_nblocks;
+	operation->strategy = strategy;
+	operation->flags = flags;
 
-	bufBlock = isLocalBuf ? LocalBufHdrGetBlock(bufHdr) : BufHdrGetBlock(bufHdr);
+	operation->io_buffers_len = 0;
 
-	/*
-	 * Read in the page, unless the caller intends to overwrite it and just
-	 * wants us to allocate a buffer.
-	 */
-	if (mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK)
-		MemSet((char *) bufBlock, 0, BLCKSZ);
-	else
+	for (int i = 0; i < actual_nblocks; ++i)
 	{
-		instr_time	io_start = pgstat_prepare_io_time(track_io_timing);
+		bool		found;
 
-		smgrread(smgr, forkNum, blockNum, bufBlock);
+		buffers[i] = PrepareReadBuffer(bmr,
+									   forkNum,
+									   blockNum + i,
+									   strategy,
+									   &found);
 
-		pgstat_count_io_op_time(io_object, io_context,
-								IOOP_READ, io_start, 1);
+		if (found)
+		{
+			/*
+			 * Terminate the read as soon as we get a hit.  It could be a
+			 * single buffer hit, or it could be a hit that follows a readable
+			 * range.  We don't want to create more than one readable range,
+			 * so we stop here.
+			 */
+			actual_nblocks = operation->nblocks = *nblocks = i + 1;
+		}
+		else
+		{
+			/* Extend the readable range to cover this block. */
+			operation->io_buffers_len++;
+		}
+	}
 
-		/* check for garbage data */
-		if (!PageIsVerifiedExtended((Page) bufBlock, blockNum,
-									PIV_LOG_WARNING | PIV_REPORT_STAT))
+	if (operation->io_buffers_len > 0)
+	{
+		if (flags & READ_BUFFERS_ISSUE_ADVICE)
 		{
-			if (mode == RBM_ZERO_ON_ERROR || zero_damaged_pages)
-			{
-				ereport(WARNING,
-						(errcode(ERRCODE_DATA_CORRUPTED),
-						 errmsg("invalid page in block %u of relation %s; zeroing out page",
-								blockNum,
-								relpath(smgr->smgr_rlocator, forkNum))));
-				MemSet((char *) bufBlock, 0, BLCKSZ);
-			}
-			else
-				ereport(ERROR,
-						(errcode(ERRCODE_DATA_CORRUPTED),
-						 errmsg("invalid page in block %u of relation %s",
-								blockNum,
-								relpath(smgr->smgr_rlocator, forkNum))));
+			/*
+			 * In theory we should only do this if PrepareReadBuffers() had to
+			 * allocate new buffers above.  That way, if two calls to
+			 * StartReadBuffers() were made for the same blocks before
+			 * WaitReadBuffers(), only the first would issue the advice.
+			 * That'd be a better simulation of true asynchronous I/O, which
+			 * would only start the I/O once, but isn't done here for
+			 * simplicity.  Note also that the following call might actually
+			 * issue two advice calls if we cross a segment boundary; in a
+			 * true asynchronous version we might choose to process only one
+			 * real I/O at a time in that case.
+			 */
+			smgrprefetch(bmr.smgr, forkNum, blockNum, operation->io_buffers_len);
 		}
+
+		/* Indicate that WaitReadBuffers() should be called. */
+		return true;
 	}
+	else
+	{
+		return false;
+	}
+}
 
-	/*
-	 * In RBM_ZERO_AND_LOCK / RBM_ZERO_AND_CLEANUP_LOCK mode, grab the buffer
-	 * content lock before marking the page as valid, to make sure that no
-	 * other backend sees the zeroed page before the caller has had a chance
-	 * to initialize it.
-	 *
-	 * Since no-one else can be looking at the page contents yet, there is no
-	 * difference between an exclusive lock and a cleanup-strength lock. (Note
-	 * that we cannot use LockBuffer() or LockBufferForCleanup() here, because
-	 * they assert that the buffer is already valid.)
-	 */
-	if ((mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK) &&
-		!isLocalBuf)
+static inline bool
+WaitReadBuffersCanStartIO(Buffer buffer, bool nowait)
+{
+	if (BufferIsLocal(buffer))
 	{
-		LWLockAcquire(BufferDescriptorGetContentLock(bufHdr), LW_EXCLUSIVE);
+		BufferDesc *bufHdr = GetLocalBufferDescriptor(-buffer - 1);
+
+		return (pg_atomic_read_u32(&bufHdr->state) & BM_VALID) == 0;
 	}
+	else
+		return StartBufferIO(GetBufferDescriptor(buffer - 1), true, nowait);
+}
+
+void
+WaitReadBuffers(ReadBuffersOperation *operation)
+{
+	BufferManagerRelation bmr;
+	Buffer	   *buffers;
+	int			nblocks;
+	BlockNumber blocknum;
+	ForkNumber	forknum;
+	bool		isLocalBuf;
+	IOContext	io_context;
+	IOObject	io_object;
+
+	/*
+	 * Currently operations are only allowed to include a read of some range,
+	 * with an optional extra buffer that is already pinned at the end.  So
+	 * nblocks can be at most one more than io_buffers_len.
+	 */
+	Assert((operation->nblocks == operation->io_buffers_len) ||
+		   (operation->nblocks == operation->io_buffers_len + 1));
 
+	/* Find the range of the physical read we need to perform. */
+	nblocks = operation->io_buffers_len;
+	if (nblocks == 0)
+		return;					/* nothing to do */
+
+	buffers = &operation->buffers[0];
+	blocknum = operation->blocknum;
+	forknum = operation->forknum;
+	bmr = operation->bmr;
+
+	isLocalBuf = SmgrIsTemp(bmr.smgr);
 	if (isLocalBuf)
 	{
-		/* Only need to adjust flags */
-		uint32		buf_state = pg_atomic_read_u32(&bufHdr->state);
-
-		buf_state |= BM_VALID;
-		pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
+		io_context = IOCONTEXT_NORMAL;
+		io_object = IOOBJECT_TEMP_RELATION;
 	}
 	else
 	{
-		/* Set BM_VALID, terminate IO, and wake up any waiters */
-		TerminateBufferIO(bufHdr, false, BM_VALID, true);
+		io_context = IOContextForStrategy(operation->strategy);
+		io_object = IOOBJECT_RELATION;
 	}
 
-	VacuumPageMiss++;
-	if (VacuumCostActive)
-		VacuumCostBalance += VacuumCostPageMiss;
+	/*
+	 * We count all these blocks as read by this backend.  This is traditional
+	 * behavior, but might turn out to be not true if we find that someone
+	 * else has beaten us and completed the read of some of these blocks.  In
+	 * that case the system globally double-counts, but we traditionally don't
+	 * count this as a "hit", and we don't have a separate counter for "miss,
+	 * but another backend completed the read".
+	 */
+	if (isLocalBuf)
+		pgBufferUsage.local_blks_read += nblocks;
+	else
+		pgBufferUsage.shared_blks_read += nblocks;
 
-	TRACE_POSTGRESQL_BUFFER_READ_DONE(forkNum, blockNum,
-									  smgr->smgr_rlocator.locator.spcOid,
-									  smgr->smgr_rlocator.locator.dbOid,
-									  smgr->smgr_rlocator.locator.relNumber,
-									  smgr->smgr_rlocator.backend,
-									  found);
+	for (int i = 0; i < nblocks; ++i)
+	{
+		int			io_buffers_len;
+		Buffer		io_buffers[MAX_BUFFERS_PER_TRANSFER];
+		void	   *io_pages[MAX_BUFFERS_PER_TRANSFER];
+		instr_time	io_start;
+		BlockNumber io_first_block;
 
-	return BufferDescriptorGetBuffer(bufHdr);
+		/*
+		 * Skip this block if someone else has already completed it.  If an
+		 * I/O is already in progress in another backend, this will wait for
+		 * the outcome: either done, or something went wrong and we will
+		 * retry.
+		 */
+		if (!WaitReadBuffersCanStartIO(buffers[i], false))
+		{
+			/*
+			 * Report this as a 'hit' for this backend, even though it must
+			 * have started out as a miss in PrepareReadBuffer().
+			 */
+			TRACE_POSTGRESQL_BUFFER_READ_DONE(forknum, blocknum + i,
+											  bmr.smgr->smgr_rlocator.locator.spcOid,
+											  bmr.smgr->smgr_rlocator.locator.dbOid,
+											  bmr.smgr->smgr_rlocator.locator.relNumber,
+											  bmr.smgr->smgr_rlocator.backend,
+											  true);
+			continue;
+		}
+
+		/* We found a buffer that we need to read in. */
+		io_buffers[0] = buffers[i];
+		io_pages[0] = BufferGetBlock(buffers[i]);
+		io_first_block = blocknum + i;
+		io_buffers_len = 1;
+
+		/*
+		 * How many neighboring-on-disk blocks can we can scatter-read into
+		 * other buffers at the same time?  In this case we don't wait if we
+		 * see an I/O already in progress.  We already hold BM_IO_IN_PROGRESS
+		 * for the head block, so we should get on with that I/O as soon as
+		 * possible.  We'll come back to this block again, above.
+		 */
+		while ((i + 1) < nblocks &&
+			   WaitReadBuffersCanStartIO(buffers[i + 1], true))
+		{
+			/* Must be consecutive block numbers. */
+			Assert(BufferGetBlockNumber(buffers[i + 1]) ==
+				   BufferGetBlockNumber(buffers[i]) + 1);
+
+			io_buffers[io_buffers_len] = buffers[++i];
+			io_pages[io_buffers_len++] = BufferGetBlock(buffers[i]);
+		}
+
+		io_start = pgstat_prepare_io_time(track_io_timing);
+		smgrreadv(bmr.smgr, forknum, io_first_block, io_pages, io_buffers_len);
+		pgstat_count_io_op_time(io_object, io_context, IOOP_READ, io_start,
+								io_buffers_len);
+
+		/* Verify each block we read, and terminate the I/O. */
+		for (int j = 0; j < io_buffers_len; ++j)
+		{
+			BufferDesc *bufHdr;
+			Block		bufBlock;
+
+			if (isLocalBuf)
+			{
+				bufHdr = GetLocalBufferDescriptor(-io_buffers[j] - 1);
+				bufBlock = LocalBufHdrGetBlock(bufHdr);
+			}
+			else
+			{
+				bufHdr = GetBufferDescriptor(io_buffers[j] - 1);
+				bufBlock = BufHdrGetBlock(bufHdr);
+			}
+
+			/* check for garbage data */
+			if (!PageIsVerifiedExtended((Page) bufBlock, io_first_block + j,
+										PIV_LOG_WARNING | PIV_REPORT_STAT))
+			{
+				if ((operation->flags & READ_BUFFERS_ZERO_ON_ERROR) || zero_damaged_pages)
+				{
+					ereport(WARNING,
+							(errcode(ERRCODE_DATA_CORRUPTED),
+							 errmsg("invalid page in block %u of relation %s; zeroing out page",
+									io_first_block + j,
+									relpath(bmr.smgr->smgr_rlocator, forknum))));
+					memset(bufBlock, 0, BLCKSZ);
+				}
+				else
+					ereport(ERROR,
+							(errcode(ERRCODE_DATA_CORRUPTED),
+							 errmsg("invalid page in block %u of relation %s",
+									io_first_block + j,
+									relpath(bmr.smgr->smgr_rlocator, forknum))));
+			}
+
+			/* Terminate I/O and set BM_VALID. */
+			if (isLocalBuf)
+			{
+				uint32		buf_state = pg_atomic_read_u32(&bufHdr->state);
+
+				buf_state |= BM_VALID;
+				pg_atomic_unlocked_write_u32(&bufHdr->state, buf_state);
+			}
+			else
+			{
+				/* Set BM_VALID, terminate IO, and wake up any waiters */
+				TerminateBufferIO(bufHdr, false, BM_VALID, true);
+			}
+
+			/* Report I/Os as completing individually. */
+			TRACE_POSTGRESQL_BUFFER_READ_DONE(forknum, io_first_block + j,
+											  bmr.smgr->smgr_rlocator.locator.spcOid,
+											  bmr.smgr->smgr_rlocator.locator.dbOid,
+											  bmr.smgr->smgr_rlocator.locator.relNumber,
+											  bmr.smgr->smgr_rlocator.backend,
+											  false);
+		}
+
+		VacuumPageMiss += io_buffers_len;
+		if (VacuumCostActive)
+			VacuumCostBalance += VacuumCostPageMiss * io_buffers_len;
+	}
 }
 
 /*
- * BufferAlloc -- subroutine for ReadBuffer.  Handles lookup of a shared
- *		buffer.  If no buffer exists already, selects a replacement
- *		victim and evicts the old page, but does NOT read in new page.
+ * BufferAlloc -- subroutine for StartReadBuffers.  Handles lookup of a shared
+ *		buffer.  If no buffer exists already, selects a replacement victim and
+ *		evicts the old page, but does NOT read in new page.
  *
  * "strategy" can be a buffer replacement strategy object, or NULL for
  * the default strategy.  The selected buffer's usage_count is advanced when
@@ -1223,11 +1475,7 @@ ReadBuffer_common(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
  *
  * The returned buffer is pinned and is already marked as holding the
  * desired page.  If it already did have the desired page, *foundPtr is
- * set true.  Otherwise, *foundPtr is set false and the buffer is marked
- * as IO_IN_PROGRESS; ReadBuffer will now need to do I/O to fill it.
- *
- * *foundPtr is actually redundant with the buffer's BM_VALID flag, but
- * we keep it for simplicity in ReadBuffer.
+ * set true.  Otherwise, *foundPtr is set false.
  *
  * io_context is passed as an output parameter to avoid calling
  * IOContextForStrategy() when there is a shared buffers hit and no IO
@@ -1286,19 +1534,10 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 		{
 			/*
 			 * We can only get here if (a) someone else is still reading in
-			 * the page, or (b) a previous read attempt failed.  We have to
-			 * wait for any active read attempt to finish, and then set up our
-			 * own read attempt if the page is still not BM_VALID.
-			 * StartBufferIO does it all.
+			 * the page, (b) a previous read attempt failed, or (c) someone
+			 * called StartReadBuffers() but not yet WaitReadBuffers().
 			 */
-			if (StartBufferIO(buf, true))
-			{
-				/*
-				 * If we get here, previous attempts to read the buffer must
-				 * have failed ... but we shall bravely try again.
-				 */
-				*foundPtr = false;
-			}
+			*foundPtr = false;
 		}
 
 		return buf;
@@ -1363,19 +1602,10 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 		{
 			/*
 			 * We can only get here if (a) someone else is still reading in
-			 * the page, or (b) a previous read attempt failed.  We have to
-			 * wait for any active read attempt to finish, and then set up our
-			 * own read attempt if the page is still not BM_VALID.
-			 * StartBufferIO does it all.
+			 * the page, (b) a previous read attempt failed, or (c) someone
+			 * called StartReadBuffers() but not yet WaitReadBuffers().
 			 */
-			if (StartBufferIO(existing_buf_hdr, true))
-			{
-				/*
-				 * If we get here, previous attempts to read the buffer must
-				 * have failed ... but we shall bravely try again.
-				 */
-				*foundPtr = false;
-			}
+			*foundPtr = false;
 		}
 
 		return existing_buf_hdr;
@@ -1407,15 +1637,9 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 	LWLockRelease(newPartitionLock);
 
 	/*
-	 * Buffer contents are currently invalid.  Try to obtain the right to
-	 * start I/O.  If StartBufferIO returns false, then someone else managed
-	 * to read it before we did, so there's nothing left for BufferAlloc() to
-	 * do.
+	 * Buffer contents are currently invalid.
 	 */
-	if (StartBufferIO(victim_buf_hdr, true))
-		*foundPtr = false;
-	else
-		*foundPtr = true;
+	*foundPtr = false;
 
 	return victim_buf_hdr;
 }
@@ -1769,7 +1993,7 @@ again:
  * pessimistic, but outside of toy-sized shared_buffers it should allow
  * sufficient pins.
  */
-static void
+void
 LimitAdditionalPins(uint32 *additional_pins)
 {
 	uint32		max_backends;
@@ -2034,7 +2258,7 @@ ExtendBufferedRelShared(BufferManagerRelation bmr,
 
 				buf_state &= ~BM_VALID;
 				UnlockBufHdr(existing_hdr, buf_state);
-			} while (!StartBufferIO(existing_hdr, true));
+			} while (!StartBufferIO(existing_hdr, true, false));
 		}
 		else
 		{
@@ -2057,7 +2281,7 @@ ExtendBufferedRelShared(BufferManagerRelation bmr,
 			LWLockRelease(partition_lock);
 
 			/* XXX: could combine the locked operations in it with the above */
-			StartBufferIO(victim_buf_hdr, true);
+			StartBufferIO(victim_buf_hdr, true, false);
 		}
 	}
 
@@ -2372,7 +2596,12 @@ PinBuffer(BufferDesc *buf, BufferAccessStrategy strategy)
 	else
 	{
 		/*
-		 * If we previously pinned the buffer, it must surely be valid.
+		 * If we previously pinned the buffer, it is likely to be valid, but
+		 * it may not be if StartReadBuffers() was called and
+		 * WaitReadBuffers() hasn't been called yet.  We'll check by loading
+		 * the flags without locking.  This is racy, but it's OK to return
+		 * false spuriously: when WaitReadBuffers() calls StartBufferIO(),
+		 * it'll see that it's now valid.
 		 *
 		 * Note: We deliberately avoid a Valgrind client request here.
 		 * Individual access methods can optionally superimpose buffer page
@@ -2381,7 +2610,7 @@ PinBuffer(BufferDesc *buf, BufferAccessStrategy strategy)
 		 * that the buffer page is legitimately non-accessible here.  We
 		 * cannot meddle with that.
 		 */
-		result = true;
+		result = (pg_atomic_read_u32(&buf->state) & BM_VALID) != 0;
 	}
 
 	ref->refcount++;
@@ -3449,7 +3678,7 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln, IOObject io_object,
 	 * someone else flushed the buffer before we could, so we need not do
 	 * anything.
 	 */
-	if (!StartBufferIO(buf, false))
+	if (!StartBufferIO(buf, false, false))
 		return;
 
 	/* Setup error traceback support for ereport() */
@@ -5184,9 +5413,15 @@ WaitIO(BufferDesc *buf)
  *
  * Returns true if we successfully marked the buffer as I/O busy,
  * false if someone else already did the work.
+ *
+ * If nowait is true, then we don't wait for an I/O to be finished by another
+ * backend.  In that case, false indicates either that the I/O was already
+ * finished, or is still in progress.  This is useful for callers that want to
+ * find out if they can perform the I/O as part of a larger operation, without
+ * waiting for the answer or distinguishing the reasons why not.
  */
 static bool
-StartBufferIO(BufferDesc *buf, bool forInput)
+StartBufferIO(BufferDesc *buf, bool forInput, bool nowait)
 {
 	uint32		buf_state;
 
@@ -5199,6 +5434,8 @@ StartBufferIO(BufferDesc *buf, bool forInput)
 		if (!(buf_state & BM_IO_IN_PROGRESS))
 			break;
 		UnlockBufHdr(buf, buf_state);
+		if (nowait)
+			return false;
 		WaitIO(buf);
 	}
 
diff --git a/src/backend/storage/buffer/localbuf.c b/src/backend/storage/buffer/localbuf.c
index fcfac335a57..985a2c7049c 100644
--- a/src/backend/storage/buffer/localbuf.c
+++ b/src/backend/storage/buffer/localbuf.c
@@ -108,10 +108,9 @@ PrefetchLocalBuffer(SMgrRelation smgr, ForkNumber forkNum,
  * LocalBufferAlloc -
  *	  Find or create a local buffer for the given page of the given relation.
  *
- * API is similar to bufmgr.c's BufferAlloc, except that we do not need
- * to do any locking since this is all local.   Also, IO_IN_PROGRESS
- * does not get set.  Lastly, we support only default access strategy
- * (hence, usage_count is always advanced).
+ * API is similar to bufmgr.c's BufferAlloc, except that we do not need to do
+ * any locking since this is all local.  We support only default access
+ * strategy (hence, usage_count is always advanced).
  */
 BufferDesc *
 LocalBufferAlloc(SMgrRelation smgr, ForkNumber forkNum, BlockNumber blockNum,
@@ -287,7 +286,7 @@ GetLocalVictimBuffer(void)
 }
 
 /* see LimitAdditionalPins() */
-static void
+void
 LimitAdditionalLocalPins(uint32 *additional_pins)
 {
 	uint32		max_pins;
@@ -297,9 +296,10 @@ LimitAdditionalLocalPins(uint32 *additional_pins)
 
 	/*
 	 * In contrast to LimitAdditionalPins() other backends don't play a role
-	 * here. We can allow up to NLocBuffer pins in total.
+	 * here. We can allow up to NLocBuffer pins in total, but it might not be
+	 * initialized yet so read num_temp_buffers.
 	 */
-	max_pins = (NLocBuffer - NLocalPinnedBuffers);
+	max_pins = (num_temp_buffers - NLocalPinnedBuffers);
 
 	if (*additional_pins >= max_pins)
 		*additional_pins = max_pins;
diff --git a/src/backend/storage/meson.build b/src/backend/storage/meson.build
index 40345bdca27..739d13293fb 100644
--- a/src/backend/storage/meson.build
+++ b/src/backend/storage/meson.build
@@ -1,5 +1,6 @@
 # Copyright (c) 2022-2024, PostgreSQL Global Development Group
 
+subdir('aio')
 subdir('buffer')
 subdir('file')
 subdir('freespace')
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index d51d46d3353..b57f71f97e3 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -14,6 +14,7 @@
 #ifndef BUFMGR_H
 #define BUFMGR_H
 
+#include "port/pg_iovec.h"
 #include "storage/block.h"
 #include "storage/buf.h"
 #include "storage/bufpage.h"
@@ -158,6 +159,11 @@ extern PGDLLIMPORT int32 *LocalRefCount;
 #define BUFFER_LOCK_SHARE		1
 #define BUFFER_LOCK_EXCLUSIVE	2
 
+/*
+ * Maximum number of buffers for multi-buffer I/O functions.  This is set to
+ * allow 128kB transfers, unless BLCKSZ and IOV_MAX imply a a smaller maximum.
+ */
+#define MAX_BUFFERS_PER_TRANSFER Min(PG_IOV_MAX, (128 * 1024) / BLCKSZ)
 
 /*
  * prototypes for functions in bufmgr.c
@@ -177,6 +183,42 @@ extern Buffer ReadBufferWithoutRelcache(RelFileLocator rlocator,
 										ForkNumber forkNum, BlockNumber blockNum,
 										ReadBufferMode mode, BufferAccessStrategy strategy,
 										bool permanent);
+
+#define READ_BUFFERS_ZERO_ON_ERROR 0x01
+#define READ_BUFFERS_ISSUE_ADVICE 0x02
+
+/*
+ * Private state used by StartReadBuffers() and WaitReadBuffers().  Declared
+ * in public header only to allow inclusion in other structs, but contents
+ * should not be accessed.
+ */
+struct ReadBuffersOperation
+{
+	/* Parameters passed in to StartReadBuffers(). */
+	BufferManagerRelation bmr;
+	Buffer	   *buffers;
+	ForkNumber	forknum;
+	BlockNumber blocknum;
+	int			nblocks;
+	BufferAccessStrategy strategy;
+	int			flags;
+
+	/* Range of buffers, if we need to perform a read. */
+	int			io_buffers_len;
+};
+
+typedef struct ReadBuffersOperation ReadBuffersOperation;
+
+extern bool StartReadBuffers(BufferManagerRelation bmr,
+							 Buffer *buffers,
+							 ForkNumber forknum,
+							 BlockNumber blocknum,
+							 int *nblocks,
+							 BufferAccessStrategy strategy,
+							 int flags,
+							 ReadBuffersOperation *operation);
+extern void WaitReadBuffers(ReadBuffersOperation *operation);
+
 extern void ReleaseBuffer(Buffer buffer);
 extern void UnlockReleaseBuffer(Buffer buffer);
 extern bool BufferIsExclusiveLocked(Buffer buffer);
@@ -250,6 +292,9 @@ extern bool HoldingBufferPinThatDelaysRecovery(void);
 
 extern bool BgBufferSync(struct WritebackContext *wb_context);
 
+extern void LimitAdditionalPins(uint32 *additional_pins);
+extern void LimitAdditionalLocalPins(uint32 *additional_pins);
+
 /* in buf_init.c */
 extern void InitBufferPool(void);
 extern Size BufferShmemSize(void);
diff --git a/src/include/storage/streaming_read.h b/src/include/storage/streaming_read.h
new file mode 100644
index 00000000000..c4d3892bb26
--- /dev/null
+++ b/src/include/storage/streaming_read.h
@@ -0,0 +1,52 @@
+#ifndef STREAMING_READ_H
+#define STREAMING_READ_H
+
+#include "storage/bufmgr.h"
+#include "storage/fd.h"
+#include "storage/smgr.h"
+
+/* Default tuning, reasonable for many users. */
+#define PGSR_FLAG_DEFAULT 0x00
+
+/*
+ * I/O streams that are performing maintenance work on behalf of potentially
+ * many users.
+ */
+#define PGSR_FLAG_MAINTENANCE 0x01
+
+/*
+ * We usually avoid issuing prefetch advice automatically when sequential
+ * access is detected, but this flag explicitly disables it, for cases that
+ * might not be correctly detected.  Explicit advice is known to perform worse
+ * than letting the kernel (at least Linux) detect sequential access.
+ */
+#define PGSR_FLAG_SEQUENTIAL 0x02
+
+/*
+ * We usually ramp up from smaller reads to larger ones, to support users who
+ * don't know if it's worth reading lots of buffers yet.  This flag disables
+ * that, declaring ahead of time that we'll be reading all available buffers.
+ */
+#define PGSR_FLAG_FULL 0x04
+
+struct PgStreamingRead;
+typedef struct PgStreamingRead PgStreamingRead;
+
+/* Callback that returns the next block number to read. */
+typedef BlockNumber (*PgStreamingReadBufferCB) (PgStreamingRead *pgsr,
+												void *pgsr_private,
+												void *per_buffer_private);
+
+extern PgStreamingRead *pg_streaming_read_buffer_alloc(int flags,
+													   void *pgsr_private,
+													   size_t per_buffer_private_size,
+													   BufferAccessStrategy strategy,
+													   BufferManagerRelation bmr,
+													   ForkNumber forknum,
+													   PgStreamingReadBufferCB next_block_cb);
+
+extern void pg_streaming_read_prefetch(PgStreamingRead *pgsr);
+extern Buffer pg_streaming_read_buffer_get_next(PgStreamingRead *pgsr, void **per_buffer_private);
+extern void pg_streaming_read_free(PgStreamingRead *pgsr);
+
+#endif
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index d3a7f75b080..299c77ea69f 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2097,6 +2097,8 @@ PgStat_TableCounts
 PgStat_TableStatus
 PgStat_TableXactStatus
 PgStat_WalStats
+PgStreamingRead
+PgStreamingReadRange
 PgXmlErrorContext
 PgXmlStrictness
 Pg_finfo_record
@@ -2267,6 +2269,7 @@ ReInitializeDSMForeignScan_function
 ReScanForeignScan_function
 ReadBufPtrType
 ReadBufferMode
+ReadBuffersOperation
 ReadBytePtrType
 ReadExtraTocPtrType
 ReadFunc
-- 
2.40.1

v10-0003-Vacuum-second-pass-uses-Streaming-Read-interface.patchtext/x-patch; charset=US-ASCII; name=v10-0003-Vacuum-second-pass-uses-Streaming-Read-interface.patchDownload
From 376edc13c9b03c80dd398c48b68d50a207aed099 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Tue, 27 Feb 2024 14:35:36 -0500
Subject: [PATCH v10 3/3] Vacuum second pass uses Streaming Read interface

Now vacuum's second pass, which removes dead items referring to dead
tuples catalogued in the first pass, uses the streaming read API by
implementing a streaming read callback which returns the next block
containing previously catalogued dead items. A new struct,
VacReapBlkState, is introduced to provide the caller with the starting
and ending indexes of dead items to vacuum.
---
 src/backend/access/heap/vacuumlazy.c | 109 ++++++++++++++++++++-------
 src/tools/pgindent/typedefs.list     |   1 +
 2 files changed, 84 insertions(+), 26 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index a83727f0c7d..4744b1c6e21 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -190,6 +190,12 @@ typedef struct LVRelState
 	BlockNumber missed_dead_pages;	/* # pages with missed dead tuples */
 	BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
 
+	/*
+	 * The index of the next TID in dead_items to reap during the second
+	 * vacuum pass.
+	 */
+	int			idx_prefetch;
+
 	/* Statistics output by us, for table */
 	double		new_rel_tuples; /* new estimated total # of tuples */
 	double		new_live_tuples;	/* new estimated total # of live tuples */
@@ -221,6 +227,20 @@ typedef struct LVSavedErrInfo
 	VacErrPhase phase;
 } LVSavedErrInfo;
 
+/*
+ * State set up in streaming read callback during vacuum's second pass which
+ * removes dead items referring to dead tuples catalogued in the first pass
+ */
+typedef struct VacReapBlkState
+{
+	/*
+	 * The indexes of the TIDs of the first and last dead tuples in a single
+	 * block in the currently vacuumed relation. The callback will set these
+	 * up prior to adding this block to the stream.
+	 */
+	int			start_idx;
+	int			end_idx;
+} VacReapBlkState;
 
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
@@ -240,8 +260,9 @@ static bool lazy_scan_noprune(LVRelState *vacrel, Buffer buf,
 static void lazy_vacuum(LVRelState *vacrel);
 static bool lazy_vacuum_all_indexes(LVRelState *vacrel);
 static void lazy_vacuum_heap_rel(LVRelState *vacrel);
-static int	lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno,
-								  Buffer buffer, int index, Buffer vmbuffer);
+static void lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno,
+								  Buffer buffer, Buffer vmbuffer,
+								  VacReapBlkState *rbstate);
 static bool lazy_check_wraparound_failsafe(LVRelState *vacrel);
 static void lazy_cleanup_all_indexes(LVRelState *vacrel);
 static IndexBulkDeleteResult *lazy_vacuum_one_index(Relation indrel,
@@ -2400,6 +2421,37 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 	return allindexes;
 }
 
+static BlockNumber
+vacuum_reap_lp_pgsr_next(PgStreamingRead *pgsr,
+						 void *pgsr_private,
+						 void *per_buffer_data)
+{
+	BlockNumber blkno;
+	LVRelState *vacrel = pgsr_private;
+	VacReapBlkState *rbstate = per_buffer_data;
+
+	VacDeadItems *dead_items = vacrel->dead_items;
+
+	if (vacrel->idx_prefetch == dead_items->num_items)
+		return InvalidBlockNumber;
+
+	blkno = ItemPointerGetBlockNumber(&dead_items->items[vacrel->idx_prefetch]);
+	rbstate->start_idx = vacrel->idx_prefetch;
+
+	for (; vacrel->idx_prefetch < dead_items->num_items; vacrel->idx_prefetch++)
+	{
+		BlockNumber curblkno =
+			ItemPointerGetBlockNumber(&dead_items->items[vacrel->idx_prefetch]);
+
+		if (blkno != curblkno)
+			break;				/* past end of tuples for this block */
+	}
+
+	rbstate->end_idx = vacrel->idx_prefetch;
+
+	return blkno;
+}
+
 /*
  *	lazy_vacuum_heap_rel() -- second pass over the heap for two pass strategy
  *
@@ -2421,7 +2473,9 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 static void
 lazy_vacuum_heap_rel(LVRelState *vacrel)
 {
-	int			index = 0;
+	Buffer		buf;
+	PgStreamingRead *pgsr;
+	VacReapBlkState *rbstate;
 	BlockNumber vacuumed_pages = 0;
 	Buffer		vmbuffer = InvalidBuffer;
 	LVSavedErrInfo saved_err_info;
@@ -2439,17 +2493,21 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 							 VACUUM_ERRCB_PHASE_VACUUM_HEAP,
 							 InvalidBlockNumber, InvalidOffsetNumber);
 
-	while (index < vacrel->dead_items->num_items)
+	pgsr = pg_streaming_read_buffer_alloc(PGSR_FLAG_MAINTENANCE, vacrel,
+										  sizeof(VacReapBlkState), vacrel->bstrategy, BMR_REL(vacrel->rel),
+										  MAIN_FORKNUM, vacuum_reap_lp_pgsr_next);
+
+	while (BufferIsValid(buf =
+						 pg_streaming_read_buffer_get_next(pgsr,
+														   (void **) &rbstate)))
 	{
 		BlockNumber blkno;
-		Buffer		buf;
 		Page		page;
 		Size		freespace;
 
 		vacuum_delay_point();
 
-		blkno = ItemPointerGetBlockNumber(&vacrel->dead_items->items[index]);
-		vacrel->blkno = blkno;
+		vacrel->blkno = blkno = BufferGetBlockNumber(buf);
 
 		/*
 		 * Pin the visibility map page in case we need to mark the page
@@ -2459,10 +2517,8 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 		visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
 
 		/* We need a non-cleanup exclusive lock to mark dead_items unused */
-		buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL,
-								 vacrel->bstrategy);
 		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
-		index = lazy_vacuum_heap_page(vacrel, blkno, buf, index, vmbuffer);
+		lazy_vacuum_heap_page(vacrel, blkno, buf, vmbuffer, rbstate);
 
 		/* Now that we've vacuumed the page, record its available space */
 		page = BufferGetPage(buf);
@@ -2481,14 +2537,16 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 	 * We set all LP_DEAD items from the first heap pass to LP_UNUSED during
 	 * the second heap pass.  No more, no less.
 	 */
-	Assert(index > 0);
+	Assert(rbstate->end_idx > 0);
 	Assert(vacrel->num_index_scans > 1 ||
-		   (index == vacrel->lpdead_items &&
+		   (rbstate->end_idx == vacrel->lpdead_items &&
 			vacuumed_pages == vacrel->lpdead_item_pages));
 
+	pg_streaming_read_free(pgsr);
+
 	ereport(DEBUG2,
 			(errmsg("table \"%s\": removed %lld dead item identifiers in %u pages",
-					vacrel->relname, (long long) index, vacuumed_pages)));
+					vacrel->relname, (long long) rbstate->end_idx, vacuumed_pages)));
 
 	/* Revert to the previous phase information for error traceback */
 	restore_vacuum_error_info(vacrel, &saved_err_info);
@@ -2502,13 +2560,12 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
  * cleanup lock is also acceptable).  vmbuffer must be valid and already have
  * a pin on blkno's visibility map page.
  *
- * index is an offset into the vacrel->dead_items array for the first listed
- * LP_DEAD item on the page.  The return value is the first index immediately
- * after all LP_DEAD items for the same page in the array.
+ * Given a block and dead items recorded during the first pass, set those items
+ * dead and truncate the line pointer array. Update the VM as appropriate.
  */
-static int
-lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
-					  int index, Buffer vmbuffer)
+static void
+lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno,
+					  Buffer buffer, Buffer vmbuffer, VacReapBlkState *rbstate)
 {
 	VacDeadItems *dead_items = vacrel->dead_items;
 	Page		page = BufferGetPage(buffer);
@@ -2529,16 +2586,17 @@ lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
 
 	START_CRIT_SECTION();
 
-	for (; index < dead_items->num_items; index++)
+	for (int i = rbstate->start_idx; i < rbstate->end_idx; i++)
 	{
-		BlockNumber tblk;
 		OffsetNumber toff;
+		ItemPointer dead_item;
 		ItemId		itemid;
 
-		tblk = ItemPointerGetBlockNumber(&dead_items->items[index]);
-		if (tblk != blkno)
-			break;				/* past end of tuples for this block */
-		toff = ItemPointerGetOffsetNumber(&dead_items->items[index]);
+		dead_item = &dead_items->items[i];
+
+		Assert(ItemPointerGetBlockNumber(dead_item) == blkno);
+
+		toff = ItemPointerGetOffsetNumber(dead_item);
 		itemid = PageGetItemId(page, toff);
 
 		Assert(ItemIdIsDead(itemid) && !ItemIdHasStorage(itemid));
@@ -2608,7 +2666,6 @@ lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
 
 	/* Revert to the previous phase information for error traceback */
 	restore_vacuum_error_info(vacrel, &saved_err_info);
-	return index;
 }
 
 /*
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 299c77ea69f..f4cfbce70b9 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2973,6 +2973,7 @@ VacOptValue
 VacuumParams
 VacuumRelation
 VacuumStmt
+VacReapBlkState
 ValidIOData
 ValidateIndexState
 ValuesScan
-- 
2.40.1

v10-0002-Vacuum-first-pass-uses-Streaming-Read-interface.patchtext/x-patch; charset=US-ASCII; name=v10-0002-Vacuum-first-pass-uses-Streaming-Read-interface.patchDownload
From 04ce6a6e22d28cdf6e44a6fb7a8abe85abaca4e8 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Mon, 11 Mar 2024 16:19:56 -0400
Subject: [PATCH v10 2/3] Vacuum first pass uses Streaming Read interface

Now vacuum's first pass, which HOT prunes and records the TIDs of
non-removable dead tuples, uses the streaming read API by converting
heap_vac_scan_next_block() to a streaming read callback.
---
 src/backend/access/heap/vacuumlazy.c | 75 ++++++++++++++++------------
 1 file changed, 44 insertions(+), 31 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 18004907750..a83727f0c7d 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -54,6 +54,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "storage/streaming_read.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -223,8 +224,8 @@ typedef struct LVSavedErrInfo
 
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
-static bool heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
-									 bool *all_visible_according_to_vm);
+static BlockNumber heap_vac_scan_next_block(PgStreamingRead *pgsr,
+											void *pgsr_private, void *per_buffer_data);
 static void find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
@@ -806,10 +807,11 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 static void
 lazy_scan_heap(LVRelState *vacrel)
 {
+	Buffer		buf;
+	PgStreamingRead *pgsr;
 	BlockNumber rel_pages = vacrel->rel_pages,
-				blkno,
 				next_fsm_block_to_vacuum = 0;
-	bool		all_visible_according_to_vm;
+	bool	   *all_visible_according_to_vm;
 
 	VacDeadItems *dead_items = vacrel->dead_items;
 	Buffer		vmbuffer = InvalidBuffer;
@@ -826,19 +828,31 @@ lazy_scan_heap(LVRelState *vacrel)
 	initprog_val[2] = dead_items->max_items;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
+	pgsr = pg_streaming_read_buffer_alloc(PGSR_FLAG_MAINTENANCE, vacrel,
+										  sizeof(bool), vacrel->bstrategy,
+										  BMR_REL(vacrel->rel),
+										  MAIN_FORKNUM,
+										  heap_vac_scan_next_block);
+
 	/* Initialize for the first heap_vac_scan_next_block() call */
 	vacrel->current_block = InvalidBlockNumber;
 	vacrel->next_unskippable_block = InvalidBlockNumber;
 	vacrel->next_unskippable_allvis = false;
 	vacrel->next_unskippable_vmbuffer = InvalidBuffer;
 
-	while (heap_vac_scan_next_block(vacrel, &blkno, &all_visible_according_to_vm))
+	while (BufferIsValid(buf = pg_streaming_read_buffer_get_next(pgsr,
+																 (void **) &all_visible_according_to_vm)))
 	{
-		Buffer		buf;
+		BlockNumber blkno;
 		Page		page;
 		bool		has_lpdead_items;
 		bool		got_cleanup_lock = false;
 
+		vacrel->blkno = blkno = BufferGetBlockNumber(buf);
+
+		CheckBufferIsPinnedOnce(buf);
+		page = BufferGetPage(buf);
+
 		vacrel->scanned_pages++;
 
 		/* Report as block scanned, update error traceback information */
@@ -905,10 +919,6 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
 
-		buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL,
-								 vacrel->bstrategy);
-		page = BufferGetPage(buf);
-
 		/*
 		 * We need a buffer cleanup lock to prune HOT chains and defragment
 		 * the page in lazy_scan_prune.  But when it's not possible to acquire
@@ -964,7 +974,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		if (got_cleanup_lock)
 			lazy_scan_prune(vacrel, buf, blkno, page,
-							vmbuffer, all_visible_according_to_vm,
+							vmbuffer, *all_visible_according_to_vm,
 							&has_lpdead_items);
 
 		/*
@@ -1018,7 +1028,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		ReleaseBuffer(vmbuffer);
 
 	/* report that everything is now scanned */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
+	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, rel_pages);
 
 	/* now we can compute the new value for pg_class.reltuples */
 	vacrel->new_live_tuples = vac_estimate_reltuples(vacrel->rel, rel_pages,
@@ -1033,6 +1043,8 @@ lazy_scan_heap(LVRelState *vacrel)
 		Max(vacrel->new_live_tuples, 0) + vacrel->recently_dead_tuples +
 		vacrel->missed_dead_tuples;
 
+	pg_streaming_read_free(pgsr);
+
 	/*
 	 * Do index vacuuming (call each index's ambulkdelete routine), then do
 	 * related heap vacuuming
@@ -1044,11 +1056,11 @@ lazy_scan_heap(LVRelState *vacrel)
 	 * Vacuum the remainder of the Free Space Map.  We must do this whether or
 	 * not there were indexes, and whether or not we bypassed index vacuuming.
 	 */
-	if (blkno > next_fsm_block_to_vacuum)
-		FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, blkno);
+	if (rel_pages > next_fsm_block_to_vacuum)
+		FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, rel_pages);
 
 	/* report all blocks vacuumed */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
+	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, rel_pages);
 
 	/* Do final index cleanup (call each index's amvacuumcleanup routine) */
 	if (vacrel->nindexes > 0 && vacrel->do_index_cleanup)
@@ -1058,14 +1070,14 @@ lazy_scan_heap(LVRelState *vacrel)
 /*
  *	heap_vac_scan_next_block() -- get next block for vacuum to process
  *
- * lazy_scan_heap() calls here every time it needs to get the next block to
- * prune and vacuum.  The function uses the visibility map, vacuum options,
- * and various thresholds to skip blocks which do not need to be processed and
- * sets blkno to the next block to process.
+ * The streaming read callback invokes heap_vac_scan_next_block() every time
+ * lazy_scan_heap() needs the next block to prune and vacuum.  The function
+ * uses the visibility map, vacuum options, and various thresholds to skip
+ * blocks which do not need to be processed and returns the next block to
+ * process or InvalidBlockNumber if there are no remaining blocks.
  *
- * The block number and visibility status of the next block to process are set
- * in *blkno and *all_visible_according_to_vm.  The return value is false if
- * there are no further blocks to process.
+ * The visibility status of the next block to process is set in the
+ * per_buffer_data.
  *
  * vacrel is an in/out parameter here.  Vacuum options and information about
  * the relation are read.  vacrel->skippedallvis is set if we skip a block
@@ -1073,11 +1085,13 @@ lazy_scan_heap(LVRelState *vacrel)
  * relfrozenxid in that case.  vacrel also holds information about the next
  * unskippable block, as bookkeeping for this function.
  */
-static bool
-heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
-						 bool *all_visible_according_to_vm)
+static BlockNumber
+heap_vac_scan_next_block(PgStreamingRead *pgsr,
+						 void *pgsr_private, void *per_buffer_data)
 {
 	BlockNumber next_block;
+	LVRelState *vacrel = pgsr_private;
+	bool	   *all_visible_according_to_vm = per_buffer_data;
 
 	/* relies on InvalidBlockNumber + 1 overflowing to 0 on first call */
 	next_block = vacrel->current_block + 1;
@@ -1090,8 +1104,7 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 			ReleaseBuffer(vacrel->next_unskippable_vmbuffer);
 			vacrel->next_unskippable_vmbuffer = InvalidBuffer;
 		}
-		*blkno = vacrel->rel_pages;
-		return false;
+		return InvalidBlockNumber;
 	}
 
 	/*
@@ -1140,9 +1153,9 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 		 * but chose not to.  We know that they are all-visible in the VM,
 		 * otherwise they would've been unskippable.
 		 */
-		*blkno = vacrel->current_block = next_block;
+		vacrel->current_block = next_block;
 		*all_visible_according_to_vm = true;
-		return true;
+		return vacrel->current_block;
 	}
 	else
 	{
@@ -1152,9 +1165,9 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 		 */
 		Assert(next_block == vacrel->next_unskippable_block);
 
-		*blkno = vacrel->current_block = next_block;
+		vacrel->current_block = next_block;
 		*all_visible_according_to_vm = vacrel->next_unskippable_allvis;
-		return true;
+		return vacrel->current_block;
 	}
 }
 
-- 
2.40.1

#35Thomas Munro
thomas.munro@gmail.com
In reply to: Melanie Plageman (#34)
Re: Confine vacuum skip logic to lazy_scan_skip

On Tue, Mar 12, 2024 at 10:03 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

I've rebased the attached v10 over top of the changes to
lazy_scan_heap() Heikki just committed and over the v6 streaming read
patch set. I started testing them and see that you are right, we no
longer pin too many buffers. However, the uncached example below is
now slower with streaming read than on master -- it looks to be
because it is doing twice as many WAL writes and syncs. I'm still
investigating why that is.

That makes sense to me. We have 256kB of buffers in our ring, but now
we're trying to read ahead 128kB at a time, so it works out that we
can only flush the WAL accumulated while dirtying half the blocks at a
time, so we flush twice as often.

If I change the ring size to 384kB, allowing for that read-ahead
window, I see approximately the same WAL flushes. Surely we'd never
be able to get the behaviour to match *and* keep the same ring size?
We simply need those 16 extra buffers to have a chance of accumulating
32 dirty buffers, and the associated WAL. Do you see the same result,
or do you think something more than that is wrong here?

Here are some system call traces using your test that helped me see
the behaviour:

1. Unpatched, ie no streaming read, we flush 90kB of WAL generated by
32 pages before we write them out one at a time just before we read in
their replacements. One flush covers the LSNs of all the pages that
will be written, even though it's only called for the first page to be
written. That's because XLogFlush(lsn), if it decides to do anything,
flushes as far as it can... IOW when we hit the *oldest* dirty block,
that's when we write out the WAL up to where we dirtied the *newest*
block, which covers the 32 pwrite() calls here:

pwrite(30,...,90112,0xf90000) = 90112 (0x16000)
fdatasync(30) = 0 (0x0)
pwrite(27,...,8192,0x0) = 8192 (0x2000)
pread(27,...,8192,0x40000) = 8192 (0x2000)
pwrite(27,...,8192,0x2000) = 8192 (0x2000)
pread(27,...,8192,0x42000) = 8192 (0x2000)
pwrite(27,...,8192,0x4000) = 8192 (0x2000)
pread(27,...,8192,0x44000) = 8192 (0x2000)
pwrite(27,...,8192,0x6000) = 8192 (0x2000)
pread(27,...,8192,0x46000) = 8192 (0x2000)
pwrite(27,...,8192,0x8000) = 8192 (0x2000)
pread(27,...,8192,0x48000) = 8192 (0x2000)
pwrite(27,...,8192,0xa000) = 8192 (0x2000)
pread(27,...,8192,0x4a000) = 8192 (0x2000)
pwrite(27,...,8192,0xc000) = 8192 (0x2000)
pread(27,...,8192,0x4c000) = 8192 (0x2000)
pwrite(27,...,8192,0xe000) = 8192 (0x2000)
pread(27,...,8192,0x4e000) = 8192 (0x2000)
pwrite(27,...,8192,0x10000) = 8192 (0x2000)
pread(27,...,8192,0x50000) = 8192 (0x2000)
pwrite(27,...,8192,0x12000) = 8192 (0x2000)
pread(27,...,8192,0x52000) = 8192 (0x2000)
pwrite(27,...,8192,0x14000) = 8192 (0x2000)
pread(27,...,8192,0x54000) = 8192 (0x2000)
pwrite(27,...,8192,0x16000) = 8192 (0x2000)
pread(27,...,8192,0x56000) = 8192 (0x2000)
pwrite(27,...,8192,0x18000) = 8192 (0x2000)
pread(27,...,8192,0x58000) = 8192 (0x2000)
pwrite(27,...,8192,0x1a000) = 8192 (0x2000)
pread(27,...,8192,0x5a000) = 8192 (0x2000)
pwrite(27,...,8192,0x1c000) = 8192 (0x2000)
pread(27,...,8192,0x5c000) = 8192 (0x2000)
pwrite(27,...,8192,0x1e000) = 8192 (0x2000)
pread(27,...,8192,0x5e000) = 8192 (0x2000)
pwrite(27,...,8192,0x20000) = 8192 (0x2000)
pread(27,...,8192,0x60000) = 8192 (0x2000)
pwrite(27,...,8192,0x22000) = 8192 (0x2000)
pread(27,...,8192,0x62000) = 8192 (0x2000)
pwrite(27,...,8192,0x24000) = 8192 (0x2000)
pread(27,...,8192,0x64000) = 8192 (0x2000)
pwrite(27,...,8192,0x26000) = 8192 (0x2000)
pread(27,...,8192,0x66000) = 8192 (0x2000)
pwrite(27,...,8192,0x28000) = 8192 (0x2000)
pread(27,...,8192,0x68000) = 8192 (0x2000)
pwrite(27,...,8192,0x2a000) = 8192 (0x2000)
pread(27,...,8192,0x6a000) = 8192 (0x2000)
pwrite(27,...,8192,0x2c000) = 8192 (0x2000)
pread(27,...,8192,0x6c000) = 8192 (0x2000)
pwrite(27,...,8192,0x2e000) = 8192 (0x2000)
pread(27,...,8192,0x6e000) = 8192 (0x2000)
pwrite(27,...,8192,0x30000) = 8192 (0x2000)
pread(27,...,8192,0x70000) = 8192 (0x2000)
pwrite(27,...,8192,0x32000) = 8192 (0x2000)
pread(27,...,8192,0x72000) = 8192 (0x2000)
pwrite(27,...,8192,0x34000) = 8192 (0x2000)
pread(27,...,8192,0x74000) = 8192 (0x2000)
pwrite(27,...,8192,0x36000) = 8192 (0x2000)
pread(27,...,8192,0x76000) = 8192 (0x2000)
pwrite(27,...,8192,0x38000) = 8192 (0x2000)
pread(27,...,8192,0x78000) = 8192 (0x2000)
pwrite(27,...,8192,0x3a000) = 8192 (0x2000)
pread(27,...,8192,0x7a000) = 8192 (0x2000)
pwrite(27,...,8192,0x3c000) = 8192 (0x2000)
pread(27,...,8192,0x7c000) = 8192 (0x2000)
pwrite(27,...,8192,0x3e000) = 8192 (0x2000)
pread(27,...,8192,0x7e000) = 8192 (0x2000)

(Digression: this alternative tail-write-head-read pattern defeats the
read-ahead and write-behind on a bunch of OSes, but not Linux because
it only seems to worry about the reads, while other Unixes have
write-behind detection too, and I believe at least some are confused
by this pattern of tiny writes following along some distance behind
tiny reads; Andrew Gierth figured that out after noticing poor ring
buffer performance, and we eventually got that fixed for one such
system[1]https://github.com/freebsd/freebsd-src/commit/f2706588730a5d3b9a687ba8d4269e386650cc4f, separating the sequence detection for reads and writes.)

2. With your patches, we replace all those little pread calls with
nice wide calls, yay!, but now we only manage to write out about half
the amount of WAL at a time as you discovered. The repeating blocks
of system calls now look like this, but there are twice as many of
them:

pwrite(32,...,40960,0x224000) = 40960 (0xa000)
fdatasync(32) = 0 (0x0)
pwrite(27,...,8192,0x5c000) = 8192 (0x2000)
preadv(27,[...],3,0x7e000) = 131072 (0x20000)
pwrite(27,...,8192,0x5e000) = 8192 (0x2000)
pwrite(27,...,8192,0x60000) = 8192 (0x2000)
pwrite(27,...,8192,0x62000) = 8192 (0x2000)
pwrite(27,...,8192,0x64000) = 8192 (0x2000)
pwrite(27,...,8192,0x66000) = 8192 (0x2000)
pwrite(27,...,8192,0x68000) = 8192 (0x2000)
pwrite(27,...,8192,0x6a000) = 8192 (0x2000)
pwrite(27,...,8192,0x6c000) = 8192 (0x2000)
pwrite(27,...,8192,0x6e000) = 8192 (0x2000)
pwrite(27,...,8192,0x70000) = 8192 (0x2000)
pwrite(27,...,8192,0x72000) = 8192 (0x2000)
pwrite(27,...,8192,0x74000) = 8192 (0x2000)
pwrite(27,...,8192,0x76000) = 8192 (0x2000)
pwrite(27,...,8192,0x78000) = 8192 (0x2000)
pwrite(27,...,8192,0x7a000) = 8192 (0x2000)

3. With your patches and test but this time using VACUUM
(BUFFER_USAGE_LIMIT = '384kB'), the repeating block grows bigger and
we get the larger WAL flushes back again, because now we're able to
collect 32 blocks' worth of WAL up front again:

pwrite(32,...,90112,0x50c000) = 90112 (0x16000)
fdatasync(32) = 0 (0x0)
pwrite(27,...,8192,0x1dc000) = 8192 (0x2000)
pread(27,...,131072,0x21e000) = 131072 (0x20000)
pwrite(27,...,8192,0x1de000) = 8192 (0x2000)
pwrite(27,...,8192,0x1e0000) = 8192 (0x2000)
pwrite(27,...,8192,0x1e2000) = 8192 (0x2000)
pwrite(27,...,8192,0x1e4000) = 8192 (0x2000)
pwrite(27,...,8192,0x1e6000) = 8192 (0x2000)
pwrite(27,...,8192,0x1e8000) = 8192 (0x2000)
pwrite(27,...,8192,0x1ea000) = 8192 (0x2000)
pwrite(27,...,8192,0x1ec000) = 8192 (0x2000)
pwrite(27,...,8192,0x1ee000) = 8192 (0x2000)
pwrite(27,...,8192,0x1f0000) = 8192 (0x2000)
pwrite(27,...,8192,0x1f2000) = 8192 (0x2000)
pwrite(27,...,8192,0x1f4000) = 8192 (0x2000)
pwrite(27,...,8192,0x1f6000) = 8192 (0x2000)
pwrite(27,...,8192,0x1f8000) = 8192 (0x2000)
pwrite(27,...,8192,0x1fa000) = 8192 (0x2000)
pwrite(27,...,8192,0x1fc000) = 8192 (0x2000)
preadv(27,[...],3,0x23e000) = 131072 (0x20000)
pwrite(27,...,8192,0x1fe000) = 8192 (0x2000)
pwrite(27,...,8192,0x200000) = 8192 (0x2000)
pwrite(27,...,8192,0x202000) = 8192 (0x2000)
pwrite(27,...,8192,0x204000) = 8192 (0x2000)
pwrite(27,...,8192,0x206000) = 8192 (0x2000)
pwrite(27,...,8192,0x208000) = 8192 (0x2000)
pwrite(27,...,8192,0x20a000) = 8192 (0x2000)
pwrite(27,...,8192,0x20c000) = 8192 (0x2000)
pwrite(27,...,8192,0x20e000) = 8192 (0x2000)
pwrite(27,...,8192,0x210000) = 8192 (0x2000)
pwrite(27,...,8192,0x212000) = 8192 (0x2000)
pwrite(27,...,8192,0x214000) = 8192 (0x2000)
pwrite(27,...,8192,0x216000) = 8192 (0x2000)
pwrite(27,...,8192,0x218000) = 8192 (0x2000)
pwrite(27,...,8192,0x21a000) = 8192 (0x2000)

4. For learning/exploration only, I rebased my experimental vectored
FlushBuffers() patch, which teaches the checkpointer to write relation
data out using smgrwritev(). The checkpointer explicitly sorts
blocks, but I think ring buffers should naturally often contain
consecutive blocks in ring order. Highly experimental POC code pushed
to a public branch[2]https://github.com/macdice/postgres/tree/vectored-ring-buffer, but I am not proposing anything here, just
trying to understand things. The nicest looking system call trace was
with BUFFER_USAGE_LIMIT set to 512kB, so it could do its writes, reads
and WAL writes 128kB at a time:

pwrite(32,...,131072,0xfc6000) = 131072 (0x20000)
fdatasync(32) = 0 (0x0)
pwrite(27,...,131072,0x6c0000) = 131072 (0x20000)
pread(27,...,131072,0x73e000) = 131072 (0x20000)
pwrite(27,...,131072,0x6e0000) = 131072 (0x20000)
pread(27,...,131072,0x75e000) = 131072 (0x20000)
pwritev(27,[...],3,0x77e000) = 131072 (0x20000)
preadv(27,[...],3,0x77e000) = 131072 (0x20000)

That was a fun experiment, but... I recognise that efficient cleaning
of ring buffers is a Hard Problem requiring more concurrency: it's
just too late to be flushing that WAL. But we also don't want to
start writing back data immediately after dirtying pages (cf. OS
write-behind for big sequential writes in traditional Unixes), because
we're not allowed to write data out without writing the WAL first and
we currently need to build up bigger WAL writes to do so efficiently
(cf. some other systems that can write out fragments of WAL
concurrently so the latency-vs-throughput trade-off doesn't have to be
so extreme). So we want to defer writing it, but not too long. We
need something cleaning our buffers (or at least flushing the
associated WAL, but preferably also writing the data) not too late and
not too early, and more in sync with our scan than the WAL writer is.
What that machinery should look like I don't know (but I believe
Andres has ideas).

[1]: https://github.com/freebsd/freebsd-src/commit/f2706588730a5d3b9a687ba8d4269e386650cc4f
[2]: https://github.com/macdice/postgres/tree/vectored-ring-buffer

#36Melanie Plageman
melanieplageman@gmail.com
In reply to: Thomas Munro (#35)
3 attachment(s)
Re: Confine vacuum skip logic to lazy_scan_skip

On Sun, Mar 17, 2024 at 2:53 AM Thomas Munro <thomas.munro@gmail.com> wrote:

On Tue, Mar 12, 2024 at 10:03 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

I've rebased the attached v10 over top of the changes to
lazy_scan_heap() Heikki just committed and over the v6 streaming read
patch set. I started testing them and see that you are right, we no
longer pin too many buffers. However, the uncached example below is
now slower with streaming read than on master -- it looks to be
because it is doing twice as many WAL writes and syncs. I'm still
investigating why that is.

--snip--

4. For learning/exploration only, I rebased my experimental vectored
FlushBuffers() patch, which teaches the checkpointer to write relation
data out using smgrwritev(). The checkpointer explicitly sorts
blocks, but I think ring buffers should naturally often contain
consecutive blocks in ring order. Highly experimental POC code pushed
to a public branch[2], but I am not proposing anything here, just
trying to understand things. The nicest looking system call trace was
with BUFFER_USAGE_LIMIT set to 512kB, so it could do its writes, reads
and WAL writes 128kB at a time:

pwrite(32,...,131072,0xfc6000) = 131072 (0x20000)
fdatasync(32) = 0 (0x0)
pwrite(27,...,131072,0x6c0000) = 131072 (0x20000)
pread(27,...,131072,0x73e000) = 131072 (0x20000)
pwrite(27,...,131072,0x6e0000) = 131072 (0x20000)
pread(27,...,131072,0x75e000) = 131072 (0x20000)
pwritev(27,[...],3,0x77e000) = 131072 (0x20000)
preadv(27,[...],3,0x77e000) = 131072 (0x20000)

That was a fun experiment, but... I recognise that efficient cleaning
of ring buffers is a Hard Problem requiring more concurrency: it's
just too late to be flushing that WAL. But we also don't want to
start writing back data immediately after dirtying pages (cf. OS
write-behind for big sequential writes in traditional Unixes), because
we're not allowed to write data out without writing the WAL first and
we currently need to build up bigger WAL writes to do so efficiently
(cf. some other systems that can write out fragments of WAL
concurrently so the latency-vs-throughput trade-off doesn't have to be
so extreme). So we want to defer writing it, but not too long. We
need something cleaning our buffers (or at least flushing the
associated WAL, but preferably also writing the data) not too late and
not too early, and more in sync with our scan than the WAL writer is.
What that machinery should look like I don't know (but I believe
Andres has ideas).

I've attached a WIP v11 streaming vacuum patch set here that is
rebased over master (by Thomas), so that I could add a CF entry for
it. It still has the problem with the extra WAL write and fsync calls
investigated by Thomas above. Thomas has some work in progress doing
streaming write-behind to alleviate the issues with the buffer access
strategy and streaming reads. When he gets a version of that ready to
share, he will start a new "Streaming Vacuum" thread.

- Melanie

Attachments:

v11-0002-Refactor-tidstore.c-memory-management.patchtext/x-patch; charset=US-ASCII; name=v11-0002-Refactor-tidstore.c-memory-management.patchDownload
From 050b6c3fa73c9153aeef58fcd306533c1008802e Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Fri, 26 Apr 2024 08:32:44 +1200
Subject: [PATCH v11 2/3] Refactor tidstore.c memory management.

Previously, TidStoreIterateNext() would expand the set of offsets for
each block into a buffer that it overwrote each time.  In order to be
able to collect the offsets for multiple blocks before working with
them, change the contract.  Now, the offsets are obtained by a separate
call to TidStoreGetBlockOffsets(), which can be called at a later time,
and TidStoreIteratorResult objects are safe to copy and store in a
queue.

This will be used by a later patch, to avoid the need for expensive
extra copies of offset array and associated memory management.
---
 src/backend/access/common/tidstore.c          | 68 +++++++++----------
 src/backend/access/heap/vacuumlazy.c          |  9 ++-
 src/include/access/tidstore.h                 | 12 ++--
 .../modules/test_tidstore/test_tidstore.c     |  9 ++-
 4 files changed, 53 insertions(+), 45 deletions(-)

diff --git a/src/backend/access/common/tidstore.c b/src/backend/access/common/tidstore.c
index fb3949d69f6..c3c1987204b 100644
--- a/src/backend/access/common/tidstore.c
+++ b/src/backend/access/common/tidstore.c
@@ -147,9 +147,6 @@ struct TidStoreIter
 	TidStoreIterResult output;
 };
 
-static void tidstore_iter_extract_tids(TidStoreIter *iter, BlockNumber blkno,
-									   BlocktableEntry *page);
-
 /*
  * Create a TidStore. The TidStore will live in the memory context that is
  * CurrentMemoryContext at the time of this call. The TID storage, backed
@@ -486,13 +483,6 @@ TidStoreBeginIterate(TidStore *ts)
 	iter = palloc0(sizeof(TidStoreIter));
 	iter->ts = ts;
 
-	/*
-	 * We start with an array large enough to contain at least the offsets
-	 * from one completely full bitmap element.
-	 */
-	iter->output.max_offset = 2 * BITS_PER_BITMAPWORD;
-	iter->output.offsets = palloc(sizeof(OffsetNumber) * iter->output.max_offset);
-
 	if (TidStoreIsShared(ts))
 		iter->tree_iter.shared = shared_ts_begin_iterate(ts->tree.shared);
 	else
@@ -503,9 +493,9 @@ TidStoreBeginIterate(TidStore *ts)
 
 
 /*
- * Scan the TidStore and return the TIDs of the next block. The offsets in
- * each iteration result are ordered, as are the block numbers over all
- * iterations.
+ * Return a result that contains the next block number and that can be used to
+ * obtain the set of offsets by calling TidStoreGetBlockOffsets().  The result
+ * is copyable.
  */
 TidStoreIterResult *
 TidStoreIterateNext(TidStoreIter *iter)
@@ -521,10 +511,10 @@ TidStoreIterateNext(TidStoreIter *iter)
 	if (page == NULL)
 		return NULL;
 
-	/* Collect TIDs from the key-value pair */
-	tidstore_iter_extract_tids(iter, (BlockNumber) key, page);
+	iter->output.blkno = key;
+	iter->output.internal_page = page;
 
-	return &(iter->output);
+	return &iter->output;
 }
 
 /*
@@ -540,7 +530,6 @@ TidStoreEndIterate(TidStoreIter *iter)
 	else
 		local_ts_end_iterate(iter->tree_iter.local);
 
-	pfree(iter->output.offsets);
 	pfree(iter);
 }
 
@@ -575,16 +564,19 @@ TidStoreGetHandle(TidStore *ts)
 	return (dsa_pointer) shared_ts_get_handle(ts->tree.shared);
 }
 
-/* Extract TIDs from the given key-value pair */
-static void
-tidstore_iter_extract_tids(TidStoreIter *iter, BlockNumber blkno,
-						   BlocktableEntry *page)
+/*
+ * Given a TidStoreIterResult returned by TidStoreIterateNext(), extract the
+ * offset numbers.  Returns the number of offsets filled in, if <=
+ * max_offsets.  Otherwise, fills in as much as it can in the given space, and
+ * returns the size of the buffer that would be needed.
+ */
+int
+TidStoreGetBlockOffsets(TidStoreIterResult *result,
+						OffsetNumber *offsets,
+						int max_offsets)
 {
-	TidStoreIterResult *result = (&iter->output);
-	int			wordnum;
-
-	result->num_offsets = 0;
-	result->blkno = blkno;
+	BlocktableEntry *page = result->internal_page;
+	int			num_offsets = 0;
 
 	if (page->header.nwords == 0)
 	{
@@ -592,31 +584,33 @@ tidstore_iter_extract_tids(TidStoreIter *iter, BlockNumber blkno,
 		for (int i = 0; i < NUM_FULL_OFFSETS; i++)
 		{
 			if (page->header.full_offsets[i] != InvalidOffsetNumber)
-				result->offsets[result->num_offsets++] = page->header.full_offsets[i];
+			{
+				if (num_offsets < max_offsets)
+					offsets[num_offsets] = page->header.full_offsets[i];
+				num_offsets++;
+			}
 		}
 	}
 	else
 	{
-		for (wordnum = 0; wordnum < page->header.nwords; wordnum++)
+		for (int wordnum = 0; wordnum < page->header.nwords; wordnum++)
 		{
 			bitmapword	w = page->words[wordnum];
 			int			off = wordnum * BITS_PER_BITMAPWORD;
 
-			/* Make sure there is enough space to add offsets */
-			if ((result->num_offsets + BITS_PER_BITMAPWORD) > result->max_offset)
-			{
-				result->max_offset *= 2;
-				result->offsets = repalloc(result->offsets,
-										   sizeof(OffsetNumber) * result->max_offset);
-			}
-
 			while (w != 0)
 			{
 				if (w & 1)
-					result->offsets[result->num_offsets++] = (OffsetNumber) off;
+				{
+					if (num_offsets < max_offsets)
+						offsets[num_offsets] = (OffsetNumber) off;
+					num_offsets++;
+				}
 				off++;
 				w >>= 1;
 			}
 		}
 	}
+
+	return num_offsets;
 }
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index f76ef2e7c63..19c13671666 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2144,12 +2144,17 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 		Buffer		buf;
 		Page		page;
 		Size		freespace;
+		OffsetNumber offsets[MaxOffsetNumber];
+		int			num_offsets;
 
 		vacuum_delay_point();
 
 		blkno = iter_result->blkno;
 		vacrel->blkno = blkno;
 
+		num_offsets = TidStoreGetBlockOffsets(iter_result, offsets, lengthof(offsets));
+		Assert(num_offsets <= lengthof(offsets));
+
 		/*
 		 * Pin the visibility map page in case we need to mark the page
 		 * all-visible.  In most cases this will be very cheap, because we'll
@@ -2161,8 +2166,8 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 		buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL,
 								 vacrel->bstrategy);
 		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
-		lazy_vacuum_heap_page(vacrel, blkno, buf, iter_result->offsets,
-							  iter_result->num_offsets, vmbuffer);
+		lazy_vacuum_heap_page(vacrel, blkno, buf, offsets,
+							  num_offsets, vmbuffer);
 
 		/* Now that we've vacuumed the page, record its available space */
 		page = BufferGetPage(buf);
diff --git a/src/include/access/tidstore.h b/src/include/access/tidstore.h
index 32aa9995193..d95cabd7b5e 100644
--- a/src/include/access/tidstore.h
+++ b/src/include/access/tidstore.h
@@ -20,13 +20,14 @@
 typedef struct TidStore TidStore;
 typedef struct TidStoreIter TidStoreIter;
 
-/* Result struct for TidStoreIterateNext */
+/*
+ * Result struct for TidStoreIterateNext.  This is copyable, but should be
+ * treated as opaque.  Call TidStoreGetOffsets() to obtain the offsets.
+ */
 typedef struct TidStoreIterResult
 {
 	BlockNumber blkno;
-	int			max_offset;
-	int			num_offsets;
-	OffsetNumber *offsets;
+	void	   *internal_page;
 } TidStoreIterResult;
 
 extern TidStore *TidStoreCreateLocal(size_t max_bytes, bool insert_only);
@@ -42,6 +43,9 @@ extern void TidStoreSetBlockOffsets(TidStore *ts, BlockNumber blkno, OffsetNumbe
 extern bool TidStoreIsMember(TidStore *ts, ItemPointer tid);
 extern TidStoreIter *TidStoreBeginIterate(TidStore *ts);
 extern TidStoreIterResult *TidStoreIterateNext(TidStoreIter *iter);
+extern int	TidStoreGetBlockOffsets(TidStoreIterResult *result,
+									OffsetNumber *offsets,
+									int max_offsets);
 extern void TidStoreEndIterate(TidStoreIter *iter);
 extern size_t TidStoreMemoryUsage(TidStore *ts);
 extern dsa_pointer TidStoreGetHandle(TidStore *ts);
diff --git a/src/test/modules/test_tidstore/test_tidstore.c b/src/test/modules/test_tidstore/test_tidstore.c
index 3f6a11bf21c..94ddcf1de82 100644
--- a/src/test/modules/test_tidstore/test_tidstore.c
+++ b/src/test/modules/test_tidstore/test_tidstore.c
@@ -267,9 +267,14 @@ check_set_block_offsets(PG_FUNCTION_ARGS)
 	iter = TidStoreBeginIterate(tidstore);
 	while ((iter_result = TidStoreIterateNext(iter)) != NULL)
 	{
-		for (int i = 0; i < iter_result->num_offsets; i++)
+		OffsetNumber offsets[MaxOffsetNumber];
+		int			num_offsets;
+
+		num_offsets = TidStoreGetBlockOffsets(iter_result, offsets, lengthof(offsets));
+		Assert(num_offsets <= lengthof(offsets));
+		for (int i = 0; i < num_offsets; i++)
 			ItemPointerSet(&(items.iter_tids[num_iter_tids++]), iter_result->blkno,
-						   iter_result->offsets[i]);
+						   offsets[i]);
 	}
 	TidStoreEndIterate(iter);
 	TidStoreUnlock(tidstore);
-- 
2.34.1

v11-0001-Use-streaming-I-O-in-VACUUM-first-pass.patchtext/x-patch; charset=US-ASCII; name=v11-0001-Use-streaming-I-O-in-VACUUM-first-pass.patchDownload
From fe4ab2059580d52c2855a6ea2c6bac80d06970c4 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Mon, 11 Mar 2024 16:19:56 -0400
Subject: [PATCH v11 1/3] Use streaming I/O in VACUUM first pass.

Now vacuum's first pass, which HOT-prunes and records the TIDs of
non-removable dead tuples, uses the streaming read API by converting
heap_vac_scan_next_block() to a read stream callback.

Author: Melanie Plageman <melanieplageman@gmail.com>
---
 src/backend/access/heap/vacuumlazy.c | 80 +++++++++++++++++-----------
 1 file changed, 49 insertions(+), 31 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 3f88cf1e8ef..f76ef2e7c63 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -55,6 +55,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "storage/read_stream.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -229,8 +230,9 @@ typedef struct LVSavedErrInfo
 
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
-static bool heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
-									 bool *all_visible_according_to_vm);
+static BlockNumber heap_vac_scan_next_block(ReadStream *stream,
+											void *callback_private_data,
+											void *per_buffer_data);
 static void find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
@@ -815,10 +817,11 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 static void
 lazy_scan_heap(LVRelState *vacrel)
 {
+	Buffer		buf;
+	ReadStream *stream;
 	BlockNumber rel_pages = vacrel->rel_pages,
-				blkno,
 				next_fsm_block_to_vacuum = 0;
-	bool		all_visible_according_to_vm;
+	bool	   *all_visible_according_to_vm;
 
 	TidStore   *dead_items = vacrel->dead_items;
 	VacDeadItemsInfo *dead_items_info = vacrel->dead_items_info;
@@ -836,19 +839,33 @@ lazy_scan_heap(LVRelState *vacrel)
 	initprog_val[2] = dead_items_info->max_bytes;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
+	stream = read_stream_begin_relation(READ_STREAM_MAINTENANCE,
+										vacrel->bstrategy,
+										vacrel->rel,
+										MAIN_FORKNUM,
+										heap_vac_scan_next_block,
+										vacrel,
+										sizeof(bool));
+
 	/* Initialize for the first heap_vac_scan_next_block() call */
 	vacrel->current_block = InvalidBlockNumber;
 	vacrel->next_unskippable_block = InvalidBlockNumber;
 	vacrel->next_unskippable_allvis = false;
 	vacrel->next_unskippable_vmbuffer = InvalidBuffer;
 
-	while (heap_vac_scan_next_block(vacrel, &blkno, &all_visible_according_to_vm))
+	while (BufferIsValid(buf = read_stream_next_buffer(stream,
+													   (void **) &all_visible_according_to_vm)))
 	{
-		Buffer		buf;
+		BlockNumber blkno;
 		Page		page;
 		bool		has_lpdead_items;
 		bool		got_cleanup_lock = false;
 
+		vacrel->blkno = blkno = BufferGetBlockNumber(buf);
+
+		CheckBufferIsPinnedOnce(buf);
+		page = BufferGetPage(buf);
+
 		vacrel->scanned_pages++;
 
 		/* Report as block scanned, update error traceback information */
@@ -914,10 +931,6 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
 
-		buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL,
-								 vacrel->bstrategy);
-		page = BufferGetPage(buf);
-
 		/*
 		 * We need a buffer cleanup lock to prune HOT chains and defragment
 		 * the page in lazy_scan_prune.  But when it's not possible to acquire
@@ -973,7 +986,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		if (got_cleanup_lock)
 			lazy_scan_prune(vacrel, buf, blkno, page,
-							vmbuffer, all_visible_according_to_vm,
+							vmbuffer, *all_visible_according_to_vm,
 							&has_lpdead_items);
 
 		/*
@@ -1027,7 +1040,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		ReleaseBuffer(vmbuffer);
 
 	/* report that everything is now scanned */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
+	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, rel_pages);
 
 	/* now we can compute the new value for pg_class.reltuples */
 	vacrel->new_live_tuples = vac_estimate_reltuples(vacrel->rel, rel_pages,
@@ -1042,6 +1055,8 @@ lazy_scan_heap(LVRelState *vacrel)
 		Max(vacrel->new_live_tuples, 0) + vacrel->recently_dead_tuples +
 		vacrel->missed_dead_tuples;
 
+	read_stream_end(stream);
+
 	/*
 	 * Do index vacuuming (call each index's ambulkdelete routine), then do
 	 * related heap vacuuming
@@ -1053,11 +1068,11 @@ lazy_scan_heap(LVRelState *vacrel)
 	 * Vacuum the remainder of the Free Space Map.  We must do this whether or
 	 * not there were indexes, and whether or not we bypassed index vacuuming.
 	 */
-	if (blkno > next_fsm_block_to_vacuum)
-		FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, blkno);
+	if (rel_pages > next_fsm_block_to_vacuum)
+		FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, rel_pages);
 
 	/* report all blocks vacuumed */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
+	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, rel_pages);
 
 	/* Do final index cleanup (call each index's amvacuumcleanup routine) */
 	if (vacrel->nindexes > 0 && vacrel->do_index_cleanup)
@@ -1067,14 +1082,14 @@ lazy_scan_heap(LVRelState *vacrel)
 /*
  *	heap_vac_scan_next_block() -- get next block for vacuum to process
  *
- * lazy_scan_heap() calls here every time it needs to get the next block to
- * prune and vacuum.  The function uses the visibility map, vacuum options,
- * and various thresholds to skip blocks which do not need to be processed and
- * sets blkno to the next block to process.
+ * The streaming read callback invokes heap_vac_scan_next_block() every time
+ * lazy_scan_heap() needs the next block to prune and vacuum.  The function
+ * uses the visibility map, vacuum options, and various thresholds to skip
+ * blocks which do not need to be processed and returns the next block to
+ * process or InvalidBlockNumber if there are no remaining blocks.
  *
- * The block number and visibility status of the next block to process are set
- * in *blkno and *all_visible_according_to_vm.  The return value is false if
- * there are no further blocks to process.
+ * The visibility status of the next block to process is set in the
+ * per_buffer_data.
  *
  * vacrel is an in/out parameter here.  Vacuum options and information about
  * the relation are read.  vacrel->skippedallvis is set if we skip a block
@@ -1082,11 +1097,14 @@ lazy_scan_heap(LVRelState *vacrel)
  * relfrozenxid in that case.  vacrel also holds information about the next
  * unskippable block, as bookkeeping for this function.
  */
-static bool
-heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
-						 bool *all_visible_according_to_vm)
+static BlockNumber
+heap_vac_scan_next_block(ReadStream *stream,
+						 void *callback_private_data,
+						 void *per_buffer_data)
 {
 	BlockNumber next_block;
+	LVRelState *vacrel = callback_private_data;
+	bool	   *all_visible_according_to_vm = per_buffer_data;
 
 	/* relies on InvalidBlockNumber + 1 overflowing to 0 on first call */
 	next_block = vacrel->current_block + 1;
@@ -1099,8 +1117,8 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 			ReleaseBuffer(vacrel->next_unskippable_vmbuffer);
 			vacrel->next_unskippable_vmbuffer = InvalidBuffer;
 		}
-		*blkno = vacrel->rel_pages;
-		return false;
+		vacrel->current_block = vacrel->rel_pages;
+		return InvalidBlockNumber;
 	}
 
 	/*
@@ -1149,9 +1167,9 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 		 * but chose not to.  We know that they are all-visible in the VM,
 		 * otherwise they would've been unskippable.
 		 */
-		*blkno = vacrel->current_block = next_block;
+		vacrel->current_block = next_block;
 		*all_visible_according_to_vm = true;
-		return true;
+		return vacrel->current_block;
 	}
 	else
 	{
@@ -1161,9 +1179,9 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 		 */
 		Assert(next_block == vacrel->next_unskippable_block);
 
-		*blkno = vacrel->current_block = next_block;
+		vacrel->current_block = next_block;
 		*all_visible_according_to_vm = vacrel->next_unskippable_allvis;
-		return true;
+		return vacrel->current_block;
 	}
 }
 
-- 
2.34.1

v11-0003-Use-streaming-I-O-in-VACUUM-second-pass.patchtext/x-patch; charset=US-ASCII; name=v11-0003-Use-streaming-I-O-in-VACUUM-second-pass.patchDownload
From 5c74eccade69374a449f0b0fd4003545863c9538 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Tue, 27 Feb 2024 14:35:36 -0500
Subject: [PATCH v11 3/3] Use streaming I/O in VACUUM second pass.

Now vacuum's second pass, which removes dead items referring to dead
tuples collected in the first pass, uses a read stream that looks ahead
in the TidStore.

Originally developed by Melanie, refactored to work with the new
TidStore by Thomas.

Author: Melanie Plageman <melanieplageman@gmail.com>
Author: Thomas Munro <thomas.munro@gmail.com>
---
 src/backend/access/heap/vacuumlazy.c | 38 +++++++++++++++++++++++-----
 1 file changed, 32 insertions(+), 6 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 19c13671666..14eee89af83 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2098,6 +2098,24 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 	return allindexes;
 }
 
+static BlockNumber
+vacuum_reap_lp_read_stream_next(ReadStream *stream,
+								void *callback_private_data,
+								void *per_buffer_data)
+{
+	TidStoreIter *iter = callback_private_data;
+	TidStoreIterResult *iter_result;
+
+	iter_result = TidStoreIterateNext(iter);
+	if (iter_result == NULL)
+		return InvalidBlockNumber;
+
+	/* Save the TidStoreIterResult for later, so we can extract the offsets. */
+	memcpy(per_buffer_data, iter_result, sizeof(*iter_result));
+
+	return iter_result->blkno;
+}
+
 /*
  *	lazy_vacuum_heap_rel() -- second pass over the heap for two pass strategy
  *
@@ -2118,6 +2136,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 static void
 lazy_vacuum_heap_rel(LVRelState *vacrel)
 {
+	Buffer		buf;
+	ReadStream *stream;
 	BlockNumber vacuumed_pages = 0;
 	Buffer		vmbuffer = InvalidBuffer;
 	LVSavedErrInfo saved_err_info;
@@ -2138,10 +2158,18 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 							 InvalidBlockNumber, InvalidOffsetNumber);
 
 	iter = TidStoreBeginIterate(vacrel->dead_items);
-	while ((iter_result = TidStoreIterateNext(iter)) != NULL)
+	stream = read_stream_begin_relation(READ_STREAM_MAINTENANCE,
+										vacrel->bstrategy,
+										vacrel->rel,
+										MAIN_FORKNUM,
+										vacuum_reap_lp_read_stream_next,
+										iter,
+										sizeof(TidStoreIterResult));
+
+	while (BufferIsValid(buf = read_stream_next_buffer(stream,
+													   (void **) &iter_result)))
 	{
 		BlockNumber blkno;
-		Buffer		buf;
 		Page		page;
 		Size		freespace;
 		OffsetNumber offsets[MaxOffsetNumber];
@@ -2149,8 +2177,7 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 
 		vacuum_delay_point();
 
-		blkno = iter_result->blkno;
-		vacrel->blkno = blkno;
+		vacrel->blkno = blkno = BufferGetBlockNumber(buf);
 
 		num_offsets = TidStoreGetBlockOffsets(iter_result, offsets, lengthof(offsets));
 		Assert(num_offsets <= lengthof(offsets));
@@ -2163,8 +2190,6 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 		visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
 
 		/* We need a non-cleanup exclusive lock to mark dead_items unused */
-		buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL,
-								 vacrel->bstrategy);
 		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
 		lazy_vacuum_heap_page(vacrel, blkno, buf, offsets,
 							  num_offsets, vmbuffer);
@@ -2177,6 +2202,7 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 		RecordPageWithFreeSpace(vacrel->rel, blkno, freespace);
 		vacuumed_pages++;
 	}
+	read_stream_end(stream);
 	TidStoreEndIterate(iter);
 
 	vacrel->blkno = InvalidBlockNumber;
-- 
2.34.1

#37Noah Misch
noah@leadboat.com
In reply to: Melanie Plageman (#36)
Re: Confine vacuum skip logic to lazy_scan_skip

On Fri, Jun 28, 2024 at 05:36:25PM -0400, Melanie Plageman wrote:

I've attached a WIP v11 streaming vacuum patch set here that is
rebased over master (by Thomas), so that I could add a CF entry for
it. It still has the problem with the extra WAL write and fsync calls
investigated by Thomas above. Thomas has some work in progress doing
streaming write-behind to alleviate the issues with the buffer access
strategy and streaming reads. When he gets a version of that ready to
share, he will start a new "Streaming Vacuum" thread.

To avoid reviewing the wrong patch, I'm writing to verify the status here.
This is Needs Review in the commitfest. I think one of these two holds:

1. Needs Review is valid.
2. It's actually Waiting on Author. You're commissioning a review of the
future-thread patch, not this one.

If it's (1), given the WIP marking, what is the scope of the review you seek?
I'm guessing performance is out of scope; what else is in or out of scope?

#38Melanie Plageman
melanieplageman@gmail.com
In reply to: Noah Misch (#37)
Re: Confine vacuum skip logic to lazy_scan_skip

On Sun, Jul 7, 2024 at 10:49 AM Noah Misch <noah@leadboat.com> wrote:

On Fri, Jun 28, 2024 at 05:36:25PM -0400, Melanie Plageman wrote:

I've attached a WIP v11 streaming vacuum patch set here that is
rebased over master (by Thomas), so that I could add a CF entry for
it. It still has the problem with the extra WAL write and fsync calls
investigated by Thomas above. Thomas has some work in progress doing
streaming write-behind to alleviate the issues with the buffer access
strategy and streaming reads. When he gets a version of that ready to
share, he will start a new "Streaming Vacuum" thread.

To avoid reviewing the wrong patch, I'm writing to verify the status here.
This is Needs Review in the commitfest. I think one of these two holds:

1. Needs Review is valid.
2. It's actually Waiting on Author. You're commissioning a review of the
future-thread patch, not this one.

If it's (1), given the WIP marking, what is the scope of the review you seek?
I'm guessing performance is out of scope; what else is in or out of scope?

Ah, you're right. I moved it to "Waiting on Author" as we are waiting
on Thomas' version which has a fix for the extra WAL write/sync
behavior.

Sorry for the "Needs Review" noise!

- Melanie

#39Thomas Munro
thomas.munro@gmail.com
In reply to: Noah Misch (#37)
Re: Confine vacuum skip logic to lazy_scan_skip

On Mon, Jul 8, 2024 at 2:49 AM Noah Misch <noah@leadboat.com> wrote:

what is the scope of the review you seek?

The patch "Refactor tidstore.c memory management." could definitely
use some review. I wasn't sure if that should be proposed in a new
thread of its own, but then the need for it comes from this
streamifying project, so... The basic problem was that we want to
build up a stream of block to be vacuumed (so that we can perform the
I/O combining etc) + some extra data attached to each buffer, in this
case the TID list, but the implementation of tidstore.c in master
would require us to make an extra intermediate copy of the TIDs,
because it keeps overwriting its internal buffer. The proposal here
is to make it so that you can get get a tiny copyable object that can
later be used to retrieve the data into a caller-supplied buffer, so
that tidstore.c's iterator machinery doesn't have to have its own
internal buffer at all, and then calling code can safely queue up a
few of these at once.

#40Noah Misch
noah@leadboat.com
In reply to: Thomas Munro (#39)
Re: Confine vacuum skip logic to lazy_scan_skip

On Mon, Jul 15, 2024 at 03:26:32PM +1200, Thomas Munro wrote:

On Mon, Jul 8, 2024 at 2:49 AM Noah Misch <noah@leadboat.com> wrote:

what is the scope of the review you seek?

The patch "Refactor tidstore.c memory management." could definitely
use some review.

That's reasonable. radixtree already forbids mutations concurrent with
iteration, so there's no new concurrency hazard. One alternative is
per_buffer_data big enough for MaxOffsetNumber, but that might thrash caches
measurably. That patch is good to go apart from these trivialities:

-	return &(iter->output);
+	return &iter->output;

This cosmetic change is orthogonal to the patch's mission.

-		for (wordnum = 0; wordnum < page->header.nwords; wordnum++)
+		for (int wordnum = 0; wordnum < page->header.nwords; wordnum++)

Likewise.

#41Thomas Munro
thomas.munro@gmail.com
In reply to: Noah Misch (#40)
2 attachment(s)
Re: Confine vacuum skip logic to lazy_scan_skip

On Tue, Jul 16, 2024 at 1:52 PM Noah Misch <noah@leadboat.com> wrote:

On Mon, Jul 15, 2024 at 03:26:32PM +1200, Thomas Munro wrote:
That's reasonable. radixtree already forbids mutations concurrent with
iteration, so there's no new concurrency hazard. One alternative is
per_buffer_data big enough for MaxOffsetNumber, but that might thrash caches
measurably. That patch is good to go apart from these trivialities:

Thanks! I have pushed that patch, without those changes you didn't like.

Here's are Melanie's patches again. They work, and the WAL flush
frequency problem is mostly gone since we increased the BAS_VACUUM
default ring size (commit 98f320eb), but I'm still looking into how
this read-ahead and the write-behind generated by vacuum (using
patches not yet posted) should interact with each other and the ring
system, and bouncing ideas around about that with my colleagues. More
on that soon, hopefully. I suspect that there won't be changes to
these patches as a result, but I still want to hold off for a bit.

Attachments:

v12-0001-Use-streaming-I-O-in-VACUUM-first-pass.patchtext/x-patch; charset=US-ASCII; name=v12-0001-Use-streaming-I-O-in-VACUUM-first-pass.patchDownload
From ac826d0187252bf446fb5f12489def5208d20289 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Mon, 11 Mar 2024 16:19:56 -0400
Subject: [PATCH v12 1/2] Use streaming I/O in VACUUM first pass.

Now vacuum's first pass, which HOT-prunes and records the TIDs of
non-removable dead tuples, uses the streaming read API by converting
heap_vac_scan_next_block() to a read stream callback.

Author: Melanie Plageman <melanieplageman@gmail.com>
---
 src/backend/access/heap/vacuumlazy.c | 80 +++++++++++++++++-----------
 1 file changed, 49 insertions(+), 31 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 835b53415d0..d92fac7e7e3 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -55,6 +55,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "storage/read_stream.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_rusage.h"
@@ -229,8 +230,9 @@ typedef struct LVSavedErrInfo
 
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
-static bool heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
-									 bool *all_visible_according_to_vm);
+static BlockNumber heap_vac_scan_next_block(ReadStream *stream,
+											void *callback_private_data,
+											void *per_buffer_data);
 static void find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
@@ -815,10 +817,11 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 static void
 lazy_scan_heap(LVRelState *vacrel)
 {
+	Buffer		buf;
+	ReadStream *stream;
 	BlockNumber rel_pages = vacrel->rel_pages,
-				blkno,
 				next_fsm_block_to_vacuum = 0;
-	bool		all_visible_according_to_vm;
+	bool	   *all_visible_according_to_vm;
 
 	TidStore   *dead_items = vacrel->dead_items;
 	VacDeadItemsInfo *dead_items_info = vacrel->dead_items_info;
@@ -836,19 +839,33 @@ lazy_scan_heap(LVRelState *vacrel)
 	initprog_val[2] = dead_items_info->max_bytes;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
+	stream = read_stream_begin_relation(READ_STREAM_MAINTENANCE,
+										vacrel->bstrategy,
+										vacrel->rel,
+										MAIN_FORKNUM,
+										heap_vac_scan_next_block,
+										vacrel,
+										sizeof(bool));
+
 	/* Initialize for the first heap_vac_scan_next_block() call */
 	vacrel->current_block = InvalidBlockNumber;
 	vacrel->next_unskippable_block = InvalidBlockNumber;
 	vacrel->next_unskippable_allvis = false;
 	vacrel->next_unskippable_vmbuffer = InvalidBuffer;
 
-	while (heap_vac_scan_next_block(vacrel, &blkno, &all_visible_according_to_vm))
+	while (BufferIsValid(buf = read_stream_next_buffer(stream,
+													   (void **) &all_visible_according_to_vm)))
 	{
-		Buffer		buf;
+		BlockNumber blkno;
 		Page		page;
 		bool		has_lpdead_items;
 		bool		got_cleanup_lock = false;
 
+		vacrel->blkno = blkno = BufferGetBlockNumber(buf);
+
+		CheckBufferIsPinnedOnce(buf);
+		page = BufferGetPage(buf);
+
 		vacrel->scanned_pages++;
 
 		/* Report as block scanned, update error traceback information */
@@ -914,10 +931,6 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
 
-		buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL,
-								 vacrel->bstrategy);
-		page = BufferGetPage(buf);
-
 		/*
 		 * We need a buffer cleanup lock to prune HOT chains and defragment
 		 * the page in lazy_scan_prune.  But when it's not possible to acquire
@@ -973,7 +986,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		if (got_cleanup_lock)
 			lazy_scan_prune(vacrel, buf, blkno, page,
-							vmbuffer, all_visible_according_to_vm,
+							vmbuffer, *all_visible_according_to_vm,
 							&has_lpdead_items);
 
 		/*
@@ -1027,7 +1040,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		ReleaseBuffer(vmbuffer);
 
 	/* report that everything is now scanned */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
+	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, rel_pages);
 
 	/* now we can compute the new value for pg_class.reltuples */
 	vacrel->new_live_tuples = vac_estimate_reltuples(vacrel->rel, rel_pages,
@@ -1042,6 +1055,8 @@ lazy_scan_heap(LVRelState *vacrel)
 		Max(vacrel->new_live_tuples, 0) + vacrel->recently_dead_tuples +
 		vacrel->missed_dead_tuples;
 
+	read_stream_end(stream);
+
 	/*
 	 * Do index vacuuming (call each index's ambulkdelete routine), then do
 	 * related heap vacuuming
@@ -1053,11 +1068,11 @@ lazy_scan_heap(LVRelState *vacrel)
 	 * Vacuum the remainder of the Free Space Map.  We must do this whether or
 	 * not there were indexes, and whether or not we bypassed index vacuuming.
 	 */
-	if (blkno > next_fsm_block_to_vacuum)
-		FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, blkno);
+	if (rel_pages > next_fsm_block_to_vacuum)
+		FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, rel_pages);
 
 	/* report all blocks vacuumed */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
+	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, rel_pages);
 
 	/* Do final index cleanup (call each index's amvacuumcleanup routine) */
 	if (vacrel->nindexes > 0 && vacrel->do_index_cleanup)
@@ -1067,14 +1082,14 @@ lazy_scan_heap(LVRelState *vacrel)
 /*
  *	heap_vac_scan_next_block() -- get next block for vacuum to process
  *
- * lazy_scan_heap() calls here every time it needs to get the next block to
- * prune and vacuum.  The function uses the visibility map, vacuum options,
- * and various thresholds to skip blocks which do not need to be processed and
- * sets blkno to the next block to process.
+ * The streaming read callback invokes heap_vac_scan_next_block() every time
+ * lazy_scan_heap() needs the next block to prune and vacuum.  The function
+ * uses the visibility map, vacuum options, and various thresholds to skip
+ * blocks which do not need to be processed and returns the next block to
+ * process or InvalidBlockNumber if there are no remaining blocks.
  *
- * The block number and visibility status of the next block to process are set
- * in *blkno and *all_visible_according_to_vm.  The return value is false if
- * there are no further blocks to process.
+ * The visibility status of the next block to process is set in the
+ * per_buffer_data.
  *
  * vacrel is an in/out parameter here.  Vacuum options and information about
  * the relation are read.  vacrel->skippedallvis is set if we skip a block
@@ -1082,11 +1097,14 @@ lazy_scan_heap(LVRelState *vacrel)
  * relfrozenxid in that case.  vacrel also holds information about the next
  * unskippable block, as bookkeeping for this function.
  */
-static bool
-heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
-						 bool *all_visible_according_to_vm)
+static BlockNumber
+heap_vac_scan_next_block(ReadStream *stream,
+						 void *callback_private_data,
+						 void *per_buffer_data)
 {
 	BlockNumber next_block;
+	LVRelState *vacrel = callback_private_data;
+	bool	   *all_visible_according_to_vm = per_buffer_data;
 
 	/* relies on InvalidBlockNumber + 1 overflowing to 0 on first call */
 	next_block = vacrel->current_block + 1;
@@ -1099,8 +1117,8 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 			ReleaseBuffer(vacrel->next_unskippable_vmbuffer);
 			vacrel->next_unskippable_vmbuffer = InvalidBuffer;
 		}
-		*blkno = vacrel->rel_pages;
-		return false;
+		vacrel->current_block = vacrel->rel_pages;
+		return InvalidBlockNumber;
 	}
 
 	/*
@@ -1149,9 +1167,9 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 		 * but chose not to.  We know that they are all-visible in the VM,
 		 * otherwise they would've been unskippable.
 		 */
-		*blkno = vacrel->current_block = next_block;
+		vacrel->current_block = next_block;
 		*all_visible_according_to_vm = true;
-		return true;
+		return vacrel->current_block;
 	}
 	else
 	{
@@ -1161,9 +1179,9 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 		 */
 		Assert(next_block == vacrel->next_unskippable_block);
 
-		*blkno = vacrel->current_block = next_block;
+		vacrel->current_block = next_block;
 		*all_visible_according_to_vm = vacrel->next_unskippable_allvis;
-		return true;
+		return vacrel->current_block;
 	}
 }
 
-- 
2.45.2

v12-0002-Use-streaming-I-O-in-VACUUM-second-pass.patchtext/x-patch; charset=US-ASCII; name=v12-0002-Use-streaming-I-O-in-VACUUM-second-pass.patchDownload
From 096f16b1e76ac28438752e7828b7c325f84edf4e Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Tue, 27 Feb 2024 14:35:36 -0500
Subject: [PATCH v12 2/2] Use streaming I/O in VACUUM second pass.

Now vacuum's second pass, which removes dead items referring to dead
tuples collected in the first pass, uses a read stream that looks ahead
in the TidStore.

Author: Melanie Plageman <melanieplageman@gmail.com>
---
 src/backend/access/heap/vacuumlazy.c | 38 +++++++++++++++++++++++-----
 1 file changed, 32 insertions(+), 6 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index d92fac7e7e3..2b7d191d175 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2098,6 +2098,24 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 	return allindexes;
 }
 
+static BlockNumber
+vacuum_reap_lp_read_stream_next(ReadStream *stream,
+								void *callback_private_data,
+								void *per_buffer_data)
+{
+	TidStoreIter *iter = callback_private_data;
+	TidStoreIterResult *iter_result;
+
+	iter_result = TidStoreIterateNext(iter);
+	if (iter_result == NULL)
+		return InvalidBlockNumber;
+
+	/* Save the TidStoreIterResult for later, so we can extract the offsets. */
+	memcpy(per_buffer_data, iter_result, sizeof(*iter_result));
+
+	return iter_result->blkno;
+}
+
 /*
  *	lazy_vacuum_heap_rel() -- second pass over the heap for two pass strategy
  *
@@ -2118,6 +2136,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 static void
 lazy_vacuum_heap_rel(LVRelState *vacrel)
 {
+	Buffer		buf;
+	ReadStream *stream;
 	BlockNumber vacuumed_pages = 0;
 	Buffer		vmbuffer = InvalidBuffer;
 	LVSavedErrInfo saved_err_info;
@@ -2138,10 +2158,18 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 							 InvalidBlockNumber, InvalidOffsetNumber);
 
 	iter = TidStoreBeginIterate(vacrel->dead_items);
-	while ((iter_result = TidStoreIterateNext(iter)) != NULL)
+	stream = read_stream_begin_relation(READ_STREAM_MAINTENANCE,
+										vacrel->bstrategy,
+										vacrel->rel,
+										MAIN_FORKNUM,
+										vacuum_reap_lp_read_stream_next,
+										iter,
+										sizeof(TidStoreIterResult));
+
+	while (BufferIsValid(buf = read_stream_next_buffer(stream,
+													   (void **) &iter_result)))
 	{
 		BlockNumber blkno;
-		Buffer		buf;
 		Page		page;
 		Size		freespace;
 		OffsetNumber offsets[MaxOffsetNumber];
@@ -2149,8 +2177,7 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 
 		vacuum_delay_point();
 
-		blkno = iter_result->blkno;
-		vacrel->blkno = blkno;
+		vacrel->blkno = blkno = BufferGetBlockNumber(buf);
 
 		num_offsets = TidStoreGetBlockOffsets(iter_result, offsets, lengthof(offsets));
 		Assert(num_offsets <= lengthof(offsets));
@@ -2163,8 +2190,6 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 		visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
 
 		/* We need a non-cleanup exclusive lock to mark dead_items unused */
-		buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL,
-								 vacrel->bstrategy);
 		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
 		lazy_vacuum_heap_page(vacrel, blkno, buf, offsets,
 							  num_offsets, vmbuffer);
@@ -2177,6 +2202,7 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 		RecordPageWithFreeSpace(vacrel->rel, blkno, freespace);
 		vacuumed_pages++;
 	}
+	read_stream_end(stream);
 	TidStoreEndIterate(iter);
 
 	vacrel->blkno = InvalidBlockNumber;
-- 
2.45.2

#42Tomas Vondra
tomas@vondra.me
In reply to: Thomas Munro (#41)
3 attachment(s)
Re: Confine vacuum skip logic to lazy_scan_skip

On 7/24/24 07:40, Thomas Munro wrote:

On Tue, Jul 16, 2024 at 1:52 PM Noah Misch <noah@leadboat.com> wrote:

On Mon, Jul 15, 2024 at 03:26:32PM +1200, Thomas Munro wrote:
That's reasonable. radixtree already forbids mutations concurrent with
iteration, so there's no new concurrency hazard. One alternative is
per_buffer_data big enough for MaxOffsetNumber, but that might thrash caches
measurably. That patch is good to go apart from these trivialities:

Thanks! I have pushed that patch, without those changes you didn't like.

Here's are Melanie's patches again. They work, and the WAL flush
frequency problem is mostly gone since we increased the BAS_VACUUM
default ring size (commit 98f320eb), but I'm still looking into how
this read-ahead and the write-behind generated by vacuum (using
patches not yet posted) should interact with each other and the ring
system, and bouncing ideas around about that with my colleagues. More
on that soon, hopefully. I suspect that there won't be changes to
these patches as a result, but I still want to hold off for a bit.

I've been looking at some other vacuum-related patches, so I took a look
at these remaining bits too. I don't have much to say about the code
(seems perfectly fine to me), so I decided to do a bit of testing.

I did a simple test (see the attached .sh script) that runs vacuum on a
table with different fractions of rows updated. The table has ~100 rows
per page, with 50% fill factor, and the updates touch ~1%, 0.1%, 0.05%
rows, and then even smaller fractions up to 0.0001%. This determines how
many pages get touched. With 1% fraction almost every page gets
modified, then the fraction quickly drops. With 0.0001% only about 0.01%
of pages gets modified.

Attached is a CSV with raw results from two machines, and also a PDF
with a comparison of the two build (master vs. patched). In vast
majority of cases, the patched build is much faster, usually 2-3x.

There are a couple cases where it regresses by ~30%, but only on one of
the machines with older storage (RAID on SATA SSDs), with 1% rows
updated (which means almost all pages get modified). So this is about
sequential access. It's a bit weird, but probably not a fatal issue.

There is also a couple smaller regressions on the "xeon" machine with
M.2 SSD, for lower update fractions - 0.05% rows updated means 5% pages
need vacuum. But I think this is more a sign of us being too aggressive
in detecting (and forcing) sequential patterns - on master, we end up
scanning 50% pages, thanks to SKIP_PAGES_THRESHOLD. I'll start a new
thread about that ...

Anyway, these results look very nice - a couple limited regressions,
substantial speedups in plausible/common cases.

regards

--
Tomas Vondra

Attachments:

run-vacuum-stream.shapplication/x-shellscript; name=run-vacuum-stream.shDownload
vacuum-stream-results.csvtext/csv; charset=UTF-8; name=vacuum-stream-results.csvDownload
vacuum-streaming.pdfapplication/pdf; name=vacuum-streaming.pdfDownload
#43Melanie Plageman
melanieplageman@gmail.com
In reply to: Tomas Vondra (#42)
2 attachment(s)
Re: Confine vacuum skip logic to lazy_scan_skip

On Sun, Dec 15, 2024 at 10:10 AM Tomas Vondra <tomas@vondra.me> wrote:

I've been looking at some other vacuum-related patches, so I took a look
at these remaining bits too. I don't have much to say about the code
(seems perfectly fine to me), so I decided to do a bit of testing.

Thanks for doing this!

I did a simple test (see the attached .sh script) that runs vacuum on a
table with different fractions of rows updated. The table has ~100 rows
per page, with 50% fill factor, and the updates touch ~1%, 0.1%, 0.05%
rows, and then even smaller fractions up to 0.0001%. This determines how
many pages get touched. With 1% fraction almost every page gets
modified, then the fraction quickly drops. With 0.0001% only about 0.01%
of pages gets modified.

Attached is a CSV with raw results from two machines, and also a PDF
with a comparison of the two build (master vs. patched). In vast
majority of cases, the patched build is much faster, usually 2-3x.

There are a couple cases where it regresses by ~30%, but only on one of
the machines with older storage (RAID on SATA SSDs), with 1% rows
updated (which means almost all pages get modified). So this is about
sequential access. It's a bit weird, but probably not a fatal issue.

Actually, while rebasing these with the intent to start investigating
the regressions you report, I noticed something quite wrong with my
code. In lazy_scan_heap(), I had put read_stream_next_buffer() before
a few expensive operations (like a round of index vacuuming and dead
item reaping if the TID store is full). It returns the pinned buffer,
so this could mean a buffer remaining pinned for a whole round of
vacuuming of items from the TID store. Not good. Anyway, this version
has fixed that. I do wonder if there is any chance that this affected
your benchmarks.

I've attached a new v13. Perhaps you could give it another go and see
if the regressions are still there?

There is also a couple smaller regressions on the "xeon" machine with
M.2 SSD, for lower update fractions - 0.05% rows updated means 5% pages
need vacuum. But I think this is more a sign of us being too aggressive
in detecting (and forcing) sequential patterns - on master, we end up
scanning 50% pages, thanks to SKIP_PAGES_THRESHOLD. I'll start a new
thread about that ...

Hmm. If only 5% of pages need vacuum, then you are right, readahead
seems like it would often be wasted (depending on _which_ 5% needs
vacuuming). But, the read stream API will only prefetch and build up
larger I/Os of blocks we actually need. So, it seems like this would
behave the same on master and with the patch. That is, both would do
extra unneeded I/O because of SKIP_PAGES_THRESHOLD. Is the 5% of the
table that needs vacuuming dispersed randomly throughout or
concentrated?

If it is concentrated and readahead would be useful, then maybe we
need to increase read_ahead_kb. You mentioned off-list that the
read_ahead_kb on this machine for this SSD was 128kB -- the same as
io_combine_limit. If we want to read ahead, issuing 128 kB I/Os might
be thwarting us and increasing latency.

- Melanie

Attachments:

v13-0001-Use-streaming-I-O-in-VACUUM-s-first-phase.patchapplication/octet-stream; name=v13-0001-Use-streaming-I-O-in-VACUUM-s-first-phase.patchDownload
From 17cae28be7d432b53a7cd1b8fdc04c8abd92be02 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Wed, 15 Jan 2025 19:56:37 -0500
Subject: [PATCH v13 1/2] Use streaming I/O in VACUUM's first phase

Now vacuum's first phase, which HOT-prunes and records the TIDs of
non-removable dead tuples, uses the streaming read API by converting
heap_vac_scan_next_block() to a read stream callback.
---
 src/backend/access/heap/vacuumlazy.c | 100 +++++++++++++++++----------
 1 file changed, 62 insertions(+), 38 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 5b0e790e121..dce5dae9d02 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -108,6 +108,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "storage/read_stream.h"
 #include "utils/lsyscache.h"
 #include "utils/pg_rusage.h"
 #include "utils/timestamp.h"
@@ -296,8 +297,9 @@ typedef struct LVSavedErrInfo
 
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
-static bool heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
-									 bool *all_visible_according_to_vm);
+static BlockNumber heap_vac_scan_next_block(ReadStream *stream,
+											void *callback_private_data,
+											void *per_buffer_data);
 static void find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
@@ -904,10 +906,11 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 static void
 lazy_scan_heap(LVRelState *vacrel)
 {
+	ReadStream *stream;
 	BlockNumber rel_pages = vacrel->rel_pages,
-				blkno,
+				blkno = 0,
 				next_fsm_block_to_vacuum = 0;
-	bool		all_visible_according_to_vm;
+	bool	   *all_visible_according_to_vm = NULL;
 
 	Buffer		vmbuffer = InvalidBuffer;
 	const int	initprog_index[] = {
@@ -923,26 +926,27 @@ lazy_scan_heap(LVRelState *vacrel)
 	initprog_val[2] = vacrel->dead_items_info->max_bytes;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
+	stream = read_stream_begin_relation(READ_STREAM_MAINTENANCE,
+										vacrel->bstrategy,
+										vacrel->rel,
+										MAIN_FORKNUM,
+										heap_vac_scan_next_block,
+										vacrel,
+										sizeof(bool));
+
 	/* Initialize for the first heap_vac_scan_next_block() call */
 	vacrel->current_block = InvalidBlockNumber;
 	vacrel->next_unskippable_block = InvalidBlockNumber;
 	vacrel->next_unskippable_allvis = false;
 	vacrel->next_unskippable_vmbuffer = InvalidBuffer;
 
-	while (heap_vac_scan_next_block(vacrel, &blkno, &all_visible_according_to_vm))
+	while (true)
 	{
 		Buffer		buf;
 		Page		page;
 		bool		has_lpdead_items;
 		bool		got_cleanup_lock = false;
 
-		vacrel->scanned_pages++;
-
-		/* Report as block scanned, update error traceback information */
-		pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
-		update_vacuum_error_info(vacrel, NULL, VACUUM_ERRCB_PHASE_SCAN_HEAP,
-								 blkno, InvalidOffsetNumber);
-
 		vacuum_delay_point();
 
 		/*
@@ -983,7 +987,8 @@ lazy_scan_heap(LVRelState *vacrel)
 
 			/*
 			 * Vacuum the Free Space Map to make newly-freed space visible on
-			 * upper-level FSM pages.  Note we have not yet processed blkno.
+			 * upper-level FSM pages.  Note that blkno is the previously
+			 * processed block.
 			 */
 			FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum,
 									blkno);
@@ -994,6 +999,24 @@ lazy_scan_heap(LVRelState *vacrel)
 										 PROGRESS_VACUUM_PHASE_SCAN_HEAP);
 		}
 
+		buf = read_stream_next_buffer(stream, (void **) &all_visible_according_to_vm);
+
+		if (!BufferIsValid(buf))
+			break;
+
+		Assert(all_visible_according_to_vm);
+		CheckBufferIsPinnedOnce(buf);
+		page = BufferGetPage(buf);
+
+		vacrel->scanned_pages++;
+
+		blkno = BufferGetBlockNumber(buf);
+
+		/* Report as block scanned, update error traceback information */
+		pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
+		update_vacuum_error_info(vacrel, NULL, VACUUM_ERRCB_PHASE_SCAN_HEAP,
+								 blkno, InvalidOffsetNumber);
+
 		/*
 		 * Pin the visibility map page in case we need to mark the page
 		 * all-visible.  In most cases this will be very cheap, because we'll
@@ -1001,10 +1024,6 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
 
-		buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL,
-								 vacrel->bstrategy);
-		page = BufferGetPage(buf);
-
 		/*
 		 * We need a buffer cleanup lock to prune HOT chains and defragment
 		 * the page in lazy_scan_prune.  But when it's not possible to acquire
@@ -1060,7 +1079,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		if (got_cleanup_lock)
 			lazy_scan_prune(vacrel, buf, blkno, page,
-							vmbuffer, all_visible_according_to_vm,
+							vmbuffer, *all_visible_according_to_vm,
 							&has_lpdead_items);
 
 		/*
@@ -1114,7 +1133,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		ReleaseBuffer(vmbuffer);
 
 	/* report that everything is now scanned */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
+	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, rel_pages);
 
 	/* now we can compute the new value for pg_class.reltuples */
 	vacrel->new_live_tuples = vac_estimate_reltuples(vacrel->rel, rel_pages,
@@ -1129,6 +1148,8 @@ lazy_scan_heap(LVRelState *vacrel)
 		Max(vacrel->new_live_tuples, 0) + vacrel->recently_dead_tuples +
 		vacrel->missed_dead_tuples;
 
+	read_stream_end(stream);
+
 	/*
 	 * Do index vacuuming (call each index's ambulkdelete routine), then do
 	 * related heap vacuuming
@@ -1140,11 +1161,11 @@ lazy_scan_heap(LVRelState *vacrel)
 	 * Vacuum the remainder of the Free Space Map.  We must do this whether or
 	 * not there were indexes, and whether or not we bypassed index vacuuming.
 	 */
-	if (blkno > next_fsm_block_to_vacuum)
-		FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, blkno);
+	if (rel_pages > next_fsm_block_to_vacuum)
+		FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, rel_pages);
 
 	/* report all blocks vacuumed */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
+	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, rel_pages);
 
 	/* Do final index cleanup (call each index's amvacuumcleanup routine) */
 	if (vacrel->nindexes > 0 && vacrel->do_index_cleanup)
@@ -1154,14 +1175,14 @@ lazy_scan_heap(LVRelState *vacrel)
 /*
  *	heap_vac_scan_next_block() -- get next block for vacuum to process
  *
- * lazy_scan_heap() calls here every time it needs to get the next block to
- * prune and vacuum.  The function uses the visibility map, vacuum options,
- * and various thresholds to skip blocks which do not need to be processed and
- * sets blkno to the next block to process.
+ * The streaming read callback invokes heap_vac_scan_next_block() every time
+ * lazy_scan_heap() needs the next block to prune and vacuum.  The function
+ * uses the visibility map, vacuum options, and various thresholds to skip
+ * blocks which do not need to be processed and returns the next block to
+ * process or InvalidBlockNumber if there are no remaining blocks.
  *
- * The block number and visibility status of the next block to process are set
- * in *blkno and *all_visible_according_to_vm.  The return value is false if
- * there are no further blocks to process.
+ * The visibility status of the next block to process is set in the
+ * per_buffer_data.
  *
  * vacrel is an in/out parameter here.  Vacuum options and information about
  * the relation are read.  vacrel->skippedallvis is set if we skip a block
@@ -1169,11 +1190,14 @@ lazy_scan_heap(LVRelState *vacrel)
  * relfrozenxid in that case.  vacrel also holds information about the next
  * unskippable block, as bookkeeping for this function.
  */
-static bool
-heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
-						 bool *all_visible_according_to_vm)
+static BlockNumber
+heap_vac_scan_next_block(ReadStream *stream,
+						 void *callback_private_data,
+						 void *per_buffer_data)
 {
 	BlockNumber next_block;
+	LVRelState *vacrel = callback_private_data;
+	bool	   *all_visible_according_to_vm = per_buffer_data;
 
 	/* relies on InvalidBlockNumber + 1 overflowing to 0 on first call */
 	next_block = vacrel->current_block + 1;
@@ -1186,8 +1210,8 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 			ReleaseBuffer(vacrel->next_unskippable_vmbuffer);
 			vacrel->next_unskippable_vmbuffer = InvalidBuffer;
 		}
-		*blkno = vacrel->rel_pages;
-		return false;
+		vacrel->current_block = vacrel->rel_pages;
+		return InvalidBlockNumber;
 	}
 
 	/*
@@ -1236,9 +1260,9 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 		 * but chose not to.  We know that they are all-visible in the VM,
 		 * otherwise they would've been unskippable.
 		 */
-		*blkno = vacrel->current_block = next_block;
+		vacrel->current_block = next_block;
 		*all_visible_according_to_vm = true;
-		return true;
+		return vacrel->current_block;
 	}
 	else
 	{
@@ -1248,9 +1272,9 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 		 */
 		Assert(next_block == vacrel->next_unskippable_block);
 
-		*blkno = vacrel->current_block = next_block;
+		vacrel->current_block = next_block;
 		*all_visible_according_to_vm = vacrel->next_unskippable_allvis;
-		return true;
+		return vacrel->current_block;
 	}
 }
 
-- 
2.45.2

v13-0002-Use-streaming-I-O-in-VACUUM-s-third-phase.patchapplication/octet-stream; name=v13-0002-Use-streaming-I-O-in-VACUUM-s-third-phase.patchDownload
From 24c8f1da03f5bb887a23581fced19f74a47fb2d1 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Wed, 15 Jan 2025 20:16:16 -0500
Subject: [PATCH v13 2/2] Use streaming I/O in VACUUM's third phase

Now vacuum's third phase (its second pass over the heap), which removes
dead items referring to dead tuples collected in the first phase, uses a
read stream that looks ahead in the TidStore.
---
 src/backend/access/heap/vacuumlazy.c | 39 +++++++++++++++++++++++-----
 1 file changed, 33 insertions(+), 6 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index dce5dae9d02..c07ce578931 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2247,6 +2247,24 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 	return allindexes;
 }
 
+static BlockNumber
+vacuum_reap_lp_read_stream_next(ReadStream *stream,
+								void *callback_private_data,
+								void *per_buffer_data)
+{
+	TidStoreIter *iter = callback_private_data;
+	TidStoreIterResult *iter_result;
+
+	iter_result = TidStoreIterateNext(iter);
+	if (iter_result == NULL)
+		return InvalidBlockNumber;
+
+	/* Save the TidStoreIterResult for later, so we can extract the offsets. */
+	memcpy(per_buffer_data, iter_result, sizeof(*iter_result));
+
+	return iter_result->blkno;
+}
+
 /*
  *	lazy_vacuum_heap_rel() -- second pass over the heap for two pass strategy
  *
@@ -2267,6 +2285,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 static void
 lazy_vacuum_heap_rel(LVRelState *vacrel)
 {
+	Buffer		buf;
+	ReadStream *stream;
 	BlockNumber vacuumed_pages = 0;
 	Buffer		vmbuffer = InvalidBuffer;
 	LVSavedErrInfo saved_err_info;
@@ -2287,10 +2307,18 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 							 InvalidBlockNumber, InvalidOffsetNumber);
 
 	iter = TidStoreBeginIterate(vacrel->dead_items);
-	while ((iter_result = TidStoreIterateNext(iter)) != NULL)
+	stream = read_stream_begin_relation(READ_STREAM_MAINTENANCE,
+										vacrel->bstrategy,
+										vacrel->rel,
+										MAIN_FORKNUM,
+										vacuum_reap_lp_read_stream_next,
+										iter,
+										sizeof(TidStoreIterResult));
+
+	while (BufferIsValid(buf = read_stream_next_buffer(stream,
+													   (void **) &iter_result)))
 	{
 		BlockNumber blkno;
-		Buffer		buf;
 		Page		page;
 		Size		freespace;
 		OffsetNumber offsets[MaxOffsetNumber];
@@ -2298,8 +2326,7 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 
 		vacuum_delay_point();
 
-		blkno = iter_result->blkno;
-		vacrel->blkno = blkno;
+		vacrel->blkno = blkno = BufferGetBlockNumber(buf);
 
 		num_offsets = TidStoreGetBlockOffsets(iter_result, offsets, lengthof(offsets));
 		Assert(num_offsets <= lengthof(offsets));
@@ -2312,8 +2339,6 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 		visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
 
 		/* We need a non-cleanup exclusive lock to mark dead_items unused */
-		buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL,
-								 vacrel->bstrategy);
 		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
 		lazy_vacuum_heap_page(vacrel, blkno, buf, offsets,
 							  num_offsets, vmbuffer);
@@ -2326,6 +2351,8 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 		RecordPageWithFreeSpace(vacrel->rel, blkno, freespace);
 		vacuumed_pages++;
 	}
+
+	read_stream_end(stream);
 	TidStoreEndIterate(iter);
 
 	vacrel->blkno = InvalidBlockNumber;
-- 
2.45.2

#44Tomas Vondra
tomas@vondra.me
In reply to: Melanie Plageman (#43)
4 attachment(s)
Re: Confine vacuum skip logic to lazy_scan_skip

On 1/16/25 02:45, Melanie Plageman wrote:

On Sun, Dec 15, 2024 at 10:10 AM Tomas Vondra <tomas@vondra.me> wrote:

I've been looking at some other vacuum-related patches, so I took a look
at these remaining bits too. I don't have much to say about the code
(seems perfectly fine to me), so I decided to do a bit of testing.

Thanks for doing this!

I did a simple test (see the attached .sh script) that runs vacuum on a
table with different fractions of rows updated. The table has ~100 rows
per page, with 50% fill factor, and the updates touch ~1%, 0.1%, 0.05%
rows, and then even smaller fractions up to 0.0001%. This determines how
many pages get touched. With 1% fraction almost every page gets
modified, then the fraction quickly drops. With 0.0001% only about 0.01%
of pages gets modified.

Attached is a CSV with raw results from two machines, and also a PDF
with a comparison of the two build (master vs. patched). In vast
majority of cases, the patched build is much faster, usually 2-3x.

There are a couple cases where it regresses by ~30%, but only on one of
the machines with older storage (RAID on SATA SSDs), with 1% rows
updated (which means almost all pages get modified). So this is about
sequential access. It's a bit weird, but probably not a fatal issue.

Actually, while rebasing these with the intent to start investigating
the regressions you report, I noticed something quite wrong with my
code. In lazy_scan_heap(), I had put read_stream_next_buffer() before
a few expensive operations (like a round of index vacuuming and dead
item reaping if the TID store is full). It returns the pinned buffer,
so this could mean a buffer remaining pinned for a whole round of
vacuuming of items from the TID store. Not good. Anyway, this version
has fixed that. I do wonder if there is any chance that this affected
your benchmarks.

I've attached a new v13. Perhaps you could give it another go and see
if the regressions are still there?

Sure. I repeated the benchmark with v13, and it seems the behavior did
change. I no longer see the "big" regression when most of the pages get
updated (and need vacuuming).

I can't be 100% sure this is due to changes in the patch, because I did
some significant upgrades to the machine since that time - it has Ryzen
9900x instead of the ancient i5-2500k, new mobo/RAM/... It's pretty
much a new machine, I only kept the "old" SATA SSD RAID storage so that
I can do some tests with non-NVMe.

So there's a (small) chance the previous runs were hitting a bottleneck
that does not exist on the new hardware.

Anyway, just to make this information more complete, the machine now has
this configuration:

* Ryzen 9 9900x (12/24C), 64GB RAM
* storage:
- data: Samsung SSD 990 PRO 4TB (NVMe)
- raid-nvme: RAID0 4x Samsung SSD 990 PRO 1TB (NVMe)
- raid-sata: RAID0 6x Intel DC3700 100GB (SATA)

Attached is the script, raw results (CSV) and two PDFs summarizing the
results as a pivot table for different test parameters. Compared to the
earlier run I tweaked the script to also vary io_combine_limit (ioc), as
I wanted to see how it interacts with effective_io_concurrency (eic).

Looking at the new results, I don't see any regressions, except for two
cases - data (single NVMe) and raid-nvme (4x NVMe). There's a small area
of regression for eic=32 and perc=0.0005, but only with WAL-logging.

I'm not sure this is worth worrying about too much. It's a heuristics
and for every heuristics there's some combination parameters where it
doesn't quite do the optimal thing. The area where the patch brings
massive improvements (or does not regress) are much more significant.

I personally am happy with this behavior, seems to be performing fine.

There is also a couple smaller regressions on the "xeon" machine with
M.2 SSD, for lower update fractions - 0.05% rows updated means 5% pages
need vacuum. But I think this is more a sign of us being too aggressive
in detecting (and forcing) sequential patterns - on master, we end up
scanning 50% pages, thanks to SKIP_PAGES_THRESHOLD. I'll start a new
thread about that ...

Hmm. If only 5% of pages need vacuum, then you are right, readahead
seems like it would often be wasted (depending on _which_ 5% needs
vacuuming). But, the read stream API will only prefetch and build up
larger I/Os of blocks we actually need. So, it seems like this would
behave the same on master and with the patch. That is, both would do
extra unneeded I/O because of SKIP_PAGES_THRESHOLD. Is the 5% of the
table that needs vacuuming dispersed randomly throughout or
concentrated?

If it is concentrated and readahead would be useful, then maybe we
need to increase read_ahead_kb. You mentioned off-list that the
read_ahead_kb on this machine for this SSD was 128kB -- the same as
io_combine_limit. If we want to read ahead, issuing 128 kB I/Os might
be thwarting us and increasing latency.

I haven't ran the new benchmark on the "xeon" machine, but I believe
this is the same regression that I mentioned above. I mean, it's the
same "area" where it happens, also for NVMe storage, so I guess it's has
the same cause. But I don't have any particular insight into this.

FWIW there's one more interesting thing - entirely unrelated to this
patch, but visible in the "eic vs. ioc" results, which are meant to show
how these two GUC interact. For (eic <= 1) the ioc doesn't seem to
matter very much, but for (eic > 1) it's clear higher ioc values are
better. Except that it "breaks" at 512kB, because I didn't realize we
allow ioc only up to 256kB. So 512kB does nothing, it just defaults to
the 128kB limit.

So this brings two questions:

* Does it still make sense to default to eic=1? For this particular test
increasing eic=4 often cuts the duration in half (especially on nvme
storage).

* Why are we limiting ioc to <= 256kB? Per the benchmark it seems it
might be beneficial to set even higher values.

regards

--
Tomas Vondra

Attachments:

ioc-eic.pdfapplication/pdf; name=ioc-eic.pdfDownload
%PDF-1.7
%����
4 0 obj
<< /Length 5 0 R
   /Filter /FlateDecode
>>
stream
x��\M���q��_��{.��G�\:��H�8i ����v{�?z&��P�CIU�o�b�z:yX�:,�.-v������K1Z>����������W��������&���6P�W�l�#Niao
�\h�������������/��~��������o��o��~���/kl�6��<8����_�{���_�,o�5�7�Z�'���kJq����uak
���d����	>,���o_�?��-����'C�)�ZS\����>��@����L$���7YS8�?���+d�����\����Kr�����mJ+&�R�Wf&���3)��}ra}w��1�bmDV��:dh`�7b]q	�V��s�s�;e;�����1����y����}��,s��3������\,������^�{���df�~�^�[_��k�=��<�0Q��|{��>�\/r
�0R/�����T�����+�u�5|9�a.����/�.��N>�@y�^���0��E��G��
<s�_�s��K &�)��������&���{���i�gvL��������Q��7+��K��'��;)M�n0/�S)%���_�w����������������������"�7���1���S���������[��zoc�<�����N��w�f�gW��W�#@t���A��5LW�i�\�W�&k�?r��Cw{0��~��\+hE=5��.����	]�T�#��z��q1�g	�giLN�X�*b�
�_@��b�-�Iw������:t�e|��g���u�����q��;u�7�5|��\�������G��\��c��!�
���:����J,�W����|g���<����.Now>��1%c�G|r���2��������+ah�#��.!�������|���LS U{����z�Y�h������)�������)�]O^(�h~}q�'�5|u������J?����:uy���e�3v���lx.�}�������;�%�����bx�{���'�V��v���6��vet��U�v��6e�]et����wwrE�e
�q��71\�UB�S�g{���'������D����_t�������.�F
� Vd8��N�n��3�0�-(�"�p�_ �|G�f'[Ao���I�V��A?������,Z�����sG�r��ciny��2;�
|w�}�eM�Rk�r�}~�-I�}�Q�h��y�:=bx��\z5�<y�����^	��zo��^s��{��k�!^�=B��n�.��Z��D�9.�������D���n6�I��x��_��`W�;�L3>�������S�uY���.n�]]�B��b�z�����VO��:�.��W����2����d8�_�U�y��|���U��;�X^�y����u�E<-�g�A�S����T+r4������7�������V���5����,��<�tKTu���h\.�4m���9 ������-�9��c�N��Q��\�Tw�<�(�T��K
�����JE8������x��Z��#�_G����� m�j�[z�B�>Y*x>�u}�Vp���J�C���1'�@����2��]vu�����J�r0�u~��
q��O��"V���t���h��=�m�k����u��tM���gSW�kG�W]Hs@�#	�S.����d����r�iGC��+�^�u����Mg�V9%P�
x�����k����9�d�e[��.�v�ep,�������)W�}4������"��!�����PC��t�%�$q���f��=;��V+�<��������:�8�
^y}���q$��*�����N<��7�L���{���q��}���:�
^;����d��uK;���C�^�W�����s����r�^F_��.�Z����=���t8��}s��zS]@������8e�q��e@�*L������>6p:�9���l����VN�v������8��Vm��k��������`8���j�*���D���#Z>i�_�?!i��o���mG���}���������"�`B��7�3�5���!���%����>iZ�ly�7
\�V�'@��X��v}�nr�����R�lc\��Q#a'�E-�N�����ol�^���Y�o�*�g*v}"C��Ed�h�L:f���B*�B�N�\�G��{E����2�S^L��;��+�-���a�6>kx�S�x�S��w��ZF����K���������|�&	5�J�3���l�$��?����gG�6��z�L{c^��3*R��e����L
��*tMUN1l� �^��u�Le��eJe���W���7���k�s���4�'_���M_N];	���5������	/yK1i����^���2o��xI�4�)'T��.�������	x���\(��l��[��z&��Z!�c�t��+l����l
�ER�����Q��;�t,-R��Tj�+=W��:�\�>5��39Mxh`���5��4{m�cG|��^x6�5�|����v�	��4����@������6
��Zj�m�cG����iW�Wf�j~ ^�������b��Z�"��5k�����p��;��#\�]�������>��S�W��.����Z���k��)���J�D}1>\U�\��Wa���<���K�1��k��~6��>_z��wW��w�cyu�9b��s����/JU�<���V��wgW��R��'rr7���;��0�������jV4$F#V6����Gc;W�����P�	��W��������,���9�l�:�����"������F6���e'Vz����X}|}BN_��/���;��f=r��
�����?q�h^2b	<�n���I���}�����J�w���&��j�v�\�+&�(����*X�N��s!�4��!�?p8��H�8��z�����O��	x��^S���������ws�$^������\���2>[Ka���]~�B��_���b�]>��]�c��B�Gal�-���_���e�aaK&D���e���w/@6��G����a�����w�J��i��E�{S�7�n�6�����s	��N�EDAF6������TCM��H�3u�f��f^���+{OJq��=I��wn(|����N��D��[��Ho'[U��X}��������)����rY�B��� �(K2)�����
	jM.�s��7�;w}�������A|���v�Z?OcGmw����/������������=���Ncmwn�5AY�U�����o��C�~��lu@�����������*���r�����5��dJ,����r���/���m=�/n}��o��o]s�������{�]�?|���Oz{P�����^�����i{���?�����������?���O_>�?O�f	�|�\'�q�[g��S���{����/?�����G�_�����*��s��_�����D��W�}�n{�/V��?����]������%�^]��*��Ju�8����>�R[����qK���R��U�n[����zbRAk����X�O5\���o_�~���������?}|{��k���ZK�D��G*��S��=�X�_��p)����'7=v}M�����J���+�+K�2�:��Rym��8�)�f��<���|�#��~�$b�Fc��2�SI:<6���~m��Ad��H�^<\�T�7$�$�I���dI�(�LR��&)�f3�~?� 5����xc���{	���~K��%���t&�[I�c�9z�$B����2����-7~{$�8�g���
�q��6���fJ,���4�������D��X:���lI���C�\����.��^����9Z:��-�Q�������:Q�qF�p=�`c��,2��� \34!���!�����������������# ����j��0J~f�3����������(np_SF�����
v<�(E�p���I�`"^��O�1�8x������(c��e�cx�X�d�i�Y&)S.�4�R&	� n�w���DK��}�C~G��C���q�����^{XJ��ES��K�Gc�6�x>FQ�G�W�1B������nr%
Kt���mIaibC�^m	��Atv\w��G�^��`�Y����o����A��
>,�����g</r3���NpP�z���M����.��A��a�%�'��:�5�0��D�K�]���������<q-qf.�5������V= q�`����e�t�����S$��r2< ��=Iv��.�n����!|.>�e��D�kp�����n�(	f��U�'�ZIJ�)�`�-��g �!��Q^�!	����X��v8�z-���6��d��$�,�q�e�� { �`��M��`������hG>=���zk\|��A�H�l2��~��DCF�i������D�]��w{;��4�����v���������������q�;�}�aCb����A�a��N�����W����\�I�{;�����M���u�d\W�u?y%)��.�.�l�wZsD;����fb�D�F�X|R��dIM�� �.��[���fH�0,DW�H ������K�M����1��d`���CI�=�	�`��Ch$�'���
p����qv���i,!073� .q���,	�m��xl�"X[�@���
����.%��y�������l	\24E�<1[7�1�C;*��2l�9�������-��S�.�n�B@��
���N#$1 Ll��O�j�V{�n�������,����:~&�X'a����%���E�
�B�]�<S��p���QB�r�1l-�H��;E>�]@U��c���` �����{���-��c$��C�.��I��Kp����J�T��|&3$q��L�����i%	� ���Ab�����F~,�za9�99��������$O�G���U	����@B{�~_V���'W0��h���D�����xmU�� �5����^�d!$5C�����z/�78���c����2N`�I�%p"	���%�����x�c�,@.�FV{9�I>!b�-\5�]��G:L�{��������	���'bC���x�<�y"T2�2@2d��8��d���%*������G������|�:FQ�����V@��$�9utr��u`TVp�d�wt#e��9�Dx
v�wu#�1D.�Ok$#�{8���p�01��y��|����Wb jp���C�c8�KV���������)�����\��YY�v�fSYJP]� G�q3���I"�%\p�W!]��#�3f���a#��&
�M7���N�����1�"��D����L����'��:���jU3f��R��=)(1E�CS4�}���	y4�����?f�.����VD�98(�k'-���@�RJ'�"�z 0Z���}H7���3�(c�"�V1��{���2OQ�{�<�;��,?[�I& �	K��	-H���`^=���>
.=�����aW�������� MrFA�l9f�_j�TP�A����o
#�
��87���h���=������B��O���R�(�?�#�Q/�Zo��|rJ}�C���j(��>��q���
1��E���������v-����T&���,Y���\.�`?#'CP�U����^D�T7d��)�_�Y(��7vQv�~������^����E���Q��g)*������2�s��z�l{���Y�(�$&8Hk�%�����:���U����xV�$/�=�or��B�7��A���7Y:5����&�}������?,�V����M����D�m�����t�Y�4��i��������a�^�}A�/�����&�q8�!�-��J@�3������4N���I$�jV!XL��$� /JS������h���oY���@�%C0)%0>��V��d��\#AT��$�ArJ�7"�^�BF������ay[��@v����GB2}cR���|���-&A\�5{<,�^)
[��^X���A������t�tH��~�����r�n�7^~���
endstream
endobj
5 0 obj
   6087
endobj
3 0 obj
<<
   /ExtGState <<
      /a0 << /CA 1 /ca 1 >>
   >>
   /Font <<
      /f-0-0 7 0 R
   >>
>>
endobj
8 0 obj
<< /Type /ObjStm
   /Length 9 0 R
   /N 1
   /First 4
   /Filter /FlateDecode
>>
stream
x�3S0�����8]
endstream
endobj
9 0 obj
   16
endobj
12 0 obj
<< /Length 13 0 R
   /Filter /FlateDecode
>>
stream
x��]M���q��_�ew�G�E�Hn���^��p�I`��~@]�u(�J���,����X��M�N�������-���������>����������o����������%�@�o��+����k�k�?}������oa����������/���o�����������}��}ru��~���o���?��}��R���������]K���y������R��{��n%8�[�.��UW�������G����\����]+�c��R����S�����^_�S{
:����{��"�5�K�1\9�>�[���\�:�5�D����������'�P��k�=��<
�]�_��R�5�@�Ua���s��]���������qu���
�(����d�����J��l���G��<���a����5���il�q�����1���M_��%��/���+�7&�"����`���r*��7}�k�u�3�
�w1��3�&~��kU�z����E����A�>� ������k��l�!�5��u�4�E��1��l��j�r��&���d��R�����`J.dM��e��������&$�%�
/�� W�Y��s�T����%�I��Z)�$Zq}���w:�{
�F~�}ljh%�
"��^����
4L?W�
�Q�A�P�����J0F5���/'����n��J�{�|+�c5����/�$g�8(��eK.=UqPm�kK��:������#���v0a�B�U@�:[E�&�pR���R�
#G�F����+9@�	>�:��=R �
xG�\�0v�Wc�oT&�������|�������G ���,����W����7����ZM�Kq�,���Qd[G�W/����������q[��W*��������6�f�<���Z��U�C4#���O���5#�:,������-�5Lzm��Y�����	��_h���^�_^;�i���W�����$�_�����e�"��$]No��[;��q�p���v4;�v�
V��d
����l��\dE�FT�)����x��6HF�5���~c�6�&�:���y�0�?/��I�'�\���-({���T�]���R�����<ppk�5S�"���R]��yw�C<|������E�;�}0j����,R��4o��������K��J
|��M4��jx-<Nz�~Qf��Y�����
����9�M�q�w��u���o�A�~5�ZnY@�5gC�&�*^��Avf�x~��h���d������m�M�� Z����&���V�4�����/W��k����1/:9�2���Lo:�bN��y��,���.��$����E�@&2�s6��7�� ��Uf�Q@�-�uM'�����Lz+�i���4��p�u�����Z�3[�s7`Ri�ti�]k�\����|����n����O�!A<h���muk�zKqrn���~��wY��W��-��pl�s}s�����:���Vc�������&k��n�69o.�����Y�����_KMz3K�6i�������������A�1���g�sl�,��G�&��:

�j�
|-)5��'�7���hckpo�]�����lfD*�G
��j,�������@$�_hT�hKk����S��x+W���j^������V� �~�st]p�1��{=xWq�U*}+&0����i���������)9.�xm�#�De������3q�h.;x~�S����
�I�%�.&G�����x��������[��W��(���gx����������������*�������sk��
(*�s��&Z
![z��p������3�x{rn�����)N����5/M���)/QI���}g]�I���3���qq7��!�kYz���}N�>�=����+����������^
~���?�����������/���N�lw��:w�_��5��2.����Gn)�P�\S��y"�?����K�3aX:&N+��?����Jh�=[��Y����r���I�!�k���]y�J�7��A|���
�Y�`\��~C�`\[���������������}�{;=����~c#a��&?|�p�
�@��.���Z�������$�����7�M-���V]�[��,1��R,�������}A/���������l1�xx%BB�-�9i����&��:J����Bv����RmM�f�6x5��/b\��+�Cf�s=��`�5!�h�z��y��C�f,�����Ch��r�0oqg��������sn�s�{O-��4��W��#����uh�_)P��{��'������)���&������\-��Rcpo>@fp�`��8WW�9'�<��w�����]�+�����c�K�D%.)���>M����ix#������X�W�1����
�����Y����p�f�}�kl�
�-�����B�u��[��_�D-���4�/�:Lz
��i�8�8�o.���.�����S�{��ts]���7��'����-�]GKe��S���Q�@#��������=��q����y�.����X6��(=��~���ufgP��/T��::�����B���(	�S�5%��f0_�W�F�J������g���:8��{�o��=�{���Yk��6.�mz?�5�o1[���~��H����=�����,�%������DG)� �����M�U�����cR���:����������l]}l��F/�m�����l[������K���7>�5����W.G�
t�yop4A>0>�7�n���[xk{uR�����^���(�z�r(K���_*�f/qKK��d<<d�s��P�/��>����qL�u��[A��o[p�z������#l����_o���w�o���i�cD��Qf�u�,�E�������������_�/_{r���o�'�{��qa}�
��!��Rc�y\�����Y����i�W�Y�~�����bt�/��������ye���]��M�0�S���	tZ)!t^��Am+��b��W����������J�WT��.q�>�F�}@�+�o�'Z�q���'gDW��X/5t\y����<�w\9�w�P��e�G����o!e����[)��F���P�?_	�?cmt��XH������~�,�����|��.-�qE���c��z���SN�c�]��<�3�c�xE���\�<[c�{���=����m�1�q4�u��c��*�~�Lc�F�gpT�����?K��Ds6%�G���2��K
S���c����@��*wI(�)_����y�<��c{vU����;�1�G���(�pp��_�I�����N,M%�&�u�U��	��IyR��H�Mt��(c���
�BJ"}����9�m�m��g_���:I*A�I�5�(�2b��0��,��%�*��4��
�
�f�\�V�U$�*P�4����*�H�Z[b�]L1�xX������81
c1����[�ZV�{���ZUa���8���[6
�����J]�"�f�\g@��#�ZE�R�,u/."W����v)�9�M��0���\w����]F]J^�q9Xb�#ga�����1F�c]�eN�g���Ny2���VR]Q.��d�����{!����Nj6U�3VD���������c�X���z���}����d5����p1�����&
��m�.,�����R*��
�wa����.���}cC#)7�!y�tv!�IR���X�0����O�m���JM����BW��w
������������5�X����#���6�V���(p�ERQ9�*~H|����i����A/����(�D�S�pt����](��u2a��>)��Rb�����*c����{T%8t#!��s�$?��3���D�jo���K$�^���(,M�dcT�x���qS� p��$����
�]<}�7b]����	��s�"t�r��&&tV�5���Q7�u{�������S���S����C��r?ie�%��yG~{c{(Y���] ������Bz��&��+hf�>��'H�<!I�5yj0~�P�*�"UK������B�q�$yH�����my,���0K�.H�j��f KA��O%K��D����L�%zh�8�_��Df������-�#��������!#@R@MR*��E���)�)��!H@L�"Q� �UIbV5���&�UHSf��C��5z��B� @B�X�B�KY�{��z���#"P�Q���$���8%v�D}����X�,�*�.x%�v6�����V���.|�L)U��K��pD��[���y�t������Nb��(���
�H3+D���^};���z,�RX�&: �*H	�!M�2��pEj^^G:*�w�{�R�\��kM��z�-�ZIi�h��c��g]$��4�(�9v�P�Q�8J�8q��n;�5�M+�v���s�c�n�lZ�hR*�/��X�������Q1������,�Q&$���#��q'H�1�����N�_�������`o\EL��M�
��Vu�E}��fIX�K��Z�������}�������XU�R	�OC.N,n�g��F�A)����c�4��q? ,���}���c�������|�z��o�������O?vD�v=1�ko"JG���(����$<���������H+�~���FGR$�P����8�n����w����Y�]%~=��u%��o��<~��`M^1����M�p*���~W��"�}�$U���'�BR�i���"�[�j�(����?���u���"��c+��I%�B��bV���O������b���7�����8���U����E^��Y���w_`I3�����%m�u^1u]��n���H����+����Ro���\�(�O@M���d�C�c}P�e�@�r~O���M`��~*I'�QG'����������N���x9!���;�@0��G���1��Z����	���<x*b �-�(����6g�(
y=,�Z��z�h����3�^{,����X�$*��|�h�
v'�U��ei/����E�������{����X��TM�E�R�)T�����)�
W��_t�`	b��R�g)&�5�1���7��~��y���ti��|�KKz�����,�fkmw���,UyT�� ��V��E>E=�H����
��A!gA;��oc�-\�C����OC����,��Z��W��.2~�r�[�X��gQI�������X!�d��2AG$��L"�vT�dD)@R{�h�k�Eb���I���o�������� ��s8x���*0���!gk��m�1����'?�V9�KP�����JrV��#�7]H�#���������Z�����,,Uy]z��C�P8Y��
�h\�
��G;�%�.3d8`����f8��!�����)'�h�p
z�`u�t�X��dN�F�������H,3dp\���� �ea,E����h��J;*�"p�Td�	���-(23�&4�+���(��n8�/x�K>��e���������Ln����R��^R��xbU��0�
(@F@��60���8�Tx��Q�n��]�.��j$%�V1^��
 }��)L��g� Tl]=S�a)6
m�����POMBU�IB�@����8FD�4�w�5�Ih��
�����Et�a&=WU���������d��A��%�n��:��O
.Q#��1v	�����CE��'� �
��c����vu8wN	����P�*�4�BA����X/��[��{}��X�%�����%h���&)���Q�M���R�eIQ?&�G-<8i,�*42==����+)���YZbZ�9���B�'e�NR��)�Z��R��i����p����	H� �j}	>�K��O��h"���TG!s�	��C�
+z�����
���	���������W^
endstream
endobj
13 0 obj
   5997
endobj
11 0 obj
<<
   /ExtGState <<
      /a0 << /CA 1 /ca 1 >>
   >>
   /Font <<
      /f-0-0 7 0 R
   >>
>>
endobj
15 0 obj
<< /Type /ObjStm
   /Length 16 0 R
   /N 1
   /First 5
   /Filter /FlateDecode
>>
stream
x�34Q0�������
endstream
endobj
16 0 obj
   17
endobj
19 0 obj
<< /Length 20 0 R
   /Filter /FlateDecode
>>
stream
x���M���q���W�ew�K�������@�8�����|a���(�����J���7j]���XU��	����������%������7�������?����/o���r	�h��}v%�\�#�k��_~}��O���?���Oo�}�����������������/~{����KL��\��������_����>q�L�������yWZk�>�k�Q��Z�����w��G	.���c�Gu������_��dNmS��RW��Z��\��P������S�����mS��R��~�j�M\9�RJ��W����Tb�.M\k�~N����%FGlU:�B[Kk�9����>l�]I��p��t`����[�9#���O���������E��*���V�q�?�C({/��m�C���Ps�9���n�f�)P����[/��TKD��
np��M~��M=������U��8-�A�R��R]*�������wM������b]�5����b2DW���?��(4�����tag�X_;�,@�O-m�����z*� M�r���v����g$�����o~\_]�@��I��|������)��0�B	��D�����I>`�5Z�c�m�K�u<�\����O[n��=�
K@n�<�L�T} �)o��i��+��Tb���M�������Q��zJ�`����.6����z�5FZ��@y!o��)D^[�U�Gv!)�7��l^m���
}.��F���-|������{���-4.���
�������X���7��i��������)�V�y���s1:��J��Ahe�q����w�)f�h�S�����ca^�U��jx�5m�jx������pp�)�k$j���r�n�G���%���������uq��B\���1�ZU7R�	��K1#�� ����3}x��x�g��"H�wb�������_P�����(UaW�U����A��|�����KFv\,pA�����u�Ud�
�Z�t"6+;���bAl0�=:_����bR���K���![�=S����vw�=gE��i�|���`��`�`�����U�Wf��rS���D^��e�C��W���6��vGl1�c�������-�������`O
��]���e��������-�a}?�?���uu���z��4�[u7���������`�D��e�����> t���(��&��x��,��g{���F�	�T�*������%V�3�����}�rm-���_ts-E���-{���U5�E��\/�����[��r��/
_�F5���k�������/�hU����o��Y�	O�_�&
�@?��2�6�j����n�kM����
��--(U����5k=�s$���lxR�6J���%~�uku5�~��4e�Z�\]� �1�,��f�fd9�5	4Mvk����Iy�����xx�p���~����g��%u�7e����e;��jV3'�-�r5�g^kM�q^������9��,Q��)��Ue�v���s;����#�@����}x=h[���H�eH�Z2�?gL�X�@�V���g"W�b��na��G�`]S�.
���@�7ez���z���A���B�����K����P[����qL�I�~`���l{��W�M�-���f5�k*��8P�Z��}�/���k(��3�g8X]�4���#w�JU~�d��� ~,��T�fD������J�'��h�����4�3#�.�%��/n��X�:�QKy��8�9�dl�oL�)T��~rT�����|�z��R�kF[��� �lU��p��y���2��������z���V�Z���+�K��P)���5Vh��m�!<c��� =8�����<�����b�5[�b?/Y���F�@��r�]�W���)�������#��|f��@����~QkmM�Jwr�npN����M��o�oKn���f���O�z�����#���5@xKo����p6a��������%�^�\�o8E2�[�;���1���?h����
b�o�<<�4�]�ksM���7u��'P�^��u�5�j��7�jt��Z�+�Y3k����p��,����!�����_l�M���A���s��<X�B5�@����&����b�"�gf)��ES��&�Ot~����p���������v�v��Q4X�rc�!���B��f����?�ZS+�M���Zc�3X���cGv>�/����xU���E�s���V��Wyb�u�d)�Z������+��5h��s����p��W�����I�������[���^Y�8h�O��b05�y��^2B\�.���N����P?���F��.�*^>b�����Z�{+�Vc�V295G��w����j��(���0�*�[�o�������Q��:��A������O�,����o���Ork��@o�-�<�[u��"d�o�p��RJ�_�K��������U�q��Q��g��������!>��z����|����l|�5���{��kr����'{���*�w�|�?�P�V��Y����>>�[����b�F��;�yZ}��	Dn���������.�s��Mp���1�'�m��#y�?�~�H������@^W������ o����[_ y��>�$��[�KY6��'�����owSm�������������������C���O���{�Hi����Y�\����1�&yCo~0;���I�|+����t�6�����z���r��/��=����Wx��_�Q�>��o�������������?~x��o��?������������^�#��>���~�;�������> �/����|w�P����q"dv-'T�;]����m;��q�B�\�W�:�G*�q�Y�����1Pr�B�����J.2*���'���}_5������^H�Vo��C��@9�\J�'09��;�����g@z�����;�H��ki��[�:�\J��[�����J��j�����������*�;C��t�t���a��f���>tk���G)��cz�������?y��?���i�������b�[~��r�w\XK I�m���R*�>1o����B�Plli>�ujp��b����R�bs��2�y��9��z/��T��=�XJ
{1!~/����4���TH�
���'/��$�Zk��
�@�R�]ZbS!e)6:�m&R�i/��P������)!�d���F)�)��RJSD���S*��)��
�O�J\���A)�D9�J���n�D�?����C���J��&�=\���������XT�W����5
 +d��y��U����%�m%x���X�(������2����U.Xk���*%�r%�����\'����n��uBj�4�
X�tu��b��@��SR!1X��\���������c�ww�����u��Vl�.�k.���H�2���UH��0pM2^2���v9'�{�j2$�!��XX�*�\!$4V�������sNM	�t����F)�)5#��8�w&p�LP�J/��jB�����V��(A~��2�0�-n6��)A"49����\��S���+���~�cFQ������X�X���D�(��4��|�����M��1��t�1%)v�[��i��`���������0^����+�x�S��B0ZO�J�l�
0�e�H�X'����(@�<��V�O����c]���J���b1������b
��
��R`�:/.�a��l�ey�'�i���o���1�x������0/�J���8��BPk�R�B[.�b=!�H[!{�$���X��	�[�
)E��RT�Zn��28��*m������8d����Ni<�������	1)*4�Iu�ML�<�:;p�!I�D=��1Z!Sh����ub��`�O�I�r�\�, �~�4!��.������X&y�{>�UMI,���c�����&vsU��P���F�P�O�����wF��3Q��.:���VYr����Tc��/�oD��{���>�t�$k����������uH9$yg`� ���i��� �y�iJ�/�EG��
K��������
�T�����Bnp����;e
sI�5��e�Jf��~�I����2Bv�d>��"d���V(%��\{1�$�.�����y\n�`V��y��-g����e^�[J�m�(����3�*����Ld����K�O'H.�J��G�&����4�<�����0k�Z��{a��8�A�D������fy����\,��X����
�Yl�j��x���0a6`� ��������^�S��^�.�3��#f+#<���.�>aIE�$���5X�oz6�*x�q�pZxw^G�e�V*��`(�=���f�z.��+)�(`	���~�X�a^W�d
- ����4UaT���5����C��6C�b���WC�z�K���>������W(wC������LXV��	�<�%�g��-���u�k��`��k�1-�K*�@�W���hN6JMI��dkz���?�TbR'3L��)1�H�����gm��.6I��!��u���������YU�v��~�����oO�4p�
oMp?A����(k<<���������:���SQb�l�]`1�iO�sKzP�r���Z�
k�$����mY��(p��i�-��i�``�Z3AT!�@��44�B��8�Q`���~\�P�O���a��^y(	vK��`�fr�3Y[����6HKkRTK`p7��Y XPb%:<��xl;�~�lJ;Q�?(&X�`�5�"�>(��d����*�QBj����^�o1L����u��P��Q��2D����`+��%��[�����iA+ML	Oy��/���Gk�F�K����||�@�D��M���	&y�}S[1���1�4�c�&v\$h�����]�lZ������P'��X?<�.�� :bkR����N��jy�~9G��q��uI���	�����P'��<�G�@�
�%��`�*Y����[��l��0��s�<�
��{p�A������uB��RV9u~X�-f���}y^[E[��g�>��4�u������.
�U����f����[F��;+�� {o���RJ4�aN��4�++����0k�]��W�m�:�gu#4��+U��	f&h3���'ZlL���D!Rxx��g�zf�z%L�a�`��<�����������3�$X�`�H|}��d6��O��N�)08��|��3�H.F��S����`�	����t��^mE��%��tr�G����i&��X��L�z�bSrB�!O.�e�
����P�O��*���I�=����$rM0��y��[BR�C���%9�h���z�~�����Yv�H�f��.?���k+	vB�)$��x�>a�!���r���k�:�0/�3x�Vz��Y�3S�M���fw>�-!���������m�S3%H�Z����[V0���������(�H����iUp���F�0�E��]�B�:�P
��j�Y�nY�V��\�=��O���������������c�$��� ���~1�c#�����q�E�iG��UO�Y��#�Nx{H�<(7����H����*�Z��N�x8����������
~e���Z��x�sr^�:Gp����?
w�n�Tyn�zl\���Z�<������rM/��Cbx6G?p�+�I��%�
~��TV4�G�iI�f�[��h=P+���"zUL�l���E�����1HTJH���x��C	����JL�-�O��oa�?`����%M?�^�(Q)�b?���VT�Qm�Ji��QJ�$��28��{	�h����?}y�����@?�
endstream
endobj
20 0 obj
   5879
endobj
18 0 obj
<<
   /ExtGState <<
      /a0 << /CA 1 /ca 1 >>
   >>
   /Font <<
      /f-0-0 7 0 R
   >>
>>
endobj
22 0 obj
<< /Type /ObjStm
   /Length 23 0 R
   /N 1
   /First 5
   /Filter /FlateDecode
>>
stream
x�32T0�������
endstream
endobj
23 0 obj
   17
endobj
26 0 obj
<< /Length 27 0 R
   /Filter /FlateDecode
>>
stream
x���A�l�q����8�{W�T���
��!'w7xa�3�C�`���s�z�}J��1^���~u�G��RI_������]��k8~���/���_9~�G������Q��Z8���_��1�����K
��_?~��7�������_~���������W���?~����}|��K�����K�����������J_R�J_b��o�����������Q�wE���"�]�t�k�������2���~h2
.�h���5���_��O���CA��XR���VD����U�����_+U9�
T��ct�v}\��Q�]���l��:����w>���K�E?������p�54!����c
�w���>|�pJ���y�A���;}I"��J
�!�!/��	[��)��w�C������{-Rbi�0���Q� ��r�i���r����-NU<E�Zk
~����R_%ro�p���:�}�����/�*b��J��%�����@�[y�#�����Gp����7�m�r�>�����b��=U_�:�"�}�������$�������=�$�����j����/���WKm�m�\k�fW��~�O:�8�~�+�_����?���1���Z�~�N�eUg���]o]�u�5���l�7S����>dGd�%Rw�)�6��X�V��7�vX�d��zm��E5W�&!q����F�Pu�iQ�}H�q��������~������_����>&���KT�w����z,�y�G���u��������!k�:�H~��7���~x�����N�U��'~��No�}`G�*�����}_��*b���&��&>K��p������r��n�I"+����_=���x��Sp\
y��]V�����8���;��������U~E�K�W����)����LO8�����`}�[v�L�J��n��3���qL�)�y
}��y��K�o�WXh#��!D����
�����usO���������b���<��P���{_]"���y���M��>��b#BO��`��|X�\[�����m��.�9����;)�U^��mRXm~7t���
���/�b6�o��D���������O���Y��������7�1\�
����ZX���f��M}����Y? rf�y.��}�lh�~�lc��1��q�?�c���
�7������6=�X�n����m|��������a���dW}�|���ptR���O]l�.x�f��1��e�4��y��c�s�����Qx��+��N�o��MA��-{\�8���5�V)�W�g-K����;1����V����9D�{sFg�x���f�GO.'��m���Z��Wd������T������n����U�R��]}`n2O���o�J�N~���M��g$�k�{K�S=�|5�Hi����Z�3��E�X����9M��q������[���}Es�����7�����l��!�\�t�oC����7����Y�F���DE�Bo�c#-���
�����$�;�F��QJa"y�����#[��j��������!����r�)xt��n{8�o�|�
s�3N!���0<;kSrqMf7�����"�����'Jw��.���}@���N����q�m�g�G�g-i'v%�OW�&��0G.50'z�?����
#��[@o�7��A~�s}�s�l����p��3��sq�,���� ���f:�T����������S�ZD�����z/����	�sP������C�,z�/�E��/�
�;��dw�G�CPn���7�=:�<����{��g5zW��7}����\�'w�~7�����YF��\��M�v���[AD�[����2���1�7��w[[��������8����a��&wS��<%��[!���� ���\��A����f�H���)��h[���a���{TB��������xkp|	���R���w��)�
��{���^T
����ju��K0��O79��K��l���2���.8o��j��������m������]��}�����}7�k��;�s����z�N����]�fB���-��t�~� �2�Ld�2U��P7��"��nw}W�P����Lf��b0�n�������;x�=���FZ8�I�����������t���w�|d����{C?W���o�<<�?q8x���������t]O�����d!��|wv���$/�x,��W��~�ah���Hx�?���K ��z��g�6���}s�g��xq��Rr��W�D�m�=������������7�|�y����|��w��
L��\��;v5�Iv��^[�k�f	�b����z�*���+�����2T�)�D�m��v�������
���N�;!�Q���� u�����~�Xu����n�����~��<��;G9h6t%H~����K���=�_���V�E�V�;��@�M)@����?L�;t��;T���[�WW����Do��9�d�O����:����M�����1�j��qao���p�ql���Xg*3o�N����}���Y�x8���7c�wC��<�����h�C��]�{��Wm�)�;���<�&e�?��a�m���Ey#�#�����&���w�M�v�o��,no������~��o�e]������.�j#?��o�����k��Qt�m���q3�Bv������r�D�����&k���M�Q��w}�x����?������?���w�>�n���0����4~yw(�w+��:,��y�;m�pO�;���D�	/������q<|w�=�c���R�M����=��_	��<SM��jR}y7e��o��^�_'�bj�_��������un5��v������NM��p��(Q�����Er#�kg���7��)�����[����d�_��~jb��7Y��z�O���Oz��oN:��F��".�����
|;�s4��~�C�K��/��~J�h����V3���W��K3�����v�y�m6�t+p��?����.�����'rS��h=Iv��Sr�bm���M�����?��(oesM���������|t���MFM�D��&v������)���&���n���d���&��_������%�~�gZ[��z}m�����Dm��;f��s6��_��Bt�X��0�z+��P����du1���=�`����R��l:�7������H�����?~�������w���G���9����o����+r|����������.�9�������?�����s���41�m���Z P�P�:������^����O^�Sj�O����4��/^	S�� ���d���=��	���'�:��������&�������
�L�x��H�_J�����Ka����O^K�\�j���Ki;�����ka�!��Qy=y)m�`������������'������W=�/�]�Rv����`v5���/��?�>|�����-$'%�v���8�B_����&���9��9>>�������f���X����z�F0�
��|}-8�����ho�F}���x%8�Pj<��M����o\�9�=�';v��gos��d�W`������}S���_��}(�f����I���d���I����|��Rx:T���iB���z�$��k_�xp���l�$m����-��s�f�WC����$`I`�1���{`s���=@
&�_�,#�<����\L�^�D"����6��Rs*L�<9�
��D���z[�����,�2��B3���MJ))%���m
��E�3P��l)�gp%
��}�$�"P�����;I����@�D�
[*.S������#���`�D:���Q�^���"6���j���aFpR��lj����Y��[&���>����N���)�eX���3U����"��6��t)F���y���2���l��N�\S�N�#���A"0X}�|S��G��".j7F��S��9{��J�fs��=�Cy�F;�m�-F0��$>�`��2
f��X���i$].��c)��vRo�PF?��v��(��`�,+�'�p��s]��I�����5C���	�F������W�
�%�f��S� ���N��(�`�\�0U�����y�v�f���V>����v���N ]b��\ ���0�fI;�b�5�W�����2d�%���\�����=%Q������`��:s!��f��U%�<xf�HU�b��G��'����IW;��5~�&�	�l���� ����Dt���%��]N������=Hg/0�`.CzY`oG���X������ AF�KL<<��^�g{{�;R��8DH����24�-���������g%E��N�x�A]`����!v��c����D����,�l6WL��'
�^U	 �5�����f5��@7��z�D���]Z�����4%(�a�b]�����"M����_l����������yVM�BO�
�+|����s�:�L_\�+3��
=�]�������?��vR����������hN[���9l3�'���k��#Y��Z �g;���K@�Z��vm������������@���3	�C����(CX�Y�h��\���/��R�_m[D!��51!�z��(����� m�����W_��7g�I0z2�����"������z�s��.���I>�z����|C��_��5������7yI����D�;���o�,JY����,8�Q]��2�S�N��X�+��%�IA�R?&g�`�bT2G���]������$��pUR�|t��>��+��/����+D����T����0�����3ph�`g[�z���:	T��'`=,0�� �����b&
���O�M��������(<l�=(,�J��1D��~t����
b���q��+s4Z&	�`YU`O����'[��]��_��
_~��X'�.#� A� ��3�	I�
�o:��\]�jE��p���b�e�s��6���X7&�\$�)A>�P�$] 1O�f�iJ�{�=P��F`kvQ��d2
�=O���u��J���x�'�2�d�y���q[c��F��h�!@P�����&��
��3�����\p��]�;ryv����Y�TF���?��hd�M"�kq����%������d�fU(T��U��Z�,o��=��*�~�W�8�j��iUF2��6k�=��������>B�A[�j5��H�M��(<3z��%��7�,
���WH�.5�<}65Ww�-��������}(,a`��_mYD!��E��^P?��e�L���	)�:G,�5��~�s~��JJ�*����A�3_+�P����W	��u:���x����`��<��g�������u�)JL3��#��.oQ�g�{x�A�J3��~qR�FF��	��]������Y������F�'�����+a���,�`fF�����l.�"��W�������:�B
�x���K7�u)B�.�@�������#�iw�UV!_\�`�����)�J�Q���@T*A?s@����Y���8�yB��V������j��*fUv�]�o�q�p���g!'^G��D������iO�,������)�+�v�K��z���'p2Q	{8;�V}��~
A��[,CJ��t0����5`d�[
�'4i��D���v���Z��R�����k��,phfFP�s3 ����6�!���pN#���6�.78W��V(������^x��IN���U�J��j�(�'I���H2���3������f���U]a�V3��
��X�j�R�&�U]�:�I�8�x\dl����3FP�P�Y0+�"h�a�Zd��x�Zq�P?������Pn���Pz�� '!'��F	�5+Y�E���DD���|�������4�;�����p��_�=
uY���;H�,����&L�:n��iF���n V���
���J�`>��_����`�h1I��3�#�I���T �\�'���{H�����e�zoIbT��n��<�##��!�<lBB�E�����[���zT�Px`�,9C��6�u5��������%L������(E�����x����S��i�0�����}��<AB�=�����/�ui&�����$���p	\zj?�~?��B���E�&������l����$�W�j�\>c���m]r�h��u������,
���.�N�����$�����ljn��LY�����t����<w/�j*I!���x���dI�4���WS�f�u�� !H�D2{B���M��A�mE�f%���{�x��������a��
endstream
endobj
27 0 obj
   6325
endobj
25 0 obj
<<
   /ExtGState <<
      /a0 << /CA 1 /ca 1 >>
   >>
   /Font <<
      /f-0-0 7 0 R
   >>
>>
endobj
29 0 obj
<< /Type /ObjStm
   /Length 30 0 R
   /N 1
   /First 5
   /Filter /FlateDecode
>>
stream
x�3�P0������
endstream
endobj
30 0 obj
   17
endobj
33 0 obj
<< /Length 34 0 R
   /Filter /FlateDecode
>>
stream
x���K���q���Wh���,�x�0'�x1�7lxbx��? ?�|)�J:��7�^�|X�*^�C����N��B	��
m?����7������o����������hr)1�K4��m4�QLi���hs����o����}�m_��������������������wo���lra{��������������O�g���g���7���6kJ�>�Y����51�-�q~����-���/?m��7G���])M���u�z�J���j�:�	Q�bZ�9U�L��(��y~��%��g��������3)h/�Z#�8@1[c�H��8Fa�(�%;B�\qi�����R�Q[����S���G��MN���u���#�����b3�-�����pu����V�x������E9-�0���K2�|����e�����C�*L���S��N�[{�1���l;f�����o�9�L�����,o�e��f(/���?�����bahw�j_5'�CV�!f�+,�C��)�EXs�S�7.��F�"�N[R����}�8n`)��P�FY�����q��8H=~���$z�!��5��w��������n��	Y>���_����Jj=��w�f
�1��L&�|v���y��w�(?��>���;ZI�(7�������uaN#��h�5����^VM_�o��Q��c[�0x�`9��Y��|�&MUxa��-�n�n!�/;9�JM���V��E����l\�l�Y@���aB���n@y����������"w�)��yBw�����BG����x:MQ���4�lV���|�r0���!f������c
�
H�'�+��Y�����x�f��
HW�3T�$yGG����&��`Q�g����B�;_��J���;�z��w��^^�t���y|�)kje��b-��f��9sg�u9�l]\���_�#g|��;ll�t��~L�a+'�����������w�ROG�C��N>�[��p[����T8-����e���\Z���iMd��C����9���������a���1��X�"~���.L]����������+�bC�e�f	�[c�����wk�-���5��mh�-iq���8j�sk<kC6�=���j[b�Vw����A^�s������W��d'��
)\��$�����A/~�<�:{Y������7R�O�r {E�L�����O�|����=��M������p������:z��/Rv���m��<D�q~�$O��AF���Y��5�}J����~�M��#�N?�r��~��������������2�����W��>/\G���b��T*�P97��=^L�Z}bCVP{�]���8[]���������#R����F�����m����./6��k�z^Eu�G����6��%Z?^�]��U�����DG%�5�fW;q-���+y'/��bnG!�j����������1�&.���vY��o�������:�s��mKkm&J��f��V��<��F9�n��I��E���+����t���z��5�o����N���+��rZ��z��n��������}���'���{[G�Z^����_�k�sc����9�������+�M�����Ksv�����.�o�a�-��w���
��������
��a���������\��������q�O���6g��SM=6g'��<w�Z��Q����W��C���X�~K$�.��9��.,�?���@����Ri���KJ)�I�"�"Z'�5�N�(�@�����������=���"���(��]��J�
���%���j�X�Ib�TJ�-��RZ���}*n�h[�u"-N^k��+ujzG���eL�_���:f����|g�Z���������������*U�S{����1��pL���p���_m}�9���Ll�
�W����V�1��p���"Up���������&�~��~Y@-���F?r���;�������u�������}���[����
���>:����	R'.
��,���e�v�qkMWb���afv"C^����wyN�V5����;z�L�������~��7����>�+�T���`�����/���}y
;��F��"��� ����hb%)������L���e=uR�m?c�������l�p�P]uv��^���;��J�����+���t_-�	Q'���l��/�T+�f���"��`G��C�����+j%�w�J�=%�������4��Gi� �[����8��;�E�a���������n�_���?Id4�z
���`�t)rFX��!�^�}��Q���rG��������:T�vj��;66<?~����*W6Nw����hj����W��� ��Smr�`�p���39-�Y���@?bm��vR+�w�1����
+8������*��N_m}ue=���V�Z5���j���"�3�X��s����_�?��_*x��s����{Z�%�z��3��_�����8 ��M8g�����������)^���C���1W�37�/��f��:E��������n:��Ld���3wL�a�}1������i������+r���7������E�����0����;�����D��
E�"��[��\�?���?��[�c�_c���*w�J��M��\����N�f0w)N?Y:�8����������w�vum�����^i������j xU/_���s0�^����8p5�j����n�:;��Q~P�b!����vE-�cW[��~��e���Sj�����{a�����4qf���&yS�����E����^V���:w��\�������?���T��r�,XoX�����;yU�����/�����p�W��{hm�6?:{M������q��h`�?��c\��*W6�:x��~{�I+��sWZ�fC���o��?�m��[��ov��m~���5v��������6������y{Od�-�������"��/�m�kn������\��AB��Z�����P��Tip�2�v������yo~�?��q���������8WR�@�*On>��7�Q9��JJ}���3���$S����2hC���W�%��?��!�Lq{�?E��@���~�^��;� ���J�+�������].]�On�u��{Nir/���f��������)}2sI���k`�X`n��:s39k���)�8�`.~O��'�7>�S��S�9hnB��q)�c����cx����F9���Zm�2�j��X�BqX-���w�%v~"f��%c�����u�u)��kd,�,�<�����^����L�;�R�W5?�'b�%��%�"�2[��NH{6��yB��J��+�U��KH 	�Vy�@R�R��cx�SmP
�R(�dq�'�i>����S�E4�0	�C9p��m�e�$Xh�X�@)���wJ��=Q�����}.��k@�MMh��T0aXj�W�F�S�e������p��`"����*��Q-��G�|����e�'����
X$�=����hwP�D;{|����Kh5�����e���OR�y6�!�Y�a�����(TK�F0�a�>O7s�����8!��I`U	!���Y�g��qa�ti.����)D0���KFHgn���$���%q�����1�ih�9	������0�#b�`I�{iv�����e�:���)�tRBP=8���3��Pt���������Yf,7M)�-�(S��!$0%����X�����+V�����w��8�������9����L���T��]��^���6YSY3s�"]4nd(����<��hl�#�������p�U������s�G����N�%cyuW�A�2�Ft�a��M���6]�T�Sa���KB�e��x_�t���u�l8�h�ux��?�<n,�"LR�a���,�H&^4,�,������%�ha���Q���G!MHL$<�t�r�,�WN�.)s�/$ArZa�a��$*2�v_��3�{`yB��^�@�������i�����(����N�d Sf�l�UQL�s�g���^e\���#�����Z�w�F�����k$�����W���t��fG���C�>Q`�9p1A0�����7��+�l�r1�q(|�������~�������K�������=�[�G�_���K�.�
y���9�wew�\��?W�<	r����Z�s�W �{�}
�c�e�^��<Y�7tZ��:p]{Wn�������1q,$���]���$�� ���r:=A���R���7�k$q~"��+�_��H�$��+����X�� ��O����j�-}��dJiT�t?��m1����|�R�8x������������3�F)�O'��z�(XJ�S���Ci���`�R�V}�X��-���0��e4�����jk*.a�;b�[���U�����OH�e>U{	a�E2�q�w���_E�0
�>��~D;���,B��ha'�	���x����pN�]�z1w~=���z	AH�{��n�����)��#8M�[d��<�&���]�j��~np�K!��s�����`�����9N1���E7Qrj�V��X^�p2�A9A�.%�`��a�z�,�8%��M��V&smK>����*dz������"����-P�NcyN��nnf�e[�Z������'��A�����U��hG���
k}
��������D��l����O�f����!p.�������L�
U��$`3v���
bX���X�����&� ��KT�������|mI$�D��2%R@��
ycw�������R{������tqe���M�0�	��W��]�	�I���)K�3�0�c9'������!X�h�G��Un)E�|����PL	i���Y�Y-C�^���9���'W��}i�`k�Af� �;?m��tp�H��e(�����q0B�
W����"��]Np|jW��4��$�dW26��`��W��z)o�0 n��c��p� ��!g���-�O���}�[z)��8��� �y~��SOk��0c`�]��\B���0@bh�*���u3��&�@
�`��%��tJ�/���Ct|����	u�1fv�����������%��E�<�RgU���+?��f ��f��$�wY|�����w~|�W���-yZ�]�����L�	���-�8.�"�������;��K;!��x ��������1����5@F����0����[
8lX��]��f����5k�
��Iv��r\���a�"����C����=��t�1)����z�p��4��������%-V�N��<N��DH�HK��)$`e�b�-�a�E��n��t��z@��!o��"�M��d���8�]��4H�=�����p��{ww� �y���x�s���^��]��R`�N0����]�"�����Y1,�}~�������DJ����p�����H	��`F��B<'\�_A�L�@���'d0l������>�{������
dl��s���	r���3����i�/��rL��r����8H���0��Xq�)X|���3��I>������1���-AN�e�?p���n_e�:L����J�s�D��4O���U�
�2l�D�}��8BU����A�	��x���$�a|*7W��~?�*!�)�<�o�H3kJ`yA��]��`�����(��K����yM1#dS���4l*���^vU<n��UU�}� ]�\�@� �6�]F�.�����+�+w*�������$�������J�=�6�a���~�o?K�������?}|�]
endstream
endobj
34 0 obj
   5856
endobj
32 0 obj
<<
   /ExtGState <<
      /a0 << /CA 1 /ca 1 >>
   >>
   /Font <<
      /f-0-0 7 0 R
   >>
>>
endobj
36 0 obj
<< /Type /ObjStm
   /Length 37 0 R
   /N 1
   /First 5
   /Filter /FlateDecode
>>
stream
x�36U0�������
endstream
endobj
37 0 obj
   17
endobj
40 0 obj
<< /Length 41 0 R
   /Filter /FlateDecode
>>
stream
x���]�������_��~^E���i ��>r0��l'�$�mI�lW�O3a`���mK�KU����[�b���%�d|N����/o�����~�i���v����?~~#��Pv���)Z2�;�y�`
Y.n����?~������o�|���������>�����}����5���>-�&����o�|��o�ay�b��Z����7�XS���������5��dg|X�51��
���������|�y�D_J=�H������^S�
yU����jr&9B�,�b���M\�nkwq�E�9:��D��J�U��7!j���i�)�����i'or��n��Y��&�W"��t�i/fG��c�6-HylQ~P3��s�
s��F���w��*#�R�Z���7�j��bM�>�X�s4�4��,�s)�W�!��
8o��^�'����e
�#lC��]��(��)����!{E]���6�B�j�}	c0uH^�_BY�e�i}1Y�H�dSm�������o���w^��V=����/�:���<��V[��>���c��v�LC0lf���<�
����\�w�k����m��;Kk��\�q}x�������U}H��0�cY_�M�=b���<U�[��UN18���&�ct���
�Z�S��8�V�_H�krl������w�� ���]�Sm�:�[@�����;���a���+-O>�����]Ie�3�^MV���%Q�#�v��ua�d#�i�Q���;\U�c�vX�f���gR��(��,�K�����������%����cc��hG+���m�Vl,�9 aM"�_lG����U�V��G��$?f�-%���w���	xv�Z��(����M���|�S�����Z�qE��':r��������A�5zK���m�&2�����TL|�ny� ���&���k�Gz���D�b�>���(~0{T��Mpr�uf�������;yb���_3�i���.H^�s�)'���� y��+irC/��+��o�(��N���|���I���^�� �0���G��s������(��T,�c~�����)���;SM�������H/`��%�M*�����S��x5yO�F���;�MJX��ig-
u�o�Q��FS�{�]4�j�DC�P5����;o�I�u6;ZU�
����?:�=������cQ��V�F_<uM����2��yHx��.6���	l�?w9;kE���\kz+�]D��\�Kc�!��8��f�����r�F�y�WU����l��Z��X��V��!�X+!W�S������'����+��Dm�V����W�K>�a/<����V�5&U�`5l��f���&�x����VZ.���[�Ku�<fC1}u�������w;��Q�jJ����Z���&W��1Nd"M��f� �78�Z�k���R��(���;a������cJ�sy:��WC���x�_�w���*_�}�@�xE�]���qV�*'�������p��N��Y��Z+�v�Z��M~[���k�:�P�
���/���"��@j��]@G�J�OSr_�����O����4��~y.#h���Wu�s��������&	5��OS?�7���lC�����k����
�������M
���|����$��,��/N5xJ���L�������c�� ������VC������]+mo���k������U#������w���~w�?�T;xM,�����p�D�����6$�i�`�T-��O<1��4O���NWS�7�������������,=K��5��
4��H�
��j�5�m�7�A�S���6�)�X�?�4/�o��n:������_����N_o�����n��� l�{;5������s���$��k����`���[�3.��g����O�_�)�j���Xk���Bf /���t��^)su� 9%Q��+
�2�N\�������8@��&�[�V��Q��N�8��m�>�Se�����9?��S���6�o
]����������F
�*��M�m�(
�h��Q}���S�"c_Sl<�Fv��1����}5s;��S��7��tZ��Y���w���ibeMf����r-�q��e�������$o�E!Ziw���6��?=���}^���DOb��m����:� ��>����r�*��'����c�d7}����~�|KZ��������S�R&�0�����P8V��
��!�8��m����~_���<:CE&v
2������7��������������u�7�sT�\S�n���`
�(|�P/E����������,0km��hB�S������s�f���x��#��dk��F�\�%��e=�/�����P��5S��A�A^7s�;���T?��zQ�b��M�n�!z���cX+3�sM�C����Mxb�HW�U���u��<�R���ZrG����7���^�����8&[f����@]m9�u�r���kj���z�����P���m�l�+z���:�\�����������By�>7�8-�|5�F��a{�h�2���r��e�8��W���z�?�~c/�cu��a�����^=j���"�i����a�
�+�[��3��\N�p�������q7��o�r��J��#��I�x��Dq/����W��[�&����TM�\N~��z'��p�{��[�S��L��o�����7��o��4��W�����tr.��e���cu�e�/�vW�����p��eM��,_�����8�Z����/on���o��Xc����]�s�eq�w�l���D�x���[�/���Tw�/��������Zk��������=�����q!�v.�2s���	MX(-��u�/K?s�L.�^;��3wR����v�N���g�3�^��.ue"�K?s����G��3��T7������U�3�FR?St�����p���=(��{)c-�t?s+���%�r?!
�5���Q+��*�3����������~Pg�����z�o��?�Vl�G���b2D�Y^r6%������������]��ESS+���~����+6������8n��z��<�z[O&W<�6�q��[9��������)��T��:?.K4�y���~[fx���O��3!8*~�p����������N���]6>����N���1���;�NZ�$v8g8���|}���
F������r�v;#���'Hd'�!�e,-$�a	�*c�z�K���2�pp>���5���ML&���S���`I<�X�3q�c���a�9�����,p������aI�F��D�}�����������I�2%��|!g������#4����
_�<L�X�P
���&����T�H���F���� �4(q(`Krw(�0��h��\��gW34�,G��o^�L
�������<��z��q�q�?���<_s��z���9;$�����m��e98�	�@���[����u��owQ�D�!�EU0.��%�Jn����bex�K���f��C����	�A�Q�a��=�'B����<��u/�{��M����h���-����urF���P�F�kH-k�������5.��3���(�!9����8���9�4��g��2���1k�������n���5+�:1�0���UF^��@)C�Jv8v��(�%AM��8h`��<,��`I�X7|�_��)%��W%�P�hK#�t��a�-�o�`*"9�Z�����*��<?�u��#p�r��<L����x@Je�9�k����J���I�����
+NY�}pD���x�	�/�q�K�C�����9�j����qMJB�Y��$8��U��c��t��P��]�9B9?�Kt�8�y@J	-O�R�o���~�W`B=��xQ`o�����~������z���%M�&�:�.ob9%8�a�G�%U~��q�j�Zo�u)��3�g��4z������N��,���+$�cC4�xNY�8*Pl��cH��B�n$,�7��4�.yD�i|����#�=�����b�T�,P�U-���� ���i]�,8�a3S�Iw�A�I��e�98x��hu�0v���J�8.�~
�P�*�y��iPa'f���U{<�d�f��������2��d>Gf��_���eP�)	��U�F������g���`<B3|w�%[�x��8xx$*;V����Ma-9=�u�8"1�0{��3TQZb���`yNL9s�a��70v0^2:(���O�7XS�T�6k�`�4�X���L)C��0�E�FR8�R85{��vP"�BI6%�	��P/���[Y)E���	F�G[����`�A�H������N^	"9O��a�7%��+�]��j�Kw�:N�p �,����+n�"��!�r��i4c�u2�P	A�����5�q�U�z,N��Sb���A���'1�b�cx��=AI%����A��0��?�����@q�b�����������H� %�I���'�����2��q���{�
L�X��L�
uJ�a��(�'v�Z�����^���`x�c���@�+0�+Av��B������M'�7L�&��0!o#zKRK�|��y]�c��^Z��J�>p�B	�@B��2{��
T!&��xE^��*u���	����u��#��r�+,<t����X0^!1�,�=XR���\�zRBZ	�+�����aVG�b�qc��HZBreS}$�iyo$*a]"���G����f�s`����lLF�g��y��iRXs*�=��E8����@di�n��X�z�$Q�����u��"]��"��'uL�i����[+{HRf8;���}�x�I���8������t#C���4(5���`��a���X�
��  �
�&��po�~�y��)!���q�tn$���J�p4����`q�X���yx[0W�PE�~�#�.s���Ji��KK�9������������*����mV����=�yg|"[�O�#<�ww�/�}�%!�-����5�r��c���2���S��w�y��<����+��<��*�c���$���Ea)��`A�J�.���;@���kH������uO��99v�|Vp�����N���\2*_oH��#����i��]<����<;����3_��(�����2,F%��0Q�+�����'���{)�Y�E�c�����y{���H3=� ����v�x�	�kD�16�[��y�mn+��m�k���|���~��y�;P��j�,	�t��x��%<wb����?n����9�ti��U����P�z
i�����@,Y��`>���1�'H������|������t���$a��6��6H�+9'^���t�>(� �n���V�"�ya>|��rx���O
����D����H��H�q$��v�g�siI������Mi��#JQ�4���L,&�{R�,L�9Kv��]}G�� G;{m�:6����U;0t'g��e��h�jE��8�6f-n�8;��k��G��/ge��zS�yxv�^2,�^X�:����h7}�(�Z`]��G����gG;8�M�G�r�1��3�rna����>vfN�Y�=��&����C�������?�zs�
endstream
endobj
41 0 obj
   5582
endobj
39 0 obj
<<
   /ExtGState <<
      /a0 << /CA 1 /ca 1 >>
   >>
   /Font <<
      /f-0-0 7 0 R
   >>
>>
endobj
43 0 obj
<< /Type /ObjStm
   /Length 44 0 R
   /N 1
   /First 5
   /Filter /FlateDecode
>>
stream
x�31R0�������
endstream
endobj
44 0 obj
   17
endobj
47 0 obj
<< /Length 48 0 R
   /Filter /FlateDecode
>>
stream
x��]����q�O�ew��f��X�F��w/����A�m ~���H��*�
c�V�P�a��T�WM����I���d8g���_?�������/�o���_�����>��"���?%_*�[�N��B��_?~�����m�~����O�q���m�����_���w����=��}��R������/���_~���/)�_���o�}x�%P��w��a��p����{Wj��\�[�.��UW�������e�_�������a#���>
�\�e*^��S��aI��Fr���,�$X"�m�,"�]� ����\Lf���������k����8��\jN
��n��>	�]����R����J@�-|j��O5���#g��9�_�Q��_�����x"`�@��l{'����$��v����:i1���w~�R3�Rr��RK(ia�������y��[^	1��[�{[G��L������3��{����6U�x����R���!/�����H��y^�������~���o����y$#�e�B�iG5n�K�(H6��[	9��W�+��w���)�ne�J�� �7��>.�f�e��G��Z�mvj+�$���fI����#2��_�m�%�������� ��09�������$	��������w�[
�HX�)�������i��[h��w��n&���X��y����D�{���������{���k@�������\y���s�������j����(+��Ag��������A<�_u���w��x�-)��yKb���o�Z���o��6q{�Z�1��������&��o�3q���?�3�t�������q���c������l�w.������w�:����^�S�S����Z_�|"��P���
|9�^~��c2�z���i����T��{����E����V^A���[/�xi~}��/�}n�\����bE�H�
!!r<�De�3���K\�1�E3hn���d������uO�{��.�Wcp��[�������G��+�s�����5Q]��18�QE��
���@�G������>������!j�s���T�5��,s��B6�V��#�U#y�e�x9���j�{��kX]oU����@�-���22�>�;�������M�#�
�[��d�f��3���JI��+��W��V�*�����&��M�������H{�,�+�&��;}!����������`���B�:���^���f[�3CU4 WG���px���m����E�����?��yG�V�c�vJNT"�*G�6�F�������7�Zl���CC����K����=�U������G��5����[��k�s��_��x��(����U|:6����f��w�o��sj��4��������wK�N����x}�����),��3��,@t�����<J7G���=,
�9u2�QM�]��pKYt�����]����]�)���w���[.}��w^[���~�=z���[s�y�%�,�z�[gw��~�E�4r��n������.��h#�v����qg�n��4`�^]
�G�����R��]�_���\����iA=v��x�~�n+���642o�������a#��qkq��CbWR���7���eAG��d^G���;�!G��b�����s}�k��������\\*�a��9��Y�,#���G����6�g�P�8���������T���[\5Y�w(��*j1�����5�c�@���"�w&
�0����U���mwt��������#�����~�{H�Y��l#�]��o}�lk�G{�z�ia��@�m���Y��4���}]�:����s'
Y��������kH2��K��ik�:�F���q.�A������ ]��u40�����{���!�N�U�Yy�*�t��n[���r^}���uei'��7}�l2�;��&�~�8��]o���5 �/�C�����8�39��9P�������t��o��c/(W��\����'�����"� �w��Fg[+Q}�l
��gGO���i
~s�'Ru���v/�
�S����.�c^��N>�A������w�LZ�<�c����
��'�Z�������S�I?i�a{�_�f�����A�{�T�1���"���1�4�Fa����W��@NZ-t��dvv������F�Sr�5~z�3�����5�=�N��b'x#�o�N��r�[p��w���3�i��w�7G�Dy����=�#�~�����Gv�����U�v'�}��Z���
���\��N����v6
�q��S�����<r�l��y�4[#��� ��_���x��(�"�m/����l���#��Z�|k`M�D]�����a��������e�0�8Q���g��&�r�?��rl�h'�����9y���#T�:��:V���R�3n��[�j9�k����[#�U�M)
�V+�.��	�q��������E��M����l�u�����lX����X���N��O�ZU��{�NL�%�N��w$W�����H=���xr5g'�&�A�L�]�NZz�so��u��G���������N�t�{�z��|�����t�Cr>+E|�^;�����s������V��6��A�\���o�;��*/�>t��_��'x��x2����L[�j0�;�H����	��O��h����7�V�����g�O�-�u��d���}���>5���V=W��=�&�0�N>�_�&oi�/��p�'�f��� �+�����O�<����'����;�:z���ZH'���5�M�*���h������#������h��"�����sg�\Z�w������n�-��IG�s�tr��;�V���U:�]�������}r$����D�z;��2�'�&L��1{kW�Z����$W��
�or�B[j�s������h}��ou�*�v�M�w�}tI9hV�N�M�u���~���?��������wk���^�v)�P�o2������o2�Tlp�����-����9��w4�������6r��������w���������>�o���e��}&�>���+u�dj?��~��?�v�������~�p�{������vl��7�6.�)9)��N���#��g���� C�.�+��\j�*�q�Y�0���%��@#�����F"������'Kq�v���G#2oXS��=Kd�sm�c*����Cs����-���IE���C��2�~���rm�pB��{h�Yn
=��B�B��'K�t|�m����{;���n��}���d	!���~�d���;/��'������s#v����O��#�p]����\��|4J���l6��X�~������,Uk6�:K<{�h�<S8�r��Q
���I.0W:�����c.0�N�B��pyEy����T�R2�1��0I*��\T�S�oO��u��k�J�G;1KX�:K��7��`��4��R�M	�2ASHN���g�9�nJ�1d,�&�O����d��R�M����g`�/���\��Z���/!(�i�!��YC�������k?�!��$m����W���@���	\��T>V�g!#	���pQK7C��*�Y����W&H�%�' � �e��{�R�f��^�:�*p]K�Y}	Hb���V���0�0��K��.����"���
�����.��}�y��Q�I3�*�)�$Ux�1�X� 	L���J� 8��~�$�����`M�:�!�9�L��$�v!NW�<����4��%]�FT��1~�����/`#�s0	��<�v�
�HN������w��[Q�:K��RA��������`NX��:�u�t�CBQ�v�K��������(�j@�g�c��
>*&���iyFiV��#��g!`��G�e�U]}d�J$*�|M^U�j����j���z�^]}�����.�y��`�m�IF�X
�i
|I`]$$fB�a}�*��V��v�����W�s�,	�u�p��"G��	9X��� (�����i6�������)I�:���v�.v��$��%�8\o�Y�W����"�������>�dug�pI0�ab��G;4�X�h�YQD��06%N��nK����-y�f	�H!�%���"� K�QnG_?c�Evh�9�5����1���.����D��2B����y�M��xHn2�yLzNW��Faex�b'G5�e��V^������Fj� �~��� �FSZ��+0�jU����u���ZVYk @;�a/�@�G�:#A�/��2�������J��� �q�+>�X�!K����IW����I�����P�C�/,����^#�@L�������&�u�������9���,��	c`V�[�D5�$����{)�l��9�	�B|��1�s��W����#��M���Yg	�bP��
�nKQ��,h�������u�.�I�(1�+�4*��~�dw>�)�1R�Gj�u�j�%��������q=K4
)���yb' �Z�P�J@|W1�:W�0t?�� �f]H�9� ��N��L`�f�(=v"`
E��.�9��/�P4p��^�-PJc�Y����������" �Fp����������)��ri����T,�OW�+�Yu$>��~��SM�>��`�E�l�BS�[���X'Ef����21-�A�>����F�r	Z*�����+���:;���I��
{�<�uMN#�XJ��������|���
�Qb�c���"��*���0l'0?�u��u��~�	iF��q�w�'5+<�0C ��_*���u!O�
^a�ek�$���.�K��Q�����C�7N���z�F���f�|�4EHP�$��#�����YW�}x�dkRL������e�Zb����b��$�
��$�������7k;"3��U&���o�j�N�`�m!($��������p�T�����".�����~\/z9����1��|J�B��)��Fp	��$Q���[PJ�M��!2����$	�p!�$�)���*���n����I4�p8����Z��s���	K4�z�u���V�����5���b]+aV	z�Kq�z��J4��6�T�WQ���YG�������!%�*=�	���b�3��5	}z:���������=�%�O	�a��3���I�F�$#�6X���(�?�:��Q)��0����P�`.��Y�>67tV�{�go6 R'���	Xx�T ��L]���;����~��f������)-�+y%l�$�UI�<w0��y�n<�!��F%L�=�tO���e,�R�����5��-8"\$����w�Z������y4��t���C�������;i	�.�e���=����b~�������rB��"A����R��p���"G���x��ux:%��u3ml���\����c�����/�a����+r��S�i����8������g���Ob])F���S�W'a0������Y�����W
F�`p��3aOH�������6��~�4����p�BC�*�d���)Ph��.�23gl�C���H�I���2
p<��d,��@���~��>c��r��b4PQRr9�c��rF8��������t�J	���5�uh����6�i2�#K�����-�feM���]��[��[��K-6�	u�-!��	8���yP�
g�s���nb����p�-e��I��<e���),���T�L���X���|���6����eX��
���&�Bk��T�$�`��+0{���"lH���R�����P�������a0P�(�1O���K;�����2x�
��d&���{��l	a���X�u�W��R!a�U`!+�B���� �e�U���n=��|�.��V=��Y�;->C����A���:��r��]Y�$�<��m��DR�����mO��1�e.[���$u
�5�i��9�5"cpv*�&��m-�z��&�Q0�8�^��@���zw�B]���*�-����b���ho
�h�~��4�e����)�bK��r�RpC#�������?m���
endstream
endobj
48 0 obj
   6077
endobj
46 0 obj
<<
   /ExtGState <<
      /a0 << /CA 1 /ca 1 >>
   >>
   /Font <<
      /f-0-0 7 0 R
   >>
>>
endobj
50 0 obj
<< /Type /ObjStm
   /Length 51 0 R
   /N 1
   /First 5
   /Filter /FlateDecode
>>
stream
x�3�T0�����%�
endstream
endobj
51 0 obj
   17
endobj
54 0 obj
<< /Length 55 0 R
   /Filter /FlateDecode
>>
stream
x��]K���q��B�����s�` @����b0~`������D�PWUW=��F�����z�t�6����6���+m?�������/�~����������\�5e:@�o�'��R�[W�/��o?��/���o�}�������������W���?~����}|z�K�������}���������U���U�����~��m��7�j	\k�H%��)&�]*e��X6�.H��+a����_���Kan��|���hX!G�-l������BL%���EJ�F0-�=.��������ZC�����I0�[�����F�i_k��r���tC�%���x�#!n�`S�3�O,xWE��U�bZ�rPg�3Z��N���yW�b�w;s���c@��"b�n�`���u���t>U����6�/�Wy�5���J9��d�{\����-�a@��i�5�������)�-�&�P�N<����j�`�R������������SD�����y?��3p	��O;�&�S
y��C��y?����_��{c�_�ne2�O�P��~����w�):#.CV;���\�W���e��������;W�e��]��{O�~Z�N:GQy�%g�t�[b��c%�9�/�S�5n`fK��iW��rzRm���\�6Xz	B�-|,e!���i��$�j����i�e�����p�����}��']���K��~��p���]H9��d��DZ�W��onJ)��@��5G>)?]����\���A��6"IZ(��m���R������r�_�����O�
t��}���fWr�l�}�����NW�ia>z��?�|g:�>7�
��sFH=�Wxjq���n�s��Xt��|ea����8����zjz�w�,����F�8�x���Ci��,�W���c(���?������>u�_����,��qj�/����
;��'�v���8K�D��B���C� �*m#x�=�Q�����,�����H��?x�v�T�*��.��x}glJ�%n4��=��|N�P_�^�������9;������n
0s�:�t�q����,���u�������q�xO�e0��|��z���/9��3��;��B���W�^�����9���*v��S�����{�����LZX\�f �5����"���DF@�Z,�zV�T=m\��Y�B=����a��|�n��yZH�����Z���^k���x���7�3��.�������4������p���H�/�(~v���;�i���G���`�qEj��]�k��[3����������!r}�C�:��N�����a&(5�w�4.�������������kn���O���K�Z�6�PX�������.��mp��39*���f"��@��R�1�T�'���<�/�o2�#������� �:����a�w�s��S�t�A�����x�&�a�������������A8��%�3v�)Q�|���\p�fe�LA�oc�V$7W��q;�k��V�������:��Fi�:G��\��'\�3������J��=�B�Cv)�_�*�x�}��d���NN�u����W��Gk���SnC�A���f��~����\H^������
$��0�8@�V��.��+�c3O��ha�{�|&����c��}��9�������*�����):.�f����?f��Y���#�/��m������"�4je�}�����d��ch��*������S��#������n��w���sm�q��eh�X��������3�������@+B�ktn����A|���}��<f���������'/��WA�<�P�U���i]k�9��js�4^��# �&5�S�Ah%��m��P��R�
�e�=_jgY5�N��^+$��b��`[|���C=��d[��������!����,�Gmhr����e�z��eO�����'�Ww�OM�
x��2���8��������d�n����?x����F!�����d����<�6��Ko:V68N>E��������M��YKj��nMx�����
�G�Uj��7R��������~�*�"�)�5���-�E������3�X��%,����yQ�;6��|j0o���������/�� �����u4���!�5�e,~��rmu������]Q�z��Mv�y=2ZFN�
�y�|�n.���t�
�-o��\��-;�&~��<1f�x��%�8*\����L�����)�Ko�9����7 �U`�>��b������G����+��Z��(����}��[gmc�z2i�Ib'Jc����8�[�'q��)5��t8*��T��}���������
���4���H�.�5F�����/�����h���������2��������Y|�=��[��S�����\����c�I��o���������:L-���L�#4$^���2�x?���w�S�N|).�}��W:��Q���g�7'�:z���K�3����M��p��^�����Z+�=GF,�9V�Z�����K��M�l4?,"�����nIE������������?Io�����
F��I����u:��o�rYZ!��ro���	(��O�11�>�K/�W�����};�km�V?b�������{���#��Q���>��!U<(���'��xm���x���z����x�����v������@��4hi�0e�v�3����8�z~u[�������B�`���
�z��xT/v�
8���
�6i�����~��}�g{��1�����B7r�*�~�Oj��E;������9i7�������dl�4b���v�c;����JZ��������[���S��sO�5$wGD'���i	�^����Zm����$(}6��'���>�~���c�9W��W �r}�O�j4������T%�C����>���[��/����G�e����N�u��=��[��zs��!���/n�N|N-������ls�n"���W0�M29��N�u�U�:i��"^K}�s�����L��C�N|����sp#[����������iG�5d}[�5���q������[)ng^�k��\b����e�����������)z'�M�>�<I��z��)�f��1F�k=�'���J��%'�&��G���q����qL�F�_:������e��������"�jh���V�G�w�`'����aXS� F=�3��/Q���l��xOa����~�A�~����w~����_��m4���_�����}fr����v�=�/R��k����G�������u�-��!��u�`;�$d.�h�7,`Mq��~�o���;o�Y����~��E�K��w��=
�s�Y��}��4��)�(���	4Z9 t���������~��%�V�;o��}\J	������S����w�C�������������~�)FD�w�B[>���ym�2z�o�sno����������o�K��/[����_��;O_?C������k��vv~��;
��'U/_��(����kr5��=�y>��a9� T��i���k�������%��E������Y��^�:����M��x�T>~0��Ty�$����><}����E��<gC�q.������{N������SV�#����ur�c�������d�%�NY��wX
���q�$�
��T��b��@���>l��~U�Z�d06%����$I	����"1��oH*�$��.Y���$���z'i��	kTI����@#�NuIS]j	�N2J�2� �&����,1���J��{`��U)������d�P]���10*9�ez��
d
0����$<X�����%r���$����I���B�d��d�gP�����IW8��(Kp-'K�G>���_~��O���~���q���}�d�p:?&�1�"��>�����~�_v�K������1��'U��E��yJ���4��i�<XE����Mu<m����qy��������R2<�v������]|Q�w������e�2Iz�����(���`��U��$������v���\I6�2f��L�)���Q8��Vj-Z���p�.R����
F-��#�"�[����X��@Sx'H����lJ��K��V0�}V�b��I�I~�KV��Sz[ t��K��M���0<�%?���`�n�2�=c���9��%=�I�#0�A��'K�EI���tx���r�Z������	B[} ����f3\m�Z`X|�b�����b�7�-y�e-�k����[YQ�j%l]�$oj��QWy!�=�u_%-�[j�U ������5>vv��;�[���j��H��|H��z%���f	��,��((DV#�aS��
0�������.���#��%��j�@X���qpK���Y��m��u��|{ �f��vD��[>W���Z�h(\���1�aS/�S as�r�JR(��$�<@�w8�.�N������"��o���S2�L#s���0�*���c��K��]�!f��	��
���	Ma��a�Eh�:���<)g��X���L���'�rT����Y���N�*��+�%���p!kX�Eh2��ea���x`E/!dH�@v����]����I0k�W-�Q�Uz5�`pK�nQv�v�@@�k_�x>d��y����I�_�Kr���9�� "��n9��l9��"�ju�!�5�����yn',y�%��C.�,�E4�������w��JhDa�����;�.Mtw'�Q�)���K�i%]�0����=��Y�zK��R��I������9�.�K�2����b�"��}���47:�Ix�LY/��y����n�0L�����U+�Dea\Y�(:G��#� �N�w�sV)�=��rQ�Z���VI`�1��OO��j���P�����'������z�!n���d���T��d����@=���xm��IX>��e��2d-	Z��%�%C�(i
h����)|����F���o�U�Jt��,,����-e8�P�1����'�^p�	p� ��Yg�
1~���qV�v�y9��h{�4��Pluz�Q���~�3hI�L�jb0��W�/3�s��&�#�b�q��d���m���E;TO�y����!�=���a���H�Y�<��94��,����x��/���4"HQ��L��F��1!�Z��VntO�<C��~�y9?�5�����	�����k+�U�����_@x�:�<(_��X�.ejI�5��m��qs����.I;�{�v�-OJ���p��6�1h9�)E��`2��eI*pb���=JL��*�}����5e�ci�U}��"�j�qZn�b���)Bt�E���D���Z`�f��C�O����������
Y� �G�ZDc�@B+���;������Ug)���cL�x;��)"I�ug�D�[�y?���99��k9N��I�����Q��U���I������
�3�g�ijG`$�R�P��V	"6#dr�g�`�@��$�����~�f��4���M�Zjf�,�#
������0����.oI�3QM~�2U��1F��	�����~b�!S�X��|+(���E��8����;���-C�q.p�'G���j�6��@�*�H=�����yT��"��Y�JG|{����K+�t�TFh��/���K�����3��.�����������r<V����;_�!"�U������+~�^������a��U������%���;1�r�a�L_�������9�8��z�Uq<����|)<�����d��������� �mX%�&��]�[�j���8�������N�*_��B3���[�f��{��B���yB������Z!33���R���mY%�EG�YND<�����e�O'���PJ����i���2��
�S���g�{5=�P8�i��DM�$���2� 
qJ�f�]�Z
t���o��A��$ab��=D��)���1M���;Td���)V\�p��jh�����v������`�:�1T��d�#����IT��_7�g_ZR
)�~����7���Mo,)�?Nch4���#�>Cl��bPk�	+Lxd�x??�u��u<�w[Z�u�s	�R�/~�Os����T&d���9l�o��v�u9u�p�@��B7
���{z��:�����p��7g�@F���-h�V���y9;�O�2
����G^f\��,�U��������������z�$N��\e!+C������%f����m�/�4���T	C�u�������\��
endstream
endobj
55 0 obj
   6395
endobj
53 0 obj
<<
   /ExtGState <<
      /a0 << /CA 1 /ca 1 >>
   >>
   /Font <<
      /f-0-0 7 0 R
   >>
>>
endobj
57 0 obj
<< /Type /ObjStm
   /Length 58 0 R
   /N 1
   /First 5
   /Filter /FlateDecode
>>
stream
x�35S0������
endstream
endobj
58 0 obj
   17
endobj
61 0 obj
<< /Length 62 0 R
   /Filter /FlateDecode
>>
stream
x���K�4����+zyN`�*�J�L���og�0��lC����[��g��s�������ZzT�[/��r���T���R����o��h�_���|�
-?�v���[v��,���_���Yd���L������g?��N/�~�}���O�.��m��=������?}q����H|Z�����O���z��7?~���������O~��F�J�)/�j�}�(D�3���,���%��!.���������n��P��D)�Xr��yUvL~��9���}�����_��]�����"r5�\�]��VqeW����D��F_�:F��J�{�l��a�RWi�N�����\9J�O�k�P\����H��@v��������Q^#S�Ur�w"�����A�k����M�zNi5*�)���##���f��Y=\ ���X�R,e��X��DJ�>��L��pD�D���o��Q��~�&�Q�b�	�gs��}�����w'2������^?�����g���/G����d���A��jF;�/�t�R"����������]d�����u��Aa�y��z�G�5e�����{�9��������M[��`���irr!�������aw�tQ��;����T6�*���9��7`?���G��z�����|M=k���1���!�o�|*��)f�e��<[�u�������N ����{G��]'���r3��Q���w����"w���G6���7�nd'��x:�����u8�����=�>�����!o�v6����o��9{�@��\f���=�L)�![~Q��5
����3���u��Z@���#7��l��U�����������hr�]�B�������������
��F:dCm���w2i�A*rK�Xh
4`��|�`���"��F��E������z��Y8�����~f)���#9�_���O�8��O�'�{L�L�z��=O#�W�}n#(u�%�J�7�f��aD�N����;RK6���t����)7r��F'F��l��*��M���O|8���<O�f�������s&���#�\�����~��2vo�{��L<)�w��U��|
wB����]�x��l
/���2����=�G;f[}���vG�<��+��T?9��g�%~�f��M�k��A��>�~3�u�7���;����������9"��������l�#G�+�k�m��iHg=��m|[������-���}���'�/6�_��:����qO�~�q�V_��>��1��be���|�_9q��w�u��r��#��}�v�3�	I�.����H�gO[��y�(�?�Y�T>));���O��������3���R?��t��p�EcG�s��c��y*z��:��Dg{���	��_{:�D�d�x���~���#F���z��.�s��
�3����p_|[��x��q��f�Wj��P�u����m�&<���%??�re'�k�n�����6�/{`�
�������p6��m
������i"����F��g��������4m�mq)>�z�^���l�/��|��������������/n������9Z��1-]~YX�F��l)�\���T�������8��_�?�/o�r%"Y~�=��D�x��8�q"����P��\JK]*H��+��y|i�q%�\����q�(�]x���Fw�& %8Q:���������3y��D�>J���^��g��@��\J[�*{i�s-Mymu�t��������rT]��w����?&]EZ)��H���u�Dyq5U����H��]*�7�J	�w��5��w�,�KY���[R�Jb}��^'��R��uX�����:��.����o�T��$��S��]#�~-��m���*��������g�G
/wv^��6����������e� I�Y��T�z?(�*�|����>�D�
�u����������3�H
)x���R�i�=�)���B2M	V]�*%��0(c��������U*���R���4��J��B�i��]
9������{(���&6K�V@��Li�`W� �c�+�1��
�1��[����h������Q@�`>��U&8{e�p��B�\�#0��{�S6�:�gW
���Ex&��H��|��j7�u��yc��6�>&������*���G� ��t:�l��7ge=J[���A!1�>�6~X�(�S<�T���`XR�p�V� ����R`��D�*$q��JY}n#����)�!�X����y���L�����J`�������$�%�zN)W��e�D�Q��:o��4�6_`�RMO�{��v�t(_���ln��� Swv4�{��,J��c��C[2b]O�Z�k�5���X�Z�|)�+���Z�Ov�FP����r��<���e8���>d`2�`RV��F��XfjfP
��M��J7���t$��6g��\A� +dPV�D�C�EOe6���c��'��$�I��"`�).0����@KR���D���9"�i	\w�4�G�Wb]�C����0kp�a�G��`
��u����6,���3�Y`�c������)�����c�T]Z���r�� ��a���:P�*��r���5Q�%�R%�*Z�en9b���v���0�t;�H,%5�l'8x�0jH�WD�"��dSJ@&�%t�|�u]��u�c|O�k�
n��Y�/��J�$x�{���x� ��g`�����<����&��o�u3�tr��p��-�YZF+���a�b�J���	Cb�������1��	)�c	�w����r�b���n�-���Fm�2�0�X]U,P�Vl�:����gO��6H��f���3�G��)b@7� <m}!{����u`��?b���P���}�yyV���`��J�.�,������U)V4�j���C]w�[�����fn��.�(�
���3JC���Xo�m�%��I~h�����2���=�X��g`}W���eh��h���������k���a����&\�]��A��YX?���������k�1m�����P�A�>�x�5��z�����	�a��vG�`�u�#���P�!���'��X����F��]����0���K���w�6HP������6��y)��|�J��1��s�1|T��l[,R��22�m*=��K�+���	#&�������4�����n�Q?�pB1�(�QT�� ��0��L��k�~�i>�^��
L��`~��XO)���+p�Y������DEJ����g�K������O�/o����
endstream
endobj
62 0 obj
   3307
endobj
60 0 obj
<<
   /ExtGState <<
      /a0 << /CA 1 /ca 1 >>
   >>
   /Font <<
      /f-0-0 7 0 R
   >>
>>
endobj
64 0 obj
<< /Type /ObjStm
   /Length 65 0 R
   /N 1
   /First 5
   /Filter /FlateDecode
>>
stream
x�33V0������
endstream
endobj
65 0 obj
   17
endobj
67 0 obj
<< /Length 68 0 R
   /Filter /FlateDecode
   /Length1 11316
>>
stream
x��z{\SW���}&�����Er�%�A�����XV
*��"AQ�/�`�}�hk���k������zQ[imW��U��n������VW��S%�~�M@����>��������93s���33g����@�0����OK�-����w{�p��K.��h^P�����/r�/�oY�`��Co��)3j�W������
�����U���n����~�������@�-��U�v�����n�^^G�#n��uK�����~��s� �8�G7	,8���I���h�|�s�s,,���wDn��s%r.�85���_����
t��$��nfENSj
X �����&,G�)d2��d�R��z���Mf=���n�xl����F�x��1��%}
|p�<�
9Us*c=��I�|Q�j���J"\�g�x�6x8�����6.����:������'_6���z3�?������`���z_s�<����������1M?�I�$�E�����!L�"z���)=v�Q��v=6��N�N=��
@���M�S5TelSK�[t{>U��7��|����������z�1�T�L�GM�dX A�5�a��2�7'd����j=e��V�Y`��F���ah`�LPv1��'&.��qv'����p�������Q�����J5��j�=��������M�m=��8{����/�.jm$���]{�a����D����Qc�M�U�� ��@�(�����H�`��x��01h�b�5p�'P0����{���4`�7^I%XEGq'4x��l
�_	_��K4x�G�0��+��=A�i~N@�c@����������J�-�.[����+�?C�����84��\���D{�����wt�3a�8�����^i�[�n'^���i�E��A�4�e�m�|oc(OVB:�p8r��L�d������\�daW��1}Yh���,�!{UK�j�Y�3�d��������W�?23��#.����Ii$�����;��������|n���7����L}����;�s�<��r��l~z����3�@J����6nY��Z<����c��3���Z������ j
����,���:c@��Ik@&��ls`��t`���<��������Lh�����#35�U'�]�������_����m�?��z"-�e�kt!G����?�9���}����P_�M��Mk�YMT�l2�,k��Y���($b�������D5K�.N�\yk��K��_�	�=L��A�+�^��Q"?9r�����c��~��C�!�f
C����hN�&[
�7b��Td�c���,����&�b#O��93dd�w�a{��ac��a0y��XjT�}����Dw��\a-�������>��T�����6N��(�fI����k�56��=������[����W���c�vM�k��5���`�����V����7={���Y�tZ��:���l�yQ����x|�L�5������"�R% �p�.����z'E�8�I�������F�; [�FO@6��
�6

�)��\r��\Tr�M�C.N�bW.N���\l���\4���\��:�9���vUoys*+5������S� �]�/�������3�#c~f���C"��i/X�������w�{<�s8���c���������5��Y�SJJ��=��P�<:�[T���-�u���h��5������5�4}�P�4��@������!��V�8*$�8���y-�9^�b��g��%�������,��E.��[K�}��p12!�h���/���'_����;�9���o�j2#�)�	� fH��,,�00��ip�\!�dr�#i2�6�m,��I,�d2���W'�]�$4����<K��i-�Y���`R���=��gwA��]�������'�9�����6mj�r�d�d�������

_q+R��
������G��*m/=�n�0[*H@0$C���(KZ8Q%Z�fQ,��9vY���A�Z0��������<N�M��Z
-��#r]����G�aI�9�j�M��'�&|�����#�RSqk�F����N_��/�����jNKM�D%3��`��LL��2�
|>��; �z����v�wz��l����]:�S� ��1��vr���N��qO��v�����l�&�u@�S���h�C�v|�D���%����O.�l�����wy�>vH����A`4���!ZlF�xR��� E��2�.�������]op���`��\w��\��
���5I����)������<�>p��<�8��&��?/�osK���+��V�DA���������aX����������6r���)P%]c���19>9���v: ��	6=��S�+�T<�}#�������T�K���AK;�|����~��
�1X&���v��j�f$A�{r����1��n�}��mD������
��Uv#~��9b�V(�2��x�IOQv����9��G��t&��	{�]KE��K9^5��?0h^��J�f��|N������61z���?��&��#^_�D��;����_�[6�g�>�$�IM����Z��8H�K�S-v��gN0����Py(�B{..c9Q��4"I3r��#�=b����
��������y}�@�JiTm4&8 2�4G�l|@�� �F�T�)n�����unt�1��7v�c��_`������f�z�ZV��9�I|Uf^z��^BG�aP�f��.�����77� �{���9�
vRg��u�������x����������8 (]c�8�����)�����	�4w�`W
*)xN�FR�'���)X�r���D�R���xc����2,^;��{�g^,k�|��d�M����7���t�H���U����kW�E���)0WJ�P��%XtzA�����f6����������%�y��A�M����<x���yp����b^�a�=x�7<���j����T�������~���.��(~��}�<x�Gj(�/j�Cl�f�k�K�D���.�C�+��5��$���m�I�����'��T�j���,j�_,j�.}�`^�^N����F�.j�W�R�K�}xf����s�1�n��5��}��G����}5����J�e/U���������?�|�oz7hw���,j5Y�*�Qpr:���n��Q�3�t<���\&�2�-�2����L���������W�`���1��3�e��0��0{����mMG�~���7��5o��.��^�=�i��j�����������^qx������=={/��xI�8�u�%ie9���3S:��x�M���(k2������������a<&���O<~��'<��c'�����N�\��"��7�<9��
<���x���<��U��q+��y����y�������1��x/��
���x��W��a�
<��c��y��}<1�G��<����������<������l���y�yL�q�E��x�������<�[x���]�%�l�(�J��I��&Pt�k�G�@���kyT'8y$��~;�?�Gxx�9����&$��j���y�^�SYY�dI��_�����������W	<K8K�P�N����T?��$����HF�
,���Q�������1<%#�e�����L�������|:��!f|R��R
�t���}�!E.��������}�{U�_s�&��)I�a��JK��cF��{^�e{�0q�n��o��i*����9P&
�0���D7@�]��0�#���!�C���tLd��I�9u�lf��e�������/?8��,�?�/
6+���������!���+�R~����mw����h�����o�_�����a��u���p��m���?�V�+V�?��7G�|f��$���pv�+��^Z������
�~�DNS��&��B��Sm��t�9�a�ssf�\/s��U�9G"N6qh�9���NGHvj5I� �+�-��z���_z^���������';80������G��hj�������\����
/Xx��om����7>z�U���kBM[��.Y}�b�������3���ig��2�� 
�IB*M:� �	(��K5Z,�!��"�Bj�����&�x_���������Y�m���'�7Nm^z$�����~U�����4�T����}w����{�p;�����'��5�Q���6�w5�H�Ac�O�gH1�K��,��VR�7�d���@��=K
�~��^~aP���K���x�D6S�jR�������2��7}w���v��z�����]��7+Wm����a����6l���,|nv��m~���'��j��
@��:�#Y)A�h�"F��}:���������2]��V�m+.�����Y���_����c��w��W6A*���-��$�Y��h�'2���SHNI!���z���F��E;�!v5K�
���%�/7���Y
U=���}V(&�l.RuCjP��o�:/�Xx��O�?i�O�!]���e{����"��I���n^�c�(�����*������.��4��s4��x��.���Hc�NM4��j�����4F�)������;�q>�0
c-�cFjv���Dz�C���8�����gI[h`�����E}<��{�,��������������>��}#�Zx{�~\���f�)�Px3�@K�|�h�����
�p%���85��[fD_���z���h�^��m��0���h�z0$�>N_/�1����i����8�&�-��wQ�?��_�

������������7�H�����
$����"�@��#4@�c5.����6���)�
c����HD�}
�qq9��'
�Ba?�?7�4����
�a{����6�#��h&H�A*�N"���*��_�������J=X�3D��A�C"��F��8���N��n�1��C7�
Vh�+5�p@��:��
��+E�������@5���o��ax��z[`�a����6�f��?���a��`A�	V��a7t�3��^xn�aO����
?�w�!xn��� �[�����1
n#>&�`��v������g�Xc0������@<wB�e��7$\��a��&XK�c`��9#��BB�8L:��/hS������M����0<�A��O�
�u�E����i�j�JUm(��p���^�vx_�~v�\^6c�����)�Ur�����E'��N���v�5c
G�*���3<;k��Lw���r:���dL����X��H!KP0X��n��W�Ebuqv�P�����U$���P-(��Be���H�V���dV+B�pP��e�U�R�R��D�0��K��r|�(tbEi�((&��������>��
&�����%hR��
E���������,���� N���q��	��Y��:p�8�:���1��e�]T]�J��&�.���5I1�5L�X*���X
�T�a�����r��=�����r�����Z�������'*C�����U4_�')�k���uJ./�
�6�B�w�`P<s�JHu������U�	
N+w�����`K�_�-����H�\Q0�-CK]QPP P�`ug������_V��Z#����V�$��.W�_��VH�B�}�k4���i�	
;AaT
�\��wJ07;��4��G��������"�b��0�2�����]�Y%��[�=�F,Z�H������P}�z0�Y1~���'������I5���Tu��	
��Ni1k���?g����,B�(��|���`���kJ�\!;K)�D
aF�"M��:vbE�9EbQuP�����a*9b�b����*V�������4�:A����,%�H�+��%81*��K,-	�������#A���'�+dfQKy���k!�@(�]�$+X-���e��D�2����C�leFy�t����|tL�(BeG���b#��Q6
�Vtn�PN���Pn�B��B���c���n����
���;~�P�<�Q+C{��B���1:u|SZ5�	�}�u�`pB1��]���E(�[�-��n����>���S��b
�����P.�e�VP�@��7U=��c��t;�W�(+;KW�����L���*W�^���BO�C-:�dz��\�1�pOR@5ai4�k�@uh�_-
f�u��IR��v��D�T�"N/�Q�L+���S]�%X2c|vV�;Dl.���yzE�Kf�yF�>�	��rG6���$H�P�*P�@�4�|�����$�F
Kim<�A���`�:�(�](S[H�uRQ��GM��N]��������8Z�Iz�@$|��}��{��
��|G#1a������G)A�RT����K�U��7@��W���j��*r��%��R$���r�\��Ug�B���
��@!�qH0%N�?^���p�
�E��
g��
�1;KiT�	U�]�����.�b>������j1��e�D�f�}�U����3���~z������g�����$�����S�*U�`�S!����:
`Q��,PH�`&=�U�j�
A� =(b4�`v��('�f*k�<�����
��B�
�*��R!�J�`�-TD!�Bj}���18�b)�E�C�0��t&]N��1�L=�;v${��Yw.N����������rC��s���a��f��X]B�r@��wP���@�>@,�2�;%���C	���k��c�G�W�������������zqo/B/�M���E�.0�y�?��/�0�9��Yu��,a:;�l����{���_}����s���9J�����z���{N���!�o����p��L�y�^v���������{������q���T�)$��r-Y�)q�>t~Hh�=��u|�k���@����qF^�@g]gc'���"��<��������6�~p�A��"��k���#M���T^@��3���?��lT�BQ��n�����K�?�<Ot=��<�������v���EL�������x�������-��\�	o�M�A�G6&9M�6�n�l�s�$����V���Z�[���W���~�^��}-�Y=�Y�9C�L����:o��;S�Q��u��^��!#�` �Y�t����]Q����p&�Y�h$��<��
�X�W�-��$}�4"��Ri�h�T��?���������������S��~����<[��2s���@(C@���3U�L���c�jZlj5�2EL���`:k"Nl�#����1c��S��F��(l`����{���J+�Y�������k7l���J����Jp�\��L/W$��8�\1���x9T�_�QF;P���Bj��'��z�	y@���<��_!O�C�z����C8'�� ��B�z �����T���	y�3'T]"�
�0
��s����bgc
endstream
endobj
68 0 obj
   7519
endobj
69 0 obj
<< /Length 70 0 R
   /Filter /FlateDecode
>>
stream
x�]RAn�0��>&��;�,K�pHv�l���)c��9���vEYi����������!-��~�5���yI1�e������%�C�������iSU?������!��rNW��\�|�w�q=���ZW�r�����{�G����}����k���<��������*��!r��������o��|)��/�8O������n�����G$�9|LY9�y���t^9��$���+��7�W����������r������`l�r�L�`�%����
�6�o$CD�(>��Z�����n�n#>��Q0<�xZd���C�Nr�y����DK��dv�?�?�|H��0�,��T���x�/I���L�@�\��m�u����I���i/�YvC�bI�������<_���Q
endstream
endobj
70 0 obj
   396
endobj
71 0 obj
<< /Type /FontDescriptor
   /FontName /WUGUUF+LiberationSans
   /FontFamily (Liberation Sans)
   /Flags 32
   /FontBBox [ -1 -207 940 724 ]
   /ItalicAngle 0
   /Ascent 905
   /Descent -211
   /CapHeight 724
   /StemV 80
   /StemH 80
   /FontFile2 67 0 R
>>
endobj
7 0 obj
<< /Type /Font
   /Subtype /TrueType
   /BaseFont /WUGUUF+LiberationSans
   /FirstChar 32
   /LastChar 118
   /FontDescriptor 71 0 R
   /Encoding /WinAnsiEncoding
   /Widths [ 277.832031 0 0 0 0 0 0 0 0 0 0 0 0 333.007812 277.832031 0 556.152344 556.152344 556.152344 556.152344 556.152344 556.152344 556.152344 556.152344 556.152344 556.152344 0 0 0 0 0 0 0 666.992188 0 0 0 0 0 0 0 0 0 0 556.152344 0 0 0 666.992188 0 0 0 0 0 0 943.847656 0 0 0 0 0 0 0 0 0 556.152344 556.152344 500 556.152344 556.152344 0 556.152344 556.152344 222.167969 0 0 222.167969 833.007812 556.152344 556.152344 556.152344 0 333.007812 500 277.832031 556.152344 500 ]
    /ToUnicode 69 0 R
>>
endobj
66 0 obj
<< /Type /ObjStm
   /Length 74 0 R
   /N 12
   /First 86
   /Filter /FlateDecode
>>
stream
x���M��0��~�\J
�f��6���BH���`�5;�r���b�I)�7�O/30?K���79H)�V0����@@�����3��pL�5�)�h�@Bi��A��q��������Ko@����0)H�8�M!U
5(��G:j��M��"m:���W>��*��fh3��{���)�S���{]uY���xL�e���PF��qR���R�/?s��/���O�i�X�DH_����.������J��uFKPNed�r� 4�7���~	m��n(�z�������P6�m�\}��_!��������UW���<4�"��]���w ;�1�U!�I,-�EZA�8��I�X�����.Mr�r\�+pI�p�$W-�b�_�f�r�����\7�U�\� ����z��'�v9�Z�Rs������W�j>���s�e,�����p����`�'gb
endstream
endobj
74 0 obj
   452
endobj
75 0 obj
<< /Type /XRef
   /Length 253
   /Filter /FlateDecode
   /Size 76
   /W [1 2 2]
   /Root 73 0 R
   /Info 72 0 R
>>
stream
x�%�+K�a�����Dd"��A��Dm��`��2�l��`��-�����[1�4���L�r8��y�^������J+ FA�oEP/g�M���Ue��
Z����cP5JN�C����m���yJ^�+�	�q�m�?�8e@�7�ov����vO��Z4�@�qon���r(E%��(Y�VuKA7[��}o��o���^z��:e��S���v��7�o��@/�O@_����@�a������B<
endstream
endobj
startxref
64515
%%EOF
perc.pdfapplication/pdf; name=perc.pdfDownload
%PDF-1.7
%����
4 0 obj
<< /Length 5 0 R
   /Filter /FlateDecode
>>
stream
x��}��-9r��|��4p�����1���`�}
=(W�.��R�Jm��
f&#���<G6z���=$#�W#V<h_�e^��e^�����}���������~���W?��/���7���Vj-6������u��Jf����^�~}��?��w���ox�������������w���+�w��f�����o?|�?�����	��������oo�e�Z�
�?���+f��^�n�����3�_�}���������oCJq��������X�0��������L�r�0��:q1��� g������5d�������r
1��6(��Gi3�A)�
����4�J�&$��U�e/��OD:clc��[u�O�1q�v��1����d7k��l���4,ks�~��������Y�[��tn%l����a�5�.GW]�6y`{��jp~o��;�n$�}�L�X���5m�sK;�nD�_��2M�����R�����a�,}J61G�������
/1���oF)������-�,�����Dp����)ax	��G����kQ�L��k2i����W}�	���6k���]}���:���]�p*��V&h����������}��I#�������
?�����
����~r�'q��'-�w���~����v�cW[�X�TRJq�����O�j��jl���k�rt�#����c:m���	��LO��+.��cW��V�t��{��a������W�B��[����;���%�4�pK����y�Rr�k��dW���?�x��xn����������35�a�r	��������}M������a-���� �����0��{�xRj�m�b&P�����	j>�|���o:�
� ; �rjp@�.	>9���m���rU��8a��?�	�����xW�`!���!��|�yE�]+�C���W��������{}�]2�Z���u��,���a�+�f3�@N��8����Z�4��b��j��9����C��0�Q*���g&l��S�a����Yb(
��G��$!4�?���m������;�Fy�J�.y�B���iT�}�	���3��#��*A1��>[���[�<��w�?��hYt����/�*�����3C����f�/�������U�I��_}-#�����`u2���S_
�'��".�p�����>[K�.pWd�1�8�~pW%�w���H���d�2�����\V#�7����3�����[a�s}���e�k�TNO�[�qq�< ?��<;�y�x�gKv��r[yT����V \K+�FX�5s�!--�K����??���~u����B�������jq���8�c���O�����=C�^/,�4���z��k]KX�[*�k �ge<<3�q��0@�&Xy���[p�kE���#�
��x;B[�@�����i�3�[�=��$|��wOO+���Z]��9v�(��\�tm��,���8"����T��ZK�Pb���M�����o�<�qX��vM���8LD�*�A/O��h���K�A�M��9���!�zY�d������t����G���I�1������)��k%��7$���!�r��������'�7����\Rh��F�+$����l�8��}E�����Qb��"����	��P !	�����Z
��vu���L`WdT��3n���hipx����
aKA0�m2���N�b(#��M`l���@�������3��q&�h!��S
m��������f4�����Y8�dL����&@j/l���3���'�TB�,:�fr9��Ln�����V��%��P�ni31lM%�����%TwL�^�M���X��5��O����jT�<��>�����������G�D� 4�5��m���*����w��vl'��Wj�[�~isG�R�����
|�u
�@"�������6M�2��gV�*}m��<B~[2U@~�n���n[���ndP�	|{�0"��4�(<��I��uj"�_o����k��c�-��d@��xho�}%�y8� ��)�R���
A����,��v�k���s����M�f;�����"��6i�M���[���n����;5���r�:;*���{���{��k��������QX6�}�c�] �B�����+�gF{E� �_����pe�'����WVV�+�_Y�'�����r�x���?p&S�b��������1���M��<�6�9u~{OL�m��1��gX���v��G��&����!;�r�ft&5b;��h��v6�\�@��:���J	�����a��.u	�������X)j����>A7>D�u�)	&g�$!��5uL�Vg��~��ckc��<�J�6�X����1 �	���9$����w�^E�3�t�^Q�u8A_�H'�+"�� }e���%a_���*8��������CD��������4"�s�b$w�CL��d��\r9��8:�j�^��8�(��cD�5[d�xSa�Q�/#].-���_���v��^\��
���%gPZ�^�E��[��.I���nQl%����^y�[���R;���
v�`=�{�q���{s����\�a�^��mk���G���6c6�����eE�~���o^�h��X�he��v��8fy�����Z��Z��g��G6�1'cN�&@n����QD*;at�X�_9��xWVV�J��2�*���%lM��������f���2�
�y�T���g��GJ(�0�?*kIh�
M�������Gxk�
m-��G��������im�8?��	�7K���im�Vo��F
��[�J.5��-Op�h��X�.y�z'�k�+�3�]��tUu�\� ����!�]�x'����S9��*�C����*���0�?������V6���^2����f���4����]����Ye�L���y�@=r��^JT��1����i�K�8��5�!�I��c(��!�M��[�r��l�V������w�Py�����5�Bau���d������-�I�5X��@��Ssg��#���7R�
���.����uy�n��%�8������x���ME�������	��J�}2���'���	��x-[�#�������Y�M�����j���t���I'����g�y6s������).z�r�{xZ��iq�9��w'���J���Yz�B���������`;���):H�~fa����3�Q-K-�Ft�B�\�_,Cxr�C@��7�5��}�����+Oo7K��k�u��pf=B"^h���rE�U�i�8�+�U��wezM���Q�@	�2�?�>_\_��@�W���	���G�&M����-��+�H5��������t�E�A�I��	1R�{bZ��~r�6BZ���G}��Q2UB�(V�Z�
S�Nj���sc�a�����h+�u��s�Q��U��C{=Zw5a��6�-��rAF�0�o�N�zJiD��|(�Y���'��(��&|���P].)����E��UN t�������s�q�!>��7������q{U���&�������w�m}5r����&lAy`jV���D/`��,
.qg���jMO����\��-������OB3���W!���Vf��9�\�
�	+\�m���Iv��R���tE�Ux�&�HW$��{
D�?�ue�E�
��B��@��!�k++'c^�<-O���I���2��	V��=�����s�OC���Gk�
��_��#����-�����pR�zi��4r~qq�L�����7�qyQ��D�Z���'�|r;�����^��)�HZj#�n>	mP�S�aQqy��Bi�U���*t)co@�"�Y��,#�*h�\��B�CVI���:�VZ-��Yq������8e12|���M~M�R��H��G��p��|���Gz�O��m����OHFE*/1��s���OM��$��X�PS�a*���l���@���&L<Ox��^��'����t�����8p�n�K�e*W���F1�f�2����#W����C����Qk���P��
@W$x���J]��.�tq�+K�E�-���z3�5���PwM��	��3�wq!B�'p�{�w�N'7�Y�(r��W�Xv�!���!6��O�=+��c]��
�4:���X�$X5��6(Q&������;�NX���bz����L��r�i5�Y�Y��F(�-;S������������}�]@H�a��KL�[{hn8�:�[�`2;g��F��L �V~��ni�4X_Z����z���)l-��Hm8Y�9���k_jLu	)�C���OM[rA�v<��$�J�^���Jk4�q�aJ*����>���J�
A�8 ��2�������9���������1�W[�i�O�YU�$�S@�h������H�G*1}�V����=�8�	�1����]�?�m��	'���%\y����'X������[���}}���0����Y��f��][z��ut���$QC�-5����J� �T����bX�M������]�<^M6#l���fi���mE~�Qc[[?�au&N���R���:p64����7�[^���/t���.�`z�=��U��\���+�����M��0�-�����\e���
�F��f;�e	�pH��������X�#�F����Q���r�^���S��Z�*���;�>Mpm5�Jm]k>.�]��:��D�4���8��'���H��(��&�����M� K��HQ�@��
�J~�
�*
K�U%�)�Re����;PC+8���!��#�6~�`�mK���J���SEIew���;��	0)��i��:�U��'��?���t��@O������%���Y1�J����L0_OM}�`����(���_C��$�*i�
%�S��>R1w4��T����\!M���'��[�)�yM�,�������&lV*����T�~�x�k�)�ou��	{��3�K�?LG�*�Rpb^[^���D���:3p��v�]W�e��]���'D|�	vy��[���Y���%O���IK8�������7�wG��h�`Uh(��&@khUiE���}=�Z�@=H��C�e�
���.K�[z�$�kK��A^�0��y6���Q{���17�Y�8����5"���:�������mie��p���Os����K0��Tm��d������~�vA��lu����mk�/(m��iF�2�y��:z��	|��!���uO��An"��Vx��
�����	n�.J>���5��tk�������\�7T��"��Y'Dp�.���F�,
\B9U\�1O���^��u*Go>��&����X�%�{����cM��X�@����#h�&���+�
 ��=������{/{C���Y��P�u�`��mC�*w	�4����xE�7�����."�����@J+S;�E%�]X��-����4Q^�F&hB^��re)���,-���wq�fk�}h���o��\���_��[wP�L����$8�C�Tj��5Z%\�c�#uu|����f�L*B��}Q��emc���Z�����M���bt��~=���.�<�|�Vf�zJA�I�7��7Q�����$�+�zb�R���p��b_�c����Msq���E��S�c����lu�V9S7r��rJ���Z�mBx�IB��k�������M�0i������z��w�P���oM��Z#%�g`��&����<	��	��Xv|�L����>I�U[q����Z����f�E�M���Z���n&
&���A�L@ug��OhF���y�b���>%�"����@�H��'I(W$Pf��w|�&�������J���~��}�KF!c^�n3w�����ZI�����������V'�:�w���O3��X��X�J�[��<e���$X��vcA�I����x-����=�e"�k<��O��AF�"����~���h����,��=�BAk3������&�G��k��_���^����r�`��OD�t(�'��{�d
�j���z���-�� �V��'����|�|�����u�SR�"����k���fh����S��j�'�_8tGs	��}�(�QM�����j��&���P�P��;��	���jY����&�����iO�e���oVG�&l��Fyi�H���f�A.O��4���n��S�n�pM��=��K�T4���H��������%��k��"��R�Cm���r�?�^O�P���g�5_r�yxa�
�y�x6H&��Ad
�\���C'�����5��[Z���`=��V��:�%14��VT�@q�'�qW��d�T�n�j��=��e�*�O�Z�����h��&Z�����I��r��XF�8���-�~�qMh���\������Y��U��[-����7Y��q��p�LI��@m�'�\���&�����$X�t�K��&��%�0/O�ktYp��&�J�tl�kH��:�o�j&;	���*�
��l�Ky�A;�Z���U�����4����v$�Z��E6���p�slo��je�G!Ky&$UX1��5� ��+Fw�u���
����$HW^s������5�iu�=a�gX`z�x|�1���ZVI�6zp�{��&�IB��N� �5	(����:���x)=��Z}��M�'m��]S���{U�����+ws3��V�Gz�2����j����6!S�E����)?#$�.�%4�	n�)�n_���MEh������U5wK�{�=���1B�"41��Z��`�V�K��j0TB����m}�56V���n���x����������t�]����O��WU'��	�!��tE���']�\�*��M���U�~��Z1��l%K�����L����{��v�rS�mx(��5���+���F�+�WO�	V�_O�d�+��x|���e1��"�]YE��V���e��|P~��J�.9��;��>F��6����Vh�4�_�f�=C�������
�<���J�!�g�!�I��f����[,��FeM��t�P"7�u?��Z�D�	("��[l��t�uc��t�E��_m���E��G/4��(/���������V���H���&E)�����M??���Z����zr��_Z�`��g% �+<�'8���fjG\_�Dx��-�D�/Q������E,������g]3�,G�ND����aa�������\J�}@��[��y����Ri�&��$S2��V>���8��1v�`7|��\;�X�\��!0�h�*��,@�xw$��-�}	��x�g��a@�$������OZ�k����V�C��?�+�����ZR�\�@#�;�y�g^X����u��������	����2��������,��U�5��!�m���@]��x��6A�G�6gG�b�~�r��y�F�^���0�i�Z_���	���Jx����<���rme���6^��OP��|�q���Y�R��J�V�YQ*�Q�HkyC�~�Yn3s�iy�~�y��n-r�;�����z������������j��kY�g%7���5WPj':�o 'gT�{�US�2F{��X
���$�k��V<,�f��K��Y��C`���o�QWm�K`W&P\��(��2��M�����U��W�\�^�������xw�Hv����[ru�t�3O�je�;��R��JB;A[[N�l��ou����7�S����{�	����*���������d"u��+��Y��I�|�wn\��!���ye<����w�*� O,�q���0�H�j(��0<�^�@��H�W$���k����ze��\~I����������-4�5��������+F3�?�-[!������O wd��'�`?��h(	g�MzsA�����j�{�=��������@~��@^� �m��\����W'xp5qHek,�����-F=��O�P���,-I�|s�^�����1v?�7Xa��{�&�"��@)�����p*�%��k	�����Q8>�-)�e�� ��zF�$��~I���y�evP��9��uX��*g�@���'��{#��RU��f�����z����K�/�f$�-�8�l)���Be�#�Z�`�S-.g"\�B����k}�N��W��gh+���	���7qB����K�7o������`�3����y�D�v���]�&]����(W�S�6{R���Ku��JD�#Z\@�:��u%��YX�����k���P]_{
:���	���W�f�_T�
�eKbUO�������EF���l�,�_�����OV
���j�{��"�oY3BqyB���^C�(�[��I�KE�.&p�K�1�7k�V��[Z��{�[K�]O�[EBo���E�k����<��Ixp+�*�Z;��J��a^�����
�����#,�&e�v��L�����d����bG��;�:f
��-Y!����:RG�2G	�4~Q.s�_�&"{g@�������pv^����H�^K��&P����#�($��, �-�$����Vw#�!'���[������3?������G@�y�&X�
Q�O���g��sU��hU�f��h,��ru�"P0������U�n�"�M6���T�qB^��� �)[�����f�|y����<�s-���c�r�.p�^��zy}�6B����"ZN!�+�{V�*!�@/�?����Ayy�'�+�.����&y]��1�D��A%���[�R�*��\��N
�����R&P����$G�1b[����jh��7]Apk��v
M :�5�����7ovxkO�����RaM�gW��YY�R�-;�*�B;���_��/�M�K�rC��C_��h��r����+����q/\?��N��J��,���9A�Gx�W�/EV�b)�0y��
p����������j$����#��-F0�&P[uy��Rm�,u�6�����Ju�:v4k�wBS�WNN��:^)�%xk��)��py}5CU]�A�����D�6����?���e����7���7����~�2�y��f������e��}?���_���j��=�����uWZ?���?�������S)�������l��s\k'h����XLac��A_��fNZ����J�����W~����M0�1�}\<xS&����/�o�������D� q����� 9�-g��q�{����w���U�p��|!�\��|��*�a��G�,���-����;�x�Ea`��f =`��
����D�H[�|g��w����E���j��K~�[S��
.���������_��Muk��=��[��2��������-�a��W
�����b����C����-��l8�=���n�vnx�����p���c������
({~u;��	j��al��n�[�[;p������(����W�ck�"��u;�k�D�<{�������oo�h�
�d�_1o����=m�)���_�����������~���������z}�_~�?����?��_����������[�=�����_��o?��O����f[����o?����\�����~������������/��>����������>��m%e�L�?�7%��o�e������=8������������>����y�M��^�����{�6}im����{2�����\��.~}w>����r����|<��~}�>��?����[2�Xz���"�aT��?;����`�����
���u_6n.�v<�����X�����1��&
+-��+l�E���6��6��u�156B��d�n��m;@�1�����/����}�_~��/?~}!�/f
��?�k3;���|�y�\�?n7�V�#W!d����9|���<���P<����Ss��i��m����9m!U�>�s������}%K��gi���Rv����I�oa�RXX�	�!am���m��k�����R����>�V��y��-�|�&�Gk�[��A�	k����x,��B��1�~����Qh6l��_5�w_���i�k'�".�4|N�D
�.��p�r��5����y��pD\����0.�"��J�p��Z<��s5��u#��>��4��!+�|�>��Ng�����K�qf��|�_���`c�#�{x`$�	�V�
��VG�v%���` }�Z�������f�v��)���Fh�#�
�A�����z�������%���?����q9_�U�������,�6��Y����um���)"4L�������	���r���������H
_)Ig8<�&���b��@�j��aHeX��F���P���[��E����ey����0�p�p^B�4�k���E��~	���.�dkd0�6��ll\�����`����e4Q����EFnH|,%k	���9EAe� ��N[���y�^m/�"t4O���}�EX$���
����gTD���Z���j��
�0"��~+A<HB����o�;�^F[�������� ��
������,[��k��M���p�B+�yH?{�W�1��8'0.�G�nFP�n��s��g8���z�2cbjD��8����_�����]�z���t�K��n�����23�R^q
6�Dz	C�}����B��r���$����l���vs]O��h>�|���r�:�f�6��%*���;�x(� B���9��\�������<��.~�)������+h��)0���amt��6	[��A����I��~��6�*����M����w�A�=�1���
�=����[
���uWJ��dR1����0KC�!s{������Jv���"�"x�|���0Ok���"�B�<X��z�#�gFEH mPa�I%g��{V���7g���~����x����X@���`���Y_��S�/)0�G��l�"9x���V�|6<K���C���{dzO�&��BdX��W[����b�RkX������!�#
�%�G�s0^��������5���Hr�����
�����XQ$��u�9X�X�EUZ.�
��)2@2K�r�N�d��U6��!��I����4���!4��t5I���x	"�"��2sK�0����?w����F��5�Dppk`m��p���4@���,�e���[H��F��	[��d��)����?Z��`��L�$~)Y�!��.Fz<
V�!��	�_x������
{_�1�k=m�X�1���FS�/z�
0��8���� I�=���-����r����q�!�+���
y��F+�����d��J�0��������aP!Wm2��O��{A.DT&��#��2�o}3H����ZP�w��P�9(�S�Fr�Zb'Ke�����G[�e���5s����	��.w��xA�0D"�7�LN,�H��.�e�hW��h�v�v�]O-h�&�]
x�Tv����0����S3�
���P�=� ������	/(��BV��<�!-�vLX��'2M��LDW&����$|Z����^��*��W��R`$���s���|��������T3@�H�Z��{��N�%��=��#x1��J].GJ#JJ�R��H��@�@�����$����4��o��`���
$ [b��$IA��U�T:�<�e��D�`_H|gC��p�y���N��/l_�('t������ys�\#��l�HH����G������\;� K��~�syx�H>��*;�)�������~�^e���Yg�P�h�h��D{f�zc8X8��!%�5U:����	�������!���8	_R&�H�XI.<�����b:W�o���:i��1��q��Y����|Ik������Dop�y �
�-@X7��~�6�����������M,B2@;�����,>c+�$�%�a�
13�����,��tAvditVT%����vr~�m,���>=N+�.��+�%tIe&�������&�'�9G��S�&%0`�g��[����o��`�tf�� aFK��
����1;��� �,��.��r����v ���a����{d�6\�"���KE����V1F 7�M��D�s� �#���,B�
�y����l��<$�,��N�j���%<�M�v�`���d
���j�3���AN��&��jy��hv
	@�%�U�����z1���m��@�7���0�O��DK�������9I�P��1�7��c��S�=aHR ��l�
�!ozi�/bL�S��GHs@�U+p!>"����J`��w����ui���:���i��N�|]���S!�,y�Kc`�#K�J��+1r��F�Az��G��fG�9	986;lf��c�s���� �@wga���,��U�C����
����*_�m�X�����'�/@Uy�4*�M�� ���e	�M���M�c�vA�K��`�����JW ���Y%��n>���v����~���+s�W��6<��a5/F�_�B�����]d�Z}<�a�l�!�gX!��
{������!�b����L"��CVH��RY/EP&��Z�HnE���/��~���y�4#5K� ���JD�@����8�d��9d"< "
&H4�@�&��%�C�����F*���g����	2�a��Ml�z�S�Y*xI��L��"`y��8�Q$x�<8"�t��
R�� LoFUR%|�;|�^���%��z
og���`yC+$���gg�`��m�&�3E,{1��6TD��:�o�Jq<dDZBy>�����������&�FMA8a��!��&������>V����B�-r��A|d#���!��H�`*G��@��A��q��WY:DPiw�HX���y6V�#2@���|�U�Eq��H�������!�S�r1�K���,�v�@���������29	!����M�m8b� �i�b�H.����L���B4Y��A�f�Sr�$����)�q��,��r�io�G�P�Pl0d�C�m6b1�
`���������U�a��2f�	7:&Z�&v�8_d,��������X8�xqg�D�3Tzx����L�#�#�u�����������A�����	�K����%V��$t��d�(��6���{3����r���$�6g"B�{k�s�9U6R������5~��<-�R�0}k��=�73e&z1��4P%C6�g#$�i!�I���w���"!C4����.�h��`����`�6�f���~���S�8�����%gj�> �*�T���H�b���u(���
x�
,�	�p��QlSpn�9�b$��B�S��[��r`yn�aG�4����T	��:'�)�	K��8D<[�-��S�r�b��P����S>�g���K�1p��gSE�A-���`?�)��^a:{��#F�'D]�&��T	�"d���r����h0r9d�&��[h�P���1��&�}Z(�����k��Y��n�N	8��l��zcI�1�Q5s��i���A������1�
�������t���~����{{���X�$+�����b�F,���T
�$B���=C��s�G��6Q*�9h17�������rN�K��g)�h�NKNij�&�6Bw�E��:�U���pR�9�Bc�{PIe4z�n5S�u��J��������]������r�{x�S`2@�D$Z�#uC�%v/ac7AAG����:���E�o�0��3j=C
u�X�s�k':��5����c�������|��i��8�/Nu\A�r����[`�Sb��3$Z'k!N�)���(W_���v��KP�Ja^�z���H����F,��<����(�K�%�Bji�����+�����iqZ=)���a���c����</��f��^�/A�2,�"Tx��$�ty��9��G`c$a��@nB�J��6���Q�������uO�k,vY_�.B�3��G�t��cpc5�
"�*~��^��j�H���"���k�J���(
vNpc ��g��_'�4\08Z@}�J����Q� ���l�6f��.���8"���k�n`1F[F�N�]0��Z��~��0!��Q�;
P���H��fX���`���_��K#�=K$C�HRH,���R��-��%I[�h��E-�/��S�g��rO������.���������7r��9��\0?�t�x)��8��*)����f-`���^�t��$�I�-`����Y
R<7�>:�T�45��<��.�p�	���U�����
4H&=���J��
T�p�3����Kb�J���S��xU.��13��f�{��`
5�]��(���m�2��_}e������^��4��s�
��(�3��{hy�c�}�����F��	
�,(5�����8���k�jh?��R�����z���2z�"N������b��kx�O(0LJ��-�(�-����v��K:�g7V�"�@�:�	 ��3��+�	��Z��
mm���l)����B�C�N Y��Y�n����:d���.��e�pFy�b?z6
Y@���!ZF�����f!���X�`�� N>.�G([�!���Y���DS)EC�/6�(3�������2�u����8�wP���*p�rP�@����(��H3KP��VI�E=���1�5%�+���37�|�@De���Is�e7D����}k����
�l#8����Fk	6N�m�+����@ ����=h�$����-q-m	$m�|�+�*�o~���<�����,[�*A�}���)�"�H�Iuh)��0�*y���cc���M@o�A�PBz�U�r`��8���	�|O��������O���2.8	����m
�}�b������|s��.�-R���|Y@G@��!��i9 lf��j�@t�5:�dAl&��t��Etx�nKA�(�UZ�����1��b��L�G=��?� [�.�N���0�-H��\)�oX��1(�yHM!��G�h�����n.~��X�C7)T��0����?�����-@�� ��/����)���a�T�&�o����g[�k�0L���Z�`>���2B ��FH�����5B��'8��P�o3�'�!�w�"P�;�rS��*���;B����	�]�F�[��n��� �b�
�����Y&��	���1�$�B����Ab)bnw�|�l�zcIY�y�����e���\�h*�yd�t�/��[{H��m��|+��i�sR^��/Y@P����Yi����tEO�*�0Xg���:&����:� b�����X�
��@�p�K��W�|tn&'�u��]���[��iw;��{'F<c�'��=0cY1�� ���g.?���~���IE+*(�O�P��v�Ypb�*Spm�e�z��X!�S���r{dQ���2c��c/�?�#���:�}�R��m����l�������G
m�"[�6b(@J�/BrPWi�\��[�l�f�~�	>������1�#}��%�1�n.vJ��Z��_E�dh�d5t�mT1:g����ulT����v�W����V�
]���{\���Xt)�[8�I�W���[+��
���y��09h����dd��~n�<
;��yV����C2@<�</Fh��s�X�l�N�?���x1R�L![Fr������D�YpnN����yV�;��S��#���G��P����ZB�Xr�.1#Hp��^�y�q�.`1q����$cw1P!����D�A�1)(���@���X� ~�8�4e��s�<m�l��}�A�@�Bw�E+e��+
?W	dx�X��%(������{��F�?@��zQ���*]��C�D�`�7�����B�J�K��"S	uRO��-������M"�V���)h�r������.$�����
��wb�X����=�rf�!�����B�r�2!$�!f�j�0����H}(����[h�������F����E�	
��&����t^�#��?�0�����1�8��B�]dU���0q,C��������J���Nd���C�L&�Q��,�%TD4<2X���V�Z����wEwfd�
H�^�q$����������0��9 p�8������{��o��[Y$�����e��#�w6��
j��K�=\���Q����t$=�����(6�Z�5�XM`>�:�CwJ�CM&<+;��BV��v2@��S>�W�|�8�_<�+��=�D��5�~�0x��������dJ�
�8V�L7B����R�B��k����Buk^�R4��5���J�8����x��h+
�<$*)��B�)��$�S$a�e�p���1��d�W��������B��-�/���#�Aq��(?v������V2E>d�A*}#vS�G��b�6M���mxvg������
p�s��j�V�;����S�#0�Au�[���+eP���������'���H���������6�X����i�E���)�0u�T�{���r�pK�	\��5��7��@IW��iBl�b�����^�"E`�1b*�)6��B�&�G!PFb *-��q������p��gj����C��E�W	#�����sYvc��A����!3�f*]��4��;������X<I@��Kx�@���������>�SHf��;�[�=pP���[W��.B�F�a�������wK�uX�$)=9��U�sz��}��{������~<�2��QW ��-�&R�Z�W��i}��e��Q�w]��P06�����?asr��O��	K�B(1��uE���D~}��[s��&������_�Fc���_~���b���Fc����9U��r
��b�"��~�����s�+���k���\��r����Y]���8r�Y��d��MC��r���[�{XC����\oMw�?M�a�f��o/=��Ncnq�<�fn��#���b�{L�L�Vz��~�j�)��'��8�Lis8UN�l�x���
����R�p�bE�����J7�~���Vc�r��BT^�S9���l����5�C����z(����AQ�B���O��6P�
8.�-=�>��*����-��/����:�[`;��jTW�����c8f��[�&�d�83����yq;���'

~����<�d���`��Xu���zKK ����Ik��
��d|YE�/���s=���$�����l�%��MT/w��T�����[j�^;�>T�G%F�*�e��*K�x�Q&�`uHZ|�U1!��@<��%#\f�v_����g�kC{lk�zm���*�}.��D1Q(C&S����zu���0/
�8���W�3�d0V�v���u��;7�C8e�[��g��������l�F�1�1��2\e�.�k	c�!�=��w�z,c�h3@�$�=;1��<�KS:���S�X��.�}=�ji u�c����������0!T�iR"g���k���;����%2���	\��-wJ��a
S.���z
���b���Kbw+y���E�d-�|�d��R���n���ME�
&
�-X!��sb����,�
�!P��VZ&��"x���d���������\����D
��b���C&|���,���Nj\���1�����d������3���g����F{9��tF[����Pn,@}�v�v��XX����V�
��Mp������iElmY�>L]���
����LF	��oq1z-���k�����L)Z~��/�9Bl�^�F)�<���p��JJ���1��X�Q
Atm1<U�>�r����u�[
��-�u:�L���n�~����oG�e��#��%����D`��o�2�^0E�0��L~�����5b;KE���Y"���R1��!�>�\��R_�a0�E�%�7<�nI�<������d����g��a;�+!$���.�e�!�#[��}��������$�	I�laF�	�*$��4�n�Q!Lr0�9�-��b�Q8�c�v�Cn����
�|��3Pdp'h����0���@H7BX#��bV�J�:r� �J��k�-|�d��o�@94�f� -�P���.����8�.)|7��>,�!�#"��I��'��"�#�L<(��u�2	�B��xN�Hc�(�v���L��I�D�E���Z������-��d���N� A�B6I�K��^�e5���(F��l��8/& L�8/���B����f�^d9��7�q��D��q���l�B"CiM`�%���
!�/6y�PE��Yw�1a�m�w�Y6y��T�&�M[���v�9�t�a���Y����D�!<A�e�5$C�%�[��� >���X���W�s�{�#�	�����,CZ[�f��h���(��K	�0�� ��A�e���'���:�0I�d�YixH�,|hy��x���
�tZ����HX�n�
��2�*t���@R�5qfC�B��m�]�d�u���a���~c� �[��2�H�z<+�A��J����w��,]h}l�*�
�!��C���@y�x��6����O
�
����!��
YD$�q�v45l��[/e�s.��!&���#�R�T��X��oo���e�
endstream
endobj
5 0 obj
   18971
endobj
3 0 obj
<<
   /ExtGState <<
      /a0 << /CA 1 /ca 1 >>
   >>
   /Font <<
      /f-0-0 7 0 R
   >>
>>
endobj
8 0 obj
<< /Type /ObjStm
   /Length 9 0 R
   /N 1
   /First 4
   /Filter /FlateDecode
>>
stream
x�3S0�����8]
endstream
endobj
9 0 obj
   16
endobj
12 0 obj
<< /Length 13 0 R
   /Filter /FlateDecode
>>
stream
x��}K�,9r�����p�����c+@0 ��e�]c�y4,�ek$[�`f2�c&#�uz ����)�������}��y}��y�7�c����_?������/���l^��������[���p��7��-����k5��~�����|�O����������_���?����n6E����:�Y����������������w�-����f�e_^f+��_f�������l��W
[I�e�����_�����0��9�6"��t5X���%o�q��f���)��l!�h����-�cm�06:S[|m��-5��>���f/��r	n[M�T��`m��X_7���nv6��S���*�����(oU�&��E5���������~�A��[Kq����T�e/�Z��aQ�I�E����W6y�N"�X����C���^�|A;��h������JL'������Wg����}�������Hb�z���<Kn����U�{C�A�]�m�B,���4���e��yG`N%���75�����91[q>��~,�? XY?��|�OOt���'��-�����~�%�Po$"Y_�n�7�1mKI.��
����F~mg/�ES��le��B���j�1����&���]\����q�=�Bh�c|5��c|A����0n�y,��������@W&h�=5FEl����s>�y29�<�o�����3�\Y���m��k������re�`r��f���F|l��!���cI^���}H��"���%�R�q�}E�+�Wck��/A����WV�%�Z���F�����Od��J���������H#�
�M����V\�ID������=D����#\��LSoR��T�5����������K���j{�`k2v@��nj���,�)\Ga(�*�I����r�(P\}v���jw@#��J8���<89* ����<���;�����ol�&�'LM�L���b��:������A�HT�8���q�Jy�-6��s�&���41�ry��*vX���7}$x�����zXx��Ox�?����x�K�b���.G�9���c�cu�5c�@����f:�A��k��]�b1��&���!C�(p&�>{�w����� ���I�?���T�T��_e<�����R:��X!�#�Z��:���Y���z�����W���XAT�����2+h��]��v����tk����_j3Y����������J'���C{�;&\��$����mu�*a��mm���<���a,S:J���W^��V�0��Z� <����z�Nk�}S����/!�� �k��Wx�<n3$�@Q,�
N?a���f���+/�:�:r��}c�q��r��Ow���	����pr��VOsL�t�G:�e����v"���	�������K�t<�������mp(��)�4��6t�����Y�!?���-������j�u��A����HM+�+��K%�t�5��%�Q2 W���d��\��>��(@�W����R�f]���*tXo?�����W?�"�c\��x��F1r��3�_Zs3$@R��w�l�2Br:u���s2��P_%RXy���9���`��Y����D�����Ew��+k���r���q�N��������(�@�k���k5F�k��i���Wte$!�"y�Y:�Y:����lL����=�r�9l�x���d���&��UU)��E�}�' Y�mM~��"N#Z�
tu�v��>�N����������~?�y�a�#_��xr�b����MX�d�Ye�����e��+��J�rE
$���+.��$M��0 W{���] 6K��V}6q�gY�T���>�v>�0���75��:�E��$Q�������)�������P��c�l�#R����2Y1 ����?> b�%'q%��*m�j�w��W�`��*�{��|�c�aZ�$D���*���%�����i��LP!4�C��x�3��l�6��h��3g0��+yD�B��]	�
���4u��2��G�<�sm���������
_�W�h9��MPW�x7�c*�K��g��[J�+x�G��'��9�>8N����g���5�|���]�{�O�u�74T���Q���t<k�fV��`9���g�j{�6���P}O�*Z���[�@������������eT�}�R�J�V����j�2U������(x�	��3bv�Te��S�@;;U|��"�jK+�U�P}�k�$S[������dP�kwf[
_����L��Le�Le�=�)6C����=��4��<�CB�B��Px�RQ�Vu��:y�;��������>���I2��n����p'�d��Od�����Y�f�����
 GT�'����aT�<�R$~��&�|���[lJ}�I7�Q\�a��!7���L��~;����c�!�\�GOyMUvV���1������UyaR&j-�T����J��7@�@s�u�/VL�e����M�Y���,�&+Yj�`Zjmi��}G�jqi�`pk���\/h��F�B�A��(�����-q�y��e�A�Na�E���-"dk�)����������zEo�~�duxk�\�����w���V)��\��<������I�~��{���`��i!Y['@#��������y0��	z���K�4�}�(�L���$e��B�ZuHx�(�Y\����g��^�CA�S��,N���5�W�k��7����**�+�@�@��=]�0��W�q?9l1���# ���1��?#Y(�l;�sTgfG��0%�A��^z���V^)��W��q�� ����R$z5B�/I�V��Ni+T�:2���K(�c�������U&�������QB-��HwN����r5���H�!�*�������`�U���	�������g	����B��,d���Z���O+&�+P9)��#��_����.�Qt��:����'x�pT��"�������e
����\�V���F]�i��Y��#Z�E=��^����!
�d��-J�lZO	��6������	&�5IL%��5
�jXB�FO@�3���Q�X��jq��D'|�!�N��:	�+N?��� �6^>������%H��&��'��������t���k~�~��I�$Kv��;�8�	��B^
�_#]�<����!:����W	P�����V��|���������Wo:}�MTV��;T���J�����5���{�^�o�����p�������AB���Jv���v����+���;�~�v&l-���
������{����_=�
�/��^[
����WW�d80%������<9�.?s�I�o���?��z;�E�f�E5O�
s?��%�`��Xy�^�>`��-D���ch�Xe�Z^N��	&����w�GT
V�*j/?B�FA������2�Yy��6(:� M@��(X��p�d�)����:	6l���6=f���{�q5vh3�0��{��Y�r?
<�$Y�(3�Vm
YCz�J����*�I��)�1N]O�F���Q�&P];�(h��Yw�$�H9���++�}?��/�m���P���]�L��z�	�_�&l�i��Ah�4���g+�!�?��F�++]=���g�+lOq�����[S�@��p,���X%����)����3���� '��II�
����M�Uk �����>0�G�i�I�������=A2�P@&���
g��0t@+���1�^�x�3������V�-�v�����������+a�~e�&}�u�U��U���h3������Vd�!P���
"��/�)PH8{k`��Q�;����u5DAAy��� �`��P	��3*)-�ve���"����\��Be��ta��y��
���t`K�JN���Z��-�|v��~���#�����WB�D��Q\���PY�1�J���np�`���yl=�q[7��)\�G5���QM�&0
u�>�)<��G�6-�u��7 Y{���q��[JT��P�Z���[Z!`���:^	�3��WA��i��}��aB@�@�v�(h	DtB83�f@n�j���V<
��1�$h��(��:����Q��i�fQI�L�%u}���Q�^C���F����������A�@�"���^��0�,�$;��uiET�C����+�)�@����*���3<h3<R
������L���F3iC�A��w����^����` t+������)��n���6 ������{�Za<+tO$��u8��6����FI�����Uf���gi}uz%��9��!&�������_!&�	���,��cR{���A)SpW���,E�i5��1��V��AB�t@Y���K�G2(�'2(�Y��,�:�)��N�`bqS�0�H#���ww�s�SP�m	�
KH�x.��wP�(�����z�����Q�@�������J�Ya��u�J��G�NO��:��u���,Q��t��`��f!�Z]��%��'m(���g��KusB�'��#���*���YR�S��$�J���3��a"��$r�=!�)����:e}�p���Q��.����b&x��!sh=��Ey$A�2��@��_XGPU��D�q7�ni4�=l�w����S�@-m�0�&X�ZQ�������1���JE�B���E��~�'�D��\�/5>�2��/D�*/�� ��*-�G��2�m"x9���z@�<|���(�@*���P	���j�!UY_��'�*����+�_����9!c���HB/�D��2K�<���� �<d�*o��i�����d�}���+S*+�#����U�_�(ih������_��8�aJ?�a*�B0�g!�O��v�����% *�c�4��%$�4.��d
�.#u������dh�W��5���:y$��"�N���N���Ny��q������&XrT�D��q��(<�������~�R���1
�f��'�#����,�ByB������g�7	��g��l�k�E*�:��B���E`TX�������J��&Jv�7a�_Q����	���h�P�*'J���	am}��:���3�_m����D�L��cS^��R��o�����O<j[r�e��{���(�d@�OdD����0&�E�/�����{i	�
����Jen�xoAD�T`���N��K%�j�?�Tu��|mu<%8�mC��
��Z��`�0������������o��vH
O4uz�8�A�����E���vR��+��/���+/��*�WK"����9@�6���)�d�OD���BP�g!��O���	s�j��p'R	
)$���`�B��$*�l)0�"�_G������u_�>*#	o�o�,o�,o�S,���+��(D>�e��e���y:��:�c�vI���v>��G��[a4�n>�7������&�VoO'�M)�����3��������!D(�V^pU�-��xzZa�d���-y�C���'���I��Kui%.A�}Z��A�CZ�-R������a��6W����Q������!\����`�u�{�k�{��w��Vv_}���-��nt.5 �>�y������k�	�o�B0G�b��*��Ws0�
�>R�a��$"(k{�P^����Ch�����6u7�&������a�3�7������=k��]^zh}t/��[��e�u��������"�k��tZ^������8U����1.��-F9M�����<'�+h= �
j�	B��u������i��	���M�yg1ym�n^S��'�k���6}�;���0�~�A=���.kb�e���%���Iy�q_5�]4/�v�������RF�k�lzAw��r��?l���>�O#'�]��F���t�k�6�g�Nr�-�N(�(�������i|=��LCi,}:D@��>���s�3���na]�w'�z���%��-�Z+��9��e�g���$Sq]-A�P,�����jW-1v����J�"���5�H��n�zF�|	�<n�������q������Sy����Vy�Yg@&�#�i{�F�:������rO�F��ZS���a��G�-������M�0L�20�/�Pi�AY[L�������bu��;3�"���Q���#%(+�����/�!��/�KR� ������rf��}�z������/
��.��+.<1C	R���+��
+�(����P,��I�]q�����_q��v��JrMG���W�*;rsV�V��[O�0!k�����PG���3����8/�#�jL.����
�5_������;!+#/�gU2Wn�7J10�+)6*�&	�O'��;=�G���W(,��8v�kP��9��� �y�}�|��/�Y��QM�8j���;5���S	�	e��x��E��[�y
;��	����;�c_��)�V_S����ZB��_��������Gh/[�RSn��`W��)��v�`���	v��b-�������..o�����0���Zh\�Ys%��w6��p>3�]Y_u���w��X���w���Y@x����t��o[V������G��9�����`��cx/�����&��5(#�NUsy��a^�@�����&��b�h���hY!�V��Ex�����L@^ ��[�j��zG7
�h��S��N�����0T��*��s�	f����~�}��B�y3
p��8���P���CX�(���P���	�����]�	�R�5V�q��Eiw���P�J�\���@0~+U��]^�����2m7�	\�33��I�]m0zR��][r�__������8t�P	8;��?[*�U���5�'��	�7���>�r|:������:��q�_��e]�w���l%���7oJ8L��(�&	��R\���xh����t|c�I�1��Ii'�U�s  )�t���|PC_�yri%���MS�L��H�CtG]Zq�1����/y����>�~�_Ork��M��[��R��29V��cR{F5g�CS�`��n�U�[���/d`+�4��5_a��]cP��H����=�i)0cR?i�M]�	�4��+��V!��
7�����Q��Om� �i�w�lX&@0����[�!]!Ye��te��������P��`�x�r+k
�����!����Iz��M��`3�y�z�f]G�D���7m��Q,-�5z��#���/,����K$G M�� �*4�C��Ki����"Z#W�[
�lEp!����w�0����0����\F�K�'�W��b��IA������W{�n��c�:�������p-�����_�zD]^i(NPWV_��X_^�3�/�A���!8:�t{x����MR�<���#]%3���>�����X��_�@�������m���w�G��]�S�Q���s�Q�]����������i:e�U2���'@������9�Q�_IK����n�2�T����L�8+��H!B����[��%[�����\�����@�N���r�������o)�[�������=�����:B�	V.�W�|HQ�V2�������>��J������!�-K�����`T&X��f���z��U��� ��>[L#K�h�
�V�WE%��Q��E/Np75jU�LW���\0�
�^
�b	�
���r���I~�[��+J�:A���!����0~bc�`%��G�~���yW1]wd������Oo�<��C^#@���#^?�:��/.8f��j��{m��tu���z��[�;����'Dwp
7��2�����NL�~�>�3u�B��f��=����_��2������SH�W��*����_#@KN!�kh6-�_�@�-��L�5��s����o�)#��6��(<(�<r��p+������U:�M�����'\[�����F�"�#������P4On�~�	4����@�����-�]�6O��&��,4W&xr����K��q��(�0�	�4j��O���:�t�TxZ�6�*��r����[�*�<i���#/�P�./�	���_
�	��oI��Z��P&x�'�� ���Se�F�x	��; ����u��p�p�E��T4������Z�1�N�M�i�R�=9��a*��)��a*N�0gy����P3�	��K�<��Fl�c���M*ZT�`,�?5v���k&���M�Bq��S�?��������l�x^�q���E3�b��9��*����f��}�����r�{�P��D&|�k���#\_q����V�j%;D��%�VoqS#J���g'�5�te�����N3|1[D#�twc���B=�]��<\�kC� 4���<��pW�k��<_�����P�_�O%@���������[������	��Ex����vE�@#A��������FAl��zj����1~�M��m�	�Amn���F�JS��EY?�,�]�a����-Y!r���+�Tu)����cq���D8Vv����[�AH��ce-��8V�AU�oQpo��R@]E����
f%�{���ce-k�9�%!�J�C��
?����rbU�����i�M	=��%
XT�k�{��B��;��P����	E�-�xBQ��"����JA#C�lQ�:��H��G����,��(��f�Js<��a���d��9:����-��OL\q���?{=�����������w���t��B�A��-f���$T��*�$X�Of5FJo��Zy
h6���*�Dv1tL?l�|���H�<V�y���#�w�H���[+,X^����R%���-[d�C���m0h��j��3
�!)W�K�� ���<~�%ov.�V\R��'�J��M��
��Sr�fJ[a]�yr�}�=l��E�
�U�K��o=�r�:\��wt�wwh�������M; �U������Z��Om��T�CT��Yt��S}�+���s�
M�}S�~��	^����_���������������2�y�����{������<���_?b�R)������W�"��%g_�������_�������n�g��1�qhM�|��9����o����k�������E����8��84�-��u��<
m����U���N�;�������Y��'rN[�0��7�"�^t
[�_���#D�;�$�_Ic�2��ic�R������eK��z�c���:��X������5^�c�������8lU��q�7�u\���+il�[��~�+mF�</���q�����z�����W�c[�v0��<6�j�a��W�c����~�o���i�)
��:O�>��������o.��Z_���_��Z|���r��M��~�����|�O������[�������?���Zfk�d^?�������r�?m5�[�7'��m�?�o�������J�K���.m���t���P��+����sq�����B�B����z�5�M"�["�dC����9�N���)Dl�~>��\�Da����Lm
������O[}���6��}�t�
H�S��S>���J��1���+��M�a5��{cW��5��X�Yc������.6_�K�K9����>>mr[����������tn����}L;**�"����M��o�C��(g����!��K�[���[@Vz�Z���������v���
;(��J�q�	M7��_����"j}(�S�D���C4b�����w�R9[�1Q$`�t��CK�
�%���,��B�����d���+}��:>R|0���J�`e���q����"
;>���w�"��o�q�f&	��3�j���H�H�3>J��2K�%�
is����J�0�L*��>������e��H�����C�1K�vVS1f�6�������Ag��G�>2>$!��"oy������&��p��M�">����aI&���M��|d��&��XO�%����?����K
�>R"��$�a��8��������?��>�%��@H`���A
��I `���L������fCb�x����
I�4U5�@H ]���@�'b�a4�d >��_$����$V<$��	Ub ����K?� ��?����n�����"Z�@eTb�������l	�0*ru"*
ot���|M��
�^&F�m0|a��]GE�O�(�5��El#���
�t��a!��QC:j��Jaf�
H���^*���S��F����9)�f��&[F�3l�d�*�13�����$G�M2�����m��������
V�cf��V&<8�����*��#�g�
�\�qm����`�w
��(1���p7^qY-������J���C6b��,�\��Kd�g���ATB|�]��Jj�E�z�����?p[��M\���#(!�����1�%�@�1��p��s�����$%������2E�.�V�7�Rk��0��8��a��jC�@@E���Q�6uN@G
`�T@GeV^��QE�M��)Y����k�l�'���DB��������;+I�**;T@�������O:�h���+j#�jra#����31�tJ����,��d�:*j#A6b�g��r"���#&6r��- $��U��q���{�`/ ��Q�L��D������F��?��$�#?���	����YAe���J��00��7�b�;^X��
�H�USY�8�����H+j��RJ�x)@la{+ <�~����#xx�G����k�`��7M�|8��
�9M���`��H�����,��Q����l���A���i�-�"{��u����������
��PQ�
��tx�������i��J�@1g���i��d���<	!������"��+y�eiS������.���	����N��l-^;l�d+��F`'`���s&���&.`����Q%�`Q2J��7D�k����,�H^�@"�51�IR<���0�j��lX[��80IP}��5,V3�BY��9�=��EY�$"6%�QQ���*3�L���CX���Z�,�^�\c&Kq�A�8Kr���,�{��N6b�u�
�6���d��&�g�`H	��:���P�$����8h�2�Q+��D�03a��������jfr���`�`��'w�`����rb*�.��C��������ky����"����#y��O�<P���$�Q�q-����=$�����2r���Y0i�Z�8T�F�h���*J` sA-%0	�i� �����U-6�! a�(aH��������)2xA��r["����j��^Ol�VG����q��q�!>�3��-����<����/g�x�l-Do�P�jGfF�0�5��%FV;���v��������H����k,l��T	���j����m�;	�t�$����m�(����
��i�����S�������/���O�r���������!������v���|�������{	�w����P�C��:����r[��~��n+-t�*�*$�4jP��Z�P ����g�6�1���j�.��rn�;bK4�C��sm���^#�s�1��m��_�x�/q�)����iCm�-����6����Nb!Y��5�f4l1�����<�u��-����6`@�H�P�_s;s`$�h�8���ZV�/�M�����*�`�^T���s\��q��N\f	�����~?�;>�x3�|���T��`���"�������M��ZV��8j03>J�l���:�H��6��2h����\��;��{8X5kW6��h��O�f����M�C	��$X��]�X�T9s�pv�@�C� K����!*l�~�~e���LcnQ��@�2e��oz���E��4Ub!�VN��8
)������!3�
���b%�Fh�,V,�R-+��Fr��P��j	���^V��:��-Z~����C2dV�F���c�"f,O�1�n����R%=I��������B�`1�b�O���P�d,��Xf�q,�8+���9����y�M��i�,�5f!{�$��\���c�7��n�U7�k��cE�E���]���N��ie��lj�6��[�B{�
M�c�T%���*L,`�q�����w��S��G�vj-(����������������j��y+�&a
�;�B����t�IB�
�0.���C:�E�J�`��1�S�0sg��"����DPR3I�M������'Y��3�����������(�`4fAEcV���������u��f�:�Z������^����YPE�<a�I`s�~:�D�	�`k\��t�:�M������i����0�Z�o3U��0f���tQ�@^��m03�
�Pr�@�6`z�`pU�!�������[�s|�����{
�)!�z�4������F����7��;���`��� �39��9�\v<"�0�H'�	o<��2/��
�<�7{�{��w�Fi�E��0�#��#{��3�}�ep�KJ�-�@��B����d�"XCW��-[@����E_h�9���J����wL�Wm��ME�W������b2�K��-&B�X��IR��E��kg5�N0�:���W'�U��4j�"�.3;V4�k��+��[�Z�zn�L��fJ6�a
[�~"h��h����b�n�J�Kr��8��y�R�TY}QT����� ���VcPQD8Y�DF��`t�����
\�.���WJxQk�p\�=��H�0P.E��S�r*1�U{w�K�j�r�eI���^;XF���U(h��S�`�v X��=�2S�2�%�y�I�,`}�/"��L=o��R���	Br�'���;���V�o������L��4A�+�|&ES7V�U��	 ���f4�F��`�`w�~Z�mb���rZ!T��+/YH��^��oL$[0�����3(D�zeA�X
C�4�Q���`����X>a�!!vK�H�d�5B�E�PNs�G<�������]!��`\��A-O���d$�m������<�0n��[�h�@`f�]MU������j��^z����|��b���]��d 3�TQ��x��z�=�#q�f��U1j��oWTw�A�;��L��>���L-2�$Y23�c�~�CU���d������,gb|#��@�L@���������#'`�`[T��EI����`+k�\wcGK�f�F�-���GI3�y�*�4Q2e���T��&{��*2�O�Bjz���`4��F��Y�������9�;�����{���d�M���H��/���:�$g������'N%�7E,��Ks�����5��!3���D!��%��2tD%��X�}(�albp�yf+�3>����A�f���}1�tC��S���~��M���r���?�oh��Q�A��9<�/��)�C2�j��t�T��5����C2�J	L��(����B&$�]������@b�n�����yV;Gg�-:�Z�?8?V��!��Hh�B�0�a;�B�����eR�km`����LF��)�"d��b�h���i���g��vuj��K|��P���)
���L
�X�a,������� �)����)J"��K�L8������������+1��=�V�c�h�p��0�����\��i�1tK��C����7�"���E6�0{� �F5r��7(;+H�`@1�p�o�Z��pP��)�b/��R���g����h��z:�^��D�M�[-�CS����lM9t�
�$�C�L��e1
��k����H~A9E9�Lf��AQF�>�9������$:=��L���r���!st0'���'�r��h��^�:���D� vt�G��a��=�kz�
4H��o�y���%�wg�r���C��>�>�m.b���� �k���T�<q%�e����(?���!�
��AIQ���1���yj,�cQ��#0d@Na�B���Z1�;D�@q��~�gxL�=�;?(jj"TAD���>��o�`��t����CQ6*rL�`#�a��o��&�q��7f!:?$�7���.�������n��������������{	�b0�����e@�A�s���e�Y�c�����vL�X���6���h��)q���C��T�c0I5
A��/�-��M;��EM�a��)�d���+��a�����3�}n����f�(��U-s�Q�#xbt���l�1�D]�3��KT��K�����{��w���0�J���yf&+����H.3����
��������+��j��k��`���R2�E&�s'8R��B���5��H-
�f����<�2�f��y��m��q���\1CA.�����v	Cb'�������StU�D����A�:/,���Cv����
��f�;sy�E�uZ�=�c0e����x�-��<����b$�J�h�����r���
f�����J��K��)��T�9C�q��:^�����?�-����X�v���>Z�OQ���y_��t�4�l���3^b]LE��}���%@��T1�tIn��A�w�.�P������nxRU��(�Z2f�=�d��'��]���,��?��?}��������B�f���r�5�6���X;�����K�v�(�P�vo�������n��6���
Ht�C��c�}�H����]��.�����mv�9����c����?�[_&%�W[{L���4�h/�<l^Co�!���A0=�tm��M���K����I��v�'B�lLO���o����eL��7:F�!�e��2�v�/L`��`����~u�]����N���r���M�\���$
W����nuh����b`�>�
-�$��
������PV;���@���|����-!Q�Ed�+����3.��dX���i��[@���Vm�q
��/GEk�������v�H�Y��>C�^�
�G*`s�]- kj������[��=4����6pm�-���|e	C���*��	�*���!
������c
�-IHB�h�,��Q^CwbL���WE���ZI�@��\�(V(a4�[DEa$�x��i`�S���Pk"��������NU)%��l���18��1���>"�`h���C�h�,����Z*$��^�*!'���pA��W`>"�{��P$��@����bk�/���8Jp�giyd���
B����NT�
f��n�ul8�}��/��k�c��`��
��A�����f�s��
��h�zL���6�c�o�/E��aL�����\3�n�M������m�v~V/����I��^^���5b��z������������Y�8+Dj�����8������U
+����p�����V�\�ndF���~�n\C�%����W�#�z��������d����y����<H0��t8�=�gAi�*���T���T@16�J�cD������o�#.�S�eG�����=QK&u"*��*5�-�B�8@�A����(�B���n-����h���*_?h�d%{D����b� �g�7���;�������*_p��O�����zG��|��Qa�"NJ]��Z�����n|r8 e�OK��n�]Dh<Z�egLCuB����:l!��E���+x������&pc���0(�Np(����(`^��MuF
�2��`��-��/�P^
q5 1���YEe���clj�VQ�+C�t�]L�AG����#:h��e��"��']�����"�����X�6�ot������Z?6Q���<��y�@��p@�a��CAy-_��P���3�u2[�O����T��`f&�re����&p���-r��
S|����X��y����������:��1�gx���&D�{�����]O��*Q��x���r\�FQ (�}Y�P��`mcQ"��J�/K�I���LL%�;s�$��-�x��gY%��=x>1Jbe0����B�S��x�����q@!�SzC93����#-$D[(pj���lI$?�a"8\C)���a>���n
�5�J�Z1OF�8H����%����� ���!k��/�\�5Q��
L�0Z�D�A�\��ry���1]A�*i�]��p���������2�B�ca���%�Y��BP+�WY!�����/�P�H0���BE�%��	P>���2Dp\�Z��s$����m�4��\b���� �O�T���y����Q��IT2B�!��z���\�Tm�Y	[j������A����,p<9�*�'7CvO��8�Op{Qr|A�Rf"eP;���u������ny�6��d�l�.��6������fu��T8�B&�X�p�[��a@�T�ka��#>/��1��:��(��I��pk����K��=?3��9�E�E��a�,�=2z����AU�����u������l�**�F�p��w+KLJpt��x�{P#����:*����SV/��j����d�t;v�Y�M�6��!f���h���f���b$��q���� ������c�f��9[~�]�-���#���/A��I?���LQy����Z� ���"`���m������_�.a�$�[��(K�Z��F:T����-6�n���N]�M���8]��J&(o�!%�WbIc@)rU7��0���5DT�T�C�0DR�a�g�
��v�������M�s@=oNhS�:(c"�2��y	���+�^?�1��J��96��s�@�k<��T/Sy7�4���s�/�V�Z���@C.�<���@�%�������T�*�
�yN��$����9��M�Ivv���������������c+�B{%�x��5���
���,�(���"�����
d�:pp�7�	�ri��n�$�r����	I�����Xf�0��zk��ZP�d�`V9�cE��]3�-�e����1����[���nRp����\�h7��x���<2��
&
��$EV�W��"�'P�����b,��������K<�x,\I^1W�rl�+��U�@\n�^uza�A���,���Z/����?:���1)b�kn��&�d���`�C
����L{�)���C���1�"�������xTH%���p`�8�
t����T�V�e<�3�8��Z>�$�G6R�o��j��e_��w�j��J��o�Q���`�:��f��r�F������2�$D��hP��f /j��R3(N���4FP��
�s�R�nU�p��~��l�,�0�*Hp
�"G��������	�K`�6p��#>t��L��bv(��n���v�5x�Rd��\�E�m���]��*�B���7��%�����~����0#��u��~��-5��L0_��������xR�#����}NpU�F{��WBJ����S���{��Z���kV������A�@��T�n�-_�w�@���z$�n��xL}�@
��1��bY���{����H��R�[)���c�����]��{V�R�#��������0�[��x�!#�o�Ur{)�y@�E,�3��V�UOcr`��������l���q��OY2X��O�,Cw�D�=Q�I-�+���$D�a@�9����$@B���7]��
�d
��'���`5���dOy��h�B�7����g���{���o��>���gG�P�v+V7�e��x�N����1#�7���7�Y��2>�p��(�`]
�p�~
z����sp/��eF�~��v��8�Rh^h��q6��@�]ImX�'�]��J���l@�I��0WI�'1_��b������7r�43\-��1vr��_j��o?����?]l�����5�8@�������}�l�v��"f���Z�':�9�z����y�����,]����5��V��o��x���fMC{�;���iZ�����,�-�2��.��p?`�w�06E�e�`������8�\< �[�����������D]����/
�3���j��@,yC�{���m��-&��]��w�A�������A�����B���>+���C�ik�*4�#,0����:�	���?E8�Q�0��|���?��`�W���D�	X�ys���[�#Y=�����?~�K���
endstream
endobj
13 0 obj
   18436
endobj
11 0 obj
<<
   /ExtGState <<
      /a0 << /CA 1 /ca 1 >>
   >>
   /Font <<
      /f-0-0 7 0 R
   >>
>>
endobj
15 0 obj
<< /Type /ObjStm
   /Length 16 0 R
   /N 1
   /First 5
   /Filter /FlateDecode
>>
stream
x�34Q0�������
endstream
endobj
16 0 obj
   17
endobj
19 0 obj
<< /Length 20 0 R
   /Filter /FlateDecode
>>
stream
x���K�$�q���S���������A`���]�z\�`�}}YU������������_�;������Q\l��p�������O���������~������=���W��XB�G����s<������~�_��o�������?}����}|
.�b?��]H���?�����/���������������F������C?�k!�Phz�y�}<jv��#4}:�������Z�S�AM�1g8z�m{sed�m�udjC)�&B+��V�]���1�mj1��~�A��c�Mcp�%}��j��i�Q�Kj�m.���[�c�-�gz�>��u�-���/��>5n�2-��9�1���J���c�i.s�R�^cK�R
Z}���s�s4���0����F4�|-���vj�17���BH��)�0����\2BeB�rt��iK�AZx�{�5��e�<��F����G�����M����FL��<�\<�O����t�zZH���Pk�Hl��6�u��L�gc�H(��}H�.H���<��M�{Ly���<Q7�V�k�KnU��Vg����h��0���H�PG�������l�����c�-�h��������<4�����R'���Z��:|��o���o��z�y�VEd�w=��2������^�%����������1�s�y���w��|F����A���E�#����lk�J�e��I��������IV
��@���S����)l��!7��z�;A���
fl�Flj\Vn��l�R���8������!��p7\7��,r���.�U��$G}X��Bu��y���> ?"�%fR�[�_�De�t�Zj��a����R�)o"���~\TJ��������@����y70Pj����^k��|iE��/���ds����_F�E�>�D�?�Q�s��W����&�q�m���17��s���C��k�)�[��m��OG�6�[��1��o��m
lz�m��>? �������'�H����0&[��m��|[�'�c��{������_4��tk�x��F[�����zd�,D]P��$����j�|�N�'�S�K��a�l2���n_z��Qu�Y�zs|������W��M�5qq����i������(��������7\��f��n�_,��}ycnM���N�������1\�I�}Fz�)M���Y!�������xGn~Q����}e\=�|�s�Y����"���RJ�`��Z���V���P�����<,����o}��P�!7G��7�����!����h��^��}&�j����H��(�d�Q�!�Jzka�l/��4���C:��C�_X����y<l{�x��Q@��@X�Iq%.���[!�os�V8�&���x~�����=�/���v{����/�mO^�����	�����&
�%I�x�iM���������M��X���k���qK�2����{e��z�a5v'�����6�ud0���a^k|?*{��������&����-���B��D��/P�#r�)4��t� �)W��C��EE�z�cf����i���l������t��~	MC�gI`�77��
�8$&�^	Bm���N)���e���c�����1��EV�Y��W�������og���O��M+0�V����m!2V��kA��}�\�.�Eb7��=��T���dz����6���i�3�v��1�L�6��''��1����	�5?
P�)��q����sz��h2�����.!L��xVPX�1+z%���R�c=1����]�Yyx�|��]�"���O+3�J��ZUd�N�ua���M8Q����dlW���]9�����@[���V\�1W�������-��~�/B����y|����b����D�x�D��_�)�4<5.D�j�������3)$�-[���i8�^A�M
���������ft`�D8u��o*�C�|��k,��C���oR1�F�B1�x@�`cS��j��.� ��h��b���04���E�J����C�$��X��5�DkP��-�QZ�50l�.a��>��.�1�@1���?��X`bxk����9�&�����s`N��/���1MovP�d��!�i�-a�Z�d(��*�����j]X�#1�������BP���Y��1���c`��D�r����13��������r�Gxe^�s��8�*G��g~X����M&	p���}k�F0����Ys.�����'f����S�_:�U���7��)�,��2����5�Mg���fp93����^!�~��;e+�XB�>���0�	���$X3����$�[@2;P�p���$W�R:�o5��4�y��`u��p��(���Z�=lN������NAX7�H�,�-��3_H7���h����,tvV{�j���\��J&]��V���#K)����"���5^L�1>�2��3�F�D��z@�u!����%������zK&W~������_R}	���&�i}��r�;�Vpk
f2����@�����.cb|�G��J�bk����3xQkh���2����y���5��9k�G��G�U[
���]��S�������%�B1��>�=�]dO��9.n*P���C?�0Fy�(�f>����eQ2o5���f�������D�����G}��0��K5�4���YS��3���f�j�'Wh=��B�}�Y}�Cxl
m0����\�:�d����&�������Z5c����Z�3���l��q���YlYi��3sm�"����WmYkV�]��1�B�\B��R\��3��A9jro�yWg�*~a��+�c��7��Q+O54����g�-�R��{��c��Voa,m��R���q��v��"���V��M�6������-a�����5f�����bm�f����6�a������7���o��Z��b��`��p%��Of�t�z!��&qUy,ds���3�RVB�w3�`<rc����}3����/��)����{ 6����0��$��E���1��8��Kd2������V�H���{8��xzv%�9zK�\~��F/7�z/�y�5,���R�"��F��G�mox��Y��Z�9�)���!���_��Z���Z3�������ea2�U{�.y�����+2��B&�D'S����{a���x�L2~F����Y����f��[g�	l�?�G�0Zx��<^_1��yq���� �#;����zKFQ~��h�r���B>�]����0��C�L.�aV��nN��o��� a�Ik03�u3i��!a$��}J�]r��:�V��,�S�������
���A�tZh�r�qVaqf��[8���P��I���j5'�9�	���X���f��1��[�T(��/5��'��
��xm��E
���y��q4c���;�������
Z��|�����:�A��^`�#s�4L�e����-�FZ����xVs���=a����q����.�����:��\�:����j�����+U��r�g�h)��O4;��y�3��t�T�KD��N����SH�7��l�*�_x5:x�]�#�e_��1��kM�JP0���W92��&lR���A���#����A��� �����,OqW��o��,Z��Pc�/�bn��(�0Z3��3���	d�\��Q���A��gZ3M-���8\��I"��p�J�F�"��[3{�����5S� ����Z3����Ze	���~+���<��h���
4J���������)��/�Q26�Ur�}���}���`&�n���o�3���yYf��MMa�����Y����;ux��������.�>�����>������px��?����#���=�E�TW{?���Fi����E�����>����4$������s=���q����0��J����y�4WR���7/�����������f*������z�Nl��N�����7�&�Z�w�H�������>������m�����e���������c���W����Z�K����m+���K����m�p���M�o���:O�\���%�����)�����0������)���]k�����C�_����'��/_�������|�?���N��O�M��G���s�y��V���?����Mt�����rJ���g��j�����F(�<�~u;�m2��������g�{�6\����*?vr�{�T�-i����FO2� �{�2���|����:������3?;��������?]����F�������s���5���f���H?}������&j�8��@k�}�6�v��cY����G��rD���3�z<R��cp#VXe��7��e�#�kS�������Sf<B�[<���Ak,,W~��:�Gt~~���>�����]]�TB
w�-�[Y��SGf��t"%�<>�/<�4yBs�9�\�D�?0QB�L����*Lt��X�0Q�01d�G������O!�� H�K|<�����.�6�V���tXd�Q������E�&L_o��'�?

�e���Ap��. ���Q��~���'�DOQ�F��U�y����r���k�i\*&�_��}Yb�.Q:)4�����c����*���x#o�`��E�>�\ �����h�s�17���Yr�%Tt#^@�f��P�v��s#ld��"k����Y���"������E�<5T�� B�&h���B�SY�1:@����}>I�������5��D-FGD��a��Q�]����7�|�-jFG�[��U�]�#pJ��I�� {�R����y�V�����"dC'.���?�xD�t%#�"6���c������l�A��U�"R����t�QI�����@�5�K�%���L�':u/�]��j������!��p|���J�S�G����7�iD-��L�+�tu\�;���Z�����/��D�@��e��������X���w�M8���]��*$����j����e'��������"
����++$�hj�eB�dC���l	��.�I!�44�MEU��-j�g]�����_ABBF"�.�� �3"Bb����X�#M��fo����=����k�-o �.LM�7!�iK�I������uLxE����w�4Om>�����EY���[�g�ep
�!H�"�f����,�ZbB
�/�!��56��^�����^����g��w���8	�,/,p�VW���6-��=��E~��`lA>'�5[�>�����\3n������D����D�(L<��R��B�f��
��
P�#�A��v�vB
"*x��6�O��
PO�����JF]���J'���WGwqc���11��Z�s���cV��
����g�V��
&A^�R���k�r�������c�������g��T0�S��2AU�8v�5,������D�F
�l���A�+|$�a��Hp�Z��������$�����M�������z		����K0$5!XTPty-��Z@���:rP(���h|�N�S�����GZ������- ���P�Asb�J��B�X(B`��W��t�v�D&B�
!��	U�]nrz�@Hl��B���=!a>�(�=
J7� �78���BM�	i��+:&>���)i
� ���/�Q�/����E���$'������������aB�xZ��Ka��!������w`M/���9�BEb�6�]���G�L��
���Dy!@�+@�\��Z��<EH]v/a��!?W�\Y� @��*D�S@�,�H��;���_r��){S�&���l @r{e�T����)���-S�� ih��% ����f �
K|��7:yMo�3�	���u����y��*�`� W8�C>� lx�`������
d���;�'Om�7#P�g[����I�f6��C�U�eJ�M�=���Kagf��� �P�3�x3�wz��nDQ��(�`�Q|����(�j@5ta;%�eC�G)2�\��;Z��*�g����/l ����(Q�U���d(�$+��vv��Y2�E���)������Cbe��E7U�.�����\4Wx��%4Q��c4U��!��0g�4<z��<����;g��Kg�H�G~�Tf!��);<�b��t�C+�����������,)�O�J��|D��>�s!"f�F���b_z(X����(���j+���<n�����1$� ��M�����j�Hf�#��Z�:aUS�&>�*A��C���j:F�4k��y1���)���sP��z��3���0vD ^8�UdN	�6��O-��-
�bh`!���_M�d����$�	�����e�-{�/Z��w�2�FOQ�0.K/BPS�wt:�M1@0/�F	�0��� �|�pY�l�/Ey�����6����9��! A�F�p��jH�r#?�����\rHJ���e ��s����"j���2Wv�e��cM�)I��j-0L@B $�Cw�@����4/����n�5��
����H�v��J�,(�L[������cx.*i�����WS3�5�c�(���%���k.Zu,���@����o�E�n9�� p��LTs�L'(�����������:��6�%��B5��R�h�eAK�xpu!�+�gH�4P�C	o����*h�,�F��d_�!-��O���p�P�5���P-b��������!Z%?Z����)r�t�
rWB���h4W������F�
5Y���=1����!��*T���8x%eGV�N�<y�#�����EO����`�bU�31���hvHb�Aw��ht�]��%�_��y)#e* P6���b�����-Jbd4�J;$�}��DO�j���.���NSW��
�nS��4Y��a�
i��j�E���;�;�I+*j�����������5Ok�Ji�riJ����n�4�BeQ�.
��Y;�9�M��M�Xh
*Xx�j�{3����<�!b���7��Q4>��B�����l/��#C���#��&&D�"\����nS|��b�&�koP�3&�Tct��AYY��.C�hk�O1���tD0����lG����.��g��Dh�����x����j��
��|� ���i���wJD��Xd&��'E+<7P�<���.FE�"")q�7,��qQ��RnsR3�(L�*D���}n�v"��*n�����U����a
\�P���Ub�de�����Y"������]K����7���$U#2`�U�d���p/���(����A�`^�b��7}g%�c�M��������V�]�����"���/)�����!�������
�y~��l�E����+�H����2`�@W�&�Cf[����b'��J,�*��� ��Jb�a[?44<�b�<�A�r{��2���zAj����������^<�fd0q.�$��p����
��P
���m�2�~���X@�����Q��"�����]���j���x�9�F�{�;>���C�~�n{  `�������z��ku���j����#N��$��D?������B�{"�w�8�=t����+@����D e�EQk[yK���������?��!�{
endstream
endobj
20 0 obj
   7669
endobj
18 0 obj
<<
   /ExtGState <<
      /a0 << /CA 1 /ca 1 >>
   >>
   /Font <<
      /f-0-0 7 0 R
   >>
>>
endobj
22 0 obj
<< /Type /ObjStm
   /Length 23 0 R
   /N 1
   /First 5
   /Filter /FlateDecode
>>
stream
x�32T0�������
endstream
endobj
23 0 obj
   17
endobj
25 0 obj
<< /Length 26 0 R
   /Filter /FlateDecode
   /Length1 11568
>>
stream
x��y}\Te��9�u������E��A�� #(f��2a�*���AQ�7���^41�6�5M��������Qme�����5Wjkw�������|?���h��~�����������s����<��y. $@� ,������G�����5	����G���D�������;�����j��=+$��_����~��7��������u��H@��l��i����:����������Y���R����7�8B������_;`�ir2p�.�[�\�$��I���h�|���s,.���wt�������;N-���z�8��c3]x!��o�����Q� 2��?�	.N���>��Qv��e��`��DFP&J���#�����,����,���Y�l���������$����bh����1\t�9�3��N���[`!�d�F����"jXd��sh� �<R4�����������U�����Id�������L3OQ�Y�k"_GN~�F�k[?|�-m=s��)'5,���V�aS�#�T�2�rR
�������&���zG/�=<v���c��<�x������'�>mu���y5�g�dm�1�������;��������B1�a����Sv�E�9w�����L�����w/�7�47
���h�������>�o+ ��vGM+�~3c�lv�`�L`�����q��^GL���,1��s0{t���(��9��6r���n��E	O&�~���6���i�=j:��"	$���Z��T���]�f�&�)e������0�LCcfBZ���eX=�0		dPNp�x����W��2�7OU���.)��bLK�dV�Z���ZM��}{�r�\�S'�^4._�����co����/����k�#����g�/�VH&�M3����j�je��fYc����E��`��e��GiPy1�(������b�#�s���}����g���tG��Z��9��n��1��?��05D���s�#X�n4�"I�=�*=�Y.s��#�e���gD�q��H$�F�K��2�28z
��;
��0T�����5W����9Kq^lE>�2�TWEgd�)�a��m;��.p�9u��1#k�h���jd�����O�:���W��5��Qx�����*�=���j�)9\��Q�����Z��-�d���Y>������"���l�yI�]���zl�L���������%�Re �(���w'��.���8�E�����lF�'(��F)(�e;�����|���|T��CkC>O�cO>����|l���|4���|���y���l���jS���`^M���_.v�.��jY����W�
��&�S4���939F3V-nMD"����_��Y�I�w���������Q��z�9P�vV���6U�S���%���z�a	��GrK��7�E{.�G�}�k�������5}�Pc5��@�?��-�C�6�8*,�8���y-�y^���<`��%������,��M.���@�{��H1&)�H���/�:��']���vf����u6M���i*�����_�1��d�D�3�T[�
�f#�zcX66���gI!,�)1��^��q�����i����R�j�r����3B���q����������G�����-������k6��S����x+�E>u��|9W]���>�P����5P�L�X0C��(	�@����t,i�D�l0h	��X0h�s��`�C��`�k.����y������[��G��I7)�W�,��n2k8��W����}L�W�SF$L/����H������Z��+���`���	x��{F�@�@&���A�6�c��gp��H����7�4���ib�c�n(����������
�L�'p�	����d���D��y4�jnZ;��K���'��C�?A��������"z��dZ���#���#i�8���������V�F���K� �����^�w��H�!"��,������B������N5�EO�g�c�����F�)�11���nH$u:!(��djP&�����L�fb_&�d�+)�,6��$(��|O<��p4��r�dq�j,���O�q�u|��]q��K��Zy����G�V=�P�6tEU�W������M#��/����GG��u���W�xH�Z�U�.SS�x������d���7
{�PI�s�;��}i8�L��4���jeX��
�O�������U��`�U����7�k�d� A��'V���?������A�/>��H:�xFcM�1�w����#F�9`�f���h2�)�a7�::('��h �~����ul�C+O��K=^;�f0pjq�@������X���{�b,U#��?�}O��#G���):�7��=~m�t�1������1�6��_J�,P�NO�(;�,!��>=��c�=��c����#���T�;�8CC���������!z�$���<x���������MR����(��pZe���5C������'�8���!'����S��y
r��D3��X	?�����3�����|u�c;6nx��
Dz�����F�����|�w��O��Q/���?���
u�	�>RR�4�4c3'���hl(����\<gA��"�a����Z��fk��Y�l�?O=3����A2���)�6�����#��3S$�	�vf����l����Q�y����:3�����f6�c>�941��Lw>�%t���A�������rC���U��I^�jN(�M����T6TENG>��oz?����������B���g��6%�
@ZX;��Iso*�������w4�Rq�����W�_���b���W�8d���a���O��<���PE���Y$[o\����y�����Q�_C���zMB�^��#p�?�����N/���i���fvr`�S�s�[�z	�$�I(I���$�W���e	��p��wJ�T��4l��7~%�Q
�OC7KX-�	y	/HxV<H�Y���F@I���'X7Kx��c4�I���e	;��M����	b�����ay�i��D�6�C��*�?�%���X�[��?��Cr��&�yWfN��P�%�Ww�R��\����A��$�i�{ �Z���|��4r��e��?��������T:/yY�j�M���Uw����~��Q�����!�FB��d���$��PRNR�t��r��L&e�t�sr�\��`O�r�%}9h����������Gx�T46���d��(�pL��C������$y�o���p�L�hY�\9�u���~�����=nZ{�M����zj6J[�Z��5�|�l05#{�����Vo��N��l��������S��`��[!/����Zh�Xuf����e1�d���$��B��x�{�j.-�q$�)<&���_��1������z~+��'W������u|O��1�G�
�<���$��x|��gy������&o����	<J<�����E�������=l����[x��q�y��'��h�������5��x|��M*��<Q�Q_�c.�<�I<����O�x��/��y|��m<n��Nm�2��'�5�R4�~�:�	[�c���P�-�j�.�Z���������<<��fJO���Pg����z�8���v���e?�)?-5�:U���_N�H�8K������zkj����D2k��X�Zd��?c'"z��}��sTjf��H�;��YN����'����C6���i��z��b�3�oE�\|������c�V�#��[���HP�$��2Xb!;�����s��y�8B�^����iJ��@"$C��fa��3Eo
�z��_:��\�`Q������B�?��__�A������?��C�;7�GvF����o�#��l��h����|���e\
'��d�*?
�� v���]��������n�L��Y��(A$h�"F��M28�7T>�7o��Sy�9���*`;.�������[�	����V��b�@o�t����h����M@�]��bI�!
���_���R|���p"=p#�6#�Fd�������2�mU���n�wo|�E�������S�+<x�]ms�Z�^��>���3|��G����Zm�
i0�_h�:�m6����jpXjXz�),���6[r�lcH}X^����0��%Xu�5���K�c��Q����X���E�K�V���nR
����w_�u^8T|��]O�?u�O�#��k���~�GOEa������n�����[#�T}
K4��B"���tM�^�$�'����r��#�*I��|L �����B7e��~��/���H>A����G6��F��m�v�������I�B�9�h<Ec�
�;il���F�&�
Au��A����^
>H|y$Z�����r��c?���i*��)�	yP�%1��T����'1L�h�>#;#{�l�@+��A��i+d3K���oC/�>��)[8
c�ncX{:�c�%�5�����"*����qgx�?���������9�c���w7��o\�(�x�����?�����������<��`2�x�H:���U�+�/�]�������y(=MY�-�����Q��q��2�M��l2�L��4�H���vV5�d��x���h�C�U;k���D���%��7O�#�-�����~����������v��--���F��<�c�6^A�nA�����]��������]�����]X6,���f�0dx�i3b��3s�&�sZ�L��V�f��Ls��\�����[����%�(i���s%���9����<m ZhI�|��?D�/e��m[����o~b���2�`����z�c��������r?�{O�u{��ek�Xj|�����v�S��h�������`����� �1�I��Q))�$L���C���]m��f�H&Y��C2���&=�d�
>I��h�+/�/�Ho�E�z���@8�\3��|����;�>�����F>���S;��c��G��i���3�+g^8w}��`��������}z��S_�7����?���O�:u�s�W;=�������;<��3�$�_�@� �@��!���W�j��6���%������hT�.�8C�����B�P<���B:|���v��;�3��6��GP�B����bd��
i0D�C:$���q]������/�:��P`X��Ap�,��HH3���W���C����t+�a�������
n������#s��J7���*����Pm�:��z���	�j�m��?�}��[�a+��o�n���v��!��������i�@/�	7��'���g�����xn�_�a�	�w���L����l�5�:a'.�M�]X
�`M��<XK�`��$�-�@tk�_�t�����	��XF����yC��"�k�����mH��X����8D�����:���H^����0�T����
Ei��p��K�	������++f��Y�1}��e�M-�6P2e��k���WO�j|���E����F��d��d�n����M�����eh�$rC%
��@�X"����%��)�9%b �u�)T�XZ���:E	JV�"�
���,�����R�Y��)DA9>E����R��SDYP�h�iZ���:ISD����4�Ti�%p[C{IhJnv%&L'/L�����������P���.���Z��.�E�.I�V!=%u�J���d
�v��9S�8EC�d���LVX���D6]9=��w�a~H2���u7T*d����N�����p�2B�����sgnN�B%G�R�H*������]��c��o@��x�����8��������Y�V> B��Q������-�E�,�w��%!A�`��u���6�J�~Y1�p�_z`f�b-��TO@h�SH�Bz|�{��i��
;YaT
���6t�a~n�[i)�������$+DH��`�*�e38<$�ss�fU�+�gj�X�D�o�SZ�+B�����f��-��-�P�'k��Bz��/:Ka�QC(T�:���u���~���
��Y�bQ(�S���%���m
N�e�����J1C�]���%��.�c%]�y%bI]H���)�f*yb�b'
��*V��Y����0�6Y����(%�D�+��=4%&��K,�|����1�c@��;&W*dVI{e�"���!�H����_V�N+����feD��������YbYyU��� 1�����\�F��cl���<:���IY�<f������&(�Ga=:���&U
w��y�VF�)#���S�tj�2��jN�K�1jW���R�-�cOn�P!>�B{t�RKP�GP(�N!<�K5��K�j�B��P��A�+������W����^���7DY�9
��fvTe*��\�Z�?�-�=u-����Y�*s1��3U����8^��C��:Q0��C�w���37�W��S���Y�4����w�w�sY��fO���"`R��m�]~l�UU��@h�]��@brh����m��/
~
J�P�v��r�Y���i���~�
Ki���A��`����(K��,��b�5�u1X���.PU�O��:��o ��U�~��{	���|W1y����.���Q���1	�*.M]QUy�I�koY�'�OnN��A,S�J�P��]rC{HV�

�Q*(N�'v!��q�$%Q���}*��3*�')����E!&T-���-�!�]��|F�)Y��i7����D�����,��	��+V�����@���G������"������K��T�0��0�C�a.��(���xF�Sa�CPa����@��`�=P���0�#������x�-��&����z*����2��TSa�I�a	��Td*!*��a0���Gb;#n �&�&��_�����o�|���-�d1=l�Zw��tBZ����6'�$oN����a�����IOi+����9�3��
�.�
�7����P�����6p�5���*��x��q�U��GV����w"���=����f\@�~�v�d���:�\�g����3����tv�Y:����]�,�2}�����}��}������>oQ�/�t��L�u�^q����/���������B�_!�:y���SHV��j��2�2}����^���|��7���	���Y�W~����������n�;���v[
����3/=�|x��}�Y�!l���_�O��c�����Pg:�;p���t(���(�
������|Ny��y��9"o�o/��Y�������{�n"o���������o�t���-����zxs������y�����t������qS�&�c�l��D�����������������]Ma�+�r-�u���@�+�)^g�%+2�
�\��,�
�����RWU`��Z`�����
���I4�������H�ly�__N���������A�\��B���B���
�
-t�+84U�LB�\&����l�L�<��R�&�)S���L���&r)�����]�gIRY7�Y���j��,��/�R�6*��+���m����)�*��0�L��U���F��J�<����pS�i��>k@�$��j���i-��������R�
K�&�� ��n
��p�)�pX
�Q	%Kq����$i^Xi^�)6E8</c8�O������(
endstream
endobj
26 0 obj
   7730
endobj
27 0 obj
<< /Length 28 0 R
   /Filter /FlateDecode
>>
stream
x�]RAn�0��>&��
v"YH�pHv�l���)c��9���vEYi���*W�]�����]W��F�������^s`}�������������y�T�/����yH�����>-�=���s\�|�����9/������x��O>s�u��NG�U��N��tf]��9��~{x��okS�����m
��tb����~�;�)���,$�9|LYy'�u��Ny�l�Nyg��	<u���o���A�v��3Ff�)���v�7�6�u���&B�Z+Z�l�d���O�y<�����-]��,+g9tt�#����	�$?z��2�7�O�B��p�Y��$�	�$���x"IB�.�<$y|L[.����jev&\s���E-{"�$���m�DU�/
\��
endstream
endobj
28 0 obj
   395
endobj
29 0 obj
<< /Type /FontDescriptor
   /FontName /AHTQGK+LiberationSans
   /FontFamily (Liberation Sans)
   /Flags 32
   /FontBBox [ -1 -207 940 724 ]
   /ItalicAngle 0
   /Ascent 905
   /Descent -211
   /CapHeight 724
   /StemV 80
   /StemH 80
   /FontFile2 25 0 R
>>
endobj
7 0 obj
<< /Type /Font
   /Subtype /TrueType
   /BaseFont /AHTQGK+LiberationSans
   /FirstChar 32
   /LastChar 118
   /FontDescriptor 29 0 R
   /Encoding /WinAnsiEncoding
   /Widths [ 277.832031 0 0 0 0 889.160156 0 0 0 0 0 0 0 333.007812 277.832031 0 556.152344 556.152344 556.152344 556.152344 556.152344 556.152344 556.152344 556.152344 556.152344 556.152344 0 0 0 0 0 0 0 666.992188 0 0 0 0 0 0 0 0 0 0 556.152344 0 0 0 666.992188 0 0 0 0 0 0 943.847656 0 0 0 0 0 0 0 0 0 556.152344 556.152344 500 556.152344 556.152344 0 556.152344 556.152344 222.167969 0 0 222.167969 833.007812 556.152344 556.152344 556.152344 0 333.007812 500 277.832031 556.152344 500 ]
    /ToUnicode 27 0 R
>>
endobj
24 0 obj
<< /Type /ObjStm
   /Length 32 0 R
   /N 6
   /First 37
   /Filter /FlateDecode
>>
stream
x���Ak�@���+��T)$;��$"9� �D�IK\b�d����_�����O�<��{���c3�$�8b.����"B�q6�!��Z�p-+e���>X�!���a�~|����������>t�2��6P����v�^��l�uim��xxc�t�n��)��S�E��2Jy�/��9�C�J�	��kiT�@}3/��C-��{�����,��I�$O�&IN����Y������l�/�>$z�,m��ll�'����
g:u�[�K�U�j���bQ��eugJe]2��t�@,�UXq?X�%��KWq�;�����n��������~i����E�~����
endstream
endobj
32 0 obj
   332
endobj
33 0 obj
<< /Type /XRef
   /Length 124
   /Filter /FlateDecode
   /Size 34
   /W [1 2 2]
   /Root 31 0 R
   /Info 30 0 R
>>
stream
x�%�;�PD��&Z�w��n����-P<��-��!,�P�<3��d�df.cFF��Q����4|��c�r�B�@�k����XRd���{�?�+��������Fz/��>
��j���?�o
endstream
endobj
startxref
55978
%%EOF
results.csv.gzapplication/gzip; name=results.csv.gzDownload
����gresults.csv��Y�$�����U��yx�NJ�-M
G�.���)<���<�QM��u������?������������_�������������������������V����O�
��������RR��������<�W�����?9���U�g��������������'����O�,���1��c{����h�O
_1�C�7P�z�%��
� ����(�W����'�����O��u���g���Gl_y�?s~��z��O��<���j�y�=�?���1��������9��[�=��z*�{����'������
����%�������|����l������z�#GSo�y��91�/�H�_%�^D1�w1yhm?'�%�'S�Z�S��<��v������������Wm�	�W,�7[�����ck?7,$��W�z����g��N�X��8��%O��B�_a|z����6g�H?�V�5�t���7%w��k�����#�+�r��@��z�����		��~�9RU��0�*L�s��rF��v�\�$!�����`�����=��.��]�_E~�����g�m>�qT	a�g�K��zK�_;�>B��U�D����'���]����zh�k�\Z��������u
�=��:{�y�-�C��a	�����0/T�d�0���������������O}]�$�'+�!qE.W���%�WW���r�Bh�yf���!�r=�|��%?����O��R��+5	M�E���_����������s��� y"���$�:]a6\�;��~2c����>*��W��[
\��QL��q��tU
\��r�x���.�U��1C���B��M��wf�������<�p���>���|�kl]�[RT�6du����]���4��3�Y��!�L�w�rj�7%<����)��jQ�"16}���6_R0�L�'�a}��� ^��|E���������Qv�0%"�/����)��>��:�V��Oz��o	_eJ�*1���������v����[����z��|�r��Sq�(�8F��r����9��P�.��xj�)�V������Mq�b�J�my�������s5K�X�	
#���8�CM���>���3GU?���0�@���u�����$���h�f���jy���~�m	5&�����9 �J���F��&�nH��$�I�/�����D���	c��-�����W���Ecx�x�U�x+kw��q76�k���mR{��E�$���Y�gk1��Xs�&,W�W�U�6pJ�����P����C��T��J�yL���f��������EZ09��8��,��:i��P.��\���I��+�����>A��E����5�_%8������`9�"gL�4��r[|7�9����L%�����&��Dj�J���c?���O�UxG_�g�
�@�>s����[�j�v�,������9��6s��:�^��JL����z���k�5>=X�$1E)�M�e	�?�`��Z����I�-:��������k)a�y��c����\Bm���l�y�Z����~ImT(%5�u|3$y��J��W���� =������GC�����`��)j��iW���������6����XyS��3�������w�_������gJ��_!9�xN��4�����d�,��:��������TM�3��Rk�d�P"�eR5�|��*���NK�����`Bo��e��r�o�V�r{3�ax;s�TMcY��|�X��(!����L?��+����&��W�2=j����Q��%���i&�HP����9�w9#��(�nmP&K����I�&r����F���24��zRm���3��j��S��Fz*g��������\�2��dG�7��NLC�4��XB5��7:g��;|�r������������N_vl� aU�1?m������'�%_z����*I�����s��s���w��%��1u�{|2�s�J�����f�(��j������G���un����'�>�$���V���I���d�{0{��'k�l8����U$��n��Kr���*�WP��������f���#��>Q�R�~�<j����1I#B������Z$�xO��K�_��k&�U��t6u�G_��JZ5����lJ�3@�,����<�l�L�<�d��=T�v��+�K�����G�C��Z�O�c����0�i�8]>'a�+���qoD���Q�H�?�s�S�}h�j����_*��\a��m^��sP�_�+���}��;�A)�}���������n�W��#K����<8r��w�Sk��R��q��;��V�z�H��6,p�zr�}����������R)y����Ja%��]�Kpa���x�~��-�W��\���:%�H0(�ts�$Y�"!vy/������`k��U�����0����������~\.�c%<}9��&����H�3�nj���6��ns�w����+�mck3j�(�B��������Zfb)�����$i����tiG��lV�j#�k��?<+�H�y�4�Mj$�f���W�z�]����K^�����
�F�K��L��c	���$�&_?����6����S���U�I��x\�Nn���,�B�X���J^���C���������yH
��1�f���������a��W�c��x����Y��N�iK��������-�j�\�y�]�����7�f��=�'UVc:eL�$�xq.�XM����k�"E����S$X|��
�T��NB.�[�h�m��%�S�K������?
���Ie;��[Ik��s���j�@�D���t]��O��\6���F�69���M�[�Ct�=�g�O{
:��I*��
�rL���w�Th~�xu&K��u�x�������W~���;����v�dsF���!�!�. �o6���4������bID�f�RB;[��|�t����gIH?3�]Bs~����x�C%�R���e���:�N��=��J�(T����;_n���e�,�%���:c
�_�m���}<a!�<�VD���F<L��$��5�����>��|��"���"����(�$�^ ����!W����K���.ec�VJ�{�]�z���/F�fi�p�v���v�^^P��z
zy�V�)�N�
���<k��x{��Yi1��>4f	�W����F�]r�F�D�>�){�����pM�@��LR�I�/wa(�axC�������EX���x��.|�zy��L:2uhX��P���H"����`������1�w����-�g�#e��B��<������,�iK	��f����C����9�IRf'�I\�:�d��do*��,F��TA��[(��9���3��8������>�<?7�f~nm��P 5R>|�_�����
9��X��+i�z���D���i4iQ!�� ���}>~�6�3�pP�d_�1��)�c�M��B��)�4�����+����Y��9P������6n�G�OQ./75������f)G���NI��=����
��p#����GT��|��1�o�c���{z�S������t���OS��(W�w>���7���5�����5	�X���o���^T�;!at9a��(�W��sz��T��kv�R��qG�m{,���f�Z�siu�X�R�!���Ns���J�����+�J�����$N��Kz����X?����Gl���u�%�����W���e��^�?��KJ8��(z�Y��%�z�T�f�����A'+Z��G�W=��A�W��p����wTl�Ym
&��
D���������������h
�W��������"����������$w`<��=�H�EV)��RUp���ypy��n���#�f]�5���9�r��rJ�)�����2o<��=���a����5�Q�o}X�H������}DAG!;��;HA�W3m����<;�N�-	�>l}��$~��]y������9$3{'�+�Y����8�����HY��%$�&\PPxr�&��7��c�:�����k��e���7�|�+�#��8tO����D6���c�S+S�'�#�Uz���r�\���c��37���|I*}1����	5��w�����F8�3{L���No?���
���O��)OM�����������4�mX9@"C]�5'#�

.��l�����+��r+Q�|�1[���E���:�LY��|8f��	��'�j�������������np�]!��K;����=Q%����{���c��Q�Zk�Q/����R	���>���� OJ}���VQ��C6�#��&��U�0#w��k9�t�Q}�#5|3��eC��0?.p�}Z.@�T^�3b��xX�8�����&�J<dn�B�T�,��cT�}�j�c�V.!��H^N�:�����pt;C�u7�E�Dq�����o�t��^L�<�m��P>T9�{�G�1=L�4)��/��Ng�l�SH������\g}{�[���S7�

;��	�n�2'&���	(�����'7�$v��OoX:�5��+%�/����&���W���G"y=x���zj�/}����k�*A�f�����{K_ ������*��hp`�t|,�u����e��8�l��0���1`�c�cF���)=�E�	�<�48I7�E�l�)&�-F�h�q�^�@����2��	�J/�s8�:r`C�A�$������|�#�&IX�cu��Y��rb7����]��F�c�)����?�+(��2�73i�������!�`�Z[4c�+s����������r��a���oF�4��F��3�����*�$r�[�������{�+m���1��1�4!����vn��&���f9
�'p�#��
���,?~���������mK�E*��9i{Wz�����]j<i��.����_5�5�4c���[4TR���M��E�����f���^��Pi��T>;*R� ����?a`9�g�:�T���m���n?t
�$�5�
G�h�
��'j��z"|X��)�Z^��7h����,M9t�8�4r����'#H�+-�����'���>����>����tL[A�3A�%�-���r��`b.�c\S���8�j�BBEK��b.����:�\��~F(�	�>yz1$�Q��g4'����_�B���[��T��X��~M_)�k��+����i����@�����!�|I��TBk���+�����4s{��@+�N@;����6*��E�f�;J�����	���)=�jk�T��=�e	�E����\�F��9�{��a6�Z�)Gc��6�~>M|���i�KC��&&�����}<X_C�>Z���	��U6��k�����z�K��+�����%�f\��%�����d1=���|^;C����9^ooj�4��7j
�]14��|:BXm=U���3���<�R�#�]�����l��2$�WwB'6��Z��i�Cf�Q�jG�����0��&C�7�����R+�}}$�p�`9��(5��C��B�����b�9X�qc�/8L�^��'I����-@i���`r�{�����!����r���X*<'�r\;��R�t2�*_���>S.��"�>R5�������u�_��LgW~�H��/w��� %=�'��h�*�Mj��E9yW�������)Fm~�6�]�f���C'�j��}*F�]1�������
������@3S�@S��BX�����7���Y|�!�j������)B���,�r|	e*1'�v8�������3�����s..��I�1+k�g��HI������+�b@�K�8���Z��@p�j���<��(�|����4H���w���^3m���u|����/��6�d���*F���/=FCR��*����K�8z��V�qF%���Y��f��2�Y�K����{�ua��P�R�]��n���S���G{N����]����������!�~��W^ou-R����������%_�S��9X�|U�R�D���z��	�S�pyH��������`|>�\�������fm�}x[�&]-u��*�WA����9���qk�MU��|`�����w�9�&�t���T��I/7%�J4�-���DK�K������#"�no��5�p3HN��X����S`ca��c�^�������F�������^�A|�����^ ��+���}-�[���4�T<4�N���j�����\aI��zy�*4$�L�c`1RD���v��L���������f7��N3f��B#�������Pc��R�!B��nZ�"�������l-2BS�����0����j�5��P�
�����D�[����P(��$��od�KD���_SA��H�e�j/5U������c+��4a�O�DkT!�6����a�=k
�>��> ~U6aU��~PUY(�����~5���e4�Md��6��C}�W�����/I����NU^�S�z����p��mK��Y�/��hc<�����J^J_�M�>wp��'&tj�������O��W���J?��B�8
����m96��]Q��H���M��5I��(5�~����J�Q��U�J�1�	��?�%��-l��
�x�[dT
��s�8������g����������3��@B)q����r��t�������I�Oe���[E���>�T#l2{�!|W7���|+�K�A��QW���.��xTM�a�tcx��*�	��"��\�f��tLy�;�A�`8[SJ���h�mxeIl�z��0,�9������g�){s�Rz0�kyjg'���=�O�4�Y���c2�nl#��e���&#�oj{�I�� ~,V��K�_��X#*|�����5���	`���!p��B;���_�d���/�j�a8�� �����������2�b�]q��c�6K�6"���)�>I������IC`�Yma��b�%�~���]�`��46�{��3�%�����6X&pj�a;���|���7]���'%��d U����io�g5��}�]�W�b��7D�"��k/������Kq w�/�7�n^�V��[n?5F)���V�B��)W�t��+���U��L��S��[�6>Pp��R�M��5Q��Z�aud���Q��Au��/+����4UF��b����q����N��r���%SU}\�x�qj4-��~j�jxVw�1Z�[�g<�����P��D�,����k��W��������M����gjh�'��7������ ����l��j'~;�5'����E�Z��o�9��ae1>�l����X��N�����r���f9 �E0������W�4�?�;���F��N���������c��Meh��}<=�NLC���TW���kT����k@L��n)-/*e_c���mH��H7��$R�	j��r�/���+X�og��q��Z,���4JP�������+�J�Q��<J���?�ov������0�aV~���s�I�T��S/�[Sj���?����A��'�Q��-S��^�5c��A�������e��.\[�
h����,b��|�&�[���Vu��t��Pd�n����A���F�D��+������9�S�p*j����_�/t�L��j_m�Ru>v�|��Mj�cw*(E���m9�yP�<?OU�dO����|R�����u#����E�G?v�E�T�@C-'�[�u��GI��\A��]{>yl��G�M����U'�K�NA�������h��^`�D���]28������$,����k.|8�{f�
����z��m�k�/���zL������7�%B�7�o��#,��

���������KV�H] ��+h�}m�q��0��AiH�i��Yx����^�Tx�i\B�N�s<Y�����l#�B�
�eyo^���/�voW�P���H,����Wq_�.��F9��[.�>_-�%k]�$9�P)����*�6{�#�D���� ������8�e�]��4���)����?n��@V�T���������9����p:�e/}mS���L����z@Y�r���|vC�C�J6�lm���jw��W��uu�G$�e���T+���j�R��*�P�a=�F{�v
�hE��!<\���~�����6�I`P'X�y��9��CR�m'����m��}���au7S��_L����8�A�������1��\�s�Ib����x��is;�.�[��c�����~���u��z��o�y/�q00�oR,P�^Z��>�1?`\AZT�_�m�8�X�ES���"��n�<S����4�;�����&���
��P�~������y=:O	�r�)\��Pqi��G��H�=BB���$4bW���7�\���B���5�(�?�\s?����`D9��0!��������h�p�8����9�����;��d�����>����e�/tA�6�aTgt���Z^����55N���d���TEP��c�xi�m���v��5����g���`�@�:*���)�F�.�49��I�\���L����5�}���j���B�3����W��gf�eg�1\��M�R\�I�S���|�6�o��6�$?��&�C�1l����2�vz��9��6����"*��/�~��z����bH�s��u���j��^�����@����kLW���]��T��~���5?����'m�'t���$�Z1S��U>�)�N<��������0��dz�(����M�P�B~����}�|(m\$���<i�l�{�������L���F.���1��=�#p��@B:0c��1i��2��i8U����9�e'�r���7:L�m-��K5q�g��J���L���������Sm���[��19$`x���g��v�Z�
=\�v��[��L\����2���������5�VX��b�[`r2;��bx��$�xj���-m���L����;7�#���E>��������O���kLx}u��ls7����������H��voVUGo>�dS�&���E�|�nQ]G�mN�x�*�QW��.������CU���R������8�����Z�1��.��X����[���Q����h2@L]#\7�B����������PYy���t��_S�6��.0m6�%�2�L��B+����N��I����;���T���x���*�T������LZ?�E(_=���f��G��l��f���@9J!��)�2���;c�
��E���v���,��\mx�p��5	kK( �Qq��w*����*�/R�!�a�npvk���#cY�j��0&���1��	�6�������%����i:��UBC�,L���^��L�jxy/o�eQ[<{��V�p&3�Il��}��m�9bz��3�D��r�:Lv��p��!��O��:��0m�(e�=H�F{!��2T�7�7����Y:���KBtLz&L;����%������%8�~�wg������mm��.v��Z�O�����T:�k!Z��\4���=�nr�X�S1b~��]e{]"[�0y�����
���e�^�����^F#�&��������KoM��Ly��V��R����%���y(�JR�6��Z�w�b>��Y���8���
�jt��fH��������j~l�����_|;�'�������5����f|���}o+���In��K��j�8<�4�B��s�f��^>�����D��4���y�1�qT&��{���1;���0��$��i�-�/G�a��~fU\O�G�L��mIO�`W��� )��e0��5���<Lv<�n?�����U��/-�\ �����K�S:[�r
�F�q z�c�e>�=ZNi'��+��Ln*d���$q�_��n�����
�_iu��!]%�7�.��Y�V�x�b�;�9�PJ�F-L���<���:���%2��4�� 
�X� %����_<*����#�������S�Z�I�gtiD�3��J�L�sK�6��
O���m*Z��D.��������3�Ryf%1
����X&�g�X��LY�����p��jf���*qK�+�*5xg5��Mk1[m���VE{*��&���\`�j����~�����OG�/>���1��Q$RY�����������-�Z������f�����X%���F�<h��Z�B}`��0Z��������_�dk�4?�='XRr�
���������;��4xu�Mi�=buJ�Ku4f�����f�����NC�c�N�Y�U�WX����c�|��x�3���u��^�k�$�}:�~�NiL#*Z�D�������{�&��[����d��'\z8G�y�i
���9�FW����.��f9�PTu��6�5:S��0�V�����.�0
��-_bx��V�%�����(�������5
���.G��)5v-F��p�m��$��k=F�$y=Wx�9�w������@�*��������F��sP��)�0w���k��a��ZF����\8_�X�)C��!]]�l�>C~�oJhC���F��
;�W��1/�<����bMqw�7'[��� `)E]�&�����?I�����F��TO����.��\�x��[��[���C���� Q��f(�|�����W/�U�&�
(�U��ja������B�w�:.�����@#�m��������]t��^����~Ya�m���;��\�Y�������<�
���C�]�M�H�R�<���'����5�RC)O��g`�)�J<\������
=����x�u�I��s���D�-��|A��Tm���;�U����>o���o���#�?Y�z����MV�n�Y�e���S9�N|Z��.Rz�aqP��d4/��Z��jm���S����~<��J������c��o
�f�TkLJ\���T� Vm��(���!!����.%�F�]��i��Xh��R�2y���xI�$������@[V�CS����@l�VW�����|��7,�������#���ul�`�9��T��LQqNor��,n2����UY?@�}	hz�v�.o�LC���Y{�0�������M��JI�W��N��t�T�>��W���9���1"s
��l����n��*�C�����d��'�Il�G�������4Ki�|��=��
���-�^e������G�H�+=�����[h,a g��'��	?O�#/^��*u����^�0�f�}$_�I�_�P�
���(w�u2i����KadHoX�M������uG���-�o9�V����aBBM��/n]t�b�G�L����)�:t&���Q{~�C�FWw.���9���31g�b~)H�.-��s<����[�P��w�P�)(�P��I��Y�i�^e��F7��U��O��7�'�.��
�������Ee�}��\+AWI���-_��u<u-�e��vR�;���K�|D;�THtR����|��k_$��T
��������'�?fd����8��f_����a�-]T���x���l��������>GG�R���EKkZ�7e���08��'��p�6T�SuM������X�P�D�����!��7�(��T�jwj���~t�F���<&��U]_ks9*!���k��
!���YY��kH9��Vi������{�s�v��a$�BY��Al�8v���o����[,�Xm)������u�$w�m���
~��H����t�� ����1Cl���s�r�!M��(C)��s�B�/�k��0s��h,�zp>��tM�J*���q<�y�A�������B{j��r�(:~	/I+j��X� ��cu1����M�06�`\*N�����6���s6���\�lUrBy�S�*������8�����@��;��5���J������kEo����*1���C���5��f~6�RlL:&�������f�H=?�i��"@����N��~�D�94�6
e/��!��8�0���X��B"��=oT4sJ�����/�4:��:����Ka���V���9R�����Z�;������}���r\>R�I�9q�easXh��IA������2��V~l~�z
b��:��K}@�q�r1�/���Gc�����K��e���T�:����e�OVh��W�y%j��[3fG/�X�}�[��g�"<�~���|���gh?$�5�?Z��8��a��F��Z�����f��-h���-�X����v����e�������v$=��������W��z����>|�Z.��4�.�%;/ux�����3��k������T��D���fDx��r"����i����Y���o��m���sF3n��d��32+��������T���k&<��D�t�Gf�v9;d�I�c���2K��7/>T�rW�7b
����Pv��]����pA��m@h����� �l��&,��������mS��1��f2`F��=��("�<U����������f������1Y:�{Lu?�u�Q���%eb���{j��N#��j����(p�s����� ak"�H�e�3}3��@9�f^V�K&'.����7������W+��q�����K��8�*�oy����������j)?]���[�6�Y�����diy��P�/���b>^�]di�,g�g��JB�>��=����������/=T]�/a�xH�1n��@*B\M[��t�6�C$�;'��rp|�w"��p�28=1�����M�[dv�[Z�0���4)?�f��U�f���Ps��svXZn��ax�Z!�?m�S��owj�,=Y�HPd�R���-��9Y�s<<��&��w������lV�k�$uLg6�Y]�6�����%?��<sD�g*�����`(�����[u$A�fP�
�on�
��8�ug����u�&�{��&��� .X���%�o��������K5M�����_���K��
~�h�����o���;�]�q'����,�����zT����7n��h3�_����G���(��+�U%UfS�R�(�^Q����Y�/���_{�#�@�AS�P��-�=S�I����0���������j��1Q��K������<g�.�����{=�-�n��/;{��8{G�'8���?2����tB�&C�~���~/0*ik��o���z'��R5liT,��+!V�3
���i��L���A2��UV��~�X���:Ln ��F�F�����]a)^BZ{SdI���"���`x���cC����j@���.���:�t�OP��m��HPm����&�S�a�%��X0�J�{e�.D��1���-D�$L��BoF�v +c�������-����I�H�
E�pk`u�M^���6��	�I�����X�cw������0��t�l�w	��MM�����f:����(1���%1U��K��{W�� ��O�1�)��/J&������O�@R���i������?[��H*[�� ���=w�����L���������$���k�:rIC������>[3z9,z�U����w!���K����M�A�\	���tV�����5@H�r���P'n@�iZ�$dU ?t=J����l�a���
��m|3��f�#���4\>����H�����M����U�C�-<�{���Hu��/&M�HZ�/����wSIk-A�2��*���4���,�6��Z�d��c�_fX!%�+}����S1c'��a���(<^�0��c�%2�a;��xi�cQ�*E[$��X����Y���B�]+�Z���<7���w,��l}�Io���U$#�Z����6���su�c!Ch
�����������a����>�T�n��R����L'^��>u��5�h�4J)���������'�J�%�-o�
5I*Pf5�o��/�|�	��k�%'����7�zht��=q@�hi�~\0.��<��:�P�BZ]�����-'��qU#���������|n#����-z>�r��E��!T�Xmw�v�0�w���R���_��[����k�,�������Q�2U��k�3���,�����}�����%��,�Y��3�b\��>��x��A�?S��������)5�"Az�g�4����3-|�`�b�������Q\�iYJZ���
�(�n*��^��h{�/M�X��U";�X��<�q}2��}�S��CZ%��`������4F? `X=����H����g,����y�����P��2�P�Z�+h��@��s�����V��������C�l��p�<pWb�����+0Vcz����ej8|��qqj����4c)p����g��f�F�^U,�n������j��T�TE�xP`JpB"O#N�/�����rl~�4U>b���l�$���`��}�^�j�����\2V�0�W�&s�{���x��\����l=��c���ix�~��7�}�}��'5ME1�c�t�����������[����������T�����z�`�!�d���P���|<a_*bc�����5��C�lp�����7��=PP�5(�-f�������t&<kT��H:���@���:FA�������Rj=d]���Y5����7�����,���*f.fb��g�����^c,�n�\���&����:>�	%�6EU���3�9����p���GN-�R��uC������f������{k��&A�9	#a <�U\�w�2���,�'yQ�������������?G�S�HYaZ����hZk?���Xuh�;�rk7�@���4���z��O�Bo3:�������:�'x���7�����Ved�N�'���=+�{����{��Wbl�j��5T�JE���{���<�?�i9�"�;g^��|����`��������fN)��!����r�NG3�-�O�OR���`U���v������"�j_�V]�Xv$Dp	�����|2
�������������"MW����}8��Sf�z�XT'����I����
�%�U��Kb��2U��|�n4��������l�*>,�x�HA����V��~�����iZxH��Y�"
	������������x�$�������y��5�ui6��gF_k��Y�8�^m�<���"T���.����:8���Eh���/�W����MzN�34����G�k��5�"C���}��L{���ms���=T���}u�
�xT5=��:���0�J��D�;p^>��������-�R��6�`�M����3U��U����K�����9������@&�5��Q�����*r&j���������r�+{
��uT�
E�����
{������S���(yJ�q���W�_�Fm��Fyj������~�S{�O
����\A����9�.��Mb�|�a�L/]��.y�#�7�p�t'��?���Y����|uV����R&X2�TCP�Ds_��9p�i���Y�J�WX>l��le��7������G�����y�-�,1<�����7�?��d}vA��.�������<�>����o����C�^�1�i��kX(�6����3���Bd�9Z���pB���/$�K�v�����ui]����3�W�����G �~�Ma��������$xh6���8S��&��sm�*����]-!>��+�#+�t�S����[�=�Z
�N
z�w�A&�b���������
t�[��r�Y����`B����Y��I>�c[���<��ecJ���V��><l�U����K����Z��������2��8q.;
��G�+��/9�Y������E0�>o���)�pT
kN�z^u,5�AQ�4"i�D��T�>�l|�����-/]���B��������<�a�,�y_�G���mK'��n����U,f���z�4��F�LzT�Vtopvm���������1�k������[���4��'������`�'�XNp���f�:�(^&QmX�������m�c��#��{'4��(	Z�i�j�
�:,�{���xS�����F�f��\fn����9�gH�m�Sl����v[�|�����\���Y�o?!|��c}j��(���(�\�4Q���g�_����%6
LB����R�x�|��A ��j���*+z���V������2�������k����X��`�*1����`o�JQz<;�e�&q����$���b��(��F���y8�c�~���8�N=wS}e���b, '����^�`A�AEkKvW��������'#Yj��7�|Q&n�r�!�K;�����'����t4�!t[;�ve���O;�x-��e�J��[�&�z-�����Q�����R$��%'�-���6
�#�@fM������N����wkCHl���c������������n������$��Xo��r��W~�\t�����#�*��HDZ_�1�R�$K�R���GU������kJ�Z2{tv�@�Cm��9�JcF�+�Y����;�{9����!�f��0���Z�:�x���	m�g��~�\�=�i�*����3R�d����>����U����M^�����c�-o4�1��:�K���.��mr9�BU���~���^v�%hO��`E<3�:�t�h��k��V����^Cl�c���G���M}����9]�����*�W����&o���;/`a��q�L�����C�����q[���+����kJ�F?-<Z1��ds�*��C���{N�`���)�{�y����������m��g���1ik�� ���)
��qi��]k����9���2g�*�#��/��k^�7D��5�s�9��W�A�������>+�d[�<e>����c�X��-�g�S���e4J6P��^[}:��8X�)8+���68��'����+{v��T,�� �Xv8������W>X���U$R'��@o������z���+�fQ�A�B�g��O��2���k���Kf(�
S>�����MU����Y����(@1����9�N��r����.!V�B�j:��V@��/'f�U3y���@ST���\�I
e�T���o�vm��B�C'�/@oZ�83�q��*�L*9����"�������`M+&�!P��� �YZ���s��}S|D`���z.�������J@���v���>�P�x��*erW������RX����O:���,,x����N+��y�	b�'�<C�|.����I�t�������=_+}i:��J�M�T)	�:�r>�dvIg7l:��B�B���g�m�FT��T��3�i%����g~D�W�=Z����C�"�W,c����1����]R�W��3�
�Xg��_��C�~�3M�E��\�f/�:�
�x�p����-��^�2��G��g���-���/f
�Ziz�"G��+Q�]��H��hKP.�H>N�%n|x����BgP������2(Hf�W*��jP������C&��V���J\��j����(m"������&��
a~�}��[����h�)a��\	��U�(nL��T	C����&��7�C�F��@�C��z�
��9[r��s:�	��ICy����ykf\�v8-���S��S�On�����,g������[rE9��G���x���u1����%Y���H\�R����`q�<J�|K�b��x-�����������I��1������U��^�����
����P>�N��p����QY}o�Qka����v��
�iy�F���M��XH(��E����s��&���L�1�_���q��c� /��-e������85��;J�T�����l/�k)�*/�Kl�����,B�5S*���XH��� +�����4[�=z�t��f��~��������.��FjlV=NV�RB����A$xs�\-���t����I����i��<��
rE���>s���W�����������.����Jg��LEMJ\2��M,[����[n�Y8a�:��@�@x3��=0�8�;JZi��V���K���F�Y�r��;��E�Y>f�G�_;��Fd��@t����n�(_�\0�����i����(���|
���4}n)R#��"1L����n)�C�m\�jdP'�vyG����k&��a'�:c����+�=i�	+
b8�����}S�m��@�m�g)[U�����JT}�k�
�g4�<�Z������X�
�j���uzp������M��0h��F���+|����l�X���C�/9U�9(K�W���/%XSh����G�K��^O�+��rI�\������6��
�ug*�G"�����r������"�,jOSN��
�K ����6��>w�1�|0R�p�X.0G���.�� &^���=|[���y����fT�g]���^�.���Q�����D�����7}��6���S��rJd@�~QN�s!p�����b+}�).�Z�
��Q�#���g�����J����o�v�6��R%������4]��%��x��"&f,����&_y���4?�G$�(��|v��wvH%�'7 <��������c*��r���+E`���|<GR�����2/�S��0�w!��o��%�(�b�����|R2�p�HN!H$������h�g�d�\���G���� �A*J�aVc@�uH�VcF�]0�1R��?c$������f	�T95��&����F��r��n�<�3�HM!oA]A�����Sn~T��p������G�v���C �i��"f�_$m�����oVP2��!O�8��0�����%����R28A�H\p}�K���\��y����M��\��U����SGY"rjR�N'�Qj���T��R��(������8�G�
�OUA����%G6�a���A1TM
��
���A
�)>�����+>��_nX)�1<u�<C��>��=���\JKf�\g����{��{������Wjzqi%����u��bC���b�f���j��K5q�`u�vwx���R�bm��(������/�@���yN�)�Q )/��2o6�H+��	B���^�y����b���:�)g�}T��!�i��aYu�$�)Q�C;���R-�9_?)�����{����
}Zo��T�g^�	_��������7��iN`�������4�OOZOY�F�(�5BZ^��9��F�-'�D�:J~G����I�X�|������
=?��w�{1S�+��P�W.�V=k�?��c��6{�]
9��n�, v
�����c�-�����*�Ld�2)����������?Q��Ea�]�c\����\��!��_�O�X����);F�Y���:�p>�c�]H�V)>�U��Ex
��;s�R�
dN	QV�����vx�J�������F�1L��r�������c6��i����B����21�b���1!�6��z)�����D��,0LRZ_���-�/y}��o��.����_]Y��P��������K���54�Z��9�/~���.����#���9��2����v�����YM�(l��Z�x��7��#���4�!(��_��'�`���/���d��Gg`kM���1���`��Qv�v������hh�T7j�t��a��������z��b����r������J_c�4�x�-<�/Y�za���~�����U<&����X��U��KO)GC�Z��{�w<��l;Y����� ��T�{2%��
`�>}��W���7�4d��������y�����~-������
��������\7`��I�*�oqzX8��6%#��Xp��BcE��S��.trM�B�V6=UeB�h�]���r�����3L�z��I2��\��<Vuu�TU�Z�j�	~���.D�s������gP�~�LG�H��%3}S�7
,-�z/*�/�8"�����Sa�������N�!��0�dl��W�I1<L�
���R?�����0%[7S#��.A���d�w~�f�
�l�0������9Y�pI��O�E�TH�N�����u������i��r��}�V�*���/�%��A��%���^)�w�g���g��G�y��y����|���:�D�s��<�S.EXJ���������YC#X�[���s6�t����0{���7��y���w	����l��h_����{��/<�m4����V��z�o�DA^��TeD��*���Tg��R�q/g��D=��8��^b�C:�`���#����{tx�������K���v/]����}�6�g6",��������sAp����X�1vY���{���<�R��.�a�S��������N�%�EQ���*5���,~Y��%�LVpsS�H��r'nW���vQ������8�s�/G"n��(
s�5�wn�.$����
��CbA���o��4��R�����y�XVm��������2�^4��=��P�O�,����`��i��M&5>^�H��^�\���NW
�Y��4^S�$�n�?�^/���t������;�|{(�=�a�f�3$`��^��^�#�Y����������������r6#������#n��7��L�)����E�e�����ULu\2��i�g�t��^���p�e��ta�p_�B���d0�R!/���w������{B^��3���������	Q��0��K�~�!&�fF9)D-���=��Wq�-�o��cC�X���5L����k �$��v����0<f����%{g��Q�� 9e����x�[J3�!��P+
���o���[��et�h�6�i��f~?=���TK���r
00P��fR>u�ji2X��:Z�dV$�g,G��t,W}!q���FTq������z�O^ ^Km���q�\�%��.��M|�
�������0�J�>0�$Z�X���L)��wI�I�[���PS$��)+Ur�������0o�
��,�v_o_�5H2�rd�A���r
�.o���.�����9*�e�Ri.����=P���8�:���F�*)��*s9u35z"LP�79k�����g�]�nM���E�h�e���)��[(��^���1������� �c�9/��H>�mz�����\
��1���T�n6�K4���RP��7&���l^dG�@����u�r���9*�W
���^9l�*�6��m�,�����&w)L�j���)�tR�a0i�KmU1�R)���-D���)����m�i�o���e�,i��w��^=�EN��|���m��e5��U�����hG�'�k��"	���xe�;����a�|�gu��Ho�P�u���r/���=�'�����{���/��BA�/�yf��)��e��n�W�_��������s[��"T�V�;G�'y,�������LR0?�"�������tq)�Nz�l�"G���=���G4���OT����#4������'?Z��|!���nX�~��L�q����t"���j�|Zo�a����`���e@M(
��\�O���2l!�a��o�����@��������(��v�|����3�d���s��A���;f���
�dDW��	Ppp�l�Y��A�&�vX�����F�ww������
�
��%�T���`NI�-�@M2m��B�f$�����������}��m�Z3��(�3lFE��8T}2���i���R,�����&���������a#�����*�p��<�Yt��aQ�����c���-���R�%�f�Q���U_��AA`�A|����<�Jg�����#<��HS����;7W[e?jvZ]��������Z8��r��nI]�o-l:<�6GIQ7b��4tb<����]��znlV��l���(;pl������a%��K��<�:G�����iw�4�lJ�g:���a�r���t�M�ZVZbv*��=����1r��+���.f����
�k�Tg7� u�s�
�eLy�0YT�|wm�4�(��W'����?��A������
�}��f��N���,D��]�Q|<����0=#������b8��!�>F0[l�r�"����u��/��Q��|���z��F�O�%�����p�w���^N��;�DK��$KS�i+o�����uk��
	��+�Hj��N[���,������/3�}�'�u�3��Z��c]n�{r\[��/yd5���_��;#���s�K�6]���R���}��e�t�4��6�w��Ep���ckd.���\�k��7���+�,��ta�+=���&��vL�H=vD��!A9����NE���(��,�}N�c���w���:V�F�����9S�G�&���h9�{,c�u6�����sh��U�z�W��K�C_Tz�~M�u�����f����r��&(fzS���V�tA�V��u^��U+�(�$�8��
3"m?��a������{p]BO�C���`
�����N�V�a>b1��ah�oDyc>����A��7�URy
]����k����������;Q�Q��';&��l���D������|�E���{������
�����q\D�k$���f��b��x7:KJ��FP��Hrd���6�%!���l��M�Bh�`K?�._����M�vdS�2Lt�_�[�2yl��G�\G���D������6�M��>���	�:nB��`*��u������������1%���t��|���[)��;��BJ�A%�8i,;��Q/��	:��2nS�����������?J�����=��jn{������E*�)��'���e"��{.R����9��U+����J�����f_+20�6�����3�Q���p���0LN���<��N����m�`,J���P,�f�$��������jw��4���D�qJ��V�u�/a�qh��|xVG��$6+j�6}��Z6mtx��oet�����Q��C�i�Z$��wgz�<bo$����l��3�S����'���\V5P00���~:���1�K�er)�5_��<�+�|��h�vbc��|"�U�wP�E���h�LW�E��D��%T|V��W���1��uy�J�U��}r#v
h�BMv��7��R��q��H��na�d�/��k4Qc\zsFE�'�*N��2/�E��:�	�����q�H��GvR1�`,������/�W� 
���$�w�f��b�	�Z��b*:�����k=�|]S�s�2�Y�k�Mm	X�jA��x�0.�)oa����^�`�*�f����i�����%�����=�����JB������B�	���+�)+��c���+��{eG��4;���O��v��g������e/����������xM�����F@�KY�v-,]�p��K�`�Rq*�)5�'b7�>V�Wi�*r
P/���`o�
�j Q��Q�s������� H3����`��2�r�O�'\������J�I���"���?6p����`��C��D����C����-�*8^�.^������}��n��y���u�KNo�2k��61��EisK/_�QL^��t����Q�N�X���)Em5�>P'��
{=�K+�+m�QZ���@N��x��s��	������jBFq�6Y��,M����B��
I����j����5/f1�+�6s����GZ�)��
8�p(ZL�Hn�H���1X�HC1�1�O�c���!i�I�rc+>��^��"��R���T�!
���(�*��}K~s^�&�Z���A�Y����N�W%��)?��5NR�W��'�9���RyOp�c�w��+�;��j1�V�OS����y�s��{[3��*���W8"AU����.i�����_�*�=���W�K1*gf"���2�E�5���t-Px+�MM���J:���69c9nC�������o�j-O�QU�����q����!�%4����TCt?l��C�|��=0"�~E�a\���K2�Z9c@
���;�K���^����X��*��T������6 ��T�K��������h��:R)I�f�
����Z�G��\	D�����
��*gg���\PCE�2im0�.@�0��Q.N9T�� ���$���|�H+���0.n�����4�?o8�������c��o���C������{��Q�9�DC���;n^I��?.�^���"�w������>tz}����SY����������V��]�w`��{�,,��a���*�|c�r�+��B������
��7-�����U�L)6i������-�z,�R�:d�3���cw�����O��+Q�S`<}���.���lI����p���������Y���G��Y/E�_�~|�i&3�h9��&��^�Wy*o+����D�7RyG�9A_���J���6D�7�3 ��O�S�p��k��
�O%�����o�_��t!��X��6U����}g��f_�
�R����[\^��u��)�7��F[&$�LO��8d^.e����"3#;��gf��a��G)���@��<�e�����'�i	UF�
3���H�R�����%�(�x-���s\��B�O�!�qY��D�r����S�S������}��u���R���k�!�y���8����q{u����i���0�f=���fK�m�`�:__
C5�'�q&7&��v�w;�<�`znW�4�SJ�8�x��MA3����E�����[k��u4K�%�yB�U�����jy�c��K���+xg�*��r����C��u��l�����{l�oj.��
6���S�r�d�W����Z�R���F������M0i���H��� {`>��/�:LI��I�t��6���$�v�w�j')?�O��~<�,F56v]�7�]]�ai��a;�y�Y��k�+�������_�7�w��_j���P$�-^�S-�����;/H���D[��#����������e\��'�Y�����-���:�d�N����"_����@����RTn��
�i%��os����������h�|��#����NPhi�������^�{�n����=y���[y~p�Z�(2��cU�z��'�����,V�/��N'���_�C����*Ti����,
����~��_�,#��u.��8i��8Qer��z(���k��r�Je������0���U��	���Nk�����0f��)uuG�[�����3�a��Map������{��VJo)�us���E�lV�q��?>���D���J}���Kmq���E���R��,���w\$����}�ge�:�?���>��$����b�7���R5����P�_��������vO�R���t
m����t>��{��zZ�4w�#�B	��#q�A��f�6�R��p	�8#k|��kn��J�_h?���|�����l�����`(�F��W`�����2	��z�����
Og�4K9��K�z����������o�I��%������JO�wq{$�5/~`�@H��9�>����qh<K���[�-]��T���F�-^���62C��_��o��*���?�}�8'��:�YP�uO������7�jr	� J�I�d�����$��Y}�����8>�����"L����b21�����{������d�����^%G���\����y�wc$���K�
�w�>5m��I��B8��Q��� �43 @����/5��o��.����%M��
���w�QQ��dp
�o�������O�R���u>��TI}s
�}��r������:���"������=�������f��S��J��k�C����'��rE���'��w�=����
5_W��t�9�?��W��v|(#:��Vj%X��,>�l��O
�����n��;�zw
,�z�����1��nP�*g)��"��0��G������%���m�Aeu�T�;w�����
r���j@B��I:� 7�_Hg��]i���6�@R)� ��(��>�=����li�����A'��!8#
��*�A{�_�h�E�����Bm����\�=<�cd��-$V�Qb.�n���>U����s7�D=Q����O]��O�:���x�)H(�1O
�`�tN�3�}X����������y���!o�;�a�C��j����6���E�B�Iz�jj 1
n�����&���5�`��!75��0����v�x���S�O���0��Z3 
2��{�{�olV�*_���p
�������4�|�r`�������8�����D��o�*�e#��*���Oe1�gy����b1�����v�O*�Z���
�0�v$��|�9�8�wG�P)�E���U�\{�fQ��OR@s�B�*��%�aN���^��$�Q�wq���8|�[NF��1���UW�Q����
�E��7���h2��_$����]��X
�PR���Fa�=���+6����QKan�Ds���&���h�x�t���9��+��4�w����z�d��~z���&�,%#��*Fo/y�
��D����Z����#������P�bC�Wj�@�^���g�+�������U������`�?�2�fM�<U;T=$"�fj��W����obK�#B?)(�p��N�7Zk�V�p% ��JJ������~��M]���oW�5���
m��/z�u���<��.�P"
0����A����qq������Z�����o[��,��~2�\3�����90���K����
���5E��b�[����v������k�ia��N�6��`�.�����#�W�����iz]u�_��4�����*'��`\�$����Kh�S��%�!����x��|B;��}�������hZY&���!R�g����
��I� k��1�u� J����7H��T%oY��,9�>y�7����h^���j������g�k���&i��X�!����
�bU-�������������q���[�Jw�v9h���P��ztS$=���B����/@��k�[�J�/h���h%��|Z>�v�^
m�
�A�3;�=��t���x��kK��_��f���
�}�E�J�O���y�6NlR>cC>���#�;P��]"��/e����!���l[�E�y��4quq�����:.D��&�o�w��?���j$B�Z������]��j{{tJR`W8�4��|oK�5�*yX����bgnR�Js�7��#�|��>T�?rRt`��J�`Q(�Lo�1�N&�"����i�[L5�e�����H�;�;��I�7������S���E6'4�d�0���]<}!~u!Mh������`,�
<��oQ+Vw>�����?Q��	�Dg�W>�?�������)�Y��S��Wm��c��r�����S��*ys��+��F9B����>��gx�z�x��Kw<F	��T��k�U����X����+�)D����z���@�����(��
P.��:�W|�SLJ��S�Yh5��[���:�)��+HNJ=�����].�k�[��D���xB#���<6�-�
���l�'�U7���j��D�����$�� 2��9wZ�d�{g8�:�����������*���M�f<�`���W��?^
����U�������q�^�sF���(���p���r��9�1S��te�W�&/:�����@A��S>Z�1����rM�i�`M��#J\9�����
��Gl�Ou)a�	���������Z�l`�?��U�V�:��}���		Oc�����.�n�nL���NO�y3��f&�9�H����DC��i-8��I�Y����uUwy����<n2V�n���]C��fy*����3e\,����F�l%.�/zNbeg�|��g!mwf���8�u"z������Z�]2s��l8�"�����!����
�6�Cp�p�/��VT�KvO^�U
���s��6#a�#����~>;��	x|����D�J*��I���6Q�#�Y��n��)�[0���ZU���h���W�4�~�������������]��-�JS�4��|���5����"B�Ef�c�95���=���p
7�WK�m�%���j�e?���� a��&������d�s������s��h�����5V���w�����f�8�:��������l�X-��cm�\�S�SG!�'�u�fQ�,�'���H.��b*X���}�T�ij��3.�+<��ur�������f>�z�F������v��D��~o��Hi�/z���$I�I���Q�0�K�E��f�(�I�~�#@��m����'��./��Q	`�,�D�q{A�������pF�B�fz��E�>�������0@��C�����8��zd���f�������G��;z�������� I����~�����rS�)�,s��w-|��^��4�gP$��XRL�p#��Y�
���`��$����]���0K����V�KoK����_��T�!J ��4D�7/���xi���y���u�(��(��;7z#]���$�?i�hL���lo���!E6i�4����j�_���.�3�X��%����C����J"�<Hr�s���q��()�����F�����i���	=�Z����1iV�����|�r�/E��
�X1�B������7�@~�������RC9qx������6U����^���������,��gz6��GE����� �����(�g�?j������;�����
W�k�af���wh�����n��
����c�7������%�m�����K\W��d�N:6S�����q<+Ys����]Q
�\�����������(�m(v���9���������'�
��{�_�|�=X��$}g]#��H^Q�qmF�= �@Ea�'}l���M���,����Z�������A���)+��n=��������l���Am�+��#�6�Cv��IM��U��>���d���4��8���Uk��&��U������0#�N	r����q���?f���b%ptz&'����o����� ����|���7��
�tj��z�j��5Q��RJ���=�L�
�:dY�8�;�9[��a�9�N1,��7d��*�P�FGI*���m����������{��P'z�sq��F]��d�3�)�]@����%����/##������j�\k��I�>_��kF>_MM?n����_e���������]����"=���gW�cA����v���`���}�/��%�-C�`�_)$���C�c���s����8�����U�&�4�-f��4�T����n�"1���K!�M�;�U~{3���`�Kz�A�W��^�[i�7#�S
��(`����A9�3#����V�X������'
�W�s��W$��LL�?�r?�OF�O�N
����A����F�>�6`�$u-��>�q��H�����zG�����wg��XJ��(5�z5��U���3}~x�>S��f���RB_��~���k	�HF�,J	tt%h�������GN���h�f�p]�Jo���H?;l7��������_�^�@��fTh�Hy��;yW�v��y^S����
�$��9���I�y~n���S�9?d�4S�V��3��B]��F������������
��>Kl������bK����/d�w
��������SWW�.��;8n��T��SV�������k��C�� R`_��j�f���Z���e�Bx3��:�By8��rl�$��bo��9�/�M��E��9�M�U���H�~A{_����U �>U����RW~�gGy	��St�)��-_B����v
2��u&/i�HBi��T|����4e�����7x����{&��p����/�4��)k�	����p�g��	��������7V�8w&��of_�3����L���5gy�c���d�G��YUoP�j��WC���tj�cI]���rt�u�;�H23[$R:��	�zs��������G��=�_��A�pr����Z Z���+sH%e�9�fX�2)(��e�P���U��ti���7����p4�%8\Ith�x������������|I"m|>_C�:Lq���<�{L;p�#��M�<l�lf��D����y�.My���>���
�����(���`M
���B����.v�a�g�F4�\�Sq�Hrv�����C���m'}VJ������5�������4�;k��FYou������K���P�$�H���u	���E�� �	�9#��-yN(���4}nx���^��7Fl]k�
rT��#:���x��H�0��	���?����8}kz]�Y�RSl/���IW�E�	�����zfv4PkP_'�(e�Z�(Eo��Qk���5��p���r}~b���D��PI����U�b��dN�\�T����R���a���aiq��g8K������.�IQ���	n�1���v����}P��F�7��OX�F+��D~5��ZQyik&���Zn���;�������Y;�<�\bHUv.yH��/�����1��){��L�2:&(�,��������-����^L"�J^�4\�m���T�(�|�������w���
La�X��PO���<
����_��_�t~��g'H&yV��l&�J��*	����o��8R�"�(!"8��cw�W1��b��������~���8�AjJ�����7�R�:����������rq���0_vj
��Qs|�r��BCF�r�q�tK�;�����bY�k���`��+����T�Q����
���W�3��+HB��aJD�����L��uKy
�Z�M������v��sS�EDV	2���Xp������M,����.M��6' �vL���l��\����Y���m$�Z������������j����lE�8���
�����#��3)>8)��Q��&��w�������V�����O�d�Dn�Cx���$9(V�K:W��{��o,�:�O4��Gy[m�1��-J��kQ����u����q\2��w��wX��������6H��6Y!i���>�s�px1�+�L�is�.c��JmDd�(Jl�A���,�&yb7����U#0(y���8��v80Z�
E���H��8A�����K�)��4
��5v}{�T���pK�5������HU��h����j1�U�����m�x�~�� ����  X���������H��h/_-{�RN9���������5U�S.���gQs��!���,���2f�L*� �8����Zw�>���Z��7�
��������W��,]� 8���rN�"������[?6u'���URsu���5�e[��t�����Q��r���0�]j{��q���5���)��CO!����s|���}+��"�����/��\�~�X�����3C�E��3q6���rIn����qi���&��.��l7���c�������d����)�Lji`�r�����h���f97c�Y�%����A��Ow�����H]A������ov���0:�d��*���H�7��?���� I��F������jt����Z-=�-�D��|�$PV?�����QJ���F:����[Ps�J�`~nC�Y�z!5H��
��������V��ddr��.�6�	�PZ�C��������;z���b� �M�%0R���Q��Bv���n '��
�vk��s�,gu<��	K�Ap&�����HOc�H��T-�z�������bX#����%" #��t�����HP|Z�6���}"����A��#yJ�P��?3BQ�x{�=���8�c=��T��7�1m\W��nWN��
n�MgH�d��r�����uN�j�����YX}��o�ui�{�#/�O�rf�@����L�J�Q�� }��B)�M��A�h�eJ1�����	�I����Si6�bK*R�oe��tr�`����`&�D��sL`���i�=��@]�C��E�?�g��5��":�;`c��;�=,+�A��Q��R�,L�sN;����7����C)i���Uq�,��>fE����.��(��q��mK�%X����(%A���=19��=�������*�Ew������J�C���x��^���R�X�\]^��f(�xk.�K��P�][�\��������HY�
����J�gl�9>��QY�@�R���g��`VSNV������,�(~����V��hE�`�(���O5��uh���a��.)���������QfC�&
�$���A��a��N7��~���u�*���c�o�3��}������Z����H�Lg;^�(H�.���9�|��%-!����
HKH|���q���2���������aaE��3&�A1r�������^���==9�O�&��-nA�p0��}���U�B�Bx�n%��EvJ�g�����+�/'���i�����F
Y._���`?���P�Nc�m[���V����?����_��<q�q$���[8Y����zV�������7�-�l�&�!��{����J�������R�I
$pz����t��� ~�JSg_�_m�����%����tX�y���~�����+&>9���#����p���4�X?�V�A�[�)��2���Aj�<����� q������O�,�2�e�.����'� S�J��ED��������5�b��pb�����*�c����Hm��&��iTO
��:1
�D�$����M@DOz����xT|cIM
N+$���������D>�$1|��)�:������3x������40a�_TP��O)>E��]���%`+�>�A�!
+{�i:q����P�s��Q��>�(N��|���S�k�����f���Y'��*��G�3?^	�q�d�d16������*mP�|��_�x�NaFQ����bbwo�s��f0^Ayffr]�R_��w��Ko�5���W��Bz�
9���/o8��[.�A��9pS�|�������B��4���HI�g	|�?��,�}}���\����'��[=�B*A�� ��BT��3Q�S,�Q��`��&{}��9�(G���U$�� 	(X�QRNk���������q�H�n��7���!or�NXVx�No��dl�jEU��(�(���.�K�+s�������?�dGiH���<� ���Y�n{-�1���X.����>?����W���A��D�Dp�~�:���N09���4r�`�46�t}"/m��2e�$T�hL&��qa_�(��L�bA���tC���.2�����F����M f����o��<j3k���*�\�9�+������(��k�V(-�dg��z��$�o�?��oSk��2�M��{I�`R^�������Ko46���Go���cc���)8�B��q5�;�}��9������)��"E=�u�c�L����>5FU�P� O�MoT3��.VTE��lZ�Kv0��@���D[��AM9XNjM�\��7/BX�'T���Lnkl����29R�K�(D-|���%^�����N��V��t�%�?op��$#�$7������Oo�Q%[r�����FD��.�f:�v~��\���~2\�����-O�YFIc�nn��4��|��~�xR
�l&.�*��<:U��1�U���f���h���n��������S���zgI�,��8o���	�<M��j	��Ft�*g��/��l]���0V�����?��%f,:(9c(e����L�\_�����lb�Db ����w��0b��R���?vs�%\(��
/~�4�O�"u�

�5R�A_�����j7���2^��
���6�]*���nI&�r�|/����F�[R����* ���K�8K���f������%Ho�Zh9T>p���f�����1�jr�)��"&�[��=�/7���B+�j��{t�R����)�l��"�=��:���������������1)\-Zr�.���^F4R�`R���l�����/V�K\TM�XW�u<�8�k�o;�U������,R�(��`c���Rm�AS�K����|l���nI�S�^���)tbwn'��2�li`�r��r�H	�����/'��^#b�pW�� 4]����G�����I���`����K�'��ACj33C���v.���a8/�������+f
�E�����J�5��a����V�R)��c��`M@����3����N�~8��]^���0�kX��v+v��Dp���]3��������K���������t�Zj�p���8/�ZrpG�c�(_&>\�j��QM1k�~��y���hq��+��:G����R"�s/b+:<���Y�)`l�
���m�SwC?B�p��l�B��M��s��:����]�d�
�
�[�#I��<HF����w`�����a1��F<��@��7�&7�����#��2�J��������7lT)Q��B}���kI�g�����Rs���k����YI��,K�U��*�Y��Z�Y�� S�e�9�pah����;]��s�$$���6��@�h���n�F��v���.����������l*I�A�L�=x��9��u{C1yB���*�C����B���%�=t�����<�No�}������F�����3���-������&w�������������f�RF������_���|����������������u���+����
���q��QL���;+� ���9b_�E�����}Z9M}rw)���V:�1���"=)�-9k���{��B�����Q���bzO�MdZc�����k��5>�jey�?3J������fDG����L�B}��=k;�|Y����p�p���T�����St�3NH�H���P�bsT�JP&�8�!I���u���,��$}�2YD��Efj���Ku�n���Y��>+�e�&P����������/�#�C6���=�"+������������W�H���7��e|c�����7D�(��5�p���2���6����z.Jo��`�/:�*s����E�I�T��z�����jl�8IR�\;�;�}]���'�p��y��V��=�iv�	�����t��u�!�3���$����UH�9M���(�H��o9�I�����3�_��N�����H���I�
d�)�Hy��5�.!73��}�g���Wv��'+�������n�A3���?wu6!)=Y>d�2�>��{'
J��UPV���3kt:@����������!o��0Z�=;���j�qn���n�U5��k���=��=��2�W.������z����B�Hf��P�?��i��t/L��E��)|'\�G7O�d�l���H�����s;��+��dF/(f���r,<<�e�������FW�3�b3:&u�e1v�_L�^`#�Q�<w��Q"�+���Q�
V�i���QXz�*X����7���u�QR�����?��~�0M�y��� ~��1���w�ws#�=� �A4b��*vua��::�&�S���s��c>��]^�T8g��s�wsd���	�����"���[�����b@6
�X#y�F|��=�~��.������S���Tv�x��.b��+�3���%#gS��Q���6~,���l���R����y(��De�����7���U��8
'��X������2���X?S�d��)�#�)�lV�
�W������5@��yZ	_�OG��[Am���'
���V%��O�"�C�{2e�a�����s8_���
�����g��mk���D�]W+x��s���N�f!�\]���]sz��oY�R���(*`	����zQ���b+��.6L�q�q��ok�?��tt�i���C&���g'{��}���1����P����3�F���u���F;�gq��K�d"��4��t��BT�.9����J2�6�j����������Ok�(a4��@���B�~*uz�x����,�He���[�*r ��e�pzC�YOCr��.�$�m�S�
���kKx�;1��^����
Q�7'����8�U0VW5��Y�����_|��d�N�����W�Y��9'�[=�v��[������(����M��1�M��VSWU@`\�_|�	�#�DN~0�#�(�������� �u�eB��e�O�aG!eO����:O}��/��=<`���
IZ]�f�"�s�s��W/P���RTK���O�����2=�v���$�>��}��`@5���%��
0����_p�l�M��9���&�x��%3��En�J����Ez�z�g�����
w]�t����x�%��Nw��U�%�Hn������X�aX����;�����w�,�uSX��"��Rb����7�7�Z���������7��w�68aj�'�M��/�3�53�5K������qCi�hk�������}��W�����RZ5���q_/��&���#�t-ZR'�A������f���OE�gb;�2���56�9u�Q�J8M�|s�J�>�@��\�Ct����!�A����^1�K1)�u@���������]�,��!�w�S���Z�������J��������/��2�v?������``r�e�y��$'.��#�V�������0�V
	>�3Wt�-�g�;���/��Z���
���2���MD��������.�2uf��[�/�m��Z�%�}���}s�f����b�A�3�
��Wo �����F�����RR���T�g!�!f�p���9�'�G��NX*���Y:����_$���r�������g��ki��AZZ�B<�U������������~R���&�jw�X�6���I��8+'Uw#��& �gx��,k8�-)�����o�
��w���79�����e���<W�\`BT������s+���y����;m�7�I���
3MK�E�(����N���x�--6M���icV��Sc�8�q$Wo�K8�tm�9��T�CO����*�A��(����,�8�:^
>�5 v$�+���)�����A�j��0Q��%dx��j�-{#������;�V�������?A�"n��)��T��4����+�:��c�����:n�}&��r#7�3�7L�1�M�0�i�����i��N�R�v�{M��,�
`?����?*�g�$�N ��C����1&�u�A��k����>�,�dXi���	�z�T\�<�iVT�@�q��?Lk7���G�����H�]�G��	�N��s�8�L�|�,���
�Z���NP�,�k2`m����0����{u�qLk�h
�+2(t���B9�a��u����gE���oB9�`�V�z7
��!H����	|\���N5����T3����u��v��v�gI�U��<�KVo�5�z��Z�7D����x���������g��M��T�I7z��l%3�������0���|zx_`y�&���������r����KC��NR���Q����\���h�e}Y�'�/�K�q���g�������|7�#�g�@���0Y���!P���iAj������)�u]�6/l?��N��q�����R{BU��N�m*��H��S����jqc���1�S#����\��8Ey�v��������)���g9�(�n������X���x�f5H���������1-R`����^���3���o�0������Hi�2:������p��s���Z&\fD���h.5UKp��7\�NZFv.��O�,�V�.�C6���a�"BU�b����a�5~C���;��SE\��:�VUZ�G�b�X�%�������	���x��'gTb��3:?\���6�c���L(���:t�V�s�~���&|P�v��|�vX��JFaHbNS�!���'�z�rY���MkZ\;m�R���S����QP`�@�h��/�7~s��Nek\�-�����v�$#�R�X^V8������z�N�I 3�Qsw,W����g�
�+he6�Q���D�%H�b�?9����q��|�W�n��,��{��,�vF��#�sC���7q���+�}P���U�<���1�3�Q������
�Q�KiMXo(���OT�����:��{���m,����n3��	���F3�k�,i��
��wy��f�-QB�5�e#�Mi��������b��a�����P��������	$�M�9��Z�F>#3�����hn��q�z
���`!E�`u-�>�)����X���[XD-�������$�=oA��dp�s�����_��XL������ZY��q�������A����bP}3'<.\�k{~3*Yl-�@d��23�Z���_~����B!47�x�SC�����Q1aJl��e'��(}�$pI�0��E�G��>�n*�!qk\�oH �wN�H�����������E/c
x~3J}��)�.Y�1����jo&�c<�WU��`�I��
����z/�n�d{���H��o������3�����UMwl8 ��B0g,<W[�s�E���m����)�z&>���8�O�W��&���
�k�o(���uOv�U��n��T����2�V��L���X�D��r])"��#������M�����Ws;*�~L����'��'����V���4��XL����5�R�d���Pl��R�b��_�&g��G�7���nK�^��S��tG�r@�����������sy�e��gN���E+�G��9��������1���~����5��6r�=lJ�����D���x�w0�����C�����3<\6gR�>oX^)+��R����?��Z��7�q�&�o����A2���H��,��~�j�W	U���q������|�� #��Q[��V�TZ*	��	�y�W���_����������d�v��_����/�����d�Q��#VW�cj�k��h{��K�:�nX�*�����W�i��[���#��[2��n�e[@�@j�jeT2��F7��g�\�9������ ��U��
�C�R(�A���VV������A��A~�U������[���	rc
v��pja8��Ir��nNE��s������!z/�i�4��R��q�f��4yJ��_���k��z�x,��!16�]�-$������oK4juJ^G[������u��D�p��.�Pm�`�{G�RJk�����u�������^���%$����A�DE�Lw�	6��0Y-�J~GH��'�s��������aI�����?3�l��z�^C�$������-�K!����]��r�v�,q��L�3M�n1������$
�J�w�\����r����,�6S
"�d	��H'�^���*&����g,���3��1��l�c�M>F7,�g�g��B{���U5��t
����vN����
�H���B�C�j)���go��WxvtwZ�A_#;XIjH7�{}�����z�"U�F4��9�,~U���r�'��[�����Fr�Mw����Z\����D=G�	�Q>�~6o?A����&���i;�\�n�.��M�W��1���Lu�A:��2�A�y�J�k��9����Y��%�
�����b�oH��oT���X�#�����Qi��7E�jI|�Cl^�q�n�����O���!J����6�g`}#������a%���I��FF(�jiB)��u��&��������PN����%��6��My�Yg�(>|���FN�i��������uY�:��Y�+|�������4f���v��/������I�����NO��	����I�����K��(���B�%W�W�[��Z��Y��P�������p/���M�_1�:����������Kh�y�U�������Z�i�F��x4b.�����"Qg������PG����"��_����P��E��)!�b�������V�X�Q�����Dy��oM�C���kj3@0n���s���^+�5v�/��q�����Mn���
��(uP��~��c�~p1:�r����"Y������P�����Zc��t�wWx��.���]��Y�g�]W�k�!�����*f!�w����nh*��71�jE�Q��m'�����i�g��>�����d��3.
�s��_�-P9#���""5/����m�.��|F��R����*��Mgo!���|�R������n{<'r��D�k��g"������f���������Zz�*K�TQ?�M�XE2#RD'�o���/i���Y��R/n��p�����
�D�:������?��a�(��*B�����o���g���+�jPb)�Y��l�"�1M^n��*���������JK����l�����k�Oe0���8�2�1C���) ��u����Up�v�-�C��wW]w�@�V�������W2�����!����n��`q�x���r����'�"'��jH���[6��??���%�.b�5�������W���Q���
�!�kB������E	����%������y<�H����K����}�"��k�i���
VL���q��[�"�l�����8A����?XG�P��#,w;,�0��?��@~-����7
A��i�?V��W,_�����M_qq�_x�-
�;x<�K�7�Ik ���� E���oy�Dg��>�z�
��z���&�%��u�G�i������8"����5��*���(-"��%���{�����0�4�'#B�����%3E9G���S���Z�xp9��U�K���T�B��V�7�N�]1FF
��/��4:��c��:�(��n����$��l�Q~9�c�N����^�A��4tr%�y�3���`�2z��H8V��z^d&��-ue�|JRju����DrA����J	�y+�%I3f��!�/�+���Y�yvUY	�,HG�sy��U�[�(z;.��� ����&)���<u�/y9��?u���X=S���Sw�XN�b��"*A�T���N�^�^8~*13aLh�1%3��A����K��qU0���9�g���b�
Ko���[VLD�
IR'4%�r5��8*:ctjf������<(��$�{������8`�A+P�Z�'������G~��	� c��0��[4D����^�a�D�/o8���v�Su�W������H��� e(������G����((5��*Y����c��B��&9��p�/osS��S�d���+�K��t��z��I��s�������d�xj��>E�����R��x���<U�H���J��fM�l�
������.��~$��y����LN�2=�������T�(e�2�{4�;>�|R��m�R!����u����q�)�@�3��LGC)���Yi�(�A�=_2tu�amH$��C��G��sbH�OKL���6B���U���MN�[<v�V���L�/�/�*��@�����$��z�Zp��]��a�;U�����6��'�C"���)�h&b���~.�������9_u}/��'���>�~n1�PI���7��[�Z
���r9?�*�k��$�fC��4�����(�2��vrm!�e�����k��0��5�N��|CO"�g��Q���=����v���;��Z��p�o���h�>q/C���+�w�D���K��W��/�������������8^t(uz�?��w�����V9�VMYB"���S�����`���|t[�1�����N_N�������8|�wo>nb���X��|��K��4i-��>(c�
�Wsi��n&�/6B�SM�3��O������xSUC��������!��DM.�qE��-;�vt4]y(���5���+���Q�'����o%�0�pT�J�m������6��9�@��l?�S�>_��r����v�^�m�wI��C	����S��i��DH�������5��d�Q(�#���|��e�r����E���4�����T#���n��>�Zj>���5,b���4��FP5��� ������tY�
P����p>Hw��������-��R����������a�����>N���q�?U����Z���dL\�
�<��o�^cS_F9M�8���=�C\Q�P]�EcCu��M/E.���]����h���������|l��F�Z��20������R�w���$hU�](����=hl�
 W[�%�Xv����
Xo@�b�3���-~�����l�
1#q�?_������:��3g�P�kI����*��5���A�@�W�*���-L��O��� �X5#F�;i�M#���17#+�K������
������r������%�k/��fUB��19����W�w�����X���U_K!�Tr�����B��Uj����$���W�5P�:����H���w������������������/�l:�����L�������[.������Hr&���^���������saB�x�9�
!��p�M�8�{��\���-N��*/��@��3Z��.q]���Q�O�:����>�S�8��!��1+���}t��AS�D�8���$U�*8?�{�����.p���-o�qh�&�cj�[��Tpf���P"��G��k�!��|�1`��kU$+�t�MRiQ��@~G���9�)]���3��$8K�[Qtv����^�$%�@����L����_���1�V#��������I�l_:A�h�y��.�'5��}B�TvE��q&�vd�L7�|�"5��(�Jz��=��f7JfR
&��L��������[P��gU>�J�%�"�R����h�).q`$���v�u|����8�`�@�QD��B��x�Or=^���Hw�7��t�B|�d����"LN�������5vp���C��C��M�.r��q�O8 ��M�����9�[t���#���ZA��T��_�xl��H��ad�ls�#�c�k}�=������������`���v��K��SRr��R}�����,G������];[�C�R������������<,f��_JR����N)�[��<7�6��f�:�w���U�4����l����u���2�7��;�����2%�j����������D9q�a/M�������1��j�:_�9SO�����k5z��5�a���M��w�tK���x�rjQe�\��%�~�s�0g�b���"m�E����;�)�
�D.�vM%�w�;_��1�P��	��"=�������+q8�����w���wX�x�rYm����bi�#9r;� �T].m���	j���`�_�k�������o��a�K���~�����v�D����������_��N�������
�Z����z���rvr��Kz�K���Mt.�l@���^M����kF������F�C�02�����!)U�0���'�7�)����f�l�XL0CD�y\��������Sl��`e�0���z����!Y�d���M���
;����Y���K�x�����n�[�n�M��S:e$M��5��H�����
c��L��J1s�5�.�����9�4�������G)v�&~7N�=����-E�*2���o����2�'%���sJ}*E�lx4���iI��k�~����?I-��L�Zp��n].a�w����N�O~Cf���!3�Im��R��'�K����24�_��J��]��c����s!�RM�������NW��O��RX*��%�w�������m����B��}<&�lF���ZY7��T�W]��BJ�����)gL������7�l����`�����&�S�3�!
��F1�i���a:�6�k������x<U�da�{�>�[��9n�SU�e�t'��R�jF#[z����Yz8������5�<�,��|�0���������]�4�f��I���h��Rm0�a!�f���ax�����b#�nt��TOAq�������:�P�
����2����K���y���(`�&?<;gp�^��f��Q4�����
�8����B���6�*�����'/�C��O���Z�e8i��U��i�����T�ax����������f
>���|�����{���#-�Bf�A���iV^;���A���6T�S��p�;,W��}���GPs�������~~��3%����@����F*\�4�-:��A��sYq8K
n�bMDQ���o�/��P�l@�$��2#$������{�&w�62�v�~����M9U��1��k����~�0~�n{1�R���Q~p�����`kc���V;��ah��*O�B���Z��|V�Zs�����K�S6RU��j8�P��')��u�����
��s�`]3���x|U�5� ?�F\��)/�:�!2V�a�o���8lQ�X`���|9dm�{
����h��*0Pn4�j?��8�}%�3&3 PT$bt]���{&�w+�
1MDq����0tc��0*`i������v��e�g]�h����N=�T���GY���`��|��F0�y�u2�D�&-��sr�Vm����e�#�����7����P%K��W�A7I��i
O��3#��;b�������A5���5���V��5�2���������W���x�y��3���S{�
��FPU#�%/����
�0=~	���pk�2��=�,�_y�������[���d�P@-j-xo������y����Q��������f�fWc2�����/jv����>��y~~�%��\A��"�*5��s/T%1������2v%�����&%�WF��G�G0'�[�J���	4�\�����+��\�L�����X����z��,A���d����N.l���o*����Y��=���V�����
%r	���[�=�8�Uz
3�O���^�-�����[��0����M�)��%����}��g�1S���.��o���r7�����7�sQk]�-{�����=����x�
���-{Fcz�
^N���`�����������
qg��Ju���9� ��`K�=���2*�Q��*���	7_(���F3��k���=����t/���/|�2W�v���\�l��e0����#(V6!>�}W�4I.O-�3 ���[�E_�7����e�v��{"AJ�*���?��~���_
��]YrO26+�����9����(��������L��>r�C�F��>�;��wi��������xw/Wn@v3`z)�j�-����9~��[S6~�*&�\z.��o�%��+�73g����U��5�_�-�U�85�,���5���5���(L���]��'��&'��\7-XE$�tU	8���@s�t��'�Z�����,]H�~#�6���9������J���;�����x��"�����u>�����V.�$H����X�]���{�y3�~�|,��6�Y��=DH��o��������<6)��h�+
+{�t��u���T2/h!>�)C�����)�}���HQ��~Fp��T��U����9	"
�=�/����*N����^{��j��H�������Wu�/W�&��>���{�E�O�xS,�0l�7z��gaP�T�4���i]��t/%cG���Q���,!nO��w_9��j������fO��2m���I��\�D��;b���$�X�*� u-��s�{*���/�f�>�@P���
z8��39=�IXV��gm���&�p0�X�$�����+����~
e������^���B�B5n���jt���:4x����]<��[����jhMQU� 9,���UU�FA���<���-��s
c��V����&�'���n�8kC�f�r�Z�"z����{�h)S�Y��G"@[,��Kb��^��k6�G��/�����Q� ������~�e�:���������6���S��b2#��;68��t�V���m�WN��s���@�">���\������9�eak(��/F��Qn�����2���@��@)����
g~��63����$�Z�[Nn��Qr��?��'�P�/��d`5�96����a� �A�"s/���-��<�{��J�@�D�cO��!����0�u-�f��Q�3�����
��SYg,y��E�I3����i�����c����
��1�����S7�6���K
fa���0
���n���F-l�����l�����`ur����G�[����5�+�I�j���N��fe�g�L C��'$b���_��Ww_�u�PpjYXZ8Y.�c:��-L�P����z�)��;B9�PR����_�P$F�&3��O�C�&���:K����miU�z����C�oR|��9(�r�d��M�
3��2��;��a�2x����U������H_�/��!ij~����g���M���Fv%�������o��ugv�c�*���C����J�q�C�E)?��S7���[���i�v������������dV�N~i�*#�z�K��,U^!�g>��~��fZ[{���-��8�|54�PR�3�V�D�=����KOe	���%mM�Kg��3E��5�y@�x��������Qv1!V?-��E����>WC���S$���Gs_s�6��j�1�(��b�kd����CN�Q��]J]rI%1���w�-<��P�g�!�
u8���
+���~I��������d����&T���w��AWP�2�X�}���f����;�jK�������>�'��~-�w�9?�#���t�{�>������#��9�JY�-)����.������wo� k�Mt�r,2��)���?�P���+��&��S��fj����AuSR��h��u�����*b�*3�J��4H��I�@�/�����|K��g\��n������kF��G��L�B�0�N�1LF������Yq��u1J�n�����M4�0���	
���T��+)&��UU��h�1���^�w��bBiVo��������I���_�����4����P.���5���8;?�v-Tuf����"l~{Y%
?>qo���	���=	0��
��//�p&�g����x�g��Z1�YR�u
�P<�����'�<��[��H�47���Yn�������
��}�kP&������K�n�
U����q�������q��8}}j$�m���������z�	�l����GTGo�oz&.0Up��uv"5�����i�#[���u��m��AR�1_��e�����Y>�}+����� )T�@�sZ��c<b�\�����l�ei��icGr�Y������x��l�
��$K���/��6DKJ�:R���!S�(8M�O.������:Z��0���N����+�1�!�tR��UO���o4�_�8�!g`����������n�t4�+^��y�Q�����=��'8���1��6�h�~t��^��i(5;�iw�qLz�� �j�S_r���3�"7��E�]���F�o���w�<���h��]�H�������%\5Z�Z�g��Q<x?C:���A�%��X����5�f�p��CY��Lt~���}J��s?XR��6�h�U��^���'�'D��:����_���g������j��������,�7R�QDp�)"�������V�&�0H����P����i�R�y��?�R�<F�mI8��O5oG�.'"�\�u��������+a��b�y�C�m��iH�Z��HK��y'(?�������_���X������U����-7C���r�>Ez�4�����C}]�zSy��n�����4���H��g�y��V��#����V_�e���D3K���^o�!�c�?�
 ����<��"R�V�gkFs<��UBZ_~IX����=F����������:�'�3F[�9�a���D�Z:�Q�����+���u���I,q<�����lg�r�~pW�7���-�7m�^bAR�U��e!��Fu���MT%>#�s�,6���+�k? �9>����[9T���s���kt��e&j�����njoC��3R+�Q��[*���������!�W�MM��;�x���H^�qNB�{*�E�������{�:z��0e
����o�q��I4�z���E�r;u;^������$}(�qI���y�����[(�p����	���X�������(�J��@���C����O?>���"���&��-5��,�!-��:��+K���K��AP��x5/&����2��>��������}>Fy
��x&D ���{�n�lU�c��}����/G~T�c�m����e�UiWwKw�Z�f�:���cg7�����w�(�����8u�i�k�hD�)./)c
�N��x!��d<��Q�
P����:	���H�k�eHI����{��;o�[D��'�C��~+�%���t�4Yf3����c��t<XR�_����O���a�
���d>�7}��^hl=��|)�;mo����u;�����0GqbX����#��Y�tU{pM�k��k��,[�Z�"-��
[Gj�Y����HM	#�WBFD+KJ��8��JF�����+�+�M���t�F�|L@oC��{�H�}1}?I��,�s��&Y���r
n������xH��
�\L��X�sf�K6�`�3CC�6�Z�y�I�U�T��O�=q�U0��~���a�k���@��(a#+�BQge�1��ih�[��X��u�n�%I�[��!�����s(�6������Km[����p���J��p�u��.���^C�����UKP�*�d��=9OX��z�y
�u�%z`;����R��!Xr*��N@��/�n,�|R'��X��a%U��E�w�c�>��R�s���N��T�J��,I��~�T�L?u��@�S�y�h���-�A%�{�[��K����T�ha�����&<��3����7�1�1RK�_P)��n����)���Y�O��^��z��$v�&H���7���4���u�^��Sk��N�c{�rj�i�R���m�8�5`�7�.�`���U��c��%����|q�w3����
|�4l�LM��U���n��Q����A1��j��T���E:w��g�p!=�<�zw�G��#�SM��c:����mb=w�wy+��t7WnG�s�������1$�����M*�����D�b�q�x��6��v���
\��Bpg�Q!	6e 6�^���[�����^n�t�� ��4������duE�g��z*�����hB��dw�p��\�;_����]���&i��Ag������|���D�0Mrrj�z������l��_S�VD>��t�Yo��,�a3�������������7w:?wyxe�3oXa������iX�[�����������X����@��B��fw��h�u������d�wcM�W
(~lP1]�np����$.;�qY�,'`��A�����?�������3�Z\�����
v�Y�����"a���.W����l������x�v����UB�����9�j��&d���,�i�a����LX
��9�<�G:�mXQr��|3�t���4I*�q�&
��o�5��{�!�Jy^���i,�
�8�~�\��Rr���Tb����S�5��r/�p�z��_*��M��oa���x!Xp��d�)���r�C-f���9WU��V�r�\��hA%6�%&X���['O����i;�N�S=_s�����z�����?����~vU�nKZ$�����Nx�����4bg���
2%�7�����U����7��A%���6���=zE�4
N/�����������|�|D��}r����tH�
��n �
�"��6rK
�U��tR������wJS"�
�����Z��dX����X~������j�+�[�{��v�.��S��=��-��q�p>���C~Y�������0�v]�������x$YG����UT'jsp��<US#�/?�Z��U�������Sx��b}�f�!�������Z�-��T+(]7�1�����M^�x_R����8/���;������`U�#�����5��F���Y�!�A_�|\���E��*�@0�0�>^�*������7�Bj�^m��}��g^��#\���zN���?U�,����`��8�����
�����x��V��A,?�Zf�t�5���=�3����TM�/��0�,;�>��1���2�ZV~!�*�v��z����L��P��	����o[�}1������N^������s��z�C��J������y����(	���>d�=���`��o�>	\�d��Y����P���d�-Rk�����'h�7w:g���'�z�V|U�ky����`�GK���U����UFD��{Iv-R�2��k�����/9$d�/'���7s7�W���^�nRw�b;n��4����b,�j�(�f��n��h�'������.�������6k���g�4��r�e���^�Kd-�~��VCM�+|����h��6����I��q��3���f������z2.6�&Y�H~d��;�M�XQ]�����CT�����q�O�!�Q6/�>�O��,���vBJ������KR,��7�ogV�G.��o��|��| E��RjK<����{������:w&BFr�������,�)2�~�Ry�������)c�Wikz������|��t"�}��(�m��r7#��#8��LG���x���2c>����_@S�D�Q>�O�km���ZP<J�XnI�?YS*��w���4����?�lc���:&�4�&����e\Bbb�H3MQ�|�_�����$V��d��~d���]�[�����j_������s�h�x�Z�^�2���$��/����r�4��Vt�P��Wo����.nO�CMZc�9��}���xN�\=�"%����>s��AnC�F4�*�H���=PV1\bU�Ba���3�z���qe4���5�D��5Q�G.\����E���)��+��Nw��������)\�D���,��h�q�H{/�iu���?�6Lmu�7��xDNqN��C�X��j�dw�x��%���h�3=����*n���1��qYuBB���{��zK���#B�2��������Ed�wj4��T$IMM[6*!�b����E�� yS
�����������6��������n?����l�q�x�u*"���`|>7Y���� 6�Qe�����#��b��$��R]�l�zo\�@1F����aa��-U]������4�L&��YY����V�o�eiM�B��YT3T�Z��c����B�)]�5Q=6�Q�,]���|�(�7�/`��i'���>m3�J8n���|Q��Id�R����s��l�o�?�->s��wk��qSbc���!�a\��s�^w�Q����5[�}
���YV���=�Un��8���)h����[��G��Jr��HU���7��q�3
����+-}3���}�������QI�>r���.������tO;^J��Vi"sMw���.�\a�����v80�"��'�����H�^�M��w��/����&��]#���L���RG[�&u�������g`T�-�/@y6[u
�m�������"�����(E��@�7�������������^G[;o��Y�5�������$E�y��w�Y����pB`���P��:��y+���M�&��t�.��cU{K� ���/.�7�,h������F}]�CO�*r}X��1>^�?Cy�PQ���t$\����M���t��Uyl���>������Nu����0
t����Ihv�u���;N�'��(����2�M��{,����������kp����]q��M���k|�6�V�'�������rn�,B��V�,����n#����l����O^s�����>Q?-���M�*����I-���9h9���*^���Q!%���{�1��T���-������Zi�PM$���J���O�\��w,�V�����p�����,;F��>G��N�=|����j6�Hh�e	"��C�y����������6���5\wB����^7�xQ��[�k���d<`���P���{M�h������OJ�f�&��1��h����s#w`.Zf:�ev>zqR)�R��e�k\�F�����G�������{�n��[�F���!-~U���1�9��xj��v��

�(�77���^��4�({
%&����lZd�T��fD�q��0��J<�q9�+K!�8J���w�MJ�47���M��(�k�������DfB����W��,��6����N���H��T������S��;2;{��[�f\Z~-��^�L��|�!���(LH��nS����� r�#�$�����Q��h����p�?3�/"[K���H�}��bHd���I�l�I��>������X3���lO*��.��5Ml1G�G���@���U���Bp�l�V�������	$z���W�]6��s�������r|���@	C�x�� ���H/�u�f����Jj*��<�rm�[�;U%1���IL/vF-��e#X�1r'$E�^�4����Kj�����rM~���9�MFc�o��$�7_ ��������K�l3�����z����`��ub�/A"�%��dOS��C���;H-����=JloGQ���Z�Trh��S?f/i�,q.\�g��S/�wQW�%Y{�V� �S�[�*!r��qao ���b�~1u��K��S���9�Xy��;K�T)Y�|�2��a����{���8����8L5�]��=XX�n�����2A�K�����������FM$�M��O�7]����%vQt����$1W��6���*���vC�L���P����+����8K����?AN���{9ml ���a�(-I6��� "�
f����>�9�tO=YLes3��6�o�qbf9���y�Y���]�!,�4;����n��h��M�����.Op>(����������90�+&��qy�-����z����������ND��H���[���=�3�
�n��2A��}���Q~�`�����aHK��P�Yc��x�t���=$�����oD�������f���
��a��x��0(K��\l�����g5�^�|9�ZS:�t7QA'-/N����Y��f�$7Q��FDdo��;����w��d.:�;o�0�U�D�7���K�v���_�H��x��~%�f���~p�{=����t7�m���
��������<wY�2`��m��wT}��r�b����hc�J��	�z	���d���9_�+���4�c�C����{���v�S>����������b�!S�P��K�q��;��-��N"{s�6�[���8\J��.�����y�Y���z��}���
�	X�{����d<@%���/���y�1C�7'����S�l������h�4�E��rKV�
���/�����0����8����f�q�-�(���X�R��n}�y���)j���]T�4��L��-���U)���;	4����������Y�����/��g�2���������t����� H�|���~�v�C�yrW_��������d#����������I/�A�����|^�.,�������M�z���sm�`%���$P
�b�h��PC]����FP�~������s��N�p��
|s��������U��J������H�����7�Z�f�x������j����0��w���|l������Yk�d��?����������k���V�q��������K^*t<�*)uik����l/y����
r
3����i%MD9�W
.?�#��.���W��������eSNj���x>�;��Kz@T	�����$��_n�~H?�YJ������]��������T��@�����F�����ZtF$���\�z�tw�v�!��m��I�:'��e���=&=��h8����F���/���+X�x��:(*�Y�}�!�u�����,��� ��&M�#D�=!��u"�?X~/�l��f06�8m�
�F���r����I3
����k���)Yn�������QQ�����h0q��x�V�����9�W?�	�^�7^q�������1=�R~o�W���7m��]���������s�F��Uz�X�P=�%<:GR�^M����/����T��U0u�T��b�����>%8~/]�q>y1H��{N���Y��+E�do�T���g�Q�^c%��n�����)��/kG6P��J���$���2�A"��P�D��7��W�R��
^b-�x��8^��n)��bz
�6&�����9^���`�6�=�A�ZR�������*�U��K_l0�����������=��S�T��
����
��w94{_]�N,�u3N�ZN�Y�g����%q�{�$��7/D17$�	�`aUzu��y���[x~5���[��~(��,6��zf,�':�IyI~���
;����S�Sej�P(0�$�p��C���I-��;II�?�x��7yFm�'vV2@V���G������{�$TT���R�n�>������M��*)���$�&��N_����(��i��}a����@=B�Z�tV����pu,�~�r;�1���7���1?��8��
�������+}HZ3�I4HV�D�:�������1�d?�X�/<�S����E����w��9�r�yQ�f�����U�}�{DU*(��)�Y�*�i*<��k`��i}G�ks�7�)���R���F�c��M�EN�81� �4}���Y���V���nc0�$*OU���
���\y��En�v���S�����3��I��B�or��,��C�z�*��d��W���zP���f�1�D�{�5����^�bt+Y
d�h��r,�U��<�����u�^��`+K4%�JS=5�BEr���5��w��$��3��)^�i�;���}l����'(�Q���qCo�,�I}���B���,��"r���`��J�t!r������{)[6U�X�k��yQ������?�m���")*�{���;R!YSiy�@���p�����k5�z��>4N��@w���c�������v����
DI&u^;�E i�=|joNs��m�R��E�K>�M�n;h�cx�-���VR���B�GT�l"���P^O�Q�^��s)�G2���Z��q8��?g>�P�&�����5�s�S�u�m�ea�4�J~3d�����,nj�X�`G����_����
���FT������������
��%�V���I&�������@J\*�$4��<�!5tz5��mj`�hT��UZ��������U���c��_�>U���A^9c4�>����l�@�P�A�U���Y_��[3a��X�����F�r�
w�E��x���}�82�\�>�q1����#n�� �@%�[wmi�����\$	��R��'��,a�+���+@�l�>/��7�����)��t�3�>g�f����*9e>cp+b�G�_][�
*(�W�i��^W�~�T�i9Pz.��oFsZ�Iy�%��W�(�6��y�R/�'���[#M>o����i��z��d-��S�������:�?��2����jH^P����a{��g���[�������J+�����<y�R�.[��'��/�'���E���A�-��_��C���b�YT8�~��c�����L�LX���17�����!e�:�e�g�P�l���w����>���q��P�f&������(��
"�T�����+��l�k=Q������5c*$`}��lT�p'>�6�3�M0��T�'��w��������[�q���Yn�W��]���$U�~~tTZ�?S�`���:y%�kM��[}�~�v�[Jo1�M�����7s@��&B�R�����+��X��E���Kfe%�O}�!��n]�jL(?I���d3rk�����W,���@����H���b������m���c��1�k��-N5Z�8�/l�<�f�#�)���m�bq��9)�Q
F�hR��3NL��Xu��pB�����!��Es��L��*Cs4��0���qC+��]A��Tb+�9�����rif��&"�R�J&Y��{�sj�H7l�/�_@��9��K��7������i�bT����������M��pf9�_�p����\��[���` ��6D�Zf���&{�$���y������!�S�������	��n��.��������?T����jNE������R}��n���
�{�������R#��7H�1�V��"�jr�W]�k����~�3\�F�����;S����q�d�2�C����L^����\�Q�"K����u��U�Xm*.�<Wg�
���'__�Z
%��������}��S���Z�Ki5H^�S�qLg����y$�%�}wA���u�c>H��J(X]�=o*��EBB�\�~�3�%��w���zmu��XJq�����
��MSZ�jP4�����:	r����gi��$�[v���`'OwXr�)`��r����v0#��#
���l��"��g��F����`��u6��G�(��+W�O������������������G�Wh��#����B�}�?hG�4������r��O4O��,F��K�(���/8����b&d�8�A!w�����;0E9�����|e=�{#���������_�+y�M����nQ�����8z1���w�T�e��b��>���I�v[�� D�1�/7,���E������3�������v8����6�	aF��������^L:�a�v������If7s��N�MzK���_�9\����U�(��O��?���R�w	U����y�R>X�+��c����h��D�
����r6�����%�/���=��2����@���o$:y��qN?��5��/Bs/�����R:!)�2P'�C��[���ym�M�
�'�V��q�|��S4�Oi�����k�>.b@u3$�K��./a��T<w�5�MY���^!k�B�w�d*�5���P������������6@�ey����!�x�]����4B�%\h��`���sl�//�����M��Dr\:����E���<��M_�F���y?$��G4:U�q�K^��^1�x�r4(Bef[f%�[�����J�d:�����1jQ��y�ZO����
;���u��\�D#[���J���x�#8�Q+^��V�S6�S��'F�L�����\����l��$�mB�~7��-��(�XK���P�W�E�bnw�p�rsVO�+S+(�~�����S�+5#��R���=�E�'��W��uZ���L{��i���\�z���-<d�"uH��������=f����QE&��7	aCR��b=#T�`��R���#1���D]���6%�?�*�K���z�h|	����q��6%�f3��|7�aY_E�65��gdd��Zb�mf-2�_���H�����������9��\���m�J� �}3��P�nJ���/{�c-���O���;]$�t��m8�/tT��P������%eH~����Q�K�o�x���})N/G��6-��*����8�������3���~d>���������~n���U//�.�a�P�Q��*b������.d��T���
A J6����i�����,I��a0 z>�3���1>,�ro}k"�*K����V�Kk0����8*���[x�gt�1�����RK�_no�Z:;%_�n�n" \���s�6]_n�+5E	CdS9Q�n������~,�a�8 ���8�'n7vy���P����������l���~����K��z`Hy�r8�)
#��2���c1�i��rXA1b\V�]�,��O
8���	���b��h�f`3�E�����A����g���j�X�A�����������^�u(���n��N��D��_�hF|��OzV#������1iZ5�&�:�����Mw�S����U%�R|������R��I����8��q��t��2��&`�1�]����������6#}��bT��:M`�gb��Zf�=�Q,�$���I�KO�ZR����f�4��L���C��"���j�9�*WAt���
��
q�|t��RM�1��>vRVD�n���1��A�j�M>�tz�)I�C�s��b��{��bI�Q����<E\�0�-U��o����o��f{l!I��������{���q���J]�@��<��������,�{�J�|@
�����ou�t�]-�zvk��?�b��f�s	�
h�gh(�<t���Ku��%9��9���D�{���}�����Z���iOS�
��k�E��]!oO-�^����V��Yk�&P�idv�����k�Xz���``��0���f��F��:VAs[���z��g�3��o��T�NAu�7��?���(�f����d3�W.m��b�Ma=d�q�p(�qz��= }:�>��oB�d�_L%��o��6j�����
��_���n��S}Zk6Uiz��1Gz<)�v$i9y��d��S:.
�'9��M|>�
 �^��3�/J��Z?�yr2����qk��_@Z���k��%A��q���]�aIz~.5\;��(!lW��h����dg�u�v�p��t%���Wed/B����>���q`��C��^���
D�}�pZ�G��c��`\�n��\�C��l����(�{��Ou��K������.NU�{=�Xc����:rU*�>|��t��2%:q����d,�;�����3�1�H*��l`�^�T"�7���ig���':K����;,��6/��8�d����q����\�����#I��_|�S�0�y��$���]������&,KN�7��9F�N��r�X���������:���~��v�������DOF
�Q��QSO��	�;���m����+�Iy��
��#zOK������1
�V���.�g��@�gp����� iVF������>�s���_�1�� E�j��e>TP$K�d��P{Tp@S��`��U*���H\B/�f9*������4����_0�\�[��!�)�LT��Y��5����6�(Y>�>��sV"g���;�I��i���S5�����	���O6b��`����&��t����������uA�������I�y�J
��d�3���h�.p���51�v�.c[x��i�������LFC���6F�=x5.�XK"�������Y�#�C`�K��$��_�w��]�X_����nw'������:��fy�u����g ���~�#+����J��v��L�L�<5(������7sJ��'�
��V�I���
C�{�������I�-�W�"�(��i:�Z�I��E1Wyq_\���Qb��.����sqz�|����[FB`�I�����s��K6g�:<p����N;�QjL������X]"i�6�aI��OZx��H
T�m��d�Q�k�{_w$!�9����y�p?_o43�l
���2�i�9@�����z��A	xK�q��&I�d���X2����%�[�#v��8.����A�?�I���5�����[Mg	G	�fs��o����k��3�Z�N�v���i�
@g����)�5��:w����4��mRV��p����PUr@�7E��b�x�$��>4v��eq�"�������rp|���/�09��a�>����>�?Q\g>T�y��q�8�Q�K�5b&��2*���
?����"���U������{���rpp����C���{A�Z�c	F�.�UVi�1����Oq���l���_�m��0f�3,Sm���y��W�|j������:����U�O�����,�V�����������u��?��,�����t�S��}�'��T\���?=�W����>F��Y�!���1� ������]�uK���ndVPN����m��^��Wl�JrQ�����h����I�ZA>k�c��G��A�B�G
-��Hc���a�����a@D���!����<��"q��+'�DG��;r�K'���<�U^��	>���^^
)��i�"ci����G�����~I?R����7,���d��{��~X���B�����X���S�7���.������c`�9zir��rm��U]��Z^oO:�#�����C�����q��E�6���}��K2&_.���!y��� ���Ju|&���!�r����E��(�v�%�����{�+Gza#"y��-���U�<C�I�
���������S�#J��8b������*����m���/'C�L*���������E�
�RT���-���������>%�[�V�3�_���=��������q�b+T�6xNVGz�'������������������7��`C���,#*�z�}mf2�
VY��Y��|Qh.���+pb���
���<�<W��+'P���^A�;$W	wa�l���|+��.hg���D����==�>{������{(1����5r������<��'m�c;�@����n���lC>������p���z3�	���@IA��4�
5[��1;��2�h�R�����j�a������=11����PU�2V!A������|�(�_e�Z�����z_�1k�tK����d�D6��>��HWt�w�����E|?�U��d�+#�X��M�IP%���������"T�/T!d0���:�"�e�	��i���]���3��(4K�����|�����:�`f��C�I�L���m/6�(�r����\�XU&��v�����6��c-t���1�By����
BF�QW��W	dM���OK�����m����n�gG���Pz3r���%�n�)�87�z��i���ZZ�-�fR{���Vv��	
�������>�w��-����=�`�@�i*�Pj�������[.�
j�z���B����JLz�c���^�r"I1�@�n�����Q�A�n�K���y�5���a�����M��
[�=���n�t����i@��i��Me~+�|�J2�&O�����V{���V@����Z�a��c���n��A[�A���'�t�>�=;���s�U�A�P_��>�����^��	?*�^�iF����G�j�(��e�q^��[,���r�W����#����s�Z7�Am/����!�k�x��q
L�w.n�qD/�$��r0~��B��G���ZP�����H�M�)�q�����f�
%��%�����U�t�-�}>Z�F�G��(���~%����B���H�{���f��T7�A�K1�e?[�j-�����7��������W���i=���>V
���}�_|���m���7��
���������Whj M��o���[C:�j�?���M�ND��~���\i=^0���F����y���=���tQA$-���H��vR?=0 T��t���ai�
������P����(}y�y��6x���5�&>��;��^1��IE�c �����E�nH(B�[V��>c[�}DvB�@7������6QQ1���`1��62�Sl�������>�W<��fSL��	_f=�f<��h���S\�����2G��?����{&
u\��>�R�R�k����4_�����wh
KZ�{��}gSg�x	i�����&�\Y�<�������S�(i��u�3�eBp�0�^�Oao��;��M����6#���W�
� �E��<�|��a?����T0q���gj|h��p]M��
=��}��]��Xj�wW�v{�Gj��0������y�4
������{���
�$�K��G%��_�l��l�����vBAL]���N�����/@��N��A�`��S�.�E���)�-����u���y���E�h��)��;pk��#�nY��?���z�d����+�����
`yF�V�X�H��<��	�~7�3�|�$���u�+M��1�
N�����
k!.�mV��HE��/��IV.]���%^������U�)M|n�a�A�l8BFn����/#�f�L}��8�>����,��b�4�l�|�`�0�#�9#Q�k&"}����)iY��e�T�[���^}c��#�1�g���Nu/���:��u��8�+�y�w��z�o�X�O�z*���0�Nm�OiO��A����\��a�����fIk�X	tR����:L'�.6��;��n��)�h,'����#�#�4�2y�0'T��V���3H
G9��5��}P=L����-���E
�'BS���jJ�l�`;�c1"N�s3��(�\{1��+W�dx�]��U�n���C��
�!GZl�XPi���@v��)������V��z�:���C��sZ��4��7\�M�>9�P��I��$p��X`����U0uX����wr���aZ���BJ��YU���=oy��q�g��tn�vK���~����|-6�]��X��%��*���i5*��f� S� W����b��q_�!��c�*���eX��N�����A�br�|�Z�;�9��e��_ ������3uQ�:f��0����/�?7������ea�(�����w��[=m�
�(o�\�����X��
�$f�����tI�/b+�?oY���F��t�C�r�"Y�T�K���k' �������PB|�Q�[pwE-xi�w��!G9l�,\�V��j���������-}��	��	���)���mG���)��������	��Y���Wx���O�����v��!U��g&1Gl���}6l��wp�:�i\|�2�����_�7�����C��a@�9�-����{�����U��pu��Nn���l�)gHS�<tx���-����� @�*���+������\5Qc5)
8�0��^��^,�������bMj���s��v��3���Q��P"]�VA�>w��T6�/M�-����>��8�'X��/0��=a�+��M��{FQ����l�Ju��4�(��-�r.&7����aH�7}�A��a������'Yo9�x���v[���2�P[����X�/�\2#�����<�-��&�h9�W���R����Xi��z�i�Qg)�F�w������Vken9;@�J���c��d��d��nzN���	$�Aj�K"�#t	���
�.�����f��L��4���O��))��VN����A����k����"79����Bg��\���J�z��w[����r��cY��f�!w���EDu��
��/N�b$&���0��*N�C�9����AZ��V��+����;������2�N����]{��,��l�T�� �fv�����P��vV'�����{S�:����U���D"�8,�F&J�\y;�5d�5/(����_i�JT��I=������q�K��]���*�&sy�'h�6��V�(�oz�U��V�{l*&N��Z�2���[���?
�c������>����YTT�wc,����������#D�D�Y�#���qm��G�����X����n�<�+�*x�v�8K�R�����S���������g1]�4�5,=����iZ�<Nn�0���)���������I����0�_�s�/]���e<h��:�n��
���+�������
*�������V��k��Q������/�F���4�#�ef��X�7��D�$��V������E�g��b@n���e$������L���p��w�"���������;��i���$��I.�T�x55my�4\J�4B��J�d�1�w}ysNkqL��8�LR��������������M2Y�=�j	��h�!%����5-����
X��Z���aj$�{��.�y�������iG������V/�!�_N*���4l�[��nO�v���������!>4�S�Y��*�L��v�_]�R�O����/���B����z�k����L�i
����F���0<Qx�I��Y{@��W�&�
��;S{|\`�@����)��q����IQy
$�?l��6��q���Eq��N�V����'�$�\)5�R��:9`y��\�Z�+b?���K=z���!+����t�oon$�\z<��T�R����[�s-���Xtl�[���6���b8
�a��%�A\|�_��koHg(&<�N�����3�t�q`��D����]���k5$3�9���#���k�O������i9�o�d�����d.�"{�e[������v�c�B���uE$9�C��_�qI��<����%�>�6��$��#IU}m;���,�e�>��_aC�	RU�`��i��ZI���@�u����>���i4UY�S07�)�3w���
.�1�R�U,o&?��,��������"���Go��n�����J`i�%/Dn���]B���HM9�r^�C0zk_�_���N�j<�e��t�G��u$;�mqvf�?���/�&M~�4�Zb~�
q���G�|�*����-�����s�7��D-qJ����v;<��/?<�:����*����i�&U�=���T)b��h����Y�e�k�,]��}���Z����O��(t��*��QP���(��}�e"�\���46�l�XP���<���������;���=�&K5d�L'���bx (=�l6���������n���a�m\�����PV�:�^+��Zr#���,���4�p_�~�����#~	�vV��}q�<{<Ul��L��a�{����s!?�gm��
%~1X^��d���<�3d��_<�S�R�%�Q(a�.]n��!���n�)�NG�r�7Gy�HJ0����5��]El�w��`Nq{���a��3����v�G��^�Z����!A����SB
��v:ZZTY�P_�������iv���$T�U�tn"B�5#5?
iF���:��n'�!qe�9�t7d�]����}��-��jq�e�uQ	��Fyn��p/��^gb��+�E0�U����}7d�T��O��c��b��Ct{���p�|cZ@����|+�����r�$��
�{(�"%�W��Q�
�w�O�`�p�D���"�*o�d�����;2<�z ��5��$N�����;��r`�C�M�
�fDo���f6���D�%�Qa����XB��LN��N�Gy/�K3�TVrBTx��ng�\r4��F�CG�����"�X�������������yT��E�`H��?g^�o�E?�����Q(����^�v8n���"4�'���h����~j�mC7.�^7��j��^��x����qn��������N�~���INF�'��������&}<r3�T�(5�JO�]�3��Y����HYX�hSS��� l��z�������)1/����#	r��H�,Z3����6\p����l`��h�P>6���f�ni���4��F�
1��6���,:��WF�r(/���l���>[5�F*� �=<
~��i���J4k���J.��\�.��Y+��Z,%����W�d���ond�\��`bM��4��.����$�A)��t�q���SJ�b(�z��=O�g��Z��:c6��.�m�u���P��Q�y+*g��Q���q���n�M3�S�l��M�^����Ss�xA�@�V$@������u9��tJ?%��Pf^I�=6,��(��Nwx�qX`g�����~
ps�%����ECT&���}�T��::�2��:v_��El-Z�R,;M&����e�>��fL�r��r=@;_��w������j�T�PII�3���M(:�9�Y���
S��)�`0�a����v6�;�7S�����#�k���}j���b�
�!{�zO�nmS�������P���*0���5/�������a��@�D��;(�Rg��\)#Z��b(�������;c���h0��j���Vu��_��6���_.��.�����m����x�'[h�;�#��iY�����V��w@fC���]Tkl?�8�Kx}�S8W��vvR�=����v
@�T��C����-��K#s�F���@'�5a1�^���'7b���
L����������9��48��=������Rk~���tu�����,��N�@�m^�x���Lx�,�|��|���gnf������I��e��|�����LG���E\�+H�L%M�m~TLch9�]�d���I��t��=
s,�R�Y��_| ��`�,��uK��k,�(#�>�7�E��4A�C��8��0�E��W��G!���U
�e�K�������|�	��n��+~��B�|���C�\�x��r�+�����>��|����PqR��bf��J���t)�w���mp_�����Zy�q�k���a';��eK��D*~?��0�"�V[��j=��L0v�d�/ANE��P�IM��"��,�n]�4�b@����i�e���S,�w�������Ug>u�N`3Q���O�^s~q9�z�c�6Uhk��x��%3�TT�n
��R
%V$��O�I��d����rlT�q'�0��/Ih�ZL����R�fw���+{����@M�
��M���(nT�������V��j@#�u{����-��j���2�UA]�3?)bKi�
"\�LY,�X$������f�	��K���o(�o{�0ci��B�r���Y'#�����!>@$����(>{�{��QQ���y�����������#�I�����F��|�A�E:�|Q&f��{
<��7=�� n��*->��"��~�j79miz����`���e�uL�i�a�(���r�?�E3,�L$���!����1l�&TU�����!���N�1�.f��p����������w���/6pu/?����X��\*�����IB>��7c�w)��b��
%���4�������x��`v�J�I��3�3y��{���
�T�#���VM��o�Ja���n��|w{��#�&�W"�l�'�>a������F���q$\�-�1��G<*]��L�$�UM,���K�;`V��J������'�H'�h~�AK�X�
���:+*p�[p�����y��`�,Rb�I
���[����)sS�D.\��l�����[��
�����:���whh�� �>�m
Z$g`?U��D�l�yx��m�Y<��r��9Fhr56?m����5.91�I���R��a'���T3�����W�]������D��G�[��6��*I���j~���1�Xp����6���]�F���%J�+�{V�5N������.!�c�}����"����V�N�9�����c�B�]-cRpL��f�#0-�~M?�{�<XA�9]b�^t!B$����l����p�)����?�vj��
E[��#�}v7���D�)����������z?�,s���n��g���[k����y�������pid��^��m��\�p"c��
�X~�$��f|��LS0]��U���Tn��i24�1���c��Ye�5��#%FK��O�Z��?Q��>.�oO�)WZ0�]N��{��|R��-E������`�[Z����G����T�*5�#������5�5�V7I��R|���w���9�SAd�u��V�]r�!��2���;�b)���{w���]��VSH>�:���)��	q���$�lvc,����R�*eG��,C�W�I�u��8�V/DC�o�K���>�����,K�2����W1A
'�_q�M�+���V�
�G�\cq������
��E0na��������������WB�������0��y���q��������'���9Y�(p�Kb�C���;�;����LDU�����t���e�!�~!���������������vG9���Y���X���������	l���������^G�B��CEa)s�qw�	��F�rqd��H8��t�2��9������5�d����A�'�af��K�.��{a�y'Y|pXp�����U(�����{������V��wEu��9���9��B�D���	A��TJ2�c�<:#����	�`�,.��k���~�5��cTG��G�}��y�nZ�O�����m��g�8��H�7j���^-�~ZP�����+�mN���9����R�b0���N��=|VK��
2M�� ����C�'fy��cU>�������,�/>y�����W�y3P�����������l�������`�����������	�{�����2y����=L�����Q�h�������[G8���"����6
�oy�;C=Vv���k>e�{�����
����n�	K����^r���;4�{jX^Q�4��P~��B�KF/bq����O��V�.�����[*y��E�/�C�S4{����-�a�&8��a��9�0n!#K[t1�I�*���i����%�3�4�b��f��
��Sy���hd)��p���,�H����X���<��e�6X�������1�,UP������>��[~Z��1�������]sj����.���L�x���_��~	n�����O���RV�"���o����Y��a�'P�������6��\��j�i-����^���Xu���u���bK��������}C�Iz��*��4E�*��o�Ot�H�
F�F��Z���
�o-�{��4&i����!�(�]�#�9�����������?�h~5Q�0M��-���Po�4�&��&A��"���j�B���v0�������a��-�$���"������4��B.����<�AEqa��g��_�!��S2R ���>����?����E%�bH��2�"�z���)5UO������.��������@*j��}
�7���uvK�b���=���ub?�?��~m�)4��CG��'����������6���x)��>����\ez$�
��'f/�9�6����P-�_���
�;���^i�a���:$C5��^��^��Ex�fQ�=�������\�\�CMF���(s�"c���������D�.�"�F�9t�|HB���M��:h��Hc�.����t�W�-	V%JZ_A^�P�O�K2R{}����������t�P{5�e95hW��v��P�apJ�#��p���-u��+{w.��>�~g�]P�]��o�]]�E����qh6;�����1+�%�~���JD�nF��T�����8' ��������I�
jHc���\�7/�P�Z��bN���d�������
|�R%���~���������.���S^T�* �69�cvo_L��5�y��NL��#�kp�cw0��K��C|�:�����������hOui���hF�8w���%�@�P��q���-'�}M&W�(p��;���poD��S��s����0,q:���!�id����Gv^U��^t�)�g���mU�a�	z(���C���l���Cv"O�vs���r�`�L����H�?�������h������;8���(���NA��`�Rzqx�
w=:N�X�7�!Z�	o���R�SRD-�8�r��;���Wk���	�0��4za����	\��K�*���]z��]�<h1��&���T�Q��{�|���t�81�UW@p����m��~�i�]wTs�d/�4����$K�F��
 ����3j<��^�Y�����xqQ���p.����r�T�-���[���~�]WG���[�%}�t%S��N��-W���4^�RY�Eh)�5P1z�Z��8~\�7��R�77�{�����Z����,������fs�a8����
G�HO2�C����q�^/�5!�!F�K�g#p�'3�����x����ZkJ�d�zJ ��D5�PH�}��|>����6�Y�Xj~�D���
=�����L�Z����a�H�Y�D�����8�Op��Z�����Dz��Ds#w`�'*�`�-U�$���8-*���3��_��������2h>����Z�8*�� �����|���R�$�p!gH!Iz��S�w��nq�27�"E	K�|�����V����������t��7�6���Td<7M+�H��9�JF*���V������z�E����T����]x���l�g��D$L"�?=�W���EN�K��K8���E�[��,K-f>�T,<+���O�]��&+�(=b9J����{������^�S[����i�q�{�p~j4��<	���y�����v~�sD��T]�9y�������20A^�\fe�'D}G�`�e\��a	�U�,~q�{���l[R�&p��`m����w�J���j���:�f�
���>~sw�����������|��(���Xn���r�:��g���S��
\�/�8�����R�F��rDS����z�������6Z_b_P��/�v���~�Kvbr4��Os�_h���a����,
�9��,-�[���E��48��V�I���p��7p�0����L����	���xH�B��&g%����s�o��.~%-+<(==#
�����SY�J�O�w+��$���s�_,�o����j�CP�dPgQ�����A�Vg����4����=nyA;k���������l#��E��Y��n�;�����?������E�;H�
�G6�=��x3Cd�}b&��U��
I	�2T
�su=1F�L%@{�����W��FQ������wD�X��v��,�1��
X��n��)A���<�z-���x8P9���1XG�p�W�'��W�U�g�a#�����z��������������2��*{��|��"�t���T�#�'��f�����1h��<������`F�S�J�QR�������:��s]M�'1�
����M�C�q�C��Iz\X��B�N���ZL�������ex�f�������n��U]8%�����L���z���,���V�i;;CrP>�f�����#B��
���'�.
����Myr/h�w�Cb�sm�������.9-,�p�`�T���`yN�6�����d5Z����hQ��nD��%�����le� �HJs�t���{����LE?c�����b��dy���2Y�*�`J
����%G�5c#7t��>t��$�H#�6�~{PE� �C�����-z.H�(�$,hDp�Ti�/�~�������+X�v5%�^��
���H��"\�,�o`g#�T�u�����&�>?����[	{"�9���6�7f���ss�N*��$3Kp�5Fs�COSJ�����J��?C�g��k>}LW�5K�����s���T?�ZF@���T�"��
��l���J,�,��>r�|=`�^J��6�e=$����K��N��������A)��E��=��O��z����&=���:��(�,�)h�
���V���g�j�����%s�2��Y������)�����f3��Gf�|X�_��-,3_��U�^y*���G/�u��Y/��I����	����&JY)���2"j���6{Zh�1��%�B�E2���Vo���oBl/F�����(� t��
mI��|r��F���kx�!?��x)_����F�?S�3'���t�.
���Z%?%�k�=v~x��<��H����\\N���_��K���o4	?����iGY$�[�`�@�}����z��W�YE�Z�a���MJ
�	��W����>����T�'Z4�� �*q�=����;nfp��V�����B��&m,����M�"4y�����al:�p���)�^��;�/�96�6l�:����_l��rTW^�R@SO�Q_`��H<�U�K�8�I����F��>����}HU�:��+QG��/:s�7���m� �3L��z{��N�S���TAR���ex?_8�M��_��~?T�������x�J4S�:#���+��'��t	��l�6#Vuo����&��_`���_G���.����YN!�F�������r��c�O�G�W B����sd7���~Sh�8p���=�\�7;���3u[�*�!9R>4~c�}7s�[%��������7nMfG�G���
h|C�X.� �$�5�;�8�P���py�e�RO���
�e���.2N�t���RJbs�,�h��uGH��J��h�4����6�3V50#�7��,(���'�Q����7�g������rw�R��0$���R_���
�����VQ(7�" 
~�����f�r?��,���M��r��m�����O0Zkr;���a����?���q�6J�������>���y������z5�	�����Y�K���-�Pq����������O�N\KW�;����5�E/S���-HF���T���H8v�����5�*n�L!~d�����%7iM������8L�7������RY3���`��L4/a�$���<7��%V��1�;d�2�]<IEl<�,�E��+��.}�	��E�T���G�_���\H�!1����G��h`�l���B�����u�C��0��z��/I����������������a:J7_�����XM�U�Fn/|v��
���rr�UK��W��F��K�I��z���x�������O�k�����y����a���&=tS|2����t�,�~n�\{�;^�Kzv� @�
��"���}�[��1�En��.
�P����od�����j��e;�F���2������~�
�EZ��������Qxn��,���@d�/?�h�-e�:�D{�1�b����w�n�&gx��f��TW�]Y����'��y�Z�j	t=+���s3��s~����D�����=���f�CUY�)�:�
|#�vaD�T�A������_��r�����t��O�����AN8��9F��n�z�}���j�?����Y��(���R����z�u�y�*�f�����W���{��M3YyD��)��D�����f�K��zA�����c���$���.'rZP�{��P�6f^����%/�o
@/
6�_��~�a��f^��-O�oD�>+p��8j�T�uC��r��j��mz���b���4�}�M�Ek�R��{v�R������R��	���P;t���������L����,�(C�ZQ���m��o�,�;���%�7�`��e���� E��4c�R��T�i�g�AH������tP��BFzM���e"�R���")���H2s)���7����b�9�`����:nz�|�^3��h�{@��9x����~�7�e�Z��f`���y�.�=�Brt��X������Sy�m��F��b�����2���N=�%�$u��i�7�����}��;��R�W.�7}�PJwb�3)-9��|�'���[�M��I��)7�4�����Q�Uv���(m����*Z�t�n����X��/�t�T��MV��*A�^�)��c�P��V��k��6��E�����(���p��.���t����a���a��e;h��������j��-�����P�/\~B�,�������������{	[
Jl�W@���6dM|.$!�s�85����6���g6j���Y���(64� �����P�W��Z��/�����m���I?����n��c�*�Hy����!����t;��]�T�fT��Z#7����r��-4E�my��bD�����E#u������ �0��bd��4^���8J$���-��G�2Ig
����O�R0���.���x���h��������l,;y����1p�0��Q�����I��c���+�����z�5�W�Hkc9F��?xd�����1t��&�%\�`eH}T^����L�;�sH�vD��������m$nY��-`v�I���Q�~��M����>��K�2R).zW5V7|{�5Y�
�����p�H���&����\n��}����H�������2ih�Z���p��s�����s���[]z�*����^ko�(��LN��?3��r�[��:�Im��O��r^;&o`��9��N9N1�#�-FZ=�a%����qU����F�R���[�N������)�nC�����y����Vk��a���au���dm*?A�" �)�/�p��b`�_*pPZ!�y��b�:�,	6K���k��j���M�)k�y���7�1i7���E��u��G�M�}�JB��Tu�N����yec �D�	���p���N�Y���`�)s�o�i�pmOj�/I��<��]=F�xP2�Ve���wnE��`al�ade����3����n0�2�B���=��������Q5����u��!��a�c}���?�����S�����e��#qGQ%g���Q�~��o�:9wS_<��|=,N@�����~��@����<mIM����������?�_:��1�	�i�<�~��px��)A���bc�����A���>��h<�kq"Wofe7�B�y+�1����#3t0�����{�N7u/u�1�������v�}�=x�0����{�04T�3#�9�;���~���j�T���A�T���6�4�5�90����Ona��H�}�=��`��'��~�."S�G�������r�-.X��`�boE)�����M�@��E���V����|z��5�x�0��->��"����f��QS�F�����R]�������uFj�N��.*���OO���%�����J��7�����gXL���c���Y\��(aE#+M�7���.����|��e���e�u����Bf��ORS�<�����������l��5��Z�@j\��A�z�x?a��b�Puy,wH���B?]�O�������0��mF����v���Z������X}Mlkh��?�R��V!F�F�c�$�$���O�nI�fu>�*�L�)���ow���
�/D%��u#������@����~����LZ�@���z�����^�Q%#���N��D��O�Z���x�����Y��$�������?�DQ*j8��Xb>�Q>�|(�V-���9���o������Xy04�����W�������#v^r&ir<}���x��H�	�����!
��tu�K��>$73D_&�5z���/������i8T���b�9��j����'?Q��%y-Y�l����,Y��9���4��
��������'E��u��p�KG^�{���(���q�����Xxl����d�lh�m�����
�����������)aEg��r)eX��	����[5�h�>E-;�i�n`!�n���X����<�N���Q�WI`a7R�fA�4A������k��@~d�����Mit��:K��4��\���I����<h@*������n1�,�`
lc�E%&���
���.�+�a����Vy�,�y��(��4E#��^�gHj�Z�mo�>����*\!����]���8��s�]��I���!7�g��:��`}�W}`d�����uda�~5d�
� �=O���M�����Q�	��m?hi����YmJg'�����p<�xT;Tgl������9lGo��F����F���?v�J&���F^h�H��&��qCZe��03F�k����5w�X��@��|�g�;����LX�~��?��4b�/�_�BjT�/��j����5_�"RG��,_10�o�v��@E��%	�����8�o��dx��4^MK]�!���2Q�?m#�*���K���Gii��&~�@�EH�Q�6U�$|����}~���{�K����.��]�9$*�l�Y���6V	����N@��<���[� ����I!K���>���h���uu7���3��/J�������e�2U
�������a>���q�TKQ�������t�p�`���8t]*<	�����>��X�b�)M���_����"��_��I8Cr��N)�����^v<H�7u2F������o7���*���1G�����^��{G��DhF�+�8���?
.~��i��AbV��J*%a�=��Gda�s>�D��Ez3&�=�]+�R�O
M�����B7I������(�F���u�rI8���x��9����}��u��F�IR����7$O��2�}"F�Y
Z��B�?f+��I7�.�����a������EQ}g��X����Uu��K���7�����Gz�V�l����:��!/���O�V��G�>``!p�1�?��5�W-!���i��G�����`�_��� 1_�~��Q��_�ft��
�O����.��s�JVg���������LW����;t2����i�v�k���pi�����~������T��o:j��YDb��T��]B=���;��-+���%Iv_�
_��@Q+����J*"�V�������PT3_I3.�4Z������������l�"�?�����`����h9� �"�`��?{�����=�kNA��� ������K�Y5�����o�����*���{HP��rx�����=ox�@w����k�U�SK��X����2^�uR�-*np�����f�TW�������
�����=;����%������������Om����~��Kk.�������^��3v3Ea��S��-p�u�4����!,x����ZN�Z��|��RKo3�Dw#P�t�;R�X�7)�����H��j����|X�5R� N�b������Qd��:�S�[�a�����!�GPi���(&��5�M��2b1�q������?���QE'u����`�����@�V7�_���3�{��O���O��T�2O1�1�N��f���c�==/���KT���(zzjI���Hj��&p���i�5qW+d��X�@I-�������Li��[�����-�S����)�"F��H��&-�����v7����z	&�������#������r0�im�[T���<y�}>.����_�n����������jwE.�i�^8&������WI��'����u�uF��z��<>�o�
S���v����-�Q�|j�GI=������OY���j}'i���
��������=�,��7-��7�!��i�
���������]=>B)�:#��2N��Q�vq�$]��S��h5R��)?,�&q'q�a��h�@���E1����f6;]W����	���}�UN��k�����Y"�-�:2CPU&���v�7_���������C(.I����^l�5������������R�����B��U�2��j@QJW�Aj��>n���g��/]n��l�1����{<��V��s���nP�#�W=���$�i0�����b������^1s]4�D�\A���9���Y��VlQht�4��|�5�;Zs�nv�=B�W��o�/��]�a��'��Llr?���j��\d�(M���"�?�2z#%["�[����\cEG��-�M1f+���|j�����%��p�==��kql����p�Ul����=�DJNA�D*wI\�A����xe��\���h4���I�0�R���v�]�!��$�z�~��>E��oQ&�"�a]E�w���B�4����'}(���	��i�j����L6Ac���}�����������O&w����^��G��L���-���o]�������	f���3�<�������t�=�0m��0���1��������=��m/�lQ%�)��{����HFT�>XO�Z�����<��|M6���s��*��Y�z	�'� ��=FxA�9=�BKf=20v����=��g_TzKb"�k�
���D����J��Z�#BV.�����Z���ce/m]v���h3
cK��#2����8w����o���X'c��f�N�����x3m+����%L]&G_M{W4ee�'
S���Ze���h����c5���Ks9�9����J��YT(���R�IM����d����#(5��C���������W�)'M��H��C��?&B�Tlc#�����<M������ZuH^���$�<��?�|Q�l����D5v+��4�8��F���(�Cs^�]�<���G�'�r�~��{�Wk0`�Y�lY�y�q�V@�!�h�T�	�ui�u���x?�;�i����"	����)b��`:q�%4#3��Vq�������<�q���(�����G7�a�J�c��K��3,���+�Y'�@�J����`�'E0I{[����r�QFJ�(I�	�q�?�H����sI_h�m�6�V�]bz����V��%K����������3FEk�����a�����Ja�����q��D�j���V#����w=�R#n��{��4y�.U��`}��`��/]A6���%U�M
��x�4d	��������(z�E��<�(�#��ht�T��Po$���B�?6�z<�mfa�0m�e�����=���~I��J���qSJ�sf��A�jT`�|,���I��j��#O�	T��\:�����Q�$q���uYo�\I����!Q�/����v��Q>e�l.��m;���������Zg2��6�?u~��kVfl����D�b�S�R$7��&�)E�8�}%
+9���������kkvp�~����j���&^���-]�
E�d��r]/�0�
2���J-����=��w_�r�}�&��1J���!�3�t���s�~�e�X!���5�fn�T��_>}�G�	�5� o^���is�]:,JP��S�?�����������#�r���z�wk��������R0R"���?y��*?�\���@�J�5���{��t
�2
\�Q#D��9����;�,�
��4tt*V���cH]��"iP!)�FU�U��$�������p=�������J��b�6�
5~=]�	y��/p{��!�g�;U]�������v�3��W���I��[����G��l�����'��`q����B�V���Y*m:�*�>���M2�8��9(_m��������S����{���}"��=�y�6t/C�*���H?�/�r�u�k"��!�1���"�N��]J:�n���|l�+QL�����9�a��nJ�8�k�E��W>]V�C�[���#M��$�Y�>�,�|J
�`����Y�U�oUe�����cP�~���k���"���m@���W�ha�b����V�AG����t�&�7�N�F�J� ��p��/�^.,o���r���LP6���;a�����]��2��_^%���j���i�cW�C/��B��`
=�pY�1������W����7�o�����!B:�}~US~�D+��v�tN4~~��>��U����AX�������A���b����>��|���MzR��+�I��Ac8�#�h�o����\P�F�Q����|������=�����V��}���V���Ig`M,�
���LI����N����^�#�fr�Lty����	i������*b��b���$�v�����N^A����l��L���&��o�nxH���et<F�Q��������z�X?���� ����!�1���;���_���%�R"c{�g���������c��@��@���\���:���tYt�H��9�D�����1T�H|���?l����E��s��9��6���}�m�	���m�?u�p�e��^����zc���aB��$%RKc����0_��Q�Q*�X�Ue��B��������
�d1���?������)�8������~�2��M3o3�%L��J��Nz/��I>�q���A�����:�'K��8 ��:o%|�'�I]i^M/g<��c1��W�+?�P�!�C�")�T6�Hw>��F��6@NC�p������..�Z�G�H'G�
/4�1]�&�/�8$��)�C�B�]�������e�e�U��#�f�I]NJLE^S��X�M!i<�}��c�����z�,�bz�USVB�T5
���^x(�=�f�3Q��bCZ�0Nn��+�-��`t;����7TW��(��s/?�������l~"_���{)����d�*���j77k�b5/	�����z���Ky|��
Dr��!��&jo@j�K
�B������D�n��6@z�ka����#���i������IF�x�`j�J�B����
4���%U�bn�v���b?�K��fL���`�T�����um��=�
u�)5ZOk_�G|p����
p����_�
�}{lH��)H������
0�0q�W��|�~#�Y�/)���v��+l�{��tQ����UU��M��������?�(�8���W�Qaf4������������#~{�u:� 1�xS<�
O/���5V��UH�7
r��+�e2v'���+��z�����>vK
�7fa!i\���m5�{&d����Wm~����T�=�v�B��7)i�����wCG�������T�+���	���'��f���X������_���p
~���.ARi�����M����^H�������rY���X��-j����Rg�+#��3^-j��P���$�Z{j���@�]�.j}4%m�s@���L�1�pW���s�;��:U�~��jw?�"����l��0	];��.tO'[���{�"��������O��ak�,Y<��%������~�&h���*��A�xz 8Rn����5�����w����3��/�X\�"o���H�w+b�tJ_>�	fl�*���cg��"���x,�<-���7��s�=B�p�*K�(�^D@��.��V��>_�����~"��C��a�3&�kfp����B��.wG`��qqv&���Cm����|��mX�(J�+�d�`�#n"��H�Rk:��������!�8�_{���k�Xl���������Q�wM9����c�������2X�k��=�_�`S�A��K�t^$����(����b�nn��u��&�/%mq�K�����J�Pu�
�	�=��W��c[m����X����
V�|���LY�x��5@w���G\
d�\��������0 �&�7�c>�)7�A�Y��J����|�Y����0�v|���a��M�2��K�gx#��6_��d��i���7�:���P`/3�7�P���F/n���]3��6�����?����,�����LFB��a��?��
J\�B�5����>�	F|v!V��S��k(|_��ei+�(����#S"�����A!7#�!����Z�����$�(�w��\�(uNZ]�s_Ix-RC�Z�S��Z�'_�j>�C^����T"��[��)K��E��NU��7�Y��*��FG��P�t�$�����0rG�r��"���|���bm��,�.���rn
�w��HlA��i��b#<lX�4���
E#^�I������M�G��
:'E����:g!�X�R6�D�Fd������/����L���������Gu�P����pZI�N')�G1�0��+C��`���Y�Vp�m"T5������k�(?Xu��2���l���2�V�i2����������V��=$r{	y!�Sd^�Z�o��"QW#0(��`������������*�,=;e#(Fn	.�w����;�-k$�H_?�)o�X(����L�F%�_�'���ve�����M����?��o]��Ej'�4M�)�{�4�M���V��p���y��V+8�/J��.X�����~�r
���b����fk���L�J`�qa��qW	ee�t���}t����~R�i�����^	��T�y�G�����G���F5�� '�C<H��5^�|>��>���O\�;�m:-w�4�i���W���2Z/EH��c�VX�h�����cp5��!-����FT�����o'�/kZF����B3?�O�tB��+�M��j������SE�q}F�w�Cj��7��0�G6F=��`A������;D��lM �D.�=nGu�5��b��gS�69�`�/'�F���C���r�����}H�<a�aL��$�+�$��+�
�,=b9] ����;����l���D���,������6_@�o#���&���#�g�\�{��oHV2�O@��W& ��������D��k�5��hJU_��b�xw�kL��\5(h����>�3���P�����zsg�������_��3��r�)��13d/`Q�6���K�Ez3y
����_<���@D$����������"�r��U������5�m��&?�6)���LOp\q��%(��4���#��MF/��l���o��V��Q�FdVk��B�����>|��T���b���s�qt�n�Ro���w��/`m?���.��(�y��O�H_���=��!��~�8��,��[�>���/��-~���Q[s�8�WN�/�"����%�g�����,��R�Py�������c�������t�I&���O����������kJ���~�t��}�m������_�� (�4�|fL��w�o 
��{���xq[5Wp�faT������6Fn���}��
:P��(�_K�v�
tg�,j�g*��Ik,S{6<YDJ��C����������������B|���p�%�9&������� �m�3�,ut������I��zQ�T�F�����.�^7$����(�`l+~�h�l�����o�SU�e��)U7��<}���=i^�Cq���[�|cvs�4�
���x��-�D9���s�?7�L���%�x��"�����
#f�G�~�R=b,�=/����8fU( ^_�,z S�:_�$��?W�|��Z�^;��QC�����a��G%��j�L�����:��BSJ�Q5YQ�h�~��?,�f���*Z$��r|!�,n0��O�6��'F �T��~H�`�d���u{��a?�����d;���,��Q`���*����������g�mT�����."�I)�D�ct�����������a���)a���
�tt�*en��(����]���0�]���;�p���;���r(���#�
�Z�����}C��Hf��as+��ai� ~D>��5Z�8T_N�Z�&M������0�H�*�p$;��� D��D�>�t�����>m�&�NF�n���y��C���
y_��j-����P#�]; $�	K�`D#��kM����89Z��
{J���������������,7m-�2����"hST@r���G$T����A����VaM�Kb}��bg#��46dN�F:}
���J&M�j�K���~�Lj��@/�Y��;Y��������{�Q�9� �U�>y�aER������]�3��	J���_��x
]�b���D�.�������q�P�#��������O�����L�&kU	I����:8)�-�������9PE��� �a�7r�����g����N��:Kv���!	�&BG�~u1K��"��G��i��LC[��;�9T��N���nS'w���,=WRUk�JQ�JvWJ��2������������4�`vzrTj���u4z7���jN��=L���b�=��
+E��g7D:=��"��"dF������|<�eu����������1�=�(���b%�������lo_o
��9u��?
��
CGoq�$�8@}���:)A��M�r���5G/f���(g��t��[�s,���D�b���8ax��bQ��''\:=*��zuE8x�k����7Nj�=���H��P��)��$�v���<P�
�:�iyN�d���9.:R���` �D��n��HXL���3�t*M�~g�Cd$	�������$�N->�����S�����P~H�&�n�6��Kf[�����a���c�3�ziCv��0���1=���Q?����3����"�u'ZB�,5K/��2���C4�.���qJXJ����>�(}��0L�KD���������/o���
��8)����V�������\���2���������0�5�3�~*#�
.N��~Y�������h�3ah�Z^|�}��#�)����r����>�8�^z��PpV*:�����X�9x9�����D�<����.��=~�q
�?��;)y�'�{����_��6��BM�������B�F8?a���.?Hy�b�v<�?���HE�D��_qJ+�������u3g|P�L6z��{�:o��(��e����	N�E�a�{�+C%���";��N���Z�<;����L���S�g[����k�;�g����������Y�#$.����\�D��o��O��`����I. ������&t�0���w���Q�RC��+�� 1���;�Z���B��������g6-�J<�GW���|����]�T-�cH
�7�b�����q�C�7��8��J_�%���?_LVvW����z55�M,>��7�}:�`m�8��?��Z��8�OT����o_���_��J-�6��"�}U7���xV)�T���_�Xl��-5D
��2�������y��
�*>P�������*��-���j��E��M���n��F�+�`X����6�,�z8������,�VcI'��5�� m�GN�;���7�)o��f�5�WM�����k��A����!B��w���{�th1&��_��{����_}3���7���#b:#��d����X��
/�*5M�����f�|��0�@��n��Zz�{+��M�vDtj
���i�N��6m���p1��w��Z��q|�����*QiC�,�lV-F�,yP�t�T�jK��>�^).E�Qf���G>c	Fjb"�.�U/�j��k1�b=�'�	�<������/���������J�`��������5;*L�G��C>cWn���.�Z�lP�e�B,`N9�������������*�^Cs��w��2��E��KmI�8�f&�:�u�?n���)������A/Vr��z�J$��WV	��Q��������R�N��{���~��h���:�P>�7���r����e$�B�0M[/����6�d{����r!l���d/"���'�����4���/��|(�U	��Z��L���k�xdi3������)87��qin�rT�);�u0�s,�27r�vgn�c�T@�(`}�xs�aV*��U+T����4������������-�����+oe�B���0{4�[+S!*�D��n8�+������{����K���h�H����,&Tqw��G���t��C�mT,rqu�n���1lM�zn��?�7�]�;����>��i�$v1`��� V�e�y��I��/,o;J�/���c��EF��W�A}.�C����$���W"�X`
�Rzmn��k��m�N�4%��4����������\FL��D_O�.��n%��� �1������������!�g��C3+��9y��a�g���]�=��;EZ��Bt8�����l����U�DV�_�,z);�T��R8�-�jO�Dc��R�]���*q�o��sv���,�|?��`0I��x_t���j�����U ���W0qp;u��Z�rJ�9'P��8v�(�OF�4���TPh}����@H��g
����,z������Z��%#"�AG�I���P�[�����+'H��������r,F�t3��V�F"�F�c��/b$@M�K�!!���D]��q�t�������S��|S1�O�����ZQ���n�hn�_��>#<��9A�M7^o[w��U�@��*H���S	g�},��9	��C,�����nC'<���v`�,5�V}S�m�/�I{+U�IjnM�=+~��L�Wk��!��"���G�V��%��C�U�c��5��P���J�������9|�~L��A2���&a�)�mc0��kHy:E�����p�"�aSK]��x��������������Mb����-�o��<��������KR�����l�D����G��PP����
��h&�#��s�q#Oo����b��:�Sd�7	!�]�'�C��A\F�Q��E�����?��m b4�HmS�Wk
��L[��Y=����2��/��j�}I�����������O�;��a�6'6�S�MJDp��5��\@��C&���������R�n��	�
��A���a�
|4��xxy-U]��%l���;2~����_gcY�c�S��of����bF�3���c��>�'�`�-i��
�����'%U�{�f�J��Me�J��>���
��g�o0��_/VU��/���M���flS	�C���qc��ZZ��f��=R�N���,~^�!!����p���b��s��m������.)��������W��L������
iG����@��
��$7j�<k��ubHZI��]�rD]�3Wy�m~`�G��)������}!�|N[�rLm~@MS
� �\y���sr��T�U#��i���d+s���T�����x�wO�y��mHC����>�?��m-�"i��.���]~t�]{:*������*�ca��d������%����c��~3VZ[g����s�����~y���4������&)K���R��K����v�IK<]���x��J/~�=[�oO,��S��
������x<O��������V�:`Z	��a���C�/hn�D!m�DT����tS���6cZ���;/���4= F����q���'��(�*���g9nKz�[��L*�~��;��!��Y*����~��T��_����SD��[U��HM~���j�����3�d^G,��!x!����4�ET&U��%��f�{�����j�HfV�|S�D�
j%��������F�s�����_��65"I9���A��-�p>~���i�aV���I�,��[�:������SX�&|��f�������6J=�|Y#,n������\��V�j=`��,c��i�����K�`���(k?��N����e��jv�-�J.'��H-��s7g��C�6��Z^hh�{�����K�#u5�q���^���>���ul�'tMe���m�$a��S@Y�U������r����]�?����
r�T���n���Z���Y|Jg��
��kj�{�H7�WCicR4��-�>5�����W�	�0L	 F��������,�bGW]��`X��&�KG)�)�T1���0))J�W-���\UX�p��-����0�:�)q���g�%��A�w���*:0��?�������qc�{��2��-�]��)�d5j�TP�D:T�����}L��A�/d�!�R�I�x�e���)����
zZ�
woj��H���+�����c�����6����+n}5��]��|C��hJU���9)���f�����g��/�A�Z�w��/�=n���;�uSN�C����*��V{*�8����"N_��;������|T����WQ��o6n����k�eT�!kA��{�q���W������+T�PJu7���X���WzBJ�>@o�(�@��/C+fn2r��_�.���q;,���6
���]|�A�I��4[8H�����%�l��E���u�+��EE��+=����GGqNu�v��O�G�������mVu'Q�(zA��	��tysI�M�
[w��w��Ls��@�i�^^}S���p��2�-�\rC^7�c��c(kW�P��9W���>��Y�CZ��A�������H���l��H��^]
�D������5�ce3�q�mMu���L%G��m3Fj�U����x�os�-�?��E�M���1?a(���������x����e��G��=�L�S��T/Y]wJ&|��o��6�I��w��A����[�;���h9��$�����*��<���t	�P<�Il����=�~����9pw�� i�/U�1���X����v��D�w����Z��	�;�nA���v�r���;���(-�ib���I���u�����(�e���b�
�R�q����Q<����i���(�7��I��
�}1���?��e��AU��
�q�>����f����!��_!l�Zy����3�����������E���''��>!B��
9���_[�r:`hp-(�#��J����0��p�:���!�?���{H[",��Q.�n� J
(4�s������]�%[��y�c�3k����!�����`�2��LF���2����c�}���MS���U�AM�4�������j�T�X��n�c����������Np���I%l����B9��{���|�,���S�K%�Y�!��-��d���'�2r{�����������*�����B�� v��'
��y�S����M��Y���W����#!��GY�X��YHJj�5��U����0]�A�4�V��:1�*����>�2�wK�ZD���D��7p9���������RB�E����;y�5��An�
����*����=���q]�f����V��q��y���A�#�
s_�q��0�)f%�Yat�w���v�<�T�����"A����eHg����]�\���������N|1���u�r��Ae�n�o$C���
��� �(��3^Hp��O�������@b(!_��������)%3��-������dq���UED��/�L�J�F3X�������,�RS�~����vh�K�<�h8Hy�s8B���WN��e\�
/u�KXk���a�Y�8���3Wh:�,��������L��f�-�[V1r!�����5���T��@
r�E7_H��'5�4��$+�[�*�
�3�c�5nvu*�Z��V3�)��?����%6�(��t������7�;�(5����D��E����	d�:��|E&H�yFT����x�����[%��W�/P�K�D.���#v������������~�:K1���_FF�L��\hS�����Ov���*����*_9c�> �Wm�����Z���@�
��o��d�"��q��8UC�T�*�o�[]u�1e��Y3��TF��5�}*��DR������R��}X����R�&�4���}�!�$�I#P��w��1��
MB6'C=L�<0[E4{����i��a��^t_���f��{���2����t�Q�T�.������5N�g,��|�7����f��5
�W��V�@����D�g���L]^xD�7����U�c*c��ES���P��p��%@�P�|p���3��Rg20AP�D����7�{3����?wo��p�����H��M�HR���DX>]
��)�m�z'�I _	��L��Z��K?�-����`F\��h����QQ�7vN_����+LTbn����0�@�r���Y/Fj��.F���K�����������T������������g���`8��y�.8�����G�E�RR���I�����O��a�/�2��6\�<�;��gg��kJ`�;�F����Rw�?��f���;Vo)����`�,\�6p/���a��S���9*;P��\������4�B7��}��4�XRf4_$Y:������"�j�'��v0��8���H�n0>U���(�I���$]��2���������r���q�����w2�G�-�Y�T S�1��j���X"p=v#���+u���x�����F�1��>���\��7�{�$%����H����ck�����7����]��(w (�cN)J�h�WO(��i���������WH-���IIE#����I��~<���|�����,
�bI��b��<%���-��U�%\� +���Y����LH����"��~e��v(%�jDFk��/�#��^� �������IG "_��e��<��c��U���bf������AJW��o&>��A��t��C;��ah������~4,����#C]����qw���7F���3.1��bM_�l��c�c�C��M�	x����
2 ;�F�x���2/~>��-�yD���Lh@?BRb��7���.RqN6��K��^��r#�B�;����5��8�O���R}Ec\���@�1�]'�6����+����C�cEM�v����?���&���(��tX@�CJ��/%�7�x2E�A:#x�����oY�S��Y4���Z��tGw�9��?i�����}���Y�O��mdm&3�����TKL�
��o��w�]�w�C�%!.����og:�a�FT���8���<���7��2;b2����+��KL���(��m$��x#������������Z��R<W�0�A]��Bo�B�F^wd�-�����\@���v�?����GB2^{T	�R(�")�����$cJG
�����2��"��;3��o����������ge����k�	e8�4����s�4��.�&��P�!7�h?���U�IW4�HP���8=�����d��7zG�+^�fjL����x��R��f������~��:}�r�V�SZ�D
��C�/���~�~6��"�9�(�vC}���z���N����%��U��DW�E%r�#���1�a��5g����}T��~')s��[�T���c��Mji������9�6�=W��fX������R����/!S<���u8�r����w�m�S�����N������0L�n�b�Q����v�=�������������#*`'��57�U���z%(��J��=����������,vae������o��hy�?F@b\0����\��wK�]����ZR����?(�������.�BW�- ��E_`�8�J�0�\��,<�m�����I����:��b��
������B�2�!S�T,�k��(Vle��������|U�N��
��o�p)C
R= R=>�k��4��	�:�a��[{���e�	��yCN�`y�j[-��;v���R��*���	���I�T��C]8%z=��;C��u��(��������^��;�*T�C�aV_ka��_��v�f�5�
`/�$���h�8����DD
t���2p�t*������}�5���*���J0y_�?��(�:w����.h3+�����>�������m9��1�H�|�od�uHC���c9�_u-a��(U;����0�n�}�C����?�J��d��H������XN_0gD����i���z��Y;���u��U-q���g�{]�u���GZ
��}*pE5��������"Y�"�3i(���?+^/�p��y�R�������h��E7Pc��j��������������<��R��]��f�5��-{��s<�5�|QA����O�4l��Z���m�K��O��C~H0�c��d��H,��
�:���m��|������S���+K��
�ss���lFk�9����m���P�r��@�1�Sup�|j��pJf��&���������lg����zs:���.�<�4�>�VZ>?aI�9�	�	4Mp�E�Z�����bu����H<�����(a�6T��9G	�+�NV�gL�*�2#g�+���T�F��m�G��Y�3r\�Q�����K�c��h������Z���y����%`d&�u�F����}c\�����*,2p�����s�(b<��(�����Y��N�Ob���C��!�T_��I��O����l�K�y����gw�>����H�E#2�h�C@pn����k8�y�!�-%Y�����_�O���ey�(k(<&;�������\��T�N	�A���]��{l��X6����(���#_	��q1��5H��ce1����a2�r;b�������E�s�O�T��6��Z7^R���FU�����#Qi?���+a�{1
�N��hTN�cH���xy���>OPLgK��x+�Z�(���_��e�8�yf5�i�mi��[�y���8;J<^���C�:���#�:�1M�2�X�9w���1 �X6���3L[���o���;7"?���J2�Ir*
���mI����b%A;�#����9���^:�h���i�D�G�f��3p��I!2�\�o���NN�
�#x��Zo��a]��Hi~�:,9w���W@��.���k�nHe�4~~�?J�t
���q�5��R���5G���Ggb� �DO�d!o�� ���1d����6�d�N��#�����bz	�"yld(� ����@�A�B�bDWY�&z���Z�7s��P����t*��"���{/���`������R����|\�0��%%�g�*t�1��"�����"D��a�~\�f�k��q6�����%4��w�|�������3��*�`
�9E�G��n��t;�F��g�*�~S���GT��
MT�|E�-����h��y5Xb��s������M��6c^-.�-!ot7�6��rY�<��� 31
������)'��e���9���E�m����F��0tS&����-��-����H������@�T�����Wl)fcUJ�v��� ;���;\��9d9(t�Mu���y�S�>b���Z��Z�:�wkd5QN,t>mE�����]��T�ch����%�KC����A7�����#w�=
���]U��!����LW��f�����)WH�`]��J���gBM��VHX�����|S���R�vR���}Y�?��������HBa�fIz�O�4a�`�TO�G���0I�����V\��f��f�C����������|��|A�o��/�n��M��A� �
�u,���$I��%�r���(<��)"��c}S��s�&�{�)c��W0��U�fgWs�N;��cW���C~�{:n%S_B�����G��i�e����7���<���o(��LXi�<�+r�F�k��������B�P��'��6�����_��p"
���<��M���-/������l0�������?x�q:RJ��
�J$��I��s*����$-X�KT�{[��/���1?X���e����Q�dD�,�����j����q���0^�c��[(��F_�Eyp�e�}�="
QT!�Q��G>���+^������|��d�m��Z���O��;���X�g��>��b���
��L����
`�}�QV����A�'/���E;�$���TC��-��in�����fBiTq]P����b��_��.�I�[�K���;��^�]�;�E�����N���{Y2�)�n������(�������XRU��S%�.Ci���=Q��4��\�(�����| ���-ii�T��H��{����{>�}�"���ua��2����������!
w������3��T�Bk���Z��yd�%]\h�/�%�&g���qI�E�{p�,�0��I����������4JY5��	 ���fK���O�UIm�����4��N��HM����{����Q/����H_l`Z����[zYT������zG���L���7�+y'
7��;�HPS���>���6��s�Z_w�f��n��$�\�������E#p[CI{_��3�n*j�G�Q/Z���mF#u�>+{���K�9��n'{������"��@����i�+K�4�||��#�
?��������k������sY����=��cjW��Bgf��}���B�m)I��<s�U������+-xB5����S�M��rw�`���#U7B��u8.���/u�}�h��3Uv����B�� 8?eOV��E�W�%��N�M��@w%@��4�U%��{z���-��h�!�|�^�E 9�d��n�h���X�����g�p���������6p�����c������fE*���XR��C���W*�����0b�����&b��"1z3��J������x��Zw?�pu�M��W
���������j�a�������|.x��3���z��
j��h�{���g5Y��>�������e�w������{	���	���s�P!��j#Hw�7�:�9@�F���G�d	�J60v�"����G-@��oP�?2�.&|d�'p=�����==9X�0y����������3������i���W����wYX
����. o"fFl��.��'�,�A���I���J��fn+�}�����h�	NQ��_0���L��8����Vt�A,���*�"��F��()��3���������I�Z'��Y39����p���m�0=d2����v/��M{�y�����8"�s�����\n�����x�
�:���r�3_I����T����bV�kGt�}����
� Z=�9� �aM�����H@m($kBc�;�.�a&Ee\������z�iy�b�}�^��O{F0�f�
�������]�hgQ3����/P+w�VZ��1J�%�Wj��,}SC.�,'��`BU���?j��_���^%����o�+JP�*y-W]9z��uT�F��Uo��/ B7Y��N� �#.�i�
�h}�d',�Eu�k���r����Z��%�>C�;m��>*/H%f�����\�����V���m��Q�������QH���#�[���}�+�I}V��+�4�H�}�a����c����'�+��Z!�v����4/`W`D���p���5�d 
�"��8Agz��?��<�A!s�/&-/���jN0��r�[��P�������o��?K�U�G��"���� j���=x�dK��%o���SN3o�oK}K���D2����?����-��5]���FM�~v�A�Cw���^?�w{ �-F��f$7���kx#A!^�J`���F���d��Ku;k�HS�c�O���0�.����]%3@_-7�d+�A���`�y`���Lo�w�+�lW�A��eUFa�0�Bl��n0���TS������r�����aPa�\���_��F;X8���E��P}D���m�������U5V�\f�xT���
/�O�
!��Ew6��m��{�*�I��&O
O8��vc�����-����P��@�=��*A�[L%�B������f��f�P=6.�"���`x��?�M�T������O�(v��p�����H��T�j[������IG(�~���rC�|�w��:��n��%�a��Y��P������h1�y/���v�mL\��iAM��5�k���������[���]�f��&c�*���	��{(TN\]�F�|  Q������;�gjo��fR$ 
��k���
�K��J)�G���Z9t����K/g�JGB�Vu�`���X���
�3(�����>����d7���k�W]*�~��a��o�;��~�i����AN�~�_Ab~�;�k�w���������<w�j�
���uAw�2�P��6-9�Nz,�7����@
1Cp����2�O���S���J��/^L����7������^|�$#|�
|����:"tx��Pum�~�f)�!9�����-\S���������hW���E�>�����M��4�Y
8uvW�[�_��^�}�,E��b�����G��fX��2��n�����YM�����������9OF��e^k|��"�H�'�H/�Pn��^�P��j�,�e�Bcb������r����$����+DB7�#�#ji���;`7D������rG/6I�v��C� Q��/V�2jO�1�avQ�6(�3������4��M���EjA�����t���g#�f��ZGsV�R����R����d05C���P��x��C���BE��!0����6��Y�h�iZq�h�����nFn��>�K���u��^���wDal���;���T��o�q5�+�d���	[`���$i�GL'��i��#�HZL�n�8:�����$v�eR2N����fG�,(P��k��++]�C{�T�����E�h2�i*A���~�9�l��+���������:Xo���3�@����z�F�?�����+1T�9����~IGW3�K��K7�*���UpG�m�����b&L�zG��#���C��=�v����"i����m��-�I�f�@�QC�?��!�����&�����%1�9*4�����'�����@�����[�����	�����n��x(�R�����U��RU�q�{�u8i�9�0�X@���
�ax��?\Hj��%wi�^��^r���i��*MC/�w<��R�F7;=�� ������^n�L��?�����IY����%���������9C?�g��Q���
�u�+pR.8PB	K�Rm�1��h���Uy����c
Uh �c������Y�@�bx*3��8�]Qf��t�	���.���W+��5����8;M�������7r��/w�	#hfU�b2�.�v��Z%J����������z��?"�����_��|'���L����3S��| �cJY����Z�v��E�M�R"�����E���cyXJ�^�R$�
��gc����p�;I�0n�$`�I�u2�c�>'��+g��n6�f0��G~0���]��_m���t�~M�0K�"�t�uZ��J��LP�A�H��ti���1�j�m��JFUq"WH��?��r/����[F��<��^�o���F56/%�r���E'��7����a$�g�Y$f�����
0`Yy�a��,���P~�D�>�_���^%�$�.����������uO��������x��\x0r����Tco�/kW�h[	eGYRT��U�� ���~bN8��7--w���Bz����l9%#��A�g��A$���J?�p!y���"
�t�(��$�������R�Z5������G��8v��� L�DW�����R4�n�u2URW��?e�<��:�t'�t&c�P�Bk����+f�V0��lW��R����N{�0������DTF|�����B����U�"Y��o���^�.G��o��+��`�H��u����>:���r�������#?@iPr
�)nfDx�r!����DA�l�qU��f^����(I�{����;��Y5?�>9Q;��e{S���;FI�j�������h� �ns���HFS�0y�U5?Z�!RK-v����U3@wK��56�W?� E��\trk���8���5(�>j���J�|��0��QI����r������hvj�Y)����F"�#�0�)�!�[�W�����"�l��=��PTVC�E��'-����
6��:t����X*�"���-���,��%X&X�:
�a��!��o�sc{~����l�P��\-����q�r�]�'�/�
���
���0a�_@i��u$�B�u#��[1@[\M��Et=#��{{���Tr9a�<�<�4I�a��'_��sZ?!��������[.��d�����,����L��i�j%o�b1E�f���N��5�p��T���
k��
��et8���z�2����8�H�i�9���
6�������m�C9���RE��|�G�x�cL���9��S5���������B�5$�}���a��2�����?f��;��\���l�kSV�c��n��6�Y(z������X%9r�ii���Qt��z*�������PA�����}��g����%J����&{�b��!z���(�0�&�L;t���9��B���3{�����W�9{��B�fmF4���}I�T�{i$�5�F����8�����B08X�q��A����^����o�X���F�����>=��X���(_XH�d��"}�x�KP���d�V��!����^@W�q���(�:!
��9!��Y����t����2��!�����Ta��Z�v�\�0z��l�'q5���f������%���V�H(1�����Q��`�=�-lb�!!������#eaf�i:���ry���2�k����d\��D�T�3���k��$��*W����O��W����nfS��2O{BN"���lXI��q�A2����`|�G�%�J�a�(�f���^�,��0�Q��X�r�<R�/��*�AQ�'�����Z.����mP���5��011��>����	�!�M�{�}�2H!�@�7�(�6~
��W}g�����D�������G�{�����S��1Q�����|�^��e~�Q�;�me+��S��_P�������!<�������
������{	"����PW���G�~0�\�+����X�G�����2
��oG>�/�N�Ce�BK\���<�AK��o�sC5D��dO����"�������nKBt�K)���P�}y���
�����&m�u�c���lgA�?:S?������e��G��[e]B{P/�����F5_x��pvN����=-����PI�_���ril+��:0"F��F�z�}3�����lFk���;���G���3�������� yJ�0n1���j���o��-j�3�N�Q�UVa�6tw��O�7�H���X���\Y��ox�M�h+���>�r:P���'w��c���$3����[�w�/Z����:B1l�Qj~��7��pC��y��J�/��RUm��,[�&�}�Q�1}���T������c8�@�glaC�*����~	_Z~��E��9.1�}uG������9��]����`V]��-78���r%�i�i0O�\���x�SIN���� W����G�������k�\���>JS�X�v9V������u�(��]2�����H�*��,(��U38��e���>����J��n��#G��h�K�M��j���������8�^�?�hH�<��������l��\W���������cc�w�1UWi�?��~>��PG�W������1��Q}|�.�m5�U�?��N��L���.F���b��j�~���LO����L��A�ohO4q#����*Y��a
�t������q�j�)i�H���^s�0LR�xa�Unh��y�h'��Z%J����*=�
�e�������=)8b� $+������2^�:p8��/����n���dk�	���?�6�bM�@O���TU���*����W7�#.'��*�@M�@����b{v �6,���f]>���u����p�9��� �o�)���H�u�%.M�JS�2�W_��
o���KG4K��Hr6��.�8ZM��M#���4�����T�w�`����
~��3e�L��F|X�T.�NN�C(d����Xa����������o`*�i������}��!W�v���������U��jS���#l�F� ��<(A�v��^D�%T������RO������l�X��3�fJ����]�t�oq��q�"�.��tX���4[X}�7������P��a�����S�j~c��&�-%�b�
o��z&�!Mk��(�`�3�w'�r�,��P6���N
s���U`��s^M��-�d+�Pw�K���G��h�6R���������5G�%W�� Ayh�'bR����g��$���f�_L��~������5HJ���!�������Tq��E9��OsG�g6�����:��]t��XN�����N"
0/��a_��&9d�A>������B�r��K��r����F3;���dM��a5j����K����SW����Chh������%g��,JA�������
���F��KaN��?@B!�f�;��=��6R6���u%^��R�����N&���ui$��D`R���T������������������*��'�����d��T����$��-L�����0��jq��&B��Uj�l�)[{���M'6c���R�����)
������wt<�t��r��6N���.�Wr�m�����J�e��q��$�b�	�M=�[��P��M���w?i�oN�e�X8�������w�������-��}�#EB��#��[����o������������}����xy��Rl>V��<����w�}A���O�oR��
:��y�N�X�]H�e�������>�u���e)��1��PwF���+jt�m����;��C1��:����%i�Zv����Rh�/a[���
k+���h���N��_=��J	k�V�aD���������'�k�9��.���@m4�{�H�a����F� -I���h��OI;gwb5����6����������nXi�9�{���r�-�lKK��-]�}����D��z0�!
���.�}�������$Rj��a�L����ib�_���"�����Z�%O[]r3�/8@�~'�b��6Rh��kf��lO#w���V�nk��"��&��k��3�WY��=�_7�:���C���#U�t��~7�ox0\��)'���;b��y-^��,��t�P�!��X�;_����x���5k>]j��-��^�j]o7�_�A[��vGg	��S�9�M�\_U*C'��*�!�K�zMx�tJt�]Cz��Wk:e�m����/�
�;�"���g����i�+`tx"��{'�J-R�~+!y��V0E��{$
I
�8�%,
Ap^o+��p�a+�>����I;�e��h��G$����/��d�l�buE�'�q��!+_C���}x����It��HhOfU'R������upq�
�O]u�������	�0
�S�)��M���t���1���r��V�����������P^Bu�G$=@& �P��M����=�������?M_��jM1Y�bn�>'���z-��52ph�x+�_�JW��~�s�(������J���2�OL��3�����������=����12���,m����;���)=�,�g���t�.�#�����Z�������3M�M��p�������#�F�t=��d�]�����y0��2U�Jy��;n�����i���8�����g���:�� �f��^����u�0,k�P�.�j$�U-Pw�����9]�b�J�q��W��.�Bc����U�����(FSw�2��|MC��W�>�Z�eadg��o��0\C,5�3�d 6��s�����j���/��*�L����&Haql����P���k�m:�;��~���&sl����Va��R4�Ti"!A��Z~��������q�F4��d�}3����iH(a�T���������&����:���4ut]�����#���fWG��h�i�n$+������(�����8;�����z��X��b�����dC��uv�o��5�-`�*uOX��D$����X�Y�i���
�������z�����j��Bdd�����cr�G7���\Q�P�q�������fr��TPdQ4(����wNe� �����3%b�j)�x?#�L��Q��A����}�T��bd9�BrZ�e�6H�S�4������+(@��2��TO�����H8��gr^�A���H�2+&6��-�U�!U�N���Q����c�0���x�IU!�������h��:	��� y������
��R�}�����D��w�0��+a�����9]�4����p�#,`y
[�9�	����.����}z0V�W�e��'��������rl	�\�)i��-����~�K��u�k�x[��AID��&�����b"$m����5l�m��S��C�yl��uQ�������i��C&����t��W��m*"�1�c�����2�����Z���C�`�0l�~�����A�#�/�.������3���>���k2��c[�^��������"����Ca�.oG:�������V��vG��_U>�t�^�����P�����e��^0�Z�8S6��������>��B�w�VR27Y�d8���7�����PR-]���Y���]�o�s���w���I�^.�o1�.�&����^9:�����0�u�h`J`�����pKi_SE�_s|O��s^��ZM^��z��S�U����?Y��f����p�����c	���c(����Dj���oU�te���1W
�vc�;�O�b�"U�U�@	��.R��~F$�R�������*��q��;�8��zb�U�gXtV��n�gc!=��TA�(BM��O�zOu20�S��<�f]���*�bx-�;~������n�uF�E������!g+g�5�]�(j���
��3�2��`�>]��
*�mh���=������!m�.|����w�j{I/�_xC��7�Z:�9���JqEy���7y�!(�aF���������s�gI��U�JG
k Kg�Q|���\��r�Fp�X�/��+�c��D�[��B1Q�7����S���k�u�28sww�??�/�������XJ�I�5v���Tn�oD(d�����&�7M�#���@�������m6�����'V0il��4��<��=K��)�1���T(I�C����(m���i ��^(D�z��9v�T�#������<D�2�����A6�}��-�F�
F4�� ������������
�-�@���K�k�����;Q�7I~$f�b�5��������f�RK_Ng�C��w�&��������#
����/��.4FO��k���P���r?���0naC
j����3�Xw)o��<�^S�Z�L_������
c��|�0~���/�<xu���om
gX�2�M%g��w�Y��4��D��K+�)�}�T�UyR�_�W�<�7�N���gY�8�H����i"N�
)����@|�q�������&-���u����#G!�x:*��.���
+@u2�n�`r3�
�M@4v��od���4�����Y-v|������*�L�����3��8��A��-���j�����z��U��G������TI�e�s5����{����w��f��X����K�����F��� ��h�����I@3�p�����w7u
��E������2j���xPe�R)^����	h������t����Zf�?/�Y�A�j����!��h����ZW9V�"�{r��v"�X>������[���D2�������n��p��g5�0^�&A0���m��wFpF�������u�0��Y��������
V{��m0���v��
����+�E��Ib�:�O+c�3$\���p;F��K��?�����)�*?���%�*���*��3�Y[�G�
QuM5I�����������4E����U9�7]('ZA�k�PsV�S������Md,���C��G��Z�
����@Z���l���hwQ� ���������5I"�,�D�2���Zn�� ��Acu]�9�^[������dl���8���k����SfT�Wb~�\f���i�������������^�5KH�q���r���8�;e��>j���@�`3_U���-^�-~7i��-uT��9B~�KK��������Z2K(g���BV�}'�h����Sc��=`�B���2�#�8��"���}�o��~��4�nT}�����]���BlC?���.=)��O}���]�����i)	9.���uH��������l�������~�9�P5D����������A�V��K�e,�K|�	��4���)E

���d%�_��Bd^���Ss��l*B$��i�\��~l�KuP/^��������%EP�f��4�+��Joz���`\���6r�;QO���7�������l�pf��|�8���(��a�X���M�Sh7g��-Ki�q�{�^,��=�������	�gIj��J�Rs���/���q D:^�N�[;J�b�����-�u"a�9g�/��*����/��4���������a���-
fo:���:o�1�U��5�BD��
%u:g!�)S(��^�l<�%�u_h|2��w������1L�$.8�O(�a���4��B���K?��	G���s0�!�A��BLe�l���GWn�1��6�u�d�MG$5�RvtDp�x=�r4�jT����,*���a�����by-���$<�x^����%�DC�� ���js��o�|
�cfr�C�����NL���w�7�22@?O��Iwb��z���W>�%�#_Yj*q����:�&���g5���#j����������d��AWz||���I��a	3��	�6de�V7~rp�S�vcv
16)#�����l!��hgR�J�H�{�����4�L��5����E��5[����`_n�X��gz+��6X6�=�,��M�W��o`ly�^�ri���!i�9e:���X��x����Wd��!������0"�AABH����������L#�03����D$n��l=k)��qJ���0�}��������'}	�G����>�r�7�i*����h�m9[�{���M�� ���� �������]�1(�;�]��H����O2��%`.��a��:/�m[/5��9�f�=T�SB���le�riI1ac0��#U��o"6��Tdqly�����G��a��|����?�H��!��W�\u�4��+�s�(�8#�1iP,c�I�ST�����G=*�����Om����+����)q�b,���o�j��}��x�Fg�^���L�oj���&�D>�����X�e�Lh^��;�-�]�6�Bid`:�m�_��ZHfG�4L1B=H������n�w#��2,�ge��
d��T�W���GMV��������z=$�����9���Z��R����W\���1�lMUYtT������ G�h�Pm�A��!���:�}������n�ZM%��3����_��$�Jn��SeU��@����Y%)�'e�a�O�%5��c��SV� ���O}J�������D`���1�"�t�C���i���r�&Ys���C��sB6,?x��t���t������'�F�)�;���"P����� =�N��k%�zN{����q�[���w�4�&�H�C�� @;������B:�a�`is3���7i��Y���~��1�Pyo'����I��F���k�6^='�r����j��5?<!�	x��H��?�
;��'7�������
6>�J�-�z�N$�
�qK�[^�)B�Wk9Qj'�%{�;��Z���f��[�c��=6�|�G���J��A��hl���WG�	Z_BQr������a�"�8U&0k���	��Ck�s��go�j�����8��[f�w�����S{��[0��L�!�]���S�l�T��
f�<���cM�Lu�)w����P�
��9�������\���-�����n������G,&�>Q^:'0���rvC��_���|��j�l��5'',X�����K\���L�VA�@�%��%>0
����g�m������d��]%�8W�-���rw`��o�XQ�qf���S[���C�B��~�XZ,hJ\$��o��z~��uj[�
4�����������o6r%E���pG��f���hK��&M�V��B�O]���J�Mjp	`�>K@��0P��7��y���*����"�|z�9/l�\W���$<(m����/.�^�}8�0����zf-�GH�#�2uz5}������y�����)u���[[����ph��2�G)���!"vw���B�~6uF6���H#���#^{7���D7��5g��B
�S��$O;H�|)�}���@�S�����HsSq��$������5���5�dT�l�9~hs��M�!���QQc�7+6�M)'�U�6��������>�6�#�z������I���=k���U�:�s�DD�m����3�����CC����&����M��k,Mp�%��Gm~�J��D���0^)P�4���#����5�}UM�����x�HF8�����7#v���������I�V����7��jV��	�IE���;
{�1���������jJ��~�	|$�����:�$���+���!�tR��J�s5~^#������� ���uDnj����v;�kb��e>/�+���yU`x/h�IHz���������{��(��C(f8��,����Y�=]���Q�GQb��Jgs���_�#�W4i�����������H,�V�a�����f�J��[}���<kQ=��0����=�'i/l�d	���P>
�p#"-PO���v�e���U�����^nU6�&��0<���>�T9k���3b*K�CO�[��.����c����/�%�<(w��7��QB�;E�)&��a�����]�=�J+y����{52��	�.SP� �w�/�5�e�#�0�i:�@�GGw�n,iY~�L��11Q���5Z�����(#�:�p�u�{&�v���n}�Qy�'5\\�J{��Mh��C�	�f���$nqa�n&���y�
��?��������.���}��e��j]vI.v�43-_wB������d�
���^����I���UXg
S��:�WB�D����6R�9����^-�r��6�'^C�E�����'��-o�:�������;���n�U�?���6)���<��lTTao�O�Uc�<���'s�`5#SCN�BKk��u�S�`U���z���Z�	n��h���j0�?�*��\�4v���i����xS�oj�����~u[�~t��c�`�1L}[�b���$RlA��rr�7���
���-���I�Ku�����`^�a�����Fy���`^|S�p�#2�>���_wH�:�-��4Ts9d���Zr�
�G�{�����T|����4�������������2�|
v�o;��F	�r�p�������v�q���������`�}�t���PUC���nW�c(?�G���R/x
]����5�
�t7)�K�Q�z��sv��U"��VF��j�W�P��fa��9���^(�[A5���T����F2"q����+�����7�$
�&��<muVI���d�,(��u#���aJ#q�i/�V�,�����4N��4��1�Z1;d$_��)�J`���H��^q�>4�xZU�k]�~�3W�,:h�;
I�s��jk�<|���������0>)m���y�;'�������u��D�>F�}V�� ��l|�i� ��I^��m������:�z��3���������'�%lD�'f���}�U60S�s=.�,�k��<�}m����U�!4���z�[��\I��}^/Y�
���>�����-�,�2i�������m����G��}�T�����><��~�����@������f�������`��
d�%V'��������������R��bo�n���7{��$&� ����6��l���Z�~j�&w[e?oe~�v�aI���x��Z`�`�����D���
�o���sE��
X����>��^%>\���"�TI�u���p�S���_�8a������4�ux�]��z��r`@��n{����=�yl�Ke��G�o�^n,�g��H����Du���l!�}���eM"�4^�������
�g����)���O��V���n?��c�k�(��T�����$�v2
���S��W��J�����C�T.���[8�65]��j&�j]�'�Kv;��C��$:�K�r2��E��#�qg����-E����X�l��B(B$�,�w����G��iS���E2Es���m-U�����h��/�������b5i
�N��(�%��j6�]����g�5���*��X9���e��i��e$�Dg�'����u%�
E�N�v�'J�E�Q���_�j	��q�|tS7v��2�������p��a	��cJ�����H�x�����qx���m+^O���T�DB>��Ui���daxX8
����5���2S�J��[������
0P�!^��+S����=nG���14�����������.ph�S
�XPJF�tl����a��+G�mb!��)Y���l�X�9,������+��*Kuzqh��Jn�u�Vd�A��,�|������Fe>�py�����VH�V��UC�����Zr��=���iL2z�
�cW�n���s��S�^�����K��iW:m��C����r�,
TN��'�k���N��6�%\"8+���������g6��� ��H������W[����A���2OW�9��b�d�S��]Y�����S�I1sM� H0��Wl�v�� �GS����>G
�Y��PDs��-�|:�d���%��X;i�����^���BX�V�nT������C�� �x�����~z]7���)����$��S���xjQ�[&�P��SL���|k}g�y�k_X������j�+�����R8�^@V���1���J��x�?�&�dW����&�W��	�b���/x�����9�q��A������uY�G��c3*ya�q��g����(��8T�^P)���ga��|0R�t�Q��X6�-���L��R��y������h����l�9/�3����oF�7=�����(*���e�r�@~�}c�3�HA3�,9}���{%0���(uv�����gvu�����4Q���Y� �S�F����QO�s��(�	;������:V�x���<9j�anz�����z�r��V��X{�+��"*�}�]P���G������:�)|��M�����z���{���z4�"�?k^�9��_�7OV2��]���4�W �/z�_��x�P����qVun������qs�0�e��2#�]�/�A�=���g�R��<K��(S�����V�����xFil�e<������u�C���t��o	�3�n���u^���]��d�K|���3RIu:dg(��B��kV.�c��J�W��1y�8�#�q��	(�
��T�@����e� ]��^��h���lO�%���5mj�Qu���.Z�Cgvy�jgU�-�b�e�v�2���+��^U��k�g:x��5��!�Mb�)'��������B�gT���DS��������S�J�lOX��rS�j����	���Y�C��k�6��$=P������E3���.v��uvSs���#��,7�J��f��-���N�x<�:�@��DM����c^h����$c������
���<R377��~�d��CI�8n1���Pz�2f����=�3w�b���U���q�P�/�V�=0|��9.��P���g��k�Np[���K
��T����������r����6�@&*8�����T3rd����r���vs�-*`�U-T�Y�r���;���GH�$�?�	��*���� ��	
y`)�a�������:bw����vyu�
_>0�O6��l4�J�>��,����
���~Nm�F�wTU���o�6���c�	|,h&�h#v��*��'��w�kQs57�iC=TW�v��^��F�� �P����3�0�>^�B��'��F�O�W�>�-����4�}���$����6���U������X{ji�"�����R��Gyy�2W���p��)q/zjZ��E��Z������W��g�����C�S|
��
������@��>Q�{�����"��%M*����r�c���qT�@\	��1���t�����l)���@�$�����wX[��H3Yld��e�/�l�OYSk2S&)�`�E�dF*�w�����PiI�7)t�t�����#����^�tZh��{xm��_/uhH6�E��|D���_;Gy�v��XV)��V��#������.���v:(�I�j����l-��	�� ����f����zFA'�m�-��8���b�&R���X���K���$?����jl�S��4���79:����F;V�Sx�-T�R�����bt�WM���p���� �t���E��w
���z����I{��Y�B	Y���Tw�,w�|D��a��Q�IX�qP|@����YWw�~Y�JS��#�!�%����!��9:*���O>��"����E���I���X��`\�����q�%�9g�G�*U�QjMc�:�u�����G{����p��g�Q_�n��5I@>����v(.W��I[��rJ���k�DA0����s�"��q�1�[������a>����"����lU���@���Gxo�����,C�Gl��G]�X��bz�}O�����}F�r�jK�$�i�I�x^�\�������a �xnS�2����(|��������{yT���3?X�j1n����G�SJ9����U*��'�tD	��R'���q�����<�#\�)a�y�y�P��n=]/8ID�9#�<}F����0S�z&�c7�;���;$�U���nq�aCWQ*B�U�>0�������P���R��8����,��wP���[��Bv�#����B�zpNC4����5���o�L�3��]vS����s\IB��KA�X?���T�$_�b��9h��`�W��Y��\@������@W���#��j���Fk��F�.��{��3��6�"�8�&	&��OvCKW� ��t��P>j���]?`$$��,A����$�����*��-a�"�����w�6��6����2�
�v��q����p�-%��@W��n���DY[2o�!}�c�������#��d>�P�5d^fZ�`^��m�X�����{��j�8���~~r���i\�����N��vk����th����}������R�Q.��(�!/{�Q�7�C(�������.�f
��V
9�	�^;I����.���
� ����o���uC���w��E(>�-�G3FC�����b����2����2�v��2>�K����������jP�&�S�t��u�.c�rN�q�#jf��2�I������1J�l�|@�Bd!�S��r��X�>���K3���>�5}��}������������9�v�v��n��c����[���GjqIhQ��}��G
���e�H�5����N����8 ��Z� ��������!A���2� 2�MY���7����?�{��� U�����P!�!�:�q���5
gPB! @���k��O�M�6�LJ��7N��}%���,i�:]z���L70�.��G�0����H�q_��`�PS5o���a�B B�~��,�-�m�������s�j'��~{��d�,7A3p�e���@[?��H2�'�\�����}�JBBO���`*�U����wx� �gMV�)o�9^�t{�P0r�L#�5�Z�v��fq:�CJ�Yk
�%.G� ��t���%��e���ni6z����9u,�3�s�����Gb��Y9��������=���f�D���_O���|���h���%?�|���2X��~�9a�Oj�	9�!E�)�u��z@�_7�����y�DK�� �PML>2�]_I���k���JQ�5	.���w���U�CK�o����Q�~��h��Ik�K�Q�D���wo9�V�Q�nR��?����=4|��JJ�F��.��V��%�+��TD-$|��r\�:(Z�0i]]:C���[��~��We�|>'�-���Epo�<t���yQe����}�;���7���RrK~���L�IxHLI���(���/,���S�i=YUmp{�]&=Lv����E��������.�H/�V1�Lz�AzY;|(�r/��-��j�CR 0���!Nu�1���N�U�c�����r�Z+����V�{7*�����Z�����������Tw��G�8��z�us�F�=Z,�f�IU�;��j<�O���X���r�����i!���
�KS�<�T��}��G��x�U�b��gd��~q����%32=!������|oO�G�O

��e�%�z� �!N�
]LX�W�����4�`��g)��-{S�%����������0|H�}o!<H�m�K��M��`������s��������&�O�d����p��=z:c.=�r�<�G1V�R�IxG3M��N���u�rT�4�v����T����3l��
&����x�7h'1-�{F����!�m����r�����;N�:����i����y,���g��6�e�_Z��r�M�8��|@�E�l�-I��.�u��F<#g&�>���X:��>j�{�����K~ ����49?���h�����$�����LJ	Nj%�E����PL��c�J��f_M���J6�k�
�o���|����hFg��2v|U����d��i��>������c��0�4�	��U!|du`���@�~��e�����	��5��rD��UZxW������:�a���@QFz��A���nb��j�U^`�MiI��������l�=YvU�P��o�1n#	7�
�Z �L/*�n�"?_�Knv�N2���Q���c����v�uc�b8�>0a^={��|Z�g���0���F*X������V,�2�����
�x;e����oK��T�
��5�&����I����K����|v��� �&�nNI��-,��RO��V�L���Fr/E~D#v�Y�y���������UG1p��y��P]�Gg}#m�.?`e����V���8e���&		����({3t�����K�����8�8L�:���=�}����.�42@Q+��t������������G�'k��S��8�y�$U�_�`������?���8��M'������\n���w��l@��/��tTI�j�I[<�Om����=j��b��Z��jZ'��k����vP^S;~v2��CME,�q���0�8i�.��ZWo��x�	!I@����D=&&r��G��?�r02�+g������#���
V���U��Um��o�������/�$1Kq��p(uf����G�Jk�.�����T����by��W�����UhE�znJ�M���+<�/{�{_qn'�r���4�*Aa��}��rL
����a������xAyf/#�;�xZ�J��3T �Q���*7�����c`���&�����������AZ�������n���N�n����-�E�:�~:�`&Y1i���O�i���jw~e.��=�,������m�����Bnjh`u����$��g�d&L5�h����\�C�E�p[z!+{�*��R������L��4�6�h�	����	 ����8K�s 8��P������H�w��'j�X�k	G` ���8�����ZN�
�K�3��������s$��1����q@��{�E�����J��F]"���E����������^��R�3�@���AB�e�\��M�"H�X�~����,
'��#��6Tk�c��jB���,���(��z���7�V#�]*]�`�^����o�E����E�Y��E���)�Lj4��8G)<��4�/{�D6�8�7�����
���^�z�,��F���(���PZsg��!�w����b��Iggm����R�����(���3�}�|k$>�3��~'X�^N�"�Y�,`���"hhtBW|M��>UZ�a��P�c1��%�n�)kgj+�p��g�
{e���
WXgp�M�j�����U���������C��56*��b-QE�.���Q�QZ�	u,��"���~[~�4�57O
��|�����9�oV��%G������i����������3�1c*Vtf<}c��x�3(�W#-�x��u��E��uJsMan�WnY��RY�M.�y���Q�a����=�)f]�WB^�6�%�|*aB�or����V�����P$�4�[)���C��&���x���t�����jm��}5�n�5�w~��D��Z2j�=�!��m[�!��SR���Fz��d�����a��_E������L�F�i��OH�����h�m�V���K ���\
�9�����V�YT��5��_�4�z����U
�5�U
��i������k�2�f�).���d��P+���"4�/4v��&�v�u�S�S8�%j�f��+���������j��P�x�E�I3u����F���/h*��v,9�uB��Z���%�@*�m��6��D�_�:������kCy��?Uw{	��E�� S��ys�K��xh���}Z����$|6�o����O�G��|�X���{&rw��f��t���F9�>�~�=��F�J��Du�%7��I��_+*TO�P�S���hG ��]��� c�8M���BKd��y�+�LT��oeb�T���U��P%�\c���a��U�I=��V�����8�a�
e�&zQ����jK��������
i(		e;)����{����7X������^�:^2-B��'�pm��^$'N�=�����1=���c�����(��?	.W��-*A\�������[���=�z�L��h�vVs���5���z���"�����C�|�%������^��5�8��U�K�����	�����e$�t����[4���PQ&���I�����km��k�g8C��.�������W��*��j��W����%]y7mD��T���,(�nt#9���h�-���E	������������h��Pu��Tf4IB������B�����{q2�{�0��%����������������l�C.x��o�i���vT
a�\���[+���I���a���u�T9}\�qc�H�j���0�����	~�\���;j�E��	�����Q��M��QyGX������a��PB�[���h>
�r�#�1��H����*�U��'n^�P��
A���,K?���~��� dc@=�����h��bol�U<'�����:jht*�� ����a��:Z��0i[�T�u�8G,}a;��������������]v9"�4i5Hd�a�g�{{�?zX��Y�)�������.�q�������p+�� �
H�Q� ��,�7�;��K/�lm+�CO�$xz���e[���fpZX�a���^!>���T!�Z�+_S���E^c.����xR��l1�a���rkvf�,Q��U�R����M��"V���pt��zM�N�<���f��>����C��z�O���O[)�}��C�E�}:��o����� �e�;��,�b*J�|e�	�P�}����������i��q]y�r/����f�$��%�3% T3��M����K�����:)s5@K�0���S"-�2��|���\�U��b�1n;�Jd���2�d�k�Y��M��4�L��
^O���W,��TG������j�}/�����|�aC���T��|1���R�[�;de��C�
�4_��6���6�I=�50Z��W���c�jiVf����#��3 i^G�����s�w�L��S3����q�L��
f*�/���zV�Xw2�N�G�]LQ���XV��Nh�$Q��/�PT�<��[��
ViY�nj��y����'��L��If��B|_����5?�d,�RW��kd�f	����bWid��'�{��^�Imwi�@�Y��8��7,'�gS}����+
�<5����c����Y��G�D"����%����{�XZ�2�+`j�R	>Rx8�B��fZIQN��9	�'Q������K�M��������V����Kkt�������4i�
���,���BL�L�4��I����W�kZ�m���D�����EK;�v��l����l%G���?z�W-*������r�T	XN���+�H�i��P��IE�����z�4q<�����SW��@���\+DIeZ����/�c+�>������1W��}�n{��D�
��At�q�E��t(i��$������o�;�s)��8�:?���*���K�;�>/Tbl���Y�������f4)^���$Re3����� V�����������p��{�,y�d������*9�}+�r�s������C�%h�������w���w[����	���^l�S�q��b�,v�8@X|�q�d��`��	�]II�>a��� ]n���1	�)3�T����_lG_"�����?�umW�6���L�.��0��������	J�1��'���;����EF��VP-���_�g_��R�7�t4(�h �����@1l�}��ui{��`s��� #�����(����h9�C��;�� �n�n��	�jr��`w������$��r��r����i<�f����������tE�������654�`�����&l����=K�O��=t~��![�zZ!W�i�/��';�^�0
�^g-C���>�G�u�z�F��@�l��&�����5�^�
�?�=����f������C��("	���������:��i�C^vD�|*s�l^u�*�����'��2���VRT�_6=c���	M;�V�����cy|���C��V�7�(#���������c�JZ7���6-�����������_�������*���D����{
���������#W�'�5I�f�Q�&"�O��d�wMJ�N���I�7�<�4����{��2o�?��|���;�����6�4�z�f������ �!�o�����2�sm�C�<�5���$��7Rt~���)w��B��e��-Y��={�l�!��=�uLh%(��(� I^w�&����-]PG`aL�M������N4�:��xL����}��S�$�����3��o�7���o�.���E�����$�9�a��2�lA})8<X�x/|U(����l�!ti|�c�!�L��yu��_��}��� p�Z��|�����L���gp�R���<��AKZ�TOt-r^Eo0#��������{����I_f��t��&9���]?v��R�w�yWg���>�)���\�`)���Y�:C$	���!������rr������y!_%���������Yu	�b�A��*�������M-P��o�m�+�3�����&g�����n�`����-�T�7�h_��'/��g��0�dS[~��_��F(�
��F�CM�,��#v��Xo��8�H^rEq��q��$��[&m@^HB��<c&����'��)�G���O����cK}G�V��
��
�3���#�2F�f��G�h]=�k��!'��7	Hh�&W M��������0��*xU.r��eX��Dq�B����w6~Cp�Y�����t�)^8�j�K�X>�TN��"�oL������	N�9h��O�1��,���b�%ld���!}�7��Uk�E:03�vY!T�����$\�u����T�$�BhdL��V[!�~6�=��N��!���Cw��4����?e]Z&m+�O'?Q����)���7K&���i>�(XX��9�w-ARWu!o����pq����&v���x����U�
k�J�A�Ht�up>��c�TB�����=J2�F���j��Q�kU���
�ku��zE���j��R�U�2V	��p��:�'�g�E=qN��4��9���In�}��}�c�
q3G������yp~�J����=�
D�4���OGn�"�yui0&�p����#�~���V�T�h������p��B����0��V����l2j�&��+}��{S�J��QWI�i�	�*=�P+��d�>��Aw+�V��sd;�D�j"�B��.i�KrXL�����GI�����g8��������RK���G\^��a�MoS n����,�����VK1���	����k��|kO�|Z�L?����y����!ES5�\9-����=����YsU�}"������?��PS����X��V-~�[����!��5�7X{��	�?B3{���v�tz��/���f��[��s���s9��v�\��(?h^Y�N��p�u�Wa���� �-��(ro��'�.��f�4�i��`�����*'�2���3.r���5�#qro�2Z6A�����Ql_�xt��f}�hx�/76��G��),8|�;KJ�����!�r|��e���WB�',���[��C:(��L>q������Y|m��l&����#��e�d����h�d��b/�u��D�	T�:?��<�m�=�"@G�hx�}��\���z��d��X���{OR�m3��a(��9EY���{����Q�&Zx�CW�A\��l�>�]�|��>��� �W���${Y�P$������\�?�In�6���#K�����\�����V%���B\_�y��h�R����/�'|L+6��(c&�j��p��~��,�v{��gG7�^-�j�H���@�����Ik�X�H>������Be����C�������}��9�x&�&u��� ���|�sc��"�����wYr����0��w
���r���!�s���i�>[}�[:����U�K��cx��)���56=�p��}�#��4���p<C�<���4!Aeyz/���I|Q�uR�w�sh
7#�(5#��=U�������	_2����G����������_�B|R�:��
���(.�b�Ow�Q={h'.a�Kk\�{��CX)3�����4k�-�f%����<.�	�t�F��su�K=3dRt7���0�S����E*%���iCW
�O����Wc�.�L��1T	�@#�y�}O�NMWF���i��sR�5�
���TU�_Q�s"�.��P��A�)M�V:O����;��97�,���Y�����9<R��@S�A�k"u�f�O����CM�c?��i�������-$�6�Q���y��.lj���3#l���@U�X�am�9)5��B.���(�or�
����������	���'�� ���P��|QN�����D�n^3�R�2�L�%i�O��j9rdF�x�z,��Z5G:+O��+�+��C�r��������!�i��SUM�1����hp����=�v����������k�c���OBd����<���E��%"�h�j
����yQ���
�����U.D�"��~�V���	9��A��}���j����V';q�)Z5'/R�P/�h�����:�M0|���2S�~�z����*4���;��
f�Ur�����4�a���[��c�����A�D�T��1���B���N����;}����U�H���XU�$������)9���B�	��H�\^:�y���G��?D��d0��M>�Bx�	n�/�"<�sT�����v/����\�H�r���M3|1�(������HUjb���tk�4Q�C���q��x;�{g��^PQP�U��|3�'^��w;���l���-d�����yA�b��E����<%�.������j�%)v)�H,�[��N��	Q^�o|����]~��s�C@�������D&?��#�a�JbY��Q����+��9A7V/&��j?xz����K����^����_Tm��E������/-ds��S������^~9.uCt�>����$$�S4�TlX����5|L[xfk{�]FG�a*��9;�m�F>����F����|��Q�l�����(
/<�Q0����R?������fIM��c��R��E{����������_�uvK��y��E&Y�7����T#�/�/�bHCU���~@<�M��z#��U��-�~�~��p���A��<}��k~c�g4���D�s
���~��/9�b�N3Q�f�u�G/-�F����	������o�����FPs����2���:�ay����j���2������z�3�^P���e�&����#^��"�^0�a	���p�gO��l:��h&BR|v��Z>��_��^.�������r���[�+����y�6�E���Q���z��#}\�_����s��%���A(r��\���5]�%��x���P�5��uK}��q��i%CQQ89�W�T}�D�a�)����'C�}�7�d��\��"'�W�e����9��sC�^�Ft���cy|�[��%#SD3-U?�������ftd��p�
���_����ZiG�����[��W�*�)�'����UE��>k1�Qy���6��^���t�g��5t�\EY�u�RTU����n������tkk5�}��X,�"��!�����D%dG�#������z7���I;"$��{�����[��mN^b5��H$�����C
�Y�����W��[��s���@	��Eu�
���m��eXv~n]���W�	i9\����u]Fy0��z}_T��1�Q����Q|��T�N��d���Y%�EU����y����F���#���/
�{9�4�n��$�K��n��G���\#�Q7�+�'eD2���m/a��n^�VB5���9K����TXM#������j=Y,�n�������<
p��U����Y�r�H
����-��{����)���$�����]Z�6��-i�Tj���S�����������j3�NK����7$h������G1��
����.{���!h�s�L	�����[^��H���C���@�r�w8�\�	_C[`�T&��i����X���F|R�<�<�c����E1�~�����ToQz�����^f�$�Qi��M������7vy�o��#)�g�4O�3;��eP�J��������6/��8�0��'�o����M��@���C���2ue���YF�I���x����'�N��A���a�{s)��5�oAq�x�j���b�K����>�.�S�����Q�� �����igJ�b/�l9 ��qR���X���r�%)>���K�Y����j��U��e�j�u��������R�p���$O�4+�Y�S�!�
����V�0�%�2�NQ���4���Z��gM>S��H##���^�M��e�����%���
����!���oS]�PU�)z������O��k-�~��	� !�Z�
H�SZ�.U/B�>��0U�o���big������(!�B�Y�� ��eX8fy=�z<VfR���"���g�
y�S�V�K�s�!�����eM�A��e�����w�\�gW���J{�J�>=3�L/G���>J�m�������w�;�<���MUno!�9&����$ ��>Ts����$�`�nG5�)���
���j����sM>�n��
O\��n]Z��=�����BD��@�(��|���"�-������gWL�7.����v3��qs��� �}$�gk����T
CE%W�T����&^���B2&�4I��s9u0�z1u>
����X��M��=M���<�K!���<w7�nO�<�y	����W���Y����c��0���|������t������v~�����V,�#��D�&O�Sf�2���`4y��O�����w����h>a?��t,��� �-���F�n��
0��i��3,�h������'��������,;�v�������}�17+F9���J����,�v��W8��{h��ji����r$�;��7�������/P�,ai��&�����������\]8����n��%mT ����}���d��C_�Q�&������6Z7~�7�
�����xx�������^�����!���\P�#t�14�����+e�z�L�Fg���������nA�)*������9��L���O�NM��9h6s.&
u�8
�3y�!T�b��`W����K��>6��fx�s�o�5a2&|����LK]�cv�kE[����i)+Z��F��������Fp ��:bUZi�e�r��r.�����5K*��'n��E6�1z� �"
����?��a��=1���~^�_0_t���Xj�`��{G��Hl���W��W�n�N������ ��h�
P�N������B"l��~��C<�S�� ���������{H^���pl�n~��2B���y�6���+M����?��{�z^S)�-������=agWy�h��q��f�^��\(_�1��o��������yl.g��5��<	���cPL�O�%�_Mu;,Q5%eG���!WnZJ�nb�D��@�� �~�s�Q����S�]����[�������K�U0M�3(��$v?��*�R���a=��`<I^��x��q�`R�S
{���=X����o~������K�G=J�F!@'��.�-��=,pq{��&M�ix�I�tux?�2��h��w2���VdC��C��n0JJc�(f\Q����xC��;�)��]e�"�U������R��1�)�*��>0�g�����R=�+;��m�{p'�^��U����b�W���F��i��Q.(��r�==��e\���8p�w�dgl�bIj�C�����FhH�pB�}�1�Z�Fz�S�%���%7#%����������.�^���@����Da��}��q�R$�j�q���~R�����mG��1��t$ �����=���A�JUT�����Wp`x�a��e��U�\G����]��Vz6fP�}uK�����g1�t�-�9�g'.k�))��� g��35����}i��#���������-8���A����R��a��9G�||1!���,�q{Z�K�����I0r�qA��PR���M��s�h��o?������_1����)�����nt������X�*OL�>|�p0�BO�$��	�s��������^��M%h��=��'���uu�)�RR+�F�Q|��r�d���IRj}�wD���m�N}J�x)��w�)��E�p������J�C�^�<��E��N�L�c��A��S��E�_bo�\��MU������������C>���kN+I���}�a�%@.�v/fg��A0-oU2�'4v��tr�f�
U;J������w)xA"x=��@z�����S��+�J���tV�(yd�	�=��l�
~�5����i>M�rm�K�r��������U����\�/�f�������u�������a������������5 {�7��N��_�I�7�~�&&�����O�
*O^I�Z|kV�\��D^J�7S^����}����&�TZk:���m���!��r�������ncx)���>I7S�d:���`,1��
j+�C��|���8])i�9��g�*IT_)�+�����.E:�]n��^��5�{{��e~�QX
uM������X0*f��X��}��7�=�M ��������>��q����(�����,\TM�	����m�_.�iWeo�����>����.��f.~����n��o>��������@t�:T��S@���b7�[$�C��^X>���k�n��.�a�[L��ZP$?f���j��Z�G�z(7y
�T^16 �����0z���a���A������Y�������Q�n`��)�T�Z�84
f���gM��S��%�_���RtU;~�xy���mK���o�P��������
��_�7��A��x�-��i��%c��*��3�hy�����6%>a.F����j#R���|���z�����mV���,*������\$[�����v��>����K���&�2�J�x��:�wr����`6��e%�O6�����LS1�RF�O�
Iz6�$����u�����wh%���<	Lz�f��S�����}o���>J0��&�;�&��Le��3U��F�����!��>�C���K1R����b!Q��!2��1�R��6���}�.�LF�J�7s��T��9!;���I��I��*�H	��e~./�;�q�s��Px���W����U#��n��y����/:��H^���i~/��9���-C����g��I�����Z��%�cT(���IU{�3�[��<$�Sl,}���+0��(�c�H�|���C�
���wm����Q^�E:>����;�O�����3�O�-g��bJ�P��*�R&�|Y6�nX��|'����I��J,�*���b�!R�C�{���7!���3�H�#SH;�L�q*g���\Z�%��e6��A�`�Aa���f1������mtu%/�=kx���.^�S�/a�9Q@U9~_q����A�dw=Y%[/���t�x9����� �����5i}�[���z2Rn��O:���m�P���*�#���)��>��X��4�1�Ma~ ���<����Uco?���&'E�r��?�d�����6�!v+1^sDN#n�A��j��CzA��K��@z�f%T0�Y?az�+�S��U3qm��!?�DIo�d�wIp��=�}�����D���W�"o�:4���������v���p/������p�_ ��Q�6}pU�<$���~-�� ������ZH�V����u;�~�|yd�����*�n/Z/�����I{��H
���J����������7������X����i�_�0��g���$�(�C���y����O1�����=s�n!�Q�Z�U��/r���__��(��`fz�-�SK
��{�bN�,��������B$Y��'er�wQ!��;YuE5j
D�?���/|M��5�a&uE�!���j��,����H�,�Ty�	��o�{�+}c�t�J�����@��O�79��Z��E����r'����(������"B�RnrM	����_�$�l��}I!�Y%��T#�'|9�:�a�����PU��?1���o�qD|.UDP���o�
l9E��U���!���Y�����MuZ���]�We�������s|��A�wE��V�K�[�/H�4��Xq�f����������%`�
z��%e�
5���_T��_���84�J���IyqJ�������v��9�������\�.
@N��O�x}�.����E�wg������?���L��0�H�:�@�sc�Itm,������ �7���?	�sNk��b:Ay��Ff&�py��J�2������k�G���:����>;In�J����bn�C%���Bg��o�������Z%��/~���'�<�E|/�������4�9�
����2P
Q������#���3�x�J��i�K;���n"����Y�F'"��4��p\{��_�D��*��q&�%�>�����$���N�a���=������?5&txO��1�Mx4�Iq][y�m{R7ji�����GwZ��x4i��%���j}Ayk�G3�:�5�2����%=�g}�1Ng8���d$�����p�$%3���+�$�4B\��U��x����xJ�d��������+��[P�"��~��t��@*u3����H�����O(�s��@/-�n�
���A����?Z;]��G�$0����^T���?<2tl���F���+�)��	G���mE�2�����U����O�,U��A����\�@b����W%���J�jo���p~�r+�`l[�2���6�f��Y���<���x -�H�l'���>����*�iT�8h��w���SA�)!-����'�!��Q�U�S����!b������C�)n�?Y�t��������lG��9_>|��O��s^^�b�4��%�x�����O��C��*�0�I�$r��62}/���`m�t�(�@�/(�.���M���v06iIQ�#*���t���U�$~�!%�ZGY�������x��K���)3�������O�����4��d5���L��H��X�p��;Ms�q1�X��$-oSp3�=�����Y�-�{B|�Z3�y���!�"����7�&5iyg1�%W-���4�s	�"m�j�4Q�`��?������<�������7���M�����z�!�5$���"��n�C���33��\!��l��.\o/�
]��Y�U���?y��_W��Ps�4Xx�.�����}kg�#�0����/�2O�
����X�V`���Gh
u$�����sC����;�'�D��48�H2���}����]eu�XG������������ �*������'@��x�;,�w��HL��(f���!q��Wgj�
"%5��m)��?U����k:F�<	`>I�T����[��?z{M�R�v3��O�o�2�vi(�����[�������-m�������\��)�_g����G#+d�}������qX]Nxt]x������6tZE�<-�sI�@v|�*��
+�-��5��H���k�-�<��k�s�j�
^�m�\�
^A������Nc=�n��zc�d��6�+�\
����j��:.w%._�P���c������8�"�X>	l��V�������|�)��-���hGG[�A����Pw�|��v��0H�?N�0CQ����|Q����8c��4p��G��R��w��!��$�C,�������/R�=�{�a�1��*�O~�cFZU�5���Z28�����ek(V5�I��?�R���-�
�#�xl�S��]!���r�+E��\������By�l�tz��Y��0p|�IOL��fT
am�>�A�;�J98�mc
��/
�_���1k
`��*z����)@,�gc\F�k�
����%y���+�xFC���J��3��!��~^C�J3�������
'g^�&��a�VLV�E�>Qn:�F���`{'�jO
(x������������y�eTO�.����d���sN{�pj�GP�M
'����-L�1<�;I9U�� 0d�tb��d=�u����$y�����U���%������B�g]�P�����*�V���P��Y07m@�l@O��t,��-�������S�n���!��0h�@�M<�'Ue�`�,7�h*{)�h��u�k
44�,*(�S-���	���]�R��~KM�G�\�J���2���/,��.�LY�U�Hd�f�xJi\�����	u�/�������a-�%�����c�S��q~�6��4u8�YOw����c���j:��0^U
����O���xmk= ���������t[����~�kl ���������8����A�<�a�$i���B�&	/��������z�A�]u�5��K���-m�W�S��`\������z~G�{wV\,8��er.������Q��Ie	�D�2��^S�k?��W�h�y
v��E���Z3�H-dHgr�h(9/��[��G;����R�#�����3�n�UX�
�t-(X��A��^E&������LF�q�8o�~�I���"�5��MT8U��?�+�d;#:�h['�������N�Q��"R�����44���+aC\������RP[)?��:`��.���������8�&����P-<�9�6V����#��y�{p�����U������a�������� 2�#(��0X���4�Dnb.���������A�|�e1�Qh*xGH1���sN�Q�R��[O���J?%�����������4���"7
0����'��^�>!djjthS���������9�>��zNe]"���e����������)��B=��;��3P:7%�ih�k�?�5��3>Oh�,&�H�J�H%��8 �r�v��K���	��d����#I���W	�L����k�����f"M�<��f������='|�4z�����qU�>F��)��I�6�U�"i�?
$���������lk��p�Y6�����WM�J�C�pnf�z"���@Zh�����2Xb���yH���IW�n���(0O����JR����"���<����*��?]uo�:����-���0H�/mN���wvR�^�jq�C�?���<�xe6�^g< i����w��K�#o��iQ%��b8�M�J��2�\���S�H��n�o��H+4l�KIZ=�H~`��j�v0[%�J�r�U�/������V��_�2Fb���/=�f~vR4�h���.T)��&�0x��SLDW��<�-c�<>X��u<8E���'3�' ���n
�mH���?d����UZ�f��M%��y�6F�����u)�P([PfTXRm���V:@�_s2��!k��g�����7�������,5�TE��p��������~�ry��2S^x������:��c+?���[���y����*�4�[�a�G�T�}C_�����Q�X
�<'������,�sA^W��f�,}���sV��m�����*���;g3H���>�K��
�T�iP�j���hR��}�f�AGk*7��@�fh��_�mH��}�������8���YP���F�c�C�����F_~����.8����C��\
$6��L�D�n�X<:� �x�����ZP'<�����0���O�����P���@��`��=�s����Xp�v����W}��b���e�~�	K�e������������77�"�e:�{�"O�J���.��m6H+e5�h0�g47��0�0s��QV��'+����6O�_�a������oNBA�c��6�u��B������2r;4�����1���q0L���z��$]'����~��y�rv�/@dIw��g|8�X��1��$�sM�p�E��_Wu�@������e��8x0��#6�5d� �v��p�lK��.��x��,�.H)|�i�!f�S�I�C1��W����Pi�m=���tF�%_��_Y',��~��1��?N�� ���+Z�Dc'�z�7���7�P�FhL`X��������u/�\�E��*ti#IEnZ�Y^��e��������K���`�Jm��TS�|�(�{�1����t���q�%���"�I��>�[�~���bWk-�^,�dSs����'�{��*}|���_j�:���<Un�Db?��6��wfL������'�8�9��<c�`�:J�A��T��9���Wmid}�,
O�5d�7'��O'�1�2�iQY��D����,�{<�K�^!�������qA��vZ�5��M]��U������o����=��1�x���*�	����/u�����S5\��U�6
2��/�G��S�Rj���%�1$m�/&������K�)i������`���L���xM�@�h�o���8��zB����t�zBT�C���S�a���];�~���X�n-�bZ��	����!B�:�"��{�
�_�_�t����F���2�~N1�p!�������6v��
t8��fM����^��L-���n��)!56D�����6Ds
_�>_%������P�R�^����+^��l-�� [�
[~����V,�C
frE��!��g�PQ���� ���;�gi�����:`�/�JM�Q��NH�7`��� �U�N���q�%�x��@��q��w����x�~8���,��q��,(Ngk5�����jL�DW�?����S�������m?C�&���1��6HI�����b��u�4�k���7�l�=Vsv2]� (~�3�����/~Jf9��#D'V��XI	�H�3LEi
�D��W>�v�q����F�;���
f�Bh��\�,�@�Q�����Cwf�`\��\,O*<���g�0��`�+��5�����3��[��������u�}a������ma�������@��b�>(`
Nna%eQ@�6�o��Ux�=�9�
=yZ&)��S	�A��
I���w��%�����)�|:�(�+Ni8���o��+;��y��l��|���F������U[��Q�B�/Y�`�k�<�L�l="qu�pl��e��#+|S"�8���{�_�@e�L�f�Z/�b<@�jo�z��n#".�.^s����<��mM����.}�?in���x��J���Y�C��`�9�����}�%�<��Z
��,�
������v�\:S����w��).�3�d�x�D��S%�{�h���(%�����E����-m�Q
g�:�����0o]��uQ�Q�	��u5���+/���������n�B��/��Fm�N�eMNV�������0�Fp�����@�Z���W<y�����o�0�X�Wh-�{���`;k����J������f�)y~Y�k�nB|���PP�(U�����j;�0^k��� 0�V��N����������.��2��@��$ �-�^+A�����wW%Is��u'���F��=�����S�e�����KQB��QwSj�9&#���x�u�6�������Zv�~Y~������������\.���Ne���h;����,��v\��RQ-��!���y�	���E�h����Q�5����k/����a���
����B?��^�;;h��	E'wC��H�����O����Xb���^-��#��A�X�U��'Kp�:�����t.��F����pbs/rO��h�?��B�;P`�����T�t(�r�N���T��$7����m���=<�;��/��W��Vi��h���B4TP�������Z������:���u�����c�X�������v����E�������dD$d���!���37����f�������:���v�9����%CH�p��H�r
VdB:$��'"<<�������0e�}&�������6g��z�K+P�=�_Xur`���s���;w5��(e��|��c��e���,��^*���/27����%��c�n����p���s__���	s�=���m��n���w�$j����{wIW������Pi���;w]��O�x��i�{�
D��<�����
���W2.t��J�����m��%�,q�y��_���~�
��=�W�UT������=��{��
��#���<�U��f�VGJC}��T�/D�V��n��;
g**n�S�ru����9�v@H��� �Uys�o�_�F�H5;��J'���Lg%�7'�l���0E��`�@�������s��0,%�RA
j^	�����dmu�
`UX��<,�Z?���jo�Qq�mo3~jZ�v���]gK��r����?���,4��%�q���j
�8�2�%;��'.n����9/�*5V�!���A�����Z�r����xo��/�zlFT����������]�uK.5T�gh����VknA�'+�K���26��	�_�|T���h�qfA�6����'x�M2,F�sM����#4����{���nfQc:*�n���~�\�xp�Ia��|x��O�}����^���9A���~���z"��P�^�8_�|��������S�����vFF���py�o������(��$I���������_o�P*�����'��3'S���H!�[���i1��V�EjK7��~��b�w�x����>���,��N���x�+E��X������yV��|�/v4�oc'��|y�B��_��6c�����)�&a�Y�Mt�?������[jC��q��YOx����c#�`��>�"����VQ�;N�Ht��v}u��kA�4z�c�Uf�q��]U��9)����
���Xf��C�
��V/��h��&H�m����';����'����Im�-w������Y�;ZI�k�{s���#��|�� Q9��AA�^��q�%���h��At��o���k8'�M��`��!�-[�6�����l��c�1�z���y\�2~fC
D��<a������o�1�.������)�7.N���hv�R��m�q���E�D�$M���mR�Y����t?��6�(�R��C]������6�������e��!�E��'m�t�<qk��CgtT.@��jd����2�"���*w�;Sx�h6X8B��s�;g�7N`�q����!�0�a`M���(�R�4�-��������~��K_�$"���0��
A��{����I��e�4	�����Z����{�y8�^S��M���'�/+s����<�(�8�t�X����*���!�Kv���<zi���
x|�2�0��Q�.��j��F����U,���\RQ'�St����RcO�� ��_�uKP����&;"[Z8L� �9C��Tie([?������u<0��
��9#�%w���Q�ov�r]�s���sMU"1������STho����J�d)�]jT���k��D���.���lR�\a��Tr2w����~���u��RaT-*>�5��>sf��>(J!JTA��0�����K
3���[�����rq	�_#�2��������#����8LL�k���'�:�i�`x�ELv������g�b��s������Zw�(������+:���X�.~��H9R�>���K�4�`0��u+<c5JB*����ZPw�0��c�Z0�@����U�D��G<
������J�(�Tc]#79��,JI�C����u�ca��s���S���[���Z�$����7qr��v�<�5�P�Q� ��9�i�"������;��O��K������Yq Ux>�_@F_{������W�����egT3��e?����
����4��j:�����|&��xos
@�O�������������P�T�{
v��4R{��P���C9�B�s]���pe�9�W���I��Z�<��*��'�)�DVl��7��XI9�xy L��B����|���f��
u"���!X�tV����A��
7!�����9����VUI���EF�L��'���o����x��sBl�����p�����udPz��(A9���%�lno�(�H����mw+����=krF���
�o�B�y�9�����>�$C^��j�0��]�0��:w���$�c��.��e�l���9����H>�e�4����!����j���8�������V�4,���~�Xt����YK�W�l?B��ci����j�����1���k��Q�D��N�P�3'_7oN��t�\S^F��������s%�]fc��(C���>;�>{O�`q��#�5������j��3�~���������\h�di-K�x��������447�� =���l��K�`���t���
�i�x���"��B�
-������`��R�>��1u�3]��tH�~�+��q����X�d����7>@�'0�1U���i����j���~Ob����t-=�~�
����9I��D��������%���f0u* �����2_��.�~$s"���|B|o��Aw�y�}��Jw_��"�K�+��P��}�� q���^��d����F��82�Jz[����k�xe��(+/F�1�q��ve�`���&������A7�>���S�2��t�wubD]��=@z������k�I-���'�z����	���9������"PU�}'M����,�Tu���	�}��
�.��D��)�LA4�0�-�P��>gt�f���W/��9�����S�q�����|��z��sj��[�RZ�'���0��r�FA^ �S�I^f_�)c�#$_�X�H����P5TS�?��>�iR�]D�W��d�F�pMS���X�����V��Uy�0
����a��Ey43�uW]|}��1����j4���B��}�S<g�ULC��0�U�������@���O�	4�TW�;��K��|S[W�����:���]���p�^6�}Ym����`��b%���{�������$���\�
�r����K�7�*���
�q&��H�H������I��!y��r�Gk3[�y���(����xUu����%����F6/�i�l@&w��]:�e�n���r�-Nuq���I<��"��(���
P\|�t�+r>l�j��I�<L�.���t�%�h?�UlC��H��)�G`��2>����a ��`\C���ke��>���a�A������[��c�\���5���H����v=�3+�dQJ���yY�T6'<��������~qwxv�m�Ux�%� �b�����7^�@����um�u������CpI6#O��0���v����tR���H8a�U�YlHv����D����"�1;�'��������ke%OM�]������� ���3&��R(*��>��[y�^n��'Al	��d�4�v%V�)1d�5��6r<l9\��[�u�_��
��)����>�����~K[�)��
�=�P����y���������F��.U�b�����m���Q���I�]6��I�/X����h���fI��A�������oM'��v�,���5����{n'���1^��R������tD��Kf�VU)6:'aw�e�����h���]��Fb�\�Q_������}����)�f��e��8'����R�87(�������Ni����X.� _D�E`Hs3.*h�Z~k�F����,,-�Pu�O�2�3��{!X�i�[x�3������u�V���R���*���;Hy�����R��@������-m���S���T�B�g9G����f���GS��������F��P���fdU���|�K����b���N��Kq6=x����`b���"s�5��2t+~��FxM���9n�U��?>C�kW����j�yl���r�)�}��/]0��Fp�pM�*�
��������nhZ��3�����?��,J1SW���R��;5�;��~1�����T�����(�*���e7����u~�)��4�aW?��u�D?���KO�:��)��T
�Z�6���~��R��vN�� ���l��#��|��f`���K@�S�6{7_7��`����TV��>�:��OhZ}[���h8�\��K�M�g8��,��*`on>�F���^�u���v���L<�{�:v!8�SV0���U�S�� $^tZ���R|p�������g@R
�87����_��/��NH����-��D�=Rt�^W��������T�A���yF�1�{�"
6g��~#to"�i�S�P\FCC@�&����$c�����u(���?x��H������ ��\r���u��Q<���		}������.���"�`�Fkp�f�p��8��i���ggR��������~�����z�>����R���)�+����w�n��k�^�0>Fy W�_��"�b��/`C���������|����S]��o<���/�1J����:�mo
F%B��I�w���GS�Z�@�P���%)-���������|���w���j�D7u7���Z����!��j	��h��qO2J���.M"��0���zG��w���W�����S�"M;Mz�~TtNwt��y��[��p��ZeR~�O��4�V�b$�G����D���O���K��;Jn��H�u����~q?�<��j���-�����qk�^�b�`/J�J@u]in}�z��� ���;d�cV�k�V��b�N������%��!q%�E Qo��=jK�#i9��G|�(R�Y7��rhpZ��!g���
�K<�J�mE���-K
j?��������n��,�����s�{�i� f��Q�km��*^�8�y��L��9w��rwP�f��������h!��W����K�I��_�0^I?X���V���p��w�	��j8Q�h<5���U�I_�aCsF��&���_�}�O>n�d��wX����I�����^����z8���������uf��j�!J4��~�bo�%����Lu��H�Ip�!w�� %7�����_�i�'��'Jt_5����Bt�Fk��\����>���nm`��n��=9$o��I��`[������s4nJ���d�w�k������U��"��������:�;����k��qO$�wi��"c����WNl�b9#�VQ\��d��(^ee�W��p��N�I���5��f����.�:��E�MC@�Z:aOU#�'oar���	O�1U
J���Y��|� ��:Xj��V�s��8�p<��T��S�`��0�"����nAZG�m��b�Y1����������Dm������
�����NQ��v��0����,��;Nb����rVBD]��Tw�	�f���&�Sp��	�>�D�A�|Xp�h���4���>����H��a����<��r7�� -�e
)F�(�7��k���:9��wC�]���O�����j���������5��p��b�]�yQe����y �[�q��f��u��� 
)Ks�M���QjTv������-����4,T�cn�m���y��{6�����O��6���j�n�uq�A�';x�e�7Gk{�9�\��U�������D�>���HkR���e��wL:K3o�$,�	�MAM/����/@n�ZZ7"�u�7�����0y��&�T���[{H��wF�J��(b=7/n:�gm�f/1F�C-N��twV�7!� ���,��3_�R�L��z*�@k�xp����^��P])r����1�&\x���	/D���	Q�j8cl/��Z���{.{i421C�R�y#�����5�C�[RP���i���RQ�F���8H������l�o��f�iWh�R�3�� y�]�]qC�)�0c���l�q�&�x_�I+��T�|E�����R����/���!�T����b��J�,�;���6`s���t�I������v-�2��_�I.����F* 5 �F�&����\���y�t�U��3\>=-�o���HS���,����'o9
�rw�J���Io�������01^?8
J�.��;�x���d����T���}��^;`����Uk�CY�4�+2���0���29A�}�;�T����*���g��D	����l��G����L���M�;����T�%����7��x=)v��u��G!�u1�LR%��%������}�P1m��YQFg�s~�k�?��DW;������PX~u�
��L_]W��ws�`����)��XFt�vs�����-��rP��s:tf���h+���L}��'
�pH�w��tE�E"'�=��^qB��(��*8���pv��mlVP���ov'(�f�i7}.���7��\v�{��;e����:����.�^j���E��F�;f�����j��3��LbK:g��K(P6Fq�B��
DB�)�O������EW��s�^Q��KR������?������1&��p1�s���7yf��@N���3������u������!j��;�`����$�B,�!-I�����p��?�����br����Y'd���7�O��;�������ZT�������n��:��%�
��/�����V���0B�i�^����`l/����3+E<����G;\��4�W���I�/A8�yw����Z��1|)T9+�68�h�	/3����z����������u����<����8���1_�u�qg�&�>��y��n��W9v�H�&-����B�>��[��tO[6����e%Rfi�p�����mwv$v`C\u~�\>���\��R���E�i�j�IJN����[�Q�Gn2��F�2�[����I������� ,?��g���q�~)�{@��
?@�=:u`���5��i�J������G* ���:<Yl���21��cYV�Y��(����N����6�D�sbn�Kq��m�zazW�������x���-��2�3j�����r�KH����cu3�s%5����n�X6���V}�QY��P _��Q�!�"�6&v�*����i
.��P�M�*�G�b������h	jw��5���q�k.�[c���2K�)9��������y�e�FQ�x,�%|b[vb]N��Y�1C*�YXKWD�p����/�����*�����{w��a����D��Q�U�{I��L��#��o���6��B����]^^�>h)Y�I�e.S&o���wzw[��_=�T�����s����`l8L��&:����������s���|?3M�@E���wO��	�).k������:��C��$�u���B���'��e����aU{D��W�b����/v�UW�A��������	�)4�8j��K�$�>�����mX�i�j�S��DN\!�H��\r����X�VIE]�	F^�`�_��c^���+�r���S������ ��)��2��Q��+�x���l�Z��ltZ����%���1*�J��K�e�������3�h����th�;�N`F��._�^v]~-#�~�9��{�;��dU�cpn���
�>�_��Dd�B3�����G����G���+�^����@�y�O�1��~EJ$��!*9�;�
�������LRX�����q%q�[1���K�@�5�x��t5�����=�JGJ\����*�b�~���E��K`��5r��`=~��/����������u�[��z���
VT�U�"|���Q�s-�e���p�nL^����L/�����}��R�L<t��E��c�Tz��_|��^8g)1�fl��rU��N�
{*����r���n*�.
����

	r|�$y5���������A\/�r<t_m�EJS|�
��A�����=wM���|A��7=���b��(���^KT>���>ptbS��~���k9����@	Os��o�)o����5��������U�iG��40N�oh�e�u9fbf��z'!��cq�'��]��{��S�ULy�6�������n�V�n'�).q{u�o�q�������fV���p���4�xw�����B���e�M��W��uh0��P�U�YK�i����A&BV!|� �*0�4��8�)��A�����B�M���E�v�*������\�P�`�b�6�D�
L�Qh�F�I�y���hkF;�7_�~6�e�"�K�$��D�L,UF�K��[7��W�]�k�/�N+������<������\))Y�*�)b�(�1'��\�" W�c��w���!Yk�d�{Pkw�v>��y_�P�`	�Cyxe�gj�N���/��������=XW�z'�~�L�0M=�j *�������t�!��Iw]�#��	I����8�f���E�	�I�N�][~�Xe��?�����?7�Y��n��L*(�v1;�|9v���6L�5��j)z=|Cv�\$C���*^6W�<�-$[�2�/A����k_�G�.U�K-L)6%&K�
��yb�	����sW
�/��� >������m���c4��/+�A<C%#���&����������Sn��Ac��(t�:�l~m��
K	�Q�r�t��'o���}���n���l��
��z|^���R�U+�#�������sj�`b	�i"014"��>c���{_��f_F���J���M����h����U����x�l�:!�#�&��K��&7]xk�QJ��%&����;�N�G<�:3���+DH�C�t�3[����0�Qf��I���Y�T��������v��kx��&H���Y�7�{>�� Q������%	����m4��6F3�v'xETu2�n'�[S ���+���4��PrK���������ZCI�hz���%�z�"�������z��K��fj�b��`G`u��������,$M��l�����������x����f��t1��*(%�{������RW�)�����%�"��Qv��kE'aL�����MO5���3I)*y��+��_{t����@�F�0-m'�����.��W{x����Xp�i�\��oJ�����/O$�)�����K�X���;���������T��|�)c��-{��Z�����&Nrq���<��"�B���GL�P7����U�o,n��
��=XT������o\���\5�mf�l}cg��FCR�<1��~����w%s��H�g�^���!��������c�ybG�\j���),GqR�`NR�5�fU�2����)��6�J��4��7^�QEK?�Hjp�nm��1R��M�0�]��������%,<��.��/h�z��F#�/�I����*�d��S�&#��F
A���6G���c4�������d��a��^�r��VqG�:+x�M��/�K�#f�|$�R�(�wtb1u��TU_^����s�4�r���Ch'��4��y�I�_^�|��G��:��y�����}���y��#gVO���II��Dm�2"uu�"F_��CO�3���7�n�������XMi�i����-|�\��q�����]�wt��UYDS�h�4�4�����`�z^��1 �����^�*���}
�3��f�
@Eg��qs�i��"�2��������������0O�����c�I��W����K}�d�fx/3�=>_�.�p���^)	'��u����W�����H\�?xve�|�}q�X����r�PA��:���]�8<�=��o\Q���(�����XA��|�"�sp���V]��L��
�������W��hJj�
0�:U�e�^�2���)h(�����J�xq.��@�^{*�^d��l�����]���L�T��U��$��\����{0	nEu^��7!]�%�!����E�kQ����������TU��x&������GA�wM��Y�	!�����N�4��b�q��w!���Dg���D�S#������}�j�>W��f�&t�saQ"�����/m�4��_Q�V]�9��I��F�0)Q�I�H�2^z���&I@�>�P��`�1�U�x��
�c�-��U��p3�NV��5%����$�:�*�_���"/�����%Q��������h�W
c���HY�*���N\��~� �T���&?��HI�����r
�5w�b������_��/k1#0��$��8��q\�3
�������7��8�|��{c��~��q�Ei%��.Z��n8���2 2Eq��r��c�4���&^q���F�-���]o�q~E�c�:�%��M0�.�23�._�DNG�x�D�C\�����	'u��NhR��a��1q����x�
�|���(���o�Bt�M�8�na��>��������+Q0������z1���Re7,��q8��ka�P�,l�d�������(r�v�SGL:���i���S��  +v��I��b�a�����[C��=�����@���:!�)�,Z����h9#��W��k�i�
k������>�=r �z���PH����m��$�d�R��^1�k�,e����h�X��k`���K[j���4����+!���T�>�pY��Qe�u��C��8�@uH�4��|���!4�
PE���������?��6/f�������F~�	���cTCA4Yzr�������2��4VyF]�v�p&?*��9�����h�{u�c��s.��Q��;�����s����N�j3�������f>����rHjT�cWEWcGPntU��$�hU���em{�m�s���O��(5�����*����J�w�s���2S���C��0Jq�6??3cG��B�Re��(V�Q4��o�-�tk}���'��F4Hx�aX-[R
iL�\H���Q���81�����&�����GQ�V�����s�]���6���v
x�EOv���Co�����<�la��������G=���Uc��L
���"3�c:3,��)M��|,�����$�l���\����z������^#nH&;��CG�=���Z&e_��>�
��K�9�3Q	�� [��B*/�XE8?��%�o4��������}^a��Q�+��"���DIi����������,lW� ��v�U��
�e�e �T�������7"����t?5�E��y#��J�l�A��B�>���@��������`(�X�=���q4\�.�s�_N��������|n��B�7p��F ��zb���x�K���=a4W�u���u4our	�,����d���D����pm@�u����0-/�E@F|�rIw��v"��|�����C2<\s*G���r��<�]'�I��L��?��8	���gr��2��79���-Z�rY�����������,���FK�]I�pU3�j�����Fk�_x�v��S�9$�\�K`�0����!j�W�[�^���n�/���h��X��5:&�?����G���K��^������bn&M-�5����uk^^S�9��
C�<�G�iN�+��'j�P��|��n���Kn��(���UNI�`R��W���j����c]���V�����69GU��3���%L��Hj?������-����%t��@� V�������Vs-h�&l����pF�q����JvnRWcm��]�(_���7��VL\�S-�|�@���v�4����5��Q�q���A����N��:�
�Y���S~1OZ��6rx���2n�����a���
���t3K�	>�5~X)N����7�t��q#-�w�r��f��=(��B�Or��.�;}�-C���d?h�jV=�����Q��q��7��^�`�O�����[�\���.��Or�Z��z%FoB�����b���*�"�.5��7�H�����`�]����X��������������89���?|�����������!=(�$c�.�'L�U2s_ ���UR�.K$��S�����J�����Jsw\�rarwB;����[9�7Qo��l�P����&��G���o�.Bz�������Z4�hf���QNE��=�M��{f�:�Y<��*�i������e����������r���N���z������hB2l ��H���b����8bE���������'��)Ap�����RrGn	Z���56j��$���5D �c�;i�Z�A2����kc�����{�2���X�Vt;�z�d���}�������,d&���t�Y�����E�*�T�ZVe��R}���lH�����$��~�Ye�h�hu]bq���_/��|�����������r,I�xb���@��x�mE���:��D����O$W�o�]��<g�%nV���
?x��R.@���D&/-twI~_�K�&�]�2�}��4G����'��H�K�� u\Vhi�3�������H:��������������l-;���I�?������(#Aw�m@;T�(����u]���4���P���P�Nf���XT����Ei�`z���y�D�%���;���c ���i���D�Z�
{a�A"<mO�Y})�KdQ����8E��1j;p��V3��0����G���xZ5E��a}\���$L�!ry��7�����!�<�@���%v��k���@������	�S=p(����>���������Fp*l(?�O���s��Zm-_�T�!�tw�}]P���J����5���`���9�?U�$I�jQk���h��,��������y�qM��,��PuV>���fd#G�VfO�b����@�����z�&u�BW+����$.��������M�BAu8U�+���^�Sk�=�����U<E���=Zy'��M�����h��V�	�!���T���%*S�f���0
��A�Bq�5�����^�qC������Sz_z
���U�{/�ze���i�Lkjt�q/����&3�A'Su��@�_a�(�}\$L������jv&���#�;���������~������]��,�q��mu�.��{
}a���a"���:� d���9��G`�)��Zi��46�?��W����Y-�
O�A�EV�X���h!}
'VX�� ^[5�y��0Q�0�b����@����:1[����Uycy}Y�O�_���i�a\��c�))0^>���|��0NM��M�1�B�����>��.�o�4\�
�^���{Uu�r*���R�.�(���P<4����e���(��@����q9IZ�Vl� RZk�9J(TU���}���5��>�P�;��T��E�r��/w�D���i�����<�������o�x��JU����N����V$���%>���������H�����
���7����B�J)Gkq;^2��\�y\.��n�����"����u��o�XCE�Y��,��C~�F��9�.4�%��������N�1?~�_	8I�]�
����A��`�A�����FtV�v���[��v�nS�,��6�;��
�S��-},B�j�����-�HAs�eC���\��h���zLQ��1�l�)�}@�5��uJ�����Q�z�{�h��.�r���tNWP��L�
���ww�^
��]��+�I|K_\�������]�~���=�!a8j������S��#�|k3�?1�F �a�~v���0�?���Z�?^��\�0U
e" ��G{�F��=�z�J]�(35�E����4_��P,`�~�%���s<�7��T��&���/g�����r��T��4������k�����+���%>�2M�ut�����C��iu7�Q�5�M]ZFI��g\�s��q<X��	�^L�Ad�f'�����zA�������c'��TO!C)����(z��C���������,���r����\/��u�Do��>��^X�9z/sQ���������orj�|�/������u�8i�������I�4�@�\8�r���w�}/���B��	�\fj����3�6��4�Ja��S�o�_�2y/���J�����,-�Y�*b��A�9��j�a�jZ95������l���e��MU����	35�YxWR'��#O u/�`yr�~w���^�J����A##����/J��ya��Sr�Q�#Z��F�����+�:�qi��!^���Z1:�F�>���8}Mn)m}_�x���N�^|x�.2�*k�<�eO���!!�:���4�9���
���x��B1��MI���m��c>�pA�7��B)S=�l��kJ�q����l*�,�<W��Su'�#������=��B���s�?J��\T���Jx�9����K(T�F|qq��y��r�K�<���P����Y��2���Rb{V�������[�+�����U��N�R;C����R=�a�lB�pQgp��$�Knf�M3h�Z'�;��*�G7[v9�<��'�.��s<p*[��J*L�7-��Kv�RY�IN��uB.'a>���C�iH�wm�ZUQ���*�d�_g���<��'�%1��nu9�_�����^z�8IQ�-���o\�0�����C���0���\�>�{��c�c�*���c*�!oE����<{H��k?��	k`���,�6�N��%�,�O��P@Q��5�r�'z97��9����X�����lN��|�,`���'$i<a�b��0�uh^Io��mv�]W\kB�<*5�������fH���RFn�[�J����P�bR�q����)T�qpw����>�_��Z��}����y"���b���5����'
���^��j_��5u�N��7�?��n�Pl�^��j���_|����<�����;B���
%�(n�jP6�rK*<
�v8'�K��+�5fm`�D|%(~�]�0)7���S�y;9&�JqS���"
�V$�u,�,"���{)����I���r>Po�j�O�E��VE��wP�����(��NY`�"Q����A�4���8�Uo������{���i���������ju����x��kZ�{�=��#������-��S��1��v'��~M��v��+�l�$���)At�.l~��#����H1��5�K��6�Z��V;L�����{Cv�=�z�������{iAs�B��UJL`-{"�
�?	xL��{r�[�
��*5h&�+����;������Z3i���J��7�>��������o�(-RF�WEsx������l�jc��|V$��]�oK��i��spKH��{����!r��kS�(���\�e]	)U�09���<�__$��e�[�9;N�g0��}��;�a��6����V�/`�/���w�8���!J��>4Oo���W�c
)���i��s�����r�\ja��|h&V+�������.+�[�g�07����~���2����k]�T����cY��0�� �
S�����_fU�s���h���3_������Q�3��{j�T�8]�Q��5����u�V@����7����1'<�<���*�z{�0����,�Q0#�J��N0�����A�0����&zsN�q9`DSjv{-�/x��e�H��D�sc2{/	��j:QAO��:�ib�[1G24�}V_����{46����������*Eas��Ri�g��	�1xq��{O:��MD�Pe��z"*f�����u+h��u,-�%�,N��5D��f\���������
���
q�9��hXZ��x]���l�L"s�M�s��D�K��������^L�}��Sm�����m{5@���K��#�.������jK��^���kl�=��+��j�Uz�?�Th��c�9�Sd�QU�.{lx�C���v�O9~���>��:����H��F������M5���.
����rv�
?��|���C�zLq�9�"��	��4~��_2���J��(q_�c[`��IO&��(�,�s�{-)^�(q�e�UQ|�_S]�/���������2�� ?l�B���`c�7krut��BzSO����^,H��!D�UJM)H^���Dz\�FT�u�V�������NA=�x������z�������W�����0������C�B
�->C64��K����������Mc����Mo��?�;`���
��Cr2����9!��Q������J�m4����d�p*hfA#�"?AU���|�^���Uq��t�f�0�,�7����qog�lm3z&<P�� ��O��s��+�H�L�2�J'���/7���Iq�y��BVCLz~�C]����P�#]����-�qU1��f_������
���������5��Q����iYe�a<���$��� ���I�h�je'p�Da-[��� ��UV���X�d�����`2���l4�h	�����[�\�`q�s�����$�l�^1���C"q���rs��s�d����lMjVw��6UUk�����q'� 3��(�[S��4�(�s��g��>	kr��K��"><�qG�CXe���M�����Kk��W#q�0����A���3b��G�z�]�.U@vJ:Q�h����Pp�79���Ht��?L9p�r����������?����J��i��(���u.��e�������x�C�*j��'v?�~�*�./*��m���f]W�#��2�d O�%������Y�������������+��yA��,dj��N��;Py	�f�+�X>Z��*��������p'��UiY���W�C��{4��
��	�1U���2�@�iu�w��,�[�k��f3���1���vr[�\�EkD�>�/�c~/$�����8�����
GM��Q:�J�L�I@$��r�����l�b��p�}�@���5�q��Q#��/C�IuI��M���F;x�C�#�
��x-C�F�k�Hy�~1R�{�n�Ow�d�f�5�f��[�{=s��n�u��CHQU���q{%4^J�����,�����^����Q����q����ni���9>��-�����%t����MJfDc������W�)��t��w}��-�����Q���G_�_�E_b
s�;��6��S]J���p}R�b3)�.v�������V-����
��|(�^��z��x��<D���[w���F4Lc��
���
���4o!q����R�������W<��=�kD��o^V�����Rff����:	<Vr�U�T���-�w}���P�|oXD����i�H����y1s����
�`Q�	1�Dc��%���Cjo`���/D8F)e���+|�R{y�;+�p�K���4�g�Q��v�s�ph`�8FV���4>��K=�0����m��������u*�F����7�������WM�}��)���\:4�������hct$
�SF�� Q�Vo���cM���6*x���:��#���N��P0��X�ii��hD4�NXC���;����K��B{z��*��m�y�����,������������:�%m������E/�Np���qN���q��q�WhnN�z��|�B��K�h>�(Z
;���&�_�H����=3H�G������h'~#������&�*�z��E��mRss��1}�����r�S�$�Y
�.�a����J���~����q��	�U�7�m����#��.�?�#Z�������8�n�&�O�u?��$2���"q���qZ,B1.mM~����6��+ ���(�:�����a���HSDn.K��z�oec����Xuk��F�����V���QW
K�oC��>����x� oNJ��:u>�G���y�d�_�� 3U�a��im�]�d|��z�������ycf*����������]��P1��Y�4��\����sYg?2��8&�D��$���l�,�{��J�K��\�T�xm���i�jSU��_��f.0�O�(����3}���y���%��j�"���9-���n���tE��s#hF��������1���l'?Xy������������Q�Qy����No����-%��l�*��UN�����c�����f>���E�7��lz��8/�e�*7�{�k��c�������]=��wg��5U�[�=VSY"�q:�\��p"�{+��T�[� �|�C�|c��K6M5�r-����zg�W����|h�!�����[v���V$aO���8�Oj������^a�*y����nx�W��[���-��AAn`���{�d
P������=�}���{jr�\��S�M�I����8��q�l#ipJ�$��R�Y��w5��e�����6nq�B=3�..��{�j��<�q|,��]���,'}S�p7K`r����,#������e�b��+6[>QZr��~#q�=���O�Zp���L~!�k�k�x��Z���S��_x��`3d
�C�Y��B9�8�,7��bX�I���m��&_��i������NA:�������vgq��������	D���hM�?�)�F������s�w-���~;Q��Cm�J�'�h�j?C0�VuAb"���;���� k����
�kK���13}A�����q�������2K_������}�|��R��Z�:���"�q�0�)K9��bK
r@��P�jJ�s��+��\u@���U���{G�f�$yt!��3�\����.�A�%s{[:B�[����T��W�z8�]�@	�S|	k��5bLnZ�a�z��v�3�\G����!e��������.��]c]��������k�u�om��w��>����8�$}L��?]���?7���#5#?B�A��,sty�N�D��^R�=��d����U��t�C�#��NG*z�d��{�����mAN�ZQ;�G�� �t[���!b�s<�iW8��sl���-�=�=R���d�T����C�T���!���L�_�~��q�U8J_vtt@\J��8N��D*��pc!@���N����7����������������'���4��7k|.T[���������/b4=d3^������K�hHAX�_�|��Y�DK�D�*�K�dQ]�������R��&�Jj���s�8O����t����u��s�]����<��$MHf���*:���$�W~�����`�L���1X�xO��E�%l���
��^�!!�����2���>��3b�
��������B�r!F�fF�DA>�{�2�9���� �2#�Ie���o��k�B-��*@P�K������X��DPD���T�=��(��@8.�S�/N�lG��8��
��5wD�e9��')��a���X�����ZrM����C��$73l�?����Hrp��U��\j[�l$6L]�I��X.y/��I���H�.�
�t��l1��_:2���/��_�,�i&�p���gW���t{��%���7D�`8��U���2��,��r�X���#i��c���!7/�a�2+�G��x�D��u���q@SS�5d(��T�HZ������5��I��}�p��)�M.��G&^�C��U���c������0���E���4��D����]H�� ���yS���p�m�H�S�F]C���z�vXRv6���k_��#Mo5����f-���
%D���J��w�WHA:�$T+��~�pJ|r���X���e�,��(G����B���@��d����J�2�i����@u	����;O����qw�����#R���F=*�o�Q��l�7��LI�W��4����{�l��K�(��OJ��l��������<X�&������B���'L���i���
e���|��Zn��(K����[�?�h�l��6"x[p�������F����������xEf�[�_J�4��4C��B�Q�����[�'�Oh��������%z�v��7Zo����u*�+��vFu���3�V�^���j|
�.�xS��U5���1����u��5����V�v�?����C���4��G���^
Q������5�K��A3P�C����t���R}|>�YO|P=��x����z�f"_4�:b���m���tx�7w�xVI����hD��P����W�t!��qssNu��x��|�g�������U"cI�X��@q�Zxt���7��R3@����y����R���
EA1�76�{��B(�F����x�>����P����h��++��Z\u&�����&;u�-	UJRo�fl�����/�b��jy��mj
wp_;�T������ -���W;Ro�������H�p�VH�^g��>����;lSAI��������~��0\���QO.�M[���p�X�����@Q��o��sQ���(���N�B�??���B�1I�1��U�gx��*i�n��{���O4��Y�@.����sO��z�(bQ��P�j�z�v��0���	���@f���9�o��x�.Z �<d���z�0�$8�� yj&�&����c�#q_�E��0�I���|&��u�h�6�2j����<"�(�AfF;�]"63)
�wm�iK����iC"��oz����{IF0�����~!�.m�{�p�4���!��]mY�B!]���������	E����/����]Mw4�G�]�����j����L~H�8���2�#P�v5���/5�J(J�~3�6SE����/�����tU}�����g_�����=��GN�����:�[�r��9�u��}	��E���yJ$�aD��&X��8��v�|���jp&��R�A�ct��wu�XW���>�<�����}YS���������`�8��m�n(:6�\�"Vw9w�S3���e�Bg���w$Z�����Wi`!0�W����%�����iz��}�����#V�������m�ZK�{4����W�t?�-�h�MI;��
������ce;��1p����.�w�_V��HB�pfF>������O>,������i�V�+�hs8��?>)�Lq�.5�W�y(���Y�s����s�%9R����~%�o��4
�|��`��E8W`)��h~|W�m�7������h4��T�H�CgAA�[���.yV;�"�I����<P�7SV�Y>�d�~��Wto�9�8��&G���s����2�\E��_z-Ig��=��)�g")!�
���K��ef#\'��I��GG��(z�U�k#Nf
�8����5����c�^1��lAK�����R�:o�)C%mS����+1�������EG�#dV����Qh���������Gr$��"@��'��	a���������eL+�����������y��q��[.{������U�v�������e���Nn;��^���1��������k/w63
����,qB9��A����{13�������8��7�GU��(��X �b=�p��>��{��u��'IMP�Q���V~?s��
j�B���O���%#�2D��5������~��{uY���CEW�M���Y�[����t�`���[���N;,gpp	�����O��O*������a��r����Io���uS�$��h<do�~|W�s�;��#��������h*��X��M��q�:��(�/Z��b�*c��{���Eo���7[W a����X��"_l0�p,c1�V�������q��)C����TQ�/���+��8���{W�������y�]���k4�W����.�!�����
����B��_e�/�
�"����w���BGu���kHY�@�of�����@��^��\C�w�c������;�kZqn�1]%����.����JT.����L��wPp�'O6���+�\@u������G~G������H�_W�/u��j��2���*t����k�G�p^�2����y�%�;Z�G���b7����C�_W��k�>���bt�fi@E=Ys���v?G9��r���Psg��}�x�����>>��0�����qf�1�������n�*��Q�h�QYKxvL�@-�=�(�(�?������M��M$�[�G�3����\Rf�7K�w�Z�n��KJ���������+��� �U��bXQ���=�x_�Z���B)N��#����o����j7!4"j$�$�Fn���]�9�q&��s#������=���4��t�.;��#��+�z��X�Pt�n�.HY�������.��r.�=[�+I��������������� ��Y����������N�����G���
N���*��S�?��I���u��M�dHyaD����2���X���K�G
�	xX��������M:�j�/��C��i�IG���6�aDg<��U�"Vfv~\��y+!����bL�dphX�X�s����0��	�zv��*���*>��Q
H��K�����K}�nC{�|�<���s@R/R!^�VC��<��t�R�/8����`�s���HjCN]��0�z�����'rN�x����)9di~����EQ#6j�o��
KM��� I�S����N=+�����^��ED��=�e���k2�V��?�qA��`'��Z�{&-P2<<�Kjw��H� |>�[W��x�f�*J=`4|CH��S��[�^P������?��c�kX�4&SyTjT���`j>9zWyO*E��R��lBj�����K�)��=��m/ A���%eM������CkRytu/)��-�����m$����(�ChMq9�*o�D�7�>�!�c��z��|�|����9s��u<���=�����C�B7�|������~�zB�Y���c�2������u�J���m�;���LS����[��C�#n��Zw�q�Z�(��]n��9��]v��q�Jv��>k���.��|�����2�j����3��v�*.�*���R���]�F�v8��#��X�d\��������������
��P���Kw?�� �H���.��Zu�T�L�1v�"M���J|Pi��y�\�~1�[�K����wAn��q�����XqT������_�`C.�hA��F@�G 	Q ��������;���&E����.5���0S��'K��rl�!��)��f�b,����^7jv�v�6�b��e��{|f��T_�;��������Hx)�)�"��Q������_:�d��4��+qh�@S����&��=������������G--��'��~�K����~��=.9���!%9AJ%u�{�sCX�5 &���b{�K��U���n�fT��k�Td��;y}#�K��<"H�r6��']�)b=�.�a2+�"�5Z�F�K�������_01i%�����/���>u2�����Mg|�q����L������z�	�����HWi�q�����)��Q��z
C�tQm$��MRu5����S���C�����u#�����
�g%+,|D��h�	g�=,>08t�;�t�kK����$��U���Ji�wM����&0��A�
��w)�����m��/E��i�i�N�%JA���t���:k����Q�����U�m��^�]u���
&^z���3�1����?�0��Lg�
3��B:���T�o�����`4�d:�(~*�	�;0-X���D�����D��a�i�#5-o���R�����:�N]��1lU��*������=A*��;���1�v�7�$}*e+S<��.�����0��`8u�L$0��D^;�������9i�]^�&���B��E!��;CO{���.:IY��PqQ����o?hE& ��+��D����~/�^��I+(����>�-����m#h��9K^�[d.��'k��8���#������x��R.�:NUI�#�	_�������a1V��G[��� �OQ����(F��92��������^�` "�p/4��gS���5H�����ao�w��1j�3F�-%�aS�
w�9J�*e���>e~��@^2����������	�j���!�w��-%�j~����~�����\��W5���D~�I�;�^M�-�
0e�+T����W��;~�
�J*��-%��������KX&�G�/�L�:d~0�=$�c�[��AW����_�h�{��:�����!L�������@���{�j��?^�X91��!`���J�i�*����m��&��b%B0��
!K�����6j
�[����f�]����p2
��������{��zC�7��N���Z��������CB�;u�
l<��V�/b���J�n(� �a1����yu���=��{����2�>V��rwH,�p:�Bf`vh��^���}�D/�'�/vG����M;B���c���P�("��8��Mw����?�����q��~�EV*���^��@�s��IP�lS�:,���=�t�f(�L��>D%�;o�\��H�\�1�H5Gn���?���G�����({0|E@�?S�������)uASG��"6(�M}p��
��MJ~�����F�5����mh1	S�`���9x���p��0e0Bp#.om>:���W��$:�]o�cQ��G�5Yv5V%�o�%95@�������1�c����<�~�b*�+�g�%�E����!�{�m�����n����hJ�"��N�nL/��������k�SuS}�@�G��Pg���T'^@���0�uO�/��D$����R�x��h�5����*l��4h���t@����|B�K�bI_�6��F�\#�w����H;���Y��0y/��[��fA���ii9-j����O9�t�����NY_����X��[��0��#)��k��Ny��<�e(hq-d���lgN�+��y�����`�������!�j��F�POI
����=�q����pFL�O���Q��f �R@���������v�KUTxY�%���P%x|%Z����[�h�k�y�tMt�D���K����.�*�<����$�el{5�����C���J��0��=L�;�9I�v�_��������mm�p�����wvT{��,c�)	��-\���`n�F��j���>}@�|P��J�J�RIMeW���*n��L��rI[����$�8�skvr!T8)���9����IU�"�q���R�r��q����-�3n�2�R�hYW���`��������s�F���`��;�i��L�q�2����X��<�2]�=�e���������1f�['E���~S{���-�R\G�6��w��2�qV�%�	��X�4 i�@��U���O�q�q!KE��-����/�?J�#��u0O��Y�`�1��0O&�@������?>/�qyl���
:q������\�I^�y�V�w�S��iQ�_����M�.0uS�L�RM�M�A��6�$_��Q�Wo�������w�\+�'�'4*S��K����}e�2F�����Lf���R�����)�5������E�$�*��0|.�O�P%�A��z�����V��SI�X]K��[��6������{"�l���p$)����C��n����;��������f\r_nP�����xC{@�d�<z�u��E�J���G���������/��&P�	s��Q]uQ\�xnm{�M�/=���9+���s���
q�V������_��G�������}J.tO��+jfb�1#�S.,�
��$���v~�(��~�����5�?s~��W	��Dh�^g�o��0�����4�N4��H����T������D�[C�Tra�rY{dTk:�N� P������>���9.K����r�+NGc�(#����=Ua����6a,�Y31��-�hV�&}q��V�A���x+� p�����
�4A�,
~{���[A�@V=3Z���J�����d����mA"�b|g�x[���%�^�����5S+�d���0`�mg
�N�&�I��w�K� �� <[bQSf%~�tc~ w�<���������������6�_�����.�����{�t/�P,�'(���S�PR�0i�cx�(n�
�k�0�]^��^'�G�*I�c���A���R;.��l�O��CA�{5�-�5�v�`-�m'Rs�����WG����r�y���Lr����GO�,KT���F���/�l��;����ZJy��v@/,���\������K��������G��x�F5���n��Y���2���2tXf�+�7���� <��U�$������������L��1���&�%R�F�3��%�J�:���M����T��qpe`J�;��`�_��-������������[R�o�	�S��JU���-�V�Qj����b��AG��t~�\�s8&����Jv5���6]����(�z���Zo���CV�1��)�eW��NA�5��k���x� *'S�]�%��Nd��zP
=9��I��j�b��U�Wg���Z��E)�\��}�s12���P�a<i(���5>��bI���3��K�5�w%�LU����H����������G�$}Y�R�N�����9���r�9�j6��C��E� G��P�q]��:�i��I.UL�����������B�3,Ao�m{��G�x�I,8!�e�=�pD~	�&��J#�dBM/~��b�K�����PfP�f�H�R4<���Y�|C ����x��[<�=�C����N:~�^��r�E�Z��mB*�Qy����{il�!:	�2������}5��@e� $G�|���;�r�bm��9�d�0�Ur�)R!��y�r���[f���T���S�3�����7�K�lC�������%� S��uy��g��M[��:5���������2�>l7�AM:���Eh/��+��,�(O�yy`��)n�J�V��P:A�d��
:P�����Q�*�G*�O��3_�ss��*����olx�kS�o���J"���h���� l<�#�1����.�=������j�yU�E�m������TH��N}S���NJ�G�A/������V%p]�g#�#%2>"�=y�7#�W�r��T=���H��$y%[��P�(T)������W&|��p�CPy�t��K��kj�X[����~���2��|u��U��Eq�*=����'��E�yH,Q�Pal�t^8gy���=�SB-�:�4��n��@J��h���r�2����x�w���HK�	�B[�'��SW������/$i6z�^���r�������Yt���JT�1�8�m�j�2���rc�:JG���8�^S"��'�rh
+��<|�\_���
�Lr����H�[�C2��<��
�>�2=�?����$�fkk��\��������<f����s�D�����������i�-�=x�H
�b3���'�IBe1�@�����U��/|��jPS�]�����:�:]<��m�i0���*I�f5}����* �c+�i]���+[M�8-=��F�t����8��f�'3y~E�T��A������H�8Gi�C~�#c<��������k���L�/�������;G�#�O�1�^��A��`|���,����-p�m��`�Q����Nw����kR7�]
��(���?�{�Y���-b��&T����������r<���`�g:i����J�&�2=��Y��D���/">=�:u���#�Wg���r�S����X��`��3��	����,�N��������N������sJ�>�J���.p�Mf!�l�p��}���1�5I��O��1�
yJ%��,�c�$u���Ch4�t���>�{a'�������:1����W�����v!?�r��w�A;��-�5tHU`t�$�H��0_��T���r�D�W���A%�#v�U���a�F�6b�{��������CQ�����pHP���X�*/��E��8�F�����^�������6|�e�aS�"�`�����
��`~-���-�����2���6x"��h��X���R�TZi�!Nn�[�|G#"��&B%.��T4��������������~^��tH�v����B5����A*���Kg5���u������1���q+;~�V�x�DE�Lx�l6�A�T���=��T���c�> D2m�z��W��5��������X��V�	�d3!o�u�".���N$��f3�3dv�V���Jo	��C���:�jX'��O��tzn�:v������2�{�q��q�0h_m:���)����'�l{w E������0h��S�Wy
����a���LZ������Y���f=������7��g&��0O��{�;EU��tgtY�����;�<�#�'��\��\}��������Y�M�Q~�����{��^�{�l�j@j�`k�C��y���?���#���k����n��z�X����,��M�
�����hi��r\�����i������FVW�1.e3��>�����Rm���:�h�������x��k�V�}[��?OE��_�E���2��n&
���6����I�`1�`
������`�������[��d�d=��t�q�w��3������
|
����`��G{'<w�/)�F5��	� b��U���������`�������N�u�:�
���������>%x��;E .����!�s��jh_�[�O���O����wC�w�����i���	N�f�"�Q(�����?�G��}�/�fZ�/<eCAuJ��/6�����dO�|Q�>jp:����'����o������[�q.��
J�E
��>��C��Kxo`�"?�������m0Y2��rU�U����/��=��O�:�>I��N���x�����h���2�h�}1X����!9
\JLp|,�k��Z��~����$��<�;��bLm1��w�tI�D�y��q��A����PC����������� �t�I�>���J�3�a~f�&�K�UM�E���+I~�S����(����@�F\(6(���,���G�6!Q�����k�X}�t��Jl��rR�XffY�����VHOp�f�0�����'+0�|X��@�(9��������!���[)]� ���I��CY)�z��'��T
V~~����s&�~>g�Q_��o�;�7�@��wK��W1�����VQ�����_1[��!%S���\���y��������"
F��;w?a����QER�K2��.C���V\����;���M��F�R4{+�q.����������S�{w���b����� ��W���1yG�� 
;�R���^9��[(^��;o�l�?A1���� �+�����k�d	���q�TW�}
�K��'�gG��n�]�=����4������8Q��%��m�~8k[n�\(_�1_������1����M�����:UH�����LY p)d�{\pX��S�D�a�T�a|���$��U�����wE�T�n�$G���g+���A�U���[��;�L����>�yS�B��!����6���-�D���sw�Y��(UT[J����p��w8�dT���^�hJ��X���&��z3k��t�Ft�uV�W��^���zq;����uJ�3B`������/`���$� �3���P�Z%�w�����]S������U���"+�]f���d�~�(�w8pIg�|�5= ��?U�1p�r��M�EI����BO����.\o]������YC�����:�U�-�����\tDhPJE)����j��B`�H�tNb��%����'`d��c�^�r�
}�OM��Px?����h�+dvH2+��o.�f�����4��
�)Ls;���M�V��ie�Jo��$�:�yGg��R�1 :7Rn�^����Zi��`������#l>G6�2t�Gf
�z�(D�>�O����c
��[T��EQ�G��1��5�������TR�e�������3'���vU�����N��e���|i&��6��
�Z�Y���������z�3)���)�{�r�'=�@��*PWQ&=�.��f<F����GVo2���n�,� �S�������}��[U�h�F�8��;zH���������"%*������~)���������������D�e��N��u��v5��^��D�uE�9�H��W�$�=���!�z�~���Gr�C�R) �7���x�#�,�7�k�,�S�k*���n^� �0��� �G��w]�|rM�?������L�����ObY��V�#niY� ���<����q3��V]�������^]�=��%��x4����w~j���� �,����|R��'�R]�F0�}��O&���AC�������A�Zk)��B��"�C�[����!=��P����r�*?���h4���Q�P!y!oO�.R�o���PP���T��'~L�_+6�6%�OYhWK������������hU)�SEJ��@q,�Zw�&	pR�I�B��Oh�w5��o[����c���9K�Y
�Y��Y���9�������[��s�G��yj�4��� �|6�#�����i?`��MCTQf�P�o%:ww��$������YY�e5z�7l���K�p$�+Iu*CN:�o����
�
~5������~��b�?�k��9���Npwu�H_O��N���S�'xG�z�����
R�����p������(7��e�|c�m��eG�b�O��������MNt3�i���,~�M�jA��*3z�L����������%��\��bg����'S���
�������c����}ln��
��\��P�cq{��d��
Z��\������ wF�F���h=��wz��������/v(FbZ?�����Y�����*�K�����@���:��@[�oHft������^wB�T�O�����~>�^�E��<'�M��0�(����pC���K=��e���<����gq����Z�
�`����C��������<D���}����1rj:�Pm\�������5��~�9GJ�,H��`:��<�M1
Q4j�^X����
�v�#:���
/�:�.tn�bS�)n�����}���Lx~�b����%��t��{R��!T��8�w78��'�:�����MAY
S����-:0���%ock\�wT�? ?��������T�P�NW�k��$���~�/
BW�,�+�7�^�S/Jp��Y)���4���2�gUw5�t�c��*�~���'��t�����8�4�{����N�di��<��E�"	n�0~����.��j�|��)���jt���f��x�R�"$��������+`�;h�I
X�c�����&a?��Z���[E������f�F���'W�c�;�3�>3���u&��$Zt��o�j��^��8�,��O���\u��'�e7F{`��7�����!X���;����N��#	����1��K��:�R�p{��,x�8�a�h��	1�D��lH�F�pr�xH�=^lc��s��E���6�
D{FI&*ro��{�}��u�~��R���
#-���f�QT����RB��@��>k�	�l?��{���\��A����s+F�g�~X��U�
�������p�������������h0J����P���sA\�y����u1TL���5�Q	^C�k'��IR\�E�����VIR��@u���b��<��Q�)5�Y�*��2w��[��@zW>�m�7p<��^v�I!-Q��iyY�h~����2�Be�#�������X�<����+� r�Y=B(F��\����3�`��:J
]����Kj���1���\�>�
Y��<6005$��1V��e�>��2��vPM��Q�D���\�#����������U��J��7&bb����J�F�����s��f�<���]��*8?<���iRW	1�{�yG=W	����L�c/���T��,�n�#�&���v��W��c�IGu�k<5�����&�2�NN��;������/�P;Si��:�H�-PDJ���P��P��{�-��T���s�*+�<s�*����y�~�bdv"i�\�`�zn������{:CG��������8�t������Z���:���6������m��{���J�i�����miSk�I���][r�����m�V4_L"4J�.���qT�������R&���EQ�*�m�n>Y���0j�{
)iL���:�s��wp��rp��+M������m����@?�������}�G<��z�{qo~.�����%=�B��ob)�nSb�1j[9F�2h��x!���P��]��L}V��������������7��@�N(u|��\%�0����hF3>%uq����{���� �����d�������4P��"%����	�H����(
��cu������k��4^A(���$��20�?7�,��^s�'r�\j� ��_���k�X�Ih�G���0z���|4m!��b{q��H�^yX��/s�|��S@5��y�w���H�x	�3�������[_�UA�:I�@4��7U6�nE��:z��t������ro�����b�����(2$+[d/��4��m&MM��H��-;����}�����|���L���j�(r1��XX����.���r��7�9�&�H�W�g(��ON^�Y82�g���:O�4�\>�Y�;��@�F���VJs��]��[����@qH��S�o�����cL����>�����������D�(�^:i�F�eHr�f�����pV�x��h��s#H������	�h�F�����O�C�#RX��Su�^���_�C^�g�J�GJ;�V����������L&Li+U��c����c���r��0�?�������>W�iW��k�����>q��%�u�D��$���~!������:�X�t� ��g�u���K����$Br
��E�nT�!�,��Y�KB1��[����3�q]�)F�!���s�1~��g��;�[{0<o����E�����Y���s{	�O>pv�8��:����]\�@�+���W�!���n�cE�'F�j7������3T��p��? j������mZ�v�1�4%.�L:��~������}���kQ���R3�����`�G�R|)	94e�������"���d~U���W}���x��jV�%lf"�_�u�a�UK�0���)��(��z����V��	+nH2��F.4�7_.����������s���-������>����2�&_��n�U�>UPI�y��_D����������T�Z����M�b�!D��\1�����
���I��H;LKPj��Cq�U�����y��=l��Q��xw�
�Hm���d�/;��E?R���B��n�B$�3���F��}�CG��SG����U��*�����������H�y`sg�y��Nf�����`D"����c��H��hQ"+���.^��T(��&�f���h����RD]� �w����Z�Zh\r�H^\��U�]��,$�c2�_��<	���;U�j?������g�U�c������I/IpN9��$k^2W&YQr��U������|�kF
�U��,����m�,�
����T.�ha����X����k3�Zz!y���������{�GN�lK�����r,3'/B���x���]��,�OR� Oh*���x��Z+�T�j^���)�/�]��� �b�/{`�� ��Oy�@�4�������)�w�{������w�Z���4���������qY�T�Eu>�d��M��������/���	��.���]����E~�/X�/�3#���N�C�D��`^�V}�oG�D�����U�C]OHK0��+A�N�,�/���rL�f�1c��q���
|�;��;=�KYN��xEO��@0U��f������d�H�(��R�po���q4�=�LGE���m�o���{K�-�d��Nzd����	��F!�&�&
�����kGf)�dV�D�� ��i�:M�������6Lz�U2�W��RY���'H�`M�vj��f�`��{G��!�+,FNn��v[��Y����"S�C#p�P_��|�cQ����6_�g� D��3����5�	�6�Sd1�|_�(�+�������w��u]�J`�0
��$�����]B}�s��4�m#��X�&�2�du��4z����Q��M�#��$���e���v��I���x��S����t��i`"�jU����eD���c���-�O('m~T�I��K	�Y������F�X�t��z�z��<���#oy�a�R����VF��Y�j�PD:������N���V��1��|�,z���
������t9�����&��V��B*w��B����s%�_��`$��k�m�)�1�{�r���gT$�r��k�U���r'SS����TFsTO/����k�i���A08#}��n�>"�q*�������E������U���Y!-����������Up���P0�M%����Ko8�HIt>����x���=�O�������Xf������Un��`��.0&�a�~?����M��F���9I��2��9�n3grB~�9l �~0��r�����iAQAk�7T�����6����$xXO�KM�n�4�m5����[��5����+������Se�����V�����C���k�qi���~�B~+�GO]��EzC����|m���� �������P�$�;�����-t��&?��aKK������uw��/������A�4/J�^�u��b��CC7*]m;�|v���W������p��eZ~�9��}s�����)�6S����W�x�������
sh*�(��$w��1��0�s)�X�U��J6*^��][����JI�Si������t+�cz^G�f�)o��|�������'��?2B��6S)I�������BCH �ssi��g���x��n���a��u���J�F��
v<��l1t%����-�-����7��O�)�@Q�K~��j.�Z]�\x�?T������������RA���������)N=�����|0�����}��S%��JW�����p~<e+��/���Z��{?�������hA�i�l��|����,i��K��������.�?j��T�:��������T���!��1���A>�!R�
������aR���n[BE�K��
HPB��C��<�8u={3�y�����{���g�"-�U6
��
�B������n3��3��IY
�?\����RT�/�zx���������q�AN�3'�y����K���t;ei���
�c�����&(i3�2"�&�d��-�'�N�����iG�d������������azd	FEO�}��H���/��qb�����&^��;����I��q�
a�UW��>���UP���\�(�:��UP�r��1��Z#���T������?��9��P��T�{����6���?��m�|]4�cVS ?������2#�DF�p�U��:x$�n�)�bcF;������/�A���"5�P$$UQ�/0u��h�L]�����+�{i���?�!6�j�1�	��7������>N(if:�
A���.U����G1��EK�����I��^�k��^��i�,y���4���VZ�vU�ca1�_����8k����H� �}J��
��6�N��H�T
(�RTr�<��(��Y��P1���,�I�]~#�6/�K� $�vV�3�p_�������:
��6�l�O>xVr���93�'*k&�w�O~�$�������E�
�+� ?�#�������5�A���0_G���������X�:g�>I�|TF��M;���]��,���	��m��,DP��5��a�u��0�fb����d�! z2��fuD���l�5�rs���r�:�>kX�7>~J��G	uT���9o���ub0��c�V�P8���Is�F ��z0��V&���$�R����d����+��C���Y��N������6�����0�S�Pbt��Yh
�"&��L
�^�9-;[0.�c.�Jl�|�<��
�uAz:U�� YGj�>:_�)7�R�;���r&�u
�����p����tbTgF=iF(����V�:�����fP�����)��iZp�����
�G%=p��a���I�g��Q%��"!8���?&q��b���)���:�����a��`e�P�j��)?�>�L���&��:P>1��K|�]���F����*�9�������|�q���\��)�G�U�J�l����pH-��5I���%�U�i��W�����G����)X��y����@�!>�a<Qm�Ar��!|�^��*���p�����D��mi�*��%P��s��%����%7c �x4ne���|��H����6����Z�Nv��������S�)������J�ySU ����lz��xPG��H���d�\0)��|:t���bR�J�����w]����1��J�1�g������w,3���3_��0��s������)�%�&z�����{����0��;��e`���w�Sy�����aI����M���s:�!��{����u�3W��m��R,(?H��v25�#���I��� � ���yO��Hjm��BKPU?��gELr~�S��e#g���d����m�O��:F60���t���d�����QG\�������Zr5�j}S�W.���nj^����R��
�Wf��e��u;��R�oo-z"�����1J�@{�j��P~���.OV����yT���v�sxS0�<�������@
��F7�������"-U��(��+��3q\��#��Z�%Q�?)=@Y��\oA)mw�
�������u��q�"�R���Q�?,I�v�#dD �U7	��d��*R���>��*�Y����b�)p}l����R�l����:�Z���rQj���Yq���"g�N����a�;��_}UY[��)?���4��ZDjH�sUFsU=gU�������J�����a{n1�WC�5C�NrQ�cD�3ZoY�?b�������N ��#:]���H=fSk����/T��9���{���#�,�E������
�sJ�N����PC�������Ck�1Yi:�r������w5�b�AQy$K&����=��80�V�yq��&���k_*�WkeT2�
D��1���J>����f4
b{���1a�����i����$"�=s�����w~�~��13� �������a�M3R��X,���F�����i#������n<��DZI�Y�����7J����{�t�>D�(��������������}s $Wg]\-����j\	W����iy�0H�>�W�����9��v�H�A]����u�Ekd��I���e�Ji�����E��5Q������6?�<���*��TG>��U,�7�T1��<�p��F_�Kq|����%c��vP\ ��p�|4�����o�G2�������0q��>�i���\UA_+���0���W�F���B�i����=��3(�l9t��}k�������D��b����B�U���oE���H�
�/R3��x_�)6N�$��g��_�c������(�x 8A�MXF��O�,����&������X�����K*<�&=e�2�����2���F��O����m�>�`{�����R��y�lu���;i[���}���#Fv�O����A�"���s�R��{���s��k�����l��t�����A�`�����!�];�HR���5;��n��W��a������6�{S|�����mm��������^�/U��g������OV}���.��C�v����H��>���C[a3���������tL�S~�^�������6�)�q7�w�L���^�������a"�3���KKF`�(���OPF�@z�
-+�i-v�h�xk�G��v����v�X��e��nu�\�7bw�	$n�Z�.�"{�t��En�J�:�(:�B
�
]��MJH��3���Y:���'x��YM}����c��W��~�RM��b���j	�:/��E_\��#�y"jW$�(�g�G�9��	C�I�5x7�<`�V�/�C-�/x��k����!6�'�W�Fa�<z������a�%���s-���X}�MJq�G.o{A� ���N�i9�����q���-��!~|����?8i9W(���v����0���������4�$�#���s��V��J ��+��>��0�J�X��
�����[?������^��M�X��~�wL2 �6���N���YI����H\~�U$�zj%�����1�~�V����$xH��^�g���m���nr�����:V�H���m��+�_��b��<�G�iT�>�g������uW��JS����;(h���e5:dI6Aq�����'}m�i6��J��3z�H�{��$z��Y�����n������p����]UA�eQo?&��.�1��Q�gVwj�d���QNc��h|�1Ed��}�7]��0-�<�	L�B��'7]x�� 5�$���SE�3�]k���4Z20�������E1������sG�"�u�\
I1own��
x��R��������u��?���|��r�K�l��{_~4�5�n
�%��ES��_������F����<*���+r7��������u}����%��������N���t���@��������~�w����Kt�$�+\n�����]����!N����52�b�^Q���7�P#�P �����z�*Q�+�oJ�H�{����Y�FL�e6�?v3`K�I��(xML����g�.�;b��nQ{;��C����Bw�����	���sW�������G�����d���
�w�q�-�t��
��I;B���_B�eM�R�8:�����u��K7
fe"k���,����r}�����4��j��� �G�������MK��*�������J����N�6e��ogK������U��*�DQ���9}jw������~9������y����+��������}�,�T$"}�Wy5��A��CA+R�6?$��)�b���_��b�����KY9��y�������K6��ssz�jT������*�_�����<�e��2�jM��Q�q7�����H��9O��:������`��m�YM��t-�.)3�{���0�F_B�;J7nZ�%�������e@c��6���x�>���L���\��@��Xk��%��QAeC�7� 8�y���}@6g\�s%�|cz�>��-[�������5�8���v�&4'�d42a�������J�PN���p�)5@��/�������Q�j�h�c�W����`�����!����yp�S�oh�r��E�0����oNk��w~7��2W2�.�^'��@��_��f�� IK+�oN���}c�`�v[Z����a�z��W����L
�T��V�>D��8��i�����Bc�Z�y��_�(�}c7��4��	��G�� �9n��g�4fg���2���w�
�X�lS���WV�qa��4`���������CQ$�&�K/-p<��F��9��(*=I������hf�o[��t�<F�^�X���������T���'m{�P���u�9�X�� 	�SF�.��T��h����-�-���|���b��]��1j�]��:U�l��>�t�$o��J�R""�3�x�����ni[�\��6�:�R?���h�"�������R�M�;���J�%�5�����?�!���
tZ���QOUH�^���|���q���Z�J���~�q|��V��`VB&�hJY>�KW�E�FX���-
�w|&�����=���
�e�._���A���Kq����8_"Y����X=aX���G5G��1��e���o���e!4D`�3*4�/j��w�*z�����8��[z���m��j�-����Qb���]v��l�� $q3^Kl�F������ ���f&����?���^���O_����F�y�\C�?�����O���	�o�V/�S7����KDD`���D�c����<i�`����L:�r��#U��T�u�5��}��������N��!K��
���d")`��c�>�C���M���_��w�"�SG������I��d4Oi�w�������!f7����|�����vEQ�dYs�p�o�[z�r�N:�c�N@���A�V!�9������6�5 *�}��.'7�V����VR��lu�G�����N���9Q��bi�`p�X�}������
�}-�(]}a�'<r�
X����Y��G/���
A��N��P�F���.��8��C�)1E��U�g���at�M{C����'�7��L����`���I�0�}����5�@�sN��>��;Hre�R�P0h�b���������Q����N�`z��P7 c���AJ��J�t����c5�-)��`,'XK��2����0��p��>���#�kq)�������
�d%�"�=���sHb�jz�>���b�V���6�y���n:DA�h���Sm��
�F8�����y�(^N�1_Z��r2L~����`����|�)e�t�K�>]��]�� ���UIM�������<���#�L|_��7*�"2�	����wrl-�V�3 �.o�5�\�&)��y�*�`���TlAPE�w^��%r�J�TN-�]7����;�fX��ZL"~������i�6�6vhi7�Q�����C������+���vfc��&$`��d����7h�7�j�U��0r&���p���:���D�����z�~����;������@�Be����G++e�	����X!�y��''��/��vy��?�$S/�����!O��n�q}�<��M��p>�\�9l��P�g&��{����Q�{S�Q�TJ)^'[1�'�44��/���a�7K�U(E;��t���"8�"�G;`��Bw�^e���vN��>��'{�t�����#�U��\XV�����n�����N���zZ�b�~_I���	���U�'W`���Q�4�@�2
�ga���;�D�(Q}�~A|���l�E��/��|NW��c������T������;�Fh��0����Wn�����w=������[�s��
|����uS������ �r�a�o�����T#�8��J���G}���V����b���r�w:u��F�m^��a�$,�}Uzo�r�e���=��r����v��["]~��w_�
�b��\�����F�v(U����N���?����W]��zE�`��m�z"��p"�^��m��
x�:�=���w��7������GE��B]�Go��|�V�/����R9r�y%������_eCP���Y��.�Y�l~�?�.���11��� Iy�J�����&Xfu����J:�O�Dw��4J0��gX��(j�7B�.k��^����0�P�HbzeC��z������������#}Q�Jo;��Y
�����d��.�>��3�Q������w�w)�V��?
i�i����~���.��?��<_�Hh��Q��z��SSw	�F�]���qDk�"�p�0,��oz�U}Hk�O:VJ���.�,��
.���7�\.��:���`����}muI�`��:�e�����{���1�����H�/xZ����c��'�(F��j	���r$y��K����c��~���Y�U.���}l~(G9&�1�����^kQ�����B9�����D��X�����l�-at%���nK�T"����7�^�N<__�a�Y��|XV���w���}���;,���T�	U/��Af�y�w-�%
ES���������6v��8�����|o�&��0'R�B
G� ��8*^$��A~"R��UK��/kt[.7�X7��F�g.����7	w�-n+�B�l�RhbZC)L�+��������,m��Q�R������d0��O=��+�}��e��vH�3�@�|@���(�m?C���C��
h�:��@B�!��q]��%S2�#P�7X*a�x�s<���F�D��5����p�|E|��
�#t(�!���w�4�8wJi���](���W./O����w��
@{_~S(�\����cN����n9�qo#���_@�g��u���7�dp����=a_����J��iq�8>��4��ff��
���r����@����n���E�����V0Gn����������rL��z�f����*
������_2�'��u/��W6h^M��4�+�>�HP���/���N���������m����)>8�1�k���~����������e���
�K��6�1��_�������
���]���To��-��*v9!�T8��O���U�y?���@I%�j�+%��p�]*��\n�U���5�����U��@���}��0$kY�)uu��d�i8?G3��E���&��XT�da������	G�9+Q��'�%.��>�����3\�L�T�X��@���B|��g�c�wc����#��{;M;�x�����Q�I���S7��H��oy��"��1�RK�
�0��g�>v���[t<�Q�6B�����^
s��L��zR�L���H���L�����W��p�r0qz5qRuQ�x�`.��g��l7���{1�R�w�u������e�YqvL���H���E-y�i�_�1�P������h�D"������%v{�#)�:B2?c����
����S|������p�\�ovz�G�Ljii��*Zh���:�~���#`�/�7��x��qc�R�K�oM��>`��"������ -�]*��_<���St�.P�������]N��i��5Sk:Zy���yG��F�x�����u��?�x)����KR�u%+^��5B����W����@��0����-g�t�Vb��.Y]z/�t7��a3�8Q�D��>����	���A��l��F�(e^��d����G�q-�Uv�}{�����,�rTE�x{�r{-%��CaIr����p�"���������o�[�v)���$�N�%���O(�j�����F�������E��l���vN�G��^������{�����\�o��73��~������j���hq!�J_Z*���Z^�V{l
�\��` �.�tV�(�t�f]#�0����wDed|2r*��JV�<�8��(�2����	UQ+���;�Z����QF������~�`$���G.�A�#/}b�]
����CL����`��}������T���q)��W���[+�f�B�tl��-���sK���?^��K��|��u��&�t�=X�p%S�q�K�o�M����s+s���*_����6#�;a�=��F�����HEw�X��F$X~#�S�;o��H����(�����@D��x/B|}�
��
t�M�^����l�������q��R�!��s�n��O-��yFH�T��key�|��&{�JT~?9�+�����;
�6u��� 
��j�����2Z��P���{�^��m�j��2=��f�8)���'�wd�R���������>����>�o����hX���y{�Q�C)����?J��t�3^'����Wv-�sB��C�	�����a{���To���~�2�R.���% E��
H�LD/��{���$X��_�k�k
\C�2�!���8$]��}�O}��<��`T'R4����>��'����J&���-^(����J5�/����a�))����tm��A(���9{���d�r*g���$^�~
x���G��9�Tz���G��f�6"[v��!1�������T~:��4�3Rhd�3������k�Qs������
�(��m��2�\����J4�z:�h/�N�g-eA�|{)hYz�$M���^��MV��NRo��e�ub��N�F���9���eio�7�k�����
,' ������gY���`����������������|x�E�J�.%(>6��B�q~��i����u����R�#{l����\'��;'K�ij���Rz��O�/)�O�9J����+�"y[\�{?#�Td�������,�F}m�����)R�bU��Z.h��}9��|2�{�+�<qTU���o}+���8���3F� y@������������^�M�1��E�`��.�����vI�-�A'�^�����g������
&\.��d^<���%G�'�n��KM���_���	�����nh����o�i�-9���Av����/��,���}gv�2��d���P�.��}��
��sQ�������ow��\J�G�t�*j��aA����}#(��]���xFn��0JX��ve�n����T$�R\B����������xc��v���0h�o_J
���!�MGPee�l|7H���;�"gb�k�	�-�c����ox
T��PT�]e���,CP�G_a�=��o�EB���]{���|@=/]i�������
+��G�0&k���ue���c������q����\��n}h�y�m���������#T��#��A������B�.Y#AY8�9���r��B/G�M�UR7�9�b��;���I�}�%=�G��d��Y��
�Tv��v���U]�.��a�w�iU�����l{��nfRyo4En��q��:v����~j��]����c����"��z=���?=0E�7�����@]���PG\(�`�{�L�i���t���_�����������R������[C(�/J-�b�?�^�
��h�:��a���J��G��Yx�s��1_�2t����v�W�d%��;������Ra���|o��-rN�����R�����X{�!'�fX���L(��q�
�wt��~R�v����g��\L��Ms�����~ ���A�A20B���7<"��,��H�(��j�L!�� �e}��j��L2+b�s�H����������$K������[����!�����Y������>������Y{��)�����X�_�Xz�1�3�(��d'U�q������A:�t"q��������zj���bal�
5[���9�����G/s|s��n7����B�A�P�g�b�Yj�}��6{�N�fW���[�(]�M�M�~
>��q����`��3MGH�0�'���������wS�p�
���8�W�Vp�A�����
����y�1r��E���F�n�����y�W)*��C������&�y�d���<`���y,�|�eF�3w�=�cQT��IS�VpA+y�x�,�0�����duK��?���(��$v�"w��E*�= ���
hz����}N$y�����t��<��Q�������)��/�4l��e�����4�A�.����rc?�*HI�9��Y��(�����jai�����T	�������|�U��e�Q��.p3��3����]����"��Wt�G��j\zc��<�?���=��L��:�"���o=�}�N������������#���\*�����O��e5�x�\�j�\C�����c����7e�!h��QGH��f������4��Vo\��vVJM�a��C2�D��w	�5gW�%Zx�:?�_�	����K=�wN%d��4���:�^������c}��E�@hQ��!�q}��y@��A1�!����]�����p��+)4�G4�}��D���uM@�X��M��u1T��$��l�j<�8��AW��iI������V�^WI�Q�M
�=�\����
U��H.;�NS|���!��1����X]�pFa��u� ��U�:��`B�6Q�
��D�7���k�/�3;+=�3�;�4
��Y~r���V�-Y���/���ih�6vF�r5�+�1�GH�}�bf��8���EP��l�x��hB:�z��Y�r��`�eun#s�u��j��\�2���]�����1E�v�iq�U2*>�"�==l}�����65QW�5��L##��_
7?�����[B�B^)q �?��o2�0�����y�dnS;&�l�m�b��|Y���I�����D�3���,�kT4�>0�$�����PL�*���������I�Y�s�s��[Wq�Z|�#��M^�J|+}*c��y�'�W���������z�����U�z�X�c��g��+^�{������x�'0�;�?V��&������T��'�w�<���Q���; �'�5��p{/N��U��j�KwI�P*�s�(������<���#hh8\
�!m����K����i�M%jL[���w��Z�c�EC���MGfB�P��=��3��	_S2XG�K��?Q���6~�H J�z�Jm"]�~�gw�'�e��s��[������n�������N�u�����G�e,�nm'������@�G������A{�\&�f7����>�������$U��%i��~��D����$>AS�MO��k�q�C����}�wN+����H:E���+�v�;�HH�[~�U�\pva��E��cU9�r�Z�G���
|���1�piC�f�����e�+~&u�J���{<P����R��X���'}�D����;�>k���}@�J]�%M�M�0ZUB��F�&����SN�Th�G�����}qJ�82I�1�������:�nW����v1����nh?��=E|���r�W������A����f�����";D��E&�5:���>�{'�D��Rq��G��O�\�8���Z�d���I�~�B�"���������Z�_Vq�*���?��j���d�G��e^_pn������.���L��u8����������<�*T^���>|S�!�rNh gk���5v��2��>�^����Y�z�o��`���C
_���MU\�$�[D����{�-���(�������W���	����#O�|�:m�v%\EH����9�������g�������W�4�jew�8`'_�m��b�^e��}q�=n��
7�U���)C%��a�Ex��|)�Kb��r������sO�|Q���PU6�g���>�O�j���_=��
n�qK}�
��$hgj;�~�������b�����^T�6~���>����$B�H?��o�d~���?m���#5�������^������X�v���}�c���K�/Dn��l�K��$j���+�,���=gj���f�j~����
9[�k'_{��"2w!��}�Ez+4m ~r��9�22�7,��B�
�f��Gn�a4�^_��QLJ�	g���a?���Xe�k��4�Aj�*����e���B����������Qj���p0x�Nh�g���U���4^���/�W�s+�P�=�p�e���;�SJ�v�[[�����E���#�����r�)��e2���g{V�o�2����W�HW�xG|W
�OB����<Y���F�67%�F*�F�����(�A����}���������%��F��2���y��X��N|.'�nY;0����='�zy �w���'�g_�3����Evr��N����w��qr��v��=SH(��uW~7]����������G`�+a�����&�*�ho����{��cTS�<&[�����l������Aoo6L9��J�=���>I��
N.�8�p6�����'�E�'�HN���������n�"����|$r���/c��8����}�*��x��S|G=f�%x;B�GT5>�k��{t*"CH3"@L2�2�N��l�
F�OoZ��Pq"}�����v=��'C��;�\i5m{�u�H>��L�dC��}1���
36o�;������/P�;*������d��9�!� ���'��ay��R�t�]z�����R\^�cF$AQ���L+�MD��"F��C�K���Y%V!���<��b.�����@���L����������JI�d�8�s����7���@������&�^��J�&/��N���?j���������kr["9�te� s=�]��X��l��X����\����C���5��W|j�=�w_�a���3�6���=��I��s|T������d$Ig)�e)K2�!r%����������nr`�C�U��6�{lT���=x!�k1a�3�������2~4��J��_��o�
�������l�"	xj:��]s����7r=��!���W��{tC�����%i5��dX�H����c�p&�\9�"J�B�����+�DJ,��4�
R�������J���Gs��g]�����S|h6Y��K"`�����_�-'����f���[p�'�D�������<v��U{��:E�������W,�F���f�Z�O���4yZ�$��d)��/�t^z�5O3�Z�a0�����vy�'���*��rr}A�[�������{aX}�eQ����z-J���B�W��l{���
[
��!�����Y�_!��{��N��I)�%=���WJ�{B^q�p�ee{P���&����*qY�DmO��b�@
��/F��M��D���X�e{�\�����iT�����G�k���z�iB7v��|dW�����-���������Bh�N"���_�C�h�&r�m�J-{�o��W`g8
�fd;#���a{�5�a�w'��6�E-��M�]C��"_���h%8?
a��x�|��`��~?\:v�F��F':�����
�S��9(���������#�M#<ji�T7G��W|�s��������*�_�>]�������5�k;����0�cqki�S=o�QJ��DI��si�����8L�1T��[c�S�&�~�w�+����+z�*�C8+�?8]Z������s�>4_Y	u�v��f���`�/Op/`N�_4��+4T����:����X�lw�+|Y���
����s�T���k���Sv_�u��lg
L��DDrr���G���.��$7m6ic-�����
#����~�,���.�������^�F�J��lI��z�g��E���*�u�%o�Y����/7��V�yM:��f�UU���Ch�/��Y�`�+a�����/������3��6(����nv��vn���GM�@�y�qS��fj��z80�������������>X�|�7f�RJ:���&��f
tr�_qg�NM��������
<�m-�}��d���A�r��a���AM�Yi��
~��D��FQ���o|���2���U|U�Z��qI*&�+�������o)������pZ�N���g�&�$)���"������n���:@;��Z�
M38�zJ�j5L� =��/P���/���j����0��Y8�XS�Qc&�����0�H>BPpG�5����`�L'��6Q[B�\!�@O��-l�
��S��Fm5�p�l����:B�f�}M@�]�w��� �b�+a����O���v��,�G���T��I�@F�6n�h��d��2=�W���S��t����R��j���[@�r����!���%�b*x�@`���c*��?��46���t8�9����\���:����Pk�/����J��.���gM&`�Ww�u� ����'�V�>��M'�i[����N���w�w^f�/�������`\H]�Q��&�]� J����?TPr6������.�������y�0Y���������`,��E�����G��F��)����t��S
��J���W�6L����������$-Cb�DZ�2>��?Y��1IYLu��
�hj���w����50������������
������9�\��z��T���E��V�i{�mf�[���Z�C�5gQv��~����Pq ��u��?��9��
Y��(8�+��8���J��W'Y��l����J�i{��>e���B������m�J������/����<�	O�,��D|+�*,���+]�1��������������TH���Y��_��$���ilh:P�_��� �<���u�!�$!������ 
Z�����(�=��w��q��tR������:��R*�X��<�,��0���6@Q_���n�������A���p��v�O���<�������g���o�bZ^J��,���t]�^�+=���k��J��'�Q#���45^�]}�}f����J����(H&����j~�aZwV
��%��jS���r�]e):o)}���������J
�oC�'T8:���E��h�f���dLV�O{���7I�d`��y�!�������7�������$��j���e������{0�\�
�0NN���]m�y7v�/���RBQ8`��-(nweF�0�k@�))�O�2��(h$5�y4�.�'�zV���B�Cl6e�D0�yJ���,n�������t��k�����x���<���@��jI�>Z_9��_��Xm�1�����E�8%y�����q�~����^%T�J����%<�=�[<����1n��W��$�9R�'�"�(O�
5�����5����DpD�:&}�>����44���ON����������kTiR5L0]�����p(x��<����g�>�/
���2���E��K�����T��<�el���-��VM�jLU�r��I�C 3^�g1"3 M���/�����|,����W����z%��a=��fi@2<r�^X=��f�V	���1�m���0���-+Y5�|��Gw���U8| �y�`����rj�9�������_9 :6��l;`��7����������y��3�+�T>�X�Ta$�
��$��O�/����r����3�LA%�������Bb�Yy������;#���qv�H����V���P� �@j����-^*k��r�q�Y�����j$a�1�������]W&�Q���69�W������m�Iw�v�p��G�+�%M5%�Q�7����:�@�����=�7��c����4C�
�6����	�{����#�o��hl|k�fjr�;�r�����Ct���d�cT?>�{�k�R;|Y����	m_�Oh���k���]��������������J����[�����%��:L21��
����j�XQ���I��]�0�1����u��\z3��'��9�5�)�&M�K|�.R�|�a���XU�f��Z@X�B$��t|7��	�����������cU�/7d(�����P.��4�p�(�U��rl#~v�%C�>u�����T��31�RS��|< ~�"���g �]o��=���x1���#�qP�������S������K�N�&�y����l�T��=�T���h/�`o�XN���k�P �E�c�x�� �NP|����*���v��n�S,k�9S&�R�w)��{������DY��,��q��W����M���u+O����!�S�7*���h��F*=N�!�l�6�o�k<9��_�h��^�'r������g��@���b���+l_�[�t��,�{���L��7�0�
N{�N1z��K����+������M�@8��H���3I78
{��>�H�������m�o�9Mn���l�
!�����|���'����i���G������j6n�^���3�o.���l�
#��R�H������e���m���D�p�q0�������5M�2�Ky�Gd�}�t| 40*]�`����[��^g6����6�����J�j�Q2��JNJ������jU��B���.�q�x�Ju;;p�4^��V���6������xj,r�~3�Y��3�>�
�����T��N���|*�B���_��U
�S�n�(=�������W��''�&K^�*d�ax�����\���f�\����(���H+��l�����CGB��|�f~r�"{i.�<QB�B�"O[Z�}���n/�dB����^��`�mR��M���.��Pzk��N�6T�}��t���?O[�%���{��K��	��mS%����l�x1��I����]N����}�>����4���7#�!��JD&����n��U�"��j7W[7>K|6�
�u����
���)=5/g�0xd��&��f���>�U����:��=���{`����q������a�l��-�jw����5:���[��*�,/XPW�b��nX>9l.Y!�
A�������m����{k�.`JG���m��O��i)�x(M�?�����K^�u�B��k��g�3��*����1!�5.�4,�:L��M���e�3�h���8�5/�����'
�<�<��PQ��'��u]���Ao��l��Q�m�|ij9z;+���
��\8��{yp�':I��+�f�#
/hl��+c/1�Y�����8������h`:;����Dw���K9;����\��,�V��	]�U�Z��SbU�����fV���Yu9U�S1����h����ykg+3T�� T}_�g���>c�Q���zX�2��� �~�Y�����5{{�o�S7s���@V�"���}�����x�$u@�.����E�� ��y���t��W�&y�7���V�������&��ew�����I�C&m��UYz_5�uk����$�����~�e�l�&a�mY5�=3�
���-=� �L�P�S��r������h/����������X�R�o��������f�F	�Ro9�����<[;�(�/8Me����z&��w�Jr��C��.�e}n*���s M���Q�P**f�����>]�eGk����t�Hjx��Hg�\!6b��O�����O���~HA�O��y��l�N�������;��������<P�/q@,r�?z��-
*E������y$Cu�����"��]���^��\3�NR&D����nF�w�q`�Nb4!eD�t%M��N�{�!�l�6��x�grw2���z�)O�'�k��`D�I���#�^��P�rr"���q;�;X|e|k�&�4Y�.�b|}���[Y����'���aJ��4����jL{\t��&�/E������
E�lL�q�T�r�l��������������"�����o�P�
�p���d�B�����\��,����r����I�����~����O�8���T�:���S'���R_6
'�!T�M��0��k3M!�Wy:R�����y��������u���s���>��rbBL��e�sp�M�g�e~qv�� .u��5$l�3�e�5<��1�@B�	z��������
�:�_������&(y��~���k^�`?������]���pX^�f�!rAz�L�`���Z']��
��0=�|R�s�$������(�Kr��b�fd���w��6��2�|��:�b�����t^'����h�f��}���
� �bK�8�%�7��M���m0����!��-C*���������'���8����m��\?Z��X��g<�����m�w�����F�i�O�����r���@�f��k���������&�)\���>	f>�\��������9�a���^���uLL���|���r�v�����������M���k�U!����>qz�����ns�,e�C�U�Pe��Mp��< ��k�]}�����'h��%��b8����\��kD�8P�?�������s�*�*�1�����!NH����Zk�����@��^4��GoT8�T����&�m��	8��v.P.JA�);lu�R�'������R{p�y��&�3�A��Z��s)+��i?`8�����cIS�X4�	"�}�S�m�g��0����� ~s�n��C{���V��v�q����D���N��h]���K`�������lW��`{�s5/���FO�Fn���iP3I�\��U7z�c�6�e5��]%����-'�C(��=�#�`����p����fla{�[48L��
���|S*|<4{=���	*X���C~�&[+<iY�e��1��5�1��m����b�b�(p���e�"{��h5's8�5��&��{n�Ie�����6��3\�(}�-��!�$���� ���p�|���K?0$
R��<Axv��>Bt�/S��"��j&-6�A�Fk�t���N�'������t��Z�1I�&���6�o�D��y��d��x��Q�����}�K����3�w�"��N�����ft���b�jA���5h^-gw�/���b��.�t�$�V��vm�:�CW��D+��~=�M�
��!��X1�������4���9rM�x ��)���&�'�\I��Z�m����]�A�&��+���&���v!��k����=z'UW
�
�L��b�;i7��v{���k���or��_b�X7y�����]����u�>�)�u7��~�
�n�&PS���#����l����Tr�P�p��Mkc��_���r�r����_RW���Gkr��W���M�_sj}�gx�>�|OJw�y�X}�)Z}{��{_�)���&y����<���We����f�6�P�����s�3~#�;Py�y����n)fGeZ�x�i���=�����s9���y������0n�Y�(�������:A�Q���{�M��{W=���4�G���c:���y�Z>]p�KH���6[s����������r-��I����7��t�����4��p����`���i@[5�����8^g�H0x�� �����@���9J�%��������,P}�����������K�W�L�Gb�6��f��<�1HZ\�Mk&�}��R��_�r3S"�e����x���4��k�tc]�'b�,���	�����Ir�+��TBP	,��MO�+�c(�J?���bRDlk�6���5Y�aC�A��\�����o���I�xIa��?s�m?�+70|��s��a����=d*�����E�y����g��K�I�d��k����d�W���~�N������s�G�*�������<�Wg;b������v�:���E���P�~q�R*i.^;<�Nh��$��/z����B3��K�dn�j%&�M��a�{Q��s�*�FH%��(������
N5>>��10}�l���77N�|N�����u�w��G���C@]��1������*���6��k�Q��n���fV����G��a�(9=l��_���a�1���U��Cs��>�qZ<7O������o6��%g��E3~�#��+�Q��|:DFR*�pA(T$�a��3c�>u����Y!:��1��=�+"J�gGU`M�7P��u�2!�>�vH����2���0[��q�~������m;�t������L����m�v�.���|r����������f*1����5|DY�����+*��"������9
r�l��D��5Rp�G+�7�f��mk'��@�J�,�l���"����t�B����������r�u��\ew��T�&�]z��OM[�a��� 1;~�s�pl�N&!���#n�DO)���O�R�N�J-�Y�����D�M��\���A����RS��O�K
��9�{
�!+q��pW���C����>�f��|�g5O�����>f	kr}�>��0�Z��)*��]:�Dvc�'^� ��l(��4%u��s��M������5.���H!�K�@�ch�. ��Z3�nDoZ4%����1	�fV���Y�0w{�}�{wZ�t7=�:b��K���Xg�aZQx��Wvs�&����5d/�cPw���c���0q��l�����Tt������)��b�G���#��D����
��$>v�Z �!���j�&�N ������5��V�����iL����3-��k�/���{��!ux���[�)�CnD�/���{r����y
!wZ]U<����Y���Z��`�3)����J7�EJ��FG���
��{<�q"9��54'���ck�>�s�����H��Dy�Z����n�}[.�����
H�������������1X��:���S�-����G����K�lt$���`q}!�GS�^��U�mu�"�� ���<k��}kH���7������es4��;�/����Nov���FV}�_�N��!���>���`b"���k����<b#=�L�[�2M61[����?���6p��N������]'w�&��h�0m�D����[}s��Z����������fC���9Q"�	�
�������-�A���Z:�C�M��K�4���v�/��F��}r��L�U���<ZH�����6������L�'��"����EK�A���t��qA��[~��K��I����������!`f�B��q�Er�v`��N��E%\��m�1�u�����g8y�jp�wf"m
�7*�_
T7V��il��NjN�����?Smv�&5|��i�[�g�����}��������d�E���|����l
o�&5W
�2�<~�]�����_��9	����M��>��2�s��w�!�R�-b�m���)�GK|<�9)�$w�9����p�k�����
�y�y�@������I��P�CU"@����I��?0C+p�j�T���>G
�v�bq���D($2u�7��3C�bo��������:s���EA���� ��~=|Y��i���i5"Mu)IW�n.����i�@�Q*'w��i����n=5yI���.}�������7��m��A�>��<h
�S�9M������n����\��Z�����b�\�?���$1����k��o��a��-��z����)-6c�N,Bt����{��\^��~���Rs����a���=<�l��|�9z����g��|�
�w5�B�'�`��D*���9�6wK�N�R��?!
L m��v_�r<LPq,%����m�f{rV#K�N�_��v��lOT�T
#QiQ����Z_�"�g�s�f�bL���^���������ZI&-Ky���B�O����&�������Q���|t|��0;F���_LxhNE���-���'x�t�5wP��WC��7^�9��pZ(�W_�G���c�F'����
�����}�i�����M?���7-o'������b9��3UqUV�75�v}�6f��(�M;
�L9�����	�8����0H$������q4������!�%/T�o8��5�|�����U�^�<��hJ�����ei��&�����jx�4�P"@y�[�lM�{������8HH�-��jN}A�/�����l�v2���
�
�����9�7W���W=���_�z��%��x��E�%��A�#�q�a[p�n�J/���H���R~�Z^�}����~��+,�EP�����=����y��thaA�\��^��������,���x	�mo�y����K�>�}���s���}��[Hiv��y����H���!���h�"l�We��
sm&Pwy�M�+��w���v0���;@���K�4������|�)��2��Pr��������Zo��W�;���e3��F��C�����Hy&��:'���=&KX�J1?������v�X�o�_�g�M�E�E�� $����)�.2(�����:�������3�P5"�����;��y_�A�.��O����� e�m{��v�h<�s��>V^��t����S��y-!�D	%�y�>%�����X��V�g�C ���|�#
��O����ger��mI|�6�M�CFKgj�b��<����!�;&k�����&���s���:�hK�'`���J#
����w���$�V�h���>Ct�����<��-�l�&�%�DE�$�����GG-I�@:)X}��'�o��c^�w�2���f
�_��(��pK���`$
��7^x|1���mF���.JD�lo�9>A>���I/��
�Q?�Xtw��X'^��A@%]�8XvS�o{^��R]0Q&�\x� U�:���;BhK;�v��\x�A���B��#s���'�-��WMG�6��z����Z�����W�$~V9$���4B�,1�J�C�5�������H#rV=q`��`����@��"��})(."KN7f�:p����9���A�a>]���XQ���a|8�H���\bj��|������U��Rs�a���"�w���N;A�*	��������\����B<,N��v������mc0s�|!����	�����(T�^�>����
�(8rxa��F�v3G����{J|0��]�~�XJ�Xr��l�B�Z)"D���|���'S�.q)M>�R��'[�Qg�-��z�l��&K>���{?KI*�M�������������RbP�gC��F�1�c�i���3q���f���I2,���	$���T�2����V:�TaLam<�@�@~Q������[.�����I���k�t����6����������m�����l�����;���BT�M�~��������zG��p���o3_��0&���Z��`&�������t�4�3�H�g���<������QMD��!�r��m����:���.�0x�������_s��|�>� 14��l���
�?���[J�����V2���/,���u���pJ�1[�]�����s����:����V���b�U�1�g��m/�Wn�W���X,�sf����3�\���6�l�j���fn�����1z���{Xd�I�����tofz��-u������{���F�4> ���>�#�BiE�@���7U�>�s�Y�=O��69{������z%&}.o��Yp������U��J�����5�����n6E.o�NG�I��*��w�O��GT)�����,Ru�v�n�#R��(�.
����p�o�_\$^D���a���h�5�nN�R�1��W�+C�4u�u{�#��9�e��3�j^�H���m*rL!���pIA�D�a�&j�;�,�aa5�x�y�����~��8��!v���h8�-�<��}1�m�5>�`��T:��B~�~���a��1[�#Y?r6��gx�[��,����v��V���w_������s6l�>Ng^0^���mR��P���@E��U�%X�V��7|�1�(���Rf�(GG�!�'�W��	�	v��+,<&��r�����{~�a�UR	��\��w4ez&�Z3���;�d�<�*���?E|.���Z"�8*�lgm>����K������}���[�do�GKkB�^p�?_�fW���7�sk�]�:���.w/�����
�&�<E���!�,��kG���*�����L��GT���m���S�>����������_���������m���WC;�-	i=�8������s�b��T���4*���7�_�%'4�����[��F>��K����>�yHN�b&��#���,��a�^�u"������89R�G�gj*9
4e��{�O�4�x�,��*�q��q�C7��L#LRR�e���]��O/�O�^H��+=2�v�v��~�����c����N���������������t��' ��G��;�A���Z������.��5���i�{RJ��s�_���fG{��7u2tV�
����j8�����������aZ�]���>��o��ce�y������H��7'��ki�t4��%�����������%���zQ3�}�Pi�=�g9���a*��[���~b�d��jC��3
{����"q7�)��d7CR1�������7�y���+��}PZ�������i�x��K������������r�}��y��I=iQ�l5���g��q����O�l�i(�Eo�m���y��o�����E1T�����p���T��`[!���#���]�������eS�A��a��W�oE�U���	z){dXo�����,��\5����^{Pe/4E�5[fe�����-�/��@���1c�CIcxK#��/2��Y��[S�����M��*��p������O����9!UO��t��Z�D
Z�!P�������-M��1���,�.�4w]���6�>����/��qo��U��7|��Pt��FvdVd�)��~g���5V�q���������X���V�[��$\�������sf�pU�k3�]rr~c �{h�j����D9��y8����C���.�!|��2F9��5�V!�Y�Q�g��j�k+�!� �dZ ���$��I��D|.�{2a������>5����u������nF=����L������H>�_{_,���3�Ko������X.F����
9��.���n�X[E��7(�z�F��|�c��T�S�"�-�����/�����.�-��Oh�nQ����d�����/���3��R�u��&�3#E���4�������#��IB��TY;�-�q�n��1����X}z��{l/�[!|l��M�~T�(#FNT���b�65��\�?�M�!�E"I2�U��S2H������Vuq�jTX������B�M�qw�_��~��HI�4������,��uj�0A����#?4E���X������2G�r$��C�����{�.���L|E������*���6�N���J�AT�.��E?mt�<���*��]%�x��$������<�`�� ��v��p'|���\y� ��S�����_�a6��#-��&]��&�u}����>u{�_��/�rV���	2�\�y�GOY6�����uMiIz�6�;�0aRHlBW��P3�#���v��4�U��b��n�L���u3����'oTI�*<��Z.�[��rT���ZM@�Q�L����U@�jip��r>��0��������J%i�_:h��A2e�[_8��[�C���aNC��w/,�.�O�a�gR�a�&���i{��n-L�~na8�
�H�3���~]��%R�&�7�����A��H�j�6}m��x �w�h�GV�����c �Z�7F���xeW���F[JZ
�:4������)����p�/�y���^��3 $"{�ip�C��R&kl'�g�12��M�������N���@�!����CjP_���&����Cm�3u��}����1�1i|�(��0	�^�������7��!u.39���������'�[���g�P]��
�����Ga�eL�Y��A��r�x��(?�r�c|&{�J�������&�s�&p�&[�a
�3��&N��*��|`wboU������)5�&wD�F���6�2�o4�S�Hg�����"������B�{Rnta�Z�Z��4}�����qO��;A�@��Fb8��M�E��P�s�e����T��9��y'�V�9L�10��H�*j��\&��T�7��B���C��~T�7��'��b���D�!��m{L
�������N R�7��r�i���|�p���\g����q�~��].y���5��,��9��U����b�oP���C�������2�jL�8\�e�>���T�QY�������f�q�2�r/��:��4>E.um�TA�{�����
�ev�k��N��>���gcj����x�;PG�����������GP����G)k7{�xu�����A�L�%$~���l�������H�Ik����������PU�z�
�����Di�nS^'�K���|\6�g�;hC$����4�����/eu���-���Q2����b�
U��*�����H�Jr��t�N�F���*����w���7W������a�E�ID�]aC�Ig�r��#v3��#F`�i�
�j�n.�;�l��xL��W~G{P/%TEB�QyEG����I�h�07P��:�����uw������q�T�����g ��@���*F����g���Bu�&� >9[��M���I�':Uj�0p|
2������KXI�M�J����������'rt���9?��e�.�d��c���
W��r?m�PeO�aV��N#����@~�%�\v������q5!�MB�6s~��r����D�'zI��&���C
�q��\��vr�v��.��Yb;��Zp�������)��J'u���FY��)�I�X�._TL�[)LK�#���x���Me���H���`uh��K�J�6�����!ESKO��i����,�[���{���R)�[U��,��~����V�[��}46���_�^��g2����s�JSm��K=�_Z��<�n�=y����`��,q�1~���AT]3��v~���]��P
�V!��
��f���9�r�Sik����V%��o���k�3�j�W�#�\��}�C�z��+�t�@,r���jIF��>r�&����M~�������3MD+�}G�Dm������#�'m���,*�
��?9A�|#���P�5���f�W�t�A%����5)oJw��Ti���}j����j���3T�)�"BK:s����
��o��qj�4;6���P��M��6��d&^�Y4�����Q�m�#�;hkj�Y��js�����L"����C
OO�!�bS��7���9iR�vg���|}q�\�f� SM6<qA�D:u-���My�hO��0q-0#����z,/�����%�����MX�m0�6#��������z�D���/Fn��/�1#Lk����������E��s4-�/�����n�b|��I�F?@�D:���#��������'\Wv$��4��oR����u�1�t<�4��|8)&���h��:9fb.���:��%bd\tp��=����Z4��H���U}��>�9�)����o8V�h�s�����|q�J���l���$�M�;D�9��������(��<l�K_Mh���R�����79b<^��)Z���W���1>�=CyJ.Fh���1f��6����WB.��E+��$�)��k�s�W��6oh����x�:���CRc��f���J�OZ���##�?$��2U�h����-d���|A����cw�\����|�D���<�>��M��e�0�dw�[G9����Tg���(���Q�����>9�����
:�&��#7j���v~���l4r(�\�5n��?����tX����N������A}�!5�:�?��N�0���<�g���6��\�����/���ei\��_�,�Y�����i�w�[��������B��<�q�������V���_1C��5��X1f��M9��?���J���|��R5:VNDr���Q�}R�1����)O��D�x�z1l
4f��{��l�o�u=:�n��������~c�t5>
O.��0�{��(*<�/�U�E\��MJ�����/���i����:�5*P�n����lX��^�>�3YM��x(���x�$s9
��6�s��A���������
�P����LJo�����j��R�lZk����QPO�c)P�$k57������ok���"���jf�����\�q�c������[�������5�@�8]"@����}�=�4�}D$����`�T�m����s$O��#wY�����N���Fv�\��P��U�� �����l��N�N��O�2C������-��[��.!e9I�&��],��~�8s�p���OA�q$\I��>����8��,F�F����"0���`��D��3NL?>B����l:�����mI�a�l�8�����E9��P���&CY�r.Q����K��B��x��Q���%v�J����V����Ki�#R�Q)E�y(z��Z���K��M����9�
U\��\
	8�*��`���6i��������=�dG��-�����3$��;`��A���6J�f�\>�4��	mG�V:���7�a�Xh�4�������>���#��%���2��#V��(�EF>����rd�?[����m&�N����:Q4�f���&w���jCt*=������V7�| �6q^��r��IK��?��5 -=�F�7�Z�u8����]()j��Zw�4?������	��pif�j[�7t���g�t�������q��H}������ci���^�9�tL���?��xS��X��_�m�%� ��6^cZ��:�(Ra�S�n"�d<��
����2	�����s���mPw�5����������=H�p?=��i����Qy:���
m^!��cFD��Ut2�n@VE,)tz�#^�K}%d�W��h����k��m��P�c(rq��+7Sj��m[^�}=H\)�0B:�"��~:/��M�i�w��I2N��H-h��E�/:�'(3� W
4"���7���o=�m�F4M��m���@
���������8�B��h��Y'�:;�j'��(i�b!�p����OB���;Wn�@^������:<��ZD�{�ON���gk8@�
�K.*B��L��*m�iRH]s��=A��\�6�&����0�x%�R�a���t_u1����&9K��a����v��D%?H��xL�D]�l�T|!7���d_�t<m��p�N�����h!��!{AE��2�\��	�pH�&O�2
5hQ+/Us�������4{��A(���_��/^�SD���"k����1�&����������xs����+���c�����~+��;���]���p�5�%��c?\����j��
G���1���� U�t
	�����~����0t�h����9������\������w�R�^�~����O�?�-�F=r�H�%�y=j� �v�$l:=L����g�JC&��Wkz���T�C
/����[HS���@���";Z=��+�-�>�"����}�>�'8�s���#�h7�TK�7�An5�>�����6�H
�4�Hd�o�>���\k����
����������3o�$%�&�Q]](�"����j?��q9���E�C&L
�����2�����?�)	���@Kj�6I;]o��rj��j�8������c0����2A+�?BS�n��}�2J@����5�O��r�m��9�PJ�������R��]��$�i�US��a��<�Fx�����v����$�9W�,gJ7 �J��������(*W�����������$O�
�0[��'��l�7P�]UC�`<@�Y�g�����g��[�12���l=���=��G�&bHj���z7�
����I�T��r���NTF�a��������=>�w)����Ph���MQS>}d��F��c{��{H��g��hV�0nR��M'�s�`9�n4jX�j��-���4� �q�(G�j��������dA*��
����Pz������$7`5���������M�G�������a"�/�p9�gG�/@&��5����9������V}���N�(�e*f��<�/�Vm�>��/D��+eQ�H�&'��7rEr�F���va�0W!}����L�`o�r00���k���c&��,5�%����J��������'�`�J{4����^um�������U]�M��_���s�4��)Vq�������������Nl�g�I��*�ft�a��SZ���K��>�1����b�v���d|o9D,s?�uX��Y�[qD����*u��Y����m�����G����H�?���3��/��;���*��zReQVSe���v��ox0��H����d3�@��MC+)�
�����42��g�JR�v��P����kl�w�f�Z��*��Wz�M���FHs��R'��
P�x��9nc[������:�4�C9���q�n���|b��d<84�}s��c��<f�~����)J��R�w���RNV��lu'gbc}����y(r��\t�)������U��GF�%1h
u���UM��6�vUB4my�r�%����1��-'��%�/`����0��}v�cF[6��$�A����*t7i�����);�(���(u��R��3��X�.K��r��
������g\z/��R��RB�	!�������>i����Q=(�����x���/�Q:Y&t3�I�7�w��!�n��DZ��$��l��m��� rYm"�]�D=�B�vA���&	�'?���X���_z�>�G<�k�\0����_�?����K�a��i�^��Va���3Q�zt>H���s{}��{������p���p����`Q��
�U�"���?��LGa&}~���qG���}��������`k��:#T�Z�U������(���ae;�9O�YV������F��8^��eJ����os)����g}�[T�
f1j5�X��K:�O�c�"S��<�K/���5|����h"}Fz��
M^��q�W)E�@	;u�E��}�-�M
uq���?N�R��@M�3fH�q�!����l���Tv��{\���}�z�����.?������!J����i�3R�]7��i�����������������<���������9�7�)O38��[D^�v���-��#���f��J����A��/�>��Bk�,@�k�
fUR~�>��.��b�����1��)��/�������[�,�;����}s����	0��6�f,	)������<�R���:��������y��=o(��6y�QY��a���d<���E)�@��K��w��nU�q'd�Avdjb5�K���~���d��>O[k�,���i^*x5	P�����!�u��+)���z������tX"�����������v����:7N�r����l���"�����SR����`m��F���2��Q��`\(J��>s����Neu��t�qi���M���|�7�x�H��0.&+`�P~}$-#�+�s�V�>��S�|')���$]N��m�����](�N�����Reukh���Yy���������hD�����:jy3F���R�)��/�?M���^��
�Uap�G�t��"����l�`�J�ES������v�%S/
UZ`����0�]���6y��DC��f'�6�c���m�Mf#�r;�#1*�>pL�!Q���^�@�J��nu�'���=����<�X��*�/��R��Z��T��w4s�}6�6
�~Ch;.�O�Q����H�K<��
%��p%���t��GqH��-�&T1����*��3��n/��5��69=�>���}��M���m�X��-���������x�$�� [d�"�J
�Hszsf�QY��l�����s�����M�7�wb��I�r�'�����4�?�����n�8��D���v��)S*M,�<^�~t
���m�>������mDn\�/���2��dn@
�4�8<��T�~�����.���,������<����=[����Q��3)o,cfs���R�
�\F�;db���{�%kg?�L�a im��)"�j�6N<���b1��,��>"zj}:�F��E��^0�n����Ru:�S,L���L���F�W.j(R
�[�N�^�7>e�f���;��N��!��rA���>�<g���-�@J^Da�B��J0��B��OC�\���f�(n��*��p���X/'���/�4��|n�\�H��W�G�_��0Y>�\XX�R5��=C�*��x�����;
���x�F>|�����IH)*�H�J�����4�;���a��c�%^�A�������jkzP���2�03_��Z]N���U~��Csz@Gn$5���y��o��
���`
cr�}��54���b�`��Zh
N�`���z'�l��njw=f/0��e�A�i��V$�����!6#�7�&�v7�/6�����OGMj�o�d�m���O�y�%S�@���yc����K��0���J�p���%��i�r`�_G�!���w�����P�RZ��4�H.���z����)�6������o�\�cR/1��8Y!�j��6�|;��5
��>�&�
�_Tm�W�_��?����]C.I�c�.,�Z�a@ZfAW��uD�,J�/�������FT�tYh��j�����u�e�J��CUh������#C4��S�A$�}�j����rK�z�S��	�m4'��7m*H��(��h��p����$9G�6�Z>�8�_�h�>/�D�k�z��YS��`)j��|��M����3������W5�o�<��a�m���P�*�~d�oCp����9��k����+n�!X��m>K1/
�]������{��| Wx�����_������p���0
�����{_4�@��z�1zGS�L�+ct��v�h�
���	<o^8�]+
^\�q�T��Ep��|�>xH+/��kTK���~������5\���n\r���_��|
�t��*���p���W
�o�s�	��QhK����6��3�!�����v%�{[��������\]�9/N�:h���X����N�6�#���Om7n"s��nvt���T��\/"^n�K�D���{����t�o���>LN��h��I���F�^������F�!�������x���;��j��Ir���b�
�}�^|nmg�*���K�%�B�2~�g+�����=&�\�E�7���|�e�����!�k\�>Q6=�GZ<!g�Q�A�Z���m�m�[���62xik�04�A�����'A�p2�t���a�'��;��C��=��F������G�uN�:B�8N�/����@�1�4j��i��8M��X����nsN5��eL��F�����k�f��%mt����R����c�A�J';��6��������=�,�^�A-MX=�2�E���w"����1��4�o�8BE�}���z){�^�1�A�Jey�C�:��H�OMTSA��\x|��~�����9_�X��GJm"�ZT���/~�&�����LV�N	����m�M��TJ3�L� �K�&�1� 6m���;�h��sU��B �L����c�{���b��nV�c�'V�w��#���O~�^���T}����^���Og���:�/�6+B��5�.� �M�e(��7F��R�.Y�=*�4�����V.������������a5�]�m�	H�'��U�# Y�/1�~���OB�:�<���Q�s��:�	���p�l}�A�<�o���P���Y)�?������'�xV�y�����]6�6�%=����+AyN���l�n���d���5�����<���[�:�����}y.����rM�R*��uS���&��(�F@�<��i0��4�TD_�Z���v�$v�jRNgJV������vJ��b3���������e;�,�����&"�q��qS��b�H� P��A�a���; �rA���'$�id����Sj]r�N�N�b�DO�����/�S+ZG���EY�����Fa�����������^����m�U�E��:8�`�J���}W��j#�p�F��D,�Iz1{����JlF��B�S�4��|��:(��<mm�
�a���qL�O����A��R�8�z�kB��/���SB���Y\�����N��5��`.w���.��x\��p.��s�6X%�����,�}��\���jp�&NE����_��~s�f�0��4wn?�5�f���p�5�M,C������=��)����8r_��VIP�.,�����!>b����X�o��\[��Y[JQ%�X�pW��~���	���B�����Ss+��o���]��t�n��\+3��qb�^ ��Oa���)�u�9�"V���nh"v�V�*�Dv�������@���f��,��f2`������?�����������_����7��-��d��S�8����{>@����J9�O!�+�p�����)��377\T�hg *XA�_�i*�hR����P��y�%.Jn��T]~�\��B5�Z�k�������g-.��9�T���B�Z�=~�U����KS<�jUK��V?��.�|�[>��� eI��T����O��7�����M���G��CZ��_���L��&Q^>J�a������Y��Sg�/k$��~��S�����Su$9�JSB�[m���*GB�ry����Y���b�� -I�l�H��)U��p�����o�)Y)!�a&������S�?{��e����!$���U;��av=��v���Bq��)Ij�����x������%�����}��g��i8~f����*;����H�j�wcw�W�b��L'�����7��u8��B�z��A�b]����/��z�K��*:�1ZH�9����3����I�b�1�sZ�����Qn�b���xqq�X�l�w���V|�8�����K_|
L�{H�-���_e�?H}���[�z�$��1��u��gOqY=�P�
�+Bj4��\����
/�0�L.e[�o��M��=��|�')�����58~v���x�����9�CCg{?����������w�����������~�7�%Yc���8�����aI^�������_+��.���
66r'&����|w�.�@[,���[�r)`�������vr��b��z���E���x���ay��d���s>!:��5UW��^�+���[���5��r������:�~�#3�����1+)*���,�����=-��Z	��T������(���������9�o��Ea?^�<���9aE��Y/��z���x�r��=���E��3�Y������p	�5�n������sH�f��
�?�"?������K	���\��%_����+�U-�AZ��7,���j�����F;��z���U>���a0������W���j�u�H�<%E<���z��l���R���Y�m�
g.���>q�������2��g���u�1������vV.�d�u�7�q�����m���\��J��&�����0F	�����mEm�������?����8C=�}|Q���J���� �\1t[�V�Ciz����P��J"�&c�z���&�K
�s����5���
�8y)�I�f^J@�S*�de6�z����5O�[%���Q��������1�i4/%r2Gd
AKG����������c ��X���m�J�*���j��T4Y������Gxuf�����\��f8�0!;j�>������Z���;���< i
_���}�*���I��[������"�]lB��,�\-�(g�`I�����Kxt�k9����+0
�d���8��i>���;
.m�G���T=�kd��x�dk�>!���<��H�z���Ap��B�Eh��H�]�S��q#
��#fz���(�y@�@�����f�sm����Q�#0��@��$$<�>�$C�E�G��0������d�A��5;~�^>�V<��1K��+x��Z
�z������"�(K�S!,�x-W������oC��B�[�w�2��%/�k����U����=a_�{��GV���S�"Wj����:r/����&fU�Y��j��Q���'�z)P7�A��������h��k4��
����
�rPq�VT(?����iu�������0��.����E��}����)Oe7��g$	`H��yVv�z��p;���K��,Wdv�OY��tm`����6v����:��6���@L��q��R<��9�H�_Nh����c0�X�@���i�7�%�=�TWb0{��^�s9%�8W���j�q���P���l���W�����]�Z#���P�b�.?a�,<D���mfk%[)�tv�j�:�����5@T*4��7����p�������2�tz��z�z���
'7�<[��(����e0 i���!R���!"�	V����������� �R��D�p�[��K�X��;X�����������[�-:�|�"d��rF�l�uiq!������#�\D�e�my���\<`�w�m�VT�8���>!��+���"��Bu��m�����28��$6��pd.x�u�[Rr��-���R�jk,/����qo�!�c0���G��1k)������
��:�������[3?����^����4�[���}+���/���R*��_�������H���3.3����������V>1��.��m����AA���s�Fd8��:w����K��P�D�.�_rf���9��*�Z�������=.�]x,JRy�/��S���� h����5��r)4��L������U/l�W/��Vo)�;_=��[�e������������$�j�My��Dv���]Q�T}��M�|Z�,�7�;�����]�5�
g�MHZ�]��h{��M�AK�
��qoV�������>�
�#���|6��������q=}��Z`���x�O39~Z�Jm�H�����������uM�p��Z��eO3�7K���r��]��Yno��'th�T��zV?�������}�&�`�,��o��Cl3��|As����y�=|��r���E�7wYH���a������
��-#�A��0��5�}��)!O��9���1U]�n�����j�8��1�����&����X��:+���@�!�,I��N\�����k�N)AQ�����g����o�b�(��$��7�W![�g�`%��T�����4E���R��]KV�
�+?��3�����Q^O"��QJP�F��2g!`�K��1�-����3��������5���#�V����	��2������E���Ri:�*���� a��{/ K���+�k�z��,/�������IIs�gh��3-��qq�7{����qka�R��7^��v!ue��'�}��Mp-8�	h��]��n�d������*�X��y�FP�5�����_1��o�M����>������k��=D���@q�������=��Y?YC�����E���?��A����P�K�
��a��3��NH���Q�2�6��������1��}(x��.��[���h*� �t�
�``��s�/#��6��P�Fn���y�B�;!/_�:� 0�L�
��
T��$�b7��M,��:�{�v��SB�$��f�7��G/�nT}��#�Fw*�������:3Y�����U��<�F��9��Lh��
��W�����j�m��B�t��U����uQ�\��bX>��]�D�f}h&�O��c�u�L{�Tr�?��}��q���O
�����T�����E&�,���&�E����6��r>�_O��@A�U�"�n�o^V�V�9BI��"�oJ� 1��=�.���Dv�d;��<�O]�������8z3����5;��^����s�cC>o},����'���������<Nb��XS~��&�����I���@�iI�	Z��; �	l�P=cO����j[k9.�v
/���E
����+�V_�ll��S3��	��>o��Zx),��<��gU�Si��,���f2]4���0�����E�XN6a3a�h,�J-��]���'�N�����Y�����^�w����N�H��	��������2HU���a��4��<���nBZ:�}P,��[������1��������ot�3"��Y�xu|�/>�^��w%Yty����dpVZ���6��WnI*h/����*�z�/��r}��p2���e<����:�&����@��p���28�I�F���R>�T�T��
�������A�z������$
~���w���l�Sf��mr*u���i���9-
�\���W��
���M�h������	�j�C|���;S���0F#�6hD��n����TQA8*U��$����"�0�"QF��5KFa�����}h�Hr��!F�����
����RG*?Ji�}�+�/�4�r���>x{����1�P�S��E�u���������Y�%������l�����e�%���E�k9���4s�R��i9��x��2�;3��,[~�lX ���5�zH�&�`<E����ei�$����������I�#�v�C�������5a�����.H��2S��mM8���3_���i����PnIm�3HLr�&�L�Y��\�|�;1��3m�3�_���qq���i��[o�w�?>[SE%�������<�&�V7�6���a�����9���94���y�_�����5y�^M��<s�Q�����_"��S.c����������Z�D�T������W�������Q�8��9�7�%2��*�s<��
�7�r���bI��F=�bwn26������B���}`i�Z���Bs���dY��9��Ib�c`����Q^v���[�4�

oL:�R�v
���6����X�p�a4���c����.vL�S���b��J:�N�e��*�H��]Rw5I�[ii7~�U�#��������Z��p�~�t��Y�*i���+#��!)�c2� |����8D�
m���gP�?5�
�����:�Z�tu���1�w��A���Y��w=u��doS����*�y�}�Y�w�n�v7Q~�2��@���J%��b=@;��g�WJ�+>�����0H�aI��S#"Gd����Me��<���r^��z�'���s�8�H��w�<JL&��:�
 ���R"~(��Wz:�wA�����S���W��	�G=��K���K~'�E����h�T>�6��������������A+��pqr��#*q���2�����E~Og�]h,wBf�/��W�����%�!T����"k��v����9�m��,P�.%P�Pq��.EM�Hd�Hq��{,���s��3����k�P��uSvX���dv��{��E���N������e}c$E������H���6��SX.h���-�A0��Dk)�<�0U���������O��"/gx4N	:����+NM�*��<�`��l���.�!=lAf��mc�yTRvV�J�M1�g�#~������35B��v$o�������/�� ��`hN���G�nz��I���F�*�6�v6�^�D�������FE:�c8��J|&��PM��]so��_����"�"Y��i���&����1��}ST'X�17��3����$�S�{A`"�lL�/�K9��O�
�$%�;{6����\~��q��LB*���8�n)�p!�!6iD��a�z���L����8M�rf�4��n�$oEd�V��Z'�S�!���|0CI�Iv%/#)W?�5��<+�"��p�
�]�~�_1����-�����m�3xC(�NZ:�x�L���T)�bQo�77�����-d�����7��C���*p
f&��z�-M��UT�t��Dw��Q+����uR��4Ue����+�uyD��"L-�S"���HH8�����}���f� �:')S��,�U�L�����K��#6(��l�E�f���/�w�*6�;n�3F�������
S��F�+}�N����/���,m�c$&�Q�������LV�H����uOZ��rH:6�3�C��6:�+d�)fF��9�K+��*9������Dw�9���s
���?�%�m���8�f @��3���][� E��u��2O;�5}��O���7����L�������x-�V��6)�#4y~��W\��T����Y:�U�h�jz�Y�K�wO����<8�;s�������Bc�0`��������"�1�	9���M�h���3�N���j��)��������9q$_���z�irx�F����u������C��-W�&���te�gB5PQ+c��W���e����P>���Q5�G����)�+V�H�����(�:}�H4��1�n������F&z5A)FL�~�b#w�x%Fu����e��~)�����A���5)=,cO1��H�pu��!�T��k�z(%>yS����V�����8H�&�.��������A���LU4n�#</��9�&ICprs���Cu����[y�Os���\[dzq��Z�,���MWwp��5������j���Z�<�F��n&�iB��@� �%Q�����	�����r3fkr�}�����=?���S�G�-�(�g�f2tb�AIa�z�6����6����)Zjj��e���}����d��~#p��tM�Z�I���3Y%`�� �z��W!���5��BnJ��/���EE�'��e����e"70Y�� Wr���q���)����~�]9���&p�R������{���G A������=��������)��@sk7���3kD�*� DJ�ez���T4����X����a���;��x�Ri�R>tg�
��[����p^���s��q�]�
I�v��@�8v��[�D]�|�/���
���DjA]w@�y�H����&o�����~���=2�;���E���p�rs�0�A�9A�y9�o3�+��Aj���I�K��RE���32��tL� �7*}���EI�at�M�r��d�-��
��X�K��'�$
�C��!<�u�-6��'��!q�P�S���qb4S�u�������L9�moL"�r�a��&d
^����<��T[#+�/�e/M�@���8C
~���2��PG���9�kq��@��!��M������/1�:�hvu�#w|�/�67=H�2
&��~�e#R�[�M�^�4J�4j@t��)vT~�dC�_
*�	z?����<�jwv��[Wj �Q�r���v�f���o�����aY��-���ni�/T�xZ��g^����*�� ��Z`���9�2j�'*���j��]�d9���#�J1pN��/JF���7���%+���uI�
�<;�
#B_MK&������.U����B�o�|:[�H�`��T�����������l�P/�������
�ec�Y�1z�^��7���=_4�w�a����E�q���.l1�V�����V	�������@�q3�j\'�����<f c�����dm�v`Q)q�\�+���.��^��0`/w-r��I&��+�9��
�c��N[K�"��p�AR�[ ?=bB�*3�wc�e���H�� ���q������B�5$�T�+���X��F3Hs�=�l.��1�~������c	2`T���}��w]��$���x���p���7^��_����UC�������]����������1
��I���^�w=��������=*	����zhG_P[H������m��}zb�R�( K^��uQqn8o*�P�T��0z���=;�?.�����Jj����e��t�K��QAF.l�f������Ox~���G�=�/i�Y����Jt�.`���;x���'�F��x���r�t5�9\U����,�����p������*�i�_%pB�����w��x
�6�.~�����n��L�Tf�Jd*T9�����b���v"���";m���Wc,�X�%���S�v%9W.�7��8o����H��r�����rWh�4*9�������k�Q�
"�C������"{���pk�������0R�����AUq��M�����	f��0��N��Ao^����d��4l��-l$�
�(��';d�����i��Zqh��J
��y���b'��R�
��>��g]����6���hC�V�y�J��4J�
;{y�^*�4���~{�#%����e�����1�Z�w����������zo�j`����/��6|��'�@G��+iTJ|9�WL�p��Vnj������Mu6���m�b���[�C2.��~P����M��x����zt�LbF�����F�.i(C�����*JTU���X'����y'&�L'���S�t���8��\��;6^r4a8RWV5�4�J��M?09�"VP�aF��m��}�')�
������Y��6���W	�\A
��L��Z�	-z�����FG�B�5�p^�1��/DI/pA*{sM���,r-�jab�I����F�r���L�gV�:�����v*�W����9�J1������aI��Doc2b$h�lb�5���lh���3�����V?�����\�N���gYv&t����]O4�k��n8z�f�
��wT��	z6a?E���Y03^W�s1L�&DrL��/�L),7��[�C2�[4uX��������9�R5���6�,*�/�o�>V����#`r��eCg��c+���M�W_���a\.o�2�����:�;��C����#B�&
��[?d9���{	7*�� ���JE5��r����	��OF!���yd����zxtmM�
�X4th1��*�)L�+1����Yzv���~b�TzL�,)�����8D0+�p�O�np'�m�0��T�����3��������d�������0us��d���~QtX���[Y��>�,s��-H��F��"�����)�����"ye�����e���g����e��+��p�a�����\Z�=���i��>8P�v����eJVd�;&�~^�k�c�����t���ZRx2{!n�����3�����|��n�F�9���o*j�Y�x�|u��bg���8����v7��z
l��O���Y��[m����Qf1��&X�3�Z\���8���OjN��i��Z���H��%e���Db�������*��c��~��.G���0�Yta������<S,!�LU#��!��	�M�
�Tj9Mn%Y�m����Y��F��}( �l���I��Ino	��8Bf��](��L���'��N�GR��';<��mH��C�����Id��a�MJQ�e����y�q��~E����p0�4v�L����������A����1I��O�5^:�S�^���Y������v�%��O�f��8	T��A�Rw
�A6y1M��V��kf�pW{o3^�:���jxb&[iD�������a^��%�������P�N����%<2}�3��HZ��mh��F�3�7�4t��W����F��u���Gg�0�)�8�O��"�����������]��Rpn�f�����1V���^�q:�[��Ig�|�����;�<��'<"�&cR��W�����}u�?ZE�f ��]�>_r���������X�(�I*|j����S9.o���)��3����-�5������;�v"
q2�]$MW]���z$Uq����;�X8������n�C�����A���k��^{M}*�bT-Z,l��n�v���m8��G���}vK�fx�����WP&���G�>H��viZ�K!�����Z��g���Z�0�A�r����;
[*��\r��G�\����p?����1����gDG��^7SG�%�Aq��&r���Pf���_oA>hg����^;���~BM������������=�oP���K;���~vU&|Pp�}rO�'�a��
2�Yb\>*��xVH�BQ��iW��H�2��"�|�)��^cz.&���)G�y���-�s���i177>�m�Hau��G�0���3��^a��j��p�pN�]_.pw�H����������q��~\��=�?&x��U� �==2'����j3'&���Baz�h��7�x'�Y�o^��i��LH�R?�2������[��!.�j�f*�M��-t����YB������G������'��\D����{Y	 a�}LN�ksJv"w3�	�������yn�����<W:������O����vA��r��:������nA=������VqZ���G�V��9����p�����8����i�!�Vf*y����T�O����,G���0���8&�����#;�i�p�����{�_�?�Pr�'&��-�d����}b���Zm�$�T�]p���.��C��-j�K����{-�~k�0L#_%�I\��[��'��N���������'k��K���rj�(@n�i��oXUC�Z�[�i��T�u�r��������B�w�P�KZ���	������s���nj]�|oN��� ��N��9��.������C�D�
����-d���v��
4���#Rz��#���0��r0�Jj�0�:�u������������i��s��F�L����TZ�,m�b��3�:&�0|Z[�_`�!^�>�aE�]���{-�w����C��>�����p|].��������^A�'9���E�R����[�P����A�f�/+���f���D���'�E�Of%9�
hx&��P�!
�<R�,�%E�M%�\Y��':>-�cT����gH���B6`s�Y�$r�sA�~|L;Ocz0M%��<�3��J��v!Ek�"y�P[��Vvw@���)^�C���q���UzC�)��1R��������~�DM�p�SL��MT��<S\1�u�C;��Y���3����[^z����czr������9z
�y��[y�q�S)������^|7I]_�<��x$��+F5��n���c���U�%7�n�NT�����]20{�Ji� ��X"���Y�z�d{S�J����u^���D�.��[���#+�k^_~����c\6��%�X�A�4��m��<��TU��jU������B�"��o�~�M)b+w�i��1��'��8��VY��	.+�6D��pe��A�NY6��~��\���Iq�yVU��}y�>�k;g������g�J�}mh<�����v�Uo���/��mk�Sn	���Wr��k8�"K�3|��"���Z|	k>�Z���=����>������;G��t~�\�X�(Gc���,�/F�d�2��8���m��iZ���I+�o�BER�k%��� ��n}�M�`���a��Um����;C�}q��\�X�9}�H���4U�_hq���s%1�[�5�4�����3I�?lq�����A�VC3��:�jd��w��{"����8m����3��+p�����0�8�rj���zyq�(�!�P����W��x�U]�������c������l;���-��cv�������8��m����F�1����GH.Ku�e���P��d���3P���o�3O�����a��W��d��[}�-�93:b���e��)K����j�l��;q?�WjI��;"��e���~1#��/���5]�wC����H����v^��w.J����$�d�v�������l��&?6���$��\��W<���0*{�M��\6e�7p���T=�����F3���\��n��h4�3t\P>��|��[�&�j�D��a�E���5������*�����i���XZ���z/yc]�)$��p�Q�U*�4:;Z�L��'����-����;D��Z�l��`�;|����2N�R<���?����Nz3�<t����H��I`����\��\�W��Sz1��pN�R"��+��X��W���{�kf�|�7b�����$���cqR��jJ��1w��6����_�
R�WM��x���dL���yh��r���u��Eng����N"x�e�~-o(�~XL�9�&J	���X��$�:������>v4*W�]��o�����,o�:�<&��N_2#9���I��?���]A#��kzg}����z��^���������o�������"Z*���w�E�����'	kv�)�t�J��W�8=+�7��\<�=C_���v���-����Smt{��ro/��9�*���*�1��Pq�h�����[���V��@��F<�[
��e�9�	�%�g�!-�������?�CX-���p�����G��s�����%���_U#� �f�=A����;&����J�����MC�����h�!��t\��-����guz{��lW�A�S�`������4YUT���d��aUQ��"-�d.�� $��'E$�\����n`��K����|w^���.5SO�R/���������\�jF�P����:������%�n=&��F�?��"��|G��j����a���Gm)]�!��)��yJB�
�{2U��R�������t������������#Bw���w��O���4����$p3-
C�
�����k4�r���G���S���}SZ{ �fNk�km:zD8�g��q�'�6b�6��9���>���*�)(�L>��`��F<D����������1�=���5>�{������
���A���F�T��{�+��/jVO�`��j��}��w%cG|�����.j*6�����K�J�*_��.�U1_]s@��r?��^�,���g�Z`
�L{V�Mc��@�jPK�t���3�`�;�3r7�����s��\���<�aV^���y=�� ������&�	�jR�^��S�
f!}���^��?d�8���G-�H����NKJ�=��yYmM�s]s����8sW\�;k�R;,!QO��b���}bl�j��rK���M����.T�y�gasY���52�/o��<�Vk5�$���gV���v5�C��HqR��<C��f����s�Z�b�LD���������hR�@OI�$�Vh���55�-O��� ���#��n#]�^N�����?B;��<��Zu�����|P��Y���|#���M�K�H7Sa��#��,�t����eRX��*��-���i8���@����
���F������8���U���q}�K����$��G�M��u9b�8�5���;),���s�;���)��B��Xe�������m���o9���t=`�L���-���^�f�����=���Y��;�6Z��7e�o ��������\�yH~�P"u#��k�T~��^���aC7~�����&��R�1�uz\���~��,�;�jnGPO�
����<����E��:�^T4�C�����;�v.�����/����YQr�	�!t?���Z��O�Y�O���"HKr?����n�����:����I�J�d[f�\	*s����a;����n���GGz;@��p9#9+���yG�e+v3P�I������e���=�}�G����L�����j����>K��#��m
V|+��<�Xg��2���m�~�c���n���:B����l��Jfz����|������/�_�4%�����.k]{V�0��TM��#��o/Z���@���{���q�x2��PQ��Q���d�_�C�j���[~�r"����?7�r����\��,�k*�ydY,)D�VD��[��
)��@��+{�����Y��4�v�/��s|���e�����!l���!�<J1E�z����%��%�����EZ4���~��_�)j����j�������v^K�a{����� p�{,1��4��@�$k!r=�2���
���3	G�\2�{]=��P�6��m�$�s���^���*�oRw�t��W����.+���)�����?\��	H�U��0���1K��J/7��&g�t�z�� p�WFN�Av���X#%�N��b}����]j��`�O]=��t{�Y�X�93G,mo)�����G����LE.���l�:(n���J��-�����q��)_�oT���"��������\BOY��%~t.�X����#�^�j�����������K],\�`'?F�I{sZ&Kq^c.k��z�����b2N�^�^N��^�o��O�Mr�������\�L�����ozDT���I��=�F��$C���u
o��m�TK��]��Rb��J�%�~�KC��RRA����D/����V��&����z35t`�O�HT"�zY�'�R6~rQ�o���k�oX�$���r/�	^AJz[�U�.z�<��S�V��!B�;l���p��v�B!�z�c���Z\�A,���i�e^=h��X�e���O�3p_'�zUV����W�5F��>����kl��^��f:	>�#�����,)�i�Y����,g��fPr�,�/U�2��X��J �C]�/L��T)�x��" ��Z1��u��tu�
��8�c��e�
�2D?��X��H��o���4���)&)�dRJb6�4�o�]f`�}��_<u5�O'�����\G��W� �S��*�R�o��(�%_�Id��$�H�a:t�o�����:Q��P)���^h/d�R�fT��E^�'lN���kz�s�I���W��e����5�.�01���h]�I�/yM��T�d������t8�i�A7Hsw]��� ���+��`����d'���
��(fX1���*�B.��>��o�W���W^��+��aq�c�J#W�Lp�nC����u*�=�S�y^�����F7w����)w�����0U�KT�cLrT�Z������oD�����i�<����������@��p8�����F����O5��Ug��=���jR�L� ���Z�y�U���@��D���M��j�������4�w�����z�r�X]R�.!���z�E%*7:C�	_SL.m��Me�����3���KS�v��D��S�F�!��Z	�o.���n�N��+����&�%R����i`a�yF��C������<k��Ci�F'J��>�����V����\�<�~7��w9�ls�d�M;�e3���F�M�}a����Agq���)#��y:r
�@��FX�>>���5;#!�8��G��|�Wiw~E���y5���h`��KPzBh\�^���8�n��(������!����g��� a62��/��6"7nt�_K���m��Q_���#z�$6I�^M�`�GC���9��{r��t1�TE�<VP�kx����J�)L�$_u+>�~�lT���'
.*����{JbV.���N�&�����y��N�]������M�P��d��[c�3����d�V��6�eI��������p|��w��R4teXc"����<��������K���r�
�f��
���-��Oe������S�����
{u�{��-��8v����0�3�Ky�h�����#k���0R�
�����i7�d���S�e�'W��c�K�`�%]��GJ>K[~I�n��x���N!p��&�������@��
������)�R�#z��+�,%~���p&��6
^u�����f��{�$\�����6��I��v�o�������hwA�^��_g3[�������G�����a�l~4�u�9����>z���c�A~�ME�;f�3�L�e�m��	�k�������V���1x���)Q�U�*/����{�U��������&���@����D�o�'����KV���`E0��|}�^g�Y��j2[*�
��.������E�����{�(+��������E��C��Z������GU+��4k�Im�O�H�<<��t�<���3�PG�l6m��G�����?��4]�rW��x��,���/]u"��{�c�v�RG�T��E�M S���IZ��[)0V����B��$��;���`�P������j���K?]��$s�8�����
���Z��]�����}
{#����`�����Y|�����{����_C�5N|
���-������
k��m���]]A}�N��=��K�q���Z����V{R��g�
V������-~�����w�����"�9�Qy;.~�*��]���5�v��}�%�W�>�����X�%i�	���}�g ^JF���vp�;#�W�[9�����_W�D�@������n������U�A$�Bh�eP$���e��>DX�������FD�I����!���vu�xsv����=����Ok�&�A����
��m��`�����y|A!���u��z49�����0|��A��!^���)��������Y��\FcOAx���+U��T�W@�X��/�F��X6~��\���f��g�1����Va��}�R�rWwY���g��"?�%l$9�g��������y{�Vi�7c)�Y���K</�6���k��l��&�������dg��3.Y"���e����}���r=C.����A�]��7?\����i�)-]�C���!W��
e���A���r�tE��r�v���f*P�M�d�"��C�z�9��uG����[�����Z(����'#�v�WS�*��s����>Jm8��]�f 1�'=g0���@��]�����|JR�d�c���a��Q�J�jjd1�a$(�r=�����s#v�,Rh\8��#s��l
�������H;�&~���q�������8H�)S�)��5��k�wC�6	{�:"�Y7��_y�i��G�"����nImo7����a��LX����ruic��SR���4KNM�*��#Wx��s�S�������s�
�����+��	�z�����E5}CS=�h�A�C�y"��������_��f��e��b\�@�-�4�mY���HJ�*&t�w�JX��Jy��;����L���
A�!@K^Bi�`C;��y�Y���e�O�������S���ZL�S��\�/���?�>)������!,\��S;��A$��������=E���:���I��O��$s�$��0F��@��X���.���z�s������1�|�^>��n�:���.r%��W�/�,ih�i.(�e��#NV2����E�X�
�����E0 ��~����N�����7��^���3���"
2��Y\���8.�G�m��+�jF���
>>�sd�����JR��#���G*H�{��3)���:�YZ�/����`�f|�GO�H��+-_����LA����������s�&/];����s�LR<`A1����t�|��2�h���7�Q�1���d�d9$r�+����&��]�#&�������/��p�����������6�7��1���v]���BJ4�|�O�6gbkZ���b������D�-��P��@!Jc���������E��Am��k��������X�L�%��G%%C��l�������6C
���=��>�4�S�]��q"2�S%���:��e�Q�L���4L��T���v��,�"�d����*��
>�j�?
BeI^�z��&��3�YE��I��8~���`!�@�+m����zB��ZX���	}������c({#�s�	UTuYD>F{��#�S�������0��P�D�C�>�i���C$����k�������_}��&&���g�K"QZ���f":�0t����|_�3���F� ,�\��9b���'�+Zq�KS��k�������v�Bk��
��'��N�L�R�/,F����4�lm#*����)z��?�OL�p�{����&�~���~m��g�(Xc�MJ��8	a�)�I�/�������������9����>��v�l���F��cN���~�U����V"�x�}P�IqSv`T"(&J^:��P�_~�xe�MDS���J%Oj��3����<��H��z9�r3���0-i��Ps`�,��Pd�:_
 �#�n~7�~p��Y��/��	��}��<�nD�P/&���l��`*��H�(C]��L����S�{y�!��;-�m��3�4.Y.M�a���%�$Y�_���k�}�`��P���*O�|�"(�� T@E�����
��)��`q���5�6z��O��g�^����2��c5Gqj,p[U��cvGq���������'����"�p#D�
���Orm�������p�~�axl�ije�#U(�����)+$�
];��a��.�8������2]6�!��R
,���[m��l��*A"�JYV�{���SM&�<�6�Kx�:k�V��%��(�%�|���s6W9Cd
ib��X���U����9���6p������tx-W�Y�3G%�o�":�������O��I��&��Lq�*@�i,IY8y��S/:���+����0q�F��	$_�T��oUR=8���cwtI@�_
�y]��C�`E�`�6�R�&�2\�!]�F��B�Mh��%�.�����b7�w��@WS�����/���y�|��8h~!%�W
�o���)0`Y������Vv���`���_)q����[���=�����{��
�/����[C)�;`+�����2��h�8[D-*�,&�S�s�\�D~h�9u�a/�V*��V����!�vAu���z�DjKSZCbX�59-S�:en��+������z�I}����X�]^�#�e&z����2e;�|!���/<nU��:E��O.�����g�g�ZC=�e +�9���h�\H��r�S+�\���5F��S�D������1������T>�e��3��&�����]9Y R�O�����������<��Hr�(��"���k-%}�E_CX����AF ����A�mfam��C!�m��J�;��#���|��[^<@�r�9*��I����S�4���ox��G��s7��X�tnuA��O�>v<��J���2��R@6O\�N���(����R��D<L�FW��^d	�'���� ���u��1�!��FF)��"
gHp���8���-1��!�d�c���;�4����!/x�#���[����:�Rw��T�(�p/���_�����N�G�.%c��R��s����Y��P#�
�
�{��_
��!6x�o���j32cY�Spz�h��dRep�����H`�A�[�T�5�������6c�$;=]�����(v�<�������������Y���Ns��q�/[��/E
qI^C\V�hBF{�H��=$�>��q�F��]�qv�m�$��9~'n��`���2u:o�.�Ht��-�^�G�)��xM�7l��^P��,�L��@#Wq���3�����O��}��5�n�3��$U�j�9�0���������)d)�5����&u�������,�	^��[c���4~k�FX�\�I=��x�Y:����6��U�"#�5�\����d������"y�l7
���U�,� ��-N��
����L��T��p�����tY=�}�2�r����&/N��[r o��U2�O&�SvvA3����*���0".�q��|�2�s��y����Q8v�����������e�<e�V��E2Rj\"}W����W��>;xi�)V<��nV�_�?Z�a���l$<c�5�|���.���'������&z�$!�	[��������%JKi����J��\�B��:
'�5n1��K�'�B$f�^#*a/�j���}�t���5�O����g�X�'-0���y��������ROl�	���@�����4#�>��z)�F�Ms����0L���(�����T�`\~`�����
[L�|`/���uf1lo�|�'=�I�v>��mX�������/��Q�@$ZY��~��<���n�������_�-��#�W'�0I>/I��)n��������>L����xM�y��5�nI��FN	�0��7 ������F�L����x��V���_�`^�a:��w��2���
Q��G�����I�\���x�YSH����Eb�}�_�]��/z�eCW
�@� ui_b�������5��GB>&��F�3iX�>7��������.Djr�K�N�r��u65U��Ez�|�;�� �D������^V�,6��� ��l����u�*��R�����:o�ayOIM�T\~��7�{�Ez���O��
���+��k��{VR��!����(K������q� ��&}���Xw���D3���@.Gd:�o�����K XI�����3��,����l��|G�G��@�T8��������""-��
jv��.-����Y���5�u��\)�#���`�d�Cu�0�2�/=A�����u��2�T�5�k�.�\�Zj�����P�>���'�
�<��I���i��-�����E���Y���t��4��&Z>,���,}��COn����M{����$�#�'�(����Y�so��*���2����%)���>��&e%�%|����������Y:.j�`^��l��h�~C�v�A;����v*�@����T���s �q��^���5�kU��	F�	��T��P��b>�8������u�m
��S;����ux���\�����-5i!�6����sbg���Xi������K��rI������fm����:��j����V#�a@j0����(���7�b�8����wCR)�X��tJW
Y����"PD�`%*Mg���L�\�ZJ)eD�:�K����<��������Q�w���^������*Dg��Q��m�����;Z�F �!IM��@�d)'=�;7e�Z[�������
�
�B�����~���@����'��hf����	
����&�<��w���X��m����["9�����G����p�k� j�8��$N�j�Zv�su[�l6i��0�%�[��_|'d��6��giF�>��> ��l�CHL�%,���/��(7D�H�i�� ;2���+�+g)R���I�������oYx���d\��-��-�^�J>h�g��Ckf��{��aG��p����z4X���V8r�k�9T*_���9
<K�|N'��+�o�O�I7����I����u.9/n>�e	���+�/����l�
z�=����<Y��	v��^�AW�u,�dS�����j�>��/��I�6�g����Z�t���y�}b�����D�E@�e�~���R��;,.��
V�����3��c�(dSR{&��l�����q��m�o��&��%���}4�9M���������7]�v�b�g%!������k��O7T<�4� �q:E�:ja@}m]�"������.m��������0�
�90���;?��7���
���v��q>��]�H�va7�m��+g%>iH�r>��*�#A�����^<�:O
��Z�`K�_(�)���0�*UR�&������*�e	�����	d��l�{��rK]�u���9���nU�5�����M*����-*9lA�����/]�R4�En�%f��jF�r�t���G7@�����a���m�����ya���w���X�l��g#�T
.S�;8�X��h	�"�NO=~�4�f~�*���1�\.��O�!s�d�t�5�}�{�K���b]4r����3�iC������Q�fI�U����N�K�T�S���/�|$����v�qk���^9��C�0P�{����>��xb3�����?n��tW����MW��l�bZ&��!����q)��`�_��,}����]z�6���j�X6
��{���)�����P�{���x)��"@�u�_6��MJ�7e����a�0��!M���	11'��@0�6�m���L�:G���]U2.�v}3�9s���b�nfL��
��)x.��f
��	k5��]+���&G]
�rT��P(����8U.�0����;a��n\�3]��i����y�(��Z�s��.����Af���.�/�PY�Q��.��Q&[���@�=��@�r�(�I������s�.�O�������\���7�%qq���\V��=��w����%�=kJsQ��^�m��&�9_{5yaL�w$�se�Q����PS-���U��>��|�����E@_�|���5A�E������8s]���>�x��b��W������������p���`����,��E�d������+�Ni�6vix-�a�F��#���!>
��O�F��u�����&��i3����SR[k�D5BUk�a�uM6hr������k��dJ�����BQ����$_��>B�l�	�b��R%����N7�.\�(��$���5�nh�5�8�|������b������.��f
���
KSw3���Deu�������dq�h�B>�^��XW��v=�FNEm Tn�b�p&���f_1� VNnC��_��N���K�Q1��P����rb}�L2V$g����]1������L���Z����X���Ef�ft�����4����}��������R��s����u\�-������nqB�
���j]�6���_v^��Qr��:�C���&��nR�����#�����T�~b�aF��3
�n�ZJr��K������@�~hu�F.��YN���������F�.�$W������<�Z��"�B��z���	��\����*D�^E�����3t_�0�����<���[o���zN�=bF}���En�w6e�p�h��r��4�(=�>�+�zI���]HT�PTJ����t)��Y�R�#��2|x��w�F%�����	�G�~�Fsv2��x0B�4dD�
PS����L
�j7������$�W���������������4�8S���M����'KL*\�\y��F����Crr?��%0��Y���)�����zY���EM-���v�����)�e��<=���[�%�f��?*wf=�K����.��AR����'��{�'���[�Q�k������A(����91�u��m}H��Y,g=�4p��{5��v5'��n,}6�R^�}T�
�
��R����N�fL6@�[:$�;N�R����F-@2Hv;J��������/0�T��a#����;+5�#trse��J���n��4&����H�����zi/;�k����)T��`U�������+�W�+e�}8��1]��o���V;\a���Y�*np��}ws9`�C�B�N�����\.�n����������JN��������5���!��8cG�i��mC23�[��N�[���+S4j��KD�
{��?��*��d�?��D4w�6���s	����
@�W����0���0�Dn��C��`��1�rZ�dJ��3}Rk��6���u^a���I�J
�~��pS�HR�%�������W�1����d����2^�}�D���0�">3�a�U�t�����TcJ���$�x)R ����5�����v9�<ai9�=����8��(���L��%�_8m���v��n���F:�Q���Iy�1����7@��e��J@BR	H_r���1�n������E
*��i���1\*���t��"�	>�&�l�0\�B
FE��Lq�������6Q���qr�M�����v���Ck5���u�b�Je��n4�.K	��$vXcB��uCzMf��������o��S
\^��PT��G�PF�������������P��h��!��=��N-H�������2�����'Z�y�E����V����?a=?|��'���$�o���:����D��ZU+����7����C;��"����DE^����F<0��h<��s�;?��w�����}g�j���
oWb�����Z�>V���fXE��i��vKgW���@�n@i{���pr��K���:M�16��7~�T'n"�2
!�'�=��pI:�j&�����������D-����%9����W#��f�1e��U�J��Z���U������SN
�<��.T��H�dJ�
�����:v��P�x�-cP��xM�*(~���=��:�JY
�>�
��L�$�v3�����u����M\~0�3{�3��YP�����!T9��V���bAT�F\��b0�;������<���z�p�O��ln�\��.sq�%m�My�%������z<�
��,,�����6����� q:/�����	Oi��H�hR;n�-N��&x/Q��X��^�����Z9��+��@D��ug���U���Ws�� ��B>��Ri^��JwO0v��I=����{�HR��^��+.IQ���7�et�-���pEjj(����Ag���&G��6SN�9�/���8�Z��IBZ���^r��Vb�~$!_�S�K��^MI|5&A�:��q	�����@��q.N����:�D�
��Y!�����V<`1ZX������'�W�����7;����{S�vL"#2)�<�t�a�����KV ��jv��i��b79��@�����H����b�
�y�0�.��������=�5[;Q��T�3^;>oW���s�H�9��Q����!��W�/�V<��<���
�2���*����������-� ����$>D��������N��8|��K�!X�����?Pt����~���Z���!����1q�{k]G>��'YuzF@c1���$����s��OJ!B��>t�#�#�+���l!
���G�����O��	sa�>X����c��8,�L�
9 �	��a�5�Sn�kr��Z�8���|",��$�]g�G��� ������Q�?�6��x��������	�Y�j\�A�*r��Ui ����%g�]�RJR���p�"�Y+�����j%�b����/JNVL��H�������p���F��5����#h��������*�������fv����+4?9n��>(�	l���:���L9;Nb� ����h��V)�4�����H����IS>���&��]~���`��r�W-�9���-���C���GdnHZ�|iy�q����42������Y��6�����-�H�~jC9����#Lo�]�SK%}�8���� ��h������n:�����}�����i!g��=�d������C`�t�����3���V�5��M�v��yM���aE��7zwGJ��K��g$���<�6�)�Y�7�^�������4)��IoD��5[Uk��8D��?������?G[�q����.6�`��E�_9�����O9[�&F��2���8P����'��Z�����m��p��8�7_���x�~�K!8����9
8?q���L7b�j�4n���EX�K���E0�������m.c�C&AI(������1�%�D��*
hW�2����3������&9�����L����\����%y���,����Z��P���,=��!����=Z::1$t	��rW�x����yw���Y��������5P���������G$S��%i�����3E�#��8��S��z����~���z��k� ���7�S�k��fZ<�@Z������t\f5����f���]���oMc��NKt>�/������k}�1H�W.E�r��j/_�'���F�6���Q��WbZ]�U�_>u2�,�����p)��Rc�_@���X���|�@!G�<�mf���.�y�%����;q6�AuVxz�Q <_C��g���=����!�)�ob�]�S���[�I��z,���k�����-��v�C����U�&y<;����W�����4Ew�+>�\�xnK0H
�~�]������� �l�������	d-���{
����0#7��1��D�n��;�!EW�1 %x3��������L��0���/��R���Lz�,R�H��IfF��v$��=���g>��4<�|��4*l������:Nao)����K�\�����s��x?���_/E]45�+���_mt�\MMU�p���%������F1)2/�yh�.���k���TQ��Sw5N��GB��E��x
�Vg7���/�]�j��V��L������
]���0��A<�@e�C����<^PF���������~ �X
AH �Z2������Ib��:�w�Qf���6
���W��yU����m1QF\���.g��#��,�H �1��q>S��x�c�4gtvu��ax�U����w�,��g�z8�-Q����!���p6��!��[����`�����BO�~�TV��`u���k��\li[�?���YQ�!O4��/�G=���'����\T�v��k���������o{���P����q��\�i�q���Qt\�o�[�f�xs�<���������!O@}�M�����z���9�I����C����W�\��S2h�HQ
��y'�E�,�wO�&�L~��9�����W��0'���2r��,ht���n��L&<Qr��w���"�~�s�\oWH)`��pQ���^T�%�a`����H|KF[��5r��]�.�>��i����E��$G��r�P�����Y
��g�+��5��i���"�=��}�/�t��}<�@�Z����x�p�����(r���E���#�}��6��)���CZ*'G���L�|�~�����eO��$m��r.d��r�Q����
����_��8���<[bT�/
A�K��A�.".�����F�����-�N�?�
��	&C�6:~�;�a�b���N50a
��kCz5>��?�lb0�s|W[+^&��1i�5�"��gj�5�s�h����D/�7��	�)�u���"X�)$}�m5������������%�z�����X���;�1�
"�r�Djyu�r�KA�G�Z�����VN8@��,��o��W���[�C�1�g��A%��B-�j���db9��,O���c[�F��M���:z>��F�%��nr�K��|���!�r	yj�0�F��a����3�iAE���WJ�����/�B��K�&j�0��T����������������i~R�q=��&G�����,���=���2��C����i5�&c�J]�S����CZ,F}�sR%BSH?�����N�����Q'�7m�q�������]k�xm^91�V1i��W�������w�{9(������@q����t�������i������eg��>�%e�cx��R��1^]j	V�_����8>[���G��a�)�n�oM�I�h)�L}�Dp�a����H��0�]ww"o�>�}��f��v�/����/=��1���h&)�|�;O����N�Ks?�C�z����7'��k�d^���e&k`��$�����[�\^K�RR��Z�&�>���E[m�n��&5E��n�Zs�"���o�1��������x��`���p����-�4�e;��mi:k22����g�@q��+\�Uj��01�7)��P�������E&�yt�$%����#�8Y������C��hg�w
��X.5����,����m<I�6�����>�g��;���@����7UM1��#���;�l����0�D���YwA�n�R��j�Z�A�f9�6F��r��ld%���,}w�	��~�)Tn^����}�������)�^Z�;n�D��s8\9�U��S�u�s��[�j��Q/q��N�&�������~�Hq�p�yYg~������)(��4���A�����r�OKY�pM���_�'�TK��*����)�R4����5��b.K3�N�\n���a�u]��c
��������	,M��>n�o��1ZR���=�L�L�����]��2 ��P����-/������2��lHB
�X���Q34^��fc�"�4m�o���;-���{��9V��N�P��q��Sa��r[��>��O��b��1�����8��9=tKJ������G!�N����T2���F�����O/���W�=<�d$�ux9����Am�^���3�|��Ocjde$�������w��� E�:����<�V��If	c��D�H�_O(g}��gi����F9��y{�J\O��$���<���N2kY�
���<�������O'iF}x6���B�5�X?�G8�pZ��y:�F��b���=t��!@%���,u%�S5L���T�$��K}vj���R�B?(�B87^����ApJb�A���+f9�&U����=���R�t2�^�N������4J�|����v�us�{��������	�����[y|�z9l���0x�� �V��W�{����'�x��#������ ��E�k��AH }�oWh<��5v&�0J/k����x�N��S
�Z(�t6����������1OP�|f/Gg>�1�����uV���%�%�z���,q�����QYu}�m7��d�]�'���u�C���1�����(G�I�������}Kz����wf����e}���zz�aw��D���9��1��X�[������v�:��d��!�iS��j��
nEnz��O��v�B%����B�N�P��w��_�����F�;7y�����'a!4Yj$�a�*"�������>61����8�h��w�2d��t�2��W�
r*��p�,��;�������y���t�E����``�����$���*��=����Q���w����n��;W�qgu�P��0}���E|����%�Nj��.gG��s�A���Jm�^H�e	����	�A�6�9m�x��?�O��[R�N}��7a��~L}�bL��c�-��|�7�!�Sq*�S
�7��C��	u�-�1��m�;y=|2����2�����������S��>�<Oh\�:
	_��|�z�L�k�
���L������;R�����	`9D&�����]��|��6c�N���������J�Y���+0���/����8?Z�������%3i�;�A�8b���Z4�"k������]�5�+��#%+e5��)��I���0�&����ru=mi3NQ�q�
��45	4s���cV��B�yo��Q��_�o������?����y��B�dz�=�i�`!�i�����y�5�2�9�x�1r�>����B�4��s��������������(y�r�&�j��Y���U��n��4w���W�0S��v��4Ro��������?iK�x����{E����T=�w���Or��:5���o�
������(b���p0���-��e�=?����F4�X�5U��K�v�se-�.p
#p��r��w\��������'`����7�&�Dz�r5���)*6N.)�K���h~���i�&���$�������8�����"�����������mC�-�(d�FZS��_6��e���r�owF�����Fx0Y��^�wi��\A���^m����xr��K,�R����4C��eZ�xU*����O-�WRAMru�h/������Fo�&���8�>6V���!�z5�k�!����J����� LX[y�T��0�"y���j���e��D
��1
a��q�n
����s.<E�L��!�y�v�I~��������=�Km6�"����k����Nk�]�
��/3�v��������i3xO-]����^
�����
�Bv�iqj=Pp�B-"���X�~�g�&���b�S��}��q���N2uUj?��8.�5��]������Y<6�]��n������t������F93���fC*]�fa����
&�1:���>�
rPL��������O��g���:����q�p�|E����>^��������#*h���k��J%v)���@i�^;j�V�����C�����KB2�3W����F���w�Zj<�����.:oF�mJE�7(���4�r1J8x|�~Uq��e�`�-���<]���uZ�o)T�D������j�A�`S����b���r�Z�PEN0��j�=�E���&�Z��0N}3�YJ��N�[/�,A�_rM�L��R��~��������?����
��k�kASs7��!��L�I�
�����v.3w�`*xVn��C�j;&��v��{��?c�����&S���������c9O"p"�A�]��[#������)kS)Ue
���:O���'���F,	���'{��������~8������pMa'U��	�$3
���-��ro��������Z�x�Z���%o�����0�g<���M�>�����}��@gE�v����1������&-;&L����v����i�N�����?��<�R��r�2���:��~��>�m�f,jZJ#�*�����`f�'O�7�:_�Ve}c�i�����)_Z=�d�������i�8�� �lj� c���������AN�c�,@�����r(&�����Z��~?���E�N�4�]3�ZH��6f��*���w�W�M��'���3�yt��5czTK�+$w�����������������T��M�2`d��5�:������x���GwX!^�o�-�Z7�EiZ�W\���Zn�����D��G�����.����Q� 
�J��k�b���]�Ck�W���G��9������#�f��PQ���=����B�!������]N4��U�uw����,��5���J��wt
[��c�}d����c�+>������[������GNg���\����8w�B�m�k��~)�w���4Dl*����`��l�zr$R(l/���5�!E�oyU��D4=�(��F�)� ���k)7��-���LA�����Z7-�q�!�����#_Q��Ja��^����&��<tz�����0��
���7d�F�Q���w��y�P��e����
������M.���3�nryFUv�P���}F ��)�x���p�X$q��z�*�z�$(��JT�y�V(7�IH���EYm��!����)�>��At����
A��v�{�N�I�6��:7���r��L&�T�hi�i�(��o��~�'Z����I�*`U>�������$��;�,�H�J��|���k������+!p;�ks]����uD��4�z����~��N�����7 A�8%lLS��O��j���F�'���������HmOK��������}����!p��H_=x����������9�hE~���zt������j����tL��Ub�w����Z���Y�TZ���(G�{��n�o_�4k�C~%�c_����O�(��K1���F���Y7�)�B.rl�����������(�f�Q�����C	8rg���Y.�,��<���K^s���?{l6T�\JM=�5�)>�����#�'����98B6��
���Qo7�����u���+��CD�q����+��dCT
�vY�?����7�{*�z<�d�|�v$��IE,���OG��5��m�E>��g0<��q05�:����d�3��'Y�,u��;��6cq���&��z��){MV�G��X
0��l���x�����1Bj��?G�\h������Cl�\�`��d�[�3xfd�5u���d2iV:y$����\����$3�����+�4U�t���w.�z����D2���uM�	���r�����-�M�n�\o�N�;�r�O�M���a=	JW|����>���Q?��e�� ~��~��q���}����r���-A����n"l+��i#�����m�;�(�	�C�����j~�Y����C���q�2��wR���w���T��{I�}c��N�����-�E%��������Z��VY��(^��<��wp�����^��`�#�	�
�!��8|�����D��@��r��+�Pg��z������-$��t�3��@����� ="�G�H��a��N!�r%�S�)�/K��]������I������2&u>��n.��O�RO~Yf��1�)Y��JL���F��q�Xo���y�3�u�<�o���7���i;G�������.v��7C\���D�
���w|�����Ak'\����b�L�A�f%9��99���@=j^z�7����H��7������������\`���L!1�:���z"��m��aS�����e��
EG12��4Y~{���jk������aS��LA!g@�w���������(V�g���R�����f�����	����J����k�a�q�WK������}mf��]�Y�8*�0���O��.�k���4H76�#�B�um���+U�w�b��
��
�B�/%A�7����U%F��H�74'�D�:K�rNG�Y�*����M']�9�6�)�3�!��[L�5�di���*��u����^'7c�Z
]k���&uu���@���.D��<M~;�"y�����e�v�k�K��������s�X]���\�9�!u��w]��{�Tqo<xSi���9��0��e�����k��o�N��db�����~�������s��,��������1�lN����p"^5���2�
�!����WtQ��3�����"s��Pb�!3�����?������H������� c��I�_%���'jw�(��r�uG���4���<����N���Q�	�me��@l��J���<�����*"GF������Ii�U<&B�3�5w�>WW�P�3'�/�;��2��4��7�����+mC$n���d*������1W�Bi�o�Z��&�u���������W����4�������L�wL�mK�np#!��Z��}q����S4�	��
��[���Y �z��WC�-����jrr#s�(2S�;���o%����XQ��I������L%c*y�:A������y\��n�%�;�T��v,e�
#��K�&���-YIH�����!*��[��:�V�:I]�=�'J����`�t�SY�����o�fR~
���N9����s�MZ�x���#.HU5�������U����	'�����A'�z���&FRo`��v�x��fJn���	�T��J�@�"�E�d���Yx���Zl�|����)��c9���_��6�Dt2�4�]��������|)h:���3MN�K7;�zo������/^0M����d�������*������$���mIewlRd�qK��P�� ���TlH���9�c����j�T��w��;��V
RF���D����U&|>/($����V��-�Z�&���3G�54��6"3��">��I4���:��6�����<d��n���p�
1iG�#	��A����P0Y"FW���F,�����9{�d��]��+��0���<����H&���/8�&�d$A���7��\��01C����ZR{�)������F�F�H?�od�c������T�9���`��b���2^[�	t8cU��d9�z���=w�\�e.�����I��n���}+����v�?��8�Pn����E�"'�G�\���sf�_v���'���!�M���^�������{���)�F��K���;=eN�	a�
�
r��?n�8�Z3�Sr�1�j����!Z{�t��l/���R�>��5��M��S�8�:o�h��Z�c9�t�Z�M@��(�5�q|���F��dL��s�o�F`��8����s�L�I�N.�>�ZS������������u�f�"w�9���Vj_�w3���pm�:���<A����e�
�L��E~��AFT�\WUr}+�����C�	��e����>K*U�+�4���D!��	�tn/�L����s1����:I^M1��R�G�g/���E��r(aVJ��?m
?>J�m�J����=���L7u�r����}���7��Z�i��������P�������hT#��U��}���k�S�|��Le*
/�Hm�����~:@�x�D�&%���>=�TG�i�!/neP�|��|��������?���Vz�{]�P�s\��B�����tL��yX���N�v8Y�;��Q��{��.���jb�`�29&��/�|�2
�e�A^X���a��>ro�G(�
�3�.z��[��{ry���N%z?L8o����t�
����0��J���(��r�����R_We���F�o�������#�
�ZN#d���.��"�T�?�R�x���r�������v�0�����11����RP��vPy��s��z��L���������z����s��q;��rQ�
�[����3u��������?u2���&�6�K7��T[���
nE9����
$���!�w����Ys�l��4�������,��0��FY�u�i'5a���e�����������x"�����r|���������dM��CdRd�^W�8��`T1J�AyO�����&]{�B���>��cT������?�qD��g���������x�/�R#���k���U�&�5>��%��M��������>hy��<��0c������t��-�*�n2�JV�?��T��
�K;��h��p�]����K��~g\cy������)��z�AR��������8�f�GRg�7�@����;D��jcj�\BD���P1]���������[ne����Xl���b���o�Pa!�Q��������t�I!�p�`:�[}���e4e�F�oD��:�8!�����2o�wGp�h�c}��(u�H��yw}����T����:\j7(o
{��n�I��F;g��@?�i�4��et����z���1-+�������|@5�0�D�a�3*V�/����tNnCE��}'d����=*V���	rr[���o�<��-Mc�4�T�iZ��P��W���US~vi�f����m\Np<��Ch�,�gT�,���1��2;,q��Okj��EY��l�<�����5mARB��s�G����Qo�rB<�}�QfEME_cI�.��5b�G}��2���+=�:R0�*�i��C�_gAiWf������0H5�C/|�5��q�����&�R��6|=��]#R�3wv�Jc�M���&�����i%�i5������Xb��zS<y��i��!��8�5��r��'{?�Q�))�`��~0��@�����Z����*h�oW��`�`��1��+�����v�!���aAG����J�72�^o,�k7�7R���WEB����=(��������W&�#/g����F4Vxd��h�^*���=�`{��}���2KzR�JEQ_ �{����G���R�[�p���v�����P45H����Q-�
.%��aG���+�95�NfA�>�8fmfL�w^Lr��q�{?z3�	�.���5@�����j����Y�R�d�����4��s�k�|�����y�HY7#��-�����i6Vf���!�(=�H��v�Q�@��J�(A	z=�.|�����fP��L����T~�\�a��a����c� &���Ks�lk�o�d;�0yz�������yfl�g����O.������N~x1�x]��@��o��"MK+k����"_2�Y��Yt%�6��65�p�������l{/�5$'���[j�R�<�%mN^3��x���0>U-���W,�(��Q6CZBdC�������m�)u�e���[x��l�������D��2_CQw��n�=�A�xn�F<��0u�$�g�;�J2=��+E���F^���?�2yKd�I�!g��S��CNA*�"�q��/��P��^�B���9�u5�&4P}��./�A��oD
d
o
~6��x�@��qU�����UBG���n���j�3S/=*j fQ�9lBw��$������	�O�O��s����c���c����u���Kj7<8b<�5\<y��� ���5#5��a`-
�������POb�7O���j[��v��x�jJC�11�c�����>*W���B���j�Z��1�m�k��i(�v�5�����p�2k�'����.a����v8�d�l����B��#���8@.�d��@��(��u,���~��\�C����Cj��h���J��'*�vkH\Oh�H
��[�Eq	�z�~���[=S����:��w�S�C�'x�v���_���{k1��>3���x����+�eQ��<����'!/r�W��~%�_����J?m-��eQ��p�����Y��5�C�,Rfo��}��Kr�����b�#B��^/bi��,�����Z�����{��������7���M��A��~��r��`������my�fH��8�dG
��n�>!������T�Q9��5�l2��bU�Bh�U��;d�����hl��I�����������R���Bo5����+)���3�����c�)o�*=�;G�"6�S3L3��I���<B��~rl�����F;��	q%?�"�e��<��Z-�0s���O����I��Z��y��Rg�E��(f��>G�
����g��\��?=j�F?�<=��R��n�7x#b��Q�}8�4G�������f.4�0��n|��~T��9MHl�� 7��N���l��M�TGq&�F���oQz��{J�H��pP�CT���<�wwy�g}�p�w�H&
���Rv������H>0n���T�z�?��H2�n:��..1N���~O����T��
\{�Uj�����OHr�����|�t����nc���]A
������]�E�kbIh�J�3����H*�iL�$�Fcb	�t��~G
����:u���yK�d���@��z�@(�-�C���Qs����9i}���tO��3&�Ig
�C�]y�{�\�Gk!]�=�S�j����x�zv��b�R`[#���^��4���)�z����]�
�6�^J��9��mzp�R!�`�_!c��VfL�fF	�(�he�Km�@��T���a�U���x�(?Q��}-'�o��1u������M��?��h3<��e{r$��;T�����9X�@�`q�!�kmW�ui���a��]4�l��N��F�mb��_��1�^��0�Y_C�n����b�{
���93�i(�X�h"�;i�!�F����X����Tu!�������O�9������>�1��������
�{�b����?��L��`�NP��G�~W���W��a�N@M��������s�[������	w���zDE1F�ER���D-�K����Q��i��F���=���b7�l���`�/���a����_9U3�?xBF��`z�
/5������pP�laO3�G�V.cdr�����,�n��O~YQ��c����#P]Ch=8���6�����K��G��Z��72���22T8<�����W1�n�<M;��������&����$�Y�6�RM�O>-N=������	C��Ym!��KMU*�1l�T5]�L:��"�ejSe	�O��:?w�y.��0���a���%��z����,�8z��\�CU���4�}�|�8A����y��d%��3S/
�C�0���*_~`�9XyGz�4�
��sn��q"��f}���Qo���%M��yu������9��GB�x{�D�T�o7� hvLz5�d��4������!����s���E��� �-!�&MSb�kZ�{��u�rh�L����x������Q���f��������n�J�:ve��HrSn�����`���6�>���j��Uy��QE�QZ������T��>��Z�Qo�6+�j����N�B�}{�F�9��Q�6��0���g{,���/��.�i�~����wg+�T/
U��V#�
��q=r����A�k���d����4[�����.1�|oYvF���������`���
����D�)�q����t��W�U�a�;���K�d ]�]�q9apRyu��0t����;���^i*�����k}��d�J_��)hB��#n�g8���G3j�����o��I��a���u����o^�[c��@n!m>s1s4�����������db��R��~��)�wY����&a���B�L�<�>E�.�mHKYQ%R#��nx�QV�����A�����`���rG0fQ}B��WDo
0���\�W�����: �	!i�����+_S[��~�=.��@��w�W�S�0K�=�\eL�Byb��`l���9��!yi���*{�-Vn/���S7�k�l�A?��O�,�����	hpB�@��E���h0J�S���.��4�^�������F���e�+lw�e^y����|
0����k���0���O����Lq�E�,��g;@^3��^}t��97-1�?�H&+�z�`�p��N�B�uy9���)�2];����~�R���6l���������,�6�e��T>���7�\S��:���V>l�����}C�e�81�R�u����?���z����q!�h�w��y�K4�
��L�kG��7;�7;B!���]���B�Z&���{�n�����,�������_g-�Z��D�`�;D( {}�/���8����)��*
_���K��]����%����w�%8p��~y�:k������������7�l��C=g�1z_��<��s��z�2����.����A�+=�y'��z���B���=�>�PiU�CN��;Eb�5���!�;��|���k9'8X%W�N�I4cj�Mi�(1��o,w������8l�{��l7@����4�5G�:,��1#�����R}��\��0}O#��9� R�j�Qp�D�gE������@����v,���3�b�%�yU{9�.�G�|1�c������a����}!��^>#
#�����^%/�(��Mq��
��4$2Ea3��{P�'���7e�!D$}��[��T~���,*���X<���1����w��������E�
�����������Z~lW�5X�j��M s>L(p>��?t��:�[�"����3�����
w�;��<�q�6���A[`\��K0��Y�
}
��V�t�t��;��>O�LT
-��i�T �����|��4�r�s��m�i�����^g�a�w�b�]^�yd�F��=<�����1;��<�Q*�2������J9�[��]�!����Z��������1-A���,`�����������P���/�t����LE2�>{C�C�J��#����L���K�c����e�e�7��������O��'KEQ�$rM�}�[����XM:�0���L���n�4�u�������
*v6���|����������f��)���0�%�x����x�
�6���
��0��3sy1jL���K�~w�'L��J�L�*�4=����F��Nw�QqN���w9�5+�=c8=C�q�&s�o������/�\��������6�r��R��W�r2�Ss�������\�}v��Q�5�W?8��o_����>�;B������9!������3���8�,��D&bg(��kG�P�t�Wh���[K���a���i�r���YM�
�#Vd��&��B�����X
|#@��y�i��O*� ��)n
U�<H3;�Op�Q�yR�����OQ-��t�w��k�h�:.�A�l���I�Y�_w��LXc)"��0�mFsC���|�)]���x+�M�J++l7T��:8a��Z���p����T��	���J+�l&3I���r�H"8t��<�[�
�R�������r�)��_�9��$���l�{`��I���\802��=���������e6[4a6��4o5*�9?�n?���������J�yo�7�P��(����������������~��
MQ���}4�=���D��I���������������a�C�.e���;v�Q���;��M��k���&i�
z����H���.�Rs�Pl����`���rIA�������d����.��P!��ez�v�I@�x����eT���,,�2�Jju�b��!�w�����4�#u2�%{�����S�>���)�����`be�	r�^�(3r�J2U�����s�R%�W����2
�rM����p2��R]���5�z����]�S�iaAjv�^����u�t�wkl9�p0A��{�T�T
����>������}�����xhE������'
P�H�gwZ�Y�	�3��e������@]>�_@�R����<C���6�������n�1�m����N�\��K"���W�)�l��E���p���M�����������	�QM����7���Y-���Ji�1�{[u��D	��w��������+���a��y��X��
a���cWn����.!;��
OP���Zz0&~�5����a.��7��gd2�GW.�:LHu>�M��;s1�~km�tH�5:�8��������D���x���<�hs��?���������3�#�����L��c�R�it�G-~5}�c�u��F
kp�`�#��z����Z�nJ��ifU"����r���1f�zt)�*?:W�Uq��]J��X���lm.��H���o\}���mov@����@u��e��l���aVS%�oE
tm�]� ������{NI�j���=�y���S����H�wT������a.�4����	:kF���a
�cI�}/F�L�
=��������I����,#U���Q��d�hv�D�)3�;G�J�������X �e�1?P�;�76T��n�6�+������];]���J| ��7�>�OZ	���_KV��tnS������F ���
o]B�n��91����.�����]yW��L��'kZU�+���KD{1k)��@:�:d
R�}B�u�.{%���A���3�<�w�V���S����md"GM��4�9$kcL��]#�B|U�s�y�|�<7`^�N����3��po���Z.�6<T��Uy���h9���)��]��2��{3���������*�l���7c E4��!��/���G�k�]�X�3�Y�7������7�����gF��#g���Rt���H4�w%!7��eB�������lc��c���������bg��|�d����2�9�"�A{��Nk)LN�o� �����1�:|4sXUG4�CDLQ�M��4���&�]�������?�E�=G��W��n3U(�zi2@w�l�5Ti����h}�1{x�c.��X���K�fX���(�#w�!�]�8�W�����M/�s6ji����2��Br���h�]�G����}���J����Bz����C���@Yd��j[��O
�����c�o�(
���DEZ�dd�CQ��.v�����5E�C��>�
4��dE1RVfE����6�Td��`���2R���Aw�d��K�����,���K�q�O�����:SF���p�6���Z�6l�9��:��j�ro��4A��7���9���_�(]F=��y����{�#�nV*K���z��H�"�z�P��!p�����b1
o"���"_�:y���j@���Y�_V4�����Nq���l�4�T'
+�����S�&���Xr�p��t�w����t=��j/:a+
>+����K	'�q�THj6�Z*D�>i�6�����������(m���&_8k'�z�iaL��1��b�c�eK��>���@��`iul�%Y������N��j�_4=�@��	�a�������J���p��?�KJ+�/�����=���jm_<60��������x���'b������N~�<N�`�AY�;G�a�L^�e<�����T+"=�����7m6K����������;�����R��+��I�����<���6����)�.8x�x=���B&@����$��^�������4�6$WN�����a��y�r����6 �Y����� ��.�� �K�l�g��=����	�v��sJ_e���t
�<�j	�������*�~9,�i�J!��������(�z�#p3��RJ���r�$�'6s���]�S�\3�|��~r�C5,z�@	<��2a'���� v�h�;�R75�fzb��^��9���L���P?�XnH��C�=�y�ht�Ap���{��������7��u�A
���6�w<:��l2��T��8��nT����#m�nb�e�P����o�Q)�-~+�}�z����!�vt�8�p�����}�|�n��j5��7)~W�o�k&�����Ti��Y0�:hg��,��!�Zn��2
������E���8_������>M�ka-�uLx}��v$���f��up$��z��x�|!U(YPj���&�7�Gt�"��9g����%�X����|O0�RcW�en;P��������A�H�A~9`I.��c)=94�}3�M�����Fx�X���)h���N]G�H��?����/��H���@�A/7�t���A�;+����3���������4��1��ts&'!&������;��)D.��!`S&)g����R��g��l�>�Rz�e�0t�����K,�q���
�H��<��
�W�o�N(Ui������=�Y��KU��K5l$W[�G
�B�{#��w���P���!C&#G�Y�o�i�����O��|3u9�%�n����/�vb���U��,���{�������
�������T����W1��1�F!Rl������9���i;����A~�9nE[�'�����p����M�����T�PP`J��K����:��d:5N��1!�Y��wVR�L~
�����H�R���m�^�{]Z�-'k���,g@n����m���5wsru���+BF�w��g�5���>x��m*���`�c��7)���.
o6j��1qH�+��&M
������G���L$�<����)��q��M�r5�l[qL�L�B{`v��>�p�V�n��_�3����|+:���G���\���B/�9��B����o���T��t�����(	�u�p���'��w7��'��]������3�
?d���E��%1J���JgL]����y��q�L�!F�t������z�����v5���o�
��K^0nB:j�v�������W���^_"j����?��+�nS��q����l��t�X%!�[}�|�I0�No}��=kz#��>�h	�J���R�����@�����s�����j�7��A�x�	���>���Qf��@j����K�K)����3��������Kyp�}���&`P�����?h������5+����{J���>�e66�����<�<>U�k���m�{b��	'��7�]��HV�*{�L���EH���U�qmV~=��B'm�-���
���X�4����(5Qo����<��������R��{%���V,�����4����hM�\���C�2$�N "	l����:�e�5[�`�fz�%��������W�Mc��j�za��������e��6j����LY	��\���o�M�FR�?������uE��.��Yj����L��c�l�J�h���f;
?���d���EJ�J��:2����:���&�3�+��&������OO����������6�{^�-�$�}b����I�1d�w��3Y��0���'�� �Ay�xw�3�tH��|�H6
G���Kom^1��5rN�	o���Yt�	R1������r�3y�[^���5O��d6Ztr�����pf�qr���Au#��;���$l�8�n bluk(`drhd���i�1���,C�|���@�kQ���
��!�2Z�7�����6�'[��*.�
��=�fMp�qJ�M������K�y�K��t�yC������7�f,��|S��4�RaCxt[��������'�T
��k���I!Y�X�^���=+�R._^�)I��.Em�!U;*�����n��CY��cT����|��������{��v�y*_�����Cc- ,#�}5�U���|��#�VyCV����j��j����{���p���w�L5eW���������z�����I�7t-�=�%u�t8���f�S�H�o*��R<��z�8�	/?�~���jM���a�h��8�L>����q�m�%�����o��\Z1�DB�4`b������4���������_��/F���*��n���d�(�!�:��������{*�*��+iu#K����7�4������O;T����*�u�!�+��?�����c����4|����vE�Y���l'�I7��].��]��,���F��D���F���mO����Hg:
.@� ���i���E��������|c����Q:���>�*W�!>�?$#�&������{��hV�H��D9�?d���#�p[���s��t�gq�[^VW��9�o�g;�l���� IIc���4s����R�&��������x4����&!(#�	���ICD"��:F5�]���z�%h�=yX
7�rI��x��l]�3�����[���iS�����N.c�w���X���O=7&��AM-~��������z������\���<:����4��6Z��r8^1(�Q��P$Q����u��7�^���qdK}�j�Id%����c\yf�]V���?b(��U�w3���|�x3)O��^@b40|z��{��������z�
)����������|�syJ���8��N~��n�1n�	�JS�V���EUw9~�
x�{�������(�v���5I� �.9��Q=�����t�kL�.�t��t���R7���A�/4�_���xrK�r��c�d:���tB�H��^��i���L�8c��R����z�7���w�,~��`�-����� ��Ai�P9>yM��?9���'9<����n�i�;^� .��OS��3�+���93x0��Xlh�r��ey����R�#\SV�Yu�YU3�V����:���4Qa1M4���Z���'._�`\��?���G4�,��V-2���"?�1H��=��Kg��o��q�A8/�Br�g�C��o���q���.u��a�l�FCr�g��(i�����!eV���>@��gV�$�W���kL���}o1��\��dW�&`x������A���d��2|#:"���Zx�����n�1�9��'�������we�=PJ�iF���Lh!A��=
��,\������J�y39i��[���S���c\���r�S)�����;<^�*�Nx��V�R�����g��EG���M4-M0������4p0�4�MY�Ii� ��k��X���mWAQ�|MFNR�z�\G_>�@�H/RR�I����}t;�����`K
r���;��^k�Xy�n��LC`$�?�T�����:��b�P� �$�����0�mMQ;K���Z�{.��F��0�����.��}BxOV��i�1�Yo�T�Sb�Es�.������P��S#B���)�Q��N�������N'{�yfW�~���H��'����A�-~5��vk�^���qA�r��gLjMc�Bk����������]���zR�J����t��>�T���~��2���mYq��R��!��H��L�)�n����3knz�_�o������"����X�lc����E��"[}��K����]��)�jF�����`EC��@�X�f��\LA��	%����4n������i��@t�s�����l���>���K�^"x��Wc6��=A���?�|�����%d����M��T	K�����X���D���E9�2Fzx������,���a��3{3u��G~�<}C��_��
�7�w8<�-�U���B-����=�Rj.�8�8�o���r�:^=B�=g����DUa?���^���m�v�)�q� ����H�
�C�9Y�'�kK�=!<������L��8����1�� ���h=��`���������T��|����L��P#�^�"��(zHq;}<)(M�?��VP��������I0\������w���R��N�T*X�]��-��:�rD������{J���QV��.���B��x
���?`\��Cb3q*v��{�p��X���T E�%��xv�Lp�(�[��(�*=��N~/�-\���Q.�{ YT�2����:`�l@m��1vT���t�\�N$ nW.������n.t!���j~<��j;3���c7 ��r:*��QV.V�]*��q��M�>_Nvy��$�G���8��|�tb������
��9����0{�r���U�s�����
6uuv���&�nfS��z����O��r*����*����j%��
v��zKV�l7 a���.��W������`k�=x�Y~yC=���~u�%3�D1b(��fB�Q{��<�����0�`��[2�����z���O�7)�������a�f�z��j;Hm9/@�����i:
v���Uz=a�����"����syLl
?��MT�4���<�[CJ�o��*~���"-3e�*�c0��v�;;!�������V�&��:��j����Il��	��!K�j{����@y3h����#��#��I1eAE��}���K�*��5���y,��zn�x��|��
[t����W�����9�������.�D6S~�x��L	��$�"��������C;���0����+��>EV��i
�= ����W����9?��px�YS����{'�Nf���T�"Lg�{y�k�9�j,\�D����<P��
�=3��{������� ��DY��N��t����x;eSn�t��A������N�=���S��T��S���Q.����mV51��������n���5`�e�5�2��A\��?����i�/UL��Rc��������=���6���U�~M����)�rh���2J�(�^ {�'r=��S//� :�OEc6��D�U��QS~�7?]D��vs�������U������#����]#����&�2���U�rs�jT�1������z^c��������N����TM�?��c�x� �����K��@I#5������tz��SUHLh]K�/u�~=<`utjd�F2m�Is�p��"�p��fO�u�}�����0��������e<�a<��V��=�6���?�A1�c9LFj]9L�]E��
��$.�qG�LRv�W����&\ez8@��I�T�I=�?��J��r[�S
�jU��-v���������6��s�z{���w��H��9�&�c���I���O�[��D�d/8��Y�Q-S �3nv*m.������g��,�^H'p���1dE�������zr���Y��&m��z6�#�JZDr�z�hAT
����K�jn_=��Z���F�0|��%���H����4���i����B���E������6���dG5=�x�!�.
-��;sk�v�t�g����AgT���{p@�V���8q�
����vC���Vo�;�;�G��'�1��cx��#�3oA�����SSb:)�VT��O����h2����0��o	��(�8^�}rj�gE5�����x�R��V�x;U���� /����e�a#���I����>��B=�'�Sh������q������M��a�k��Sm&/��yly�a�I#���h>�����gS�`��S�{L��"���1]��5��<L�X��aO�����h|���ar�p���T�e���?n6?\�ar���
�V��r8�>�����\TawN��E�}���4��!�}��-�/B;�L�f�l&T�3���#����9]�[~_���h�����z`n��P�'�tU7a�,�{l3F:0����"�	�����,���������8��>I�`�����������%��~24�DcQ��&����.��N>�~��z@"�x���YJ?e�N����We&*r]<n�?<���!���J����R�6����y�R�n��b�E�Z����Q[C%.!�����*�fx_���h ����0Y�H��p�<��d������]S�0����m�,��v�T��f�n��:j�
���H��^�{�_����Ko�u���6��XV�1\�C��������z�K��b\��7�2�`~�G���8��-��x�V�:��*��D�cN�	����9��G��k�j*&���7���BnP����T��)[���]��kf3�����C��,���&A�dqz����fH�,.QE#��[/�����1T��S��9������s�5����5��F����(M@�N�����7�N�f�y4of����Dx_�y����94���Rn/cq�'�ty�����_>r��q�����Ha���s;�Z;�H�P�3�Z��W����%�K�8��*U�e�����P�`�U�����@���s��7?�]M����YVK"�� +r����kJL��	"�45�x���F�-2�������Q]���r=i5jg^�"l�wx�\k97B��s,���.a��	�p�����_
��<�h��������k������.r�p�N���}K�	�b���z���Q)��T�I��fU�CB���	?��T�����}����D����Z9���E���+B}���?�F�����b������n9�:��>���X��~�x��T���7j�h��k�����D�p+3�M�4�e�5NNZZQ'�������E����c�������S�up �x��^��Og�SZU	��I�O�g�X�)c�m�#�����Z�a��t:�]H����S�E)����i
�K�.e�~f�L������(c5+��w��M��:�G#�Ub!]tU�(�&~{�j10�Chf�`��wl|�S�������vD@����F7���U+��M<_���p����*�D�Z,�h����OD�M���{-3�1�,�iSb����2W�W��u-=��-�Rs;��j��<�V�^%������ x��c6}�����a,��������V�����y������(��N
��3�=r�2�w�����#�	����s�%]�����W�;��=Jd�KZ.�L9�ci�p�pa!�������cW����/\��Z=�c(C��F���mo4
��m������t���UI�5��C�,%��b���'�i�������64�4�'k��������J�V}N���v+�=f�\.|�����xO6�=2���;�<�{%vFV���������): J�������]#vc]P�AM�q%��1Jm�n�����CF�������8�����4�W0�nxMx��������S�DN��T�3����Y��s�JG]\Js%_4:�%�/����JL�1�d�y����HC�+����yYV��&�\����Q�&���#�&�A��Y�~K)����$
]LA�td�K5�Y��R
:k��zD�8A��q�?�AqXW(��h����������/&�yx��]l�D��5H*|EF���>����5N�`^���
�)��O/7u%F�3�+����O�
���v2�p��j����UOT���e�rg�����*�:�M��%����L�e39� �C]���C��-�k����Cjq��d��YD�G5�T�d��=������|
/�}��������R��F,��_� z���r������0��^����O�wJ���r��sH��`���d7_Zk�����Pr:A�m�H��k]x����[@.�7�����<#���C�D���=�JW������sX�u���6~~�]���o�a�eL����]N(H�$9��p?���0J��X��N��Z�y�-(UYQ����f���u)9�m����Hj��4T�Q#�K�-o�|&�Q+g&C+����W}���0�-�.�����\yw2����G�?G���"T������������\&1��Asqii]��JC��0�X&�)��#��2�!�?R�p.������t�
����\:���{:6~�g4u�)����h�W���x7�7�r����RGBXT�\�

u�J{��$>i�����(�u����M��f���*D�7y+I=O�5,f���s�e�H>:�J���������Tc���^4h�5xh��6W�!��r������Ag��M}��DH�
���4:r����L�u��� ��J>��r,�Kq��~i�vu���q%��q4�U��>{-M<���o~���������]�A+8�E`�R~�Ok+D�K���&9��.�����rs��^���4����a��f��3�N,����1��#�&m\���d)�����!���e�u�u�b6�D��������p}n�a��fp�z�OU����.Tk��l1�ZY<��}'��?T��{���������|���J����:%-%Ruyl<u.FXHw�DP�mR��������~">R������������.��:W����������o��7#�b�<H��M��]f+���O�����������+
O[����-'w���
��������9aJ�A���$'wqK��x�l��h��U�vK�8�g_��wR)��:v�����f9��������4'=~�H�e�����s�������;�^�����H$s�SE�\�8=6J����/j�0=�����d�B���i������#��t��#���2���j>z��-PE��y%�C9��Y�������TyU1�~E��Y,}�P��������+3N������?��z����G>�XT&EU�����j�0������K?�:�r	���,G������(��!g`'����]6s���D2y�@9�h(G��E���t��m��m��-f����$+�Y?�|=��h�#����&K(��;��L��4�i�b��a����oe��Ko�Oj��������Z<���e�V�PP������
�����5uRY_@�����#�)[Fj�k���������~���"�����]w'���GE[�����CsN;�-�`�F'7j���Ln��A���>�h�m�+�b�����uB�p�������������so&��x���M����s��q���a^���&	����t��Ig?�����H~�2byI7P91,>�&�!�����-����>w%����&�C��J{_�O�����y��%�j:{����*!j��!��X�����D����:J#?��lUi}��%ws}�t��5=p�yo~�C�Q<�wm�]����R���~�uO��w���CY������C/�^�"���4+1
p
y��L��k�!>/��w�x9�}�����*Ko�U��f��\V��r�@p�O��/�x��I����1j]���W}���j0W9�s��y���K����_�~��l�H�!��������&�i �<��r�������u9r��O|8���54�*������8��L`J/�@��1|��-���SN)�Y�l-���~	���uP��j�?H'�f�"J��|���;{S�"SDI
�wV�?u�������D�$,����@�;�{q��,�j�����wgU7tGu�
�N����u�l������p��%����b����b5A����l�*R�J��T�5W���<Y����LT��������*S9���{*�;KLN�f�W�8�-��i�����z�����)pIc�Mm���>{,R��6�q1#���^�����b3�v����xXb���nZ�fni�K�<�:�����E��46@�)�G���x@��j?���6v��~��)���}\��;�Nstz�T�����d
����8�w�m�{\E����U^[��$$�qP�\W�}�Wc��~M�h�0^�t�q�����n����p$u�1������� ���_j�*u�T�U]���A����08�$��A�k�u�����&/K��9��P���~<��hv�/��T�"&d��4���������,����=]�F�?���M�g&2��uT���s%tP���|I�MT�}~�z��J��'}���~O`�E~��q��3�"����}����<����3Z���|y�\y(����K�I��v�$*���1��'��z�9Mk;+������z���f�C�4�I G`����f}��%�A2.�s�e�Z�jd�g��]��=�N�n��[WmBE����9�nG!Z�v��M�\���|,�V�[��Z�T�B��x�|_W�s��u�E��e�
���{���3��*]y�c��)�0������.�!G��i����~#��������:�U����y$�����S1�� �}9*NN��<����U{����hNrvGm�[E���<����Y^�
�?|�-e��[";/)����%]�sI�����]�7�� �bz�+; %�3����VST!��&
�U��Hl��
��8�~:H��,7�����R���s�RpU%{�[k������� �y�k��/[.[	���g���M�Q=���:
s�*��^���kAz�����x�����x?������oaw��	p������w�)�g��7�M$�v��.�5����N��&��3%	����lz��Y
pHX�����adt��	yt���3��N���mO�s�u�fr�:�r9���)�Z'�fl48t�5�]vOt�L�g���mpq0�bDK������,%R�����I��j3p5��Fj�rxX��qBy�������R��|��&4Y#���fE�w���Fbg��O����[����h�!V����'�nb��`���I��v������������i7!`���X�W�x/ �=M�X�]A��4x�Awg����jF�o���-�=FR^H8k���m�
�����$��K(A�<�*dU��xw������j�	-'5
WR�{�Np}��� �8/���?���CuJ��:�.�)�v��#E���,����_S>y�[q.�HP��]���������Hd7�#�������K�8N��O�� �`���z�=�V��*/�i��:��iv��a&���eu9�fe��5���C���L)k?,fT��5{��P�G�����J�4�wCcq����OV$��DSu�R�9�O�)���[
CHJSk�j7N�e�����i�5��L��O7[
m6�b��=�/
�4R��Y�?%_8Q�5id
k�1�xp�/��D|	���"��g%x���i��u����)�����1L�,�!l1�������47o�6�P��kW��|��y"��	;���q\��T���'�*���)0�X�|kW�7'I.��L?N5l��H�x��~ ]%,����i�$�ik�Z�S�I6Y�oO����:�/i2��-%�_��������O��<���i,0&��fw1d.�>���9�P?@����|�k�{���Lf�7�VQm�����~J�R3������@�a���������0t����q�0��Z�gT?m~:CRb:������|�P3&�C��"B��	F;�����~�' A!�y�U�(Gm���~�b�N����������aD�Q�����������{{K����x��&�)X^\���&e��W��a5��s�<k��M>)�]����%�����~b9�}��HS),��?a1Z@����f���U�����A�*� �~Z����6�W�.(v�GTq��
Qm��=���5
|���W��)G�w�N�O�>79V��udI~�V��v�n�=	Q�\��MU�����LJU�*u,���;����%�3�P\��b�G�V�)�~Ncf��t�e���c���|j!.�~�0.|����f�)��+��;�k�����J�f����/-��p�[�:n�z�!)��\;t�X�q�F	SE�!�L���3�����]�!5D�����6���0�>��J\�MZ�GM�g������4#]�<����&��K0���<"���K"N�.��6������j
�Q2�=���?�Q:����`\"P�F��R�)�YVI9�R>6�S-���w��J�7{�H>%���������#F2Q��4Y���������9�4G��*u�4��A4���;�������
�8�|��^������E��e���I�����O*]�������:�:���Qd4N;5��b���0`��f�
G6^���\����U������*����T#�����3��� ��Z�k�l3�u��k�r��(G!�K>����T�FF
ZB/B�SR
�76��������1�����T#{#��e5~b�!���gH���=U�m�Ot�?�����*��1�>L�Y�0r������������t���� a�
�_�hS)H5��E_�����UDHq�������C��Y4�H�]t�����j��Z������IP�T�[]�����#�	kE�R����	Z�B���<����Sd��<�h���A>�yQIt��#������P���kib������m�t0-��{���Y��=�`���+�~���kss�z�������4����i�d�+7<W���Sy_�1��6}Z?���X�4����U�[-���#�3�C���h�m;�Vj����#��F��p��Go���\�sZ��������E:����t��+s-9�x?Pn�=PA�BM��t��\	QDT4"����!F��s�o�y5<�{D�.��]u�v���T@��@��0O<����J����E]��EIw�$SA�^�v�-5��������N`c!��kU���H<w�;h����,|�Y����P�N��T?p�)|���������VhgF���8�-�*o+��Y�zF?%��$=���J~;��/Z���U��t��A�<��t��)�c�����I�Xq���*�v��zLy�!��7�8����
?�
g���,������vr79���\^����G� �Oz:p��x��n�`�[<���ZyI8�z���SI,?sZF9��Ob���@�r�q��8m���|���(}��H?hK+(�B#\C�[�R0��$=r'���=���j^oH����F�:H����XJZ��hrDN<���x���tt���	
H���wz*���^��P�=iXC������c
z�^��0f$FcB���E������*D�)�]��<���,A������I���s�@�Ha���9�1
��*8U��x��T&d��9�ke Elx<��W��-�#w"O@S��,�)�qNS��x
��0����8�,]/��h���rGi����U�15{"���>���j��(���6u�mO�vm���
��F!de�iJZs1@G7+W���S���
{QOc]����2n1D��a���8��:0W��sU K�fc`���~D���eL��vj.����/&@86y��*�����CGp��%�s�4���i
d�~�NT���4
c�rB[d�_no_S+��.+^��db���i]=��
Qh�f�k�Q�6���a�\L���6�����3��.] �Q��z,u=L��O���&�����@���)��E2�W�h�t�@*�vX�
�'��G��+��d��p���]O�Z���MJm�WeaE��c&,V �h��i8���"e;�chqk�l'L����3�v��e�������qN'�������2������L��WiQ�ku((��+���l�g&�A.{
Gmm=���:�k6!R������t��V��(�:>j�j��n��|#��<�d�t���%��������F��e���1�fRMsvx�����KX�sQ
r�)�<W��j�F��}[_zY�:��ic�g����4�#Ao��q�J��G����������������&�P��4a��B���=>R�"cx^U8T ~�����=��S�������@�&���&��T�EQC�����ur�����)\ge�U9%�����qw�vx������<������o��:F���c�<v��~�n�!������Z�����E�	d1v9�=���f��Q��� ��8�[7Z�-�K�D��p�n����:�����[M#~��J�1-WMRoaXI�|��z81-I�C�>n��P����=b:n.��� ����]0��:9D_]v	����25-X��HS	@.
�E�?s�M^�*���p��1-}%T��6��AAN�3}H�.��~���+��{w���]����v��Be~�����������x�2�������'��$���b�tH�����k�D�WU3#�44�����a����=R�0�(4q���Y_L�|
�w\�,��kTJ�c�1�
����Q�6/�N��!+�=�|��\=����\q�������ck�9��)��%,'�;��!/P�~q}z�\4;LI��|�aH��'���\������-�c��k��O�L�B���g���}������NsTN�b�O���-H7�;�<h=�P�����:�[����}N.�BO�3l����+ndi����e�|����6eo�C�"����'[�IV4��S.k\q�	=��k��my�a���V&	���.��V�6{���')
�����cTs�LpPQ��T�g�������W@>��L
�i�} '����O�M�K2\Y�q���^+}������QW�T�{<�o@vx���YI�����(\����Q>*��M������5�kC�x{�{:2J�o
�Gi�v/�tb1�I�.�=M
c��p�rJg6�� �����1��l��+n;�����I����O5���z���9E��CU��\�������j:h���>�����/v��rJ&�Y.�?I)�S������M����~:�����Ii�g�~�u�|��c���������q��o������!���v+�Cu��%�����x�f3�
�j2���"��K��+I�W�i�+IyU����Q7�}���c�W������i�~���a����.ovuH@?�yQ>�l�J!�v,.d/��oYV�	�2}�����>{�����G1���hVT�6�<^���g��t�i`���1:���=�#��h��A�\����������f
��QB���s�@�u�l����?�2� ��G����i�o��$\�4)�7��yY��"��]RQ�O��3�����	9���Y���e�U#�r�5�����	?��'zNZ���^6�O���-����f'LVT���$�z�eDRm��r���0l��;��h.�5M������R��/u�k��?h��uf/.S�
"1��#r1�%��s����+?����RM5�aj���Y`����7�����=�:S�-z
����&Mq�G���/+�����,<h��\"��w���+F��:�<�����
t!C\�-!��]�t4��b������z�&��j�&�������5o�vr/����_����[�������S
���1|�����t1{�x�N�����������1SMe4j\w�\Oow�Kc��%q�g,
��e_��z��:G��#sQ9�<��&8��IN�B������������
 ��
6h�������i��vZ�h�S�8���5�cy���U���%�V.sM�>��]:��l�"$�hE*�N������1D���
�W9H�
L��M#�MM��`�������bC�M��H�E��x������M#����U,�qGY�����p/��( �'�@����j�48cV����*���a����_g^�t������y~S����JO������Eq�1n(c�I�lX������r�e���~��0q�h�F��a!Y�+Z%R����or���5�����q������2�p����DQ��+
W~������k��C	����i�k��\��;p��q�~����=#-]g�����3��]�r�Y]���r|)��������{{�c`)������i����M�d�^���$~f"�/�q������"�V1w�>.�������5�������[��q/�rN{")^rQ���4CO$�`h��'�~���"��`�MB4~-#�]��}Cq
E�
m%Y
5F���������+�H��|�GV{B�"��I���c$����M1�1�K8U���:�i��!C��A98L���"G��fSbE="�=�����`���������<+��;�������������
]�����7g����b�~�R���]�1��T�8��k�K�	v)e��Z�P��p���ZS9�n���wN��&�8����g�4�z_�23���'H/����"�t���KN�V�xm����1�uAQ�!")6�5���7N��
������0�!��u,����
/���R�t���uB=����t��$�7]r@���eWq�!fr�?`
IPl/��k��T��?\>�+*{W�C�7n�4�z��@L	0i� 3y�K��^3S���������4����M�����'Tu������~�n���y����p�_q��������J�����	�EbYA��P��K������_����6<i�
��S�)�%���tejXa|ssE�Ky*��	s�O���v�Q0(�p_�QJsb��*�:9�W(�v��Z��
���q��x4�M=��pcd�D�AjQ�(�d9�=^P�F�"ts�%\V�j�*V�"/���/'�G.���T�0]�����k[�S�p�'Q��z:���f8mAxikTTM|�'��<q��~�����$�]U������)o���Y���x�9�{dY/��p�3��7W��/��� G���v&�-��,�n��8�����U��i��C���\E���{je��T�e^�����I����E4D]�p��
(������.��(���:����	aw<B�%u	��k���I����� W'����HOnM�{�W�\�RC��d��m���:46�l�VL�����5I���3���� 1y�'��_�-t�)59�y\� �_�fz�2���.�K�����U��N��F�^�r�&e'�&}���v�k	%���m���:|H��q^���vL58������|��_4�t�^Q�to��3z0A��
|*E�������U��-�}��=��)#^��do���}2O��e��#��}	<��y
8���%a0��M���J?2V������� ���l\R�����B\i�"���M�W�KD���Z.=���8��^TD&���|��;=��\����EK�]z�~Z?D��hjs���j�������z�������$������=�2@��Y�n[��������������R�S
!N>��5�x�������D�%���O�G���EIG�.?UM����t��s�yj0H��vhC�A��uz����Q�&���
1Q����majrg�]�i|�G���������B�B��$)Eze�^����X�$~����� ��#X��_������:�R����q���p7������_��W��t%�RG�����Z�l�W���c�&
Z����j�Z����o%�����D�L����������/�g��~�U����t���CN�a�{�������UAaOemJF�^�`�(�S����R��*�ic��l9���U��P�;�l���3�m�j�������-~�S�O-��a;.���0���0!�i�����i[�����&�L�5!p�O7�%���|M�0*��rO:�wM��d>�A��H��Q�z�dW�x��x^J���H�9�#����hT�4t��%�Pn2j����k�e����A�Z{P� ������}`�I�u�]3�+����_�a6����X�w�`r�N��4�79��
N����p���
=F��]����:��aEY�?�����W��
���}/5^��N��q?��j��`��TrX����R
���v�c}��������>���i�r��f�c�(yE�D%W{�K�d���O��������o���y����
d�tz���[��5�Oh��NFS_��>�Px^�{{��EeCB)
�_������Z�����^������a>�:�����lZ��VC��ej`@x�^@�����?g��!�D�(��.�;W9�`T�
�m!�'�J>���y�1�k}�%�-��8��	f=�b��'kk�V8j���W�Az�
�aY:y�0<5/����w�$��(������dR@�Zk3�{�_Lq���|�C]k��d��3�@U ��+T{���A2�vl��Z j�F)���^��m5��(�IRL�	��*�8��w����d��,�^U)���5����+����=��c�q7���h�&8��e��]D��bX����t����14�V.������e��iY7��L6 �f�W
7ND�`%%Y ~�?������-�����0�]�s:Q�����|r��"N���������rd�d,��#�o1�}���f�'%�Y�g�>�h����3�~�'G������#!D)��H�137�)�"�� ��U��$k"���9.D-I����_��B�U�;F��O�IR��W��p?�6�Y:��,�G�C��S��zc*���rR
�$�2k^�y5+��3������1��
�b�x��j�9�Qm;�c�c�:H�����Er2�N��I��W���@^�T�h�[d���2G5~����d�LL��M�p���\������j���z���79��Gg-��������w���Y��Y�Nd`����-#���.$��I{nEo�����k��)0���� .&����w�^r�Zi����jC�]�����um��O.)&��������!|����dv{��yS
�gIuTy�����0_u���J��7��C�j�*J�Y�b��X��9�����d��:�����A�w3�N�D�3�^�_�St���Q��y)"�e�����4m�JL9?���"�#$�U:R�[Kw ;��;�=�c2r#>F�$���5r�SP��T���ct-�����9Csu�wB�\���@�.	J����$b��e���"�1����������'&�t��PP|s2�o]�%��������S?6S,)�@��U�[�w�������*�oJD���'bc�Y@�b�K���s��V�W���j4D����W��M����4�i���D��8��_��s6*���Y/rL�/D�����h�=ty*s%��� -<�CU3i��=�	b�ez
���>{�WE�������i��WDu��<;�Wn�N��U����>��i�Cd�F��>c��$���;O�����\	�3;�+����^\/���������A�/��vKj���z���XNG���r�Mkp?�y�����W��6j�K/��`?��a������8���h��G�����ItQ�Q���:3��L����^5�����|�pbe��v��!�R���8���X���d��������SC����
��\�hzU"_;�UB?�o�!�!��6�d�W��}�/O,ie��e���!QZ���}Ux^:�u�V�{�������ts�NC{qP�~�]Y�+����R?�
r������&A�m�h�L^�����=�d���N��t�%��
n^�\��D��iQ��}����(�S�t��b����"����A�a�����u�RN��J]�2�tT��2�
�<��*6XE�+�S�����1$g�H���!�.�Jc�r@�|�f�Gu���]����V���S��Yk}�a��e���u�A�&��se3x�cvU-�a��f
n-."�7i�(5����S�����`�j(('Q�B����<T�U�<��(FU�ge��M��<������z����Fo'G���i��5��a���r���3�&�e@b��Zl�:�4t����N]�,�����@�*u5�7�K��zgf)����*�r���w������@���@����j�P.7{2M��kg�z�������WXL1r��G��9����kH/i��������-���3���|�)M��oka��p|���f�gN�T;n`:)���K�M��y�A����Q+�������g�	�������[mzx����-\�ZS����R��~3���RTN�����_��N���F��]��L���M��rr�3.`��`�<�6��Rm%�Ic������?A��uy�Y��PR�{���-�������8���JH?��cA����%�`u���U@���X���f�2 ��J���S?�����*x�\j����y�Y��i�[�Q��D~��l-�#����c�{��\�F>*�	*l���3��ZJ�X�����P�R�,6��	��4��������T��������Yz<����+��eO5��,��d��|�D��^��������`>R�aY��!�Ra
O���]�l�����\�x+���,��$��:�t�t&h M��o���o�W5~XlU��Y
��]|.�5"F:��\+�|��/��5�]�"
n(�x�U�������^{-[�Gh��#�=_ez��	gA�����F%rVy�6��Y����|Fo�Cu���Z?�h)w��`��U�1��=�f���:���Zrh������R�u/���0s��;��/�����``dH^
�3�h�O�#qZ��Y�f�M`h����� 19=��@6�U������y����� Q��u�x�������"��K����6K�$s:T4Oqs�x,B�*�*���+Xq��<��X+���E�����A�lG`)�J������7#����P�q�;DOv��lG���j��h�_��~�h��vy�z"����I���`bPOrdw5�;�@_����ER��e'�^cH;�����T�����-D%�v�]��'����%n-e�LA���ry������e�h�\�.zY����4�5)A&]VQ/�o�����}C��]�,��7�e������r��P����r{��hg�k&&"5��Xj*`����?��n�c��W�[�����&�y��B.�[J10�R�h>�t?��V����Z�m��r��I3�MH�L�C��q]���D�\|�R��n��{�����@o���� ������=:���>n��rO0Ku
��!�H�2y�����J��	�2|����IR���N6�X��W��}@���E�������$��i5�dV-��W.^S������ �'���N��B��u�x�!��O�4��c�C\�����Q�"?���Y�{X��d�U6����T�k����.��F�i��yS��D�����ZmB6����9��!��2�<BSP�B84p8��B�A�u�R~v�f�k>�������0�	�-��������2c3�n�YE�A�spE9���k�8����_�f�����\��RZ1�)%����Ir�z���G8�
��u��L���"��`�@dm�z����g�
��)�f&�B]�d�J3���.�����vw���,������qc������R�W����L5���@�O>-'P�h���X�=�����!=�!p�t��u�{�Mk	=Y�ihVL���������2fx�:6j*�Z��G0]O�&�S4�(���$�6���w���od�15������B�ht���3���F�,'�&��,O��x��b��d6�Z����:��]b�'r�i�&������D�.����X���=��������;��������xo���*Oz�CR.���sB�a�f�����;��0�N������,/�g�k]V��k���)��8y��8=�1��2,}�3*�����EO����.#��hHV�����e�]�!0���<���{���@u�����D��]l���DA$����L�~����V;���Rb3���p���v��O�a�(��{����>E���,:�{c��p��\�2R��������j��n����|Z�gT$e@��T`��i�Pq{�iT,�:�:�3wr�+ZwT+��&r<���]���!�l�����xe��ei!�Y����`����1�.�cDz��8^zy�T�����:C�g�r��&=�5���}��+{�pj6��w5z�z�8@C;�\������	�r�%}&"�>u����1�iuh�_W��������?��pB�jO+�;��q|t�9e��r�
���07�k��������5���z7���~��;�Y��we&~G|����w��iKi���o���m��~���w'����)j����]�SM~�!l��M)��~��8����0"~��v��L�>v��/i���]*�t%D:�����Y^U�L�cO.���!8�}/'j��^�$���8�9�	���2iC�b��	
�� *������pN$9�0����J{.!���}���������:�}y."����
0)� -P?���.���&#JWR��P��1")�M$����6�,��g-���F�[�]�'�m���3��B���k����=��w��UCm���1���._rXc�,6����#�����	/%����>1�dO}�g����S��� �J=�������t������P����d!�+���Ik�*����$�=�x���s�����
�ph&o��$yY�$����<D�����`\G����qC�G����C��b����h[$����'��:�����P�����B��/���c�[���l����l'����|�
�f0���-��l����,{�r��W�J����gM ��\:�`O��0YVU�������}{?ku�`s+�a�W��'}f�n<��9��-�5�2�cHw��H��L(j���`�Ru��/�h�N&9E1$�Ql��`N��#������Sa�*N����/w���l�O�Bu��������bDF%�!@2���v�
Eq���
���!�Jt1	{�j9�{�W)/<����0���/�o������2n�H�'�?����
IqGt�G���A���y(O��Z6c�����v��7�^���W�th��m� M��]�^x(�F[B�o<�������5�6���C,� l5~��!_K^D�|O�����)���{��+�����:��}��W����)�<2���v\z���$��zB1���H)<��N�edrBX�J�����P����]�*��y(�LP���3�A�D����J
�V��]��MFg��|�)��W��2���������:@Pd���
x����gi6&O3��B����zc�v��b��<�^����x8&�a�/~�� c�������~2�X��5�{�;���W�c��r2G�`�2��EZyb0t�&�s"v+�]�d�:�.��G7F�Y	�*Q^�/���F�`�M=R�N�T����4��e�_�CmF�����So�����EJ[�f`�4��P�8;����
rT9��4���_�_�1k*��R��"�jz��n��f6jj�(���k&�'�$����c��c0��;�q���g�<�cUEN�_�nj���&(��re����|�k+��X>�P����	S���]u���S��n��O���H�lq
~h[E��<�e��vJ���&!I%1���p�ki������mS�!>|v�E2������4��U��|�E�_/�`��QJ�\WL����z/Nb��J��da�A��q����1j�V��W���_��%dcT;�`5$Hi��s��G�S2�E�������n�Dr�7[��IFX���<�>r���]FF%,�Cs�4���@���$Av����9��Nw���0���HH"YZ�3�������c���:�����q���1W��<�S�0F�G�M�����9����h�Y�_o�$�ar�yd�Z����)��]�{�k�t��b����`����q����u�`N|�t�$��!���t��v�t�d����(uW������<h��
������PH~��=Vj-v_��D������&
v?�[�����>HU]��db�����{���-)�(|��.=�E���vr�1�W6����R���9M��8QJ���X��\��[M-����4����L��8�&'J�#z��Ku����w���$Z�H
:�}y��.w�	�Y&]����`?�L��Y��x��w>h"2n�Q���~�d�����5�D�n����������Yee���9j�����M�&����k��\D����v5^6(g��r#n��c�(����y�.�v�-�j�'l���(#pe�����d0Z�wI;`1�n%,�k2�)=�����c�$\��=�m�27�c)~�P�n^���0��&|�n������2�]��J�@���<���zWfq��Y9���0��Z�?d� ��V�qV~�������!��c�1h���@1D�����C��4ii\;�P�&
L��l
��>.�����L����e���:*�r��������
�u�l��>���T�%/E��L��Y�����05������Z�C�_w����l��LIC�
@WS���1��5��Z������3���^���������j����)��Ri
��-X����z�4x/��@�Fn���~��K�x������#+�T�����>��,���(n=�<���yXW4N���W�����~����;.���F�[u)�1����i��?�p#j�2QG�r�����Z:���P�����]�[7�a0X������
@JB7�iCi������^������~����g��P����x���{��t�I+T�g����2�p���3���
Oy ����\��1(�,*�nc�\������R��+�6��M~!���qH��3�C�� ]f��������H=P�1������b�6�1�l��>���m����Y�]n��G���}E��
m��0���~.�t�T�Sz��PDvLt����p{��R#�'EJ��^P|�����<�����r�uO�b���T:~Q
��-Y-:���#�|�s�B���{�A���e�Y�#�8?���+oR8��S����Z��%��&{�c�.E���q�y
d��iFK�=0m��A�?�]g�~�y�f��8U���r���W�����8����P�2W��x�L�y�!��������z��"�������>N>��
8��)�`�h�W�����V�i��KN���u��������R��z����n~���R�y^`sA��F���Rw��bz����R�8 �|P����*��lA�[t��^��|K�I>��������r�����b����SX)/i���W������{���P�E�!�o�'��^O�Y��	n�Th��tiGe�w����c�/J���W��������.S����Q�l*VU=w�c������g�Vj�����6}���K~tq�B�,�U-���Pr���/��O��y��<��p��9L��j4_t*?�x����T���
����S�9�>-e���H�?@���inF�4�cf���in��Ho�s�RFX"|��e���i��N���rH�|�&7f�e>_G���!�h�T�
>�����W|3?�
��L���gx�rQ�R#�G�/�s��h�*������w�s6t?�%F��!�����/���P��i���	>Q,C������8�!���B'��=U�#_i����eh��,~��<���yr�`��j��X��)��������r>R^J�f8�ps�Q��"�LGe~�q����=.�G
��G�N����Q��6�	�<�cq$����]��Q��5(:��Vj�sw������jOY{�#Lg����'�v���3������1#���_�����w�g)�1���t^������)^u��&�]+:'�����X8�t�fD\���p�-p��m�o.��`�KY����e��������C�'k�p-�����y���A^30T��@>�(O8'����z�����5??c,�]����
��[���s5�%��R��_}�^�0��lErH'�h�����2�:y�:��G��J�U���vl��`k`4�a�/�f�<Q���DH�MdJ\�<�O�����G�N1}Y�]�kUJ����H�
F����:����EC������aB����F@G���X8}��~��h}��@�A�Y�}�n��j��F��Y��j� J�F���rX��'��-w�u���`�L������g\5�7
f����MPwn��<Ov�l��?9��z��W��e9�S��i��N:��U7����\"G��<����h�Q��9������b���v��#5Bl���Jo�J�xw�����UD����9��Sj��K]s3�m�\�O��	��yy��M:pnty�>u(�dk6�
!���$M~v��GUV�� �
,�,Y�%�5�4_��5�z
���'�]�����t�qc���q��i�S��U�^�^�p������D�eQOz3����@�c]����L�0�o�~�_f������Ljv���<r^��{�tD���U3�C/Y�����F'���P�E�.r�`��z������!C(��|�+��l'S�A��2�#�_iA��j��V��U���J��O��MjP��7�P��?t����$�Rv�"��&�.������p��:j���jw!<W[���cEA�/3����u�-�i��B����1^)|�����e�JQY��GW�I|f�Bz���8����2�&y���o���U�"!	�����7���+�yHv��������oL�}�+w���2Ue/g�+;�Q�
�z��.��^z>�1�'�S�����<J9���&~�~�22=��{�33[��N�+I\�	LxM(����<'#�,y`��*U�����#�����)����k&Q���vm��Y{�9G
�;@�m��hCJ����c��5��h�Z��'W�N�Y�TE6r���5�����?e���!�!<�����acGdTL�q/e���!-wL�����I�k�]�&Z���6&�5��Ep��������^���9��#�Mw��1�Y��Q���H�����EjDG%q���c������e8-���O����%��/h�����p������5t�<����S%�Jdy�w�K~I���P)����J>���kJ���:�Z��xE)0���}/��P��xpu���+�v���*���S�V��������������|"����4V�_y�v���R5����
1�+'n[Q�ab
��K��c�_Z=L���G}��.H�c�����]Z5<���|���y�������5��\<�5�x�nD���?������ �e�]up���#KG�n�����BPe��4:�1YU��m�3s]mis��/������1u��\g����
]w�S$S5�������#k�S�^��>��O��Q�������h���-�����
K������^"�d��4�M�����17�=�uE�x���+��O<���@f��Yp��p�������B���V��� ��\����[�'�.�ISc+t|%�C^�w�_X�7�����=Y��r��~�3�'&�� >���SU6�~�R�������.O+�Z�������T]�U�s�[��O���o�aO����Y�n1r���{������������C��F���������F�#�����Odu8|��������dS�~�J��/|*'~�I�i���nYI"Ej�� ���Z�4����N��#2_��i�������������q_/�#I$�E�Z[y?I�aN}�V1�	��������s��a�=�7K�rb��Wc���d�]Owz����d�����#~��=����k^���FW.���&�%�VL�@��{Ttl��Q����T���\��)����C�}�}�.R�<����q���9O��d"��a�9�UI����L�x.}i��}�A����Ow
�����Azj_���A�����Ny��U����c��W�3���E��j����0�_������g���^��.��_������bx�Z��H9Z�r��R~0�s:��t����U��hf1#�%S&�c���7t��Z;���O��g1��9�w�!2z��\��q��	�=�{�3�
0|�����\u�Lm(�P;���
���z�K1qp�R2�B��/����l�(HrG�r2�I���$����L~���'c$5|/h�_��;Z�[4Wl����\�c[4��3����zj����@u��~���
]Ce��Z�o*��~%t�d�d}�>���r��l�KB���aj��Um�"�c����%X��`�5�=�8�GV�t���
LS�I�]d6t������@��,����yI��H�0�l�	���9jr�}��~��zr:bG��a��'����<�z�Dt��D�&�^_`�#N�l�r���.�u��F�)&��&�}v����N�*lPdR��k]���-������N.��*�S�Z�����0�z�&�w�@I���O(�=�%E^	�b����@����F�������u�A<���d[�sI���\�QuI
�f�B���g2�Nh�*��$����0��V�D
�B��n��!����=�`���Erj����Z��I���4����/�(�n|�����f�'�t:��QR#��P�u��Z�M�d����C�$w�H�H�,��)u�0T���2�"����F%���KWs�E���z�%����kK5��<*p������a�i��EJu��Q51!�D�@�i�A�E�����M��N��8�dv1�q�#�\\���w����T����
������./�f�����s=(���k=��-���,��xh
��?���?"c�1�,e�QD
�C*��O��O
T����b�P�qY\���#z�����x��1vY0���o�U���8*�/���������,�^m�HV�W����%��0�.���^�a�Z��z�6���%t�<n l��kX��1���I����)���G��my����3����H���U�9�_���n�1�3���g��8�veZ����#46�e����:���T�D#UL0���6�'���
�8���C�8�2�
kf�g�^�U������s-l2�2M�������[8���%��e�I��(��>�H�&S�w���;�nt��{VU�����F���*���A#�PK���zo9#KE}�m^�3�H&���Oe�hx���~��=�narfIc�)t����A<�f�u��

�Jv��w�U7�P�����b
����1�+�E����McE��,O���H���
"�E���J6��#�q���q���lI��������:B�T�Ec~?�Y6��AGi���Zt�����'zEK�m|�:�n�'U���)���f�4FU�`h|�C~pm�{/Vc4�����vV
��,��yu��
�~&��`[���y��6���������Ac��j��/�2��v1w�tOw;��)X��5���J���Y���3GS�I�B����k�2������Ox��tz:��M>�>6;�J�d�V��Y�.���1.�rRMG��i)��������Y�Q����+*����;v�=���1?�W���K����wk��y���@����U|�s6�\�Z����������O������)QQb�q�#�|��oG�zZm,#!A���9845�~��b���/��h��I��"�F��EC��lR(,c��=�{+�1�1yI0v�,�������������^��\�Z������E�d��j�q��)�sF4��Bu
0��'oC��Fx��Nl#4
�=���M����3P=�]jVl'{�!?�@�,/k��2�p������������&�������d�4Wg��$��3X	��I���e(*}I.����Z��y��������;^y���=E;]
�f���X�&��f��+-�<��>�
��"�[�|v ������756�|f�l��Y/����j��2e��f�t�1<L��BUv,�{�z��l�jRy�4��-�������r��U�|4K�v�biX�g�r(��ppz"/������ra�ssp�p�E���\��K��1���Du�2�[4�sJm�s�bN�S>��j�4_6CWRw]M���?{���hP��5�XsK
6������3��z�3�K���)URWA�p�������{;�)�sNe%'�=�~��WX{>
�E%��Tr�+���L���1�qQj�����b��;24O���e4������P�V�aN�C^��pbbWj�r��"|i������]�KsX!9Y��,e����R�����;���FrB��M����;_d`�����`��R}_�?'~��Y��5���,OW���@|}�d.jx�k��mp�3Y4�1,"�L�2����.�qj�G����ZGW�j�C�sf�[3�8�]�����qm��l��
���e/�*���Oi�����������n�]I��s&
�������$�<��tW���
���������
A1�SM-�
pI���U���{N,�A��x���M���s�q�
D�P�)��N�t|��G����
��=�M�K�b�r��e�7��ZGa�m��Y4(����3w1Y6�aS�,+������	�[�q��O�oH�����;o�����\��"!,�s�p�P"�p>h �B���h<���Y0H��{��-��G��]�[�A-��!a����;^T��r9.R���T]LKy��^��e%�Ry�:b��w�\h(�����5�>%�R�g��7�p]<-�u�)gQ7du0�������j,�mlql9e�2V$5���,MG]f8�t���������-&'����sicn<����Uh0���b�����:����(��j�Uo<;��g@>��=p��M�<�-���^�q��cU�3�x��.\z,f�V�Cf!�V���W��#��r[��/X���Z�;������+��i�$�����^�Z��'��T�Z2h\����<���fh�:�����#�y��+�7Y�h 9��q\DLu�g������wY�_������l	~�����}&;�xur'���i�X�}�2��r�4*���B�M��@U|#���o
��_�G�f�[��{��J9�3�^oP��-7�C3RCid��r%V��+.�NH�r<��d��kjR�2W���X�%�AJ�%���Xq�|?����[��$(Vy�'����mLY�0RZ^~=�A�)7�D���H��ar>q;�S���4�02�5)%���d��R�xs�L��1i�������d��5�N�<�s6�,(.�?{7�Yb���S��kV
f��y������'�W���d�k���z���0�s��8���QHv(�.|�������!��O/*��y�����(>C�0�zL�t&:Z��������OER�N���f��2�o��oFK��u�����>���E@�KO��6f�$�a���4�J"�9�]�h��o���d�V]B��D���XN?����K���
.���s�&�DR�[�Mb\��cW��1������P��]#��:�;�{|f:W�@���]��S�1O�t�^�lR��>���+��q�T|Qv����[���s�yv�k�'���+*�$��������Q������|�B�\'MzRV�jd<\��b���4v�Z���O�s��S���+�\)���}`�����e�C2*�V&�&#�(F~PzC��)PTf�=�������t\E
�3f�#X�2�Y<�<�K�X�����lp��q�t��K����,i��e���.���{�K�"��(��{c�����"���k��:\��r���;,��wz=B��"��C��>�`Px�J�O.u�-��X�:bD����{5>H$��8��f���.e���p-����>7��SzA�9��k����x�mKP���b��+���V-},";��>7/ehqe�^��x)Z��;���u={�����gQ�1�u56��Po���n�:��A�Hm�����:g�#a�2f����Z���+��<li!5d=���>����6��:.'�#N��m�'��\
�,O��B3��A����)���t��b���'YN.?���������I'���w��l=���utM�����'��f��(v�;Z���n.����0����OT�������u�6�uC�:t�J��]`�.����l���L� rD;���9�=Z����Y�
Z�/��!�*�K*5����������dJKf�l*�X�ub�9�������a�����xw�#iO�s9�8���='��T��wz1��*�����3����F�bH��/2��\V����0Ltqki9����2���6��IW�Hc���@�J��k����xj��������o�A+7)��|'t�&���gq]O�IK�������0>���N�@��1��e.�k�}N�r�i��U���Y����1�4G��������P���s-�U�6^:�_�z+�<�q�W��w~h�g5i$�����[\��7���qC��k�.eY�tHO��Y�+�L:&������S����Spq7��M�^�!����~
	�A;����zAL�*\�����R����B�iN�'.b�,���f����DpX���&��|�����y]s���:���xJGFX��A�_�9rb���Vy�k*j���O��>o+"��o��	]$|8�n��\W������-��n&�$)-l.�����^j����i��$��k�
?>�F�K/�AO}�+��!�����W�8Fv���}���|B��1�|���kv�[�4qb�3��z�;�*�������,�+%������x�\�f���iZ}h�1v��������RJW�FJa.V�o>v�JKg�ah�e!�q�'�hKW�$�XL=U0!����f���C�R)u��GW$,�RT�3���t0���!�m�W��k@����g�����92t���\L�'����Rxh6�6�A�O�����8V(�$l����j�,�(�*o���/�3��P)��)RY�x�J��Hq���a�������kz���n#��;x@�4,��7�����Y�>�A>*6�*��IZ2�9��L?���o���������7qM����at���4l�?}��������K�^��x/K���H�����| ���	����y,lg��w|�`h��0�2I��r1�R�\����{������A��<��������#�
r�y�M��r�����~>?���"�B�7��������}=O�ZC�����bs-C���������G+�^�v���4"
y��wGj�����gW��d�5]�w��+__���1n8P"`!���J��0�~��In?';R��>z��x�(;6�s.r3��.d��8RB��y{I5����R��B�Y�����G^�LrZ�xg7��^�R��t�f�c,��^~U/��o�l{�����Y�=G��<���X���%����^����-G���i���Ol�v\�l�F�(�a��?x��Wk���V�].�x��q��S+���v�H����	w�
���#x�yv�m�� ��^�H�'OG���m�����76O	Cm��_�)���!���9�|'�p���5�+�v(�0gx�z?�zR���h����S����������&���f����`�o�}<�M+�
&��[��=�}:!�U�^c���t���|����Q��_�BB��*b���=���?t����Pgk�4�%�y���>������x�����?D��8�&��gh^=(	]�TLcJu����m1���`VY�mR�@j��A����7��X�%��p��'w�����.��\=N-����t��������D�f��e��{�b?����������>�����B|��JU���(�:�B�M�����"$Pw��t�����<���d	�v_�s4C!&2�#-r�3��-K�ti���A#�x�J�e���?���������=�T�9�[������5�yv�|We��d�}��_��?���3��f�����D�����u���7�8G�N�V�j��R�����L��?�a+���� �Il���Z�,7��6j �R��(��@�im	���Jj��+j
�z�6�C��-Fct�g�s���]�(}�9;|E�Z��U&�����l�Pr�0,a"�3��|,:��q�iL��k���>s�u�7���08�(K��@B��Q9�����|��@�s�V8O^D��
/��_d��?���W
|�\�~:qu�`���&u��u'�*���U��8^�f3����gk8k�������2|����l>�u8�+%��	?�7���f�������[��M�m`O���,y�4mL3{�h]B��GO���?��~�F���b��)���Oy�!�l��
��b���/>{����5�Eu_���v$�_�����kg?c�e[���P:zx�$�y>�NbI�������V1����|����mp�	5a�
�^���F�xik��)�� ��~��C�T Guf����c&���C�����������j��������k�s[����X9������@%�Ys�}����+,�<%�Nj�C��C�	��F�g�l�B<��{|���G���kyFLV
O���7"�s��n���$\880���'^,v
���O���(7��2V��������!�gs]�D=�qnZY3�QP���I��P��^�RV�T�V=�!�W=n4�re��A���u�]��Il����'Jn��}1<��-u�������x�����O�����+V	�Z0D�����K�s�J�5On�9�Jc	��]�V�w��N�H6Y-+��J�?d�O�D=�sl����=��x��|S����}�~����s�!J��vk����F��F��!���Y&
LmS-@�o@ �����|W�;.�?�*9��e�����X����O���a�A���Xj���432��Tud�/���w�[��b��u:�A_:����g�W��Cl�m����h��\��� ��i��r���l/���o���+��@V�Q�]��=v��U"bx����R�Wu+d�<�:���������D���:�f���@n��p@���J	�v�c��u�t6���zVN�����]��W�8c���L(���N��������v��d��k������dB���n��3����W��������q��D��q� ��YdX�K'zc���b��+e��r=���Q2�)���.����&�#�fI�F���=F�:SD���4����]����4����[~{������il��~]}�n�����`��y����cZ�c<�����4|H~�~��r&�0A6��_)w�.rL�g��	��x����CF*{���%M���d��r��N5v�����,U����E��3������I�-O_Nr�8`\@�m.#���+�:���X����n��i�e'�)����vP������Zgl�����������*�����U�9���a���+���XY,\=�|���p�2JXz�
 �4���\n�Uu��tjSS�" XS�1��R-��� �9�4	Xq�����K��k��US�1���Q����@�+���n�t%�?�%-�����}Hf�	4�:���UK)���N/��MGRp�����g�%'},�u{�
w�
������Ip@�c��Z�/����w�qOT�n�j����h]���q��(\��A^���E��=�6+����,hr�����R\~�D���.+fa��0?�#.�{��u��.���,��)��:�|D��E���Z�s��0����~�7~�S�Dc��T�?����7����Q�g��t9f���'������y�4�i�%�����<��m����u��P�g���[��ned(S ,��1F�������NW��w����\��uUa_�|�ct�����]G�E�)�V0��G+�GQ��:�������u��/�\�]����A\���=��RV�!��%NLz�B��`@�2�
���7p��B8��}���"��V������6���UGU��Z+yw���'���=tUJ ����_��������[�LdDRF���^���l!��R�-^M���A�\)��I[��P���r��$��C��j�|x����#Lr���Y�����JVk�:��+������)�~AUL�=��c�|�cS�����9����'��7V�O3A+:�
7Sw�g�_�ua�pM��{�0��v����z?�`�R`��}�:K����	��/e �L��S\<�z(K������d�����u������UX`N�_-l�.�C�Z����J%�a���$����s�k�s�N�g��L~����Wb�s����2��h��RL����6q]���P	6�uf��K�C��Fh���A���V	��/��\W��"F#�����H�������(��0I�S��H���"�s���ok�����R�M�����f��S�
��N�������S�0��B9�3@�E��,���%_����D�h�_��Y�!�:�,)l�\�C���q������j*�<5ln9����s.v�S�6�
�������������D�����1��T�����>0���=�"�����������#�v��M
&����?uQ�z;���\����!�B�p���K�����,������J�Av k��,'�X�����:N����sMr��grk=���8A�K����{�wjW=<�[@�T'R�y��x������������t�F�����2;�xu/�Ko��"wxXC�&�k2����WtG#v�f�]����Z6��:^�A���U6B�
bUOI6L�3�=/��l�
z0�p�a�,F���D��rTR.�dBN�����d��n����L����e�&��V��#]������iK���|(�?��Qx
��*���)bH��j1���>�o�DE�j�$��	7�����������������P���|KC��?r�h^�T�����S�W"��`
-���T#_U�Z����?�,����5&��4Fy���������7,zr1>o��@��)�������X���y$��"�E���x+{`�p��F�����`�d�� `�%>DOG@����|2�[D]��P�������\M#�<��G��/�f	H����6�(�x8���t�+S��C�U�9/�l�B��Z�uIW2�@R��8�L������I���3����8�%;7Flm�N��&�N���9��s�U5gI]�8����j�,=
��F(
���h��0������n���'�e��D�����3D��bg�D��'���:*[���O_�����[�	��ao�/<s}Wh4�������X�8�q��0On�
���5�s
��%����?��j������V�Ch�Wl�2��L)������.Gk�|R���SWV~"��2R~�������V�>����a��O�t�������'ixI)��0�����x���]�l�;#���<��x�����P8N����9���7'���Yp
��\��/��)�Xd����!���.DK��}<�x���\�e;(p���=�y��[u�����f�Q�F���
���$=�k[M���T����i �������5�R�9���njY�������@<(�3[`/N���6�����c���+lw�����c_S"
������������M]�������%���Zy�,[���Z�u�]C�c6��Q4�j(��a�����v.%���7�=Z!n{%�N��������oZ�����}Y����0���H{��2,q?z����6q|�J��P�f0<���-V�������V���S��Xa*�6���N���;U+5SN����7����^�n����@���vGM?4�MJ��I-u��1��a�T���o�
9��Z��aPY�g	�
�ub+��e�b�����{��/�5�P��
x�0*zT��
��c��gm��*9���4�$9e�c>�^�#�6�.b�xY0a+��7_�U"�\L������V�|��%�������W
u��j�?B�1<Y��l��|�-<��=|�����4������
�3f|�?�Q���YH�JN�=�"��f\��9�������J��/��M����
�|j�A|��k�D��z�'�{/���x���sq��n���Z�h=�����r��3e��"0�K�X"4���'2��Hv(�j���Z�1
��X�����������`2j���w%��kK��)5��{�_�y=�`.k�����2�3�`���HC�m��!��9�T]P��k2�grf��5�I)�qHx������~�����I�<���l�����
� �Q�����(��t�t�S~�Id��������}K�R��ZL���9�~��:�������^�vp�e�x����^�V&J���[902Mj�=����(p���M���MGqR�������{97�`GQSF\������rB��T`9�%�A��<��SA���GJ���3��sX,�b�)R�WH\�V(��h�@�ki��g�)BRW���<?<�w���&J�+����9%�s����J��U�k�;T8�y�;l�����Az��`5���
_N����e0�t�^����D�^����wb��U�)��U7X��2�2�:M�s�8{]�k���J���
$�RW|o��� _��V{�*PMV�`�Ed���^��b�[K��m�A��'���Q�]���������?�o�?y�G��S����y�)�hJ�b�:���$���
���y�*=`��t4�I���~Bl���S����_	����[�R�.�@�3@�U>�
jp�<��q��:l�#�DR�P�� �����yT�B�.t=l{=p��d����(�5�q}%p�G0�3�����!�#eW���Yi2V�J%��X������q1	U���V��������q�qx�<{�n��`� {gL
�1����G���?����l�X���a��Q�x$�s���5k�.I��g�q����G[�&I1����X2U"Z.��������N�#�����i��V��U����a�[��D=
���4�9Y{�y��"iA��uqI�����j�v�Bc��
������_n	��X	"�+�u�=YD�S���RC���K��H
�r�]�]Z��$�Y�UOMy�k)x�9`�1���kW��={��xA����������
^�_�������Nx�$r�q�v�2��GDBL���\7�ga�
h��F^��4�����+���1���R���a�8��R��#d�X�";�����'h���j�5����F�aN�a�������A�2]t�xg���.c��g����:��5�:��56y�r����������&�b���,���A����������B���1�I�r�/l">7�dI�eV�p�9�f!�z�����2H���g7s�#x�e�l���&'�� Hz�DsX�+���?P�\P�)c�\���%0D�J{�%+���=P��H���XlVW�&��	�,GA,��v����5��X��=e��<�������K�� ���PuZ�"�Q��qyE�����{����"�z~Y9��+��uo
���x�-uCWf�qv�K�/,[�[i|�����1�?GO�����go�!6�B^�)�B�:w�n>��e�<e�M��S5����1��3�A���u�Rw>��r�uV���<��2V\���n��G��L��Up�O/����K
r��.-�q�Z�W������wK1��+��*�
fF�,��3�1��I����S!�|�8~�P�����fU�3%\�?��{��"�Lj�.��.�jRV�������Gc�U�����4�����6ey���wVT�%'&�a�u,��%�O��.�sh����IW��\�2��WC�a���O��H-n�����Xh��'����r�7����*.�e�������*�
N�
������l9���YmX�D#`q�40y"E-g��4�����u�T�� �>��xC�d��3�c*�F��T$m�og]yX9#�Y��������^@�n�a�%@�r����h��e�Tq�f�]
a5��8�3c�Ovj3��i���� �s�j�0
K�:���j�6���}wI���D�EM�$!^����Sj/l��!i����j�o'<�"����I���l_�q�p�O�
�zT������Kl����K���1�ws��d4k~�f�����hy�����)�ha"���$�9�/0l��w�|Cu��
��'���f�.n��j�J�o���5��+����qQ�C����&�e�)�����}6�F�u55_�5����n����(W���;
�� h��Y)���\�R��Kuy�v�j6���_'|�K�Q_m�R�[�K!"��Tx�u���*�^Gbb��'���(����Z�\ �Q�y�Rq��,�o�@FR �->f����I���:?c��]������Y�p��*7H5�8B�QE[F�"Vv$��b��G�0�k�A$�0��������IZ`�:��V50�o���5�����*�DmF������:�.�����4����^6_���++�`������4���~W*�_U�����B��C����G,��/Z#��G���/�������p�3�wv���:�g�]�Uj��+���Wk2;�FG��F&��g�5.c�.{�����s��^p^`hd���r�����}��cL\	�k�=gV�,�>f����J#����Box)�����EJ��a]W��<��q��K*��]�h�pYgl��5�C�Ug����7
(�#J����)�hqp��f�dq����QJbqg����������g�8i�W��(]#�R�F[���9�5��HYs��<��D�������M���)3�F&k�%�^5ir�;���,�aa~d�S����oL���i\������.#0�����`�h|
5x
�b���]����� �j�)��*	�����.����SY�B��Ay\�����&=0���8����.l�+**�jAK����#6r��.�jY�f$7cm�-=��Z�����Nn�J�4����n�X6K��G���<M�I\v��j���G}
=M�#+���5��(�}�j��^�C�w�,(f�[6*�T�pCwiM7��65�TX�Ri�)7�>��8��c���?�D�2>���KW��Tq��+�t9��[�q_���w�uXQ!^���M<0>s��a9��&������J��rQ�*�������%+{��P~�G��^�#���c9�q�"/)3<H�� F��gM���w�n^\������w�@}`qg�K�=���k����eXqP*K�����T���,dMa��'����i�y�r1YS�����O��:s�C	Y�`y���-�=Y��7�&f�.��+��'|����1���!�� �4"H+d�>>���U��+�4�3���m��s>�r���0f��^]ltj��^��p��"V�'lt���W��Y�9F�����jj������.�ZC�i�r���:�?q���X��p���(������$���~I�������r�B��+��]y����m~����9�������N.�o��fL�X����q�����3�l�n�n`���b~�M�	�y*/�bD?u|�=���X��:��#my��%�K��OH�\��B�a��hz2�r�����D~�2�;`;��aX}t���Dt+�5���t�k��d�Rs���Z��1t�\��~�:�4`0�����A���4�����.U�If���R���.;�X�	�eg�->=�i�rm��
�2�Rm���Q���8|��8y<����D�W�k�s�v��]��������z_�s���u�u	�+�a�1�I}�5����:`����\N+=X�d9%GI�� g������0�b�3+�~�	�q��}�<��y�vB������G��8*�!�g�^eru�|N�����'��R3�w����:N���~����Q������!�!�v=m�0��}*5k�������q���G���6(���F�'��|�s9daQ�������^���R�&��H�����5�����!�/.��a�P�>��z��i��[V�1�e��h$�m�m���qa�@/����W��qs(�l&i3��r�2��~�\�e?�����������HaU����-������5�.��4&3��J*�]9�C�����/��D��2{D����������@�N�.#(�_����z�����r��z�P2�C.�������
��&.�S���:.o�`�m��R�h�tp([�u����3AN*9�s1��H��n����C/nJ���.	��m=��'���(��Rh(�*��\���R2)Q�]
d��i�]����kg;
���#�r����e��=��Jq������������.G���u"��/b!F28��w~�f����d�����S��%��]w'�iD��/N
�����������)����2�g��r�b��k��6A���%��D�7����>������~������Q���<]i�;�!=N�x2�(,��F�A#���^���(�_q|d��)/9�����_�����+��U��k����%���@�]i��}V������=��%
��������r��.�Y���W=���N6�w��[F��7���M�j�z:�T���W������>�
u�z#��8G����o\��[k���Tt�G�]V�f��nOF��N5�������sG����v��$fP��^j"��������L��0�^y�C�e�`^����9����u�g'N7�<��]�������Z�Qe�IZ������2�4��P���@y��:�!\���Q�YaL��B�?��wA@�
O�������T]Z���tw����1������LF�>X�v;W;b�<��
���#_���vS�27�3�\�>k\_op������C��Szt^O��q�8�"�kx]�������&Y�R5�������H0]��Z�dA#R_2��:�x�]%C"��[��~/
���bdO�wz�����
������ R����we�����n��������L�A��TVe^���.�h9����b��I(��u}�kFL3N��2���q��:4��:����07�3����ow��������Dv�qx�#lU�q!J��ip�N)\*,�����
N����l4��t��R�������`��t���e�g�c2b.T~!����d�"8�i����ex��5mv$'z^��#��|��sx`���xT�4f���*��<��2A
�J�������/��M}z�g��-�e���X����j^C�����]o�_W�s�M{���:���=�fx@��M����`<� �8����^���[u��I2����8���]���(4#�@��.��v���;5)=;E;�����Z5�����������(������)]_}���m���i?�*)	s/�|m���\G�9���_���@/6ud0���I�#c�&�<�J����F�sp=��h�~n�=~��l'{��v�=��;j�
�0����B��j�
)1PK��V��<�9v<�t�e	�Ua��).��ri�;��t/�c��U\j�	�Y�F26=
da���������5���a0�U���g��9����m�L1J�V��
�;�7=���\L�D��=hi��������c��zS��������YX�b3��t����I���z
D��>$���\UA����c�\���!'�9�i�K���z��
~^^��O����q?	j[����{`
6B3BV9�(���V����+lk�`�U2��|B]R�H�L���0�����2Ue=��%��Uk�����p${���N��)��n
�j?�mX��������p��#.������G>����kt�NJ���K'9�D��(���S�'(C��y!&�����4�w������<kMm��h!Y{��l��������rFG�4�
����u�����y�c	��Y�S����1�������sj�E&k����8����
&��$�P�dN������(Lre`ro���%��7ts-F�{!t�kvb���r���������jP[=W�}wjk����}$�����y.��o��nC�9C�q���H%WF��b�t�rX.|<�����8�7��?���]�eu	e���"�����p�xZC��A���V��
7�5s~��0�gB�m��������s>�������E�0����x��w�*����
y��������9�\r}u�@9�\7�v.���B���^Z):�{w��@<�^���=��������
�B���A�F���7��I�iZfY	H�����z��� �����O�U1f�+b�z��K�����A�4������m���� 'urnY��C^�]�W}N!��V�����!���
��jR-����]wC��2�h/�:F(�L�K���\C0��U�e�#�,t��5�?�z�3N��LYS4����H�"q��
�\��*D�N��E�r��/|�Z��/����������#��Qb������
yT�z��R��������R�d^)���F5��n9O�M���L�z��	�uu���������w�)��
������1<R�[����b&S��hV���q��5�.Iwp��c+��=@r�����i��-�
CS�T��"�7UD���a��/ZW�m��_l��&sy������m:�����E�J���{p�lI��{5z������LbF���(/~���vvsw��������f����Oj?U's�wyw*�Q�Q�����R���j����@��A��+��2�8�b6���"�������;�����������tn4�e��f��U����Rv�1�����r|W\���U<s��GY����_5�$��|� ;�s���KK���L�,\���9����X���L|������8���G��Yh���������z)��������Pk�x�-�j��]'|P{�(jz<6�	���g�`y��n���B����P�����I~w8�����n3�c���f��W��u����h*�������y��^�wj��bO����B!�T.K��}����M!?��a5}��)����eiU���D~����z�w�Z
�9��gm�
|p����=��<��XW�~ �w��K#�6B1�}���I@�+F���%1���F-OQLL����V�-��������,/��7���m��ptN{�9`�
�E���:���Rj9]�����+�����W������������������s�kL�z��q��
,����\�U���5��`�������^I}:|�\u�qx�\��B��8�6u���������_t��w�(M�/�<�=��x���Z5r��|K�\�*8"z��0��������v�G�D���<$��F�q""�"�I�<&��q�V��?�#�,F$�j/�5�`��"5`mA8�����)Fh5�;V
T��H_�qs��-:����_�����	�6�R�����bZ�N<�'�m����Ug?$��|�I+���0�[�V�����<
�[�������'�;�m���U�'W"��MN�V�Xg
�6����|���6(6�4IUL��K����7���vu��;mY�%j�w�����9C��5���!\��<�3��_����v�����>�Fk�l����
�fSJ�� 4v�7%=�W�`V'���@e����������)F��b8z�1'#�&��Y�r�\`a������\-C�|��W���p[�I����,���E�����<`��,M������}{Q���m���:l��{��r_)�����4�/��Tf~���O0��������������|���1��u��6'T���G����MO+P�<�����$2�����)f�G�{�-<&��F��/���.&||*4G5B�Io�<�<e��lx��Y����)�A��oY]��K���h,����(��;]h��G'�Z�W���:��Hv?*�_���

�8C:/���pU��I
���h8��_����SM����MF'�{��6,�l�v���
�,�+�`m`�Q��gY���\�������E��cN�k�����u��7Tq]�D���G���F+r���@z�J'O
���9�x��n)�����(e-|��.3��MR�`%���0��{(>a��*A���������`���]�����-0K=S��J�:�w��_O������#����9���z8��Y�$�)���xY[e]���7�G#�1M�hR
������wv�a�Q��U��N���<2x#�Z�}�c"]������<��,���`���g
��=��r|������tj����������}`|�Z������F�
���f��Dw�0�w�6e^�<sM�,�j��d��s��3��2<��Z\K��
P6��75gd��n��)!����@XB������qE�V�iB���O"��'��v�%��xS�m�#\Z�Gmu���#��A �i8�
��yO$���
�L�������J��m���f�.�"<���H%djG0�F�.�W�������u�\�m��?��%���g�C�o�u�J3����B.��.ky�1���c�de$$�t�J��c��3�c��Y�Ud^"c����1=kusf���>1O�b�x������w�<����R��������'�.���j~my��A����=o9T�s�!g��5�i�.����{�����R:�9OI�iv�]�Z��S�}s�Z|O1n
��=���{�=-�YX8�S��H@@'��U�_��\�(�t3�5��Ou�sw0~�����iC;@�|��-y�s(9������*blG���N;�UG+���6�	�����`~��\?�o�!Urh;k�;�e�l1���d�f���x�!�{J������3AW��WF�dAcE�L�Ax�;
���hY��M/��Z>������z�X��#�e~2\����*0��&�T) Aj�Hrjy&��Z}�Q4�|c����j�.~���l����)HUs���~c����'#����_5�+������_9�)�k�=i�����.�]�q�����lu��9~5�>��Md^����Un%���	�*^k>�D�R�?���>�oN�y6��b��Q�7G��
�<��|����Y��?+�L���� :�4�����Ui�x�x'}���Q5O�A�uw��E1G�c�f�m��+�����*���&���3����eK7T���|�h�R��}�K;R�{�r����;�x.���6�8h0��������3��6f7�!��3H�*����*�8������i��i���k�����3P����"[�.�.�mk���b��)��}r�������2s7�	�������j
��Z3f���gj����e�g��0��s%>1t�4jo��=PQ�-�����a��S��t��u�-�9����������5�Y�Lvr��8`C�������2��JM�0��M6�0z�;�+���2���[��_^Cm?>��?���/ck�1��p���0S��)�R`�g�.EW�sT��M���
��ra=�������aOi���hZ����s<R�����A�5Z\��	��$r�
&a*I�~�h�]��eF�����sOp�s?��0��,_�r�2�1(C]��8��il���p�&f[��t��F����Z��X��c$�(�[
�8u}��k~Z��0�������<����
�+��J���R�*�<(�]3y����de+$kN�7�R���d��R,����H`�WYN"���g����j54����;�D|�� ���1?�ZG+?\�?����F���A��Q�a�,�X�0
n�}p�����;!�"eg%N�2�����o�#����e�F�~���4K�Z�FX�hD����2;����]|�ma�r��w}�y�/�F����V�#������J�[Z��r^�!R&�����=�~���� g
C��Cj�=���]&3�0�� i6U�[|z
�K���l�F����N�����%�����P�F��O����t6=f41p����N��N���������A�*J�v�d��3K����-+2���)��+�`�Y6�r(+�q�fl|z�������uZZ��16(b���;q���yGs2���/���\����p�U�`g:K�����qI��Mm�?��x&�*e�,.���<(�n���L�80����������m���Z���'��C����~�bu�Y�r��_��X�/��KU!���y'�K?p�#��Xb6�x�c�B<-^���]���+Inx��K�����T �k��X�E�:��H�:�����/�8z��ra��31�	���0.s`�����I���9�I�+l�'=0�v�;Uc�AZ����5C����kg��f6��L:.i�!m�����#��e
�����	f�7M���rkr���^�[�e�$�2��������C�z�v��XT�=c�x��;D��F��E��s�s���s��:MEI*��%^n�9�]n?����5"�Bj��zLf�4�f�]�U9�v~j6���lY�BujDx���ez�;8��[-m�@���
�%5���o#�Ug;��/Qi��p�8I��g~�MC���8��pT�E%��u�<L���j�.�eK[������X��t��4��d)�0��/c�rE���F<,
A\�������.�[,����V.g�;�>a��(,��)1U�]������c}�4-�5�:`}a����5�~tA����r��Y>�Rt~��/;�Y�E[r�89f�]On�(���(��l�11����������Q�M4��v%gD�����P�������)(Wnv������R}�H��o������"�?��pU1�����X�m���Z�*7�i+���j8<b
`��c��X}Ur��-�/�$evw@���O���0��(�q����q`����Q��l�j�0�1rx��!`�����,��jZ����A��4��uf�(����9�m*�"k�!��8��z���h�Y�0<	J���4^������P�<&?�>Y���QM���1��}T������T_�����:��|�x�k���`og�&�dM8�����7"�\�2�����Y��s����SM��&�y���(\���Y>�:o���>7���n�	���YWj����������u��&���Q}UWU������N����B\�WI�����y��*�O��)�����qJ1'��	s.��S�{i������A�XO�r��k��	�k���a�@��j(�r���>��$��Lf�@TBIQ�����e+�������^jW�P?�++�1�<5h�)��O|��	��6�
�v���u4�'2�pz'���GJ#m����zU�Cp���p2)�L�!�s��%K+����v�Q�CK��������p
y�
�k �)�)�v���(��1�>u9�x����������.���ci���j|�`���Nc����jx�O%(J]��DX�����YV'�T�.�
�fE(F��H(�C0�6Ld��$�;/d��>M���m�>�3�'[������~���o�.�k��U�>cD��M;��$GaXX}Y,..�	=nDH��d1����:�o[U�W��'�!�6��UO�[d��Y���V*��v%�Vqn�7�����H��m���uh����j����J@�j�E0K��'�y���v�����4f��+�>~JN�������&(7-�����2sY� �2�Rg��Q���H�|m�@d���b�E!4���I0L�+�|u�E�7�8���~�����3�`�m�����T.�X��t���_UE�%�
���'�	�����H�n���e�B��$��1�����n�(S�-:�=�1����i��5N�#Fx�x��#��	*x����	Z���������>d�����h�=fe�w������I��I����l*F^�Wx������������)x����[��T�e�r�}���z�����|ij��������DK:L���{}�������=�;�r�����u��5;�9�M�s\�������xH�D�YN��2?�����[�����/TfR�����X�������e�id����\����O��y~!�J^��oj�R�(��k��[��E����3���C,��~j���{qJ/��u-�]1����:;�$�����*j9��X� �"����i�Y�J"��;H;�����]*H3>%2�e�(�����>���#
o7�A�M�7�7?������rk���hQ���}�.s�GZP��=�rs�V����M�L]yO�0n���S��!j�wK�Z%���,�O����i�M�$�o���w��I������~�v7�I��a�#�6z^�'�������sXSU��60���r�_�!1!���c�:�s�xu���F"~��1��Y}�`����������O������t���wK��(D��qx��������Vj�9������|���[�\i��C"d��Q��I~�T����	�MIGp]�%�
]��6��t�53���
4c�[_~�*��OtV@=$������a��!!��atWJ��Kpu����t�85�
a�!�N��[����A�bB��������`4�'�6���L3]�vZ?�����Q��!��G����`��� ;����9���AA��]��b��TY#r�������c�?�I���ZB����pu�`�y����O06
	R��l�����_��Br@K>��)�&/4^�>�A�����������	���@V�����@�pn�d@���%qu*:~_w����W��?g����a��B�g0x���z.S���;��I0��S7��4[��!������_�9���U������N%C�m�Sf
�8o������:=���&��t����E��e�s�@&�P�������5s�v����!�`m���w��{8��,$)%��:])Y�RF���#�1*F��/����dK	>nFV�a�!��t�k+?�CcK�`��}M��;,MJl��=��)�I��'���]��x��q����.?n��^a�,��U�;�@��������:]O����|��yy=��l�J�j�Mzwjx��(Ax&=G�u������x�	��������L���}�Z�N�8�����)�'*?��e���������En�s;;q�Br�LH��o1�S����t�?���N-�����������hF�"}O����=���x����\p�����yj���m���F��n~<�rw������3����E�w��|�����<;���i^MG�[<���-%C7�h�e���a�����:��
�%����$��}t|��`����s��\N
��#!�~�N�G�����f�����"��w$�]���(��gj��/��k�yt�:�*�?2��
�����B�j�r.�"�Y���U��z�����V��@�i,�JT|*^W��q�.9�r����l�����6*y�T3������r�����u/J!i��4�+��YQ��
F��4�m�Y���fo�/��#�U���=w����\�?���4]1��?�Cj�prV��/R���i�V���)&zz�6�N~����F,�0#G-��X���!W�k~X�U�.��?�����cn}v���0��@tUd������rY����3�/�Lf������p��c��1�n:w���k�������S�TX���t����6���
�R��vd"�|�.���������Z��/�����k��s�e��+Ww*����M�\��������7Mj���HHJ��������7������=1b�NT���Y�]��J ��t��������^��4�����&,6Ol��1�<�b���a=Z2�x����jW�K�&�`j���Xs{�S������>dcG
(�7�Z�A���6��~�X@���-L�n�s���$���56UE�_*���#����*��F����������v3�Z,����Q�}�E�@��4[�.�h����<_s|���c���t5U���G�x���F,�4�H��vZ�aC��aN�&��g��Ry�<���s3�a�`(�C��q0�P����WI����&�7���{��e�������;e���ARB���b��e;���k�M���\���-t���0x���_�_�+F��u0�d4�:���z�����fh&�T��b��-��1��np|E6VhU*'��E�*f�IC��g-a{���4�E����&?$�*\�&6�������lQ�mG\����������I�������E�7����X�BL����R��r_��EAi�0C��uVCAwxY�����2Og}�����������ZR3���C���)���2�m#��������d���]���aFs��
%'U5����@���)��c	C/,�X��������+��U�r�������������}����}��[<C�iM&���J}���.��Ur���c�NZG���u���.���������A'�������S�h����"2��s}��2.in��u�J0+�1��<������#�J� �5��\���Ct��E@M���� �e�k2(d(�j��-��f>���rb
����GkU9�J��gT����
c�
�%���vq�>�W���/%�H�w���2�~������H��V�������@�W����$M������rq�`U� y��j� =���5�[�+���R�X_Gx�m�6j��{���F6�_���;��sc/fy�,��>��:R�.8���Z4��D��H�����d���<�R�|�\����nu����E��q)��:5�Z�-�8����R	���<�j���
������p�P Y���mf=�IRcg$7�fy�R�=w�C��2j��oN���"�t�2��!���d�h@�0������gwp6�����N������~�S3�
��CfN�pK�y����qX�>Lf�_�_p��hc�����B<�$�_����Ls������ST���o����q�ZH�k�P1�D�Zwg���Z�i��r�V ��C�N�I��4���L��a������z����Nh���yz�%�v+O��z�{s��H.�4�P�:i
����jmd�����1���>`�	������~P�O���-)�r�����k�^��T�;���J�m�r,:X*-��e9�=�������94�������!M���7cP�r����3W8���n�%�o�s��O����y\;�������0�j��������Y��sm�T�:q��������@���5�����]#7�w1F}������/�+����=|Y����>p���j���B_�Z�o��Q�����O�(~�Ma'���Jww��x�����p^�f��"�v�������nr�����b5��4���JU���������Z�]�����,}\�Q�\���9A����S�w��N�
�Qb����	�����.VU�}�R6$�A������&��^��?����Z#���9�c����*�8�,�IS��^;�g�w�e����#����f�_6)�c���fh-e�J�!%��!���g�y�NqtoMz��R�MlT�!����Zk6j5��[6���T��S���j�u����-���C�����P�����[�.����n����?�5v���5����R�����x����$Nn�����{k7h���u7ipH��w9�
�?��Q�\��i��������c�x�{�����+��|x����nK�l"�rP��9V$�:`vS��i0�$�"����������[>��hr�]Q���T���� �4�����)nex\�n�������@��<+#:y�8�p'v�+t0�R  �y^����RV�������~4�9y�c��v��SWg�B����i�
�����x:-���&"�)?#Q�\c��n-���E#!;�1���=��;��B���0,�S���,:���+Sd��T:}�e��[��4�����z��r�w�#���gaL���VD#*@g9�D����	*���x� x!�`.p���l��=���A��V��2~������d�:����|��b$4���u,��D�L��+�C.�^/� #�?>�m�W{�u���5���
W�����h�r8�"�wZ�mQ�&H��ce_S��S�M��O�\w�y.�p#n�4��J����M'���P��EjT���lP��]r�[Y�O�|���N;|lw�������f�c~�N���E�����u�UPzo�*G�YT����4��/o����/+����9���)%���xqW����r��\J��e�
������c���k��Y����-r	}!��e?.M54K(������|���_L���y�ZN�0g��m���G��wS�%�
�2t��O�v����UF7�6��*�7w|�J�C��n���U**]�o���G��<U���$<�h�	��t�{�q%����P�/���[��]����/�FX��q#��?x���a+��J*W.�[��E-7.�����$E����|���\.��t+T.���I*v��5������p�p!��������eg��e�s-5�\��K�Z���G$v��O#�f�"��"3\�.H�]:N��9vL�2��_
��������a���o�EYel�Y<�B���R]���"zUg�.?���y������d,�9w�R
�rt�����2����V�]�����A,����?�0�}�t�,�V�#��[����C�C}�J]����J
���i2V��
�D_���G��ij��4�E�,NU����K8�w"�H]?Z�0�����,C�,&��
F����������B��ZLN�n���p�rVm���'D_L\p�������{��' j_	��r0}��jti��+�`6�����}�������q��f)o�h?�:`,1����;����>$�d��?
���9���Y�UM'�����x>������?�	=�d�kVB��"�[z^���+�[%h����&��[��b{c��D�K]�aN��4�"{8���]'6���A�<�C��\o������Nti�T�K�����|�� ��qTC��i}vWNM���(��=N{x��f�Z��g�0�>J��i����|���8([s`��s'
�(�"l���jMA'�<������z
��&C�SZK�Ew�#v��i�
�����7(����u�e�Q���V����*��V��r|�gE����yg����i���������]����?��+���/:�0RX~��{���W������5,���s���/���BQ��S�}����b]�����^)<J�������u���ak�eH���*i��s���+��x��|V�f���B����+q�����_��2�f���}���,�<�e���d�?��,�>qI	e�6cW
`��������Qz��N�rv�PI�+;����(T3i�V�t������}xk.�������XDc�������MP���x��Z$9e�SaDE�5t�/'�0������@v����~�{<]��D"����)�)�J���
{�6KFg2�D� ��eL�	�^>��W�V����0�O����^�V'!�-,�N��mJ�qWd��\�c���kP�@s��7+�n�@�lM�����_HB?�P\���U<*���hD1FC#�6~����3-[�x����=G��]�r��UA���'x�'��3��[b3��FqX�z��^MzM�l��������&�������]Ic�����\����;&0-&�3Q�����rw�V���<y�bv�V�����(�L��AtG������#����D�<��0�0=����4���Q7��x����P�YL�� �"�6���i:��_Wv�hO�r�D���X����t���'�Z��^
�=��M�������8���`j���L�U�1t�hIY9� ���$�5���4K���BEJ��9]J���t���1�x�I�}Pd��������bn���!{(�v�[/o,����A����z��`d)���sY�-�w,P�&�w0���u����F�{N"�>�����-�7V�.^6��r�h(��JTi�����k��S�f�!�\�����j�&�Oo�(��y��.�����0����
��)=@d��������&���v���J~�KJqG����v�I��d��A����C���]���4N$��N�`0��N4���/�6D�E6YK����G��Gjw�#p���C'�w����������,���E���N�:=M�����0��f���"A0X�4���W�D����O9�2�o��c�A0U�n���7�M�DP����lB���Y�^�0[�
S�e����C�]��w�c>*�Rj���Q��������,,�
�'j�rE���k�����G+7����*)�A��X��s���z*��q��LO�=�%�l������^�o+7������������DL�<��L���`�{���&��OH/�����(n��(�%���8���}�n��K������. q�B�������3��#]�X�lk���$���d8����s�o�u�B����'�Iy9�6�Wi�`����:w�O_����*���Lu��)�p��Bz��]�	�jF!o�K�u�j����PfD���<�
{����p��rA�������[��8�K�V����(��G��Q�F>�7vq�5	� �5�i��j2�Vy����$:�8�Bn�lY>��M)
��G��x|
�K�Yd�LV.����uV�J���_�/�1�U��C����=Rv�[��IlF��V��!�x<��.����2kj�OV,����~W��j��.0,F�5��O����f<�33�+�q]\x����nG��b�e�h�=��z���V1Y��~�lC�<��> cG��U��8��j�%��c�����4���k���l��K+��&����J~p�f�rI��;����#s�v.��5�~}�Bi1������	���5�Is�?���O7��s��m�����n�k{�+�z�����e
G3K���m�m>��+��z�hw0gQy���Ku���
sj��d�Rv�R��G6���0��J�f�����m�Z��f�B4vj�4��e�p�1�YM����B���a�s���Y��y��YD����.��pgz�nb]�@��C����m��J;�,3/������+��89��g^��@09���"
>j���-��>�e���g�i���;�Te�����G����8{�\�y��{2�,���%>
�'��l�|��u��*MPe���X��a�]���`�"�>���u��}����d�
?�Ou%����Dzz{�"g��Ep������������@��a������-\�~��4������Q���N��e)z�r
�c�H��Qze�y��8���y]*�����8��&Z.M)_�;�P�j�=��A2��G����T����}e�S0#�I�[�gz]�������Tr?��"��=���4NH�F��yY��4q���A'-�������IZ~���P������(��\\�H�����X�G����e�~/-�v���1��	��S�M�m�Z�S�}��
�|wm�����:H���S�M�
s���?2�\�!S�����x����nlB���������!g!����6�$�����p=w/O����<Q�Z���r�Z<�y������Qch�������r�
�\���>�H<19�Ze�c����t)�8ju+]QR
�k@�o6>[
�h������|	d�2P�0�j9R�F���k���1��6EM�k��k�R�����PO�=�����'9���O��\Gk(����R����ex:����)�8j��x���OK\��*��l����s�3����fh���Gm������4�r���(?��5�F�Uh��NkT��Z��X=Q������{D�Ok���2�ti����;Sv���<��t���2����e��d������c���H]��n,�~lT�?����	�(��?�!�w�
run-vacuum-stream.shapplication/x-shellscript; name=run-vacuum-stream.shDownload
#45Thomas Munro
thomas.munro@gmail.com
In reply to: Tomas Vondra (#44)
Re: Confine vacuum skip logic to lazy_scan_skip

On Sun, Jan 19, 2025 at 5:51 AM Tomas Vondra <tomas@vondra.me> wrote:

* Does it still make sense to default to eic=1? For this particular test
increasing eic=4 often cuts the duration in half (especially on nvme
storage).

Maybe it wasn't a bad choice for systems with one spinning disk, but
obviously typical hardware has changed completely since then. Bruce
even changed the docs to recommend "hundreds" on SSDs (46eafc88). We
could definitely consider changing the value, but this particular
thing is using the READ_STREAM_MAINTENANCE flag, so it uses
maintenance_io_concurrency, and that one defaults to 10. That's also
arbitrary and quite small, but I think it means we can avoid choosing
new defaults for now :-) For the non-maintenance one, we might want
to think about tweaking that in the context of bitmap heapscan?

(Interestingly, MySQL seems to have a related setting defaulting to
10k, but that may be a system-wide setting, IDK. Our settings are
quite low level, per "operation", or really per stream. In an
off-list chat with Robert and Andres, we bounced around some new
names, and the one I liked best was io_concurrency_per_stream. It
would be accurate but bring a new implementation detail to the UX. I
actually like that about it: it's like
max_parallel_workers_per_gather, and also just the way work_mem is
really work_mem_per_<something>: yes it is low level but is an honest
expression of how (un)sophisticated our resource usage controls are
today. Perhaps we'll eventually figure out how to balance all
resources dynamically from global limits...)

* Why are we limiting ioc to <= 256kB? Per the benchmark it seems it
might be beneficial to set even higher values.

That comes from:

#define MAX_IO_COMBINE_LIMIT PG_IOV_MAX

... which comes from:

/* Define a reasonable maximum that is safe to use on the stack. */
#define PG_IOV_MAX Min(IOV_MAX, 32)

There are a few places that use either PG_IOV_MAX or
MAX_IO_COMBINE_LIMIT to size a stack array, but those should be OK
with a bigger number as the types are small: buffer, pointer, or
struct iovec (16 bytes). For example, if it were 128 then we could do
1MB I/O, a nice round number, and arrays of those types would still
only be 2kB or less. The other RDBMSes I googled seem to have max I/O
sizes of around 512kB or 1MB, some tunable explicitly, some derived
from other settings.

I picked 128kB as the default combine limit because it comes up in
lots of other places eg readahead size, and seemed to work pretty
well, and is definitely supportable on all POSIX systems (see POSIX
IOV_MAX, which is allowed to be as low as 16). I chose a maximum that
was just a bit more, but not much more because I was also worried
about how many places would eventually finish up wasting a lot of
memory by having to multiply that by
number-of-possible-I/Os-in-progress or other similar ideas, but I was
expecting we'd want to increase it. read_stream.c doesn't do that
sort of multiplication itself, but you can see a case like that in
Andres's AIO patchset here:

https://github.com/anarazel/postgres/blob/aio-2/src/backend/storage/aio/aio_init.c

So for example if io_max_concurrency defaults to 1024; you'd get 2MB
of iovecs in shared memory with PG_IOV_MAX of 128, instead of 512kB
today. We can always discuss defaults separately but that case
doesn't seem like a problem from here...

Nice results, thanks!

#46Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#45)
Re: Confine vacuum skip logic to lazy_scan_skip

On Sun, Jan 19, 2025 at 10:31 AM Thomas Munro <thomas.munro@gmail.com> wrote:

read_stream.c doesn't do that
sort of multiplication itself,

Actually for completeness there is a place where it allocates local
memory for max I/Os * 4, and that 4 is a not entirely unbogus and
should change to io_combine_limit for the AIO stuff. Patches in
progress, more soon. But that'd not be using MAX_IO_COMBINE_LIMIT or
the number of system-wide I/O, it'd be your (usually much smaller)
configured limits. But I'll write about that with more details in a
new thread...

#47Tomas Vondra
tomas@vondra.me
In reply to: Thomas Munro (#45)
Re: Confine vacuum skip logic to lazy_scan_skip

On 1/18/25 22:31, Thomas Munro wrote:

On Sun, Jan 19, 2025 at 5:51 AM Tomas Vondra <tomas@vondra.me> wrote:

* Does it still make sense to default to eic=1? For this particular test
increasing eic=4 often cuts the duration in half (especially on nvme
storage).

Maybe it wasn't a bad choice for systems with one spinning disk, but
obviously typical hardware has changed completely since then. Bruce
even changed the docs to recommend "hundreds" on SSDs (46eafc88). We
could definitely consider changing the value, but this particular
thing is using the READ_STREAM_MAINTENANCE flag, so it uses
maintenance_io_concurrency, and that one defaults to 10. That's also
arbitrary and quite small, but I think it means we can avoid choosing
new defaults for now :-)

I completely lost track which hardware is supposed to be a good fit for
low/high values of these GUCs. But weren't old spinning drives what
needed fairly long queue to optimize the movement of heads? And IIRC
some of the newer SSD (e.g. Optane) were marketed as not requiring very
long queues ... anyway, it seems fairly difficult to formulate a rule
comprehensible for our users.

FWIW the benchmarking script tweaks both effective_io_concurrency and
maintenance_io_concurrency GUCs (sets them to the same value). But yeah,
10 seems like a much better default for this type of storage. So for
vacuum this is probably fine, but I was thinking more about the regular
effective_io_concurrency for queries.

For the non-maintenance one, we might want
to think about tweaking that in the context of bitmap heapscan?

I'm not sure what you mean. How would we tweak it? You mean the GUC, or
some sort of adaptive heuristics?

(Interestingly, MySQL seems to have a related setting defaulting to
10k, but that may be a system-wide setting, IDK. Our settings are
quite low level, per "operation", or really per stream. In an
off-list chat with Robert and Andres, we bounced around some new
names, and the one I liked best was io_concurrency_per_stream. It
would be accurate but bring a new implementation detail to the UX. I
actually like that about it: it's like
max_parallel_workers_per_gather, and also just the way work_mem is
really work_mem_per_<something>: yes it is low level but is an honest
expression of how (un)sophisticated our resource usage controls are
today. Perhaps we'll eventually figure out how to balance all
resources dynamically from global limits...)

Possibly, but I'd guess we're years from doing that. I'm not sure anyone
even proposed anything like that.

* Why are we limiting ioc to <= 256kB? Per the benchmark it seems it
might be beneficial to set even higher values.

That comes from:

#define MAX_IO_COMBINE_LIMIT PG_IOV_MAX

... which comes from:

/* Define a reasonable maximum that is safe to use on the stack. */
#define PG_IOV_MAX Min(IOV_MAX, 32)

There are a few places that use either PG_IOV_MAX or
MAX_IO_COMBINE_LIMIT to size a stack array, but those should be OK
with a bigger number as the types are small: buffer, pointer, or
struct iovec (16 bytes). For example, if it were 128 then we could do
1MB I/O, a nice round number, and arrays of those types would still
only be 2kB or less. The other RDBMSes I googled seem to have max I/O
sizes of around 512kB or 1MB, some tunable explicitly, some derived
from other settings.

I picked 128kB as the default combine limit because it comes up in
lots of other places eg readahead size, and seemed to work pretty
well, and is definitely supportable on all POSIX systems (see POSIX
IOV_MAX, which is allowed to be as low as 16).

Not sure I follow. Surely if a system can't support values above some
limit, it would define IOV_MAX accordingly, and we'd just reject that
value. And as you point out, the IOV_MAX may be as low as 16, so it's
already possible to get a GUC value that gets rejected on some systems
(even if it's just a theoretical issue).

I chose a maximum that
was just a bit more, but not much more because I was also worried
about how many places would eventually finish up wasting a lot of
memory by having to multiply that by
number-of-possible-I/Os-in-progress or other similar ideas, but I was
expecting we'd want to increase it. read_stream.c doesn't do that
sort of multiplication itself, but you can see a case like that in
Andres's AIO patchset here:

https://github.com/anarazel/postgres/blob/aio-2/src/backend/storage/aio/aio_init.c

So for example if io_max_concurrency defaults to 1024; you'd get 2MB
of iovecs in shared memory with PG_IOV_MAX of 128, instead of 512kB
today. We can always discuss defaults separately but that case
doesn't seem like a problem from here...

Yeah, all of this makes sense. I don't doubt this is a tradeoff, and if
the GUC gets set to a high value it might have detrimental effect.
Still, 256kB seems a bit too conservative and "not round" ;-)

regards

--
Tomas Vondra

#48Melanie Plageman
melanieplageman@gmail.com
In reply to: Tomas Vondra (#44)
2 attachment(s)
Re: Confine vacuum skip logic to lazy_scan_skip

On Sat, Jan 18, 2025 at 11:51 AM Tomas Vondra <tomas@vondra.me> wrote:

Sure. I repeated the benchmark with v13, and it seems the behavior did
change. I no longer see the "big" regression when most of the pages get
updated (and need vacuuming).

I can't be 100% sure this is due to changes in the patch, because I did
some significant upgrades to the machine since that time - it has Ryzen
9900x instead of the ancient i5-2500k, new mobo/RAM/... It's pretty
much a new machine, I only kept the "old" SATA SSD RAID storage so that
I can do some tests with non-NVMe.

So there's a (small) chance the previous runs were hitting a bottleneck
that does not exist on the new hardware.

Anyway, just to make this information more complete, the machine now has
this configuration:

* Ryzen 9 9900x (12/24C), 64GB RAM
* storage:
- data: Samsung SSD 990 PRO 4TB (NVMe)
- raid-nvme: RAID0 4x Samsung SSD 990 PRO 1TB (NVMe)
- raid-sata: RAID0 6x Intel DC3700 100GB (SATA)

Attached is the script, raw results (CSV) and two PDFs summarizing the
results as a pivot table for different test parameters. Compared to the
earlier run I tweaked the script to also vary io_combine_limit (ioc), as
I wanted to see how it interacts with effective_io_concurrency (eic).

Looking at the new results, I don't see any regressions, except for two
cases - data (single NVMe) and raid-nvme (4x NVMe). There's a small area
of regression for eic=32 and perc=0.0005, but only with WAL-logging.

I'm not sure this is worth worrying about too much. It's a heuristics
and for every heuristics there's some combination parameters where it
doesn't quite do the optimal thing. The area where the patch brings
massive improvements (or does not regress) are much more significant.

I personally am happy with this behavior, seems to be performing fine.

Yes, looking at these results, I also feel good about it. I've updated
the commit metadata in attached v14, but I could use a round of review
before pushing it.

- Melanie

Attachments:

v14-0002-Use-streaming-I-O-in-VACUUM-s-third-phase.patchtext/x-patch; charset=US-ASCII; name=v14-0002-Use-streaming-I-O-in-VACUUM-s-third-phase.patchDownload
From 2397c7fb4b91f907ebcec60d35067c5072a2ae8b Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Wed, 5 Feb 2025 17:23:05 -0500
Subject: [PATCH v14 2/2] Use streaming I/O in VACUUM's third phase

Now vacuum's third phase (its second pass over the heap), which removes
dead items referring to dead tuples collected in the first phase, uses a
read stream that looks ahead in the TidStore.

Author: Melanie Plageman <melanieplageman@gmail.com>
Co-authored-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGKN3oy0bN_3yv8hd78a4%2BM1tJC9z7mD8%2Bf%2ByA%2BGeoFUwQ%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c | 42 ++++++++++++++++++++++++----
 1 file changed, 36 insertions(+), 6 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 70351de403d..222ee01e1ad 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2250,6 +2250,27 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 	return allindexes;
 }
 
+/*
+ * Read stream callback for vacuum's third phase (second pass over the heap).
+ */
+static BlockNumber
+vacuum_reap_lp_read_stream_next(ReadStream *stream,
+								void *callback_private_data,
+								void *per_buffer_data)
+{
+	TidStoreIter *iter = callback_private_data;
+	TidStoreIterResult *iter_result;
+
+	iter_result = TidStoreIterateNext(iter);
+	if (iter_result == NULL)
+		return InvalidBlockNumber;
+
+	/* Save the TidStoreIterResult for later, so we can extract the offsets. */
+	memcpy(per_buffer_data, iter_result, sizeof(*iter_result));
+
+	return iter_result->blkno;
+}
+
 /*
  *	lazy_vacuum_heap_rel() -- second pass over the heap for two pass strategy
  *
@@ -2270,6 +2291,8 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 static void
 lazy_vacuum_heap_rel(LVRelState *vacrel)
 {
+	Buffer		buf;
+	ReadStream *stream;
 	BlockNumber vacuumed_pages = 0;
 	Buffer		vmbuffer = InvalidBuffer;
 	LVSavedErrInfo saved_err_info;
@@ -2290,10 +2313,18 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 							 InvalidBlockNumber, InvalidOffsetNumber);
 
 	iter = TidStoreBeginIterate(vacrel->dead_items);
-	while ((iter_result = TidStoreIterateNext(iter)) != NULL)
+	stream = read_stream_begin_relation(READ_STREAM_MAINTENANCE,
+										vacrel->bstrategy,
+										vacrel->rel,
+										MAIN_FORKNUM,
+										vacuum_reap_lp_read_stream_next,
+										iter,
+										sizeof(TidStoreIterResult));
+
+	while (BufferIsValid(buf = read_stream_next_buffer(stream,
+													   (void **) &iter_result)))
 	{
 		BlockNumber blkno;
-		Buffer		buf;
 		Page		page;
 		Size		freespace;
 		OffsetNumber offsets[MaxOffsetNumber];
@@ -2301,8 +2332,7 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 
 		vacuum_delay_point();
 
-		blkno = iter_result->blkno;
-		vacrel->blkno = blkno;
+		vacrel->blkno = blkno = BufferGetBlockNumber(buf);
 
 		num_offsets = TidStoreGetBlockOffsets(iter_result, offsets, lengthof(offsets));
 		Assert(num_offsets <= lengthof(offsets));
@@ -2315,8 +2345,6 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 		visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
 
 		/* We need a non-cleanup exclusive lock to mark dead_items unused */
-		buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL,
-								 vacrel->bstrategy);
 		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
 		lazy_vacuum_heap_page(vacrel, blkno, buf, offsets,
 							  num_offsets, vmbuffer);
@@ -2329,6 +2357,8 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 		RecordPageWithFreeSpace(vacrel->rel, blkno, freespace);
 		vacuumed_pages++;
 	}
+
+	read_stream_end(stream);
 	TidStoreEndIterate(iter);
 
 	vacrel->blkno = InvalidBlockNumber;
-- 
2.34.1

v14-0001-Use-streaming-I-O-in-VACUUM-s-first-phase.patchtext/x-patch; charset=US-ASCII; name=v14-0001-Use-streaming-I-O-in-VACUUM-s-first-phase.patchDownload
From 6d61028e0900d6b96bfe72cb0c262e22b8bde39a Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Wed, 5 Feb 2025 17:21:41 -0500
Subject: [PATCH v14 1/2] Use streaming I/O in VACUUM's first phase

Now vacuum's first phase, which HOT-prunes and records the TIDs of
non-removable dead tuples, uses the streaming read API by converting
heap_vac_scan_next_block() to a read stream callback.

Discussion: https://postgr.es/m/CA%2BhUKGKN3oy0bN_3yv8hd78a4%2BM1tJC9z7mD8%2Bf%2ByA%2BGeoFUwQ%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c | 100 +++++++++++++++++----------
 1 file changed, 62 insertions(+), 38 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 075af385cd1..70351de403d 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -108,6 +108,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "storage/read_stream.h"
 #include "utils/lsyscache.h"
 #include "utils/pg_rusage.h"
 #include "utils/timestamp.h"
@@ -296,8 +297,9 @@ typedef struct LVSavedErrInfo
 
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
-static bool heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
-									 bool *all_visible_according_to_vm);
+static BlockNumber heap_vac_scan_next_block(ReadStream *stream,
+											void *callback_private_data,
+											void *per_buffer_data);
 static void find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
@@ -907,10 +909,11 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 static void
 lazy_scan_heap(LVRelState *vacrel)
 {
+	ReadStream *stream;
 	BlockNumber rel_pages = vacrel->rel_pages,
-				blkno,
+				blkno = 0,
 				next_fsm_block_to_vacuum = 0;
-	bool		all_visible_according_to_vm;
+	bool	   *all_visible_according_to_vm = NULL;
 
 	Buffer		vmbuffer = InvalidBuffer;
 	const int	initprog_index[] = {
@@ -926,26 +929,27 @@ lazy_scan_heap(LVRelState *vacrel)
 	initprog_val[2] = vacrel->dead_items_info->max_bytes;
 	pgstat_progress_update_multi_param(3, initprog_index, initprog_val);
 
+	stream = read_stream_begin_relation(READ_STREAM_MAINTENANCE,
+										vacrel->bstrategy,
+										vacrel->rel,
+										MAIN_FORKNUM,
+										heap_vac_scan_next_block,
+										vacrel,
+										sizeof(bool));
+
 	/* Initialize for the first heap_vac_scan_next_block() call */
 	vacrel->current_block = InvalidBlockNumber;
 	vacrel->next_unskippable_block = InvalidBlockNumber;
 	vacrel->next_unskippable_allvis = false;
 	vacrel->next_unskippable_vmbuffer = InvalidBuffer;
 
-	while (heap_vac_scan_next_block(vacrel, &blkno, &all_visible_according_to_vm))
+	while (true)
 	{
 		Buffer		buf;
 		Page		page;
 		bool		has_lpdead_items;
 		bool		got_cleanup_lock = false;
 
-		vacrel->scanned_pages++;
-
-		/* Report as block scanned, update error traceback information */
-		pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
-		update_vacuum_error_info(vacrel, NULL, VACUUM_ERRCB_PHASE_SCAN_HEAP,
-								 blkno, InvalidOffsetNumber);
-
 		vacuum_delay_point();
 
 		/*
@@ -986,7 +990,8 @@ lazy_scan_heap(LVRelState *vacrel)
 
 			/*
 			 * Vacuum the Free Space Map to make newly-freed space visible on
-			 * upper-level FSM pages.  Note we have not yet processed blkno.
+			 * upper-level FSM pages.  Note that blkno is the previously
+			 * processed block.
 			 */
 			FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum,
 									blkno);
@@ -997,6 +1002,24 @@ lazy_scan_heap(LVRelState *vacrel)
 										 PROGRESS_VACUUM_PHASE_SCAN_HEAP);
 		}
 
+		buf = read_stream_next_buffer(stream, (void **) &all_visible_according_to_vm);
+
+		if (!BufferIsValid(buf))
+			break;
+
+		Assert(all_visible_according_to_vm);
+		CheckBufferIsPinnedOnce(buf);
+		page = BufferGetPage(buf);
+
+		vacrel->scanned_pages++;
+
+		blkno = BufferGetBlockNumber(buf);
+
+		/* Report as block scanned, update error traceback information */
+		pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
+		update_vacuum_error_info(vacrel, NULL, VACUUM_ERRCB_PHASE_SCAN_HEAP,
+								 blkno, InvalidOffsetNumber);
+
 		/*
 		 * Pin the visibility map page in case we need to mark the page
 		 * all-visible.  In most cases this will be very cheap, because we'll
@@ -1004,10 +1027,6 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
 
-		buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL,
-								 vacrel->bstrategy);
-		page = BufferGetPage(buf);
-
 		/*
 		 * We need a buffer cleanup lock to prune HOT chains and defragment
 		 * the page in lazy_scan_prune.  But when it's not possible to acquire
@@ -1063,7 +1082,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		if (got_cleanup_lock)
 			lazy_scan_prune(vacrel, buf, blkno, page,
-							vmbuffer, all_visible_according_to_vm,
+							vmbuffer, *all_visible_according_to_vm,
 							&has_lpdead_items);
 
 		/*
@@ -1117,7 +1136,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		ReleaseBuffer(vmbuffer);
 
 	/* report that everything is now scanned */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
+	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, rel_pages);
 
 	/* now we can compute the new value for pg_class.reltuples */
 	vacrel->new_live_tuples = vac_estimate_reltuples(vacrel->rel, rel_pages,
@@ -1132,6 +1151,8 @@ lazy_scan_heap(LVRelState *vacrel)
 		Max(vacrel->new_live_tuples, 0) + vacrel->recently_dead_tuples +
 		vacrel->missed_dead_tuples;
 
+	read_stream_end(stream);
+
 	/*
 	 * Do index vacuuming (call each index's ambulkdelete routine), then do
 	 * related heap vacuuming
@@ -1143,11 +1164,11 @@ lazy_scan_heap(LVRelState *vacrel)
 	 * Vacuum the remainder of the Free Space Map.  We must do this whether or
 	 * not there were indexes, and whether or not we bypassed index vacuuming.
 	 */
-	if (blkno > next_fsm_block_to_vacuum)
-		FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, blkno);
+	if (rel_pages > next_fsm_block_to_vacuum)
+		FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, rel_pages);
 
 	/* report all blocks vacuumed */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
+	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, rel_pages);
 
 	/* Do final index cleanup (call each index's amvacuumcleanup routine) */
 	if (vacrel->nindexes > 0 && vacrel->do_index_cleanup)
@@ -1157,14 +1178,14 @@ lazy_scan_heap(LVRelState *vacrel)
 /*
  *	heap_vac_scan_next_block() -- get next block for vacuum to process
  *
- * lazy_scan_heap() calls here every time it needs to get the next block to
- * prune and vacuum.  The function uses the visibility map, vacuum options,
- * and various thresholds to skip blocks which do not need to be processed and
- * sets blkno to the next block to process.
+ * The streaming read callback invokes heap_vac_scan_next_block() every time
+ * lazy_scan_heap() needs the next block to prune and vacuum.  The function
+ * uses the visibility map, vacuum options, and various thresholds to skip
+ * blocks which do not need to be processed and returns the next block to
+ * process or InvalidBlockNumber if there are no remaining blocks.
  *
- * The block number and visibility status of the next block to process are set
- * in *blkno and *all_visible_according_to_vm.  The return value is false if
- * there are no further blocks to process.
+ * The visibility status of the next block to process is set in the
+ * per_buffer_data.
  *
  * vacrel is an in/out parameter here.  Vacuum options and information about
  * the relation are read.  vacrel->skippedallvis is set if we skip a block
@@ -1172,11 +1193,14 @@ lazy_scan_heap(LVRelState *vacrel)
  * relfrozenxid in that case.  vacrel also holds information about the next
  * unskippable block, as bookkeeping for this function.
  */
-static bool
-heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
-						 bool *all_visible_according_to_vm)
+static BlockNumber
+heap_vac_scan_next_block(ReadStream *stream,
+						 void *callback_private_data,
+						 void *per_buffer_data)
 {
 	BlockNumber next_block;
+	LVRelState *vacrel = callback_private_data;
+	bool	   *all_visible_according_to_vm = per_buffer_data;
 
 	/* relies on InvalidBlockNumber + 1 overflowing to 0 on first call */
 	next_block = vacrel->current_block + 1;
@@ -1189,8 +1213,8 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 			ReleaseBuffer(vacrel->next_unskippable_vmbuffer);
 			vacrel->next_unskippable_vmbuffer = InvalidBuffer;
 		}
-		*blkno = vacrel->rel_pages;
-		return false;
+		vacrel->current_block = vacrel->rel_pages;
+		return InvalidBlockNumber;
 	}
 
 	/*
@@ -1239,9 +1263,9 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 		 * but chose not to.  We know that they are all-visible in the VM,
 		 * otherwise they would've been unskippable.
 		 */
-		*blkno = vacrel->current_block = next_block;
+		vacrel->current_block = next_block;
 		*all_visible_according_to_vm = true;
-		return true;
+		return vacrel->current_block;
 	}
 	else
 	{
@@ -1251,9 +1275,9 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 		 */
 		Assert(next_block == vacrel->next_unskippable_block);
 
-		*blkno = vacrel->current_block = next_block;
+		vacrel->current_block = next_block;
 		*all_visible_according_to_vm = vacrel->next_unskippable_allvis;
-		return true;
+		return vacrel->current_block;
 	}
 }
 
-- 
2.34.1

#49Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#48)
2 attachment(s)
Re: Confine vacuum skip logic to lazy_scan_skip

On Wed, Feb 5, 2025 at 5:26 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

Yes, looking at these results, I also feel good about it. I've updated
the commit metadata in attached v14, but I could use a round of review
before pushing it.

I've done a bit of self-review and updated these patches.

- Melanie

Attachments:

v15-0002-Use-streaming-read-I-O-in-VACUUM-s-third-phase.patchtext/x-patch; charset=US-ASCII; name=v15-0002-Use-streaming-read-I-O-in-VACUUM-s-third-phase.patchDownload
From 0483495b5a3a35cf27672ad08901509b8c32d49d Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Wed, 5 Feb 2025 17:23:05 -0500
Subject: [PATCH v15 2/2] Use streaming read I/O in VACUUM's third phase

Make vacuum's third phase (its second pass over the heap), which reaps
dead items collected in the first phase and marks them as reusable, use
the read stream API. This commit adds a new read stream callback,
vacuum_reap_lp_read_stream_next(), that looks ahead in the TidStore and
returns the next block number to read for vacuum.

Author: Melanie Plageman <melanieplageman@gmail.com>
Co-authored-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGKN3oy0bN_3yv8hd78a4%2BM1tJC9z7mD8%2Bf%2ByA%2BGeoFUwQ%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c | 49 +++++++++++++++++++++++++---
 1 file changed, 44 insertions(+), 5 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index b941158c645..740565b8379 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2261,6 +2261,29 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 	return allindexes;
 }
 
+/*
+ * Read stream callback for vacuum's third phase (second pass over the heap).
+ * Gets the next block from the TID store and returns it or InvalidBlockNumber
+ * if there are no further blocks to vacuum.
+ */
+static BlockNumber
+vacuum_reap_lp_read_stream_next(ReadStream *stream,
+								void *callback_private_data,
+								void *per_buffer_data)
+{
+	TidStoreIter *iter = callback_private_data;
+	TidStoreIterResult *iter_result;
+
+	iter_result = TidStoreIterateNext(iter);
+	if (iter_result == NULL)
+		return InvalidBlockNumber;
+
+	/* Save the TidStoreIterResult for later, so we can extract the offsets. */
+	memcpy(per_buffer_data, iter_result, sizeof(*iter_result));
+
+	return iter_result->blkno;
+}
+
 /*
  *	lazy_vacuum_heap_rel() -- second pass over the heap for two pass strategy
  *
@@ -2281,6 +2304,7 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 static void
 lazy_vacuum_heap_rel(LVRelState *vacrel)
 {
+	ReadStream *stream;
 	BlockNumber vacuumed_pages = 0;
 	Buffer		vmbuffer = InvalidBuffer;
 	LVSavedErrInfo saved_err_info;
@@ -2301,7 +2325,17 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 							 InvalidBlockNumber, InvalidOffsetNumber);
 
 	iter = TidStoreBeginIterate(vacrel->dead_items);
-	while ((iter_result = TidStoreIterateNext(iter)) != NULL)
+
+	/* Set up the read stream */
+	stream = read_stream_begin_relation(READ_STREAM_MAINTENANCE,
+										vacrel->bstrategy,
+										vacrel->rel,
+										MAIN_FORKNUM,
+										vacuum_reap_lp_read_stream_next,
+										iter,
+										sizeof(TidStoreIterResult));
+
+	while (true)
 	{
 		BlockNumber blkno;
 		Buffer		buf;
@@ -2312,9 +2346,14 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 
 		vacuum_delay_point();
 
-		blkno = iter_result->blkno;
-		vacrel->blkno = blkno;
+		buf = read_stream_next_buffer(stream, (void **) &iter_result);
 
+		if (!BufferIsValid(buf))
+			break;
+
+		vacrel->blkno = blkno = BufferGetBlockNumber(buf);
+
+		Assert(iter_result);
 		num_offsets = TidStoreGetBlockOffsets(iter_result, offsets, lengthof(offsets));
 		Assert(num_offsets <= lengthof(offsets));
 
@@ -2326,8 +2365,6 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 		visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
 
 		/* We need a non-cleanup exclusive lock to mark dead_items unused */
-		buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL,
-								 vacrel->bstrategy);
 		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
 		lazy_vacuum_heap_page(vacrel, blkno, buf, offsets,
 							  num_offsets, vmbuffer);
@@ -2340,6 +2377,8 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 		RecordPageWithFreeSpace(vacrel->rel, blkno, freespace);
 		vacuumed_pages++;
 	}
+
+	read_stream_end(stream);
 	TidStoreEndIterate(iter);
 
 	vacrel->blkno = InvalidBlockNumber;
-- 
2.34.1

v15-0001-Use-streaming-read-I-O-in-VACUUM-s-first-phase.patchtext/x-patch; charset=US-ASCII; name=v15-0001-Use-streaming-read-I-O-in-VACUUM-s-first-phase.patchDownload
From bc33c168c2799dd397b7b67f22e3d8fdf40bc043 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Wed, 5 Feb 2025 17:21:41 -0500
Subject: [PATCH v15 1/2] Use streaming read I/O in VACUUM's first phase

Make vacuum's first phase, which prunes and freezes tuples and records
dead TIDs, use the read stream API by by converting
heap_vac_scan_next_block() to a read stream callback.

Discussion: https://postgr.es/m/CA%2BhUKGKN3oy0bN_3yv8hd78a4%2BM1tJC9z7mD8%2Bf%2ByA%2BGeoFUwQ%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c | 131 +++++++++++++++++----------
 1 file changed, 83 insertions(+), 48 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 075af385cd1..b941158c645 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -108,6 +108,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "storage/read_stream.h"
 #include "utils/lsyscache.h"
 #include "utils/pg_rusage.h"
 #include "utils/timestamp.h"
@@ -296,8 +297,9 @@ typedef struct LVSavedErrInfo
 
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
-static bool heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
-									 bool *all_visible_according_to_vm);
+static BlockNumber heap_vac_scan_next_block(ReadStream *stream,
+											void *callback_private_data,
+											void *per_buffer_data);
 static void find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
@@ -907,10 +909,11 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 static void
 lazy_scan_heap(LVRelState *vacrel)
 {
+	ReadStream *stream;
 	BlockNumber rel_pages = vacrel->rel_pages,
-				blkno,
+				blkno = 0,
 				next_fsm_block_to_vacuum = 0;
-	bool		all_visible_according_to_vm;
+	bool	   *all_visible_according_to_vm = NULL;
 
 	Buffer		vmbuffer = InvalidBuffer;
 	const int	initprog_index[] = {
@@ -932,20 +935,22 @@ lazy_scan_heap(LVRelState *vacrel)
 	vacrel->next_unskippable_allvis = false;
 	vacrel->next_unskippable_vmbuffer = InvalidBuffer;
 
-	while (heap_vac_scan_next_block(vacrel, &blkno, &all_visible_according_to_vm))
+	/* Set up the read stream for vacuum's first pass through the heap */
+	stream = read_stream_begin_relation(READ_STREAM_MAINTENANCE,
+										vacrel->bstrategy,
+										vacrel->rel,
+										MAIN_FORKNUM,
+										heap_vac_scan_next_block,
+										vacrel,
+										sizeof(bool));
+
+	while (true)
 	{
 		Buffer		buf;
 		Page		page;
 		bool		has_lpdead_items;
 		bool		got_cleanup_lock = false;
 
-		vacrel->scanned_pages++;
-
-		/* Report as block scanned, update error traceback information */
-		pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
-		update_vacuum_error_info(vacrel, NULL, VACUUM_ERRCB_PHASE_SCAN_HEAP,
-								 blkno, InvalidOffsetNumber);
-
 		vacuum_delay_point();
 
 		/*
@@ -986,7 +991,8 @@ lazy_scan_heap(LVRelState *vacrel)
 
 			/*
 			 * Vacuum the Free Space Map to make newly-freed space visible on
-			 * upper-level FSM pages.  Note we have not yet processed blkno.
+			 * upper-level FSM pages.  Note that blkno is the previously
+			 * processed block.
 			 */
 			FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum,
 									blkno);
@@ -997,6 +1003,23 @@ lazy_scan_heap(LVRelState *vacrel)
 										 PROGRESS_VACUUM_PHASE_SCAN_HEAP);
 		}
 
+		buf = read_stream_next_buffer(stream, (void **) &all_visible_according_to_vm);
+
+		if (!BufferIsValid(buf))
+			break;
+
+		Assert(all_visible_according_to_vm);
+		CheckBufferIsPinnedOnce(buf);
+		page = BufferGetPage(buf);
+		blkno = BufferGetBlockNumber(buf);
+
+		vacrel->scanned_pages++;
+
+		/* Report as block scanned, update error traceback information */
+		pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
+		update_vacuum_error_info(vacrel, NULL, VACUUM_ERRCB_PHASE_SCAN_HEAP,
+								 blkno, InvalidOffsetNumber);
+
 		/*
 		 * Pin the visibility map page in case we need to mark the page
 		 * all-visible.  In most cases this will be very cheap, because we'll
@@ -1004,10 +1027,6 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
 
-		buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL,
-								 vacrel->bstrategy);
-		page = BufferGetPage(buf);
-
 		/*
 		 * We need a buffer cleanup lock to prune HOT chains and defragment
 		 * the page in lazy_scan_prune.  But when it's not possible to acquire
@@ -1063,7 +1082,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		if (got_cleanup_lock)
 			lazy_scan_prune(vacrel, buf, blkno, page,
-							vmbuffer, all_visible_according_to_vm,
+							vmbuffer, *all_visible_according_to_vm,
 							&has_lpdead_items);
 
 		/*
@@ -1116,8 +1135,12 @@ lazy_scan_heap(LVRelState *vacrel)
 	if (BufferIsValid(vmbuffer))
 		ReleaseBuffer(vmbuffer);
 
-	/* report that everything is now scanned */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
+	/*
+	 * Report that everything is now scanned. We never skip scanning the last
+	 * block in the relation, so we can pass rel_pages here.
+	 */
+	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED,
+								 rel_pages);
 
 	/* now we can compute the new value for pg_class.reltuples */
 	vacrel->new_live_tuples = vac_estimate_reltuples(vacrel->rel, rel_pages,
@@ -1132,6 +1155,8 @@ lazy_scan_heap(LVRelState *vacrel)
 		Max(vacrel->new_live_tuples, 0) + vacrel->recently_dead_tuples +
 		vacrel->missed_dead_tuples;
 
+	read_stream_end(stream);
+
 	/*
 	 * Do index vacuuming (call each index's ambulkdelete routine), then do
 	 * related heap vacuuming
@@ -1142,12 +1167,14 @@ lazy_scan_heap(LVRelState *vacrel)
 	/*
 	 * Vacuum the remainder of the Free Space Map.  We must do this whether or
 	 * not there were indexes, and whether or not we bypassed index vacuuming.
+	 * We can pass rel_pages here because we never skip scanning the last
+	 * block of the relation.
 	 */
-	if (blkno > next_fsm_block_to_vacuum)
-		FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, blkno);
+	if (rel_pages > next_fsm_block_to_vacuum)
+		FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, rel_pages);
 
 	/* report all blocks vacuumed */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
+	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, rel_pages);
 
 	/* Do final index cleanup (call each index's amvacuumcleanup routine) */
 	if (vacrel->nindexes > 0 && vacrel->do_index_cleanup)
@@ -1155,28 +1182,37 @@ lazy_scan_heap(LVRelState *vacrel)
 }
 
 /*
- *	heap_vac_scan_next_block() -- get next block for vacuum to process
- *
- * lazy_scan_heap() calls here every time it needs to get the next block to
- * prune and vacuum.  The function uses the visibility map, vacuum options,
- * and various thresholds to skip blocks which do not need to be processed and
- * sets blkno to the next block to process.
- *
- * The block number and visibility status of the next block to process are set
- * in *blkno and *all_visible_according_to_vm.  The return value is false if
- * there are no further blocks to process.
- *
- * vacrel is an in/out parameter here.  Vacuum options and information about
- * the relation are read.  vacrel->skippedallvis is set if we skip a block
- * that's all-visible but not all-frozen, to ensure that we don't update
- * relfrozenxid in that case.  vacrel also holds information about the next
- * unskippable block, as bookkeeping for this function.
+ *	heap_vac_scan_next_block() -- read stream callback to get the next block
+ *	for vacuum to process
+ *
+ * Every time lazy_scan_heap() needs a new block to process during its first
+ * phase, it invokes read_stream_next_buffer() with a stream set up to call
+ * heap_vac_scan_next_block() to get the next block.
+ *
+ * heap_vac_scan_next_block() uses the visibility map, vacuum options, and
+ * various thresholds to skip blocks which do not need to be processed and
+ * returns the next block to process or InvalidBlockNumber if there are no
+ * remaining blocks.
+ *
+ * The visibility status of the next block to process is set in the
+ * per_buffer_data.
+ *
+ * callback_private_data contains a reference to the LVRelState, passed to the
+ * read stream API during stream setup. The LVRelState is an in/out parameter
+ * here (locally named `vacrel`). Vacuum options and information about the
+ * relation are read from it. vacrel->skippedallvis is set if we skip a block
+ * that's all-visible but not all-frozen (to ensure that we don't update
+ * relfrozenxid in that case). vacrel also holds information about the next
+ * unskippable block -- as bookkeeping for this function.
  */
-static bool
-heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
-						 bool *all_visible_according_to_vm)
+static BlockNumber
+heap_vac_scan_next_block(ReadStream *stream,
+						 void *callback_private_data,
+						 void *per_buffer_data)
 {
 	BlockNumber next_block;
+	LVRelState *vacrel = callback_private_data;
+	bool	   *all_visible_according_to_vm = per_buffer_data;
 
 	/* relies on InvalidBlockNumber + 1 overflowing to 0 on first call */
 	next_block = vacrel->current_block + 1;
@@ -1189,8 +1225,7 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 			ReleaseBuffer(vacrel->next_unskippable_vmbuffer);
 			vacrel->next_unskippable_vmbuffer = InvalidBuffer;
 		}
-		*blkno = vacrel->rel_pages;
-		return false;
+		return InvalidBlockNumber;
 	}
 
 	/*
@@ -1239,9 +1274,9 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 		 * but chose not to.  We know that they are all-visible in the VM,
 		 * otherwise they would've been unskippable.
 		 */
-		*blkno = vacrel->current_block = next_block;
+		vacrel->current_block = next_block;
 		*all_visible_according_to_vm = true;
-		return true;
+		return vacrel->current_block;
 	}
 	else
 	{
@@ -1251,9 +1286,9 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 		 */
 		Assert(next_block == vacrel->next_unskippable_block);
 
-		*blkno = vacrel->current_block = next_block;
+		vacrel->current_block = next_block;
 		*all_visible_according_to_vm = vacrel->next_unskippable_allvis;
-		return true;
+		return vacrel->current_block;
 	}
 }
 
-- 
2.34.1

#50Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#49)
3 attachment(s)
Re: Confine vacuum skip logic to lazy_scan_skip

On Thu, Feb 6, 2025 at 1:06 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Wed, Feb 5, 2025 at 5:26 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

Yes, looking at these results, I also feel good about it. I've updated
the commit metadata in attached v14, but I could use a round of review
before pushing it.

I've done a bit of self-review and updated these patches.

This needed a rebase in light of 052026c9b90.
v16 attached has an additional commit which converts the block
information parameters to heap_vac_scan_next_block() into flags
because we can only get one piece of information per block from the
read stream API. This seemed nicer than a cumbersome struct.

- Melanie

Attachments:

v16-0003-Use-streaming-read-I-O-in-VACUUM-s-third-phase.patchtext/x-patch; charset=US-ASCII; name=v16-0003-Use-streaming-read-I-O-in-VACUUM-s-third-phase.patchDownload
From 920b567a0afa77f1478f804dc934936a9e9e9ae8 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Wed, 5 Feb 2025 17:23:05 -0500
Subject: [PATCH v16 3/3] Use streaming read I/O in VACUUM's third phase

Make vacuum's third phase (its second pass over the heap), which reaps
dead items collected in the first phase and marks them as reusable, use
the read stream API. This commit adds a new read stream callback,
vacuum_reap_lp_read_stream_next(), that looks ahead in the TidStore and
returns the next block number to read for vacuum.

Author: Melanie Plageman <melanieplageman@gmail.com>
Co-authored-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGKN3oy0bN_3yv8hd78a4%2BM1tJC9z7mD8%2Bf%2ByA%2BGeoFUwQ%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c | 49 +++++++++++++++++++++++++---
 1 file changed, 44 insertions(+), 5 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index f705f31f5e6..5b2d9c535c1 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2637,6 +2637,29 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 	return allindexes;
 }
 
+/*
+ * Read stream callback for vacuum's third phase (second pass over the heap).
+ * Gets the next block from the TID store and returns it or InvalidBlockNumber
+ * if there are no further blocks to vacuum.
+ */
+static BlockNumber
+vacuum_reap_lp_read_stream_next(ReadStream *stream,
+								void *callback_private_data,
+								void *per_buffer_data)
+{
+	TidStoreIter *iter = callback_private_data;
+	TidStoreIterResult *iter_result;
+
+	iter_result = TidStoreIterateNext(iter);
+	if (iter_result == NULL)
+		return InvalidBlockNumber;
+
+	/* Save the TidStoreIterResult for later, so we can extract the offsets. */
+	memcpy(per_buffer_data, iter_result, sizeof(*iter_result));
+
+	return iter_result->blkno;
+}
+
 /*
  *	lazy_vacuum_heap_rel() -- second pass over the heap for two pass strategy
  *
@@ -2657,6 +2680,7 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 static void
 lazy_vacuum_heap_rel(LVRelState *vacrel)
 {
+	ReadStream *stream;
 	BlockNumber vacuumed_pages = 0;
 	Buffer		vmbuffer = InvalidBuffer;
 	LVSavedErrInfo saved_err_info;
@@ -2677,7 +2701,17 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 							 InvalidBlockNumber, InvalidOffsetNumber);
 
 	iter = TidStoreBeginIterate(vacrel->dead_items);
-	while ((iter_result = TidStoreIterateNext(iter)) != NULL)
+
+	/* Set up the read stream */
+	stream = read_stream_begin_relation(READ_STREAM_MAINTENANCE,
+										vacrel->bstrategy,
+										vacrel->rel,
+										MAIN_FORKNUM,
+										vacuum_reap_lp_read_stream_next,
+										iter,
+										sizeof(TidStoreIterResult));
+
+	while (true)
 	{
 		BlockNumber blkno;
 		Buffer		buf;
@@ -2688,9 +2722,14 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 
 		vacuum_delay_point();
 
-		blkno = iter_result->blkno;
-		vacrel->blkno = blkno;
+		buf = read_stream_next_buffer(stream, (void **) &iter_result);
 
+		if (!BufferIsValid(buf))
+			break;
+
+		vacrel->blkno = blkno = BufferGetBlockNumber(buf);
+
+		Assert(iter_result);
 		num_offsets = TidStoreGetBlockOffsets(iter_result, offsets, lengthof(offsets));
 		Assert(num_offsets <= lengthof(offsets));
 
@@ -2702,8 +2741,6 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 		visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
 
 		/* We need a non-cleanup exclusive lock to mark dead_items unused */
-		buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL,
-								 vacrel->bstrategy);
 		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
 		lazy_vacuum_heap_page(vacrel, blkno, buf, offsets,
 							  num_offsets, vmbuffer);
@@ -2716,6 +2753,8 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 		RecordPageWithFreeSpace(vacrel->rel, blkno, freespace);
 		vacuumed_pages++;
 	}
+
+	read_stream_end(stream);
 	TidStoreEndIterate(iter);
 
 	vacrel->blkno = InvalidBlockNumber;
-- 
2.34.1

v16-0001-Convert-heap_vac_scan_next_block-boolean-paramet.patchtext/x-patch; charset=US-ASCII; name=v16-0001-Convert-heap_vac_scan_next_block-boolean-paramet.patchDownload
From 4d6e90bd73dc185415d0278a19b22c691b6fa598 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Tue, 11 Feb 2025 19:02:13 -0500
Subject: [PATCH v16 1/3] Convert heap_vac_scan_next_block() boolean parameters
 to flags

The read stream API only allows one piece of extra per block state to be
passed back to the API user. Heap vacuum needs to know whether or not a
given block was all-visible in the visibility map and whether or not it
was eagerly scanned. Convert these two pieces of information to flags so
that they can be passed as a single argument to
heap_vac_scan_next_block() (which will become the read stream API
callback for heap phase I vacuuming).
---
 src/backend/access/heap/vacuumlazy.c | 47 ++++++++++++++++------------
 1 file changed, 27 insertions(+), 20 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 8c387ae557e..25aeb23aa30 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -248,6 +248,13 @@ typedef enum
  */
 #define EAGER_SCAN_REGION_SIZE 4096
 
+/*
+ * heap_vac_scan_next_block() sets these statuses to communicate per-block
+ * information to the caller.
+ */
+#define VAC_BLK_WAS_EAGER_SCANNED (1 << 0)
+#define VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM (1 << 1)
+
 typedef struct LVRelState
 {
 	/* Target heap relation and its indexes */
@@ -417,8 +424,7 @@ static void lazy_scan_heap(LVRelState *vacrel);
 static void heap_vacuum_eager_scan_setup(LVRelState *vacrel,
 										 VacuumParams *params);
 static bool heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
-									 bool *all_visible_according_to_vm,
-									 bool *was_eager_scanned);
+									 uint8 *blk_flags);
 static void find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
@@ -1171,8 +1177,7 @@ lazy_scan_heap(LVRelState *vacrel)
 	BlockNumber rel_pages = vacrel->rel_pages,
 				blkno,
 				next_fsm_block_to_vacuum = 0;
-	bool		all_visible_according_to_vm,
-				was_eager_scanned = false;
+	uint8		blk_flags = 0;
 	BlockNumber orig_eager_scan_success_limit =
 		vacrel->eager_scan_remaining_successes; /* for logging */
 	Buffer		vmbuffer = InvalidBuffer;
@@ -1196,8 +1201,7 @@ lazy_scan_heap(LVRelState *vacrel)
 	vacrel->next_unskippable_eager_scanned = false;
 	vacrel->next_unskippable_vmbuffer = InvalidBuffer;
 
-	while (heap_vac_scan_next_block(vacrel, &blkno, &all_visible_according_to_vm,
-									&was_eager_scanned))
+	while (heap_vac_scan_next_block(vacrel, &blkno, &blk_flags))
 	{
 		Buffer		buf;
 		Page		page;
@@ -1206,7 +1210,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		bool		got_cleanup_lock = false;
 
 		vacrel->scanned_pages++;
-		if (was_eager_scanned)
+		if (blk_flags & VAC_BLK_WAS_EAGER_SCANNED)
 			vacrel->eager_scanned_pages++;
 
 		/* Report as block scanned, update error traceback information */
@@ -1331,7 +1335,8 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		if (got_cleanup_lock)
 			lazy_scan_prune(vacrel, buf, blkno, page,
-							vmbuffer, all_visible_according_to_vm,
+							vmbuffer,
+							blk_flags & VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM,
 							&has_lpdead_items, &vm_page_frozen);
 
 		/*
@@ -1348,7 +1353,8 @@ lazy_scan_heap(LVRelState *vacrel)
 		 * exclude pages skipped due to cleanup lock contention from eager
 		 * freeze algorithm caps.
 		 */
-		if (got_cleanup_lock && was_eager_scanned)
+		if (got_cleanup_lock &&
+			blk_flags & VAC_BLK_WAS_EAGER_SCANNED)
 		{
 			/* Aggressive vacuums do not eager scan. */
 			Assert(!vacrel->aggressive);
@@ -1479,11 +1485,11 @@ lazy_scan_heap(LVRelState *vacrel)
  * and various thresholds to skip blocks which do not need to be processed and
  * sets blkno to the next block to process.
  *
- * The block number and visibility status of the next block to process are set
- * in *blkno and *all_visible_according_to_vm.  The return value is false if
- * there are no further blocks to process. If the block is being eagerly
- * scanned, was_eager_scanned is set so that the caller can count whether or
- * not an eagerly scanned page is successfully frozen.
+ * The block number of the next block to process is set in *blkno and its
+ * visibility status and whether or not it was eager scanned is set in
+ * *blk_flags.
+ *
+ * The return value is false if there are no further blocks to process.
  *
  * vacrel is an in/out parameter here.  Vacuum options and information about
  * the relation are read.  vacrel->skippedallvis is set if we skip a block
@@ -1493,15 +1499,14 @@ lazy_scan_heap(LVRelState *vacrel)
  */
 static bool
 heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
-						 bool *all_visible_according_to_vm,
-						 bool *was_eager_scanned)
+						 uint8 *blk_flags)
 {
 	BlockNumber next_block;
 
 	/* relies on InvalidBlockNumber + 1 overflowing to 0 on first call */
 	next_block = vacrel->current_block + 1;
 
-	*was_eager_scanned = false;
+	*blk_flags = 0;
 
 	/* Have we reached the end of the relation? */
 	if (next_block >= vacrel->rel_pages)
@@ -1562,7 +1567,7 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 		 * otherwise they would've been unskippable.
 		 */
 		*blkno = vacrel->current_block = next_block;
-		*all_visible_according_to_vm = true;
+		*blk_flags |= VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM;
 		return true;
 	}
 	else
@@ -1574,8 +1579,10 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 		Assert(next_block == vacrel->next_unskippable_block);
 
 		*blkno = vacrel->current_block = next_block;
-		*all_visible_according_to_vm = vacrel->next_unskippable_allvis;
-		*was_eager_scanned = vacrel->next_unskippable_eager_scanned;
+		if (vacrel->next_unskippable_allvis)
+			*blk_flags |= VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM;
+		if (vacrel->next_unskippable_eager_scanned)
+			*blk_flags |= VAC_BLK_WAS_EAGER_SCANNED;
 		return true;
 	}
 }
-- 
2.34.1

v16-0002-Use-streaming-read-I-O-in-VACUUM-s-first-phase.patchtext/x-patch; charset=US-ASCII; name=v16-0002-Use-streaming-read-I-O-in-VACUUM-s-first-phase.patchDownload
From e321b5caacfc231b0cb64d99881d60a487fa3a8a Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Wed, 5 Feb 2025 17:21:41 -0500
Subject: [PATCH v16 2/3] Use streaming read I/O in VACUUM's first phase

Make vacuum's first phase, which prunes and freezes tuples and records
dead TIDs, use the read stream API by by converting
heap_vac_scan_next_block() to a read stream callback.

Discussion: https://postgr.es/m/CA%2BhUKGKN3oy0bN_3yv8hd78a4%2BM1tJC9z7mD8%2Bf%2ByA%2BGeoFUwQ%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c | 139 +++++++++++++++++----------
 1 file changed, 86 insertions(+), 53 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 25aeb23aa30..f705f31f5e6 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -153,6 +153,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "storage/read_stream.h"
 #include "utils/lsyscache.h"
 #include "utils/pg_rusage.h"
 #include "utils/timestamp.h"
@@ -423,8 +424,9 @@ typedef struct LVSavedErrInfo
 static void lazy_scan_heap(LVRelState *vacrel);
 static void heap_vacuum_eager_scan_setup(LVRelState *vacrel,
 										 VacuumParams *params);
-static bool heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
-									 uint8 *blk_flags);
+static BlockNumber heap_vac_scan_next_block(ReadStream *stream,
+											void *callback_private_data,
+											void *per_buffer_data);
 static void find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
@@ -1174,10 +1176,11 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 static void
 lazy_scan_heap(LVRelState *vacrel)
 {
+	ReadStream *stream;
 	BlockNumber rel_pages = vacrel->rel_pages,
-				blkno,
+				blkno = 0,
 				next_fsm_block_to_vacuum = 0;
-	uint8		blk_flags = 0;
+	uint8	   *blk_flags = NULL;
 	BlockNumber orig_eager_scan_success_limit =
 		vacrel->eager_scan_remaining_successes; /* for logging */
 	Buffer		vmbuffer = InvalidBuffer;
@@ -1201,7 +1204,16 @@ lazy_scan_heap(LVRelState *vacrel)
 	vacrel->next_unskippable_eager_scanned = false;
 	vacrel->next_unskippable_vmbuffer = InvalidBuffer;
 
-	while (heap_vac_scan_next_block(vacrel, &blkno, &blk_flags))
+	/* Set up the read stream for vacuum's first pass through the heap */
+	stream = read_stream_begin_relation(READ_STREAM_MAINTENANCE,
+										vacrel->bstrategy,
+										vacrel->rel,
+										MAIN_FORKNUM,
+										heap_vac_scan_next_block,
+										vacrel,
+										sizeof(bool));
+
+	while (true)
 	{
 		Buffer		buf;
 		Page		page;
@@ -1209,15 +1221,6 @@ lazy_scan_heap(LVRelState *vacrel)
 		bool		vm_page_frozen = false;
 		bool		got_cleanup_lock = false;
 
-		vacrel->scanned_pages++;
-		if (blk_flags & VAC_BLK_WAS_EAGER_SCANNED)
-			vacrel->eager_scanned_pages++;
-
-		/* Report as block scanned, update error traceback information */
-		pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
-		update_vacuum_error_info(vacrel, NULL, VACUUM_ERRCB_PHASE_SCAN_HEAP,
-								 blkno, InvalidOffsetNumber);
-
 		vacuum_delay_point();
 
 		/*
@@ -1258,7 +1261,8 @@ lazy_scan_heap(LVRelState *vacrel)
 
 			/*
 			 * Vacuum the Free Space Map to make newly-freed space visible on
-			 * upper-level FSM pages.  Note we have not yet processed blkno.
+			 * upper-level FSM pages.  Note that blkno is the previously
+			 * processed block.
 			 */
 			FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum,
 									blkno);
@@ -1269,6 +1273,25 @@ lazy_scan_heap(LVRelState *vacrel)
 										 PROGRESS_VACUUM_PHASE_SCAN_HEAP);
 		}
 
+		buf = read_stream_next_buffer(stream, (void **) &blk_flags);
+
+		if (!BufferIsValid(buf))
+			break;
+
+		Assert(blk_flags);
+		CheckBufferIsPinnedOnce(buf);
+		page = BufferGetPage(buf);
+		blkno = BufferGetBlockNumber(buf);
+
+		vacrel->scanned_pages++;
+		if (*blk_flags & VAC_BLK_WAS_EAGER_SCANNED)
+			vacrel->eager_scanned_pages++;
+
+		/* Report as block scanned, update error traceback information */
+		pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
+		update_vacuum_error_info(vacrel, NULL, VACUUM_ERRCB_PHASE_SCAN_HEAP,
+								 blkno, InvalidOffsetNumber);
+
 		/*
 		 * Pin the visibility map page in case we need to mark the page
 		 * all-visible.  In most cases this will be very cheap, because we'll
@@ -1276,10 +1299,6 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
 
-		buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL,
-								 vacrel->bstrategy);
-		page = BufferGetPage(buf);
-
 		/*
 		 * We need a buffer cleanup lock to prune HOT chains and defragment
 		 * the page in lazy_scan_prune.  But when it's not possible to acquire
@@ -1336,7 +1355,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		if (got_cleanup_lock)
 			lazy_scan_prune(vacrel, buf, blkno, page,
 							vmbuffer,
-							blk_flags & VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM,
+							*blk_flags & VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM,
 							&has_lpdead_items, &vm_page_frozen);
 
 		/*
@@ -1354,7 +1373,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		 * freeze algorithm caps.
 		 */
 		if (got_cleanup_lock &&
-			blk_flags & VAC_BLK_WAS_EAGER_SCANNED)
+			*blk_flags & VAC_BLK_WAS_EAGER_SCANNED)
 		{
 			/* Aggressive vacuums do not eager scan. */
 			Assert(!vacrel->aggressive);
@@ -1439,8 +1458,12 @@ lazy_scan_heap(LVRelState *vacrel)
 	if (BufferIsValid(vmbuffer))
 		ReleaseBuffer(vmbuffer);
 
-	/* report that everything is now scanned */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
+	/*
+	 * Report that everything is now scanned. We never skip scanning the last
+	 * block in the relation, so we can pass rel_pages here.
+	 */
+	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED,
+								 rel_pages);
 
 	/* now we can compute the new value for pg_class.reltuples */
 	vacrel->new_live_tuples = vac_estimate_reltuples(vacrel->rel, rel_pages,
@@ -1455,6 +1478,8 @@ lazy_scan_heap(LVRelState *vacrel)
 		Max(vacrel->new_live_tuples, 0) + vacrel->recently_dead_tuples +
 		vacrel->missed_dead_tuples;
 
+	read_stream_end(stream);
+
 	/*
 	 * Do index vacuuming (call each index's ambulkdelete routine), then do
 	 * related heap vacuuming
@@ -1465,12 +1490,14 @@ lazy_scan_heap(LVRelState *vacrel)
 	/*
 	 * Vacuum the remainder of the Free Space Map.  We must do this whether or
 	 * not there were indexes, and whether or not we bypassed index vacuuming.
+	 * We can pass rel_pages here because we never skip scanning the last
+	 * block of the relation.
 	 */
-	if (blkno > next_fsm_block_to_vacuum)
-		FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, blkno);
+	if (rel_pages > next_fsm_block_to_vacuum)
+		FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, rel_pages);
 
 	/* report all blocks vacuumed */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
+	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, rel_pages);
 
 	/* Do final index cleanup (call each index's amvacuumcleanup routine) */
 	if (vacrel->nindexes > 0 && vacrel->do_index_cleanup)
@@ -1478,30 +1505,37 @@ lazy_scan_heap(LVRelState *vacrel)
 }
 
 /*
- *	heap_vac_scan_next_block() -- get next block for vacuum to process
- *
- * lazy_scan_heap() calls here every time it needs to get the next block to
- * prune and vacuum.  The function uses the visibility map, vacuum options,
- * and various thresholds to skip blocks which do not need to be processed and
- * sets blkno to the next block to process.
- *
- * The block number of the next block to process is set in *blkno and its
- * visibility status and whether or not it was eager scanned is set in
- * *blk_flags.
- *
- * The return value is false if there are no further blocks to process.
- *
- * vacrel is an in/out parameter here.  Vacuum options and information about
- * the relation are read.  vacrel->skippedallvis is set if we skip a block
- * that's all-visible but not all-frozen, to ensure that we don't update
- * relfrozenxid in that case.  vacrel also holds information about the next
- * unskippable block, as bookkeeping for this function.
+ *	heap_vac_scan_next_block() -- read stream callback to get the next block
+ *	for vacuum to process
+ *
+ * Every time lazy_scan_heap() needs a new block to process during its first
+ * phase, it invokes read_stream_next_buffer() with a stream set up to call
+ * heap_vac_scan_next_block() to get the next block.
+ *
+ * heap_vac_scan_next_block() uses the visibility map, vacuum options, and
+ * various thresholds to skip blocks which do not need to be processed and
+ * returns the next block to process or InvalidBlockNumber if there are no
+ * remaining blocks.
+ *
+ * The visibility status of the next block to process and whether or not it
+ * was eager scanned is set in the per_buffer_data.
+ *
+ * callback_private_data contains a reference to the LVRelState, passed to the
+ * read stream API during stream setup. The LVRelState is an in/out parameter
+ * here (locally named `vacrel`). Vacuum options and information about the
+ * relation are read from it. vacrel->skippedallvis is set if we skip a block
+ * that's all-visible but not all-frozen (to ensure that we don't update
+ * relfrozenxid in that case). vacrel also holds information about the next
+ * unskippable block -- as bookkeeping for this function.
  */
-static bool
-heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
-						 uint8 *blk_flags)
+static BlockNumber
+heap_vac_scan_next_block(ReadStream *stream,
+						 void *callback_private_data,
+						 void *per_buffer_data)
 {
 	BlockNumber next_block;
+	LVRelState *vacrel = callback_private_data;
+	uint8	   *blk_flags = per_buffer_data;
 
 	/* relies on InvalidBlockNumber + 1 overflowing to 0 on first call */
 	next_block = vacrel->current_block + 1;
@@ -1516,8 +1550,7 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 			ReleaseBuffer(vacrel->next_unskippable_vmbuffer);
 			vacrel->next_unskippable_vmbuffer = InvalidBuffer;
 		}
-		*blkno = vacrel->rel_pages;
-		return false;
+		return InvalidBlockNumber;
 	}
 
 	/*
@@ -1566,9 +1599,9 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 		 * but chose not to.  We know that they are all-visible in the VM,
 		 * otherwise they would've been unskippable.
 		 */
-		*blkno = vacrel->current_block = next_block;
+		vacrel->current_block = next_block;
 		*blk_flags |= VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM;
-		return true;
+		return vacrel->current_block;
 	}
 	else
 	{
@@ -1578,12 +1611,12 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 		 */
 		Assert(next_block == vacrel->next_unskippable_block);
 
-		*blkno = vacrel->current_block = next_block;
+		vacrel->current_block = next_block;
 		if (vacrel->next_unskippable_allvis)
 			*blk_flags |= VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM;
 		if (vacrel->next_unskippable_eager_scanned)
 			*blk_flags |= VAC_BLK_WAS_EAGER_SCANNED;
-		return true;
+		return vacrel->current_block;
 	}
 }
 
-- 
2.34.1

#51Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Melanie Plageman (#50)
Re: Confine vacuum skip logic to lazy_scan_skip

On Tue, Feb 11, 2025 at 5:10 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Thu, Feb 6, 2025 at 1:06 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Wed, Feb 5, 2025 at 5:26 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

Yes, looking at these results, I also feel good about it. I've updated
the commit metadata in attached v14, but I could use a round of review
before pushing it.

I've done a bit of self-review and updated these patches.

This needed a rebase in light of 052026c9b90.
v16 attached has an additional commit which converts the block
information parameters to heap_vac_scan_next_block() into flags
because we can only get one piece of information per block from the
read stream API. This seemed nicer than a cumbersome struct.

Sorry for the late chiming in. I've reviewed the v16 patch set, and
the patches mostly look good. Here are some comments mostly about
cosmetic things:

0001:

-   bool        all_visible_according_to_vm,
-               was_eager_scanned = false;
+   uint8       blk_flags = 0;

Can we probably declare blk_flags inside the main loop?

0002:

In lazy_scan_heap(), we have a failsafe check at the beginning of the
main loop, which is performed before reading the first block. Isn't it
better to move this check after scanning a block (especially after
incrementing scanned_pages)? Otherwise, we would end up calling
lazy_check_wraparound_failsafe() at the very first loop, which
previously didn't happen without the patch. Since we already called
lazy_check_wraparound_failsafe() just before calling lazy_scan_heap(),
the extra check would not make much sense.

---
+   /* Set up the read stream for vacuum's first pass through the heap */
+   stream = read_stream_begin_relation(READ_STREAM_MAINTENANCE,
+                                       vacrel->bstrategy,
+                                       vacrel->rel,
+                                       MAIN_FORKNUM,
+                                       heap_vac_scan_next_block,
+                                       vacrel,
+                                       sizeof(bool));

Is there any reason to use sizeof(bool) instead of sizeof(uint8) here?

---
            /*
             * Vacuum the Free Space Map to make newly-freed space visible on
-            * upper-level FSM pages.  Note we have not yet processed blkno.
+            * upper-level FSM pages.  Note that blkno is the previously
+            * processed block.
             */
            FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum,
                                    blkno);

Given the blkno is already processed, should we pass 'blkno + 1'
instead of blkno?

0003:

- while ((iter_result = TidStoreIterateNext(iter)) != NULL)

I think we can declare iter_result in the main loop of lazy_vacuum_heap_rel().

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#52Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#50)
3 attachment(s)
Re: Confine vacuum skip logic to lazy_scan_skip

On Tue, Feb 11, 2025 at 8:10 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Thu, Feb 6, 2025 at 1:06 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Wed, Feb 5, 2025 at 5:26 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

Yes, looking at these results, I also feel good about it. I've updated
the commit metadata in attached v14, but I could use a round of review
before pushing it.

I've done a bit of self-review and updated these patches.

This needed a rebase in light of 052026c9b90.
v16 attached has an additional commit which converts the block
information parameters to heap_vac_scan_next_block() into flags
because we can only get one piece of information per block from the
read stream API. This seemed nicer than a cumbersome struct.

I've done some clean-up including incorporating a few off-list pieces
of minor feedback from Andres.

- Melanie

Attachments:

v17-0003-Use-streaming-read-I-O-in-VACUUM-s-third-phase.patchtext/x-patch; charset=US-ASCII; name=v17-0003-Use-streaming-read-I-O-in-VACUUM-s-third-phase.patchDownload
From da251546b0e7a748884efc9fb6d27fab0b7a452d Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Thu, 13 Feb 2025 17:34:18 -0500
Subject: [PATCH v17 3/3] Use streaming read I/O in VACUUM's third phase

Make vacuum's third phase (its second pass over the heap), which reaps
dead items collected in the first phase and marks them as reusable, use
the read stream API. This commit adds a new read stream callback,
vacuum_reap_lp_read_stream_next(), that looks ahead in the TidStore and
returns the next block number to read for vacuum.

Author: Melanie Plageman <melanieplageman@gmail.com>
Co-authored-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGKN3oy0bN_3yv8hd78a4%2BM1tJC9z7mD8%2Bf%2ByA%2BGeoFUwQ%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c | 53 +++++++++++++++++++++++++---
 1 file changed, 48 insertions(+), 5 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 32083a92c31..f85530a28f3 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2639,6 +2639,32 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 	return allindexes;
 }
 
+/*
+ * Read stream callback for vacuum's third phase (second pass over the heap).
+ * Gets the next block from the TID store and returns it or InvalidBlockNumber
+ * if there are no further blocks to vacuum.
+ */
+static BlockNumber
+vacuum_reap_lp_read_stream_next(ReadStream *stream,
+								void *callback_private_data,
+								void *per_buffer_data)
+{
+	TidStoreIter *iter = callback_private_data;
+	TidStoreIterResult *iter_result;
+
+	iter_result = TidStoreIterateNext(iter);
+	if (iter_result == NULL)
+		return InvalidBlockNumber;
+
+	/*
+	 * Save the TidStoreIterResult for later, so we can extract the offsets.
+	 * It is safe to copy the result, according to TidStoreIterateNext().
+	 */
+	memcpy(per_buffer_data, iter_result, sizeof(*iter_result));
+
+	return iter_result->blkno;
+}
+
 /*
  *	lazy_vacuum_heap_rel() -- second pass over the heap for two pass strategy
  *
@@ -2659,6 +2685,7 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 static void
 lazy_vacuum_heap_rel(LVRelState *vacrel)
 {
+	ReadStream *stream;
 	BlockNumber vacuumed_pages = 0;
 	Buffer		vmbuffer = InvalidBuffer;
 	LVSavedErrInfo saved_err_info;
@@ -2679,7 +2706,17 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 							 InvalidBlockNumber, InvalidOffsetNumber);
 
 	iter = TidStoreBeginIterate(vacrel->dead_items);
-	while ((iter_result = TidStoreIterateNext(iter)) != NULL)
+
+	/* Set up the read stream */
+	stream = read_stream_begin_relation(READ_STREAM_MAINTENANCE,
+										vacrel->bstrategy,
+										vacrel->rel,
+										MAIN_FORKNUM,
+										vacuum_reap_lp_read_stream_next,
+										iter,
+										sizeof(TidStoreIterResult));
+
+	while (true)
 	{
 		BlockNumber blkno;
 		Buffer		buf;
@@ -2690,9 +2727,15 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 
 		vacuum_delay_point(false);
 
-		blkno = iter_result->blkno;
-		vacrel->blkno = blkno;
+		buf = read_stream_next_buffer(stream, (void **) &iter_result);
+
+		/* The relation is exhausted */
+		if (!BufferIsValid(buf))
+			break;
 
+		vacrel->blkno = blkno = BufferGetBlockNumber(buf);
+
+		Assert(iter_result);
 		num_offsets = TidStoreGetBlockOffsets(iter_result, offsets, lengthof(offsets));
 		Assert(num_offsets <= lengthof(offsets));
 
@@ -2704,8 +2747,6 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 		visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
 
 		/* We need a non-cleanup exclusive lock to mark dead_items unused */
-		buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL,
-								 vacrel->bstrategy);
 		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
 		lazy_vacuum_heap_page(vacrel, blkno, buf, offsets,
 							  num_offsets, vmbuffer);
@@ -2718,6 +2759,8 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 		RecordPageWithFreeSpace(vacrel->rel, blkno, freespace);
 		vacuumed_pages++;
 	}
+
+	read_stream_end(stream);
 	TidStoreEndIterate(iter);
 
 	vacrel->blkno = InvalidBlockNumber;
-- 
2.34.1

v17-0001-Convert-heap_vac_scan_next_block-boolean-paramet.patchtext/x-patch; charset=US-ASCII; name=v17-0001-Convert-heap_vac_scan_next_block-boolean-paramet.patchDownload
From 6abfb26af7cd2ff55c9b13cadcef2d56b6460a28 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Thu, 13 Feb 2025 16:49:10 -0500
Subject: [PATCH v17 1/3] Convert heap_vac_scan_next_block() boolean parameters
 to flags

The read stream API only allows one piece of extra per block state to be
passed back to the API user. lazy_scan_heap() needs to know whether or
not a given block was all-visible in the visibility map and whether or
not it was eagerly scanned. Convert these two pieces of information to
flags so that they can be passed as a single argument to
heap_vac_scan_next_block() (which will become the read stream API
callback for heap phase I vacuuming).

Discussion: https://postgr.es/m/CAAKRu_bmx33jTqATP5GKNFYwAg02a9dDtk4U_ciEjgBHZSVkOQ%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c | 47 ++++++++++++++++------------
 1 file changed, 27 insertions(+), 20 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 3df5b92afb8..c4d0f77ee2f 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -248,6 +248,13 @@ typedef enum
  */
 #define EAGER_SCAN_REGION_SIZE 4096
 
+/*
+ * heap_vac_scan_next_block() sets these flags to communicate information
+ * about the block it read to the caller.
+ */
+#define VAC_BLK_WAS_EAGER_SCANNED (1 << 0)
+#define VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM (1 << 1)
+
 typedef struct LVRelState
 {
 	/* Target heap relation and its indexes */
@@ -417,8 +424,7 @@ static void lazy_scan_heap(LVRelState *vacrel);
 static void heap_vacuum_eager_scan_setup(LVRelState *vacrel,
 										 VacuumParams *params);
 static bool heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
-									 bool *all_visible_according_to_vm,
-									 bool *was_eager_scanned);
+									 uint8 *blk_info);
 static void find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
@@ -1171,8 +1177,7 @@ lazy_scan_heap(LVRelState *vacrel)
 	BlockNumber rel_pages = vacrel->rel_pages,
 				blkno,
 				next_fsm_block_to_vacuum = 0;
-	bool		all_visible_according_to_vm,
-				was_eager_scanned = false;
+	uint8		blk_info = 0;
 	BlockNumber orig_eager_scan_success_limit =
 		vacrel->eager_scan_remaining_successes; /* for logging */
 	Buffer		vmbuffer = InvalidBuffer;
@@ -1196,8 +1201,7 @@ lazy_scan_heap(LVRelState *vacrel)
 	vacrel->next_unskippable_eager_scanned = false;
 	vacrel->next_unskippable_vmbuffer = InvalidBuffer;
 
-	while (heap_vac_scan_next_block(vacrel, &blkno, &all_visible_according_to_vm,
-									&was_eager_scanned))
+	while (heap_vac_scan_next_block(vacrel, &blkno, &blk_info))
 	{
 		Buffer		buf;
 		Page		page;
@@ -1206,7 +1210,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		bool		got_cleanup_lock = false;
 
 		vacrel->scanned_pages++;
-		if (was_eager_scanned)
+		if (blk_info & VAC_BLK_WAS_EAGER_SCANNED)
 			vacrel->eager_scanned_pages++;
 
 		/* Report as block scanned, update error traceback information */
@@ -1331,7 +1335,8 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		if (got_cleanup_lock)
 			lazy_scan_prune(vacrel, buf, blkno, page,
-							vmbuffer, all_visible_according_to_vm,
+							vmbuffer,
+							blk_info & VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM,
 							&has_lpdead_items, &vm_page_frozen);
 
 		/*
@@ -1348,7 +1353,8 @@ lazy_scan_heap(LVRelState *vacrel)
 		 * exclude pages skipped due to cleanup lock contention from eager
 		 * freeze algorithm caps.
 		 */
-		if (got_cleanup_lock && was_eager_scanned)
+		if (got_cleanup_lock &&
+			(blk_info & VAC_BLK_WAS_EAGER_SCANNED))
 		{
 			/* Aggressive vacuums do not eager scan. */
 			Assert(!vacrel->aggressive);
@@ -1479,11 +1485,11 @@ lazy_scan_heap(LVRelState *vacrel)
  * and various thresholds to skip blocks which do not need to be processed and
  * sets blkno to the next block to process.
  *
- * The block number and visibility status of the next block to process are set
- * in *blkno and *all_visible_according_to_vm.  The return value is false if
- * there are no further blocks to process. If the block is being eagerly
- * scanned, was_eager_scanned is set so that the caller can count whether or
- * not an eagerly scanned page is successfully frozen.
+ * The block number of the next block to process is set in *blkno and its
+ * visibility status and whether or not it was eager scanned is set in
+ * *blk_info.
+ *
+ * The return value is false if there are no further blocks to process.
  *
  * vacrel is an in/out parameter here.  Vacuum options and information about
  * the relation are read.  vacrel->skippedallvis is set if we skip a block
@@ -1493,15 +1499,14 @@ lazy_scan_heap(LVRelState *vacrel)
  */
 static bool
 heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
-						 bool *all_visible_according_to_vm,
-						 bool *was_eager_scanned)
+						 uint8 *blk_info)
 {
 	BlockNumber next_block;
 
 	/* relies on InvalidBlockNumber + 1 overflowing to 0 on first call */
 	next_block = vacrel->current_block + 1;
 
-	*was_eager_scanned = false;
+	*blk_info = 0;
 
 	/* Have we reached the end of the relation? */
 	if (next_block >= vacrel->rel_pages)
@@ -1562,7 +1567,7 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 		 * otherwise they would've been unskippable.
 		 */
 		*blkno = vacrel->current_block = next_block;
-		*all_visible_according_to_vm = true;
+		*blk_info |= VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM;
 		return true;
 	}
 	else
@@ -1574,8 +1579,10 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 		Assert(next_block == vacrel->next_unskippable_block);
 
 		*blkno = vacrel->current_block = next_block;
-		*all_visible_according_to_vm = vacrel->next_unskippable_allvis;
-		*was_eager_scanned = vacrel->next_unskippable_eager_scanned;
+		if (vacrel->next_unskippable_allvis)
+			*blk_info |= VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM;
+		if (vacrel->next_unskippable_eager_scanned)
+			*blk_info |= VAC_BLK_WAS_EAGER_SCANNED;
 		return true;
 	}
 }
-- 
2.34.1

v17-0002-Use-streaming-read-I-O-in-VACUUM-s-first-phase.patchtext/x-patch; charset=US-ASCII; name=v17-0002-Use-streaming-read-I-O-in-VACUUM-s-first-phase.patchDownload
From def5659ded7b7f401ea8d9deffa44b10b73a4a5f Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Thu, 13 Feb 2025 17:34:12 -0500
Subject: [PATCH v17 2/3] Use streaming read I/O in VACUUM's first phase

Make vacuum's first phase, which prunes and freezes tuples and records
dead TIDs, use the read stream API by by converting
heap_vac_scan_next_block() to a read stream callback.

Discussion: https://postgr.es/m/CAAKRu_aLwANZpxHc0tC-6OT0OQT4TftDGkKAO5yigMUOv_Tcsw%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c | 145 +++++++++++++++++----------
 1 file changed, 90 insertions(+), 55 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index c4d0f77ee2f..32083a92c31 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -153,6 +153,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "storage/read_stream.h"
 #include "utils/lsyscache.h"
 #include "utils/pg_rusage.h"
 #include "utils/timestamp.h"
@@ -423,8 +424,9 @@ typedef struct LVSavedErrInfo
 static void lazy_scan_heap(LVRelState *vacrel);
 static void heap_vacuum_eager_scan_setup(LVRelState *vacrel,
 										 VacuumParams *params);
-static bool heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
-									 uint8 *blk_info);
+static BlockNumber heap_vac_scan_next_block(ReadStream *stream,
+											void *callback_private_data,
+											void *per_buffer_data);
 static void find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
@@ -1174,10 +1176,12 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 static void
 lazy_scan_heap(LVRelState *vacrel)
 {
+	ReadStream *stream;
 	BlockNumber rel_pages = vacrel->rel_pages,
-				blkno,
+				blkno = 0,
 				next_fsm_block_to_vacuum = 0;
 	uint8		blk_info = 0;
+	void	   *per_buffer_data = NULL;
 	BlockNumber orig_eager_scan_success_limit =
 		vacrel->eager_scan_remaining_successes; /* for logging */
 	Buffer		vmbuffer = InvalidBuffer;
@@ -1201,7 +1205,16 @@ lazy_scan_heap(LVRelState *vacrel)
 	vacrel->next_unskippable_eager_scanned = false;
 	vacrel->next_unskippable_vmbuffer = InvalidBuffer;
 
-	while (heap_vac_scan_next_block(vacrel, &blkno, &blk_info))
+	/* Set up the read stream for vacuum's first pass through the heap */
+	stream = read_stream_begin_relation(READ_STREAM_MAINTENANCE,
+										vacrel->bstrategy,
+										vacrel->rel,
+										MAIN_FORKNUM,
+										heap_vac_scan_next_block,
+										vacrel,
+										sizeof(uint8));
+
+	while (true)
 	{
 		Buffer		buf;
 		Page		page;
@@ -1209,15 +1222,6 @@ lazy_scan_heap(LVRelState *vacrel)
 		bool		vm_page_frozen = false;
 		bool		got_cleanup_lock = false;
 
-		vacrel->scanned_pages++;
-		if (blk_info & VAC_BLK_WAS_EAGER_SCANNED)
-			vacrel->eager_scanned_pages++;
-
-		/* Report as block scanned, update error traceback information */
-		pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
-		update_vacuum_error_info(vacrel, NULL, VACUUM_ERRCB_PHASE_SCAN_HEAP,
-								 blkno, InvalidOffsetNumber);
-
 		vacuum_delay_point(false);
 
 		/*
@@ -1258,7 +1262,8 @@ lazy_scan_heap(LVRelState *vacrel)
 
 			/*
 			 * Vacuum the Free Space Map to make newly-freed space visible on
-			 * upper-level FSM pages.  Note we have not yet processed blkno.
+			 * upper-level FSM pages.  Note that blkno is the previously
+			 * processed block.
 			 */
 			FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum,
 									blkno);
@@ -1269,6 +1274,26 @@ lazy_scan_heap(LVRelState *vacrel)
 										 PROGRESS_VACUUM_PHASE_SCAN_HEAP);
 		}
 
+		buf = read_stream_next_buffer(stream, &per_buffer_data);
+
+		/* The relation is exhausted. */
+		if (!BufferIsValid(buf))
+			break;
+
+		blk_info = *((uint8 *) per_buffer_data);
+		CheckBufferIsPinnedOnce(buf);
+		page = BufferGetPage(buf);
+		blkno = BufferGetBlockNumber(buf);
+
+		vacrel->scanned_pages++;
+		if (blk_info & VAC_BLK_WAS_EAGER_SCANNED)
+			vacrel->eager_scanned_pages++;
+
+		/* Report as block scanned, update error traceback information */
+		pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
+		update_vacuum_error_info(vacrel, NULL, VACUUM_ERRCB_PHASE_SCAN_HEAP,
+								 blkno, InvalidOffsetNumber);
+
 		/*
 		 * Pin the visibility map page in case we need to mark the page
 		 * all-visible.  In most cases this will be very cheap, because we'll
@@ -1276,10 +1301,6 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
 
-		buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL,
-								 vacrel->bstrategy);
-		page = BufferGetPage(buf);
-
 		/*
 		 * We need a buffer cleanup lock to prune HOT chains and defragment
 		 * the page in lazy_scan_prune.  But when it's not possible to acquire
@@ -1439,8 +1460,12 @@ lazy_scan_heap(LVRelState *vacrel)
 	if (BufferIsValid(vmbuffer))
 		ReleaseBuffer(vmbuffer);
 
-	/* report that everything is now scanned */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
+	/*
+	 * Report that everything is now scanned. We never skip scanning the last
+	 * block in the relation, so we can pass rel_pages here.
+	 */
+	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED,
+								 rel_pages);
 
 	/* now we can compute the new value for pg_class.reltuples */
 	vacrel->new_live_tuples = vac_estimate_reltuples(vacrel->rel, rel_pages,
@@ -1455,6 +1480,8 @@ lazy_scan_heap(LVRelState *vacrel)
 		Max(vacrel->new_live_tuples, 0) + vacrel->recently_dead_tuples +
 		vacrel->missed_dead_tuples;
 
+	read_stream_end(stream);
+
 	/*
 	 * Do index vacuuming (call each index's ambulkdelete routine), then do
 	 * related heap vacuuming
@@ -1465,12 +1492,14 @@ lazy_scan_heap(LVRelState *vacrel)
 	/*
 	 * Vacuum the remainder of the Free Space Map.  We must do this whether or
 	 * not there were indexes, and whether or not we bypassed index vacuuming.
+	 * We can pass rel_pages here because we never skip scanning the last
+	 * block of the relation.
 	 */
-	if (blkno > next_fsm_block_to_vacuum)
-		FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, blkno);
+	if (rel_pages > next_fsm_block_to_vacuum)
+		FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, rel_pages);
 
 	/* report all blocks vacuumed */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
+	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, rel_pages);
 
 	/* Do final index cleanup (call each index's amvacuumcleanup routine) */
 	if (vacrel->nindexes > 0 && vacrel->do_index_cleanup)
@@ -1478,36 +1507,41 @@ lazy_scan_heap(LVRelState *vacrel)
 }
 
 /*
- *	heap_vac_scan_next_block() -- get next block for vacuum to process
- *
- * lazy_scan_heap() calls here every time it needs to get the next block to
- * prune and vacuum.  The function uses the visibility map, vacuum options,
- * and various thresholds to skip blocks which do not need to be processed and
- * sets blkno to the next block to process.
- *
- * The block number of the next block to process is set in *blkno and its
- * visibility status and whether or not it was eager scanned is set in
- * *blk_info.
- *
- * The return value is false if there are no further blocks to process.
- *
- * vacrel is an in/out parameter here.  Vacuum options and information about
- * the relation are read.  vacrel->skippedallvis is set if we skip a block
- * that's all-visible but not all-frozen, to ensure that we don't update
- * relfrozenxid in that case.  vacrel also holds information about the next
- * unskippable block, as bookkeeping for this function.
+ *	heap_vac_scan_next_block() -- read stream callback to get the next block
+ *	for vacuum to process
+ *
+ * Every time lazy_scan_heap() needs a new block to process during its first
+ * phase, it invokes read_stream_next_buffer() with a stream set up to call
+ * heap_vac_scan_next_block() to get the next block.
+ *
+ * heap_vac_scan_next_block() uses the visibility map, vacuum options, and
+ * various thresholds to skip blocks which do not need to be processed and
+ * returns the next block to process or InvalidBlockNumber if there are no
+ * remaining blocks.
+ *
+ * The visibility status of the next block to process and whether or not it
+ * was eager scanned is set in the per_buffer_data.
+ *
+ * callback_private_data contains a reference to the LVRelState, passed to the
+ * read stream API during stream setup. The LVRelState is an in/out parameter
+ * here (locally named `vacrel`). Vacuum options and information about the
+ * relation are read from it. vacrel->skippedallvis is set if we skip a block
+ * that's all-visible but not all-frozen (to ensure that we don't update
+ * relfrozenxid in that case). vacrel also holds information about the next
+ * unskippable block -- as bookkeeping for this function.
  */
-static bool
-heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
-						 uint8 *blk_info)
+static BlockNumber
+heap_vac_scan_next_block(ReadStream *stream,
+						 void *callback_private_data,
+						 void *per_buffer_data)
 {
 	BlockNumber next_block;
+	LVRelState *vacrel = callback_private_data;
+	uint8		blk_info = 0;
 
 	/* relies on InvalidBlockNumber + 1 overflowing to 0 on first call */
 	next_block = vacrel->current_block + 1;
 
-	*blk_info = 0;
-
 	/* Have we reached the end of the relation? */
 	if (next_block >= vacrel->rel_pages)
 	{
@@ -1516,8 +1550,7 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 			ReleaseBuffer(vacrel->next_unskippable_vmbuffer);
 			vacrel->next_unskippable_vmbuffer = InvalidBuffer;
 		}
-		*blkno = vacrel->rel_pages;
-		return false;
+		return InvalidBlockNumber;
 	}
 
 	/*
@@ -1566,9 +1599,10 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 		 * but chose not to.  We know that they are all-visible in the VM,
 		 * otherwise they would've been unskippable.
 		 */
-		*blkno = vacrel->current_block = next_block;
-		*blk_info |= VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM;
-		return true;
+		vacrel->current_block = next_block;
+		blk_info |= VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM;
+		*((uint8 *) per_buffer_data) = blk_info;
+		return vacrel->current_block;
 	}
 	else
 	{
@@ -1578,12 +1612,13 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 		 */
 		Assert(next_block == vacrel->next_unskippable_block);
 
-		*blkno = vacrel->current_block = next_block;
+		vacrel->current_block = next_block;
 		if (vacrel->next_unskippable_allvis)
-			*blk_info |= VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM;
+			blk_info |= VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM;
 		if (vacrel->next_unskippable_eager_scanned)
-			*blk_info |= VAC_BLK_WAS_EAGER_SCANNED;
-		return true;
+			blk_info |= VAC_BLK_WAS_EAGER_SCANNED;
+		*((uint8 *) per_buffer_data) = blk_info;
+		return vacrel->current_block;
 	}
 }
 
-- 
2.34.1

#53Thomas Munro
thomas.munro@gmail.com
In reply to: Melanie Plageman (#52)
Re: Confine vacuum skip logic to lazy_scan_skip

On Fri, Feb 14, 2025 at 12:11 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

I've done some clean-up including incorporating a few off-list pieces
of minor feedback from Andres.

I've been poking, reading, and trying out these patches. They look good to me.

Tiny nit, maybe this comment could say something less obvious, cf the
similar comment near the other stream:

+       /* Set up the read stream */
+       stream = read_stream_begin_relation(READ_STREAM_MAINTENANCE,

I don't really love the cumbersome casting required around
per_buffer_data, but that's not your patches' fault (hmm, wonder what
we can do to improve that).

#54Melanie Plageman
melanieplageman@gmail.com
In reply to: Masahiko Sawada (#51)
3 attachment(s)
Re: Confine vacuum skip logic to lazy_scan_skip

Thanks for your review! I've made the changes in attached v18.

I do want to know what you think we should do about what you brought
up about lazy_check_wraparound_failsafe() -- given my reply (below).

On Thu, Feb 13, 2025 at 6:08 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Sorry for the late chiming in. I've reviewed the v16 patch set, and
the patches mostly look good. Here are some comments mostly about
cosmetic things:

0001:

-   bool        all_visible_according_to_vm,
-               was_eager_scanned = false;
+   uint8       blk_flags = 0;

Can we probably declare blk_flags inside the main loop?

I've done this in 0002 (can't in 0001 because of it being used inside
the while loop itself).

0002:

In lazy_scan_heap(), we have a failsafe check at the beginning of the
main loop, which is performed before reading the first block. Isn't it
better to move this check after scanning a block (especially after
incrementing scanned_pages)? Otherwise, we would end up calling
lazy_check_wraparound_failsafe() at the very first loop, which
previously didn't happen without the patch. Since we already called
lazy_check_wraparound_failsafe() just before calling lazy_scan_heap(),
the extra check would not make much sense.

Yes, I agonized over this a bit. The problem with calling
lazy_check_wraparound_failsafe() (and vacuum_delay_point() especially)
after reading the first block is that read_stream_next_buffer()
returns the buffer pinned. We don't want to hang onto that pin for a
long time. But I can't move them to the bottom of the loop after we
release the buffer because some of the code paths don't make it that
far. I don't see a good way other than how I did it or special-casing
block 0. What do you think?

---
+   /* Set up the read stream for vacuum's first pass through the heap */
+   stream = read_stream_begin_relation(READ_STREAM_MAINTENANCE,
+                                       vacrel->bstrategy,
+                                       vacrel->rel,
+                                       MAIN_FORKNUM,
+                                       heap_vac_scan_next_block,
+                                       vacrel,
+                                       sizeof(bool));

Is there any reason to use sizeof(bool) instead of sizeof(uint8) here?

Nope. That was a buglet (fixed in my v17 but I'm glad you caught it
too). Thanks!

/*
* Vacuum the Free Space Map to make newly-freed space visible on
-            * upper-level FSM pages.  Note we have not yet processed blkno.
+            * upper-level FSM pages.  Note that blkno is the previously
+            * processed block.
*/
FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum,
blkno);

Given the blkno is already processed, should we pass 'blkno + 1'
instead of blkno?

Good idea! Done in attached v18.

0003:

- while ((iter_result = TidStoreIterateNext(iter)) != NULL)

I think we can declare iter_result in the main loop of lazy_vacuum_heap_rel().

Done.

- Melanie

Attachments:

v18-0003-Use-streaming-read-I-O-in-VACUUM-s-third-phase.patchtext/x-patch; charset=US-ASCII; name=v18-0003-Use-streaming-read-I-O-in-VACUUM-s-third-phase.patchDownload
From 7cd9bd45b2e6a7da333b3f6c1ec46bc9fa90ba8b Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Thu, 13 Feb 2025 17:34:18 -0500
Subject: [PATCH v18 3/3] Use streaming read I/O in VACUUM's third phase

Make vacuum's third phase (its second pass over the heap), which reaps
dead items collected in the first phase and marks them as reusable, use
the read stream API. This commit adds a new read stream callback,
vacuum_reap_lp_read_stream_next(), that looks ahead in the TidStore and
returns the next block number to read for vacuum.

Author: Melanie Plageman <melanieplageman@gmail.com>
Co-authored-by: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGKN3oy0bN_3yv8hd78a4%2BM1tJC9z7mD8%2Bf%2ByA%2BGeoFUwQ%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c | 55 +++++++++++++++++++++++++---
 1 file changed, 49 insertions(+), 6 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index ea19eed57bf..3c4aa17892e 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2638,6 +2638,32 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 	return allindexes;
 }
 
+/*
+ * Read stream callback for vacuum's third phase (second pass over the heap).
+ * Gets the next block from the TID store and returns it or InvalidBlockNumber
+ * if there are no further blocks to vacuum.
+ */
+static BlockNumber
+vacuum_reap_lp_read_stream_next(ReadStream *stream,
+								void *callback_private_data,
+								void *per_buffer_data)
+{
+	TidStoreIter *iter = callback_private_data;
+	TidStoreIterResult *iter_result;
+
+	iter_result = TidStoreIterateNext(iter);
+	if (iter_result == NULL)
+		return InvalidBlockNumber;
+
+	/*
+	 * Save the TidStoreIterResult for later, so we can extract the offsets.
+	 * It is safe to copy the result, according to TidStoreIterateNext().
+	 */
+	memcpy(per_buffer_data, iter_result, sizeof(*iter_result));
+
+	return iter_result->blkno;
+}
+
 /*
  *	lazy_vacuum_heap_rel() -- second pass over the heap for two pass strategy
  *
@@ -2658,11 +2684,11 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 static void
 lazy_vacuum_heap_rel(LVRelState *vacrel)
 {
+	ReadStream *stream;
 	BlockNumber vacuumed_pages = 0;
 	Buffer		vmbuffer = InvalidBuffer;
 	LVSavedErrInfo saved_err_info;
 	TidStoreIter *iter;
-	TidStoreIterResult *iter_result;
 
 	Assert(vacrel->do_index_vacuuming);
 	Assert(vacrel->do_index_cleanup);
@@ -2678,20 +2704,37 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 							 InvalidBlockNumber, InvalidOffsetNumber);
 
 	iter = TidStoreBeginIterate(vacrel->dead_items);
-	while ((iter_result = TidStoreIterateNext(iter)) != NULL)
+
+	/* Set up the read stream for vacuum's second pass through the heap */
+	stream = read_stream_begin_relation(READ_STREAM_MAINTENANCE,
+										vacrel->bstrategy,
+										vacrel->rel,
+										MAIN_FORKNUM,
+										vacuum_reap_lp_read_stream_next,
+										iter,
+										sizeof(TidStoreIterResult));
+
+	while (true)
 	{
 		BlockNumber blkno;
 		Buffer		buf;
 		Page		page;
+		TidStoreIterResult *iter_result;
 		Size		freespace;
 		OffsetNumber offsets[MaxOffsetNumber];
 		int			num_offsets;
 
 		vacuum_delay_point(false);
 
-		blkno = iter_result->blkno;
-		vacrel->blkno = blkno;
+		buf = read_stream_next_buffer(stream, (void **) &iter_result);
 
+		/* The relation is exhausted */
+		if (!BufferIsValid(buf))
+			break;
+
+		vacrel->blkno = blkno = BufferGetBlockNumber(buf);
+
+		Assert(iter_result);
 		num_offsets = TidStoreGetBlockOffsets(iter_result, offsets, lengthof(offsets));
 		Assert(num_offsets <= lengthof(offsets));
 
@@ -2703,8 +2746,6 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 		visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
 
 		/* We need a non-cleanup exclusive lock to mark dead_items unused */
-		buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL,
-								 vacrel->bstrategy);
 		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
 		lazy_vacuum_heap_page(vacrel, blkno, buf, offsets,
 							  num_offsets, vmbuffer);
@@ -2717,6 +2758,8 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 		RecordPageWithFreeSpace(vacrel->rel, blkno, freespace);
 		vacuumed_pages++;
 	}
+
+	read_stream_end(stream);
 	TidStoreEndIterate(iter);
 
 	vacrel->blkno = InvalidBlockNumber;
-- 
2.34.1

v18-0002-Use-streaming-read-I-O-in-VACUUM-s-first-phase.patchtext/x-patch; charset=US-ASCII; name=v18-0002-Use-streaming-read-I-O-in-VACUUM-s-first-phase.patchDownload
From 18a30acd22c0720e38d6e54873137eeb728f3a99 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Thu, 13 Feb 2025 17:34:12 -0500
Subject: [PATCH v18 2/3] Use streaming read I/O in VACUUM's first phase

Make vacuum's first phase, which prunes and freezes tuples and records
dead TIDs, use the read stream API by by converting
heap_vac_scan_next_block() to a read stream callback.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CAAKRu_aLwANZpxHc0tC-6OT0OQT4TftDGkKAO5yigMUOv_Tcsw%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c | 148 ++++++++++++++++-----------
 1 file changed, 91 insertions(+), 57 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index c4d0f77ee2f..ea19eed57bf 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -153,6 +153,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "storage/read_stream.h"
 #include "utils/lsyscache.h"
 #include "utils/pg_rusage.h"
 #include "utils/timestamp.h"
@@ -423,8 +424,9 @@ typedef struct LVSavedErrInfo
 static void lazy_scan_heap(LVRelState *vacrel);
 static void heap_vacuum_eager_scan_setup(LVRelState *vacrel,
 										 VacuumParams *params);
-static bool heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
-									 uint8 *blk_info);
+static BlockNumber heap_vac_scan_next_block(ReadStream *stream,
+											void *callback_private_data,
+											void *per_buffer_data);
 static void find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
@@ -1174,10 +1176,11 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 static void
 lazy_scan_heap(LVRelState *vacrel)
 {
+	ReadStream *stream;
 	BlockNumber rel_pages = vacrel->rel_pages,
-				blkno,
+				blkno = 0,
 				next_fsm_block_to_vacuum = 0;
-	uint8		blk_info = 0;
+	void	   *per_buffer_data = NULL;
 	BlockNumber orig_eager_scan_success_limit =
 		vacrel->eager_scan_remaining_successes; /* for logging */
 	Buffer		vmbuffer = InvalidBuffer;
@@ -1201,23 +1204,24 @@ lazy_scan_heap(LVRelState *vacrel)
 	vacrel->next_unskippable_eager_scanned = false;
 	vacrel->next_unskippable_vmbuffer = InvalidBuffer;
 
-	while (heap_vac_scan_next_block(vacrel, &blkno, &blk_info))
+	/* Set up the read stream for vacuum's first pass through the heap */
+	stream = read_stream_begin_relation(READ_STREAM_MAINTENANCE,
+										vacrel->bstrategy,
+										vacrel->rel,
+										MAIN_FORKNUM,
+										heap_vac_scan_next_block,
+										vacrel,
+										sizeof(uint8));
+
+	while (true)
 	{
 		Buffer		buf;
 		Page		page;
+		uint8		blk_info = 0;
 		bool		has_lpdead_items;
 		bool		vm_page_frozen = false;
 		bool		got_cleanup_lock = false;
 
-		vacrel->scanned_pages++;
-		if (blk_info & VAC_BLK_WAS_EAGER_SCANNED)
-			vacrel->eager_scanned_pages++;
-
-		/* Report as block scanned, update error traceback information */
-		pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
-		update_vacuum_error_info(vacrel, NULL, VACUUM_ERRCB_PHASE_SCAN_HEAP,
-								 blkno, InvalidOffsetNumber);
-
 		vacuum_delay_point(false);
 
 		/*
@@ -1258,10 +1262,10 @@ lazy_scan_heap(LVRelState *vacrel)
 
 			/*
 			 * Vacuum the Free Space Map to make newly-freed space visible on
-			 * upper-level FSM pages.  Note we have not yet processed blkno.
+			 * upper-level FSM pages. Note we have not yet processed blkno+1.
 			 */
 			FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum,
-									blkno);
+									blkno + 1);
 			next_fsm_block_to_vacuum = blkno;
 
 			/* Report that we are once again scanning the heap */
@@ -1269,6 +1273,26 @@ lazy_scan_heap(LVRelState *vacrel)
 										 PROGRESS_VACUUM_PHASE_SCAN_HEAP);
 		}
 
+		buf = read_stream_next_buffer(stream, &per_buffer_data);
+
+		/* The relation is exhausted. */
+		if (!BufferIsValid(buf))
+			break;
+
+		blk_info = *((uint8 *) per_buffer_data);
+		CheckBufferIsPinnedOnce(buf);
+		page = BufferGetPage(buf);
+		blkno = BufferGetBlockNumber(buf);
+
+		vacrel->scanned_pages++;
+		if (blk_info & VAC_BLK_WAS_EAGER_SCANNED)
+			vacrel->eager_scanned_pages++;
+
+		/* Report as block scanned, update error traceback information */
+		pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
+		update_vacuum_error_info(vacrel, NULL, VACUUM_ERRCB_PHASE_SCAN_HEAP,
+								 blkno, InvalidOffsetNumber);
+
 		/*
 		 * Pin the visibility map page in case we need to mark the page
 		 * all-visible.  In most cases this will be very cheap, because we'll
@@ -1276,10 +1300,6 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
 
-		buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL,
-								 vacrel->bstrategy);
-		page = BufferGetPage(buf);
-
 		/*
 		 * We need a buffer cleanup lock to prune HOT chains and defragment
 		 * the page in lazy_scan_prune.  But when it's not possible to acquire
@@ -1439,8 +1459,12 @@ lazy_scan_heap(LVRelState *vacrel)
 	if (BufferIsValid(vmbuffer))
 		ReleaseBuffer(vmbuffer);
 
-	/* report that everything is now scanned */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
+	/*
+	 * Report that everything is now scanned. We never skip scanning the last
+	 * block in the relation, so we can pass rel_pages here.
+	 */
+	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED,
+								 rel_pages);
 
 	/* now we can compute the new value for pg_class.reltuples */
 	vacrel->new_live_tuples = vac_estimate_reltuples(vacrel->rel, rel_pages,
@@ -1455,6 +1479,8 @@ lazy_scan_heap(LVRelState *vacrel)
 		Max(vacrel->new_live_tuples, 0) + vacrel->recently_dead_tuples +
 		vacrel->missed_dead_tuples;
 
+	read_stream_end(stream);
+
 	/*
 	 * Do index vacuuming (call each index's ambulkdelete routine), then do
 	 * related heap vacuuming
@@ -1465,12 +1491,14 @@ lazy_scan_heap(LVRelState *vacrel)
 	/*
 	 * Vacuum the remainder of the Free Space Map.  We must do this whether or
 	 * not there were indexes, and whether or not we bypassed index vacuuming.
+	 * We can pass rel_pages here because we never skip scanning the last
+	 * block of the relation.
 	 */
-	if (blkno > next_fsm_block_to_vacuum)
-		FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, blkno);
+	if (rel_pages > next_fsm_block_to_vacuum)
+		FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, rel_pages);
 
 	/* report all blocks vacuumed */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
+	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, rel_pages);
 
 	/* Do final index cleanup (call each index's amvacuumcleanup routine) */
 	if (vacrel->nindexes > 0 && vacrel->do_index_cleanup)
@@ -1478,36 +1506,41 @@ lazy_scan_heap(LVRelState *vacrel)
 }
 
 /*
- *	heap_vac_scan_next_block() -- get next block for vacuum to process
- *
- * lazy_scan_heap() calls here every time it needs to get the next block to
- * prune and vacuum.  The function uses the visibility map, vacuum options,
- * and various thresholds to skip blocks which do not need to be processed and
- * sets blkno to the next block to process.
- *
- * The block number of the next block to process is set in *blkno and its
- * visibility status and whether or not it was eager scanned is set in
- * *blk_info.
- *
- * The return value is false if there are no further blocks to process.
- *
- * vacrel is an in/out parameter here.  Vacuum options and information about
- * the relation are read.  vacrel->skippedallvis is set if we skip a block
- * that's all-visible but not all-frozen, to ensure that we don't update
- * relfrozenxid in that case.  vacrel also holds information about the next
- * unskippable block, as bookkeeping for this function.
+ *	heap_vac_scan_next_block() -- read stream callback to get the next block
+ *	for vacuum to process
+ *
+ * Every time lazy_scan_heap() needs a new block to process during its first
+ * phase, it invokes read_stream_next_buffer() with a stream set up to call
+ * heap_vac_scan_next_block() to get the next block.
+ *
+ * heap_vac_scan_next_block() uses the visibility map, vacuum options, and
+ * various thresholds to skip blocks which do not need to be processed and
+ * returns the next block to process or InvalidBlockNumber if there are no
+ * remaining blocks.
+ *
+ * The visibility status of the next block to process and whether or not it
+ * was eager scanned is set in the per_buffer_data.
+ *
+ * callback_private_data contains a reference to the LVRelState, passed to the
+ * read stream API during stream setup. The LVRelState is an in/out parameter
+ * here (locally named `vacrel`). Vacuum options and information about the
+ * relation are read from it. vacrel->skippedallvis is set if we skip a block
+ * that's all-visible but not all-frozen (to ensure that we don't update
+ * relfrozenxid in that case). vacrel also holds information about the next
+ * unskippable block -- as bookkeeping for this function.
  */
-static bool
-heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
-						 uint8 *blk_info)
+static BlockNumber
+heap_vac_scan_next_block(ReadStream *stream,
+						 void *callback_private_data,
+						 void *per_buffer_data)
 {
 	BlockNumber next_block;
+	LVRelState *vacrel = callback_private_data;
+	uint8		blk_info = 0;
 
 	/* relies on InvalidBlockNumber + 1 overflowing to 0 on first call */
 	next_block = vacrel->current_block + 1;
 
-	*blk_info = 0;
-
 	/* Have we reached the end of the relation? */
 	if (next_block >= vacrel->rel_pages)
 	{
@@ -1516,8 +1549,7 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 			ReleaseBuffer(vacrel->next_unskippable_vmbuffer);
 			vacrel->next_unskippable_vmbuffer = InvalidBuffer;
 		}
-		*blkno = vacrel->rel_pages;
-		return false;
+		return InvalidBlockNumber;
 	}
 
 	/*
@@ -1566,9 +1598,10 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 		 * but chose not to.  We know that they are all-visible in the VM,
 		 * otherwise they would've been unskippable.
 		 */
-		*blkno = vacrel->current_block = next_block;
-		*blk_info |= VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM;
-		return true;
+		vacrel->current_block = next_block;
+		blk_info |= VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM;
+		*((uint8 *) per_buffer_data) = blk_info;
+		return vacrel->current_block;
 	}
 	else
 	{
@@ -1578,12 +1611,13 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 		 */
 		Assert(next_block == vacrel->next_unskippable_block);
 
-		*blkno = vacrel->current_block = next_block;
+		vacrel->current_block = next_block;
 		if (vacrel->next_unskippable_allvis)
-			*blk_info |= VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM;
+			blk_info |= VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM;
 		if (vacrel->next_unskippable_eager_scanned)
-			*blk_info |= VAC_BLK_WAS_EAGER_SCANNED;
-		return true;
+			blk_info |= VAC_BLK_WAS_EAGER_SCANNED;
+		*((uint8 *) per_buffer_data) = blk_info;
+		return vacrel->current_block;
 	}
 }
 
-- 
2.34.1

v18-0001-Convert-heap_vac_scan_next_block-boolean-paramet.patchtext/x-patch; charset=US-ASCII; name=v18-0001-Convert-heap_vac_scan_next_block-boolean-paramet.patchDownload
From 25140d2966d2f87a4a410fc92147f2d88e587920 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Thu, 13 Feb 2025 16:49:10 -0500
Subject: [PATCH v18 1/3] Convert heap_vac_scan_next_block() boolean parameters
 to flags

The read stream API only allows one piece of extra per block state to be
passed back to the API user. lazy_scan_heap() needs to know whether or
not a given block was all-visible in the visibility map and whether or
not it was eagerly scanned. Convert these two pieces of information to
flags so that they can be passed as a single argument to
heap_vac_scan_next_block() (which will become the read stream API
callback for heap phase I vacuuming).

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CAAKRu_bmx33jTqATP5GKNFYwAg02a9dDtk4U_ciEjgBHZSVkOQ%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c | 47 ++++++++++++++++------------
 1 file changed, 27 insertions(+), 20 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 3df5b92afb8..c4d0f77ee2f 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -248,6 +248,13 @@ typedef enum
  */
 #define EAGER_SCAN_REGION_SIZE 4096
 
+/*
+ * heap_vac_scan_next_block() sets these flags to communicate information
+ * about the block it read to the caller.
+ */
+#define VAC_BLK_WAS_EAGER_SCANNED (1 << 0)
+#define VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM (1 << 1)
+
 typedef struct LVRelState
 {
 	/* Target heap relation and its indexes */
@@ -417,8 +424,7 @@ static void lazy_scan_heap(LVRelState *vacrel);
 static void heap_vacuum_eager_scan_setup(LVRelState *vacrel,
 										 VacuumParams *params);
 static bool heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
-									 bool *all_visible_according_to_vm,
-									 bool *was_eager_scanned);
+									 uint8 *blk_info);
 static void find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
@@ -1171,8 +1177,7 @@ lazy_scan_heap(LVRelState *vacrel)
 	BlockNumber rel_pages = vacrel->rel_pages,
 				blkno,
 				next_fsm_block_to_vacuum = 0;
-	bool		all_visible_according_to_vm,
-				was_eager_scanned = false;
+	uint8		blk_info = 0;
 	BlockNumber orig_eager_scan_success_limit =
 		vacrel->eager_scan_remaining_successes; /* for logging */
 	Buffer		vmbuffer = InvalidBuffer;
@@ -1196,8 +1201,7 @@ lazy_scan_heap(LVRelState *vacrel)
 	vacrel->next_unskippable_eager_scanned = false;
 	vacrel->next_unskippable_vmbuffer = InvalidBuffer;
 
-	while (heap_vac_scan_next_block(vacrel, &blkno, &all_visible_according_to_vm,
-									&was_eager_scanned))
+	while (heap_vac_scan_next_block(vacrel, &blkno, &blk_info))
 	{
 		Buffer		buf;
 		Page		page;
@@ -1206,7 +1210,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		bool		got_cleanup_lock = false;
 
 		vacrel->scanned_pages++;
-		if (was_eager_scanned)
+		if (blk_info & VAC_BLK_WAS_EAGER_SCANNED)
 			vacrel->eager_scanned_pages++;
 
 		/* Report as block scanned, update error traceback information */
@@ -1331,7 +1335,8 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		if (got_cleanup_lock)
 			lazy_scan_prune(vacrel, buf, blkno, page,
-							vmbuffer, all_visible_according_to_vm,
+							vmbuffer,
+							blk_info & VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM,
 							&has_lpdead_items, &vm_page_frozen);
 
 		/*
@@ -1348,7 +1353,8 @@ lazy_scan_heap(LVRelState *vacrel)
 		 * exclude pages skipped due to cleanup lock contention from eager
 		 * freeze algorithm caps.
 		 */
-		if (got_cleanup_lock && was_eager_scanned)
+		if (got_cleanup_lock &&
+			(blk_info & VAC_BLK_WAS_EAGER_SCANNED))
 		{
 			/* Aggressive vacuums do not eager scan. */
 			Assert(!vacrel->aggressive);
@@ -1479,11 +1485,11 @@ lazy_scan_heap(LVRelState *vacrel)
  * and various thresholds to skip blocks which do not need to be processed and
  * sets blkno to the next block to process.
  *
- * The block number and visibility status of the next block to process are set
- * in *blkno and *all_visible_according_to_vm.  The return value is false if
- * there are no further blocks to process. If the block is being eagerly
- * scanned, was_eager_scanned is set so that the caller can count whether or
- * not an eagerly scanned page is successfully frozen.
+ * The block number of the next block to process is set in *blkno and its
+ * visibility status and whether or not it was eager scanned is set in
+ * *blk_info.
+ *
+ * The return value is false if there are no further blocks to process.
  *
  * vacrel is an in/out parameter here.  Vacuum options and information about
  * the relation are read.  vacrel->skippedallvis is set if we skip a block
@@ -1493,15 +1499,14 @@ lazy_scan_heap(LVRelState *vacrel)
  */
 static bool
 heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
-						 bool *all_visible_according_to_vm,
-						 bool *was_eager_scanned)
+						 uint8 *blk_info)
 {
 	BlockNumber next_block;
 
 	/* relies on InvalidBlockNumber + 1 overflowing to 0 on first call */
 	next_block = vacrel->current_block + 1;
 
-	*was_eager_scanned = false;
+	*blk_info = 0;
 
 	/* Have we reached the end of the relation? */
 	if (next_block >= vacrel->rel_pages)
@@ -1562,7 +1567,7 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 		 * otherwise they would've been unskippable.
 		 */
 		*blkno = vacrel->current_block = next_block;
-		*all_visible_according_to_vm = true;
+		*blk_info |= VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM;
 		return true;
 	}
 	else
@@ -1574,8 +1579,10 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 		Assert(next_block == vacrel->next_unskippable_block);
 
 		*blkno = vacrel->current_block = next_block;
-		*all_visible_according_to_vm = vacrel->next_unskippable_allvis;
-		*was_eager_scanned = vacrel->next_unskippable_eager_scanned;
+		if (vacrel->next_unskippable_allvis)
+			*blk_info |= VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM;
+		if (vacrel->next_unskippable_eager_scanned)
+			*blk_info |= VAC_BLK_WAS_EAGER_SCANNED;
 		return true;
 	}
 }
-- 
2.34.1

#55Melanie Plageman
melanieplageman@gmail.com
In reply to: Thomas Munro (#53)
Re: Confine vacuum skip logic to lazy_scan_skip

On Thu, Feb 13, 2025 at 6:52 PM Thomas Munro <thomas.munro@gmail.com> wrote:

I've been poking, reading, and trying out these patches. They look good to me.

Thanks for the review.

Tiny nit, maybe this comment could say something less obvious, cf the
similar comment near the other stream:

+       /* Set up the read stream */
+       stream = read_stream_begin_relation(READ_STREAM_MAINTENANCE,

Done in upthread v18.

I don't really love the cumbersome casting required around
per_buffer_data, but that's not your patches' fault (hmm, wonder what
we can do to improve that).

I don't know if you saw v17, but I tried to improve it a bit. The
casting still has to happen, but I at least use the variable as a
uint8 instead of a pointer to a uint8 (dunno if that makes it better
or worse). It is the same in v18.

- Melanie

#56Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Melanie Plageman (#54)
Re: Confine vacuum skip logic to lazy_scan_skip

On Thu, Feb 13, 2025 at 4:55 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

Thanks for your review! I've made the changes in attached v18.

I do want to know what you think we should do about what you brought
up about lazy_check_wraparound_failsafe() -- given my reply (below).

On Thu, Feb 13, 2025 at 6:08 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Sorry for the late chiming in. I've reviewed the v16 patch set, and
the patches mostly look good. Here are some comments mostly about
cosmetic things:

0001:

-   bool        all_visible_according_to_vm,
-               was_eager_scanned = false;
+   uint8       blk_flags = 0;

Can we probably declare blk_flags inside the main loop?

I've done this in 0002 (can't in 0001 because of it being used inside
the while loop itself).

You're right, thanks.

0002:

In lazy_scan_heap(), we have a failsafe check at the beginning of the
main loop, which is performed before reading the first block. Isn't it
better to move this check after scanning a block (especially after
incrementing scanned_pages)? Otherwise, we would end up calling
lazy_check_wraparound_failsafe() at the very first loop, which
previously didn't happen without the patch. Since we already called
lazy_check_wraparound_failsafe() just before calling lazy_scan_heap(),
the extra check would not make much sense.

Yes, I agonized over this a bit. The problem with calling
lazy_check_wraparound_failsafe() (and vacuum_delay_point() especially)
after reading the first block is that read_stream_next_buffer()
returns the buffer pinned.

Good point.

We don't want to hang onto that pin for a
long time. But I can't move them to the bottom of the loop after we
release the buffer because some of the code paths don't make it that
far. I don't see a good way other than how I did it or special-casing
block 0. What do you think?

How about adding 'vacrel->scanned_pages > 0' to the if statement?
Which seems not odd to me.

Looking at the 0002 patch, it seems you reverted the change to the
following comment:

  /*
   * Vacuum the Free Space Map to make newly-freed space visible on
-  * upper-level FSM pages.  Note we have not yet processed blkno.
+ * upper-level FSM pages. Note we have not yet processed blkno+1.
   */

I feel that the previous change I saw in v17 is clearer:

  /*
  * Vacuum the Free Space Map to make newly-freed space visible on
- * upper-level FSM pages.  Note we have not yet processed blkno.
+ * upper-level FSM pages.  Note that blkno is the previously
+ * processed block.
  */

The rest looks good to me.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#57Melanie Plageman
melanieplageman@gmail.com
In reply to: Masahiko Sawada (#56)
3 attachment(s)
Re: Confine vacuum skip logic to lazy_scan_skip

On Thu, Feb 13, 2025 at 8:30 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Feb 13, 2025 at 4:55 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

We don't want to hang onto that pin for a
long time. But I can't move them to the bottom of the loop after we
release the buffer because some of the code paths don't make it that
far. I don't see a good way other than how I did it or special-casing
block 0. What do you think?

How about adding 'vacrel->scanned_pages > 0' to the if statement?
Which seems not odd to me.

Cool. I've done this in attached v19.

Looking at the 0002 patch, it seems you reverted the change to the
following comment:

/*
* Vacuum the Free Space Map to make newly-freed space visible on
-  * upper-level FSM pages.  Note we have not yet processed blkno.
+ * upper-level FSM pages. Note we have not yet processed blkno+1.
*/

I feel that the previous change I saw in v17 is clearer:

I've reverted to the old comment. Thanks

The rest looks good to me.

Cool! I'll plan to push this tomorrow barring any objections.

- Melanie

Attachments:

v19-0001-Convert-heap_vac_scan_next_block-boolean-paramet.patchtext/x-patch; charset=US-ASCII; name=v19-0001-Convert-heap_vac_scan_next_block-boolean-paramet.patchDownload
From 25140d2966d2f87a4a410fc92147f2d88e587920 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Thu, 13 Feb 2025 16:49:10 -0500
Subject: [PATCH v19 1/3] Convert heap_vac_scan_next_block() boolean parameters
 to flags

The read stream API only allows one piece of extra per block state to be
passed back to the API user. lazy_scan_heap() needs to know whether or
not a given block was all-visible in the visibility map and whether or
not it was eagerly scanned. Convert these two pieces of information to
flags so that they can be passed as a single argument to
heap_vac_scan_next_block() (which will become the read stream API
callback for heap phase I vacuuming).

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CAAKRu_bmx33jTqATP5GKNFYwAg02a9dDtk4U_ciEjgBHZSVkOQ%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c | 47 ++++++++++++++++------------
 1 file changed, 27 insertions(+), 20 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 3df5b92afb8..c4d0f77ee2f 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -248,6 +248,13 @@ typedef enum
  */
 #define EAGER_SCAN_REGION_SIZE 4096
 
+/*
+ * heap_vac_scan_next_block() sets these flags to communicate information
+ * about the block it read to the caller.
+ */
+#define VAC_BLK_WAS_EAGER_SCANNED (1 << 0)
+#define VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM (1 << 1)
+
 typedef struct LVRelState
 {
 	/* Target heap relation and its indexes */
@@ -417,8 +424,7 @@ static void lazy_scan_heap(LVRelState *vacrel);
 static void heap_vacuum_eager_scan_setup(LVRelState *vacrel,
 										 VacuumParams *params);
 static bool heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
-									 bool *all_visible_according_to_vm,
-									 bool *was_eager_scanned);
+									 uint8 *blk_info);
 static void find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
@@ -1171,8 +1177,7 @@ lazy_scan_heap(LVRelState *vacrel)
 	BlockNumber rel_pages = vacrel->rel_pages,
 				blkno,
 				next_fsm_block_to_vacuum = 0;
-	bool		all_visible_according_to_vm,
-				was_eager_scanned = false;
+	uint8		blk_info = 0;
 	BlockNumber orig_eager_scan_success_limit =
 		vacrel->eager_scan_remaining_successes; /* for logging */
 	Buffer		vmbuffer = InvalidBuffer;
@@ -1196,8 +1201,7 @@ lazy_scan_heap(LVRelState *vacrel)
 	vacrel->next_unskippable_eager_scanned = false;
 	vacrel->next_unskippable_vmbuffer = InvalidBuffer;
 
-	while (heap_vac_scan_next_block(vacrel, &blkno, &all_visible_according_to_vm,
-									&was_eager_scanned))
+	while (heap_vac_scan_next_block(vacrel, &blkno, &blk_info))
 	{
 		Buffer		buf;
 		Page		page;
@@ -1206,7 +1210,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		bool		got_cleanup_lock = false;
 
 		vacrel->scanned_pages++;
-		if (was_eager_scanned)
+		if (blk_info & VAC_BLK_WAS_EAGER_SCANNED)
 			vacrel->eager_scanned_pages++;
 
 		/* Report as block scanned, update error traceback information */
@@ -1331,7 +1335,8 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		if (got_cleanup_lock)
 			lazy_scan_prune(vacrel, buf, blkno, page,
-							vmbuffer, all_visible_according_to_vm,
+							vmbuffer,
+							blk_info & VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM,
 							&has_lpdead_items, &vm_page_frozen);
 
 		/*
@@ -1348,7 +1353,8 @@ lazy_scan_heap(LVRelState *vacrel)
 		 * exclude pages skipped due to cleanup lock contention from eager
 		 * freeze algorithm caps.
 		 */
-		if (got_cleanup_lock && was_eager_scanned)
+		if (got_cleanup_lock &&
+			(blk_info & VAC_BLK_WAS_EAGER_SCANNED))
 		{
 			/* Aggressive vacuums do not eager scan. */
 			Assert(!vacrel->aggressive);
@@ -1479,11 +1485,11 @@ lazy_scan_heap(LVRelState *vacrel)
  * and various thresholds to skip blocks which do not need to be processed and
  * sets blkno to the next block to process.
  *
- * The block number and visibility status of the next block to process are set
- * in *blkno and *all_visible_according_to_vm.  The return value is false if
- * there are no further blocks to process. If the block is being eagerly
- * scanned, was_eager_scanned is set so that the caller can count whether or
- * not an eagerly scanned page is successfully frozen.
+ * The block number of the next block to process is set in *blkno and its
+ * visibility status and whether or not it was eager scanned is set in
+ * *blk_info.
+ *
+ * The return value is false if there are no further blocks to process.
  *
  * vacrel is an in/out parameter here.  Vacuum options and information about
  * the relation are read.  vacrel->skippedallvis is set if we skip a block
@@ -1493,15 +1499,14 @@ lazy_scan_heap(LVRelState *vacrel)
  */
 static bool
 heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
-						 bool *all_visible_according_to_vm,
-						 bool *was_eager_scanned)
+						 uint8 *blk_info)
 {
 	BlockNumber next_block;
 
 	/* relies on InvalidBlockNumber + 1 overflowing to 0 on first call */
 	next_block = vacrel->current_block + 1;
 
-	*was_eager_scanned = false;
+	*blk_info = 0;
 
 	/* Have we reached the end of the relation? */
 	if (next_block >= vacrel->rel_pages)
@@ -1562,7 +1567,7 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 		 * otherwise they would've been unskippable.
 		 */
 		*blkno = vacrel->current_block = next_block;
-		*all_visible_according_to_vm = true;
+		*blk_info |= VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM;
 		return true;
 	}
 	else
@@ -1574,8 +1579,10 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 		Assert(next_block == vacrel->next_unskippable_block);
 
 		*blkno = vacrel->current_block = next_block;
-		*all_visible_according_to_vm = vacrel->next_unskippable_allvis;
-		*was_eager_scanned = vacrel->next_unskippable_eager_scanned;
+		if (vacrel->next_unskippable_allvis)
+			*blk_info |= VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM;
+		if (vacrel->next_unskippable_eager_scanned)
+			*blk_info |= VAC_BLK_WAS_EAGER_SCANNED;
 		return true;
 	}
 }
-- 
2.34.1

v19-0003-Use-streaming-read-I-O-in-VACUUM-s-third-phase.patchtext/x-patch; charset=US-ASCII; name=v19-0003-Use-streaming-read-I-O-in-VACUUM-s-third-phase.patchDownload
From 1c980c266cb2c1e91c2a86688f7d2e39da0d264d Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Thu, 13 Feb 2025 17:34:18 -0500
Subject: [PATCH v19 3/3] Use streaming read I/O in VACUUM's third phase

Make vacuum's third phase (its second pass over the heap), which reaps
dead items collected in the first phase and marks them as reusable, use
the read stream API. This commit adds a new read stream callback,
vacuum_reap_lp_read_stream_next(), that looks ahead in the TidStore and
returns the next block number to read for vacuum.

Author: Melanie Plageman <melanieplageman@gmail.com>
Co-authored-by: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGKN3oy0bN_3yv8hd78a4%2BM1tJC9z7mD8%2Bf%2ByA%2BGeoFUwQ%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c | 55 +++++++++++++++++++++++++---
 1 file changed, 49 insertions(+), 6 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 08d89ab2bcd..74175e00534 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2640,6 +2640,32 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 	return allindexes;
 }
 
+/*
+ * Read stream callback for vacuum's third phase (second pass over the heap).
+ * Gets the next block from the TID store and returns it or InvalidBlockNumber
+ * if there are no further blocks to vacuum.
+ */
+static BlockNumber
+vacuum_reap_lp_read_stream_next(ReadStream *stream,
+								void *callback_private_data,
+								void *per_buffer_data)
+{
+	TidStoreIter *iter = callback_private_data;
+	TidStoreIterResult *iter_result;
+
+	iter_result = TidStoreIterateNext(iter);
+	if (iter_result == NULL)
+		return InvalidBlockNumber;
+
+	/*
+	 * Save the TidStoreIterResult for later, so we can extract the offsets.
+	 * It is safe to copy the result, according to TidStoreIterateNext().
+	 */
+	memcpy(per_buffer_data, iter_result, sizeof(*iter_result));
+
+	return iter_result->blkno;
+}
+
 /*
  *	lazy_vacuum_heap_rel() -- second pass over the heap for two pass strategy
  *
@@ -2660,11 +2686,11 @@ lazy_vacuum_all_indexes(LVRelState *vacrel)
 static void
 lazy_vacuum_heap_rel(LVRelState *vacrel)
 {
+	ReadStream *stream;
 	BlockNumber vacuumed_pages = 0;
 	Buffer		vmbuffer = InvalidBuffer;
 	LVSavedErrInfo saved_err_info;
 	TidStoreIter *iter;
-	TidStoreIterResult *iter_result;
 
 	Assert(vacrel->do_index_vacuuming);
 	Assert(vacrel->do_index_cleanup);
@@ -2680,20 +2706,37 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 							 InvalidBlockNumber, InvalidOffsetNumber);
 
 	iter = TidStoreBeginIterate(vacrel->dead_items);
-	while ((iter_result = TidStoreIterateNext(iter)) != NULL)
+
+	/* Set up the read stream for vacuum's second pass through the heap */
+	stream = read_stream_begin_relation(READ_STREAM_MAINTENANCE,
+										vacrel->bstrategy,
+										vacrel->rel,
+										MAIN_FORKNUM,
+										vacuum_reap_lp_read_stream_next,
+										iter,
+										sizeof(TidStoreIterResult));
+
+	while (true)
 	{
 		BlockNumber blkno;
 		Buffer		buf;
 		Page		page;
+		TidStoreIterResult *iter_result;
 		Size		freespace;
 		OffsetNumber offsets[MaxOffsetNumber];
 		int			num_offsets;
 
 		vacuum_delay_point(false);
 
-		blkno = iter_result->blkno;
-		vacrel->blkno = blkno;
+		buf = read_stream_next_buffer(stream, (void **) &iter_result);
 
+		/* The relation is exhausted */
+		if (!BufferIsValid(buf))
+			break;
+
+		vacrel->blkno = blkno = BufferGetBlockNumber(buf);
+
+		Assert(iter_result);
 		num_offsets = TidStoreGetBlockOffsets(iter_result, offsets, lengthof(offsets));
 		Assert(num_offsets <= lengthof(offsets));
 
@@ -2705,8 +2748,6 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 		visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
 
 		/* We need a non-cleanup exclusive lock to mark dead_items unused */
-		buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL,
-								 vacrel->bstrategy);
 		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
 		lazy_vacuum_heap_page(vacrel, blkno, buf, offsets,
 							  num_offsets, vmbuffer);
@@ -2719,6 +2760,8 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 		RecordPageWithFreeSpace(vacrel->rel, blkno, freespace);
 		vacuumed_pages++;
 	}
+
+	read_stream_end(stream);
 	TidStoreEndIterate(iter);
 
 	vacrel->blkno = InvalidBlockNumber;
-- 
2.34.1

v19-0002-Use-streaming-read-I-O-in-VACUUM-s-first-phase.patchtext/x-patch; charset=US-ASCII; name=v19-0002-Use-streaming-read-I-O-in-VACUUM-s-first-phase.patchDownload
From ba4b9d8dd13c02bddb6a1013e9a05b4aea688036 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Thu, 13 Feb 2025 17:34:12 -0500
Subject: [PATCH v19 2/3] Use streaming read I/O in VACUUM's first phase

Make vacuum's first phase, which prunes and freezes tuples and records
dead TIDs, use the read stream API by by converting
heap_vac_scan_next_block() to a read stream callback.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CAAKRu_aLwANZpxHc0tC-6OT0OQT4TftDGkKAO5yigMUOv_Tcsw%40mail.gmail.com
---
 src/backend/access/heap/vacuumlazy.c | 152 +++++++++++++++++----------
 1 file changed, 94 insertions(+), 58 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index c4d0f77ee2f..08d89ab2bcd 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -153,6 +153,7 @@
 #include "storage/bufmgr.h"
 #include "storage/freespace.h"
 #include "storage/lmgr.h"
+#include "storage/read_stream.h"
 #include "utils/lsyscache.h"
 #include "utils/pg_rusage.h"
 #include "utils/timestamp.h"
@@ -423,8 +424,9 @@ typedef struct LVSavedErrInfo
 static void lazy_scan_heap(LVRelState *vacrel);
 static void heap_vacuum_eager_scan_setup(LVRelState *vacrel,
 										 VacuumParams *params);
-static bool heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
-									 uint8 *blk_info);
+static BlockNumber heap_vac_scan_next_block(ReadStream *stream,
+											void *callback_private_data,
+											void *per_buffer_data);
 static void find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis);
 static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf,
 								   BlockNumber blkno, Page page,
@@ -1174,10 +1176,11 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 static void
 lazy_scan_heap(LVRelState *vacrel)
 {
+	ReadStream *stream;
 	BlockNumber rel_pages = vacrel->rel_pages,
-				blkno,
+				blkno = 0,
 				next_fsm_block_to_vacuum = 0;
-	uint8		blk_info = 0;
+	void	   *per_buffer_data = NULL;
 	BlockNumber orig_eager_scan_success_limit =
 		vacrel->eager_scan_remaining_successes; /* for logging */
 	Buffer		vmbuffer = InvalidBuffer;
@@ -1201,23 +1204,24 @@ lazy_scan_heap(LVRelState *vacrel)
 	vacrel->next_unskippable_eager_scanned = false;
 	vacrel->next_unskippable_vmbuffer = InvalidBuffer;
 
-	while (heap_vac_scan_next_block(vacrel, &blkno, &blk_info))
+	/* Set up the read stream for vacuum's first pass through the heap */
+	stream = read_stream_begin_relation(READ_STREAM_MAINTENANCE,
+										vacrel->bstrategy,
+										vacrel->rel,
+										MAIN_FORKNUM,
+										heap_vac_scan_next_block,
+										vacrel,
+										sizeof(uint8));
+
+	while (true)
 	{
 		Buffer		buf;
 		Page		page;
+		uint8		blk_info = 0;
 		bool		has_lpdead_items;
 		bool		vm_page_frozen = false;
 		bool		got_cleanup_lock = false;
 
-		vacrel->scanned_pages++;
-		if (blk_info & VAC_BLK_WAS_EAGER_SCANNED)
-			vacrel->eager_scanned_pages++;
-
-		/* Report as block scanned, update error traceback information */
-		pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
-		update_vacuum_error_info(vacrel, NULL, VACUUM_ERRCB_PHASE_SCAN_HEAP,
-								 blkno, InvalidOffsetNumber);
-
 		vacuum_delay_point(false);
 
 		/*
@@ -1229,7 +1233,8 @@ lazy_scan_heap(LVRelState *vacrel)
 		 * one-pass strategy, and the two-pass strategy with the index_cleanup
 		 * param set to 'off'.
 		 */
-		if (vacrel->scanned_pages % FAILSAFE_EVERY_PAGES == 0)
+		if (vacrel->scanned_pages > 0 &&
+			vacrel->scanned_pages % FAILSAFE_EVERY_PAGES == 0)
 			lazy_check_wraparound_failsafe(vacrel);
 
 		/*
@@ -1258,10 +1263,11 @@ lazy_scan_heap(LVRelState *vacrel)
 
 			/*
 			 * Vacuum the Free Space Map to make newly-freed space visible on
-			 * upper-level FSM pages.  Note we have not yet processed blkno.
+			 * upper-level FSM pages. Note that blkno is the previously
+			 * processed block.
 			 */
 			FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum,
-									blkno);
+									blkno + 1);
 			next_fsm_block_to_vacuum = blkno;
 
 			/* Report that we are once again scanning the heap */
@@ -1269,6 +1275,26 @@ lazy_scan_heap(LVRelState *vacrel)
 										 PROGRESS_VACUUM_PHASE_SCAN_HEAP);
 		}
 
+		buf = read_stream_next_buffer(stream, &per_buffer_data);
+
+		/* The relation is exhausted. */
+		if (!BufferIsValid(buf))
+			break;
+
+		blk_info = *((uint8 *) per_buffer_data);
+		CheckBufferIsPinnedOnce(buf);
+		page = BufferGetPage(buf);
+		blkno = BufferGetBlockNumber(buf);
+
+		vacrel->scanned_pages++;
+		if (blk_info & VAC_BLK_WAS_EAGER_SCANNED)
+			vacrel->eager_scanned_pages++;
+
+		/* Report as block scanned, update error traceback information */
+		pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
+		update_vacuum_error_info(vacrel, NULL, VACUUM_ERRCB_PHASE_SCAN_HEAP,
+								 blkno, InvalidOffsetNumber);
+
 		/*
 		 * Pin the visibility map page in case we need to mark the page
 		 * all-visible.  In most cases this will be very cheap, because we'll
@@ -1276,10 +1302,6 @@ lazy_scan_heap(LVRelState *vacrel)
 		 */
 		visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
 
-		buf = ReadBufferExtended(vacrel->rel, MAIN_FORKNUM, blkno, RBM_NORMAL,
-								 vacrel->bstrategy);
-		page = BufferGetPage(buf);
-
 		/*
 		 * We need a buffer cleanup lock to prune HOT chains and defragment
 		 * the page in lazy_scan_prune.  But when it's not possible to acquire
@@ -1439,8 +1461,12 @@ lazy_scan_heap(LVRelState *vacrel)
 	if (BufferIsValid(vmbuffer))
 		ReleaseBuffer(vmbuffer);
 
-	/* report that everything is now scanned */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
+	/*
+	 * Report that everything is now scanned. We never skip scanning the last
+	 * block in the relation, so we can pass rel_pages here.
+	 */
+	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED,
+								 rel_pages);
 
 	/* now we can compute the new value for pg_class.reltuples */
 	vacrel->new_live_tuples = vac_estimate_reltuples(vacrel->rel, rel_pages,
@@ -1455,6 +1481,8 @@ lazy_scan_heap(LVRelState *vacrel)
 		Max(vacrel->new_live_tuples, 0) + vacrel->recently_dead_tuples +
 		vacrel->missed_dead_tuples;
 
+	read_stream_end(stream);
+
 	/*
 	 * Do index vacuuming (call each index's ambulkdelete routine), then do
 	 * related heap vacuuming
@@ -1465,12 +1493,14 @@ lazy_scan_heap(LVRelState *vacrel)
 	/*
 	 * Vacuum the remainder of the Free Space Map.  We must do this whether or
 	 * not there were indexes, and whether or not we bypassed index vacuuming.
+	 * We can pass rel_pages here because we never skip scanning the last
+	 * block of the relation.
 	 */
-	if (blkno > next_fsm_block_to_vacuum)
-		FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, blkno);
+	if (rel_pages > next_fsm_block_to_vacuum)
+		FreeSpaceMapVacuumRange(vacrel->rel, next_fsm_block_to_vacuum, rel_pages);
 
 	/* report all blocks vacuumed */
-	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);
+	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, rel_pages);
 
 	/* Do final index cleanup (call each index's amvacuumcleanup routine) */
 	if (vacrel->nindexes > 0 && vacrel->do_index_cleanup)
@@ -1478,36 +1508,41 @@ lazy_scan_heap(LVRelState *vacrel)
 }
 
 /*
- *	heap_vac_scan_next_block() -- get next block for vacuum to process
- *
- * lazy_scan_heap() calls here every time it needs to get the next block to
- * prune and vacuum.  The function uses the visibility map, vacuum options,
- * and various thresholds to skip blocks which do not need to be processed and
- * sets blkno to the next block to process.
- *
- * The block number of the next block to process is set in *blkno and its
- * visibility status and whether or not it was eager scanned is set in
- * *blk_info.
- *
- * The return value is false if there are no further blocks to process.
- *
- * vacrel is an in/out parameter here.  Vacuum options and information about
- * the relation are read.  vacrel->skippedallvis is set if we skip a block
- * that's all-visible but not all-frozen, to ensure that we don't update
- * relfrozenxid in that case.  vacrel also holds information about the next
- * unskippable block, as bookkeeping for this function.
+ *	heap_vac_scan_next_block() -- read stream callback to get the next block
+ *	for vacuum to process
+ *
+ * Every time lazy_scan_heap() needs a new block to process during its first
+ * phase, it invokes read_stream_next_buffer() with a stream set up to call
+ * heap_vac_scan_next_block() to get the next block.
+ *
+ * heap_vac_scan_next_block() uses the visibility map, vacuum options, and
+ * various thresholds to skip blocks which do not need to be processed and
+ * returns the next block to process or InvalidBlockNumber if there are no
+ * remaining blocks.
+ *
+ * The visibility status of the next block to process and whether or not it
+ * was eager scanned is set in the per_buffer_data.
+ *
+ * callback_private_data contains a reference to the LVRelState, passed to the
+ * read stream API during stream setup. The LVRelState is an in/out parameter
+ * here (locally named `vacrel`). Vacuum options and information about the
+ * relation are read from it. vacrel->skippedallvis is set if we skip a block
+ * that's all-visible but not all-frozen (to ensure that we don't update
+ * relfrozenxid in that case). vacrel also holds information about the next
+ * unskippable block -- as bookkeeping for this function.
  */
-static bool
-heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
-						 uint8 *blk_info)
+static BlockNumber
+heap_vac_scan_next_block(ReadStream *stream,
+						 void *callback_private_data,
+						 void *per_buffer_data)
 {
 	BlockNumber next_block;
+	LVRelState *vacrel = callback_private_data;
+	uint8		blk_info = 0;
 
 	/* relies on InvalidBlockNumber + 1 overflowing to 0 on first call */
 	next_block = vacrel->current_block + 1;
 
-	*blk_info = 0;
-
 	/* Have we reached the end of the relation? */
 	if (next_block >= vacrel->rel_pages)
 	{
@@ -1516,8 +1551,7 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 			ReleaseBuffer(vacrel->next_unskippable_vmbuffer);
 			vacrel->next_unskippable_vmbuffer = InvalidBuffer;
 		}
-		*blkno = vacrel->rel_pages;
-		return false;
+		return InvalidBlockNumber;
 	}
 
 	/*
@@ -1566,9 +1600,10 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 		 * but chose not to.  We know that they are all-visible in the VM,
 		 * otherwise they would've been unskippable.
 		 */
-		*blkno = vacrel->current_block = next_block;
-		*blk_info |= VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM;
-		return true;
+		vacrel->current_block = next_block;
+		blk_info |= VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM;
+		*((uint8 *) per_buffer_data) = blk_info;
+		return vacrel->current_block;
 	}
 	else
 	{
@@ -1578,12 +1613,13 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno,
 		 */
 		Assert(next_block == vacrel->next_unskippable_block);
 
-		*blkno = vacrel->current_block = next_block;
+		vacrel->current_block = next_block;
 		if (vacrel->next_unskippable_allvis)
-			*blk_info |= VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM;
+			blk_info |= VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM;
 		if (vacrel->next_unskippable_eager_scanned)
-			*blk_info |= VAC_BLK_WAS_EAGER_SCANNED;
-		return true;
+			blk_info |= VAC_BLK_WAS_EAGER_SCANNED;
+		*((uint8 *) per_buffer_data) = blk_info;
+		return vacrel->current_block;
 	}
 }
 
-- 
2.34.1

#58Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#57)
Re: Confine vacuum skip logic to lazy_scan_skip

On Thu, Feb 13, 2025 at 9:06 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Thu, Feb 13, 2025 at 8:30 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

The rest looks good to me.

Cool! I'll plan to push this tomorrow barring any objections.

I've committed this and marked it as such in the CF app.

- Melanie

#59Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#58)
Re: Confine vacuum skip logic to lazy_scan_skip

On Fri, Feb 14, 2025 at 1:15 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Thu, Feb 13, 2025 at 9:06 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Thu, Feb 13, 2025 at 8:30 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

The rest looks good to me.

Cool! I'll plan to push this tomorrow barring any objections.

I've committed this and marked it as such in the CF app.

Seems valgrind doesn't like this [1]https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&amp;dt=2025-02-14%2018%3A00%3A12. I'm looking into it.

- Melanie

[1]: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&amp;dt=2025-02-14%2018%3A00%3A12

#60Thomas Munro
thomas.munro@gmail.com
In reply to: Melanie Plageman (#59)
Re: Confine vacuum skip logic to lazy_scan_skip

On Sat, Feb 15, 2025 at 7:30 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

Seems valgrind doesn't like this [1]. I'm looking into it.

Melanie was able to reproduce this on her local valgrind and
eventually we figured out that it's my fault. I put code into
read_stream.c that calls wipe_mem(), thinking that that was our
standard way of scribbling 0x7f on memory that you shouldn't access
again until it's reused. I didn't realise that wipe_mem() also tells
valgrind that the memory is now "no access". That makes sense for
palloc/pfree because when that range is allocated again it'll clear
that. The point is to help people discover that they have a dangling
reference to per-buffer data after they advance to the next buffer,
which wouldn't work because it's in a circular queue and could be
overwritten any time after that.

This fixes it, but is not yet my proposed change:

@@ -193,9 +193,12 @@ read_stream_get_block(ReadStream *stream, void
*per_buffer_data)
        if (blocknum != InvalidBlockNumber)
                stream->buffered_blocknum = InvalidBlockNumber;
        else
+       {
+               VALGRIND_MAKE_MEM_UNDEFINED(per_buffer_data,
stream->per_buffer_data_size);
                blocknum = stream->callback(stream,

stream->callback_private_data,

per_buffer_data);
+ }

Thinking about how to make it more symmetrical...

#61Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#60)
1 attachment(s)
Re: Confine vacuum skip logic to lazy_scan_skip

On Sat, Feb 15, 2025 at 10:50 AM Thomas Munro <thomas.munro@gmail.com> wrote:

On Sat, Feb 15, 2025 at 7:30 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

Seems valgrind doesn't like this [1]. I'm looking into it.

Melanie was able to reproduce this on her local valgrind and
eventually we figured out that it's my fault. I put code into
read_stream.c that calls wipe_mem(), thinking that that was our
standard way of scribbling 0x7f on memory that you shouldn't access
again until it's reused. I didn't realise that wipe_mem() also tells
valgrind that the memory is now "no access". That makes sense for
palloc/pfree because when that range is allocated again it'll clear
that. The point is to help people discover that they have a dangling
reference to per-buffer data after they advance to the next buffer,
which wouldn't work because it's in a circular queue and could be
overwritten any time after that.

Here's a patch. Is there a tidier way to write this?

It should probably be back-patched to 17, because external code might
use per-buffer data (obviously v17 core doesn't or skink would have
told us this sooner). It's not a good time to push to 17 today,
though. Push to master now to cheer skink up and 17 some time later
when the coast is clear, or just wait?

Attachments:

0001-Fix-explicit-valgrind-interaction-in-read_stream.c.patchtext/x-patch; charset=US-ASCII; name=0001-Fix-explicit-valgrind-interaction-in-read_stream.c.patchDownload
From bf4ba8b334c7ea6fcd33e8e6a6a4628c88161624 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Sat, 15 Feb 2025 11:06:30 +1300
Subject: [PATCH] Fix explicit valgrind interaction in read_stream.c.

By calling wipe_mem() on per-buffer data memory that has been released,
we are also telling Valgrind that the memory is "noaccess".  We need to
set it to "undefined" before giving it to the registered callback to
fill in, when a slot is reused.

As discovered by build farm animal skink when the VACUUM streamification
patches landed (the first users of per-buffer data).

Back-patch to 17, where read streams landed.  There aren't any users of
per-buffer data in 17, but extension code might do that.

Reported-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKG%2Bg6aXpi2FEHqeLOzE%2BxYw%3DOV%2B-N5jhOEnnV%2BF0USM9xA%40mail.gmail.com
---
 src/backend/storage/aio/read_stream.c | 39 ++++++++++++++++++++++-----
 1 file changed, 33 insertions(+), 6 deletions(-)

diff --git a/src/backend/storage/aio/read_stream.c b/src/backend/storage/aio/read_stream.c
index e4414b2e915..d722eda7d8e 100644
--- a/src/backend/storage/aio/read_stream.c
+++ b/src/backend/storage/aio/read_stream.c
@@ -193,9 +193,20 @@ read_stream_get_block(ReadStream *stream, void *per_buffer_data)
 	if (blocknum != InvalidBlockNumber)
 		stream->buffered_blocknum = InvalidBlockNumber;
 	else
+	{
+		/*
+		 * Tell Valgrind that the per-buffer data is undefined.  That replaces
+		 * the "noaccess" state that was set when the consumer moved past this
+		 * entry last time around the queue, and should also catch callbacks
+		 * that fail to initialize data that the buffer consumer later
+		 * accesses.  On the first go around, it is undefined already.
+		 */
+		VALGRIND_MAKE_MEM_UNDEFINED(per_buffer_data,
+									stream->per_buffer_data_size);
 		blocknum = stream->callback(stream,
 									stream->callback_private_data,
 									per_buffer_data);
+	}
 
 	return blocknum;
 }
@@ -752,8 +763,11 @@ read_stream_next_buffer(ReadStream *stream, void **per_buffer_data)
 	}
 
 #ifdef CLOBBER_FREED_MEMORY
-	/* Clobber old buffer and per-buffer data for debugging purposes. */
+	/* Clobber old buffer for debugging purposes. */
 	stream->buffers[oldest_buffer_index] = InvalidBuffer;
+#endif
+
+#if defined(CLOBBER_FREED_MEMORY) || defined(USE_VALGRIND)
 
 	/*
 	 * The caller will get access to the per-buffer data, until the next call.
@@ -762,11 +776,24 @@ read_stream_next_buffer(ReadStream *stream, void **per_buffer_data)
 	 * that is holding a dangling pointer to it.
 	 */
 	if (stream->per_buffer_data)
-		wipe_mem(get_per_buffer_data(stream,
-									 oldest_buffer_index == 0 ?
-									 stream->queue_size - 1 :
-									 oldest_buffer_index - 1),
-				 stream->per_buffer_data_size);
+	{
+		void	   *per_buffer_data;
+
+		per_buffer_data = get_per_buffer_data(stream,
+											  oldest_buffer_index == 0 ?
+											  stream->queue_size - 1 :
+											  oldest_buffer_index - 1);
+
+#if defined(CLOBBER_FREED_MEMORY)
+		/* This also tells Valgrind the memory is "noaccess". */
+		wipe_mem(get_per_buffer_data(per_buffer_data,
+									 stream->per_buffer_data_size));
+#elif defined(USE_VALGRIND)
+		/* Tell it ourselves. */
+		VALGRIND_MAKE_MEM_NO_ACCESS(per_buffer_data,
+									stream->per_buffer_data_size);
+#endif
+	}
 #endif
 
 	/* Pin transferred to caller. */
-- 
2.48.1

#62Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#61)
Re: Confine vacuum skip logic to lazy_scan_skip

Thomas Munro <thomas.munro@gmail.com> writes:

Here's a patch. Is there a tidier way to write this?

Hmm, I think not with the current set of primitives. We could think
about refactoring them, but that's not a job for a band-aid patch.

It should probably be back-patched to 17, because external code might
use per-buffer data (obviously v17 core doesn't or skink would have
told us this sooner). It's not a good time to push to 17 today,
though. Push to master now to cheer skink up and 17 some time later
when the coast is clear, or just wait?

Agreed that right now is a bad time to push this to v17 --- we need to
keep the risk factors as low as possible for the re-release. Master
now and v17 after the re-wrap seems like the right compromise.

regards, tom lane

#63Thomas Munro
thomas.munro@gmail.com
In reply to: Tom Lane (#62)
Re: Confine vacuum skip logic to lazy_scan_skip

On Sat, Feb 15, 2025 at 12:03 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Thomas Munro <thomas.munro@gmail.com> writes:

Here's a patch. Is there a tidier way to write this?

Hmm, I think not with the current set of primitives. We could think
about refactoring them, but that's not a job for a band-aid patch.

Thanks for looking.

It should probably be back-patched to 17, because external code might
use per-buffer data (obviously v17 core doesn't or skink would have
told us this sooner). It's not a good time to push to 17 today,
though. Push to master now to cheer skink up and 17 some time later
when the coast is clear, or just wait?

Agreed that right now is a bad time to push this to v17 --- we need to
keep the risk factors as low as possible for the re-release. Master
now and v17 after the re-wrap seems like the right compromise.

Cool, will push to master. Melanie, could you please confirm that
this patch works for you? I haven't figured out what I'm doing wrong
but my local Valgrind doesn't seem to show the problem (USE_VALGRIND
defined, Debian's Valgrind v3.19.0).

#64Melanie Plageman
melanieplageman@gmail.com
In reply to: Thomas Munro (#63)
Re: Confine vacuum skip logic to lazy_scan_skip

On Fri, Feb 14, 2025 at 6:09 PM Thomas Munro <thomas.munro@gmail.com> wrote:

Agreed that right now is a bad time to push this to v17 --- we need to
keep the risk factors as low as possible for the re-release. Master
now and v17 after the re-wrap seems like the right compromise.

Cool, will push to master. Melanie, could you please confirm that
this patch works for you? I haven't figured out what I'm doing wrong
but my local Valgrind doesn't seem to show the problem (USE_VALGRIND
defined, Debian's Valgrind v3.19.0).

It fixed the issue (after an off-list correction to the patch by Thomas).

- Melanie

#65Thomas Munro
thomas.munro@gmail.com
In reply to: Melanie Plageman (#64)
Re: Confine vacuum skip logic to lazy_scan_skip

On Sat, Feb 15, 2025 at 12:50 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

It fixed the issue (after an off-list correction to the patch by Thomas).

Thanks! It's green again.

#66Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#65)
Re: Confine vacuum skip logic to lazy_scan_skip

Thomas Munro <thomas.munro@gmail.com> writes:

Thanks! It's green again.

The security team's Coverity instance complained about this patch:

*** CID 1642971: Null pointer dereferences (FORWARD_NULL)
/srv/coverity/git/pgsql-git/postgresql/src/backend/access/heap/vacuumlazy.c: 1295 in lazy_scan_heap()
1289 buf = read_stream_next_buffer(stream, &per_buffer_data);
1290
1291 /* The relation is exhausted. */
1292 if (!BufferIsValid(buf))
1293 break;
1294

CID 1642971: Null pointer dereferences (FORWARD_NULL)
Dereferencing null pointer "per_buffer_data".

1295 blk_info = *((uint8 *) per_buffer_data);
1296 CheckBufferIsPinnedOnce(buf);
1297 page = BufferGetPage(buf);
1298 blkno = BufferGetBlockNumber(buf);
1299
1300 vacrel->scanned_pages++;

Basically, Coverity doesn't understand that a successful call to
read_stream_next_buffer must set per_buffer_data here. I don't
think there's much chance of teaching it that, so we'll just
have to dismiss this item as "intentional, not a bug". However,
I do have a suggestion: I think the "per_buffer_data" variable
should be declared inside the "while (true)" loop not outside.
That way there is no chance of a value being carried across
iterations, so that if for some reason read_stream_next_buffer
failed to do what we expect and did not set per_buffer_data,
we'd be certain to get a null-pointer core dump rather than
accessing data from a previous buffer.

regards, tom lane

#67Melanie Plageman
melanieplageman@gmail.com
In reply to: Tom Lane (#66)
Re: Confine vacuum skip logic to lazy_scan_skip

On Sun, Feb 16, 2025 at 1:12 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Thomas Munro <thomas.munro@gmail.com> writes:

Thanks! It's green again.

The security team's Coverity instance complained about this patch:

*** CID 1642971: Null pointer dereferences (FORWARD_NULL)
/srv/coverity/git/pgsql-git/postgresql/src/backend/access/heap/vacuumlazy.c: 1295 in lazy_scan_heap()
1289 buf = read_stream_next_buffer(stream, &per_buffer_data);
1290
1291 /* The relation is exhausted. */
1292 if (!BufferIsValid(buf))
1293 break;
1294

CID 1642971: Null pointer dereferences (FORWARD_NULL)
Dereferencing null pointer "per_buffer_data".

1295 blk_info = *((uint8 *) per_buffer_data);
1296 CheckBufferIsPinnedOnce(buf);
1297 page = BufferGetPage(buf);
1298 blkno = BufferGetBlockNumber(buf);
1299
1300 vacrel->scanned_pages++;

Basically, Coverity doesn't understand that a successful call to
read_stream_next_buffer must set per_buffer_data here. I don't
think there's much chance of teaching it that, so we'll just
have to dismiss this item as "intentional, not a bug".

Is this easy to do? Like is there a list of things from coverity to ignore?

I do have a suggestion: I think the "per_buffer_data" variable
should be declared inside the "while (true)" loop not outside.
That way there is no chance of a value being carried across
iterations, so that if for some reason read_stream_next_buffer
failed to do what we expect and did not set per_buffer_data,
we'd be certain to get a null-pointer core dump rather than
accessing data from a previous buffer.

Done and pushed. Thanks!

- Melanie

#68Tom Lane
tgl@sss.pgh.pa.us
In reply to: Melanie Plageman (#67)
Re: Confine vacuum skip logic to lazy_scan_skip

Melanie Plageman <melanieplageman@gmail.com> writes:

On Sun, Feb 16, 2025 at 1:12 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Basically, Coverity doesn't understand that a successful call to
read_stream_next_buffer must set per_buffer_data here. I don't
think there's much chance of teaching it that, so we'll just
have to dismiss this item as "intentional, not a bug".

Is this easy to do? Like is there a list of things from coverity to ignore?

Their website has a table of live issues, and we can just mark this
one "dismissed". I'm not entirely sure how they recognize dismissed
issues --- it's not perfect, because old complaints tend to get
resurrected after changes in nearby code. But it's good enough.

I do have a suggestion: I think the "per_buffer_data" variable
should be declared inside the "while (true)" loop not outside.

Done and pushed. Thanks!

Thanks, looks better now.

regards, tom lane

#69Ranier Vilela
ranier.vf@gmail.com
In reply to: Melanie Plageman (#67)
1 attachment(s)
Re: Confine vacuum skip logic to lazy_scan_skip

Hi.

Em ter., 18 de fev. de 2025 às 11:31, Melanie Plageman <
melanieplageman@gmail.com> escreveu:

On Sun, Feb 16, 2025 at 1:12 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Thomas Munro <thomas.munro@gmail.com> writes:

Thanks! It's green again.

The security team's Coverity instance complained about this patch:

*** CID 1642971: Null pointer dereferences (FORWARD_NULL)

/srv/coverity/git/pgsql-git/postgresql/src/backend/access/heap/vacuumlazy.c:
1295 in lazy_scan_heap()

1289 buf = read_stream_next_buffer(stream,

&per_buffer_data);

1290
1291 /* The relation is exhausted. */
1292 if (!BufferIsValid(buf))
1293 break;
1294

CID 1642971: Null pointer dereferences (FORWARD_NULL)
Dereferencing null pointer "per_buffer_data".

1295 blk_info = *((uint8 *) per_buffer_data);
1296 CheckBufferIsPinnedOnce(buf);
1297 page = BufferGetPage(buf);
1298 blkno = BufferGetBlockNumber(buf);
1299
1300 vacrel->scanned_pages++;

Basically, Coverity doesn't understand that a successful call to
read_stream_next_buffer must set per_buffer_data here. I don't
think there's much chance of teaching it that, so we'll just
have to dismiss this item as "intentional, not a bug".

Is this easy to do? Like is there a list of things from coverity to ignore?

I do have a suggestion: I think the "per_buffer_data" variable
should be declared inside the "while (true)" loop not outside.
That way there is no chance of a value being carried across
iterations, so that if for some reason read_stream_next_buffer
failed to do what we expect and did not set per_buffer_data,
we'd be certain to get a null-pointer core dump rather than
accessing data from a previous buffer.

Done and pushed. Thanks!

Per Coverity.

CID 1592454: (#1 of 1): Explicit null dereferenced (FORWARD_NULL)
8. var_deref_op: Dereferencing null pointer per_buffer_data.

I think that function *read_stream_next_buffer* can return
a invalid per_buffer_data pointer, with a valid buffer.

Sorry if I'm wrong, but the function is very suspicious.

Attached a patch, which tries to fix.

best regards,
Ranier Vilela

Attachments:

fix-possible-invalid-pointer-read_stream.patchapplication/octet-stream; name=fix-possible-invalid-pointer-read_stream.patchDownload
diff --git a/src/backend/storage/aio/read_stream.c b/src/backend/storage/aio/read_stream.c
index 04bdb5e6d4..18e9b4f3c4 100644
--- a/src/backend/storage/aio/read_stream.c
+++ b/src/backend/storage/aio/read_stream.c
@@ -666,6 +666,8 @@ read_stream_next_buffer(ReadStream *stream, void **per_buffer_data)
 										READ_BUFFERS_ISSUE_ADVICE : 0)))
 			{
 				/* Fast return. */
+				if (per_buffer_data)
+					*per_buffer_data = get_per_buffer_data(stream, oldest_buffer_index);
 				return buffer;
 			}
 
@@ -682,9 +684,14 @@ read_stream_next_buffer(ReadStream *stream, void **per_buffer_data)
 			stream->distance = 0;
 			stream->oldest_buffer_index = stream->next_buffer_index;
 			stream->pinned_buffers = 0;
+			oldest_buffer_index = stream->next_buffer_index;
 		}
 
 		stream->fast_path = false;
+
+		if (per_buffer_data)
+			*per_buffer_data = get_per_buffer_data(stream, oldest_buffer_index);
+
 		return buffer;
 	}
 #endif
#70Andres Freund
andres@anarazel.de
In reply to: Ranier Vilela (#69)
Re: Confine vacuum skip logic to lazy_scan_skip

Hi,

On 2025-02-27 14:32:28 -0300, Ranier Vilela wrote:

Hi.

Em ter., 18 de fev. de 2025 às 11:31, Melanie Plageman <
melanieplageman@gmail.com> escreveu:

On Sun, Feb 16, 2025 at 1:12 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Thomas Munro <thomas.munro@gmail.com> writes:

Thanks! It's green again.

The security team's Coverity instance complained about this patch:

*** CID 1642971: Null pointer dereferences (FORWARD_NULL)

/srv/coverity/git/pgsql-git/postgresql/src/backend/access/heap/vacuumlazy.c:
1295 in lazy_scan_heap()

1289 buf = read_stream_next_buffer(stream,

&per_buffer_data);

1290
1291 /* The relation is exhausted. */
1292 if (!BufferIsValid(buf))
1293 break;
1294

CID 1642971: Null pointer dereferences (FORWARD_NULL)
Dereferencing null pointer "per_buffer_data".

1295 blk_info = *((uint8 *) per_buffer_data);
1296 CheckBufferIsPinnedOnce(buf);
1297 page = BufferGetPage(buf);
1298 blkno = BufferGetBlockNumber(buf);
1299
1300 vacrel->scanned_pages++;

Basically, Coverity doesn't understand that a successful call to
read_stream_next_buffer must set per_buffer_data here. I don't
think there's much chance of teaching it that, so we'll just
have to dismiss this item as "intentional, not a bug".

Is this easy to do? Like is there a list of things from coverity to ignore?

I do have a suggestion: I think the "per_buffer_data" variable
should be declared inside the "while (true)" loop not outside.
That way there is no chance of a value being carried across
iterations, so that if for some reason read_stream_next_buffer
failed to do what we expect and did not set per_buffer_data,
we'd be certain to get a null-pointer core dump rather than
accessing data from a previous buffer.

Done and pushed. Thanks!

Per Coverity.

CID 1592454: (#1 of 1): Explicit null dereferenced (FORWARD_NULL)
8. var_deref_op: Dereferencing null pointer per_buffer_data.

That's exactly what the messages you quoted are discussing, no?

Sorry if I'm wrong, but the function is very suspicious.

How so?

diff --git a/src/backend/storage/aio/read_stream.c b/src/backend/storage/aio/read_stream.c
index 04bdb5e6d4..18e9b4f3c4 100644
--- a/src/backend/storage/aio/read_stream.c
+++ b/src/backend/storage/aio/read_stream.c
@@ -666,6 +666,8 @@ read_stream_next_buffer(ReadStream *stream, void **per_buffer_data)
READ_BUFFERS_ISSUE_ADVICE : 0)))
{
/* Fast return. */
+				if (per_buffer_data)
+					*per_buffer_data = get_per_buffer_data(stream, oldest_buffer_index);
return buffer;
}

A few lines above:
Assert(stream->per_buffer_data_size == 0);

The fast path isn't used when per buffer data is used. Adding a check for
per_buffer_data and assigning something to it is nonsensical.

Greetings,

Andres Freund

#71Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#70)
Re: Confine vacuum skip logic to lazy_scan_skip

Hi,

On 2025-02-27 12:44:24 -0500, Andres Freund wrote:

CID 1592454: (#1 of 1): Explicit null dereferenced (FORWARD_NULL)
8. var_deref_op: Dereferencing null pointer per_buffer_data.

That's exactly what the messages you quoted are discussing, no?

Ah, no, it isn't. But I still think the coverity alert and the patch don't
make sense, as per the below:

diff --git a/src/backend/storage/aio/read_stream.c b/src/backend/storage/aio/read_stream.c
index 04bdb5e6d4..18e9b4f3c4 100644
--- a/src/backend/storage/aio/read_stream.c
+++ b/src/backend/storage/aio/read_stream.c
@@ -666,6 +666,8 @@ read_stream_next_buffer(ReadStream *stream, void **per_buffer_data)
READ_BUFFERS_ISSUE_ADVICE : 0)))
{
/* Fast return. */
+				if (per_buffer_data)
+					*per_buffer_data = get_per_buffer_data(stream, oldest_buffer_index);
return buffer;
}

A few lines above:
Assert(stream->per_buffer_data_size == 0);

The fast path isn't used when per buffer data is used. Adding a check for
per_buffer_data and assigning something to it is nonsensical.

Greetings,

Andres Freund

#72Ranier Vilela
ranier.vf@gmail.com
In reply to: Andres Freund (#71)
Re: Confine vacuum skip logic to lazy_scan_skip

Em qui., 27 de fev. de 2025 às 14:49, Andres Freund <andres@anarazel.de>
escreveu:

Hi,

On 2025-02-27 12:44:24 -0500, Andres Freund wrote:

CID 1592454: (#1 of 1): Explicit null dereferenced (FORWARD_NULL)
8. var_deref_op: Dereferencing null pointer per_buffer_data.

That's exactly what the messages you quoted are discussing, no?

Ah, no, it isn't. But I still think the coverity alert and the patch don't
make sense, as per the below:

diff --git a/src/backend/storage/aio/read_stream.c

b/src/backend/storage/aio/read_stream.c

index 04bdb5e6d4..18e9b4f3c4 100644
--- a/src/backend/storage/aio/read_stream.c
+++ b/src/backend/storage/aio/read_stream.c
@@ -666,6 +666,8 @@ read_stream_next_buffer(ReadStream *stream, void

**per_buffer_data)

READ_BUFFERS_ISSUE_ADVICE : 0)))

{
/* Fast return. */
+                           if (per_buffer_data)
+                                   *per_buffer_data =

get_per_buffer_data(stream, oldest_buffer_index);

return buffer;
}

A few lines above:
Assert(stream->per_buffer_data_size == 0);

The fast path isn't used when per buffer data is used. Adding a check

for

per_buffer_data and assigning something to it is nonsensical.

Perhaps.

But the fast path and the parameter void **per_buffer_data,
IMHO, is a serious risk in my opinion.
Of course, when in runtime.

best regards,
Ranier Vilela

#73Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#71)
Re: Confine vacuum skip logic to lazy_scan_skip

Andres Freund <andres@anarazel.de> writes:

Ah, no, it isn't. But I still think the coverity alert and the patch don't
make sense, as per the below:

Coverity's alert makes perfect sense if you posit that Coverity
doesn't assume that this read_stream_next_buffer call will
only be applied to a stream that has per_buffer_data_size > 0.
(Even if it did understand that, I wouldn't assume that it's
smart enough to see that the fast path will never be taken.)

I wonder if it'd be a good idea to add something like

Assert(stream->distance == 1);
Assert(stream->pending_read_nblocks == 0);
Assert(stream->per_buffer_data_size == 0);
+ Assert(per_buffer_data == NULL);

in read_stream_next_buffer. I doubt that this will shut Coverity
up, but it would help to catch caller coding errors, i.e. passing
a per_buffer_data pointer when there's no per-buffer data.

On the whole I doubt we can get rid of this warning without some
significant redesign of the read_stream API, and I don't think
it's worth the trouble. Coverity is a tool not a requirement.
I'm content to just dismiss the warning.

regards, tom lane

#74Melanie Plageman
melanieplageman@gmail.com
In reply to: Tom Lane (#73)
Re: Confine vacuum skip logic to lazy_scan_skip

On Thu, Feb 27, 2025 at 1:08 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wonder if it'd be a good idea to add something like

Assert(stream->distance == 1);
Assert(stream->pending_read_nblocks == 0);
Assert(stream->per_buffer_data_size == 0);
+ Assert(per_buffer_data == NULL);

in read_stream_next_buffer. I doubt that this will shut Coverity
up, but it would help to catch caller coding errors, i.e. passing
a per_buffer_data pointer when there's no per-buffer data.

I think this is a good stopgap. I was discussing adding this assert
off-list with Thomas and he wanted to detail his more ambitious plans
for type safety improvements in the read stream API. Less on the order
of a redesign and more like a separate read_stream_next_buffer()s for
when there is per buffer data and when there isn't. And a by-value and
by-reference version for the one where there is data.

I'll plan to add this assert tomorrow if that discussion doesn't materialize.

- Melanie

#75Thomas Munro
thomas.munro@gmail.com
In reply to: Melanie Plageman (#74)
2 attachment(s)
Re: Confine vacuum skip logic to lazy_scan_skip

On Fri, Feb 28, 2025 at 11:58 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Thu, Feb 27, 2025 at 1:08 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wonder if it'd be a good idea to add something like

Assert(stream->distance == 1);
Assert(stream->pending_read_nblocks == 0);
Assert(stream->per_buffer_data_size == 0);
+ Assert(per_buffer_data == NULL);

in read_stream_next_buffer. I doubt that this will shut Coverity
up, but it would help to catch caller coding errors, i.e. passing
a per_buffer_data pointer when there's no per-buffer data.

I think this is a good stopgap. I was discussing adding this assert
off-list with Thomas and he wanted to detail his more ambitious plans
for type safety improvements in the read stream API. Less on the order
of a redesign and more like a separate read_stream_next_buffer()s for
when there is per buffer data and when there isn't. And a by-value and
by-reference version for the one where there is data.

Here's what I had in mind. Is it better?

Attachments:

0001-Improve-API-for-retrieving-data-from-read-streams.patchtext/x-patch; charset=US-ASCII; name=0001-Improve-API-for-retrieving-data-from-read-streams.patchDownload
From 68e9424b590051959142917459eb4ea074589b79 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Fri, 28 Feb 2025 10:48:29 +1300
Subject: [PATCH 1/2] Improve API for retrieving data from read streams.

Dealing with the per_buffer_data argument to read_stream_next_buffer()
has proven a bit clunky.  Provide some new wrapper functions/macros:

    buffer = read_stream_get_buffer(rs);
    buffer = read_stream_get_buffer_and_value(rs, &my_int);
    buffer = read_stream_get_buffer_and_pointer(rs, &my_pointer_to_int);

These improve readability and type safety via assertions.
---
 contrib/pg_prewarm/pg_prewarm.c          |  4 +-
 contrib/pg_visibility/pg_visibility.c    |  6 +--
 src/backend/access/heap/heapam.c         |  2 +-
 src/backend/access/heap/heapam_handler.c |  2 +-
 src/backend/access/heap/vacuumlazy.c     |  6 +--
 src/backend/storage/aio/read_stream.c    | 12 +++++
 src/backend/storage/buffer/bufmgr.c      |  4 +-
 src/include/storage/read_stream.h        | 64 ++++++++++++++++++++++++
 8 files changed, 87 insertions(+), 13 deletions(-)

diff --git a/contrib/pg_prewarm/pg_prewarm.c b/contrib/pg_prewarm/pg_prewarm.c
index a2f0ac4af0c..f6ae266d7b0 100644
--- a/contrib/pg_prewarm/pg_prewarm.c
+++ b/contrib/pg_prewarm/pg_prewarm.c
@@ -208,11 +208,11 @@ pg_prewarm(PG_FUNCTION_ARGS)
 			Buffer		buf;
 
 			CHECK_FOR_INTERRUPTS();
-			buf = read_stream_next_buffer(stream, NULL);
+			buf = read_stream_get_buffer(stream);
 			ReleaseBuffer(buf);
 			++blocks_done;
 		}
-		Assert(read_stream_next_buffer(stream, NULL) == InvalidBuffer);
+		Assert(read_stream_get_buffer(stream) == InvalidBuffer);
 		read_stream_end(stream);
 	}
 
diff --git a/contrib/pg_visibility/pg_visibility.c b/contrib/pg_visibility/pg_visibility.c
index 7f268a18a74..e7187a46c9d 100644
--- a/contrib/pg_visibility/pg_visibility.c
+++ b/contrib/pg_visibility/pg_visibility.c
@@ -556,7 +556,7 @@ collect_visibility_data(Oid relid, bool include_pd)
 			Buffer		buffer;
 			Page		page;
 
-			buffer = read_stream_next_buffer(stream, NULL);
+			buffer = read_stream_get_buffer(stream);
 			LockBuffer(buffer, BUFFER_LOCK_SHARE);
 
 			page = BufferGetPage(buffer);
@@ -569,7 +569,7 @@ collect_visibility_data(Oid relid, bool include_pd)
 
 	if (include_pd)
 	{
-		Assert(read_stream_next_buffer(stream, NULL) == InvalidBuffer);
+		Assert(read_stream_get_buffer(stream) == InvalidBuffer);
 		read_stream_end(stream);
 	}
 
@@ -752,7 +752,7 @@ collect_corrupt_items(Oid relid, bool all_visible, bool all_frozen)
 										0);
 
 	/* Loop over every block in the relation. */
-	while ((buffer = read_stream_next_buffer(stream, NULL)) != InvalidBuffer)
+	while ((buffer = read_stream_get_buffer(stream)) != InvalidBuffer)
 	{
 		bool		check_frozen = all_frozen;
 		bool		check_visible = all_visible;
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index fa7935a0ed3..86f280069e0 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -609,7 +609,7 @@ heap_fetch_next_buffer(HeapScanDesc scan, ScanDirection dir)
 
 	scan->rs_dir = dir;
 
-	scan->rs_cbuf = read_stream_next_buffer(scan->rs_read_stream, NULL);
+	scan->rs_cbuf = read_stream_get_buffer(scan->rs_read_stream);
 	if (BufferIsValid(scan->rs_cbuf))
 		scan->rs_cblock = BufferGetBlockNumber(scan->rs_cbuf);
 }
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index e78682c3cef..7487896b06c 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -1010,7 +1010,7 @@ heapam_scan_analyze_next_block(TableScanDesc scan, ReadStream *stream)
 	 * re-acquire sharelock for each tuple, but since we aren't doing much
 	 * work per tuple, the extra lock traffic is probably better avoided.
 	 */
-	hscan->rs_cbuf = read_stream_next_buffer(stream, NULL);
+	hscan->rs_cbuf = read_stream_get_buffer(stream);
 	if (!BufferIsValid(hscan->rs_cbuf))
 		return false;
 
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 1af18a78a2b..ac7a4d8c21d 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1230,7 +1230,6 @@ lazy_scan_heap(LVRelState *vacrel)
 		Page		page;
 		uint8		blk_info = 0;
 		bool		has_lpdead_items;
-		void	   *per_buffer_data = NULL;
 		bool		vm_page_frozen = false;
 		bool		got_cleanup_lock = false;
 
@@ -1287,13 +1286,12 @@ lazy_scan_heap(LVRelState *vacrel)
 										 PROGRESS_VACUUM_PHASE_SCAN_HEAP);
 		}
 
-		buf = read_stream_next_buffer(stream, &per_buffer_data);
+		buf = read_stream_get_buffer_and_value(stream, &blk_info);
 
 		/* The relation is exhausted. */
 		if (!BufferIsValid(buf))
 			break;
 
-		blk_info = *((uint8 *) per_buffer_data);
 		CheckBufferIsPinnedOnce(buf);
 		page = BufferGetPage(buf);
 		blkno = BufferGetBlockNumber(buf);
@@ -2740,7 +2738,7 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 
 		vacuum_delay_point(false);
 
-		buf = read_stream_next_buffer(stream, (void **) &iter_result);
+		buf = read_stream_get_buffer_and_pointer(stream, &iter_result);
 
 		/* The relation is exhausted */
 		if (!BufferIsValid(buf))
diff --git a/src/backend/storage/aio/read_stream.c b/src/backend/storage/aio/read_stream.c
index 04bdb5e6d4b..0f1332c46f6 100644
--- a/src/backend/storage/aio/read_stream.c
+++ b/src/backend/storage/aio/read_stream.c
@@ -615,6 +615,9 @@ read_stream_begin_smgr_relation(int flags,
  * valid until the next call to read_stream_next_buffer().  When the stream
  * runs out of data, InvalidBuffer is returned.  The caller may decide to end
  * the stream early at any time by calling read_stream_end().
+ *
+ * See read_stream.h for read_stream_get_buffer() and variants that provide
+ * some degree of type safety for the per_buffer_data argument.
  */
 Buffer
 read_stream_next_buffer(ReadStream *stream, void **per_buffer_data)
@@ -840,6 +843,15 @@ read_stream_next_block(ReadStream *stream, BufferAccessStrategy *strategy)
 	return read_stream_get_block(stream, NULL);
 }
 
+/*
+ * Return the configured per-buffer data size, for use in assertions.
+ */
+size_t
+read_stream_per_buffer_data_size(ReadStream *stream)
+{
+	return stream->per_buffer_data_size;
+}
+
 /*
  * Reset a read stream by releasing any queued up buffers, allowing the stream
  * to be used again for different blocks.  This can be used to clear an
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 7915ed624c1..f4cedd15109 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -4690,7 +4690,7 @@ RelationCopyStorageUsingBuffer(RelFileLocator srclocator,
 		CHECK_FOR_INTERRUPTS();
 
 		/* Read block from source relation. */
-		srcBuf = read_stream_next_buffer(src_stream, NULL);
+		srcBuf = read_stream_get_buffer(src_stream);
 		LockBuffer(srcBuf, BUFFER_LOCK_SHARE);
 		srcPage = BufferGetPage(srcBuf);
 
@@ -4715,7 +4715,7 @@ RelationCopyStorageUsingBuffer(RelFileLocator srclocator,
 		UnlockReleaseBuffer(dstBuf);
 		UnlockReleaseBuffer(srcBuf);
 	}
-	Assert(read_stream_next_buffer(src_stream, NULL) == InvalidBuffer);
+	Assert(read_stream_get_buffer(src_stream) == InvalidBuffer);
 	read_stream_end(src_stream);
 
 	FreeAccessStrategy(bstrategy_src);
diff --git a/src/include/storage/read_stream.h b/src/include/storage/read_stream.h
index c11d8ce3300..68c9340b0e3 100644
--- a/src/include/storage/read_stream.h
+++ b/src/include/storage/read_stream.h
@@ -70,6 +70,7 @@ extern ReadStream *read_stream_begin_relation(int flags,
 extern Buffer read_stream_next_buffer(ReadStream *stream, void **per_buffer_data);
 extern BlockNumber read_stream_next_block(ReadStream *stream,
 										  BufferAccessStrategy *strategy);
+extern size_t read_stream_per_buffer_data_size(ReadStream *stream);
 extern ReadStream *read_stream_begin_smgr_relation(int flags,
 												   BufferAccessStrategy strategy,
 												   SMgrRelation smgr,
@@ -81,4 +82,67 @@ extern ReadStream *read_stream_begin_smgr_relation(int flags,
 extern void read_stream_reset(ReadStream *stream);
 extern void read_stream_end(ReadStream *stream);
 
+/*
+ * Get the next buffer from a stream that is not using per-buffer data.
+ */
+static inline Buffer
+read_stream_get_buffer(ReadStream *stream)
+{
+	Assert(read_stream_per_buffer_data_size(stream) == 0);
+	return read_stream_next_buffer(stream, NULL);
+}
+
+/*
+ * Inlinable helper for read_stream_get_buffer_and_value() macro.
+ */
+static inline Buffer
+read_stream_get_buffer_and_value_with_size(ReadStream *stream,
+										   void *output_data,
+										   size_t output_data_size)
+{
+	Buffer		buffer;
+	void	   *per_buffer_data;
+
+	Assert(read_stream_per_buffer_data_size(stream) == output_data_size);
+	buffer = read_stream_next_buffer(stream, &per_buffer_data);
+	if (buffer != InvalidBuffer)
+		memcpy(output_data, per_buffer_data, output_data_size);
+
+	return buffer;
+}
+
+/*
+ * Get the next buffer and a copy of the associated per-buffer data.
+ * InvalidBuffer means end-of-stream, and in that case the per-buffer data is
+ * undefined.  Example of use:
+ *
+ * int my_int;
+ *
+ * buf = read_stream_get_buffer_and_value(stream, &my_int);
+ */
+#define read_stream_get_buffer_and_value(stream, vp) \
+	read_stream_get_buffer_and_value_with_size((stream), (vp), sizeof(*(vp)))
+
+/*
+ * Get the next buffer and a pointer to the associated per-buffer data.  This
+ * avoids casts, while still checking that we have the expected level of
+ * indirection.  InvalidBuffer means end-of-stream, and in that case the output
+ * pointer is undefined.  Otherwise the output pointer should only be
+ * dereferenced up until the next call.  For example:
+ *
+ * int *my_int_p;
+ *
+ * buf = read_stream_get_buffer_and_pointer(stream, &my_int_p);
+ */
+#if HAVE__BUILTIN_TYPES_COMPATIBLE_P
+#define read_stream_get_buffer_and_pointer(stream, pointer) \
+	(StaticAssertExpr(!__builtin_types_compatible_p(__typeof__(**(pointer)), \
+													void), \
+					  "expected pointer to pointer to non-void"), \
+	 read_stream_next_buffer((stream), ((void **) (pointer))))
+#else
+#define read_stream_get_buffer_and_pointer(stream, pointer) \
+	read_stream_next_buffer((stream), ((void **) (pointer)))
+#endif
+
 #endif							/* READ_STREAM_H */
-- 
2.48.1

0002-Improve-API-for-storing-data-in-read-streams.patchtext/x-patch; charset=US-ASCII; name=0002-Improve-API-for-storing-data-in-read-streams.patchDownload
From 02bb13b80abc20cd8221287b4c908f9c7c0dde23 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Fri, 28 Feb 2025 12:53:23 +1300
Subject: [PATCH 2/2] Improve API for storing data in read streams.

Read stream callbacks receive a void pointer into the per-buffer data
queue so that can store data there for later retrieval by the buffer
consumer.  We can improve readability and safety a bit by changing
cast-and-assign or raw memcpy() to:

	read_stream_put_value(stream, per_buffer_data, my_int);

This form infers the size and asserts that the storage space matches,
generally mirroring the read_stream_get_buffer_and_value() call used for
retrieving the streamed data later.
---
 src/backend/access/heap/vacuumlazy.c | 6 +++---
 src/include/storage/read_stream.h    | 9 +++++++++
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index ac7a4d8c21d..9563906fb27 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1612,7 +1612,7 @@ heap_vac_scan_next_block(ReadStream *stream,
 		 */
 		vacrel->current_block = next_block;
 		blk_info |= VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM;
-		*((uint8 *) per_buffer_data) = blk_info;
+		read_stream_put_value(stream, per_buffer_data, blk_info);
 		return vacrel->current_block;
 	}
 	else
@@ -1628,7 +1628,7 @@ heap_vac_scan_next_block(ReadStream *stream,
 			blk_info |= VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM;
 		if (vacrel->next_unskippable_eager_scanned)
 			blk_info |= VAC_BLK_WAS_EAGER_SCANNED;
-		*((uint8 *) per_buffer_data) = blk_info;
+		read_stream_put_value(stream, per_buffer_data, blk_info);
 		return vacrel->current_block;
 	}
 }
@@ -2671,7 +2671,7 @@ vacuum_reap_lp_read_stream_next(ReadStream *stream,
 	 * Save the TidStoreIterResult for later, so we can extract the offsets.
 	 * It is safe to copy the result, according to TidStoreIterateNext().
 	 */
-	memcpy(per_buffer_data, iter_result, sizeof(*iter_result));
+	read_stream_put_value(stream, per_buffer_data, *iter_result);
 
 	return iter_result->blkno;
 }
diff --git a/src/include/storage/read_stream.h b/src/include/storage/read_stream.h
index 68c9340b0e3..673dd75ccaf 100644
--- a/src/include/storage/read_stream.h
+++ b/src/include/storage/read_stream.h
@@ -145,4 +145,13 @@ read_stream_get_buffer_and_value_with_size(ReadStream *stream,
 	read_stream_next_buffer((stream), ((void **) (pointer)))
 #endif
 
+/*
+ * Set the per-buffer data by value.  This can be called from inside a
+ * callback that is returning block numbers.  It asserts that the value's size
+ * matches the available space.
+ */
+#define read_stream_put_value(stream, per_buffer_data, value) \
+	(AssertMacro(sizeof(value) == read_stream_per_buffer_data_size(stream)), \
+	 memcpy((per_buffer_data), &(value), sizeof(value)))
+
 #endif							/* READ_STREAM_H */
-- 
2.48.1

#76Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#75)
2 attachment(s)
Re: Confine vacuum skip logic to lazy_scan_skip

On Fri, Feb 28, 2025 at 2:29 PM Thomas Munro <thomas.munro@gmail.com> wrote:

On Fri, Feb 28, 2025 at 11:58 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Thu, Feb 27, 2025 at 1:08 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wonder if it'd be a good idea to add something like

Assert(stream->distance == 1);
Assert(stream->pending_read_nblocks == 0);
Assert(stream->per_buffer_data_size == 0);
+ Assert(per_buffer_data == NULL);

in read_stream_next_buffer. I doubt that this will shut Coverity
up, but it would help to catch caller coding errors, i.e. passing
a per_buffer_data pointer when there's no per-buffer data.

I think this is a good stopgap. I was discussing adding this assert
off-list with Thomas and he wanted to detail his more ambitious plans
for type safety improvements in the read stream API. Less on the order
of a redesign and more like a separate read_stream_next_buffer()s for
when there is per buffer data and when there isn't. And a by-value and
by-reference version for the one where there is data.

Here's what I had in mind. Is it better?

Here's a slightly better one. I think when you use
read_stream_get_buffer_and_value(stream, &value), or
read_stream_put_value(stream, space, value), then we should assert
that sizeof(value) strictly matches the available space, as shown. But,
new in v2, if you use read_stream_get_buffer_and_pointer(stream,
&pointer), then sizeof(*pointer) should only have to be <= the
storage space, not ==, because someone might plausibly want to make
per_buffer_data_size variable at runtime (ie decide when they
construct the stream), and then be able to retrieve a pointer to the
start of a struct with a flexible array or something like that. In v1
I was just trying to assert that it was a
pointer-to-a-pointer-to-something and no more (in a confusing
compile-time assertion), but v2 is simpler, and is happy with a
pointer to a pointer to something that doesn't exceed the space
(run-time assertion).

Attachments:

v2-0001-Improve-API-for-retrieving-data-from-read-streams.patchapplication/x-patch; name=v2-0001-Improve-API-for-retrieving-data-from-read-streams.patchDownload
From b2dd9c90f970a889deea2c2e9e16097e4e06ece8 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Fri, 28 Feb 2025 10:48:29 +1300
Subject: [PATCH v2 1/2] Improve API for retrieving data from read streams.

Dealing with the per_buffer_data argument to read_stream_next_buffer()
has proven a bit clunky.  Provide some new wrapper functions/macros:

    buffer = read_stream_get_buffer(rs);
    buffer = read_stream_get_buffer_and_value(rs, &my_int);
    buffer = read_stream_get_buffer_and_pointer(rs, &my_pointer_to_int);

These improve readability and type safety via assertions.
---
 contrib/pg_prewarm/pg_prewarm.c          |  4 +-
 contrib/pg_visibility/pg_visibility.c    |  6 +--
 src/backend/access/heap/heapam.c         |  2 +-
 src/backend/access/heap/heapam_handler.c |  2 +-
 src/backend/access/heap/vacuumlazy.c     |  6 +--
 src/backend/storage/aio/read_stream.c    | 12 ++++++
 src/backend/storage/buffer/bufmgr.c      |  4 +-
 src/include/storage/read_stream.h        | 55 ++++++++++++++++++++++++
 8 files changed, 78 insertions(+), 13 deletions(-)

diff --git a/contrib/pg_prewarm/pg_prewarm.c b/contrib/pg_prewarm/pg_prewarm.c
index a2f0ac4af0c..f6ae266d7b0 100644
--- a/contrib/pg_prewarm/pg_prewarm.c
+++ b/contrib/pg_prewarm/pg_prewarm.c
@@ -208,11 +208,11 @@ pg_prewarm(PG_FUNCTION_ARGS)
 			Buffer		buf;
 
 			CHECK_FOR_INTERRUPTS();
-			buf = read_stream_next_buffer(stream, NULL);
+			buf = read_stream_get_buffer(stream);
 			ReleaseBuffer(buf);
 			++blocks_done;
 		}
-		Assert(read_stream_next_buffer(stream, NULL) == InvalidBuffer);
+		Assert(read_stream_get_buffer(stream) == InvalidBuffer);
 		read_stream_end(stream);
 	}
 
diff --git a/contrib/pg_visibility/pg_visibility.c b/contrib/pg_visibility/pg_visibility.c
index 7f268a18a74..e7187a46c9d 100644
--- a/contrib/pg_visibility/pg_visibility.c
+++ b/contrib/pg_visibility/pg_visibility.c
@@ -556,7 +556,7 @@ collect_visibility_data(Oid relid, bool include_pd)
 			Buffer		buffer;
 			Page		page;
 
-			buffer = read_stream_next_buffer(stream, NULL);
+			buffer = read_stream_get_buffer(stream);
 			LockBuffer(buffer, BUFFER_LOCK_SHARE);
 
 			page = BufferGetPage(buffer);
@@ -569,7 +569,7 @@ collect_visibility_data(Oid relid, bool include_pd)
 
 	if (include_pd)
 	{
-		Assert(read_stream_next_buffer(stream, NULL) == InvalidBuffer);
+		Assert(read_stream_get_buffer(stream) == InvalidBuffer);
 		read_stream_end(stream);
 	}
 
@@ -752,7 +752,7 @@ collect_corrupt_items(Oid relid, bool all_visible, bool all_frozen)
 										0);
 
 	/* Loop over every block in the relation. */
-	while ((buffer = read_stream_next_buffer(stream, NULL)) != InvalidBuffer)
+	while ((buffer = read_stream_get_buffer(stream)) != InvalidBuffer)
 	{
 		bool		check_frozen = all_frozen;
 		bool		check_visible = all_visible;
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index fa7935a0ed3..86f280069e0 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -609,7 +609,7 @@ heap_fetch_next_buffer(HeapScanDesc scan, ScanDirection dir)
 
 	scan->rs_dir = dir;
 
-	scan->rs_cbuf = read_stream_next_buffer(scan->rs_read_stream, NULL);
+	scan->rs_cbuf = read_stream_get_buffer(scan->rs_read_stream);
 	if (BufferIsValid(scan->rs_cbuf))
 		scan->rs_cblock = BufferGetBlockNumber(scan->rs_cbuf);
 }
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index e78682c3cef..7487896b06c 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -1010,7 +1010,7 @@ heapam_scan_analyze_next_block(TableScanDesc scan, ReadStream *stream)
 	 * re-acquire sharelock for each tuple, but since we aren't doing much
 	 * work per tuple, the extra lock traffic is probably better avoided.
 	 */
-	hscan->rs_cbuf = read_stream_next_buffer(stream, NULL);
+	hscan->rs_cbuf = read_stream_get_buffer(stream);
 	if (!BufferIsValid(hscan->rs_cbuf))
 		return false;
 
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 1af18a78a2b..ac7a4d8c21d 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1230,7 +1230,6 @@ lazy_scan_heap(LVRelState *vacrel)
 		Page		page;
 		uint8		blk_info = 0;
 		bool		has_lpdead_items;
-		void	   *per_buffer_data = NULL;
 		bool		vm_page_frozen = false;
 		bool		got_cleanup_lock = false;
 
@@ -1287,13 +1286,12 @@ lazy_scan_heap(LVRelState *vacrel)
 										 PROGRESS_VACUUM_PHASE_SCAN_HEAP);
 		}
 
-		buf = read_stream_next_buffer(stream, &per_buffer_data);
+		buf = read_stream_get_buffer_and_value(stream, &blk_info);
 
 		/* The relation is exhausted. */
 		if (!BufferIsValid(buf))
 			break;
 
-		blk_info = *((uint8 *) per_buffer_data);
 		CheckBufferIsPinnedOnce(buf);
 		page = BufferGetPage(buf);
 		blkno = BufferGetBlockNumber(buf);
@@ -2740,7 +2738,7 @@ lazy_vacuum_heap_rel(LVRelState *vacrel)
 
 		vacuum_delay_point(false);
 
-		buf = read_stream_next_buffer(stream, (void **) &iter_result);
+		buf = read_stream_get_buffer_and_pointer(stream, &iter_result);
 
 		/* The relation is exhausted */
 		if (!BufferIsValid(buf))
diff --git a/src/backend/storage/aio/read_stream.c b/src/backend/storage/aio/read_stream.c
index 04bdb5e6d4b..0f1332c46f6 100644
--- a/src/backend/storage/aio/read_stream.c
+++ b/src/backend/storage/aio/read_stream.c
@@ -615,6 +615,9 @@ read_stream_begin_smgr_relation(int flags,
  * valid until the next call to read_stream_next_buffer().  When the stream
  * runs out of data, InvalidBuffer is returned.  The caller may decide to end
  * the stream early at any time by calling read_stream_end().
+ *
+ * See read_stream.h for read_stream_get_buffer() and variants that provide
+ * some degree of type safety for the per_buffer_data argument.
  */
 Buffer
 read_stream_next_buffer(ReadStream *stream, void **per_buffer_data)
@@ -840,6 +843,15 @@ read_stream_next_block(ReadStream *stream, BufferAccessStrategy *strategy)
 	return read_stream_get_block(stream, NULL);
 }
 
+/*
+ * Return the configured per-buffer data size, for use in assertions.
+ */
+size_t
+read_stream_per_buffer_data_size(ReadStream *stream)
+{
+	return stream->per_buffer_data_size;
+}
+
 /*
  * Reset a read stream by releasing any queued up buffers, allowing the stream
  * to be used again for different blocks.  This can be used to clear an
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 7915ed624c1..f4cedd15109 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -4690,7 +4690,7 @@ RelationCopyStorageUsingBuffer(RelFileLocator srclocator,
 		CHECK_FOR_INTERRUPTS();
 
 		/* Read block from source relation. */
-		srcBuf = read_stream_next_buffer(src_stream, NULL);
+		srcBuf = read_stream_get_buffer(src_stream);
 		LockBuffer(srcBuf, BUFFER_LOCK_SHARE);
 		srcPage = BufferGetPage(srcBuf);
 
@@ -4715,7 +4715,7 @@ RelationCopyStorageUsingBuffer(RelFileLocator srclocator,
 		UnlockReleaseBuffer(dstBuf);
 		UnlockReleaseBuffer(srcBuf);
 	}
-	Assert(read_stream_next_buffer(src_stream, NULL) == InvalidBuffer);
+	Assert(read_stream_get_buffer(src_stream) == InvalidBuffer);
 	read_stream_end(src_stream);
 
 	FreeAccessStrategy(bstrategy_src);
diff --git a/src/include/storage/read_stream.h b/src/include/storage/read_stream.h
index c11d8ce3300..c6066c0f296 100644
--- a/src/include/storage/read_stream.h
+++ b/src/include/storage/read_stream.h
@@ -70,6 +70,7 @@ extern ReadStream *read_stream_begin_relation(int flags,
 extern Buffer read_stream_next_buffer(ReadStream *stream, void **per_buffer_data);
 extern BlockNumber read_stream_next_block(ReadStream *stream,
 										  BufferAccessStrategy *strategy);
+extern size_t read_stream_per_buffer_data_size(ReadStream *stream);
 extern ReadStream *read_stream_begin_smgr_relation(int flags,
 												   BufferAccessStrategy strategy,
 												   SMgrRelation smgr,
@@ -81,4 +82,58 @@ extern ReadStream *read_stream_begin_smgr_relation(int flags,
 extern void read_stream_reset(ReadStream *stream);
 extern void read_stream_end(ReadStream *stream);
 
+/*
+ * Get the next buffer from a stream that is not using per-buffer data.
+ */
+static inline Buffer
+read_stream_get_buffer(ReadStream *stream)
+{
+	Assert(read_stream_per_buffer_data_size(stream) == 0);
+	return read_stream_next_buffer(stream, NULL);
+}
+
+/*
+ * Helper for read_stream_get_buffer_and_value().
+ */
+static inline Buffer
+read_stream_get_buffer_and_value_with_size(ReadStream *stream,
+										   void *output_data,
+										   size_t output_data_size)
+{
+	Buffer		buffer;
+	void	   *per_buffer_data;
+
+	Assert(read_stream_per_buffer_data_size(stream) == output_data_size);
+	buffer = read_stream_next_buffer(stream, &per_buffer_data);
+	if (buffer != InvalidBuffer)
+		memcpy(output_data, per_buffer_data, output_data_size);
+
+	return buffer;
+}
+
+/*
+ * Get the next buffer and a copy of the associated per-buffer data.
+ * InvalidBuffer means end-of-stream, and in that case the per-buffer data is
+ * undefined.  Example of use:
+ *
+ * int my_int;
+ *
+ * buf = read_stream_get_buffer_and_value(stream, &my_int);
+ */
+#define read_stream_get_buffer_and_value(stream, vp) \
+	read_stream_get_buffer_and_value_with_size((stream), (vp), sizeof(*(vp)))
+
+/*
+ * Get the next buffer and a pointer to the associated per-buffer data.  This
+ * avoids casts in the calling code, and asserts that we received a pointer to
+ * a pointer to a type that doesn't exceed the storage size.  For example:
+ *
+ * int *my_int_p;
+ *
+ * buf = read_stream_get_buffer_and_pointer(stream, &my_int_p);
+ */
+#define read_stream_get_buffer_and_pointer(stream, pointer) \
+	(AssertMacro(sizeof(**(pointer)) <= read_stream_per_buffer_data_size(stream)), \
+	 read_stream_next_buffer((stream), ((void **) (pointer))))
+
 #endif							/* READ_STREAM_H */
-- 
2.48.1

v2-0002-Improve-API-for-storing-data-in-read-streams.patchapplication/x-patch; name=v2-0002-Improve-API-for-storing-data-in-read-streams.patchDownload
From 2586d9c7321391168a40cb0f14e5a80182792b64 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Fri, 28 Feb 2025 12:53:23 +1300
Subject: [PATCH v2 2/2] Improve API for storing data in read streams.

Read stream callbacks receive a void pointer into the per-buffer data
queue so that can store data there for later retrieval by the buffer
consumer.  We can improve readability and safety a bit by changing
cast-and-assign or raw memcpy() to:

	read_stream_put_value(stream, per_buffer_data, my_int);

This form infers the size and asserts that the storage space matches,
generally mirroring the read_stream_get_buffer_and_value() call used for
retrieving the streamed data later.
---
 src/backend/access/heap/vacuumlazy.c | 6 +++---
 src/include/storage/read_stream.h    | 9 +++++++++
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index ac7a4d8c21d..9563906fb27 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1612,7 +1612,7 @@ heap_vac_scan_next_block(ReadStream *stream,
 		 */
 		vacrel->current_block = next_block;
 		blk_info |= VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM;
-		*((uint8 *) per_buffer_data) = blk_info;
+		read_stream_put_value(stream, per_buffer_data, blk_info);
 		return vacrel->current_block;
 	}
 	else
@@ -1628,7 +1628,7 @@ heap_vac_scan_next_block(ReadStream *stream,
 			blk_info |= VAC_BLK_ALL_VISIBLE_ACCORDING_TO_VM;
 		if (vacrel->next_unskippable_eager_scanned)
 			blk_info |= VAC_BLK_WAS_EAGER_SCANNED;
-		*((uint8 *) per_buffer_data) = blk_info;
+		read_stream_put_value(stream, per_buffer_data, blk_info);
 		return vacrel->current_block;
 	}
 }
@@ -2671,7 +2671,7 @@ vacuum_reap_lp_read_stream_next(ReadStream *stream,
 	 * Save the TidStoreIterResult for later, so we can extract the offsets.
 	 * It is safe to copy the result, according to TidStoreIterateNext().
 	 */
-	memcpy(per_buffer_data, iter_result, sizeof(*iter_result));
+	read_stream_put_value(stream, per_buffer_data, *iter_result);
 
 	return iter_result->blkno;
 }
diff --git a/src/include/storage/read_stream.h b/src/include/storage/read_stream.h
index c6066c0f296..5af801969b4 100644
--- a/src/include/storage/read_stream.h
+++ b/src/include/storage/read_stream.h
@@ -136,4 +136,13 @@ read_stream_get_buffer_and_value_with_size(ReadStream *stream,
 	(AssertMacro(sizeof(**(pointer)) <= read_stream_per_buffer_data_size(stream)), \
 	 read_stream_next_buffer((stream), ((void **) (pointer))))
 
+/*
+ * Set the per-buffer data by value.  This can be called from inside a
+ * callback that is returning block numbers.  It asserts that the value's size
+ * matches the available space.
+ */
+#define read_stream_put_value(stream, per_buffer_data, value) \
+	(AssertMacro(sizeof(value) == read_stream_per_buffer_data_size(stream)), \
+	 memcpy((per_buffer_data), &(value), sizeof(value)))
+
 #endif							/* READ_STREAM_H */
-- 
2.48.1

#77Álvaro Herrera
alvherre@kurilemu.de
In reply to: Thomas Munro (#76)
Re: Confine vacuum skip logic to lazy_scan_skip

Moved this [1]https://commitfest.postgresql.org/patch/5617/ to PG19-Drafts. Feel free to put it back in a regular
commitfest when you want to return to it.

[1]: https://commitfest.postgresql.org/patch/5617/

--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"La virtud es el justo medio entre dos defectos" (Aristóteles)

#78Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#41)
Re: Confine vacuum skip logic to lazy_scan_skip

[ seizing on this old commit as being most closely related to the issue ]

Thomas Munro <thomas.munro@gmail.com> writes:

On Tue, Jul 16, 2024 at 1:52 PM Noah Misch <noah@leadboat.com> wrote:

On Mon, Jul 15, 2024 at 03:26:32PM +1200, Thomas Munro wrote:
That's reasonable. radixtree already forbids mutations concurrent with
iteration, so there's no new concurrency hazard. One alternative is
per_buffer_data big enough for MaxOffsetNumber, but that might thrash caches
measurably. That patch is good to go apart from these trivialities:

Thanks! I have pushed that patch, without those changes you didn't like.

The security team recently updated our Coverity instance to the latest
version, and it's started complaining as follows:

*** CID 1667418: Memory - corruptions (OVERRUN)
/srv/coverity/git/pgsql-git/postgresql/src/backend/access/heap/vacuumlazy.c: 2812 in lazy_vacuum_heap_rel()
2806 * already have the correct page pinned anyway.
2807 */
2808 visibilitymap_pin(vacrel->rel, blkno, &vmbuffer);
2809
2810 /* We need a non-cleanup exclusive lock to mark dead_items unused */
2811 LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);

CID 1667418: Memory - corruptions (OVERRUN)
Overrunning callee's array of size 291 by passing argument "num_offsets" (which evaluates to 2048) in call to "lazy_vacuum_heap_page".

2812 lazy_vacuum_heap_page(vacrel, blkno, buf, offsets,
2813 num_offsets, vmbuffer);
2814
2815 /* Now that we've vacuumed the page, record its available space */
2816 page = BufferGetPage(buf);
2817 freespace = PageGetHeapFreeSpace(page);

The reason it thinks that num_offsets could be as much as 2048 is
presumably the code a little bit above this:

OffsetNumber offsets[MaxOffsetNumber];
...
num_offsets = TidStoreGetBlockOffsets(iter_result, offsets, lengthof(offsets));
Assert(num_offsets <= lengthof(offsets));

However, lazy_vacuum_heap_page blindly assumes that the passed value
will be no more than MaxHeapTuplesPerPage. It seems like we ought
to get these two functions in sync, either both using MaxOffsetNumber
or both using MaxHeapTuplesPerPage for their array sizes.

It looks to me like MaxHeapTuplesPerPage should be sufficient.
Also, after reading TidStoreGetBlockOffsets I wonder if we
should replace that Assert with

num_offsets = Min(num_offsets, lengthof(offsets));

Thoughts?

regards, tom lane

#79John Naylor
johncnaylorls@gmail.com
In reply to: Tom Lane (#78)
Re: Confine vacuum skip logic to lazy_scan_skip

On Wed, Oct 22, 2025 at 11:12 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

The reason it thinks that num_offsets could be as much as 2048 is
presumably the code a little bit above this:

OffsetNumber offsets[MaxOffsetNumber];
...
num_offsets = TidStoreGetBlockOffsets(iter_result, offsets, lengthof(offsets));
Assert(num_offsets <= lengthof(offsets));

However, lazy_vacuum_heap_page blindly assumes that the passed value
will be no more than MaxHeapTuplesPerPage. It seems like we ought
to get these two functions in sync, either both using MaxOffsetNumber
or both using MaxHeapTuplesPerPage for their array sizes.

It looks to me like MaxHeapTuplesPerPage should be sufficient.

Seems right.

Also, after reading TidStoreGetBlockOffsets I wonder if we
should replace that Assert with

num_offsets = Min(num_offsets, lengthof(offsets));

Thoughts?

Not sure. That changes the posture from "can't happen" to "shouldn't
happen, but if it does, don't cause a disaster". Even with the latter,
the assert still seems appropriate for catching developer mistakes.

--
John Naylor
Amazon Web Services

#80Melanie Plageman
melanieplageman@gmail.com
In reply to: John Naylor (#79)
Re: Confine vacuum skip logic to lazy_scan_skip

On Thu, Oct 30, 2025 at 7:14 AM John Naylor <johncnaylorls@gmail.com> wrote:

On Wed, Oct 22, 2025 at 11:12 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

The reason it thinks that num_offsets could be as much as 2048 is
presumably the code a little bit above this:

OffsetNumber offsets[MaxOffsetNumber];
...
num_offsets = TidStoreGetBlockOffsets(iter_result, offsets, lengthof(offsets));
Assert(num_offsets <= lengthof(offsets));

However, lazy_vacuum_heap_page blindly assumes that the passed value
will be no more than MaxHeapTuplesPerPage. It seems like we ought
to get these two functions in sync, either both using MaxOffsetNumber
or both using MaxHeapTuplesPerPage for their array sizes.

It looks to me like MaxHeapTuplesPerPage should be sufficient.

Seems right.

Yes, it makes sense to me to make offsets size MaxHeapTuplesPerPage,
if that is what is being suggested. Doesn't hurt to take up a bit less
stack space too.

Also, after reading TidStoreGetBlockOffsets I wonder if we
should replace that Assert with

num_offsets = Min(num_offsets, lengthof(offsets));

Thoughts?

Not sure. That changes the posture from "can't happen" to "shouldn't
happen, but if it does, don't cause a disaster". Even with the latter,
the assert still seems appropriate for catching developer mistakes.

You are suggesting keeping the assert and this line after it?

num_offsets = Min(num_offsets, lengthof(offsets));

The current contract of TidStoreGetBlockOffsets() is that it won't
return a value larger than max_offsets passed in, so it is a good idea
to have an assert in case it changes. But, if we take the minimum,
then is the assert there to keep developers from changing
TidStoreGetBlockOffsets() from behaving differently? I don't know if I
like that, but I don't feel strongly enough to object. Anyway, I think
we should add the line Tom suggested.

- Melanie

#81John Naylor
johncnaylorls@gmail.com
In reply to: Melanie Plageman (#80)
Re: Confine vacuum skip logic to lazy_scan_skip

On Mon, Nov 3, 2025 at 10:59 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

Not sure. That changes the posture from "can't happen" to "shouldn't
happen, but if it does, don't cause a disaster". Even with the latter,
the assert still seems appropriate for catching developer mistakes.

You are suggesting keeping the assert and this line after it?

num_offsets = Min(num_offsets, lengthof(offsets));

My "not sure" was referring to this line.

The current contract of TidStoreGetBlockOffsets() is that it won't
return a value larger than max_offsets passed in, so it is a good idea
to have an assert in case it changes.

I suspect the contract is the way it is in order to enable the assert to work.

But, if we take the minimum,
then is the assert there to keep developers from changing
TidStoreGetBlockOffsets() from behaving differently? I don't know if I
like that, but I don't feel strongly enough to object. Anyway, I think
we should add the line Tom suggested.

This line seems strange to me (and maybe even stranger to have both
the min and the assert), but maybe I don't understand Tom's rationale
well enough. Do we need it to silence Coverity?

--
John Naylor
Amazon Web Services