[PATCHES] Post-special page storage TDE support

Started by David Christensenabout 3 years ago35 messages

david.christensen@crunchydata.com

about 3 years ago

6 attachment(s)

Hi -hackers,

An additional piece that I am working on for improving infra for TDE
features is allowing the storage of additional per-page data. Rather
than hard-code the idea of a specific struct, this is utilizing a new,
more dynamic structure to associate page offsets with a particular
feature that may-or-may-not be present for a given cluster. I am
calling this generic structure a PageFeature/PageFeatureSet (better
names welcome), which is defined for a cluster at initdb/bootstrap
time, and reserves a given amount of trailing space on the Page which
is then parceled out to the consumers of said space.

While the immediate need that this feature fills is storage of
encryption tags for XTS-based encryption on the pages themselves, this
can also be used for any optional features; as an example I have
implemented expanded checksum support (both 32- and 64-bit), as well
as a self-description "wasted space" feature, which just allocates
trailing space from the page (obviously intended as illustration
only).

There are 6 commits in this series:

0001 - adds `reserved_page_space` global, making various size
calculations and limits dynamic, adjusting access methods to offset
special space, and ensuring that we can safely reserve allocated space
from the end of pages.

0002 - test suite stability fixes - the change in number of tuples per
page means that we had some assumptions about the order from tests
that now break

0003 - the "PageFeatures" commit, the meat of this feature (see
following description)

0004 - page_checksum32 feature - store the full 32-bit checksum across
the existing pd_checksum field as well as 2 bytes from
reserved_page_space. This is more of a demo of what could be done
here than a practical feature.

0005 - wasted space PageFeature - just use up space. An additional
feature we can turn on/off to see how multiple features interact.
Only for illustration.

0006 - 64-bit checksums - fully allocated from reserved_page_space.
Using an MIT-licensed 64-bit checksum, but if we determined we'd want
to do this we'd probably roll our own.

From the commit message for PageFeatures:

Page features are a standardized way of assigning and using dynamic
space usage from the tail end of
a disk page. These features are set at cluster init time (so
configured via `initdb` and
initialized via the bootstrap process) and affect all disk pages.

A PageFeatureSet is effectively a bitflag of all configured features,
each of which has a fixed
size. If not using any PageFeatures, the storage overhead of this is 0.

Rather than using a variable location struct, an implementation of a
PageFeature is responsible for
an offset and a length in the page. The current API returns only a
pointer to the page location for
the implementation to manage, and no further checks are done to ensure
that only the expected memory
is accessed.

Access to the underlying memory is synonymous with determining whether
a given cluster is using an
underlying PageFeature, so code paths can do something like:

char *loc;

if ((loc = ClusterGetPageFeatureOffset(page, PF_MY_FEATURE_ID)))
{
// ipso facto this feature is enabled in this cluster *and* we
know the memory address
...
}

Since this is direct memory access to the underlying Page, ensure the
buffer is pinned. Explicitly
locking (assuming you stay in your lane) should only need to guard
against access from other
backends of this type if using shared buffers, so will be use-case dependent.

This does have a runtime overhead due to moving some offset
calculations from compile time to
runtime. It is thought that the utility of this feature will outweigh
the costs here.

Candidates for page features include 32-bit or 64-bit checksums,
encryption tags, or additional
per-page metadata.

While we are not currently getting rid of the pd_checksum field, this
mechanism could be used to
free up that 16 bits for some other purpose. One such purpose might be
to mirror the cluster-wise
PageFeatureSet, currently also a uint16, which would mean the entirety
of this scheme could be
reflected in a given page, opening up per-relation or even per-page
setting/metadata here. (We'd
presumably need to snag a pd_flags bit to interpret pd_checksum that
way, but it would be an
interesting use.)

Discussion is welcome and encouraged!

Thanks,

David

Attachments:

0005-A-second-page-feature-just-to-allocate-more-space.patchapplication/octet-stream; name=0005-A-second-page-feature-just-to-allocate-more-space.patchDownload

From 53abdf9b4ca2e4f0b6f6c6bfbc43cc94d1dcebfd Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Fri, 21 Oct 2022 11:06:59 -0400
Subject: [PATCH 5/6] A second page feature just to allocate more space

---
 src/backend/utils/misc/guc_tables.c | 11 +++++++++++
 src/bin/initdb/initdb.c             | 10 ++++++++--
 src/common/pagefeat.c               |  3 +++
 src/include/common/pagefeat.h       |  2 ++
 4 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index c2fc61f81e..dd7d80feef 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -1809,6 +1809,17 @@ struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"wasted_space", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Waste some space in the page. Not even a fill factor. Just testing multiple page features."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
+		},
+		&page_feature_wasted_space,
+		false,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"syslog_sequence_numbers", PGC_SIGHUP, LOGGING_WHERE,
 			gettext_noop("Add sequence number to syslog messages to avoid duplicate suppression."),
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 43311f0b79..0622983c06 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -151,6 +151,7 @@ static bool sync_only = false;
 static bool show_setting = false;
 static bool data_checksums = false;
 static bool page_checksums32 = false;
+static bool waste_space = false;
 static char *xlog_dir = NULL;
 static char *str_wal_segment_size_mb = NULL;
 static int	wal_segment_size_mb;
@@ -1323,11 +1324,12 @@ bootstrap_template1(void)
 	unsetenv("PGCLIENTENCODING");
 
 	snprintf(cmd, sizeof(cmd),
-			 "\"%s\" --boot -X %d %s %s %s %s %s",
+			 "\"%s\" --boot -X %d %s %s %s %s %s %s",
 			 backend_exec,
 			 wal_segment_size_mb * (1024 * 1024),
 			 data_checksums ? "-k" : "",
 			 page_checksums32 ? "-e page_checksums32" : "",
+			 waste_space ? "-e wasted_space" : "",
 			 boot_options, extra_options,
 			 debug ? "-d 5" : "");
 
@@ -2807,6 +2809,7 @@ main(int argc, char *argv[])
 		{"wal-segsize", required_argument, NULL, 12},
 		{"data-checksums", no_argument, NULL, 'k'},
 		{"extended-checksums", no_argument, NULL, 'K'},
+		{"waste-space", no_argument, NULL, 'w'},
 		{"allow-group-access", no_argument, NULL, 'g'},
 		{"discard-caches", no_argument, NULL, 14},
 		{"locale-provider", required_argument, NULL, 15},
@@ -2852,7 +2855,7 @@ main(int argc, char *argv[])
 
 	/* process command-line options */
 
-	while ((c = getopt_long(argc, argv, "A:dD:E:gkKL:nNsST:U:WX:", long_options, &option_index)) != -1)
+	while ((c = getopt_long(argc, argv, "A:dD:E:gkKL:nNsST:U:WwX:", long_options, &option_index)) != -1)
 	{
 		switch (c)
 		{
@@ -2943,6 +2946,9 @@ main(int argc, char *argv[])
 			case 'T':
 				default_text_search_config = pg_strdup(optarg);
 				break;
+			case 'w':
+				waste_space = true;
+				break;
 			case 'X':
 				xlog_dir = pg_strdup(optarg);
 				break;
diff --git a/src/common/pagefeat.c b/src/common/pagefeat.c
index a3e3e6686e..1af22b9876 100644
--- a/src/common/pagefeat.c
+++ b/src/common/pagefeat.c
@@ -22,6 +22,7 @@ PageFeatureSet cluster_page_features;
 
 /* status GUCs, display only. set by XLog startup */
 bool page_feature_page_checksums32;
+bool page_feature_wasted_space;
 
 /*
  * A "page feature" is an optional cluster-defined additional data field that
@@ -49,6 +50,8 @@ typedef struct PageFeatureDesc
 static PageFeatureDesc feature_descs[PF_MAX_FEATURE] = {
 	/* PF_PAGE_CHECKSUMS32 */
 	{ 2, "page_checksums32" },
+	/* PF_WASTED_SPACE */
+	{ 40, "wasted_space" },
 };
 
 
diff --git a/src/include/common/pagefeat.h b/src/include/common/pagefeat.h
index baff058de0..d6a665cd8f 100644
--- a/src/include/common/pagefeat.h
+++ b/src/include/common/pagefeat.h
@@ -17,6 +17,7 @@
 /* revealed for GUCs */
 extern int reserved_page_size;
 extern bool page_feature_page_checksums32;
+extern bool page_feature_wasted_space;
 
 /* forward declaration to avoid circular includes */
 typedef Pointer Page;
@@ -29,6 +30,7 @@ extern PageFeatureSet cluster_page_features;
 /* bit offset for features flags */
 typedef enum {
 	PF_PAGE_CHECKSUMS32 = 0,
+	PF_WASTED_SPACE,
 	PF_MAX_FEATURE /* must be last */
 } PageFeature;
 
-- 
2.37.0 (Apple Git-136)

0001-Add-reserved_page_space-to-Page-structure.patchapplication/octet-stream; name=0001-Add-reserved_page_space-to-Page-structure.patchDownload

From 0233fc19901b811f65f59bbdc13c820f7154893a Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Tue, 18 Oct 2022 14:28:09 -0400
Subject: [PATCH 1/6] Add reserved_page_space to Page structure

This space is reserved for extended data on the Page structure which will be ultimately used for
encrypted data, extended checksums, and potentially other things.  This data appears at the end of
the Page, after any `pd_special` area, and will be calculated at runtime based on specific
ControlFile features.

No effort is made to ensure this is backwards-compatible with existing clusters for `pg_upgrade`, as
we will require logical replication to move data into a cluster with different settings here.
---
 src/backend/access/gin/ginfast.c        |  2 +-
 src/backend/access/nbtree/nbtdedup.c    |  2 +-
 src/backend/access/nbtree/nbtsplitloc.c |  2 +-
 src/backend/storage/page/bufpage.c      | 16 ++++++++--------
 src/backend/utils/init/globals.c        |  3 +++
 src/backend/utils/misc/guc_tables.c     | 13 +++++++++++++
 src/bin/initdb/initdb.c                 |  1 +
 src/include/access/ginblock.h           |  4 +++-
 src/include/access/hash.h               |  5 ++++-
 src/include/access/htup_details.h       |  9 +++++----
 src/include/access/nbtree.h             | 13 +++++++++----
 src/include/access/spgist_private.h     |  1 +
 src/include/storage/bufpage.h           | 21 +++++++++++++++------
 src/test/regress/expected/insert.out    |  2 +-
 src/test/regress/expected/vacuum.out    |  4 ++--
 src/test/regress/sql/insert.sql         |  2 +-
 src/test/regress/sql/vacuum.sql         |  4 ++--
 17 files changed, 71 insertions(+), 33 deletions(-)

diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index f750b5ed9e..c2bb952048 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -39,7 +39,7 @@
 int			gin_pending_list_limit = 0;
 
 #define GIN_PAGE_FREESIZE \
-	( BLCKSZ - MAXALIGN(SizeOfPageHeaderData) - MAXALIGN(sizeof(GinPageOpaqueData)) )
+	( BLCKSZ - MAXALIGN(SizeOfPageHeaderData) - MAXALIGN(sizeof(GinPageOpaqueData)) - MaxSizeOfPageReservedSpace )
 
 typedef struct KeyArray
 {
diff --git a/src/backend/access/nbtree/nbtdedup.c b/src/backend/access/nbtree/nbtdedup.c
index 93a025b0a9..c40f6ae800 100644
--- a/src/backend/access/nbtree/nbtdedup.c
+++ b/src/backend/access/nbtree/nbtdedup.c
@@ -826,7 +826,7 @@ _bt_singleval_fillfactor(Page page, BTDedupState state, Size newitemsz)
 
 	/* This calculation needs to match nbtsplitloc.c */
 	leftfree = PageGetPageSize(page) - SizeOfPageHeaderData -
-		MAXALIGN(sizeof(BTPageOpaqueData));
+		MAXALIGN(sizeof(BTPageOpaqueData)) - SizeOfPageReservedSpace;
 	/* Subtract size of new high key (includes pivot heap TID space) */
 	leftfree -= newitemsz + MAXALIGN(sizeof(ItemPointerData));
 
diff --git a/src/backend/access/nbtree/nbtsplitloc.c b/src/backend/access/nbtree/nbtsplitloc.c
index 241e26d338..bb18707892 100644
--- a/src/backend/access/nbtree/nbtsplitloc.c
+++ b/src/backend/access/nbtree/nbtsplitloc.c
@@ -156,7 +156,7 @@ _bt_findsplitloc(Relation rel,
 
 	/* Total free space available on a btree page, after fixed overhead */
 	leftspace = rightspace =
-		PageGetPageSize(origpage) - SizeOfPageHeaderData -
+		PageGetPageSize(origpage) - SizeOfPageHeaderData - SizeOfPageReservedSpace -
 		MAXALIGN(sizeof(BTPageOpaqueData));
 
 	/* The right page will have the same high key as the old page */
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index 8b617c7e79..a76b8aab6c 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -43,7 +43,7 @@ PageInit(Page page, Size pageSize, Size specialSize)
 {
 	PageHeader	p = (PageHeader) page;
 
-	specialSize = MAXALIGN(specialSize);
+	specialSize = MAXALIGN(specialSize) + reserved_page_size;
 
 	Assert(pageSize == BLCKSZ);
 	Assert(pageSize > specialSize + SizeOfPageHeaderData);
@@ -117,7 +117,7 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
 		if ((p->pd_flags & ~PD_VALID_FLAG_BITS) == 0 &&
 			p->pd_lower <= p->pd_upper &&
 			p->pd_upper <= p->pd_special &&
-			p->pd_special <= BLCKSZ &&
+			p->pd_special + reserved_page_size <= BLCKSZ &&
 			p->pd_special == MAXALIGN(p->pd_special))
 			header_sane = true;
 
@@ -211,7 +211,7 @@ PageAddItemExtended(Page page,
 	if (phdr->pd_lower < SizeOfPageHeaderData ||
 		phdr->pd_lower > phdr->pd_upper ||
 		phdr->pd_upper > phdr->pd_special ||
-		phdr->pd_special > BLCKSZ)
+		phdr->pd_special + reserved_page_size > BLCKSZ)
 		ereport(PANIC,
 				(errcode(ERRCODE_DATA_CORRUPTED),
 				 errmsg("corrupted page pointers: lower = %u, upper = %u, special = %u",
@@ -723,7 +723,7 @@ PageRepairFragmentation(Page page)
 	if (pd_lower < SizeOfPageHeaderData ||
 		pd_lower > pd_upper ||
 		pd_upper > pd_special ||
-		pd_special > BLCKSZ ||
+		pd_special + reserved_page_size > BLCKSZ ||
 		pd_special != MAXALIGN(pd_special))
 		ereport(ERROR,
 				(errcode(ERRCODE_DATA_CORRUPTED),
@@ -1066,7 +1066,7 @@ PageIndexTupleDelete(Page page, OffsetNumber offnum)
 	if (phdr->pd_lower < SizeOfPageHeaderData ||
 		phdr->pd_lower > phdr->pd_upper ||
 		phdr->pd_upper > phdr->pd_special ||
-		phdr->pd_special > BLCKSZ ||
+		phdr->pd_special + reserved_page_size > BLCKSZ ||
 		phdr->pd_special != MAXALIGN(phdr->pd_special))
 		ereport(ERROR,
 				(errcode(ERRCODE_DATA_CORRUPTED),
@@ -1201,7 +1201,7 @@ PageIndexMultiDelete(Page page, OffsetNumber *itemnos, int nitems)
 	if (pd_lower < SizeOfPageHeaderData ||
 		pd_lower > pd_upper ||
 		pd_upper > pd_special ||
-		pd_special > BLCKSZ ||
+		pd_special + reserved_page_size > BLCKSZ ||
 		pd_special != MAXALIGN(pd_special))
 		ereport(ERROR,
 				(errcode(ERRCODE_DATA_CORRUPTED),
@@ -1307,7 +1307,7 @@ PageIndexTupleDeleteNoCompact(Page page, OffsetNumber offnum)
 	if (phdr->pd_lower < SizeOfPageHeaderData ||
 		phdr->pd_lower > phdr->pd_upper ||
 		phdr->pd_upper > phdr->pd_special ||
-		phdr->pd_special > BLCKSZ ||
+		phdr->pd_special + reserved_page_size > BLCKSZ ||
 		phdr->pd_special != MAXALIGN(phdr->pd_special))
 		ereport(ERROR,
 				(errcode(ERRCODE_DATA_CORRUPTED),
@@ -1419,7 +1419,7 @@ PageIndexTupleOverwrite(Page page, OffsetNumber offnum,
 	if (phdr->pd_lower < SizeOfPageHeaderData ||
 		phdr->pd_lower > phdr->pd_upper ||
 		phdr->pd_upper > phdr->pd_special ||
-		phdr->pd_special > BLCKSZ ||
+		phdr->pd_special + reserved_page_size > BLCKSZ ||
 		phdr->pd_special != MAXALIGN(phdr->pd_special))
 		ereport(ERROR,
 				(errcode(ERRCODE_DATA_CORRUPTED),
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 1a5d29ac9b..3e241eba5b 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -151,3 +151,6 @@ int64		VacuumPageDirty = 0;
 
 int			VacuumCostBalance = 0;	/* working state for vacuum */
 bool		VacuumCostActive = false;
+
+int			reserved_page_size = 0; /* how much page space to reserve for extended unencrypted metadata */
+
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 05ab087934..a2d10d149d 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2624,6 +2624,19 @@ struct config_int ConfigureNamesInt[] =
 		NULL, assign_max_wal_size, NULL
 	},
 
+	{
+		{"reserved_page_size", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows the size of reserved space for extended pages."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+			},
+		&reserved_page_size,
+		0,
+		0,
+		PG_UINT8_MAX,
+		NULL, NULL, NULL
+		},
+	
 	{
 		{"checkpoint_timeout", PGC_SIGHUP, WAL_CHECKPOINTS,
 			gettext_noop("Sets the maximum time between automatic WAL checkpoints."),
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index f61a043055..40561d5d61 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -77,6 +77,7 @@
 #include "getopt_long.h"
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
+#include "storage/bufpage.h" /* MaxSizeOfPageReservedSpace */
 
 
 /* Ideally this would be in a .h file, but it hardly seems worth the trouble */
diff --git a/src/include/access/ginblock.h b/src/include/access/ginblock.h
index 9347f464f3..d2c011abbd 100644
--- a/src/include/access/ginblock.h
+++ b/src/include/access/ginblock.h
@@ -249,6 +249,7 @@ typedef signed char GinNullCategory;
 #define GinMaxItemSize \
 	Min(INDEX_SIZE_MASK, \
 		MAXALIGN_DOWN(((BLCKSZ - \
+						MaxSizeOfPageReservedSpace - \
 						MAXALIGN(SizeOfPageHeaderData + 3 * sizeof(ItemIdData)) - \
 						MAXALIGN(sizeof(GinPageOpaqueData))) / 3)))
 
@@ -319,6 +320,7 @@ typedef signed char GinNullCategory;
 
 #define GinDataPageMaxDataSize	\
 	(BLCKSZ - MAXALIGN(SizeOfPageHeaderData) \
+	 - MaxSizeOfPageReservedSpace \
 	 - MAXALIGN(sizeof(ItemPointerData)) \
 	 - MAXALIGN(sizeof(GinPageOpaqueData)))
 
@@ -326,7 +328,7 @@ typedef signed char GinNullCategory;
  * List pages
  */
 #define GinListPageSize  \
-	( BLCKSZ - SizeOfPageHeaderData - MAXALIGN(sizeof(GinPageOpaqueData)) )
+	( BLCKSZ - SizeOfPageHeaderData - MAXALIGN(sizeof(GinPageOpaqueData)) - MaxSizeOfPageReservedSpace)
 
 /*
  * A compressed posting list.
diff --git a/src/include/access/hash.h b/src/include/access/hash.h
index da372841c4..b21cd2e8c6 100644
--- a/src/include/access/hash.h
+++ b/src/include/access/hash.h
@@ -287,6 +287,7 @@ typedef struct HashOptions
 #define HashMaxItemSize(page) \
 	MAXALIGN_DOWN(PageGetPageSize(page) - \
 				  SizeOfPageHeaderData - \
+				  SizeOfPageReservedSpace - \
 				  sizeof(ItemIdData) - \
 				  MAXALIGN(sizeof(HashPageOpaqueData)))
 
@@ -318,7 +319,9 @@ typedef struct HashOptions
 
 #define HashGetMaxBitmapSize(page) \
 	(PageGetPageSize((Page) page) - \
-	 (MAXALIGN(SizeOfPageHeaderData) + MAXALIGN(sizeof(HashPageOpaqueData))))
+	 (MAXALIGN(SizeOfPageHeaderData) + \
+	  SizeOfPageReservedSpace + \
+	  MAXALIGN(sizeof(HashPageOpaqueData))))
 
 #define HashPageGetMeta(page) \
 	((HashMetaPage) PageGetContents(page))
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index 9561c835f2..0d0f97bd86 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -548,15 +548,16 @@ do { \
 /*
  * MaxHeapTupleSize is the maximum allowed size of a heap tuple, including
  * header and MAXALIGN alignment padding.  Basically it's BLCKSZ minus the
- * other stuff that has to be on a disk page.  Since heap pages use no
- * "special space", there's no deduction for that.
+ * other stuff that has to be on a disk page.  We also include
+ * MaxSizeOfPageReservedSpace bytes in this calculation as this could be
+ * enabled.
  *
  * NOTE: we allow for the ItemId that must point to the tuple, ensuring that
  * an otherwise-empty page can indeed hold a tuple of this size.  Because
  * ItemIds and tuples have different alignment requirements, don't assume that
  * you can, say, fit 2 tuples of size MaxHeapTupleSize/2 on the same page.
  */
-#define MaxHeapTupleSize  (BLCKSZ - MAXALIGN(SizeOfPageHeaderData + sizeof(ItemIdData)))
+#define MaxHeapTupleSize  (BLCKSZ - MAXALIGN(SizeOfPageHeaderData + sizeof(ItemIdData) + MaxSizeOfPageReservedSpace))
 #define MinHeapTupleSize  MAXALIGN(SizeofHeapTupleHeader)
 
 /*
@@ -571,7 +572,7 @@ do { \
  * require increases in the size of work arrays.
  */
 #define MaxHeapTuplesPerPage	\
-	((int) ((BLCKSZ - SizeOfPageHeaderData) / \
+	((int) ((BLCKSZ - SizeOfPageHeaderData - MaxSizeOfPageReservedSpace) / \
 			(MAXALIGN(SizeofHeapTupleHeader) + sizeof(ItemIdData))))
 
 /*
diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h
index 8e4f6864e5..adf7be54d7 100644
--- a/src/include/access/nbtree.h
+++ b/src/include/access/nbtree.h
@@ -162,12 +162,12 @@ typedef struct BTMetaPageData
  * attribute, which we account for here.
  */
 #define BTMaxItemSize(page) \
-	(MAXALIGN_DOWN((PageGetPageSize(page) - \
+	(MAXALIGN_DOWN((PageGetPageSize(page) - SizeOfPageReservedSpace - \
 					MAXALIGN(SizeOfPageHeaderData + 3*sizeof(ItemIdData)) - \
 					MAXALIGN(sizeof(BTPageOpaqueData))) / 3) - \
 					MAXALIGN(sizeof(ItemPointerData)))
 #define BTMaxItemSizeNoHeapTid(page) \
-	MAXALIGN_DOWN((PageGetPageSize(page) - \
+	MAXALIGN_DOWN((PageGetPageSize(page) - SizeOfPageReservedSpace - \
 				   MAXALIGN(SizeOfPageHeaderData + 3*sizeof(ItemIdData)) - \
 				   MAXALIGN(sizeof(BTPageOpaqueData))) / 3)
 
@@ -181,10 +181,15 @@ typedef struct BTMetaPageData
  * heap TIDs must have to fill the space between the page header and
  * special area).  The value is slightly higher (i.e. more conservative)
  * than necessary as a result, which is considered acceptable.
+ *
+ * Since this is a fixed-size upper limit we restrict to the max size of page
+ * reserved space; this does mean that we pay a cost of
+ * (MaxSizeOfPageReservedSpace / sizeof(ItemPointerData)) less tuples stored
+ * on a page.
  */
 #define MaxTIDsPerBTreePage \
-	(int) ((BLCKSZ - SizeOfPageHeaderData - sizeof(BTPageOpaqueData)) / \
-		   sizeof(ItemPointerData))
+	(int) ((BLCKSZ - SizeOfPageHeaderData - MaxSizeOfPageReservedSpace - \
+			sizeof(BTPageOpaqueData)) / sizeof(ItemPointerData))
 
 /*
  * The leaf-page fillfactor defaults to 90% but is user-adjustable.
diff --git a/src/include/access/spgist_private.h b/src/include/access/spgist_private.h
index eb56b1c6b8..d6b0cc11a5 100644
--- a/src/include/access/spgist_private.h
+++ b/src/include/access/spgist_private.h
@@ -447,6 +447,7 @@ typedef SpGistDeadTupleData *SpGistDeadTuple;
 #define SPGIST_PAGE_CAPACITY  \
 	MAXALIGN_DOWN(BLCKSZ - \
 				  SizeOfPageHeaderData - \
+				  SizeOfPageReservedSpace - \
 				  MAXALIGN(sizeof(SpGistPageOpaqueData)))
 
 /*
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 2708c4b683..295ac1367d 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -19,6 +19,13 @@
 #include "storage/item.h"
 #include "storage/off.h"
 
+extern int reserved_page_size;
+
+#define SizeOfPageReservedSpace reserved_page_size
+/* strict upper bound on the amount of space occupied we have reserved on
+ * pages in this cluster */
+#define MaxSizeOfPageReservedSpace 64
+
 /*
  * A postgres disk page is an abstraction layered on top of a postgres
  * disk block (which is simply a unit of i/o, see block.h).
@@ -36,10 +43,10 @@
  * |			 v pd_upper							  |
  * +-------------+------------------------------------+
  * |			 | tupleN ...                         |
- * +-------------+------------------+-----------------+
- * |	   ... tuple3 tuple2 tuple1 | "special space" |
- * +--------------------------------+-----------------+
- *									^ pd_special
+ * +-------------+-----+------------+-----------------+
+ * | ... tuple2 tuple1 | "special space" | "reserved" |
+ * +-------------------+------------+-----------------+
+ *					   ^ pd_special      ^ reserved_page_space
  *
  * a page is full when nothing can be added between pd_lower and
  * pd_upper.
@@ -73,6 +80,8 @@
  * stored as the page trailer.  an access method should always
  * initialize its pages with PageInit and then set its own opaque
  * fields.
+ *
+ * XXX - update more comments here about reserved_page_space
  */
 
 typedef Pointer Page;
@@ -313,7 +322,7 @@ PageSetPageSizeAndVersion(Page page, Size size, uint8 version)
 static inline uint16
 PageGetSpecialSize(Page page)
 {
-	return (PageGetPageSize(page) - ((PageHeader) page)->pd_special);
+	return (PageGetPageSize(page) - ((PageHeader) page)->pd_special - reserved_page_size);
 }
 
 /*
@@ -325,7 +334,7 @@ static inline void
 PageValidateSpecialPointer(Page page)
 {
 	Assert(page);
-	Assert(((PageHeader) page)->pd_special <= BLCKSZ);
+	Assert((((PageHeader) page)->pd_special - reserved_page_size) <= BLCKSZ);
 	Assert(((PageHeader) page)->pd_special >= SizeOfPageHeaderData);
 }
 
diff --git a/src/test/regress/expected/insert.out b/src/test/regress/expected/insert.out
index dd4354fc7d..548e896289 100644
--- a/src/test/regress/expected/insert.out
+++ b/src/test/regress/expected/insert.out
@@ -100,7 +100,7 @@ SELECT pg_size_pretty(pg_relation_size('large_tuple_test'::regclass, 'main'));
 INSERT INTO large_tuple_test (select 3, NULL);
 -- now this tuple won't fit on the second page, but the insert should
 -- still succeed by extending the relation
-INSERT INTO large_tuple_test (select 4, repeat('a', 8126));
+INSERT INTO large_tuple_test (select 4, repeat('a', 8062));
 DROP TABLE large_tuple_test;
 --
 -- check indirection (field/array assignment), cf bug #14265
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index c63a157e5f..583a5a91ae 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -134,7 +134,7 @@ CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 CREATE INDEX no_index_cleanup_idx ON no_index_cleanup(t);
 ALTER TABLE no_index_cleanup ALTER COLUMN t SET STORAGE EXTERNAL;
 INSERT INTO no_index_cleanup(i, t) VALUES (generate_series(1,30),
-    repeat('1234567890',269));
+    repeat('1234567890',266));
 -- index cleanup option is ignored if VACUUM FULL
 VACUUM (INDEX_CLEANUP TRUE, FULL TRUE) no_index_cleanup;
 VACUUM (FULL TRUE) no_index_cleanup;
@@ -150,7 +150,7 @@ ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = auto);
 VACUUM no_index_cleanup;
 -- Parameter is set for both the parent table and its toast relation.
 INSERT INTO no_index_cleanup(i, t) VALUES (generate_series(31,60),
-    repeat('1234567890',269));
+    repeat('1234567890',266));
 DELETE FROM no_index_cleanup WHERE i < 45;
 -- Only toast index is cleaned up.
 ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = off,
diff --git a/src/test/regress/sql/insert.sql b/src/test/regress/sql/insert.sql
index bdcffd0314..f481bedd02 100644
--- a/src/test/regress/sql/insert.sql
+++ b/src/test/regress/sql/insert.sql
@@ -55,7 +55,7 @@ INSERT INTO large_tuple_test (select 3, NULL);
 
 -- now this tuple won't fit on the second page, but the insert should
 -- still succeed by extending the relation
-INSERT INTO large_tuple_test (select 4, repeat('a', 8126));
+INSERT INTO large_tuple_test (select 4, repeat('a', 8062));
 
 DROP TABLE large_tuple_test;
 
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 9faa8a34a6..0aec01b88e 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -115,7 +115,7 @@ CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 CREATE INDEX no_index_cleanup_idx ON no_index_cleanup(t);
 ALTER TABLE no_index_cleanup ALTER COLUMN t SET STORAGE EXTERNAL;
 INSERT INTO no_index_cleanup(i, t) VALUES (generate_series(1,30),
-    repeat('1234567890',269));
+    repeat('1234567890',266));
 -- index cleanup option is ignored if VACUUM FULL
 VACUUM (INDEX_CLEANUP TRUE, FULL TRUE) no_index_cleanup;
 VACUUM (FULL TRUE) no_index_cleanup;
@@ -131,7 +131,7 @@ ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = auto);
 VACUUM no_index_cleanup;
 -- Parameter is set for both the parent table and its toast relation.
 INSERT INTO no_index_cleanup(i, t) VALUES (generate_series(31,60),
-    repeat('1234567890',269));
+    repeat('1234567890',266));
 DELETE FROM no_index_cleanup WHERE i < 45;
 -- Only toast index is cleaned up.
 ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = off,
-- 
2.37.0 (Apple Git-136)

0004-Add-page_checksums32-page-feature.patchapplication/octet-stream; name=0004-Add-page_checksums32-page-feature.patchDownload

From 987611777fdae13a2fc75faa3015d8adabcafb85 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Thu, 20 Oct 2022 17:05:33 -0400
Subject: [PATCH 4/6] Add page_checksums32 page feature

This is an example page feature which utilizes 2 bytes of the reserved page space and the existing
2-byte pd_checksum to store the total 32-bit page checksum that we currently calculate and throw
half away. It also serves as an illustration of writing/using a page feature.
---
 src/backend/access/transam/xlog.c       |  4 +-
 src/backend/backup/basebackup.c         | 28 +++++++++---
 src/backend/storage/page/bufpage.c      | 53 +++++++++++++++++++----
 src/backend/utils/misc/guc_tables.c     | 11 +++++
 src/bin/initdb/initdb.c                 | 18 ++++++--
 src/bin/pg_controldata/pg_controldata.c |  3 ++
 src/common/pagefeat.c                   | 14 +++++-
 src/include/common/pagefeat.h           |  5 ++-
 src/include/storage/checksum.h          |  1 +
 src/include/storage/checksum_impl.h     | 57 +++++++++++++++++++++++++
 10 files changed, 172 insertions(+), 22 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 21a134a663..a1379efd49 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4233,7 +4233,9 @@ bool
 DataChecksumsEnabled(void)
 {
 	Assert(ControlFile != NULL);
-	return (ControlFile->data_checksum_version > 0);
+	return (ControlFile->data_checksum_version > 0) || \
+		PageFeatureSetHasFeature(ControlFile->page_features, PF_PAGE_CHECKSUMS32);
+
 }
 
 /*
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 74fb529380..90cd97b938 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -25,6 +25,7 @@
 #include "commands/defrem.h"
 #include "common/compression.h"
 #include "common/file_perm.h"
+#include "common/pagefeat.h"
 #include "lib/stringinfo.h"
 #include "miscadmin.h"
 #include "nodes/pg_list.h"
@@ -1492,7 +1493,7 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 	int			fd;
 	BlockNumber blkno = 0;
 	bool		block_retry = false;
-	uint16		checksum;
+	uint32		checksum, page_checksum;
 	int			checksum_failures = 0;
 	off_t		cnt;
 	int			i;
@@ -1608,9 +1609,24 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 				 */
 				if (!PageIsNew(page) && PageGetLSN(page) < sink->bbs_state->startptr)
 				{
-					checksum = pg_checksum_page((char *) page, blkno + segmentno * RELSEG_SIZE);
-					phdr = (PageHeader) page;
-					if (phdr->pd_checksum != checksum)
+					char *extended_checksum_loc = NULL;
+
+					/* are we using extended checksums? */
+					if ((extended_checksum_loc = ClusterPageFeatureOffset(page, PF_PAGE_CHECKSUMS32)))
+					{
+						page_checksum = *(uint16*)extended_checksum_loc;
+						page_checksum <<= 16;
+						page_checksum += ((PageHeader)page)->pd_checksum;
+						checksum = (uint32)pg_checksum32_page(page, blkno + segmentno * RELSEG_SIZE, extended_checksum_loc);
+					}
+					else
+					{
+						phdr = (PageHeader) page;
+						page_checksum = (uint32)phdr->pd_checksum;
+						checksum = pg_checksum_page(page, blkno + segmentno * RELSEG_SIZE);
+					}
+
+					if (page_checksum != checksum)
 					{
 						/*
 						 * Retry the block on the first failure.  It's
@@ -1661,9 +1677,9 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 							ereport(WARNING,
 									(errmsg("checksum verification failed in "
 											"file \"%s\", block %u: calculated "
-											"%X but expected %X",
+											"%lu but expected %lu",
 											readfilename, blkno, checksum,
-											phdr->pd_checksum)));
+											page_checksum)));
 						if (checksum_failures == 5)
 							ereport(WARNING,
 									(errmsg("further checksum verification "
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index 74a7bdce33..b50b4e76a0 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -93,8 +93,9 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
 	bool		checksum_failure = false;
 	bool		header_sane = false;
 	bool		all_zeroes = false;
-	uint16		checksum = 0;
-
+	uint32		checksum = 0;
+	uint32		page_checksum = 0;
+	char       *extended_checksum_loc = NULL;
 	/*
 	 * Don't verify page data unless the page passes basic non-zero test
 	 */
@@ -102,9 +103,22 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
 	{
 		if (DataChecksumsEnabled())
 		{
-			checksum = pg_checksum_page((char *) page, blkno);
-
-			if (checksum != p->pd_checksum)
+			/* are we using extended checksums? */
+			if ((extended_checksum_loc = ClusterPageFeatureOffset(page, PF_PAGE_CHECKSUMS32)))
+			{
+				/* high bits of the existing checksum are stored as a uint16 at extended_checksum_loc, low bits in pd_checksum */
+				page_checksum = *((uint16*)extended_checksum_loc);
+				page_checksum <<= 16;
+				page_checksum += p->pd_checksum;
+				checksum = (uint32)pg_checksum32_page(page, blkno, extended_checksum_loc);
+			}
+			else
+			{
+				/* traditional checksums in the pd_checksum field */
+				page_checksum = (uint32)p->pd_checksum;
+				checksum = (uint32)pg_checksum_page((char *) page, blkno);
+			}
+			if (checksum != page_checksum)
 				checksum_failure = true;
 		}
 
@@ -150,7 +164,7 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
 			ereport(WARNING,
 					(errcode(ERRCODE_DATA_CORRUPTED),
 					 errmsg("page verification failed, calculated checksum %u but expected %u",
-							checksum, p->pd_checksum)));
+							checksum, page_checksum)));
 
 		if ((flags & PIV_REPORT_STAT) != 0)
 			pgstat_report_checksum_failure();
@@ -1510,6 +1524,7 @@ char *
 PageSetChecksumCopy(Page page, BlockNumber blkno)
 {
 	static char *pageCopy = NULL;
+	char *extended_checksum_loc = NULL;
 
 	/* If we don't need a checksum, just return the passed-in data */
 	if (PageIsNew(page) || !DataChecksumsEnabled())
@@ -1525,7 +1540,18 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 		pageCopy = MemoryContextAlloc(TopMemoryContext, BLCKSZ);
 
 	memcpy(pageCopy, (char *) page, BLCKSZ);
-	((PageHeader) pageCopy)->pd_checksum = pg_checksum_page(pageCopy, blkno);
+
+	if ((extended_checksum_loc = ClusterPageFeatureOffset(pageCopy, PF_PAGE_CHECKSUMS32)))
+	{
+		/* 32-bit checksums split storage between pd_checksum and page feature offset */
+		uint32 checksum = pg_checksum32_page((char*)pageCopy, blkno, extended_checksum_loc);
+
+		*(uint16*)extended_checksum_loc = (uint16)(checksum >> 16);
+		((PageHeader) pageCopy)->pd_checksum = (uint16)(checksum & 0xFFFF);
+
+	}
+	else
+		((PageHeader) pageCopy)->pd_checksum = pg_checksum_page(pageCopy, blkno);
 	return pageCopy;
 }
 
@@ -1538,9 +1564,20 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 void
 PageSetChecksumInplace(Page page, BlockNumber blkno)
 {
+	char *extended_checksum_loc = NULL;
+
 	/* If we don't need a checksum, just return */
 	if (PageIsNew(page) || !DataChecksumsEnabled())
 		return;
 
-	((PageHeader) page)->pd_checksum = pg_checksum_page((char *) page, blkno);
+	/* are we using extended checksums? */
+	if ((extended_checksum_loc = ClusterPageFeatureOffset(page, PF_PAGE_CHECKSUMS32)))
+	{
+		/* 32-bit checksums split storage between pd_checksum and page feature offset */
+		uint32 checksum = pg_checksum32_page((char*)page, blkno, extended_checksum_loc);
+		*(uint16*)extended_checksum_loc = (uint16)(checksum >> 16);
+		((PageHeader) page)->pd_checksum = (uint16)(checksum & 0xFFFF);
+	}
+	else
+		((PageHeader) page)->pd_checksum = pg_checksum_page(page, blkno);
 }
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index a2d10d149d..c2fc61f81e 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -1798,6 +1798,17 @@ struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"page_checksums32", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows whether 32-bit extended checksums are turned on for this cluster."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
+		},
+		&page_feature_page_checksums32,
+		false,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"syslog_sequence_numbers", PGC_SIGHUP, LOGGING_WHERE,
 			gettext_noop("Add sequence number to syslog messages to avoid duplicate suppression."),
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 40561d5d61..43311f0b79 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -150,6 +150,7 @@ static bool do_sync = true;
 static bool sync_only = false;
 static bool show_setting = false;
 static bool data_checksums = false;
+static bool page_checksums32 = false;
 static char *xlog_dir = NULL;
 static char *str_wal_segment_size_mb = NULL;
 static int	wal_segment_size_mb;
@@ -1322,10 +1323,11 @@ bootstrap_template1(void)
 	unsetenv("PGCLIENTENCODING");
 
 	snprintf(cmd, sizeof(cmd),
-			 "\"%s\" --boot -X %d %s %s %s %s",
+			 "\"%s\" --boot -X %d %s %s %s %s %s",
 			 backend_exec,
 			 wal_segment_size_mb * (1024 * 1024),
 			 data_checksums ? "-k" : "",
+			 page_checksums32 ? "-e page_checksums32" : "",
 			 boot_options, extra_options,
 			 debug ? "-d 5" : "");
 
@@ -2148,6 +2150,7 @@ usage(const char *progname)
 	printf(_("  -g, --allow-group-access  allow group read/execute on data directory\n"));
 	printf(_("      --icu-locale=LOCALE   set ICU locale ID for new databases\n"));
 	printf(_("  -k, --data-checksums      use data page checksums\n"));
+	printf(_("  -K, --extended-checksums  use extended data page checksums\n"));
 	printf(_("      --locale=LOCALE       set default locale for new databases\n"));
 	printf(_("      --lc-collate=, --lc-ctype=, --lc-messages=LOCALE\n"
 			 "      --lc-monetary=, --lc-numeric=, --lc-time=LOCALE\n"
@@ -2803,6 +2806,7 @@ main(int argc, char *argv[])
 		{"waldir", required_argument, NULL, 'X'},
 		{"wal-segsize", required_argument, NULL, 12},
 		{"data-checksums", no_argument, NULL, 'k'},
+		{"extended-checksums", no_argument, NULL, 'K'},
 		{"allow-group-access", no_argument, NULL, 'g'},
 		{"discard-caches", no_argument, NULL, 14},
 		{"locale-provider", required_argument, NULL, 15},
@@ -2848,7 +2852,7 @@ main(int argc, char *argv[])
 
 	/* process command-line options */
 
-	while ((c = getopt_long(argc, argv, "A:dD:E:gkL:nNsST:U:WX:", long_options, &option_index)) != -1)
+	while ((c = getopt_long(argc, argv, "A:dD:E:gkKL:nNsST:U:WX:", long_options, &option_index)) != -1)
 	{
 		switch (c)
 		{
@@ -2900,6 +2904,9 @@ main(int argc, char *argv[])
 			case 'k':
 				data_checksums = true;
 				break;
+			case 'K':
+				page_checksums32 = true;
+				break;
 			case 'L':
 				share_path = pg_strdup(optarg);
 				break;
@@ -3015,6 +3022,9 @@ main(int argc, char *argv[])
 	if (pwprompt && pwfilename)
 		pg_fatal("password prompt and password file cannot be specified together");
 
+	if (data_checksums && page_checksums32)
+		pg_fatal("data checksums and extended data checksums cannot be specified together");
+
 	check_authmethod_unspecified(&authmethodlocal);
 	check_authmethod_unspecified(&authmethodhost);
 
@@ -3068,7 +3078,9 @@ main(int argc, char *argv[])
 
 	printf("\n");
 
-	if (data_checksums)
+	if (page_checksums32)
+		printf(_("Extended data page checksums are enabled.\n"));
+	else if (data_checksums)
 		printf(_("Data page checksums are enabled.\n"));
 	else
 		printf(_("Data page checksums are disabled.\n"));
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index 22e6122458..56eb8d88a2 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -331,5 +331,8 @@ main(int argc, char *argv[])
 		   mock_auth_nonce_str);
 	printf(_("Reserved page size for features:      %d\n"),
 		   CalculateReservedPageSize(ControlFile->page_features));
+	printf(_("Using extended checksums:             %s\n"),
+		   PageFeatureSetHasFeature(ControlFile->page_features, PF_PAGE_CHECKSUMS32) \
+		   ? _("yes") : _("no"));
 	return 0;
 }
diff --git a/src/common/pagefeat.c b/src/common/pagefeat.c
index 75f589714b..a3e3e6686e 100644
--- a/src/common/pagefeat.c
+++ b/src/common/pagefeat.c
@@ -16,10 +16,13 @@
 #include "common/pagefeat.h"
 #include "utils/guc.h"
 
-/* global variable to store the reserved_page_size */
+/* global variables */
 int reserved_page_size;
 PageFeatureSet cluster_page_features;
 
+/* status GUCs, display only. set by XLog startup */
+bool page_feature_page_checksums32;
+
 /*
  * A "page feature" is an optional cluster-defined additional data field that
  * is stored in the "reserved_page_size" area in the footer of a given Page.
@@ -37,8 +40,15 @@ typedef struct PageFeatureDesc
 	char *guc_name;
 } PageFeatureDesc;
 
-/* these are the fixed widths for each feature type, indexed by feature */
+/* These are the fixed widths for each feature type, indexed by feature.  This
+ * is also used to lookup page features by the bootstrap process and expose
+ * the state of this page feature as a readonly boolean GUC, so when adding a
+ * named feature here ensure you also update the guc_tables file to add this,
+ * or the attempt to set the GUC will fail. */
+
 static PageFeatureDesc feature_descs[PF_MAX_FEATURE] = {
+	/* PF_PAGE_CHECKSUMS32 */
+	{ 2, "page_checksums32" },
 };
 
 
diff --git a/src/include/common/pagefeat.h b/src/include/common/pagefeat.h
index 18e3249522..baff058de0 100644
--- a/src/include/common/pagefeat.h
+++ b/src/include/common/pagefeat.h
@@ -16,6 +16,7 @@
 
 /* revealed for GUCs */
 extern int reserved_page_size;
+extern bool page_feature_page_checksums32;
 
 /* forward declaration to avoid circular includes */
 typedef Pointer Page;
@@ -27,8 +28,8 @@ extern PageFeatureSet cluster_page_features;
 
 /* bit offset for features flags */
 typedef enum {
-	/* TODO: add features here */
-	PF_MAX_FEATURE = 0 /* must be last */
+	PF_PAGE_CHECKSUMS32 = 0,
+	PF_MAX_FEATURE /* must be last */
 } PageFeature;
 
 /* prototypes */
diff --git a/src/include/storage/checksum.h b/src/include/storage/checksum.h
index 1904fabd5a..d0bcc01ca6 100644
--- a/src/include/storage/checksum.h
+++ b/src/include/storage/checksum.h
@@ -20,5 +20,6 @@
  * 4-byte boundary.
  */
 extern uint16 pg_checksum_page(char *page, BlockNumber blkno);
+extern uint32 pg_checksum32_page(char *page, BlockNumber blkno, char*offset);
 
 #endif							/* CHECKSUM_H */
diff --git a/src/include/storage/checksum_impl.h b/src/include/storage/checksum_impl.h
index d2eb75f769..87d9f96484 100644
--- a/src/include/storage/checksum_impl.h
+++ b/src/include/storage/checksum_impl.h
@@ -213,3 +213,60 @@ pg_checksum_page(char *page, BlockNumber blkno)
 	 */
 	return (uint16) ((checksum % 65535) + 1);
 }
+
+
+/*
+ * Compute and return a 32-bit checksum for a Postgres page.
+ *
+ * Beware that the 16-bit portion of the page that cksum points to is
+ * transiently zeroed, as is the pd_checksums field.  The storage location for
+ * this is determined by the PageFeatures in play for cluster, so we are
+ * storing the
+ *
+ * The checksum includes the block number (to detect the case where a page is
+ * somehow moved to a different location), the page header (excluding the
+ * checksum itself), and the page data.
+ *
+ * The high bits of this are stored in the overflow storage area of the page
+ * pointed to by *cksum, leaving the pd_checksum field with the same checksum
+ * you'd expect if running the pg_checksum_page function.
+ */
+uint32
+pg_checksum32_page(char *page, BlockNumber blkno, char *cksum)
+{
+	PGChecksummablePage *cpage = (PGChecksummablePage *) page;
+	uint16		save_pd,save_ext,*ptr;
+	uint32		checksum;
+
+	/* We only calculate the checksum for properly-initialized pages */
+	Assert(!PageIsNew((Page) page));
+	/* Ensure that the cksum pointer is in the page range on this page */
+	Assert(cksum >= page && cksum <= (page + BLCKSZ - sizeof(uint16)));
+
+	ptr = (uint16*)cksum;
+
+	/*
+	 * Save the existing checksum locations and temporarily set it to zero, so
+	 * that the checksum calculation isn't affected by the old checksum stored
+	 * on the page.  Restore it after, because actually updating the checksum
+	 * is NOT part of the API of this function.
+	 */
+
+	save_ext = *ptr;
+	save_pd = cpage->phdr.pd_checksum;
+	*ptr = 0;
+	cpage->phdr.pd_checksum = 0;
+
+	checksum = pg_checksum_block(cpage);
+
+	/* restore */
+	*ptr = save_ext;
+	cpage->phdr.pd_checksum = save_pd;
+
+	/* Mix in the block number to detect transposed pages */
+	checksum ^= blkno;
+
+	/* ensure we have non-zero return value here; this does double-up on our
+	 * coset for group 1 here, but it's a nice property to preserve */
+	return (checksum == 0 ? 1 : checksum);
+}
-- 
2.37.0 (Apple Git-136)

0003-Add-cluster-wide-Page-Features.patchapplication/octet-stream; name=0003-Add-cluster-wide-Page-Features.patchDownload

From 42fa9f2e2a06ae493e1b77f148693c0633f773a2 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Wed, 19 Oct 2022 22:33:20 -0400
Subject: [PATCH 3/6] Add cluster-wide Page Features

Page features are a standardized way of assigning and using dynamic space usage from the tail end of
a disk page.  These features are set at cluster init time (so configured via `initdb` and
initialized via the bootstrap process) and affect all disk pages.

A PageFeatureSet is effectively a bitflag of all configured features, each of which has a fixed
size.  If not using any PageFeatures, the storage overhead of this is 0.

Rather than using a variable location struct, an implementation of a PageFeature is responsible for
an offset and a length in the page.  The current API returns only a pointer to the page location for
the implementation to manage, and no further checks are done to ensure that only the expected memory
is accessed.

Access to the underlying memory is synonymous with determining whether a given cluster is using an
underlying PageFeature, so code paths can do something like:

    char *loc;

    if ((loc = ClusterGetPageFeatureOffset(page, PF_MY_FEATURE_ID)))
    {
        // ipso facto this feature is enabled in this cluster *and* we know the memory address
        ...
    }

Since this is direct memory access to the underlying Page, ensure the buffer is pinned.  Explicitly
locking (assuming you stay in your lane) should only need to guard against access from other
backends of this type if using shared buffers, so will be use-case dependent.

This does have a runtime overhead due to moving some offset calculations from compile time to
runtime. It is thought that the utility of this feature will outweigh the costs here.

Candidates for page features include 32-bit or 64-bit checksums, encryption tags, or additional
per-page metadata.

While we are not currently getting rid of the pd_checksum field, this mechanism could be used to
free up that 16 bits for some other purpose. One such purpose might be to mirror the cluster-wise
PageFeatureSet, currently also a uint16, which would mean the entirety of this scheme could be
reflected in a given page, opening up per-relation or even per-page setting/metadata here.  (We'd
presumably need to snag a pd_flags bit to interpret pd_checksum that way, but it would be an
interesting use.)
---
 contrib/pg_surgery/heap_surgery.c        |   2 +-
 src/backend/access/brin/brin_bloom.c     |   1 +
 src/backend/access/gin/ginfast.c         |   2 +-
 src/backend/access/heap/heapam.c         |   6 +-
 src/backend/access/heap/heapam_handler.c |   6 +-
 src/backend/access/heap/pruneheap.c      |  12 +--
 src/backend/access/heap/vacuumlazy.c     |   8 +-
 src/backend/access/transam/xlog.c        |   9 ++
 src/backend/bootstrap/bootstrap.c        |  19 +++-
 src/backend/nodes/tidbitmap.c            |   2 +-
 src/backend/storage/page/bufpage.c       |   2 +-
 src/backend/utils/init/globals.c         |   3 -
 src/bin/pg_controldata/pg_controldata.c  |   3 +
 src/common/Makefile                      |   1 +
 src/common/pagefeat.c                    | 130 +++++++++++++++++++++++
 src/include/access/ginblock.h            |  17 ++-
 src/include/access/heapam.h              |   2 +-
 src/include/access/htup_details.h        |  36 +++++--
 src/include/access/nbtree.h              |  17 ++-
 src/include/catalog/pg_control.h         |   5 +-
 src/include/common/pagefeat.h            |  47 ++++++++
 src/include/storage/bufmgr.h             |   1 +
 src/include/storage/bufpage.h            |   5 +-
 23 files changed, 290 insertions(+), 46 deletions(-)
 create mode 100644 src/common/pagefeat.c
 create mode 100644 src/include/common/pagefeat.h

diff --git a/contrib/pg_surgery/heap_surgery.c b/contrib/pg_surgery/heap_surgery.c
index 8a2ad9773d..72cf1880de 100644
--- a/contrib/pg_surgery/heap_surgery.c
+++ b/contrib/pg_surgery/heap_surgery.c
@@ -89,7 +89,7 @@ heap_force_common(FunctionCallInfo fcinfo, HeapTupleForceOption heap_force_opt)
 	Relation	rel;
 	OffsetNumber curr_start_ptr,
 				next_start_ptr;
-	bool		include_this_tid[MaxHeapTuplesPerPage];
+	bool		include_this_tid[MaxHeapTuplesPerPageLimit];
 
 	if (RecoveryInProgress())
 		ereport(ERROR,
diff --git a/src/backend/access/brin/brin_bloom.c b/src/backend/access/brin/brin_bloom.c
index 6b0af7267d..b44a77fed6 100644
--- a/src/backend/access/brin/brin_bloom.c
+++ b/src/backend/access/brin/brin_bloom.c
@@ -125,6 +125,7 @@
 #include "access/stratnum.h"
 #include "catalog/pg_type.h"
 #include "catalog/pg_amop.h"
+#include "common/pagefeat.h"
 #include "utils/builtins.h"
 #include "utils/datum.h"
 #include "utils/lsyscache.h"
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index c2bb952048..f7f0d64bc2 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -39,7 +39,7 @@
 int			gin_pending_list_limit = 0;
 
 #define GIN_PAGE_FREESIZE \
-	( BLCKSZ - MAXALIGN(SizeOfPageHeaderData) - MAXALIGN(sizeof(GinPageOpaqueData)) - MaxSizeOfPageReservedSpace )
+	( BLCKSZ - MAXALIGN(SizeOfPageHeaderData) - MAXALIGN(sizeof(GinPageOpaqueData)) - SizeOfPageReservedSpace )
 
 typedef struct KeyArray
 {
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index bd4d85041d..c54149b559 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -9088,7 +9088,7 @@ heap_xlog_insert(XLogReaderState *record)
 	union
 	{
 		HeapTupleHeaderData hdr;
-		char		data[MaxHeapTupleSize];
+		char		data[MaxHeapTupleSizeLimit];
 	}			tbuf;
 	HeapTupleHeader htup;
 	xl_heap_header xlhdr;
@@ -9210,7 +9210,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
 	union
 	{
 		HeapTupleHeaderData hdr;
-		char		data[MaxHeapTupleSize];
+		char		data[MaxHeapTupleSizeLimit];
 	}			tbuf;
 	HeapTupleHeader htup;
 	uint32		newlen;
@@ -9367,7 +9367,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
 	union
 	{
 		HeapTupleHeaderData hdr;
-		char		data[MaxHeapTupleSize];
+		char		data[MaxHeapTupleSizeLimit];
 	}			tbuf;
 	xl_heap_header xlhdr;
 	uint32		newlen;
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index a3414a76e8..4246b3345e 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -1184,7 +1184,7 @@ heapam_index_build_range_scan(Relation heapRelation,
 	TransactionId OldestXmin;
 	BlockNumber previous_blkno = InvalidBlockNumber;
 	BlockNumber root_blkno = InvalidBlockNumber;
-	OffsetNumber root_offsets[MaxHeapTuplesPerPage];
+	OffsetNumber root_offsets[MaxHeapTuplesPerPageLimit];
 
 	/*
 	 * sanity checks
@@ -1747,8 +1747,8 @@ heapam_index_validate_scan(Relation heapRelation,
 	EState	   *estate;
 	ExprContext *econtext;
 	BlockNumber root_blkno = InvalidBlockNumber;
-	OffsetNumber root_offsets[MaxHeapTuplesPerPage];
-	bool		in_index[MaxHeapTuplesPerPage];
+	OffsetNumber root_offsets[MaxHeapTuplesPerPageLimit];
+	bool		in_index[MaxHeapTuplesPerPageLimit];
 	BlockNumber previous_blkno = InvalidBlockNumber;
 
 	/* state variables for the merge */
diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index 9f43bbe25f..ec0fc5faed 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -54,9 +54,9 @@ typedef struct
 	int			ndead;
 	int			nunused;
 	/* arrays that accumulate indexes of items to be changed */
-	OffsetNumber redirected[MaxHeapTuplesPerPage * 2];
-	OffsetNumber nowdead[MaxHeapTuplesPerPage];
-	OffsetNumber nowunused[MaxHeapTuplesPerPage];
+	OffsetNumber redirected[MaxHeapTuplesPerPageLimit * 2];
+	OffsetNumber nowdead[MaxHeapTuplesPerPageLimit];
+	OffsetNumber nowunused[MaxHeapTuplesPerPageLimit];
 
 	/*
 	 * marked[i] is true if item i is entered in one of the above arrays.
@@ -64,7 +64,7 @@ typedef struct
 	 * This needs to be MaxHeapTuplesPerPage + 1 long as FirstOffsetNumber is
 	 * 1. Otherwise every access would need to subtract 1.
 	 */
-	bool		marked[MaxHeapTuplesPerPage + 1];
+	bool		marked[MaxHeapTuplesPerPageLimit + 1];
 
 	/*
 	 * Tuple visibility is only computed once for each tuple, for correctness
@@ -74,7 +74,7 @@ typedef struct
 	 *
 	 * Same indexing as ->marked.
 	 */
-	int8		htsv[MaxHeapTuplesPerPage + 1];
+	int8		htsv[MaxHeapTuplesPerPageLimit + 1];
 } PruneState;
 
 /* Local functions */
@@ -598,7 +598,7 @@ heap_prune_chain(Buffer buffer, OffsetNumber rootoffnum, PruneState *prstate)
 	OffsetNumber latestdead = InvalidOffsetNumber,
 				maxoff = PageGetMaxOffsetNumber(dp),
 				offnum;
-	OffsetNumber chainitems[MaxHeapTuplesPerPage];
+	OffsetNumber chainitems[MaxHeapTuplesPerPageLimit];
 	int			nchain = 0,
 				i;
 
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index dfbe37472f..7e18f2f712 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1565,8 +1565,8 @@ lazy_scan_prune(LVRelState *vacrel,
 	int			nnewlpdead;
 	TransactionId NewRelfrozenXid;
 	MultiXactId NewRelminMxid;
-	OffsetNumber deadoffsets[MaxHeapTuplesPerPage];
-	xl_heap_freeze_tuple frozen[MaxHeapTuplesPerPage];
+	OffsetNumber deadoffsets[MaxHeapTuplesPerPageLimit];
+	xl_heap_freeze_tuple frozen[MaxHeapTuplesPerPageLimit];
 
 	Assert(BufferGetBlockNumber(buf) == blkno);
 
@@ -1968,7 +1968,7 @@ lazy_scan_noprune(LVRelState *vacrel,
 	HeapTupleHeader tupleheader;
 	TransactionId NewRelfrozenXid = vacrel->NewRelfrozenXid;
 	MultiXactId NewRelminMxid = vacrel->NewRelminMxid;
-	OffsetNumber deadoffsets[MaxHeapTuplesPerPage];
+	OffsetNumber deadoffsets[MaxHeapTuplesPerPageLimit];
 
 	Assert(BufferGetBlockNumber(buf) == blkno);
 
@@ -2497,7 +2497,7 @@ lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
 {
 	VacDeadItems *dead_items = vacrel->dead_items;
 	Page		page = BufferGetPage(buffer);
-	OffsetNumber unused[MaxHeapTuplesPerPage];
+	OffsetNumber unused[MaxHeapTuplesPerPageLimit];
 	int			uncnt = 0;
 	TransactionId visibility_cutoff_xid;
 	bool		all_frozen;
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index dea978a962..21a134a663 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -69,6 +69,7 @@
 #include "catalog/pg_database.h"
 #include "common/controldata_utils.h"
 #include "common/file_utils.h"
+#include "common/pagefeat.h"
 #include "executor/instrument.h"
 #include "miscadmin.h"
 #include "pg_trace.h"
@@ -89,6 +90,7 @@
 #include "storage/ipc.h"
 #include "storage/large_object.h"
 #include "storage/latch.h"
+#include "common/pagefeat.h"
 #include "storage/pmsignal.h"
 #include "storage/predicate.h"
 #include "storage/proc.h"
@@ -109,6 +111,7 @@
 #include "utils/varlena.h"
 
 extern uint32 bootstrap_data_checksum_version;
+extern PageFeatureSet bootstrap_page_features;
 
 /* timeline ID to be used when bootstrapping */
 #define BootstrapTimeLineID		1
@@ -3898,6 +3901,7 @@ InitControlFile(uint64 sysidentifier)
 	ControlFile->wal_log_hints = wal_log_hints;
 	ControlFile->track_commit_timestamp = track_commit_timestamp;
 	ControlFile->data_checksum_version = bootstrap_data_checksum_version;
+	ControlFile->page_features = bootstrap_page_features;
 }
 
 static void
@@ -4182,9 +4186,14 @@ ReadControlFile(void)
 
 	CalculateCheckpointSegments();
 
+	/* set our page-level space reservation from ControlFile if any extended feature flags are set*/
+	reserved_page_size = CalculateReservedPageSize(ControlFile->page_features);
+
 	/* Make the initdb settings visible as GUC variables, too */
 	SetConfigOption("data_checksums", DataChecksumsEnabled() ? "yes" : "no",
 					PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
+
+	SetExtendedFeatureConfigOptions(ControlFile->page_features);
 }
 
 /*
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index 58247d826d..4ee0bf3db7 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -46,7 +46,7 @@
 #include "utils/relmapper.h"
 
 uint32		bootstrap_data_checksum_version = 0;	/* No checksum */
-
+PageFeatureSet bootstrap_page_features = 0;			/* No special features */
 
 static void CheckerModeMain(void);
 static void bootstrap_signals(void);
@@ -221,7 +221,7 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
 	argv++;
 	argc--;
 
-	while ((flag = getopt(argc, argv, "B:c:d:D:Fkr:X:-:")) != -1)
+	while ((flag = getopt(argc, argv, "B:c:d:D:e:Fkr:X:-:")) != -1)
 	{
 		switch (flag)
 		{
@@ -244,6 +244,19 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
 					pfree(debugstr);
 				}
 				break;
+			case 'e':
+				{
+					/* enable specific features */
+					PageFeatureSet features_tmp;
+
+					features_tmp = PageFeatureSetAddFeatureByName(bootstrap_page_features, optarg);
+					if (features_tmp == bootstrap_page_features)
+						ereport(ERROR,
+								(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+								 errmsg("Unrecognized page feature requested: \"%s\"", optarg)));
+					bootstrap_page_features = features_tmp;
+				}
+				break;
 			case 'F':
 				SetConfigOption("fsync", "false", PGC_POSTMASTER, PGC_S_ARGV);
 				break;
@@ -299,6 +312,8 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
 		}
 	}
 
+	ClusterPageFeatureInit(bootstrap_page_features);
+
 	if (argc != optind)
 	{
 		write_stderr("%s: invalid command-line arguments\n", progname);
diff --git a/src/backend/nodes/tidbitmap.c b/src/backend/nodes/tidbitmap.c
index a7a6b26668..19adae4d23 100644
--- a/src/backend/nodes/tidbitmap.c
+++ b/src/backend/nodes/tidbitmap.c
@@ -53,7 +53,7 @@
  * the per-page bitmaps variable size.  We just legislate that the size
  * is this:
  */
-#define MAX_TUPLES_PER_PAGE  MaxHeapTuplesPerPage
+#define MAX_TUPLES_PER_PAGE  MaxHeapTuplesPerPageLimit
 
 /*
  * When we have to switch over to lossy storage, we use a data structure
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index a76b8aab6c..74a7bdce33 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -702,7 +702,7 @@ PageRepairFragmentation(Page page)
 	Offset		pd_upper = ((PageHeader) page)->pd_upper;
 	Offset		pd_special = ((PageHeader) page)->pd_special;
 	Offset		last_offset;
-	itemIdCompactData itemidbase[MaxHeapTuplesPerPage];
+	itemIdCompactData itemidbase[MaxHeapTuplesPerPageLimit];
 	itemIdCompact itemidptr;
 	ItemId		lp;
 	int			nline,
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 3e241eba5b..1a5d29ac9b 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -151,6 +151,3 @@ int64		VacuumPageDirty = 0;
 
 int			VacuumCostBalance = 0;	/* working state for vacuum */
 bool		VacuumCostActive = false;
-
-int			reserved_page_size = 0; /* how much page space to reserve for extended unencrypted metadata */
-
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index c390ec51ce..22e6122458 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -26,6 +26,7 @@
 #include "catalog/pg_control.h"
 #include "common/controldata_utils.h"
 #include "common/logging.h"
+#include "common/pagefeat.h"
 #include "getopt_long.h"
 #include "pg_getopt.h"
 
@@ -328,5 +329,7 @@ main(int argc, char *argv[])
 		   ControlFile->data_checksum_version);
 	printf(_("Mock authentication nonce:            %s\n"),
 		   mock_auth_nonce_str);
+	printf(_("Reserved page size for features:      %d\n"),
+		   CalculateReservedPageSize(ControlFile->page_features));
 	return 0;
 }
diff --git a/src/common/Makefile b/src/common/Makefile
index e9af7346c9..79ffa4dc9a 100644
--- a/src/common/Makefile
+++ b/src/common/Makefile
@@ -65,6 +65,7 @@ OBJS_COMMON = \
 	kwlookup.o \
 	link-canary.o \
 	md5_common.o \
+	pagefeat.o \
 	pg_get_line.o \
 	pg_lzcompress.o \
 	pg_prng.o \
diff --git a/src/common/pagefeat.c b/src/common/pagefeat.c
new file mode 100644
index 0000000000..75f589714b
--- /dev/null
+++ b/src/common/pagefeat.c
@@ -0,0 +1,130 @@
+/*-------------------------------------------------------------------------
+ *
+ * pagefeat.c
+ *	  POSTGRES optional page features
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/common/pagefeat.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+#include "common/pagefeat.h"
+#include "utils/guc.h"
+
+/* global variable to store the reserved_page_size */
+int reserved_page_size;
+PageFeatureSet cluster_page_features;
+
+/*
+ * A "page feature" is an optional cluster-defined additional data field that
+ * is stored in the "reserved_page_size" area in the footer of a given Page.
+ * These features are set at initdb time and are static for the life of the cluster.
+ *
+ * Page features are identified by flags, each corresponding to a blob of data
+ * with a fixed length and content.  For a given cluster, these features will
+ * globally exist or not, and can be queried for feature existence.  You can
+ * also get the data/length for a given feature using accessors.
+ */
+
+typedef struct PageFeatureDesc
+{
+	uint16 length;
+	char *guc_name;
+} PageFeatureDesc;
+
+/* these are the fixed widths for each feature type, indexed by feature */
+static PageFeatureDesc feature_descs[PF_MAX_FEATURE] = {
+};
+
+
+/* Return the size for a given set of feature flags */
+uint16
+CalculateReservedPageSize(PageFeatureSet features)
+{
+	uint16 size = 0;
+	int i;
+
+	if (!features)
+		return 0;
+
+	for (i = 0; i < PF_MAX_FEATURE; i++)
+		if (features & (1<<i))
+			size += feature_descs[i].length;
+
+	return MAXALIGN(size);
+}
+
+
+/*
+ * Get the page offset for the given feature given the page, flags, and
+ * feature id.  Returns NULL if the feature is not enabled.
+ */
+
+char *
+GetPageFeatureOffset(Page page, PageFeatureSet enabled_features, PageFeature feature_id)
+{
+	uint16 size = 0;
+	int i;
+
+	Assert(page != NULL);
+	if (!(enabled_features & (1<<feature_id)))
+		return (char*)0;
+
+	/* we need to find the offsets of all previous features to skip */
+	for (i = 0; i < feature_id; i++)
+		if (enabled_features & (1<<i))
+			size += feature_descs[i].length;
+
+	/* size is now the offset from the start of the reserved page space */
+	return (char*)((char *)page + BLCKSZ - reserved_page_size + size);
+}
+
+/*
+ * Return the feature length
+ */
+uint16
+GetFeatureLength(PageFeature feature_id)
+{
+	Assert(feature_id >= 0 && feature_id < PF_MAX_FEATURE);
+	return feature_descs[feature_id].length;
+}
+
+
+/* expose the given feature flags as boolean yes/no GUCs */
+void
+SetExtendedFeatureConfigOptions(PageFeatureSet features)
+{
+#ifndef FRONTEND
+	int i;
+
+	for (i = 0; i < PF_MAX_FEATURE; i++)
+		SetConfigOption(feature_descs[i].guc_name, (features & (1<<i)) ? "yes" : "no",
+						PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
+#endif
+	cluster_page_features = features;
+}
+
+/* add a named feature to the feature set */
+PageFeatureSet
+PageFeatureSetAddFeatureByName(PageFeatureSet features, const char *feat_name)
+{
+	int i;
+
+	for (i = 0; i < PF_MAX_FEATURE; i++)
+		if (!strcmp(feat_name, feature_descs[i].guc_name))
+			return features | (1<<i);
+	return features;
+}
+
+/* add feature to the feature set by identifier */
+PageFeatureSet
+PageFeatureSetAddFeature(PageFeatureSet features, PageFeature feature)
+{
+	Assert(feature >= 0 && feature < PF_MAX_FEATURE);
+	return features | (1<<feature);
+}
diff --git a/src/include/access/ginblock.h b/src/include/access/ginblock.h
index d2c011abbd..1572480c02 100644
--- a/src/include/access/ginblock.h
+++ b/src/include/access/ginblock.h
@@ -249,7 +249,12 @@ typedef signed char GinNullCategory;
 #define GinMaxItemSize \
 	Min(INDEX_SIZE_MASK, \
 		MAXALIGN_DOWN(((BLCKSZ - \
-						MaxSizeOfPageReservedSpace - \
+						SizeOfPageReservedSpace - \
+						MAXALIGN(SizeOfPageHeaderData + 3 * sizeof(ItemIdData)) - \
+						MAXALIGN(sizeof(GinPageOpaqueData))) / 3)))
+#define GinMaxItemSizeLimit \
+	Min(INDEX_SIZE_MASK, \
+		MAXALIGN_DOWN(((BLCKSZ - \
 						MAXALIGN(SizeOfPageHeaderData + 3 * sizeof(ItemIdData)) - \
 						MAXALIGN(sizeof(GinPageOpaqueData))) / 3)))
 
@@ -320,7 +325,11 @@ typedef signed char GinNullCategory;
 
 #define GinDataPageMaxDataSize	\
 	(BLCKSZ - MAXALIGN(SizeOfPageHeaderData) \
-	 - MaxSizeOfPageReservedSpace \
+	 - SizeOfPageReservedSpace \
+	 - MAXALIGN(sizeof(ItemPointerData)) \
+	 - MAXALIGN(sizeof(GinPageOpaqueData)))
+#define GinDataPageMaxDataSizeLimit	\
+	(BLCKSZ - MAXALIGN(SizeOfPageHeaderData) \
 	 - MAXALIGN(sizeof(ItemPointerData)) \
 	 - MAXALIGN(sizeof(GinPageOpaqueData)))
 
@@ -328,7 +337,9 @@ typedef signed char GinNullCategory;
  * List pages
  */
 #define GinListPageSize  \
-	( BLCKSZ - SizeOfPageHeaderData - MAXALIGN(sizeof(GinPageOpaqueData)) - MaxSizeOfPageReservedSpace)
+	( BLCKSZ - SizeOfPageHeaderData - MAXALIGN(sizeof(GinPageOpaqueData)) - SizeOfPageReservedSpace)
+#define GinListPageSizeLimit  \
+	( BLCKSZ - SizeOfPageHeaderData - MAXALIGN(sizeof(GinPageOpaqueData)))
 
 /*
  * A compressed posting list.
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 9dab35551e..84a1472ac0 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -74,7 +74,7 @@ typedef struct HeapScanDescData
 	/* these fields only used in page-at-a-time mode and for bitmap scans */
 	int			rs_cindex;		/* current tuple's index in vistuples */
 	int			rs_ntuples;		/* number of visible tuples on page */
-	OffsetNumber rs_vistuples[MaxHeapTuplesPerPage];	/* their offsets */
+	OffsetNumber rs_vistuples[MaxHeapTuplesPerPageLimit];	/* their offsets */
 }			HeapScanDescData;
 typedef struct HeapScanDescData *HeapScanDesc;
 
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index 0d0f97bd86..f26229ab93 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -19,6 +19,7 @@
 #include "access/tupdesc.h"
 #include "access/tupmacs.h"
 #include "storage/bufpage.h"
+#include "common/pagefeat.h"
 
 /*
  * MaxTupleAttributeNumber limits the number of (user) columns in a tuple.
@@ -549,22 +550,38 @@ do { \
  * MaxHeapTupleSize is the maximum allowed size of a heap tuple, including
  * header and MAXALIGN alignment padding.  Basically it's BLCKSZ minus the
  * other stuff that has to be on a disk page.  We also include
- * MaxSizeOfPageReservedSpace bytes in this calculation as this could be
- * enabled.
+ * SizeOfPageReservedSpace bytes in this calculation to account for page
+ * trailers.
+ *
+ * MaxHeapTupleSizeLimit is the maximum buffer-size required for any cluster,
+ * explicitly excluding the PageReservedSpace.  This is needed for any data
+ * structure which uses a fixed-size buffer, since compilers do not want a
+ * variable-sized array, and MaxHeapTupleSize is now variable.
  *
  * NOTE: we allow for the ItemId that must point to the tuple, ensuring that
  * an otherwise-empty page can indeed hold a tuple of this size.  Because
  * ItemIds and tuples have different alignment requirements, don't assume that
  * you can, say, fit 2 tuples of size MaxHeapTupleSize/2 on the same page.
  */
-#define MaxHeapTupleSize  (BLCKSZ - MAXALIGN(SizeOfPageHeaderData + sizeof(ItemIdData) + MaxSizeOfPageReservedSpace))
+#define MaxHeapTupleSize  (BLCKSZ - MAXALIGN(SizeOfPageHeaderData + sizeof(ItemIdData) + SizeOfPageReservedSpace))
+#define MaxHeapTupleSizeLimit  (BLCKSZ - MAXALIGN(SizeOfPageHeaderData + sizeof(ItemIdData)))
 #define MinHeapTupleSize  MAXALIGN(SizeofHeapTupleHeader)
 
 /*
- * MaxHeapTuplesPerPage is an upper bound on the number of tuples that can
- * fit on one heap page.  (Note that indexes could have more, because they
- * use a smaller tuple header.)  We arrive at the divisor because each tuple
- * must be maxaligned, and it must have an associated line pointer.
+ * MaxHeapTuplesPerPage is an upper bound on the number of tuples that can fit
+ * on one heap page.  (Note that indexes could have more, because they use a
+ * smaller tuple header.)  We arrive at the divisor because each tuple must be
+ * maxaligned, and it must have an associated line pointer.  This is a dynamic
+ * value, accounting for PageReservedSpace on the end.
+ *
+ * MaxHeapTuplesPerPageLimit is this same limit, but discounting
+ * PageReservedSpace (which can be zero), so is appropriate for defining data
+ * structures which require fixed-size buffers.  Code should not assume
+ * MaxHeapTuplesPerPage == MaxHeapTuplesPerPageLimit, so if iterating over
+ * such a structure, the *size* of the buffer should be
+ * MaxHeapTuplesPerPageLimit, but the limits of iteration should be
+ * MaxHeapTuplesPerPage, implying that MaxHeapTuplesPerPage <=
+ * MaxHeapTuplesPerPageLimit.
  *
  * Note: with HOT, there could theoretically be more line pointers (not actual
  * tuples) than this on a heap page.  However we constrain the number of line
@@ -572,7 +589,10 @@ do { \
  * require increases in the size of work arrays.
  */
 #define MaxHeapTuplesPerPage	\
-	((int) ((BLCKSZ - SizeOfPageHeaderData - MaxSizeOfPageReservedSpace) / \
+	((int) ((BLCKSZ - SizeOfPageHeaderData - SizeOfPageReservedSpace) / \
+			(MAXALIGN(SizeofHeapTupleHeader) + sizeof(ItemIdData))))
+#define MaxHeapTuplesPerPageLimit	\
+	((int) ((BLCKSZ - SizeOfPageHeaderData) / \
 			(MAXALIGN(SizeofHeapTupleHeader) + sizeof(ItemIdData))))
 
 /*
diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h
index adf7be54d7..b71e98b8e6 100644
--- a/src/include/access/nbtree.h
+++ b/src/include/access/nbtree.h
@@ -173,8 +173,14 @@ typedef struct BTMetaPageData
 
 /*
  * MaxTIDsPerBTreePage is an upper bound on the number of heap TIDs tuples
- * that may be stored on a btree leaf page.  It is used to size the
- * per-page temporary buffers.
+ * that may be stored on a btree leaf page.  It is used to size the per-page
+ * temporary buffers.  This accounts for PageReservedSpace limit as well, so
+ * is a dynamic value depending on cluster settings.
+ *
+ * MaxTIDsPerBTreePageLimit is the same value without considering
+ * PageReservedSpace limit as well, so is used for fixed-size buffers, however
+ * code accessing these buffers should consider only MaxTIDsPerBTreePage when
+ * iterating over then.
  *
  * Note: we don't bother considering per-tuple overheads here to keep
  * things simple (value is based on how many elements a single array of
@@ -188,7 +194,10 @@ typedef struct BTMetaPageData
  * on a page.
  */
 #define MaxTIDsPerBTreePage \
-	(int) ((BLCKSZ - SizeOfPageHeaderData - MaxSizeOfPageReservedSpace - \
+	(int) ((BLCKSZ - SizeOfPageHeaderData - SizeOfPageReservedSpace - \
+			sizeof(BTPageOpaqueData)) / sizeof(ItemPointerData))
+#define MaxTIDsPerBTreePageLimit \
+	(int) ((BLCKSZ - SizeOfPageHeaderData - \
 			sizeof(BTPageOpaqueData)) / sizeof(ItemPointerData))
 
 /*
@@ -986,7 +995,7 @@ typedef struct BTScanPosData
 	int			lastItem;		/* last valid index in items[] */
 	int			itemIndex;		/* current index in items[] */
 
-	BTScanPosItem items[MaxTIDsPerBTreePage];	/* MUST BE LAST */
+	BTScanPosItem items[MaxTIDsPerBTreePageLimit];	/* MUST BE LAST */
 } BTScanPosData;
 
 typedef BTScanPosData *BTScanPos;
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 06368e2366..c6559a377b 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -19,7 +19,7 @@
 #include "access/xlogdefs.h"
 #include "pgtime.h"				/* for pg_time_t */
 #include "port/pg_crc32c.h"
-
+#include "common/pagefeat.h"
 
 /* Version identifier for this pg_control format */
 #define PG_CONTROL_VERSION	1300
@@ -219,6 +219,9 @@ typedef struct ControlFileData
 	/* Are data pages protected by checksums? Zero if no checksum version */
 	uint32		data_checksum_version;
 
+	/* What extended page features are we using? */
+	PageFeatureSet page_features;
+
 	/*
 	 * Random nonce, used in authentication requests that need to proceed
 	 * based on values that are cluster-unique, like a SASL exchange that
diff --git a/src/include/common/pagefeat.h b/src/include/common/pagefeat.h
new file mode 100644
index 0000000000..18e3249522
--- /dev/null
+++ b/src/include/common/pagefeat.h
@@ -0,0 +1,47 @@
+/*-------------------------------------------------------------------------
+ *
+ * pagefeat.h
+ *	  POSTGRES page feature support
+ *
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/common/pagefeat.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PAGEFEAT_H
+#define PAGEFEAT_H
+
+/* revealed for GUCs */
+extern int reserved_page_size;
+
+/* forward declaration to avoid circular includes */
+typedef Pointer Page;
+typedef uint16 PageFeatureSet;
+
+extern PageFeatureSet cluster_page_features;
+
+#define SizeOfPageReservedSpace reserved_page_size
+
+/* bit offset for features flags */
+typedef enum {
+	/* TODO: add features here */
+	PF_MAX_FEATURE = 0 /* must be last */
+} PageFeature;
+
+/* prototypes */
+void SetExtendedFeatureConfigOptions(PageFeatureSet features);
+char *GetPageFeatureOffset(Page page, PageFeatureSet enabled_features, PageFeature feature);
+uint16 CalculateReservedPageSize(PageFeatureSet features);
+uint16 GetFeatureLength(PageFeature feature);
+PageFeatureSet PageFeatureSetAddFeatureByName(PageFeatureSet features, const char *feat_name);
+PageFeatureSet PageFeatureSetAddFeature(PageFeatureSet features, PageFeature feature);
+
+/* macros dealing with the current cluster's page features */
+#define PageFeatureSetHasFeature(fs,f) (fs&(1<<f))
+#define ClusterPageFeatureOffset(page,feat) GetPageFeatureOffset(page,cluster_page_features,feat)
+#define ClusterPageFeatureInit(features) cluster_page_features = features;
+
+#endif							/* PAGEFEAT_H */
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index 6f4dfa0960..0658849452 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -17,6 +17,7 @@
 #include "storage/block.h"
 #include "storage/buf.h"
 #include "storage/bufpage.h"
+#include "common/pagefeat.h"
 #include "storage/relfilelocator.h"
 #include "utils/relcache.h"
 #include "utils/snapmgr.h"
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 295ac1367d..1add3ba6f4 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -18,13 +18,10 @@
 #include "storage/block.h"
 #include "storage/item.h"
 #include "storage/off.h"
+#include "common/pagefeat.h"
 
-extern int reserved_page_size;
-
-#define SizeOfPageReservedSpace reserved_page_size
 /* strict upper bound on the amount of space occupied we have reserved on
  * pages in this cluster */
-#define MaxSizeOfPageReservedSpace 64
 
 /*
  * A postgres disk page is an abstraction layered on top of a postgres
-- 
2.37.0 (Apple Git-136)

0006-Add-64-bit-checksum-page-feature.patchapplication/octet-stream; name=0006-Add-64-bit-checksum-page-feature.patchDownload

From 719b793cf61b8bdacc1b2bf5d44dad4b24b0f36c Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Fri, 21 Oct 2022 18:36:12 -0400
Subject: [PATCH 6/6] Add 64-bit checksum page feature

---
 src/backend/access/transam/xlog.c       |   3 +-
 src/backend/backup/basebackup.c         |   9 +-
 src/backend/storage/page/bufpage.c      |  21 +-
 src/backend/utils/misc/guc_tables.c     |  11 +
 src/bin/initdb/initdb.c                 |  30 +-
 src/bin/pg_controldata/pg_controldata.c |   4 +-
 src/common/pagefeat.c                   |   3 +
 src/include/common/komihash.h           | 565 ++++++++++++++++++++++++
 src/include/common/pagefeat.h           |   2 +
 src/include/storage/checksum.h          |   1 +
 src/include/storage/checksum_impl.h     |  62 +++
 11 files changed, 695 insertions(+), 16 deletions(-)
 create mode 100644 src/include/common/komihash.h

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index a1379efd49..f184f2a257 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4234,7 +4234,8 @@ DataChecksumsEnabled(void)
 {
 	Assert(ControlFile != NULL);
 	return (ControlFile->data_checksum_version > 0) || \
-		PageFeatureSetHasFeature(ControlFile->page_features, PF_PAGE_CHECKSUMS32);
+		PageFeatureSetHasFeature(ControlFile->page_features, PF_PAGE_CHECKSUMS32) || \
+		PageFeatureSetHasFeature(ControlFile->page_features, PF_PAGE_CHECKSUMS64);
 
 }
 
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 90cd97b938..f74350f9de 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -1493,7 +1493,7 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 	int			fd;
 	BlockNumber blkno = 0;
 	bool		block_retry = false;
-	uint32		checksum, page_checksum;
+	uint64		checksum, page_checksum;
 	int			checksum_failures = 0;
 	off_t		cnt;
 	int			i;
@@ -1612,7 +1612,12 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 					char *extended_checksum_loc = NULL;
 
 					/* are we using extended checksums? */
-					if ((extended_checksum_loc = ClusterPageFeatureOffset(page, PF_PAGE_CHECKSUMS32)))
+					if ((extended_checksum_loc = ClusterPageFeatureOffset(page, PF_PAGE_CHECKSUMS64)))
+					{
+						page_checksum = *(uint64*)extended_checksum_loc;
+						checksum = pg_checksum64_page(page, blkno + segmentno * RELSEG_SIZE, (uint64*)extended_checksum_loc);
+					}
+					else if ((extended_checksum_loc = ClusterPageFeatureOffset(page, PF_PAGE_CHECKSUMS32)))
 					{
 						page_checksum = *(uint16*)extended_checksum_loc;
 						page_checksum <<= 16;
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index b50b4e76a0..afe8707266 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -93,8 +93,8 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
 	bool		checksum_failure = false;
 	bool		header_sane = false;
 	bool		all_zeroes = false;
-	uint32		checksum = 0;
-	uint32		page_checksum = 0;
+	uint64		checksum = 0;
+	uint64		page_checksum = 0;
 	char       *extended_checksum_loc = NULL;
 	/*
 	 * Don't verify page data unless the page passes basic non-zero test
@@ -104,7 +104,12 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
 		if (DataChecksumsEnabled())
 		{
 			/* are we using extended checksums? */
-			if ((extended_checksum_loc = ClusterPageFeatureOffset(page, PF_PAGE_CHECKSUMS32)))
+			if ((extended_checksum_loc = ClusterPageFeatureOffset(page, PF_PAGE_CHECKSUMS64)))
+			{
+				page_checksum = *((uint64*)extended_checksum_loc);
+				checksum = pg_checksum64_page(page, blkno, (uint64*)extended_checksum_loc);
+			}
+			else if ((extended_checksum_loc = ClusterPageFeatureOffset(page, PF_PAGE_CHECKSUMS32)))
 			{
 				/* high bits of the existing checksum are stored as a uint16 at extended_checksum_loc, low bits in pd_checksum */
 				page_checksum = *((uint16*)extended_checksum_loc);
@@ -163,7 +168,7 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
 		if ((flags & PIV_LOG_WARNING) != 0)
 			ereport(WARNING,
 					(errcode(ERRCODE_DATA_CORRUPTED),
-					 errmsg("page verification failed, calculated checksum %u but expected %u",
+					 errmsg("page verification failed, calculated checksum %lu but expected %lu",
 							checksum, page_checksum)));
 
 		if ((flags & PIV_REPORT_STAT) != 0)
@@ -1541,7 +1546,9 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 
 	memcpy(pageCopy, (char *) page, BLCKSZ);
 
-	if ((extended_checksum_loc = ClusterPageFeatureOffset(pageCopy, PF_PAGE_CHECKSUMS32)))
+	if ((extended_checksum_loc = ClusterPageFeatureOffset(pageCopy, PF_PAGE_CHECKSUMS64)))
+		*(uint64*)extended_checksum_loc = pg_checksum64_page(pageCopy, blkno, (uint64*)extended_checksum_loc);
+	else if ((extended_checksum_loc = ClusterPageFeatureOffset(pageCopy, PF_PAGE_CHECKSUMS32)))
 	{
 		/* 32-bit checksums split storage between pd_checksum and page feature offset */
 		uint32 checksum = pg_checksum32_page((char*)pageCopy, blkno, extended_checksum_loc);
@@ -1571,7 +1578,9 @@ PageSetChecksumInplace(Page page, BlockNumber blkno)
 		return;
 
 	/* are we using extended checksums? */
-	if ((extended_checksum_loc = ClusterPageFeatureOffset(page, PF_PAGE_CHECKSUMS32)))
+	if ((extended_checksum_loc = ClusterPageFeatureOffset(page, PF_PAGE_CHECKSUMS64)))
+		*(uint64*)extended_checksum_loc = pg_checksum64_page(page, blkno, (uint64*)extended_checksum_loc);
+	else if ((extended_checksum_loc = ClusterPageFeatureOffset(page, PF_PAGE_CHECKSUMS32)))
 	{
 		/* 32-bit checksums split storage between pd_checksum and page feature offset */
 		uint32 checksum = pg_checksum32_page((char*)page, blkno, extended_checksum_loc);
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index dd7d80feef..cf194696ac 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -1809,6 +1809,17 @@ struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"page_checksums64", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows whether 64-bit extended checksums are turned on for this cluster."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
+		},
+		&page_feature_page_checksums64,
+		false,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"wasted_space", PGC_INTERNAL, PRESET_OPTIONS,
 			gettext_noop("Waste some space in the page. Not even a fill factor. Just testing multiple page features."),
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 0622983c06..c9267973b8 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -151,6 +151,7 @@ static bool sync_only = false;
 static bool show_setting = false;
 static bool data_checksums = false;
 static bool page_checksums32 = false;
+static bool page_checksums64 = false;
 static bool waste_space = false;
 static char *xlog_dir = NULL;
 static char *str_wal_segment_size_mb = NULL;
@@ -1324,11 +1325,12 @@ bootstrap_template1(void)
 	unsetenv("PGCLIENTENCODING");
 
 	snprintf(cmd, sizeof(cmd),
-			 "\"%s\" --boot -X %d %s %s %s %s %s %s",
+			 "\"%s\" --boot -X %d %s %s %s %s %s %s %s",
 			 backend_exec,
 			 wal_segment_size_mb * (1024 * 1024),
 			 data_checksums ? "-k" : "",
 			 page_checksums32 ? "-e page_checksums32" : "",
+			 page_checksums64 ? "-e page_checksums64" : "",
 			 waste_space ? "-e wasted_space" : "",
 			 boot_options, extra_options,
 			 debug ? "-d 5" : "");
@@ -2152,7 +2154,8 @@ usage(const char *progname)
 	printf(_("  -g, --allow-group-access  allow group read/execute on data directory\n"));
 	printf(_("      --icu-locale=LOCALE   set ICU locale ID for new databases\n"));
 	printf(_("  -k, --data-checksums      use data page checksums\n"));
-	printf(_("  -K, --extended-checksums  use extended data page checksums\n"));
+	printf(_("  -K, --extended-checksums={32|64}\n"
+			 "                            use extended data page checksums of the given bitsize\n"));
 	printf(_("      --locale=LOCALE       set default locale for new databases\n"));
 	printf(_("      --lc-collate=, --lc-ctype=, --lc-messages=LOCALE\n"
 			 "      --lc-monetary=, --lc-numeric=, --lc-time=LOCALE\n"
@@ -2808,7 +2811,7 @@ main(int argc, char *argv[])
 		{"waldir", required_argument, NULL, 'X'},
 		{"wal-segsize", required_argument, NULL, 12},
 		{"data-checksums", no_argument, NULL, 'k'},
-		{"extended-checksums", no_argument, NULL, 'K'},
+		{"extended-checksums", required_argument, NULL, 'K'},
 		{"waste-space", no_argument, NULL, 'w'},
 		{"allow-group-access", no_argument, NULL, 'g'},
 		{"discard-caches", no_argument, NULL, 14},
@@ -2855,7 +2858,7 @@ main(int argc, char *argv[])
 
 	/* process command-line options */
 
-	while ((c = getopt_long(argc, argv, "A:dD:E:gkKL:nNsST:U:WwX:", long_options, &option_index)) != -1)
+	while ((c = getopt_long(argc, argv, "A:dD:E:gkK:L:nNsST:U:WwX:", long_options, &option_index)) != -1)
 	{
 		switch (c)
 		{
@@ -2908,7 +2911,22 @@ main(int argc, char *argv[])
 				data_checksums = true;
 				break;
 			case 'K':
-				page_checksums32 = true;
+				{
+					if (!strcmp(optarg,"32"))
+					{
+						page_checksums32 = true;
+						page_checksums64 = false;
+					}
+					else if (!strcmp(optarg,"64"))
+					{
+						page_checksums32 = false;
+						page_checksums64 = true;
+					}
+					else
+					{
+						pg_fatal("Must provide \"32\" or \"64\" as extended checksum size");
+					}
+				}
 				break;
 			case 'L':
 				share_path = pg_strdup(optarg);
@@ -3028,7 +3046,7 @@ main(int argc, char *argv[])
 	if (pwprompt && pwfilename)
 		pg_fatal("password prompt and password file cannot be specified together");
 
-	if (data_checksums && page_checksums32)
+	if (data_checksums && (page_checksums32 || page_checksums64) )
 		pg_fatal("data checksums and extended data checksums cannot be specified together");
 
 	check_authmethod_unspecified(&authmethodlocal);
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index 56eb8d88a2..8f77ff1be2 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -332,7 +332,9 @@ main(int argc, char *argv[])
 	printf(_("Reserved page size for features:      %d\n"),
 		   CalculateReservedPageSize(ControlFile->page_features));
 	printf(_("Using extended checksums:             %s\n"),
+		   PageFeatureSetHasFeature(ControlFile->page_features, PF_PAGE_CHECKSUMS64) \
+		   ? _("64-bit") :
 		   PageFeatureSetHasFeature(ControlFile->page_features, PF_PAGE_CHECKSUMS32) \
-		   ? _("yes") : _("no"));
+		   ? _("32-bit") : _("no"));
 	return 0;
 }
diff --git a/src/common/pagefeat.c b/src/common/pagefeat.c
index 1af22b9876..2a42f5e57b 100644
--- a/src/common/pagefeat.c
+++ b/src/common/pagefeat.c
@@ -22,6 +22,7 @@ PageFeatureSet cluster_page_features;
 
 /* status GUCs, display only. set by XLog startup */
 bool page_feature_page_checksums32;
+bool page_feature_page_checksums64;
 bool page_feature_wasted_space;
 
 /*
@@ -50,6 +51,8 @@ typedef struct PageFeatureDesc
 static PageFeatureDesc feature_descs[PF_MAX_FEATURE] = {
 	/* PF_PAGE_CHECKSUMS32 */
 	{ 2, "page_checksums32" },
+	/* PF_PAGE_CHECKSUMS64 */
+	{ 8, "page_checksums64" },
 	/* PF_WASTED_SPACE */
 	{ 40, "wasted_space" },
 };
diff --git a/src/include/common/komihash.h b/src/include/common/komihash.h
new file mode 100644
index 0000000000..898a8d48a1
--- /dev/null
+++ b/src/include/common/komihash.h
@@ -0,0 +1,565 @@
+/**
+ * komihash.h version 4.3.1
+ *
+ * The inclusion file for the "komihash" hash function.
+ *
+ * Description is available at https://github.com/avaneev/komihash
+ *
+ * License
+ *
+ * Copyright (c) 2021-2022 Aleksey Vaneev
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef KOMIHASH_INCLUDED
+#define KOMIHASH_INCLUDED
+
+#include <stdint.h>
+#include <string.h>
+
+// Macros that apply byte-swapping.
+
+#if defined( __GNUC__ ) || defined( __clang__ )
+
+	#define KOMIHASH_BYTESW32( v ) __builtin_bswap32( v )
+	#define KOMIHASH_BYTESW64( v ) __builtin_bswap64( v )
+
+#elif defined( _MSC_VER )
+
+	#define KOMIHASH_BYTESW32( v ) _byteswap_ulong( v )
+	#define KOMIHASH_BYTESW64( v ) _byteswap_uint64( v )
+
+#else // defined( _MSC_VER )
+
+	#define KOMIHASH_BYTESW32( v ) ( \
+		( v & 0xFF000000 ) >> 24 | \
+		( v & 0x00FF0000 ) >> 8 | \
+		( v & 0x0000FF00 ) << 8 | \
+		( v & 0x000000FF ) << 24 )
+
+	#define KOMIHASH_BYTESW64( v ) ( \
+		( v & 0xFF00000000000000 ) >> 56 | \
+		( v & 0x00FF000000000000 ) >> 40 | \
+		( v & 0x0000FF0000000000 ) >> 24 | \
+		( v & 0x000000FF00000000 ) >> 8 | \
+		( v & 0x00000000FF000000 ) << 8 | \
+		( v & 0x0000000000FF0000 ) << 24 | \
+		( v & 0x000000000000FF00 ) << 40 | \
+		( v & 0x00000000000000FF ) << 56 )
+
+#endif // defined( _MSC_VER )
+
+// Endianness-definition macro, can be defined externally (e.g. =1, if
+// endianness-correction is unnecessary in any case, to reduce its associated
+// overhead).
+
+#if !defined( KOMIHASH_LITTLE_ENDIAN )
+	#if defined( _WIN32 ) || defined( __LITTLE_ENDIAN__ ) || \
+		( defined( __BYTE_ORDER__ ) && __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ )
+
+		#define KOMIHASH_LITTLE_ENDIAN 1
+
+	#elif defined( __BIG_ENDIAN__ ) || \
+		( defined( __BYTE_ORDER__ ) && __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ )
+
+		#define KOMIHASH_LITTLE_ENDIAN 0
+
+	#else // defined( __BIG_ENDIAN__ )
+
+		#warning KOMIHASH: cannot determine endianness, assuming little-endian.
+
+		#define KOMIHASH_LITTLE_ENDIAN 1
+
+	#endif // defined( __BIG_ENDIAN__ )
+#endif // !defined( KOMIHASH_LITTLE_ENDIAN )
+
+// Macros that apply byte-swapping, used for endianness-correction.
+
+#if KOMIHASH_LITTLE_ENDIAN
+
+	#define KOMIHASH_EC32( v ) ( v )
+	#define KOMIHASH_EC64( v ) ( v )
+
+#else // KOMIHASH_LITTLE_ENDIAN
+
+	#define KOMIHASH_EC32( v ) KOMIHASH_BYTESW32( v )
+	#define KOMIHASH_EC64( v ) KOMIHASH_BYTESW64( v )
+
+#endif // KOMIHASH_LITTLE_ENDIAN
+
+// Likelihood macros that are used for manually-guided micro-optimization.
+
+#if defined( __GNUC__ ) || defined( __clang__ )
+
+	#define KOMIHASH_LIKELY( x )  __builtin_expect( x, 1 )
+	#define KOMIHASH_UNLIKELY( x )  __builtin_expect( x, 0 )
+
+#else // likelihood macros
+
+	#define KOMIHASH_LIKELY( x ) ( x )
+	#define KOMIHASH_UNLIKELY( x ) ( x )
+
+#endif // likelihood macros
+
+// In-memory data prefetch macro (temporal locality=1, in case a collision
+// resolution would be necessary).
+
+#if defined( __GNUC__ ) || defined( __clang__ )
+
+	#define KOMIHASH_PREFETCH( addr ) __builtin_prefetch( addr, 0, 1 )
+
+#else // prefetch macro
+
+	#define KOMIHASH_PREFETCH( addr )
+
+#endif // prefetch macro
+
+/**
+ * An auxiliary function that returns an unsigned 32-bit value created out of
+ * a sequence of bytes in memory. This function is used to convert endianness
+ * of in-memory 32-bit unsigned values, and to avoid unaligned memory
+ * accesses.
+ *
+ * @param p Pointer to 4 bytes in memory. Alignment is unimportant.
+ */
+
+static inline uint32_t kh_lu32ec( const uint8_t* const p )
+{
+	uint32_t v;
+	memcpy( &v, p, 4 );
+
+	return( KOMIHASH_EC32( v ));
+}
+
+/**
+ * An auxiliary function that returns an unsigned 64-bit value created out of
+ * a sequence of bytes in memory. This function is used to convert endianness
+ * of in-memory 64-bit unsigned values, and to avoid unaligned memory
+ * accesses.
+ *
+ * @param p Pointer to 8 bytes in memory. Alignment is unimportant.
+ */
+
+static inline uint64_t kh_lu64ec( const uint8_t* const p )
+{
+	uint64_t v;
+	memcpy( &v, p, 8 );
+
+	return( KOMIHASH_EC64( v ));
+}
+
+/**
+ * Function builds an unsigned 64-bit value out of remaining bytes in a
+ * message, and pads it with the "final byte". This function can only be
+ * called if less than 8 bytes are left to read. The message should be "long",
+ * permitting Msg[ -3 ] reads.
+ *
+ * @param Msg Message pointer, alignment is unimportant.
+ * @param MsgLen Message's remaining length, in bytes; can be 0.
+ * @param fb Final byte used for padding.
+ */
+
+static inline uint64_t kh_lpu64ec_l3( const uint8_t* const Msg,
+	const size_t MsgLen, uint64_t fb )
+{
+	if( MsgLen < 4 )
+	{
+		const uint8_t* const Msg3 = Msg + MsgLen - 3;
+		const int ml8 = (int) ( MsgLen << 3 );
+		const uint64_t m = (uint64_t) Msg3[ 0 ] | (uint64_t) Msg3[ 1 ] << 8 |
+			(uint64_t) Msg3[ 2 ] << 16;
+
+		return( fb << ml8 | m >> ( 24 - ml8 ));
+	}
+
+	const int ml8 = (int) ( MsgLen << 3 );
+	const uint64_t mh = kh_lu32ec( Msg + MsgLen - 4 );
+	const uint64_t ml = kh_lu32ec( Msg );
+
+	return( fb << ml8 | ml | ( mh >> ( 64 - ml8 )) << 32 );
+}
+
+/**
+ * Function builds an unsigned 64-bit value out of remaining bytes in a
+ * message, and pads it with the "final byte". This function can only be
+ * called if less than 8 bytes are left to read. Can be used on "short"
+ * messages, but MsgLen should be greater than 0.
+ *
+ * @param Msg Message pointer, alignment is unimportant.
+ * @param MsgLen Message's remaining length, in bytes; cannot be 0.
+ * @param fb Final byte used for padding.
+ */
+
+static inline uint64_t kh_lpu64ec_nz( const uint8_t* const Msg,
+	const size_t MsgLen, uint64_t fb )
+{
+	if( MsgLen < 4 )
+	{
+		fb <<= ( MsgLen << 3 );
+		uint64_t m = Msg[ 0 ];
+
+		if( MsgLen > 1 )
+		{
+			m |= (uint64_t) Msg[ 1 ] << 8;
+
+			if( MsgLen > 2 )
+			{
+				m |= (uint64_t) Msg[ 2 ] << 16;
+			}
+		}
+
+		return( fb | m );
+	}
+
+	const int ml8 = (int) ( MsgLen << 3 );
+	const uint64_t mh = kh_lu32ec( Msg + MsgLen - 4 );
+	const uint64_t ml = kh_lu32ec( Msg );
+
+	return( fb << ml8 | ml | ( mh >> ( 64 - ml8 )) << 32 );
+}
+
+/**
+ * Function builds an unsigned 64-bit value out of remaining bytes in a
+ * message, and pads it with the "final byte". This function can only be
+ * called if less than 8 bytes are left to read. The message should be "long",
+ * permitting Msg[ -4 ] reads.
+ *
+ * @param Msg Message pointer, alignment is unimportant.
+ * @param MsgLen Message's remaining length, in bytes; can be 0.
+ * @param fb Final byte used for padding.
+ */
+
+static inline uint64_t kh_lpu64ec_l4( const uint8_t* const Msg,
+	const size_t MsgLen, uint64_t fb )
+{
+	if( MsgLen < 5 )
+	{
+		const int ml8 = (int) ( MsgLen << 3 );
+
+		return( fb << ml8 |
+			(uint64_t) kh_lu32ec( Msg + MsgLen - 4 ) >> ( 32 - ml8 ));
+	}
+	else
+	{
+		const int ml8 = (int) ( MsgLen << 3 );
+
+		return( fb << ml8 | kh_lu64ec( Msg + MsgLen - 8 ) >> ( 64 - ml8 ));
+	}
+}
+
+#if defined( __SIZEOF_INT128__ )
+
+	/**
+	 * 64-bit by 64-bit unsigned multiplication.
+	 *
+	 * @param m1 Multiplier 1.
+	 * @param m2 Multiplier 2.
+	 * @param[out] rl The lower half of the 128-bit result.
+	 * @param[out] rh The higher half of the 128-bit result.
+	 */
+
+	static inline void kh_m128( const uint64_t m1, const uint64_t m2,
+		uint64_t* const rl, uint64_t* const rh )
+	{
+		const __uint128_t r = (__uint128_t) m1 * m2;
+
+		*rl = (uint64_t) r;
+		*rh = (uint64_t) ( r >> 64 );
+	}
+
+#elif defined( _MSC_VER ) && defined( _M_X64 )
+
+	#include <intrin.h>
+
+	static inline void kh_m128( const uint64_t m1, const uint64_t m2,
+		uint64_t* const rl, uint64_t* const rh )
+	{
+		*rl = _umul128( m1, m2, rh );
+	}
+
+#else // defined( _MSC_VER )
+
+	// _umul128() code for 32-bit systems, adapted from mullu(),
+	// from https://go.dev/src/runtime/softfloat64.go
+	// Licensed under BSD-style license.
+
+	static inline uint64_t kh__emulu( const uint32_t x, const uint32_t y )
+	{
+		return( x * (uint64_t) y );
+	}
+
+	static inline void kh_m128( const uint64_t u, const uint64_t v,
+		uint64_t* const rl, uint64_t* const rh )
+	{
+		*rl = u * v;
+
+		const uint32_t u0 = (uint32_t) u;
+		const uint32_t v0 = (uint32_t) v;
+		const uint64_t w0 = kh__emulu( u0, v0 );
+		const uint32_t u1 = (uint32_t) ( u >> 32 );
+		const uint32_t v1 = (uint32_t) ( v >> 32 );
+		const uint64_t t = kh__emulu( u1, v0 ) + ( w0 >> 32 );
+		const uint64_t w1 = (uint32_t) t + kh__emulu( u0, v1 );
+
+		*rh = kh__emulu( u1, v1 ) + ( w1 >> 32 ) + ( t >> 32 );
+	}
+
+#endif // defined( _MSC_VER )
+
+// Common hashing round with 16-byte input, using the "r1l" and "r1h"
+// temporary variables.
+
+#define KOMIHASH_HASH16( m ) \
+	kh_m128( Seed1 ^ kh_lu64ec( m ), \
+		Seed5 ^ kh_lu64ec( m + 8 ), &r1l, &r1h ); \
+	Seed5 += r1h; \
+	Seed1 = Seed5 ^ r1l;
+
+// Common hashing round without input, using the "r2l" and "r2h" temporary
+// variables.
+
+#define KOMIHASH_HASHROUND() \
+	kh_m128( Seed1, Seed5, &r2l, &r2h ); \
+	Seed5 += r2h; \
+	Seed1 = Seed5 ^ r2l;
+
+// Common hashing finalization round, with the final hashing input expected in
+// the "r2l" and "r2h" temporary variables.
+
+#define KOMIHASH_HASHFIN() \
+	kh_m128( r2l, r2h, &r1l, &r1h ); \
+	Seed5 += r1h; \
+	Seed1 = Seed5 ^ r1l; \
+	KOMIHASH_HASHROUND();
+
+/**
+ * KOMIHASH hash function. Produces and returns a 64-bit hash value of the
+ * specified message, string, or binary data block. Designed for 64-bit
+ * hash-table and hash-map uses. Produces identical hashes on both big- and
+ * little-endian systems.
+ *
+ * @param Msg0 The message to produce a hash from. The alignment of this
+ * pointer is unimportant.
+ * @param MsgLen Message's length, in bytes.
+ * @param UseSeed Optional value, to use instead of the default seed. To use
+ * the default seed, set to 0. The UseSeed value can have any bit length and
+ * statistical quality, and is used only as an additional entropy source. May
+ * need endianness-correction if this value is shared between big- and
+ * little-endian systems.
+ */
+
+static inline uint64_t komihash( const void* const Msg0, size_t MsgLen,
+	const uint64_t UseSeed )
+{
+	const uint8_t* Msg = (const uint8_t*) Msg0;
+
+	// The seeds are initialized to the first mantissa bits of PI.
+
+	uint64_t Seed1 = 0x243F6A8885A308D3 ^ ( UseSeed & 0x5555555555555555 );
+	uint64_t Seed5 = 0x452821E638D01377 ^ ( UseSeed & 0xAAAAAAAAAAAAAAAA );
+	uint64_t r1l, r1h, r2l, r2h;
+
+	// The three instructions in the "KOMIHASH_HASHROUND" macro represent the
+	// simplest constant-less PRNG, scalable to any even-sized state
+	// variables, with the `Seed1` being the PRNG output (2^64 PRNG period).
+	// It passes `PractRand` tests with rare non-systematic "unusual"
+	// evaluations.
+	//
+	// To make this PRNG reliable, self-starting, and eliminate a risk of
+	// stopping, the following variant can be used, which is a "register
+	// checker-board", a source of raw entropy. The PRNG is available as the
+	// komirand() function. Not required for hashing (but works for it) since
+	// the input entropy is usually available in abundance during hashing.
+	//
+	// Seed5 += r2h + 0xAAAAAAAAAAAAAAAA;
+	//
+	// (the `0xAAAA...` constant should match register's size; essentially,
+	// it is a replication of the `10` bit-pair; it is not an arbitrary
+	// constant).
+
+	KOMIHASH_HASHROUND(); // Required for PerlinNoise.
+
+	if( KOMIHASH_LIKELY( MsgLen < 16 ))
+	{
+		KOMIHASH_PREFETCH( Msg );
+
+		r2l = Seed1;
+		r2h = Seed5;
+
+		if( MsgLen > 7 )
+		{
+			// The following two XOR instructions are equivalent to mixing a
+			// message with a cryptographic one-time-pad (bitwise modulo 2
+			// addition). Message's statistics and distribution are thus
+			// unimportant.
+
+			r2h ^= kh_lpu64ec_l3( Msg + 8, MsgLen - 8,
+				1 << ( Msg[ MsgLen - 1 ] >> 7 ));
+
+			r2l ^= kh_lu64ec( Msg );
+		}
+		else
+		if( KOMIHASH_LIKELY( MsgLen != 0 ))
+		{
+			r2l ^= kh_lpu64ec_nz( Msg, MsgLen,
+				1 << ( Msg[ MsgLen - 1 ] >> 7 ));
+		}
+
+		KOMIHASH_HASHFIN();
+
+		return( Seed1 );
+	}
+
+	if( KOMIHASH_LIKELY( MsgLen < 32 ))
+	{
+		KOMIHASH_PREFETCH( Msg );
+
+		KOMIHASH_HASH16( Msg );
+
+		const uint64_t fb = 1 << ( Msg[ MsgLen - 1 ] >> 7 );
+
+		if( MsgLen > 23 )
+		{
+			r2h = Seed5 ^ kh_lpu64ec_l4( Msg + 24, MsgLen - 24, fb );
+			r2l = Seed1 ^ kh_lu64ec( Msg + 16 );
+		}
+		else
+		{
+			r2l = Seed1 ^ kh_lpu64ec_l4( Msg + 16, MsgLen - 16, fb );
+			r2h = Seed5;
+		}
+
+		KOMIHASH_HASHFIN();
+
+		return( Seed1 );
+	}
+
+	if( MsgLen > 63 )
+	{
+		uint64_t Seed2 = 0x13198A2E03707344 ^ Seed1;
+		uint64_t Seed3 = 0xA4093822299F31D0 ^ Seed1;
+		uint64_t Seed4 = 0x082EFA98EC4E6C89 ^ Seed1;
+		uint64_t Seed6 = 0xBE5466CF34E90C6C ^ Seed5;
+		uint64_t Seed7 = 0xC0AC29B7C97C50DD ^ Seed5;
+		uint64_t Seed8 = 0x3F84D5B5B5470917 ^ Seed5;
+		uint64_t r3l, r3h, r4l, r4h;
+
+		do
+		{
+			KOMIHASH_PREFETCH( Msg );
+
+			kh_m128( Seed1 ^ kh_lu64ec( Msg ),
+				Seed5 ^ kh_lu64ec( Msg + 8 ), &r1l, &r1h );
+
+			kh_m128( Seed2 ^ kh_lu64ec( Msg + 16 ),
+				Seed6 ^ kh_lu64ec( Msg + 24 ), &r2l, &r2h );
+
+			kh_m128( Seed3 ^ kh_lu64ec( Msg + 32 ),
+				Seed7 ^ kh_lu64ec( Msg + 40 ), &r3l, &r3h );
+
+			kh_m128( Seed4 ^ kh_lu64ec( Msg + 48 ),
+				Seed8 ^ kh_lu64ec( Msg + 56 ), &r4l, &r4h );
+
+			Msg += 64;
+			MsgLen -= 64;
+
+			// Such "shifting" arrangement (below) does not increase
+			// individual SeedN's PRNG period beyond 2^64, but reduces a
+			// chance of any occassional synchronization between PRNG lanes
+			// happening. Practically, Seed1-4 together become a single
+			// "fused" 256-bit PRNG value, having a summary PRNG period of
+			// 2^66.
+
+			Seed5 += r1h;
+			Seed6 += r2h;
+			Seed7 += r3h;
+			Seed8 += r4h;
+			Seed2 = Seed5 ^ r2l;
+			Seed3 = Seed6 ^ r3l;
+			Seed4 = Seed7 ^ r4l;
+			Seed1 = Seed8 ^ r1l;
+
+		} while( KOMIHASH_LIKELY( MsgLen > 63 ));
+
+		Seed5 ^= Seed6 ^ Seed7 ^ Seed8;
+		Seed1 ^= Seed2 ^ Seed3 ^ Seed4;
+	}
+
+	KOMIHASH_PREFETCH( Msg );
+
+	if( KOMIHASH_LIKELY( MsgLen > 31 ))
+	{
+		KOMIHASH_HASH16( Msg );
+		KOMIHASH_HASH16( Msg + 16 );
+
+		Msg += 32;
+		MsgLen -= 32;
+	}
+
+	if( MsgLen > 15 )
+	{
+		KOMIHASH_HASH16( Msg );
+
+		Msg += 16;
+		MsgLen -= 16;
+	}
+
+	const uint64_t fb = 1 << ( Msg[ MsgLen - 1 ] >> 7 );
+
+	if( MsgLen > 7 )
+	{
+		r2h = Seed5 ^ kh_lpu64ec_l4( Msg + 8, MsgLen - 8, fb );
+		r2l = Seed1 ^ kh_lu64ec( Msg );
+	}
+	else
+	{
+		r2l = Seed1 ^ kh_lpu64ec_l4( Msg, MsgLen, fb );
+		r2h = Seed5;
+	}
+
+	KOMIHASH_HASHFIN();
+
+	return( Seed1 );
+}
+
+/**
+ * Simple, reliable, self-starting yet efficient PRNG, with 2^64 period.
+ * 0.62 cycles/byte performance. Self-starts in 4 iterations, which is a
+ * suggested "warming up" initialization before using its output.
+ *
+ * @param[in,out] Seed1 Seed value 1. Can be initialized to any value
+ * (even 0). This is the usual "PRNG seed" value.
+ * @param[in,out] Seed2 Seed value 2, a supporting variable. Best initialized
+ * to the same value as Seed1.
+ * @return The next uniformly-random 64-bit value.
+ */
+
+static inline uint64_t komirand( uint64_t* const Seed1, uint64_t* const Seed2 )
+{
+	uint64_t r1l, r1h;
+
+	kh_m128( *Seed1, *Seed2, &r1l, &r1h );
+	*Seed2 += r1h + 0xAAAAAAAAAAAAAAAA;
+	*Seed1 = *Seed2 ^ r1l;
+
+	return( *Seed1 );
+}
+
+#endif // KOMIHASH_INCLUDED
diff --git a/src/include/common/pagefeat.h b/src/include/common/pagefeat.h
index d6a665cd8f..47331e71ed 100644
--- a/src/include/common/pagefeat.h
+++ b/src/include/common/pagefeat.h
@@ -17,6 +17,7 @@
 /* revealed for GUCs */
 extern int reserved_page_size;
 extern bool page_feature_page_checksums32;
+extern bool page_feature_page_checksums64;
 extern bool page_feature_wasted_space;
 
 /* forward declaration to avoid circular includes */
@@ -30,6 +31,7 @@ extern PageFeatureSet cluster_page_features;
 /* bit offset for features flags */
 typedef enum {
 	PF_PAGE_CHECKSUMS32 = 0,
+	PF_PAGE_CHECKSUMS64,
 	PF_WASTED_SPACE,
 	PF_MAX_FEATURE /* must be last */
 } PageFeature;
diff --git a/src/include/storage/checksum.h b/src/include/storage/checksum.h
index d0bcc01ca6..f131c4d492 100644
--- a/src/include/storage/checksum.h
+++ b/src/include/storage/checksum.h
@@ -21,5 +21,6 @@
  */
 extern uint16 pg_checksum_page(char *page, BlockNumber blkno);
 extern uint32 pg_checksum32_page(char *page, BlockNumber blkno, char*offset);
+extern uint64 pg_checksum64_page(char *page, BlockNumber blkno, uint64*offset);
 
 #endif							/* CHECKSUM_H */
diff --git a/src/include/storage/checksum_impl.h b/src/include/storage/checksum_impl.h
index 87d9f96484..2ce2c8bb29 100644
--- a/src/include/storage/checksum_impl.h
+++ b/src/include/storage/checksum_impl.h
@@ -101,6 +101,7 @@
  */
 
 #include "storage/bufpage.h"
+#include "common/komihash.h"
 
 /* number of checksums to calculate in parallel */
 #define N_SUMS 32
@@ -138,6 +139,7 @@ do { \
 	(checksum) = __tmp * FNV_PRIME ^ (__tmp >> 17); \
 } while (0)
 
+
 /*
  * Block checksum algorithm.  The page must be adequately aligned
  * (at least on 4-byte boundary).
@@ -173,6 +175,8 @@ pg_checksum_block(const PGChecksummablePage *page)
 	return result;
 }
 
+
+
 /*
  * Compute the checksum for a Postgres page.
  *
@@ -270,3 +274,61 @@ pg_checksum32_page(char *page, BlockNumber blkno, char *cksum)
 	 * coset for group 1 here, but it's a nice property to preserve */
 	return (checksum == 0 ? 1 : checksum);
 }
+
+/*
+ * 64-bit block checksum algorithm.  The page must be adequately aligned
+ * (at least on 4-byte boundary).
+ */
+
+static uint64
+pg_checksum64_block(const PGChecksummablePage *page)
+{
+	/* ensure that the size is compatible with the algorithm */
+	Assert(sizeof(PGChecksummablePage) == BLCKSZ);
+
+	return (uint64)komihash(page, BLCKSZ, 0);
+}
+
+
+/*
+ * Compute and return a 64-bit checksum for a Postgres page.
+ *
+ * Beware that the 64-bit portion of the page that cksum points to is
+ * transiently zeroed, though it is restored.
+ *
+ * The checksum includes the block number (to detect the case where a page is
+ * somehow moved to a different location), the page header (excluding the
+ * checksum itself), and the page data.
+ *
+ * The high bits of this are stored in the overflow storage area of the page
+ * pointed to by *cksum, leaving the pd_checksum field with the same checksum
+ * you'd expect if running the pg_checksum_page function.
+ */
+uint64
+pg_checksum64_page(char *page, BlockNumber blkno, uint64 *cksumloc)
+{
+	PGChecksummablePage *cpage = (PGChecksummablePage *) page;
+	uint64      saved;
+	uint64      checksum;
+
+	/* We only calculate the checksum for properly-initialized pages */
+	Assert(!PageIsNew((Page) page));
+	/* Ensure that the cksum pointer is in the page range on this page */
+	Assert((char*)cksumloc >= page && (char*)cksumloc <= (page + BLCKSZ - sizeof(uint64)));
+
+	saved = *cksumloc;
+	*cksumloc = 0;
+
+	checksum = pg_checksum64_block(cpage);
+
+	/* restore */
+	*cksumloc = saved;
+
+	/* Mix in the block number to detect transposed pages */
+	checksum ^= blkno;
+
+	/* ensure in the extremely unlikely case that we have non-zero return
+	 * value here; this does double-up on our coset for group 1 here, but it's
+	 * a nice property to preserve */
+	return (checksum == 0 ? 1 : checksum);
+}
-- 
2.37.0 (Apple Git-136)

0002-Make-the-output-of-select_views-test-stable.patchapplication/octet-stream; name=0002-Make-the-output-of-select_views-test-stable.patchDownload

From e53a11a563880826e99336f5ecce09c13763c9e8 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Thu, 20 Oct 2022 19:16:54 -0400
Subject: [PATCH 2/6] Make the output of "select_views" test stable

Changing the reserved_page_size has resulted in non-stable results for this test. This is likely due
to tuples being added to other disk pages when a different reserved_page_size is used.  Fix by
explicitly defining result ordering, which should be good enough for this test.
---
 src/test/regress/expected/select_views.out | 288 ++++++++++-----------
 src/test/regress/sql/select_views.sql      |   2 +-
 2 files changed, 145 insertions(+), 145 deletions(-)

diff --git a/src/test/regress/expected/select_views.out b/src/test/regress/expected/select_views.out
index 1aeed8452b..a1a349f092 100644
--- a/src/test/regress/expected/select_views.out
+++ b/src/test/regress/expected/select_views.out
@@ -2,19 +2,158 @@
 -- SELECT_VIEWS
 -- test the views defined in CREATE_VIEWS
 --
-SELECT * FROM street;
+SELECT * FROM street ORDER BY cname COLLATE "C", name COLLATE "C", thepath::text COLLATE "C";
                 name                |                                                                                                                                                                                                                   thepath                                                                                                                                                                                                                    |   cname   
 ------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------
- Access Rd 25                       | [(-121.9283,37.894),(-121.9283,37.9)]                                                                                                                                                                                                                                                                                                                                                                                                        | Oakland
- Ada                           St   | [(-122.2487,37.398),(-122.2496,37.401)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
- Agua Fria Creek                    | [(-121.9254,37.922),(-121.9281,37.889)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
+ 19th                          Ave  | [(-122.2366,37.897),(-122.2359,37.905)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ 5th                           St   | [(-122.296,37.615),(-122.2953,37.598)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
+ 82nd                          Ave  | [(-122.1695,37.596),(-122.1681,37.603)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Allen                         Ct   | [(-122.0131,37.602),(-122.0117,37.597)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Alvarado Niles                Road | [(-122.0325,37.903),(-122.0316,37.9)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
+ Arizona                       St   | [(-122.0381,37.901),(-122.0367,37.898)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Avenue 134th                       | [(-122.1823,37.002),(-122.1851,37.992)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Avenue 140th                       | [(-122.1656,37.003),(-122.1691,37.988)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Avenue D                           | [(-122.298,37.848),(-122.3024,37.849)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
+ Broadway                           | [(-122.2409,37.586),(-122.2395,37.601)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Buckingham                    Blvd | [(-122.2231,37.59),(-122.2214,37.606)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
+ Butterfield                   Dr   | [(-122.0838,37.002),(-122.0834,37.987)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ California                    St   | [(-122.2032,37.005),(-122.2016,37.996)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Campus                        Dr   | [(-122.1704,37.905),(-122.1678,37.868),(-122.1671,37.865)]                                                                                                                                                                                                                                                                                                                                                                                   | Berkeley
+ Carson                        St   | [(-122.1846,37.9),(-122.1843,37.901)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
+ Cedar                         St   | [(-122.3011,37.737),(-122.2999,37.739)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Central                       Ave  | [(-122.2343,37.602),(-122.2331,37.595)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Champion                      St   | [(-122.214,37.991),(-122.2147,37.002)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
+ Claremont                     Pl   | [(-122.0542,37.995),(-122.0542,37.008)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Coliseum                      Way  | [(-122.2113,37.626),(-122.2085,37.592),(-122.2063,37.568)]                                                                                                                                                                                                                                                                                                                                                                                   | Berkeley
+ Cornell                       Ave  | [(-122.2956,37.925),(-122.2949,37.906),(-122.2939,37.875)]                                                                                                                                                                                                                                                                                                                                                                                   | Berkeley
+ Creston                       Road | [(-122.2639,37.002),(-122.2613,37.986),(-122.2602,37.978),(-122.2598,37.973)]                                                                                                                                                                                                                                                                                                                                                                | Berkeley
+ Crow Canyon Creek                  | [(-122.043,37.905),(-122.0368,37.71)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
+ Cull Creek                         | [(-122.0624,37.875),(-122.0582,37.527)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Decoto                        Road | [(-122.0159,37.006),(-122.016,37.002),(-122.0164,37.993)]                                                                                                                                                                                                                                                                                                                                                                                    | Berkeley
+ Deering                       St   | [(-122.2146,37.904),(-122.2126,37.897)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Dimond                        Ave  | [(-122.2167,37.994),(-122.2162,37.006)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Donna                         Way  | [(-122.1333,37.606),(-122.1316,37.599)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Eden Creek                         | [(-122.022037,37.00675),(-122.0221,37.998)]                                                                                                                                                                                                                                                                                                                                                                                                  | Berkeley
+ Euclid                        Ave  | [(-122.2671,37.009),(-122.2666,37.987)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Foothill                      Blvd | [(-122.2414,37.9),(-122.2403,37.893)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
+ Fountain                      St   | [(-122.2306,37.593),(-122.2293,37.605)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Grizzly Peak                  Blvd | [(-122.2213,37.638),(-122.2127,37.581)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Grove                         Way  | [(-122.0643,37.884),(-122.062679,37.89162),(-122.061796,37.89578),(-122.0609,37.9)]                                                                                                                                                                                                                                                                                                                                                          | Berkeley
+ Herrier                       St   | [(-122.1943,37.006),(-122.1936,37.998)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Hesperian                     Blvd | [(-122.1132,37.6),(-122.1123,37.586)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
+ I- 580                             | [(-122.1108,37.023),(-122.1101,37.02),(-122.108103,37.00764),(-122.108,37.007),(-122.1069,37.998),(-122.1064,37.994),(-122.1053,37.982),(-122.1048,37.977),(-122.1032,37.958),(-122.1026,37.953),(-122.1013,37.938),(-122.0989,37.911),(-122.0984,37.91),(-122.098,37.908)]                                                                                                                                                                  | Berkeley
+ I- 580                             | [(-122.1543,37.703),(-122.1535,37.694),(-122.1512,37.655),(-122.1475,37.603),(-122.1468,37.583),(-122.1472,37.569),(-122.149044,37.54874),(-122.1493,37.546),(-122.1501,37.532),(-122.1506,37.509),(-122.1495,37.482),(-122.1487,37.467),(-122.1477,37.447),(-122.1414,37.383),(-122.1404,37.376),(-122.1398,37.372),(-122.139,37.356),(-122.1388,37.353),(-122.1385,37.34),(-122.1382,37.33),(-122.1378,37.316)]                            | Berkeley
+ I- 580                             | [(-122.2197,37.99),(-122.22,37.99),(-122.222092,37.99523),(-122.2232,37.998),(-122.224146,37.99963),(-122.2261,37.003),(-122.2278,37.007),(-122.2302,37.026),(-122.2323,37.043),(-122.2344,37.059),(-122.235405,37.06427),(-122.2365,37.07)]                                                                                                                                                                                                 | Berkeley
+ I- 580                        Ramp | [(-122.093241,37.90351),(-122.09364,37.89634),(-122.093788,37.89212)]                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
+ I- 580                        Ramp | [(-122.0934,37.896),(-122.09257,37.89961),(-122.0911,37.906)]                                                                                                                                                                                                                                                                                                                                                                                | Berkeley
+ I- 580                        Ramp | [(-122.0941,37.897),(-122.0943,37.902)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ I- 580                        Ramp | [(-122.096,37.888),(-122.0962,37.891),(-122.0964,37.9)]                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ I- 580                        Ramp | [(-122.101,37.898),(-122.1005,37.902),(-122.0989,37.911)]                                                                                                                                                                                                                                                                                                                                                                                    | Berkeley
+ I- 580                        Ramp | [(-122.1086,37.003),(-122.1068,37.993),(-122.1066,37.992),(-122.1053,37.982)]                                                                                                                                                                                                                                                                                                                                                                | Berkeley
+ I- 880                             | [(-122.0375,37.632),(-122.0359,37.619),(-122.0358,37.616),(-122.034514,37.60409),(-122.031876,37.57965),(-122.031193,37.57332),(-122.03016,37.56375),(-122.02943,37.55698),(-122.028689,37.54929),(-122.027833,37.53908),(-122.025979,37.51698),(-122.0238,37.491)]                                                                                                                                                                          | Berkeley
+ I- 880                             | [(-122.0612,37.003),(-122.0604,37.991),(-122.0596,37.982),(-122.0585,37.967),(-122.0583,37.961),(-122.0553,37.918),(-122.053635,37.89475),(-122.050759,37.8546),(-122.05,37.844),(-122.0485,37.817),(-122.0483,37.813),(-122.0482,37.811)]                                                                                                                                                                                                   | Berkeley
+ I- 880                             | [(-122.1365,37.902),(-122.1358,37.898),(-122.1333,37.881),(-122.1323,37.874),(-122.1311,37.866),(-122.1308,37.865),(-122.1307,37.864),(-122.1289,37.851),(-122.1277,37.843),(-122.1264,37.834),(-122.1231,37.812),(-122.1165,37.766),(-122.1104,37.72),(-122.109695,37.71094),(-122.109,37.702),(-122.108312,37.69168),(-122.1076,37.681)]                                                                                                   | Berkeley
+ I- 880                             | [(-122.1755,37.185),(-122.1747,37.178),(-122.1742,37.173),(-122.1692,37.126),(-122.167792,37.11594),(-122.16757,37.11435),(-122.1671,37.111),(-122.1655,37.1),(-122.165169,37.09811),(-122.1641,37.092),(-122.1596,37.061),(-122.158381,37.05275),(-122.155991,37.03657),(-122.1531,37.017),(-122.1478,37.98),(-122.1407,37.932),(-122.1394,37.924),(-122.1389,37.92),(-122.1376,37.91)]                                                     | Berkeley
+ I- 880                             | [(-122.2214,37.711),(-122.2202,37.699),(-122.2199,37.695),(-122.219,37.682),(-122.2184,37.672),(-122.2173,37.652),(-122.2159,37.638),(-122.2144,37.616),(-122.2138,37.612),(-122.2135,37.609),(-122.212,37.592),(-122.2116,37.586),(-122.2111,37.581)]                                                                                                                                                                                       | Berkeley
+ I- 880                             | [(-122.2707,37.975),(-122.2693,37.972),(-122.2681,37.966),(-122.267,37.962),(-122.2659,37.957),(-122.2648,37.952),(-122.2636,37.946),(-122.2625,37.935),(-122.2617,37.927),(-122.2607,37.921),(-122.2593,37.916),(-122.258,37.911),(-122.2536,37.898),(-122.2432,37.858),(-122.2408,37.845),(-122.2386,37.827),(-122.2374,37.811)]                                                                                                           | Berkeley
+ I- 880                        Ramp | [(-122.059,37.982),(-122.0577,37.984),(-122.0612,37.003)]                                                                                                                                                                                                                                                                                                                                                                                    | Berkeley
+ I- 880                        Ramp | [(-122.0618,37.011),(-122.0631,37.982),(-122.0585,37.967)]                                                                                                                                                                                                                                                                                                                                                                                   | Berkeley
+ I- 880                        Ramp | [(-122.1029,37.61),(-122.1013,37.587),(-122.0999,37.569)]                                                                                                                                                                                                                                                                                                                                                                                    | Berkeley
+ I- 880                        Ramp | [(-122.1379,37.891),(-122.1383,37.897),(-122.1377,37.902)]                                                                                                                                                                                                                                                                                                                                                                                   | Berkeley
+ I- 880                        Ramp | [(-122.1379,37.931),(-122.137597,37.92736),(-122.1374,37.925),(-122.1373,37.924),(-122.1369,37.914),(-122.1358,37.905),(-122.1365,37.908),(-122.1358,37.898)]                                                                                                                                                                                                                                                                                | Berkeley
+ I- 880                        Ramp | [(-122.2536,37.898),(-122.254,37.902)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
+ Jackson                       St   | [(-122.0845,37.6),(-122.0842,37.606)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
+ Joyce                         St   | [(-122.0792,37.604),(-122.0774,37.581)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Keeler                        Ave  | [(-122.2578,37.906),(-122.2579,37.899)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Laguna                        Ave  | [(-122.2099,37.989),(-122.2089,37)]                                                                                                                                                                                                                                                                                                                                                                                                          | Berkeley
+ Lakehurst                     Cir  | [(-122.284729,37.89025),(-122.286096,37.90364)]                                                                                                                                                                                                                                                                                                                                                                                              | Berkeley
+ Lakeshore                     Ave  | [(-122.2586,37.99),(-122.2556,37.006)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
+ Linden                        St   | [(-122.2867,37.998),(-122.2864,37.008)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Locust                        St   | [(-122.1606,37.007),(-122.1593,37.987)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Marin                         Ave  | [(-122.2741,37.894),(-122.272,37.901)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
+ Martin Luther King Jr         Way  | [(-122.2712,37.608),(-122.2711,37.599)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Maubert                       Ave  | [(-122.1114,37.009),(-122.1096,37.995)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ McClure                       Ave  | [(-122.1431,37.001),(-122.1436,37.998)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Miller                        Road | [(-122.0902,37.645),(-122.0865,37.545)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Mission                       Blvd | [(-122.0006,37.896),(-121.9989,37.88)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
+ Oakland Inner Harbor               | [(-122.2625,37.913),(-122.260016,37.89484)]                                                                                                                                                                                                                                                                                                                                                                                                  | Berkeley
+ Oneil                         Ave  | [(-122.076754,37.62476),(-122.0745,37.595)]                                                                                                                                                                                                                                                                                                                                                                                                  | Berkeley
+ Parkridge                     Dr   | [(-122.1438,37.884),(-122.1428,37.9)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
+ Parkside                      Dr   | [(-122.0475,37.603),(-122.0443,37.596)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Paseo Padre                   Pkwy | [(-122.0021,37.639),(-121.996,37.628)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
+ Pearl                         St   | [(-122.2383,37.594),(-122.2366,37.615)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Railroad                      Ave  | [(-122.0245,37.013),(-122.0234,37.003),(-122.0223,37.993)]                                                                                                                                                                                                                                                                                                                                                                                   | Berkeley
+ Ranspot                       Dr   | [(-122.0972,37.999),(-122.0959,37)]                                                                                                                                                                                                                                                                                                                                                                                                          | Berkeley
+ Redding                       St   | [(-122.1978,37.901),(-122.1975,37.895)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Redwood                       Road | [(-122.1493,37.98),(-122.1437,37.001)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
+ Roca                          Dr   | [(-122.0335,37.609),(-122.0314,37.599)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Sacramento                    St   | [(-122.2799,37.606),(-122.2797,37.597)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Saddle Brook                  Dr   | [(-122.1478,37.909),(-122.1454,37.904),(-122.1451,37.888)]                                                                                                                                                                                                                                                                                                                                                                                   | Berkeley
+ San Andreas                   Dr   | [(-122.0609,37.9),(-122.0614,37.895)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
+ Santa Maria                   Ave  | [(-122.0773,37),(-122.0773,37.98)]                                                                                                                                                                                                                                                                                                                                                                                                           | Berkeley
+ Shattuck                      Ave  | [(-122.2686,37.904),(-122.2686,37.897)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Shoreline                     Dr   | [(-122.2657,37.603),(-122.2648,37.6)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
+ Skyline                       Blvd | [(-122.1738,37.01),(-122.1714,37.996)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
+ Skywest                       Dr   | [(-122.1161,37.62),(-122.1123,37.586)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
+ Southern Pacific Railroad          | [(-122.3002,37.674),(-122.2999,37.661)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Sp Railroad                        | [(-122.0734,37.001),(-122.0734,37.997)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Sp Railroad                        | [(-122.0914,37.601),(-122.087,37.56),(-122.086408,37.5551)]                                                                                                                                                                                                                                                                                                                                                                                  | Berkeley
+ Sp Railroad                        | [(-122.137792,37.003),(-122.1365,37.992),(-122.131257,37.94612)]                                                                                                                                                                                                                                                                                                                                                                             | Berkeley
+ State Hwy 123                      | [(-122.3004,37.986),(-122.2998,37.969),(-122.2995,37.962),(-122.2992,37.952),(-122.299,37.942),(-122.2987,37.935),(-122.2984,37.924),(-122.2982,37.92),(-122.2976,37.904),(-122.297,37.88),(-122.2966,37.869),(-122.2959,37.848),(-122.2961,37.843)]                                                                                                                                                                                         | Berkeley
+ State Hwy 13                       | [(-122.1797,37.943),(-122.179871,37.91849),(-122.18,37.9),(-122.179023,37.86615),(-122.1787,37.862),(-122.1781,37.851),(-122.1777,37.845),(-122.1773,37.839),(-122.177,37.833)]                                                                                                                                                                                                                                                              | Berkeley
+ State Hwy 238                      | ((-122.098,37.908),(-122.0983,37.907),(-122.099,37.905),(-122.101,37.898),(-122.101535,37.89711),(-122.103173,37.89438),(-122.1046,37.892),(-122.106,37.89))                                                                                                                                                                                                                                                                                 | Berkeley
+ State Hwy 238                 Ramp | [(-122.1288,37.9),(-122.1293,37.895),(-122.1296,37.906)]                                                                                                                                                                                                                                                                                                                                                                                     | Berkeley
+ Stuart                        St   | [(-122.2518,37.6),(-122.2507,37.601),(-122.2491,37.606)]                                                                                                                                                                                                                                                                                                                                                                                     | Berkeley
+ Tupelo                        Ter  | [(-122.059087,37.6113),(-122.057021,37.59942)]                                                                                                                                                                                                                                                                                                                                                                                               | Berkeley
+ West Loop                     Road | [(-122.0576,37.604),(-122.0602,37.586)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Western Pacific Railroad Spur      | [(-122.0394,37.018),(-122.0394,37.961)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Wisconsin                     St   | [(-122.1994,37.017),(-122.1975,37.998),(-122.1971,37.994)]                                                                                                                                                                                                                                                                                                                                                                                   | Berkeley
+ Wp Railroad                        | [(-122.254,37.902),(-122.2506,37.891)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
+ 14th                          St   | [(-122.299,37.147),(-122.3,37.148)]                                                                                                                                                                                                                                                                                                                                                                                                          | Lafayette
+ 5th                           St   | [(-122.278,37),(-122.2792,37.005),(-122.2803,37.009)]                                                                                                                                                                                                                                                                                                                                                                                        | Lafayette
+ 98th                          Ave  | [(-122.2001,37.258),(-122.1974,37.27)]                                                                                                                                                                                                                                                                                                                                                                                                       | Lafayette
+ Ada                           St   | [(-122.2487,37.398),(-122.2496,37.401)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
+ California                    St   | [(-122.2032,37.005),(-122.2016,37.996)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
+ Capricorn                     Ave  | [(-122.2176,37.404),(-122.2164,37.384)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
+ Chambers                      Dr   | [(-122.2004,37.352),(-122.1972,37.368)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
+ Chambers                      Lane | [(-122.2001,37.359),(-122.1975,37.371)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
+ Champion                      St   | [(-122.214,37.991),(-122.2147,37.002)]                                                                                                                                                                                                                                                                                                                                                                                                       | Lafayette
+ Coolidge                      Ave  | [(-122.2007,37.058),(-122.1992,37.06)]                                                                                                                                                                                                                                                                                                                                                                                                       | Lafayette
+ Creston                       Road | [(-122.2639,37.002),(-122.2613,37.986),(-122.2602,37.978),(-122.2598,37.973)]                                                                                                                                                                                                                                                                                                                                                                | Lafayette
+ Dimond                        Ave  | [(-122.2167,37.994),(-122.2162,37.006)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
+ Edgewater                     Dr   | [(-122.201,37.379),(-122.2042,37.41)]                                                                                                                                                                                                                                                                                                                                                                                                        | Lafayette
+ Euclid                        Ave  | [(-122.2671,37.009),(-122.2666,37.987)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
+ Heartwood                     Dr   | [(-122.2006,37.341),(-122.1992,37.338)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
+ Hollis                        St   | [(-122.2885,37.397),(-122.289,37.414)]                                                                                                                                                                                                                                                                                                                                                                                                       | Lafayette
+ I- 580                             | [(-122.2197,37.99),(-122.22,37.99),(-122.222092,37.99523),(-122.2232,37.998),(-122.224146,37.99963),(-122.2261,37.003),(-122.2278,37.007),(-122.2302,37.026),(-122.2323,37.043),(-122.2344,37.059),(-122.235405,37.06427),(-122.2365,37.07)]                                                                                                                                                                                                 | Lafayette
+ I- 80                              | ((-122.2937,37.277),(-122.3016,37.262))                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
+ I- 80                              | ((-122.2962,37.273),(-122.3004,37.264))                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
+ I- 80                         Ramp | [(-122.2962,37.413),(-122.2959,37.382),(-122.2951,37.372)]                                                                                                                                                                                                                                                                                                                                                                                   | Lafayette
+ I- 880                        Ramp | [(-122.2771,37.002),(-122.278,37)]                                                                                                                                                                                                                                                                                                                                                                                                           | Lafayette
+ Indian                        Way  | [(-122.2066,37.398),(-122.2045,37.411)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
+ Laguna                        Ave  | [(-122.2099,37.989),(-122.2089,37)]                                                                                                                                                                                                                                                                                                                                                                                                          | Lafayette
+ Lakeshore                     Ave  | [(-122.2586,37.99),(-122.2556,37.006)]                                                                                                                                                                                                                                                                                                                                                                                                       | Lafayette
+ Linden                        St   | [(-122.2867,37.998),(-122.2864,37.008)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
+ Mandalay                      Road | [(-122.2322,37.397),(-122.2321,37.403)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
+ Proctor                       Ave  | [(-122.2267,37.406),(-122.2251,37.386)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
+ Sheridan                      Road | [(-122.2279,37.425),(-122.2253,37.411),(-122.2223,37.377)]                                                                                                                                                                                                                                                                                                                                                                                   | Lafayette
+ State Hwy 13                       | [(-122.2049,37.2),(-122.20328,37.17975),(-122.1989,37.125),(-122.198078,37.11641),(-122.1975,37.11)]                                                                                                                                                                                                                                                                                                                                         | Lafayette
+ State Hwy 13                  Ramp | [(-122.2244,37.427),(-122.223,37.414),(-122.2214,37.396),(-122.2213,37.388)]                                                                                                                                                                                                                                                                                                                                                                 | Lafayette
+ State Hwy 24                       | [(-122.2674,37.246),(-122.2673,37.248),(-122.267,37.261),(-122.2668,37.271),(-122.2663,37.298),(-122.2659,37.315),(-122.2655,37.336),(-122.265007,37.35882),(-122.264443,37.37286),(-122.2641,37.381),(-122.2638,37.388),(-122.2631,37.396),(-122.2617,37.405),(-122.2615,37.407),(-122.2605,37.412)]                                                                                                                                        | Lafayette
+ Taurus                        Ave  | [(-122.2159,37.416),(-122.2128,37.389)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
+ 100th                         Ave  | [(-122.1657,37.429),(-122.1647,37.432)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
+ 107th                         Ave  | [(-122.1555,37.403),(-122.1531,37.41)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
+ 1st                           St   | [(-121.75508,37.89294),(-121.753581,37.90031)]                                                                                                                                                                                                                                                                                                                                                                                               | Oakland
+ 85th                          Ave  | [(-122.1877,37.466),(-122.186,37.476)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
+ 89th                          Ave  | [(-122.1822,37.459),(-122.1803,37.471)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
+ 98th                          Ave  | [(-122.1568,37.498),(-122.1558,37.502)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
+ 98th                          Ave  | [(-122.1693,37.438),(-122.1682,37.444)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
+ Access Rd 25                       | [(-121.9283,37.894),(-121.9283,37.9)]                                                                                                                                                                                                                                                                                                                                                                                                        | Oakland
+ Agua Fria Creek                    | [(-121.9254,37.922),(-121.9281,37.889)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Andrea                        Cir  | [(-121.733218,37.88641),(-121.733286,37.90617)]                                                                                                                                                                                                                                                                                                                                                                                              | Oakland
  Apricot                       Lane | [(-121.9471,37.401),(-121.9456,37.392)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Apricot                       Lane | [(-121.9471,37.401),(-121.9456,37.392)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Arden                         Road | [(-122.0978,37.177),(-122.1,37.177)]                                                                                                                                                                                                                                                                                                                                                                                                         | Oakland
- Arizona                       St   | [(-122.0381,37.901),(-122.0367,37.898)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Arlington                     Dr   | [(-121.8802,37.408),(-121.8807,37.394)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Arlington                     Dr   | [(-121.8802,37.408),(-121.8807,37.394)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Arlington                     Road | [(-121.7957,37.898),(-121.7956,37.906)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
@@ -23,10 +162,7 @@ SELECT * FROM street;
  Arroyo Seco                        | [(-121.7073,37.766),(-121.6997,37.729)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Ash                           St   | [(-122.0408,37.31),(-122.04,37.292)]                                                                                                                                                                                                                                                                                                                                                                                                         | Oakland
  Avenue 134th                       | [(-122.1823,37.002),(-122.1851,37.992)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- Avenue 134th                       | [(-122.1823,37.002),(-122.1851,37.992)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Avenue 140th                       | [(-122.1656,37.003),(-122.1691,37.988)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- Avenue 140th                       | [(-122.1656,37.003),(-122.1691,37.988)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
- Avenue D                           | [(-122.298,37.848),(-122.3024,37.849)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
  B                             St   | [(-122.1749,37.451),(-122.1743,37.443)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Bancroft                      Ave  | [(-122.15714,37.4242),(-122.156,37.409)]                                                                                                                                                                                                                                                                                                                                                                                                     | Oakland
  Bancroft                      Ave  | [(-122.1643,37.523),(-122.1631,37.508),(-122.1621,37.493)]                                                                                                                                                                                                                                                                                                                                                                                   | Oakland
@@ -35,55 +171,28 @@ SELECT * FROM street;
  Blacow                        Road | [(-122.0179,37.469),(-122.0167,37.465)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Bridgepointe                  Dr   | [(-122.0514,37.305),(-122.0509,37.299)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Broadmore                     Ave  | [(-122.095,37.522),(-122.0936,37.497)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
- Broadway                           | [(-122.2409,37.586),(-122.2395,37.601)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
- Buckingham                    Blvd | [(-122.2231,37.59),(-122.2214,37.606)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
  Butterfield                   Dr   | [(-122.0838,37.002),(-122.0834,37.987)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Butterfield                   Dr   | [(-122.0838,37.002),(-122.0834,37.987)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- Butterfield                   Dr   | [(-122.0838,37.002),(-122.0834,37.987)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  C                             St   | [(-122.1768,37.46),(-122.1749,37.435)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
  Calaveras Creek                    | [(-121.8203,37.035),(-121.8207,37.931)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Calaveras Creek                    | [(-121.8203,37.035),(-121.8207,37.931)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- California                    St   | [(-122.2032,37.005),(-122.2016,37.996)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
- California                    St   | [(-122.2032,37.005),(-122.2016,37.996)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
  Cameron                       Ave  | [(-122.1316,37.502),(-122.1327,37.481)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- Campus                        Dr   | [(-122.1704,37.905),(-122.1678,37.868),(-122.1671,37.865)]                                                                                                                                                                                                                                                                                                                                                                                   | Berkeley
- Capricorn                     Ave  | [(-122.2176,37.404),(-122.2164,37.384)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
- Carson                        St   | [(-122.1846,37.9),(-122.1843,37.901)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
  Cedar                         Blvd | [(-122.0282,37.446),(-122.0265,37.43)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
- Cedar                         St   | [(-122.3011,37.737),(-122.2999,37.739)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Celia                         St   | [(-122.0611,37.3),(-122.0616,37.299)]                                                                                                                                                                                                                                                                                                                                                                                                        | Oakland
- Central                       Ave  | [(-122.2343,37.602),(-122.2331,37.595)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
- Chambers                      Dr   | [(-122.2004,37.352),(-122.1972,37.368)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
- Chambers                      Lane | [(-122.2001,37.359),(-122.1975,37.371)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
- Champion                      St   | [(-122.214,37.991),(-122.2147,37.002)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
- Champion                      St   | [(-122.214,37.991),(-122.2147,37.002)]                                                                                                                                                                                                                                                                                                                                                                                                       | Lafayette
  Chapman                       Dr   | [(-122.0421,37.504),(-122.0414,37.498)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Charles                       St   | [(-122.0255,37.505),(-122.0252,37.499)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Cherry                        St   | [(-122.0437,37.42),(-122.0434,37.413)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
  Claremont                     Pl   | [(-122.0542,37.995),(-122.0542,37.008)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Claremont                     Pl   | [(-122.0542,37.995),(-122.0542,37.008)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- Claremont                     Pl   | [(-122.0542,37.995),(-122.0542,37.008)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Coliseum                      Way  | [(-122.2001,37.47),(-122.1978,37.516)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
- Coliseum                      Way  | [(-122.2113,37.626),(-122.2085,37.592),(-122.2063,37.568)]                                                                                                                                                                                                                                                                                                                                                                                   | Berkeley
- Coolidge                      Ave  | [(-122.2007,37.058),(-122.1992,37.06)]                                                                                                                                                                                                                                                                                                                                                                                                       | Lafayette
- Cornell                       Ave  | [(-122.2956,37.925),(-122.2949,37.906),(-122.2939,37.875)]                                                                                                                                                                                                                                                                                                                                                                                   | Berkeley
  Corriea                       Way  | [(-121.9501,37.402),(-121.9505,37.398)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Corriea                       Way  | [(-121.9501,37.402),(-121.9505,37.398)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Cowing                        Road | [(-122.0002,37.934),(-121.9772,37.782)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- Creston                       Road | [(-122.2639,37.002),(-122.2613,37.986),(-122.2602,37.978),(-122.2598,37.973)]                                                                                                                                                                                                                                                                                                                                                                | Berkeley
- Creston                       Road | [(-122.2639,37.002),(-122.2613,37.986),(-122.2602,37.978),(-122.2598,37.973)]                                                                                                                                                                                                                                                                                                                                                                | Lafayette
- Crow Canyon Creek                  | [(-122.043,37.905),(-122.0368,37.71)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
  Crystaline                    Dr   | [(-121.925856,37),(-121.925869,37.00527)]                                                                                                                                                                                                                                                                                                                                                                                                    | Oakland
  Cull Canyon                   Road | [(-122.0536,37.435),(-122.0499,37.315)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- Cull Creek                         | [(-122.0624,37.875),(-122.0582,37.527)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  D                             St   | [(-122.1811,37.505),(-122.1805,37.497)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Decoto                        Road | [(-122.0159,37.006),(-122.016,37.002),(-122.0164,37.993)]                                                                                                                                                                                                                                                                                                                                                                                    | Oakland
  Decoto                        Road | [(-122.0159,37.006),(-122.016,37.002),(-122.0164,37.993)]                                                                                                                                                                                                                                                                                                                                                                                    | Oakland
- Decoto                        Road | [(-122.0159,37.006),(-122.016,37.002),(-122.0164,37.993)]                                                                                                                                                                                                                                                                                                                                                                                    | Berkeley
- Deering                       St   | [(-122.2146,37.904),(-122.2126,37.897)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
- Dimond                        Ave  | [(-122.2167,37.994),(-122.2162,37.006)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
- Dimond                        Ave  | [(-122.2167,37.994),(-122.2162,37.006)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
- Donna                         Way  | [(-122.1333,37.606),(-122.1316,37.599)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Driftwood                     Dr   | [(-122.0109,37.482),(-122.0113,37.477)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Driscoll                      Road | [(-121.9482,37.403),(-121.948451,37.39995)]                                                                                                                                                                                                                                                                                                                                                                                                  | Oakland
  Driscoll                      Road | [(-121.9482,37.403),(-121.948451,37.39995)]                                                                                                                                                                                                                                                                                                                                                                                                  | Oakland
@@ -91,38 +200,22 @@ SELECT * FROM street;
  Eden                          Ave  | [(-122.1143,37.505),(-122.1142,37.491)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Eden Creek                         | [(-122.022037,37.00675),(-122.0221,37.998)]                                                                                                                                                                                                                                                                                                                                                                                                  | Oakland
  Eden Creek                         | [(-122.022037,37.00675),(-122.0221,37.998)]                                                                                                                                                                                                                                                                                                                                                                                                  | Oakland
- Eden Creek                         | [(-122.022037,37.00675),(-122.0221,37.998)]                                                                                                                                                                                                                                                                                                                                                                                                  | Berkeley
- Edgewater                     Dr   | [(-122.201,37.379),(-122.2042,37.41)]                                                                                                                                                                                                                                                                                                                                                                                                        | Lafayette
  Enos                          Way  | [(-121.7677,37.896),(-121.7673,37.91)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
- Euclid                        Ave  | [(-122.2671,37.009),(-122.2666,37.987)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
- Euclid                        Ave  | [(-122.2671,37.009),(-122.2666,37.987)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
  Fairview                      Ave  | [(-121.999,37.428),(-121.9863,37.351)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
  Fairview                      Ave  | [(-121.999,37.428),(-121.9863,37.351)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
- Foothill                      Blvd | [(-122.2414,37.9),(-122.2403,37.893)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
- Fountain                      St   | [(-122.2306,37.593),(-122.2293,37.605)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Gading                        Road | [(-122.0801,37.343),(-122.08,37.336)]                                                                                                                                                                                                                                                                                                                                                                                                        | Oakland
- Grizzly Peak                  Blvd | [(-122.2213,37.638),(-122.2127,37.581)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
- Grove                         Way  | [(-122.0643,37.884),(-122.062679,37.89162),(-122.061796,37.89578),(-122.0609,37.9)]                                                                                                                                                                                                                                                                                                                                                          | Berkeley
  Harris                        Road | [(-122.0659,37.372),(-122.0675,37.363)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- Heartwood                     Dr   | [(-122.2006,37.341),(-122.1992,37.338)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
  Hegenberger                   Exwy | [(-122.1946,37.52),(-122.1947,37.497)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
  Herrier                       St   | [(-122.1943,37.006),(-122.1936,37.998)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- Herrier                       St   | [(-122.1943,37.006),(-122.1936,37.998)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Hesperian                     Blvd | [(-122.097,37.333),(-122.0956,37.31),(-122.0946,37.293)]                                                                                                                                                                                                                                                                                                                                                                                     | Oakland
  Hesperian                     Blvd | [(-122.097,37.333),(-122.0956,37.31),(-122.0946,37.293)]                                                                                                                                                                                                                                                                                                                                                                                     | Oakland
- Hesperian                     Blvd | [(-122.1132,37.6),(-122.1123,37.586)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
- Hollis                        St   | [(-122.2885,37.397),(-122.289,37.414)]                                                                                                                                                                                                                                                                                                                                                                                                       | Lafayette
  I- 580                             | [(-121.727,37.074),(-121.7229,37.093),(-121.722301,37.09522),(-121.721001,37.10005),(-121.7194,37.106),(-121.7188,37.109),(-121.7168,37.12),(-121.7163,37.123),(-121.7145,37.127),(-121.7096,37.148),(-121.707731,37.1568),(-121.7058,37.166),(-121.7055,37.168),(-121.7044,37.174),(-121.7038,37.172),(-121.7037,37.172),(-121.7027,37.175),(-121.7001,37.181),(-121.6957,37.191),(-121.6948,37.192),(-121.6897,37.204),(-121.6697,37.185)] | Oakland
  I- 580                             | [(-121.9322,37.989),(-121.9243,37.006),(-121.9217,37.014)]                                                                                                                                                                                                                                                                                                                                                                                   | Oakland
  I- 580                             | [(-121.9322,37.989),(-121.9243,37.006),(-121.9217,37.014)]                                                                                                                                                                                                                                                                                                                                                                                   | Oakland
  I- 580                             | [(-122.018,37.019),(-122.0009,37.032),(-121.9787,37.983),(-121.958,37.984),(-121.9571,37.986)]                                                                                                                                                                                                                                                                                                                                               | Oakland
  I- 580                             | [(-122.018,37.019),(-122.0009,37.032),(-121.9787,37.983),(-121.958,37.984),(-121.9571,37.986)]                                                                                                                                                                                                                                                                                                                                               | Oakland
  I- 580                             | [(-122.1108,37.023),(-122.1101,37.02),(-122.108103,37.00764),(-122.108,37.007),(-122.1069,37.998),(-122.1064,37.994),(-122.1053,37.982),(-122.1048,37.977),(-122.1032,37.958),(-122.1026,37.953),(-122.1013,37.938),(-122.0989,37.911),(-122.0984,37.91),(-122.098,37.908)]                                                                                                                                                                  | Oakland
- I- 580                             | [(-122.1108,37.023),(-122.1101,37.02),(-122.108103,37.00764),(-122.108,37.007),(-122.1069,37.998),(-122.1064,37.994),(-122.1053,37.982),(-122.1048,37.977),(-122.1032,37.958),(-122.1026,37.953),(-122.1013,37.938),(-122.0989,37.911),(-122.0984,37.91),(-122.098,37.908)]                                                                                                                                                                  | Berkeley
  I- 580                             | [(-122.1543,37.703),(-122.1535,37.694),(-122.1512,37.655),(-122.1475,37.603),(-122.1468,37.583),(-122.1472,37.569),(-122.149044,37.54874),(-122.1493,37.546),(-122.1501,37.532),(-122.1506,37.509),(-122.1495,37.482),(-122.1487,37.467),(-122.1477,37.447),(-122.1414,37.383),(-122.1404,37.376),(-122.1398,37.372),(-122.139,37.356),(-122.1388,37.353),(-122.1385,37.34),(-122.1382,37.33),(-122.1378,37.316)]                            | Oakland
- I- 580                             | [(-122.1543,37.703),(-122.1535,37.694),(-122.1512,37.655),(-122.1475,37.603),(-122.1468,37.583),(-122.1472,37.569),(-122.149044,37.54874),(-122.1493,37.546),(-122.1501,37.532),(-122.1506,37.509),(-122.1495,37.482),(-122.1487,37.467),(-122.1477,37.447),(-122.1414,37.383),(-122.1404,37.376),(-122.1398,37.372),(-122.139,37.356),(-122.1388,37.353),(-122.1385,37.34),(-122.1382,37.33),(-122.1378,37.316)]                            | Berkeley
- I- 580                             | [(-122.2197,37.99),(-122.22,37.99),(-122.222092,37.99523),(-122.2232,37.998),(-122.224146,37.99963),(-122.2261,37.003),(-122.2278,37.007),(-122.2302,37.026),(-122.2323,37.043),(-122.2344,37.059),(-122.235405,37.06427),(-122.2365,37.07)]                                                                                                                                                                                                 | Berkeley
- I- 580                             | [(-122.2197,37.99),(-122.22,37.99),(-122.222092,37.99523),(-122.2232,37.998),(-122.224146,37.99963),(-122.2261,37.003),(-122.2278,37.007),(-122.2302,37.026),(-122.2323,37.043),(-122.2344,37.059),(-122.235405,37.06427),(-122.2365,37.07)]                                                                                                                                                                                                 | Lafayette
  I- 580                        Ramp | [(-121.8521,37.011),(-121.8479,37.999),(-121.8476,37.999),(-121.8456,37.01),(-121.8455,37.011)]                                                                                                                                                                                                                                                                                                                                              | Oakland
  I- 580                        Ramp | [(-121.8521,37.011),(-121.8479,37.999),(-121.8476,37.999),(-121.8456,37.01),(-121.8455,37.011)]                                                                                                                                                                                                                                                                                                                                              | Oakland
  I- 580                        Ramp | [(-121.8743,37.014),(-121.8722,37.999),(-121.8714,37.999)]                                                                                                                                                                                                                                                                                                                                                                                   | Oakland
@@ -131,13 +224,7 @@ SELECT * FROM street;
  I- 580                        Ramp | [(-121.9043,37.998),(-121.9036,37.013),(-121.902632,37.0174),(-121.9025,37.018)]                                                                                                                                                                                                                                                                                                                                                             | Oakland
  I- 580                        Ramp | [(-121.9368,37.986),(-121.936483,37.98832),(-121.9353,37.997),(-121.93504,37.00035),(-121.9346,37.006),(-121.933764,37.00031),(-121.9333,37.997),(-121.9322,37.989)]                                                                                                                                                                                                                                                                         | Oakland
  I- 580                        Ramp | [(-121.9368,37.986),(-121.936483,37.98832),(-121.9353,37.997),(-121.93504,37.00035),(-121.9346,37.006),(-121.933764,37.00031),(-121.9333,37.997),(-121.9322,37.989)]                                                                                                                                                                                                                                                                         | Oakland
- I- 580                        Ramp | [(-122.093241,37.90351),(-122.09364,37.89634),(-122.093788,37.89212)]                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
- I- 580                        Ramp | [(-122.0934,37.896),(-122.09257,37.89961),(-122.0911,37.906)]                                                                                                                                                                                                                                                                                                                                                                                | Berkeley
- I- 580                        Ramp | [(-122.0941,37.897),(-122.0943,37.902)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
- I- 580                        Ramp | [(-122.096,37.888),(-122.0962,37.891),(-122.0964,37.9)]                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
- I- 580                        Ramp | [(-122.101,37.898),(-122.1005,37.902),(-122.0989,37.911)]                                                                                                                                                                                                                                                                                                                                                                                    | Berkeley
  I- 580                        Ramp | [(-122.1086,37.003),(-122.1068,37.993),(-122.1066,37.992),(-122.1053,37.982)]                                                                                                                                                                                                                                                                                                                                                                | Oakland
- I- 580                        Ramp | [(-122.1086,37.003),(-122.1068,37.993),(-122.1066,37.992),(-122.1053,37.982)]                                                                                                                                                                                                                                                                                                                                                                | Berkeley
  I- 580                        Ramp | [(-122.1414,37.383),(-122.1407,37.376),(-122.1403,37.372),(-122.139,37.356)]                                                                                                                                                                                                                                                                                                                                                                 | Oakland
  I- 580/I-680                  Ramp | ((-121.9207,37.988),(-121.9192,37.016))                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  I- 580/I-680                  Ramp | ((-121.9207,37.988),(-121.9192,37.016))                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
@@ -151,25 +238,16 @@ SELECT * FROM street;
  I- 680                        Ramp | [(-121.92,37.438),(-121.9218,37.424),(-121.9238,37.408),(-121.9252,37.392)]                                                                                                                                                                                                                                                                                                                                                                  | Oakland
  I- 680                        Ramp | [(-121.9238,37.402),(-121.9234,37.395),(-121.923,37.399)]                                                                                                                                                                                                                                                                                                                                                                                    | Oakland
  I- 680                        Ramp | [(-121.9238,37.402),(-121.9234,37.395),(-121.923,37.399)]                                                                                                                                                                                                                                                                                                                                                                                    | Oakland
- I- 80                              | ((-122.2937,37.277),(-122.3016,37.262))                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
- I- 80                              | ((-122.2962,37.273),(-122.3004,37.264))                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
- I- 80                         Ramp | [(-122.2962,37.413),(-122.2959,37.382),(-122.2951,37.372)]                                                                                                                                                                                                                                                                                                                                                                                   | Lafayette
  I- 880                             | ((-121.9669,37.075),(-121.9663,37.071),(-121.9656,37.065),(-121.9618,37.037),(-121.95689,37),(-121.948,37.933))                                                                                                                                                                                                                                                                                                                              | Oakland
  I- 880                             | ((-121.9669,37.075),(-121.9663,37.071),(-121.9656,37.065),(-121.9618,37.037),(-121.95689,37),(-121.948,37.933))                                                                                                                                                                                                                                                                                                                              | Oakland
  I- 880                             | [(-121.948,37.933),(-121.9471,37.925),(-121.9467,37.923),(-121.946,37.918),(-121.9452,37.912),(-121.937,37.852)]                                                                                                                                                                                                                                                                                                                             | Oakland
  I- 880                             | [(-122.0219,37.466),(-122.0205,37.447),(-122.020331,37.44447),(-122.020008,37.43962),(-122.0195,37.432),(-122.0193,37.429),(-122.0164,37.393),(-122.010219,37.34771),(-122.0041,37.313)]                                                                                                                                                                                                                                                     | Oakland
  I- 880                             | [(-122.0375,37.632),(-122.0359,37.619),(-122.0358,37.616),(-122.034514,37.60409),(-122.031876,37.57965),(-122.031193,37.57332),(-122.03016,37.56375),(-122.02943,37.55698),(-122.028689,37.54929),(-122.027833,37.53908),(-122.025979,37.51698),(-122.0238,37.491)]                                                                                                                                                                          | Oakland
- I- 880                             | [(-122.0375,37.632),(-122.0359,37.619),(-122.0358,37.616),(-122.034514,37.60409),(-122.031876,37.57965),(-122.031193,37.57332),(-122.03016,37.56375),(-122.02943,37.55698),(-122.028689,37.54929),(-122.027833,37.53908),(-122.025979,37.51698),(-122.0238,37.491)]                                                                                                                                                                          | Berkeley
  I- 880                             | [(-122.0612,37.003),(-122.0604,37.991),(-122.0596,37.982),(-122.0585,37.967),(-122.0583,37.961),(-122.0553,37.918),(-122.053635,37.89475),(-122.050759,37.8546),(-122.05,37.844),(-122.0485,37.817),(-122.0483,37.813),(-122.0482,37.811)]                                                                                                                                                                                                   | Oakland
  I- 880                             | [(-122.0612,37.003),(-122.0604,37.991),(-122.0596,37.982),(-122.0585,37.967),(-122.0583,37.961),(-122.0553,37.918),(-122.053635,37.89475),(-122.050759,37.8546),(-122.05,37.844),(-122.0485,37.817),(-122.0483,37.813),(-122.0482,37.811)]                                                                                                                                                                                                   | Oakland
- I- 880                             | [(-122.0612,37.003),(-122.0604,37.991),(-122.0596,37.982),(-122.0585,37.967),(-122.0583,37.961),(-122.0553,37.918),(-122.053635,37.89475),(-122.050759,37.8546),(-122.05,37.844),(-122.0485,37.817),(-122.0483,37.813),(-122.0482,37.811)]                                                                                                                                                                                                   | Berkeley
  I- 880                             | [(-122.0831,37.312),(-122.0819,37.296),(-122.081,37.285),(-122.0786,37.248),(-122.078,37.24),(-122.077642,37.23496),(-122.076983,37.22567),(-122.076599,37.22026),(-122.076229,37.21505),(-122.0758,37.209)]                                                                                                                                                                                                                                 | Oakland
  I- 880                             | [(-122.0978,37.528),(-122.096,37.496),(-122.0931,37.453),(-122.09277,37.4496),(-122.090189,37.41442),(-122.0896,37.405),(-122.085,37.34)]                                                                                                                                                                                                                                                                                                    | Oakland
- I- 880                             | [(-122.1365,37.902),(-122.1358,37.898),(-122.1333,37.881),(-122.1323,37.874),(-122.1311,37.866),(-122.1308,37.865),(-122.1307,37.864),(-122.1289,37.851),(-122.1277,37.843),(-122.1264,37.834),(-122.1231,37.812),(-122.1165,37.766),(-122.1104,37.72),(-122.109695,37.71094),(-122.109,37.702),(-122.108312,37.69168),(-122.1076,37.681)]                                                                                                   | Berkeley
  I- 880                             | [(-122.1755,37.185),(-122.1747,37.178),(-122.1742,37.173),(-122.1692,37.126),(-122.167792,37.11594),(-122.16757,37.11435),(-122.1671,37.111),(-122.1655,37.1),(-122.165169,37.09811),(-122.1641,37.092),(-122.1596,37.061),(-122.158381,37.05275),(-122.155991,37.03657),(-122.1531,37.017),(-122.1478,37.98),(-122.1407,37.932),(-122.1394,37.924),(-122.1389,37.92),(-122.1376,37.91)]                                                     | Oakland
- I- 880                             | [(-122.1755,37.185),(-122.1747,37.178),(-122.1742,37.173),(-122.1692,37.126),(-122.167792,37.11594),(-122.16757,37.11435),(-122.1671,37.111),(-122.1655,37.1),(-122.165169,37.09811),(-122.1641,37.092),(-122.1596,37.061),(-122.158381,37.05275),(-122.155991,37.03657),(-122.1531,37.017),(-122.1478,37.98),(-122.1407,37.932),(-122.1394,37.924),(-122.1389,37.92),(-122.1376,37.91)]                                                     | Berkeley
- I- 880                             | [(-122.2214,37.711),(-122.2202,37.699),(-122.2199,37.695),(-122.219,37.682),(-122.2184,37.672),(-122.2173,37.652),(-122.2159,37.638),(-122.2144,37.616),(-122.2138,37.612),(-122.2135,37.609),(-122.212,37.592),(-122.2116,37.586),(-122.2111,37.581)]                                                                                                                                                                                       | Berkeley
- I- 880                             | [(-122.2707,37.975),(-122.2693,37.972),(-122.2681,37.966),(-122.267,37.962),(-122.2659,37.957),(-122.2648,37.952),(-122.2636,37.946),(-122.2625,37.935),(-122.2617,37.927),(-122.2607,37.921),(-122.2593,37.916),(-122.258,37.911),(-122.2536,37.898),(-122.2432,37.858),(-122.2408,37.845),(-122.2386,37.827),(-122.2374,37.811)]                                                                                                           | Berkeley
  I- 880                        Ramp | [(-122.0019,37.301),(-122.002,37.293)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
  I- 880                        Ramp | [(-122.0041,37.313),(-122.0018,37.315),(-122.0007,37.315),(-122.0005,37.313),(-122.0002,37.308),(-121.9995,37.289)]                                                                                                                                                                                                                                                                                                                          | Oakland
  I- 880                        Ramp | [(-122.0041,37.313),(-122.0038,37.308),(-122.0039,37.284),(-122.0013,37.287),(-121.9995,37.289)]                                                                                                                                                                                                                                                                                                                                             | Oakland
@@ -177,167 +255,89 @@ SELECT * FROM street;
  I- 880                        Ramp | [(-122.0238,37.491),(-122.0215,37.483),(-122.0211,37.477),(-122.0205,37.447)]                                                                                                                                                                                                                                                                                                                                                                | Oakland
  I- 880                        Ramp | [(-122.059,37.982),(-122.0577,37.984),(-122.0612,37.003)]                                                                                                                                                                                                                                                                                                                                                                                    | Oakland
  I- 880                        Ramp | [(-122.059,37.982),(-122.0577,37.984),(-122.0612,37.003)]                                                                                                                                                                                                                                                                                                                                                                                    | Oakland
- I- 880                        Ramp | [(-122.059,37.982),(-122.0577,37.984),(-122.0612,37.003)]                                                                                                                                                                                                                                                                                                                                                                                    | Berkeley
  I- 880                        Ramp | [(-122.0618,37.011),(-122.0631,37.982),(-122.0585,37.967)]                                                                                                                                                                                                                                                                                                                                                                                   | Oakland
  I- 880                        Ramp | [(-122.0618,37.011),(-122.0631,37.982),(-122.0585,37.967)]                                                                                                                                                                                                                                                                                                                                                                                   | Oakland
- I- 880                        Ramp | [(-122.0618,37.011),(-122.0631,37.982),(-122.0585,37.967)]                                                                                                                                                                                                                                                                                                                                                                                   | Berkeley
  I- 880                        Ramp | [(-122.085,37.34),(-122.0801,37.316),(-122.081,37.285)]                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  I- 880                        Ramp | [(-122.085,37.34),(-122.0801,37.316),(-122.081,37.285)]                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  I- 880                        Ramp | [(-122.085,37.34),(-122.0866,37.316),(-122.0819,37.296)]                                                                                                                                                                                                                                                                                                                                                                                     | Oakland
  I- 880                        Ramp | [(-122.085,37.34),(-122.0866,37.316),(-122.0819,37.296)]                                                                                                                                                                                                                                                                                                                                                                                     | Oakland
- I- 880                        Ramp | [(-122.1029,37.61),(-122.1013,37.587),(-122.0999,37.569)]                                                                                                                                                                                                                                                                                                                                                                                    | Berkeley
- I- 880                        Ramp | [(-122.1379,37.891),(-122.1383,37.897),(-122.1377,37.902)]                                                                                                                                                                                                                                                                                                                                                                                   | Berkeley
- I- 880                        Ramp | [(-122.1379,37.931),(-122.137597,37.92736),(-122.1374,37.925),(-122.1373,37.924),(-122.1369,37.914),(-122.1358,37.905),(-122.1365,37.908),(-122.1358,37.898)]                                                                                                                                                                                                                                                                                | Berkeley
- I- 880                        Ramp | [(-122.2536,37.898),(-122.254,37.902)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
- I- 880                        Ramp | [(-122.2771,37.002),(-122.278,37)]                                                                                                                                                                                                                                                                                                                                                                                                           | Lafayette
- Indian                        Way  | [(-122.2066,37.398),(-122.2045,37.411)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
- Jackson                       St   | [(-122.0845,37.6),(-122.0842,37.606)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
  Johnson                       Dr   | [(-121.9145,37.901),(-121.915,37.877)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
- Joyce                         St   | [(-122.0792,37.604),(-122.0774,37.581)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Juniper                       St   | [(-121.7823,37.897),(-121.7815,37.9)]                                                                                                                                                                                                                                                                                                                                                                                                        | Oakland
  Kaiser                        Dr   | [(-122.067163,37.47821),(-122.060402,37.51961)]                                                                                                                                                                                                                                                                                                                                                                                              | Oakland
- Keeler                        Ave  | [(-122.2578,37.906),(-122.2579,37.899)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Kildare                       Road | [(-122.0968,37.016),(-122.0959,37)]                                                                                                                                                                                                                                                                                                                                                                                                          | Oakland
  La Playa                      Dr   | [(-122.1039,37.545),(-122.101,37.493)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
- Laguna                        Ave  | [(-122.2099,37.989),(-122.2089,37)]                                                                                                                                                                                                                                                                                                                                                                                                          | Berkeley
- Laguna                        Ave  | [(-122.2099,37.989),(-122.2089,37)]                                                                                                                                                                                                                                                                                                                                                                                                          | Lafayette
- Lakehurst                     Cir  | [(-122.284729,37.89025),(-122.286096,37.90364)]                                                                                                                                                                                                                                                                                                                                                                                              | Berkeley
- Lakeshore                     Ave  | [(-122.2586,37.99),(-122.2556,37.006)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
- Lakeshore                     Ave  | [(-122.2586,37.99),(-122.2556,37.006)]                                                                                                                                                                                                                                                                                                                                                                                                       | Lafayette
  Las Positas                   Road | [(-121.764488,37.99199),(-121.75569,37.02022)]                                                                                                                                                                                                                                                                                                                                                                                               | Oakland
  Las Positas                   Road | [(-121.764488,37.99199),(-121.75569,37.02022)]                                                                                                                                                                                                                                                                                                                                                                                               | Oakland
- Linden                        St   | [(-122.2867,37.998),(-122.2864,37.008)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
- Linden                        St   | [(-122.2867,37.998),(-122.2864,37.008)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
  Livermore                     Ave  | [(-121.7687,37.448),(-121.769,37.375)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
  Livermore                     Ave  | [(-121.7687,37.448),(-121.769,37.375)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
  Livermore                     Ave  | [(-121.772719,37.99085),(-121.7728,37.001)]                                                                                                                                                                                                                                                                                                                                                                                                  | Oakland
  Livermore                     Ave  | [(-121.772719,37.99085),(-121.7728,37.001)]                                                                                                                                                                                                                                                                                                                                                                                                  | Oakland
  Locust                        St   | [(-122.1606,37.007),(-122.1593,37.987)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- Locust                        St   | [(-122.1606,37.007),(-122.1593,37.987)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Logan                         Ct   | [(-122.0053,37.492),(-122.0061,37.484)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Magnolia                      St   | [(-122.0971,37.5),(-122.0962,37.484)]                                                                                                                                                                                                                                                                                                                                                                                                        | Oakland
- Mandalay                      Road | [(-122.2322,37.397),(-122.2321,37.403)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
- Marin                         Ave  | [(-122.2741,37.894),(-122.272,37.901)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
- Martin Luther King Jr         Way  | [(-122.2712,37.608),(-122.2711,37.599)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Mattos                        Dr   | [(-122.0005,37.502),(-122.000898,37.49683)]                                                                                                                                                                                                                                                                                                                                                                                                  | Oakland
  Maubert                       Ave  | [(-122.1114,37.009),(-122.1096,37.995)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- Maubert                       Ave  | [(-122.1114,37.009),(-122.1096,37.995)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  McClure                       Ave  | [(-122.1431,37.001),(-122.1436,37.998)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- McClure                       Ave  | [(-122.1431,37.001),(-122.1436,37.998)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Medlar                        Dr   | [(-122.0627,37.378),(-122.0625,37.375)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Mildred                       Ct   | [(-122.0002,37.388),(-121.9998,37.386)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- Miller                        Road | [(-122.0902,37.645),(-122.0865,37.545)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Miramar                       Ave  | [(-122.1009,37.025),(-122.099089,37.03209)]                                                                                                                                                                                                                                                                                                                                                                                                  | Oakland
  Mission                       Blvd | [(-121.918886,37),(-121.9194,37.976),(-121.9198,37.975)]                                                                                                                                                                                                                                                                                                                                                                                     | Oakland
  Mission                       Blvd | [(-121.918886,37),(-121.9194,37.976),(-121.9198,37.975)]                                                                                                                                                                                                                                                                                                                                                                                     | Oakland
  Mission                       Blvd | [(-122.0006,37.896),(-121.9989,37.88)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
- Mission                       Blvd | [(-122.0006,37.896),(-121.9989,37.88)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
  Moores                        Ave  | [(-122.0087,37.301),(-122.0094,37.292)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  National                      Ave  | [(-122.1192,37.5),(-122.1281,37.489)]                                                                                                                                                                                                                                                                                                                                                                                                        | Oakland
  Navajo                        Ct   | [(-121.8779,37.901),(-121.8783,37.9)]                                                                                                                                                                                                                                                                                                                                                                                                        | Oakland
  Newark                        Blvd | [(-122.0352,37.438),(-122.0341,37.423)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- Oakland Inner Harbor               | [(-122.2625,37.913),(-122.260016,37.89484)]                                                                                                                                                                                                                                                                                                                                                                                                  | Berkeley
  Oakridge                      Road | [(-121.8316,37.049),(-121.828382,37)]                                                                                                                                                                                                                                                                                                                                                                                                        | Oakland
- Oneil                         Ave  | [(-122.076754,37.62476),(-122.0745,37.595)]                                                                                                                                                                                                                                                                                                                                                                                                  | Berkeley
- Parkridge                     Dr   | [(-122.1438,37.884),(-122.1428,37.9)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
- Parkside                      Dr   | [(-122.0475,37.603),(-122.0443,37.596)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Paseo Padre                   Pkwy | [(-121.9143,37.005),(-121.913522,37)]                                                                                                                                                                                                                                                                                                                                                                                                        | Oakland
  Paseo Padre                   Pkwy | [(-122.0021,37.639),(-121.996,37.628)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
- Paseo Padre                   Pkwy | [(-122.0021,37.639),(-121.996,37.628)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
- Pearl                         St   | [(-122.2383,37.594),(-122.2366,37.615)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Periwinkle                    Road | [(-122.0451,37.301),(-122.044758,37.29844)]                                                                                                                                                                                                                                                                                                                                                                                                  | Oakland
  Pimlico                       Dr   | [(-121.8616,37.998),(-121.8618,37.008)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Pimlico                       Dr   | [(-121.8616,37.998),(-121.8618,37.008)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Portsmouth                    Ave  | [(-122.1064,37.315),(-122.1064,37.308)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- Proctor                       Ave  | [(-122.2267,37.406),(-122.2251,37.386)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
  Railroad                      Ave  | [(-122.0245,37.013),(-122.0234,37.003),(-122.0223,37.993)]                                                                                                                                                                                                                                                                                                                                                                                   | Oakland
  Railroad                      Ave  | [(-122.0245,37.013),(-122.0234,37.003),(-122.0223,37.993)]                                                                                                                                                                                                                                                                                                                                                                                   | Oakland
- Railroad                      Ave  | [(-122.0245,37.013),(-122.0234,37.003),(-122.0223,37.993)]                                                                                                                                                                                                                                                                                                                                                                                   | Berkeley
  Ranspot                       Dr   | [(-122.0972,37.999),(-122.0959,37)]                                                                                                                                                                                                                                                                                                                                                                                                          | Oakland
  Ranspot                       Dr   | [(-122.0972,37.999),(-122.0959,37)]                                                                                                                                                                                                                                                                                                                                                                                                          | Oakland
- Ranspot                       Dr   | [(-122.0972,37.999),(-122.0959,37)]                                                                                                                                                                                                                                                                                                                                                                                                          | Berkeley
- Redding                       St   | [(-122.1978,37.901),(-122.1975,37.895)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Redwood                       Road | [(-122.1493,37.98),(-122.1437,37.001)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
- Redwood                       Road | [(-122.1493,37.98),(-122.1437,37.001)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
- Roca                          Dr   | [(-122.0335,37.609),(-122.0314,37.599)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Rosedale                      Ct   | [(-121.9232,37.9),(-121.924,37.897)]                                                                                                                                                                                                                                                                                                                                                                                                         | Oakland
- Sacramento                    St   | [(-122.2799,37.606),(-122.2797,37.597)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
- Saddle Brook                  Dr   | [(-122.1478,37.909),(-122.1454,37.904),(-122.1451,37.888)]                                                                                                                                                                                                                                                                                                                                                                                   | Berkeley
  Saginaw                       Ct   | [(-121.8803,37.898),(-121.8806,37.901)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- San Andreas                   Dr   | [(-122.0609,37.9),(-122.0614,37.895)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
  Santa Maria                   Ave  | [(-122.0773,37),(-122.0773,37.98)]                                                                                                                                                                                                                                                                                                                                                                                                           | Oakland
  Santa Maria                   Ave  | [(-122.0773,37),(-122.0773,37.98)]                                                                                                                                                                                                                                                                                                                                                                                                           | Oakland
- Santa Maria                   Ave  | [(-122.0773,37),(-122.0773,37.98)]                                                                                                                                                                                                                                                                                                                                                                                                           | Berkeley
- Shattuck                      Ave  | [(-122.2686,37.904),(-122.2686,37.897)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
- Sheridan                      Road | [(-122.2279,37.425),(-122.2253,37.411),(-122.2223,37.377)]                                                                                                                                                                                                                                                                                                                                                                                   | Lafayette
- Shoreline                     Dr   | [(-122.2657,37.603),(-122.2648,37.6)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
  Skyline                       Blvd | [(-122.1738,37.01),(-122.1714,37.996)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
- Skyline                       Blvd | [(-122.1738,37.01),(-122.1714,37.996)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
  Skyline                       Dr   | [(-122.0277,37.5),(-122.0284,37.498)]                                                                                                                                                                                                                                                                                                                                                                                                        | Oakland
- Skywest                       Dr   | [(-122.1161,37.62),(-122.1123,37.586)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
- Southern Pacific Railroad          | [(-122.3002,37.674),(-122.2999,37.661)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Sp Railroad                        | [(-121.893564,37.99009),(-121.897,37.016)]                                                                                                                                                                                                                                                                                                                                                                                                   | Oakland
  Sp Railroad                        | [(-121.893564,37.99009),(-121.897,37.016)]                                                                                                                                                                                                                                                                                                                                                                                                   | Oakland
  Sp Railroad                        | [(-121.9565,37.898),(-121.9562,37.9)]                                                                                                                                                                                                                                                                                                                                                                                                        | Oakland
  Sp Railroad                        | [(-122.0734,37.001),(-122.0734,37.997)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Sp Railroad                        | [(-122.0734,37.001),(-122.0734,37.997)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- Sp Railroad                        | [(-122.0734,37.001),(-122.0734,37.997)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
- Sp Railroad                        | [(-122.0914,37.601),(-122.087,37.56),(-122.086408,37.5551)]                                                                                                                                                                                                                                                                                                                                                                                  | Berkeley
  Sp Railroad                        | [(-122.137792,37.003),(-122.1365,37.992),(-122.131257,37.94612)]                                                                                                                                                                                                                                                                                                                                                                             | Oakland
- Sp Railroad                        | [(-122.137792,37.003),(-122.1365,37.992),(-122.131257,37.94612)]                                                                                                                                                                                                                                                                                                                                                                             | Berkeley
  Sp Railroad                        | [(-122.1947,37.497),(-122.193328,37.4848)]                                                                                                                                                                                                                                                                                                                                                                                                   | Oakland
  Stanton                       Ave  | [(-122.100392,37.0697),(-122.099513,37.06052)]                                                                                                                                                                                                                                                                                                                                                                                               | Oakland
- State Hwy 123                      | [(-122.3004,37.986),(-122.2998,37.969),(-122.2995,37.962),(-122.2992,37.952),(-122.299,37.942),(-122.2987,37.935),(-122.2984,37.924),(-122.2982,37.92),(-122.2976,37.904),(-122.297,37.88),(-122.2966,37.869),(-122.2959,37.848),(-122.2961,37.843)]                                                                                                                                                                                         | Berkeley
- State Hwy 13                       | [(-122.1797,37.943),(-122.179871,37.91849),(-122.18,37.9),(-122.179023,37.86615),(-122.1787,37.862),(-122.1781,37.851),(-122.1777,37.845),(-122.1773,37.839),(-122.177,37.833)]                                                                                                                                                                                                                                                              | Berkeley
- State Hwy 13                       | [(-122.2049,37.2),(-122.20328,37.17975),(-122.1989,37.125),(-122.198078,37.11641),(-122.1975,37.11)]                                                                                                                                                                                                                                                                                                                                         | Lafayette
- State Hwy 13                  Ramp | [(-122.2244,37.427),(-122.223,37.414),(-122.2214,37.396),(-122.2213,37.388)]                                                                                                                                                                                                                                                                                                                                                                 | Lafayette
- State Hwy 238                      | ((-122.098,37.908),(-122.0983,37.907),(-122.099,37.905),(-122.101,37.898),(-122.101535,37.89711),(-122.103173,37.89438),(-122.1046,37.892),(-122.106,37.89))                                                                                                                                                                                                                                                                                 | Berkeley
- State Hwy 238                 Ramp | [(-122.1288,37.9),(-122.1293,37.895),(-122.1296,37.906)]                                                                                                                                                                                                                                                                                                                                                                                     | Berkeley
- State Hwy 24                       | [(-122.2674,37.246),(-122.2673,37.248),(-122.267,37.261),(-122.2668,37.271),(-122.2663,37.298),(-122.2659,37.315),(-122.2655,37.336),(-122.265007,37.35882),(-122.264443,37.37286),(-122.2641,37.381),(-122.2638,37.388),(-122.2631,37.396),(-122.2617,37.405),(-122.2615,37.407),(-122.2605,37.412)]                                                                                                                                        | Lafayette
  State Hwy 84                       | [(-121.9565,37.898),(-121.956589,37.89911),(-121.9569,37.903),(-121.956,37.91),(-121.9553,37.919)]                                                                                                                                                                                                                                                                                                                                           | Oakland
  State Hwy 84                       | [(-122.0671,37.426),(-122.07,37.402),(-122.074,37.37),(-122.0773,37.338)]                                                                                                                                                                                                                                                                                                                                                                    | Oakland
  State Hwy 92                       | [(-122.1085,37.326),(-122.1095,37.322),(-122.1111,37.316),(-122.1119,37.313),(-122.1125,37.311),(-122.1131,37.308),(-122.1167,37.292),(-122.1187,37.285),(-122.12,37.28)]                                                                                                                                                                                                                                                                    | Oakland
  State Hwy 92                  Ramp | [(-122.1086,37.321),(-122.1089,37.315),(-122.1111,37.316)]                                                                                                                                                                                                                                                                                                                                                                                   | Oakland
- Stuart                        St   | [(-122.2518,37.6),(-122.2507,37.601),(-122.2491,37.606)]                                                                                                                                                                                                                                                                                                                                                                                     | Berkeley
  Sunol Ridge                   Trl  | [(-121.9419,37.455),(-121.9345,37.38)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
  Sunol Ridge                   Trl  | [(-121.9419,37.455),(-121.9345,37.38)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
  Tassajara Creek                    | [(-121.87866,37.98898),(-121.8782,37.015)]                                                                                                                                                                                                                                                                                                                                                                                                   | Oakland
  Tassajara Creek                    | [(-121.87866,37.98898),(-121.8782,37.015)]                                                                                                                                                                                                                                                                                                                                                                                                   | Oakland
- Taurus                        Ave  | [(-122.2159,37.416),(-122.2128,37.389)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
  Tennyson                      Road | [(-122.0891,37.317),(-122.0927,37.317)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Thackeray                     Ave  | [(-122.072,37.305),(-122.0715,37.298)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
  Theresa                       Way  | [(-121.7289,37.906),(-121.728,37.899)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
  Tissiack                      Way  | [(-121.920364,37),(-121.9208,37.995)]                                                                                                                                                                                                                                                                                                                                                                                                        | Oakland
  Tissiack                      Way  | [(-121.920364,37),(-121.9208,37.995)]                                                                                                                                                                                                                                                                                                                                                                                                        | Oakland
- Tupelo                        Ter  | [(-122.059087,37.6113),(-122.057021,37.59942)]                                                                                                                                                                                                                                                                                                                                                                                               | Berkeley
  Vallecitos                    Road | [(-121.8699,37.916),(-121.8703,37.891)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Warm Springs                  Blvd | [(-121.933956,37),(-121.9343,37.97)]                                                                                                                                                                                                                                                                                                                                                                                                         | Oakland
  Warm Springs                  Blvd | [(-121.933956,37),(-121.9343,37.97)]                                                                                                                                                                                                                                                                                                                                                                                                         | Oakland
  Welch Creek                   Road | [(-121.7695,37.386),(-121.7737,37.413)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Welch Creek                   Road | [(-121.7695,37.386),(-121.7737,37.413)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- West Loop                     Road | [(-122.0576,37.604),(-122.0602,37.586)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Western Pacific Railroad Spur      | [(-122.0394,37.018),(-122.0394,37.961)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Western Pacific Railroad Spur      | [(-122.0394,37.018),(-122.0394,37.961)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- Western Pacific Railroad Spur      | [(-122.0394,37.018),(-122.0394,37.961)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Whitlock Creek                     | [(-121.74683,37.91276),(-121.733107,37)]                                                                                                                                                                                                                                                                                                                                                                                                     | Oakland
  Whitlock Creek                     | [(-121.74683,37.91276),(-121.733107,37)]                                                                                                                                                                                                                                                                                                                                                                                                     | Oakland
  Willimet                      Way  | [(-122.0964,37.517),(-122.0949,37.493)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Wisconsin                     St   | [(-122.1994,37.017),(-122.1975,37.998),(-122.1971,37.994)]                                                                                                                                                                                                                                                                                                                                                                                   | Oakland
- Wisconsin                     St   | [(-122.1994,37.017),(-122.1975,37.998),(-122.1971,37.994)]                                                                                                                                                                                                                                                                                                                                                                                   | Berkeley
- Wp Railroad                        | [(-122.254,37.902),(-122.2506,37.891)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
- 100th                         Ave  | [(-122.1657,37.429),(-122.1647,37.432)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- 107th                         Ave  | [(-122.1555,37.403),(-122.1531,37.41)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
- 14th                          St   | [(-122.299,37.147),(-122.3,37.148)]                                                                                                                                                                                                                                                                                                                                                                                                          | Lafayette
- 19th                          Ave  | [(-122.2366,37.897),(-122.2359,37.905)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
- 1st                           St   | [(-121.75508,37.89294),(-121.753581,37.90031)]                                                                                                                                                                                                                                                                                                                                                                                               | Oakland
- 5th                           St   | [(-122.278,37),(-122.2792,37.005),(-122.2803,37.009)]                                                                                                                                                                                                                                                                                                                                                                                        | Lafayette
- 5th                           St   | [(-122.296,37.615),(-122.2953,37.598)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
- 82nd                          Ave  | [(-122.1695,37.596),(-122.1681,37.603)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
- 85th                          Ave  | [(-122.1877,37.466),(-122.186,37.476)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
- 89th                          Ave  | [(-122.1822,37.459),(-122.1803,37.471)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- 98th                          Ave  | [(-122.1568,37.498),(-122.1558,37.502)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- 98th                          Ave  | [(-122.1693,37.438),(-122.1682,37.444)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- 98th                          Ave  | [(-122.2001,37.258),(-122.1974,37.27)]                                                                                                                                                                                                                                                                                                                                                                                                       | Lafayette
 (333 rows)
 
 SELECT name, #thepath FROM iexit ORDER BY name COLLATE "C", 2;
diff --git a/src/test/regress/sql/select_views.sql b/src/test/regress/sql/select_views.sql
index e742f13699..919e831239 100644
--- a/src/test/regress/sql/select_views.sql
+++ b/src/test/regress/sql/select_views.sql
@@ -3,7 +3,7 @@
 -- test the views defined in CREATE_VIEWS
 --
 
-SELECT * FROM street;
+SELECT * FROM street ORDER BY cname COLLATE "C", name COLLATE "C", thepath::text COLLATE "C";
 
 SELECT name, #thepath FROM iexit ORDER BY name COLLATE "C", 2;
 
-- 
2.37.0 (Apple Git-136)

Julien Rouhaud

rjuju123@gmail.com

about 3 years ago

In reply to: David Christensen (#1)

Re: [PATCHES] Post-special page storage TDE support

Hi,

On Mon, Oct 24, 2022 at 12:55:53PM -0500, David Christensen wrote:

Explicitly
locking (assuming you stay in your lane) should only need to guard
against access from other
backends of this type if using shared buffers, so will be use-case dependent.

I'm not sure what you mean here?

This does have a runtime overhead due to moving some offset
calculations from compile time to
runtime. It is thought that the utility of this feature will outweigh
the costs here.

Have you done some benchmarking to give an idea of how much overhead we're
talking about?

Candidates for page features include 32-bit or 64-bit checksums,
encryption tags, or additional
per-page metadata.

While we are not currently getting rid of the pd_checksum field, this
mechanism could be used to
free up that 16 bits for some other purpose.

IIUC there's a hard requirement of initdb-time initialization, as there's
otherwise no guarantee that you will find enough free space in each page at
runtime. It seems like a very hard requirement for a full replacement of the
current checksum approach (even if I agree that the current implementation
limitations are far from ideal), especially since there's no technical reason
that would prevent us from dynamically enabling data-checksums without doing
all the work when the cluster is down.

David Christensen

david.christensen@crunchydata.com

about 3 years ago

In reply to: Julien Rouhaud (#2)

Re: [PATCHES] Post-special page storage TDE support

Explicitly
locking (assuming you stay in your lane) should only need to guard
against access from other
backends of this type if using shared buffers, so will be use-case dependent.

I'm not sure what you mean here?

I'm mainly pointing out that the specific code that manages this
feature is the only one who has to worry about modifying said page
region.

This does have a runtime overhead due to moving some offset
calculations from compile time to
runtime. It is thought that the utility of this feature will outweigh
the costs here.

Have you done some benchmarking to give an idea of how much overhead we're
talking about?

Not yet, but I am going to work on this. I suspect the current code
could be improved, but will try to get some sort of measurement of the
additional overhead.

Candidates for page features include 32-bit or 64-bit checksums,
encryption tags, or additional
per-page metadata.

While we are not currently getting rid of the pd_checksum field, this
mechanism could be used to
free up that 16 bits for some other purpose.

IIUC there's a hard requirement of initdb-time initialization, as there's
otherwise no guarantee that you will find enough free space in each page at
runtime. It seems like a very hard requirement for a full replacement of the
current checksum approach (even if I agree that the current implementation
limitations are far from ideal), especially since there's no technical reason
that would prevent us from dynamically enabling data-checksums without doing
all the work when the cluster is down.

As implemented, that is correct; we are currently assuming this
specific feature mechanism is set at initdb time only. Checksums are
not the primary motivation here, but were something that I could use
for an immediate illustration of the feature. That said, presumably
you could define a way to set the features per-relation (say with a
template field in pg_class) which would propagate to a relation on
rewrite, so there could be ways to handle things incrementally, were
this an overall goal.

Thanks for looking,

David

Matthias van de Meent

boekewurm+postgres@gmail.com

about 3 years ago

In reply to: David Christensen (#1)

Re: [PATCHES] Post-special page storage TDE support

On Mon, 24 Oct 2022, 19:56 David Christensen, <
david.christensen@crunchydata.com> wrote:

Discussion is welcome and encouraged!

Did you read the related thread with related discussion from last June,
"Re: better page-level checksums" [0]/messages/by-id/CA+TgmoaCeQ2b-BVgVfF8go8zFoceDjJq9w4AFQX7u6Acfdn2uA@mail.gmail.com? In that I argued that space at the
end of a page is already allocated for the AM, and that reserving variable
space at the end of the page for non-AM usage is wasting the AM's
performance potential.

Apart from that: Is this variable-sized 'metadata' associated with smgr
infrastructure only, or is it also available for AM features? If not; then
this is a strong -1. The amount of tasks smgr needs to do on a page is
generally much less than the amount of tasks an AM needs to do; so in my
view the AM has priority in prime page real estate, not smgr or related
infrastructure.

re: PageFeatures
I'm not sure I understand the goal, nor the reasoning. Shouldn't this be
part of the storage manager (smgr) implementation / can't this be part of
the smgr of the relation?

re: use of pd_checksum
I mentioned this in the above-mentioned thread too, in [1]/messages/by-id/CAEze2Wi5wYinU7nYxyKe_C0DRc6uWYa8ivn5=zg63nKtHBnn8A@mail.gmail.com, that we could
use pd_checksum as an extra area marker for this storage-specific data,
which would be located between pd_upper and pd_special.

Re: patch contents

0001:

+ specialSize = MAXALIGN(specialSize) + reserved_page_size;

This needs to be aligned, so MAXALIGN(specialSize + reserved_page_size), or
an assertion that reserved_page_size is MAXALIGNED, would be better.

PageValidateSpecialPointer(Page page)
{
Assert(page);
-    Assert(((PageHeader) page)->pd_special <= BLCKSZ);
+    Assert((((PageHeader) page)->pd_special - reserved_page_size) <=

BLCKSZ);

This check is incorrect. With your code it would allow pd_special past the
end of the block. If you want to put the reserved_space_size effectively
inside the special area, this check should instead be:

+ Assert(((PageHeader) page)->pd_special <= (BLCKSZ -
reserved_page_size));

Or, equally valid

+ Assert((((PageHeader) page)->pd_special + reserved_page_size) <=
BLCKSZ);

+ * +-------------+-----+------------+-----------------+
+ * | ... tuple2 tuple1 | "special space" | "reserved" |
+ * +-------------------+------------+-----------------+

Could you fix the table display if / when you revise the patchset? It seems
to me that the corners don't line up with the column borders.

0002:

Make the output of "select_views" test stable
Changing the reserved_page_size has resulted in non-stable results for

this test.

This makes sense, what kind of instability are we talking about? Are there
different results for runs with the same binary, or is this across
compilations?

0003 and up were not yet reviewed in depth.

Kind regards,

Matthias van de Meent

[0]: /messages/by-id/CA+TgmoaCeQ2b-BVgVfF8go8zFoceDjJq9w4AFQX7u6Acfdn2uA@mail.gmail.com
/messages/by-id/CA+TgmoaCeQ2b-BVgVfF8go8zFoceDjJq9w4AFQX7u6Acfdn2uA@mail.gmail.com
[1]: /messages/by-id/CAEze2Wi5wYinU7nYxyKe_C0DRc6uWYa8ivn5=zg63nKtHBnn8A@mail.gmail.com
/messages/by-id/CAEze2Wi5wYinU7nYxyKe_C0DRc6uWYa8ivn5=zg63nKtHBnn8A@mail.gmail.com

David Christensen

david.christensen@crunchydata.com

about 3 years ago

In reply to: Matthias van de Meent (#4)

Re: [PATCHES] Post-special page storage TDE support

Hi Matthias,

Did you read the related thread with related discussion from last June, "Re: better page-level checksums" [0]? In that I argued that space at the end of a page is already allocated for the AM, and that reserving variable space at the end of the page for non-AM usage is wasting the AM's performance potential.

Yes, I had read parts of that thread among others, but have given it a
re-read. I can see the point you're making here, and agree that if we
can allocate between pd_special and pd_upper that could make sense. I
am a little unclear as to what performance impacts for the AM there
would be if this additional space were ahead or behind the page
special area; it seems like if this is something that needs to live on
the page *somewhere* just being aligned correctly would be sufficient
from the AM's standpoint. Considering that I am trying to make this
have zero storage impact if these features are not active, the impact
on a cluster with no additional features would be moot from a storage
perspective, no?

Apart from that: Is this variable-sized 'metadata' associated with smgr infrastructure only, or is it also available for AM features? If not; then this is a strong -1. The amount of tasks smgr needs to do on a page is generally much less than the amount of tasks an AM needs to do; so in my view the AM has priority in prime page real estate, not smgr or related infrastructure.

I will confess to a slightly wobbly understanding of the delineation
of responsibility here. I was under the impression that by modifying
any consumer of PageHeaderData this would be sufficient to cover all
AMs for the types of cluster-wide options we'd be concerned about (say
extended checksums, multiple page encryption schemes, or other
per-page information we haven't yet anticipated). Reading smgr/README
and the various access/*/README has not made the distinction clear to
me yet.

re: PageFeatures
I'm not sure I understand the goal, nor the reasoning. Shouldn't this be part of the storage manager (smgr) implementation / can't this be part of the smgr of the relation?

For at least the feature cases I'm anticipating, this would apply to
any disk page that may have user data, set (at least initially) at
initdb time, so should apply to any pages in the cluster, regardless
of AM.

re: use of pd_checksum
I mentioned this in the above-mentioned thread too, in [1], that we could use pd_checksum as an extra area marker for this storage-specific data, which would be located between pd_upper and pd_special.

I do think that we could indeed use this as an additional in-page
pointer, but at least for this version was keeping things
backwards-compatible. Peter G (I think) also made some good points
about how to include the various status bits on the page somehow in
terms of making a page completely self-contained.

Re: patch contents

0001:

+ specialSize = MAXALIGN(specialSize) + reserved_page_size;

This needs to be aligned, so MAXALIGN(specialSize + reserved_page_size), or an assertion that reserved_page_size is MAXALIGNED, would be better.

It is currently aligned via the space calculation return value but
agree that folding it into an assert or reworking it explicitly is
clearer.

PageValidateSpecialPointer(Page page)
{
Assert(page);
-    Assert(((PageHeader) page)->pd_special <= BLCKSZ);
+    Assert((((PageHeader) page)->pd_special - reserved_page_size) <= BLCKSZ);
This check is incorrect. With your code it would allow pd_special past the end of the block. If you want to put the reserved_space_size effectively inside the special area, this check should instead be:

+ Assert(((PageHeader) page)->pd_special <= (BLCKSZ - reserved_page_size));

Or, equally valid

+ Assert((((PageHeader) page)->pd_special + reserved_page_size) <= BLCKSZ);

Yup, I think I inverted my logic there; thanks.

+ * +-------------+-----+------------+-----------------+
+ * | ... tuple2 tuple1 | "special space" | "reserved" |
+ * +-------------------+------------+-----------------+
Could you fix the table display if / when you revise the patchset? It seems to me that the corners don't line up with the column borders.

Sure thing.

0002:

Make the output of "select_views" test stable
Changing the reserved_page_size has resulted in non-stable results for this test.

This makes sense, what kind of instability are we talking about? Are there different results for runs with the same binary, or is this across compilations?

When running with the same compilation/initdb settings, the test
results are stable, but differ depending what options you chose, so
`make installcheck` output will fail when testing a cluster with
different options vs upstream HEAD without these patches, etc.

0003 and up were not yet reviewed in depth.

Thanks, I appreciate the feedback so far.

Matthias van de Meent

boekewurm+postgres@gmail.com

about 3 years ago

In reply to: David Christensen (#5)

Re: [PATCHES] Post-special page storage TDE support

On Sat, 29 Oct 2022 at 00:25, David Christensen
<david.christensen@crunchydata.com> wrote:

Hi Matthias,

Did you read the related thread with related discussion from last June, "Re: better page-level checksums" [0]? In that I argued that space at the end of a page is already allocated for the AM, and that reserving variable space at the end of the page for non-AM usage is wasting the AM's performance potential.

Yes, I had read parts of that thread among others, but have given it a
re-read. I can see the point you're making here, and agree that if we
can allocate between pd_special and pd_upper that could make sense. I
am a little unclear as to what performance impacts for the AM there
would be if this additional space were ahead or behind the page
special area; it seems like if this is something that needs to live on
the page *somewhere* just being aligned correctly would be sufficient
from the AM's standpoint.

It would be sufficient, but it is definitely suboptimal. See [0]https://commitfest.postgresql.org/40/3543/ as a
patch that is being held back by putting stuff behind the special
area.

I don't really care much about the storage layout on-disk, but I do
care that AMs have efficient access to their data. For the page
header, line pointers, and special area, that is currently guaranteed
by the current page layout. However, for the special area, that
currently guaranteed offset of (BLCKSZ -
MAXALIGN(sizeof(IndexOpaque))) will get broken as there would be more
space in the special area than the AM would be expecting. Right now,
our index AMs are doing pointer chasing during special area lookups
for no good reason, but with the patch it would be required. I don't
like that at all.

Because I understand that it is a requirement to store this reserved
space in a fixed place on the on-disk page (you must know where the
checksum is at static places on the page, otherwise you'd potentially
mis-validate a page) but that requirement is not there for in-memory
storage. I think it's a small price to pay to swap the fields around
during R/W operations - the largest size of special area is currently
24 bytes, and the proposals I've seen for this extra storage area
would not need it to be actually filled with data whilst the page is
being used by the AM (checksum could be zeroed in in-memory
operations, and it'd get set during writeback; same with all other
fields I can imagine the storage system using).

Considering that I am trying to make this
have zero storage impact if these features are not active, the impact
on a cluster with no additional features would be moot from a storage
perspective, no?

The issue is that I'd like to eliminate the redirection from the page
header in the hot path. Currently, we can do that, and pd_special
would be little more than a hint to the smgr and pd_linp code that
that area is special and reserved for this access method's private
data, so that it is not freed. If you stick something extra in there,
it's not special for the AM's private data, and the AM won't be able
to use pd_special for similar uses as pd_linp+pd_lower. I'd rather
have the storage system use it's own not-special area; choreographed
by e.g. a reuse of pd_checksum for one more page offset. Swapping the
fields around between on-disk and in-memory doesn't need to be an
issue, as special areas are rarely very large.

Every index type we support utilizes the special area. Wouldn't those
in-memory operations have priority on this useful space, as opposed to
a storage system that maybe will be used in new clusters, and even
then only during R/W operations to disk (each at most once for N
memory operations)?

Apart from that: Is this variable-sized 'metadata' associated with smgr infrastructure only, or is it also available for AM features? If not; then this is a strong -1. The amount of tasks smgr needs to do on a page is generally much less than the amount of tasks an AM needs to do; so in my view the AM has priority in prime page real estate, not smgr or related infrastructure.

I will confess to a slightly wobbly understanding of the delineation
of responsibility here. I was under the impression that by modifying
any consumer of PageHeaderData this would be sufficient to cover all
AMs for the types of cluster-wide options we'd be concerned about (say
extended checksums, multiple page encryption schemes, or other
per-page information we haven't yet anticipated). Reading smgr/README
and the various access/*/README has not made the distinction clear to
me yet.

pd_special has (historically) been reserved for access methods'
page-level private data. If you add to this area, shouldn't that be
space that the AM should be able to hook into as well? Or are all
those features limited to the storage system only; i.e. the storage
system decides what's best for the AM's page handling w.r.t. physical
storage?

re: PageFeatures
I'm not sure I understand the goal, nor the reasoning. Shouldn't this be part of the storage manager (smgr) implementation / can't this be part of the smgr of the relation?

For at least the feature cases I'm anticipating, this would apply to
any disk page that may have user data, set (at least initially) at
initdb time, so should apply to any pages in the cluster, regardless
of AM.

OK, so having a storage manager for each supported set of features is
not planned for this. Understood.

re: use of pd_checksum
I mentioned this in the above-mentioned thread too, in [1], that we could use pd_checksum as an extra area marker for this storage-specific data, which would be located between pd_upper and pd_special.

I do think that we could indeed use this as an additional in-page
pointer, but at least for this version was keeping things
backwards-compatible. Peter G (I think) also made some good points
about how to include the various status bits on the page somehow in
terms of making a page completely self-contained.

I think that adding page header bits would suffice for backwards
compatibility if we'd want to reuse pd_checksum. A new
PD_CHECKSUM_REUSED_FOR_STORAGE would suffice here; it would be unset
in normal (pre-patch, or without these fancy new features) clusters.

Kind regards,

Matthias van de Meent

PS. sorry for the rant. I hope my arguments are clear why I dislike
the storage area being placed behind the special area in memory.

[0]: https://commitfest.postgresql.org/40/3543/

David Christensen

david.christensen@crunchydata.com

about 3 years ago

In reply to: Matthias van de Meent (#6)

5 attachment(s)

Re: [PATCHES] Post-special page storage TDE support

Per some offline discussion with Stephen and incorporating some of the
feedback I've gotten I'm including the following changes/revisions:

1. Change the signature of any macros that rely on a dynamic component
to look like a function so you can more easily determine in-code
whether something is truly a constant/compile time calculation or a
runtime one.

2. We use a new page flag for whether "extended page features" are
enabled on the given page. If this is set then we look for the 1-byte
trailer with the bitflag of the number of features. We allow space for
7 page features and are reserving the final hi bit for future
use/change of interpretation to accommodate more.

3. Consolidate the extended checksums into a 56-bit checksum that
immediately precedes the 1-byte flag. Choice of 64-bit checksum is
arbitrary just based on some MIT-licensed code I found, so just
considering this proof of concept, not necessarily promoting that
specific calculation. (I think I included some additional checksum
variants from the earlier revision for ease of testing various
approaches.)

4. Ensure the whole area is MAXALIGN and fixed a few bugs that were
pointed out in this thread.

Patches are:

1. make select_views stable, prerequisite for anything that is messing
with tuples on page sizes

2. add reserved_page_size handling and rework existing code to account
for this additional space usage

3. main PageFeatures-related code; introduce that abstraction layer,
along with the trailing byte on the page with the enabled features for
this specific page. We also add an additional param to PageInit()
with the page features active on this page; currently all call sites
are using the cluster-wide cluster_page_features as the parameter, so
all pages share what is stored in the control file based on initdb
options. However, routines which query page features look on the
actual page itself, so in fact we are able to piecemeal at the
page/relation level if we so desire, or turn off for specific types of
pages, say. Also includes the additional pd_flags bit to enable that
interpretation.

4. Actual extended checksums PageFeature. Rather than two separate
implementations as in the previous patch series, we are using 56 bits
of a 64-bit checksum, stored as the high 7 bytes of the final 8 in the
page where this is enabled.

5. wasted_space PageFeature just to demo multiple features in play.

Thanks,

David

Attachments:

v2-0002-Add-reserved_page_space-to-Page-structure.patchapplication/octet-stream; name=v2-0002-Add-reserved_page_space-to-Page-structure.patchDownload

From f33ea0541a5115e5bcd2b3881c48c20598994220 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Tue, 18 Oct 2022 14:28:09 -0400
Subject: [PATCH v2 2/5] Add reserved_page_space to Page structure

This space is reserved for extended data on the Page structure which will be ultimately used for
encrypted data, extended checksums, and potentially other things.  This data appears at the end of
the Page, after any `pd_special` area, and will be calculated at runtime based on specific
ControlFile features.

No effort is made to ensure this is backwards-compatible with existing clusters for `pg_upgrade`, as
we will require logical replication to move data into a cluster with different settings here.
---
 contrib/amcheck/verify_nbtree.c               |  4 +-
 contrib/pg_surgery/heap_surgery.c             |  4 +-
 src/backend/access/brin/brin_bloom.c          |  8 ++--
 src/backend/access/brin/brin_minmax_multi.c   |  8 ++--
 src/backend/access/brin/brin_tuple.c          |  2 +-
 src/backend/access/common/indextuple.c        |  2 +-
 src/backend/access/gin/gindatapage.c          | 20 +++++-----
 src/backend/access/gin/ginentrypage.c         |  4 +-
 src/backend/access/gin/ginfast.c              | 10 ++---
 src/backend/access/gin/gininsert.c            |  4 +-
 src/backend/access/gin/ginpostinglist.c       |  6 +--
 src/backend/access/gin/ginvacuum.c            |  2 +-
 src/backend/access/heap/README.HOT            |  2 +-
 src/backend/access/heap/heapam.c              | 14 +++----
 src/backend/access/heap/heapam_handler.c      |  8 ++--
 src/backend/access/heap/hio.c                 |  8 ++--
 src/backend/access/heap/pruneheap.c           | 24 +++++------
 src/backend/access/heap/rewriteheap.c         |  4 +-
 src/backend/access/heap/vacuumlazy.c          | 22 +++++-----
 src/backend/access/nbtree/nbtdedup.c          |  6 +--
 src/backend/access/nbtree/nbtinsert.c         |  4 +-
 src/backend/access/nbtree/nbtree.c            |  4 +-
 src/backend/access/nbtree/nbtsearch.c         |  8 ++--
 src/backend/access/nbtree/nbtsplitloc.c       |  2 +-
 src/backend/nodes/tidbitmap.c                 |  2 +-
 .../replication/logical/reorderbuffer.c       |  2 +-
 src/backend/storage/freespace/freespace.c     | 30 +++++++-------
 src/backend/storage/page/bufpage.c            | 34 ++++++++--------
 src/backend/utils/adt/tsgistidx.c             |  2 +-
 src/backend/utils/init/globals.c              |  3 ++
 src/backend/utils/misc/guc_tables.c           | 13 ++++++
 src/bin/initdb/initdb.c                       |  1 +
 src/include/access/ginblock.h                 | 27 +++++++++----
 src/include/access/hash.h                     |  5 ++-
 src/include/access/heapam.h                   |  2 +-
 src/include/access/heaptoast.h                |  4 +-
 src/include/access/htup_details.h             | 40 ++++++++++++++-----
 src/include/access/nbtree.h                   | 32 ++++++++++-----
 src/include/access/spgist_private.h           |  1 +
 src/include/storage/bufpage.h                 | 21 +++++++---
 .../test_ginpostinglist/test_ginpostinglist.c |  6 +--
 src/test/regress/expected/insert.out          |  4 +-
 src/test/regress/expected/vacuum.out          |  4 +-
 src/test/regress/sql/insert.sql               |  4 +-
 src/test/regress/sql/vacuum.sql               |  4 +-
 45 files changed, 249 insertions(+), 172 deletions(-)

diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 9021d156eb..affb513ad0 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -488,12 +488,12 @@ bt_check_every_level(Relation rel, Relation heaprel, bool heapkeyspace,
 		/*
 		 * Size Bloom filter based on estimated number of tuples in index,
 		 * while conservatively assuming that each block must contain at least
-		 * MaxTIDsPerBTreePage / 3 "plain" tuples -- see
+		 * MaxTIDsPerBTreePage() / 3 "plain" tuples -- see
 		 * bt_posting_plain_tuple() for definition, and details of how posting
 		 * list tuples are handled.
 		 */
 		total_pages = RelationGetNumberOfBlocks(rel);
-		total_elems = Max(total_pages * (MaxTIDsPerBTreePage / 3),
+		total_elems = Max(total_pages * (MaxTIDsPerBTreePage() / 3),
 						  (int64) state->rel->rd_rel->reltuples);
 		/* Generate a random seed to avoid repetition */
 		seed = pg_prng_uint64(&pg_global_prng_state);
diff --git a/contrib/pg_surgery/heap_surgery.c b/contrib/pg_surgery/heap_surgery.c
index 8a2ad9773d..52ffcc9782 100644
--- a/contrib/pg_surgery/heap_surgery.c
+++ b/contrib/pg_surgery/heap_surgery.c
@@ -89,7 +89,7 @@ heap_force_common(FunctionCallInfo fcinfo, HeapTupleForceOption heap_force_opt)
 	Relation	rel;
 	OffsetNumber curr_start_ptr,
 				next_start_ptr;
-	bool		include_this_tid[MaxHeapTuplesPerPage];
+	bool		include_this_tid[MaxHeapTuplesPerPageLimit];
 
 	if (RecoveryInProgress())
 		ereport(ERROR,
@@ -225,7 +225,7 @@ heap_force_common(FunctionCallInfo fcinfo, HeapTupleForceOption heap_force_opt)
 			}
 
 			/* Mark it for processing. */
-			Assert(offno < MaxHeapTuplesPerPage);
+			Assert(offno < MaxHeapTuplesPerPage());
 			include_this_tid[offno] = true;
 		}
 
diff --git a/src/backend/access/brin/brin_bloom.c b/src/backend/access/brin/brin_bloom.c
index 6b0af7267d..fee460fb29 100644
--- a/src/backend/access/brin/brin_bloom.c
+++ b/src/backend/access/brin/brin_bloom.c
@@ -166,7 +166,7 @@ typedef struct BloomOptions
  * on the fact that the filter header is ~20B alone, which is about
  * the same as the filter bitmap for 16 distinct items with 1% false
  * positive rate. So by allowing lower values we'd not gain much. In
- * any case, the min should not be larger than MaxHeapTuplesPerPage
+ * any case, the min should not be larger than MaxHeapTuplesPerPage()
  * (~290), which is the theoretical maximum for single-page ranges.
  */
 #define		BLOOM_MIN_NDISTINCT_PER_RANGE		16
@@ -448,7 +448,7 @@ brin_bloom_opcinfo(PG_FUNCTION_ARGS)
  *
  * Adjust the ndistinct value based on the pagesPerRange value. First,
  * if it's negative, it's assumed to be relative to maximum number of
- * tuples in the range (assuming each page gets MaxHeapTuplesPerPage
+ * tuples in the range (assuming each page gets MaxHeapTuplesPerPage()
  * tuples, which is likely a significant over-estimate). We also clamp
  * the value, not to over-size the bloom filter unnecessarily.
  *
@@ -463,7 +463,7 @@ brin_bloom_opcinfo(PG_FUNCTION_ARGS)
  * seems better to rely on the upper estimate.
  *
  * XXX We might also calculate a better estimate of rows per BRIN range,
- * instead of using MaxHeapTuplesPerPage (which probably produces values
+ * instead of using MaxHeapTuplesPerPage() (which probably produces values
  * much higher than reality).
  */
 static int
@@ -478,7 +478,7 @@ brin_bloom_get_ndistinct(BrinDesc *bdesc, BloomOptions *opts)
 
 	Assert(BlockNumberIsValid(pagesPerRange));
 
-	maxtuples = MaxHeapTuplesPerPage * pagesPerRange;
+	maxtuples = MaxHeapTuplesPerPage() * pagesPerRange;
 
 	/*
 	 * Similarly to n_distinct, negative values are relative - in this case to
diff --git a/src/backend/access/brin/brin_minmax_multi.c b/src/backend/access/brin/brin_minmax_multi.c
index 9a0bcf6698..37d31e5d38 100644
--- a/src/backend/access/brin/brin_minmax_multi.c
+++ b/src/backend/access/brin/brin_minmax_multi.c
@@ -1997,10 +1997,10 @@ brin_minmax_multi_distance_tid(PG_FUNCTION_ARGS)
 	 * We use the no-check variants here, because user-supplied values may
 	 * have (ip_posid == 0). See ItemPointerCompare.
 	 */
-	da1 = ItemPointerGetBlockNumberNoCheck(pa1) * MaxHeapTuplesPerPage +
+	da1 = ItemPointerGetBlockNumberNoCheck(pa1) * MaxHeapTuplesPerPage() +
 		ItemPointerGetOffsetNumberNoCheck(pa1);
 
-	da2 = ItemPointerGetBlockNumberNoCheck(pa2) * MaxHeapTuplesPerPage +
+	da2 = ItemPointerGetBlockNumberNoCheck(pa2) * MaxHeapTuplesPerPage() +
 		ItemPointerGetOffsetNumberNoCheck(pa2);
 
 	PG_RETURN_FLOAT8(da2 - da1);
@@ -2477,7 +2477,7 @@ brin_minmax_multi_add_value(PG_FUNCTION_ARGS)
 		 * much lower, but meh.
 		 */
 		maxvalues = Min(target_maxvalues * MINMAX_BUFFER_FACTOR,
-						MaxHeapTuplesPerPage * pagesPerRange);
+						MaxHeapTuplesPerPage() * pagesPerRange);
 
 		/* but always at least the original value */
 		maxvalues = Max(maxvalues, target_maxvalues);
@@ -2523,7 +2523,7 @@ brin_minmax_multi_add_value(PG_FUNCTION_ARGS)
 		 * much lower, but meh.
 		 */
 		maxvalues = Min(serialized->maxvalues * MINMAX_BUFFER_FACTOR,
-						MaxHeapTuplesPerPage * pagesPerRange);
+						MaxHeapTuplesPerPage() * pagesPerRange);
 
 		/* but always at least the original value */
 		maxvalues = Max(maxvalues, serialized->maxvalues);
diff --git a/src/backend/access/brin/brin_tuple.c b/src/backend/access/brin/brin_tuple.c
index c0e2dbd23b..86eaf55619 100644
--- a/src/backend/access/brin/brin_tuple.c
+++ b/src/backend/access/brin/brin_tuple.c
@@ -217,7 +217,7 @@ brin_form_tuple(BrinDesc *brdesc, BlockNumber blkno, BrinMemTuple *tuple,
 			 * datatype, try to compress it in-line.
 			 */
 			if (!VARATT_IS_EXTENDED(DatumGetPointer(value)) &&
-				VARSIZE(DatumGetPointer(value)) > TOAST_INDEX_TARGET &&
+				VARSIZE(DatumGetPointer(value)) > TOAST_INDEX_TARGET() &&
 				(atttype->typstorage == TYPSTORAGE_EXTENDED ||
 				 atttype->typstorage == TYPSTORAGE_MAIN))
 			{
diff --git a/src/backend/access/common/indextuple.c b/src/backend/access/common/indextuple.c
index c0bad3cd95..bda67b1dd3 100644
--- a/src/backend/access/common/indextuple.c
+++ b/src/backend/access/common/indextuple.c
@@ -118,7 +118,7 @@ index_form_tuple_context(TupleDesc tupleDescriptor,
 		 * try to compress it in-line.
 		 */
 		if (!VARATT_IS_EXTENDED(DatumGetPointer(untoasted_values[i])) &&
-			VARSIZE(DatumGetPointer(untoasted_values[i])) > TOAST_INDEX_TARGET &&
+			VARSIZE(DatumGetPointer(untoasted_values[i])) > TOAST_INDEX_TARGET() &&
 			(att->attstorage == TYPSTORAGE_EXTENDED ||
 			 att->attstorage == TYPSTORAGE_MAIN))
 		{
diff --git a/src/backend/access/gin/gindatapage.c b/src/backend/access/gin/gindatapage.c
index 7c76d1f90d..c6d641f4e8 100644
--- a/src/backend/access/gin/gindatapage.c
+++ b/src/backend/access/gin/gindatapage.c
@@ -535,7 +535,7 @@ dataBeginPlaceToPageLeaf(GinBtree btree, Buffer buf, GinBtreeStack *stack,
 		 * a single byte, and we can use all the free space on the old page as
 		 * well as the new page. For simplicity, ignore segment overhead etc.
 		 */
-		maxitems = Min(maxitems, freespace + GinDataPageMaxDataSize);
+		maxitems = Min(maxitems, freespace + GinDataPageMaxDataSize());
 	}
 	else
 	{
@@ -550,7 +550,7 @@ dataBeginPlaceToPageLeaf(GinBtree btree, Buffer buf, GinBtreeStack *stack,
 		int			nnewsegments;
 
 		nnewsegments = freespace / GinPostingListSegmentMaxSize;
-		nnewsegments += GinDataPageMaxDataSize / GinPostingListSegmentMaxSize;
+		nnewsegments += GinDataPageMaxDataSize() / GinPostingListSegmentMaxSize;
 		maxitems = Min(maxitems, nnewsegments * MinTuplesPerSegment);
 	}
 
@@ -665,8 +665,8 @@ dataBeginPlaceToPageLeaf(GinBtree btree, Buffer buf, GinBtreeStack *stack,
 				leaf->lastleft = dlist_prev_node(&leaf->segments, leaf->lastleft);
 			}
 		}
-		Assert(leaf->lsize <= GinDataPageMaxDataSize);
-		Assert(leaf->rsize <= GinDataPageMaxDataSize);
+		Assert(leaf->lsize <= GinDataPageMaxDataSize());
+		Assert(leaf->rsize <= GinDataPageMaxDataSize());
 
 		/*
 		 * Fetch the max item in the left page's last segment; it becomes the
@@ -755,7 +755,7 @@ ginVacuumPostingTreeLeaf(Relation indexrel, Buffer buffer, GinVacuumState *gvs)
 		if (seginfo->seg)
 			oldsegsize = SizeOfGinPostingList(seginfo->seg);
 		else
-			oldsegsize = GinDataPageMaxDataSize;
+			oldsegsize = GinDataPageMaxDataSize();
 
 		cleaned = ginVacuumItemPointers(gvs,
 										seginfo->items,
@@ -1015,7 +1015,7 @@ dataPlaceToPageLeafRecompress(Buffer buf, disassembledLeaf *leaf)
 		}
 	}
 
-	Assert(newsize <= GinDataPageMaxDataSize);
+	Assert(newsize <= GinDataPageMaxDataSize());
 	GinDataPageSetDataSize(page, newsize);
 }
 
@@ -1684,7 +1684,7 @@ leafRepackItems(disassembledLeaf *leaf, ItemPointer remaining)
 		 * copying to the page. Did we exceed the size that fits on one page?
 		 */
 		segsize = SizeOfGinPostingList(seginfo->seg);
-		if (pgused + segsize > GinDataPageMaxDataSize)
+		if (pgused + segsize > GinDataPageMaxDataSize())
 		{
 			if (!needsplit)
 			{
@@ -1724,8 +1724,8 @@ leafRepackItems(disassembledLeaf *leaf, ItemPointer remaining)
 	else
 		leaf->rsize = pgused;
 
-	Assert(leaf->lsize <= GinDataPageMaxDataSize);
-	Assert(leaf->rsize <= GinDataPageMaxDataSize);
+	Assert(leaf->lsize <= GinDataPageMaxDataSize());
+	Assert(leaf->rsize <= GinDataPageMaxDataSize());
 
 	/*
 	 * Make a palloc'd copy of every segment after the first modified one,
@@ -1801,7 +1801,7 @@ createPostingTree(Relation index, ItemPointerData *items, uint32 nitems,
 										 GinPostingListSegmentMaxSize,
 										 &npacked);
 		segsize = SizeOfGinPostingList(segment);
-		if (rootsize + segsize > GinDataPageMaxDataSize)
+		if (rootsize + segsize > GinDataPageMaxDataSize())
 			break;
 
 		memcpy(ptr, segment, segsize);
diff --git a/src/backend/access/gin/ginentrypage.c b/src/backend/access/gin/ginentrypage.c
index 382f8bb4d6..51adfe7b4e 100644
--- a/src/backend/access/gin/ginentrypage.c
+++ b/src/backend/access/gin/ginentrypage.c
@@ -102,13 +102,13 @@ GinFormTuple(GinState *ginstate,
 
 	newsize = MAXALIGN(newsize);
 
-	if (newsize > GinMaxItemSize)
+	if (newsize > GinMaxItemSize())
 	{
 		if (errorTooBig)
 			ereport(ERROR,
 					(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
 					 errmsg("index row size %zu exceeds maximum %zu for index \"%s\"",
-							(Size) newsize, (Size) GinMaxItemSize,
+							(Size) newsize, (Size) GinMaxItemSize(),
 							RelationGetRelationName(ginstate->index))));
 		pfree(itup);
 		return NULL;
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index f750b5ed9e..46b487b53a 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -38,8 +38,8 @@
 /* GUC parameter */
 int			gin_pending_list_limit = 0;
 
-#define GIN_PAGE_FREESIZE \
-	( BLCKSZ - MAXALIGN(SizeOfPageHeaderData) - MAXALIGN(sizeof(GinPageOpaqueData)) )
+#define GIN_PAGE_FREESIZE() \
+	( BLCKSZ - MAXALIGN(SizeOfPageHeaderData) - MAXALIGN(sizeof(GinPageOpaqueData)) - SizeOfPageReservedSpace() )
 
 typedef struct KeyArray
 {
@@ -183,7 +183,7 @@ makeSublist(Relation index, IndexTuple *tuples, int32 ntuples,
 
 		tupsize = MAXALIGN(IndexTupleSize(tuples[i])) + sizeof(ItemIdData);
 
-		if (size + tupsize > GinListPageSize)
+		if (size + tupsize > GinListPageSize())
 		{
 			/* won't fit, force a new page and reprocess */
 			i--;
@@ -249,7 +249,7 @@ ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector)
 	 */
 	CheckForSerializableConflictIn(index, NULL, GIN_METAPAGE_BLKNO);
 
-	if (collector->sumsize + collector->ntuples * sizeof(ItemIdData) > GinListPageSize)
+	if (collector->sumsize + collector->ntuples * sizeof(ItemIdData) > GinListPageSize())
 	{
 		/*
 		 * Total size is greater than one page => make sublist
@@ -450,7 +450,7 @@ ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector)
 	 * ginInsertCleanup() should not be called inside our CRIT_SECTION.
 	 */
 	cleanupSize = GinGetPendingListCleanupSize(index);
-	if (metadata->nPendingPages * GIN_PAGE_FREESIZE > cleanupSize * 1024L)
+	if (metadata->nPendingPages * GIN_PAGE_FREESIZE() > cleanupSize * 1024L)
 		needCleanup = true;
 
 	UnlockReleaseBuffer(metabuffer);
diff --git a/src/backend/access/gin/gininsert.c b/src/backend/access/gin/gininsert.c
index ea1c4184fb..b39237a964 100644
--- a/src/backend/access/gin/gininsert.c
+++ b/src/backend/access/gin/gininsert.c
@@ -75,7 +75,7 @@ addItemPointersToLeafTuple(GinState *ginstate,
 
 	/* Compress the posting list, and try to a build tuple with room for it */
 	res = NULL;
-	compressedList = ginCompressPostingList(newItems, newNPosting, GinMaxItemSize,
+	compressedList = ginCompressPostingList(newItems, newNPosting, GinMaxItemSize(),
 											NULL);
 	pfree(newItems);
 	if (compressedList)
@@ -135,7 +135,7 @@ buildFreshLeafTuple(GinState *ginstate,
 	GinPostingList *compressedList;
 
 	/* try to build a posting list tuple with all the items */
-	compressedList = ginCompressPostingList(items, nitem, GinMaxItemSize, NULL);
+	compressedList = ginCompressPostingList(items, nitem, GinMaxItemSize(), NULL);
 	if (compressedList)
 	{
 		res = GinFormTuple(ginstate, attnum, key, category,
diff --git a/src/backend/access/gin/ginpostinglist.c b/src/backend/access/gin/ginpostinglist.c
index 68356d55c0..6206f6102e 100644
--- a/src/backend/access/gin/ginpostinglist.c
+++ b/src/backend/access/gin/ginpostinglist.c
@@ -26,7 +26,7 @@
  * lowest 32 bits are the block number. That leaves 21 bits unused, i.e.
  * only 43 low bits are used.
  *
- * 11 bits is enough for the offset number, because MaxHeapTuplesPerPage <
+ * 11 bits is enough for the offset number, because MaxHeapTuplesPerPage() <
  * 2^11 on all supported block sizes. We are frugal with the bits, because
  * smaller integers use fewer bytes in the varbyte encoding, saving disk
  * space. (If we get a new table AM in the future that wants to use the full
@@ -74,9 +74,9 @@
 /*
  * How many bits do you need to encode offset number? OffsetNumber is a 16-bit
  * integer, but you can't fit that many items on a page. 11 ought to be more
- * than enough. It's tempting to derive this from MaxHeapTuplesPerPage, and
+ * than enough. It's tempting to derive this from MaxHeapTuplesPerPage(), and
  * use the minimum number of bits, but that would require changing the on-disk
- * format if MaxHeapTuplesPerPage changes. Better to leave some slack.
+ * format if MaxHeapTuplesPerPage() changes. Better to leave some slack.
  */
 #define MaxHeapTuplesPerPageBits		11
 
diff --git a/src/backend/access/gin/ginvacuum.c b/src/backend/access/gin/ginvacuum.c
index b4fa5f6bf8..031a1b796d 100644
--- a/src/backend/access/gin/ginvacuum.c
+++ b/src/backend/access/gin/ginvacuum.c
@@ -514,7 +514,7 @@ ginVacuumEntryPage(GinVacuumState *gvs, Buffer buffer, BlockNumber *roots, uint3
 
 				if (nitems > 0)
 				{
-					plist = ginCompressPostingList(items, nitems, GinMaxItemSize, NULL);
+					plist = ginCompressPostingList(items, nitems, GinMaxItemSize(), NULL);
 					plistsize = SizeOfGinPostingList(plist);
 				}
 				else
diff --git a/src/backend/access/heap/README.HOT b/src/backend/access/heap/README.HOT
index 68c6709aa8..8835926a24 100644
--- a/src/backend/access/heap/README.HOT
+++ b/src/backend/access/heap/README.HOT
@@ -243,7 +243,7 @@ of line pointer bloat: we might end up with huge numbers of line pointers
 and just a few actual tuples on a page.  To limit the damage in the worst
 case, and to keep various work arrays as well as the bitmaps in bitmap
 scans reasonably sized, the maximum number of line pointers per page
-is arbitrarily capped at MaxHeapTuplesPerPage (the most tuples that
+is arbitrarily capped at MaxHeapTuplesPerPage() (the most tuples that
 could fit without HOT pruning).
 
 Effectively, space reclamation happens during tuple retrieval when the
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 12be87efed..265ac277ab 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -477,7 +477,7 @@ heapgetpage(TableScanDesc sscan, BlockNumber page)
 
 	LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
 
-	Assert(ntup <= MaxHeapTuplesPerPage);
+	Assert(ntup <= MaxHeapTuplesPerPage());
 	scan->rs_ntuples = ntup;
 }
 
@@ -9088,7 +9088,7 @@ heap_xlog_insert(XLogReaderState *record)
 	union
 	{
 		HeapTupleHeaderData hdr;
-		char		data[MaxHeapTupleSize];
+		char		data[MaxHeapTupleSizeLimit];
 	}			tbuf;
 	HeapTupleHeader htup;
 	xl_heap_header xlhdr;
@@ -9144,7 +9144,7 @@ heap_xlog_insert(XLogReaderState *record)
 		data = XLogRecGetBlockData(record, 0, &datalen);
 
 		newlen = datalen - SizeOfHeapHeader;
-		Assert(datalen > SizeOfHeapHeader && newlen <= MaxHeapTupleSize);
+		Assert(datalen > SizeOfHeapHeader && newlen <= MaxHeapTupleSize());
 		memcpy((char *) &xlhdr, data, SizeOfHeapHeader);
 		data += SizeOfHeapHeader;
 
@@ -9210,7 +9210,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
 	union
 	{
 		HeapTupleHeaderData hdr;
-		char		data[MaxHeapTupleSize];
+		char		data[MaxHeapTupleSizeLimit];
 	}			tbuf;
 	HeapTupleHeader htup;
 	uint32		newlen;
@@ -9288,7 +9288,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
 			tupdata = ((char *) xlhdr) + SizeOfMultiInsertTuple;
 
 			newlen = xlhdr->datalen;
-			Assert(newlen <= MaxHeapTupleSize);
+			Assert(newlen <= MaxHeapTupleSize());
 			htup = &tbuf.hdr;
 			MemSet((char *) htup, 0, SizeofHeapTupleHeader);
 			/* PG73FORMAT: get bitmap [+ padding] [+ oid] + data */
@@ -9367,7 +9367,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
 	union
 	{
 		HeapTupleHeaderData hdr;
-		char		data[MaxHeapTupleSize];
+		char		data[MaxHeapTupleSizeLimit];
 	}			tbuf;
 	xl_heap_header xlhdr;
 	uint32		newlen;
@@ -9523,7 +9523,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
 		recdata += SizeOfHeapHeader;
 
 		tuplen = recdata_end - recdata;
-		Assert(tuplen <= MaxHeapTupleSize);
+		Assert(tuplen <= MaxHeapTupleSize());
 
 		htup = &tbuf.hdr;
 		MemSet((char *) htup, 0, SizeofHeapTupleHeader);
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 41f1ca65d0..0ea35f8a4e 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -1184,7 +1184,7 @@ heapam_index_build_range_scan(Relation heapRelation,
 	TransactionId OldestXmin;
 	BlockNumber previous_blkno = InvalidBlockNumber;
 	BlockNumber root_blkno = InvalidBlockNumber;
-	OffsetNumber root_offsets[MaxHeapTuplesPerPage];
+	OffsetNumber root_offsets[MaxHeapTuplesPerPageLimit];
 
 	/*
 	 * sanity checks
@@ -1747,8 +1747,8 @@ heapam_index_validate_scan(Relation heapRelation,
 	EState	   *estate;
 	ExprContext *econtext;
 	BlockNumber root_blkno = InvalidBlockNumber;
-	OffsetNumber root_offsets[MaxHeapTuplesPerPage];
-	bool		in_index[MaxHeapTuplesPerPage];
+	OffsetNumber root_offsets[MaxHeapTuplesPerPageLimit];
+	bool		in_index[MaxHeapTuplesPerPageLimit];
 	BlockNumber previous_blkno = InvalidBlockNumber;
 
 	/* state variables for the merge */
@@ -2211,7 +2211,7 @@ heapam_scan_bitmap_next_block(TableScanDesc scan,
 
 	LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
 
-	Assert(ntup <= MaxHeapTuplesPerPage);
+	Assert(ntup <= MaxHeapTuplesPerPage());
 	hscan->rs_ntuples = ntup;
 
 	return ntup > 0;
diff --git a/src/backend/access/heap/hio.c b/src/backend/access/heap/hio.c
index b0ece66629..4a525fb451 100644
--- a/src/backend/access/heap/hio.c
+++ b/src/backend/access/heap/hio.c
@@ -354,11 +354,11 @@ RelationGetBufferForTuple(Relation relation, Size len,
 	/*
 	 * If we're gonna fail for oversize tuple, do it right away
 	 */
-	if (len > MaxHeapTupleSize)
+	if (len > MaxHeapTupleSize())
 		ereport(ERROR,
 				(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
 				 errmsg("row is too big: size %zu, maximum size %zu",
-						len, MaxHeapTupleSize)));
+						len, MaxHeapTupleSize())));
 
 	/* Compute desired extra freespace due to fillfactor option */
 	saveFreeSpace = RelationGetTargetPageFreeSpace(relation,
@@ -370,8 +370,8 @@ RelationGetBufferForTuple(Relation relation, Size len,
 	 * somewhat arbitrary, but it should prevent most unnecessary relation
 	 * extensions while inserting large tuples into low-fillfactor tables.
 	 */
-	nearlyEmptyFreeSpace = MaxHeapTupleSize -
-		(MaxHeapTuplesPerPage / 8 * sizeof(ItemIdData));
+	nearlyEmptyFreeSpace = MaxHeapTupleSize() -
+		(MaxHeapTuplesPerPage() / 8 * sizeof(ItemIdData));
 	if (len + saveFreeSpace > nearlyEmptyFreeSpace)
 		targetFreeSpace = Max(len, nearlyEmptyFreeSpace);
 	else
diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index 9f43bbe25f..2673f8f4c5 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -54,17 +54,17 @@ typedef struct
 	int			ndead;
 	int			nunused;
 	/* arrays that accumulate indexes of items to be changed */
-	OffsetNumber redirected[MaxHeapTuplesPerPage * 2];
-	OffsetNumber nowdead[MaxHeapTuplesPerPage];
-	OffsetNumber nowunused[MaxHeapTuplesPerPage];
+	OffsetNumber redirected[MaxHeapTuplesPerPageLimit * 2];
+	OffsetNumber nowdead[MaxHeapTuplesPerPageLimit];
+	OffsetNumber nowunused[MaxHeapTuplesPerPageLimit];
 
 	/*
 	 * marked[i] is true if item i is entered in one of the above arrays.
 	 *
-	 * This needs to be MaxHeapTuplesPerPage + 1 long as FirstOffsetNumber is
+	 * This needs to be MaxHeapTuplesPerPage() + 1 long as FirstOffsetNumber is
 	 * 1. Otherwise every access would need to subtract 1.
 	 */
-	bool		marked[MaxHeapTuplesPerPage + 1];
+	bool		marked[MaxHeapTuplesPerPageLimit + 1];
 
 	/*
 	 * Tuple visibility is only computed once for each tuple, for correctness
@@ -74,7 +74,7 @@ typedef struct
 	 *
 	 * Same indexing as ->marked.
 	 */
-	int8		htsv[MaxHeapTuplesPerPage + 1];
+	int8		htsv[MaxHeapTuplesPerPageLimit + 1];
 } PruneState;
 
 /* Local functions */
@@ -598,7 +598,7 @@ heap_prune_chain(Buffer buffer, OffsetNumber rootoffnum, PruneState *prstate)
 	OffsetNumber latestdead = InvalidOffsetNumber,
 				maxoff = PageGetMaxOffsetNumber(dp),
 				offnum;
-	OffsetNumber chainitems[MaxHeapTuplesPerPage];
+	OffsetNumber chainitems[MaxHeapTuplesPerPageLimit];
 	int			nchain = 0,
 				i;
 
@@ -870,7 +870,7 @@ static void
 heap_prune_record_redirect(PruneState *prstate,
 						   OffsetNumber offnum, OffsetNumber rdoffnum)
 {
-	Assert(prstate->nredirected < MaxHeapTuplesPerPage);
+	Assert(prstate->nredirected < MaxHeapTuplesPerPage());
 	prstate->redirected[prstate->nredirected * 2] = offnum;
 	prstate->redirected[prstate->nredirected * 2 + 1] = rdoffnum;
 	prstate->nredirected++;
@@ -884,7 +884,7 @@ heap_prune_record_redirect(PruneState *prstate,
 static void
 heap_prune_record_dead(PruneState *prstate, OffsetNumber offnum)
 {
-	Assert(prstate->ndead < MaxHeapTuplesPerPage);
+	Assert(prstate->ndead < MaxHeapTuplesPerPage());
 	prstate->nowdead[prstate->ndead] = offnum;
 	prstate->ndead++;
 	Assert(!prstate->marked[offnum]);
@@ -895,7 +895,7 @@ heap_prune_record_dead(PruneState *prstate, OffsetNumber offnum)
 static void
 heap_prune_record_unused(PruneState *prstate, OffsetNumber offnum)
 {
-	Assert(prstate->nunused < MaxHeapTuplesPerPage);
+	Assert(prstate->nunused < MaxHeapTuplesPerPage());
 	prstate->nowunused[prstate->nunused] = offnum;
 	prstate->nunused++;
 	Assert(!prstate->marked[offnum]);
@@ -1097,7 +1097,7 @@ page_verify_redirects(Page page)
  * If item k is part of a HOT-chain with root at item j, then we set
  * root_offsets[k - 1] = j.
  *
- * The passed-in root_offsets array must have MaxHeapTuplesPerPage entries.
+ * The passed-in root_offsets array must have MaxHeapTuplesPerPage() entries.
  * Unused entries are filled with InvalidOffsetNumber (zero).
  *
  * The function must be called with at least share lock on the buffer, to
@@ -1114,7 +1114,7 @@ heap_get_root_tuples(Page page, OffsetNumber *root_offsets)
 				maxoff;
 
 	MemSet(root_offsets, InvalidOffsetNumber,
-		   MaxHeapTuplesPerPage * sizeof(OffsetNumber));
+		   MaxHeapTuplesPerPage() * sizeof(OffsetNumber));
 
 	maxoff = PageGetMaxOffsetNumber(page);
 	for (offnum = FirstOffsetNumber; offnum <= maxoff; offnum = OffsetNumberNext(offnum))
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index a34e9b352d..5b1d8bb184 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -654,11 +654,11 @@ raw_heap_insert(RewriteState state, HeapTuple tup)
 	/*
 	 * If we're gonna fail for oversize tuple, do it right away
 	 */
-	if (len > MaxHeapTupleSize)
+	if (len > MaxHeapTupleSize())
 		ereport(ERROR,
 				(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
 				 errmsg("row is too big: size %zu, maximum size %zu",
-						len, MaxHeapTupleSize)));
+						len, MaxHeapTupleSize())));
 
 	/* Compute desired extra freespace due to fillfactor option */
 	saveFreeSpace = RelationGetTargetPageFreeSpace(state->rs_new_rel,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index dfbe37472f..daec556c36 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -942,8 +942,8 @@ lazy_scan_heap(LVRelState *vacrel)
 		 * dead_items TIDs, pause and do a cycle of vacuuming before we tackle
 		 * this page.
 		 */
-		Assert(dead_items->max_items >= MaxHeapTuplesPerPage);
-		if (dead_items->max_items - dead_items->num_items < MaxHeapTuplesPerPage)
+		Assert(dead_items->max_items >= MaxHeapTuplesPerPage());
+		if (dead_items->max_items - dead_items->num_items < MaxHeapTuplesPerPage())
 		{
 			/*
 			 * Before beginning index vacuuming, we release any pin we may
@@ -1565,8 +1565,8 @@ lazy_scan_prune(LVRelState *vacrel,
 	int			nnewlpdead;
 	TransactionId NewRelfrozenXid;
 	MultiXactId NewRelminMxid;
-	OffsetNumber deadoffsets[MaxHeapTuplesPerPage];
-	xl_heap_freeze_tuple frozen[MaxHeapTuplesPerPage];
+	OffsetNumber deadoffsets[MaxHeapTuplesPerPageLimit];
+	xl_heap_freeze_tuple frozen[MaxHeapTuplesPerPageLimit];
 
 	Assert(BufferGetBlockNumber(buf) == blkno);
 
@@ -1968,7 +1968,7 @@ lazy_scan_noprune(LVRelState *vacrel,
 	HeapTupleHeader tupleheader;
 	TransactionId NewRelfrozenXid = vacrel->NewRelfrozenXid;
 	MultiXactId NewRelminMxid = vacrel->NewRelminMxid;
-	OffsetNumber deadoffsets[MaxHeapTuplesPerPage];
+	OffsetNumber deadoffsets[MaxHeapTuplesPerPageLimit];
 
 	Assert(BufferGetBlockNumber(buf) == blkno);
 
@@ -2497,7 +2497,7 @@ lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
 {
 	VacDeadItems *dead_items = vacrel->dead_items;
 	Page		page = BufferGetPage(buffer);
-	OffsetNumber unused[MaxHeapTuplesPerPage];
+	OffsetNumber unused[MaxHeapTuplesPerPageLimit];
 	int			uncnt = 0;
 	TransactionId visibility_cutoff_xid;
 	bool		all_frozen;
@@ -3130,16 +3130,16 @@ dead_items_max_items(LVRelState *vacrel)
 		max_items = Min(max_items, MAXDEADITEMS(MaxAllocSize));
 
 		/* curious coding here to ensure the multiplication can't overflow */
-		if ((BlockNumber) (max_items / MaxHeapTuplesPerPage) > rel_pages)
-			max_items = rel_pages * MaxHeapTuplesPerPage;
+		if ((BlockNumber) (max_items / MaxHeapTuplesPerPage()) > rel_pages)
+			max_items = rel_pages * MaxHeapTuplesPerPage();
 
 		/* stay sane if small maintenance_work_mem */
-		max_items = Max(max_items, MaxHeapTuplesPerPage);
+		max_items = Max(max_items, MaxHeapTuplesPerPage());
 	}
 	else
 	{
 		/* One-pass case only stores a single heap page's TIDs at a time */
-		max_items = MaxHeapTuplesPerPage;
+		max_items = MaxHeapTuplesPerPage();
 	}
 
 	return (int) max_items;
@@ -3159,7 +3159,7 @@ dead_items_alloc(LVRelState *vacrel, int nworkers)
 	int			max_items;
 
 	max_items = dead_items_max_items(vacrel);
-	Assert(max_items >= MaxHeapTuplesPerPage);
+	Assert(max_items >= MaxHeapTuplesPerPage());
 
 	/*
 	 * Initialize state for a parallel vacuum.  As of now, only one worker can
diff --git a/src/backend/access/nbtree/nbtdedup.c b/src/backend/access/nbtree/nbtdedup.c
index 93a025b0a9..04d7aa5c9f 100644
--- a/src/backend/access/nbtree/nbtdedup.c
+++ b/src/backend/access/nbtree/nbtdedup.c
@@ -355,8 +355,8 @@ _bt_bottomupdel_pass(Relation rel, Buffer buf, Relation heapRel,
 	delstate.bottomup = true;
 	delstate.bottomupfreespace = Max(BLCKSZ / 16, newitemsz);
 	delstate.ndeltids = 0;
-	delstate.deltids = palloc(MaxTIDsPerBTreePage * sizeof(TM_IndexDelete));
-	delstate.status = palloc(MaxTIDsPerBTreePage * sizeof(TM_IndexStatus));
+	delstate.deltids = palloc(MaxTIDsPerBTreePage() * sizeof(TM_IndexDelete));
+	delstate.status = palloc(MaxTIDsPerBTreePage() * sizeof(TM_IndexStatus));
 
 	minoff = P_FIRSTDATAKEY(opaque);
 	maxoff = PageGetMaxOffsetNumber(page);
@@ -826,7 +826,7 @@ _bt_singleval_fillfactor(Page page, BTDedupState state, Size newitemsz)
 
 	/* This calculation needs to match nbtsplitloc.c */
 	leftfree = PageGetPageSize(page) - SizeOfPageHeaderData -
-		MAXALIGN(sizeof(BTPageOpaqueData));
+		MAXALIGN(sizeof(BTPageOpaqueData)) - SizeOfPageReservedSpace();
 	/* Subtract size of new high key (includes pivot heap TID space) */
 	leftfree -= newitemsz + MAXALIGN(sizeof(ItemPointerData));
 
diff --git a/src/backend/access/nbtree/nbtinsert.c b/src/backend/access/nbtree/nbtinsert.c
index f6f4af8bfe..e4924b4a0f 100644
--- a/src/backend/access/nbtree/nbtinsert.c
+++ b/src/backend/access/nbtree/nbtinsert.c
@@ -2816,8 +2816,8 @@ _bt_simpledel_pass(Relation rel, Buffer buffer, Relation heapRel,
 	delstate.bottomup = false;
 	delstate.bottomupfreespace = 0;
 	delstate.ndeltids = 0;
-	delstate.deltids = palloc(MaxTIDsPerBTreePage * sizeof(TM_IndexDelete));
-	delstate.status = palloc(MaxTIDsPerBTreePage * sizeof(TM_IndexStatus));
+	delstate.deltids = palloc(MaxTIDsPerBTreePage() * sizeof(TM_IndexDelete));
+	delstate.status = palloc(MaxTIDsPerBTreePage() * sizeof(TM_IndexStatus));
 
 	for (offnum = minoff;
 		 offnum <= maxoff;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index b52eca8f38..19b8cffa4f 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -258,8 +258,8 @@ btgettuple(IndexScanDesc scan, ScanDirection dir)
 				 */
 				if (so->killedItems == NULL)
 					so->killedItems = (int *)
-						palloc(MaxTIDsPerBTreePage * sizeof(int));
-				if (so->numKilled < MaxTIDsPerBTreePage)
+						palloc(MaxTIDsPerBTreePage() * sizeof(int));
+				if (so->numKilled < MaxTIDsPerBTreePage())
 					so->killedItems[so->numKilled++] = so->currPos.itemIndex;
 			}
 
diff --git a/src/backend/access/nbtree/nbtsearch.c b/src/backend/access/nbtree/nbtsearch.c
index c74543bfde..e26c6a8d47 100644
--- a/src/backend/access/nbtree/nbtsearch.c
+++ b/src/backend/access/nbtree/nbtsearch.c
@@ -1668,7 +1668,7 @@ _bt_readpage(IndexScanDesc scan, ScanDirection dir, OffsetNumber offnum)
 		if (!continuescan)
 			so->currPos.moreRight = false;
 
-		Assert(itemIndex <= MaxTIDsPerBTreePage);
+		Assert(itemIndex <= MaxTIDsPerBTreePage());
 		so->currPos.firstItem = 0;
 		so->currPos.lastItem = itemIndex - 1;
 		so->currPos.itemIndex = 0;
@@ -1676,7 +1676,7 @@ _bt_readpage(IndexScanDesc scan, ScanDirection dir, OffsetNumber offnum)
 	else
 	{
 		/* load items[] in descending order */
-		itemIndex = MaxTIDsPerBTreePage;
+		itemIndex = MaxTIDsPerBTreePage();
 
 		offnum = Min(offnum, maxoff);
 
@@ -1765,8 +1765,8 @@ _bt_readpage(IndexScanDesc scan, ScanDirection dir, OffsetNumber offnum)
 
 		Assert(itemIndex >= 0);
 		so->currPos.firstItem = itemIndex;
-		so->currPos.lastItem = MaxTIDsPerBTreePage - 1;
-		so->currPos.itemIndex = MaxTIDsPerBTreePage - 1;
+		so->currPos.lastItem = MaxTIDsPerBTreePage() - 1;
+		so->currPos.itemIndex = MaxTIDsPerBTreePage() - 1;
 	}
 
 	return (so->currPos.firstItem <= so->currPos.lastItem);
diff --git a/src/backend/access/nbtree/nbtsplitloc.c b/src/backend/access/nbtree/nbtsplitloc.c
index 241e26d338..694f7d17b1 100644
--- a/src/backend/access/nbtree/nbtsplitloc.c
+++ b/src/backend/access/nbtree/nbtsplitloc.c
@@ -156,7 +156,7 @@ _bt_findsplitloc(Relation rel,
 
 	/* Total free space available on a btree page, after fixed overhead */
 	leftspace = rightspace =
-		PageGetPageSize(origpage) - SizeOfPageHeaderData -
+		PageGetPageSize(origpage) - SizeOfPageHeaderData - SizeOfPageReservedSpace() -
 		MAXALIGN(sizeof(BTPageOpaqueData));
 
 	/* The right page will have the same high key as the old page */
diff --git a/src/backend/nodes/tidbitmap.c b/src/backend/nodes/tidbitmap.c
index a7a6b26668..19adae4d23 100644
--- a/src/backend/nodes/tidbitmap.c
+++ b/src/backend/nodes/tidbitmap.c
@@ -53,7 +53,7 @@
  * the per-page bitmaps variable size.  We just legislate that the size
  * is this:
  */
-#define MAX_TUPLES_PER_PAGE  MaxHeapTuplesPerPage
+#define MAX_TUPLES_PER_PAGE  MaxHeapTuplesPerPageLimit
 
 /*
  * When we have to switch over to lossy storage, we use a data structure
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 22a15a482a..d01a325174 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4824,7 +4824,7 @@ ReorderBufferToastReplace(ReorderBuffer *rb, ReorderBufferTXN *txn,
 	 * the tuplebuf because attrs[] will point back into the current content.
 	 */
 	tmphtup = heap_form_tuple(desc, attrs, isnull);
-	Assert(newtup->tuple.t_len <= MaxHeapTupleSize);
+	Assert(newtup->tuple.t_len <= MaxHeapTupleSize());
 	Assert(ReorderBufferTupleBufData(newtup) == newtup->tuple.t_data);
 
 	memcpy(newtup->tuple.t_data, tmphtup->t_data, tmphtup->t_len);
diff --git a/src/backend/storage/freespace/freespace.c b/src/backend/storage/freespace/freespace.c
index a6b0533103..b880f059fe 100644
--- a/src/backend/storage/freespace/freespace.c
+++ b/src/backend/storage/freespace/freespace.c
@@ -37,12 +37,12 @@
  * We use just one byte to store the amount of free space on a page, so we
  * divide the amount of free space a page can have into 256 different
  * categories. The highest category, 255, represents a page with at least
- * MaxFSMRequestSize bytes of free space, and the second highest category
+ * MaxFSMRequestSize() bytes of free space, and the second highest category
  * represents the range from 254 * FSM_CAT_STEP, inclusive, to
- * MaxFSMRequestSize, exclusive.
+ * MaxFSMRequestSize(), exclusive.
  *
- * MaxFSMRequestSize depends on the architecture and BLCKSZ, but assuming
- * default 8k BLCKSZ, and that MaxFSMRequestSize is 8164 bytes, the
+ * MaxFSMRequestSize() depends on the architecture and BLCKSZ, but assuming
+ * default 8k BLCKSZ, and that MaxFSMRequestSize() is 8164 bytes, the
  * categories look like this:
  *
  *
@@ -54,16 +54,16 @@
  * 8128 - 8163 254
  * 8164 - 8192 255
  *
- * The reason that MaxFSMRequestSize is special is that if MaxFSMRequestSize
- * isn't equal to a range boundary, a page with exactly MaxFSMRequestSize
- * bytes of free space wouldn't satisfy a request for MaxFSMRequestSize
- * bytes. If there isn't more than MaxFSMRequestSize bytes of free space on a
+ * The reason that MaxFSMRequestSize() is special is that if MaxFSMRequestSize()
+ * isn't equal to a range boundary, a page with exactly MaxFSMRequestSize()
+ * bytes of free space wouldn't satisfy a request for MaxFSMRequestSize()
+ * bytes. If there isn't more than MaxFSMRequestSize() bytes of free space on a
  * completely empty page, that would mean that we could never satisfy a
- * request of exactly MaxFSMRequestSize bytes.
+ * request of exactly MaxFSMRequestSize() bytes.
  */
 #define FSM_CATEGORIES	256
 #define FSM_CAT_STEP	(BLCKSZ / FSM_CATEGORIES)
-#define MaxFSMRequestSize	MaxHeapTupleSize
+#define MaxFSMRequestSize()	MaxHeapTupleSize()
 
 /*
  * Depth of the on-disk tree. We need to be able to address 2^32-1 blocks,
@@ -372,13 +372,13 @@ fsm_space_avail_to_cat(Size avail)
 
 	Assert(avail < BLCKSZ);
 
-	if (avail >= MaxFSMRequestSize)
+	if (avail >= MaxFSMRequestSize())
 		return 255;
 
 	cat = avail / FSM_CAT_STEP;
 
 	/*
-	 * The highest category, 255, is reserved for MaxFSMRequestSize bytes or
+	 * The highest category, 255, is reserved for MaxFSMRequestSize() bytes or
 	 * more.
 	 */
 	if (cat > 254)
@@ -394,9 +394,9 @@ fsm_space_avail_to_cat(Size avail)
 static Size
 fsm_space_cat_to_avail(uint8 cat)
 {
-	/* The highest category represents exactly MaxFSMRequestSize bytes. */
+	/* The highest category represents exactly MaxFSMRequestSize() bytes. */
 	if (cat == 255)
-		return MaxFSMRequestSize;
+		return MaxFSMRequestSize();
 	else
 		return cat * FSM_CAT_STEP;
 }
@@ -411,7 +411,7 @@ fsm_space_needed_to_cat(Size needed)
 	int			cat;
 
 	/* Can't ask for more space than the highest category represents */
-	if (needed > MaxFSMRequestSize)
+	if (needed > MaxFSMRequestSize())
 		elog(ERROR, "invalid FSM request size %zu", needed);
 
 	if (needed == 0)
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index 8b617c7e79..cf4b9a3bfa 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -43,7 +43,7 @@ PageInit(Page page, Size pageSize, Size specialSize)
 {
 	PageHeader	p = (PageHeader) page;
 
-	specialSize = MAXALIGN(specialSize);
+	specialSize = MAXALIGN(specialSize) + reserved_page_size;
 
 	Assert(pageSize == BLCKSZ);
 	Assert(pageSize > specialSize + SizeOfPageHeaderData);
@@ -117,7 +117,7 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
 		if ((p->pd_flags & ~PD_VALID_FLAG_BITS) == 0 &&
 			p->pd_lower <= p->pd_upper &&
 			p->pd_upper <= p->pd_special &&
-			p->pd_special <= BLCKSZ &&
+			p->pd_special + reserved_page_size <= BLCKSZ &&
 			p->pd_special == MAXALIGN(p->pd_special))
 			header_sane = true;
 
@@ -186,7 +186,7 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
  *	one that is both unused and deallocated.
  *
  *	If flag PAI_IS_HEAP is set, we enforce that there can't be more than
- *	MaxHeapTuplesPerPage line pointers on the page.
+ *	MaxHeapTuplesPerPage() line pointers on the page.
  *
  *	!!! EREPORT(ERROR) IS DISALLOWED HERE !!!
  */
@@ -211,7 +211,7 @@ PageAddItemExtended(Page page,
 	if (phdr->pd_lower < SizeOfPageHeaderData ||
 		phdr->pd_lower > phdr->pd_upper ||
 		phdr->pd_upper > phdr->pd_special ||
-		phdr->pd_special > BLCKSZ)
+		phdr->pd_special + reserved_page_size > BLCKSZ)
 		ereport(PANIC,
 				(errcode(ERRCODE_DATA_CORRUPTED),
 				 errmsg("corrupted page pointers: lower = %u, upper = %u, special = %u",
@@ -295,9 +295,9 @@ PageAddItemExtended(Page page,
 	}
 
 	/* Reject placing items beyond heap boundary, if heap */
-	if ((flags & PAI_IS_HEAP) != 0 && offsetNumber > MaxHeapTuplesPerPage)
+	if ((flags & PAI_IS_HEAP) != 0 && offsetNumber > MaxHeapTuplesPerPage())
 	{
-		elog(WARNING, "can't put more than MaxHeapTuplesPerPage items in a heap page");
+		elog(WARNING, "can't put more than MaxHeapTuplesPerPage() items in a heap page");
 		return InvalidOffsetNumber;
 	}
 
@@ -702,7 +702,7 @@ PageRepairFragmentation(Page page)
 	Offset		pd_upper = ((PageHeader) page)->pd_upper;
 	Offset		pd_special = ((PageHeader) page)->pd_special;
 	Offset		last_offset;
-	itemIdCompactData itemidbase[MaxHeapTuplesPerPage];
+	itemIdCompactData itemidbase[MaxHeapTuplesPerPageLimit];
 	itemIdCompact itemidptr;
 	ItemId		lp;
 	int			nline,
@@ -723,7 +723,7 @@ PageRepairFragmentation(Page page)
 	if (pd_lower < SizeOfPageHeaderData ||
 		pd_lower > pd_upper ||
 		pd_upper > pd_special ||
-		pd_special > BLCKSZ ||
+		pd_special + reserved_page_size > BLCKSZ ||
 		pd_special != MAXALIGN(pd_special))
 		ereport(ERROR,
 				(errcode(ERRCODE_DATA_CORRUPTED),
@@ -979,12 +979,12 @@ PageGetExactFreeSpace(Page page)
  *		reduced by the space needed for a new line pointer.
  *
  * The difference between this and PageGetFreeSpace is that this will return
- * zero if there are already MaxHeapTuplesPerPage line pointers in the page
+ * zero if there are already MaxHeapTuplesPerPage() line pointers in the page
  * and none are free.  We use this to enforce that no more than
- * MaxHeapTuplesPerPage line pointers are created on a heap page.  (Although
+ * MaxHeapTuplesPerPage() line pointers are created on a heap page.  (Although
  * no more tuples than that could fit anyway, in the presence of redirected
  * or dead line pointers it'd be possible to have too many line pointers.
- * To avoid breaking code that assumes MaxHeapTuplesPerPage is a hard limit
+ * To avoid breaking code that assumes MaxHeapTuplesPerPage() is a hard limit
  * on the number of line pointers, we make this extra check.)
  */
 Size
@@ -999,10 +999,10 @@ PageGetHeapFreeSpace(Page page)
 					nline;
 
 		/*
-		 * Are there already MaxHeapTuplesPerPage line pointers in the page?
+		 * Are there already MaxHeapTuplesPerPage() line pointers in the page?
 		 */
 		nline = PageGetMaxOffsetNumber(page);
-		if (nline >= MaxHeapTuplesPerPage)
+		if (nline >= MaxHeapTuplesPerPage())
 		{
 			if (PageHasFreeLinePointers(page))
 			{
@@ -1066,7 +1066,7 @@ PageIndexTupleDelete(Page page, OffsetNumber offnum)
 	if (phdr->pd_lower < SizeOfPageHeaderData ||
 		phdr->pd_lower > phdr->pd_upper ||
 		phdr->pd_upper > phdr->pd_special ||
-		phdr->pd_special > BLCKSZ ||
+		phdr->pd_special + reserved_page_size > BLCKSZ ||
 		phdr->pd_special != MAXALIGN(phdr->pd_special))
 		ereport(ERROR,
 				(errcode(ERRCODE_DATA_CORRUPTED),
@@ -1201,7 +1201,7 @@ PageIndexMultiDelete(Page page, OffsetNumber *itemnos, int nitems)
 	if (pd_lower < SizeOfPageHeaderData ||
 		pd_lower > pd_upper ||
 		pd_upper > pd_special ||
-		pd_special > BLCKSZ ||
+		pd_special + reserved_page_size > BLCKSZ ||
 		pd_special != MAXALIGN(pd_special))
 		ereport(ERROR,
 				(errcode(ERRCODE_DATA_CORRUPTED),
@@ -1307,7 +1307,7 @@ PageIndexTupleDeleteNoCompact(Page page, OffsetNumber offnum)
 	if (phdr->pd_lower < SizeOfPageHeaderData ||
 		phdr->pd_lower > phdr->pd_upper ||
 		phdr->pd_upper > phdr->pd_special ||
-		phdr->pd_special > BLCKSZ ||
+		phdr->pd_special + reserved_page_size > BLCKSZ ||
 		phdr->pd_special != MAXALIGN(phdr->pd_special))
 		ereport(ERROR,
 				(errcode(ERRCODE_DATA_CORRUPTED),
@@ -1419,7 +1419,7 @@ PageIndexTupleOverwrite(Page page, OffsetNumber offnum,
 	if (phdr->pd_lower < SizeOfPageHeaderData ||
 		phdr->pd_lower > phdr->pd_upper ||
 		phdr->pd_upper > phdr->pd_special ||
-		phdr->pd_special > BLCKSZ ||
+		phdr->pd_special + reserved_page_size > BLCKSZ ||
 		phdr->pd_special != MAXALIGN(phdr->pd_special))
 		ereport(ERROR,
 				(errcode(ERRCODE_DATA_CORRUPTED),
diff --git a/src/backend/utils/adt/tsgistidx.c b/src/backend/utils/adt/tsgistidx.c
index 728b5e9e71..18e48684f4 100644
--- a/src/backend/utils/adt/tsgistidx.c
+++ b/src/backend/utils/adt/tsgistidx.c
@@ -207,7 +207,7 @@ gtsvector_compress(PG_FUNCTION_ARGS)
 		}
 
 		/* make signature, if array is too long */
-		if (VARSIZE(res) > TOAST_INDEX_TARGET)
+		if (VARSIZE(res) > TOAST_INDEX_TARGET())
 		{
 			SignTSVector *ressign = gtsvector_alloc(SIGNKEY, siglen, NULL);
 
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 00bceec8fa..879c054ae0 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -151,3 +151,6 @@ int64		VacuumPageDirty = 0;
 
 int			VacuumCostBalance = 0;	/* working state for vacuum */
 bool		VacuumCostActive = false;
+
+int			reserved_page_size = 0; /* how much page space to reserve for extended unencrypted metadata */
+
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 836b49484a..4c30d0d37a 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2630,6 +2630,19 @@ struct config_int ConfigureNamesInt[] =
 		NULL, assign_max_wal_size, NULL
 	},
 
+	{
+		{"reserved_page_size", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows the size of reserved space for extended pages."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+			},
+		&reserved_page_size,
+		0,
+		0,
+		PG_UINT8_MAX,
+		NULL, NULL, NULL
+		},
+
 	{
 		{"checkpoint_timeout", PGC_SIGHUP, WAL_CHECKPOINTS,
 			gettext_noop("Sets the maximum time between automatic WAL checkpoints."),
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index f61a043055..40561d5d61 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -77,6 +77,7 @@
 #include "getopt_long.h"
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
+#include "storage/bufpage.h" /* MaxSizeOfPageReservedSpace */
 
 
 /* Ideally this would be in a .h file, but it hardly seems worth the trouble */
diff --git a/src/include/access/ginblock.h b/src/include/access/ginblock.h
index 9347f464f3..5630f3efa0 100644
--- a/src/include/access/ginblock.h
+++ b/src/include/access/ginblock.h
@@ -162,7 +162,7 @@ extern bool GinPageIsRecyclable(Page page);
  *				pointers for that page
  * Note that these are all distinguishable from an "invalid" item pointer
  * (which is InvalidBlockNumber/0) as well as from all normal item
- * pointers (which have item numbers in the range 1..MaxHeapTuplesPerPage).
+ * pointers (which have item numbers in the range 1..MaxHeapTuplesPerPage()).
  */
 #define ItemPointerSetMin(p)  \
 	ItemPointerSet((p), (BlockNumber)0, (OffsetNumber)0)
@@ -246,7 +246,13 @@ typedef signed char GinNullCategory;
  * currently store the high key explicitly, we just use the rightmost item on
  * the page, so it would actually be enough to fit two items.)
  */
-#define GinMaxItemSize \
+#define GinMaxItemSize() \
+	Min(INDEX_SIZE_MASK, \
+		MAXALIGN_DOWN(((BLCKSZ - \
+						SizeOfPageReservedSpace() - \
+						MAXALIGN(SizeOfPageHeaderData + 3 * sizeof(ItemIdData)) - \
+						MAXALIGN(sizeof(GinPageOpaqueData))) / 3)))
+#define GinMaxItemSizeLimit \
 	Min(INDEX_SIZE_MASK, \
 		MAXALIGN_DOWN(((BLCKSZ - \
 						MAXALIGN(SizeOfPageHeaderData + 3 * sizeof(ItemIdData)) - \
@@ -309,15 +315,20 @@ typedef signed char GinNullCategory;
  */
 #define GinDataPageSetDataSize(page, size) \
 	{ \
-		Assert(size <= GinDataPageMaxDataSize); \
+		Assert(size <= GinDataPageMaxDataSize()); \
 		((PageHeader) page)->pd_lower = (size) + MAXALIGN(SizeOfPageHeaderData) + MAXALIGN(sizeof(ItemPointerData)); \
 	}
 
 #define GinNonLeafDataPageGetFreeSpace(page)	\
-	(GinDataPageMaxDataSize - \
+	(GinDataPageMaxDataSize() - \
 	 GinPageGetOpaque(page)->maxoff * sizeof(PostingItem))
 
-#define GinDataPageMaxDataSize	\
+#define GinDataPageMaxDataSize()	\
+	(BLCKSZ - MAXALIGN(SizeOfPageHeaderData) \
+	 - SizeOfPageReservedSpace() \
+	 - MAXALIGN(sizeof(ItemPointerData)) \
+	 - MAXALIGN(sizeof(GinPageOpaqueData)))
+#define GinDataPageMaxDataSizeLimit	\
 	(BLCKSZ - MAXALIGN(SizeOfPageHeaderData) \
 	 - MAXALIGN(sizeof(ItemPointerData)) \
 	 - MAXALIGN(sizeof(GinPageOpaqueData)))
@@ -325,8 +336,10 @@ typedef signed char GinNullCategory;
 /*
  * List pages
  */
-#define GinListPageSize  \
-	( BLCKSZ - SizeOfPageHeaderData - MAXALIGN(sizeof(GinPageOpaqueData)) )
+#define GinListPageSize()  \
+	( BLCKSZ - SizeOfPageHeaderData - MAXALIGN(sizeof(GinPageOpaqueData)) - SizeOfPageReservedSpace())
+#define GinListPageSizeLimit  \
+	( BLCKSZ - SizeOfPageHeaderData - MAXALIGN(sizeof(GinPageOpaqueData)))
 
 /*
  * A compressed posting list.
diff --git a/src/include/access/hash.h b/src/include/access/hash.h
index da372841c4..c17a528d80 100644
--- a/src/include/access/hash.h
+++ b/src/include/access/hash.h
@@ -287,6 +287,7 @@ typedef struct HashOptions
 #define HashMaxItemSize(page) \
 	MAXALIGN_DOWN(PageGetPageSize(page) - \
 				  SizeOfPageHeaderData - \
+				  SizeOfPageReservedSpace() - \
 				  sizeof(ItemIdData) - \
 				  MAXALIGN(sizeof(HashPageOpaqueData)))
 
@@ -318,7 +319,9 @@ typedef struct HashOptions
 
 #define HashGetMaxBitmapSize(page) \
 	(PageGetPageSize((Page) page) - \
-	 (MAXALIGN(SizeOfPageHeaderData) + MAXALIGN(sizeof(HashPageOpaqueData))))
+	 (MAXALIGN(SizeOfPageHeaderData) + \
+	  SizeOfPageReservedSpace() + \
+	  MAXALIGN(sizeof(HashPageOpaqueData))))
 
 #define HashPageGetMeta(page) \
 	((HashMetaPage) PageGetContents(page))
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 9dab35551e..84a1472ac0 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -74,7 +74,7 @@ typedef struct HeapScanDescData
 	/* these fields only used in page-at-a-time mode and for bitmap scans */
 	int			rs_cindex;		/* current tuple's index in vistuples */
 	int			rs_ntuples;		/* number of visible tuples on page */
-	OffsetNumber rs_vistuples[MaxHeapTuplesPerPage];	/* their offsets */
+	OffsetNumber rs_vistuples[MaxHeapTuplesPerPageLimit];	/* their offsets */
 }			HeapScanDescData;
 typedef struct HeapScanDescData *HeapScanDesc;
 
diff --git a/src/include/access/heaptoast.h b/src/include/access/heaptoast.h
index a75699054a..3421e4ac8b 100644
--- a/src/include/access/heaptoast.h
+++ b/src/include/access/heaptoast.h
@@ -61,11 +61,11 @@
 #define TOAST_TUPLE_TARGET_MAIN MaximumBytesPerTuple(TOAST_TUPLES_PER_PAGE_MAIN)
 
 /*
- * If an index value is larger than TOAST_INDEX_TARGET, we will try to
+ * If an index value is larger than TOAST_INDEX_TARGET(), we will try to
  * compress it (we can't move it out-of-line, however).  Note that this
  * number is per-datum, not per-tuple, for simplicity in index_form_tuple().
  */
-#define TOAST_INDEX_TARGET		(MaxHeapTupleSize / 16)
+#define TOAST_INDEX_TARGET()		(MaxHeapTupleSize() / 16)
 
 /*
  * When we store an oversize datum externally, we divide it into chunks
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index 9561c835f2..6bd2b8f3c2 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -546,31 +546,51 @@ do { \
 #define BITMAPLEN(NATTS)	(((int)(NATTS) + 7) / 8)
 
 /*
- * MaxHeapTupleSize is the maximum allowed size of a heap tuple, including
+ * MaxHeapTupleSize() is the maximum allowed size of a heap tuple, including
  * header and MAXALIGN alignment padding.  Basically it's BLCKSZ minus the
- * other stuff that has to be on a disk page.  Since heap pages use no
- * "special space", there's no deduction for that.
+ * other stuff that has to be on a disk page.  We also include
+ * SizeOfPageReservedSpace() bytes in this calculation to account for page
+ * trailers.
+ *
+ * MaxHeapTupleSizeLimit is the maximum buffer-size required for any cluster,
+ * explicitly excluding the PageReservedSpace.  This is needed for any data
+ * structure which uses a fixed-size buffer, since compilers do not want a
+ * variable-sized array, and MaxHeapTupleSize() is now variable.
  *
  * NOTE: we allow for the ItemId that must point to the tuple, ensuring that
  * an otherwise-empty page can indeed hold a tuple of this size.  Because
  * ItemIds and tuples have different alignment requirements, don't assume that
- * you can, say, fit 2 tuples of size MaxHeapTupleSize/2 on the same page.
+ * you can, say, fit 2 tuples of size MaxHeapTupleSize()/2 on the same page.
  */
-#define MaxHeapTupleSize  (BLCKSZ - MAXALIGN(SizeOfPageHeaderData + sizeof(ItemIdData)))
+#define MaxHeapTupleSize()  (BLCKSZ - MAXALIGN(SizeOfPageHeaderData + sizeof(ItemIdData) + SizeOfPageReservedSpace()))
+#define MaxHeapTupleSizeLimit  (BLCKSZ - MAXALIGN(SizeOfPageHeaderData + sizeof(ItemIdData)))
 #define MinHeapTupleSize  MAXALIGN(SizeofHeapTupleHeader)
 
 /*
- * MaxHeapTuplesPerPage is an upper bound on the number of tuples that can
- * fit on one heap page.  (Note that indexes could have more, because they
- * use a smaller tuple header.)  We arrive at the divisor because each tuple
- * must be maxaligned, and it must have an associated line pointer.
+ * MaxHeapTuplesPerPage() is an upper bound on the number of tuples that can fit
+ * on one heap page.  (Note that indexes could have more, because they use a
+ * smaller tuple header.)  We arrive at the divisor because each tuple must be
+ * maxaligned, and it must have an associated line pointer.  This is a dynamic
+ * value, accounting for PageReservedSpace on the end.
+ *
+ * MaxHeapTuplesPerPageLimit is this same limit, but discounting
+ * PageReservedSpace (which can be zero), so is appropriate for defining data
+ * structures which require fixed-size buffers.  Code should not assume
+ * MaxHeapTuplesPerPage() == MaxHeapTuplesPerPageLimit, so if iterating over
+ * such a structure, the *size* of the buffer should be
+ * MaxHeapTuplesPerPageLimit, but the limits of iteration should be
+ * MaxHeapTuplesPerPage(), implying that MaxHeapTuplesPerPage() <=
+ * MaxHeapTuplesPerPageLimit.
  *
  * Note: with HOT, there could theoretically be more line pointers (not actual
  * tuples) than this on a heap page.  However we constrain the number of line
  * pointers to this anyway, to avoid excessive line-pointer bloat and not
  * require increases in the size of work arrays.
  */
-#define MaxHeapTuplesPerPage	\
+#define MaxHeapTuplesPerPage()	\
+	((int) ((BLCKSZ - SizeOfPageHeaderData - SizeOfPageReservedSpace()) / \
+			(MAXALIGN(SizeofHeapTupleHeader) + sizeof(ItemIdData))))
+#define MaxHeapTuplesPerPageLimit	\
 	((int) ((BLCKSZ - SizeOfPageHeaderData) / \
 			(MAXALIGN(SizeofHeapTupleHeader) + sizeof(ItemIdData))))
 
diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h
index 8e4f6864e5..18a2831443 100644
--- a/src/include/access/nbtree.h
+++ b/src/include/access/nbtree.h
@@ -162,29 +162,43 @@ typedef struct BTMetaPageData
  * attribute, which we account for here.
  */
 #define BTMaxItemSize(page) \
-	(MAXALIGN_DOWN((PageGetPageSize(page) - \
+	(MAXALIGN_DOWN((PageGetPageSize(page) - SizeOfPageReservedSpace() - \
 					MAXALIGN(SizeOfPageHeaderData + 3*sizeof(ItemIdData)) - \
 					MAXALIGN(sizeof(BTPageOpaqueData))) / 3) - \
 					MAXALIGN(sizeof(ItemPointerData)))
 #define BTMaxItemSizeNoHeapTid(page) \
-	MAXALIGN_DOWN((PageGetPageSize(page) - \
+	MAXALIGN_DOWN((PageGetPageSize(page) - SizeOfPageReservedSpace() - \
 				   MAXALIGN(SizeOfPageHeaderData + 3*sizeof(ItemIdData)) - \
 				   MAXALIGN(sizeof(BTPageOpaqueData))) / 3)
 
 /*
- * MaxTIDsPerBTreePage is an upper bound on the number of heap TIDs tuples
- * that may be stored on a btree leaf page.  It is used to size the
- * per-page temporary buffers.
+ * MaxTIDsPerBTreePage() is an upper bound on the number of heap TIDs tuples
+ * that may be stored on a btree leaf page.  It is used to size the per-page
+ * temporary buffers.  This accounts for PageReservedSpace limit as well, so
+ * is a dynamic value depending on cluster settings.
+ *
+ * MaxTIDsPerBTreePageLimit is the same value without considering
+ * PageReservedSpace limit as well, so is used for fixed-size buffers, however
+ * code accessing these buffers should consider only MaxTIDsPerBTreePage() when
+ * iterating over then.
  *
  * Note: we don't bother considering per-tuple overheads here to keep
  * things simple (value is based on how many elements a single array of
  * heap TIDs must have to fill the space between the page header and
  * special area).  The value is slightly higher (i.e. more conservative)
  * than necessary as a result, which is considered acceptable.
+ *
+ * Since this is a fixed-size upper limit we restrict to the max size of page
+ * reserved space; this does mean that we pay a cost of
+ * (MaxSizeOfPageReservedSpace / sizeof(ItemPointerData)) less tuples stored
+ * on a page.
  */
-#define MaxTIDsPerBTreePage \
-	(int) ((BLCKSZ - SizeOfPageHeaderData - sizeof(BTPageOpaqueData)) / \
-		   sizeof(ItemPointerData))
+#define MaxTIDsPerBTreePage() \
+	(int) ((BLCKSZ - SizeOfPageHeaderData - SizeOfPageReservedSpace() - \
+			sizeof(BTPageOpaqueData)) / sizeof(ItemPointerData))
+#define MaxTIDsPerBTreePageLimit \
+	(int) ((BLCKSZ - SizeOfPageHeaderData - \
+			sizeof(BTPageOpaqueData)) / sizeof(ItemPointerData))
 
 /*
  * The leaf-page fillfactor defaults to 90% but is user-adjustable.
@@ -981,7 +995,7 @@ typedef struct BTScanPosData
 	int			lastItem;		/* last valid index in items[] */
 	int			itemIndex;		/* current index in items[] */
 
-	BTScanPosItem items[MaxTIDsPerBTreePage];	/* MUST BE LAST */
+	BTScanPosItem items[MaxTIDsPerBTreePageLimit];	/* MUST BE LAST */
 } BTScanPosData;
 
 typedef BTScanPosData *BTScanPos;
diff --git a/src/include/access/spgist_private.h b/src/include/access/spgist_private.h
index eb56b1c6b8..cfa2d622a7 100644
--- a/src/include/access/spgist_private.h
+++ b/src/include/access/spgist_private.h
@@ -447,6 +447,7 @@ typedef SpGistDeadTupleData *SpGistDeadTuple;
 #define SPGIST_PAGE_CAPACITY  \
 	MAXALIGN_DOWN(BLCKSZ - \
 				  SizeOfPageHeaderData - \
+				  SizeOfPageReservedSpace() - \
 				  MAXALIGN(sizeof(SpGistPageOpaqueData)))
 
 /*
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 2708c4b683..ea0dfc2645 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -19,6 +19,13 @@
 #include "storage/item.h"
 #include "storage/off.h"
 
+extern int reserved_page_size;
+
+#define SizeOfPageReservedSpace() reserved_page_size
+
+/* strict upper bound on the amount of space occupied we have reserved on
+ * pages in this cluster */
+
 /*
  * A postgres disk page is an abstraction layered on top of a postgres
  * disk block (which is simply a unit of i/o, see block.h).
@@ -36,10 +43,10 @@
  * |			 v pd_upper							  |
  * +-------------+------------------------------------+
  * |			 | tupleN ...                         |
- * +-------------+------------------+-----------------+
- * |	   ... tuple3 tuple2 tuple1 | "special space" |
- * +--------------------------------+-----------------+
- *									^ pd_special
+ * +-------------+-----+------------+-----------------+
+ * | ... tuple2 tuple1 | "special space" | "reserved" |
+ * +-------------------+------------+-----------------+
+ *					   ^ pd_special      ^ reserved_page_space
  *
  * a page is full when nothing can be added between pd_lower and
  * pd_upper.
@@ -73,6 +80,8 @@
  * stored as the page trailer.  an access method should always
  * initialize its pages with PageInit and then set its own opaque
  * fields.
+ *
+ * XXX - update more comments here about reserved_page_space
  */
 
 typedef Pointer Page;
@@ -313,7 +322,7 @@ PageSetPageSizeAndVersion(Page page, Size size, uint8 version)
 static inline uint16
 PageGetSpecialSize(Page page)
 {
-	return (PageGetPageSize(page) - ((PageHeader) page)->pd_special);
+	return (PageGetPageSize(page) - ((PageHeader) page)->pd_special - reserved_page_size);
 }
 
 /*
@@ -325,7 +334,7 @@ static inline void
 PageValidateSpecialPointer(Page page)
 {
 	Assert(page);
-	Assert(((PageHeader) page)->pd_special <= BLCKSZ);
+	Assert((((PageHeader) page)->pd_special + reserved_page_size) <= BLCKSZ);
 	Assert(((PageHeader) page)->pd_special >= SizeOfPageHeaderData);
 }
 
diff --git a/src/test/modules/test_ginpostinglist/test_ginpostinglist.c b/src/test/modules/test_ginpostinglist/test_ginpostinglist.c
index 9a230097db..0416f0d504 100644
--- a/src/test/modules/test_ginpostinglist/test_ginpostinglist.c
+++ b/src/test/modules/test_ginpostinglist/test_ginpostinglist.c
@@ -88,9 +88,9 @@ Datum
 test_ginpostinglist(PG_FUNCTION_ARGS)
 {
 	test_itemptr_pair(0, 2, 14);
-	test_itemptr_pair(0, MaxHeapTuplesPerPage, 14);
-	test_itemptr_pair(MaxBlockNumber, MaxHeapTuplesPerPage, 14);
-	test_itemptr_pair(MaxBlockNumber, MaxHeapTuplesPerPage, 16);
+	test_itemptr_pair(0, MaxHeapTuplesPerPage(), 14);
+	test_itemptr_pair(MaxBlockNumber, MaxHeapTuplesPerPage(), 14);
+	test_itemptr_pair(MaxBlockNumber, MaxHeapTuplesPerPage(), 16);
 
 	PG_RETURN_VOID();
 }
diff --git a/src/test/regress/expected/insert.out b/src/test/regress/expected/insert.out
index dd4354fc7d..b530ac0038 100644
--- a/src/test/regress/expected/insert.out
+++ b/src/test/regress/expected/insert.out
@@ -86,7 +86,7 @@ drop table inserttest;
 --
 CREATE TABLE large_tuple_test (a int, b text) WITH (fillfactor = 10);
 ALTER TABLE large_tuple_test ALTER COLUMN b SET STORAGE plain;
--- create page w/ free space in range [nearlyEmptyFreeSpace, MaxHeapTupleSize)
+-- create page w/ free space in range [nearlyEmptyFreeSpace, MaxHeapTupleSize())
 INSERT INTO large_tuple_test (select 1, NULL);
 -- should still fit on the page
 INSERT INTO large_tuple_test (select 2, repeat('a', 1000));
@@ -100,7 +100,7 @@ SELECT pg_size_pretty(pg_relation_size('large_tuple_test'::regclass, 'main'));
 INSERT INTO large_tuple_test (select 3, NULL);
 -- now this tuple won't fit on the second page, but the insert should
 -- still succeed by extending the relation
-INSERT INTO large_tuple_test (select 4, repeat('a', 8126));
+INSERT INTO large_tuple_test (select 4, repeat('a', 8062));
 DROP TABLE large_tuple_test;
 --
 -- check indirection (field/array assignment), cf bug #14265
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index c63a157e5f..583a5a91ae 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -134,7 +134,7 @@ CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 CREATE INDEX no_index_cleanup_idx ON no_index_cleanup(t);
 ALTER TABLE no_index_cleanup ALTER COLUMN t SET STORAGE EXTERNAL;
 INSERT INTO no_index_cleanup(i, t) VALUES (generate_series(1,30),
-    repeat('1234567890',269));
+    repeat('1234567890',266));
 -- index cleanup option is ignored if VACUUM FULL
 VACUUM (INDEX_CLEANUP TRUE, FULL TRUE) no_index_cleanup;
 VACUUM (FULL TRUE) no_index_cleanup;
@@ -150,7 +150,7 @@ ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = auto);
 VACUUM no_index_cleanup;
 -- Parameter is set for both the parent table and its toast relation.
 INSERT INTO no_index_cleanup(i, t) VALUES (generate_series(31,60),
-    repeat('1234567890',269));
+    repeat('1234567890',266));
 DELETE FROM no_index_cleanup WHERE i < 45;
 -- Only toast index is cleaned up.
 ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = off,
diff --git a/src/test/regress/sql/insert.sql b/src/test/regress/sql/insert.sql
index bdcffd0314..0e31dcd1f8 100644
--- a/src/test/regress/sql/insert.sql
+++ b/src/test/regress/sql/insert.sql
@@ -43,7 +43,7 @@ drop table inserttest;
 CREATE TABLE large_tuple_test (a int, b text) WITH (fillfactor = 10);
 ALTER TABLE large_tuple_test ALTER COLUMN b SET STORAGE plain;
 
--- create page w/ free space in range [nearlyEmptyFreeSpace, MaxHeapTupleSize)
+-- create page w/ free space in range [nearlyEmptyFreeSpace, MaxHeapTupleSize())
 INSERT INTO large_tuple_test (select 1, NULL);
 
 -- should still fit on the page
@@ -55,7 +55,7 @@ INSERT INTO large_tuple_test (select 3, NULL);
 
 -- now this tuple won't fit on the second page, but the insert should
 -- still succeed by extending the relation
-INSERT INTO large_tuple_test (select 4, repeat('a', 8126));
+INSERT INTO large_tuple_test (select 4, repeat('a', 8062));
 
 DROP TABLE large_tuple_test;
 
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 9faa8a34a6..0aec01b88e 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -115,7 +115,7 @@ CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 CREATE INDEX no_index_cleanup_idx ON no_index_cleanup(t);
 ALTER TABLE no_index_cleanup ALTER COLUMN t SET STORAGE EXTERNAL;
 INSERT INTO no_index_cleanup(i, t) VALUES (generate_series(1,30),
-    repeat('1234567890',269));
+    repeat('1234567890',266));
 -- index cleanup option is ignored if VACUUM FULL
 VACUUM (INDEX_CLEANUP TRUE, FULL TRUE) no_index_cleanup;
 VACUUM (FULL TRUE) no_index_cleanup;
@@ -131,7 +131,7 @@ ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = auto);
 VACUUM no_index_cleanup;
 -- Parameter is set for both the parent table and its toast relation.
 INSERT INTO no_index_cleanup(i, t) VALUES (generate_series(31,60),
-    repeat('1234567890',269));
+    repeat('1234567890',266));
 DELETE FROM no_index_cleanup WHERE i < 45;
 -- Only toast index is cleaned up.
 ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = off,
-- 
2.31.1

v2-0005-A-second-page-feature-just-to-allocate-more-space.patchapplication/octet-stream; name=v2-0005-A-second-page-feature-just-to-allocate-more-space.patchDownload

From b2c0762bb4f8cf18d48c2af05a08b68a310089f1 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Fri, 21 Oct 2022 11:06:59 -0400
Subject: [PATCH v2 5/5] A second page feature just to allocate more space

---
 src/backend/utils/misc/guc_tables.c | 11 +++++++++++
 src/bin/initdb/initdb.c             | 10 ++++++++--
 src/common/pagefeat.c               |  3 +++
 src/include/common/pagefeat.h       |  2 ++
 4 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 24a38642a0..73ff74c486 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -1815,6 +1815,17 @@ struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"wasted_space", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Waste some space in the page. Not even a fill factor. Just testing multiple page features."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
+		},
+		&page_feature_wasted_space,
+		false,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"syslog_sequence_numbers", PGC_SIGHUP, LOGGING_WHERE,
 			gettext_noop("Add sequence number to syslog messages to avoid duplicate suppression."),
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 876e0bbe97..b9e46771e4 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -151,6 +151,7 @@ static bool sync_only = false;
 static bool show_setting = false;
 static bool data_checksums = false;
 static bool extended_checksums = false;
+static bool wasted_space = true;
 static char *xlog_dir = NULL;
 static char *str_wal_segment_size_mb = NULL;
 static int	wal_segment_size_mb;
@@ -1323,11 +1324,12 @@ bootstrap_template1(void)
 	unsetenv("PGCLIENTENCODING");
 
 	snprintf(cmd, sizeof(cmd),
-			 "\"%s\" --boot -X %d %s %s %s %s %s",
+			 "\"%s\" --boot -X %d %s %s %s %s %s %s",
 			 backend_exec,
 			 wal_segment_size_mb * (1024 * 1024),
 			 data_checksums ? "-k" : "",
 			 extended_checksums ? "-e extended_checksums" : "",
+			 wasted_space ? "-e wasted_space" : "",
 			 boot_options, extra_options,
 			 debug ? "-d 5" : "");
 
@@ -2807,6 +2809,7 @@ main(int argc, char *argv[])
 		{"wal-segsize", required_argument, NULL, 12},
 		{"data-checksums", no_argument, NULL, 'k'},
 		{"extended-checksums", no_argument, NULL, 'K'},
+		{"waste-space", no_argument, NULL, 'w'},
 		{"allow-group-access", no_argument, NULL, 'g'},
 		{"discard-caches", no_argument, NULL, 14},
 		{"locale-provider", required_argument, NULL, 15},
@@ -2852,7 +2855,7 @@ main(int argc, char *argv[])
 
 	/* process command-line options */
 
-	while ((c = getopt_long(argc, argv, "A:dD:E:gkKL:nNsST:U:WX:", long_options, &option_index)) != -1)
+	while ((c = getopt_long(argc, argv, "A:dD:E:gkKL:nNsST:U:WwX:", long_options, &option_index)) != -1)
 	{
 		switch (c)
 		{
@@ -2943,6 +2946,9 @@ main(int argc, char *argv[])
 			case 'T':
 				default_text_search_config = pg_strdup(optarg);
 				break;
+			case 'w':
+				wasted_space = true;
+				break;
 			case 'X':
 				xlog_dir = pg_strdup(optarg);
 				break;
diff --git a/src/common/pagefeat.c b/src/common/pagefeat.c
index 2037713ccb..aa7e993690 100644
--- a/src/common/pagefeat.c
+++ b/src/common/pagefeat.c
@@ -22,6 +22,7 @@ PageFeatureSet cluster_page_features;
 
 /* status GUCs, display only. set by XLog startup */
 bool page_feature_extended_checksums;
+bool page_feature_wasted_space;
 
 /*
  * A "page feature" is an optional cluster-defined additional data field that
@@ -49,6 +50,8 @@ typedef struct PageFeatureDesc
 static PageFeatureDesc feature_descs[PF_MAX_FEATURE] = {
 	/* PF_EXT_CHECKSUMS */
 	{ 7, "extended_checksums" }, /* occupies the 7 bytes atop the 1-byte trailer */
+	/* PF_WASTED_SPACE */
+	{ 40, "wasted_space" },
 };
 
 
diff --git a/src/include/common/pagefeat.h b/src/include/common/pagefeat.h
index c286062af6..6634ce22c6 100644
--- a/src/include/common/pagefeat.h
+++ b/src/include/common/pagefeat.h
@@ -17,6 +17,7 @@
 /* revealed for GUCs */
 extern int reserved_page_size;
 extern bool page_feature_extended_checksums;
+extern bool page_feature_wasted_space;
 
 /* forward declaration to avoid circular includes */
 typedef Pointer Page;
@@ -29,6 +30,7 @@ extern PageFeatureSet cluster_page_features;
 /* bit offset for features flags */
 typedef enum {
 	PF_EXT_CHECKSUMS = 0,  /* must be first */
+	PF_WASTED_SPACE,
 	PF_MAX_FEATURE /* must be last */
 } PageFeature;
 
-- 
2.31.1

v2-0004-Add-extended-page-checksums-feature.patchapplication/octet-stream; name=v2-0004-Add-extended-page-checksums-feature.patchDownload

From 98d5255f0b1e2edbbd95897030a1dd070cc52334 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Tue, 1 Nov 2022 18:48:08 -0400
Subject: [PATCH v2 4/5] Add extended page checksums feature

This is an example page feature which utilizes 2 bytes of the reserved page space and the existing
2-byte pd_checksum to store the total 32-bit page checksum that we currently calculate and throw
half away. It also serves as an illustration of writing/using a page feature.
---
 src/backend/access/transam/xlog.c       |   4 +-
 src/backend/backup/basebackup.c         |  27 +-
 src/backend/storage/page/bufpage.c      |  45 +-
 src/backend/utils/misc/guc_tables.c     |  11 +
 src/bin/initdb/initdb.c                 |  18 +-
 src/bin/pg_controldata/pg_controldata.c |   3 +
 src/common/pagefeat.c                   |   5 +
 src/include/common/komihash.h           | 569 ++++++++++++++++++++++++
 src/include/common/pagefeat.h           |   2 +
 src/include/storage/checksum.h          |   5 +
 src/include/storage/checksum_impl.h     | 198 +++++++++
 11 files changed, 868 insertions(+), 19 deletions(-)
 create mode 100644 src/include/common/komihash.h

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 27953367aa..7d4bf6bba9 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4234,7 +4234,9 @@ bool
 DataChecksumsEnabled(void)
 {
 	Assert(ControlFile != NULL);
-	return (ControlFile->data_checksum_version > 0);
+	return (ControlFile->data_checksum_version > 0) || \
+		PageFeatureSetHasFeature(ControlFile->page_features, PF_EXT_CHECKSUMS);
+
 }
 
 /*
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 74fb529380..9a0825bdb9 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -25,6 +25,7 @@
 #include "commands/defrem.h"
 #include "common/compression.h"
 #include "common/file_perm.h"
+#include "common/pagefeat.h"
 #include "lib/stringinfo.h"
 #include "miscadmin.h"
 #include "nodes/pg_list.h"
@@ -1492,7 +1493,7 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 	int			fd;
 	BlockNumber blkno = 0;
 	bool		block_retry = false;
-	uint16		checksum;
+	uint64		checksum, page_checksum;
 	int			checksum_failures = 0;
 	off_t		cnt;
 	int			i;
@@ -1608,9 +1609,23 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 				 */
 				if (!PageIsNew(page) && PageGetLSN(page) < sink->bbs_state->startptr)
 				{
-					checksum = pg_checksum_page((char *) page, blkno + segmentno * RELSEG_SIZE);
-					phdr = (PageHeader) page;
-					if (phdr->pd_checksum != checksum)
+					char *extended_checksum_loc = NULL;
+
+					/* are we using extended checksums? */
+					if ((extended_checksum_loc = PageGetFeatureOffset(page, PF_EXT_CHECKSUMS)))
+					{
+						/* 56-bit checksum stored in high 7 bytes */
+						page_checksum = pg_get_checksum56_page(page, (uint64*)extended_checksum_loc);
+						checksum = pg_checksum56_page(page, blkno + segmentno * RELSEG_SIZE, (uint64*)extended_checksum_loc);
+					}
+					else
+					{
+						phdr = (PageHeader) page;
+						page_checksum = (uint32)phdr->pd_checksum;
+						checksum = pg_checksum_page(page, blkno + segmentno * RELSEG_SIZE);
+					}
+
+					if (page_checksum != checksum)
 					{
 						/*
 						 * Retry the block on the first failure.  It's
@@ -1661,9 +1676,9 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 							ereport(WARNING,
 									(errmsg("checksum verification failed in "
 											"file \"%s\", block %u: calculated "
-											"%X but expected %X",
+											"%lu but expected %lu",
 											readfilename, blkno, checksum,
-											phdr->pd_checksum)));
+											page_checksum)));
 						if (checksum_failures == 5)
 							ereport(WARNING,
 									(errmsg("further checksum verification "
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index 0433aade03..9577296428 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -107,8 +107,9 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
 	bool		checksum_failure = false;
 	bool		header_sane = false;
 	bool		all_zeroes = false;
-	uint16		checksum = 0;
-
+	uint64		checksum = 0;
+	uint64		page_checksum = 0;
+	char       *extended_checksum_loc = NULL;
 	/*
 	 * Don't verify page data unless the page passes basic non-zero test
 	 */
@@ -116,9 +117,20 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
 	{
 		if (DataChecksumsEnabled())
 		{
-			checksum = pg_checksum_page((char *) page, blkno);
-
-			if (checksum != p->pd_checksum)
+			/* are we using extended checksums? */
+			if ((extended_checksum_loc = PageGetFeatureOffset(page, PF_EXT_CHECKSUMS)))
+			{
+				/* 56-bit checksum stored in high 7 bytes */
+				page_checksum = pg_get_checksum56_page(page, (uint64*)extended_checksum_loc);
+				checksum = pg_checksum56_page(page, blkno, (uint64*)extended_checksum_loc);
+			}
+			else
+			{
+				/* traditional checksums in the pd_checksum field */
+				page_checksum = (uint32)p->pd_checksum;
+				checksum = (uint32)pg_checksum_page((char *) page, blkno);
+			}
+			if (checksum != page_checksum)
 				checksum_failure = true;
 		}
 
@@ -163,8 +175,8 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
 		if ((flags & PIV_LOG_WARNING) != 0)
 			ereport(WARNING,
 					(errcode(ERRCODE_DATA_CORRUPTED),
-					 errmsg("page verification failed, calculated checksum %u but expected %u",
-							checksum, p->pd_checksum)));
+					 errmsg("page verification failed, calculated checksum %lu but expected %lu",
+							checksum, page_checksum)));
 
 		if ((flags & PIV_REPORT_STAT) != 0)
 			pgstat_report_checksum_failure();
@@ -1524,6 +1536,7 @@ char *
 PageSetChecksumCopy(Page page, BlockNumber blkno)
 {
 	static char *pageCopy = NULL;
+	char *extended_checksum_loc = NULL;
 
 	/* If we don't need a checksum, just return the passed-in data */
 	if (PageIsNew(page) || !DataChecksumsEnabled())
@@ -1539,7 +1552,13 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 		pageCopy = MemoryContextAlloc(TopMemoryContext, BLCKSZ);
 
 	memcpy(pageCopy, (char *) page, BLCKSZ);
-	((PageHeader) pageCopy)->pd_checksum = pg_checksum_page(pageCopy, blkno);
+
+	if ((extended_checksum_loc = PageGetFeatureOffset(pageCopy, PF_EXT_CHECKSUMS)))
+		pg_set_checksum56_page(pageCopy,
+							   pg_checksum56_page(pageCopy, blkno, (uint64*)extended_checksum_loc),
+							   (uint64*)extended_checksum_loc);
+	else
+		((PageHeader) pageCopy)->pd_checksum = pg_checksum_page(pageCopy, blkno);
 	return pageCopy;
 }
 
@@ -1552,9 +1571,17 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 void
 PageSetChecksumInplace(Page page, BlockNumber blkno)
 {
+	char *extended_checksum_loc = NULL;
+
 	/* If we don't need a checksum, just return */
 	if (PageIsNew(page) || !DataChecksumsEnabled())
 		return;
 
-	((PageHeader) page)->pd_checksum = pg_checksum_page((char *) page, blkno);
+	/* are we using extended checksums? */
+	if ((extended_checksum_loc = PageGetFeatureOffset(page, PF_EXT_CHECKSUMS)))
+		pg_set_checksum56_page(page,
+							   pg_checksum56_page(page, blkno, (uint64*)extended_checksum_loc),
+							   (uint64*)extended_checksum_loc);
+	else
+		((PageHeader) page)->pd_checksum = pg_checksum_page(page, blkno);
 }
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 4c30d0d37a..24a38642a0 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -1804,6 +1804,17 @@ struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"extended_checksums", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows whether extended checksums are turned on for this cluster."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
+		},
+		&page_feature_extended_checksums,
+		false,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"syslog_sequence_numbers", PGC_SIGHUP, LOGGING_WHERE,
 			gettext_noop("Add sequence number to syslog messages to avoid duplicate suppression."),
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 40561d5d61..876e0bbe97 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -150,6 +150,7 @@ static bool do_sync = true;
 static bool sync_only = false;
 static bool show_setting = false;
 static bool data_checksums = false;
+static bool extended_checksums = false;
 static char *xlog_dir = NULL;
 static char *str_wal_segment_size_mb = NULL;
 static int	wal_segment_size_mb;
@@ -1322,10 +1323,11 @@ bootstrap_template1(void)
 	unsetenv("PGCLIENTENCODING");
 
 	snprintf(cmd, sizeof(cmd),
-			 "\"%s\" --boot -X %d %s %s %s %s",
+			 "\"%s\" --boot -X %d %s %s %s %s %s",
 			 backend_exec,
 			 wal_segment_size_mb * (1024 * 1024),
 			 data_checksums ? "-k" : "",
+			 extended_checksums ? "-e extended_checksums" : "",
 			 boot_options, extra_options,
 			 debug ? "-d 5" : "");
 
@@ -2148,6 +2150,7 @@ usage(const char *progname)
 	printf(_("  -g, --allow-group-access  allow group read/execute on data directory\n"));
 	printf(_("      --icu-locale=LOCALE   set ICU locale ID for new databases\n"));
 	printf(_("  -k, --data-checksums      use data page checksums\n"));
+	printf(_("  -K, --extended-checksums  use extended data page checksums\n"));
 	printf(_("      --locale=LOCALE       set default locale for new databases\n"));
 	printf(_("      --lc-collate=, --lc-ctype=, --lc-messages=LOCALE\n"
 			 "      --lc-monetary=, --lc-numeric=, --lc-time=LOCALE\n"
@@ -2803,6 +2806,7 @@ main(int argc, char *argv[])
 		{"waldir", required_argument, NULL, 'X'},
 		{"wal-segsize", required_argument, NULL, 12},
 		{"data-checksums", no_argument, NULL, 'k'},
+		{"extended-checksums", no_argument, NULL, 'K'},
 		{"allow-group-access", no_argument, NULL, 'g'},
 		{"discard-caches", no_argument, NULL, 14},
 		{"locale-provider", required_argument, NULL, 15},
@@ -2848,7 +2852,7 @@ main(int argc, char *argv[])
 
 	/* process command-line options */
 
-	while ((c = getopt_long(argc, argv, "A:dD:E:gkL:nNsST:U:WX:", long_options, &option_index)) != -1)
+	while ((c = getopt_long(argc, argv, "A:dD:E:gkKL:nNsST:U:WX:", long_options, &option_index)) != -1)
 	{
 		switch (c)
 		{
@@ -2900,6 +2904,9 @@ main(int argc, char *argv[])
 			case 'k':
 				data_checksums = true;
 				break;
+			case 'K':
+				extended_checksums = true;
+				break;
 			case 'L':
 				share_path = pg_strdup(optarg);
 				break;
@@ -3015,6 +3022,9 @@ main(int argc, char *argv[])
 	if (pwprompt && pwfilename)
 		pg_fatal("password prompt and password file cannot be specified together");
 
+	if (data_checksums && extended_checksums)
+		pg_fatal("data checksums and extended data checksums cannot be specified together");
+
 	check_authmethod_unspecified(&authmethodlocal);
 	check_authmethod_unspecified(&authmethodhost);
 
@@ -3068,7 +3078,9 @@ main(int argc, char *argv[])
 
 	printf("\n");
 
-	if (data_checksums)
+	if (extended_checksums)
+		printf(_("Extended data page checksums are enabled.\n"));
+	else if (data_checksums)
 		printf(_("Data page checksums are enabled.\n"));
 	else
 		printf(_("Data page checksums are disabled.\n"));
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index c1006ad5d8..bc6be4844a 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -331,5 +331,8 @@ main(int argc, char *argv[])
 		   mock_auth_nonce_str);
 	printf(_("Reserved page size for features:      %d\n"),
 		   PageFeatureSetCalculateSize(ControlFile->page_features));
+	printf(_("Using extended checksums:             %s\n"),
+		   PageFeatureSetHasFeature(ControlFile->page_features, PF_EXT_CHECKSUMS) \
+		   ? _("yes") : _("no"));
 	return 0;
 }
diff --git a/src/common/pagefeat.c b/src/common/pagefeat.c
index 75d5bffce2..2037713ccb 100644
--- a/src/common/pagefeat.c
+++ b/src/common/pagefeat.c
@@ -20,6 +20,9 @@
 int reserved_page_size;
 PageFeatureSet cluster_page_features;
 
+/* status GUCs, display only. set by XLog startup */
+bool page_feature_extended_checksums;
+
 /*
  * A "page feature" is an optional cluster-defined additional data field that
  * is stored in the "reserved_page_size" area in the footer of a given Page.
@@ -44,6 +47,8 @@ typedef struct PageFeatureDesc
  * or the attempt to set the GUC will fail. */
 
 static PageFeatureDesc feature_descs[PF_MAX_FEATURE] = {
+	/* PF_EXT_CHECKSUMS */
+	{ 7, "extended_checksums" }, /* occupies the 7 bytes atop the 1-byte trailer */
 };
 
 
diff --git a/src/include/common/komihash.h b/src/include/common/komihash.h
new file mode 100644
index 0000000000..867a7f09b1
--- /dev/null
+++ b/src/include/common/komihash.h
@@ -0,0 +1,569 @@
+/**
+ * komihash.h version 4.3.1
+ *
+ * The inclusion file for the "komihash" hash function.
+ *
+ * Description is available at https://github.com/avaneev/komihash
+ *
+ * License
+ *
+ * Copyright (c) 2021-2022 Aleksey Vaneev
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef KOMIHASH_INCLUDED
+#define KOMIHASH_INCLUDED
+
+#include <stdint.h>
+#include <string.h>
+
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wdeclaration-after-statement"
+
+// Macros that apply byte-swapping.
+
+#if defined( __GNUC__ ) || defined( __clang__ )
+
+	#define KOMIHASH_BYTESW32( v ) __builtin_bswap32( v )
+	#define KOMIHASH_BYTESW64( v ) __builtin_bswap64( v )
+
+#elif defined( _MSC_VER )
+
+	#define KOMIHASH_BYTESW32( v ) _byteswap_ulong( v )
+	#define KOMIHASH_BYTESW64( v ) _byteswap_uint64( v )
+
+#else // defined( _MSC_VER )
+
+	#define KOMIHASH_BYTESW32( v ) ( \
+		( v & 0xFF000000 ) >> 24 | \
+		( v & 0x00FF0000 ) >> 8 | \
+		( v & 0x0000FF00 ) << 8 | \
+		( v & 0x000000FF ) << 24 )
+
+	#define KOMIHASH_BYTESW64( v ) ( \
+		( v & 0xFF00000000000000 ) >> 56 | \
+		( v & 0x00FF000000000000 ) >> 40 | \
+		( v & 0x0000FF0000000000 ) >> 24 | \
+		( v & 0x000000FF00000000 ) >> 8 | \
+		( v & 0x00000000FF000000 ) << 8 | \
+		( v & 0x0000000000FF0000 ) << 24 | \
+		( v & 0x000000000000FF00 ) << 40 | \
+		( v & 0x00000000000000FF ) << 56 )
+
+#endif // defined( _MSC_VER )
+
+// Endianness-definition macro, can be defined externally (e.g. =1, if
+// endianness-correction is unnecessary in any case, to reduce its associated
+// overhead).
+
+#if !defined( KOMIHASH_LITTLE_ENDIAN )
+	#if defined( _WIN32 ) || defined( __LITTLE_ENDIAN__ ) || \
+		( defined( __BYTE_ORDER__ ) && __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ )
+
+		#define KOMIHASH_LITTLE_ENDIAN 1
+
+	#elif defined( __BIG_ENDIAN__ ) || \
+		( defined( __BYTE_ORDER__ ) && __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ )
+
+		#define KOMIHASH_LITTLE_ENDIAN 0
+
+	#else // defined( __BIG_ENDIAN__ )
+
+		#warning KOMIHASH: cannot determine endianness, assuming little-endian.
+
+		#define KOMIHASH_LITTLE_ENDIAN 1
+
+	#endif // defined( __BIG_ENDIAN__ )
+#endif // !defined( KOMIHASH_LITTLE_ENDIAN )
+
+// Macros that apply byte-swapping, used for endianness-correction.
+
+#if KOMIHASH_LITTLE_ENDIAN
+
+	#define KOMIHASH_EC32( v ) ( v )
+	#define KOMIHASH_EC64( v ) ( v )
+
+#else // KOMIHASH_LITTLE_ENDIAN
+
+	#define KOMIHASH_EC32( v ) KOMIHASH_BYTESW32( v )
+	#define KOMIHASH_EC64( v ) KOMIHASH_BYTESW64( v )
+
+#endif // KOMIHASH_LITTLE_ENDIAN
+
+// Likelihood macros that are used for manually-guided micro-optimization.
+
+#if defined( __GNUC__ ) || defined( __clang__ )
+
+	#define KOMIHASH_LIKELY( x )  __builtin_expect( x, 1 )
+	#define KOMIHASH_UNLIKELY( x )  __builtin_expect( x, 0 )
+
+#else // likelihood macros
+
+	#define KOMIHASH_LIKELY( x ) ( x )
+	#define KOMIHASH_UNLIKELY( x ) ( x )
+
+#endif // likelihood macros
+
+// In-memory data prefetch macro (temporal locality=1, in case a collision
+// resolution would be necessary).
+
+#if defined( __GNUC__ ) || defined( __clang__ )
+
+	#define KOMIHASH_PREFETCH( addr ) __builtin_prefetch( addr, 0, 1 )
+
+#else // prefetch macro
+
+	#define KOMIHASH_PREFETCH( addr )
+
+#endif // prefetch macro
+
+/**
+ * An auxiliary function that returns an unsigned 32-bit value created out of
+ * a sequence of bytes in memory. This function is used to convert endianness
+ * of in-memory 32-bit unsigned values, and to avoid unaligned memory
+ * accesses.
+ *
+ * @param p Pointer to 4 bytes in memory. Alignment is unimportant.
+ */
+
+static inline uint32_t kh_lu32ec( const uint8_t* const p )
+{
+	uint32_t v;
+	memcpy( &v, p, 4 );
+
+	return( KOMIHASH_EC32( v ));
+}
+
+/**
+ * An auxiliary function that returns an unsigned 64-bit value created out of
+ * a sequence of bytes in memory. This function is used to convert endianness
+ * of in-memory 64-bit unsigned values, and to avoid unaligned memory
+ * accesses.
+ *
+ * @param p Pointer to 8 bytes in memory. Alignment is unimportant.
+ */
+
+static inline uint64_t kh_lu64ec( const uint8_t* const p )
+{
+	uint64_t v;
+	memcpy( &v, p, 8 );
+
+	return( KOMIHASH_EC64( v ));
+}
+
+/**
+ * Function builds an unsigned 64-bit value out of remaining bytes in a
+ * message, and pads it with the "final byte". This function can only be
+ * called if less than 8 bytes are left to read. The message should be "long",
+ * permitting Msg[ -3 ] reads.
+ *
+ * @param Msg Message pointer, alignment is unimportant.
+ * @param MsgLen Message's remaining length, in bytes; can be 0.
+ * @param fb Final byte used for padding.
+ */
+
+static inline uint64_t kh_lpu64ec_l3( const uint8_t* const Msg,
+	const size_t MsgLen, uint64_t fb )
+{
+	if( MsgLen < 4 )
+	{
+		const uint8_t* const Msg3 = Msg + MsgLen - 3;
+		const int ml8 = (int) ( MsgLen << 3 );
+		const uint64_t m = (uint64_t) Msg3[ 0 ] | (uint64_t) Msg3[ 1 ] << 8 |
+			(uint64_t) Msg3[ 2 ] << 16;
+		return( fb << ml8 | m >> ( 24 - ml8 ));
+	}
+
+	const int ml8 = (int) ( MsgLen << 3 );
+	const uint64_t mh = kh_lu32ec( Msg + MsgLen - 4 );
+	const uint64_t ml = kh_lu32ec( Msg );
+
+	return( fb << ml8 | ml | ( mh >> ( 64 - ml8 )) << 32 );
+}
+
+/**
+ * Function builds an unsigned 64-bit value out of remaining bytes in a
+ * message, and pads it with the "final byte". This function can only be
+ * called if less than 8 bytes are left to read. Can be used on "short"
+ * messages, but MsgLen should be greater than 0.
+ *
+ * @param Msg Message pointer, alignment is unimportant.
+ * @param MsgLen Message's remaining length, in bytes; cannot be 0.
+ * @param fb Final byte used for padding.
+ */
+
+static inline uint64_t kh_lpu64ec_nz( const uint8_t* const Msg,
+	const size_t MsgLen, uint64_t fb )
+{
+	if( MsgLen < 4 )
+	{
+		fb <<= ( MsgLen << 3 );
+		uint64_t m = Msg[ 0 ];
+
+		if( MsgLen > 1 )
+		{
+			m |= (uint64_t) Msg[ 1 ] << 8;
+
+			if( MsgLen > 2 )
+			{
+				m |= (uint64_t) Msg[ 2 ] << 16;
+			}
+		}
+
+		return( fb | m );
+	}
+
+	const int ml8 = (int) ( MsgLen << 3 );
+	const uint64_t mh = kh_lu32ec( Msg + MsgLen - 4 );
+	const uint64_t ml = kh_lu32ec( Msg );
+
+	return( fb << ml8 | ml | ( mh >> ( 64 - ml8 )) << 32 );
+}
+
+/**
+ * Function builds an unsigned 64-bit value out of remaining bytes in a
+ * message, and pads it with the "final byte". This function can only be
+ * called if less than 8 bytes are left to read. The message should be "long",
+ * permitting Msg[ -4 ] reads.
+ *
+ * @param Msg Message pointer, alignment is unimportant.
+ * @param MsgLen Message's remaining length, in bytes; can be 0.
+ * @param fb Final byte used for padding.
+ */
+
+static inline uint64_t kh_lpu64ec_l4( const uint8_t* const Msg,
+	const size_t MsgLen, uint64_t fb )
+{
+	if( MsgLen < 5 )
+	{
+		const int ml8 = (int) ( MsgLen << 3 );
+
+		return( fb << ml8 |
+			(uint64_t) kh_lu32ec( Msg + MsgLen - 4 ) >> ( 32 - ml8 ));
+	}
+	else
+	{
+		const int ml8 = (int) ( MsgLen << 3 );
+
+		return( fb << ml8 | kh_lu64ec( Msg + MsgLen - 8 ) >> ( 64 - ml8 ));
+	}
+}
+
+#if defined( __SIZEOF_INT128__ )
+
+	/**
+	 * 64-bit by 64-bit unsigned multiplication.
+	 *
+	 * @param m1 Multiplier 1.
+	 * @param m2 Multiplier 2.
+	 * @param[out] rl The lower half of the 128-bit result.
+	 * @param[out] rh The higher half of the 128-bit result.
+	 */
+
+	static inline void kh_m128( const uint64_t m1, const uint64_t m2,
+		uint64_t* const rl, uint64_t* const rh )
+	{
+		const __uint128_t r = (__uint128_t) m1 * m2;
+
+		*rl = (uint64_t) r;
+		*rh = (uint64_t) ( r >> 64 );
+	}
+
+#elif defined( _MSC_VER ) && defined( _M_X64 )
+
+	#include <intrin.h>
+
+	static inline void kh_m128( const uint64_t m1, const uint64_t m2,
+		uint64_t* const rl, uint64_t* const rh )
+	{
+		*rl = _umul128( m1, m2, rh );
+	}
+
+#else // defined( _MSC_VER )
+
+	// _umul128() code for 32-bit systems, adapted from mullu(),
+	// from https://go.dev/src/runtime/softfloat64.go
+	// Licensed under BSD-style license.
+
+	static inline uint64_t kh__emulu( const uint32_t x, const uint32_t y )
+	{
+		return( x * (uint64_t) y );
+	}
+
+	static inline void kh_m128( const uint64_t u, const uint64_t v,
+		uint64_t* const rl, uint64_t* const rh )
+	{
+		*rl = u * v;
+
+		const uint32_t u0 = (uint32_t) u;
+		const uint32_t v0 = (uint32_t) v;
+		const uint64_t w0 = kh__emulu( u0, v0 );
+		const uint32_t u1 = (uint32_t) ( u >> 32 );
+		const uint32_t v1 = (uint32_t) ( v >> 32 );
+		const uint64_t t = kh__emulu( u1, v0 ) + ( w0 >> 32 );
+		const uint64_t w1 = (uint32_t) t + kh__emulu( u0, v1 );
+
+		*rh = kh__emulu( u1, v1 ) + ( w1 >> 32 ) + ( t >> 32 );
+	}
+
+#endif // defined( _MSC_VER )
+
+// Common hashing round with 16-byte input, using the "r1l" and "r1h"
+// temporary variables.
+
+#define KOMIHASH_HASH16( m ) \
+	kh_m128( Seed1 ^ kh_lu64ec( m ), \
+		Seed5 ^ kh_lu64ec( m + 8 ), &r1l, &r1h ); \
+	Seed5 += r1h; \
+	Seed1 = Seed5 ^ r1l;
+
+// Common hashing round without input, using the "r2l" and "r2h" temporary
+// variables.
+
+#define KOMIHASH_HASHROUND() \
+	kh_m128( Seed1, Seed5, &r2l, &r2h ); \
+	Seed5 += r2h; \
+	Seed1 = Seed5 ^ r2l;
+
+// Common hashing finalization round, with the final hashing input expected in
+// the "r2l" and "r2h" temporary variables.
+
+#define KOMIHASH_HASHFIN() \
+	kh_m128( r2l, r2h, &r1l, &r1h ); \
+	Seed5 += r1h; \
+	Seed1 = Seed5 ^ r1l; \
+	KOMIHASH_HASHROUND();
+
+/**
+ * KOMIHASH hash function. Produces and returns a 64-bit hash value of the
+ * specified message, string, or binary data block. Designed for 64-bit
+ * hash-table and hash-map uses. Produces identical hashes on both big- and
+ * little-endian systems.
+ *
+ * @param Msg0 The message to produce a hash from. The alignment of this
+ * pointer is unimportant.
+ * @param MsgLen Message's length, in bytes.
+ * @param UseSeed Optional value, to use instead of the default seed. To use
+ * the default seed, set to 0. The UseSeed value can have any bit length and
+ * statistical quality, and is used only as an additional entropy source. May
+ * need endianness-correction if this value is shared between big- and
+ * little-endian systems.
+ */
+
+static inline uint64_t komihash( const void* const Msg0, size_t MsgLen,
+	const uint64_t UseSeed )
+{
+	const uint8_t* Msg = (const uint8_t*) Msg0;
+
+	// The seeds are initialized to the first mantissa bits of PI.
+
+	uint64_t Seed1 = 0x243F6A8885A308D3 ^ ( UseSeed & 0x5555555555555555 );
+	uint64_t Seed5 = 0x452821E638D01377 ^ ( UseSeed & 0xAAAAAAAAAAAAAAAA );
+	uint64_t r1l, r1h, r2l, r2h;
+
+	// The three instructions in the "KOMIHASH_HASHROUND" macro represent the
+	// simplest constant-less PRNG, scalable to any even-sized state
+	// variables, with the `Seed1` being the PRNG output (2^64 PRNG period).
+	// It passes `PractRand` tests with rare non-systematic "unusual"
+	// evaluations.
+	//
+	// To make this PRNG reliable, self-starting, and eliminate a risk of
+	// stopping, the following variant can be used, which is a "register
+	// checker-board", a source of raw entropy. The PRNG is available as the
+	// komirand() function. Not required for hashing (but works for it) since
+	// the input entropy is usually available in abundance during hashing.
+	//
+	// Seed5 += r2h + 0xAAAAAAAAAAAAAAAA;
+	//
+	// (the `0xAAAA...` constant should match register's size; essentially,
+	// it is a replication of the `10` bit-pair; it is not an arbitrary
+	// constant).
+
+	KOMIHASH_HASHROUND(); // Required for PerlinNoise.
+
+	if( KOMIHASH_LIKELY( MsgLen < 16 ))
+	{
+		KOMIHASH_PREFETCH( Msg );
+
+		r2l = Seed1;
+		r2h = Seed5;
+
+		if( MsgLen > 7 )
+		{
+			// The following two XOR instructions are equivalent to mixing a
+			// message with a cryptographic one-time-pad (bitwise modulo 2
+			// addition). Message's statistics and distribution are thus
+			// unimportant.
+
+			r2h ^= kh_lpu64ec_l3( Msg + 8, MsgLen - 8,
+				1 << ( Msg[ MsgLen - 1 ] >> 7 ));
+
+			r2l ^= kh_lu64ec( Msg );
+		}
+		else
+		if( KOMIHASH_LIKELY( MsgLen != 0 ))
+		{
+			r2l ^= kh_lpu64ec_nz( Msg, MsgLen,
+				1 << ( Msg[ MsgLen - 1 ] >> 7 ));
+		}
+
+		KOMIHASH_HASHFIN();
+
+		return( Seed1 );
+	}
+
+	if( KOMIHASH_LIKELY( MsgLen < 32 ))
+	{
+		KOMIHASH_PREFETCH( Msg );
+
+		KOMIHASH_HASH16( Msg );
+
+		const uint64_t fb = 1 << ( Msg[ MsgLen - 1 ] >> 7 );
+
+		if( MsgLen > 23 )
+		{
+			r2h = Seed5 ^ kh_lpu64ec_l4( Msg + 24, MsgLen - 24, fb );
+			r2l = Seed1 ^ kh_lu64ec( Msg + 16 );
+		}
+		else
+		{
+			r2l = Seed1 ^ kh_lpu64ec_l4( Msg + 16, MsgLen - 16, fb );
+			r2h = Seed5;
+		}
+
+		KOMIHASH_HASHFIN();
+
+		return( Seed1 );
+	}
+
+	if( MsgLen > 63 )
+	{
+		uint64_t Seed2 = 0x13198A2E03707344 ^ Seed1;
+		uint64_t Seed3 = 0xA4093822299F31D0 ^ Seed1;
+		uint64_t Seed4 = 0x082EFA98EC4E6C89 ^ Seed1;
+		uint64_t Seed6 = 0xBE5466CF34E90C6C ^ Seed5;
+		uint64_t Seed7 = 0xC0AC29B7C97C50DD ^ Seed5;
+		uint64_t Seed8 = 0x3F84D5B5B5470917 ^ Seed5;
+		uint64_t r3l, r3h, r4l, r4h;
+
+		do
+		{
+			KOMIHASH_PREFETCH( Msg );
+
+			kh_m128( Seed1 ^ kh_lu64ec( Msg ),
+				Seed5 ^ kh_lu64ec( Msg + 8 ), &r1l, &r1h );
+
+			kh_m128( Seed2 ^ kh_lu64ec( Msg + 16 ),
+				Seed6 ^ kh_lu64ec( Msg + 24 ), &r2l, &r2h );
+
+			kh_m128( Seed3 ^ kh_lu64ec( Msg + 32 ),
+				Seed7 ^ kh_lu64ec( Msg + 40 ), &r3l, &r3h );
+
+			kh_m128( Seed4 ^ kh_lu64ec( Msg + 48 ),
+				Seed8 ^ kh_lu64ec( Msg + 56 ), &r4l, &r4h );
+
+			Msg += 64;
+			MsgLen -= 64;
+
+			// Such "shifting" arrangement (below) does not increase
+			// individual SeedN's PRNG period beyond 2^64, but reduces a
+			// chance of any occassional synchronization between PRNG lanes
+			// happening. Practically, Seed1-4 together become a single
+			// "fused" 256-bit PRNG value, having a summary PRNG period of
+			// 2^66.
+
+			Seed5 += r1h;
+			Seed6 += r2h;
+			Seed7 += r3h;
+			Seed8 += r4h;
+			Seed2 = Seed5 ^ r2l;
+			Seed3 = Seed6 ^ r3l;
+			Seed4 = Seed7 ^ r4l;
+			Seed1 = Seed8 ^ r1l;
+
+		} while( KOMIHASH_LIKELY( MsgLen > 63 ));
+
+		Seed5 ^= Seed6 ^ Seed7 ^ Seed8;
+		Seed1 ^= Seed2 ^ Seed3 ^ Seed4;
+	}
+
+	KOMIHASH_PREFETCH( Msg );
+
+	if( KOMIHASH_LIKELY( MsgLen > 31 ))
+	{
+		KOMIHASH_HASH16( Msg );
+		KOMIHASH_HASH16( Msg + 16 );
+
+		Msg += 32;
+		MsgLen -= 32;
+	}
+
+	if( MsgLen > 15 )
+	{
+		KOMIHASH_HASH16( Msg );
+
+		Msg += 16;
+		MsgLen -= 16;
+	}
+
+	const uint64_t fb = 1 << ( Msg[ MsgLen - 1 ] >> 7 );
+
+	if( MsgLen > 7 )
+	{
+		r2h = Seed5 ^ kh_lpu64ec_l4( Msg + 8, MsgLen - 8, fb );
+		r2l = Seed1 ^ kh_lu64ec( Msg );
+	}
+	else
+	{
+		r2l = Seed1 ^ kh_lpu64ec_l4( Msg, MsgLen, fb );
+		r2h = Seed5;
+	}
+
+	KOMIHASH_HASHFIN();
+
+	return( Seed1 );
+}
+
+/**
+ * Simple, reliable, self-starting yet efficient PRNG, with 2^64 period.
+ * 0.62 cycles/byte performance. Self-starts in 4 iterations, which is a
+ * suggested "warming up" initialization before using its output.
+ *
+ * @param[in,out] Seed1 Seed value 1. Can be initialized to any value
+ * (even 0). This is the usual "PRNG seed" value.
+ * @param[in,out] Seed2 Seed value 2, a supporting variable. Best initialized
+ * to the same value as Seed1.
+ * @return The next uniformly-random 64-bit value.
+ */
+
+static inline uint64_t komirand( uint64_t* const Seed1, uint64_t* const Seed2 )
+{
+	uint64_t r1l, r1h;
+
+	kh_m128( *Seed1, *Seed2, &r1l, &r1h );
+	*Seed2 += r1h + 0xAAAAAAAAAAAAAAAA;
+	*Seed1 = *Seed2 ^ r1l;
+
+	return( *Seed1 );
+}
+
+#pragma GCC diagnostic pop
+
+#endif // KOMIHASH_INCLUDED
diff --git a/src/include/common/pagefeat.h b/src/include/common/pagefeat.h
index cf0f04ecdb..c286062af6 100644
--- a/src/include/common/pagefeat.h
+++ b/src/include/common/pagefeat.h
@@ -16,6 +16,7 @@
 
 /* revealed for GUCs */
 extern int reserved_page_size;
+extern bool page_feature_extended_checksums;
 
 /* forward declaration to avoid circular includes */
 typedef Pointer Page;
@@ -27,6 +28,7 @@ extern PageFeatureSet cluster_page_features;
 
 /* bit offset for features flags */
 typedef enum {
+	PF_EXT_CHECKSUMS = 0,  /* must be first */
 	PF_MAX_FEATURE /* must be last */
 } PageFeature;
 
diff --git a/src/include/storage/checksum.h b/src/include/storage/checksum.h
index 1904fabd5a..4c14b89a29 100644
--- a/src/include/storage/checksum.h
+++ b/src/include/storage/checksum.h
@@ -20,5 +20,10 @@
  * 4-byte boundary.
  */
 extern uint16 pg_checksum_page(char *page, BlockNumber blkno);
+extern uint32 pg_checksum32_page(char *page, BlockNumber blkno, char*offset);
+extern uint64 pg_checksum64_page(char *page, BlockNumber blkno, uint64*offset);
+extern uint64 pg_checksum56_page(char *page, BlockNumber blkno, uint64*offset);
+extern void pg_set_checksum56_page(char *page, uint64 checksum, uint64 *cksumloc);
+extern uint64 pg_get_checksum56_page(char *page, uint64 *cksumloc);
 
 #endif							/* CHECKSUM_H */
diff --git a/src/include/storage/checksum_impl.h b/src/include/storage/checksum_impl.h
index d2eb75f769..5c6f549b5f 100644
--- a/src/include/storage/checksum_impl.h
+++ b/src/include/storage/checksum_impl.h
@@ -101,6 +101,7 @@
  */
 
 #include "storage/bufpage.h"
+#include "common/komihash.h"
 
 /* number of checksums to calculate in parallel */
 #define N_SUMS 32
@@ -213,3 +214,200 @@ pg_checksum_page(char *page, BlockNumber blkno)
 	 */
 	return (uint16) ((checksum % 65535) + 1);
 }
+
+
+/*
+ * Compute and return a 32-bit checksum for a Postgres page.
+ *
+ * Beware that the 16-bit portion of the page that cksum points to is
+ * transiently zeroed, as is the pd_checksums field.  The storage location for
+ * this is determined by the PageFeatures in play for cluster, so we are
+ * storing the
+ *
+ * The checksum includes the block number (to detect the case where a page is
+ * somehow moved to a different location), the page header (excluding the
+ * checksum itself), and the page data.
+ *
+ * The high bits of this are stored in the overflow storage area of the page
+ * pointed to by *cksum, leaving the pd_checksum field with the same checksum
+ * you'd expect if running the pg_checksum_page function.
+ */
+uint32
+pg_checksum32_page(char *page, BlockNumber blkno, char *cksum)
+{
+	PGChecksummablePage *cpage = (PGChecksummablePage *) page;
+	uint16		save_pd,save_ext,*ptr;
+	uint32		checksum;
+
+	/* We only calculate the checksum for properly-initialized pages */
+	Assert(!PageIsNew((Page) page));
+	/* Ensure that the cksum pointer is in the page range on this page */
+	Assert(cksum >= page && cksum <= (page + BLCKSZ - sizeof(uint16)));
+
+	ptr = (uint16*)cksum;
+
+	/*
+	 * Save the existing checksum locations and temporarily set it to zero, so
+	 * that the checksum calculation isn't affected by the old checksum stored
+	 * on the page.  Restore it after, because actually updating the checksum
+	 * is NOT part of the API of this function.
+	 */
+
+	save_ext = *ptr;
+	save_pd = cpage->phdr.pd_checksum;
+	*ptr = 0;
+	cpage->phdr.pd_checksum = 0;
+
+	checksum = pg_checksum_block(cpage);
+
+	/* restore */
+	*ptr = save_ext;
+	cpage->phdr.pd_checksum = save_pd;
+
+	/* Mix in the block number to detect transposed pages */
+	checksum ^= blkno;
+
+	/* ensure we have non-zero return value here; this does double-up on our
+	 * coset for group 1 here, but it's a nice property to preserve */
+	return (checksum == 0 ? 1 : checksum);
+}
+
+/*
+ * 64-bit block checksum algorithm.  The page must be adequately aligned
+ * (at least on 4-byte boundary).
+ */
+
+static uint64
+pg_checksum64_block(const PGChecksummablePage *page)
+{
+	/* ensure that the size is compatible with the algorithm */
+	Assert(sizeof(PGChecksummablePage) == BLCKSZ);
+
+	return (uint64)komihash(page, BLCKSZ, 0);
+}
+
+/* temporary struct for ease of accessing memory */
+typedef union {
+	uint64 u64;
+	uint8 bytes[8];
+} Checksum56;
+
+StaticAssertDecl(sizeof(Checksum56) == sizeof(uint64), "Can't make combined checksum56 struct");
+
+/*
+ * Compute and return a 64-bit checksum for a Postgres page.
+ *
+ * Beware that the 64-bit portion of the page that cksum points to is
+ * transiently zeroed, though it is restored.
+ *
+ * The checksum includes the block number (to detect the case where a page is
+ * somehow moved to a different location), the page header (excluding the
+ * checksum itself), and the page data.
+ */
+uint64
+pg_checksum64_page(char *page, BlockNumber blkno, uint64 *cksumloc)
+{
+	PGChecksummablePage *cpage = (PGChecksummablePage *) page;
+	uint64      saved;
+	uint64      checksum;
+
+	/* We only calculate the checksum for properly-initialized pages */
+	Assert(!PageIsNew((Page) page));
+	/* Ensure that the cksum pointer is in the page range on this page */
+	Assert((char*)cksumloc >= page && (char*)cksumloc <= (page + BLCKSZ - sizeof(uint64)));
+
+	saved = *cksumloc;
+	*cksumloc = 0;
+
+	checksum = pg_checksum64_block(cpage);
+
+	/* restore */
+	*cksumloc = saved;
+
+	/* Mix in the block number to detect transposed pages */
+	checksum ^= blkno;
+
+	/* ensure in the extremely unlikely case that we have non-zero return
+	 * value here; this does double-up on our coset for group 1 here, but it's
+	 * a nice property to preserve */
+	return (checksum == 0 ? 1 : checksum);
+}
+
+/*
+ * Compute and return a 56-bit checksum for a Postgres page.
+ *
+ * Beware that the 56-bit portion of the page that cksum points to is
+ * transiently zeroed, though it is restored.  The low byte of the uint64 is
+ * not part of this checksum, so is left on the page to be included as well.
+ *
+ * The checksum includes the block number (to detect the case where a page is
+ * somehow moved to a different location), the page header (excluding the
+ * checksum itself), and the page data.
+ *
+ */
+uint64
+pg_checksum56_page(char *page, BlockNumber blkno, uint64 *cksumloc)
+{
+	PGChecksummablePage *cpage = (PGChecksummablePage *) page;
+	Checksum56      saved, checksum;
+
+	/* We only calculate the checksum for properly-initialized pages */
+	Assert(!PageIsNew((Page) page));
+	/* Ensure that the cksum pointer is in the page range on this page */
+	Assert((char*)cksumloc >= page && (char*)cksumloc <= (page + BLCKSZ - sizeof(uint64)));
+
+	saved = *(Checksum56*)cksumloc;
+	((Checksum56*)cksumloc)->u64 = 0;
+	((Checksum56*)cksumloc)->bytes[7] = saved.bytes[7];
+
+	checksum.u64 = pg_checksum64_block(cpage);
+	checksum.bytes[7] = saved.bytes[7];
+	/* restore */
+	*cksumloc = saved.u64;
+
+	/* Mix in the block number to detect transposed pages */
+	checksum.u64 ^= blkno << 8;
+
+	// checksum cannot be zero
+	return checksum.u64;
+}
+
+/*
+ * Set a 56-bit checksum onto a Postgres page.
+ *
+ * Given a uint64*, set the top 7 bytes to this checksum value, leaving the
+ * original low-order byte in-place.
+ */
+void
+pg_set_checksum56_page(char *page, uint64 checksum, uint64 *cksumloc)
+{
+	uint8 byte;
+	/* Can only set the checksum for properly-initialized pages */
+	Assert(!PageIsNew((Page) page));
+
+	/* Ensure that the cksum pointer is in the page range on this page */
+	Assert((char*)cksumloc >= page && (char*)cksumloc <= (page + BLCKSZ - sizeof(uint64)));
+
+	// preserve the old byte field
+	byte = ((Checksum56*)cksumloc)->bytes[7];
+	((Checksum56*)cksumloc)->u64 = checksum;
+	((Checksum56*)cksumloc)->bytes[7] = byte;
+}
+
+/*
+ * Get the 56-bit checksum onto a Postgres page given the offset to the
+ * containing uint64.
+ */
+uint64
+pg_get_checksum56_page(char *page, uint64 *cksumloc)
+{
+	/* Can only set the checksum for properly-initialized pages */
+	Assert(!PageIsNew((Page) page));
+
+	/* Ensure that the cksum pointer is in the page range on this page */
+	Assert((char*)cksumloc >= page && (char*)cksumloc <= (page + BLCKSZ - sizeof(uint64)));
+	Assert(MAXALIGN((uint64)cksumloc) == (uint64)cksumloc);
+
+	return *cksumloc;
+}
+
-- 
2.31.1

v2-0003-Add-pagefeat-plus-new-page-status-bit-and-page-fe.patchapplication/octet-stream; name=v2-0003-Add-pagefeat-plus-new-page-status-bit-and-page-fe.patchDownload

From 5d2cb6535cb0fe698552baf2f2d15ba3f4a227d1 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Tue, 1 Nov 2022 17:50:13 -0400
Subject: [PATCH v2 3/5] Add pagefeat, plus new page status bit and page
 feature byte at trailer

---
 contrib/bloom/blutils.c                   |   2 +-
 src/backend/access/brin/brin_bloom.c      |   1 +
 src/backend/access/brin/brin_pageops.c    |   2 +-
 src/backend/access/gin/ginutil.c          |   2 +-
 src/backend/access/gist/gistutil.c        |   2 +-
 src/backend/access/hash/hashpage.c        |   2 +-
 src/backend/access/heap/heapam.c          |   8 +-
 src/backend/access/heap/hio.c             |   4 +-
 src/backend/access/heap/rewriteheap.c     |   2 +-
 src/backend/access/heap/visibilitymap.c   |   4 +-
 src/backend/access/nbtree/nbtpage.c       |   2 +-
 src/backend/access/spgist/spgutils.c      |   2 +-
 src/backend/access/transam/xlog.c         |  10 ++
 src/backend/bootstrap/bootstrap.c         |  19 ++-
 src/backend/commands/sequence.c           |   4 +-
 src/backend/storage/freespace/freespace.c |   6 +-
 src/backend/storage/page/bufpage.c        |  22 +++-
 src/backend/utils/init/globals.c          |   3 -
 src/bin/pg_controldata/pg_controldata.c   |   3 +
 src/common/Makefile                       |   1 +
 src/common/pagefeat.c                     | 143 ++++++++++++++++++++++
 src/include/access/htup_details.h         |   1 +
 src/include/catalog/pg_control.h          |   5 +-
 src/include/common/pagefeat.h             |  55 +++++++++
 src/include/storage/bufmgr.h              |   1 +
 src/include/storage/bufpage.h             |  29 ++++-
 26 files changed, 300 insertions(+), 35 deletions(-)
 create mode 100644 src/common/pagefeat.c
 create mode 100644 src/include/common/pagefeat.h

diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index a434cf93ef..be623b7bee 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -408,7 +408,7 @@ BloomInitPage(Page page, uint16 flags)
 {
 	BloomPageOpaque opaque;
 
-	PageInit(page, BLCKSZ, sizeof(BloomPageOpaqueData));
+	PageInit(page, BLCKSZ, sizeof(BloomPageOpaqueData)), cluster_page_features);
 
 	opaque = BloomPageGetOpaque(page);
 	opaque->flags = flags;
diff --git a/src/backend/access/brin/brin_bloom.c b/src/backend/access/brin/brin_bloom.c
index fee460fb29..fbff776e75 100644
--- a/src/backend/access/brin/brin_bloom.c
+++ b/src/backend/access/brin/brin_bloom.c
@@ -125,6 +125,7 @@
 #include "access/stratnum.h"
 #include "catalog/pg_type.h"
 #include "catalog/pg_amop.h"
+#include "common/pagefeat.h"
 #include "utils/builtins.h"
 #include "utils/datum.h"
 #include "utils/lsyscache.h"
diff --git a/src/backend/access/brin/brin_pageops.c b/src/backend/access/brin/brin_pageops.c
index f17aad51b6..e179f54c31 100644
--- a/src/backend/access/brin/brin_pageops.c
+++ b/src/backend/access/brin/brin_pageops.c
@@ -475,7 +475,7 @@ brin_doinsert(Relation idxrel, BlockNumber pagesPerRange,
 void
 brin_page_init(Page page, uint16 type)
 {
-	PageInit(page, BLCKSZ, sizeof(BrinSpecialSpace));
+	PageInit(page, BLCKSZ, sizeof(BrinSpecialSpace), cluster_page_features);
 
 	BrinPageType(page) = type;
 }
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 6df7f2eaeb..1516c4fd13 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -345,7 +345,7 @@ GinInitPage(Page page, uint32 f, Size pageSize)
 {
 	GinPageOpaque opaque;
 
-	PageInit(page, pageSize, sizeof(GinPageOpaqueData));
+	PageInit(page, pageSize, sizeof(GinPageOpaqueData), cluster_page_features);
 
 	opaque = GinPageGetOpaque(page);
 	opaque->flags = f;
diff --git a/src/backend/access/gist/gistutil.c b/src/backend/access/gist/gistutil.c
index 1532462317..5bd0046671 100644
--- a/src/backend/access/gist/gistutil.c
+++ b/src/backend/access/gist/gistutil.c
@@ -758,7 +758,7 @@ gistinitpage(Page page, uint32 f)
 {
 	GISTPageOpaque opaque;
 
-	PageInit(page, BLCKSZ, sizeof(GISTPageOpaqueData));
+	PageInit(page, BLCKSZ, sizeof(GISTPageOpaqueData), cluster_page_features);
 
 	opaque = GistPageGetOpaque(page);
 	opaque->rightlink = InvalidBlockNumber;
diff --git a/src/backend/access/hash/hashpage.c b/src/backend/access/hash/hashpage.c
index d2edcd4617..c76de815d0 100644
--- a/src/backend/access/hash/hashpage.c
+++ b/src/backend/access/hash/hashpage.c
@@ -595,7 +595,7 @@ _hash_init_metabuffer(Buffer buf, double num_tuples, RegProcedure procid,
 void
 _hash_pageinit(Page page, Size size)
 {
-	PageInit(page, size, sizeof(HashPageOpaqueData));
+	PageInit(page, size, sizeof(HashPageOpaqueData), cluster_page_features);
 }
 
 /*
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 265ac277ab..7ffae9d9d1 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -8890,7 +8890,7 @@ heap_xlog_visible(XLogReaderState *record)
 
 		/* initialize the page if it was read as zeros */
 		if (PageIsNew(vmpage))
-			PageInit(vmpage, BLCKSZ, 0);
+			PageInit(vmpage, BLCKSZ, 0, cluster_page_features);
 
 		/*
 		 * XLogReadBufferForRedoExtended locked the buffer. But
@@ -9126,7 +9126,7 @@ heap_xlog_insert(XLogReaderState *record)
 	{
 		buffer = XLogInitBufferForRedo(record, 0);
 		page = BufferGetPage(buffer);
-		PageInit(page, BufferGetPageSize(buffer), 0);
+		PageInit(page, BufferGetPageSize(buffer), 0, cluster_page_features);
 		action = BLK_NEEDS_REDO;
 	}
 	else
@@ -9250,7 +9250,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
 	{
 		buffer = XLogInitBufferForRedo(record, 0);
 		page = BufferGetPage(buffer);
-		PageInit(page, BufferGetPageSize(buffer), 0);
+		PageInit(page, BufferGetPageSize(buffer), 0, cluster_page_features);
 		action = BLK_NEEDS_REDO;
 	}
 	else
@@ -9468,7 +9468,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
 	{
 		nbuffer = XLogInitBufferForRedo(record, 0);
 		page = (Page) BufferGetPage(nbuffer);
-		PageInit(page, BufferGetPageSize(nbuffer), 0);
+		PageInit(page, BufferGetPageSize(nbuffer), 0, cluster_page_features);
 		newaction = BLK_NEEDS_REDO;
 	}
 	else
diff --git a/src/backend/access/heap/hio.c b/src/backend/access/heap/hio.c
index 4a525fb451..ee823743ac 100644
--- a/src/backend/access/heap/hio.c
+++ b/src/backend/access/heap/hio.c
@@ -525,7 +525,7 @@ loop:
 		 */
 		if (PageIsNew(page))
 		{
-			PageInit(page, BufferGetPageSize(buffer), 0);
+			PageInit(page, BufferGetPageSize(buffer), 0, cluster_page_features);
 			MarkBufferDirty(buffer);
 		}
 
@@ -635,7 +635,7 @@ loop:
 			 BufferGetBlockNumber(buffer),
 			 RelationGetRelationName(relation));
 
-	PageInit(page, BufferGetPageSize(buffer), 0);
+	PageInit(page, BufferGetPageSize(buffer), 0, cluster_page_features);
 	MarkBufferDirty(buffer);
 
 	/*
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 5b1d8bb184..42ae7e84ed 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -703,7 +703,7 @@ raw_heap_insert(RewriteState state, HeapTuple tup)
 	if (!state->rs_buffer_valid)
 	{
 		/* Initialize a new empty page */
-		PageInit(page, BLCKSZ, 0);
+		PageInit(page, BLCKSZ, 0, cluster_page_features);
 		state->rs_buffer_valid = true;
 	}
 
diff --git a/src/backend/access/heap/visibilitymap.c b/src/backend/access/heap/visibilitymap.c
index d62761728b..8c506b3ebf 100644
--- a/src/backend/access/heap/visibilitymap.c
+++ b/src/backend/access/heap/visibilitymap.c
@@ -601,7 +601,7 @@ vm_readbuf(Relation rel, BlockNumber blkno, bool extend)
 	{
 		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
 		if (PageIsNew(BufferGetPage(buf)))
-			PageInit(BufferGetPage(buf), BLCKSZ, 0);
+			PageInit(BufferGetPage(buf), BLCKSZ, 0, cluster_page_features);
 		LockBuffer(buf, BUFFER_LOCK_UNLOCK);
 	}
 	return buf;
@@ -618,7 +618,7 @@ vm_extend(Relation rel, BlockNumber vm_nblocks)
 	PGAlignedBlock pg;
 	SMgrRelation reln;
 
-	PageInit((Page) pg.data, BLCKSZ, 0);
+	PageInit((Page) pg.data, BLCKSZ, 0, cluster_page_features);
 
 	/*
 	 * We use the relation extension lock to lock out other backends trying to
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 8b96708b3e..a77e9719c5 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -1140,7 +1140,7 @@ _bt_upgradelockbufcleanup(Relation rel, Buffer buf)
 void
 _bt_pageinit(Page page, Size size)
 {
-	PageInit(page, size, sizeof(BTPageOpaqueData));
+	PageInit(page, size, sizeof(BTPageOpaqueData), cluster_page_features);
 }
 
 /*
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 2c661fcf96..cd1f1c7006 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -689,7 +689,7 @@ SpGistInitPage(Page page, uint16 f)
 {
 	SpGistPageOpaque opaque;
 
-	PageInit(page, BLCKSZ, sizeof(SpGistPageOpaqueData));
+	PageInit(page, BLCKSZ, sizeof(SpGistPageOpaqueData), cluster_page_features);
 	opaque = SpGistPageGetOpaque(page);
 	opaque->flags = f;
 	opaque->spgist_page_id = SPGIST_PAGE_ID;
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index be54c23187..27953367aa 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -69,6 +69,7 @@
 #include "catalog/pg_database.h"
 #include "common/controldata_utils.h"
 #include "common/file_utils.h"
+#include "common/pagefeat.h"
 #include "executor/instrument.h"
 #include "miscadmin.h"
 #include "pg_trace.h"
@@ -89,6 +90,7 @@
 #include "storage/ipc.h"
 #include "storage/large_object.h"
 #include "storage/latch.h"
+#include "common/pagefeat.h"
 #include "storage/pmsignal.h"
 #include "storage/predicate.h"
 #include "storage/proc.h"
@@ -109,6 +111,7 @@
 #include "utils/varlena.h"
 
 extern uint32 bootstrap_data_checksum_version;
+extern PageFeatureSet bootstrap_page_features;
 
 /* timeline ID to be used when bootstrapping */
 #define BootstrapTimeLineID		1
@@ -3898,6 +3901,7 @@ InitControlFile(uint64 sysidentifier)
 	ControlFile->wal_log_hints = wal_log_hints;
 	ControlFile->track_commit_timestamp = track_commit_timestamp;
 	ControlFile->data_checksum_version = bootstrap_data_checksum_version;
+	ControlFile->page_features = bootstrap_page_features;
 }
 
 static void
@@ -4182,9 +4186,15 @@ ReadControlFile(void)
 
 	CalculateCheckpointSegments();
 
+	/* set our page-level space reservation from ControlFile if any extended feature flags are set*/
+	reserved_page_size = PageFeatureSetCalculateSize(ControlFile->page_features);
+	Assert(reserved_page_size == MAXALIGN(reserved_page_size));
+
 	/* Make the initdb settings visible as GUC variables, too */
 	SetConfigOption("data_checksums", DataChecksumsEnabled() ? "yes" : "no",
 					PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
+
+	SetExtendedFeatureConfigOptions(ControlFile->page_features);
 }
 
 /*
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index 661ebacb0c..959771b1b1 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -46,7 +46,7 @@
 #include "utils/relmapper.h"
 
 uint32		bootstrap_data_checksum_version = 0;	/* No checksum */
-
+PageFeatureSet bootstrap_page_features = 0;			/* No special features */
 
 static void CheckerModeMain(void);
 static void bootstrap_signals(void);
@@ -221,7 +221,7 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
 	argv++;
 	argc--;
 
-	while ((flag = getopt(argc, argv, "B:c:d:D:Fkr:X:-:")) != -1)
+	while ((flag = getopt(argc, argv, "B:c:d:D:e:Fkr:X:-:")) != -1)
 	{
 		switch (flag)
 		{
@@ -244,6 +244,19 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
 					pfree(debugstr);
 				}
 				break;
+			case 'e':
+				{
+					/* enable specific features */
+					PageFeatureSet features_tmp;
+
+					features_tmp = PageFeatureSetAddFeatureByName(bootstrap_page_features, optarg);
+					if (features_tmp == bootstrap_page_features)
+						ereport(ERROR,
+								(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+								 errmsg("Unrecognized page feature requested: \"%s\"", optarg)));
+					bootstrap_page_features = features_tmp;
+				}
+				break;
 			case 'F':
 				SetConfigOption("fsync", "false", PGC_POSTMASTER, PGC_S_ARGV);
 				break;
@@ -299,6 +312,8 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
 		}
 	}
 
+	ClusterPageFeatureInit(bootstrap_page_features);
+
 	if (argc != optind)
 	{
 		write_stderr("%s: invalid command-line arguments\n", progname);
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index 99c9f91cba..06ec51d8bd 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -382,7 +382,7 @@ fill_seq_fork_with_data(Relation rel, HeapTuple tuple, ForkNumber forkNum)
 
 	page = BufferGetPage(buf);
 
-	PageInit(page, BufferGetPageSize(buf), sizeof(sequence_magic));
+	PageInit(page, BufferGetPageSize(buf), sizeof(sequence_magic), cluster_page_features);
 	sm = (sequence_magic *) PageGetSpecialPointer(page);
 	sm->magic = SEQ_MAGIC;
 
@@ -1871,7 +1871,7 @@ seq_redo(XLogReaderState *record)
 	 */
 	localpage = (Page) palloc(BufferGetPageSize(buffer));
 
-	PageInit(localpage, BufferGetPageSize(buffer), sizeof(sequence_magic));
+	PageInit(localpage, BufferGetPageSize(buffer), sizeof(sequence_magic), cluster_page_features);
 	sm = (sequence_magic *) PageGetSpecialPointer(localpage);
 	sm->magic = SEQ_MAGIC;
 
diff --git a/src/backend/storage/freespace/freespace.c b/src/backend/storage/freespace/freespace.c
index b880f059fe..b12bd1cc01 100644
--- a/src/backend/storage/freespace/freespace.c
+++ b/src/backend/storage/freespace/freespace.c
@@ -217,7 +217,7 @@ XLogRecordPageWithFreeSpace(RelFileLocator rlocator, BlockNumber heapBlk,
 
 	page = BufferGetPage(buf);
 	if (PageIsNew(page))
-		PageInit(page, BLCKSZ, 0);
+		PageInit(page, BLCKSZ, 0, cluster_page_features);
 
 	if (fsm_set_avail(page, slot, new_cat))
 		MarkBufferDirtyHint(buf, false);
@@ -593,7 +593,7 @@ fsm_readbuf(Relation rel, FSMAddress addr, bool extend)
 	{
 		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
 		if (PageIsNew(BufferGetPage(buf)))
-			PageInit(BufferGetPage(buf), BLCKSZ, 0);
+			PageInit(BufferGetPage(buf), BLCKSZ, 0, cluster_page_features);
 		LockBuffer(buf, BUFFER_LOCK_UNLOCK);
 	}
 	return buf;
@@ -611,7 +611,7 @@ fsm_extend(Relation rel, BlockNumber fsm_nblocks)
 	PGAlignedBlock pg;
 	SMgrRelation reln;
 
-	PageInit((Page) pg.data, BLCKSZ, 0);
+	PageInit((Page) pg.data, BLCKSZ, 0, cluster_page_features);
 
 	/*
 	 * We use the relation extension lock to lock out other backends trying to
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index cf4b9a3bfa..0433aade03 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -39,11 +39,18 @@ bool		ignore_checksum_failure = false;
  *		until it's time to write.
  */
 void
-PageInit(Page page, Size pageSize, Size specialSize)
+PageInit(Page page, Size pageSize, Size specialSize, PageFeatureSet features)
 {
 	PageHeader	p = (PageHeader) page;
 
-	specialSize = MAXALIGN(specialSize) + reserved_page_size;
+	specialSize = MAXALIGN(specialSize);
+
+	if (features)
+	{
+		Size reserved_size = PageFeatureSetCalculateSize(features);
+		Assert(reserved_size == MAXALIGN(reserved_size));
+		specialSize += reserved_size;
+	}
 
 	Assert(pageSize == BLCKSZ);
 	Assert(pageSize > specialSize + SizeOfPageHeaderData);
@@ -51,7 +58,14 @@ PageInit(Page page, Size pageSize, Size specialSize)
 	/* Make sure all fields of page are zero, as well as unused space */
 	MemSet(p, 0, pageSize);
 
-	p->pd_flags = 0;
+	if (features)
+	{
+		/* initialize our page trailer with a copy of the cluster flags */
+		p->pd_flags = PD_EXTENDED_FEATS;
+		((PGAlignedBlock*)page)->data[BLCKSZ - 1] = (uint8)features;
+	}
+	else
+		p->pd_flags = 0; /* redundant w/MemSet? */
 	p->pd_lower = SizeOfPageHeaderData;
 	p->pd_upper = pageSize - specialSize;
 	p->pd_special = pageSize - specialSize;
@@ -407,7 +421,7 @@ PageGetTempPageCopySpecial(Page page)
 	pageSize = PageGetPageSize(page);
 	temp = (Page) palloc(pageSize);
 
-	PageInit(temp, pageSize, PageGetSpecialSize(page));
+	PageInit(temp, pageSize, PageGetSpecialSize(page), PageGetPageFeatures(page));
 	memcpy(PageGetSpecialPointer(temp),
 		   PageGetSpecialPointer(page),
 		   PageGetSpecialSize(page));
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 879c054ae0..00bceec8fa 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -151,6 +151,3 @@ int64		VacuumPageDirty = 0;
 
 int			VacuumCostBalance = 0;	/* working state for vacuum */
 bool		VacuumCostActive = false;
-
-int			reserved_page_size = 0; /* how much page space to reserve for extended unencrypted metadata */
-
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index c390ec51ce..c1006ad5d8 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -26,6 +26,7 @@
 #include "catalog/pg_control.h"
 #include "common/controldata_utils.h"
 #include "common/logging.h"
+#include "common/pagefeat.h"
 #include "getopt_long.h"
 #include "pg_getopt.h"
 
@@ -328,5 +329,7 @@ main(int argc, char *argv[])
 		   ControlFile->data_checksum_version);
 	printf(_("Mock authentication nonce:            %s\n"),
 		   mock_auth_nonce_str);
+	printf(_("Reserved page size for features:      %d\n"),
+		   PageFeatureSetCalculateSize(ControlFile->page_features));
 	return 0;
 }
diff --git a/src/common/Makefile b/src/common/Makefile
index e9af7346c9..79ffa4dc9a 100644
--- a/src/common/Makefile
+++ b/src/common/Makefile
@@ -65,6 +65,7 @@ OBJS_COMMON = \
 	kwlookup.o \
 	link-canary.o \
 	md5_common.o \
+	pagefeat.o \
 	pg_get_line.o \
 	pg_lzcompress.o \
 	pg_prng.o \
diff --git a/src/common/pagefeat.c b/src/common/pagefeat.c
new file mode 100644
index 0000000000..75d5bffce2
--- /dev/null
+++ b/src/common/pagefeat.c
@@ -0,0 +1,143 @@
+/*-------------------------------------------------------------------------
+ *
+ * pagefeat.c
+ *	  POSTGRES optional page features
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/common/pagefeat.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+#include "common/pagefeat.h"
+#include "utils/guc.h"
+
+/* global variables */
+int reserved_page_size;
+PageFeatureSet cluster_page_features;
+
+/*
+ * A "page feature" is an optional cluster-defined additional data field that
+ * is stored in the "reserved_page_size" area in the footer of a given Page.
+ * These features are set at initdb time and are static for the life of the cluster.
+ *
+ * Page features are identified by flags, each corresponding to a blob of data
+ * with a fixed length and content.  For a given cluster, these features will
+ * globally exist or not, and can be queried for feature existence.  You can
+ * also get the data/length for a given feature using accessors.
+ */
+
+typedef struct PageFeatureDesc
+{
+	uint16 length;
+	char *guc_name;
+} PageFeatureDesc;
+
+/* These are the fixed widths for each feature type, indexed by feature.  This
+ * is also used to lookup page features by the bootstrap process and expose
+ * the state of this page feature as a readonly boolean GUC, so when adding a
+ * named feature here ensure you also update the guc_tables file to add this,
+ * or the attempt to set the GUC will fail. */
+
+static PageFeatureDesc feature_descs[PF_MAX_FEATURE] = {
+};
+
+
+/* Return the size for a given set of feature flags */
+uint16
+PageFeatureSetCalculateSize(PageFeatureSet features)
+{
+	uint16 size = 1;			/* trailer byte */
+	int i;
+
+	if (!features)
+		return 0;
+
+	for (i = 0; i < PF_MAX_FEATURE; i++)
+		if (features & (1<<i))
+			size += feature_descs[i].length;
+
+	return MAXALIGN(size);
+}
+
+
+/* does a specific page have a feature? */
+static inline bool PageHasFeature(Page page, PageFeature feature)
+{
+	Assert(feature >= 0 && feature < PF_MAX_FEATURE);
+
+	/* We are checking out extended feat flag, and if set, checking the
+	 * trailer byte for presence of said feature. */
+	return ((PageHeader) page)->pd_flags & PD_EXTENDED_FEATS && \
+		((PGAlignedBlock*)page)->data[BLCKSZ - 1] & (1<<feature);
+}
+
+
+/*
+ * Get the page offset for the given feature given the page, flags, and
+ * feature id.  Returns NULL if the feature is not enabled.
+ */
+
+char *
+PageGetFeatureOffset(Page page, PageFeature feature_id)
+{
+	uint16 size = 0;
+	int i;
+	PageFeatureSet enabled_features;
+
+	Assert(page != NULL);
+
+	/* short circuit if page does not have extended features or is not using
+	 * this specific feature */
+
+	if (!PageHasFeature(page, feature_id))
+		return (char*)0;
+
+	enabled_features = ((PGAlignedBlock*)page)->data[BLCKSZ - 1];
+
+	/* we need to find the offsets of all previous features to skip */
+	for (i = PF_MAX_FEATURE; i > feature_id; i--)
+		if (enabled_features & (1<<i))
+			size += feature_descs[i].length;
+
+	/* size is now the offset from the start of the reserved page space */
+	return (char*)((char *)page + BLCKSZ - reserved_page_size + size);
+}
+
+/* expose the given feature flags as boolean yes/no GUCs */
+void
+SetExtendedFeatureConfigOptions(PageFeatureSet features)
+{
+#ifndef FRONTEND
+	int i;
+
+	for (i = 0; i < PF_MAX_FEATURE; i++)
+		SetConfigOption(feature_descs[i].guc_name, (features & (1<<i)) ? "yes" : "no",
+						PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
+#endif
+	cluster_page_features = features;
+}
+
+/* add a named feature to the feature set */
+PageFeatureSet
+PageFeatureSetAddFeatureByName(PageFeatureSet features, const char *feat_name)
+{
+	int i;
+
+	for (i = 0; i < PF_MAX_FEATURE; i++)
+		if (!strcmp(feat_name, feature_descs[i].guc_name))
+			return features | (1<<i);
+	return features;
+}
+
+/* add feature to the feature set by identifier */
+PageFeatureSet
+PageFeatureSetAddFeature(PageFeatureSet features, PageFeature feature)
+{
+	Assert(feature >= 0 && feature < PF_MAX_FEATURE);
+	return features | (1<<feature);
+}
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index 6bd2b8f3c2..e040d54fdd 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -19,6 +19,7 @@
 #include "access/tupdesc.h"
 #include "access/tupmacs.h"
 #include "storage/bufpage.h"
+#include "common/pagefeat.h"
 
 /*
  * MaxTupleAttributeNumber limits the number of (user) columns in a tuple.
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index 06368e2366..c6559a377b 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -19,7 +19,7 @@
 #include "access/xlogdefs.h"
 #include "pgtime.h"				/* for pg_time_t */
 #include "port/pg_crc32c.h"
-
+#include "common/pagefeat.h"
 
 /* Version identifier for this pg_control format */
 #define PG_CONTROL_VERSION	1300
@@ -219,6 +219,9 @@ typedef struct ControlFileData
 	/* Are data pages protected by checksums? Zero if no checksum version */
 	uint32		data_checksum_version;
 
+	/* What extended page features are we using? */
+	PageFeatureSet page_features;
+
 	/*
 	 * Random nonce, used in authentication requests that need to proceed
 	 * based on values that are cluster-unique, like a SASL exchange that
diff --git a/src/include/common/pagefeat.h b/src/include/common/pagefeat.h
new file mode 100644
index 0000000000..cf0f04ecdb
--- /dev/null
+++ b/src/include/common/pagefeat.h
@@ -0,0 +1,55 @@
+/*-------------------------------------------------------------------------
+ *
+ * pagefeat.h
+ *	  POSTGRES page feature support
+ *
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/common/pagefeat.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PAGEFEAT_H
+#define PAGEFEAT_H
+
+/* revealed for GUCs */
+extern int reserved_page_size;
+
+/* forward declaration to avoid circular includes */
+typedef Pointer Page;
+typedef uint8 PageFeatureSet;
+
+extern PageFeatureSet cluster_page_features;
+
+#define SizeOfPageReservedSpace() reserved_page_size
+
+/* bit offset for features flags */
+typedef enum {
+	PF_MAX_FEATURE /* must be last */
+} PageFeature;
+
+/* Limit for total number of features we will support.  Since we are storing a
+ * single status byte, we are reserving the top bit here to be set to indicate
+ * for whether there are more than 7 features; used for future extensibility.
+ * This should not be increased as part of normal feature development, only
+ * when adding said mechanisms */
+
+#define PF_MAX_POSSIBLE_FEATURE_CUTOFF 7
+
+StaticAssertDecl(PF_MAX_FEATURE <= PF_MAX_POSSIBLE_FEATURE_CUTOFF,
+				 "defined more features than will fit in one byte");
+
+/* prototypes */
+void SetExtendedFeatureConfigOptions(PageFeatureSet features);
+uint16 PageFeatureSetCalculateSize(PageFeatureSet features);
+PageFeatureSet PageFeatureSetAddFeatureByName(PageFeatureSet features, const char *feat_name);
+PageFeatureSet PageFeatureSetAddFeature(PageFeatureSet features, PageFeature feature);
+
+/* macros dealing with the current cluster's page features */
+char *PageGetFeatureOffset(Page page, PageFeature feature);
+#define PageFeatureSetHasFeature(fs,f) (fs&(1<<f))
+#define ClusterPageFeatureInit(features) cluster_page_features = features;
+
+#endif							/* PAGEFEAT_H */
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index e1bd22441b..853ebc7c93 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -17,6 +17,7 @@
 #include "storage/block.h"
 #include "storage/buf.h"
 #include "storage/bufpage.h"
+#include "common/pagefeat.h"
 #include "storage/relfilelocator.h"
 #include "utils/relcache.h"
 #include "utils/snapmgr.h"
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index ea0dfc2645..9afd2a74e4 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -18,6 +18,7 @@
 #include "storage/block.h"
 #include "storage/item.h"
 #include "storage/off.h"
+#include "common/pagefeat.h"
 
 extern int reserved_page_size;
 
@@ -194,8 +195,8 @@ typedef PageHeaderData *PageHeader;
 #define PD_PAGE_FULL		0x0002	/* not enough free space for new tuple? */
 #define PD_ALL_VISIBLE		0x0004	/* all tuples on page are visible to
 									 * everyone */
-
-#define PD_VALID_FLAG_BITS	0x0007	/* OR of all valid pd_flags bits */
+#define PD_EXTENDED_FEATS	0x0008	/* this page uses extended page features */
+#define PD_VALID_FLAG_BITS	0x000F	/* OR of all valid pd_flags bits */
 
 /*
  * Page layout version number 0 is for pre-7.3 Postgres releases.
@@ -311,6 +312,26 @@ PageSetPageSizeAndVersion(Page page, Size size, uint8 version)
 	((PageHeader) page)->pd_pagesize_version = size | version;
 }
 
+
+/*
+ * Return any extended page features set on the page.
+ */
+static inline PageFeatureSet PageGetPageFeatures(Page page)
+{
+	return ((PageHeader) page)->pd_flags & PD_EXTENDED_FEATS \
+		? (PageFeatureSet)(((PGAlignedBlock*)page)->data[BLCKSZ - 1])
+		: 0;
+}
+
+/*
+ * Return the size of space allocated for page features.
+ */
+static inline Size
+PageGetFeatureSize(Page page)
+{
+	return PageFeatureSetCalculateSize(PageGetPageFeatures(page));
+}
+
 /* ----------------
  *		page special data functions
  * ----------------
@@ -322,7 +343,7 @@ PageSetPageSizeAndVersion(Page page, Size size, uint8 version)
 static inline uint16
 PageGetSpecialSize(Page page)
 {
-	return (PageGetPageSize(page) - ((PageHeader) page)->pd_special - reserved_page_size);
+	return (PageGetPageSize(page) - ((PageHeader) page)->pd_special - PageGetFeatureSize(page));
 }
 
 /*
@@ -494,7 +515,7 @@ do { \
 StaticAssertDecl(BLCKSZ == ((BLCKSZ / sizeof(size_t)) * sizeof(size_t)),
 				 "BLCKSZ has to be a multiple of sizeof(size_t)");
 
-extern void PageInit(Page page, Size pageSize, Size specialSize);
+extern void PageInit(Page page, Size pageSize, Size specialSize, PageFeatureSet features);
 extern bool PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags);
 extern OffsetNumber PageAddItemExtended(Page page, Item item, Size size,
 										OffsetNumber offsetNumber, int flags);
-- 
2.31.1

v2-0001-Make-the-output-of-select_views-test-stable.patchapplication/octet-stream; name=v2-0001-Make-the-output-of-select_views-test-stable.patchDownload

From 3ffec02a4b3316f3d172ddc0f6e410128d5334c7 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Tue, 1 Nov 2022 13:35:04 -0400
Subject: [PATCH v2 1/5] Make the output of "select_views" test stable

Changing the reserved_page_size has resulted in non-stable results for this test. This is likely due
to tuples being added to other disk pages when a different reserved_page_size is used.  Fix by
explicitly defining result ordering, which should be good enough for this test.
---
 src/test/regress/expected/select_views.out | 288 ++++++++++-----------
 src/test/regress/sql/select_views.sql      |   2 +-
 2 files changed, 145 insertions(+), 145 deletions(-)

diff --git a/src/test/regress/expected/select_views.out b/src/test/regress/expected/select_views.out
index 1aeed8452b..a1a349f092 100644
--- a/src/test/regress/expected/select_views.out
+++ b/src/test/regress/expected/select_views.out
@@ -2,19 +2,158 @@
 -- SELECT_VIEWS
 -- test the views defined in CREATE_VIEWS
 --
-SELECT * FROM street;
+SELECT * FROM street ORDER BY cname COLLATE "C", name COLLATE "C", thepath::text COLLATE "C";
                 name                |                                                                                                                                                                                                                   thepath                                                                                                                                                                                                                    |   cname   
 ------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------
- Access Rd 25                       | [(-121.9283,37.894),(-121.9283,37.9)]                                                                                                                                                                                                                                                                                                                                                                                                        | Oakland
- Ada                           St   | [(-122.2487,37.398),(-122.2496,37.401)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
- Agua Fria Creek                    | [(-121.9254,37.922),(-121.9281,37.889)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
+ 19th                          Ave  | [(-122.2366,37.897),(-122.2359,37.905)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ 5th                           St   | [(-122.296,37.615),(-122.2953,37.598)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
+ 82nd                          Ave  | [(-122.1695,37.596),(-122.1681,37.603)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Allen                         Ct   | [(-122.0131,37.602),(-122.0117,37.597)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Alvarado Niles                Road | [(-122.0325,37.903),(-122.0316,37.9)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
+ Arizona                       St   | [(-122.0381,37.901),(-122.0367,37.898)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Avenue 134th                       | [(-122.1823,37.002),(-122.1851,37.992)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Avenue 140th                       | [(-122.1656,37.003),(-122.1691,37.988)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Avenue D                           | [(-122.298,37.848),(-122.3024,37.849)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
+ Broadway                           | [(-122.2409,37.586),(-122.2395,37.601)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Buckingham                    Blvd | [(-122.2231,37.59),(-122.2214,37.606)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
+ Butterfield                   Dr   | [(-122.0838,37.002),(-122.0834,37.987)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ California                    St   | [(-122.2032,37.005),(-122.2016,37.996)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Campus                        Dr   | [(-122.1704,37.905),(-122.1678,37.868),(-122.1671,37.865)]                                                                                                                                                                                                                                                                                                                                                                                   | Berkeley
+ Carson                        St   | [(-122.1846,37.9),(-122.1843,37.901)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
+ Cedar                         St   | [(-122.3011,37.737),(-122.2999,37.739)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Central                       Ave  | [(-122.2343,37.602),(-122.2331,37.595)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Champion                      St   | [(-122.214,37.991),(-122.2147,37.002)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
+ Claremont                     Pl   | [(-122.0542,37.995),(-122.0542,37.008)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Coliseum                      Way  | [(-122.2113,37.626),(-122.2085,37.592),(-122.2063,37.568)]                                                                                                                                                                                                                                                                                                                                                                                   | Berkeley
+ Cornell                       Ave  | [(-122.2956,37.925),(-122.2949,37.906),(-122.2939,37.875)]                                                                                                                                                                                                                                                                                                                                                                                   | Berkeley
+ Creston                       Road | [(-122.2639,37.002),(-122.2613,37.986),(-122.2602,37.978),(-122.2598,37.973)]                                                                                                                                                                                                                                                                                                                                                                | Berkeley
+ Crow Canyon Creek                  | [(-122.043,37.905),(-122.0368,37.71)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
+ Cull Creek                         | [(-122.0624,37.875),(-122.0582,37.527)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Decoto                        Road | [(-122.0159,37.006),(-122.016,37.002),(-122.0164,37.993)]                                                                                                                                                                                                                                                                                                                                                                                    | Berkeley
+ Deering                       St   | [(-122.2146,37.904),(-122.2126,37.897)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Dimond                        Ave  | [(-122.2167,37.994),(-122.2162,37.006)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Donna                         Way  | [(-122.1333,37.606),(-122.1316,37.599)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Eden Creek                         | [(-122.022037,37.00675),(-122.0221,37.998)]                                                                                                                                                                                                                                                                                                                                                                                                  | Berkeley
+ Euclid                        Ave  | [(-122.2671,37.009),(-122.2666,37.987)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Foothill                      Blvd | [(-122.2414,37.9),(-122.2403,37.893)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
+ Fountain                      St   | [(-122.2306,37.593),(-122.2293,37.605)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Grizzly Peak                  Blvd | [(-122.2213,37.638),(-122.2127,37.581)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Grove                         Way  | [(-122.0643,37.884),(-122.062679,37.89162),(-122.061796,37.89578),(-122.0609,37.9)]                                                                                                                                                                                                                                                                                                                                                          | Berkeley
+ Herrier                       St   | [(-122.1943,37.006),(-122.1936,37.998)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Hesperian                     Blvd | [(-122.1132,37.6),(-122.1123,37.586)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
+ I- 580                             | [(-122.1108,37.023),(-122.1101,37.02),(-122.108103,37.00764),(-122.108,37.007),(-122.1069,37.998),(-122.1064,37.994),(-122.1053,37.982),(-122.1048,37.977),(-122.1032,37.958),(-122.1026,37.953),(-122.1013,37.938),(-122.0989,37.911),(-122.0984,37.91),(-122.098,37.908)]                                                                                                                                                                  | Berkeley
+ I- 580                             | [(-122.1543,37.703),(-122.1535,37.694),(-122.1512,37.655),(-122.1475,37.603),(-122.1468,37.583),(-122.1472,37.569),(-122.149044,37.54874),(-122.1493,37.546),(-122.1501,37.532),(-122.1506,37.509),(-122.1495,37.482),(-122.1487,37.467),(-122.1477,37.447),(-122.1414,37.383),(-122.1404,37.376),(-122.1398,37.372),(-122.139,37.356),(-122.1388,37.353),(-122.1385,37.34),(-122.1382,37.33),(-122.1378,37.316)]                            | Berkeley
+ I- 580                             | [(-122.2197,37.99),(-122.22,37.99),(-122.222092,37.99523),(-122.2232,37.998),(-122.224146,37.99963),(-122.2261,37.003),(-122.2278,37.007),(-122.2302,37.026),(-122.2323,37.043),(-122.2344,37.059),(-122.235405,37.06427),(-122.2365,37.07)]                                                                                                                                                                                                 | Berkeley
+ I- 580                        Ramp | [(-122.093241,37.90351),(-122.09364,37.89634),(-122.093788,37.89212)]                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
+ I- 580                        Ramp | [(-122.0934,37.896),(-122.09257,37.89961),(-122.0911,37.906)]                                                                                                                                                                                                                                                                                                                                                                                | Berkeley
+ I- 580                        Ramp | [(-122.0941,37.897),(-122.0943,37.902)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ I- 580                        Ramp | [(-122.096,37.888),(-122.0962,37.891),(-122.0964,37.9)]                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ I- 580                        Ramp | [(-122.101,37.898),(-122.1005,37.902),(-122.0989,37.911)]                                                                                                                                                                                                                                                                                                                                                                                    | Berkeley
+ I- 580                        Ramp | [(-122.1086,37.003),(-122.1068,37.993),(-122.1066,37.992),(-122.1053,37.982)]                                                                                                                                                                                                                                                                                                                                                                | Berkeley
+ I- 880                             | [(-122.0375,37.632),(-122.0359,37.619),(-122.0358,37.616),(-122.034514,37.60409),(-122.031876,37.57965),(-122.031193,37.57332),(-122.03016,37.56375),(-122.02943,37.55698),(-122.028689,37.54929),(-122.027833,37.53908),(-122.025979,37.51698),(-122.0238,37.491)]                                                                                                                                                                          | Berkeley
+ I- 880                             | [(-122.0612,37.003),(-122.0604,37.991),(-122.0596,37.982),(-122.0585,37.967),(-122.0583,37.961),(-122.0553,37.918),(-122.053635,37.89475),(-122.050759,37.8546),(-122.05,37.844),(-122.0485,37.817),(-122.0483,37.813),(-122.0482,37.811)]                                                                                                                                                                                                   | Berkeley
+ I- 880                             | [(-122.1365,37.902),(-122.1358,37.898),(-122.1333,37.881),(-122.1323,37.874),(-122.1311,37.866),(-122.1308,37.865),(-122.1307,37.864),(-122.1289,37.851),(-122.1277,37.843),(-122.1264,37.834),(-122.1231,37.812),(-122.1165,37.766),(-122.1104,37.72),(-122.109695,37.71094),(-122.109,37.702),(-122.108312,37.69168),(-122.1076,37.681)]                                                                                                   | Berkeley
+ I- 880                             | [(-122.1755,37.185),(-122.1747,37.178),(-122.1742,37.173),(-122.1692,37.126),(-122.167792,37.11594),(-122.16757,37.11435),(-122.1671,37.111),(-122.1655,37.1),(-122.165169,37.09811),(-122.1641,37.092),(-122.1596,37.061),(-122.158381,37.05275),(-122.155991,37.03657),(-122.1531,37.017),(-122.1478,37.98),(-122.1407,37.932),(-122.1394,37.924),(-122.1389,37.92),(-122.1376,37.91)]                                                     | Berkeley
+ I- 880                             | [(-122.2214,37.711),(-122.2202,37.699),(-122.2199,37.695),(-122.219,37.682),(-122.2184,37.672),(-122.2173,37.652),(-122.2159,37.638),(-122.2144,37.616),(-122.2138,37.612),(-122.2135,37.609),(-122.212,37.592),(-122.2116,37.586),(-122.2111,37.581)]                                                                                                                                                                                       | Berkeley
+ I- 880                             | [(-122.2707,37.975),(-122.2693,37.972),(-122.2681,37.966),(-122.267,37.962),(-122.2659,37.957),(-122.2648,37.952),(-122.2636,37.946),(-122.2625,37.935),(-122.2617,37.927),(-122.2607,37.921),(-122.2593,37.916),(-122.258,37.911),(-122.2536,37.898),(-122.2432,37.858),(-122.2408,37.845),(-122.2386,37.827),(-122.2374,37.811)]                                                                                                           | Berkeley
+ I- 880                        Ramp | [(-122.059,37.982),(-122.0577,37.984),(-122.0612,37.003)]                                                                                                                                                                                                                                                                                                                                                                                    | Berkeley
+ I- 880                        Ramp | [(-122.0618,37.011),(-122.0631,37.982),(-122.0585,37.967)]                                                                                                                                                                                                                                                                                                                                                                                   | Berkeley
+ I- 880                        Ramp | [(-122.1029,37.61),(-122.1013,37.587),(-122.0999,37.569)]                                                                                                                                                                                                                                                                                                                                                                                    | Berkeley
+ I- 880                        Ramp | [(-122.1379,37.891),(-122.1383,37.897),(-122.1377,37.902)]                                                                                                                                                                                                                                                                                                                                                                                   | Berkeley
+ I- 880                        Ramp | [(-122.1379,37.931),(-122.137597,37.92736),(-122.1374,37.925),(-122.1373,37.924),(-122.1369,37.914),(-122.1358,37.905),(-122.1365,37.908),(-122.1358,37.898)]                                                                                                                                                                                                                                                                                | Berkeley
+ I- 880                        Ramp | [(-122.2536,37.898),(-122.254,37.902)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
+ Jackson                       St   | [(-122.0845,37.6),(-122.0842,37.606)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
+ Joyce                         St   | [(-122.0792,37.604),(-122.0774,37.581)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Keeler                        Ave  | [(-122.2578,37.906),(-122.2579,37.899)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Laguna                        Ave  | [(-122.2099,37.989),(-122.2089,37)]                                                                                                                                                                                                                                                                                                                                                                                                          | Berkeley
+ Lakehurst                     Cir  | [(-122.284729,37.89025),(-122.286096,37.90364)]                                                                                                                                                                                                                                                                                                                                                                                              | Berkeley
+ Lakeshore                     Ave  | [(-122.2586,37.99),(-122.2556,37.006)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
+ Linden                        St   | [(-122.2867,37.998),(-122.2864,37.008)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Locust                        St   | [(-122.1606,37.007),(-122.1593,37.987)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Marin                         Ave  | [(-122.2741,37.894),(-122.272,37.901)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
+ Martin Luther King Jr         Way  | [(-122.2712,37.608),(-122.2711,37.599)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Maubert                       Ave  | [(-122.1114,37.009),(-122.1096,37.995)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ McClure                       Ave  | [(-122.1431,37.001),(-122.1436,37.998)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Miller                        Road | [(-122.0902,37.645),(-122.0865,37.545)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Mission                       Blvd | [(-122.0006,37.896),(-121.9989,37.88)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
+ Oakland Inner Harbor               | [(-122.2625,37.913),(-122.260016,37.89484)]                                                                                                                                                                                                                                                                                                                                                                                                  | Berkeley
+ Oneil                         Ave  | [(-122.076754,37.62476),(-122.0745,37.595)]                                                                                                                                                                                                                                                                                                                                                                                                  | Berkeley
+ Parkridge                     Dr   | [(-122.1438,37.884),(-122.1428,37.9)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
+ Parkside                      Dr   | [(-122.0475,37.603),(-122.0443,37.596)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Paseo Padre                   Pkwy | [(-122.0021,37.639),(-121.996,37.628)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
+ Pearl                         St   | [(-122.2383,37.594),(-122.2366,37.615)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Railroad                      Ave  | [(-122.0245,37.013),(-122.0234,37.003),(-122.0223,37.993)]                                                                                                                                                                                                                                                                                                                                                                                   | Berkeley
+ Ranspot                       Dr   | [(-122.0972,37.999),(-122.0959,37)]                                                                                                                                                                                                                                                                                                                                                                                                          | Berkeley
+ Redding                       St   | [(-122.1978,37.901),(-122.1975,37.895)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Redwood                       Road | [(-122.1493,37.98),(-122.1437,37.001)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
+ Roca                          Dr   | [(-122.0335,37.609),(-122.0314,37.599)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Sacramento                    St   | [(-122.2799,37.606),(-122.2797,37.597)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Saddle Brook                  Dr   | [(-122.1478,37.909),(-122.1454,37.904),(-122.1451,37.888)]                                                                                                                                                                                                                                                                                                                                                                                   | Berkeley
+ San Andreas                   Dr   | [(-122.0609,37.9),(-122.0614,37.895)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
+ Santa Maria                   Ave  | [(-122.0773,37),(-122.0773,37.98)]                                                                                                                                                                                                                                                                                                                                                                                                           | Berkeley
+ Shattuck                      Ave  | [(-122.2686,37.904),(-122.2686,37.897)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Shoreline                     Dr   | [(-122.2657,37.603),(-122.2648,37.6)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
+ Skyline                       Blvd | [(-122.1738,37.01),(-122.1714,37.996)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
+ Skywest                       Dr   | [(-122.1161,37.62),(-122.1123,37.586)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
+ Southern Pacific Railroad          | [(-122.3002,37.674),(-122.2999,37.661)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Sp Railroad                        | [(-122.0734,37.001),(-122.0734,37.997)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Sp Railroad                        | [(-122.0914,37.601),(-122.087,37.56),(-122.086408,37.5551)]                                                                                                                                                                                                                                                                                                                                                                                  | Berkeley
+ Sp Railroad                        | [(-122.137792,37.003),(-122.1365,37.992),(-122.131257,37.94612)]                                                                                                                                                                                                                                                                                                                                                                             | Berkeley
+ State Hwy 123                      | [(-122.3004,37.986),(-122.2998,37.969),(-122.2995,37.962),(-122.2992,37.952),(-122.299,37.942),(-122.2987,37.935),(-122.2984,37.924),(-122.2982,37.92),(-122.2976,37.904),(-122.297,37.88),(-122.2966,37.869),(-122.2959,37.848),(-122.2961,37.843)]                                                                                                                                                                                         | Berkeley
+ State Hwy 13                       | [(-122.1797,37.943),(-122.179871,37.91849),(-122.18,37.9),(-122.179023,37.86615),(-122.1787,37.862),(-122.1781,37.851),(-122.1777,37.845),(-122.1773,37.839),(-122.177,37.833)]                                                                                                                                                                                                                                                              | Berkeley
+ State Hwy 238                      | ((-122.098,37.908),(-122.0983,37.907),(-122.099,37.905),(-122.101,37.898),(-122.101535,37.89711),(-122.103173,37.89438),(-122.1046,37.892),(-122.106,37.89))                                                                                                                                                                                                                                                                                 | Berkeley
+ State Hwy 238                 Ramp | [(-122.1288,37.9),(-122.1293,37.895),(-122.1296,37.906)]                                                                                                                                                                                                                                                                                                                                                                                     | Berkeley
+ Stuart                        St   | [(-122.2518,37.6),(-122.2507,37.601),(-122.2491,37.606)]                                                                                                                                                                                                                                                                                                                                                                                     | Berkeley
+ Tupelo                        Ter  | [(-122.059087,37.6113),(-122.057021,37.59942)]                                                                                                                                                                                                                                                                                                                                                                                               | Berkeley
+ West Loop                     Road | [(-122.0576,37.604),(-122.0602,37.586)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Western Pacific Railroad Spur      | [(-122.0394,37.018),(-122.0394,37.961)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
+ Wisconsin                     St   | [(-122.1994,37.017),(-122.1975,37.998),(-122.1971,37.994)]                                                                                                                                                                                                                                                                                                                                                                                   | Berkeley
+ Wp Railroad                        | [(-122.254,37.902),(-122.2506,37.891)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
+ 14th                          St   | [(-122.299,37.147),(-122.3,37.148)]                                                                                                                                                                                                                                                                                                                                                                                                          | Lafayette
+ 5th                           St   | [(-122.278,37),(-122.2792,37.005),(-122.2803,37.009)]                                                                                                                                                                                                                                                                                                                                                                                        | Lafayette
+ 98th                          Ave  | [(-122.2001,37.258),(-122.1974,37.27)]                                                                                                                                                                                                                                                                                                                                                                                                       | Lafayette
+ Ada                           St   | [(-122.2487,37.398),(-122.2496,37.401)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
+ California                    St   | [(-122.2032,37.005),(-122.2016,37.996)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
+ Capricorn                     Ave  | [(-122.2176,37.404),(-122.2164,37.384)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
+ Chambers                      Dr   | [(-122.2004,37.352),(-122.1972,37.368)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
+ Chambers                      Lane | [(-122.2001,37.359),(-122.1975,37.371)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
+ Champion                      St   | [(-122.214,37.991),(-122.2147,37.002)]                                                                                                                                                                                                                                                                                                                                                                                                       | Lafayette
+ Coolidge                      Ave  | [(-122.2007,37.058),(-122.1992,37.06)]                                                                                                                                                                                                                                                                                                                                                                                                       | Lafayette
+ Creston                       Road | [(-122.2639,37.002),(-122.2613,37.986),(-122.2602,37.978),(-122.2598,37.973)]                                                                                                                                                                                                                                                                                                                                                                | Lafayette
+ Dimond                        Ave  | [(-122.2167,37.994),(-122.2162,37.006)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
+ Edgewater                     Dr   | [(-122.201,37.379),(-122.2042,37.41)]                                                                                                                                                                                                                                                                                                                                                                                                        | Lafayette
+ Euclid                        Ave  | [(-122.2671,37.009),(-122.2666,37.987)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
+ Heartwood                     Dr   | [(-122.2006,37.341),(-122.1992,37.338)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
+ Hollis                        St   | [(-122.2885,37.397),(-122.289,37.414)]                                                                                                                                                                                                                                                                                                                                                                                                       | Lafayette
+ I- 580                             | [(-122.2197,37.99),(-122.22,37.99),(-122.222092,37.99523),(-122.2232,37.998),(-122.224146,37.99963),(-122.2261,37.003),(-122.2278,37.007),(-122.2302,37.026),(-122.2323,37.043),(-122.2344,37.059),(-122.235405,37.06427),(-122.2365,37.07)]                                                                                                                                                                                                 | Lafayette
+ I- 80                              | ((-122.2937,37.277),(-122.3016,37.262))                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
+ I- 80                              | ((-122.2962,37.273),(-122.3004,37.264))                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
+ I- 80                         Ramp | [(-122.2962,37.413),(-122.2959,37.382),(-122.2951,37.372)]                                                                                                                                                                                                                                                                                                                                                                                   | Lafayette
+ I- 880                        Ramp | [(-122.2771,37.002),(-122.278,37)]                                                                                                                                                                                                                                                                                                                                                                                                           | Lafayette
+ Indian                        Way  | [(-122.2066,37.398),(-122.2045,37.411)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
+ Laguna                        Ave  | [(-122.2099,37.989),(-122.2089,37)]                                                                                                                                                                                                                                                                                                                                                                                                          | Lafayette
+ Lakeshore                     Ave  | [(-122.2586,37.99),(-122.2556,37.006)]                                                                                                                                                                                                                                                                                                                                                                                                       | Lafayette
+ Linden                        St   | [(-122.2867,37.998),(-122.2864,37.008)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
+ Mandalay                      Road | [(-122.2322,37.397),(-122.2321,37.403)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
+ Proctor                       Ave  | [(-122.2267,37.406),(-122.2251,37.386)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
+ Sheridan                      Road | [(-122.2279,37.425),(-122.2253,37.411),(-122.2223,37.377)]                                                                                                                                                                                                                                                                                                                                                                                   | Lafayette
+ State Hwy 13                       | [(-122.2049,37.2),(-122.20328,37.17975),(-122.1989,37.125),(-122.198078,37.11641),(-122.1975,37.11)]                                                                                                                                                                                                                                                                                                                                         | Lafayette
+ State Hwy 13                  Ramp | [(-122.2244,37.427),(-122.223,37.414),(-122.2214,37.396),(-122.2213,37.388)]                                                                                                                                                                                                                                                                                                                                                                 | Lafayette
+ State Hwy 24                       | [(-122.2674,37.246),(-122.2673,37.248),(-122.267,37.261),(-122.2668,37.271),(-122.2663,37.298),(-122.2659,37.315),(-122.2655,37.336),(-122.265007,37.35882),(-122.264443,37.37286),(-122.2641,37.381),(-122.2638,37.388),(-122.2631,37.396),(-122.2617,37.405),(-122.2615,37.407),(-122.2605,37.412)]                                                                                                                                        | Lafayette
+ Taurus                        Ave  | [(-122.2159,37.416),(-122.2128,37.389)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
+ 100th                         Ave  | [(-122.1657,37.429),(-122.1647,37.432)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
+ 107th                         Ave  | [(-122.1555,37.403),(-122.1531,37.41)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
+ 1st                           St   | [(-121.75508,37.89294),(-121.753581,37.90031)]                                                                                                                                                                                                                                                                                                                                                                                               | Oakland
+ 85th                          Ave  | [(-122.1877,37.466),(-122.186,37.476)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
+ 89th                          Ave  | [(-122.1822,37.459),(-122.1803,37.471)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
+ 98th                          Ave  | [(-122.1568,37.498),(-122.1558,37.502)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
+ 98th                          Ave  | [(-122.1693,37.438),(-122.1682,37.444)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
+ Access Rd 25                       | [(-121.9283,37.894),(-121.9283,37.9)]                                                                                                                                                                                                                                                                                                                                                                                                        | Oakland
+ Agua Fria Creek                    | [(-121.9254,37.922),(-121.9281,37.889)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Andrea                        Cir  | [(-121.733218,37.88641),(-121.733286,37.90617)]                                                                                                                                                                                                                                                                                                                                                                                              | Oakland
  Apricot                       Lane | [(-121.9471,37.401),(-121.9456,37.392)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Apricot                       Lane | [(-121.9471,37.401),(-121.9456,37.392)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Arden                         Road | [(-122.0978,37.177),(-122.1,37.177)]                                                                                                                                                                                                                                                                                                                                                                                                         | Oakland
- Arizona                       St   | [(-122.0381,37.901),(-122.0367,37.898)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Arlington                     Dr   | [(-121.8802,37.408),(-121.8807,37.394)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Arlington                     Dr   | [(-121.8802,37.408),(-121.8807,37.394)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Arlington                     Road | [(-121.7957,37.898),(-121.7956,37.906)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
@@ -23,10 +162,7 @@ SELECT * FROM street;
  Arroyo Seco                        | [(-121.7073,37.766),(-121.6997,37.729)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Ash                           St   | [(-122.0408,37.31),(-122.04,37.292)]                                                                                                                                                                                                                                                                                                                                                                                                         | Oakland
  Avenue 134th                       | [(-122.1823,37.002),(-122.1851,37.992)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- Avenue 134th                       | [(-122.1823,37.002),(-122.1851,37.992)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Avenue 140th                       | [(-122.1656,37.003),(-122.1691,37.988)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- Avenue 140th                       | [(-122.1656,37.003),(-122.1691,37.988)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
- Avenue D                           | [(-122.298,37.848),(-122.3024,37.849)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
  B                             St   | [(-122.1749,37.451),(-122.1743,37.443)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Bancroft                      Ave  | [(-122.15714,37.4242),(-122.156,37.409)]                                                                                                                                                                                                                                                                                                                                                                                                     | Oakland
  Bancroft                      Ave  | [(-122.1643,37.523),(-122.1631,37.508),(-122.1621,37.493)]                                                                                                                                                                                                                                                                                                                                                                                   | Oakland
@@ -35,55 +171,28 @@ SELECT * FROM street;
  Blacow                        Road | [(-122.0179,37.469),(-122.0167,37.465)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Bridgepointe                  Dr   | [(-122.0514,37.305),(-122.0509,37.299)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Broadmore                     Ave  | [(-122.095,37.522),(-122.0936,37.497)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
- Broadway                           | [(-122.2409,37.586),(-122.2395,37.601)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
- Buckingham                    Blvd | [(-122.2231,37.59),(-122.2214,37.606)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
  Butterfield                   Dr   | [(-122.0838,37.002),(-122.0834,37.987)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Butterfield                   Dr   | [(-122.0838,37.002),(-122.0834,37.987)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- Butterfield                   Dr   | [(-122.0838,37.002),(-122.0834,37.987)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  C                             St   | [(-122.1768,37.46),(-122.1749,37.435)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
  Calaveras Creek                    | [(-121.8203,37.035),(-121.8207,37.931)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Calaveras Creek                    | [(-121.8203,37.035),(-121.8207,37.931)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- California                    St   | [(-122.2032,37.005),(-122.2016,37.996)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
- California                    St   | [(-122.2032,37.005),(-122.2016,37.996)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
  Cameron                       Ave  | [(-122.1316,37.502),(-122.1327,37.481)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- Campus                        Dr   | [(-122.1704,37.905),(-122.1678,37.868),(-122.1671,37.865)]                                                                                                                                                                                                                                                                                                                                                                                   | Berkeley
- Capricorn                     Ave  | [(-122.2176,37.404),(-122.2164,37.384)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
- Carson                        St   | [(-122.1846,37.9),(-122.1843,37.901)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
  Cedar                         Blvd | [(-122.0282,37.446),(-122.0265,37.43)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
- Cedar                         St   | [(-122.3011,37.737),(-122.2999,37.739)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Celia                         St   | [(-122.0611,37.3),(-122.0616,37.299)]                                                                                                                                                                                                                                                                                                                                                                                                        | Oakland
- Central                       Ave  | [(-122.2343,37.602),(-122.2331,37.595)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
- Chambers                      Dr   | [(-122.2004,37.352),(-122.1972,37.368)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
- Chambers                      Lane | [(-122.2001,37.359),(-122.1975,37.371)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
- Champion                      St   | [(-122.214,37.991),(-122.2147,37.002)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
- Champion                      St   | [(-122.214,37.991),(-122.2147,37.002)]                                                                                                                                                                                                                                                                                                                                                                                                       | Lafayette
  Chapman                       Dr   | [(-122.0421,37.504),(-122.0414,37.498)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Charles                       St   | [(-122.0255,37.505),(-122.0252,37.499)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Cherry                        St   | [(-122.0437,37.42),(-122.0434,37.413)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
  Claremont                     Pl   | [(-122.0542,37.995),(-122.0542,37.008)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Claremont                     Pl   | [(-122.0542,37.995),(-122.0542,37.008)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- Claremont                     Pl   | [(-122.0542,37.995),(-122.0542,37.008)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Coliseum                      Way  | [(-122.2001,37.47),(-122.1978,37.516)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
- Coliseum                      Way  | [(-122.2113,37.626),(-122.2085,37.592),(-122.2063,37.568)]                                                                                                                                                                                                                                                                                                                                                                                   | Berkeley
- Coolidge                      Ave  | [(-122.2007,37.058),(-122.1992,37.06)]                                                                                                                                                                                                                                                                                                                                                                                                       | Lafayette
- Cornell                       Ave  | [(-122.2956,37.925),(-122.2949,37.906),(-122.2939,37.875)]                                                                                                                                                                                                                                                                                                                                                                                   | Berkeley
  Corriea                       Way  | [(-121.9501,37.402),(-121.9505,37.398)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Corriea                       Way  | [(-121.9501,37.402),(-121.9505,37.398)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Cowing                        Road | [(-122.0002,37.934),(-121.9772,37.782)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- Creston                       Road | [(-122.2639,37.002),(-122.2613,37.986),(-122.2602,37.978),(-122.2598,37.973)]                                                                                                                                                                                                                                                                                                                                                                | Berkeley
- Creston                       Road | [(-122.2639,37.002),(-122.2613,37.986),(-122.2602,37.978),(-122.2598,37.973)]                                                                                                                                                                                                                                                                                                                                                                | Lafayette
- Crow Canyon Creek                  | [(-122.043,37.905),(-122.0368,37.71)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
  Crystaline                    Dr   | [(-121.925856,37),(-121.925869,37.00527)]                                                                                                                                                                                                                                                                                                                                                                                                    | Oakland
  Cull Canyon                   Road | [(-122.0536,37.435),(-122.0499,37.315)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- Cull Creek                         | [(-122.0624,37.875),(-122.0582,37.527)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  D                             St   | [(-122.1811,37.505),(-122.1805,37.497)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Decoto                        Road | [(-122.0159,37.006),(-122.016,37.002),(-122.0164,37.993)]                                                                                                                                                                                                                                                                                                                                                                                    | Oakland
  Decoto                        Road | [(-122.0159,37.006),(-122.016,37.002),(-122.0164,37.993)]                                                                                                                                                                                                                                                                                                                                                                                    | Oakland
- Decoto                        Road | [(-122.0159,37.006),(-122.016,37.002),(-122.0164,37.993)]                                                                                                                                                                                                                                                                                                                                                                                    | Berkeley
- Deering                       St   | [(-122.2146,37.904),(-122.2126,37.897)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
- Dimond                        Ave  | [(-122.2167,37.994),(-122.2162,37.006)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
- Dimond                        Ave  | [(-122.2167,37.994),(-122.2162,37.006)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
- Donna                         Way  | [(-122.1333,37.606),(-122.1316,37.599)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Driftwood                     Dr   | [(-122.0109,37.482),(-122.0113,37.477)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Driscoll                      Road | [(-121.9482,37.403),(-121.948451,37.39995)]                                                                                                                                                                                                                                                                                                                                                                                                  | Oakland
  Driscoll                      Road | [(-121.9482,37.403),(-121.948451,37.39995)]                                                                                                                                                                                                                                                                                                                                                                                                  | Oakland
@@ -91,38 +200,22 @@ SELECT * FROM street;
  Eden                          Ave  | [(-122.1143,37.505),(-122.1142,37.491)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Eden Creek                         | [(-122.022037,37.00675),(-122.0221,37.998)]                                                                                                                                                                                                                                                                                                                                                                                                  | Oakland
  Eden Creek                         | [(-122.022037,37.00675),(-122.0221,37.998)]                                                                                                                                                                                                                                                                                                                                                                                                  | Oakland
- Eden Creek                         | [(-122.022037,37.00675),(-122.0221,37.998)]                                                                                                                                                                                                                                                                                                                                                                                                  | Berkeley
- Edgewater                     Dr   | [(-122.201,37.379),(-122.2042,37.41)]                                                                                                                                                                                                                                                                                                                                                                                                        | Lafayette
  Enos                          Way  | [(-121.7677,37.896),(-121.7673,37.91)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
- Euclid                        Ave  | [(-122.2671,37.009),(-122.2666,37.987)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
- Euclid                        Ave  | [(-122.2671,37.009),(-122.2666,37.987)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
  Fairview                      Ave  | [(-121.999,37.428),(-121.9863,37.351)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
  Fairview                      Ave  | [(-121.999,37.428),(-121.9863,37.351)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
- Foothill                      Blvd | [(-122.2414,37.9),(-122.2403,37.893)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
- Fountain                      St   | [(-122.2306,37.593),(-122.2293,37.605)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Gading                        Road | [(-122.0801,37.343),(-122.08,37.336)]                                                                                                                                                                                                                                                                                                                                                                                                        | Oakland
- Grizzly Peak                  Blvd | [(-122.2213,37.638),(-122.2127,37.581)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
- Grove                         Way  | [(-122.0643,37.884),(-122.062679,37.89162),(-122.061796,37.89578),(-122.0609,37.9)]                                                                                                                                                                                                                                                                                                                                                          | Berkeley
  Harris                        Road | [(-122.0659,37.372),(-122.0675,37.363)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- Heartwood                     Dr   | [(-122.2006,37.341),(-122.1992,37.338)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
  Hegenberger                   Exwy | [(-122.1946,37.52),(-122.1947,37.497)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
  Herrier                       St   | [(-122.1943,37.006),(-122.1936,37.998)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- Herrier                       St   | [(-122.1943,37.006),(-122.1936,37.998)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Hesperian                     Blvd | [(-122.097,37.333),(-122.0956,37.31),(-122.0946,37.293)]                                                                                                                                                                                                                                                                                                                                                                                     | Oakland
  Hesperian                     Blvd | [(-122.097,37.333),(-122.0956,37.31),(-122.0946,37.293)]                                                                                                                                                                                                                                                                                                                                                                                     | Oakland
- Hesperian                     Blvd | [(-122.1132,37.6),(-122.1123,37.586)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
- Hollis                        St   | [(-122.2885,37.397),(-122.289,37.414)]                                                                                                                                                                                                                                                                                                                                                                                                       | Lafayette
  I- 580                             | [(-121.727,37.074),(-121.7229,37.093),(-121.722301,37.09522),(-121.721001,37.10005),(-121.7194,37.106),(-121.7188,37.109),(-121.7168,37.12),(-121.7163,37.123),(-121.7145,37.127),(-121.7096,37.148),(-121.707731,37.1568),(-121.7058,37.166),(-121.7055,37.168),(-121.7044,37.174),(-121.7038,37.172),(-121.7037,37.172),(-121.7027,37.175),(-121.7001,37.181),(-121.6957,37.191),(-121.6948,37.192),(-121.6897,37.204),(-121.6697,37.185)] | Oakland
  I- 580                             | [(-121.9322,37.989),(-121.9243,37.006),(-121.9217,37.014)]                                                                                                                                                                                                                                                                                                                                                                                   | Oakland
  I- 580                             | [(-121.9322,37.989),(-121.9243,37.006),(-121.9217,37.014)]                                                                                                                                                                                                                                                                                                                                                                                   | Oakland
  I- 580                             | [(-122.018,37.019),(-122.0009,37.032),(-121.9787,37.983),(-121.958,37.984),(-121.9571,37.986)]                                                                                                                                                                                                                                                                                                                                               | Oakland
  I- 580                             | [(-122.018,37.019),(-122.0009,37.032),(-121.9787,37.983),(-121.958,37.984),(-121.9571,37.986)]                                                                                                                                                                                                                                                                                                                                               | Oakland
  I- 580                             | [(-122.1108,37.023),(-122.1101,37.02),(-122.108103,37.00764),(-122.108,37.007),(-122.1069,37.998),(-122.1064,37.994),(-122.1053,37.982),(-122.1048,37.977),(-122.1032,37.958),(-122.1026,37.953),(-122.1013,37.938),(-122.0989,37.911),(-122.0984,37.91),(-122.098,37.908)]                                                                                                                                                                  | Oakland
- I- 580                             | [(-122.1108,37.023),(-122.1101,37.02),(-122.108103,37.00764),(-122.108,37.007),(-122.1069,37.998),(-122.1064,37.994),(-122.1053,37.982),(-122.1048,37.977),(-122.1032,37.958),(-122.1026,37.953),(-122.1013,37.938),(-122.0989,37.911),(-122.0984,37.91),(-122.098,37.908)]                                                                                                                                                                  | Berkeley
  I- 580                             | [(-122.1543,37.703),(-122.1535,37.694),(-122.1512,37.655),(-122.1475,37.603),(-122.1468,37.583),(-122.1472,37.569),(-122.149044,37.54874),(-122.1493,37.546),(-122.1501,37.532),(-122.1506,37.509),(-122.1495,37.482),(-122.1487,37.467),(-122.1477,37.447),(-122.1414,37.383),(-122.1404,37.376),(-122.1398,37.372),(-122.139,37.356),(-122.1388,37.353),(-122.1385,37.34),(-122.1382,37.33),(-122.1378,37.316)]                            | Oakland
- I- 580                             | [(-122.1543,37.703),(-122.1535,37.694),(-122.1512,37.655),(-122.1475,37.603),(-122.1468,37.583),(-122.1472,37.569),(-122.149044,37.54874),(-122.1493,37.546),(-122.1501,37.532),(-122.1506,37.509),(-122.1495,37.482),(-122.1487,37.467),(-122.1477,37.447),(-122.1414,37.383),(-122.1404,37.376),(-122.1398,37.372),(-122.139,37.356),(-122.1388,37.353),(-122.1385,37.34),(-122.1382,37.33),(-122.1378,37.316)]                            | Berkeley
- I- 580                             | [(-122.2197,37.99),(-122.22,37.99),(-122.222092,37.99523),(-122.2232,37.998),(-122.224146,37.99963),(-122.2261,37.003),(-122.2278,37.007),(-122.2302,37.026),(-122.2323,37.043),(-122.2344,37.059),(-122.235405,37.06427),(-122.2365,37.07)]                                                                                                                                                                                                 | Berkeley
- I- 580                             | [(-122.2197,37.99),(-122.22,37.99),(-122.222092,37.99523),(-122.2232,37.998),(-122.224146,37.99963),(-122.2261,37.003),(-122.2278,37.007),(-122.2302,37.026),(-122.2323,37.043),(-122.2344,37.059),(-122.235405,37.06427),(-122.2365,37.07)]                                                                                                                                                                                                 | Lafayette
  I- 580                        Ramp | [(-121.8521,37.011),(-121.8479,37.999),(-121.8476,37.999),(-121.8456,37.01),(-121.8455,37.011)]                                                                                                                                                                                                                                                                                                                                              | Oakland
  I- 580                        Ramp | [(-121.8521,37.011),(-121.8479,37.999),(-121.8476,37.999),(-121.8456,37.01),(-121.8455,37.011)]                                                                                                                                                                                                                                                                                                                                              | Oakland
  I- 580                        Ramp | [(-121.8743,37.014),(-121.8722,37.999),(-121.8714,37.999)]                                                                                                                                                                                                                                                                                                                                                                                   | Oakland
@@ -131,13 +224,7 @@ SELECT * FROM street;
  I- 580                        Ramp | [(-121.9043,37.998),(-121.9036,37.013),(-121.902632,37.0174),(-121.9025,37.018)]                                                                                                                                                                                                                                                                                                                                                             | Oakland
  I- 580                        Ramp | [(-121.9368,37.986),(-121.936483,37.98832),(-121.9353,37.997),(-121.93504,37.00035),(-121.9346,37.006),(-121.933764,37.00031),(-121.9333,37.997),(-121.9322,37.989)]                                                                                                                                                                                                                                                                         | Oakland
  I- 580                        Ramp | [(-121.9368,37.986),(-121.936483,37.98832),(-121.9353,37.997),(-121.93504,37.00035),(-121.9346,37.006),(-121.933764,37.00031),(-121.9333,37.997),(-121.9322,37.989)]                                                                                                                                                                                                                                                                         | Oakland
- I- 580                        Ramp | [(-122.093241,37.90351),(-122.09364,37.89634),(-122.093788,37.89212)]                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
- I- 580                        Ramp | [(-122.0934,37.896),(-122.09257,37.89961),(-122.0911,37.906)]                                                                                                                                                                                                                                                                                                                                                                                | Berkeley
- I- 580                        Ramp | [(-122.0941,37.897),(-122.0943,37.902)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
- I- 580                        Ramp | [(-122.096,37.888),(-122.0962,37.891),(-122.0964,37.9)]                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
- I- 580                        Ramp | [(-122.101,37.898),(-122.1005,37.902),(-122.0989,37.911)]                                                                                                                                                                                                                                                                                                                                                                                    | Berkeley
  I- 580                        Ramp | [(-122.1086,37.003),(-122.1068,37.993),(-122.1066,37.992),(-122.1053,37.982)]                                                                                                                                                                                                                                                                                                                                                                | Oakland
- I- 580                        Ramp | [(-122.1086,37.003),(-122.1068,37.993),(-122.1066,37.992),(-122.1053,37.982)]                                                                                                                                                                                                                                                                                                                                                                | Berkeley
  I- 580                        Ramp | [(-122.1414,37.383),(-122.1407,37.376),(-122.1403,37.372),(-122.139,37.356)]                                                                                                                                                                                                                                                                                                                                                                 | Oakland
  I- 580/I-680                  Ramp | ((-121.9207,37.988),(-121.9192,37.016))                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  I- 580/I-680                  Ramp | ((-121.9207,37.988),(-121.9192,37.016))                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
@@ -151,25 +238,16 @@ SELECT * FROM street;
  I- 680                        Ramp | [(-121.92,37.438),(-121.9218,37.424),(-121.9238,37.408),(-121.9252,37.392)]                                                                                                                                                                                                                                                                                                                                                                  | Oakland
  I- 680                        Ramp | [(-121.9238,37.402),(-121.9234,37.395),(-121.923,37.399)]                                                                                                                                                                                                                                                                                                                                                                                    | Oakland
  I- 680                        Ramp | [(-121.9238,37.402),(-121.9234,37.395),(-121.923,37.399)]                                                                                                                                                                                                                                                                                                                                                                                    | Oakland
- I- 80                              | ((-122.2937,37.277),(-122.3016,37.262))                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
- I- 80                              | ((-122.2962,37.273),(-122.3004,37.264))                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
- I- 80                         Ramp | [(-122.2962,37.413),(-122.2959,37.382),(-122.2951,37.372)]                                                                                                                                                                                                                                                                                                                                                                                   | Lafayette
  I- 880                             | ((-121.9669,37.075),(-121.9663,37.071),(-121.9656,37.065),(-121.9618,37.037),(-121.95689,37),(-121.948,37.933))                                                                                                                                                                                                                                                                                                                              | Oakland
  I- 880                             | ((-121.9669,37.075),(-121.9663,37.071),(-121.9656,37.065),(-121.9618,37.037),(-121.95689,37),(-121.948,37.933))                                                                                                                                                                                                                                                                                                                              | Oakland
  I- 880                             | [(-121.948,37.933),(-121.9471,37.925),(-121.9467,37.923),(-121.946,37.918),(-121.9452,37.912),(-121.937,37.852)]                                                                                                                                                                                                                                                                                                                             | Oakland
  I- 880                             | [(-122.0219,37.466),(-122.0205,37.447),(-122.020331,37.44447),(-122.020008,37.43962),(-122.0195,37.432),(-122.0193,37.429),(-122.0164,37.393),(-122.010219,37.34771),(-122.0041,37.313)]                                                                                                                                                                                                                                                     | Oakland
  I- 880                             | [(-122.0375,37.632),(-122.0359,37.619),(-122.0358,37.616),(-122.034514,37.60409),(-122.031876,37.57965),(-122.031193,37.57332),(-122.03016,37.56375),(-122.02943,37.55698),(-122.028689,37.54929),(-122.027833,37.53908),(-122.025979,37.51698),(-122.0238,37.491)]                                                                                                                                                                          | Oakland
- I- 880                             | [(-122.0375,37.632),(-122.0359,37.619),(-122.0358,37.616),(-122.034514,37.60409),(-122.031876,37.57965),(-122.031193,37.57332),(-122.03016,37.56375),(-122.02943,37.55698),(-122.028689,37.54929),(-122.027833,37.53908),(-122.025979,37.51698),(-122.0238,37.491)]                                                                                                                                                                          | Berkeley
  I- 880                             | [(-122.0612,37.003),(-122.0604,37.991),(-122.0596,37.982),(-122.0585,37.967),(-122.0583,37.961),(-122.0553,37.918),(-122.053635,37.89475),(-122.050759,37.8546),(-122.05,37.844),(-122.0485,37.817),(-122.0483,37.813),(-122.0482,37.811)]                                                                                                                                                                                                   | Oakland
  I- 880                             | [(-122.0612,37.003),(-122.0604,37.991),(-122.0596,37.982),(-122.0585,37.967),(-122.0583,37.961),(-122.0553,37.918),(-122.053635,37.89475),(-122.050759,37.8546),(-122.05,37.844),(-122.0485,37.817),(-122.0483,37.813),(-122.0482,37.811)]                                                                                                                                                                                                   | Oakland
- I- 880                             | [(-122.0612,37.003),(-122.0604,37.991),(-122.0596,37.982),(-122.0585,37.967),(-122.0583,37.961),(-122.0553,37.918),(-122.053635,37.89475),(-122.050759,37.8546),(-122.05,37.844),(-122.0485,37.817),(-122.0483,37.813),(-122.0482,37.811)]                                                                                                                                                                                                   | Berkeley
  I- 880                             | [(-122.0831,37.312),(-122.0819,37.296),(-122.081,37.285),(-122.0786,37.248),(-122.078,37.24),(-122.077642,37.23496),(-122.076983,37.22567),(-122.076599,37.22026),(-122.076229,37.21505),(-122.0758,37.209)]                                                                                                                                                                                                                                 | Oakland
  I- 880                             | [(-122.0978,37.528),(-122.096,37.496),(-122.0931,37.453),(-122.09277,37.4496),(-122.090189,37.41442),(-122.0896,37.405),(-122.085,37.34)]                                                                                                                                                                                                                                                                                                    | Oakland
- I- 880                             | [(-122.1365,37.902),(-122.1358,37.898),(-122.1333,37.881),(-122.1323,37.874),(-122.1311,37.866),(-122.1308,37.865),(-122.1307,37.864),(-122.1289,37.851),(-122.1277,37.843),(-122.1264,37.834),(-122.1231,37.812),(-122.1165,37.766),(-122.1104,37.72),(-122.109695,37.71094),(-122.109,37.702),(-122.108312,37.69168),(-122.1076,37.681)]                                                                                                   | Berkeley
  I- 880                             | [(-122.1755,37.185),(-122.1747,37.178),(-122.1742,37.173),(-122.1692,37.126),(-122.167792,37.11594),(-122.16757,37.11435),(-122.1671,37.111),(-122.1655,37.1),(-122.165169,37.09811),(-122.1641,37.092),(-122.1596,37.061),(-122.158381,37.05275),(-122.155991,37.03657),(-122.1531,37.017),(-122.1478,37.98),(-122.1407,37.932),(-122.1394,37.924),(-122.1389,37.92),(-122.1376,37.91)]                                                     | Oakland
- I- 880                             | [(-122.1755,37.185),(-122.1747,37.178),(-122.1742,37.173),(-122.1692,37.126),(-122.167792,37.11594),(-122.16757,37.11435),(-122.1671,37.111),(-122.1655,37.1),(-122.165169,37.09811),(-122.1641,37.092),(-122.1596,37.061),(-122.158381,37.05275),(-122.155991,37.03657),(-122.1531,37.017),(-122.1478,37.98),(-122.1407,37.932),(-122.1394,37.924),(-122.1389,37.92),(-122.1376,37.91)]                                                     | Berkeley
- I- 880                             | [(-122.2214,37.711),(-122.2202,37.699),(-122.2199,37.695),(-122.219,37.682),(-122.2184,37.672),(-122.2173,37.652),(-122.2159,37.638),(-122.2144,37.616),(-122.2138,37.612),(-122.2135,37.609),(-122.212,37.592),(-122.2116,37.586),(-122.2111,37.581)]                                                                                                                                                                                       | Berkeley
- I- 880                             | [(-122.2707,37.975),(-122.2693,37.972),(-122.2681,37.966),(-122.267,37.962),(-122.2659,37.957),(-122.2648,37.952),(-122.2636,37.946),(-122.2625,37.935),(-122.2617,37.927),(-122.2607,37.921),(-122.2593,37.916),(-122.258,37.911),(-122.2536,37.898),(-122.2432,37.858),(-122.2408,37.845),(-122.2386,37.827),(-122.2374,37.811)]                                                                                                           | Berkeley
  I- 880                        Ramp | [(-122.0019,37.301),(-122.002,37.293)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
  I- 880                        Ramp | [(-122.0041,37.313),(-122.0018,37.315),(-122.0007,37.315),(-122.0005,37.313),(-122.0002,37.308),(-121.9995,37.289)]                                                                                                                                                                                                                                                                                                                          | Oakland
  I- 880                        Ramp | [(-122.0041,37.313),(-122.0038,37.308),(-122.0039,37.284),(-122.0013,37.287),(-121.9995,37.289)]                                                                                                                                                                                                                                                                                                                                             | Oakland
@@ -177,167 +255,89 @@ SELECT * FROM street;
  I- 880                        Ramp | [(-122.0238,37.491),(-122.0215,37.483),(-122.0211,37.477),(-122.0205,37.447)]                                                                                                                                                                                                                                                                                                                                                                | Oakland
  I- 880                        Ramp | [(-122.059,37.982),(-122.0577,37.984),(-122.0612,37.003)]                                                                                                                                                                                                                                                                                                                                                                                    | Oakland
  I- 880                        Ramp | [(-122.059,37.982),(-122.0577,37.984),(-122.0612,37.003)]                                                                                                                                                                                                                                                                                                                                                                                    | Oakland
- I- 880                        Ramp | [(-122.059,37.982),(-122.0577,37.984),(-122.0612,37.003)]                                                                                                                                                                                                                                                                                                                                                                                    | Berkeley
  I- 880                        Ramp | [(-122.0618,37.011),(-122.0631,37.982),(-122.0585,37.967)]                                                                                                                                                                                                                                                                                                                                                                                   | Oakland
  I- 880                        Ramp | [(-122.0618,37.011),(-122.0631,37.982),(-122.0585,37.967)]                                                                                                                                                                                                                                                                                                                                                                                   | Oakland
- I- 880                        Ramp | [(-122.0618,37.011),(-122.0631,37.982),(-122.0585,37.967)]                                                                                                                                                                                                                                                                                                                                                                                   | Berkeley
  I- 880                        Ramp | [(-122.085,37.34),(-122.0801,37.316),(-122.081,37.285)]                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  I- 880                        Ramp | [(-122.085,37.34),(-122.0801,37.316),(-122.081,37.285)]                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  I- 880                        Ramp | [(-122.085,37.34),(-122.0866,37.316),(-122.0819,37.296)]                                                                                                                                                                                                                                                                                                                                                                                     | Oakland
  I- 880                        Ramp | [(-122.085,37.34),(-122.0866,37.316),(-122.0819,37.296)]                                                                                                                                                                                                                                                                                                                                                                                     | Oakland
- I- 880                        Ramp | [(-122.1029,37.61),(-122.1013,37.587),(-122.0999,37.569)]                                                                                                                                                                                                                                                                                                                                                                                    | Berkeley
- I- 880                        Ramp | [(-122.1379,37.891),(-122.1383,37.897),(-122.1377,37.902)]                                                                                                                                                                                                                                                                                                                                                                                   | Berkeley
- I- 880                        Ramp | [(-122.1379,37.931),(-122.137597,37.92736),(-122.1374,37.925),(-122.1373,37.924),(-122.1369,37.914),(-122.1358,37.905),(-122.1365,37.908),(-122.1358,37.898)]                                                                                                                                                                                                                                                                                | Berkeley
- I- 880                        Ramp | [(-122.2536,37.898),(-122.254,37.902)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
- I- 880                        Ramp | [(-122.2771,37.002),(-122.278,37)]                                                                                                                                                                                                                                                                                                                                                                                                           | Lafayette
- Indian                        Way  | [(-122.2066,37.398),(-122.2045,37.411)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
- Jackson                       St   | [(-122.0845,37.6),(-122.0842,37.606)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
  Johnson                       Dr   | [(-121.9145,37.901),(-121.915,37.877)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
- Joyce                         St   | [(-122.0792,37.604),(-122.0774,37.581)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Juniper                       St   | [(-121.7823,37.897),(-121.7815,37.9)]                                                                                                                                                                                                                                                                                                                                                                                                        | Oakland
  Kaiser                        Dr   | [(-122.067163,37.47821),(-122.060402,37.51961)]                                                                                                                                                                                                                                                                                                                                                                                              | Oakland
- Keeler                        Ave  | [(-122.2578,37.906),(-122.2579,37.899)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Kildare                       Road | [(-122.0968,37.016),(-122.0959,37)]                                                                                                                                                                                                                                                                                                                                                                                                          | Oakland
  La Playa                      Dr   | [(-122.1039,37.545),(-122.101,37.493)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
- Laguna                        Ave  | [(-122.2099,37.989),(-122.2089,37)]                                                                                                                                                                                                                                                                                                                                                                                                          | Berkeley
- Laguna                        Ave  | [(-122.2099,37.989),(-122.2089,37)]                                                                                                                                                                                                                                                                                                                                                                                                          | Lafayette
- Lakehurst                     Cir  | [(-122.284729,37.89025),(-122.286096,37.90364)]                                                                                                                                                                                                                                                                                                                                                                                              | Berkeley
- Lakeshore                     Ave  | [(-122.2586,37.99),(-122.2556,37.006)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
- Lakeshore                     Ave  | [(-122.2586,37.99),(-122.2556,37.006)]                                                                                                                                                                                                                                                                                                                                                                                                       | Lafayette
  Las Positas                   Road | [(-121.764488,37.99199),(-121.75569,37.02022)]                                                                                                                                                                                                                                                                                                                                                                                               | Oakland
  Las Positas                   Road | [(-121.764488,37.99199),(-121.75569,37.02022)]                                                                                                                                                                                                                                                                                                                                                                                               | Oakland
- Linden                        St   | [(-122.2867,37.998),(-122.2864,37.008)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
- Linden                        St   | [(-122.2867,37.998),(-122.2864,37.008)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
  Livermore                     Ave  | [(-121.7687,37.448),(-121.769,37.375)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
  Livermore                     Ave  | [(-121.7687,37.448),(-121.769,37.375)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
  Livermore                     Ave  | [(-121.772719,37.99085),(-121.7728,37.001)]                                                                                                                                                                                                                                                                                                                                                                                                  | Oakland
  Livermore                     Ave  | [(-121.772719,37.99085),(-121.7728,37.001)]                                                                                                                                                                                                                                                                                                                                                                                                  | Oakland
  Locust                        St   | [(-122.1606,37.007),(-122.1593,37.987)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- Locust                        St   | [(-122.1606,37.007),(-122.1593,37.987)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Logan                         Ct   | [(-122.0053,37.492),(-122.0061,37.484)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Magnolia                      St   | [(-122.0971,37.5),(-122.0962,37.484)]                                                                                                                                                                                                                                                                                                                                                                                                        | Oakland
- Mandalay                      Road | [(-122.2322,37.397),(-122.2321,37.403)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
- Marin                         Ave  | [(-122.2741,37.894),(-122.272,37.901)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
- Martin Luther King Jr         Way  | [(-122.2712,37.608),(-122.2711,37.599)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Mattos                        Dr   | [(-122.0005,37.502),(-122.000898,37.49683)]                                                                                                                                                                                                                                                                                                                                                                                                  | Oakland
  Maubert                       Ave  | [(-122.1114,37.009),(-122.1096,37.995)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- Maubert                       Ave  | [(-122.1114,37.009),(-122.1096,37.995)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  McClure                       Ave  | [(-122.1431,37.001),(-122.1436,37.998)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- McClure                       Ave  | [(-122.1431,37.001),(-122.1436,37.998)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Medlar                        Dr   | [(-122.0627,37.378),(-122.0625,37.375)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Mildred                       Ct   | [(-122.0002,37.388),(-121.9998,37.386)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- Miller                        Road | [(-122.0902,37.645),(-122.0865,37.545)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Miramar                       Ave  | [(-122.1009,37.025),(-122.099089,37.03209)]                                                                                                                                                                                                                                                                                                                                                                                                  | Oakland
  Mission                       Blvd | [(-121.918886,37),(-121.9194,37.976),(-121.9198,37.975)]                                                                                                                                                                                                                                                                                                                                                                                     | Oakland
  Mission                       Blvd | [(-121.918886,37),(-121.9194,37.976),(-121.9198,37.975)]                                                                                                                                                                                                                                                                                                                                                                                     | Oakland
  Mission                       Blvd | [(-122.0006,37.896),(-121.9989,37.88)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
- Mission                       Blvd | [(-122.0006,37.896),(-121.9989,37.88)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
  Moores                        Ave  | [(-122.0087,37.301),(-122.0094,37.292)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  National                      Ave  | [(-122.1192,37.5),(-122.1281,37.489)]                                                                                                                                                                                                                                                                                                                                                                                                        | Oakland
  Navajo                        Ct   | [(-121.8779,37.901),(-121.8783,37.9)]                                                                                                                                                                                                                                                                                                                                                                                                        | Oakland
  Newark                        Blvd | [(-122.0352,37.438),(-122.0341,37.423)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- Oakland Inner Harbor               | [(-122.2625,37.913),(-122.260016,37.89484)]                                                                                                                                                                                                                                                                                                                                                                                                  | Berkeley
  Oakridge                      Road | [(-121.8316,37.049),(-121.828382,37)]                                                                                                                                                                                                                                                                                                                                                                                                        | Oakland
- Oneil                         Ave  | [(-122.076754,37.62476),(-122.0745,37.595)]                                                                                                                                                                                                                                                                                                                                                                                                  | Berkeley
- Parkridge                     Dr   | [(-122.1438,37.884),(-122.1428,37.9)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
- Parkside                      Dr   | [(-122.0475,37.603),(-122.0443,37.596)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Paseo Padre                   Pkwy | [(-121.9143,37.005),(-121.913522,37)]                                                                                                                                                                                                                                                                                                                                                                                                        | Oakland
  Paseo Padre                   Pkwy | [(-122.0021,37.639),(-121.996,37.628)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
- Paseo Padre                   Pkwy | [(-122.0021,37.639),(-121.996,37.628)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
- Pearl                         St   | [(-122.2383,37.594),(-122.2366,37.615)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Periwinkle                    Road | [(-122.0451,37.301),(-122.044758,37.29844)]                                                                                                                                                                                                                                                                                                                                                                                                  | Oakland
  Pimlico                       Dr   | [(-121.8616,37.998),(-121.8618,37.008)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Pimlico                       Dr   | [(-121.8616,37.998),(-121.8618,37.008)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Portsmouth                    Ave  | [(-122.1064,37.315),(-122.1064,37.308)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- Proctor                       Ave  | [(-122.2267,37.406),(-122.2251,37.386)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
  Railroad                      Ave  | [(-122.0245,37.013),(-122.0234,37.003),(-122.0223,37.993)]                                                                                                                                                                                                                                                                                                                                                                                   | Oakland
  Railroad                      Ave  | [(-122.0245,37.013),(-122.0234,37.003),(-122.0223,37.993)]                                                                                                                                                                                                                                                                                                                                                                                   | Oakland
- Railroad                      Ave  | [(-122.0245,37.013),(-122.0234,37.003),(-122.0223,37.993)]                                                                                                                                                                                                                                                                                                                                                                                   | Berkeley
  Ranspot                       Dr   | [(-122.0972,37.999),(-122.0959,37)]                                                                                                                                                                                                                                                                                                                                                                                                          | Oakland
  Ranspot                       Dr   | [(-122.0972,37.999),(-122.0959,37)]                                                                                                                                                                                                                                                                                                                                                                                                          | Oakland
- Ranspot                       Dr   | [(-122.0972,37.999),(-122.0959,37)]                                                                                                                                                                                                                                                                                                                                                                                                          | Berkeley
- Redding                       St   | [(-122.1978,37.901),(-122.1975,37.895)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Redwood                       Road | [(-122.1493,37.98),(-122.1437,37.001)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
- Redwood                       Road | [(-122.1493,37.98),(-122.1437,37.001)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
- Roca                          Dr   | [(-122.0335,37.609),(-122.0314,37.599)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Rosedale                      Ct   | [(-121.9232,37.9),(-121.924,37.897)]                                                                                                                                                                                                                                                                                                                                                                                                         | Oakland
- Sacramento                    St   | [(-122.2799,37.606),(-122.2797,37.597)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
- Saddle Brook                  Dr   | [(-122.1478,37.909),(-122.1454,37.904),(-122.1451,37.888)]                                                                                                                                                                                                                                                                                                                                                                                   | Berkeley
  Saginaw                       Ct   | [(-121.8803,37.898),(-121.8806,37.901)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- San Andreas                   Dr   | [(-122.0609,37.9),(-122.0614,37.895)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
  Santa Maria                   Ave  | [(-122.0773,37),(-122.0773,37.98)]                                                                                                                                                                                                                                                                                                                                                                                                           | Oakland
  Santa Maria                   Ave  | [(-122.0773,37),(-122.0773,37.98)]                                                                                                                                                                                                                                                                                                                                                                                                           | Oakland
- Santa Maria                   Ave  | [(-122.0773,37),(-122.0773,37.98)]                                                                                                                                                                                                                                                                                                                                                                                                           | Berkeley
- Shattuck                      Ave  | [(-122.2686,37.904),(-122.2686,37.897)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
- Sheridan                      Road | [(-122.2279,37.425),(-122.2253,37.411),(-122.2223,37.377)]                                                                                                                                                                                                                                                                                                                                                                                   | Lafayette
- Shoreline                     Dr   | [(-122.2657,37.603),(-122.2648,37.6)]                                                                                                                                                                                                                                                                                                                                                                                                        | Berkeley
  Skyline                       Blvd | [(-122.1738,37.01),(-122.1714,37.996)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
- Skyline                       Blvd | [(-122.1738,37.01),(-122.1714,37.996)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
  Skyline                       Dr   | [(-122.0277,37.5),(-122.0284,37.498)]                                                                                                                                                                                                                                                                                                                                                                                                        | Oakland
- Skywest                       Dr   | [(-122.1161,37.62),(-122.1123,37.586)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
- Southern Pacific Railroad          | [(-122.3002,37.674),(-122.2999,37.661)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Sp Railroad                        | [(-121.893564,37.99009),(-121.897,37.016)]                                                                                                                                                                                                                                                                                                                                                                                                   | Oakland
  Sp Railroad                        | [(-121.893564,37.99009),(-121.897,37.016)]                                                                                                                                                                                                                                                                                                                                                                                                   | Oakland
  Sp Railroad                        | [(-121.9565,37.898),(-121.9562,37.9)]                                                                                                                                                                                                                                                                                                                                                                                                        | Oakland
  Sp Railroad                        | [(-122.0734,37.001),(-122.0734,37.997)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Sp Railroad                        | [(-122.0734,37.001),(-122.0734,37.997)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- Sp Railroad                        | [(-122.0734,37.001),(-122.0734,37.997)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
- Sp Railroad                        | [(-122.0914,37.601),(-122.087,37.56),(-122.086408,37.5551)]                                                                                                                                                                                                                                                                                                                                                                                  | Berkeley
  Sp Railroad                        | [(-122.137792,37.003),(-122.1365,37.992),(-122.131257,37.94612)]                                                                                                                                                                                                                                                                                                                                                                             | Oakland
- Sp Railroad                        | [(-122.137792,37.003),(-122.1365,37.992),(-122.131257,37.94612)]                                                                                                                                                                                                                                                                                                                                                                             | Berkeley
  Sp Railroad                        | [(-122.1947,37.497),(-122.193328,37.4848)]                                                                                                                                                                                                                                                                                                                                                                                                   | Oakland
  Stanton                       Ave  | [(-122.100392,37.0697),(-122.099513,37.06052)]                                                                                                                                                                                                                                                                                                                                                                                               | Oakland
- State Hwy 123                      | [(-122.3004,37.986),(-122.2998,37.969),(-122.2995,37.962),(-122.2992,37.952),(-122.299,37.942),(-122.2987,37.935),(-122.2984,37.924),(-122.2982,37.92),(-122.2976,37.904),(-122.297,37.88),(-122.2966,37.869),(-122.2959,37.848),(-122.2961,37.843)]                                                                                                                                                                                         | Berkeley
- State Hwy 13                       | [(-122.1797,37.943),(-122.179871,37.91849),(-122.18,37.9),(-122.179023,37.86615),(-122.1787,37.862),(-122.1781,37.851),(-122.1777,37.845),(-122.1773,37.839),(-122.177,37.833)]                                                                                                                                                                                                                                                              | Berkeley
- State Hwy 13                       | [(-122.2049,37.2),(-122.20328,37.17975),(-122.1989,37.125),(-122.198078,37.11641),(-122.1975,37.11)]                                                                                                                                                                                                                                                                                                                                         | Lafayette
- State Hwy 13                  Ramp | [(-122.2244,37.427),(-122.223,37.414),(-122.2214,37.396),(-122.2213,37.388)]                                                                                                                                                                                                                                                                                                                                                                 | Lafayette
- State Hwy 238                      | ((-122.098,37.908),(-122.0983,37.907),(-122.099,37.905),(-122.101,37.898),(-122.101535,37.89711),(-122.103173,37.89438),(-122.1046,37.892),(-122.106,37.89))                                                                                                                                                                                                                                                                                 | Berkeley
- State Hwy 238                 Ramp | [(-122.1288,37.9),(-122.1293,37.895),(-122.1296,37.906)]                                                                                                                                                                                                                                                                                                                                                                                     | Berkeley
- State Hwy 24                       | [(-122.2674,37.246),(-122.2673,37.248),(-122.267,37.261),(-122.2668,37.271),(-122.2663,37.298),(-122.2659,37.315),(-122.2655,37.336),(-122.265007,37.35882),(-122.264443,37.37286),(-122.2641,37.381),(-122.2638,37.388),(-122.2631,37.396),(-122.2617,37.405),(-122.2615,37.407),(-122.2605,37.412)]                                                                                                                                        | Lafayette
  State Hwy 84                       | [(-121.9565,37.898),(-121.956589,37.89911),(-121.9569,37.903),(-121.956,37.91),(-121.9553,37.919)]                                                                                                                                                                                                                                                                                                                                           | Oakland
  State Hwy 84                       | [(-122.0671,37.426),(-122.07,37.402),(-122.074,37.37),(-122.0773,37.338)]                                                                                                                                                                                                                                                                                                                                                                    | Oakland
  State Hwy 92                       | [(-122.1085,37.326),(-122.1095,37.322),(-122.1111,37.316),(-122.1119,37.313),(-122.1125,37.311),(-122.1131,37.308),(-122.1167,37.292),(-122.1187,37.285),(-122.12,37.28)]                                                                                                                                                                                                                                                                    | Oakland
  State Hwy 92                  Ramp | [(-122.1086,37.321),(-122.1089,37.315),(-122.1111,37.316)]                                                                                                                                                                                                                                                                                                                                                                                   | Oakland
- Stuart                        St   | [(-122.2518,37.6),(-122.2507,37.601),(-122.2491,37.606)]                                                                                                                                                                                                                                                                                                                                                                                     | Berkeley
  Sunol Ridge                   Trl  | [(-121.9419,37.455),(-121.9345,37.38)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
  Sunol Ridge                   Trl  | [(-121.9419,37.455),(-121.9345,37.38)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
  Tassajara Creek                    | [(-121.87866,37.98898),(-121.8782,37.015)]                                                                                                                                                                                                                                                                                                                                                                                                   | Oakland
  Tassajara Creek                    | [(-121.87866,37.98898),(-121.8782,37.015)]                                                                                                                                                                                                                                                                                                                                                                                                   | Oakland
- Taurus                        Ave  | [(-122.2159,37.416),(-122.2128,37.389)]                                                                                                                                                                                                                                                                                                                                                                                                      | Lafayette
  Tennyson                      Road | [(-122.0891,37.317),(-122.0927,37.317)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Thackeray                     Ave  | [(-122.072,37.305),(-122.0715,37.298)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
  Theresa                       Way  | [(-121.7289,37.906),(-121.728,37.899)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
  Tissiack                      Way  | [(-121.920364,37),(-121.9208,37.995)]                                                                                                                                                                                                                                                                                                                                                                                                        | Oakland
  Tissiack                      Way  | [(-121.920364,37),(-121.9208,37.995)]                                                                                                                                                                                                                                                                                                                                                                                                        | Oakland
- Tupelo                        Ter  | [(-122.059087,37.6113),(-122.057021,37.59942)]                                                                                                                                                                                                                                                                                                                                                                                               | Berkeley
  Vallecitos                    Road | [(-121.8699,37.916),(-121.8703,37.891)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Warm Springs                  Blvd | [(-121.933956,37),(-121.9343,37.97)]                                                                                                                                                                                                                                                                                                                                                                                                         | Oakland
  Warm Springs                  Blvd | [(-121.933956,37),(-121.9343,37.97)]                                                                                                                                                                                                                                                                                                                                                                                                         | Oakland
  Welch Creek                   Road | [(-121.7695,37.386),(-121.7737,37.413)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Welch Creek                   Road | [(-121.7695,37.386),(-121.7737,37.413)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- West Loop                     Road | [(-122.0576,37.604),(-122.0602,37.586)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Western Pacific Railroad Spur      | [(-122.0394,37.018),(-122.0394,37.961)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Western Pacific Railroad Spur      | [(-122.0394,37.018),(-122.0394,37.961)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- Western Pacific Railroad Spur      | [(-122.0394,37.018),(-122.0394,37.961)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
  Whitlock Creek                     | [(-121.74683,37.91276),(-121.733107,37)]                                                                                                                                                                                                                                                                                                                                                                                                     | Oakland
  Whitlock Creek                     | [(-121.74683,37.91276),(-121.733107,37)]                                                                                                                                                                                                                                                                                                                                                                                                     | Oakland
  Willimet                      Way  | [(-122.0964,37.517),(-122.0949,37.493)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
  Wisconsin                     St   | [(-122.1994,37.017),(-122.1975,37.998),(-122.1971,37.994)]                                                                                                                                                                                                                                                                                                                                                                                   | Oakland
- Wisconsin                     St   | [(-122.1994,37.017),(-122.1975,37.998),(-122.1971,37.994)]                                                                                                                                                                                                                                                                                                                                                                                   | Berkeley
- Wp Railroad                        | [(-122.254,37.902),(-122.2506,37.891)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
- 100th                         Ave  | [(-122.1657,37.429),(-122.1647,37.432)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- 107th                         Ave  | [(-122.1555,37.403),(-122.1531,37.41)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
- 14th                          St   | [(-122.299,37.147),(-122.3,37.148)]                                                                                                                                                                                                                                                                                                                                                                                                          | Lafayette
- 19th                          Ave  | [(-122.2366,37.897),(-122.2359,37.905)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
- 1st                           St   | [(-121.75508,37.89294),(-121.753581,37.90031)]                                                                                                                                                                                                                                                                                                                                                                                               | Oakland
- 5th                           St   | [(-122.278,37),(-122.2792,37.005),(-122.2803,37.009)]                                                                                                                                                                                                                                                                                                                                                                                        | Lafayette
- 5th                           St   | [(-122.296,37.615),(-122.2953,37.598)]                                                                                                                                                                                                                                                                                                                                                                                                       | Berkeley
- 82nd                          Ave  | [(-122.1695,37.596),(-122.1681,37.603)]                                                                                                                                                                                                                                                                                                                                                                                                      | Berkeley
- 85th                          Ave  | [(-122.1877,37.466),(-122.186,37.476)]                                                                                                                                                                                                                                                                                                                                                                                                       | Oakland
- 89th                          Ave  | [(-122.1822,37.459),(-122.1803,37.471)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- 98th                          Ave  | [(-122.1568,37.498),(-122.1558,37.502)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- 98th                          Ave  | [(-122.1693,37.438),(-122.1682,37.444)]                                                                                                                                                                                                                                                                                                                                                                                                      | Oakland
- 98th                          Ave  | [(-122.2001,37.258),(-122.1974,37.27)]                                                                                                                                                                                                                                                                                                                                                                                                       | Lafayette
 (333 rows)
 
 SELECT name, #thepath FROM iexit ORDER BY name COLLATE "C", 2;
diff --git a/src/test/regress/sql/select_views.sql b/src/test/regress/sql/select_views.sql
index e742f13699..919e831239 100644
--- a/src/test/regress/sql/select_views.sql
+++ b/src/test/regress/sql/select_views.sql
@@ -3,7 +3,7 @@
 -- test the views defined in CREATE_VIEWS
 --
 
-SELECT * FROM street;
+SELECT * FROM street ORDER BY cname COLLATE "C", name COLLATE "C", thepath::text COLLATE "C";
 
 SELECT name, #thepath FROM iexit ORDER BY name COLLATE "C", 2;
 
-- 
2.31.1

David Christensen

david.christensen@crunchydata.com

about 3 years ago

In reply to: David Christensen (#7)

Re: [PATCHES] Post-special page storage TDE support

Looking into some CF bot failures which didn't show up locally. Will
send a v3 when resolved.

David Christensen

david.christensen@crunchydata.com

almost 3 years ago

In reply to: David Christensen (#8)

3 attachment(s)

Re: [PATCHES] Post-special page storage TDE support

So here is a v3 here, incorporating additional bug fixes and some
design revisions. I have narrowed this down to 3 patches, fixing the
bugs that were leading to the instability of the specific test file so
dropping that as well as removing the useless POC "wasted space".

The following pieces are left:

0001 - adjust the codebase to utilize the "reserved_page_space"
variable for all offsets rather than assuming compile-time constants.
This allows us to effectively allocate a fixed chunk of storage from
the end of the page and have everything still work on this cluster.
0002 - add the Page Feature abstraction. This allows you to utilize
this chunk of storage, as well as query for feature use at the page
level.
0003 - the first page feature, 64-bit encryption (soon to be
renumbered when GCM storage for TDE is introduced, though the two
features are designed to be incompatible). This includes an
arbitrarily found 64-bit checksum, so we probably will need to write
our own or ensure that we have something license-compatible.

This is rebased and current as-of-today and passes all CI tests, so
should be in a good place to start looking at.

Best,

David

Attachments:

v3-0003-Add-page-feature-for-64-bit-checksums.patchapplication/octet-stream; name=v3-0003-Add-page-feature-for-64-bit-checksums.patchDownload

From a6ab9fcff00ec52f87c684aac37bd1a47a634bcb Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Tue, 1 Nov 2022 18:48:08 -0400
Subject: [PATCH v3 3/3] Add page feature for 64-bit checksums

Since we reclaimed the space from the pd_checksums field for storing a page's
features, we present the use of a 64-bit page checksum as an alternative.  This
uses an arbitrarily chosen 64-bit hash for demo purposes (TBD: is this
compatible, or do we need a replacement?) to demonstrate the use of this feature.

Since one of the main motivators of page features is to provide space for
authenticated page encryption, we make this optional in order to ensure that
either the 64-bit checksum (this patch) or the 64-bit authtag (future patch)
will live in the final 8 bytes of the page at a single constant offset,
hopefully allowing other programs that need to know how to handle the new format
to do so in a much easier way.
---
 src/backend/access/transam/xlog.c       |   4 +-
 src/backend/backup/basebackup.c         |  27 +-
 src/backend/storage/page/bufpage.c      |  53 ++-
 src/backend/utils/misc/guc_tables.c     |  11 +
 src/bin/initdb/initdb.c                 |  19 +-
 src/bin/pg_controldata/pg_controldata.c |   3 +
 src/common/pagefeat.c                   |   5 +
 src/include/common/komihash.h           | 569 ++++++++++++++++++++++++
 src/include/common/pagefeat.h           |   2 +
 src/include/storage/bufpage.h           |  13 +-
 src/include/storage/checksum.h          |   3 +
 src/include/storage/checksum_impl.h     |  89 ++++
 12 files changed, 772 insertions(+), 26 deletions(-)
 create mode 100644 src/include/common/komihash.h

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index b5aca9d426..b98d7a3df3 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4205,7 +4205,9 @@ bool
 DataChecksumsEnabled(void)
 {
 	Assert(ControlFile != NULL);
-	return (ControlFile->data_checksum_version > 0);
+	return (ControlFile->data_checksum_version > 0) || \
+		PageFeatureSetHasFeature(ControlFile->page_features, PF_EXT_CHECKSUMS);
+
 }
 
 /*
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 84db24edd4..eec9803b7e 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -25,6 +25,7 @@
 #include "commands/defrem.h"
 #include "common/compression.h"
 #include "common/file_perm.h"
+#include "common/pagefeat.h"
 #include "lib/stringinfo.h"
 #include "miscadmin.h"
 #include "nodes/pg_list.h"
@@ -1493,7 +1494,7 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 	int			fd;
 	BlockNumber blkno = 0;
 	bool		block_retry = false;
-	uint16		checksum;
+	uint64		checksum, page_checksum;
 	int			checksum_failures = 0;
 	off_t		cnt;
 	int			i;
@@ -1609,9 +1610,23 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 				 */
 				if (!PageIsNew(page) && PageGetLSN(page) < sink->bbs_state->startptr)
 				{
-					checksum = pg_checksum_page((char *) page, blkno + segmentno * RELSEG_SIZE);
-					phdr = (PageHeader) page;
-					if (phdr->pd_feat.checksum != checksum)
+					char *extended_checksum_loc = NULL;
+
+					/* are we using extended checksums? */
+					if ((extended_checksum_loc = PageGetFeatureOffset(page, PF_EXT_CHECKSUMS)))
+					{
+						/* 64-bit checksum */
+						page_checksum = pg_get_checksum64_page(page, (uint64*)extended_checksum_loc);
+						checksum = pg_checksum64_page(page, blkno + segmentno * RELSEG_SIZE, (uint64*)extended_checksum_loc);
+					}
+					else
+					{
+						phdr = (PageHeader) page;
+						page_checksum = phdr->pd_feat.checksum;
+						checksum = pg_checksum_page(page, blkno + segmentno * RELSEG_SIZE);
+					}
+
+					if (page_checksum != checksum)
 					{
 						/*
 						 * Retry the block on the first failure.  It's
@@ -1662,9 +1677,9 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 							ereport(WARNING,
 									(errmsg("checksum verification failed in "
 											"file \"%s\", block %u: calculated "
-											"%X but expected %X",
+											UINT64_FORMAT " but expected " UINT64_FORMAT,
 											readfilename, blkno, checksum,
-											phdr->pd_feat.checksum)));
+											page_checksum)));
 						if (checksum_failures == 5)
 							ereport(WARNING,
 									(errmsg("further checksum verification "
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index 0c67106449..f92b74b3b7 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -106,18 +106,29 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
 	bool		checksum_failure = false;
 	bool		header_sane = false;
 	bool		all_zeroes = false;
-	uint16		checksum = 0;
-
+	uint64		checksum = 0;
+	uint64		page_checksum = 0;
+	char       *extended_checksum_loc = NULL;
 	/*
 	 * Don't verify page data unless the page passes basic non-zero test
 	 */
 	if (!PageIsNew(page))
 	{
-		if (DataChecksumsEnabled() && !(p->pd_flags & PD_EXTENDED_FEATS))
+		if (DataChecksumsEnabled())
 		{
-			checksum = pg_checksum_page((char *) page, blkno);
-
-			if (checksum != p->pd_feat.checksum)
+			/* are we using extended checksums? */
+			if ((extended_checksum_loc = PageGetFeatureOffset(page, PF_EXT_CHECKSUMS)))
+			{
+				page_checksum = pg_get_checksum64_page(page, (uint64*)extended_checksum_loc);
+				checksum = pg_checksum64_page(page, blkno, (uint64*)extended_checksum_loc);
+			}
+			else
+			{
+				/* traditional checksums in the pd_checksum field */
+				page_checksum = p->pd_feat.checksum;
+				checksum = pg_checksum_page((char *) page, blkno);
+			}
+			if (checksum != page_checksum)
 				checksum_failure = true;
 		}
 
@@ -162,8 +173,9 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
 		if ((flags & PIV_LOG_WARNING) != 0)
 			ereport(WARNING,
 					(errcode(ERRCODE_DATA_CORRUPTED),
-					 errmsg("page verification failed, calculated checksum %u but expected %u",
-							checksum, p->pd_feat.checksum)));
+					 errmsg("page verification failed, calculated checksum "
+							UINT64_FORMAT " but expected " UINT64_FORMAT,
+							checksum, page_checksum)));
 
 		if ((flags & PIV_REPORT_STAT) != 0)
 			pgstat_report_checksum_failure();
@@ -1523,10 +1535,10 @@ char *
 PageSetChecksumCopy(Page page, BlockNumber blkno)
 {
 	static char *pageCopy = NULL;
+	char *extended_checksum_loc = NULL;
 
 	/* If we don't need a checksum, just return the passed-in data */
-	if (PageIsNew(page) || !DataChecksumsEnabled() || \
-		(((PageHeader)page)->pd_flags & PD_EXTENDED_FEATS))
+	if (PageIsNew(page) || !DataChecksumsEnabled())
 		return (char *) page;
 
 	/*
@@ -1539,7 +1551,13 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 		pageCopy = MemoryContextAlloc(TopMemoryContext, BLCKSZ);
 
 	memcpy(pageCopy, (char *) page, BLCKSZ);
-	((PageHeader) pageCopy)->pd_feat.checksum = pg_checksum_page(pageCopy, blkno);
+
+	if ((extended_checksum_loc = PageGetFeatureOffset(pageCopy, PF_EXT_CHECKSUMS)))
+		pg_set_checksum64_page(pageCopy,
+							   pg_checksum64_page(pageCopy, blkno, (uint64*)extended_checksum_loc),
+							   (uint64*)extended_checksum_loc);
+	else
+		((PageHeader) pageCopy)->pd_feat.checksum = pg_checksum_page(pageCopy, blkno);
 	return pageCopy;
 }
 
@@ -1552,10 +1570,17 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 void
 PageSetChecksumInplace(Page page, BlockNumber blkno)
 {
+	char *extended_checksum_loc = NULL;
+
 	/* If we don't need a checksum, just return */
-	if (PageIsNew(page) || !DataChecksumsEnabled() || \
-		(((PageHeader)page)->pd_flags & PD_EXTENDED_FEATS))
+	if (PageIsNew(page) || !DataChecksumsEnabled())
 		return;
 
-	((PageHeader) page)->pd_feat.checksum = pg_checksum_page((char *) page, blkno);
+	/* are we using extended checksums? */
+	if ((extended_checksum_loc = PageGetFeatureOffset(page, PF_EXT_CHECKSUMS)))
+		pg_set_checksum64_page(page,
+							   pg_checksum64_page(page, blkno, (uint64*)extended_checksum_loc),
+							   (uint64*)extended_checksum_loc);
+	else
+		((PageHeader) page)->pd_feat.checksum = pg_checksum_page(page, blkno);
 }
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index f7dbc40fdc..d49a9c098e 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -1845,6 +1845,17 @@ struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"extended_checksums", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows whether extended checksums are turned on for this cluster."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
+		},
+		&page_feature_extended_checksums,
+		false,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"syslog_sequence_numbers", PGC_SIGHUP, LOGGING_WHERE,
 			gettext_noop("Add sequence number to syslog messages to avoid duplicate suppression."),
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 562a68f47f..8f0f2dde3a 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -150,6 +150,7 @@ static bool sync_only = false;
 static bool show_setting = false;
 static bool data_checksums = false;
 static bool using_page_feats = false;
+static bool extended_checksums = false;
 static char *xlog_dir = NULL;
 static char *str_wal_segment_size_mb = NULL;
 static int	wal_segment_size_mb;
@@ -1322,10 +1323,11 @@ bootstrap_template1(void)
 	unsetenv("PGCLIENTENCODING");
 
 	snprintf(cmd, sizeof(cmd),
-			 "\"%s\" --boot -X %d %s %s %s %s",
+			 "\"%s\" --boot -X %d %s %s %s %s %s",
 			 backend_exec,
 			 wal_segment_size_mb * (1024 * 1024),
 			 data_checksums ? "-k" : "",
+			 extended_checksums ? "-e extended_checksums" : "",
 			 boot_options, extra_options,
 			 debug ? "-d 5" : "");
 
@@ -2109,6 +2111,7 @@ usage(const char *progname)
 	printf(_("  -g, --allow-group-access  allow group read/execute on data directory\n"));
 	printf(_("      --icu-locale=LOCALE   set ICU locale ID for new databases\n"));
 	printf(_("  -k, --data-checksums      use data page checksums\n"));
+	printf(_("      --extended-checksums  use extended data page checksums\n"));
 	printf(_("      --locale=LOCALE       set default locale for new databases\n"));
 	printf(_("      --lc-collate=, --lc-ctype=, --lc-messages=LOCALE\n"
 			 "      --lc-monetary=, --lc-numeric=, --lc-time=LOCALE\n"
@@ -2764,6 +2767,7 @@ main(int argc, char *argv[])
 		{"waldir", required_argument, NULL, 'X'},
 		{"wal-segsize", required_argument, NULL, 12},
 		{"data-checksums", no_argument, NULL, 'k'},
+		{"extended-checksums", no_argument, NULL, 17},
 		{"allow-group-access", no_argument, NULL, 'g'},
 		{"discard-caches", no_argument, NULL, 14},
 		{"locale-provider", required_argument, NULL, 15},
@@ -2809,7 +2813,7 @@ main(int argc, char *argv[])
 
 	/* process command-line options */
 
-	while ((c = getopt_long(argc, argv, "A:dD:E:gkL:nNsST:U:WX:", long_options, &option_index)) != -1)
+	while ((c = getopt_long(argc, argv, "A:dD:E:gkKL:nNsST:U:WX:", long_options, &option_index)) != -1)
 	{
 		switch (c)
 		{
@@ -2861,6 +2865,10 @@ main(int argc, char *argv[])
 			case 'k':
 				data_checksums = true;
 				break;
+			case 17:
+				extended_checksums = true;
+				using_page_feats = true;
+				break;
 			case 'L':
 				share_path = pg_strdup(optarg);
 				break;
@@ -2976,6 +2984,9 @@ main(int argc, char *argv[])
 	if (pwprompt && pwfilename)
 		pg_fatal("password prompt and password file cannot be specified together");
 
+	if (data_checksums && extended_checksums)
+		pg_fatal("data checksums and extended data checksums cannot be specified together");
+
 	check_authmethod_unspecified(&authmethodlocal);
 	check_authmethod_unspecified(&authmethodhost);
 
@@ -3033,7 +3044,9 @@ main(int argc, char *argv[])
 	if (data_checksums && using_page_feats)
 		pg_fatal("cannot use page features and data_checksums at the same time");
 
-	if (data_checksums)
+	if (extended_checksums)
+		printf(_("Extended data page checksums are enabled.\n"));
+	else if (data_checksums)
 		printf(_("Data page checksums are enabled.\n"));
 	else
 		printf(_("Data page checksums are disabled.\n"));
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index c1006ad5d8..bc6be4844a 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -331,5 +331,8 @@ main(int argc, char *argv[])
 		   mock_auth_nonce_str);
 	printf(_("Reserved page size for features:      %d\n"),
 		   PageFeatureSetCalculateSize(ControlFile->page_features));
+	printf(_("Using extended checksums:             %s\n"),
+		   PageFeatureSetHasFeature(ControlFile->page_features, PF_EXT_CHECKSUMS) \
+		   ? _("yes") : _("no"));
 	return 0;
 }
diff --git a/src/common/pagefeat.c b/src/common/pagefeat.c
index 06a4084f46..45eeb4d403 100644
--- a/src/common/pagefeat.c
+++ b/src/common/pagefeat.c
@@ -19,6 +19,9 @@
 int reserved_page_size;
 PageFeatureSet cluster_page_features;
 
+/* status GUCs, display only. set by XLog startup */
+bool page_feature_extended_checksums;
+
 /*
  * A "page feature" is an optional cluster-defined additional data field that
  * is stored in the "reserved_page_size" area in the footer of a given Page.
@@ -43,6 +46,8 @@ typedef struct PageFeatureDesc
  * or the attempt to set the GUC will fail. */
 
 static PageFeatureDesc feature_descs[PF_MAX_FEATURE] = {
+	/* PF_EXT_CHECKSUMS */
+	{ 8, "extended_checksums" }
 };
 
 
diff --git a/src/include/common/komihash.h b/src/include/common/komihash.h
new file mode 100644
index 0000000000..867a7f09b1
--- /dev/null
+++ b/src/include/common/komihash.h
@@ -0,0 +1,569 @@
+/**
+ * komihash.h version 4.3.1
+ *
+ * The inclusion file for the "komihash" hash function.
+ *
+ * Description is available at https://github.com/avaneev/komihash
+ *
+ * License
+ *
+ * Copyright (c) 2021-2022 Aleksey Vaneev
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef KOMIHASH_INCLUDED
+#define KOMIHASH_INCLUDED
+
+#include <stdint.h>
+#include <string.h>
+
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wdeclaration-after-statement"
+
+// Macros that apply byte-swapping.
+
+#if defined( __GNUC__ ) || defined( __clang__ )
+
+	#define KOMIHASH_BYTESW32( v ) __builtin_bswap32( v )
+	#define KOMIHASH_BYTESW64( v ) __builtin_bswap64( v )
+
+#elif defined( _MSC_VER )
+
+	#define KOMIHASH_BYTESW32( v ) _byteswap_ulong( v )
+	#define KOMIHASH_BYTESW64( v ) _byteswap_uint64( v )
+
+#else // defined( _MSC_VER )
+
+	#define KOMIHASH_BYTESW32( v ) ( \
+		( v & 0xFF000000 ) >> 24 | \
+		( v & 0x00FF0000 ) >> 8 | \
+		( v & 0x0000FF00 ) << 8 | \
+		( v & 0x000000FF ) << 24 )
+
+	#define KOMIHASH_BYTESW64( v ) ( \
+		( v & 0xFF00000000000000 ) >> 56 | \
+		( v & 0x00FF000000000000 ) >> 40 | \
+		( v & 0x0000FF0000000000 ) >> 24 | \
+		( v & 0x000000FF00000000 ) >> 8 | \
+		( v & 0x00000000FF000000 ) << 8 | \
+		( v & 0x0000000000FF0000 ) << 24 | \
+		( v & 0x000000000000FF00 ) << 40 | \
+		( v & 0x00000000000000FF ) << 56 )
+
+#endif // defined( _MSC_VER )
+
+// Endianness-definition macro, can be defined externally (e.g. =1, if
+// endianness-correction is unnecessary in any case, to reduce its associated
+// overhead).
+
+#if !defined( KOMIHASH_LITTLE_ENDIAN )
+	#if defined( _WIN32 ) || defined( __LITTLE_ENDIAN__ ) || \
+		( defined( __BYTE_ORDER__ ) && __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ )
+
+		#define KOMIHASH_LITTLE_ENDIAN 1
+
+	#elif defined( __BIG_ENDIAN__ ) || \
+		( defined( __BYTE_ORDER__ ) && __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ )
+
+		#define KOMIHASH_LITTLE_ENDIAN 0
+
+	#else // defined( __BIG_ENDIAN__ )
+
+		#warning KOMIHASH: cannot determine endianness, assuming little-endian.
+
+		#define KOMIHASH_LITTLE_ENDIAN 1
+
+	#endif // defined( __BIG_ENDIAN__ )
+#endif // !defined( KOMIHASH_LITTLE_ENDIAN )
+
+// Macros that apply byte-swapping, used for endianness-correction.
+
+#if KOMIHASH_LITTLE_ENDIAN
+
+	#define KOMIHASH_EC32( v ) ( v )
+	#define KOMIHASH_EC64( v ) ( v )
+
+#else // KOMIHASH_LITTLE_ENDIAN
+
+	#define KOMIHASH_EC32( v ) KOMIHASH_BYTESW32( v )
+	#define KOMIHASH_EC64( v ) KOMIHASH_BYTESW64( v )
+
+#endif // KOMIHASH_LITTLE_ENDIAN
+
+// Likelihood macros that are used for manually-guided micro-optimization.
+
+#if defined( __GNUC__ ) || defined( __clang__ )
+
+	#define KOMIHASH_LIKELY( x )  __builtin_expect( x, 1 )
+	#define KOMIHASH_UNLIKELY( x )  __builtin_expect( x, 0 )
+
+#else // likelihood macros
+
+	#define KOMIHASH_LIKELY( x ) ( x )
+	#define KOMIHASH_UNLIKELY( x ) ( x )
+
+#endif // likelihood macros
+
+// In-memory data prefetch macro (temporal locality=1, in case a collision
+// resolution would be necessary).
+
+#if defined( __GNUC__ ) || defined( __clang__ )
+
+	#define KOMIHASH_PREFETCH( addr ) __builtin_prefetch( addr, 0, 1 )
+
+#else // prefetch macro
+
+	#define KOMIHASH_PREFETCH( addr )
+
+#endif // prefetch macro
+
+/**
+ * An auxiliary function that returns an unsigned 32-bit value created out of
+ * a sequence of bytes in memory. This function is used to convert endianness
+ * of in-memory 32-bit unsigned values, and to avoid unaligned memory
+ * accesses.
+ *
+ * @param p Pointer to 4 bytes in memory. Alignment is unimportant.
+ */
+
+static inline uint32_t kh_lu32ec( const uint8_t* const p )
+{
+	uint32_t v;
+	memcpy( &v, p, 4 );
+
+	return( KOMIHASH_EC32( v ));
+}
+
+/**
+ * An auxiliary function that returns an unsigned 64-bit value created out of
+ * a sequence of bytes in memory. This function is used to convert endianness
+ * of in-memory 64-bit unsigned values, and to avoid unaligned memory
+ * accesses.
+ *
+ * @param p Pointer to 8 bytes in memory. Alignment is unimportant.
+ */
+
+static inline uint64_t kh_lu64ec( const uint8_t* const p )
+{
+	uint64_t v;
+	memcpy( &v, p, 8 );
+
+	return( KOMIHASH_EC64( v ));
+}
+
+/**
+ * Function builds an unsigned 64-bit value out of remaining bytes in a
+ * message, and pads it with the "final byte". This function can only be
+ * called if less than 8 bytes are left to read. The message should be "long",
+ * permitting Msg[ -3 ] reads.
+ *
+ * @param Msg Message pointer, alignment is unimportant.
+ * @param MsgLen Message's remaining length, in bytes; can be 0.
+ * @param fb Final byte used for padding.
+ */
+
+static inline uint64_t kh_lpu64ec_l3( const uint8_t* const Msg,
+	const size_t MsgLen, uint64_t fb )
+{
+	if( MsgLen < 4 )
+	{
+		const uint8_t* const Msg3 = Msg + MsgLen - 3;
+		const int ml8 = (int) ( MsgLen << 3 );
+		const uint64_t m = (uint64_t) Msg3[ 0 ] | (uint64_t) Msg3[ 1 ] << 8 |
+			(uint64_t) Msg3[ 2 ] << 16;
+		return( fb << ml8 | m >> ( 24 - ml8 ));
+	}
+
+	const int ml8 = (int) ( MsgLen << 3 );
+	const uint64_t mh = kh_lu32ec( Msg + MsgLen - 4 );
+	const uint64_t ml = kh_lu32ec( Msg );
+
+	return( fb << ml8 | ml | ( mh >> ( 64 - ml8 )) << 32 );
+}
+
+/**
+ * Function builds an unsigned 64-bit value out of remaining bytes in a
+ * message, and pads it with the "final byte". This function can only be
+ * called if less than 8 bytes are left to read. Can be used on "short"
+ * messages, but MsgLen should be greater than 0.
+ *
+ * @param Msg Message pointer, alignment is unimportant.
+ * @param MsgLen Message's remaining length, in bytes; cannot be 0.
+ * @param fb Final byte used for padding.
+ */
+
+static inline uint64_t kh_lpu64ec_nz( const uint8_t* const Msg,
+	const size_t MsgLen, uint64_t fb )
+{
+	if( MsgLen < 4 )
+	{
+		fb <<= ( MsgLen << 3 );
+		uint64_t m = Msg[ 0 ];
+
+		if( MsgLen > 1 )
+		{
+			m |= (uint64_t) Msg[ 1 ] << 8;
+
+			if( MsgLen > 2 )
+			{
+				m |= (uint64_t) Msg[ 2 ] << 16;
+			}
+		}
+
+		return( fb | m );
+	}
+
+	const int ml8 = (int) ( MsgLen << 3 );
+	const uint64_t mh = kh_lu32ec( Msg + MsgLen - 4 );
+	const uint64_t ml = kh_lu32ec( Msg );
+
+	return( fb << ml8 | ml | ( mh >> ( 64 - ml8 )) << 32 );
+}
+
+/**
+ * Function builds an unsigned 64-bit value out of remaining bytes in a
+ * message, and pads it with the "final byte". This function can only be
+ * called if less than 8 bytes are left to read. The message should be "long",
+ * permitting Msg[ -4 ] reads.
+ *
+ * @param Msg Message pointer, alignment is unimportant.
+ * @param MsgLen Message's remaining length, in bytes; can be 0.
+ * @param fb Final byte used for padding.
+ */
+
+static inline uint64_t kh_lpu64ec_l4( const uint8_t* const Msg,
+	const size_t MsgLen, uint64_t fb )
+{
+	if( MsgLen < 5 )
+	{
+		const int ml8 = (int) ( MsgLen << 3 );
+
+		return( fb << ml8 |
+			(uint64_t) kh_lu32ec( Msg + MsgLen - 4 ) >> ( 32 - ml8 ));
+	}
+	else
+	{
+		const int ml8 = (int) ( MsgLen << 3 );
+
+		return( fb << ml8 | kh_lu64ec( Msg + MsgLen - 8 ) >> ( 64 - ml8 ));
+	}
+}
+
+#if defined( __SIZEOF_INT128__ )
+
+	/**
+	 * 64-bit by 64-bit unsigned multiplication.
+	 *
+	 * @param m1 Multiplier 1.
+	 * @param m2 Multiplier 2.
+	 * @param[out] rl The lower half of the 128-bit result.
+	 * @param[out] rh The higher half of the 128-bit result.
+	 */
+
+	static inline void kh_m128( const uint64_t m1, const uint64_t m2,
+		uint64_t* const rl, uint64_t* const rh )
+	{
+		const __uint128_t r = (__uint128_t) m1 * m2;
+
+		*rl = (uint64_t) r;
+		*rh = (uint64_t) ( r >> 64 );
+	}
+
+#elif defined( _MSC_VER ) && defined( _M_X64 )
+
+	#include <intrin.h>
+
+	static inline void kh_m128( const uint64_t m1, const uint64_t m2,
+		uint64_t* const rl, uint64_t* const rh )
+	{
+		*rl = _umul128( m1, m2, rh );
+	}
+
+#else // defined( _MSC_VER )
+
+	// _umul128() code for 32-bit systems, adapted from mullu(),
+	// from https://go.dev/src/runtime/softfloat64.go
+	// Licensed under BSD-style license.
+
+	static inline uint64_t kh__emulu( const uint32_t x, const uint32_t y )
+	{
+		return( x * (uint64_t) y );
+	}
+
+	static inline void kh_m128( const uint64_t u, const uint64_t v,
+		uint64_t* const rl, uint64_t* const rh )
+	{
+		*rl = u * v;
+
+		const uint32_t u0 = (uint32_t) u;
+		const uint32_t v0 = (uint32_t) v;
+		const uint64_t w0 = kh__emulu( u0, v0 );
+		const uint32_t u1 = (uint32_t) ( u >> 32 );
+		const uint32_t v1 = (uint32_t) ( v >> 32 );
+		const uint64_t t = kh__emulu( u1, v0 ) + ( w0 >> 32 );
+		const uint64_t w1 = (uint32_t) t + kh__emulu( u0, v1 );
+
+		*rh = kh__emulu( u1, v1 ) + ( w1 >> 32 ) + ( t >> 32 );
+	}
+
+#endif // defined( _MSC_VER )
+
+// Common hashing round with 16-byte input, using the "r1l" and "r1h"
+// temporary variables.
+
+#define KOMIHASH_HASH16( m ) \
+	kh_m128( Seed1 ^ kh_lu64ec( m ), \
+		Seed5 ^ kh_lu64ec( m + 8 ), &r1l, &r1h ); \
+	Seed5 += r1h; \
+	Seed1 = Seed5 ^ r1l;
+
+// Common hashing round without input, using the "r2l" and "r2h" temporary
+// variables.
+
+#define KOMIHASH_HASHROUND() \
+	kh_m128( Seed1, Seed5, &r2l, &r2h ); \
+	Seed5 += r2h; \
+	Seed1 = Seed5 ^ r2l;
+
+// Common hashing finalization round, with the final hashing input expected in
+// the "r2l" and "r2h" temporary variables.
+
+#define KOMIHASH_HASHFIN() \
+	kh_m128( r2l, r2h, &r1l, &r1h ); \
+	Seed5 += r1h; \
+	Seed1 = Seed5 ^ r1l; \
+	KOMIHASH_HASHROUND();
+
+/**
+ * KOMIHASH hash function. Produces and returns a 64-bit hash value of the
+ * specified message, string, or binary data block. Designed for 64-bit
+ * hash-table and hash-map uses. Produces identical hashes on both big- and
+ * little-endian systems.
+ *
+ * @param Msg0 The message to produce a hash from. The alignment of this
+ * pointer is unimportant.
+ * @param MsgLen Message's length, in bytes.
+ * @param UseSeed Optional value, to use instead of the default seed. To use
+ * the default seed, set to 0. The UseSeed value can have any bit length and
+ * statistical quality, and is used only as an additional entropy source. May
+ * need endianness-correction if this value is shared between big- and
+ * little-endian systems.
+ */
+
+static inline uint64_t komihash( const void* const Msg0, size_t MsgLen,
+	const uint64_t UseSeed )
+{
+	const uint8_t* Msg = (const uint8_t*) Msg0;
+
+	// The seeds are initialized to the first mantissa bits of PI.
+
+	uint64_t Seed1 = 0x243F6A8885A308D3 ^ ( UseSeed & 0x5555555555555555 );
+	uint64_t Seed5 = 0x452821E638D01377 ^ ( UseSeed & 0xAAAAAAAAAAAAAAAA );
+	uint64_t r1l, r1h, r2l, r2h;
+
+	// The three instructions in the "KOMIHASH_HASHROUND" macro represent the
+	// simplest constant-less PRNG, scalable to any even-sized state
+	// variables, with the `Seed1` being the PRNG output (2^64 PRNG period).
+	// It passes `PractRand` tests with rare non-systematic "unusual"
+	// evaluations.
+	//
+	// To make this PRNG reliable, self-starting, and eliminate a risk of
+	// stopping, the following variant can be used, which is a "register
+	// checker-board", a source of raw entropy. The PRNG is available as the
+	// komirand() function. Not required for hashing (but works for it) since
+	// the input entropy is usually available in abundance during hashing.
+	//
+	// Seed5 += r2h + 0xAAAAAAAAAAAAAAAA;
+	//
+	// (the `0xAAAA...` constant should match register's size; essentially,
+	// it is a replication of the `10` bit-pair; it is not an arbitrary
+	// constant).
+
+	KOMIHASH_HASHROUND(); // Required for PerlinNoise.
+
+	if( KOMIHASH_LIKELY( MsgLen < 16 ))
+	{
+		KOMIHASH_PREFETCH( Msg );
+
+		r2l = Seed1;
+		r2h = Seed5;
+
+		if( MsgLen > 7 )
+		{
+			// The following two XOR instructions are equivalent to mixing a
+			// message with a cryptographic one-time-pad (bitwise modulo 2
+			// addition). Message's statistics and distribution are thus
+			// unimportant.
+
+			r2h ^= kh_lpu64ec_l3( Msg + 8, MsgLen - 8,
+				1 << ( Msg[ MsgLen - 1 ] >> 7 ));
+
+			r2l ^= kh_lu64ec( Msg );
+		}
+		else
+		if( KOMIHASH_LIKELY( MsgLen != 0 ))
+		{
+			r2l ^= kh_lpu64ec_nz( Msg, MsgLen,
+				1 << ( Msg[ MsgLen - 1 ] >> 7 ));
+		}
+
+		KOMIHASH_HASHFIN();
+
+		return( Seed1 );
+	}
+
+	if( KOMIHASH_LIKELY( MsgLen < 32 ))
+	{
+		KOMIHASH_PREFETCH( Msg );
+
+		KOMIHASH_HASH16( Msg );
+
+		const uint64_t fb = 1 << ( Msg[ MsgLen - 1 ] >> 7 );
+
+		if( MsgLen > 23 )
+		{
+			r2h = Seed5 ^ kh_lpu64ec_l4( Msg + 24, MsgLen - 24, fb );
+			r2l = Seed1 ^ kh_lu64ec( Msg + 16 );
+		}
+		else
+		{
+			r2l = Seed1 ^ kh_lpu64ec_l4( Msg + 16, MsgLen - 16, fb );
+			r2h = Seed5;
+		}
+
+		KOMIHASH_HASHFIN();
+
+		return( Seed1 );
+	}
+
+	if( MsgLen > 63 )
+	{
+		uint64_t Seed2 = 0x13198A2E03707344 ^ Seed1;
+		uint64_t Seed3 = 0xA4093822299F31D0 ^ Seed1;
+		uint64_t Seed4 = 0x082EFA98EC4E6C89 ^ Seed1;
+		uint64_t Seed6 = 0xBE5466CF34E90C6C ^ Seed5;
+		uint64_t Seed7 = 0xC0AC29B7C97C50DD ^ Seed5;
+		uint64_t Seed8 = 0x3F84D5B5B5470917 ^ Seed5;
+		uint64_t r3l, r3h, r4l, r4h;
+
+		do
+		{
+			KOMIHASH_PREFETCH( Msg );
+
+			kh_m128( Seed1 ^ kh_lu64ec( Msg ),
+				Seed5 ^ kh_lu64ec( Msg + 8 ), &r1l, &r1h );
+
+			kh_m128( Seed2 ^ kh_lu64ec( Msg + 16 ),
+				Seed6 ^ kh_lu64ec( Msg + 24 ), &r2l, &r2h );
+
+			kh_m128( Seed3 ^ kh_lu64ec( Msg + 32 ),
+				Seed7 ^ kh_lu64ec( Msg + 40 ), &r3l, &r3h );
+
+			kh_m128( Seed4 ^ kh_lu64ec( Msg + 48 ),
+				Seed8 ^ kh_lu64ec( Msg + 56 ), &r4l, &r4h );
+
+			Msg += 64;
+			MsgLen -= 64;
+
+			// Such "shifting" arrangement (below) does not increase
+			// individual SeedN's PRNG period beyond 2^64, but reduces a
+			// chance of any occassional synchronization between PRNG lanes
+			// happening. Practically, Seed1-4 together become a single
+			// "fused" 256-bit PRNG value, having a summary PRNG period of
+			// 2^66.
+
+			Seed5 += r1h;
+			Seed6 += r2h;
+			Seed7 += r3h;
+			Seed8 += r4h;
+			Seed2 = Seed5 ^ r2l;
+			Seed3 = Seed6 ^ r3l;
+			Seed4 = Seed7 ^ r4l;
+			Seed1 = Seed8 ^ r1l;
+
+		} while( KOMIHASH_LIKELY( MsgLen > 63 ));
+
+		Seed5 ^= Seed6 ^ Seed7 ^ Seed8;
+		Seed1 ^= Seed2 ^ Seed3 ^ Seed4;
+	}
+
+	KOMIHASH_PREFETCH( Msg );
+
+	if( KOMIHASH_LIKELY( MsgLen > 31 ))
+	{
+		KOMIHASH_HASH16( Msg );
+		KOMIHASH_HASH16( Msg + 16 );
+
+		Msg += 32;
+		MsgLen -= 32;
+	}
+
+	if( MsgLen > 15 )
+	{
+		KOMIHASH_HASH16( Msg );
+
+		Msg += 16;
+		MsgLen -= 16;
+	}
+
+	const uint64_t fb = 1 << ( Msg[ MsgLen - 1 ] >> 7 );
+
+	if( MsgLen > 7 )
+	{
+		r2h = Seed5 ^ kh_lpu64ec_l4( Msg + 8, MsgLen - 8, fb );
+		r2l = Seed1 ^ kh_lu64ec( Msg );
+	}
+	else
+	{
+		r2l = Seed1 ^ kh_lpu64ec_l4( Msg, MsgLen, fb );
+		r2h = Seed5;
+	}
+
+	KOMIHASH_HASHFIN();
+
+	return( Seed1 );
+}
+
+/**
+ * Simple, reliable, self-starting yet efficient PRNG, with 2^64 period.
+ * 0.62 cycles/byte performance. Self-starts in 4 iterations, which is a
+ * suggested "warming up" initialization before using its output.
+ *
+ * @param[in,out] Seed1 Seed value 1. Can be initialized to any value
+ * (even 0). This is the usual "PRNG seed" value.
+ * @param[in,out] Seed2 Seed value 2, a supporting variable. Best initialized
+ * to the same value as Seed1.
+ * @return The next uniformly-random 64-bit value.
+ */
+
+static inline uint64_t komirand( uint64_t* const Seed1, uint64_t* const Seed2 )
+{
+	uint64_t r1l, r1h;
+
+	kh_m128( *Seed1, *Seed2, &r1l, &r1h );
+	*Seed2 += r1h + 0xAAAAAAAAAAAAAAAA;
+	*Seed1 = *Seed2 ^ r1l;
+
+	return( *Seed1 );
+}
+
+#pragma GCC diagnostic pop
+
+#endif // KOMIHASH_INCLUDED
diff --git a/src/include/common/pagefeat.h b/src/include/common/pagefeat.h
index f07e1af315..57650cdc7b 100644
--- a/src/include/common/pagefeat.h
+++ b/src/include/common/pagefeat.h
@@ -16,6 +16,7 @@
 
 /* revealed for GUCs */
 extern PGDLLIMPORT int reserved_page_size;
+extern PGDLLIMPORT bool page_feature_extended_checksums;
 
 /* forward declaration to avoid circular includes */
 typedef Pointer Page;
@@ -28,6 +29,7 @@ extern PGDLLIMPORT PageFeatureSet cluster_page_features;
 
 /* bit offset for features flags */
 typedef enum {
+	PF_EXT_CHECKSUMS = 0,  /* must be first */
 	PF_MAX_FEATURE /* must be last */
 } PageFeature;
 
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index afaa466ec5..377276b8e8 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -78,7 +78,16 @@
  * initialize its pages with PageInit and then set its own opaque
  * fields.
  *
- * XXX - update more comments here about reserved_page_space
+ * If any page features are in use (thus reserving the cluster-wise
+ * reserved_page_space), then the special space offset will be adjusted to
+ * start not at the end of the block itself, but right before the MAXALIGN'd
+ * reserved_page_space chunk at the end, which is allocated/managed using the
+ * page features mechanism.  This adjustment is done at PageInit() time
+ * transparently to the AM, which still uses the normal pd_special pointer to
+ * reference its opaque block.  The only difference here is that the
+ * pd_special field + sizeof(opaque structure) will not (necessarily) be the
+ * same as the heap block size, but instead BLCKSZ - reserved_page_space.
+ *
  */
 
 typedef Pointer Page;
@@ -119,7 +128,7 @@ PageXLogRecPtrGet(PageXLogRecPtr val)
  *
  *		pd_lsn		- identifies xlog record for last change to this page.
  *		pd_feat     - union type, one of:
- *         checksum - page checksum, if checksums enabled.
+ *         checksum - page checksum, if legacy checksums are enabled.
  *         features - page features, if using extended feature flags.
  *		pd_flags	- flag bits.
  *		pd_lower	- offset to start of free space.
diff --git a/src/include/storage/checksum.h b/src/include/storage/checksum.h
index 4afd25a0af..1c319dd2c5 100644
--- a/src/include/storage/checksum.h
+++ b/src/include/storage/checksum.h
@@ -20,5 +20,8 @@
  * 4-byte boundary.
  */
 extern uint16 pg_checksum_page(char *page, BlockNumber blkno);
+extern uint64 pg_checksum64_page(char *page, BlockNumber blkno, uint64*offset);
+extern void pg_set_checksum64_page(char *page, uint64 checksum, uint64 *cksumloc);
+extern uint64 pg_get_checksum64_page(char *page, uint64 *cksumloc);
 
 #endif							/* CHECKSUM_H */
diff --git a/src/include/storage/checksum_impl.h b/src/include/storage/checksum_impl.h
index 25933f1759..b10c9447bd 100644
--- a/src/include/storage/checksum_impl.h
+++ b/src/include/storage/checksum_impl.h
@@ -101,6 +101,7 @@
  */
 
 #include "storage/bufpage.h"
+#include "common/komihash.h"
 
 /* number of checksums to calculate in parallel */
 #define N_SUMS 32
@@ -214,3 +215,91 @@ pg_checksum_page(char *page, BlockNumber blkno)
 	 */
 	return (uint16) ((checksum % 65535) + 1);
 }
+
+
+/*
+ * 64-bit block checksum algorithm.  The page must be adequately aligned
+ * (on an 8-byte boundary).
+ */
+
+static uint64
+pg_checksum64_block(const PGChecksummablePage *page)
+{
+	/* ensure that the size is compatible with the algorithm */
+	Assert(sizeof(PGChecksummablePage) == BLCKSZ);
+
+	return (uint64)komihash(page, BLCKSZ, 0);
+}
+
+/*
+ * Compute and return a 64-bit checksum for a Postgres page.
+ *
+ * Beware that the 64-bit portion of the page that cksum points to is
+ * transiently zeroed, though it is restored.
+ *
+ * The checksum includes the block number (to detect the case where a page is
+ * somehow moved to a different location), the page header (excluding the
+ * checksum itself), and the page data.
+ */
+uint64
+pg_checksum64_page(char *page, BlockNumber blkno, uint64 *cksumloc)
+{
+	PGChecksummablePage *cpage = (PGChecksummablePage *) page;
+	uint64      saved;
+	uint64      checksum;
+
+	/* We only calculate the checksum for properly-initialized pages */
+	Assert(!PageIsNew((Page) page));
+	/* Ensure that the cksum pointer is in the page range on this page */
+	Assert((char*)cksumloc >= page && (char*)cksumloc <= (page + BLCKSZ - sizeof(uint64)));
+
+	saved = *cksumloc;
+	*cksumloc = 0;
+
+	checksum = pg_checksum64_block(cpage);
+
+	/* restore */
+	*cksumloc = saved;
+
+	/* Mix in the block number to detect transposed pages */
+	checksum ^= blkno;
+
+	/* ensure in the extremely unlikely case that we have non-zero return
+	 * value here; this does double-up on our coset for group 1 here, but it's
+	 * a nice property to preserve */
+	return (checksum == 0 ? 1 : checksum);
+}
+
+
+/*
+ * Set a 64-bit checksum onto a Postgres page.
+ *
+ */
+void
+pg_set_checksum64_page(char *page, uint64 checksum, uint64 *cksumloc)
+{
+	/* Can only set the checksum for properly-initialized pages */
+	Assert(!PageIsNew((Page) page));
+
+	/* Ensure that the cksum pointer is in the page range on this page */
+	Assert((char*)cksumloc >= page && (char*)cksumloc <= (page + BLCKSZ - sizeof(uint64)));
+	*cksumloc = checksum;
+}
+
+/*
+ * Get the 64-bit checksum onto a Postgres page given the offset to the
+ * containing uint64.
+ */
+uint64
+pg_get_checksum64_page(char *page, uint64 *cksumloc)
+{
+	/* Can only set the checksum for properly-initialized pages */
+	Assert(!PageIsNew((Page) page));
+
+	/* Ensure that the cksum pointer is in the page range on this page */
+	Assert((char*)cksumloc >= page && (char*)cksumloc <= (page + BLCKSZ - sizeof(uint64)));
+	Assert(MAXALIGN((uint64)cksumloc) == (uint64)cksumloc);
+
+	return *cksumloc;
+}
+
-- 
2.38.1

v3-0002-Add-Page-Features-optional-per-page-storage-alloc.patchapplication/octet-stream; name=v3-0002-Add-Page-Features-optional-per-page-storage-alloc.patchDownload

From 5725686fcb48830cc79b9a5733050aa36f6631e5 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Tue, 1 Nov 2022 17:50:13 -0400
Subject: [PATCH v3 2/3] Add Page Features: optional per-page storage
 allocations

Page Features are optional sets of storage allocated from the trailer of a disk
page.  While the current approach is handled at a cluster level, using a Page
Feature Set defined at bootstrap time, the design is granular enough that
specific features could be used on given classes of relations or even specific
relations.

The API design merges the test for a given feature with the return of the
address of the feature's storage on the page, which simplifies code that tests
and then does something with the resulting space on the page.  For instance,
with a page feature that stores 8 bytes as a page checksum, the test for such a
feature might be implemented as:

    char *extended_checksum_loc = NULL;

    /* are we using extended checksums? */
    if ((extended_checksum_loc = PageGetFeatureOffset(page, PF_EXT_CHECKSUMS)))
    {
        /* 64-bit checksum */
        page_checksum = pg_get_checksum64_page(page, (uint64*)extended_checksum_loc);
        checksum = pg_checksum64_page(page, blkno + segmentno * RELSEG_SIZE, (uint64*)extended_checksum_loc);
    }
    else
    {
        phdr = (PageHeader) page;
        page_checksum = phdr->pd_feat.checksum;
        checksum = pg_checksum_page(page, blkno + segmentno * RELSEG_SIZE);
    }

The page features are named, fixed size, and consistently offset so a cluster
that is created with the same options could use pg_upgrade for future versions,
similar to how data checksums are currently utilized.

The code overrides the interpretation of the pd_checksum field only when a
specific page flag is on the page, meaning that this code is
backwards-compatible with both checksum-enabled and -disabled historical
clusters, as well as forward-compatible with the development of additional page
features in future versions.
---
 contrib/bloom/blutils.c                   |   2 +-
 contrib/pageinspect/rawpage.c             |   2 +-
 doc/src/sgml/storage.sgml                 |   4 +-
 src/backend/access/brin/brin_bloom.c      |   1 +
 src/backend/access/brin/brin_pageops.c    |   2 +-
 src/backend/access/common/bufmask.c       |   3 +-
 src/backend/access/gin/ginutil.c          |   2 +-
 src/backend/access/gist/gistutil.c        |   2 +-
 src/backend/access/hash/hashpage.c        |   2 +-
 src/backend/access/heap/heapam.c          |   8 +-
 src/backend/access/heap/hio.c             |   4 +-
 src/backend/access/heap/rewriteheap.c     |   2 +-
 src/backend/access/heap/visibilitymap.c   |   4 +-
 src/backend/access/nbtree/nbtpage.c       |   2 +-
 src/backend/access/spgist/spgutils.c      |   2 +-
 src/backend/access/transam/xlog.c         |  10 ++
 src/backend/backup/basebackup.c           |   4 +-
 src/backend/bootstrap/bootstrap.c         |  19 ++-
 src/backend/commands/sequence.c           |   4 +-
 src/backend/storage/freespace/freespace.c |   6 +-
 src/backend/storage/page/README           | 163 +++++++++++++++++++++-
 src/backend/storage/page/bufpage.c        |  39 ++++--
 src/bin/initdb/initdb.c                   |   5 +
 src/bin/pg_checksums/pg_checksums.c       |  11 +-
 src/bin/pg_controldata/pg_controldata.c   |   3 +
 src/bin/pg_upgrade/file.c                 |   2 +-
 src/common/Makefile                       |   1 +
 src/common/meson.build                    |   1 +
 src/common/pagefeat.c                     | 140 +++++++++++++++++++
 src/include/access/htup_details.h         |   1 +
 src/include/catalog/pg_control.h          |   5 +-
 src/include/common/pagefeat.h             |  56 ++++++++
 src/include/storage/bufmgr.h              |   1 +
 src/include/storage/bufpage.h             |  65 ++++++---
 src/include/storage/checksum_impl.h       |  19 +--
 src/test/perl/PostgreSQL/Test/Cluster.pm  |   2 +
 src/tools/msvc/Mkvcbuild.pm               |   2 +-
 37 files changed, 521 insertions(+), 80 deletions(-)
 create mode 100644 src/common/pagefeat.c
 create mode 100644 src/include/common/pagefeat.h

diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index 6dba4fd9af..1bb9e35710 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -408,7 +408,7 @@ BloomInitPage(Page page, uint16 flags)
 {
 	BloomPageOpaque opaque;
 
-	PageInit(page, BLCKSZ, sizeof(BloomPageOpaqueData));
+	PageInit(page, BLCKSZ, sizeof(BloomPageOpaqueData), cluster_page_features);
 
 	opaque = BloomPageGetOpaque(page);
 	opaque->flags = flags;
diff --git a/contrib/pageinspect/rawpage.c b/contrib/pageinspect/rawpage.c
index b25a63cbd6..98f8a63086 100644
--- a/contrib/pageinspect/rawpage.c
+++ b/contrib/pageinspect/rawpage.c
@@ -284,7 +284,7 @@ page_header(PG_FUNCTION_ARGS)
 	}
 	else
 		values[0] = LSNGetDatum(lsn);
-	values[1] = UInt16GetDatum(pageheader->pd_checksum);
+	values[1] = UInt16GetDatum(pageheader->pd_feat.checksum);
 	values[2] = UInt16GetDatum(pageheader->pd_flags);
 
 	/* pageinspect >= 1.10 uses int4 instead of int2 for those fields */
diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml
index e5b9f3f1ff..86cfc4d8a2 100644
--- a/doc/src/sgml/storage.sgml
+++ b/doc/src/sgml/storage.sgml
@@ -839,10 +839,10 @@ data. Empty in ordinary tables.</entry>
    to this page</entry>
   </row>
   <row>
-   <entry>pd_checksum</entry>
+   <entry>pd_feat</entry>
    <entry>uint16</entry>
    <entry>2 bytes</entry>
-   <entry>Page checksum</entry>
+   <entry>Page checksum or Page Features</entry>
   </row>
   <row>
    <entry>pd_flags</entry>
diff --git a/src/backend/access/brin/brin_bloom.c b/src/backend/access/brin/brin_bloom.c
index 5bc0166fb8..1d7c5c34d3 100644
--- a/src/backend/access/brin/brin_bloom.c
+++ b/src/backend/access/brin/brin_bloom.c
@@ -125,6 +125,7 @@
 #include "access/stratnum.h"
 #include "catalog/pg_type.h"
 #include "catalog/pg_amop.h"
+#include "common/pagefeat.h"
 #include "utils/builtins.h"
 #include "utils/datum.h"
 #include "utils/lsyscache.h"
diff --git a/src/backend/access/brin/brin_pageops.c b/src/backend/access/brin/brin_pageops.c
index ad5a89bd05..0f0a5c0af2 100644
--- a/src/backend/access/brin/brin_pageops.c
+++ b/src/backend/access/brin/brin_pageops.c
@@ -475,7 +475,7 @@ brin_doinsert(Relation idxrel, BlockNumber pagesPerRange,
 void
 brin_page_init(Page page, uint16 type)
 {
-	PageInit(page, BLCKSZ, sizeof(BrinSpecialSpace));
+	PageInit(page, BLCKSZ, sizeof(BrinSpecialSpace), cluster_page_features);
 
 	BrinPageType(page) = type;
 }
diff --git a/src/backend/access/common/bufmask.c b/src/backend/access/common/bufmask.c
index 5e392dab1e..3f1fdb3c0d 100644
--- a/src/backend/access/common/bufmask.c
+++ b/src/backend/access/common/bufmask.c
@@ -33,7 +33,8 @@ mask_page_lsn_and_checksum(Page page)
 	PageHeader	phdr = (PageHeader) page;
 
 	PageXLogRecPtrSet(phdr->pd_lsn, (uint64) MASK_MARKER);
-	phdr->pd_checksum = MASK_MARKER;
+	if (!(phdr->pd_flags & PD_EXTENDED_FEATS))
+		phdr->pd_feat.checksum = MASK_MARKER;
 }
 
 /*
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index e7cc452a8a..8a882960f6 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -345,7 +345,7 @@ GinInitPage(Page page, uint32 f, Size pageSize)
 {
 	GinPageOpaque opaque;
 
-	PageInit(page, pageSize, sizeof(GinPageOpaqueData));
+	PageInit(page, pageSize, sizeof(GinPageOpaqueData), cluster_page_features);
 
 	opaque = GinPageGetOpaque(page);
 	opaque->flags = f;
diff --git a/src/backend/access/gist/gistutil.c b/src/backend/access/gist/gistutil.c
index 56451fede1..9e66815dfb 100644
--- a/src/backend/access/gist/gistutil.c
+++ b/src/backend/access/gist/gistutil.c
@@ -758,7 +758,7 @@ gistinitpage(Page page, uint32 f)
 {
 	GISTPageOpaque opaque;
 
-	PageInit(page, BLCKSZ, sizeof(GISTPageOpaqueData));
+	PageInit(page, BLCKSZ, sizeof(GISTPageOpaqueData), cluster_page_features);
 
 	opaque = GistPageGetOpaque(page);
 	opaque->rightlink = InvalidBlockNumber;
diff --git a/src/backend/access/hash/hashpage.c b/src/backend/access/hash/hashpage.c
index 2d8fdec98e..0531636445 100644
--- a/src/backend/access/hash/hashpage.c
+++ b/src/backend/access/hash/hashpage.c
@@ -595,7 +595,7 @@ _hash_init_metabuffer(Buffer buf, double num_tuples, RegProcedure procid,
 void
 _hash_pageinit(Page page, Size size)
 {
-	PageInit(page, size, sizeof(HashPageOpaqueData));
+	PageInit(page, size, sizeof(HashPageOpaqueData), cluster_page_features);
 }
 
 /*
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 72bc0db029..2ab619264b 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -9133,7 +9133,7 @@ heap_xlog_visible(XLogReaderState *record)
 
 		/* initialize the page if it was read as zeros */
 		if (PageIsNew(vmpage))
-			PageInit(vmpage, BLCKSZ, 0);
+			PageInit(vmpage, BLCKSZ, 0, cluster_page_features);
 
 		/*
 		 * XLogReadBufferForRedoExtended locked the buffer. But
@@ -9369,7 +9369,7 @@ heap_xlog_insert(XLogReaderState *record)
 	{
 		buffer = XLogInitBufferForRedo(record, 0);
 		page = BufferGetPage(buffer);
-		PageInit(page, BufferGetPageSize(buffer), 0);
+		PageInit(page, BufferGetPageSize(buffer), 0, cluster_page_features);
 		action = BLK_NEEDS_REDO;
 	}
 	else
@@ -9493,7 +9493,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
 	{
 		buffer = XLogInitBufferForRedo(record, 0);
 		page = BufferGetPage(buffer);
-		PageInit(page, BufferGetPageSize(buffer), 0);
+		PageInit(page, BufferGetPageSize(buffer), 0, cluster_page_features);
 		action = BLK_NEEDS_REDO;
 	}
 	else
@@ -9711,7 +9711,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
 	{
 		nbuffer = XLogInitBufferForRedo(record, 0);
 		page = (Page) BufferGetPage(nbuffer);
-		PageInit(page, BufferGetPageSize(nbuffer), 0);
+		PageInit(page, BufferGetPageSize(nbuffer), 0, cluster_page_features);
 		newaction = BLK_NEEDS_REDO;
 	}
 	else
diff --git a/src/backend/access/heap/hio.c b/src/backend/access/heap/hio.c
index 67d4460ed5..21c9a4b5ba 100644
--- a/src/backend/access/heap/hio.c
+++ b/src/backend/access/heap/hio.c
@@ -525,7 +525,7 @@ loop:
 		 */
 		if (PageIsNew(page))
 		{
-			PageInit(page, BufferGetPageSize(buffer), 0);
+			PageInit(page, BufferGetPageSize(buffer), 0, cluster_page_features);
 			MarkBufferDirty(buffer);
 		}
 
@@ -635,7 +635,7 @@ loop:
 			 BufferGetBlockNumber(buffer),
 			 RelationGetRelationName(relation));
 
-	PageInit(page, BufferGetPageSize(buffer), 0);
+	PageInit(page, BufferGetPageSize(buffer), 0, cluster_page_features);
 	MarkBufferDirty(buffer);
 
 	/*
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 1737673d6a..ad554fd454 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -702,7 +702,7 @@ raw_heap_insert(RewriteState state, HeapTuple tup)
 	if (!state->rs_buffer_valid)
 	{
 		/* Initialize a new empty page */
-		PageInit(page, BLCKSZ, 0);
+		PageInit(page, BLCKSZ, 0, cluster_page_features);
 		state->rs_buffer_valid = true;
 	}
 
diff --git a/src/backend/access/heap/visibilitymap.c b/src/backend/access/heap/visibilitymap.c
index 74ff01bb17..2b75849b0a 100644
--- a/src/backend/access/heap/visibilitymap.c
+++ b/src/backend/access/heap/visibilitymap.c
@@ -609,7 +609,7 @@ vm_readbuf(Relation rel, BlockNumber blkno, bool extend)
 	{
 		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
 		if (PageIsNew(BufferGetPage(buf)))
-			PageInit(BufferGetPage(buf), BLCKSZ, 0);
+			PageInit(BufferGetPage(buf), BLCKSZ, 0, cluster_page_features);
 		LockBuffer(buf, BUFFER_LOCK_UNLOCK);
 	}
 	return buf;
@@ -626,7 +626,7 @@ vm_extend(Relation rel, BlockNumber vm_nblocks)
 	PGAlignedBlock pg;
 	SMgrRelation reln;
 
-	PageInit((Page) pg.data, BLCKSZ, 0);
+	PageInit((Page) pg.data, BLCKSZ, 0, cluster_page_features);
 
 	/*
 	 * We use the relation extension lock to lock out other backends trying to
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 3feee28d19..17917d1e0e 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -1140,7 +1140,7 @@ _bt_upgradelockbufcleanup(Relation rel, Buffer buf)
 void
 _bt_pageinit(Page page, Size size)
 {
-	PageInit(page, size, sizeof(BTPageOpaqueData));
+	PageInit(page, size, sizeof(BTPageOpaqueData), cluster_page_features);
 }
 
 /*
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 3761f2c193..c6607cc048 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -689,7 +689,7 @@ SpGistInitPage(Page page, uint16 f)
 {
 	SpGistPageOpaque opaque;
 
-	PageInit(page, BLCKSZ, sizeof(SpGistPageOpaqueData));
+	PageInit(page, BLCKSZ, sizeof(SpGistPageOpaqueData), cluster_page_features);
 	opaque = SpGistPageGetOpaque(page);
 	opaque->flags = f;
 	opaque->spgist_page_id = SPGIST_PAGE_ID;
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index fb4c860bde..b5aca9d426 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -69,6 +69,7 @@
 #include "catalog/pg_database.h"
 #include "common/controldata_utils.h"
 #include "common/file_utils.h"
+#include "common/pagefeat.h"
 #include "executor/instrument.h"
 #include "miscadmin.h"
 #include "pg_trace.h"
@@ -89,6 +90,7 @@
 #include "storage/ipc.h"
 #include "storage/large_object.h"
 #include "storage/latch.h"
+#include "common/pagefeat.h"
 #include "storage/pmsignal.h"
 #include "storage/predicate.h"
 #include "storage/proc.h"
@@ -109,6 +111,7 @@
 #include "utils/varlena.h"
 
 extern uint32 bootstrap_data_checksum_version;
+extern PageFeatureSet bootstrap_page_features;
 
 /* timeline ID to be used when bootstrapping */
 #define BootstrapTimeLineID		1
@@ -3878,6 +3881,7 @@ InitControlFile(uint64 sysidentifier)
 	ControlFile->wal_log_hints = wal_log_hints;
 	ControlFile->track_commit_timestamp = track_commit_timestamp;
 	ControlFile->data_checksum_version = bootstrap_data_checksum_version;
+	ControlFile->page_features = bootstrap_page_features;
 }
 
 static void
@@ -4153,9 +4157,15 @@ ReadControlFile(void)
 
 	CalculateCheckpointSegments();
 
+	/* set our page-level space reservation from ControlFile if any extended feature flags are set*/
+	reserved_page_size = PageFeatureSetCalculateSize(ControlFile->page_features);
+	Assert(reserved_page_size == MAXALIGN(reserved_page_size));
+
 	/* Make the initdb settings visible as GUC variables, too */
 	SetConfigOption("data_checksums", DataChecksumsEnabled() ? "yes" : "no",
 					PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
+
+	SetExtendedFeatureConfigOptions(ControlFile->page_features);
 }
 
 /*
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 3fb9451643..84db24edd4 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -1611,7 +1611,7 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 				{
 					checksum = pg_checksum_page((char *) page, blkno + segmentno * RELSEG_SIZE);
 					phdr = (PageHeader) page;
-					if (phdr->pd_checksum != checksum)
+					if (phdr->pd_feat.checksum != checksum)
 					{
 						/*
 						 * Retry the block on the first failure.  It's
@@ -1664,7 +1664,7 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 											"file \"%s\", block %u: calculated "
 											"%X but expected %X",
 											readfilename, blkno, checksum,
-											phdr->pd_checksum)));
+											phdr->pd_feat.checksum)));
 						if (checksum_failures == 5)
 							ereport(WARNING,
 									(errmsg("further checksum verification "
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index 49e956b2c5..b66f24b47e 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -46,7 +46,7 @@
 #include "utils/relmapper.h"
 
 uint32		bootstrap_data_checksum_version = 0;	/* No checksum */
-
+PageFeatureSet bootstrap_page_features = 0;			/* No special features */
 
 static void CheckerModeMain(void);
 static void bootstrap_signals(void);
@@ -221,7 +221,7 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
 	argv++;
 	argc--;
 
-	while ((flag = getopt(argc, argv, "B:c:d:D:Fkr:X:-:")) != -1)
+	while ((flag = getopt(argc, argv, "B:c:d:D:e:Fkr:X:-:")) != -1)
 	{
 		switch (flag)
 		{
@@ -270,6 +270,19 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
 					pfree(debugstr);
 				}
 				break;
+			case 'e':
+				{
+					/* enable specific features */
+					PageFeatureSet features_tmp;
+
+					features_tmp = PageFeatureSetAddFeatureByName(bootstrap_page_features, optarg);
+					if (features_tmp == bootstrap_page_features)
+						ereport(ERROR,
+								(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+								 errmsg("Unrecognized page feature requested: \"%s\"", optarg)));
+					bootstrap_page_features = features_tmp;
+				}
+				break;
 			case 'F':
 				SetConfigOption("fsync", "false", PGC_POSTMASTER, PGC_S_ARGV);
 				break;
@@ -299,6 +312,8 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
 		}
 	}
 
+	ClusterPageFeatureInit(bootstrap_page_features);
+
 	if (argc != optind)
 	{
 		write_stderr("%s: invalid command-line arguments\n", progname);
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index bfe279cddf..0d51ec6d3b 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -382,7 +382,7 @@ fill_seq_fork_with_data(Relation rel, HeapTuple tuple, ForkNumber forkNum)
 
 	page = BufferGetPage(buf);
 
-	PageInit(page, BufferGetPageSize(buf), sizeof(sequence_magic));
+	PageInit(page, BufferGetPageSize(buf), sizeof(sequence_magic), cluster_page_features);
 	sm = (sequence_magic *) PageGetSpecialPointer(page);
 	sm->magic = SEQ_MAGIC;
 
@@ -1856,7 +1856,7 @@ seq_redo(XLogReaderState *record)
 	 */
 	localpage = (Page) palloc(BufferGetPageSize(buffer));
 
-	PageInit(localpage, BufferGetPageSize(buffer), sizeof(sequence_magic));
+	PageInit(localpage, BufferGetPageSize(buffer), sizeof(sequence_magic), cluster_page_features);
 	sm = (sequence_magic *) PageGetSpecialPointer(localpage);
 	sm->magic = SEQ_MAGIC;
 
diff --git a/src/backend/storage/freespace/freespace.c b/src/backend/storage/freespace/freespace.c
index 62ee80d746..9dd2b4939f 100644
--- a/src/backend/storage/freespace/freespace.c
+++ b/src/backend/storage/freespace/freespace.c
@@ -217,7 +217,7 @@ XLogRecordPageWithFreeSpace(RelFileLocator rlocator, BlockNumber heapBlk,
 
 	page = BufferGetPage(buf);
 	if (PageIsNew(page))
-		PageInit(page, BLCKSZ, 0);
+		PageInit(page, BLCKSZ, 0, cluster_page_features);
 
 	if (fsm_set_avail(page, slot, new_cat))
 		MarkBufferDirtyHint(buf, false);
@@ -593,7 +593,7 @@ fsm_readbuf(Relation rel, FSMAddress addr, bool extend)
 	{
 		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
 		if (PageIsNew(BufferGetPage(buf)))
-			PageInit(BufferGetPage(buf), BLCKSZ, 0);
+			PageInit(BufferGetPage(buf), BLCKSZ, 0, cluster_page_features);
 		LockBuffer(buf, BUFFER_LOCK_UNLOCK);
 	}
 	return buf;
@@ -611,7 +611,7 @@ fsm_extend(Relation rel, BlockNumber fsm_nblocks)
 	PGAlignedBlock pg;
 	SMgrRelation reln;
 
-	PageInit((Page) pg.data, BLCKSZ, 0);
+	PageInit((Page) pg.data, BLCKSZ, 0, cluster_page_features);
 
 	/*
 	 * We use the relation extension lock to lock out other backends trying to
diff --git a/src/backend/storage/page/README b/src/backend/storage/page/README
index e30d7ac59a..95e61ef25a 100644
--- a/src/backend/storage/page/README
+++ b/src/backend/storage/page/README
@@ -3,6 +3,11 @@ src/backend/storage/page/README
 Checksums
 ---------
 
+Note: The description of the page checksums described in this section are
+relevant only when the database cluster has been initialized without page
+features; see the section on Page Features below for full details on
+interpretation.
+
 Checksums on data pages are designed to detect corruption by the I/O system.
 We do not protect buffers against uncorrectable memory errors, since these
 have a very low measured incidence according to research on large server farms,
@@ -19,7 +24,7 @@ We set the checksum on a buffer in the shared pool immediately before we
 flush the buffer. As a result we implicitly invalidate the page's checksum
 when we modify the page for a data change or even a hint. This means that
 many or even most pages in shared buffers have invalid page checksums,
-so be careful how you interpret the pd_checksum field.
+so be careful how you interpret the pd_feat.checksum field.
 
 That means that WAL-logged changes to a page do NOT update the page checksum,
 so full page images may not have a valid checksum. But those page images have
@@ -62,3 +67,159 @@ checksums are enabled.  Systems in Hot-Standby mode may benefit from hint bits
 being set, but with checksums enabled, a page cannot be dirtied after setting a
 hint bit (due to the torn page risk). So, it must wait for full-page images
 containing the hint bit updates to arrive from the primary.
+
+
+Page Features
+-------------
+
+As described above, the use and interpretation of checksums on the page level
+are conditional depending on whether any Page Features had been enabled at
+initdb time.
+
+A Page Feature is an optional boolean parameter that will allocate a fixed-size
+amount of space from the end of a Page. All enabled Page Features are known as
+a Page Feature Set, and the control file contains the cluster-wide initial
+state. Future work here could expand out which features are utilized.
+
+Changes to the Page structure itself involve a new `pd_flags` bit and, if set, a
+reinterpretation of the `pd_checksums` field as a copy of this Page's enabled
+page features. This gives us both a sanity-check against the pg_control
+cluster_page_features as well as being a backwards-compatible change with
+existing disk pages with or without checksums enabled, meaning that pg_upgrade
+should work still.
+
+Future upgrades for clusters using Page Features should continue to work, as
+long as the initdb options for the future clusters are still compatible and as
+long as we keep the set of existing Page Features well-defined in terms of bit
+offsets and reserved length. (This does not seem like an unreasonable
+restriction.)
+
+Since we are taking over the pd_checksums field on the Page structure when Page
+Features are in use, it would seem that this would introduce some potential data
+corruption concerns, however one of the available page features is an extended
+checksum, which itself obviates the need for the checksums field and expands the
+available storage space for this checksum to a full 64-bits. This should be
+sufficient to address this concern, and the checksum-related code paths have
+already been updated to handle either the standard checksums or the extended
+checksums transparently.
+
+In addition to extended checksums, there is also a Page Feature which we use to
+store the GCM tag for authenticated encryption for TDE. This reserved space
+provides both storage and validation of Additional Authenticated Data so we can
+be sure that if a page decrypts appropriately is is cryptographically impossible
+to have twiddled any bits on this page outside of through Postgres itself, which
+serves as a stronger alternative to the checksum validation. The encryption
+tags and the extended checksums would both have validation guarantees, so there
+is no need for a cluster to include both (and in fact combining them makes no
+sense) so the options are considered incompatible.
+
+
+Developing new Page Features
+----------------------------
+
+The goal of Page Features is to make it easy to have space-occupying optional
+features without it being a huge pain for developers to create new features,
+probe for the use of said features or to provide unnecessary boilerplate.
+
+As such, defining a new page feature consists of making changes to the following
+files:
+
+pagefeat.h:
+
+- create an `extern bool page_feature_<foo>` to expose the feature to the GUC
+  system.
+
+- a new feature flag should be defined for the feature; new features should
+  always be added at the end of the list since where these appear in the list
+  determines their relative offset in the page and features that already exist
+  in a cluster must appear at the same offset.
+
+pagefeat.c:
+
+- define the `bool page_feature_<foo>` variable to store the status field
+
+- add a new PageFeatureDesc entry to the corresponding index in the
+  `feature_descs` structure for this feature, including the size of space to be
+  reserved and the name of the GUC to expose the status.
+
+guc_tables.c:
+
+- add a boolean computed field linking the variable name and the GUC name for
+  the feature.  Should be basically the same as any existing page feature GUC
+  such as "extended_checksums".
+
+initdb.c:
+
+- add whatever required getopt handling to optionally enable the feature at
+  initdb time.  This should eventually pass a `-e <feature_name>` to the
+  bootstrap process, at which point everything else should work.
+
+
+Page Feature Space Usage
+------------------------
+
+Page Features only consume space for the features that are enabled.  The total
+per-page space usage is exposed via the `reserved_page_space` GUC, which itself
+is MAXALIGN()ed.
+
+A design choice was made in which the checking for a page feature's enabling and
+accessing the memory area for said page feature could be combined in a single
+call, PageGetFeatureOffset().  This routine is passed a Page and a PageFeature,
+and if this specific page feature has been enabled it will return the memory
+offset inside this Page, otherwise it will return NULL.
+
+Using an example from basebackup.c:
+
+    char *extended_checksum_loc = NULL;
+
+    /* are we using extended checksums? */
+    if ((extended_checksum_loc = PageGetFeatureOffset(page, PF_EXT_CHECKSUMS)))
+    {
+        /* 64-bit checksum */
+        page_checksum = pg_get_checksum64_page(page, (uint64*)extended_checksum_loc);
+        checksum = pg_checksum64_page(page, blkno + segmentno * RELSEG_SIZE, (uint64*)extended_checksum_loc);
+    }
+    else
+    {
+        phdr = (PageHeader) page;
+        page_checksum = phdr->pd_feat.checksum;
+        checksum = pg_checksum_page(page, blkno + segmentno * RELSEG_SIZE);
+    }
+
+Above you can see that the pointer returned from the PageGetFeatureOffset() is
+being used as the validity check and the assignment of the corresponding memory
+location at the same time.
+
+The PageFeatureSet for the cluster is a bitfield, with the enabled features on
+the page being numbers from the low bits, so for a cluster initialized with the
+following feature for hypothetical Page Features with lengths of 8 except for
+feature 0 with length 16, you would have the following offsets calcuated for the
+page features:
+
+| 00110101 |
+
+0 -> page + BLCKSZ - 16
+1 -> NULL
+2 -> page + BLCKSZ - 16 - 8
+3 -> NULL
+4 -> page + BLCKSZ - 16 - 8 - 8
+5 -> page + BLCKSZ - 16 - 8 - 8 - 8
+6 -> NULL
+7 -> NULL
+
+Note that there are some definite performance improvements related to how we are
+currently calculating the feature offsets; these can be precalculated based on
+the enabled features in the table and turned into a simple LUT.
+
+
+It is worth noting that since we are allocating space from the end of the page,
+we must adjust (transparently) the pd_special pointers to account for the
+reserved_page_size.  This has been fixed in all core or contrib code, but since
+this is now calculated at runtime instead of compile time (due to the
+requirement that we be able to enable/disable features at initdb time) this
+means that a lot of things which had been static compile-time constants are now
+instead calculated.  This cost, unlike the space costs, must unfortunately be
+paid by all users of the code whether or not we are using any features at all.
+It is hoped that the utility of the page features and the extensibility it
+provides will outweigh any performance changes here.
+
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index 70a2361ba7..0c67106449 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -26,8 +26,6 @@
 /* GUC variable */
 bool		ignore_checksum_failure = false;
 
-int			reserved_page_size = 0; /* how much page space to reserve for extended unencrypted metadata */
-
 
 /* ----------------------------------------------------------------
  *						Page support functions
@@ -41,11 +39,18 @@ int			reserved_page_size = 0; /* how much page space to reserve for extended une
  *		until it's time to write.
  */
 void
-PageInit(Page page, Size pageSize, Size specialSize)
+PageInit(Page page, Size pageSize, Size specialSize, PageFeatureSet features)
 {
 	PageHeader	p = (PageHeader) page;
 
-	specialSize = MAXALIGN(specialSize) + reserved_page_size;
+	specialSize = MAXALIGN(specialSize);
+
+	if (features)
+	{
+		Size reserved_size = PageFeatureSetCalculateSize(features);
+		Assert(reserved_size == MAXALIGN(reserved_size));
+		specialSize += reserved_size;
+	}
 
 	Assert(pageSize == BLCKSZ);
 	Assert(pageSize > specialSize + SizeOfPageHeaderData);
@@ -53,7 +58,13 @@ PageInit(Page page, Size pageSize, Size specialSize)
 	/* Make sure all fields of page are zero, as well as unused space */
 	MemSet(p, 0, pageSize);
 
-	p->pd_flags = 0;
+	if (features)
+	{
+		p->pd_flags = PD_EXTENDED_FEATS;
+		p->pd_feat.features = features;
+	}
+	else
+		p->pd_flags = 0; /* redundant w/MemSet? */
 	p->pd_lower = SizeOfPageHeaderData;
 	p->pd_upper = pageSize - specialSize;
 	p->pd_special = pageSize - specialSize;
@@ -102,11 +113,11 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
 	 */
 	if (!PageIsNew(page))
 	{
-		if (DataChecksumsEnabled())
+		if (DataChecksumsEnabled() && !(p->pd_flags & PD_EXTENDED_FEATS))
 		{
 			checksum = pg_checksum_page((char *) page, blkno);
 
-			if (checksum != p->pd_checksum)
+			if (checksum != p->pd_feat.checksum)
 				checksum_failure = true;
 		}
 
@@ -152,7 +163,7 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
 			ereport(WARNING,
 					(errcode(ERRCODE_DATA_CORRUPTED),
 					 errmsg("page verification failed, calculated checksum %u but expected %u",
-							checksum, p->pd_checksum)));
+							checksum, p->pd_feat.checksum)));
 
 		if ((flags & PIV_REPORT_STAT) != 0)
 			pgstat_report_checksum_failure();
@@ -409,7 +420,7 @@ PageGetTempPageCopySpecial(Page page)
 	pageSize = PageGetPageSize(page);
 	temp = (Page) palloc(pageSize);
 
-	PageInit(temp, pageSize, PageGetSpecialSize(page));
+	PageInit(temp, pageSize, PageGetSpecialSize(page), PageGetPageFeatures(page));
 	memcpy(PageGetSpecialPointer(temp),
 		   PageGetSpecialPointer(page),
 		   PageGetSpecialSize(page));
@@ -1514,7 +1525,8 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 	static char *pageCopy = NULL;
 
 	/* If we don't need a checksum, just return the passed-in data */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsEnabled() || \
+		(((PageHeader)page)->pd_flags & PD_EXTENDED_FEATS))
 		return (char *) page;
 
 	/*
@@ -1527,7 +1539,7 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 		pageCopy = MemoryContextAlloc(TopMemoryContext, BLCKSZ);
 
 	memcpy(pageCopy, (char *) page, BLCKSZ);
-	((PageHeader) pageCopy)->pd_checksum = pg_checksum_page(pageCopy, blkno);
+	((PageHeader) pageCopy)->pd_feat.checksum = pg_checksum_page(pageCopy, blkno);
 	return pageCopy;
 }
 
@@ -1541,8 +1553,9 @@ void
 PageSetChecksumInplace(Page page, BlockNumber blkno)
 {
 	/* If we don't need a checksum, just return */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsEnabled() || \
+		(((PageHeader)page)->pd_flags & PD_EXTENDED_FEATS))
 		return;
 
-	((PageHeader) page)->pd_checksum = pg_checksum_page((char *) page, blkno);
+	((PageHeader) page)->pd_feat.checksum = pg_checksum_page((char *) page, blkno);
 }
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 7a58c33ace..562a68f47f 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -149,6 +149,7 @@ static bool do_sync = true;
 static bool sync_only = false;
 static bool show_setting = false;
 static bool data_checksums = false;
+static bool using_page_feats = false;
 static char *xlog_dir = NULL;
 static char *str_wal_segment_size_mb = NULL;
 static int	wal_segment_size_mb;
@@ -3028,6 +3029,10 @@ main(int argc, char *argv[])
 
 	printf("\n");
 
+	/* check for incompatible extended features */
+	if (data_checksums && using_page_feats)
+		pg_fatal("cannot use page features and data_checksums at the same time");
+
 	if (data_checksums)
 		printf(_("Data page checksums are enabled.\n"));
 	else
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index aa21007497..76c1e62f8b 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -234,11 +234,11 @@ scan_file(const char *fn, int segmentno)
 		csum = pg_checksum_page(buf.data, blockno + segmentno * RELSEG_SIZE);
 		if (mode == PG_MODE_CHECK)
 		{
-			if (csum != header->pd_checksum)
+			if (csum != header->pd_feat.checksum)
 			{
 				if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
 					pg_log_error("checksum verification failed in file \"%s\", block %u: calculated checksum %X but block contains %X",
-								 fn, blockno, csum, header->pd_checksum);
+								 fn, blockno, csum, header->pd_feat.checksum);
 				badblocks++;
 			}
 		}
@@ -250,13 +250,13 @@ scan_file(const char *fn, int segmentno)
 			 * Do not rewrite if the checksum is already set to the expected
 			 * value.
 			 */
-			if (header->pd_checksum == csum)
+			if (header->pd_feat.checksum == csum)
 				continue;
 
 			blocks_written_in_file++;
 
 			/* Set checksum in page header */
-			header->pd_checksum = csum;
+			header->pd_feat.checksum = csum;
 
 			/* Seek back to beginning of block */
 			if (lseek(f, -BLCKSZ, SEEK_CUR) < 0)
@@ -551,6 +551,9 @@ main(int argc, char *argv[])
 	if (ControlFile->pg_control_version != PG_CONTROL_VERSION)
 		pg_fatal("cluster is not compatible with this version of pg_checksums");
 
+	if (ControlFile->page_features != 0)
+		pg_fatal("pg_checksums cannot be used on a cluster with enabled page features");
+
 	if (ControlFile->blcksz != BLCKSZ)
 	{
 		pg_log_error("database cluster is not compatible");
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index c390ec51ce..c1006ad5d8 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -26,6 +26,7 @@
 #include "catalog/pg_control.h"
 #include "common/controldata_utils.h"
 #include "common/logging.h"
+#include "common/pagefeat.h"
 #include "getopt_long.h"
 #include "pg_getopt.h"
 
@@ -328,5 +329,7 @@ main(int argc, char *argv[])
 		   ControlFile->data_checksum_version);
 	printf(_("Mock authentication nonce:            %s\n"),
 		   mock_auth_nonce_str);
+	printf(_("Reserved page size for features:      %d\n"),
+		   PageFeatureSetCalculateSize(ControlFile->page_features));
 	return 0;
 }
diff --git a/src/bin/pg_upgrade/file.c b/src/bin/pg_upgrade/file.c
index ed874507ff..3b18f0b318 100644
--- a/src/bin/pg_upgrade/file.c
+++ b/src/bin/pg_upgrade/file.c
@@ -292,7 +292,7 @@ rewriteVisibilityMap(const char *fromfile, const char *tofile,
 
 			/* Set new checksum for visibility map page, if enabled */
 			if (new_cluster.controldata.data_checksum_version != 0)
-				((PageHeader) new_vmbuf.data)->pd_checksum =
+				((PageHeader) new_vmbuf.data)->pd_feat.checksum =
 					pg_checksum_page(new_vmbuf.data, new_blkno);
 
 			errno = 0;
diff --git a/src/common/Makefile b/src/common/Makefile
index 2f424a5735..ce6d8f12d8 100644
--- a/src/common/Makefile
+++ b/src/common/Makefile
@@ -64,6 +64,7 @@ OBJS_COMMON = \
 	kwlookup.o \
 	link-canary.o \
 	md5_common.o \
+	pagefeat.o \
 	percentrepl.o \
 	pg_get_line.o \
 	pg_lzcompress.o \
diff --git a/src/common/meson.build b/src/common/meson.build
index 1caa1fed04..a3fa6d7df8 100644
--- a/src/common/meson.build
+++ b/src/common/meson.build
@@ -16,6 +16,7 @@ common_sources = files(
   'kwlookup.c',
   'link-canary.c',
   'md5_common.c',
+  'pagefeat.c',
   'percentrepl.c',
   'pg_get_line.c',
   'pg_lzcompress.c',
diff --git a/src/common/pagefeat.c b/src/common/pagefeat.c
new file mode 100644
index 0000000000..06a4084f46
--- /dev/null
+++ b/src/common/pagefeat.c
@@ -0,0 +1,140 @@
+/*-------------------------------------------------------------------------
+ *
+ * pagefeat.c
+ *	  POSTGRES optional page features
+ *
+ * Copyright (c) 2022, PostgreSQL Global Development Group
+ *
+ *
+ * IDENTIFICATION
+ *	  src/common/pagefeat.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+#include "common/pagefeat.h"
+#include "utils/guc.h"
+
+/* global variables */
+int reserved_page_size;
+PageFeatureSet cluster_page_features;
+
+/*
+ * A "page feature" is an optional cluster-defined additional data field that
+ * is stored in the "reserved_page_size" area in the footer of a given Page.
+ * These features are set at initdb time and are static for the life of the cluster.
+ *
+ * Page features are identified by flags, each corresponding to a blob of data
+ * with a fixed length and content.  For a given cluster, these features will
+ * globally exist or not, and can be queried for feature existence.  You can
+ * also get the data/length for a given feature using accessors.
+ */
+
+typedef struct PageFeatureDesc
+{
+	uint16 length;
+	char *guc_name;
+} PageFeatureDesc;
+
+/* These are the fixed widths for each feature type, indexed by feature.  This
+ * is also used to lookup page features by the bootstrap process and expose
+ * the state of this page feature as a readonly boolean GUC, so when adding a
+ * named feature here ensure you also update the guc_tables file to add this,
+ * or the attempt to set the GUC will fail. */
+
+static PageFeatureDesc feature_descs[PF_MAX_FEATURE] = {
+};
+
+
+/* Return the size for a given set of feature flags */
+uint16
+PageFeatureSetCalculateSize(PageFeatureSet features)
+{
+	uint16 size = 0;
+	int i;
+
+	if (!features)
+		return 0;
+
+	for (i = 0; i < PF_MAX_FEATURE; i++)
+		if (features & (1<<i))
+			size += feature_descs[i].length;
+
+	return MAXALIGN(size);
+}
+
+
+/* does a specific page have a feature? */
+static inline bool PageHasFeature(Page page, PageFeature feature)
+{
+	Assert(feature >= 0 && feature < PF_MAX_FEATURE);
+
+	return ((PageHeader) page)->pd_flags & PD_EXTENDED_FEATS && \
+		((PageHeader)page)->pd_feat.features & (1<<feature);
+}
+
+
+/*
+ * Get the page offset for the given feature given the page, flags, and
+ * feature id.  Returns NULL if the feature is not enabled.
+ */
+
+char *
+PageGetFeatureOffset(Page page, PageFeature feature_id)
+{
+	uint16 size = 0;
+	int i;
+	PageFeatureSet enabled_features;
+
+	Assert(page != NULL);
+
+	/* short circuit if page does not have extended features or is not using
+	 * this specific feature */
+
+	if (!PageHasFeature(page, feature_id))
+		return (char*)0;
+
+	enabled_features = ((PageHeader)page)->pd_feat.features;
+
+	/* we need to find the offsets of all previous features to skip */
+	for (i = PF_MAX_FEATURE; i > feature_id; i--)
+		if (enabled_features & (1<<i))
+			size += feature_descs[i].length;
+
+	/* size is now the offset from the start of the reserved page space */
+	return (char*)((char *)page + BLCKSZ - reserved_page_size + size);
+}
+
+/* expose the given feature flags as boolean yes/no GUCs */
+void
+SetExtendedFeatureConfigOptions(PageFeatureSet features)
+{
+#ifndef FRONTEND
+	int i;
+
+	for (i = 0; i < PF_MAX_FEATURE; i++)
+		SetConfigOption(feature_descs[i].guc_name, (features & (1<<i)) ? "yes" : "no",
+						PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
+#endif
+	cluster_page_features = features;
+}
+
+/* add a named feature to the feature set */
+PageFeatureSet
+PageFeatureSetAddFeatureByName(PageFeatureSet features, const char *feat_name)
+{
+	int i;
+
+	for (i = 0; i < PF_MAX_FEATURE; i++)
+		if (!strcmp(feat_name, feature_descs[i].guc_name))
+			return features | (1<<i);
+	return features;
+}
+
+/* add feature to the feature set by identifier */
+PageFeatureSet
+PageFeatureSetAddFeature(PageFeatureSet features, PageFeature feature)
+{
+	Assert(feature >= 0 && feature < PF_MAX_FEATURE);
+	return features | (1<<feature);
+}
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index 01362da297..7ed0958cda 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -18,6 +18,7 @@
 #include "access/transam.h"
 #include "access/tupdesc.h"
 #include "access/tupmacs.h"
+#include "common/pagefeat.h"
 #include "storage/bufpage.h"
 #include "varatt.h"
 
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index dc953977c5..e2450ed540 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -19,7 +19,7 @@
 #include "access/xlogdefs.h"
 #include "pgtime.h"				/* for pg_time_t */
 #include "port/pg_crc32c.h"
-
+#include "common/pagefeat.h"
 
 /* Version identifier for this pg_control format */
 #define PG_CONTROL_VERSION	1300
@@ -219,6 +219,9 @@ typedef struct ControlFileData
 	/* Are data pages protected by checksums? Zero if no checksum version */
 	uint32		data_checksum_version;
 
+	/* What extended page features are we using? */
+	PageFeatureSet page_features;
+
 	/*
 	 * Random nonce, used in authentication requests that need to proceed
 	 * based on values that are cluster-unique, like a SASL exchange that
diff --git a/src/include/common/pagefeat.h b/src/include/common/pagefeat.h
new file mode 100644
index 0000000000..f07e1af315
--- /dev/null
+++ b/src/include/common/pagefeat.h
@@ -0,0 +1,56 @@
+/*-------------------------------------------------------------------------
+ *
+ * pagefeat.h
+ *	  POSTGRES page feature support
+ *
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/common/pagefeat.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PAGEFEAT_H
+#define PAGEFEAT_H
+
+/* revealed for GUCs */
+extern PGDLLIMPORT int reserved_page_size;
+
+/* forward declaration to avoid circular includes */
+typedef Pointer Page;
+typedef uint8 PageFeatureSet;
+
+extern PGDLLIMPORT PageFeatureSet cluster_page_features;
+
+#define SizeOfPageReservedSpace() reserved_page_size
+#define MaxSizeOfPageReservedSpace 64 /* sum all of the page features */
+
+/* bit offset for features flags */
+typedef enum {
+	PF_MAX_FEATURE /* must be last */
+} PageFeature;
+
+/* Limit for total number of features we will support.  Since we are storing a
+ * single status byte, we are reserving the top bit here to be set to indicate
+ * for whether there are more than 7 features; used for future extensibility.
+ * This should not be increased as part of normal feature development, only
+ * when adding said mechanisms */
+
+#define PF_MAX_POSSIBLE_FEATURE_CUTOFF 7
+
+StaticAssertDecl(PF_MAX_FEATURE <= PF_MAX_POSSIBLE_FEATURE_CUTOFF,
+				 "defined more features than will fit in one byte");
+
+/* prototypes */
+void SetExtendedFeatureConfigOptions(PageFeatureSet features);
+uint16 PageFeatureSetCalculateSize(PageFeatureSet features);
+PageFeatureSet PageFeatureSetAddFeatureByName(PageFeatureSet features, const char *feat_name);
+PageFeatureSet PageFeatureSetAddFeature(PageFeatureSet features, PageFeature feature);
+
+/* macros dealing with the current cluster's page features */
+char *PageGetFeatureOffset(Page page, PageFeature feature);
+#define PageFeatureSetHasFeature(fs,f) (fs&(1<<f))
+#define ClusterPageFeatureInit(features) cluster_page_features = features;
+
+#endif							/* PAGEFEAT_H */
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index 33eadbc129..1e6b46c466 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -17,6 +17,7 @@
 #include "storage/block.h"
 #include "storage/buf.h"
 #include "storage/bufpage.h"
+#include "common/pagefeat.h"
 #include "storage/relfilelocator.h"
 #include "utils/relcache.h"
 #include "utils/snapmgr.h"
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 18ebd3b3c7..afaa466ec5 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -18,11 +18,7 @@
 #include "storage/block.h"
 #include "storage/item.h"
 #include "storage/off.h"
-
-extern PGDLLIMPORT int reserved_page_size;
-
-#define SizeOfPageReservedSpace() reserved_page_size
-#define MaxSizeOfPageReservedSpace 0
+#include "common/pagefeat.h"
 
 /* strict upper bound on the amount of space occupied we have reserved on
  * pages in this cluster */
@@ -122,7 +118,9 @@ PageXLogRecPtrGet(PageXLogRecPtr val)
  * space management information generic to any page
  *
  *		pd_lsn		- identifies xlog record for last change to this page.
- *		pd_checksum - page checksum, if set.
+ *		pd_feat     - union type, one of:
+ *         checksum - page checksum, if checksums enabled.
+ *         features - page features, if using extended feature flags.
  *		pd_flags	- flag bits.
  *		pd_lower	- offset to start of free space.
  *		pd_upper	- offset to end of free space.
@@ -134,16 +132,18 @@ PageXLogRecPtrGet(PageXLogRecPtr val)
  * "thou shalt write xlog before data".  A dirty buffer cannot be dumped
  * to disk until xlog has been flushed at least as far as the page's LSN.
  *
- * pd_checksum stores the page checksum, if it has been set for this page;
- * zero is a valid value for a checksum. If a checksum is not in use then
- * we leave the field unset. This will typically mean the field is zero
- * though non-zero values may also be present if databases have been
- * pg_upgraded from releases prior to 9.3, when the same byte offset was
- * used to store the current timelineid when the page was last updated.
- * Note that there is no indication on a page as to whether the checksum
- * is valid or not, a deliberate design choice which avoids the problem
- * of relying on the page contents to decide whether to verify it. Hence
- * there are no flag bits relating to checksums.
+ * pd_feat is a union type; if the `PD_EXTENDED_FEATS` page flag is set, we
+ * interpret it as a bitflag storing information about the page features in
+ * use on this page.  If this flag is unset, then it stores the page checksum,
+ * if it has been set for this page; zero is a valid value for a checksum. If
+ * a checksum is not in use then we leave the field unset. This will typically
+ * mean the field is zero though non-zero values may also be present if
+ * databases have been pg_upgraded from releases prior to 9.3, when the same
+ * byte offset was used to store the current timelineid when the page was last
+ * updated.  Note that there is no indication on a page as to whether the
+ * checksum is valid or not, a deliberate design choice which avoids the
+ * problem of relying on the page contents to decide whether to verify
+ * it. Hence there are no flag bits relating to checksums.
  *
  * pd_prune_xid is a hint field that helps determine whether pruning will be
  * useful.  It is currently unused in index pages.
@@ -167,7 +167,10 @@ typedef struct PageHeaderData
 	/* XXX LSN is member of *any* block, not only page-organized ones */
 	PageXLogRecPtr pd_lsn;		/* LSN: next byte after last byte of xlog
 								 * record for last change to this page */
-	uint16		pd_checksum;	/* checksum */
+	union {
+		uint16		checksum;	/* checksum */
+		uint16		features;	/* page feature flags */
+	} pd_feat;
 	uint16		pd_flags;		/* flag bits, see below */
 	LocationIndex pd_lower;		/* offset to start of free space */
 	LocationIndex pd_upper;		/* offset to end of free space */
@@ -195,8 +198,8 @@ typedef PageHeaderData *PageHeader;
 #define PD_PAGE_FULL		0x0002	/* not enough free space for new tuple? */
 #define PD_ALL_VISIBLE		0x0004	/* all tuples on page are visible to
 									 * everyone */
-
-#define PD_VALID_FLAG_BITS	0x0007	/* OR of all valid pd_flags bits */
+#define PD_EXTENDED_FEATS	0x0008	/* this page uses extended page features */
+#define PD_VALID_FLAG_BITS	0x000F	/* OR of all valid pd_flags bits */
 
 /*
  * Page layout version number 0 is for pre-7.3 Postgres releases.
@@ -312,6 +315,26 @@ PageSetPageSizeAndVersion(Page page, Size size, uint8 version)
 	((PageHeader) page)->pd_pagesize_version = size | version;
 }
 
+
+/*
+ * Return any extended page features set on the page.
+ */
+static inline PageFeatureSet PageGetPageFeatures(Page page)
+{
+	return ((PageHeader) page)->pd_flags & PD_EXTENDED_FEATS \
+		? (PageFeatureSet)((PageHeader)page)->pd_feat.features
+		: 0;
+}
+
+/*
+ * Return the size of space allocated for page features.
+ */
+static inline Size
+PageGetFeatureSize(Page page)
+{
+	return PageFeatureSetCalculateSize(PageGetPageFeatures(page));
+}
+
 /* ----------------
  *		page special data functions
  * ----------------
@@ -323,7 +346,7 @@ PageSetPageSizeAndVersion(Page page, Size size, uint8 version)
 static inline uint16
 PageGetSpecialSize(Page page)
 {
-	return (PageGetPageSize(page) - ((PageHeader) page)->pd_special - reserved_page_size);
+	return (PageGetPageSize(page) - ((PageHeader) page)->pd_special - PageGetFeatureSize(page));
 }
 
 /*
@@ -495,7 +518,7 @@ do { \
 StaticAssertDecl(BLCKSZ == ((BLCKSZ / sizeof(size_t)) * sizeof(size_t)),
 				 "BLCKSZ has to be a multiple of sizeof(size_t)");
 
-extern void PageInit(Page page, Size pageSize, Size specialSize);
+extern void PageInit(Page page, Size pageSize, Size specialSize, PageFeatureSet features);
 extern bool PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags);
 extern OffsetNumber PageAddItemExtended(Page page, Item item, Size size,
 										OffsetNumber offsetNumber, int flags);
diff --git a/src/include/storage/checksum_impl.h b/src/include/storage/checksum_impl.h
index 7b157161a2..25933f1759 100644
--- a/src/include/storage/checksum_impl.h
+++ b/src/include/storage/checksum_impl.h
@@ -194,22 +194,23 @@ pg_checksum_page(char *page, BlockNumber blkno)
 	Assert(!PageIsNew((Page) page));
 
 	/*
-	 * Save pd_checksum and temporarily set it to zero, so that the checksum
-	 * calculation isn't affected by the old checksum stored on the page.
-	 * Restore it after, because actually updating the checksum is NOT part of
-	 * the API of this function.
+	 * Save pd_feat.checksum and temporarily set it to zero, so that the
+	 * checksum calculation isn't affected by the old checksum stored on the
+	 * page.  Restore it after, because actually updating the checksum is NOT
+	 * part of the API of this function.
 	 */
-	save_checksum = cpage->phdr.pd_checksum;
-	cpage->phdr.pd_checksum = 0;
+	save_checksum = cpage->phdr.pd_feat.checksum;
+	cpage->phdr.pd_feat.checksum = 0;
 	checksum = pg_checksum_block(cpage);
-	cpage->phdr.pd_checksum = save_checksum;
+	cpage->phdr.pd_feat.checksum = save_checksum;
 
 	/* Mix in the block number to detect transposed pages */
 	checksum ^= blkno;
 
 	/*
-	 * Reduce to a uint16 (to fit in the pd_checksum field) with an offset of
-	 * one. That avoids checksums of zero, which seems like a good idea.
+	 * Reduce to a uint16 (to fit in the pd_feat.checksum field) with an
+	 * offset of one. That avoids checksums of zero, which seems like a good
+	 * idea.
 	 */
 	return (uint16) ((checksum % 65535) + 1);
 }
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 04921ca3a3..4170ca2930 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3007,6 +3007,8 @@ The server must be stopped for this to work reliably.
 The file name should be specified relative to the cluster datadir.
 page_offset had better be a multiple of the cluster's block size.
 
+TODO: what to do about page features instead of checksums?
+
 =cut
 
 sub corrupt_page_checksum
diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index ee49424d6f..4953adfeed 100644
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -136,7 +136,7 @@ sub mkvcbuild
 	  base64.c checksum_helper.c compression.c
 	  config_info.c controldata_utils.c d2s.c encnames.c exec.c
 	  f2s.c file_perm.c file_utils.c hashfn.c ip.c jsonapi.c
-	  keywords.c kwlookup.c link-canary.c md5_common.c percentrepl.c
+	  keywords.c kwlookup.c link-canary.c md5_common.c pagefeat.c percentrepl.c
 	  pg_get_line.c pg_lzcompress.c pg_prng.c pgfnames.c psprintf.c relpath.c
 	  rmtree.c saslprep.c scram-common.c string.c stringinfo.c unicode_norm.c
 	  username.c wait_error.c wchar.c);
-- 
2.38.1

v3-0001-Add-reserved_page_space-to-Page-structure.patchapplication/octet-stream; name=v3-0001-Add-reserved_page_space-to-Page-structure.patchDownload

From 478ac071d7c45a2fad8b633aae29adc55d5b5745 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Tue, 18 Oct 2022 14:28:09 -0400
Subject: [PATCH v3 1/3] Add reserved_page_space to Page structure

This space is reserved for extended data on the Page structure which will be ultimately used for
encrypted data, extended checksums, and potentially other things.  This data appears at the end of
the Page, after any `pd_special` area, and will be calculated at runtime based on specific
ControlFile features.

No effort is made to ensure this is backwards-compatible with existing clusters for `pg_upgrade`, as
we will require logical replication to move data into a cluster with different settings here.
---
 contrib/amcheck/verify_nbtree.c               |  4 +-
 contrib/bloom/bloom.h                         |  3 +-
 contrib/bloom/blutils.c                       |  4 +-
 contrib/bloom/blvacuum.c                      |  2 +-
 contrib/pg_surgery/heap_surgery.c             |  4 +-
 src/backend/access/brin/brin_bloom.c          |  8 ++--
 src/backend/access/brin/brin_minmax_multi.c   |  8 ++--
 src/backend/access/brin/brin_tuple.c          |  2 +-
 src/backend/access/common/indextuple.c        |  2 +-
 src/backend/access/gin/gindatapage.c          | 20 +++++-----
 src/backend/access/gin/ginentrypage.c         |  4 +-
 src/backend/access/gin/ginfast.c              | 10 ++---
 src/backend/access/gin/gininsert.c            |  4 +-
 src/backend/access/gin/ginpostinglist.c       |  6 +--
 src/backend/access/gin/ginvacuum.c            |  2 +-
 src/backend/access/heap/README.HOT            |  2 +-
 src/backend/access/heap/heapam.c              | 18 ++++-----
 src/backend/access/heap/heapam_handler.c      |  8 ++--
 src/backend/access/heap/hio.c                 |  8 ++--
 src/backend/access/heap/pruneheap.c           | 24 +++++------
 src/backend/access/heap/rewriteheap.c         |  4 +-
 src/backend/access/heap/vacuumlazy.c          | 22 +++++-----
 src/backend/access/nbtree/nbtdedup.c          |  6 +--
 src/backend/access/nbtree/nbtinsert.c         |  4 +-
 src/backend/access/nbtree/nbtree.c            |  4 +-
 src/backend/access/nbtree/nbtsearch.c         |  8 ++--
 src/backend/access/nbtree/nbtsplitloc.c       |  2 +-
 src/backend/nodes/tidbitmap.c                 |  2 +-
 .../replication/logical/reorderbuffer.c       |  2 +-
 src/backend/storage/freespace/freespace.c     | 30 +++++++-------
 src/backend/storage/page/bufpage.c            | 36 +++++++++--------
 src/backend/utils/adt/tsgistidx.c             |  2 +-
 src/backend/utils/init/globals.c              |  1 +
 src/backend/utils/misc/guc_tables.c           | 13 ++++++
 src/include/access/ginblock.h                 | 27 +++++++++----
 src/include/access/hash.h                     |  5 ++-
 src/include/access/heapam.h                   |  2 +-
 src/include/access/heaptoast.h                |  4 +-
 src/include/access/htup_details.h             | 40 ++++++++++++++-----
 src/include/access/nbtree.h                   | 32 ++++++++++-----
 src/include/access/spgist_private.h           |  1 +
 src/include/storage/bufpage.h                 | 22 +++++++---
 .../test_ginpostinglist/test_ginpostinglist.c |  6 +--
 src/test/regress/expected/insert.out          |  4 +-
 src/test/regress/expected/vacuum.out          |  4 +-
 src/test/regress/sql/insert.sql               |  4 +-
 src/test/regress/sql/vacuum.sql               |  4 +-
 47 files changed, 256 insertions(+), 178 deletions(-)

diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 257cff671b..d0da6b3665 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -488,12 +488,12 @@ bt_check_every_level(Relation rel, Relation heaprel, bool heapkeyspace,
 		/*
 		 * Size Bloom filter based on estimated number of tuples in index,
 		 * while conservatively assuming that each block must contain at least
-		 * MaxTIDsPerBTreePage / 3 "plain" tuples -- see
+		 * MaxTIDsPerBTreePage() / 3 "plain" tuples -- see
 		 * bt_posting_plain_tuple() for definition, and details of how posting
 		 * list tuples are handled.
 		 */
 		total_pages = RelationGetNumberOfBlocks(rel);
-		total_elems = Max(total_pages * (MaxTIDsPerBTreePage / 3),
+		total_elems = Max(total_pages * (MaxTIDsPerBTreePage() / 3),
 						  (int64) state->rel->rd_rel->reltuples);
 		/* Generate a random seed to avoid repetition */
 		seed = pg_prng_uint64(&pg_global_prng_state);
diff --git a/contrib/bloom/bloom.h b/contrib/bloom/bloom.h
index efdf9415d1..8ebabdd7ee 100644
--- a/contrib/bloom/bloom.h
+++ b/contrib/bloom/bloom.h
@@ -131,7 +131,7 @@ typedef struct BloomMetaPageData
 #define BLOOM_MAGICK_NUMBER (0xDBAC0DED)
 
 /* Number of blocks numbers fit in BloomMetaPageData */
-#define BloomMetaBlockN		(sizeof(FreeBlockNumberArray) / sizeof(BlockNumber))
+#define BloomMetaBlockN()		((sizeof(FreeBlockNumberArray) - SizeOfPageReservedSpace())/ sizeof(BlockNumber))
 
 #define BloomPageGetMeta(page)	((BloomMetaPageData *) PageGetContents(page))
 
@@ -151,6 +151,7 @@ typedef struct BloomState
 
 #define BloomPageGetFreeSpace(state, page) \
 	(BLCKSZ - MAXALIGN(SizeOfPageHeaderData) \
+		- SizeOfPageReservedSpace()								  \
 		- BloomPageGetMaxOffset(page) * (state)->sizeOfBloomTuple \
 		- MAXALIGN(sizeof(BloomPageOpaqueData)))
 
diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index a6d9f09f31..6dba4fd9af 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -438,10 +438,10 @@ BloomFillMetapage(Relation index, Page metaPage)
 	 */
 	BloomInitPage(metaPage, BLOOM_META);
 	metadata = BloomPageGetMeta(metaPage);
-	memset(metadata, 0, sizeof(BloomMetaPageData));
+	memset(metadata, 0, sizeof(BloomMetaPageData) - SizeOfPageReservedSpace());
 	metadata->magickNumber = BLOOM_MAGICK_NUMBER;
 	metadata->opts = *opts;
-	((PageHeader) metaPage)->pd_lower += sizeof(BloomMetaPageData);
+	((PageHeader) metaPage)->pd_lower += sizeof(BloomMetaPageData) - SizeOfPageReservedSpace();
 
 	/* If this fails, probably FreeBlockNumberArray size calc is wrong: */
 	Assert(((PageHeader) metaPage)->pd_lower <= ((PageHeader) metaPage)->pd_upper);
diff --git a/contrib/bloom/blvacuum.c b/contrib/bloom/blvacuum.c
index 2340d49e00..652dc9e586 100644
--- a/contrib/bloom/blvacuum.c
+++ b/contrib/bloom/blvacuum.c
@@ -116,7 +116,7 @@ blbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 		 */
 		if (BloomPageGetMaxOffset(page) != 0 &&
 			BloomPageGetFreeSpace(&state, page) >= state.sizeOfBloomTuple &&
-			countPage < BloomMetaBlockN)
+			countPage < BloomMetaBlockN())
 			notFullPage[countPage++] = blkno;
 
 		/* Did we delete something? */
diff --git a/contrib/pg_surgery/heap_surgery.c b/contrib/pg_surgery/heap_surgery.c
index 61b184597a..fba399d95d 100644
--- a/contrib/pg_surgery/heap_surgery.c
+++ b/contrib/pg_surgery/heap_surgery.c
@@ -89,7 +89,7 @@ heap_force_common(FunctionCallInfo fcinfo, HeapTupleForceOption heap_force_opt)
 	Relation	rel;
 	OffsetNumber curr_start_ptr,
 				next_start_ptr;
-	bool		include_this_tid[MaxHeapTuplesPerPage];
+	bool		include_this_tid[MaxHeapTuplesPerPageLimit];
 
 	if (RecoveryInProgress())
 		ereport(ERROR,
@@ -225,7 +225,7 @@ heap_force_common(FunctionCallInfo fcinfo, HeapTupleForceOption heap_force_opt)
 			}
 
 			/* Mark it for processing. */
-			Assert(offno < MaxHeapTuplesPerPage);
+			Assert(offno < MaxHeapTuplesPerPage());
 			include_this_tid[offno] = true;
 		}
 
diff --git a/src/backend/access/brin/brin_bloom.c b/src/backend/access/brin/brin_bloom.c
index e4953a9d37..5bc0166fb8 100644
--- a/src/backend/access/brin/brin_bloom.c
+++ b/src/backend/access/brin/brin_bloom.c
@@ -166,7 +166,7 @@ typedef struct BloomOptions
  * on the fact that the filter header is ~20B alone, which is about
  * the same as the filter bitmap for 16 distinct items with 1% false
  * positive rate. So by allowing lower values we'd not gain much. In
- * any case, the min should not be larger than MaxHeapTuplesPerPage
+ * any case, the min should not be larger than MaxHeapTuplesPerPage()
  * (~290), which is the theoretical maximum for single-page ranges.
  */
 #define		BLOOM_MIN_NDISTINCT_PER_RANGE		16
@@ -448,7 +448,7 @@ brin_bloom_opcinfo(PG_FUNCTION_ARGS)
  *
  * Adjust the ndistinct value based on the pagesPerRange value. First,
  * if it's negative, it's assumed to be relative to maximum number of
- * tuples in the range (assuming each page gets MaxHeapTuplesPerPage
+ * tuples in the range (assuming each page gets MaxHeapTuplesPerPage()
  * tuples, which is likely a significant over-estimate). We also clamp
  * the value, not to over-size the bloom filter unnecessarily.
  *
@@ -463,7 +463,7 @@ brin_bloom_opcinfo(PG_FUNCTION_ARGS)
  * seems better to rely on the upper estimate.
  *
  * XXX We might also calculate a better estimate of rows per BRIN range,
- * instead of using MaxHeapTuplesPerPage (which probably produces values
+ * instead of using MaxHeapTuplesPerPage() (which probably produces values
  * much higher than reality).
  */
 static int
@@ -478,7 +478,7 @@ brin_bloom_get_ndistinct(BrinDesc *bdesc, BloomOptions *opts)
 
 	Assert(BlockNumberIsValid(pagesPerRange));
 
-	maxtuples = MaxHeapTuplesPerPage * pagesPerRange;
+	maxtuples = MaxHeapTuplesPerPage() * pagesPerRange;
 
 	/*
 	 * Similarly to n_distinct, negative values are relative - in this case to
diff --git a/src/backend/access/brin/brin_minmax_multi.c b/src/backend/access/brin/brin_minmax_multi.c
index ac670fd02d..59415bf05c 100644
--- a/src/backend/access/brin/brin_minmax_multi.c
+++ b/src/backend/access/brin/brin_minmax_multi.c
@@ -2001,10 +2001,10 @@ brin_minmax_multi_distance_tid(PG_FUNCTION_ARGS)
 	 * We use the no-check variants here, because user-supplied values may
 	 * have (ip_posid == 0). See ItemPointerCompare.
 	 */
-	da1 = ItemPointerGetBlockNumberNoCheck(pa1) * MaxHeapTuplesPerPage +
+	da1 = ItemPointerGetBlockNumberNoCheck(pa1) * MaxHeapTuplesPerPage() +
 		ItemPointerGetOffsetNumberNoCheck(pa1);
 
-	da2 = ItemPointerGetBlockNumberNoCheck(pa2) * MaxHeapTuplesPerPage +
+	da2 = ItemPointerGetBlockNumberNoCheck(pa2) * MaxHeapTuplesPerPage() +
 		ItemPointerGetOffsetNumberNoCheck(pa2);
 
 	PG_RETURN_FLOAT8(da2 - da1);
@@ -2481,7 +2481,7 @@ brin_minmax_multi_add_value(PG_FUNCTION_ARGS)
 		 * much lower, but meh.
 		 */
 		maxvalues = Min(target_maxvalues * MINMAX_BUFFER_FACTOR,
-						MaxHeapTuplesPerPage * pagesPerRange);
+						MaxHeapTuplesPerPage() * pagesPerRange);
 
 		/* but always at least the original value */
 		maxvalues = Max(maxvalues, target_maxvalues);
@@ -2527,7 +2527,7 @@ brin_minmax_multi_add_value(PG_FUNCTION_ARGS)
 		 * much lower, but meh.
 		 */
 		maxvalues = Min(serialized->maxvalues * MINMAX_BUFFER_FACTOR,
-						MaxHeapTuplesPerPage * pagesPerRange);
+						MaxHeapTuplesPerPage() * pagesPerRange);
 
 		/* but always at least the original value */
 		maxvalues = Max(maxvalues, serialized->maxvalues);
diff --git a/src/backend/access/brin/brin_tuple.c b/src/backend/access/brin/brin_tuple.c
index 84b79dbfc0..3908a59f66 100644
--- a/src/backend/access/brin/brin_tuple.c
+++ b/src/backend/access/brin/brin_tuple.c
@@ -217,7 +217,7 @@ brin_form_tuple(BrinDesc *brdesc, BlockNumber blkno, BrinMemTuple *tuple,
 			 * datatype, try to compress it in-line.
 			 */
 			if (!VARATT_IS_EXTENDED(DatumGetPointer(value)) &&
-				VARSIZE(DatumGetPointer(value)) > TOAST_INDEX_TARGET &&
+				VARSIZE(DatumGetPointer(value)) > TOAST_INDEX_TARGET() &&
 				(atttype->typstorage == TYPSTORAGE_EXTENDED ||
 				 atttype->typstorage == TYPSTORAGE_MAIN))
 			{
diff --git a/src/backend/access/common/indextuple.c b/src/backend/access/common/indextuple.c
index 8b178f94c1..51dd6e467f 100644
--- a/src/backend/access/common/indextuple.c
+++ b/src/backend/access/common/indextuple.c
@@ -118,7 +118,7 @@ index_form_tuple_context(TupleDesc tupleDescriptor,
 		 * try to compress it in-line.
 		 */
 		if (!VARATT_IS_EXTENDED(DatumGetPointer(untoasted_values[i])) &&
-			VARSIZE(DatumGetPointer(untoasted_values[i])) > TOAST_INDEX_TARGET &&
+			VARSIZE(DatumGetPointer(untoasted_values[i])) > TOAST_INDEX_TARGET() &&
 			(att->attstorage == TYPSTORAGE_EXTENDED ||
 			 att->attstorage == TYPSTORAGE_MAIN))
 		{
diff --git a/src/backend/access/gin/gindatapage.c b/src/backend/access/gin/gindatapage.c
index 9caeac164a..a0ac29cdbe 100644
--- a/src/backend/access/gin/gindatapage.c
+++ b/src/backend/access/gin/gindatapage.c
@@ -535,7 +535,7 @@ dataBeginPlaceToPageLeaf(GinBtree btree, Buffer buf, GinBtreeStack *stack,
 		 * a single byte, and we can use all the free space on the old page as
 		 * well as the new page. For simplicity, ignore segment overhead etc.
 		 */
-		maxitems = Min(maxitems, freespace + GinDataPageMaxDataSize);
+		maxitems = Min(maxitems, freespace + GinDataPageMaxDataSize());
 	}
 	else
 	{
@@ -550,7 +550,7 @@ dataBeginPlaceToPageLeaf(GinBtree btree, Buffer buf, GinBtreeStack *stack,
 		int			nnewsegments;
 
 		nnewsegments = freespace / GinPostingListSegmentMaxSize;
-		nnewsegments += GinDataPageMaxDataSize / GinPostingListSegmentMaxSize;
+		nnewsegments += GinDataPageMaxDataSize() / GinPostingListSegmentMaxSize;
 		maxitems = Min(maxitems, nnewsegments * MinTuplesPerSegment);
 	}
 
@@ -665,8 +665,8 @@ dataBeginPlaceToPageLeaf(GinBtree btree, Buffer buf, GinBtreeStack *stack,
 				leaf->lastleft = dlist_prev_node(&leaf->segments, leaf->lastleft);
 			}
 		}
-		Assert(leaf->lsize <= GinDataPageMaxDataSize);
-		Assert(leaf->rsize <= GinDataPageMaxDataSize);
+		Assert(leaf->lsize <= GinDataPageMaxDataSize());
+		Assert(leaf->rsize <= GinDataPageMaxDataSize());
 
 		/*
 		 * Fetch the max item in the left page's last segment; it becomes the
@@ -755,7 +755,7 @@ ginVacuumPostingTreeLeaf(Relation indexrel, Buffer buffer, GinVacuumState *gvs)
 		if (seginfo->seg)
 			oldsegsize = SizeOfGinPostingList(seginfo->seg);
 		else
-			oldsegsize = GinDataPageMaxDataSize;
+			oldsegsize = GinDataPageMaxDataSize();
 
 		cleaned = ginVacuumItemPointers(gvs,
 										seginfo->items,
@@ -1015,7 +1015,7 @@ dataPlaceToPageLeafRecompress(Buffer buf, disassembledLeaf *leaf)
 		}
 	}
 
-	Assert(newsize <= GinDataPageMaxDataSize);
+	Assert(newsize <= GinDataPageMaxDataSize());
 	GinDataPageSetDataSize(page, newsize);
 }
 
@@ -1684,7 +1684,7 @@ leafRepackItems(disassembledLeaf *leaf, ItemPointer remaining)
 		 * copying to the page. Did we exceed the size that fits on one page?
 		 */
 		segsize = SizeOfGinPostingList(seginfo->seg);
-		if (pgused + segsize > GinDataPageMaxDataSize)
+		if (pgused + segsize > GinDataPageMaxDataSize())
 		{
 			if (!needsplit)
 			{
@@ -1724,8 +1724,8 @@ leafRepackItems(disassembledLeaf *leaf, ItemPointer remaining)
 	else
 		leaf->rsize = pgused;
 
-	Assert(leaf->lsize <= GinDataPageMaxDataSize);
-	Assert(leaf->rsize <= GinDataPageMaxDataSize);
+	Assert(leaf->lsize <= GinDataPageMaxDataSize());
+	Assert(leaf->rsize <= GinDataPageMaxDataSize());
 
 	/*
 	 * Make a palloc'd copy of every segment after the first modified one,
@@ -1801,7 +1801,7 @@ createPostingTree(Relation index, ItemPointerData *items, uint32 nitems,
 										 GinPostingListSegmentMaxSize,
 										 &npacked);
 		segsize = SizeOfGinPostingList(segment);
-		if (rootsize + segsize > GinDataPageMaxDataSize)
+		if (rootsize + segsize > GinDataPageMaxDataSize())
 			break;
 
 		memcpy(ptr, segment, segsize);
diff --git a/src/backend/access/gin/ginentrypage.c b/src/backend/access/gin/ginentrypage.c
index 5a8c0eb98d..26d292921a 100644
--- a/src/backend/access/gin/ginentrypage.c
+++ b/src/backend/access/gin/ginentrypage.c
@@ -102,13 +102,13 @@ GinFormTuple(GinState *ginstate,
 
 	newsize = MAXALIGN(newsize);
 
-	if (newsize > GinMaxItemSize)
+	if (newsize > GinMaxItemSize())
 	{
 		if (errorTooBig)
 			ereport(ERROR,
 					(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
 					 errmsg("index row size %zu exceeds maximum %zu for index \"%s\"",
-							(Size) newsize, (Size) GinMaxItemSize,
+							(Size) newsize, (Size) GinMaxItemSize(),
 							RelationGetRelationName(ginstate->index))));
 		pfree(itup);
 		return NULL;
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index ca7d770d86..eedd2d35ba 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -38,8 +38,8 @@
 /* GUC parameter */
 int			gin_pending_list_limit = 0;
 
-#define GIN_PAGE_FREESIZE \
-	( BLCKSZ - MAXALIGN(SizeOfPageHeaderData) - MAXALIGN(sizeof(GinPageOpaqueData)) )
+#define GIN_PAGE_FREESIZE() \
+	( BLCKSZ - MAXALIGN(SizeOfPageHeaderData) - MAXALIGN(sizeof(GinPageOpaqueData)) - SizeOfPageReservedSpace() )
 
 typedef struct KeyArray
 {
@@ -183,7 +183,7 @@ makeSublist(Relation index, IndexTuple *tuples, int32 ntuples,
 
 		tupsize = MAXALIGN(IndexTupleSize(tuples[i])) + sizeof(ItemIdData);
 
-		if (size + tupsize > GinListPageSize)
+		if (size + tupsize > GinListPageSize())
 		{
 			/* won't fit, force a new page and reprocess */
 			i--;
@@ -249,7 +249,7 @@ ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector)
 	 */
 	CheckForSerializableConflictIn(index, NULL, GIN_METAPAGE_BLKNO);
 
-	if (collector->sumsize + collector->ntuples * sizeof(ItemIdData) > GinListPageSize)
+	if (collector->sumsize + collector->ntuples * sizeof(ItemIdData) > GinListPageSize())
 	{
 		/*
 		 * Total size is greater than one page => make sublist
@@ -450,7 +450,7 @@ ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector)
 	 * ginInsertCleanup() should not be called inside our CRIT_SECTION.
 	 */
 	cleanupSize = GinGetPendingListCleanupSize(index);
-	if (metadata->nPendingPages * GIN_PAGE_FREESIZE > cleanupSize * 1024L)
+	if (metadata->nPendingPages * GIN_PAGE_FREESIZE() > cleanupSize * 1024L)
 		needCleanup = true;
 
 	UnlockReleaseBuffer(metabuffer);
diff --git a/src/backend/access/gin/gininsert.c b/src/backend/access/gin/gininsert.c
index d5d748009e..7f7c71c3c5 100644
--- a/src/backend/access/gin/gininsert.c
+++ b/src/backend/access/gin/gininsert.c
@@ -75,7 +75,7 @@ addItemPointersToLeafTuple(GinState *ginstate,
 
 	/* Compress the posting list, and try to a build tuple with room for it */
 	res = NULL;
-	compressedList = ginCompressPostingList(newItems, newNPosting, GinMaxItemSize,
+	compressedList = ginCompressPostingList(newItems, newNPosting, GinMaxItemSize(),
 											NULL);
 	pfree(newItems);
 	if (compressedList)
@@ -135,7 +135,7 @@ buildFreshLeafTuple(GinState *ginstate,
 	GinPostingList *compressedList;
 
 	/* try to build a posting list tuple with all the items */
-	compressedList = ginCompressPostingList(items, nitem, GinMaxItemSize, NULL);
+	compressedList = ginCompressPostingList(items, nitem, GinMaxItemSize(), NULL);
 	if (compressedList)
 	{
 		res = GinFormTuple(ginstate, attnum, key, category,
diff --git a/src/backend/access/gin/ginpostinglist.c b/src/backend/access/gin/ginpostinglist.c
index 66a89837e6..65f683ce02 100644
--- a/src/backend/access/gin/ginpostinglist.c
+++ b/src/backend/access/gin/ginpostinglist.c
@@ -26,7 +26,7 @@
  * lowest 32 bits are the block number. That leaves 21 bits unused, i.e.
  * only 43 low bits are used.
  *
- * 11 bits is enough for the offset number, because MaxHeapTuplesPerPage <
+ * 11 bits is enough for the offset number, because MaxHeapTuplesPerPage() <
  * 2^11 on all supported block sizes. We are frugal with the bits, because
  * smaller integers use fewer bytes in the varbyte encoding, saving disk
  * space. (If we get a new table AM in the future that wants to use the full
@@ -74,9 +74,9 @@
 /*
  * How many bits do you need to encode offset number? OffsetNumber is a 16-bit
  * integer, but you can't fit that many items on a page. 11 ought to be more
- * than enough. It's tempting to derive this from MaxHeapTuplesPerPage, and
+ * than enough. It's tempting to derive this from MaxHeapTuplesPerPage(), and
  * use the minimum number of bits, but that would require changing the on-disk
- * format if MaxHeapTuplesPerPage changes. Better to leave some slack.
+ * format if MaxHeapTuplesPerPage() changes. Better to leave some slack.
  */
 #define MaxHeapTuplesPerPageBits		11
 
diff --git a/src/backend/access/gin/ginvacuum.c b/src/backend/access/gin/ginvacuum.c
index e5d310d836..5e001893cc 100644
--- a/src/backend/access/gin/ginvacuum.c
+++ b/src/backend/access/gin/ginvacuum.c
@@ -514,7 +514,7 @@ ginVacuumEntryPage(GinVacuumState *gvs, Buffer buffer, BlockNumber *roots, uint3
 
 				if (nitems > 0)
 				{
-					plist = ginCompressPostingList(items, nitems, GinMaxItemSize, NULL);
+					plist = ginCompressPostingList(items, nitems, GinMaxItemSize(), NULL);
 					plistsize = SizeOfGinPostingList(plist);
 				}
 				else
diff --git a/src/backend/access/heap/README.HOT b/src/backend/access/heap/README.HOT
index 6fd1767f70..6c9fd4fd52 100644
--- a/src/backend/access/heap/README.HOT
+++ b/src/backend/access/heap/README.HOT
@@ -243,7 +243,7 @@ of line pointer bloat: we might end up with huge numbers of line pointers
 and just a few actual tuples on a page.  To limit the damage in the worst
 case, and to keep various work arrays as well as the bitmaps in bitmap
 scans reasonably sized, the maximum number of line pointers per page
-is arbitrarily capped at MaxHeapTuplesPerPage (the most tuples that
+is arbitrarily capped at MaxHeapTuplesPerPage() (the most tuples that
 could fit without HOT pruning).
 
 Effectively, space reclamation happens during tuple retrieval when the
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index e6024a980b..72bc0db029 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -479,7 +479,7 @@ heapgetpage(TableScanDesc sscan, BlockNumber block)
 
 	LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
 
-	Assert(ntup <= MaxHeapTuplesPerPage);
+	Assert(ntup <= MaxHeapTuplesPerPage());
 	scan->rs_ntuples = ntup;
 }
 
@@ -6862,8 +6862,8 @@ heap_freeze_execute_prepared(Relation rel, Buffer buffer,
 	/* Now WAL-log freezing if necessary */
 	if (RelationNeedsWAL(rel))
 	{
-		xl_heap_freeze_plan plans[MaxHeapTuplesPerPage];
-		OffsetNumber offsets[MaxHeapTuplesPerPage];
+		xl_heap_freeze_plan plans[MaxHeapTuplesPerPageLimit];
+		OffsetNumber offsets[MaxHeapTuplesPerPageLimit];
 		int			nplans;
 		xl_heap_freeze_page xlrec;
 		XLogRecPtr	recptr;
@@ -9331,7 +9331,7 @@ heap_xlog_insert(XLogReaderState *record)
 	union
 	{
 		HeapTupleHeaderData hdr;
-		char		data[MaxHeapTupleSize];
+		char		data[MaxHeapTupleSizeLimit];
 	}			tbuf;
 	HeapTupleHeader htup;
 	xl_heap_header xlhdr;
@@ -9387,7 +9387,7 @@ heap_xlog_insert(XLogReaderState *record)
 		data = XLogRecGetBlockData(record, 0, &datalen);
 
 		newlen = datalen - SizeOfHeapHeader;
-		Assert(datalen > SizeOfHeapHeader && newlen <= MaxHeapTupleSize);
+		Assert(datalen > SizeOfHeapHeader && newlen <= MaxHeapTupleSize());
 		memcpy((char *) &xlhdr, data, SizeOfHeapHeader);
 		data += SizeOfHeapHeader;
 
@@ -9453,7 +9453,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
 	union
 	{
 		HeapTupleHeaderData hdr;
-		char		data[MaxHeapTupleSize];
+		char		data[MaxHeapTupleSizeLimit];
 	}			tbuf;
 	HeapTupleHeader htup;
 	uint32		newlen;
@@ -9531,7 +9531,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
 			tupdata = ((char *) xlhdr) + SizeOfMultiInsertTuple;
 
 			newlen = xlhdr->datalen;
-			Assert(newlen <= MaxHeapTupleSize);
+			Assert(newlen <= MaxHeapTupleSize());
 			htup = &tbuf.hdr;
 			MemSet((char *) htup, 0, SizeofHeapTupleHeader);
 			/* PG73FORMAT: get bitmap [+ padding] [+ oid] + data */
@@ -9610,7 +9610,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
 	union
 	{
 		HeapTupleHeaderData hdr;
-		char		data[MaxHeapTupleSize];
+		char		data[MaxHeapTupleSizeLimit];
 	}			tbuf;
 	xl_heap_header xlhdr;
 	uint32		newlen;
@@ -9766,7 +9766,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
 		recdata += SizeOfHeapHeader;
 
 		tuplen = recdata_end - recdata;
-		Assert(tuplen <= MaxHeapTupleSize);
+		Assert(tuplen <= MaxHeapTupleSize());
 
 		htup = &tbuf.hdr;
 		MemSet((char *) htup, 0, SizeofHeapTupleHeader);
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index c4b1916d36..c4b5f28459 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -1184,7 +1184,7 @@ heapam_index_build_range_scan(Relation heapRelation,
 	TransactionId OldestXmin;
 	BlockNumber previous_blkno = InvalidBlockNumber;
 	BlockNumber root_blkno = InvalidBlockNumber;
-	OffsetNumber root_offsets[MaxHeapTuplesPerPage];
+	OffsetNumber root_offsets[MaxHeapTuplesPerPageLimit];
 
 	/*
 	 * sanity checks
@@ -1747,8 +1747,8 @@ heapam_index_validate_scan(Relation heapRelation,
 	EState	   *estate;
 	ExprContext *econtext;
 	BlockNumber root_blkno = InvalidBlockNumber;
-	OffsetNumber root_offsets[MaxHeapTuplesPerPage];
-	bool		in_index[MaxHeapTuplesPerPage];
+	OffsetNumber root_offsets[MaxHeapTuplesPerPageLimit];
+	bool		in_index[MaxHeapTuplesPerPageLimit];
 	BlockNumber previous_blkno = InvalidBlockNumber;
 
 	/* state variables for the merge */
@@ -2211,7 +2211,7 @@ heapam_scan_bitmap_next_block(TableScanDesc scan,
 
 	LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
 
-	Assert(ntup <= MaxHeapTuplesPerPage);
+	Assert(ntup <= MaxHeapTuplesPerPage());
 	hscan->rs_ntuples = ntup;
 
 	return ntup > 0;
diff --git a/src/backend/access/heap/hio.c b/src/backend/access/heap/hio.c
index e152807d2d..67d4460ed5 100644
--- a/src/backend/access/heap/hio.c
+++ b/src/backend/access/heap/hio.c
@@ -354,11 +354,11 @@ RelationGetBufferForTuple(Relation relation, Size len,
 	/*
 	 * If we're gonna fail for oversize tuple, do it right away
 	 */
-	if (len > MaxHeapTupleSize)
+	if (len > MaxHeapTupleSize())
 		ereport(ERROR,
 				(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
 				 errmsg("row is too big: size %zu, maximum size %zu",
-						len, MaxHeapTupleSize)));
+						len, MaxHeapTupleSize())));
 
 	/* Compute desired extra freespace due to fillfactor option */
 	saveFreeSpace = RelationGetTargetPageFreeSpace(relation,
@@ -370,8 +370,8 @@ RelationGetBufferForTuple(Relation relation, Size len,
 	 * somewhat arbitrary, but it should prevent most unnecessary relation
 	 * extensions while inserting large tuples into low-fillfactor tables.
 	 */
-	nearlyEmptyFreeSpace = MaxHeapTupleSize -
-		(MaxHeapTuplesPerPage / 8 * sizeof(ItemIdData));
+	nearlyEmptyFreeSpace = MaxHeapTupleSize() -
+		(MaxHeapTuplesPerPage() / 8 * sizeof(ItemIdData));
 	if (len + saveFreeSpace > nearlyEmptyFreeSpace)
 		targetFreeSpace = Max(len, nearlyEmptyFreeSpace);
 	else
diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index 4e65cbcadf..e3e4359dd4 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -54,17 +54,17 @@ typedef struct
 	int			ndead;
 	int			nunused;
 	/* arrays that accumulate indexes of items to be changed */
-	OffsetNumber redirected[MaxHeapTuplesPerPage * 2];
-	OffsetNumber nowdead[MaxHeapTuplesPerPage];
-	OffsetNumber nowunused[MaxHeapTuplesPerPage];
+	OffsetNumber redirected[MaxHeapTuplesPerPageLimit * 2];
+	OffsetNumber nowdead[MaxHeapTuplesPerPageLimit];
+	OffsetNumber nowunused[MaxHeapTuplesPerPageLimit];
 
 	/*
 	 * marked[i] is true if item i is entered in one of the above arrays.
 	 *
-	 * This needs to be MaxHeapTuplesPerPage + 1 long as FirstOffsetNumber is
+	 * This needs to be MaxHeapTuplesPerPage() + 1 long as FirstOffsetNumber is
 	 * 1. Otherwise every access would need to subtract 1.
 	 */
-	bool		marked[MaxHeapTuplesPerPage + 1];
+	bool		marked[MaxHeapTuplesPerPageLimit + 1];
 
 	/*
 	 * Tuple visibility is only computed once for each tuple, for correctness
@@ -74,7 +74,7 @@ typedef struct
 	 *
 	 * Same indexing as ->marked.
 	 */
-	int8		htsv[MaxHeapTuplesPerPage + 1];
+	int8		htsv[MaxHeapTuplesPerPageLimit + 1];
 } PruneState;
 
 /* Local functions */
@@ -598,7 +598,7 @@ heap_prune_chain(Buffer buffer, OffsetNumber rootoffnum, PruneState *prstate)
 	OffsetNumber latestdead = InvalidOffsetNumber,
 				maxoff = PageGetMaxOffsetNumber(dp),
 				offnum;
-	OffsetNumber chainitems[MaxHeapTuplesPerPage];
+	OffsetNumber chainitems[MaxHeapTuplesPerPageLimit];
 	int			nchain = 0,
 				i;
 
@@ -870,7 +870,7 @@ static void
 heap_prune_record_redirect(PruneState *prstate,
 						   OffsetNumber offnum, OffsetNumber rdoffnum)
 {
-	Assert(prstate->nredirected < MaxHeapTuplesPerPage);
+	Assert(prstate->nredirected < MaxHeapTuplesPerPage());
 	prstate->redirected[prstate->nredirected * 2] = offnum;
 	prstate->redirected[prstate->nredirected * 2 + 1] = rdoffnum;
 	prstate->nredirected++;
@@ -884,7 +884,7 @@ heap_prune_record_redirect(PruneState *prstate,
 static void
 heap_prune_record_dead(PruneState *prstate, OffsetNumber offnum)
 {
-	Assert(prstate->ndead < MaxHeapTuplesPerPage);
+	Assert(prstate->ndead < MaxHeapTuplesPerPage());
 	prstate->nowdead[prstate->ndead] = offnum;
 	prstate->ndead++;
 	Assert(!prstate->marked[offnum]);
@@ -895,7 +895,7 @@ heap_prune_record_dead(PruneState *prstate, OffsetNumber offnum)
 static void
 heap_prune_record_unused(PruneState *prstate, OffsetNumber offnum)
 {
-	Assert(prstate->nunused < MaxHeapTuplesPerPage);
+	Assert(prstate->nunused < MaxHeapTuplesPerPage());
 	prstate->nowunused[prstate->nunused] = offnum;
 	prstate->nunused++;
 	Assert(!prstate->marked[offnum]);
@@ -1097,7 +1097,7 @@ page_verify_redirects(Page page)
  * If item k is part of a HOT-chain with root at item j, then we set
  * root_offsets[k - 1] = j.
  *
- * The passed-in root_offsets array must have MaxHeapTuplesPerPage entries.
+ * The passed-in root_offsets array must have MaxHeapTuplesPerPage() entries.
  * Unused entries are filled with InvalidOffsetNumber (zero).
  *
  * The function must be called with at least share lock on the buffer, to
@@ -1114,7 +1114,7 @@ heap_get_root_tuples(Page page, OffsetNumber *root_offsets)
 				maxoff;
 
 	MemSet(root_offsets, InvalidOffsetNumber,
-		   MaxHeapTuplesPerPage * sizeof(OffsetNumber));
+		   MaxHeapTuplesPerPage() * sizeof(OffsetNumber));
 
 	maxoff = PageGetMaxOffsetNumber(page);
 	for (offnum = FirstOffsetNumber; offnum <= maxoff; offnum = OffsetNumberNext(offnum))
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 8993c1ed5a..1737673d6a 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -653,11 +653,11 @@ raw_heap_insert(RewriteState state, HeapTuple tup)
 	/*
 	 * If we're gonna fail for oversize tuple, do it right away
 	 */
-	if (len > MaxHeapTupleSize)
+	if (len > MaxHeapTupleSize())
 		ereport(ERROR,
 				(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
 				 errmsg("row is too big: size %zu, maximum size %zu",
-						len, MaxHeapTupleSize)));
+						len, MaxHeapTupleSize())));
 
 	/* Compute desired extra freespace due to fillfactor option */
 	saveFreeSpace = RelationGetTargetPageFreeSpace(state->rs_new_rel,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 8f14cf85f3..2bf362e3ae 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -906,8 +906,8 @@ lazy_scan_heap(LVRelState *vacrel)
 		 * dead_items TIDs, pause and do a cycle of vacuuming before we tackle
 		 * this page.
 		 */
-		Assert(dead_items->max_items >= MaxHeapTuplesPerPage);
-		if (dead_items->max_items - dead_items->num_items < MaxHeapTuplesPerPage)
+		Assert(dead_items->max_items >= MaxHeapTuplesPerPage());
+		if (dead_items->max_items - dead_items->num_items < MaxHeapTuplesPerPage())
 		{
 			/*
 			 * Before beginning index vacuuming, we release any pin we may
@@ -1549,8 +1549,8 @@ lazy_scan_prune(LVRelState *vacrel,
 	int			nnewlpdead;
 	HeapPageFreeze pagefrz;
 	int64		fpi_before = pgWalUsage.wal_fpi;
-	OffsetNumber deadoffsets[MaxHeapTuplesPerPage];
-	HeapTupleFreeze frozen[MaxHeapTuplesPerPage];
+	OffsetNumber deadoffsets[MaxHeapTuplesPerPageLimit];
+	HeapTupleFreeze frozen[MaxHeapTuplesPerPageLimit];
 
 	Assert(BufferGetBlockNumber(buf) == blkno);
 
@@ -1969,7 +1969,7 @@ lazy_scan_noprune(LVRelState *vacrel,
 	HeapTupleHeader tupleheader;
 	TransactionId NoFreezePageRelfrozenXid = vacrel->NewRelfrozenXid;
 	MultiXactId NoFreezePageRelminMxid = vacrel->NewRelminMxid;
-	OffsetNumber deadoffsets[MaxHeapTuplesPerPage];
+	OffsetNumber deadoffsets[MaxHeapTuplesPerPageLimit];
 
 	Assert(BufferGetBlockNumber(buf) == blkno);
 
@@ -2501,7 +2501,7 @@ lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
 {
 	VacDeadItems *dead_items = vacrel->dead_items;
 	Page		page = BufferGetPage(buffer);
-	OffsetNumber unused[MaxHeapTuplesPerPage];
+	OffsetNumber unused[MaxHeapTuplesPerPageLimit];
 	int			nunused = 0;
 	TransactionId visibility_cutoff_xid;
 	bool		all_frozen;
@@ -3118,16 +3118,16 @@ dead_items_max_items(LVRelState *vacrel)
 		max_items = Min(max_items, MAXDEADITEMS(MaxAllocSize));
 
 		/* curious coding here to ensure the multiplication can't overflow */
-		if ((BlockNumber) (max_items / MaxHeapTuplesPerPage) > rel_pages)
-			max_items = rel_pages * MaxHeapTuplesPerPage;
+		if ((BlockNumber) (max_items / MaxHeapTuplesPerPage()) > rel_pages)
+			max_items = rel_pages * MaxHeapTuplesPerPage();
 
 		/* stay sane if small maintenance_work_mem */
-		max_items = Max(max_items, MaxHeapTuplesPerPage);
+		max_items = Max(max_items, MaxHeapTuplesPerPage());
 	}
 	else
 	{
 		/* One-pass case only stores a single heap page's TIDs at a time */
-		max_items = MaxHeapTuplesPerPage;
+		max_items = MaxHeapTuplesPerPage();
 	}
 
 	return (int) max_items;
@@ -3147,7 +3147,7 @@ dead_items_alloc(LVRelState *vacrel, int nworkers)
 	int			max_items;
 
 	max_items = dead_items_max_items(vacrel);
-	Assert(max_items >= MaxHeapTuplesPerPage);
+	Assert(max_items >= MaxHeapTuplesPerPage());
 
 	/*
 	 * Initialize state for a parallel vacuum.  As of now, only one worker can
diff --git a/src/backend/access/nbtree/nbtdedup.c b/src/backend/access/nbtree/nbtdedup.c
index 0349988cf5..d65f242ae9 100644
--- a/src/backend/access/nbtree/nbtdedup.c
+++ b/src/backend/access/nbtree/nbtdedup.c
@@ -355,8 +355,8 @@ _bt_bottomupdel_pass(Relation rel, Buffer buf, Relation heapRel,
 	delstate.bottomup = true;
 	delstate.bottomupfreespace = Max(BLCKSZ / 16, newitemsz);
 	delstate.ndeltids = 0;
-	delstate.deltids = palloc(MaxTIDsPerBTreePage * sizeof(TM_IndexDelete));
-	delstate.status = palloc(MaxTIDsPerBTreePage * sizeof(TM_IndexStatus));
+	delstate.deltids = palloc(MaxTIDsPerBTreePage() * sizeof(TM_IndexDelete));
+	delstate.status = palloc(MaxTIDsPerBTreePage() * sizeof(TM_IndexStatus));
 
 	minoff = P_FIRSTDATAKEY(opaque);
 	maxoff = PageGetMaxOffsetNumber(page);
@@ -826,7 +826,7 @@ _bt_singleval_fillfactor(Page page, BTDedupState state, Size newitemsz)
 
 	/* This calculation needs to match nbtsplitloc.c */
 	leftfree = PageGetPageSize(page) - SizeOfPageHeaderData -
-		MAXALIGN(sizeof(BTPageOpaqueData));
+		MAXALIGN(sizeof(BTPageOpaqueData)) - SizeOfPageReservedSpace();
 	/* Subtract size of new high key (includes pivot heap TID space) */
 	leftfree -= newitemsz + MAXALIGN(sizeof(ItemPointerData));
 
diff --git a/src/backend/access/nbtree/nbtinsert.c b/src/backend/access/nbtree/nbtinsert.c
index f4c1a974ef..1a243c3b9a 100644
--- a/src/backend/access/nbtree/nbtinsert.c
+++ b/src/backend/access/nbtree/nbtinsert.c
@@ -2816,8 +2816,8 @@ _bt_simpledel_pass(Relation rel, Buffer buffer, Relation heapRel,
 	delstate.bottomup = false;
 	delstate.bottomupfreespace = 0;
 	delstate.ndeltids = 0;
-	delstate.deltids = palloc(MaxTIDsPerBTreePage * sizeof(TM_IndexDelete));
-	delstate.status = palloc(MaxTIDsPerBTreePage * sizeof(TM_IndexStatus));
+	delstate.deltids = palloc(MaxTIDsPerBTreePage() * sizeof(TM_IndexDelete));
+	delstate.status = palloc(MaxTIDsPerBTreePage() * sizeof(TM_IndexStatus));
 
 	for (offnum = minoff;
 		 offnum <= maxoff;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 1cc88da032..0d6a140fce 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -258,8 +258,8 @@ btgettuple(IndexScanDesc scan, ScanDirection dir)
 				 */
 				if (so->killedItems == NULL)
 					so->killedItems = (int *)
-						palloc(MaxTIDsPerBTreePage * sizeof(int));
-				if (so->numKilled < MaxTIDsPerBTreePage)
+						palloc(MaxTIDsPerBTreePage() * sizeof(int));
+				if (so->numKilled < MaxTIDsPerBTreePage())
 					so->killedItems[so->numKilled++] = so->currPos.itemIndex;
 			}
 
diff --git a/src/backend/access/nbtree/nbtsearch.c b/src/backend/access/nbtree/nbtsearch.c
index c43c1a2830..60d6ef3f5b 100644
--- a/src/backend/access/nbtree/nbtsearch.c
+++ b/src/backend/access/nbtree/nbtsearch.c
@@ -1668,7 +1668,7 @@ _bt_readpage(IndexScanDesc scan, ScanDirection dir, OffsetNumber offnum)
 		if (!continuescan)
 			so->currPos.moreRight = false;
 
-		Assert(itemIndex <= MaxTIDsPerBTreePage);
+		Assert(itemIndex <= MaxTIDsPerBTreePage());
 		so->currPos.firstItem = 0;
 		so->currPos.lastItem = itemIndex - 1;
 		so->currPos.itemIndex = 0;
@@ -1676,7 +1676,7 @@ _bt_readpage(IndexScanDesc scan, ScanDirection dir, OffsetNumber offnum)
 	else
 	{
 		/* load items[] in descending order */
-		itemIndex = MaxTIDsPerBTreePage;
+		itemIndex = MaxTIDsPerBTreePage();
 
 		offnum = Min(offnum, maxoff);
 
@@ -1765,8 +1765,8 @@ _bt_readpage(IndexScanDesc scan, ScanDirection dir, OffsetNumber offnum)
 
 		Assert(itemIndex >= 0);
 		so->currPos.firstItem = itemIndex;
-		so->currPos.lastItem = MaxTIDsPerBTreePage - 1;
-		so->currPos.itemIndex = MaxTIDsPerBTreePage - 1;
+		so->currPos.lastItem = MaxTIDsPerBTreePage() - 1;
+		so->currPos.itemIndex = MaxTIDsPerBTreePage() - 1;
 	}
 
 	return (so->currPos.firstItem <= so->currPos.lastItem);
diff --git a/src/backend/access/nbtree/nbtsplitloc.c b/src/backend/access/nbtree/nbtsplitloc.c
index ecb49bb471..5d17763312 100644
--- a/src/backend/access/nbtree/nbtsplitloc.c
+++ b/src/backend/access/nbtree/nbtsplitloc.c
@@ -156,7 +156,7 @@ _bt_findsplitloc(Relation rel,
 
 	/* Total free space available on a btree page, after fixed overhead */
 	leftspace = rightspace =
-		PageGetPageSize(origpage) - SizeOfPageHeaderData -
+		PageGetPageSize(origpage) - SizeOfPageHeaderData - SizeOfPageReservedSpace() -
 		MAXALIGN(sizeof(BTPageOpaqueData));
 
 	/* The right page will have the same high key as the old page */
diff --git a/src/backend/nodes/tidbitmap.c b/src/backend/nodes/tidbitmap.c
index 8c640ce16a..1b7b3b88e8 100644
--- a/src/backend/nodes/tidbitmap.c
+++ b/src/backend/nodes/tidbitmap.c
@@ -53,7 +53,7 @@
  * the per-page bitmaps variable size.  We just legislate that the size
  * is this:
  */
-#define MAX_TUPLES_PER_PAGE  MaxHeapTuplesPerPage
+#define MAX_TUPLES_PER_PAGE  MaxHeapTuplesPerPageLimit
 
 /*
  * When we have to switch over to lossy storage, we use a data structure
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 54ee824e6c..2010535ea8 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4871,7 +4871,7 @@ ReorderBufferToastReplace(ReorderBuffer *rb, ReorderBufferTXN *txn,
 	 * the tuplebuf because attrs[] will point back into the current content.
 	 */
 	tmphtup = heap_form_tuple(desc, attrs, isnull);
-	Assert(newtup->tuple.t_len <= MaxHeapTupleSize);
+	Assert(newtup->tuple.t_len <= MaxHeapTupleSize());
 	Assert(ReorderBufferTupleBufData(newtup) == newtup->tuple.t_data);
 
 	memcpy(newtup->tuple.t_data, tmphtup->t_data, tmphtup->t_len);
diff --git a/src/backend/storage/freespace/freespace.c b/src/backend/storage/freespace/freespace.c
index 3e9693b293..62ee80d746 100644
--- a/src/backend/storage/freespace/freespace.c
+++ b/src/backend/storage/freespace/freespace.c
@@ -37,12 +37,12 @@
  * We use just one byte to store the amount of free space on a page, so we
  * divide the amount of free space a page can have into 256 different
  * categories. The highest category, 255, represents a page with at least
- * MaxFSMRequestSize bytes of free space, and the second highest category
+ * MaxFSMRequestSize() bytes of free space, and the second highest category
  * represents the range from 254 * FSM_CAT_STEP, inclusive, to
- * MaxFSMRequestSize, exclusive.
+ * MaxFSMRequestSize(), exclusive.
  *
- * MaxFSMRequestSize depends on the architecture and BLCKSZ, but assuming
- * default 8k BLCKSZ, and that MaxFSMRequestSize is 8164 bytes, the
+ * MaxFSMRequestSize() depends on the architecture and BLCKSZ, but assuming
+ * default 8k BLCKSZ, and that MaxFSMRequestSize() is 8164 bytes, the
  * categories look like this:
  *
  *
@@ -54,16 +54,16 @@
  * 8128 - 8163 254
  * 8164 - 8192 255
  *
- * The reason that MaxFSMRequestSize is special is that if MaxFSMRequestSize
- * isn't equal to a range boundary, a page with exactly MaxFSMRequestSize
- * bytes of free space wouldn't satisfy a request for MaxFSMRequestSize
- * bytes. If there isn't more than MaxFSMRequestSize bytes of free space on a
+ * The reason that MaxFSMRequestSize() is special is that if MaxFSMRequestSize()
+ * isn't equal to a range boundary, a page with exactly MaxFSMRequestSize()
+ * bytes of free space wouldn't satisfy a request for MaxFSMRequestSize()
+ * bytes. If there isn't more than MaxFSMRequestSize() bytes of free space on a
  * completely empty page, that would mean that we could never satisfy a
- * request of exactly MaxFSMRequestSize bytes.
+ * request of exactly MaxFSMRequestSize() bytes.
  */
 #define FSM_CATEGORIES	256
 #define FSM_CAT_STEP	(BLCKSZ / FSM_CATEGORIES)
-#define MaxFSMRequestSize	MaxHeapTupleSize
+#define MaxFSMRequestSize()	MaxHeapTupleSize()
 
 /*
  * Depth of the on-disk tree. We need to be able to address 2^32-1 blocks,
@@ -372,13 +372,13 @@ fsm_space_avail_to_cat(Size avail)
 
 	Assert(avail < BLCKSZ);
 
-	if (avail >= MaxFSMRequestSize)
+	if (avail >= MaxFSMRequestSize())
 		return 255;
 
 	cat = avail / FSM_CAT_STEP;
 
 	/*
-	 * The highest category, 255, is reserved for MaxFSMRequestSize bytes or
+	 * The highest category, 255, is reserved for MaxFSMRequestSize() bytes or
 	 * more.
 	 */
 	if (cat > 254)
@@ -394,9 +394,9 @@ fsm_space_avail_to_cat(Size avail)
 static Size
 fsm_space_cat_to_avail(uint8 cat)
 {
-	/* The highest category represents exactly MaxFSMRequestSize bytes. */
+	/* The highest category represents exactly MaxFSMRequestSize() bytes. */
 	if (cat == 255)
-		return MaxFSMRequestSize;
+		return MaxFSMRequestSize();
 	else
 		return cat * FSM_CAT_STEP;
 }
@@ -411,7 +411,7 @@ fsm_space_needed_to_cat(Size needed)
 	int			cat;
 
 	/* Can't ask for more space than the highest category represents */
-	if (needed > MaxFSMRequestSize)
+	if (needed > MaxFSMRequestSize())
 		elog(ERROR, "invalid FSM request size %zu", needed);
 
 	if (needed == 0)
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index 92994f8f39..70a2361ba7 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -26,6 +26,8 @@
 /* GUC variable */
 bool		ignore_checksum_failure = false;
 
+int			reserved_page_size = 0; /* how much page space to reserve for extended unencrypted metadata */
+
 
 /* ----------------------------------------------------------------
  *						Page support functions
@@ -43,7 +45,7 @@ PageInit(Page page, Size pageSize, Size specialSize)
 {
 	PageHeader	p = (PageHeader) page;
 
-	specialSize = MAXALIGN(specialSize);
+	specialSize = MAXALIGN(specialSize) + reserved_page_size;
 
 	Assert(pageSize == BLCKSZ);
 	Assert(pageSize > specialSize + SizeOfPageHeaderData);
@@ -117,7 +119,7 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
 		if ((p->pd_flags & ~PD_VALID_FLAG_BITS) == 0 &&
 			p->pd_lower <= p->pd_upper &&
 			p->pd_upper <= p->pd_special &&
-			p->pd_special <= BLCKSZ &&
+			p->pd_special + reserved_page_size <= BLCKSZ &&
 			p->pd_special == MAXALIGN(p->pd_special))
 			header_sane = true;
 
@@ -186,7 +188,7 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
  *	one that is both unused and deallocated.
  *
  *	If flag PAI_IS_HEAP is set, we enforce that there can't be more than
- *	MaxHeapTuplesPerPage line pointers on the page.
+ *	MaxHeapTuplesPerPage() line pointers on the page.
  *
  *	!!! EREPORT(ERROR) IS DISALLOWED HERE !!!
  */
@@ -211,7 +213,7 @@ PageAddItemExtended(Page page,
 	if (phdr->pd_lower < SizeOfPageHeaderData ||
 		phdr->pd_lower > phdr->pd_upper ||
 		phdr->pd_upper > phdr->pd_special ||
-		phdr->pd_special > BLCKSZ)
+		phdr->pd_special + reserved_page_size > BLCKSZ)
 		ereport(PANIC,
 				(errcode(ERRCODE_DATA_CORRUPTED),
 				 errmsg("corrupted page pointers: lower = %u, upper = %u, special = %u",
@@ -295,9 +297,9 @@ PageAddItemExtended(Page page,
 	}
 
 	/* Reject placing items beyond heap boundary, if heap */
-	if ((flags & PAI_IS_HEAP) != 0 && offsetNumber > MaxHeapTuplesPerPage)
+	if ((flags & PAI_IS_HEAP) != 0 && offsetNumber > MaxHeapTuplesPerPage())
 	{
-		elog(WARNING, "can't put more than MaxHeapTuplesPerPage items in a heap page");
+		elog(WARNING, "can't put more than MaxHeapTuplesPerPage() items in a heap page");
 		return InvalidOffsetNumber;
 	}
 
@@ -702,7 +704,7 @@ PageRepairFragmentation(Page page)
 	Offset		pd_upper = ((PageHeader) page)->pd_upper;
 	Offset		pd_special = ((PageHeader) page)->pd_special;
 	Offset		last_offset;
-	itemIdCompactData itemidbase[MaxHeapTuplesPerPage];
+	itemIdCompactData itemidbase[MaxHeapTuplesPerPageLimit];
 	itemIdCompact itemidptr;
 	ItemId		lp;
 	int			nline,
@@ -723,7 +725,7 @@ PageRepairFragmentation(Page page)
 	if (pd_lower < SizeOfPageHeaderData ||
 		pd_lower > pd_upper ||
 		pd_upper > pd_special ||
-		pd_special > BLCKSZ ||
+		pd_special + reserved_page_size > BLCKSZ ||
 		pd_special != MAXALIGN(pd_special))
 		ereport(ERROR,
 				(errcode(ERRCODE_DATA_CORRUPTED),
@@ -979,12 +981,12 @@ PageGetExactFreeSpace(Page page)
  *		reduced by the space needed for a new line pointer.
  *
  * The difference between this and PageGetFreeSpace is that this will return
- * zero if there are already MaxHeapTuplesPerPage line pointers in the page
+ * zero if there are already MaxHeapTuplesPerPage() line pointers in the page
  * and none are free.  We use this to enforce that no more than
- * MaxHeapTuplesPerPage line pointers are created on a heap page.  (Although
+ * MaxHeapTuplesPerPage() line pointers are created on a heap page.  (Although
  * no more tuples than that could fit anyway, in the presence of redirected
  * or dead line pointers it'd be possible to have too many line pointers.
- * To avoid breaking code that assumes MaxHeapTuplesPerPage is a hard limit
+ * To avoid breaking code that assumes MaxHeapTuplesPerPage() is a hard limit
  * on the number of line pointers, we make this extra check.)
  */
 Size
@@ -999,10 +1001,10 @@ PageGetHeapFreeSpace(Page page)
 					nline;
 
 		/*
-		 * Are there already MaxHeapTuplesPerPage line pointers in the page?
+		 * Are there already MaxHeapTuplesPerPage() line pointers in the page?
 		 */
 		nline = PageGetMaxOffsetNumber(page);
-		if (nline >= MaxHeapTuplesPerPage)
+		if (nline >= MaxHeapTuplesPerPage())
 		{
 			if (PageHasFreeLinePointers(page))
 			{
@@ -1066,7 +1068,7 @@ PageIndexTupleDelete(Page page, OffsetNumber offnum)
 	if (phdr->pd_lower < SizeOfPageHeaderData ||
 		phdr->pd_lower > phdr->pd_upper ||
 		phdr->pd_upper > phdr->pd_special ||
-		phdr->pd_special > BLCKSZ ||
+		phdr->pd_special + reserved_page_size > BLCKSZ ||
 		phdr->pd_special != MAXALIGN(phdr->pd_special))
 		ereport(ERROR,
 				(errcode(ERRCODE_DATA_CORRUPTED),
@@ -1201,7 +1203,7 @@ PageIndexMultiDelete(Page page, OffsetNumber *itemnos, int nitems)
 	if (pd_lower < SizeOfPageHeaderData ||
 		pd_lower > pd_upper ||
 		pd_upper > pd_special ||
-		pd_special > BLCKSZ ||
+		pd_special + reserved_page_size > BLCKSZ ||
 		pd_special != MAXALIGN(pd_special))
 		ereport(ERROR,
 				(errcode(ERRCODE_DATA_CORRUPTED),
@@ -1307,7 +1309,7 @@ PageIndexTupleDeleteNoCompact(Page page, OffsetNumber offnum)
 	if (phdr->pd_lower < SizeOfPageHeaderData ||
 		phdr->pd_lower > phdr->pd_upper ||
 		phdr->pd_upper > phdr->pd_special ||
-		phdr->pd_special > BLCKSZ ||
+		phdr->pd_special + reserved_page_size > BLCKSZ ||
 		phdr->pd_special != MAXALIGN(phdr->pd_special))
 		ereport(ERROR,
 				(errcode(ERRCODE_DATA_CORRUPTED),
@@ -1419,7 +1421,7 @@ PageIndexTupleOverwrite(Page page, OffsetNumber offnum,
 	if (phdr->pd_lower < SizeOfPageHeaderData ||
 		phdr->pd_lower > phdr->pd_upper ||
 		phdr->pd_upper > phdr->pd_special ||
-		phdr->pd_special > BLCKSZ ||
+		phdr->pd_special + reserved_page_size > BLCKSZ ||
 		phdr->pd_special != MAXALIGN(phdr->pd_special))
 		ereport(ERROR,
 				(errcode(ERRCODE_DATA_CORRUPTED),
diff --git a/src/backend/utils/adt/tsgistidx.c b/src/backend/utils/adt/tsgistidx.c
index f0411bf48f..5b2b7e3151 100644
--- a/src/backend/utils/adt/tsgistidx.c
+++ b/src/backend/utils/adt/tsgistidx.c
@@ -209,7 +209,7 @@ gtsvector_compress(PG_FUNCTION_ARGS)
 		}
 
 		/* make signature, if array is too long */
-		if (VARSIZE(res) > TOAST_INDEX_TARGET)
+		if (VARSIZE(res) > TOAST_INDEX_TARGET())
 		{
 			SignTSVector *ressign = gtsvector_alloc(SIGNKEY, siglen, NULL);
 
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 1b1d814254..773ebb53d8 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -151,3 +151,4 @@ int64		VacuumPageDirty = 0;
 
 int			VacuumCostBalance = 0;	/* working state for vacuum */
 bool		VacuumCostActive = false;
+
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 4ac808ed22..f7dbc40fdc 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2682,6 +2682,19 @@ struct config_int ConfigureNamesInt[] =
 		NULL, assign_max_wal_size, NULL
 	},
 
+	{
+		{"reserved_page_size", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows the size of reserved space for extended pages."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+			},
+		&reserved_page_size,
+		0,
+		0,
+		PG_UINT8_MAX,
+		NULL, NULL, NULL
+		},
+
 	{
 		{"checkpoint_timeout", PGC_SIGHUP, WAL_CHECKPOINTS,
 			gettext_noop("Sets the maximum time between automatic WAL checkpoints."),
diff --git a/src/include/access/ginblock.h b/src/include/access/ginblock.h
index c59790ec5a..5f6c375068 100644
--- a/src/include/access/ginblock.h
+++ b/src/include/access/ginblock.h
@@ -162,7 +162,7 @@ extern bool GinPageIsRecyclable(Page page);
  *				pointers for that page
  * Note that these are all distinguishable from an "invalid" item pointer
  * (which is InvalidBlockNumber/0) as well as from all normal item
- * pointers (which have item numbers in the range 1..MaxHeapTuplesPerPage).
+ * pointers (which have item numbers in the range 1..MaxHeapTuplesPerPage()).
  */
 #define ItemPointerSetMin(p)  \
 	ItemPointerSet((p), (BlockNumber)0, (OffsetNumber)0)
@@ -246,7 +246,13 @@ typedef signed char GinNullCategory;
  * currently store the high key explicitly, we just use the rightmost item on
  * the page, so it would actually be enough to fit two items.)
  */
-#define GinMaxItemSize \
+#define GinMaxItemSize() \
+	Min(INDEX_SIZE_MASK, \
+		MAXALIGN_DOWN(((BLCKSZ - \
+						SizeOfPageReservedSpace() - \
+						MAXALIGN(SizeOfPageHeaderData + 3 * sizeof(ItemIdData)) - \
+						MAXALIGN(sizeof(GinPageOpaqueData))) / 3)))
+#define GinMaxItemSizeLimit \
 	Min(INDEX_SIZE_MASK, \
 		MAXALIGN_DOWN(((BLCKSZ - \
 						MAXALIGN(SizeOfPageHeaderData + 3 * sizeof(ItemIdData)) - \
@@ -309,15 +315,20 @@ typedef signed char GinNullCategory;
  */
 #define GinDataPageSetDataSize(page, size) \
 	{ \
-		Assert(size <= GinDataPageMaxDataSize); \
+		Assert(size <= GinDataPageMaxDataSize()); \
 		((PageHeader) page)->pd_lower = (size) + MAXALIGN(SizeOfPageHeaderData) + MAXALIGN(sizeof(ItemPointerData)); \
 	}
 
 #define GinNonLeafDataPageGetFreeSpace(page)	\
-	(GinDataPageMaxDataSize - \
+	(GinDataPageMaxDataSize() - \
 	 GinPageGetOpaque(page)->maxoff * sizeof(PostingItem))
 
-#define GinDataPageMaxDataSize	\
+#define GinDataPageMaxDataSize()	\
+	(BLCKSZ - MAXALIGN(SizeOfPageHeaderData) \
+	 - SizeOfPageReservedSpace() \
+	 - MAXALIGN(sizeof(ItemPointerData)) \
+	 - MAXALIGN(sizeof(GinPageOpaqueData)))
+#define GinDataPageMaxDataSizeLimit	\
 	(BLCKSZ - MAXALIGN(SizeOfPageHeaderData) \
 	 - MAXALIGN(sizeof(ItemPointerData)) \
 	 - MAXALIGN(sizeof(GinPageOpaqueData)))
@@ -325,8 +336,10 @@ typedef signed char GinNullCategory;
 /*
  * List pages
  */
-#define GinListPageSize  \
-	( BLCKSZ - SizeOfPageHeaderData - MAXALIGN(sizeof(GinPageOpaqueData)) )
+#define GinListPageSize()  \
+	( BLCKSZ - SizeOfPageHeaderData - MAXALIGN(sizeof(GinPageOpaqueData)) - SizeOfPageReservedSpace())
+#define GinListPageSizeLimit  \
+	( BLCKSZ - SizeOfPageHeaderData - MAXALIGN(sizeof(GinPageOpaqueData)))
 
 /*
  * A compressed posting list.
diff --git a/src/include/access/hash.h b/src/include/access/hash.h
index 9e035270a1..32fef1fa56 100644
--- a/src/include/access/hash.h
+++ b/src/include/access/hash.h
@@ -287,6 +287,7 @@ typedef struct HashOptions
 #define HashMaxItemSize(page) \
 	MAXALIGN_DOWN(PageGetPageSize(page) - \
 				  SizeOfPageHeaderData - \
+				  SizeOfPageReservedSpace() - \
 				  sizeof(ItemIdData) - \
 				  MAXALIGN(sizeof(HashPageOpaqueData)))
 
@@ -318,7 +319,9 @@ typedef struct HashOptions
 
 #define HashGetMaxBitmapSize(page) \
 	(PageGetPageSize((Page) page) - \
-	 (MAXALIGN(SizeOfPageHeaderData) + MAXALIGN(sizeof(HashPageOpaqueData))))
+	 (MAXALIGN(SizeOfPageHeaderData) + \
+	  SizeOfPageReservedSpace() + \
+	  MAXALIGN(sizeof(HashPageOpaqueData))))
 
 #define HashPageGetMeta(page) \
 	((HashMetaPage) PageGetContents(page))
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 417108f1e0..28fadc9334 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -75,7 +75,7 @@ typedef struct HeapScanDescData
 	/* these fields only used in page-at-a-time mode and for bitmap scans */
 	int			rs_cindex;		/* current tuple's index in vistuples */
 	int			rs_ntuples;		/* number of visible tuples on page */
-	OffsetNumber rs_vistuples[MaxHeapTuplesPerPage];	/* their offsets */
+	OffsetNumber rs_vistuples[MaxHeapTuplesPerPageLimit];	/* their offsets */
 }			HeapScanDescData;
 typedef struct HeapScanDescData *HeapScanDesc;
 
diff --git a/src/include/access/heaptoast.h b/src/include/access/heaptoast.h
index 5c0a796f66..bad9ecb085 100644
--- a/src/include/access/heaptoast.h
+++ b/src/include/access/heaptoast.h
@@ -61,11 +61,11 @@
 #define TOAST_TUPLE_TARGET_MAIN MaximumBytesPerTuple(TOAST_TUPLES_PER_PAGE_MAIN)
 
 /*
- * If an index value is larger than TOAST_INDEX_TARGET, we will try to
+ * If an index value is larger than TOAST_INDEX_TARGET(), we will try to
  * compress it (we can't move it out-of-line, however).  Note that this
  * number is per-datum, not per-tuple, for simplicity in index_form_tuple().
  */
-#define TOAST_INDEX_TARGET		(MaxHeapTupleSize / 16)
+#define TOAST_INDEX_TARGET()		(MaxHeapTupleSize() / 16)
 
 /*
  * When we store an oversize datum externally, we divide it into chunks
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index e01f4f35c8..01362da297 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -550,31 +550,51 @@ StaticAssertDecl(MaxOffsetNumber < SpecTokenOffsetNumber,
 #define BITMAPLEN(NATTS)	(((int)(NATTS) + 7) / 8)
 
 /*
- * MaxHeapTupleSize is the maximum allowed size of a heap tuple, including
+ * MaxHeapTupleSize() is the maximum allowed size of a heap tuple, including
  * header and MAXALIGN alignment padding.  Basically it's BLCKSZ minus the
- * other stuff that has to be on a disk page.  Since heap pages use no
- * "special space", there's no deduction for that.
+ * other stuff that has to be on a disk page.  We also include
+ * SizeOfPageReservedSpace() bytes in this calculation to account for page
+ * trailers.
+ *
+ * MaxHeapTupleSizeLimit is the maximum buffer-size required for any cluster,
+ * explicitly excluding the PageReservedSpace.  This is needed for any data
+ * structure which uses a fixed-size buffer, since compilers do not want a
+ * variable-sized array, and MaxHeapTupleSize() is now variable.
  *
  * NOTE: we allow for the ItemId that must point to the tuple, ensuring that
  * an otherwise-empty page can indeed hold a tuple of this size.  Because
  * ItemIds and tuples have different alignment requirements, don't assume that
- * you can, say, fit 2 tuples of size MaxHeapTupleSize/2 on the same page.
+ * you can, say, fit 2 tuples of size MaxHeapTupleSize()/2 on the same page.
  */
-#define MaxHeapTupleSize  (BLCKSZ - MAXALIGN(SizeOfPageHeaderData + sizeof(ItemIdData)))
+#define MaxHeapTupleSize()  (BLCKSZ - MAXALIGN(SizeOfPageHeaderData + sizeof(ItemIdData) + SizeOfPageReservedSpace()))
+#define MaxHeapTupleSizeLimit  (BLCKSZ - MAXALIGN(SizeOfPageHeaderData + sizeof(ItemIdData)))
 #define MinHeapTupleSize  MAXALIGN(SizeofHeapTupleHeader)
 
 /*
- * MaxHeapTuplesPerPage is an upper bound on the number of tuples that can
- * fit on one heap page.  (Note that indexes could have more, because they
- * use a smaller tuple header.)  We arrive at the divisor because each tuple
- * must be maxaligned, and it must have an associated line pointer.
+ * MaxHeapTuplesPerPage() is an upper bound on the number of tuples that can fit
+ * on one heap page.  (Note that indexes could have more, because they use a
+ * smaller tuple header.)  We arrive at the divisor because each tuple must be
+ * maxaligned, and it must have an associated line pointer.  This is a dynamic
+ * value, accounting for PageReservedSpace on the end.
+ *
+ * MaxHeapTuplesPerPageLimit is this same limit, but discounting
+ * PageReservedSpace (which can be zero), so is appropriate for defining data
+ * structures which require fixed-size buffers.  Code should not assume
+ * MaxHeapTuplesPerPage() == MaxHeapTuplesPerPageLimit, so if iterating over
+ * such a structure, the *size* of the buffer should be
+ * MaxHeapTuplesPerPageLimit, but the limits of iteration should be
+ * MaxHeapTuplesPerPage(), implying that MaxHeapTuplesPerPage() <=
+ * MaxHeapTuplesPerPageLimit.
  *
  * Note: with HOT, there could theoretically be more line pointers (not actual
  * tuples) than this on a heap page.  However we constrain the number of line
  * pointers to this anyway, to avoid excessive line-pointer bloat and not
  * require increases in the size of work arrays.
  */
-#define MaxHeapTuplesPerPage	\
+#define MaxHeapTuplesPerPage()	\
+	((int) ((BLCKSZ - SizeOfPageHeaderData - SizeOfPageReservedSpace()) / \
+			(MAXALIGN(SizeofHeapTupleHeader) + sizeof(ItemIdData))))
+#define MaxHeapTuplesPerPageLimit	\
 	((int) ((BLCKSZ - SizeOfPageHeaderData) / \
 			(MAXALIGN(SizeofHeapTupleHeader) + sizeof(ItemIdData))))
 
diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h
index 8f48960f9d..079d1161b1 100644
--- a/src/include/access/nbtree.h
+++ b/src/include/access/nbtree.h
@@ -162,29 +162,43 @@ typedef struct BTMetaPageData
  * attribute, which we account for here.
  */
 #define BTMaxItemSize(page) \
-	(MAXALIGN_DOWN((PageGetPageSize(page) - \
+	(MAXALIGN_DOWN((PageGetPageSize(page) - SizeOfPageReservedSpace() - \
 					MAXALIGN(SizeOfPageHeaderData + 3*sizeof(ItemIdData)) - \
 					MAXALIGN(sizeof(BTPageOpaqueData))) / 3) - \
 					MAXALIGN(sizeof(ItemPointerData)))
 #define BTMaxItemSizeNoHeapTid(page) \
-	MAXALIGN_DOWN((PageGetPageSize(page) - \
+	MAXALIGN_DOWN((PageGetPageSize(page) - SizeOfPageReservedSpace() - \
 				   MAXALIGN(SizeOfPageHeaderData + 3*sizeof(ItemIdData)) - \
 				   MAXALIGN(sizeof(BTPageOpaqueData))) / 3)
 
 /*
- * MaxTIDsPerBTreePage is an upper bound on the number of heap TIDs tuples
- * that may be stored on a btree leaf page.  It is used to size the
- * per-page temporary buffers.
+ * MaxTIDsPerBTreePage() is an upper bound on the number of heap TIDs tuples
+ * that may be stored on a btree leaf page.  It is used to size the per-page
+ * temporary buffers.  This accounts for PageReservedSpace limit as well, so
+ * is a dynamic value depending on cluster settings.
+ *
+ * MaxTIDsPerBTreePageLimit is the same value without considering
+ * PageReservedSpace limit as well, so is used for fixed-size buffers, however
+ * code accessing these buffers should consider only MaxTIDsPerBTreePage() when
+ * iterating over then.
  *
  * Note: we don't bother considering per-tuple overheads here to keep
  * things simple (value is based on how many elements a single array of
  * heap TIDs must have to fill the space between the page header and
  * special area).  The value is slightly higher (i.e. more conservative)
  * than necessary as a result, which is considered acceptable.
+ *
+ * Since this is a fixed-size upper limit we restrict to the max size of page
+ * reserved space; this does mean that we pay a cost of
+ * (MaxSizeOfPageReservedSpace / sizeof(ItemPointerData)) less tuples stored
+ * on a page.
  */
-#define MaxTIDsPerBTreePage \
-	(int) ((BLCKSZ - SizeOfPageHeaderData - sizeof(BTPageOpaqueData)) / \
-		   sizeof(ItemPointerData))
+#define MaxTIDsPerBTreePage() \
+	(int) ((BLCKSZ - SizeOfPageHeaderData - SizeOfPageReservedSpace() - \
+			sizeof(BTPageOpaqueData)) / sizeof(ItemPointerData))
+#define MaxTIDsPerBTreePageLimit \
+	(int) ((BLCKSZ - SizeOfPageHeaderData - \
+			sizeof(BTPageOpaqueData)) / sizeof(ItemPointerData))
 
 /*
  * The leaf-page fillfactor defaults to 90% but is user-adjustable.
@@ -988,7 +1002,7 @@ typedef struct BTScanPosData
 	int			lastItem;		/* last valid index in items[] */
 	int			itemIndex;		/* current index in items[] */
 
-	BTScanPosItem items[MaxTIDsPerBTreePage];	/* MUST BE LAST */
+	BTScanPosItem items[MaxTIDsPerBTreePageLimit];	/* MUST BE LAST */
 } BTScanPosData;
 
 typedef BTScanPosData *BTScanPos;
diff --git a/src/include/access/spgist_private.h b/src/include/access/spgist_private.h
index c6ef46fc20..22599d8d94 100644
--- a/src/include/access/spgist_private.h
+++ b/src/include/access/spgist_private.h
@@ -447,6 +447,7 @@ typedef SpGistDeadTupleData *SpGistDeadTuple;
 #define SPGIST_PAGE_CAPACITY  \
 	MAXALIGN_DOWN(BLCKSZ - \
 				  SizeOfPageHeaderData - \
+				  SizeOfPageReservedSpace() - \
 				  MAXALIGN(sizeof(SpGistPageOpaqueData)))
 
 /*
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 424ecba028..18ebd3b3c7 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -19,6 +19,14 @@
 #include "storage/item.h"
 #include "storage/off.h"
 
+extern PGDLLIMPORT int reserved_page_size;
+
+#define SizeOfPageReservedSpace() reserved_page_size
+#define MaxSizeOfPageReservedSpace 0
+
+/* strict upper bound on the amount of space occupied we have reserved on
+ * pages in this cluster */
+
 /*
  * A postgres disk page is an abstraction layered on top of a postgres
  * disk block (which is simply a unit of i/o, see block.h).
@@ -36,10 +44,10 @@
  * |			 v pd_upper							  |
  * +-------------+------------------------------------+
  * |			 | tupleN ...                         |
- * +-------------+------------------+-----------------+
- * |	   ... tuple3 tuple2 tuple1 | "special space" |
- * +--------------------------------+-----------------+
- *									^ pd_special
+ * +-------------+-----+------------+----+------------+
+ * | ... tuple2 tuple1 | "special space" | "reserved" |
+ * +-------------------+------------+----+------------+
+ *					   ^ pd_special      ^ reserved_page_space
  *
  * a page is full when nothing can be added between pd_lower and
  * pd_upper.
@@ -73,6 +81,8 @@
  * stored as the page trailer.  an access method should always
  * initialize its pages with PageInit and then set its own opaque
  * fields.
+ *
+ * XXX - update more comments here about reserved_page_space
  */
 
 typedef Pointer Page;
@@ -313,7 +323,7 @@ PageSetPageSizeAndVersion(Page page, Size size, uint8 version)
 static inline uint16
 PageGetSpecialSize(Page page)
 {
-	return (PageGetPageSize(page) - ((PageHeader) page)->pd_special);
+	return (PageGetPageSize(page) - ((PageHeader) page)->pd_special - reserved_page_size);
 }
 
 /*
@@ -325,7 +335,7 @@ static inline void
 PageValidateSpecialPointer(Page page)
 {
 	Assert(page);
-	Assert(((PageHeader) page)->pd_special <= BLCKSZ);
+	Assert((((PageHeader) page)->pd_special + reserved_page_size) <= BLCKSZ);
 	Assert(((PageHeader) page)->pd_special >= SizeOfPageHeaderData);
 }
 
diff --git a/src/test/modules/test_ginpostinglist/test_ginpostinglist.c b/src/test/modules/test_ginpostinglist/test_ginpostinglist.c
index 80cee65684..6a6fdb40fd 100644
--- a/src/test/modules/test_ginpostinglist/test_ginpostinglist.c
+++ b/src/test/modules/test_ginpostinglist/test_ginpostinglist.c
@@ -88,9 +88,9 @@ Datum
 test_ginpostinglist(PG_FUNCTION_ARGS)
 {
 	test_itemptr_pair(0, 2, 14);
-	test_itemptr_pair(0, MaxHeapTuplesPerPage, 14);
-	test_itemptr_pair(MaxBlockNumber, MaxHeapTuplesPerPage, 14);
-	test_itemptr_pair(MaxBlockNumber, MaxHeapTuplesPerPage, 16);
+	test_itemptr_pair(0, MaxHeapTuplesPerPage(), 14);
+	test_itemptr_pair(MaxBlockNumber, MaxHeapTuplesPerPage(), 14);
+	test_itemptr_pair(MaxBlockNumber, MaxHeapTuplesPerPage(), 16);
 
 	PG_RETURN_VOID();
 }
diff --git a/src/test/regress/expected/insert.out b/src/test/regress/expected/insert.out
index dd4354fc7d..b530ac0038 100644
--- a/src/test/regress/expected/insert.out
+++ b/src/test/regress/expected/insert.out
@@ -86,7 +86,7 @@ drop table inserttest;
 --
 CREATE TABLE large_tuple_test (a int, b text) WITH (fillfactor = 10);
 ALTER TABLE large_tuple_test ALTER COLUMN b SET STORAGE plain;
--- create page w/ free space in range [nearlyEmptyFreeSpace, MaxHeapTupleSize)
+-- create page w/ free space in range [nearlyEmptyFreeSpace, MaxHeapTupleSize())
 INSERT INTO large_tuple_test (select 1, NULL);
 -- should still fit on the page
 INSERT INTO large_tuple_test (select 2, repeat('a', 1000));
@@ -100,7 +100,7 @@ SELECT pg_size_pretty(pg_relation_size('large_tuple_test'::regclass, 'main'));
 INSERT INTO large_tuple_test (select 3, NULL);
 -- now this tuple won't fit on the second page, but the insert should
 -- still succeed by extending the relation
-INSERT INTO large_tuple_test (select 4, repeat('a', 8126));
+INSERT INTO large_tuple_test (select 4, repeat('a', 8062));
 DROP TABLE large_tuple_test;
 --
 -- check indirection (field/array assignment), cf bug #14265
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 458adee7f8..6238cf42d7 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -134,7 +134,7 @@ CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 CREATE INDEX no_index_cleanup_idx ON no_index_cleanup(t);
 ALTER TABLE no_index_cleanup ALTER COLUMN t SET STORAGE EXTERNAL;
 INSERT INTO no_index_cleanup(i, t) VALUES (generate_series(1,30),
-    repeat('1234567890',269));
+    repeat('1234567890',266));
 -- index cleanup option is ignored if VACUUM FULL
 VACUUM (INDEX_CLEANUP TRUE, FULL TRUE) no_index_cleanup;
 VACUUM (FULL TRUE) no_index_cleanup;
@@ -150,7 +150,7 @@ ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = auto);
 VACUUM no_index_cleanup;
 -- Parameter is set for both the parent table and its toast relation.
 INSERT INTO no_index_cleanup(i, t) VALUES (generate_series(31,60),
-    repeat('1234567890',269));
+    repeat('1234567890',266));
 DELETE FROM no_index_cleanup WHERE i < 45;
 -- Only toast index is cleaned up.
 ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = off,
diff --git a/src/test/regress/sql/insert.sql b/src/test/regress/sql/insert.sql
index bdcffd0314..0e31dcd1f8 100644
--- a/src/test/regress/sql/insert.sql
+++ b/src/test/regress/sql/insert.sql
@@ -43,7 +43,7 @@ drop table inserttest;
 CREATE TABLE large_tuple_test (a int, b text) WITH (fillfactor = 10);
 ALTER TABLE large_tuple_test ALTER COLUMN b SET STORAGE plain;
 
--- create page w/ free space in range [nearlyEmptyFreeSpace, MaxHeapTupleSize)
+-- create page w/ free space in range [nearlyEmptyFreeSpace, MaxHeapTupleSize())
 INSERT INTO large_tuple_test (select 1, NULL);
 
 -- should still fit on the page
@@ -55,7 +55,7 @@ INSERT INTO large_tuple_test (select 3, NULL);
 
 -- now this tuple won't fit on the second page, but the insert should
 -- still succeed by extending the relation
-INSERT INTO large_tuple_test (select 4, repeat('a', 8126));
+INSERT INTO large_tuple_test (select 4, repeat('a', 8062));
 
 DROP TABLE large_tuple_test;
 
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 9da8f3e830..7f0949895b 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -115,7 +115,7 @@ CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 CREATE INDEX no_index_cleanup_idx ON no_index_cleanup(t);
 ALTER TABLE no_index_cleanup ALTER COLUMN t SET STORAGE EXTERNAL;
 INSERT INTO no_index_cleanup(i, t) VALUES (generate_series(1,30),
-    repeat('1234567890',269));
+    repeat('1234567890',266));
 -- index cleanup option is ignored if VACUUM FULL
 VACUUM (INDEX_CLEANUP TRUE, FULL TRUE) no_index_cleanup;
 VACUUM (FULL TRUE) no_index_cleanup;
@@ -131,7 +131,7 @@ ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = auto);
 VACUUM no_index_cleanup;
 -- Parameter is set for both the parent table and its toast relation.
 INSERT INTO no_index_cleanup(i, t) VALUES (generate_series(31,60),
-    repeat('1234567890',269));
+    repeat('1234567890',266));
 DELETE FROM no_index_cleanup WHERE i < 45;
 -- Only toast index is cleaned up.
 ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = off,
-- 
2.38.1

#10

David Christensen

david.christensen@crunchydata.com

over 2 years ago

In reply to: David Christensen (#9)

3 attachment(s)

Re: [PATCHES] Post-special page storage TDE support

Refreshing this with HEAD as of today, v4.

Attachments:

v4-0001-Add-reserved_page_space-to-Page-structure.patchapplication/octet-stream; name=v4-0001-Add-reserved_page_space-to-Page-structure.patchDownload

From 965309ea3517fa734c4bc89c144e2031cdf6c0c3 Mon Sep 17 00:00:00 2001
From: David Christensen <david@pgguru.net>
Date: Tue, 9 May 2023 16:56:15 -0500
Subject: [PATCH v4 1/3] Add reserved_page_space to Page structure

This space is reserved for extended data on the Page structure which will be ultimately used for
encrypted data, extended checksums, and potentially other things.  This data appears at the end of
the Page, after any `pd_special` area, and will be calculated at runtime based on specific
ControlFile features.

No effort is made to ensure this is backwards-compatible with existing clusters for `pg_upgrade`, as
we will require logical replication to move data into a cluster with different settings here.
---
 contrib/amcheck/verify_nbtree.c               |  4 +-
 contrib/bloom/bloom.h                         |  3 +-
 contrib/bloom/blutils.c                       |  4 +-
 contrib/bloom/blvacuum.c                      |  2 +-
 contrib/pg_surgery/heap_surgery.c             |  4 +-
 src/backend/access/brin/brin_bloom.c          |  8 ++--
 src/backend/access/brin/brin_minmax_multi.c   |  8 ++--
 src/backend/access/brin/brin_tuple.c          |  2 +-
 src/backend/access/common/indextuple.c        |  2 +-
 src/backend/access/gin/gindatapage.c          | 20 +++++-----
 src/backend/access/gin/ginentrypage.c         |  4 +-
 src/backend/access/gin/ginfast.c              | 10 ++---
 src/backend/access/gin/gininsert.c            |  4 +-
 src/backend/access/gin/ginpostinglist.c       |  6 +--
 src/backend/access/gin/ginvacuum.c            |  2 +-
 src/backend/access/heap/README.HOT            |  2 +-
 src/backend/access/heap/heapam.c              | 18 ++++-----
 src/backend/access/heap/heapam_handler.c      |  8 ++--
 src/backend/access/heap/hio.c                 |  8 ++--
 src/backend/access/heap/pruneheap.c           | 24 +++++------
 src/backend/access/heap/rewriteheap.c         |  4 +-
 src/backend/access/heap/vacuumlazy.c          | 22 +++++-----
 src/backend/access/nbtree/nbtdedup.c          |  6 +--
 src/backend/access/nbtree/nbtinsert.c         |  4 +-
 src/backend/access/nbtree/nbtree.c            |  4 +-
 src/backend/access/nbtree/nbtsearch.c         |  8 ++--
 src/backend/access/nbtree/nbtsplitloc.c       |  2 +-
 src/backend/nodes/tidbitmap.c                 |  2 +-
 .../replication/logical/reorderbuffer.c       |  2 +-
 src/backend/storage/freespace/freespace.c     | 30 +++++++-------
 src/backend/storage/page/bufpage.c            | 36 +++++++++--------
 src/backend/utils/adt/tsgistidx.c             |  2 +-
 src/backend/utils/init/globals.c              |  1 +
 src/backend/utils/misc/guc_tables.c           | 13 ++++++
 src/include/access/ginblock.h                 | 27 +++++++++----
 src/include/access/hash.h                     |  5 ++-
 src/include/access/heapam.h                   |  2 +-
 src/include/access/heaptoast.h                |  4 +-
 src/include/access/htup_details.h             | 40 ++++++++++++++-----
 src/include/access/nbtree.h                   | 32 ++++++++++-----
 src/include/access/spgist_private.h           |  1 +
 src/include/storage/bufpage.h                 | 22 +++++++---
 .../test_ginpostinglist/test_ginpostinglist.c |  6 +--
 src/test/regress/expected/insert.out          |  4 +-
 src/test/regress/expected/vacuum.out          |  4 +-
 src/test/regress/sql/insert.sql               |  4 +-
 src/test/regress/sql/vacuum.sql               |  4 +-
 47 files changed, 256 insertions(+), 178 deletions(-)

diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 6979aff727..060c4ab3e3 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -489,12 +489,12 @@ bt_check_every_level(Relation rel, Relation heaprel, bool heapkeyspace,
 		/*
 		 * Size Bloom filter based on estimated number of tuples in index,
 		 * while conservatively assuming that each block must contain at least
-		 * MaxTIDsPerBTreePage / 3 "plain" tuples -- see
+		 * MaxTIDsPerBTreePage() / 3 "plain" tuples -- see
 		 * bt_posting_plain_tuple() for definition, and details of how posting
 		 * list tuples are handled.
 		 */
 		total_pages = RelationGetNumberOfBlocks(rel);
-		total_elems = Max(total_pages * (MaxTIDsPerBTreePage / 3),
+		total_elems = Max(total_pages * (MaxTIDsPerBTreePage() / 3),
 						  (int64) state->rel->rd_rel->reltuples);
 		/* Generate a random seed to avoid repetition */
 		seed = pg_prng_uint64(&pg_global_prng_state);
diff --git a/contrib/bloom/bloom.h b/contrib/bloom/bloom.h
index efdf9415d1..8ebabdd7ee 100644
--- a/contrib/bloom/bloom.h
+++ b/contrib/bloom/bloom.h
@@ -131,7 +131,7 @@ typedef struct BloomMetaPageData
 #define BLOOM_MAGICK_NUMBER (0xDBAC0DED)
 
 /* Number of blocks numbers fit in BloomMetaPageData */
-#define BloomMetaBlockN		(sizeof(FreeBlockNumberArray) / sizeof(BlockNumber))
+#define BloomMetaBlockN()		((sizeof(FreeBlockNumberArray) - SizeOfPageReservedSpace())/ sizeof(BlockNumber))
 
 #define BloomPageGetMeta(page)	((BloomMetaPageData *) PageGetContents(page))
 
@@ -151,6 +151,7 @@ typedef struct BloomState
 
 #define BloomPageGetFreeSpace(state, page) \
 	(BLCKSZ - MAXALIGN(SizeOfPageHeaderData) \
+		- SizeOfPageReservedSpace()								  \
 		- BloomPageGetMaxOffset(page) * (state)->sizeOfBloomTuple \
 		- MAXALIGN(sizeof(BloomPageOpaqueData)))
 
diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index d935ed8fbd..d3d74a9d28 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -430,10 +430,10 @@ BloomFillMetapage(Relation index, Page metaPage)
 	 */
 	BloomInitPage(metaPage, BLOOM_META);
 	metadata = BloomPageGetMeta(metaPage);
-	memset(metadata, 0, sizeof(BloomMetaPageData));
+	memset(metadata, 0, sizeof(BloomMetaPageData) - SizeOfPageReservedSpace());
 	metadata->magickNumber = BLOOM_MAGICK_NUMBER;
 	metadata->opts = *opts;
-	((PageHeader) metaPage)->pd_lower += sizeof(BloomMetaPageData);
+	((PageHeader) metaPage)->pd_lower += sizeof(BloomMetaPageData) - SizeOfPageReservedSpace();
 
 	/* If this fails, probably FreeBlockNumberArray size calc is wrong: */
 	Assert(((PageHeader) metaPage)->pd_lower <= ((PageHeader) metaPage)->pd_upper);
diff --git a/contrib/bloom/blvacuum.c b/contrib/bloom/blvacuum.c
index 2340d49e00..652dc9e586 100644
--- a/contrib/bloom/blvacuum.c
+++ b/contrib/bloom/blvacuum.c
@@ -116,7 +116,7 @@ blbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 		 */
 		if (BloomPageGetMaxOffset(page) != 0 &&
 			BloomPageGetFreeSpace(&state, page) >= state.sizeOfBloomTuple &&
-			countPage < BloomMetaBlockN)
+			countPage < BloomMetaBlockN())
 			notFullPage[countPage++] = blkno;
 
 		/* Did we delete something? */
diff --git a/contrib/pg_surgery/heap_surgery.c b/contrib/pg_surgery/heap_surgery.c
index 88a40ab7d3..d4e9748095 100644
--- a/contrib/pg_surgery/heap_surgery.c
+++ b/contrib/pg_surgery/heap_surgery.c
@@ -89,7 +89,7 @@ heap_force_common(FunctionCallInfo fcinfo, HeapTupleForceOption heap_force_opt)
 	Relation	rel;
 	OffsetNumber curr_start_ptr,
 				next_start_ptr;
-	bool		include_this_tid[MaxHeapTuplesPerPage];
+	bool		include_this_tid[MaxHeapTuplesPerPageLimit];
 
 	if (RecoveryInProgress())
 		ereport(ERROR,
@@ -225,7 +225,7 @@ heap_force_common(FunctionCallInfo fcinfo, HeapTupleForceOption heap_force_opt)
 			}
 
 			/* Mark it for processing. */
-			Assert(offno < MaxHeapTuplesPerPage);
+			Assert(offno < MaxHeapTuplesPerPage());
 			include_this_tid[offno] = true;
 		}
 
diff --git a/src/backend/access/brin/brin_bloom.c b/src/backend/access/brin/brin_bloom.c
index e4953a9d37..5bc0166fb8 100644
--- a/src/backend/access/brin/brin_bloom.c
+++ b/src/backend/access/brin/brin_bloom.c
@@ -166,7 +166,7 @@ typedef struct BloomOptions
  * on the fact that the filter header is ~20B alone, which is about
  * the same as the filter bitmap for 16 distinct items with 1% false
  * positive rate. So by allowing lower values we'd not gain much. In
- * any case, the min should not be larger than MaxHeapTuplesPerPage
+ * any case, the min should not be larger than MaxHeapTuplesPerPage()
  * (~290), which is the theoretical maximum for single-page ranges.
  */
 #define		BLOOM_MIN_NDISTINCT_PER_RANGE		16
@@ -448,7 +448,7 @@ brin_bloom_opcinfo(PG_FUNCTION_ARGS)
  *
  * Adjust the ndistinct value based on the pagesPerRange value. First,
  * if it's negative, it's assumed to be relative to maximum number of
- * tuples in the range (assuming each page gets MaxHeapTuplesPerPage
+ * tuples in the range (assuming each page gets MaxHeapTuplesPerPage()
  * tuples, which is likely a significant over-estimate). We also clamp
  * the value, not to over-size the bloom filter unnecessarily.
  *
@@ -463,7 +463,7 @@ brin_bloom_opcinfo(PG_FUNCTION_ARGS)
  * seems better to rely on the upper estimate.
  *
  * XXX We might also calculate a better estimate of rows per BRIN range,
- * instead of using MaxHeapTuplesPerPage (which probably produces values
+ * instead of using MaxHeapTuplesPerPage() (which probably produces values
  * much higher than reality).
  */
 static int
@@ -478,7 +478,7 @@ brin_bloom_get_ndistinct(BrinDesc *bdesc, BloomOptions *opts)
 
 	Assert(BlockNumberIsValid(pagesPerRange));
 
-	maxtuples = MaxHeapTuplesPerPage * pagesPerRange;
+	maxtuples = MaxHeapTuplesPerPage() * pagesPerRange;
 
 	/*
 	 * Similarly to n_distinct, negative values are relative - in this case to
diff --git a/src/backend/access/brin/brin_minmax_multi.c b/src/backend/access/brin/brin_minmax_multi.c
index 8e4e6c2fc8..28fc1f3ebe 100644
--- a/src/backend/access/brin/brin_minmax_multi.c
+++ b/src/backend/access/brin/brin_minmax_multi.c
@@ -2001,10 +2001,10 @@ brin_minmax_multi_distance_tid(PG_FUNCTION_ARGS)
 	 * We use the no-check variants here, because user-supplied values may
 	 * have (ip_posid == 0). See ItemPointerCompare.
 	 */
-	da1 = ItemPointerGetBlockNumberNoCheck(pa1) * MaxHeapTuplesPerPage +
+	da1 = ItemPointerGetBlockNumberNoCheck(pa1) * MaxHeapTuplesPerPage() +
 		ItemPointerGetOffsetNumberNoCheck(pa1);
 
-	da2 = ItemPointerGetBlockNumberNoCheck(pa2) * MaxHeapTuplesPerPage +
+	da2 = ItemPointerGetBlockNumberNoCheck(pa2) * MaxHeapTuplesPerPage() +
 		ItemPointerGetOffsetNumberNoCheck(pa2);
 
 	PG_RETURN_FLOAT8(da2 - da1);
@@ -2481,7 +2481,7 @@ brin_minmax_multi_add_value(PG_FUNCTION_ARGS)
 		 * much lower, but meh.
 		 */
 		maxvalues = Min(target_maxvalues * MINMAX_BUFFER_FACTOR,
-						MaxHeapTuplesPerPage * pagesPerRange);
+						MaxHeapTuplesPerPage() * pagesPerRange);
 
 		/* but always at least the original value */
 		maxvalues = Max(maxvalues, target_maxvalues);
@@ -2527,7 +2527,7 @@ brin_minmax_multi_add_value(PG_FUNCTION_ARGS)
 		 * much lower, but meh.
 		 */
 		maxvalues = Min(serialized->maxvalues * MINMAX_BUFFER_FACTOR,
-						MaxHeapTuplesPerPage * pagesPerRange);
+						MaxHeapTuplesPerPage() * pagesPerRange);
 
 		/* but always at least the original value */
 		maxvalues = Max(maxvalues, serialized->maxvalues);
diff --git a/src/backend/access/brin/brin_tuple.c b/src/backend/access/brin/brin_tuple.c
index 84b79dbfc0..3908a59f66 100644
--- a/src/backend/access/brin/brin_tuple.c
+++ b/src/backend/access/brin/brin_tuple.c
@@ -217,7 +217,7 @@ brin_form_tuple(BrinDesc *brdesc, BlockNumber blkno, BrinMemTuple *tuple,
 			 * datatype, try to compress it in-line.
 			 */
 			if (!VARATT_IS_EXTENDED(DatumGetPointer(value)) &&
-				VARSIZE(DatumGetPointer(value)) > TOAST_INDEX_TARGET &&
+				VARSIZE(DatumGetPointer(value)) > TOAST_INDEX_TARGET() &&
 				(atttype->typstorage == TYPSTORAGE_EXTENDED ||
 				 atttype->typstorage == TYPSTORAGE_MAIN))
 			{
diff --git a/src/backend/access/common/indextuple.c b/src/backend/access/common/indextuple.c
index 8b178f94c1..51dd6e467f 100644
--- a/src/backend/access/common/indextuple.c
+++ b/src/backend/access/common/indextuple.c
@@ -118,7 +118,7 @@ index_form_tuple_context(TupleDesc tupleDescriptor,
 		 * try to compress it in-line.
 		 */
 		if (!VARATT_IS_EXTENDED(DatumGetPointer(untoasted_values[i])) &&
-			VARSIZE(DatumGetPointer(untoasted_values[i])) > TOAST_INDEX_TARGET &&
+			VARSIZE(DatumGetPointer(untoasted_values[i])) > TOAST_INDEX_TARGET() &&
 			(att->attstorage == TYPSTORAGE_EXTENDED ||
 			 att->attstorage == TYPSTORAGE_MAIN))
 		{
diff --git a/src/backend/access/gin/gindatapage.c b/src/backend/access/gin/gindatapage.c
index 9caeac164a..a0ac29cdbe 100644
--- a/src/backend/access/gin/gindatapage.c
+++ b/src/backend/access/gin/gindatapage.c
@@ -535,7 +535,7 @@ dataBeginPlaceToPageLeaf(GinBtree btree, Buffer buf, GinBtreeStack *stack,
 		 * a single byte, and we can use all the free space on the old page as
 		 * well as the new page. For simplicity, ignore segment overhead etc.
 		 */
-		maxitems = Min(maxitems, freespace + GinDataPageMaxDataSize);
+		maxitems = Min(maxitems, freespace + GinDataPageMaxDataSize());
 	}
 	else
 	{
@@ -550,7 +550,7 @@ dataBeginPlaceToPageLeaf(GinBtree btree, Buffer buf, GinBtreeStack *stack,
 		int			nnewsegments;
 
 		nnewsegments = freespace / GinPostingListSegmentMaxSize;
-		nnewsegments += GinDataPageMaxDataSize / GinPostingListSegmentMaxSize;
+		nnewsegments += GinDataPageMaxDataSize() / GinPostingListSegmentMaxSize;
 		maxitems = Min(maxitems, nnewsegments * MinTuplesPerSegment);
 	}
 
@@ -665,8 +665,8 @@ dataBeginPlaceToPageLeaf(GinBtree btree, Buffer buf, GinBtreeStack *stack,
 				leaf->lastleft = dlist_prev_node(&leaf->segments, leaf->lastleft);
 			}
 		}
-		Assert(leaf->lsize <= GinDataPageMaxDataSize);
-		Assert(leaf->rsize <= GinDataPageMaxDataSize);
+		Assert(leaf->lsize <= GinDataPageMaxDataSize());
+		Assert(leaf->rsize <= GinDataPageMaxDataSize());
 
 		/*
 		 * Fetch the max item in the left page's last segment; it becomes the
@@ -755,7 +755,7 @@ ginVacuumPostingTreeLeaf(Relation indexrel, Buffer buffer, GinVacuumState *gvs)
 		if (seginfo->seg)
 			oldsegsize = SizeOfGinPostingList(seginfo->seg);
 		else
-			oldsegsize = GinDataPageMaxDataSize;
+			oldsegsize = GinDataPageMaxDataSize();
 
 		cleaned = ginVacuumItemPointers(gvs,
 										seginfo->items,
@@ -1015,7 +1015,7 @@ dataPlaceToPageLeafRecompress(Buffer buf, disassembledLeaf *leaf)
 		}
 	}
 
-	Assert(newsize <= GinDataPageMaxDataSize);
+	Assert(newsize <= GinDataPageMaxDataSize());
 	GinDataPageSetDataSize(page, newsize);
 }
 
@@ -1684,7 +1684,7 @@ leafRepackItems(disassembledLeaf *leaf, ItemPointer remaining)
 		 * copying to the page. Did we exceed the size that fits on one page?
 		 */
 		segsize = SizeOfGinPostingList(seginfo->seg);
-		if (pgused + segsize > GinDataPageMaxDataSize)
+		if (pgused + segsize > GinDataPageMaxDataSize())
 		{
 			if (!needsplit)
 			{
@@ -1724,8 +1724,8 @@ leafRepackItems(disassembledLeaf *leaf, ItemPointer remaining)
 	else
 		leaf->rsize = pgused;
 
-	Assert(leaf->lsize <= GinDataPageMaxDataSize);
-	Assert(leaf->rsize <= GinDataPageMaxDataSize);
+	Assert(leaf->lsize <= GinDataPageMaxDataSize());
+	Assert(leaf->rsize <= GinDataPageMaxDataSize());
 
 	/*
 	 * Make a palloc'd copy of every segment after the first modified one,
@@ -1801,7 +1801,7 @@ createPostingTree(Relation index, ItemPointerData *items, uint32 nitems,
 										 GinPostingListSegmentMaxSize,
 										 &npacked);
 		segsize = SizeOfGinPostingList(segment);
-		if (rootsize + segsize > GinDataPageMaxDataSize)
+		if (rootsize + segsize > GinDataPageMaxDataSize())
 			break;
 
 		memcpy(ptr, segment, segsize);
diff --git a/src/backend/access/gin/ginentrypage.c b/src/backend/access/gin/ginentrypage.c
index 5a8c0eb98d..26d292921a 100644
--- a/src/backend/access/gin/ginentrypage.c
+++ b/src/backend/access/gin/ginentrypage.c
@@ -102,13 +102,13 @@ GinFormTuple(GinState *ginstate,
 
 	newsize = MAXALIGN(newsize);
 
-	if (newsize > GinMaxItemSize)
+	if (newsize > GinMaxItemSize())
 	{
 		if (errorTooBig)
 			ereport(ERROR,
 					(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
 					 errmsg("index row size %zu exceeds maximum %zu for index \"%s\"",
-							(Size) newsize, (Size) GinMaxItemSize,
+							(Size) newsize, (Size) GinMaxItemSize(),
 							RelationGetRelationName(ginstate->index))));
 		pfree(itup);
 		return NULL;
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index ca7d770d86..eedd2d35ba 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -38,8 +38,8 @@
 /* GUC parameter */
 int			gin_pending_list_limit = 0;
 
-#define GIN_PAGE_FREESIZE \
-	( BLCKSZ - MAXALIGN(SizeOfPageHeaderData) - MAXALIGN(sizeof(GinPageOpaqueData)) )
+#define GIN_PAGE_FREESIZE() \
+	( BLCKSZ - MAXALIGN(SizeOfPageHeaderData) - MAXALIGN(sizeof(GinPageOpaqueData)) - SizeOfPageReservedSpace() )
 
 typedef struct KeyArray
 {
@@ -183,7 +183,7 @@ makeSublist(Relation index, IndexTuple *tuples, int32 ntuples,
 
 		tupsize = MAXALIGN(IndexTupleSize(tuples[i])) + sizeof(ItemIdData);
 
-		if (size + tupsize > GinListPageSize)
+		if (size + tupsize > GinListPageSize())
 		{
 			/* won't fit, force a new page and reprocess */
 			i--;
@@ -249,7 +249,7 @@ ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector)
 	 */
 	CheckForSerializableConflictIn(index, NULL, GIN_METAPAGE_BLKNO);
 
-	if (collector->sumsize + collector->ntuples * sizeof(ItemIdData) > GinListPageSize)
+	if (collector->sumsize + collector->ntuples * sizeof(ItemIdData) > GinListPageSize())
 	{
 		/*
 		 * Total size is greater than one page => make sublist
@@ -450,7 +450,7 @@ ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector)
 	 * ginInsertCleanup() should not be called inside our CRIT_SECTION.
 	 */
 	cleanupSize = GinGetPendingListCleanupSize(index);
-	if (metadata->nPendingPages * GIN_PAGE_FREESIZE > cleanupSize * 1024L)
+	if (metadata->nPendingPages * GIN_PAGE_FREESIZE() > cleanupSize * 1024L)
 		needCleanup = true;
 
 	UnlockReleaseBuffer(metabuffer);
diff --git a/src/backend/access/gin/gininsert.c b/src/backend/access/gin/gininsert.c
index be1841de7b..eae27c66e9 100644
--- a/src/backend/access/gin/gininsert.c
+++ b/src/backend/access/gin/gininsert.c
@@ -75,7 +75,7 @@ addItemPointersToLeafTuple(GinState *ginstate,
 
 	/* Compress the posting list, and try to a build tuple with room for it */
 	res = NULL;
-	compressedList = ginCompressPostingList(newItems, newNPosting, GinMaxItemSize,
+	compressedList = ginCompressPostingList(newItems, newNPosting, GinMaxItemSize(),
 											NULL);
 	pfree(newItems);
 	if (compressedList)
@@ -135,7 +135,7 @@ buildFreshLeafTuple(GinState *ginstate,
 	GinPostingList *compressedList;
 
 	/* try to build a posting list tuple with all the items */
-	compressedList = ginCompressPostingList(items, nitem, GinMaxItemSize, NULL);
+	compressedList = ginCompressPostingList(items, nitem, GinMaxItemSize(), NULL);
 	if (compressedList)
 	{
 		res = GinFormTuple(ginstate, attnum, key, category,
diff --git a/src/backend/access/gin/ginpostinglist.c b/src/backend/access/gin/ginpostinglist.c
index 66a89837e6..65f683ce02 100644
--- a/src/backend/access/gin/ginpostinglist.c
+++ b/src/backend/access/gin/ginpostinglist.c
@@ -26,7 +26,7 @@
  * lowest 32 bits are the block number. That leaves 21 bits unused, i.e.
  * only 43 low bits are used.
  *
- * 11 bits is enough for the offset number, because MaxHeapTuplesPerPage <
+ * 11 bits is enough for the offset number, because MaxHeapTuplesPerPage() <
  * 2^11 on all supported block sizes. We are frugal with the bits, because
  * smaller integers use fewer bytes in the varbyte encoding, saving disk
  * space. (If we get a new table AM in the future that wants to use the full
@@ -74,9 +74,9 @@
 /*
  * How many bits do you need to encode offset number? OffsetNumber is a 16-bit
  * integer, but you can't fit that many items on a page. 11 ought to be more
- * than enough. It's tempting to derive this from MaxHeapTuplesPerPage, and
+ * than enough. It's tempting to derive this from MaxHeapTuplesPerPage(), and
  * use the minimum number of bits, but that would require changing the on-disk
- * format if MaxHeapTuplesPerPage changes. Better to leave some slack.
+ * format if MaxHeapTuplesPerPage() changes. Better to leave some slack.
  */
 #define MaxHeapTuplesPerPageBits		11
 
diff --git a/src/backend/access/gin/ginvacuum.c b/src/backend/access/gin/ginvacuum.c
index e5d310d836..5e001893cc 100644
--- a/src/backend/access/gin/ginvacuum.c
+++ b/src/backend/access/gin/ginvacuum.c
@@ -514,7 +514,7 @@ ginVacuumEntryPage(GinVacuumState *gvs, Buffer buffer, BlockNumber *roots, uint3
 
 				if (nitems > 0)
 				{
-					plist = ginCompressPostingList(items, nitems, GinMaxItemSize, NULL);
+					plist = ginCompressPostingList(items, nitems, GinMaxItemSize(), NULL);
 					plistsize = SizeOfGinPostingList(plist);
 				}
 				else
diff --git a/src/backend/access/heap/README.HOT b/src/backend/access/heap/README.HOT
index 6fd1767f70..6c9fd4fd52 100644
--- a/src/backend/access/heap/README.HOT
+++ b/src/backend/access/heap/README.HOT
@@ -243,7 +243,7 @@ of line pointer bloat: we might end up with huge numbers of line pointers
 and just a few actual tuples on a page.  To limit the damage in the worst
 case, and to keep various work arrays as well as the bitmaps in bitmap
 scans reasonably sized, the maximum number of line pointers per page
-is arbitrarily capped at MaxHeapTuplesPerPage (the most tuples that
+is arbitrarily capped at MaxHeapTuplesPerPage() (the most tuples that
 could fit without HOT pruning).
 
 Effectively, space reclamation happens during tuple retrieval when the
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 0124f37911..12e8baf93c 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -479,7 +479,7 @@ heapgetpage(TableScanDesc sscan, BlockNumber block)
 
 	LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
 
-	Assert(ntup <= MaxHeapTuplesPerPage);
+	Assert(ntup <= MaxHeapTuplesPerPage());
 	scan->rs_ntuples = ntup;
 }
 
@@ -6740,8 +6740,8 @@ heap_freeze_execute_prepared(Relation rel, Buffer buffer,
 	/* Now WAL-log freezing if necessary */
 	if (RelationNeedsWAL(rel))
 	{
-		xl_heap_freeze_plan plans[MaxHeapTuplesPerPage];
-		OffsetNumber offsets[MaxHeapTuplesPerPage];
+		xl_heap_freeze_plan plans[MaxHeapTuplesPerPageLimit];
+		OffsetNumber offsets[MaxHeapTuplesPerPageLimit];
 		int			nplans;
 		xl_heap_freeze_page xlrec;
 		XLogRecPtr	recptr;
@@ -9221,7 +9221,7 @@ heap_xlog_insert(XLogReaderState *record)
 	union
 	{
 		HeapTupleHeaderData hdr;
-		char		data[MaxHeapTupleSize];
+		char		data[MaxHeapTupleSizeLimit];
 	}			tbuf;
 	HeapTupleHeader htup;
 	xl_heap_header xlhdr;
@@ -9277,7 +9277,7 @@ heap_xlog_insert(XLogReaderState *record)
 		data = XLogRecGetBlockData(record, 0, &datalen);
 
 		newlen = datalen - SizeOfHeapHeader;
-		Assert(datalen > SizeOfHeapHeader && newlen <= MaxHeapTupleSize);
+		Assert(datalen > SizeOfHeapHeader && newlen <= MaxHeapTupleSize());
 		memcpy((char *) &xlhdr, data, SizeOfHeapHeader);
 		data += SizeOfHeapHeader;
 
@@ -9343,7 +9343,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
 	union
 	{
 		HeapTupleHeaderData hdr;
-		char		data[MaxHeapTupleSize];
+		char		data[MaxHeapTupleSizeLimit];
 	}			tbuf;
 	HeapTupleHeader htup;
 	uint32		newlen;
@@ -9421,7 +9421,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
 			tupdata = ((char *) xlhdr) + SizeOfMultiInsertTuple;
 
 			newlen = xlhdr->datalen;
-			Assert(newlen <= MaxHeapTupleSize);
+			Assert(newlen <= MaxHeapTupleSize());
 			htup = &tbuf.hdr;
 			MemSet((char *) htup, 0, SizeofHeapTupleHeader);
 			/* PG73FORMAT: get bitmap [+ padding] [+ oid] + data */
@@ -9500,7 +9500,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
 	union
 	{
 		HeapTupleHeaderData hdr;
-		char		data[MaxHeapTupleSize];
+		char		data[MaxHeapTupleSizeLimit];
 	}			tbuf;
 	xl_heap_header xlhdr;
 	uint32		newlen;
@@ -9656,7 +9656,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
 		recdata += SizeOfHeapHeader;
 
 		tuplen = recdata_end - recdata;
-		Assert(tuplen <= MaxHeapTupleSize);
+		Assert(tuplen <= MaxHeapTupleSize());
 
 		htup = &tbuf.hdr;
 		MemSet((char *) htup, 0, SizeofHeapTupleHeader);
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index cbb35aa73d..bef452fdfe 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -1200,7 +1200,7 @@ heapam_index_build_range_scan(Relation heapRelation,
 	TransactionId OldestXmin;
 	BlockNumber previous_blkno = InvalidBlockNumber;
 	BlockNumber root_blkno = InvalidBlockNumber;
-	OffsetNumber root_offsets[MaxHeapTuplesPerPage];
+	OffsetNumber root_offsets[MaxHeapTuplesPerPageLimit];
 
 	/*
 	 * sanity checks
@@ -1763,8 +1763,8 @@ heapam_index_validate_scan(Relation heapRelation,
 	EState	   *estate;
 	ExprContext *econtext;
 	BlockNumber root_blkno = InvalidBlockNumber;
-	OffsetNumber root_offsets[MaxHeapTuplesPerPage];
-	bool		in_index[MaxHeapTuplesPerPage];
+	OffsetNumber root_offsets[MaxHeapTuplesPerPageLimit];
+	bool		in_index[MaxHeapTuplesPerPageLimit];
 	BlockNumber previous_blkno = InvalidBlockNumber;
 
 	/* state variables for the merge */
@@ -2227,7 +2227,7 @@ heapam_scan_bitmap_next_block(TableScanDesc scan,
 
 	LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
 
-	Assert(ntup <= MaxHeapTuplesPerPage);
+	Assert(ntup <= MaxHeapTuplesPerPage());
 	hscan->rs_ntuples = ntup;
 
 	return ntup > 0;
diff --git a/src/backend/access/heap/hio.c b/src/backend/access/heap/hio.c
index fb95c19e90..3ce357897c 100644
--- a/src/backend/access/heap/hio.c
+++ b/src/backend/access/heap/hio.c
@@ -511,11 +511,11 @@ RelationGetBufferForTuple(Relation relation, Size len,
 	/*
 	 * If we're gonna fail for oversize tuple, do it right away
 	 */
-	if (len > MaxHeapTupleSize)
+	if (len > MaxHeapTupleSize())
 		ereport(ERROR,
 				(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
 				 errmsg("row is too big: size %zu, maximum size %zu",
-						len, MaxHeapTupleSize)));
+						len, MaxHeapTupleSize())));
 
 	/* Compute desired extra freespace due to fillfactor option */
 	saveFreeSpace = RelationGetTargetPageFreeSpace(relation,
@@ -527,8 +527,8 @@ RelationGetBufferForTuple(Relation relation, Size len,
 	 * somewhat arbitrary, but it should prevent most unnecessary relation
 	 * extensions while inserting large tuples into low-fillfactor tables.
 	 */
-	nearlyEmptyFreeSpace = MaxHeapTupleSize -
-		(MaxHeapTuplesPerPage / 8 * sizeof(ItemIdData));
+	nearlyEmptyFreeSpace = MaxHeapTupleSize() -
+		(MaxHeapTuplesPerPage() / 8 * sizeof(ItemIdData));
 	if (len + saveFreeSpace > nearlyEmptyFreeSpace)
 		targetFreeSpace = Max(len, nearlyEmptyFreeSpace);
 	else
diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index 20df39c149..487da95ada 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -54,17 +54,17 @@ typedef struct
 	int			ndead;
 	int			nunused;
 	/* arrays that accumulate indexes of items to be changed */
-	OffsetNumber redirected[MaxHeapTuplesPerPage * 2];
-	OffsetNumber nowdead[MaxHeapTuplesPerPage];
-	OffsetNumber nowunused[MaxHeapTuplesPerPage];
+	OffsetNumber redirected[MaxHeapTuplesPerPageLimit * 2];
+	OffsetNumber nowdead[MaxHeapTuplesPerPageLimit];
+	OffsetNumber nowunused[MaxHeapTuplesPerPageLimit];
 
 	/*
 	 * marked[i] is true if item i is entered in one of the above arrays.
 	 *
-	 * This needs to be MaxHeapTuplesPerPage + 1 long as FirstOffsetNumber is
+	 * This needs to be MaxHeapTuplesPerPage() + 1 long as FirstOffsetNumber is
 	 * 1. Otherwise every access would need to subtract 1.
 	 */
-	bool		marked[MaxHeapTuplesPerPage + 1];
+	bool		marked[MaxHeapTuplesPerPageLimit + 1];
 
 	/*
 	 * Tuple visibility is only computed once for each tuple, for correctness
@@ -74,7 +74,7 @@ typedef struct
 	 *
 	 * Same indexing as ->marked.
 	 */
-	int8		htsv[MaxHeapTuplesPerPage + 1];
+	int8		htsv[MaxHeapTuplesPerPageLimit + 1];
 } PruneState;
 
 /* Local functions */
@@ -598,7 +598,7 @@ heap_prune_chain(Buffer buffer, OffsetNumber rootoffnum, PruneState *prstate)
 	OffsetNumber latestdead = InvalidOffsetNumber,
 				maxoff = PageGetMaxOffsetNumber(dp),
 				offnum;
-	OffsetNumber chainitems[MaxHeapTuplesPerPage];
+	OffsetNumber chainitems[MaxHeapTuplesPerPageLimit];
 	int			nchain = 0,
 				i;
 
@@ -870,7 +870,7 @@ static void
 heap_prune_record_redirect(PruneState *prstate,
 						   OffsetNumber offnum, OffsetNumber rdoffnum)
 {
-	Assert(prstate->nredirected < MaxHeapTuplesPerPage);
+	Assert(prstate->nredirected < MaxHeapTuplesPerPage());
 	prstate->redirected[prstate->nredirected * 2] = offnum;
 	prstate->redirected[prstate->nredirected * 2 + 1] = rdoffnum;
 	prstate->nredirected++;
@@ -884,7 +884,7 @@ heap_prune_record_redirect(PruneState *prstate,
 static void
 heap_prune_record_dead(PruneState *prstate, OffsetNumber offnum)
 {
-	Assert(prstate->ndead < MaxHeapTuplesPerPage);
+	Assert(prstate->ndead < MaxHeapTuplesPerPage());
 	prstate->nowdead[prstate->ndead] = offnum;
 	prstate->ndead++;
 	Assert(!prstate->marked[offnum]);
@@ -895,7 +895,7 @@ heap_prune_record_dead(PruneState *prstate, OffsetNumber offnum)
 static void
 heap_prune_record_unused(PruneState *prstate, OffsetNumber offnum)
 {
-	Assert(prstate->nunused < MaxHeapTuplesPerPage);
+	Assert(prstate->nunused < MaxHeapTuplesPerPage());
 	prstate->nowunused[prstate->nunused] = offnum;
 	prstate->nunused++;
 	Assert(!prstate->marked[offnum]);
@@ -1097,7 +1097,7 @@ page_verify_redirects(Page page)
  * If item k is part of a HOT-chain with root at item j, then we set
  * root_offsets[k - 1] = j.
  *
- * The passed-in root_offsets array must have MaxHeapTuplesPerPage entries.
+ * The passed-in root_offsets array must have MaxHeapTuplesPerPage() entries.
  * Unused entries are filled with InvalidOffsetNumber (zero).
  *
  * The function must be called with at least share lock on the buffer, to
@@ -1114,7 +1114,7 @@ heap_get_root_tuples(Page page, OffsetNumber *root_offsets)
 				maxoff;
 
 	MemSet(root_offsets, InvalidOffsetNumber,
-		   MaxHeapTuplesPerPage * sizeof(OffsetNumber));
+		   MaxHeapTuplesPerPage() * sizeof(OffsetNumber));
 
 	maxoff = PageGetMaxOffsetNumber(page);
 	for (offnum = FirstOffsetNumber; offnum <= maxoff; offnum = OffsetNumberNext(offnum))
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 424958912c..adc4cf9af7 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -653,11 +653,11 @@ raw_heap_insert(RewriteState state, HeapTuple tup)
 	/*
 	 * If we're gonna fail for oversize tuple, do it right away
 	 */
-	if (len > MaxHeapTupleSize)
+	if (len > MaxHeapTupleSize())
 		ereport(ERROR,
 				(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
 				 errmsg("row is too big: size %zu, maximum size %zu",
-						len, MaxHeapTupleSize)));
+						len, MaxHeapTupleSize())));
 
 	/* Compute desired extra freespace due to fillfactor option */
 	saveFreeSpace = RelationGetTargetPageFreeSpace(state->rs_new_rel,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index cda8889f5e..79a56cf837 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -908,8 +908,8 @@ lazy_scan_heap(LVRelState *vacrel)
 		 * dead_items TIDs, pause and do a cycle of vacuuming before we tackle
 		 * this page.
 		 */
-		Assert(dead_items->max_items >= MaxHeapTuplesPerPage);
-		if (dead_items->max_items - dead_items->num_items < MaxHeapTuplesPerPage)
+		Assert(dead_items->max_items >= MaxHeapTuplesPerPage());
+		if (dead_items->max_items - dead_items->num_items < MaxHeapTuplesPerPage())
 		{
 			/*
 			 * Before beginning index vacuuming, we release any pin we may
@@ -1551,8 +1551,8 @@ lazy_scan_prune(LVRelState *vacrel,
 	int			nnewlpdead;
 	HeapPageFreeze pagefrz;
 	int64		fpi_before = pgWalUsage.wal_fpi;
-	OffsetNumber deadoffsets[MaxHeapTuplesPerPage];
-	HeapTupleFreeze frozen[MaxHeapTuplesPerPage];
+	OffsetNumber deadoffsets[MaxHeapTuplesPerPageLimit];
+	HeapTupleFreeze frozen[MaxHeapTuplesPerPageLimit];
 
 	Assert(BufferGetBlockNumber(buf) == blkno);
 
@@ -1971,7 +1971,7 @@ lazy_scan_noprune(LVRelState *vacrel,
 	HeapTupleHeader tupleheader;
 	TransactionId NoFreezePageRelfrozenXid = vacrel->NewRelfrozenXid;
 	MultiXactId NoFreezePageRelminMxid = vacrel->NewRelminMxid;
-	OffsetNumber deadoffsets[MaxHeapTuplesPerPage];
+	OffsetNumber deadoffsets[MaxHeapTuplesPerPageLimit];
 
 	Assert(BufferGetBlockNumber(buf) == blkno);
 
@@ -2503,7 +2503,7 @@ lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
 {
 	VacDeadItems *dead_items = vacrel->dead_items;
 	Page		page = BufferGetPage(buffer);
-	OffsetNumber unused[MaxHeapTuplesPerPage];
+	OffsetNumber unused[MaxHeapTuplesPerPageLimit];
 	int			nunused = 0;
 	TransactionId visibility_cutoff_xid;
 	bool		all_frozen;
@@ -3129,16 +3129,16 @@ dead_items_max_items(LVRelState *vacrel)
 		max_items = Min(max_items, MAXDEADITEMS(MaxAllocSize));
 
 		/* curious coding here to ensure the multiplication can't overflow */
-		if ((BlockNumber) (max_items / MaxHeapTuplesPerPage) > rel_pages)
-			max_items = rel_pages * MaxHeapTuplesPerPage;
+		if ((BlockNumber) (max_items / MaxHeapTuplesPerPage()) > rel_pages)
+			max_items = rel_pages * MaxHeapTuplesPerPage();
 
 		/* stay sane if small maintenance_work_mem */
-		max_items = Max(max_items, MaxHeapTuplesPerPage);
+		max_items = Max(max_items, MaxHeapTuplesPerPage());
 	}
 	else
 	{
 		/* One-pass case only stores a single heap page's TIDs at a time */
-		max_items = MaxHeapTuplesPerPage;
+		max_items = MaxHeapTuplesPerPage();
 	}
 
 	return (int) max_items;
@@ -3158,7 +3158,7 @@ dead_items_alloc(LVRelState *vacrel, int nworkers)
 	int			max_items;
 
 	max_items = dead_items_max_items(vacrel);
-	Assert(max_items >= MaxHeapTuplesPerPage);
+	Assert(max_items >= MaxHeapTuplesPerPage());
 
 	/*
 	 * Initialize state for a parallel vacuum.  As of now, only one worker can
diff --git a/src/backend/access/nbtree/nbtdedup.c b/src/backend/access/nbtree/nbtdedup.c
index d4db0b28f2..f4b9531bf2 100644
--- a/src/backend/access/nbtree/nbtdedup.c
+++ b/src/backend/access/nbtree/nbtdedup.c
@@ -355,8 +355,8 @@ _bt_bottomupdel_pass(Relation rel, Buffer buf, Relation heapRel,
 	delstate.bottomup = true;
 	delstate.bottomupfreespace = Max(BLCKSZ / 16, newitemsz);
 	delstate.ndeltids = 0;
-	delstate.deltids = palloc(MaxTIDsPerBTreePage * sizeof(TM_IndexDelete));
-	delstate.status = palloc(MaxTIDsPerBTreePage * sizeof(TM_IndexStatus));
+	delstate.deltids = palloc(MaxTIDsPerBTreePage() * sizeof(TM_IndexDelete));
+	delstate.status = palloc(MaxTIDsPerBTreePage() * sizeof(TM_IndexStatus));
 
 	minoff = P_FIRSTDATAKEY(opaque);
 	maxoff = PageGetMaxOffsetNumber(page);
@@ -826,7 +826,7 @@ _bt_singleval_fillfactor(Page page, BTDedupState state, Size newitemsz)
 
 	/* This calculation needs to match nbtsplitloc.c */
 	leftfree = PageGetPageSize(page) - SizeOfPageHeaderData -
-		MAXALIGN(sizeof(BTPageOpaqueData));
+		MAXALIGN(sizeof(BTPageOpaqueData)) - SizeOfPageReservedSpace();
 	/* Subtract size of new high key (includes pivot heap TID space) */
 	leftfree -= newitemsz + MAXALIGN(sizeof(ItemPointerData));
 
diff --git a/src/backend/access/nbtree/nbtinsert.c b/src/backend/access/nbtree/nbtinsert.c
index 84bbd78d59..00b2689a8a 100644
--- a/src/backend/access/nbtree/nbtinsert.c
+++ b/src/backend/access/nbtree/nbtinsert.c
@@ -2821,8 +2821,8 @@ _bt_simpledel_pass(Relation rel, Buffer buffer, Relation heapRel,
 	delstate.bottomup = false;
 	delstate.bottomupfreespace = 0;
 	delstate.ndeltids = 0;
-	delstate.deltids = palloc(MaxTIDsPerBTreePage * sizeof(TM_IndexDelete));
-	delstate.status = palloc(MaxTIDsPerBTreePage * sizeof(TM_IndexStatus));
+	delstate.deltids = palloc(MaxTIDsPerBTreePage() * sizeof(TM_IndexDelete));
+	delstate.status = palloc(MaxTIDsPerBTreePage() * sizeof(TM_IndexStatus));
 
 	for (offnum = minoff;
 		 offnum <= maxoff;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 2df8849858..926fe588c9 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -259,8 +259,8 @@ btgettuple(IndexScanDesc scan, ScanDirection dir)
 				 */
 				if (so->killedItems == NULL)
 					so->killedItems = (int *)
-						palloc(MaxTIDsPerBTreePage * sizeof(int));
-				if (so->numKilled < MaxTIDsPerBTreePage)
+						palloc(MaxTIDsPerBTreePage() * sizeof(int));
+				if (so->numKilled < MaxTIDsPerBTreePage())
 					so->killedItems[so->numKilled++] = so->currPos.itemIndex;
 			}
 
diff --git a/src/backend/access/nbtree/nbtsearch.c b/src/backend/access/nbtree/nbtsearch.c
index 263f75fce9..c922617297 100644
--- a/src/backend/access/nbtree/nbtsearch.c
+++ b/src/backend/access/nbtree/nbtsearch.c
@@ -1671,7 +1671,7 @@ _bt_readpage(IndexScanDesc scan, ScanDirection dir, OffsetNumber offnum)
 		if (!continuescan)
 			so->currPos.moreRight = false;
 
-		Assert(itemIndex <= MaxTIDsPerBTreePage);
+		Assert(itemIndex <= MaxTIDsPerBTreePage());
 		so->currPos.firstItem = 0;
 		so->currPos.lastItem = itemIndex - 1;
 		so->currPos.itemIndex = 0;
@@ -1679,7 +1679,7 @@ _bt_readpage(IndexScanDesc scan, ScanDirection dir, OffsetNumber offnum)
 	else
 	{
 		/* load items[] in descending order */
-		itemIndex = MaxTIDsPerBTreePage;
+		itemIndex = MaxTIDsPerBTreePage();
 
 		offnum = Min(offnum, maxoff);
 
@@ -1768,8 +1768,8 @@ _bt_readpage(IndexScanDesc scan, ScanDirection dir, OffsetNumber offnum)
 
 		Assert(itemIndex >= 0);
 		so->currPos.firstItem = itemIndex;
-		so->currPos.lastItem = MaxTIDsPerBTreePage - 1;
-		so->currPos.itemIndex = MaxTIDsPerBTreePage - 1;
+		so->currPos.lastItem = MaxTIDsPerBTreePage() - 1;
+		so->currPos.itemIndex = MaxTIDsPerBTreePage() - 1;
 	}
 
 	return (so->currPos.firstItem <= so->currPos.lastItem);
diff --git a/src/backend/access/nbtree/nbtsplitloc.c b/src/backend/access/nbtree/nbtsplitloc.c
index 43b67893d9..5babbb457a 100644
--- a/src/backend/access/nbtree/nbtsplitloc.c
+++ b/src/backend/access/nbtree/nbtsplitloc.c
@@ -156,7 +156,7 @@ _bt_findsplitloc(Relation rel,
 
 	/* Total free space available on a btree page, after fixed overhead */
 	leftspace = rightspace =
-		PageGetPageSize(origpage) - SizeOfPageHeaderData -
+		PageGetPageSize(origpage) - SizeOfPageHeaderData - SizeOfPageReservedSpace() -
 		MAXALIGN(sizeof(BTPageOpaqueData));
 
 	/* The right page will have the same high key as the old page */
diff --git a/src/backend/nodes/tidbitmap.c b/src/backend/nodes/tidbitmap.c
index 29a1858441..d572932856 100644
--- a/src/backend/nodes/tidbitmap.c
+++ b/src/backend/nodes/tidbitmap.c
@@ -53,7 +53,7 @@
  * the per-page bitmaps variable size.  We just legislate that the size
  * is this:
  */
-#define MAX_TUPLES_PER_PAGE  MaxHeapTuplesPerPage
+#define MAX_TUPLES_PER_PAGE  MaxHeapTuplesPerPageLimit
 
 /*
  * When we have to switch over to lossy storage, we use a data structure
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index b85b890010..f6f3fc2966 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4882,7 +4882,7 @@ ReorderBufferToastReplace(ReorderBuffer *rb, ReorderBufferTXN *txn,
 	 * the tuplebuf because attrs[] will point back into the current content.
 	 */
 	tmphtup = heap_form_tuple(desc, attrs, isnull);
-	Assert(newtup->tuple.t_len <= MaxHeapTupleSize);
+	Assert(newtup->tuple.t_len <= MaxHeapTupleSize());
 	Assert(ReorderBufferTupleBufData(newtup) == newtup->tuple.t_data);
 
 	memcpy(newtup->tuple.t_data, tmphtup->t_data, tmphtup->t_len);
diff --git a/src/backend/storage/freespace/freespace.c b/src/backend/storage/freespace/freespace.c
index 2face615d0..2260532fbb 100644
--- a/src/backend/storage/freespace/freespace.c
+++ b/src/backend/storage/freespace/freespace.c
@@ -37,12 +37,12 @@
  * We use just one byte to store the amount of free space on a page, so we
  * divide the amount of free space a page can have into 256 different
  * categories. The highest category, 255, represents a page with at least
- * MaxFSMRequestSize bytes of free space, and the second highest category
+ * MaxFSMRequestSize() bytes of free space, and the second highest category
  * represents the range from 254 * FSM_CAT_STEP, inclusive, to
- * MaxFSMRequestSize, exclusive.
+ * MaxFSMRequestSize(), exclusive.
  *
- * MaxFSMRequestSize depends on the architecture and BLCKSZ, but assuming
- * default 8k BLCKSZ, and that MaxFSMRequestSize is 8164 bytes, the
+ * MaxFSMRequestSize() depends on the architecture and BLCKSZ, but assuming
+ * default 8k BLCKSZ, and that MaxFSMRequestSize() is 8164 bytes, the
  * categories look like this:
  *
  *
@@ -54,16 +54,16 @@
  * 8128 - 8163 254
  * 8164 - 8192 255
  *
- * The reason that MaxFSMRequestSize is special is that if MaxFSMRequestSize
- * isn't equal to a range boundary, a page with exactly MaxFSMRequestSize
- * bytes of free space wouldn't satisfy a request for MaxFSMRequestSize
- * bytes. If there isn't more than MaxFSMRequestSize bytes of free space on a
+ * The reason that MaxFSMRequestSize() is special is that if MaxFSMRequestSize()
+ * isn't equal to a range boundary, a page with exactly MaxFSMRequestSize()
+ * bytes of free space wouldn't satisfy a request for MaxFSMRequestSize()
+ * bytes. If there isn't more than MaxFSMRequestSize() bytes of free space on a
  * completely empty page, that would mean that we could never satisfy a
- * request of exactly MaxFSMRequestSize bytes.
+ * request of exactly MaxFSMRequestSize() bytes.
  */
 #define FSM_CATEGORIES	256
 #define FSM_CAT_STEP	(BLCKSZ / FSM_CATEGORIES)
-#define MaxFSMRequestSize	MaxHeapTupleSize
+#define MaxFSMRequestSize()	MaxHeapTupleSize()
 
 /*
  * Depth of the on-disk tree. We need to be able to address 2^32-1 blocks,
@@ -372,13 +372,13 @@ fsm_space_avail_to_cat(Size avail)
 
 	Assert(avail < BLCKSZ);
 
-	if (avail >= MaxFSMRequestSize)
+	if (avail >= MaxFSMRequestSize())
 		return 255;
 
 	cat = avail / FSM_CAT_STEP;
 
 	/*
-	 * The highest category, 255, is reserved for MaxFSMRequestSize bytes or
+	 * The highest category, 255, is reserved for MaxFSMRequestSize() bytes or
 	 * more.
 	 */
 	if (cat > 254)
@@ -394,9 +394,9 @@ fsm_space_avail_to_cat(Size avail)
 static Size
 fsm_space_cat_to_avail(uint8 cat)
 {
-	/* The highest category represents exactly MaxFSMRequestSize bytes. */
+	/* The highest category represents exactly MaxFSMRequestSize() bytes. */
 	if (cat == 255)
-		return MaxFSMRequestSize;
+		return MaxFSMRequestSize();
 	else
 		return cat * FSM_CAT_STEP;
 }
@@ -411,7 +411,7 @@ fsm_space_needed_to_cat(Size needed)
 	int			cat;
 
 	/* Can't ask for more space than the highest category represents */
-	if (needed > MaxFSMRequestSize)
+	if (needed > MaxFSMRequestSize())
 		elog(ERROR, "invalid FSM request size %zu", needed);
 
 	if (needed == 0)
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index 9a302ddc30..a93cd9df9f 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -26,6 +26,8 @@
 /* GUC variable */
 bool		ignore_checksum_failure = false;
 
+int			reserved_page_size = 0; /* how much page space to reserve for extended unencrypted metadata */
+
 
 /* ----------------------------------------------------------------
  *						Page support functions
@@ -43,7 +45,7 @@ PageInit(Page page, Size pageSize, Size specialSize)
 {
 	PageHeader	p = (PageHeader) page;
 
-	specialSize = MAXALIGN(specialSize);
+	specialSize = MAXALIGN(specialSize) + reserved_page_size;
 
 	Assert(pageSize == BLCKSZ);
 	Assert(pageSize > specialSize + SizeOfPageHeaderData);
@@ -117,7 +119,7 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
 		if ((p->pd_flags & ~PD_VALID_FLAG_BITS) == 0 &&
 			p->pd_lower <= p->pd_upper &&
 			p->pd_upper <= p->pd_special &&
-			p->pd_special <= BLCKSZ &&
+			p->pd_special + reserved_page_size <= BLCKSZ &&
 			p->pd_special == MAXALIGN(p->pd_special))
 			header_sane = true;
 
@@ -186,7 +188,7 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
  *	one that is both unused and deallocated.
  *
  *	If flag PAI_IS_HEAP is set, we enforce that there can't be more than
- *	MaxHeapTuplesPerPage line pointers on the page.
+ *	MaxHeapTuplesPerPage() line pointers on the page.
  *
  *	!!! EREPORT(ERROR) IS DISALLOWED HERE !!!
  */
@@ -211,7 +213,7 @@ PageAddItemExtended(Page page,
 	if (phdr->pd_lower < SizeOfPageHeaderData ||
 		phdr->pd_lower > phdr->pd_upper ||
 		phdr->pd_upper > phdr->pd_special ||
-		phdr->pd_special > BLCKSZ)
+		phdr->pd_special + reserved_page_size > BLCKSZ)
 		ereport(PANIC,
 				(errcode(ERRCODE_DATA_CORRUPTED),
 				 errmsg("corrupted page pointers: lower = %u, upper = %u, special = %u",
@@ -295,9 +297,9 @@ PageAddItemExtended(Page page,
 	}
 
 	/* Reject placing items beyond heap boundary, if heap */
-	if ((flags & PAI_IS_HEAP) != 0 && offsetNumber > MaxHeapTuplesPerPage)
+	if ((flags & PAI_IS_HEAP) != 0 && offsetNumber > MaxHeapTuplesPerPage())
 	{
-		elog(WARNING, "can't put more than MaxHeapTuplesPerPage items in a heap page");
+		elog(WARNING, "can't put more than MaxHeapTuplesPerPage() items in a heap page");
 		return InvalidOffsetNumber;
 	}
 
@@ -702,7 +704,7 @@ PageRepairFragmentation(Page page)
 	Offset		pd_upper = ((PageHeader) page)->pd_upper;
 	Offset		pd_special = ((PageHeader) page)->pd_special;
 	Offset		last_offset;
-	itemIdCompactData itemidbase[MaxHeapTuplesPerPage];
+	itemIdCompactData itemidbase[MaxHeapTuplesPerPageLimit];
 	itemIdCompact itemidptr;
 	ItemId		lp;
 	int			nline,
@@ -723,7 +725,7 @@ PageRepairFragmentation(Page page)
 	if (pd_lower < SizeOfPageHeaderData ||
 		pd_lower > pd_upper ||
 		pd_upper > pd_special ||
-		pd_special > BLCKSZ ||
+		pd_special + reserved_page_size > BLCKSZ ||
 		pd_special != MAXALIGN(pd_special))
 		ereport(ERROR,
 				(errcode(ERRCODE_DATA_CORRUPTED),
@@ -979,12 +981,12 @@ PageGetExactFreeSpace(Page page)
  *		reduced by the space needed for a new line pointer.
  *
  * The difference between this and PageGetFreeSpace is that this will return
- * zero if there are already MaxHeapTuplesPerPage line pointers in the page
+ * zero if there are already MaxHeapTuplesPerPage() line pointers in the page
  * and none are free.  We use this to enforce that no more than
- * MaxHeapTuplesPerPage line pointers are created on a heap page.  (Although
+ * MaxHeapTuplesPerPage() line pointers are created on a heap page.  (Although
  * no more tuples than that could fit anyway, in the presence of redirected
  * or dead line pointers it'd be possible to have too many line pointers.
- * To avoid breaking code that assumes MaxHeapTuplesPerPage is a hard limit
+ * To avoid breaking code that assumes MaxHeapTuplesPerPage() is a hard limit
  * on the number of line pointers, we make this extra check.)
  */
 Size
@@ -999,10 +1001,10 @@ PageGetHeapFreeSpace(Page page)
 					nline;
 
 		/*
-		 * Are there already MaxHeapTuplesPerPage line pointers in the page?
+		 * Are there already MaxHeapTuplesPerPage() line pointers in the page?
 		 */
 		nline = PageGetMaxOffsetNumber(page);
-		if (nline >= MaxHeapTuplesPerPage)
+		if (nline >= MaxHeapTuplesPerPage())
 		{
 			if (PageHasFreeLinePointers(page))
 			{
@@ -1066,7 +1068,7 @@ PageIndexTupleDelete(Page page, OffsetNumber offnum)
 	if (phdr->pd_lower < SizeOfPageHeaderData ||
 		phdr->pd_lower > phdr->pd_upper ||
 		phdr->pd_upper > phdr->pd_special ||
-		phdr->pd_special > BLCKSZ ||
+		phdr->pd_special + reserved_page_size > BLCKSZ ||
 		phdr->pd_special != MAXALIGN(phdr->pd_special))
 		ereport(ERROR,
 				(errcode(ERRCODE_DATA_CORRUPTED),
@@ -1201,7 +1203,7 @@ PageIndexMultiDelete(Page page, OffsetNumber *itemnos, int nitems)
 	if (pd_lower < SizeOfPageHeaderData ||
 		pd_lower > pd_upper ||
 		pd_upper > pd_special ||
-		pd_special > BLCKSZ ||
+		pd_special + reserved_page_size > BLCKSZ ||
 		pd_special != MAXALIGN(pd_special))
 		ereport(ERROR,
 				(errcode(ERRCODE_DATA_CORRUPTED),
@@ -1307,7 +1309,7 @@ PageIndexTupleDeleteNoCompact(Page page, OffsetNumber offnum)
 	if (phdr->pd_lower < SizeOfPageHeaderData ||
 		phdr->pd_lower > phdr->pd_upper ||
 		phdr->pd_upper > phdr->pd_special ||
-		phdr->pd_special > BLCKSZ ||
+		phdr->pd_special + reserved_page_size > BLCKSZ ||
 		phdr->pd_special != MAXALIGN(phdr->pd_special))
 		ereport(ERROR,
 				(errcode(ERRCODE_DATA_CORRUPTED),
@@ -1419,7 +1421,7 @@ PageIndexTupleOverwrite(Page page, OffsetNumber offnum,
 	if (phdr->pd_lower < SizeOfPageHeaderData ||
 		phdr->pd_lower > phdr->pd_upper ||
 		phdr->pd_upper > phdr->pd_special ||
-		phdr->pd_special > BLCKSZ ||
+		phdr->pd_special + reserved_page_size > BLCKSZ ||
 		phdr->pd_special != MAXALIGN(phdr->pd_special))
 		ereport(ERROR,
 				(errcode(ERRCODE_DATA_CORRUPTED),
diff --git a/src/backend/utils/adt/tsgistidx.c b/src/backend/utils/adt/tsgistidx.c
index f0cd2866ff..44e5a87595 100644
--- a/src/backend/utils/adt/tsgistidx.c
+++ b/src/backend/utils/adt/tsgistidx.c
@@ -209,7 +209,7 @@ gtsvector_compress(PG_FUNCTION_ARGS)
 		}
 
 		/* make signature, if array is too long */
-		if (VARSIZE(res) > TOAST_INDEX_TARGET)
+		if (VARSIZE(res) > TOAST_INDEX_TARGET())
 		{
 			SignTSVector *ressign = gtsvector_alloc(SIGNKEY, siglen, NULL);
 
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 011ec18015..022b5eee4e 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -154,3 +154,4 @@ int64		VacuumPageDirty = 0;
 
 int			VacuumCostBalance = 0;	/* working state for vacuum */
 bool		VacuumCostActive = false;
+
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 2f42cebaf6..f3d5a73adc 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2714,6 +2714,19 @@ struct config_int ConfigureNamesInt[] =
 		NULL, assign_max_wal_size, NULL
 	},
 
+	{
+		{"reserved_page_size", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows the size of reserved space for extended pages."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+			},
+		&reserved_page_size,
+		0,
+		0,
+		PG_UINT8_MAX,
+		NULL, NULL, NULL
+		},
+
 	{
 		{"checkpoint_timeout", PGC_SIGHUP, WAL_CHECKPOINTS,
 			gettext_noop("Sets the maximum time between automatic WAL checkpoints."),
diff --git a/src/include/access/ginblock.h b/src/include/access/ginblock.h
index c59790ec5a..5f6c375068 100644
--- a/src/include/access/ginblock.h
+++ b/src/include/access/ginblock.h
@@ -162,7 +162,7 @@ extern bool GinPageIsRecyclable(Page page);
  *				pointers for that page
  * Note that these are all distinguishable from an "invalid" item pointer
  * (which is InvalidBlockNumber/0) as well as from all normal item
- * pointers (which have item numbers in the range 1..MaxHeapTuplesPerPage).
+ * pointers (which have item numbers in the range 1..MaxHeapTuplesPerPage()).
  */
 #define ItemPointerSetMin(p)  \
 	ItemPointerSet((p), (BlockNumber)0, (OffsetNumber)0)
@@ -246,7 +246,13 @@ typedef signed char GinNullCategory;
  * currently store the high key explicitly, we just use the rightmost item on
  * the page, so it would actually be enough to fit two items.)
  */
-#define GinMaxItemSize \
+#define GinMaxItemSize() \
+	Min(INDEX_SIZE_MASK, \
+		MAXALIGN_DOWN(((BLCKSZ - \
+						SizeOfPageReservedSpace() - \
+						MAXALIGN(SizeOfPageHeaderData + 3 * sizeof(ItemIdData)) - \
+						MAXALIGN(sizeof(GinPageOpaqueData))) / 3)))
+#define GinMaxItemSizeLimit \
 	Min(INDEX_SIZE_MASK, \
 		MAXALIGN_DOWN(((BLCKSZ - \
 						MAXALIGN(SizeOfPageHeaderData + 3 * sizeof(ItemIdData)) - \
@@ -309,15 +315,20 @@ typedef signed char GinNullCategory;
  */
 #define GinDataPageSetDataSize(page, size) \
 	{ \
-		Assert(size <= GinDataPageMaxDataSize); \
+		Assert(size <= GinDataPageMaxDataSize()); \
 		((PageHeader) page)->pd_lower = (size) + MAXALIGN(SizeOfPageHeaderData) + MAXALIGN(sizeof(ItemPointerData)); \
 	}
 
 #define GinNonLeafDataPageGetFreeSpace(page)	\
-	(GinDataPageMaxDataSize - \
+	(GinDataPageMaxDataSize() - \
 	 GinPageGetOpaque(page)->maxoff * sizeof(PostingItem))
 
-#define GinDataPageMaxDataSize	\
+#define GinDataPageMaxDataSize()	\
+	(BLCKSZ - MAXALIGN(SizeOfPageHeaderData) \
+	 - SizeOfPageReservedSpace() \
+	 - MAXALIGN(sizeof(ItemPointerData)) \
+	 - MAXALIGN(sizeof(GinPageOpaqueData)))
+#define GinDataPageMaxDataSizeLimit	\
 	(BLCKSZ - MAXALIGN(SizeOfPageHeaderData) \
 	 - MAXALIGN(sizeof(ItemPointerData)) \
 	 - MAXALIGN(sizeof(GinPageOpaqueData)))
@@ -325,8 +336,10 @@ typedef signed char GinNullCategory;
 /*
  * List pages
  */
-#define GinListPageSize  \
-	( BLCKSZ - SizeOfPageHeaderData - MAXALIGN(sizeof(GinPageOpaqueData)) )
+#define GinListPageSize()  \
+	( BLCKSZ - SizeOfPageHeaderData - MAXALIGN(sizeof(GinPageOpaqueData)) - SizeOfPageReservedSpace())
+#define GinListPageSizeLimit  \
+	( BLCKSZ - SizeOfPageHeaderData - MAXALIGN(sizeof(GinPageOpaqueData)))
 
 /*
  * A compressed posting list.
diff --git a/src/include/access/hash.h b/src/include/access/hash.h
index 9e035270a1..32fef1fa56 100644
--- a/src/include/access/hash.h
+++ b/src/include/access/hash.h
@@ -287,6 +287,7 @@ typedef struct HashOptions
 #define HashMaxItemSize(page) \
 	MAXALIGN_DOWN(PageGetPageSize(page) - \
 				  SizeOfPageHeaderData - \
+				  SizeOfPageReservedSpace() - \
 				  sizeof(ItemIdData) - \
 				  MAXALIGN(sizeof(HashPageOpaqueData)))
 
@@ -318,7 +319,9 @@ typedef struct HashOptions
 
 #define HashGetMaxBitmapSize(page) \
 	(PageGetPageSize((Page) page) - \
-	 (MAXALIGN(SizeOfPageHeaderData) + MAXALIGN(sizeof(HashPageOpaqueData))))
+	 (MAXALIGN(SizeOfPageHeaderData) + \
+	  SizeOfPageReservedSpace() + \
+	  MAXALIGN(sizeof(HashPageOpaqueData))))
 
 #define HashPageGetMeta(page) \
 	((HashMetaPage) PageGetContents(page))
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index faf5026519..644c03145b 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -75,7 +75,7 @@ typedef struct HeapScanDescData
 	/* these fields only used in page-at-a-time mode and for bitmap scans */
 	int			rs_cindex;		/* current tuple's index in vistuples */
 	int			rs_ntuples;		/* number of visible tuples on page */
-	OffsetNumber rs_vistuples[MaxHeapTuplesPerPage];	/* their offsets */
+	OffsetNumber rs_vistuples[MaxHeapTuplesPerPageLimit];	/* their offsets */
 }			HeapScanDescData;
 typedef struct HeapScanDescData *HeapScanDesc;
 
diff --git a/src/include/access/heaptoast.h b/src/include/access/heaptoast.h
index 5c0a796f66..bad9ecb085 100644
--- a/src/include/access/heaptoast.h
+++ b/src/include/access/heaptoast.h
@@ -61,11 +61,11 @@
 #define TOAST_TUPLE_TARGET_MAIN MaximumBytesPerTuple(TOAST_TUPLES_PER_PAGE_MAIN)
 
 /*
- * If an index value is larger than TOAST_INDEX_TARGET, we will try to
+ * If an index value is larger than TOAST_INDEX_TARGET(), we will try to
  * compress it (we can't move it out-of-line, however).  Note that this
  * number is per-datum, not per-tuple, for simplicity in index_form_tuple().
  */
-#define TOAST_INDEX_TARGET		(MaxHeapTupleSize / 16)
+#define TOAST_INDEX_TARGET()		(MaxHeapTupleSize() / 16)
 
 /*
  * When we store an oversize datum externally, we divide it into chunks
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index e01f4f35c8..01362da297 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -550,31 +550,51 @@ StaticAssertDecl(MaxOffsetNumber < SpecTokenOffsetNumber,
 #define BITMAPLEN(NATTS)	(((int)(NATTS) + 7) / 8)
 
 /*
- * MaxHeapTupleSize is the maximum allowed size of a heap tuple, including
+ * MaxHeapTupleSize() is the maximum allowed size of a heap tuple, including
  * header and MAXALIGN alignment padding.  Basically it's BLCKSZ minus the
- * other stuff that has to be on a disk page.  Since heap pages use no
- * "special space", there's no deduction for that.
+ * other stuff that has to be on a disk page.  We also include
+ * SizeOfPageReservedSpace() bytes in this calculation to account for page
+ * trailers.
+ *
+ * MaxHeapTupleSizeLimit is the maximum buffer-size required for any cluster,
+ * explicitly excluding the PageReservedSpace.  This is needed for any data
+ * structure which uses a fixed-size buffer, since compilers do not want a
+ * variable-sized array, and MaxHeapTupleSize() is now variable.
  *
  * NOTE: we allow for the ItemId that must point to the tuple, ensuring that
  * an otherwise-empty page can indeed hold a tuple of this size.  Because
  * ItemIds and tuples have different alignment requirements, don't assume that
- * you can, say, fit 2 tuples of size MaxHeapTupleSize/2 on the same page.
+ * you can, say, fit 2 tuples of size MaxHeapTupleSize()/2 on the same page.
  */
-#define MaxHeapTupleSize  (BLCKSZ - MAXALIGN(SizeOfPageHeaderData + sizeof(ItemIdData)))
+#define MaxHeapTupleSize()  (BLCKSZ - MAXALIGN(SizeOfPageHeaderData + sizeof(ItemIdData) + SizeOfPageReservedSpace()))
+#define MaxHeapTupleSizeLimit  (BLCKSZ - MAXALIGN(SizeOfPageHeaderData + sizeof(ItemIdData)))
 #define MinHeapTupleSize  MAXALIGN(SizeofHeapTupleHeader)
 
 /*
- * MaxHeapTuplesPerPage is an upper bound on the number of tuples that can
- * fit on one heap page.  (Note that indexes could have more, because they
- * use a smaller tuple header.)  We arrive at the divisor because each tuple
- * must be maxaligned, and it must have an associated line pointer.
+ * MaxHeapTuplesPerPage() is an upper bound on the number of tuples that can fit
+ * on one heap page.  (Note that indexes could have more, because they use a
+ * smaller tuple header.)  We arrive at the divisor because each tuple must be
+ * maxaligned, and it must have an associated line pointer.  This is a dynamic
+ * value, accounting for PageReservedSpace on the end.
+ *
+ * MaxHeapTuplesPerPageLimit is this same limit, but discounting
+ * PageReservedSpace (which can be zero), so is appropriate for defining data
+ * structures which require fixed-size buffers.  Code should not assume
+ * MaxHeapTuplesPerPage() == MaxHeapTuplesPerPageLimit, so if iterating over
+ * such a structure, the *size* of the buffer should be
+ * MaxHeapTuplesPerPageLimit, but the limits of iteration should be
+ * MaxHeapTuplesPerPage(), implying that MaxHeapTuplesPerPage() <=
+ * MaxHeapTuplesPerPageLimit.
  *
  * Note: with HOT, there could theoretically be more line pointers (not actual
  * tuples) than this on a heap page.  However we constrain the number of line
  * pointers to this anyway, to avoid excessive line-pointer bloat and not
  * require increases in the size of work arrays.
  */
-#define MaxHeapTuplesPerPage	\
+#define MaxHeapTuplesPerPage()	\
+	((int) ((BLCKSZ - SizeOfPageHeaderData - SizeOfPageReservedSpace()) / \
+			(MAXALIGN(SizeofHeapTupleHeader) + sizeof(ItemIdData))))
+#define MaxHeapTuplesPerPageLimit	\
 	((int) ((BLCKSZ - SizeOfPageHeaderData) / \
 			(MAXALIGN(SizeofHeapTupleHeader) + sizeof(ItemIdData))))
 
diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h
index d684786095..3ccc3014b3 100644
--- a/src/include/access/nbtree.h
+++ b/src/include/access/nbtree.h
@@ -162,29 +162,43 @@ typedef struct BTMetaPageData
  * attribute, which we account for here.
  */
 #define BTMaxItemSize(page) \
-	(MAXALIGN_DOWN((PageGetPageSize(page) - \
+	(MAXALIGN_DOWN((PageGetPageSize(page) - SizeOfPageReservedSpace() - \
 					MAXALIGN(SizeOfPageHeaderData + 3*sizeof(ItemIdData)) - \
 					MAXALIGN(sizeof(BTPageOpaqueData))) / 3) - \
 					MAXALIGN(sizeof(ItemPointerData)))
 #define BTMaxItemSizeNoHeapTid(page) \
-	MAXALIGN_DOWN((PageGetPageSize(page) - \
+	MAXALIGN_DOWN((PageGetPageSize(page) - SizeOfPageReservedSpace() - \
 				   MAXALIGN(SizeOfPageHeaderData + 3*sizeof(ItemIdData)) - \
 				   MAXALIGN(sizeof(BTPageOpaqueData))) / 3)
 
 /*
- * MaxTIDsPerBTreePage is an upper bound on the number of heap TIDs tuples
- * that may be stored on a btree leaf page.  It is used to size the
- * per-page temporary buffers.
+ * MaxTIDsPerBTreePage() is an upper bound on the number of heap TIDs tuples
+ * that may be stored on a btree leaf page.  It is used to size the per-page
+ * temporary buffers.  This accounts for PageReservedSpace limit as well, so
+ * is a dynamic value depending on cluster settings.
+ *
+ * MaxTIDsPerBTreePageLimit is the same value without considering
+ * PageReservedSpace limit as well, so is used for fixed-size buffers, however
+ * code accessing these buffers should consider only MaxTIDsPerBTreePage() when
+ * iterating over then.
  *
  * Note: we don't bother considering per-tuple overheads here to keep
  * things simple (value is based on how many elements a single array of
  * heap TIDs must have to fill the space between the page header and
  * special area).  The value is slightly higher (i.e. more conservative)
  * than necessary as a result, which is considered acceptable.
+ *
+ * Since this is a fixed-size upper limit we restrict to the max size of page
+ * reserved space; this does mean that we pay a cost of
+ * (MaxSizeOfPageReservedSpace / sizeof(ItemPointerData)) less tuples stored
+ * on a page.
  */
-#define MaxTIDsPerBTreePage \
-	(int) ((BLCKSZ - SizeOfPageHeaderData - sizeof(BTPageOpaqueData)) / \
-		   sizeof(ItemPointerData))
+#define MaxTIDsPerBTreePage() \
+	(int) ((BLCKSZ - SizeOfPageHeaderData - SizeOfPageReservedSpace() - \
+			sizeof(BTPageOpaqueData)) / sizeof(ItemPointerData))
+#define MaxTIDsPerBTreePageLimit \
+	(int) ((BLCKSZ - SizeOfPageHeaderData - \
+			sizeof(BTPageOpaqueData)) / sizeof(ItemPointerData))
 
 /*
  * The leaf-page fillfactor defaults to 90% but is user-adjustable.
@@ -984,7 +998,7 @@ typedef struct BTScanPosData
 	int			lastItem;		/* last valid index in items[] */
 	int			itemIndex;		/* current index in items[] */
 
-	BTScanPosItem items[MaxTIDsPerBTreePage];	/* MUST BE LAST */
+	BTScanPosItem items[MaxTIDsPerBTreePageLimit];	/* MUST BE LAST */
 } BTScanPosData;
 
 typedef BTScanPosData *BTScanPos;
diff --git a/src/include/access/spgist_private.h b/src/include/access/spgist_private.h
index c6ef46fc20..22599d8d94 100644
--- a/src/include/access/spgist_private.h
+++ b/src/include/access/spgist_private.h
@@ -447,6 +447,7 @@ typedef SpGistDeadTupleData *SpGistDeadTuple;
 #define SPGIST_PAGE_CAPACITY  \
 	MAXALIGN_DOWN(BLCKSZ - \
 				  SizeOfPageHeaderData - \
+				  SizeOfPageReservedSpace() - \
 				  MAXALIGN(sizeof(SpGistPageOpaqueData)))
 
 /*
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 424ecba028..18ebd3b3c7 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -19,6 +19,14 @@
 #include "storage/item.h"
 #include "storage/off.h"
 
+extern PGDLLIMPORT int reserved_page_size;
+
+#define SizeOfPageReservedSpace() reserved_page_size
+#define MaxSizeOfPageReservedSpace 0
+
+/* strict upper bound on the amount of space occupied we have reserved on
+ * pages in this cluster */
+
 /*
  * A postgres disk page is an abstraction layered on top of a postgres
  * disk block (which is simply a unit of i/o, see block.h).
@@ -36,10 +44,10 @@
  * |			 v pd_upper							  |
  * +-------------+------------------------------------+
  * |			 | tupleN ...                         |
- * +-------------+------------------+-----------------+
- * |	   ... tuple3 tuple2 tuple1 | "special space" |
- * +--------------------------------+-----------------+
- *									^ pd_special
+ * +-------------+-----+------------+----+------------+
+ * | ... tuple2 tuple1 | "special space" | "reserved" |
+ * +-------------------+------------+----+------------+
+ *					   ^ pd_special      ^ reserved_page_space
  *
  * a page is full when nothing can be added between pd_lower and
  * pd_upper.
@@ -73,6 +81,8 @@
  * stored as the page trailer.  an access method should always
  * initialize its pages with PageInit and then set its own opaque
  * fields.
+ *
+ * XXX - update more comments here about reserved_page_space
  */
 
 typedef Pointer Page;
@@ -313,7 +323,7 @@ PageSetPageSizeAndVersion(Page page, Size size, uint8 version)
 static inline uint16
 PageGetSpecialSize(Page page)
 {
-	return (PageGetPageSize(page) - ((PageHeader) page)->pd_special);
+	return (PageGetPageSize(page) - ((PageHeader) page)->pd_special - reserved_page_size);
 }
 
 /*
@@ -325,7 +335,7 @@ static inline void
 PageValidateSpecialPointer(Page page)
 {
 	Assert(page);
-	Assert(((PageHeader) page)->pd_special <= BLCKSZ);
+	Assert((((PageHeader) page)->pd_special + reserved_page_size) <= BLCKSZ);
 	Assert(((PageHeader) page)->pd_special >= SizeOfPageHeaderData);
 }
 
diff --git a/src/test/modules/test_ginpostinglist/test_ginpostinglist.c b/src/test/modules/test_ginpostinglist/test_ginpostinglist.c
index 80cee65684..6a6fdb40fd 100644
--- a/src/test/modules/test_ginpostinglist/test_ginpostinglist.c
+++ b/src/test/modules/test_ginpostinglist/test_ginpostinglist.c
@@ -88,9 +88,9 @@ Datum
 test_ginpostinglist(PG_FUNCTION_ARGS)
 {
 	test_itemptr_pair(0, 2, 14);
-	test_itemptr_pair(0, MaxHeapTuplesPerPage, 14);
-	test_itemptr_pair(MaxBlockNumber, MaxHeapTuplesPerPage, 14);
-	test_itemptr_pair(MaxBlockNumber, MaxHeapTuplesPerPage, 16);
+	test_itemptr_pair(0, MaxHeapTuplesPerPage(), 14);
+	test_itemptr_pair(MaxBlockNumber, MaxHeapTuplesPerPage(), 14);
+	test_itemptr_pair(MaxBlockNumber, MaxHeapTuplesPerPage(), 16);
 
 	PG_RETURN_VOID();
 }
diff --git a/src/test/regress/expected/insert.out b/src/test/regress/expected/insert.out
index dd4354fc7d..b530ac0038 100644
--- a/src/test/regress/expected/insert.out
+++ b/src/test/regress/expected/insert.out
@@ -86,7 +86,7 @@ drop table inserttest;
 --
 CREATE TABLE large_tuple_test (a int, b text) WITH (fillfactor = 10);
 ALTER TABLE large_tuple_test ALTER COLUMN b SET STORAGE plain;
--- create page w/ free space in range [nearlyEmptyFreeSpace, MaxHeapTupleSize)
+-- create page w/ free space in range [nearlyEmptyFreeSpace, MaxHeapTupleSize())
 INSERT INTO large_tuple_test (select 1, NULL);
 -- should still fit on the page
 INSERT INTO large_tuple_test (select 2, repeat('a', 1000));
@@ -100,7 +100,7 @@ SELECT pg_size_pretty(pg_relation_size('large_tuple_test'::regclass, 'main'));
 INSERT INTO large_tuple_test (select 3, NULL);
 -- now this tuple won't fit on the second page, but the insert should
 -- still succeed by extending the relation
-INSERT INTO large_tuple_test (select 4, repeat('a', 8126));
+INSERT INTO large_tuple_test (select 4, repeat('a', 8062));
 DROP TABLE large_tuple_test;
 --
 -- check indirection (field/array assignment), cf bug #14265
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 41e020cf20..7cd7761367 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -134,7 +134,7 @@ CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 CREATE INDEX no_index_cleanup_idx ON no_index_cleanup(t);
 ALTER TABLE no_index_cleanup ALTER COLUMN t SET STORAGE EXTERNAL;
 INSERT INTO no_index_cleanup(i, t) VALUES (generate_series(1,30),
-    repeat('1234567890',269));
+    repeat('1234567890',266));
 -- index cleanup option is ignored if VACUUM FULL
 VACUUM (INDEX_CLEANUP TRUE, FULL TRUE) no_index_cleanup;
 VACUUM (FULL TRUE) no_index_cleanup;
@@ -150,7 +150,7 @@ ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = auto);
 VACUUM no_index_cleanup;
 -- Parameter is set for both the parent table and its toast relation.
 INSERT INTO no_index_cleanup(i, t) VALUES (generate_series(31,60),
-    repeat('1234567890',269));
+    repeat('1234567890',266));
 DELETE FROM no_index_cleanup WHERE i < 45;
 -- Only toast index is cleaned up.
 ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = off,
diff --git a/src/test/regress/sql/insert.sql b/src/test/regress/sql/insert.sql
index bdcffd0314..0e31dcd1f8 100644
--- a/src/test/regress/sql/insert.sql
+++ b/src/test/regress/sql/insert.sql
@@ -43,7 +43,7 @@ drop table inserttest;
 CREATE TABLE large_tuple_test (a int, b text) WITH (fillfactor = 10);
 ALTER TABLE large_tuple_test ALTER COLUMN b SET STORAGE plain;
 
--- create page w/ free space in range [nearlyEmptyFreeSpace, MaxHeapTupleSize)
+-- create page w/ free space in range [nearlyEmptyFreeSpace, MaxHeapTupleSize())
 INSERT INTO large_tuple_test (select 1, NULL);
 
 -- should still fit on the page
@@ -55,7 +55,7 @@ INSERT INTO large_tuple_test (select 3, NULL);
 
 -- now this tuple won't fit on the second page, but the insert should
 -- still succeed by extending the relation
-INSERT INTO large_tuple_test (select 4, repeat('a', 8126));
+INSERT INTO large_tuple_test (select 4, repeat('a', 8062));
 
 DROP TABLE large_tuple_test;
 
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 51d7b1fecc..391b2d70d0 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -115,7 +115,7 @@ CREATE TABLE no_index_cleanup (i INT PRIMARY KEY, t TEXT);
 CREATE INDEX no_index_cleanup_idx ON no_index_cleanup(t);
 ALTER TABLE no_index_cleanup ALTER COLUMN t SET STORAGE EXTERNAL;
 INSERT INTO no_index_cleanup(i, t) VALUES (generate_series(1,30),
-    repeat('1234567890',269));
+    repeat('1234567890',266));
 -- index cleanup option is ignored if VACUUM FULL
 VACUUM (INDEX_CLEANUP TRUE, FULL TRUE) no_index_cleanup;
 VACUUM (FULL TRUE) no_index_cleanup;
@@ -131,7 +131,7 @@ ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = auto);
 VACUUM no_index_cleanup;
 -- Parameter is set for both the parent table and its toast relation.
 INSERT INTO no_index_cleanup(i, t) VALUES (generate_series(31,60),
-    repeat('1234567890',269));
+    repeat('1234567890',266));
 DELETE FROM no_index_cleanup WHERE i < 45;
 -- Only toast index is cleaned up.
 ALTER TABLE no_index_cleanup SET (vacuum_index_cleanup = off,
-- 
2.39.2

v4-0002-Add-Page-Features-optional-per-page-storage-alloc.patchapplication/octet-stream; name=v4-0002-Add-Page-Features-optional-per-page-storage-alloc.patchDownload

From 09c8091c493e1f11d345ade5edeff2bf6fbf2483 Mon Sep 17 00:00:00 2001
From: David Christensen <david@pgguru.net>
Date: Tue, 9 May 2023 16:56:15 -0500
Subject: [PATCH v4 2/3] Add Page Features: optional per-page storage
 allocations

Page Features are optional sets of storage allocated from the trailer of a disk
page.  While the current approach is handled at a cluster level, using a Page
Feature Set defined at bootstrap time, the design is granular enough that
specific features could be used on given classes of relations or even specific
relations.

The API design merges the test for a given feature with the return of the
address of the feature's storage on the page, which simplifies code that tests
and then does something with the resulting space on the page.  For instance,
with a page feature that stores 8 bytes as a page checksum, the test for such a
feature might be implemented as:

    char *extended_checksum_loc = NULL;

    /* are we using extended checksums? */
    if ((extended_checksum_loc = PageGetFeatureOffset(page, PF_EXT_CHECKSUMS)))
    {
        /* 64-bit checksum */
        page_checksum = pg_get_checksum64_page(page, (uint64*)extended_checksum_loc);
        checksum = pg_checksum64_page(page, blkno + segmentno * RELSEG_SIZE, (uint64*)extended_checksum_loc);
    }
    else
    {
        phdr = (PageHeader) page;
        page_checksum = phdr->pd_feat.checksum;
        checksum = pg_checksum_page(page, blkno + segmentno * RELSEG_SIZE);
    }

The page features are named, fixed size, and consistently offset so a cluster
that is created with the same options could use pg_upgrade for future versions,
similar to how data checksums are currently utilized.

The code overrides the interpretation of the pd_checksum field only when a
specific page flag is on the page, meaning that this code is
backwards-compatible with both checksum-enabled and -disabled historical
clusters, as well as forward-compatible with the development of additional page
features in future versions.
---
 contrib/bloom/blutils.c                   |   2 +-
 contrib/pageinspect/rawpage.c             |   2 +-
 doc/src/sgml/storage.sgml                 |   4 +-
 src/backend/access/brin/brin_bloom.c      |   1 +
 src/backend/access/brin/brin_pageops.c    |   2 +-
 src/backend/access/common/bufmask.c       |   3 +-
 src/backend/access/gin/ginutil.c          |   2 +-
 src/backend/access/gist/gistutil.c        |   2 +-
 src/backend/access/hash/hashpage.c        |   2 +-
 src/backend/access/heap/heapam.c          |   8 +-
 src/backend/access/heap/hio.c             |   4 +-
 src/backend/access/heap/rewriteheap.c     |   2 +-
 src/backend/access/heap/visibilitymap.c   |   2 +-
 src/backend/access/nbtree/nbtpage.c       |   2 +-
 src/backend/access/spgist/spgutils.c      |   2 +-
 src/backend/access/transam/xlog.c         |  10 ++
 src/backend/backup/basebackup.c           |   4 +-
 src/backend/bootstrap/bootstrap.c         |  19 ++-
 src/backend/commands/sequence.c           |   4 +-
 src/backend/storage/freespace/freespace.c |   4 +-
 src/backend/storage/page/README           | 163 +++++++++++++++++++++-
 src/backend/storage/page/bufpage.c        |  39 ++++--
 src/bin/initdb/initdb.c                   |   5 +
 src/bin/pg_checksums/pg_checksums.c       |  11 +-
 src/bin/pg_controldata/pg_controldata.c   |   3 +
 src/bin/pg_upgrade/file.c                 |   2 +-
 src/common/Makefile                       |   1 +
 src/common/meson.build                    |   1 +
 src/common/pagefeat.c                     | 140 +++++++++++++++++++
 src/include/access/htup_details.h         |   1 +
 src/include/catalog/pg_control.h          |   5 +-
 src/include/common/pagefeat.h             |  56 ++++++++
 src/include/storage/bufmgr.h              |   1 +
 src/include/storage/bufpage.h             |  65 ++++++---
 src/include/storage/checksum_impl.h       |  19 +--
 src/test/perl/PostgreSQL/Test/Cluster.pm  |   2 +
 src/tools/msvc/Mkvcbuild.pm               |   2 +-
 37 files changed, 519 insertions(+), 78 deletions(-)
 create mode 100644 src/common/pagefeat.c
 create mode 100644 src/include/common/pagefeat.h

diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index d3d74a9d28..cc8c6eef4e 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -400,7 +400,7 @@ BloomInitPage(Page page, uint16 flags)
 {
 	BloomPageOpaque opaque;
 
-	PageInit(page, BLCKSZ, sizeof(BloomPageOpaqueData));
+	PageInit(page, BLCKSZ, sizeof(BloomPageOpaqueData), cluster_page_features);
 
 	opaque = BloomPageGetOpaque(page);
 	opaque->flags = flags;
diff --git a/contrib/pageinspect/rawpage.c b/contrib/pageinspect/rawpage.c
index b25a63cbd6..98f8a63086 100644
--- a/contrib/pageinspect/rawpage.c
+++ b/contrib/pageinspect/rawpage.c
@@ -284,7 +284,7 @@ page_header(PG_FUNCTION_ARGS)
 	}
 	else
 		values[0] = LSNGetDatum(lsn);
-	values[1] = UInt16GetDatum(pageheader->pd_checksum);
+	values[1] = UInt16GetDatum(pageheader->pd_feat.checksum);
 	values[2] = UInt16GetDatum(pageheader->pd_flags);
 
 	/* pageinspect >= 1.10 uses int4 instead of int2 for those fields */
diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml
index e5b9f3f1ff..86cfc4d8a2 100644
--- a/doc/src/sgml/storage.sgml
+++ b/doc/src/sgml/storage.sgml
@@ -839,10 +839,10 @@ data. Empty in ordinary tables.</entry>
    to this page</entry>
   </row>
   <row>
-   <entry>pd_checksum</entry>
+   <entry>pd_feat</entry>
    <entry>uint16</entry>
    <entry>2 bytes</entry>
-   <entry>Page checksum</entry>
+   <entry>Page checksum or Page Features</entry>
   </row>
   <row>
    <entry>pd_flags</entry>
diff --git a/src/backend/access/brin/brin_bloom.c b/src/backend/access/brin/brin_bloom.c
index 5bc0166fb8..1d7c5c34d3 100644
--- a/src/backend/access/brin/brin_bloom.c
+++ b/src/backend/access/brin/brin_bloom.c
@@ -125,6 +125,7 @@
 #include "access/stratnum.h"
 #include "catalog/pg_type.h"
 #include "catalog/pg_amop.h"
+#include "common/pagefeat.h"
 #include "utils/builtins.h"
 #include "utils/datum.h"
 #include "utils/lsyscache.h"
diff --git a/src/backend/access/brin/brin_pageops.c b/src/backend/access/brin/brin_pageops.c
index b578d25954..008e83d2ec 100644
--- a/src/backend/access/brin/brin_pageops.c
+++ b/src/backend/access/brin/brin_pageops.c
@@ -475,7 +475,7 @@ brin_doinsert(Relation idxrel, BlockNumber pagesPerRange,
 void
 brin_page_init(Page page, uint16 type)
 {
-	PageInit(page, BLCKSZ, sizeof(BrinSpecialSpace));
+	PageInit(page, BLCKSZ, sizeof(BrinSpecialSpace), cluster_page_features);
 
 	BrinPageType(page) = type;
 }
diff --git a/src/backend/access/common/bufmask.c b/src/backend/access/common/bufmask.c
index 5e392dab1e..3f1fdb3c0d 100644
--- a/src/backend/access/common/bufmask.c
+++ b/src/backend/access/common/bufmask.c
@@ -33,7 +33,8 @@ mask_page_lsn_and_checksum(Page page)
 	PageHeader	phdr = (PageHeader) page;
 
 	PageXLogRecPtrSet(phdr->pd_lsn, (uint64) MASK_MARKER);
-	phdr->pd_checksum = MASK_MARKER;
+	if (!(phdr->pd_flags & PD_EXTENDED_FEATS))
+		phdr->pd_feat.checksum = MASK_MARKER;
 }
 
 /*
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 437f24753c..9d6a09d9ef 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -338,7 +338,7 @@ GinInitPage(Page page, uint32 f, Size pageSize)
 {
 	GinPageOpaque opaque;
 
-	PageInit(page, pageSize, sizeof(GinPageOpaqueData));
+	PageInit(page, pageSize, sizeof(GinPageOpaqueData), cluster_page_features);
 
 	opaque = GinPageGetOpaque(page);
 	opaque->flags = f;
diff --git a/src/backend/access/gist/gistutil.c b/src/backend/access/gist/gistutil.c
index f9f51152b8..4cfc353210 100644
--- a/src/backend/access/gist/gistutil.c
+++ b/src/backend/access/gist/gistutil.c
@@ -758,7 +758,7 @@ gistinitpage(Page page, uint32 f)
 {
 	GISTPageOpaque opaque;
 
-	PageInit(page, BLCKSZ, sizeof(GISTPageOpaqueData));
+	PageInit(page, BLCKSZ, sizeof(GISTPageOpaqueData), cluster_page_features);
 
 	opaque = GistPageGetOpaque(page);
 	opaque->rightlink = InvalidBlockNumber;
diff --git a/src/backend/access/hash/hashpage.c b/src/backend/access/hash/hashpage.c
index af3a154266..6c28db421b 100644
--- a/src/backend/access/hash/hashpage.c
+++ b/src/backend/access/hash/hashpage.c
@@ -595,7 +595,7 @@ _hash_init_metabuffer(Buffer buf, double num_tuples, RegProcedure procid,
 void
 _hash_pageinit(Page page, Size size)
 {
-	PageInit(page, size, sizeof(HashPageOpaqueData));
+	PageInit(page, size, sizeof(HashPageOpaqueData), cluster_page_features);
 }
 
 /*
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 12e8baf93c..4429b77747 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -9019,7 +9019,7 @@ heap_xlog_visible(XLogReaderState *record)
 
 		/* initialize the page if it was read as zeros */
 		if (PageIsNew(vmpage))
-			PageInit(vmpage, BLCKSZ, 0);
+			PageInit(vmpage, BLCKSZ, 0, cluster_page_features);
 
 		/* remove VISIBILITYMAP_XLOG_* */
 		vmbits = xlrec->flags & VISIBILITYMAP_VALID_BITS;
@@ -9259,7 +9259,7 @@ heap_xlog_insert(XLogReaderState *record)
 	{
 		buffer = XLogInitBufferForRedo(record, 0);
 		page = BufferGetPage(buffer);
-		PageInit(page, BufferGetPageSize(buffer), 0);
+		PageInit(page, BufferGetPageSize(buffer), 0, cluster_page_features);
 		action = BLK_NEEDS_REDO;
 	}
 	else
@@ -9383,7 +9383,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
 	{
 		buffer = XLogInitBufferForRedo(record, 0);
 		page = BufferGetPage(buffer);
-		PageInit(page, BufferGetPageSize(buffer), 0);
+		PageInit(page, BufferGetPageSize(buffer), 0, cluster_page_features);
 		action = BLK_NEEDS_REDO;
 	}
 	else
@@ -9601,7 +9601,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
 	{
 		nbuffer = XLogInitBufferForRedo(record, 0);
 		page = (Page) BufferGetPage(nbuffer);
-		PageInit(page, BufferGetPageSize(nbuffer), 0);
+		PageInit(page, BufferGetPageSize(nbuffer), 0, cluster_page_features);
 		newaction = BLK_NEEDS_REDO;
 	}
 	else
diff --git a/src/backend/access/heap/hio.c b/src/backend/access/heap/hio.c
index 3ce357897c..f3c9acaeab 100644
--- a/src/backend/access/heap/hio.c
+++ b/src/backend/access/heap/hio.c
@@ -343,7 +343,7 @@ RelationAddBlocks(Relation relation, BulkInsertState bistate,
 			 first_block,
 			 RelationGetRelationName(relation));
 
-	PageInit(page, BufferGetPageSize(buffer), 0);
+	PageInit(page, BufferGetPageSize(buffer), 0, cluster_page_features);
 	MarkBufferDirty(buffer);
 
 	/*
@@ -677,7 +677,7 @@ loop:
 		 */
 		if (PageIsNew(page))
 		{
-			PageInit(page, BufferGetPageSize(buffer), 0);
+			PageInit(page, BufferGetPageSize(buffer), 0, cluster_page_features);
 			MarkBufferDirty(buffer);
 		}
 
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index adc4cf9af7..72f9adc1cd 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -702,7 +702,7 @@ raw_heap_insert(RewriteState state, HeapTuple tup)
 	if (!state->rs_buffer_valid)
 	{
 		/* Initialize a new empty page */
-		PageInit(page, BLCKSZ, 0);
+		PageInit(page, BLCKSZ, 0, cluster_page_features);
 		state->rs_buffer_valid = true;
 	}
 
diff --git a/src/backend/access/heap/visibilitymap.c b/src/backend/access/heap/visibilitymap.c
index ac91d1a14d..3dbc022611 100644
--- a/src/backend/access/heap/visibilitymap.c
+++ b/src/backend/access/heap/visibilitymap.c
@@ -613,7 +613,7 @@ vm_readbuf(Relation rel, BlockNumber blkno, bool extend)
 	{
 		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
 		if (PageIsNew(BufferGetPage(buf)))
-			PageInit(BufferGetPage(buf), BLCKSZ, 0);
+			PageInit(BufferGetPage(buf), BLCKSZ, 0, cluster_page_features);
 		LockBuffer(buf, BUFFER_LOCK_UNLOCK);
 	}
 	return buf;
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 41aa1c4ccd..f91b88e6b5 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -1125,7 +1125,7 @@ _bt_upgradelockbufcleanup(Relation rel, Buffer buf)
 void
 _bt_pageinit(Page page, Size size)
 {
-	PageInit(page, size, sizeof(BTPageOpaqueData));
+	PageInit(page, size, sizeof(BTPageOpaqueData), cluster_page_features);
 }
 
 /*
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 190e4f76a9..9c88534d69 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -681,7 +681,7 @@ SpGistInitPage(Page page, uint16 f)
 {
 	SpGistPageOpaque opaque;
 
-	PageInit(page, BLCKSZ, sizeof(SpGistPageOpaqueData));
+	PageInit(page, BLCKSZ, sizeof(SpGistPageOpaqueData), cluster_page_features);
 	opaque = SpGistPageGetOpaque(page);
 	opaque->flags = f;
 	opaque->spgist_page_id = SPGIST_PAGE_ID;
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index bc5a8e0569..bdc0d08ae5 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -69,6 +69,7 @@
 #include "catalog/pg_database.h"
 #include "common/controldata_utils.h"
 #include "common/file_utils.h"
+#include "common/pagefeat.h"
 #include "executor/instrument.h"
 #include "miscadmin.h"
 #include "pg_trace.h"
@@ -89,6 +90,7 @@
 #include "storage/ipc.h"
 #include "storage/large_object.h"
 #include "storage/latch.h"
+#include "common/pagefeat.h"
 #include "storage/pmsignal.h"
 #include "storage/predicate.h"
 #include "storage/proc.h"
@@ -109,6 +111,7 @@
 #include "utils/varlena.h"
 
 extern uint32 bootstrap_data_checksum_version;
+extern PageFeatureSet bootstrap_page_features;
 
 /* timeline ID to be used when bootstrapping */
 #define BootstrapTimeLineID		1
@@ -3883,6 +3886,7 @@ InitControlFile(uint64 sysidentifier)
 	ControlFile->wal_log_hints = wal_log_hints;
 	ControlFile->track_commit_timestamp = track_commit_timestamp;
 	ControlFile->data_checksum_version = bootstrap_data_checksum_version;
+	ControlFile->page_features = bootstrap_page_features;
 }
 
 static void
@@ -4158,9 +4162,15 @@ ReadControlFile(void)
 
 	CalculateCheckpointSegments();
 
+	/* set our page-level space reservation from ControlFile if any extended feature flags are set*/
+	reserved_page_size = PageFeatureSetCalculateSize(ControlFile->page_features);
+	Assert(reserved_page_size == MAXALIGN(reserved_page_size));
+
 	/* Make the initdb settings visible as GUC variables, too */
 	SetConfigOption("data_checksums", DataChecksumsEnabled() ? "yes" : "no",
 					PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
+
+	SetExtendedFeatureConfigOptions(ControlFile->page_features);
 }
 
 /*
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 5baea7535b..7a7dc9f80d 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -1596,7 +1596,7 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 				{
 					checksum = pg_checksum_page((char *) page, blkno + segmentno * RELSEG_SIZE);
 					phdr = (PageHeader) page;
-					if (phdr->pd_checksum != checksum)
+					if (phdr->pd_feat.checksum != checksum)
 					{
 						/*
 						 * Retry the block on the first failure.  It's
@@ -1659,7 +1659,7 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 											"file \"%s\", block %u: calculated "
 											"%X but expected %X",
 											readfilename, blkno, checksum,
-											phdr->pd_checksum)));
+											phdr->pd_feat.checksum)));
 						if (checksum_failures == 5)
 							ereport(WARNING,
 									(errmsg("further checksum verification "
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index 49e956b2c5..b66f24b47e 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -46,7 +46,7 @@
 #include "utils/relmapper.h"
 
 uint32		bootstrap_data_checksum_version = 0;	/* No checksum */
-
+PageFeatureSet bootstrap_page_features = 0;			/* No special features */
 
 static void CheckerModeMain(void);
 static void bootstrap_signals(void);
@@ -221,7 +221,7 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
 	argv++;
 	argc--;
 
-	while ((flag = getopt(argc, argv, "B:c:d:D:Fkr:X:-:")) != -1)
+	while ((flag = getopt(argc, argv, "B:c:d:D:e:Fkr:X:-:")) != -1)
 	{
 		switch (flag)
 		{
@@ -270,6 +270,19 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
 					pfree(debugstr);
 				}
 				break;
+			case 'e':
+				{
+					/* enable specific features */
+					PageFeatureSet features_tmp;
+
+					features_tmp = PageFeatureSetAddFeatureByName(bootstrap_page_features, optarg);
+					if (features_tmp == bootstrap_page_features)
+						ereport(ERROR,
+								(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+								 errmsg("Unrecognized page feature requested: \"%s\"", optarg)));
+					bootstrap_page_features = features_tmp;
+				}
+				break;
 			case 'F':
 				SetConfigOption("fsync", "false", PGC_POSTMASTER, PGC_S_ARGV);
 				break;
@@ -299,6 +312,8 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
 		}
 	}
 
+	ClusterPageFeatureInit(bootstrap_page_features);
+
 	if (argc != optind)
 	{
 		write_stderr("%s: invalid command-line arguments\n", progname);
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index ef01449678..83610bf6c0 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -383,7 +383,7 @@ fill_seq_fork_with_data(Relation rel, HeapTuple tuple, ForkNumber forkNum)
 
 	page = BufferGetPage(buf);
 
-	PageInit(page, BufferGetPageSize(buf), sizeof(sequence_magic));
+	PageInit(page, BufferGetPageSize(buf), sizeof(sequence_magic), cluster_page_features);
 	sm = (sequence_magic *) PageGetSpecialPointer(page);
 	sm->magic = SEQ_MAGIC;
 
@@ -1855,7 +1855,7 @@ seq_redo(XLogReaderState *record)
 	 */
 	localpage = (Page) palloc(BufferGetPageSize(buffer));
 
-	PageInit(localpage, BufferGetPageSize(buffer), sizeof(sequence_magic));
+	PageInit(localpage, BufferGetPageSize(buffer), sizeof(sequence_magic), cluster_page_features);
 	sm = (sequence_magic *) PageGetSpecialPointer(localpage);
 	sm->magic = SEQ_MAGIC;
 
diff --git a/src/backend/storage/freespace/freespace.c b/src/backend/storage/freespace/freespace.c
index 2260532fbb..2a72b9a728 100644
--- a/src/backend/storage/freespace/freespace.c
+++ b/src/backend/storage/freespace/freespace.c
@@ -217,7 +217,7 @@ XLogRecordPageWithFreeSpace(RelFileLocator rlocator, BlockNumber heapBlk,
 
 	page = BufferGetPage(buf);
 	if (PageIsNew(page))
-		PageInit(page, BLCKSZ, 0);
+		PageInit(page, BLCKSZ, 0, cluster_page_features);
 
 	if (fsm_set_avail(page, slot, new_cat))
 		MarkBufferDirtyHint(buf, false);
@@ -598,7 +598,7 @@ fsm_readbuf(Relation rel, FSMAddress addr, bool extend)
 	{
 		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
 		if (PageIsNew(BufferGetPage(buf)))
-			PageInit(BufferGetPage(buf), BLCKSZ, 0);
+			PageInit(BufferGetPage(buf), BLCKSZ, 0, cluster_page_features);
 		LockBuffer(buf, BUFFER_LOCK_UNLOCK);
 	}
 	return buf;
diff --git a/src/backend/storage/page/README b/src/backend/storage/page/README
index e30d7ac59a..95e61ef25a 100644
--- a/src/backend/storage/page/README
+++ b/src/backend/storage/page/README
@@ -3,6 +3,11 @@ src/backend/storage/page/README
 Checksums
 ---------
 
+Note: The description of the page checksums described in this section are
+relevant only when the database cluster has been initialized without page
+features; see the section on Page Features below for full details on
+interpretation.
+
 Checksums on data pages are designed to detect corruption by the I/O system.
 We do not protect buffers against uncorrectable memory errors, since these
 have a very low measured incidence according to research on large server farms,
@@ -19,7 +24,7 @@ We set the checksum on a buffer in the shared pool immediately before we
 flush the buffer. As a result we implicitly invalidate the page's checksum
 when we modify the page for a data change or even a hint. This means that
 many or even most pages in shared buffers have invalid page checksums,
-so be careful how you interpret the pd_checksum field.
+so be careful how you interpret the pd_feat.checksum field.
 
 That means that WAL-logged changes to a page do NOT update the page checksum,
 so full page images may not have a valid checksum. But those page images have
@@ -62,3 +67,159 @@ checksums are enabled.  Systems in Hot-Standby mode may benefit from hint bits
 being set, but with checksums enabled, a page cannot be dirtied after setting a
 hint bit (due to the torn page risk). So, it must wait for full-page images
 containing the hint bit updates to arrive from the primary.
+
+
+Page Features
+-------------
+
+As described above, the use and interpretation of checksums on the page level
+are conditional depending on whether any Page Features had been enabled at
+initdb time.
+
+A Page Feature is an optional boolean parameter that will allocate a fixed-size
+amount of space from the end of a Page. All enabled Page Features are known as
+a Page Feature Set, and the control file contains the cluster-wide initial
+state. Future work here could expand out which features are utilized.
+
+Changes to the Page structure itself involve a new `pd_flags` bit and, if set, a
+reinterpretation of the `pd_checksums` field as a copy of this Page's enabled
+page features. This gives us both a sanity-check against the pg_control
+cluster_page_features as well as being a backwards-compatible change with
+existing disk pages with or without checksums enabled, meaning that pg_upgrade
+should work still.
+
+Future upgrades for clusters using Page Features should continue to work, as
+long as the initdb options for the future clusters are still compatible and as
+long as we keep the set of existing Page Features well-defined in terms of bit
+offsets and reserved length. (This does not seem like an unreasonable
+restriction.)
+
+Since we are taking over the pd_checksums field on the Page structure when Page
+Features are in use, it would seem that this would introduce some potential data
+corruption concerns, however one of the available page features is an extended
+checksum, which itself obviates the need for the checksums field and expands the
+available storage space for this checksum to a full 64-bits. This should be
+sufficient to address this concern, and the checksum-related code paths have
+already been updated to handle either the standard checksums or the extended
+checksums transparently.
+
+In addition to extended checksums, there is also a Page Feature which we use to
+store the GCM tag for authenticated encryption for TDE. This reserved space
+provides both storage and validation of Additional Authenticated Data so we can
+be sure that if a page decrypts appropriately is is cryptographically impossible
+to have twiddled any bits on this page outside of through Postgres itself, which
+serves as a stronger alternative to the checksum validation. The encryption
+tags and the extended checksums would both have validation guarantees, so there
+is no need for a cluster to include both (and in fact combining them makes no
+sense) so the options are considered incompatible.
+
+
+Developing new Page Features
+----------------------------
+
+The goal of Page Features is to make it easy to have space-occupying optional
+features without it being a huge pain for developers to create new features,
+probe for the use of said features or to provide unnecessary boilerplate.
+
+As such, defining a new page feature consists of making changes to the following
+files:
+
+pagefeat.h:
+
+- create an `extern bool page_feature_<foo>` to expose the feature to the GUC
+  system.
+
+- a new feature flag should be defined for the feature; new features should
+  always be added at the end of the list since where these appear in the list
+  determines their relative offset in the page and features that already exist
+  in a cluster must appear at the same offset.
+
+pagefeat.c:
+
+- define the `bool page_feature_<foo>` variable to store the status field
+
+- add a new PageFeatureDesc entry to the corresponding index in the
+  `feature_descs` structure for this feature, including the size of space to be
+  reserved and the name of the GUC to expose the status.
+
+guc_tables.c:
+
+- add a boolean computed field linking the variable name and the GUC name for
+  the feature.  Should be basically the same as any existing page feature GUC
+  such as "extended_checksums".
+
+initdb.c:
+
+- add whatever required getopt handling to optionally enable the feature at
+  initdb time.  This should eventually pass a `-e <feature_name>` to the
+  bootstrap process, at which point everything else should work.
+
+
+Page Feature Space Usage
+------------------------
+
+Page Features only consume space for the features that are enabled.  The total
+per-page space usage is exposed via the `reserved_page_space` GUC, which itself
+is MAXALIGN()ed.
+
+A design choice was made in which the checking for a page feature's enabling and
+accessing the memory area for said page feature could be combined in a single
+call, PageGetFeatureOffset().  This routine is passed a Page and a PageFeature,
+and if this specific page feature has been enabled it will return the memory
+offset inside this Page, otherwise it will return NULL.
+
+Using an example from basebackup.c:
+
+    char *extended_checksum_loc = NULL;
+
+    /* are we using extended checksums? */
+    if ((extended_checksum_loc = PageGetFeatureOffset(page, PF_EXT_CHECKSUMS)))
+    {
+        /* 64-bit checksum */
+        page_checksum = pg_get_checksum64_page(page, (uint64*)extended_checksum_loc);
+        checksum = pg_checksum64_page(page, blkno + segmentno * RELSEG_SIZE, (uint64*)extended_checksum_loc);
+    }
+    else
+    {
+        phdr = (PageHeader) page;
+        page_checksum = phdr->pd_feat.checksum;
+        checksum = pg_checksum_page(page, blkno + segmentno * RELSEG_SIZE);
+    }
+
+Above you can see that the pointer returned from the PageGetFeatureOffset() is
+being used as the validity check and the assignment of the corresponding memory
+location at the same time.
+
+The PageFeatureSet for the cluster is a bitfield, with the enabled features on
+the page being numbers from the low bits, so for a cluster initialized with the
+following feature for hypothetical Page Features with lengths of 8 except for
+feature 0 with length 16, you would have the following offsets calcuated for the
+page features:
+
+| 00110101 |
+
+0 -> page + BLCKSZ - 16
+1 -> NULL
+2 -> page + BLCKSZ - 16 - 8
+3 -> NULL
+4 -> page + BLCKSZ - 16 - 8 - 8
+5 -> page + BLCKSZ - 16 - 8 - 8 - 8
+6 -> NULL
+7 -> NULL
+
+Note that there are some definite performance improvements related to how we are
+currently calculating the feature offsets; these can be precalculated based on
+the enabled features in the table and turned into a simple LUT.
+
+
+It is worth noting that since we are allocating space from the end of the page,
+we must adjust (transparently) the pd_special pointers to account for the
+reserved_page_size.  This has been fixed in all core or contrib code, but since
+this is now calculated at runtime instead of compile time (due to the
+requirement that we be able to enable/disable features at initdb time) this
+means that a lot of things which had been static compile-time constants are now
+instead calculated.  This cost, unlike the space costs, must unfortunately be
+paid by all users of the code whether or not we are using any features at all.
+It is hoped that the utility of the page features and the extensibility it
+provides will outweigh any performance changes here.
+
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index a93cd9df9f..1b3ce14616 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -26,8 +26,6 @@
 /* GUC variable */
 bool		ignore_checksum_failure = false;
 
-int			reserved_page_size = 0; /* how much page space to reserve for extended unencrypted metadata */
-
 
 /* ----------------------------------------------------------------
  *						Page support functions
@@ -41,11 +39,18 @@ int			reserved_page_size = 0; /* how much page space to reserve for extended une
  *		until it's time to write.
  */
 void
-PageInit(Page page, Size pageSize, Size specialSize)
+PageInit(Page page, Size pageSize, Size specialSize, PageFeatureSet features)
 {
 	PageHeader	p = (PageHeader) page;
 
-	specialSize = MAXALIGN(specialSize) + reserved_page_size;
+	specialSize = MAXALIGN(specialSize);
+
+	if (features)
+	{
+		Size reserved_size = PageFeatureSetCalculateSize(features);
+		Assert(reserved_size == MAXALIGN(reserved_size));
+		specialSize += reserved_size;
+	}
 
 	Assert(pageSize == BLCKSZ);
 	Assert(pageSize > specialSize + SizeOfPageHeaderData);
@@ -53,7 +58,13 @@ PageInit(Page page, Size pageSize, Size specialSize)
 	/* Make sure all fields of page are zero, as well as unused space */
 	MemSet(p, 0, pageSize);
 
-	p->pd_flags = 0;
+	if (features)
+	{
+		p->pd_flags = PD_EXTENDED_FEATS;
+		p->pd_feat.features = features;
+	}
+	else
+		p->pd_flags = 0; /* redundant w/MemSet? */
 	p->pd_lower = SizeOfPageHeaderData;
 	p->pd_upper = pageSize - specialSize;
 	p->pd_special = pageSize - specialSize;
@@ -102,11 +113,11 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
 	 */
 	if (!PageIsNew(page))
 	{
-		if (DataChecksumsEnabled())
+		if (DataChecksumsEnabled() && !(p->pd_flags & PD_EXTENDED_FEATS))
 		{
 			checksum = pg_checksum_page((char *) page, blkno);
 
-			if (checksum != p->pd_checksum)
+			if (checksum != p->pd_feat.checksum)
 				checksum_failure = true;
 		}
 
@@ -152,7 +163,7 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
 			ereport(WARNING,
 					(errcode(ERRCODE_DATA_CORRUPTED),
 					 errmsg("page verification failed, calculated checksum %u but expected %u",
-							checksum, p->pd_checksum)));
+							checksum, p->pd_feat.checksum)));
 
 		if ((flags & PIV_REPORT_STAT) != 0)
 			pgstat_report_checksum_failure();
@@ -409,7 +420,7 @@ PageGetTempPageCopySpecial(Page page)
 	pageSize = PageGetPageSize(page);
 	temp = (Page) palloc(pageSize);
 
-	PageInit(temp, pageSize, PageGetSpecialSize(page));
+	PageInit(temp, pageSize, PageGetSpecialSize(page), PageGetPageFeatures(page));
 	memcpy(PageGetSpecialPointer(temp),
 		   PageGetSpecialPointer(page),
 		   PageGetSpecialSize(page));
@@ -1514,7 +1525,8 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 	static char *pageCopy = NULL;
 
 	/* If we don't need a checksum, just return the passed-in data */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsEnabled() || \
+		(((PageHeader)page)->pd_flags & PD_EXTENDED_FEATS))
 		return (char *) page;
 
 	/*
@@ -1530,7 +1542,7 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 											 0);
 
 	memcpy(pageCopy, (char *) page, BLCKSZ);
-	((PageHeader) pageCopy)->pd_checksum = pg_checksum_page(pageCopy, blkno);
+	((PageHeader) pageCopy)->pd_feat.checksum = pg_checksum_page(pageCopy, blkno);
 	return pageCopy;
 }
 
@@ -1544,8 +1556,9 @@ void
 PageSetChecksumInplace(Page page, BlockNumber blkno)
 {
 	/* If we don't need a checksum, just return */
-	if (PageIsNew(page) || !DataChecksumsEnabled())
+	if (PageIsNew(page) || !DataChecksumsEnabled() || \
+		(((PageHeader)page)->pd_flags & PD_EXTENDED_FEATS))
 		return;
 
-	((PageHeader) page)->pd_checksum = pg_checksum_page((char *) page, blkno);
+	((PageHeader) page)->pd_feat.checksum = pg_checksum_page((char *) page, blkno);
 }
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 2c208ead01..1d76f694ec 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -166,6 +166,7 @@ static bool do_sync = true;
 static bool sync_only = false;
 static bool show_setting = false;
 static bool data_checksums = false;
+static bool using_page_feats = false;
 static char *xlog_dir = NULL;
 static char *str_wal_segment_size_mb = NULL;
 static int	wal_segment_size_mb;
@@ -3447,6 +3448,10 @@ main(int argc, char *argv[])
 
 	printf("\n");
 
+	/* check for incompatible extended features */
+	if (data_checksums && using_page_feats)
+		pg_fatal("cannot use page features and data_checksums at the same time");
+
 	if (data_checksums)
 		printf(_("Data page checksums are enabled.\n"));
 	else
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 19eb67e485..11008848e7 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -234,11 +234,11 @@ scan_file(const char *fn, int segmentno)
 		csum = pg_checksum_page(buf.data, blockno + segmentno * RELSEG_SIZE);
 		if (mode == PG_MODE_CHECK)
 		{
-			if (csum != header->pd_checksum)
+			if (csum != header->pd_feat.checksum)
 			{
 				if (ControlFile->data_checksum_version == PG_DATA_CHECKSUM_VERSION)
 					pg_log_error("checksum verification failed in file \"%s\", block %u: calculated checksum %X but block contains %X",
-								 fn, blockno, csum, header->pd_checksum);
+								 fn, blockno, csum, header->pd_feat.checksum);
 				badblocks++;
 			}
 		}
@@ -250,13 +250,13 @@ scan_file(const char *fn, int segmentno)
 			 * Do not rewrite if the checksum is already set to the expected
 			 * value.
 			 */
-			if (header->pd_checksum == csum)
+			if (header->pd_feat.checksum == csum)
 				continue;
 
 			blocks_written_in_file++;
 
 			/* Set checksum in page header */
-			header->pd_checksum = csum;
+			header->pd_feat.checksum = csum;
 
 			/* Seek back to beginning of block */
 			if (lseek(f, -BLCKSZ, SEEK_CUR) < 0)
@@ -551,6 +551,9 @@ main(int argc, char *argv[])
 	if (ControlFile->pg_control_version != PG_CONTROL_VERSION)
 		pg_fatal("cluster is not compatible with this version of pg_checksums");
 
+	if (ControlFile->page_features != 0)
+		pg_fatal("pg_checksums cannot be used on a cluster with enabled page features");
+
 	if (ControlFile->blcksz != BLCKSZ)
 	{
 		pg_log_error("database cluster is not compatible");
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index c390ec51ce..c1006ad5d8 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -26,6 +26,7 @@
 #include "catalog/pg_control.h"
 #include "common/controldata_utils.h"
 #include "common/logging.h"
+#include "common/pagefeat.h"
 #include "getopt_long.h"
 #include "pg_getopt.h"
 
@@ -328,5 +329,7 @@ main(int argc, char *argv[])
 		   ControlFile->data_checksum_version);
 	printf(_("Mock authentication nonce:            %s\n"),
 		   mock_auth_nonce_str);
+	printf(_("Reserved page size for features:      %d\n"),
+		   PageFeatureSetCalculateSize(ControlFile->page_features));
 	return 0;
 }
diff --git a/src/bin/pg_upgrade/file.c b/src/bin/pg_upgrade/file.c
index d173602882..a33f5f351a 100644
--- a/src/bin/pg_upgrade/file.c
+++ b/src/bin/pg_upgrade/file.c
@@ -292,7 +292,7 @@ rewriteVisibilityMap(const char *fromfile, const char *tofile,
 
 			/* Set new checksum for visibility map page, if enabled */
 			if (new_cluster.controldata.data_checksum_version != 0)
-				((PageHeader) new_vmbuf.data)->pd_checksum =
+				((PageHeader) new_vmbuf.data)->pd_feat.checksum =
 					pg_checksum_page(new_vmbuf.data, new_blkno);
 
 			errno = 0;
diff --git a/src/common/Makefile b/src/common/Makefile
index 113029bf7b..1c1f819350 100644
--- a/src/common/Makefile
+++ b/src/common/Makefile
@@ -65,6 +65,7 @@ OBJS_COMMON = \
 	kwlookup.o \
 	link-canary.o \
 	md5_common.o \
+	pagefeat.o \
 	percentrepl.o \
 	pg_get_line.o \
 	pg_lzcompress.o \
diff --git a/src/common/meson.build b/src/common/meson.build
index 41bd58ebdf..f87208db32 100644
--- a/src/common/meson.build
+++ b/src/common/meson.build
@@ -17,6 +17,7 @@ common_sources = files(
   'kwlookup.c',
   'link-canary.c',
   'md5_common.c',
+  'pagefeat.c',
   'percentrepl.c',
   'pg_get_line.c',
   'pg_lzcompress.c',
diff --git a/src/common/pagefeat.c b/src/common/pagefeat.c
new file mode 100644
index 0000000000..06a4084f46
--- /dev/null
+++ b/src/common/pagefeat.c
@@ -0,0 +1,140 @@
+/*-------------------------------------------------------------------------
+ *
+ * pagefeat.c
+ *	  POSTGRES optional page features
+ *
+ * Copyright (c) 2022, PostgreSQL Global Development Group
+ *
+ *
+ * IDENTIFICATION
+ *	  src/common/pagefeat.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+#include "common/pagefeat.h"
+#include "utils/guc.h"
+
+/* global variables */
+int reserved_page_size;
+PageFeatureSet cluster_page_features;
+
+/*
+ * A "page feature" is an optional cluster-defined additional data field that
+ * is stored in the "reserved_page_size" area in the footer of a given Page.
+ * These features are set at initdb time and are static for the life of the cluster.
+ *
+ * Page features are identified by flags, each corresponding to a blob of data
+ * with a fixed length and content.  For a given cluster, these features will
+ * globally exist or not, and can be queried for feature existence.  You can
+ * also get the data/length for a given feature using accessors.
+ */
+
+typedef struct PageFeatureDesc
+{
+	uint16 length;
+	char *guc_name;
+} PageFeatureDesc;
+
+/* These are the fixed widths for each feature type, indexed by feature.  This
+ * is also used to lookup page features by the bootstrap process and expose
+ * the state of this page feature as a readonly boolean GUC, so when adding a
+ * named feature here ensure you also update the guc_tables file to add this,
+ * or the attempt to set the GUC will fail. */
+
+static PageFeatureDesc feature_descs[PF_MAX_FEATURE] = {
+};
+
+
+/* Return the size for a given set of feature flags */
+uint16
+PageFeatureSetCalculateSize(PageFeatureSet features)
+{
+	uint16 size = 0;
+	int i;
+
+	if (!features)
+		return 0;
+
+	for (i = 0; i < PF_MAX_FEATURE; i++)
+		if (features & (1<<i))
+			size += feature_descs[i].length;
+
+	return MAXALIGN(size);
+}
+
+
+/* does a specific page have a feature? */
+static inline bool PageHasFeature(Page page, PageFeature feature)
+{
+	Assert(feature >= 0 && feature < PF_MAX_FEATURE);
+
+	return ((PageHeader) page)->pd_flags & PD_EXTENDED_FEATS && \
+		((PageHeader)page)->pd_feat.features & (1<<feature);
+}
+
+
+/*
+ * Get the page offset for the given feature given the page, flags, and
+ * feature id.  Returns NULL if the feature is not enabled.
+ */
+
+char *
+PageGetFeatureOffset(Page page, PageFeature feature_id)
+{
+	uint16 size = 0;
+	int i;
+	PageFeatureSet enabled_features;
+
+	Assert(page != NULL);
+
+	/* short circuit if page does not have extended features or is not using
+	 * this specific feature */
+
+	if (!PageHasFeature(page, feature_id))
+		return (char*)0;
+
+	enabled_features = ((PageHeader)page)->pd_feat.features;
+
+	/* we need to find the offsets of all previous features to skip */
+	for (i = PF_MAX_FEATURE; i > feature_id; i--)
+		if (enabled_features & (1<<i))
+			size += feature_descs[i].length;
+
+	/* size is now the offset from the start of the reserved page space */
+	return (char*)((char *)page + BLCKSZ - reserved_page_size + size);
+}
+
+/* expose the given feature flags as boolean yes/no GUCs */
+void
+SetExtendedFeatureConfigOptions(PageFeatureSet features)
+{
+#ifndef FRONTEND
+	int i;
+
+	for (i = 0; i < PF_MAX_FEATURE; i++)
+		SetConfigOption(feature_descs[i].guc_name, (features & (1<<i)) ? "yes" : "no",
+						PGC_INTERNAL, PGC_S_DYNAMIC_DEFAULT);
+#endif
+	cluster_page_features = features;
+}
+
+/* add a named feature to the feature set */
+PageFeatureSet
+PageFeatureSetAddFeatureByName(PageFeatureSet features, const char *feat_name)
+{
+	int i;
+
+	for (i = 0; i < PF_MAX_FEATURE; i++)
+		if (!strcmp(feat_name, feature_descs[i].guc_name))
+			return features | (1<<i);
+	return features;
+}
+
+/* add feature to the feature set by identifier */
+PageFeatureSet
+PageFeatureSetAddFeature(PageFeatureSet features, PageFeature feature)
+{
+	Assert(feature >= 0 && feature < PF_MAX_FEATURE);
+	return features | (1<<feature);
+}
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index 01362da297..7ed0958cda 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -18,6 +18,7 @@
 #include "access/transam.h"
 #include "access/tupdesc.h"
 #include "access/tupmacs.h"
+#include "common/pagefeat.h"
 #include "storage/bufpage.h"
 #include "varatt.h"
 
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index dc953977c5..e2450ed540 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -19,7 +19,7 @@
 #include "access/xlogdefs.h"
 #include "pgtime.h"				/* for pg_time_t */
 #include "port/pg_crc32c.h"
-
+#include "common/pagefeat.h"
 
 /* Version identifier for this pg_control format */
 #define PG_CONTROL_VERSION	1300
@@ -219,6 +219,9 @@ typedef struct ControlFileData
 	/* Are data pages protected by checksums? Zero if no checksum version */
 	uint32		data_checksum_version;
 
+	/* What extended page features are we using? */
+	PageFeatureSet page_features;
+
 	/*
 	 * Random nonce, used in authentication requests that need to proceed
 	 * based on values that are cluster-unique, like a SASL exchange that
diff --git a/src/include/common/pagefeat.h b/src/include/common/pagefeat.h
new file mode 100644
index 0000000000..f07e1af315
--- /dev/null
+++ b/src/include/common/pagefeat.h
@@ -0,0 +1,56 @@
+/*-------------------------------------------------------------------------
+ *
+ * pagefeat.h
+ *	  POSTGRES page feature support
+ *
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/common/pagefeat.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef PAGEFEAT_H
+#define PAGEFEAT_H
+
+/* revealed for GUCs */
+extern PGDLLIMPORT int reserved_page_size;
+
+/* forward declaration to avoid circular includes */
+typedef Pointer Page;
+typedef uint8 PageFeatureSet;
+
+extern PGDLLIMPORT PageFeatureSet cluster_page_features;
+
+#define SizeOfPageReservedSpace() reserved_page_size
+#define MaxSizeOfPageReservedSpace 64 /* sum all of the page features */
+
+/* bit offset for features flags */
+typedef enum {
+	PF_MAX_FEATURE /* must be last */
+} PageFeature;
+
+/* Limit for total number of features we will support.  Since we are storing a
+ * single status byte, we are reserving the top bit here to be set to indicate
+ * for whether there are more than 7 features; used for future extensibility.
+ * This should not be increased as part of normal feature development, only
+ * when adding said mechanisms */
+
+#define PF_MAX_POSSIBLE_FEATURE_CUTOFF 7
+
+StaticAssertDecl(PF_MAX_FEATURE <= PF_MAX_POSSIBLE_FEATURE_CUTOFF,
+				 "defined more features than will fit in one byte");
+
+/* prototypes */
+void SetExtendedFeatureConfigOptions(PageFeatureSet features);
+uint16 PageFeatureSetCalculateSize(PageFeatureSet features);
+PageFeatureSet PageFeatureSetAddFeatureByName(PageFeatureSet features, const char *feat_name);
+PageFeatureSet PageFeatureSetAddFeature(PageFeatureSet features, PageFeature feature);
+
+/* macros dealing with the current cluster's page features */
+char *PageGetFeatureOffset(Page page, PageFeature feature);
+#define PageFeatureSetHasFeature(fs,f) (fs&(1<<f))
+#define ClusterPageFeatureInit(features) cluster_page_features = features;
+
+#endif							/* PAGEFEAT_H */
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index 6ab00daa2e..c56abfb184 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -17,6 +17,7 @@
 #include "storage/block.h"
 #include "storage/buf.h"
 #include "storage/bufpage.h"
+#include "common/pagefeat.h"
 #include "storage/relfilelocator.h"
 #include "utils/relcache.h"
 #include "utils/snapmgr.h"
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 18ebd3b3c7..afaa466ec5 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -18,11 +18,7 @@
 #include "storage/block.h"
 #include "storage/item.h"
 #include "storage/off.h"
-
-extern PGDLLIMPORT int reserved_page_size;
-
-#define SizeOfPageReservedSpace() reserved_page_size
-#define MaxSizeOfPageReservedSpace 0
+#include "common/pagefeat.h"
 
 /* strict upper bound on the amount of space occupied we have reserved on
  * pages in this cluster */
@@ -122,7 +118,9 @@ PageXLogRecPtrGet(PageXLogRecPtr val)
  * space management information generic to any page
  *
  *		pd_lsn		- identifies xlog record for last change to this page.
- *		pd_checksum - page checksum, if set.
+ *		pd_feat     - union type, one of:
+ *         checksum - page checksum, if checksums enabled.
+ *         features - page features, if using extended feature flags.
  *		pd_flags	- flag bits.
  *		pd_lower	- offset to start of free space.
  *		pd_upper	- offset to end of free space.
@@ -134,16 +132,18 @@ PageXLogRecPtrGet(PageXLogRecPtr val)
  * "thou shalt write xlog before data".  A dirty buffer cannot be dumped
  * to disk until xlog has been flushed at least as far as the page's LSN.
  *
- * pd_checksum stores the page checksum, if it has been set for this page;
- * zero is a valid value for a checksum. If a checksum is not in use then
- * we leave the field unset. This will typically mean the field is zero
- * though non-zero values may also be present if databases have been
- * pg_upgraded from releases prior to 9.3, when the same byte offset was
- * used to store the current timelineid when the page was last updated.
- * Note that there is no indication on a page as to whether the checksum
- * is valid or not, a deliberate design choice which avoids the problem
- * of relying on the page contents to decide whether to verify it. Hence
- * there are no flag bits relating to checksums.
+ * pd_feat is a union type; if the `PD_EXTENDED_FEATS` page flag is set, we
+ * interpret it as a bitflag storing information about the page features in
+ * use on this page.  If this flag is unset, then it stores the page checksum,
+ * if it has been set for this page; zero is a valid value for a checksum. If
+ * a checksum is not in use then we leave the field unset. This will typically
+ * mean the field is zero though non-zero values may also be present if
+ * databases have been pg_upgraded from releases prior to 9.3, when the same
+ * byte offset was used to store the current timelineid when the page was last
+ * updated.  Note that there is no indication on a page as to whether the
+ * checksum is valid or not, a deliberate design choice which avoids the
+ * problem of relying on the page contents to decide whether to verify
+ * it. Hence there are no flag bits relating to checksums.
  *
  * pd_prune_xid is a hint field that helps determine whether pruning will be
  * useful.  It is currently unused in index pages.
@@ -167,7 +167,10 @@ typedef struct PageHeaderData
 	/* XXX LSN is member of *any* block, not only page-organized ones */
 	PageXLogRecPtr pd_lsn;		/* LSN: next byte after last byte of xlog
 								 * record for last change to this page */
-	uint16		pd_checksum;	/* checksum */
+	union {
+		uint16		checksum;	/* checksum */
+		uint16		features;	/* page feature flags */
+	} pd_feat;
 	uint16		pd_flags;		/* flag bits, see below */
 	LocationIndex pd_lower;		/* offset to start of free space */
 	LocationIndex pd_upper;		/* offset to end of free space */
@@ -195,8 +198,8 @@ typedef PageHeaderData *PageHeader;
 #define PD_PAGE_FULL		0x0002	/* not enough free space for new tuple? */
 #define PD_ALL_VISIBLE		0x0004	/* all tuples on page are visible to
 									 * everyone */
-
-#define PD_VALID_FLAG_BITS	0x0007	/* OR of all valid pd_flags bits */
+#define PD_EXTENDED_FEATS	0x0008	/* this page uses extended page features */
+#define PD_VALID_FLAG_BITS	0x000F	/* OR of all valid pd_flags bits */
 
 /*
  * Page layout version number 0 is for pre-7.3 Postgres releases.
@@ -312,6 +315,26 @@ PageSetPageSizeAndVersion(Page page, Size size, uint8 version)
 	((PageHeader) page)->pd_pagesize_version = size | version;
 }
 
+
+/*
+ * Return any extended page features set on the page.
+ */
+static inline PageFeatureSet PageGetPageFeatures(Page page)
+{
+	return ((PageHeader) page)->pd_flags & PD_EXTENDED_FEATS \
+		? (PageFeatureSet)((PageHeader)page)->pd_feat.features
+		: 0;
+}
+
+/*
+ * Return the size of space allocated for page features.
+ */
+static inline Size
+PageGetFeatureSize(Page page)
+{
+	return PageFeatureSetCalculateSize(PageGetPageFeatures(page));
+}
+
 /* ----------------
  *		page special data functions
  * ----------------
@@ -323,7 +346,7 @@ PageSetPageSizeAndVersion(Page page, Size size, uint8 version)
 static inline uint16
 PageGetSpecialSize(Page page)
 {
-	return (PageGetPageSize(page) - ((PageHeader) page)->pd_special - reserved_page_size);
+	return (PageGetPageSize(page) - ((PageHeader) page)->pd_special - PageGetFeatureSize(page));
 }
 
 /*
@@ -495,7 +518,7 @@ do { \
 StaticAssertDecl(BLCKSZ == ((BLCKSZ / sizeof(size_t)) * sizeof(size_t)),
 				 "BLCKSZ has to be a multiple of sizeof(size_t)");
 
-extern void PageInit(Page page, Size pageSize, Size specialSize);
+extern void PageInit(Page page, Size pageSize, Size specialSize, PageFeatureSet features);
 extern bool PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags);
 extern OffsetNumber PageAddItemExtended(Page page, Item item, Size size,
 										OffsetNumber offsetNumber, int flags);
diff --git a/src/include/storage/checksum_impl.h b/src/include/storage/checksum_impl.h
index 7b157161a2..25933f1759 100644
--- a/src/include/storage/checksum_impl.h
+++ b/src/include/storage/checksum_impl.h
@@ -194,22 +194,23 @@ pg_checksum_page(char *page, BlockNumber blkno)
 	Assert(!PageIsNew((Page) page));
 
 	/*
-	 * Save pd_checksum and temporarily set it to zero, so that the checksum
-	 * calculation isn't affected by the old checksum stored on the page.
-	 * Restore it after, because actually updating the checksum is NOT part of
-	 * the API of this function.
+	 * Save pd_feat.checksum and temporarily set it to zero, so that the
+	 * checksum calculation isn't affected by the old checksum stored on the
+	 * page.  Restore it after, because actually updating the checksum is NOT
+	 * part of the API of this function.
 	 */
-	save_checksum = cpage->phdr.pd_checksum;
-	cpage->phdr.pd_checksum = 0;
+	save_checksum = cpage->phdr.pd_feat.checksum;
+	cpage->phdr.pd_feat.checksum = 0;
 	checksum = pg_checksum_block(cpage);
-	cpage->phdr.pd_checksum = save_checksum;
+	cpage->phdr.pd_feat.checksum = save_checksum;
 
 	/* Mix in the block number to detect transposed pages */
 	checksum ^= blkno;
 
 	/*
-	 * Reduce to a uint16 (to fit in the pd_checksum field) with an offset of
-	 * one. That avoids checksums of zero, which seems like a good idea.
+	 * Reduce to a uint16 (to fit in the pd_feat.checksum field) with an
+	 * offset of one. That avoids checksums of zero, which seems like a good
+	 * idea.
 	 */
 	return (uint16) ((checksum % 65535) + 1);
 }
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index bc9b5dc644..95e22f5822 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3014,6 +3014,8 @@ The server must be stopped for this to work reliably.
 The file name should be specified relative to the cluster datadir.
 page_offset had better be a multiple of the cluster's block size.
 
+TODO: what to do about page features instead of checksums?
+
 =cut
 
 sub corrupt_page_checksum
diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index 958206f315..fd1fb0e441 100644
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -138,7 +138,7 @@ sub mkvcbuild
 	  archive.c base64.c checksum_helper.c compression.c
 	  config_info.c controldata_utils.c d2s.c encnames.c exec.c
 	  f2s.c file_perm.c file_utils.c hashfn.c ip.c jsonapi.c
-	  keywords.c kwlookup.c link-canary.c md5_common.c percentrepl.c
+	  keywords.c kwlookup.c link-canary.c md5_common.c pagefeat.c percentrepl.c
 	  pg_get_line.c pg_lzcompress.c pg_prng.c pgfnames.c psprintf.c relpath.c
 	  rmtree.c saslprep.c scram-common.c string.c stringinfo.c unicode_norm.c
 	  username.c wait_error.c wchar.c);
-- 
2.39.2

v4-0003-Add-page-feature-for-64-bit-checksums.patchapplication/octet-stream; name=v4-0003-Add-page-feature-for-64-bit-checksums.patchDownload

From b5fc096637154980e3deb4c198d33ac1ee95fdec Mon Sep 17 00:00:00 2001
From: David Christensen <david@pgguru.net>
Date: Tue, 9 May 2023 16:56:15 -0500
Subject: [PATCH v4 3/3] Add page feature for 64-bit checksums

Since we reclaimed the space from the pd_checksums field for storing a page's
features, we present the use of a 64-bit page checksum as an alternative.  This
uses an arbitrarily chosen 64-bit hash for demo purposes (TBD: is this
compatible, or do we need a replacement?) to demonstrate the use of this feature.

Since one of the main motivators of page features is to provide space for
authenticated page encryption, we make this optional in order to ensure that
either the 64-bit checksum (this patch) or the 64-bit authtag (future patch)
will live in the final 8 bytes of the page at a single constant offset,
hopefully allowing other programs that need to know how to handle the new format
to do so in a much easier way.
---
 src/backend/access/transam/xlog.c       |   4 +-
 src/backend/backup/basebackup.c         |  27 +-
 src/backend/storage/page/bufpage.c      |  53 ++-
 src/backend/utils/misc/guc_tables.c     |  11 +
 src/bin/initdb/initdb.c                 |  17 +-
 src/bin/pg_controldata/pg_controldata.c |   3 +
 src/common/pagefeat.c                   |   5 +
 src/include/common/komihash.h           | 569 ++++++++++++++++++++++++
 src/include/common/pagefeat.h           |   2 +
 src/include/storage/bufpage.h           |  13 +-
 src/include/storage/checksum.h          |   3 +
 src/include/storage/checksum_impl.h     |  89 ++++
 12 files changed, 771 insertions(+), 25 deletions(-)
 create mode 100644 src/include/common/komihash.h

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index bdc0d08ae5..01cfab0793 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4210,7 +4210,9 @@ bool
 DataChecksumsEnabled(void)
 {
 	Assert(ControlFile != NULL);
-	return (ControlFile->data_checksum_version > 0);
+	return (ControlFile->data_checksum_version > 0) || \
+		PageFeatureSetHasFeature(ControlFile->page_features, PF_EXT_CHECKSUMS);
+
 }
 
 /*
diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 7a7dc9f80d..01928797a1 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -25,6 +25,7 @@
 #include "commands/defrem.h"
 #include "common/compression.h"
 #include "common/file_perm.h"
+#include "common/pagefeat.h"
 #include "lib/stringinfo.h"
 #include "miscadmin.h"
 #include "nodes/pg_list.h"
@@ -1486,7 +1487,7 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 	int			fd;
 	BlockNumber blkno = 0;
 	bool		block_retry = false;
-	uint16		checksum;
+	uint64		checksum, page_checksum;
 	int			checksum_failures = 0;
 	off_t		cnt;
 	int			i;
@@ -1594,9 +1595,23 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 				 */
 				if (!PageIsNew(page) && PageGetLSN(page) < sink->bbs_state->startptr)
 				{
-					checksum = pg_checksum_page((char *) page, blkno + segmentno * RELSEG_SIZE);
-					phdr = (PageHeader) page;
-					if (phdr->pd_feat.checksum != checksum)
+					char *extended_checksum_loc = NULL;
+
+					/* are we using extended checksums? */
+					if ((extended_checksum_loc = PageGetFeatureOffset(page, PF_EXT_CHECKSUMS)))
+					{
+						/* 64-bit checksum */
+						page_checksum = pg_get_checksum64_page(page, (uint64*)extended_checksum_loc);
+						checksum = pg_checksum64_page(page, blkno + segmentno * RELSEG_SIZE, (uint64*)extended_checksum_loc);
+					}
+					else
+					{
+						phdr = (PageHeader) page;
+						page_checksum = phdr->pd_feat.checksum;
+						checksum = pg_checksum_page(page, blkno + segmentno * RELSEG_SIZE);
+					}
+
+					if (page_checksum != checksum)
 					{
 						/*
 						 * Retry the block on the first failure.  It's
@@ -1657,9 +1672,9 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
 							ereport(WARNING,
 									(errmsg("checksum verification failed in "
 											"file \"%s\", block %u: calculated "
-											"%X but expected %X",
+											UINT64_FORMAT " but expected " UINT64_FORMAT,
 											readfilename, blkno, checksum,
-											phdr->pd_feat.checksum)));
+											page_checksum)));
 						if (checksum_failures == 5)
 							ereport(WARNING,
 									(errmsg("further checksum verification "
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index 1b3ce14616..6eac6f056e 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -106,18 +106,29 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
 	bool		checksum_failure = false;
 	bool		header_sane = false;
 	bool		all_zeroes = false;
-	uint16		checksum = 0;
-
+	uint64		checksum = 0;
+	uint64		page_checksum = 0;
+	char       *extended_checksum_loc = NULL;
 	/*
 	 * Don't verify page data unless the page passes basic non-zero test
 	 */
 	if (!PageIsNew(page))
 	{
-		if (DataChecksumsEnabled() && !(p->pd_flags & PD_EXTENDED_FEATS))
+		if (DataChecksumsEnabled())
 		{
-			checksum = pg_checksum_page((char *) page, blkno);
-
-			if (checksum != p->pd_feat.checksum)
+			/* are we using extended checksums? */
+			if ((extended_checksum_loc = PageGetFeatureOffset(page, PF_EXT_CHECKSUMS)))
+			{
+				page_checksum = pg_get_checksum64_page(page, (uint64*)extended_checksum_loc);
+				checksum = pg_checksum64_page(page, blkno, (uint64*)extended_checksum_loc);
+			}
+			else
+			{
+				/* traditional checksums in the pd_checksum field */
+				page_checksum = p->pd_feat.checksum;
+				checksum = pg_checksum_page((char *) page, blkno);
+			}
+			if (checksum != page_checksum)
 				checksum_failure = true;
 		}
 
@@ -162,8 +173,9 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
 		if ((flags & PIV_LOG_WARNING) != 0)
 			ereport(WARNING,
 					(errcode(ERRCODE_DATA_CORRUPTED),
-					 errmsg("page verification failed, calculated checksum %u but expected %u",
-							checksum, p->pd_feat.checksum)));
+					 errmsg("page verification failed, calculated checksum "
+							UINT64_FORMAT " but expected " UINT64_FORMAT,
+							checksum, page_checksum)));
 
 		if ((flags & PIV_REPORT_STAT) != 0)
 			pgstat_report_checksum_failure();
@@ -1523,10 +1535,10 @@ char *
 PageSetChecksumCopy(Page page, BlockNumber blkno)
 {
 	static char *pageCopy = NULL;
+	char *extended_checksum_loc = NULL;
 
 	/* If we don't need a checksum, just return the passed-in data */
-	if (PageIsNew(page) || !DataChecksumsEnabled() || \
-		(((PageHeader)page)->pd_flags & PD_EXTENDED_FEATS))
+	if (PageIsNew(page) || !DataChecksumsEnabled())
 		return (char *) page;
 
 	/*
@@ -1542,7 +1554,13 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 											 0);
 
 	memcpy(pageCopy, (char *) page, BLCKSZ);
-	((PageHeader) pageCopy)->pd_feat.checksum = pg_checksum_page(pageCopy, blkno);
+
+	if ((extended_checksum_loc = PageGetFeatureOffset(pageCopy, PF_EXT_CHECKSUMS)))
+		pg_set_checksum64_page(pageCopy,
+							   pg_checksum64_page(pageCopy, blkno, (uint64*)extended_checksum_loc),
+							   (uint64*)extended_checksum_loc);
+	else
+		((PageHeader) pageCopy)->pd_feat.checksum = pg_checksum_page(pageCopy, blkno);
 	return pageCopy;
 }
 
@@ -1555,10 +1573,17 @@ PageSetChecksumCopy(Page page, BlockNumber blkno)
 void
 PageSetChecksumInplace(Page page, BlockNumber blkno)
 {
+	char *extended_checksum_loc = NULL;
+
 	/* If we don't need a checksum, just return */
-	if (PageIsNew(page) || !DataChecksumsEnabled() || \
-		(((PageHeader)page)->pd_flags & PD_EXTENDED_FEATS))
+	if (PageIsNew(page) || !DataChecksumsEnabled())
 		return;
 
-	((PageHeader) page)->pd_feat.checksum = pg_checksum_page((char *) page, blkno);
+	/* are we using extended checksums? */
+	if ((extended_checksum_loc = PageGetFeatureOffset(page, PF_EXT_CHECKSUMS)))
+		pg_set_checksum64_page(page,
+							   pg_checksum64_page(page, blkno, (uint64*)extended_checksum_loc),
+							   (uint64*)extended_checksum_loc);
+	else
+		((PageHeader) page)->pd_feat.checksum = pg_checksum_page(page, blkno);
 }
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index f3d5a73adc..1abc7dce97 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -1875,6 +1875,17 @@ struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"extended_checksums", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows whether extended checksums are turned on for this cluster."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_RUNTIME_COMPUTED
+		},
+		&page_feature_extended_checksums,
+		false,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"syslog_sequence_numbers", PGC_SIGHUP, LOGGING_WHERE,
 			gettext_noop("Add sequence number to syslog messages to avoid duplicate suppression."),
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 1d76f694ec..ec8e671ff2 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -167,6 +167,7 @@ static bool sync_only = false;
 static bool show_setting = false;
 static bool data_checksums = false;
 static bool using_page_feats = false;
+static bool extended_checksums = false;
 static char *xlog_dir = NULL;
 static char *str_wal_segment_size_mb = NULL;
 static int	wal_segment_size_mb;
@@ -1536,10 +1537,11 @@ bootstrap_template1(void)
 	unsetenv("PGCLIENTENCODING");
 
 	snprintf(cmd, sizeof(cmd),
-			 "\"%s\" --boot -X %d %s %s %s %s",
+			 "\"%s\" --boot -X %d %s %s %s %s %s",
 			 backend_exec,
 			 wal_segment_size_mb * (1024 * 1024),
 			 data_checksums ? "-k" : "",
+			 extended_checksums ? "-e extended_checksums" : "",
 			 boot_options, extra_options,
 			 debug ? "-d 5" : "");
 
@@ -2497,6 +2499,7 @@ usage(const char *progname)
 	printf(_("      --icu-locale=LOCALE   set ICU locale ID for new databases\n"));
 	printf(_("      --icu-rules=RULES     set additional ICU collation rules for new databases\n"));
 	printf(_("  -k, --data-checksums      use data page checksums\n"));
+	printf(_("      --extended-checksums  use extended data page checksums\n"));
 	printf(_("      --locale=LOCALE       set default locale for new databases\n"));
 	printf(_("      --lc-collate=, --lc-ctype=, --lc-messages=LOCALE\n"
 			 "      --lc-monetary=, --lc-numeric=, --lc-time=LOCALE\n"
@@ -3155,6 +3158,7 @@ main(int argc, char *argv[])
 		{"waldir", required_argument, NULL, 'X'},
 		{"wal-segsize", required_argument, NULL, 12},
 		{"data-checksums", no_argument, NULL, 'k'},
+		{"extended-checksums", no_argument, NULL, 18},
 		{"allow-group-access", no_argument, NULL, 'g'},
 		{"discard-caches", no_argument, NULL, 14},
 		{"locale-provider", required_argument, NULL, 15},
@@ -3272,6 +3276,10 @@ main(int argc, char *argv[])
 			case 'k':
 				data_checksums = true;
 				break;
+			case 18:
+				extended_checksums = true;
+				using_page_feats = true;
+				break;
 			case 'L':
 				share_path = pg_strdup(optarg);
 				break;
@@ -3395,6 +3403,9 @@ main(int argc, char *argv[])
 	if (pwprompt && pwfilename)
 		pg_fatal("password prompt and password file cannot be specified together");
 
+	if (data_checksums && extended_checksums)
+		pg_fatal("data checksums and extended data checksums cannot be specified together");
+
 	check_authmethod_unspecified(&authmethodlocal);
 	check_authmethod_unspecified(&authmethodhost);
 
@@ -3452,7 +3463,9 @@ main(int argc, char *argv[])
 	if (data_checksums && using_page_feats)
 		pg_fatal("cannot use page features and data_checksums at the same time");
 
-	if (data_checksums)
+	if (extended_checksums)
+		printf(_("Extended data page checksums are enabled.\n"));
+	else if (data_checksums)
 		printf(_("Data page checksums are enabled.\n"));
 	else
 		printf(_("Data page checksums are disabled.\n"));
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index c1006ad5d8..bc6be4844a 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -331,5 +331,8 @@ main(int argc, char *argv[])
 		   mock_auth_nonce_str);
 	printf(_("Reserved page size for features:      %d\n"),
 		   PageFeatureSetCalculateSize(ControlFile->page_features));
+	printf(_("Using extended checksums:             %s\n"),
+		   PageFeatureSetHasFeature(ControlFile->page_features, PF_EXT_CHECKSUMS) \
+		   ? _("yes") : _("no"));
 	return 0;
 }
diff --git a/src/common/pagefeat.c b/src/common/pagefeat.c
index 06a4084f46..45eeb4d403 100644
--- a/src/common/pagefeat.c
+++ b/src/common/pagefeat.c
@@ -19,6 +19,9 @@
 int reserved_page_size;
 PageFeatureSet cluster_page_features;
 
+/* status GUCs, display only. set by XLog startup */
+bool page_feature_extended_checksums;
+
 /*
  * A "page feature" is an optional cluster-defined additional data field that
  * is stored in the "reserved_page_size" area in the footer of a given Page.
@@ -43,6 +46,8 @@ typedef struct PageFeatureDesc
  * or the attempt to set the GUC will fail. */
 
 static PageFeatureDesc feature_descs[PF_MAX_FEATURE] = {
+	/* PF_EXT_CHECKSUMS */
+	{ 8, "extended_checksums" }
 };
 
 
diff --git a/src/include/common/komihash.h b/src/include/common/komihash.h
new file mode 100644
index 0000000000..867a7f09b1
--- /dev/null
+++ b/src/include/common/komihash.h
@@ -0,0 +1,569 @@
+/**
+ * komihash.h version 4.3.1
+ *
+ * The inclusion file for the "komihash" hash function.
+ *
+ * Description is available at https://github.com/avaneev/komihash
+ *
+ * License
+ *
+ * Copyright (c) 2021-2022 Aleksey Vaneev
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef KOMIHASH_INCLUDED
+#define KOMIHASH_INCLUDED
+
+#include <stdint.h>
+#include <string.h>
+
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wdeclaration-after-statement"
+
+// Macros that apply byte-swapping.
+
+#if defined( __GNUC__ ) || defined( __clang__ )
+
+	#define KOMIHASH_BYTESW32( v ) __builtin_bswap32( v )
+	#define KOMIHASH_BYTESW64( v ) __builtin_bswap64( v )
+
+#elif defined( _MSC_VER )
+
+	#define KOMIHASH_BYTESW32( v ) _byteswap_ulong( v )
+	#define KOMIHASH_BYTESW64( v ) _byteswap_uint64( v )
+
+#else // defined( _MSC_VER )
+
+	#define KOMIHASH_BYTESW32( v ) ( \
+		( v & 0xFF000000 ) >> 24 | \
+		( v & 0x00FF0000 ) >> 8 | \
+		( v & 0x0000FF00 ) << 8 | \
+		( v & 0x000000FF ) << 24 )
+
+	#define KOMIHASH_BYTESW64( v ) ( \
+		( v & 0xFF00000000000000 ) >> 56 | \
+		( v & 0x00FF000000000000 ) >> 40 | \
+		( v & 0x0000FF0000000000 ) >> 24 | \
+		( v & 0x000000FF00000000 ) >> 8 | \
+		( v & 0x00000000FF000000 ) << 8 | \
+		( v & 0x0000000000FF0000 ) << 24 | \
+		( v & 0x000000000000FF00 ) << 40 | \
+		( v & 0x00000000000000FF ) << 56 )
+
+#endif // defined( _MSC_VER )
+
+// Endianness-definition macro, can be defined externally (e.g. =1, if
+// endianness-correction is unnecessary in any case, to reduce its associated
+// overhead).
+
+#if !defined( KOMIHASH_LITTLE_ENDIAN )
+	#if defined( _WIN32 ) || defined( __LITTLE_ENDIAN__ ) || \
+		( defined( __BYTE_ORDER__ ) && __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ )
+
+		#define KOMIHASH_LITTLE_ENDIAN 1
+
+	#elif defined( __BIG_ENDIAN__ ) || \
+		( defined( __BYTE_ORDER__ ) && __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ )
+
+		#define KOMIHASH_LITTLE_ENDIAN 0
+
+	#else // defined( __BIG_ENDIAN__ )
+
+		#warning KOMIHASH: cannot determine endianness, assuming little-endian.
+
+		#define KOMIHASH_LITTLE_ENDIAN 1
+
+	#endif // defined( __BIG_ENDIAN__ )
+#endif // !defined( KOMIHASH_LITTLE_ENDIAN )
+
+// Macros that apply byte-swapping, used for endianness-correction.
+
+#if KOMIHASH_LITTLE_ENDIAN
+
+	#define KOMIHASH_EC32( v ) ( v )
+	#define KOMIHASH_EC64( v ) ( v )
+
+#else // KOMIHASH_LITTLE_ENDIAN
+
+	#define KOMIHASH_EC32( v ) KOMIHASH_BYTESW32( v )
+	#define KOMIHASH_EC64( v ) KOMIHASH_BYTESW64( v )
+
+#endif // KOMIHASH_LITTLE_ENDIAN
+
+// Likelihood macros that are used for manually-guided micro-optimization.
+
+#if defined( __GNUC__ ) || defined( __clang__ )
+
+	#define KOMIHASH_LIKELY( x )  __builtin_expect( x, 1 )
+	#define KOMIHASH_UNLIKELY( x )  __builtin_expect( x, 0 )
+
+#else // likelihood macros
+
+	#define KOMIHASH_LIKELY( x ) ( x )
+	#define KOMIHASH_UNLIKELY( x ) ( x )
+
+#endif // likelihood macros
+
+// In-memory data prefetch macro (temporal locality=1, in case a collision
+// resolution would be necessary).
+
+#if defined( __GNUC__ ) || defined( __clang__ )
+
+	#define KOMIHASH_PREFETCH( addr ) __builtin_prefetch( addr, 0, 1 )
+
+#else // prefetch macro
+
+	#define KOMIHASH_PREFETCH( addr )
+
+#endif // prefetch macro
+
+/**
+ * An auxiliary function that returns an unsigned 32-bit value created out of
+ * a sequence of bytes in memory. This function is used to convert endianness
+ * of in-memory 32-bit unsigned values, and to avoid unaligned memory
+ * accesses.
+ *
+ * @param p Pointer to 4 bytes in memory. Alignment is unimportant.
+ */
+
+static inline uint32_t kh_lu32ec( const uint8_t* const p )
+{
+	uint32_t v;
+	memcpy( &v, p, 4 );
+
+	return( KOMIHASH_EC32( v ));
+}
+
+/**
+ * An auxiliary function that returns an unsigned 64-bit value created out of
+ * a sequence of bytes in memory. This function is used to convert endianness
+ * of in-memory 64-bit unsigned values, and to avoid unaligned memory
+ * accesses.
+ *
+ * @param p Pointer to 8 bytes in memory. Alignment is unimportant.
+ */
+
+static inline uint64_t kh_lu64ec( const uint8_t* const p )
+{
+	uint64_t v;
+	memcpy( &v, p, 8 );
+
+	return( KOMIHASH_EC64( v ));
+}
+
+/**
+ * Function builds an unsigned 64-bit value out of remaining bytes in a
+ * message, and pads it with the "final byte". This function can only be
+ * called if less than 8 bytes are left to read. The message should be "long",
+ * permitting Msg[ -3 ] reads.
+ *
+ * @param Msg Message pointer, alignment is unimportant.
+ * @param MsgLen Message's remaining length, in bytes; can be 0.
+ * @param fb Final byte used for padding.
+ */
+
+static inline uint64_t kh_lpu64ec_l3( const uint8_t* const Msg,
+	const size_t MsgLen, uint64_t fb )
+{
+	if( MsgLen < 4 )
+	{
+		const uint8_t* const Msg3 = Msg + MsgLen - 3;
+		const int ml8 = (int) ( MsgLen << 3 );
+		const uint64_t m = (uint64_t) Msg3[ 0 ] | (uint64_t) Msg3[ 1 ] << 8 |
+			(uint64_t) Msg3[ 2 ] << 16;
+		return( fb << ml8 | m >> ( 24 - ml8 ));
+	}
+
+	const int ml8 = (int) ( MsgLen << 3 );
+	const uint64_t mh = kh_lu32ec( Msg + MsgLen - 4 );
+	const uint64_t ml = kh_lu32ec( Msg );
+
+	return( fb << ml8 | ml | ( mh >> ( 64 - ml8 )) << 32 );
+}
+
+/**
+ * Function builds an unsigned 64-bit value out of remaining bytes in a
+ * message, and pads it with the "final byte". This function can only be
+ * called if less than 8 bytes are left to read. Can be used on "short"
+ * messages, but MsgLen should be greater than 0.
+ *
+ * @param Msg Message pointer, alignment is unimportant.
+ * @param MsgLen Message's remaining length, in bytes; cannot be 0.
+ * @param fb Final byte used for padding.
+ */
+
+static inline uint64_t kh_lpu64ec_nz( const uint8_t* const Msg,
+	const size_t MsgLen, uint64_t fb )
+{
+	if( MsgLen < 4 )
+	{
+		fb <<= ( MsgLen << 3 );
+		uint64_t m = Msg[ 0 ];
+
+		if( MsgLen > 1 )
+		{
+			m |= (uint64_t) Msg[ 1 ] << 8;
+
+			if( MsgLen > 2 )
+			{
+				m |= (uint64_t) Msg[ 2 ] << 16;
+			}
+		}
+
+		return( fb | m );
+	}
+
+	const int ml8 = (int) ( MsgLen << 3 );
+	const uint64_t mh = kh_lu32ec( Msg + MsgLen - 4 );
+	const uint64_t ml = kh_lu32ec( Msg );
+
+	return( fb << ml8 | ml | ( mh >> ( 64 - ml8 )) << 32 );
+}
+
+/**
+ * Function builds an unsigned 64-bit value out of remaining bytes in a
+ * message, and pads it with the "final byte". This function can only be
+ * called if less than 8 bytes are left to read. The message should be "long",
+ * permitting Msg[ -4 ] reads.
+ *
+ * @param Msg Message pointer, alignment is unimportant.
+ * @param MsgLen Message's remaining length, in bytes; can be 0.
+ * @param fb Final byte used for padding.
+ */
+
+static inline uint64_t kh_lpu64ec_l4( const uint8_t* const Msg,
+	const size_t MsgLen, uint64_t fb )
+{
+	if( MsgLen < 5 )
+	{
+		const int ml8 = (int) ( MsgLen << 3 );
+
+		return( fb << ml8 |
+			(uint64_t) kh_lu32ec( Msg + MsgLen - 4 ) >> ( 32 - ml8 ));
+	}
+	else
+	{
+		const int ml8 = (int) ( MsgLen << 3 );
+
+		return( fb << ml8 | kh_lu64ec( Msg + MsgLen - 8 ) >> ( 64 - ml8 ));
+	}
+}
+
+#if defined( __SIZEOF_INT128__ )
+
+	/**
+	 * 64-bit by 64-bit unsigned multiplication.
+	 *
+	 * @param m1 Multiplier 1.
+	 * @param m2 Multiplier 2.
+	 * @param[out] rl The lower half of the 128-bit result.
+	 * @param[out] rh The higher half of the 128-bit result.
+	 */
+
+	static inline void kh_m128( const uint64_t m1, const uint64_t m2,
+		uint64_t* const rl, uint64_t* const rh )
+	{
+		const __uint128_t r = (__uint128_t) m1 * m2;
+
+		*rl = (uint64_t) r;
+		*rh = (uint64_t) ( r >> 64 );
+	}
+
+#elif defined( _MSC_VER ) && defined( _M_X64 )
+
+	#include <intrin.h>
+
+	static inline void kh_m128( const uint64_t m1, const uint64_t m2,
+		uint64_t* const rl, uint64_t* const rh )
+	{
+		*rl = _umul128( m1, m2, rh );
+	}
+
+#else // defined( _MSC_VER )
+
+	// _umul128() code for 32-bit systems, adapted from mullu(),
+	// from https://go.dev/src/runtime/softfloat64.go
+	// Licensed under BSD-style license.
+
+	static inline uint64_t kh__emulu( const uint32_t x, const uint32_t y )
+	{
+		return( x * (uint64_t) y );
+	}
+
+	static inline void kh_m128( const uint64_t u, const uint64_t v,
+		uint64_t* const rl, uint64_t* const rh )
+	{
+		*rl = u * v;
+
+		const uint32_t u0 = (uint32_t) u;
+		const uint32_t v0 = (uint32_t) v;
+		const uint64_t w0 = kh__emulu( u0, v0 );
+		const uint32_t u1 = (uint32_t) ( u >> 32 );
+		const uint32_t v1 = (uint32_t) ( v >> 32 );
+		const uint64_t t = kh__emulu( u1, v0 ) + ( w0 >> 32 );
+		const uint64_t w1 = (uint32_t) t + kh__emulu( u0, v1 );
+
+		*rh = kh__emulu( u1, v1 ) + ( w1 >> 32 ) + ( t >> 32 );
+	}
+
+#endif // defined( _MSC_VER )
+
+// Common hashing round with 16-byte input, using the "r1l" and "r1h"
+// temporary variables.
+
+#define KOMIHASH_HASH16( m ) \
+	kh_m128( Seed1 ^ kh_lu64ec( m ), \
+		Seed5 ^ kh_lu64ec( m + 8 ), &r1l, &r1h ); \
+	Seed5 += r1h; \
+	Seed1 = Seed5 ^ r1l;
+
+// Common hashing round without input, using the "r2l" and "r2h" temporary
+// variables.
+
+#define KOMIHASH_HASHROUND() \
+	kh_m128( Seed1, Seed5, &r2l, &r2h ); \
+	Seed5 += r2h; \
+	Seed1 = Seed5 ^ r2l;
+
+// Common hashing finalization round, with the final hashing input expected in
+// the "r2l" and "r2h" temporary variables.
+
+#define KOMIHASH_HASHFIN() \
+	kh_m128( r2l, r2h, &r1l, &r1h ); \
+	Seed5 += r1h; \
+	Seed1 = Seed5 ^ r1l; \
+	KOMIHASH_HASHROUND();
+
+/**
+ * KOMIHASH hash function. Produces and returns a 64-bit hash value of the
+ * specified message, string, or binary data block. Designed for 64-bit
+ * hash-table and hash-map uses. Produces identical hashes on both big- and
+ * little-endian systems.
+ *
+ * @param Msg0 The message to produce a hash from. The alignment of this
+ * pointer is unimportant.
+ * @param MsgLen Message's length, in bytes.
+ * @param UseSeed Optional value, to use instead of the default seed. To use
+ * the default seed, set to 0. The UseSeed value can have any bit length and
+ * statistical quality, and is used only as an additional entropy source. May
+ * need endianness-correction if this value is shared between big- and
+ * little-endian systems.
+ */
+
+static inline uint64_t komihash( const void* const Msg0, size_t MsgLen,
+	const uint64_t UseSeed )
+{
+	const uint8_t* Msg = (const uint8_t*) Msg0;
+
+	// The seeds are initialized to the first mantissa bits of PI.
+
+	uint64_t Seed1 = 0x243F6A8885A308D3 ^ ( UseSeed & 0x5555555555555555 );
+	uint64_t Seed5 = 0x452821E638D01377 ^ ( UseSeed & 0xAAAAAAAAAAAAAAAA );
+	uint64_t r1l, r1h, r2l, r2h;
+
+	// The three instructions in the "KOMIHASH_HASHROUND" macro represent the
+	// simplest constant-less PRNG, scalable to any even-sized state
+	// variables, with the `Seed1` being the PRNG output (2^64 PRNG period).
+	// It passes `PractRand` tests with rare non-systematic "unusual"
+	// evaluations.
+	//
+	// To make this PRNG reliable, self-starting, and eliminate a risk of
+	// stopping, the following variant can be used, which is a "register
+	// checker-board", a source of raw entropy. The PRNG is available as the
+	// komirand() function. Not required for hashing (but works for it) since
+	// the input entropy is usually available in abundance during hashing.
+	//
+	// Seed5 += r2h + 0xAAAAAAAAAAAAAAAA;
+	//
+	// (the `0xAAAA...` constant should match register's size; essentially,
+	// it is a replication of the `10` bit-pair; it is not an arbitrary
+	// constant).
+
+	KOMIHASH_HASHROUND(); // Required for PerlinNoise.
+
+	if( KOMIHASH_LIKELY( MsgLen < 16 ))
+	{
+		KOMIHASH_PREFETCH( Msg );
+
+		r2l = Seed1;
+		r2h = Seed5;
+
+		if( MsgLen > 7 )
+		{
+			// The following two XOR instructions are equivalent to mixing a
+			// message with a cryptographic one-time-pad (bitwise modulo 2
+			// addition). Message's statistics and distribution are thus
+			// unimportant.
+
+			r2h ^= kh_lpu64ec_l3( Msg + 8, MsgLen - 8,
+				1 << ( Msg[ MsgLen - 1 ] >> 7 ));
+
+			r2l ^= kh_lu64ec( Msg );
+		}
+		else
+		if( KOMIHASH_LIKELY( MsgLen != 0 ))
+		{
+			r2l ^= kh_lpu64ec_nz( Msg, MsgLen,
+				1 << ( Msg[ MsgLen - 1 ] >> 7 ));
+		}
+
+		KOMIHASH_HASHFIN();
+
+		return( Seed1 );
+	}
+
+	if( KOMIHASH_LIKELY( MsgLen < 32 ))
+	{
+		KOMIHASH_PREFETCH( Msg );
+
+		KOMIHASH_HASH16( Msg );
+
+		const uint64_t fb = 1 << ( Msg[ MsgLen - 1 ] >> 7 );
+
+		if( MsgLen > 23 )
+		{
+			r2h = Seed5 ^ kh_lpu64ec_l4( Msg + 24, MsgLen - 24, fb );
+			r2l = Seed1 ^ kh_lu64ec( Msg + 16 );
+		}
+		else
+		{
+			r2l = Seed1 ^ kh_lpu64ec_l4( Msg + 16, MsgLen - 16, fb );
+			r2h = Seed5;
+		}
+
+		KOMIHASH_HASHFIN();
+
+		return( Seed1 );
+	}
+
+	if( MsgLen > 63 )
+	{
+		uint64_t Seed2 = 0x13198A2E03707344 ^ Seed1;
+		uint64_t Seed3 = 0xA4093822299F31D0 ^ Seed1;
+		uint64_t Seed4 = 0x082EFA98EC4E6C89 ^ Seed1;
+		uint64_t Seed6 = 0xBE5466CF34E90C6C ^ Seed5;
+		uint64_t Seed7 = 0xC0AC29B7C97C50DD ^ Seed5;
+		uint64_t Seed8 = 0x3F84D5B5B5470917 ^ Seed5;
+		uint64_t r3l, r3h, r4l, r4h;
+
+		do
+		{
+			KOMIHASH_PREFETCH( Msg );
+
+			kh_m128( Seed1 ^ kh_lu64ec( Msg ),
+				Seed5 ^ kh_lu64ec( Msg + 8 ), &r1l, &r1h );
+
+			kh_m128( Seed2 ^ kh_lu64ec( Msg + 16 ),
+				Seed6 ^ kh_lu64ec( Msg + 24 ), &r2l, &r2h );
+
+			kh_m128( Seed3 ^ kh_lu64ec( Msg + 32 ),
+				Seed7 ^ kh_lu64ec( Msg + 40 ), &r3l, &r3h );
+
+			kh_m128( Seed4 ^ kh_lu64ec( Msg + 48 ),
+				Seed8 ^ kh_lu64ec( Msg + 56 ), &r4l, &r4h );
+
+			Msg += 64;
+			MsgLen -= 64;
+
+			// Such "shifting" arrangement (below) does not increase
+			// individual SeedN's PRNG period beyond 2^64, but reduces a
+			// chance of any occassional synchronization between PRNG lanes
+			// happening. Practically, Seed1-4 together become a single
+			// "fused" 256-bit PRNG value, having a summary PRNG period of
+			// 2^66.
+
+			Seed5 += r1h;
+			Seed6 += r2h;
+			Seed7 += r3h;
+			Seed8 += r4h;
+			Seed2 = Seed5 ^ r2l;
+			Seed3 = Seed6 ^ r3l;
+			Seed4 = Seed7 ^ r4l;
+			Seed1 = Seed8 ^ r1l;
+
+		} while( KOMIHASH_LIKELY( MsgLen > 63 ));
+
+		Seed5 ^= Seed6 ^ Seed7 ^ Seed8;
+		Seed1 ^= Seed2 ^ Seed3 ^ Seed4;
+	}
+
+	KOMIHASH_PREFETCH( Msg );
+
+	if( KOMIHASH_LIKELY( MsgLen > 31 ))
+	{
+		KOMIHASH_HASH16( Msg );
+		KOMIHASH_HASH16( Msg + 16 );
+
+		Msg += 32;
+		MsgLen -= 32;
+	}
+
+	if( MsgLen > 15 )
+	{
+		KOMIHASH_HASH16( Msg );
+
+		Msg += 16;
+		MsgLen -= 16;
+	}
+
+	const uint64_t fb = 1 << ( Msg[ MsgLen - 1 ] >> 7 );
+
+	if( MsgLen > 7 )
+	{
+		r2h = Seed5 ^ kh_lpu64ec_l4( Msg + 8, MsgLen - 8, fb );
+		r2l = Seed1 ^ kh_lu64ec( Msg );
+	}
+	else
+	{
+		r2l = Seed1 ^ kh_lpu64ec_l4( Msg, MsgLen, fb );
+		r2h = Seed5;
+	}
+
+	KOMIHASH_HASHFIN();
+
+	return( Seed1 );
+}
+
+/**
+ * Simple, reliable, self-starting yet efficient PRNG, with 2^64 period.
+ * 0.62 cycles/byte performance. Self-starts in 4 iterations, which is a
+ * suggested "warming up" initialization before using its output.
+ *
+ * @param[in,out] Seed1 Seed value 1. Can be initialized to any value
+ * (even 0). This is the usual "PRNG seed" value.
+ * @param[in,out] Seed2 Seed value 2, a supporting variable. Best initialized
+ * to the same value as Seed1.
+ * @return The next uniformly-random 64-bit value.
+ */
+
+static inline uint64_t komirand( uint64_t* const Seed1, uint64_t* const Seed2 )
+{
+	uint64_t r1l, r1h;
+
+	kh_m128( *Seed1, *Seed2, &r1l, &r1h );
+	*Seed2 += r1h + 0xAAAAAAAAAAAAAAAA;
+	*Seed1 = *Seed2 ^ r1l;
+
+	return( *Seed1 );
+}
+
+#pragma GCC diagnostic pop
+
+#endif // KOMIHASH_INCLUDED
diff --git a/src/include/common/pagefeat.h b/src/include/common/pagefeat.h
index f07e1af315..57650cdc7b 100644
--- a/src/include/common/pagefeat.h
+++ b/src/include/common/pagefeat.h
@@ -16,6 +16,7 @@
 
 /* revealed for GUCs */
 extern PGDLLIMPORT int reserved_page_size;
+extern PGDLLIMPORT bool page_feature_extended_checksums;
 
 /* forward declaration to avoid circular includes */
 typedef Pointer Page;
@@ -28,6 +29,7 @@ extern PGDLLIMPORT PageFeatureSet cluster_page_features;
 
 /* bit offset for features flags */
 typedef enum {
+	PF_EXT_CHECKSUMS = 0,  /* must be first */
 	PF_MAX_FEATURE /* must be last */
 } PageFeature;
 
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index afaa466ec5..377276b8e8 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -78,7 +78,16 @@
  * initialize its pages with PageInit and then set its own opaque
  * fields.
  *
- * XXX - update more comments here about reserved_page_space
+ * If any page features are in use (thus reserving the cluster-wise
+ * reserved_page_space), then the special space offset will be adjusted to
+ * start not at the end of the block itself, but right before the MAXALIGN'd
+ * reserved_page_space chunk at the end, which is allocated/managed using the
+ * page features mechanism.  This adjustment is done at PageInit() time
+ * transparently to the AM, which still uses the normal pd_special pointer to
+ * reference its opaque block.  The only difference here is that the
+ * pd_special field + sizeof(opaque structure) will not (necessarily) be the
+ * same as the heap block size, but instead BLCKSZ - reserved_page_space.
+ *
  */
 
 typedef Pointer Page;
@@ -119,7 +128,7 @@ PageXLogRecPtrGet(PageXLogRecPtr val)
  *
  *		pd_lsn		- identifies xlog record for last change to this page.
  *		pd_feat     - union type, one of:
- *         checksum - page checksum, if checksums enabled.
+ *         checksum - page checksum, if legacy checksums are enabled.
  *         features - page features, if using extended feature flags.
  *		pd_flags	- flag bits.
  *		pd_lower	- offset to start of free space.
diff --git a/src/include/storage/checksum.h b/src/include/storage/checksum.h
index 4afd25a0af..1c319dd2c5 100644
--- a/src/include/storage/checksum.h
+++ b/src/include/storage/checksum.h
@@ -20,5 +20,8 @@
  * 4-byte boundary.
  */
 extern uint16 pg_checksum_page(char *page, BlockNumber blkno);
+extern uint64 pg_checksum64_page(char *page, BlockNumber blkno, uint64*offset);
+extern void pg_set_checksum64_page(char *page, uint64 checksum, uint64 *cksumloc);
+extern uint64 pg_get_checksum64_page(char *page, uint64 *cksumloc);
 
 #endif							/* CHECKSUM_H */
diff --git a/src/include/storage/checksum_impl.h b/src/include/storage/checksum_impl.h
index 25933f1759..b10c9447bd 100644
--- a/src/include/storage/checksum_impl.h
+++ b/src/include/storage/checksum_impl.h
@@ -101,6 +101,7 @@
  */
 
 #include "storage/bufpage.h"
+#include "common/komihash.h"
 
 /* number of checksums to calculate in parallel */
 #define N_SUMS 32
@@ -214,3 +215,91 @@ pg_checksum_page(char *page, BlockNumber blkno)
 	 */
 	return (uint16) ((checksum % 65535) + 1);
 }
+
+
+/*
+ * 64-bit block checksum algorithm.  The page must be adequately aligned
+ * (on an 8-byte boundary).
+ */
+
+static uint64
+pg_checksum64_block(const PGChecksummablePage *page)
+{
+	/* ensure that the size is compatible with the algorithm */
+	Assert(sizeof(PGChecksummablePage) == BLCKSZ);
+
+	return (uint64)komihash(page, BLCKSZ, 0);
+}
+
+/*
+ * Compute and return a 64-bit checksum for a Postgres page.
+ *
+ * Beware that the 64-bit portion of the page that cksum points to is
+ * transiently zeroed, though it is restored.
+ *
+ * The checksum includes the block number (to detect the case where a page is
+ * somehow moved to a different location), the page header (excluding the
+ * checksum itself), and the page data.
+ */
+uint64
+pg_checksum64_page(char *page, BlockNumber blkno, uint64 *cksumloc)
+{
+	PGChecksummablePage *cpage = (PGChecksummablePage *) page;
+	uint64      saved;
+	uint64      checksum;
+
+	/* We only calculate the checksum for properly-initialized pages */
+	Assert(!PageIsNew((Page) page));
+	/* Ensure that the cksum pointer is in the page range on this page */
+	Assert((char*)cksumloc >= page && (char*)cksumloc <= (page + BLCKSZ - sizeof(uint64)));
+
+	saved = *cksumloc;
+	*cksumloc = 0;
+
+	checksum = pg_checksum64_block(cpage);
+
+	/* restore */
+	*cksumloc = saved;
+
+	/* Mix in the block number to detect transposed pages */
+	checksum ^= blkno;
+
+	/* ensure in the extremely unlikely case that we have non-zero return
+	 * value here; this does double-up on our coset for group 1 here, but it's
+	 * a nice property to preserve */
+	return (checksum == 0 ? 1 : checksum);
+}
+
+
+/*
+ * Set a 64-bit checksum onto a Postgres page.
+ *
+ */
+void
+pg_set_checksum64_page(char *page, uint64 checksum, uint64 *cksumloc)
+{
+	/* Can only set the checksum for properly-initialized pages */
+	Assert(!PageIsNew((Page) page));
+
+	/* Ensure that the cksum pointer is in the page range on this page */
+	Assert((char*)cksumloc >= page && (char*)cksumloc <= (page + BLCKSZ - sizeof(uint64)));
+	*cksumloc = checksum;
+}
+
+/*
+ * Get the 64-bit checksum onto a Postgres page given the offset to the
+ * containing uint64.
+ */
+uint64
+pg_get_checksum64_page(char *page, uint64 *cksumloc)
+{
+	/* Can only set the checksum for properly-initialized pages */
+	Assert(!PageIsNew((Page) page));
+
+	/* Ensure that the cksum pointer is in the page range on this page */
+	Assert((char*)cksumloc >= page && (char*)cksumloc <= (page + BLCKSZ - sizeof(uint64)));
+	Assert(MAXALIGN((uint64)cksumloc) == (uint64)cksumloc);
+
+	return *cksumloc;
+}
+
-- 
2.39.2

#11

Stephen Frost

sfrost@snowman.net

over 2 years ago

In reply to: David Christensen (#10)

Re: [PATCHES] Post-special page storage TDE support

Greetings,

* David Christensen (david.christensen@crunchydata.com) wrote:

Refreshing this with HEAD as of today, v4.

Thanks for updating this!

Subject: [PATCH v4 1/3] Add reserved_page_space to Page structure

This space is reserved for extended data on the Page structure which will be ultimately used for
encrypted data, extended checksums, and potentially other things. This data appears at the end of
the Page, after any `pd_special` area, and will be calculated at runtime based on specific
ControlFile features.

No effort is made to ensure this is backwards-compatible with existing clusters for `pg_upgrade`, as
we will require logical replication to move data into a cluster with different settings here.

This initial patch, at least, does maintain pg_upgrade as the
reserved_page_size (maybe not a great name?) is set to 0, right?
Basically this is just introducing the concept of a reserved_page_size
and adjusting all of the code that currently uses BLCKSZ or
PageGetPageSize() to account for this extra space.

Looking at the changes to bufpage.h, in particular ...

diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h

@@ -19,6 +19,14 @@
#include "storage/item.h"
#include "storage/off.h"

+extern PGDLLIMPORT int reserved_page_size;
+
+#define SizeOfPageReservedSpace() reserved_page_size
+#define MaxSizeOfPageReservedSpace 0
+
+/* strict upper bound on the amount of space occupied we have reserved on
+ * pages in this cluster */

This will eventually be calculated based on what features are supported
concurrently?

@@ -36,10 +44,10 @@
* |			 v pd_upper							  |
* +-------------+------------------------------------+
* |			 | tupleN ...                         |
- * +-------------+------------------+-----------------+
- * |	   ... tuple3 tuple2 tuple1 | "special space" |
- * +--------------------------------+-----------------+
- *									^ pd_special
+ * +-------------+-----+------------+----+------------+
+ * | ... tuple2 tuple1 | "special space" | "reserved" |
+ * +-------------------+------------+----+------------+
+ *					   ^ pd_special      ^ reserved_page_space

Right, adds a dynamic amount of space 'post-special area'.

@@ -73,6 +81,8 @@
* stored as the page trailer.  an access method should always
* initialize its pages with PageInit and then set its own opaque
* fields.
+ *
+ * XXX - update more comments here about reserved_page_space
*/

Would be good to do. ;)

@@ -325,7 +335,7 @@ static inline void
PageValidateSpecialPointer(Page page)
{
Assert(page);
-	Assert(((PageHeader) page)->pd_special <= BLCKSZ);
+	Assert((((PageHeader) page)->pd_special + reserved_page_size) <= BLCKSZ);
Assert(((PageHeader) page)->pd_special >= SizeOfPageHeaderData);
}

This is just one usage ... but seems like maybe we should be using
PageGetPageSize() here instead of BLCKSZ, and that more-or-less
throughout? Nearly everywhere we're using BLCKSZ today to give us that
compile-time advantage of a fixed block size is going to lose that
advantage anyway thanks to reserved_page_size being run-time. Now, one
up-side to this is that it'd also get us closer to being able to support
dynamic block sizes concurrently which would be quite interesting. That
is, a special tablespace with a 32KB block size while the rest are the
traditional 8KB. This would likely require multiple shared buffer
pools, of course...

diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index 9a302ddc30..a93cd9df9f 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -26,6 +26,8 @@
/* GUC variable */
bool		ignore_checksum_failure = false;

+int			reserved_page_size = 0; /* how much page space to reserve for extended unencrypted metadata */
+

/* ----------------------------------------------------------------
* Page support functions
@@ -43,7 +45,7 @@ PageInit(Page page, Size pageSize, Size specialSize)
{
PageHeader p = (PageHeader) page;

-	specialSize = MAXALIGN(specialSize);
+	specialSize = MAXALIGN(specialSize) + reserved_page_size;

Rather than make it part of specialSize, I would think we'd be better
off just treating them independently. Eg, the later pd_upper setting
would be done by:

p->pd_upper = pageSize - specialSize - reserved_page_size;

etc.

@@ -186,7 +188,7 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
*	one that is both unused and deallocated.
*
*	If flag PAI_IS_HEAP is set, we enforce that there can't be more than
- *	MaxHeapTuplesPerPage line pointers on the page.
+ *	MaxHeapTuplesPerPage() line pointers on the page.

Making MaxHeapTuplesPerPage() runtime dynamic is a requirement for
supporting multiple page sizes concurrently ... but I'm not sure it's
actually required for the reserved_page_size idea as currently
considered. The reason is that with 8K or larger pages, the amount of
space we're already throwing away is at least 20 bytes, if I did my math
right. If we constrain reserved_page_size to be 20 bytes or less, as I
believe we're currently thinking we won't need that much, then we could
perhaps keep MaxHeapTuplesPerPage as a compile-time constant.

On the other hand, to the extent that we want to consider having
variable page sizes in the future, perhaps we do want to change this.
If so, the approach broadly looks reasonable to me, but I'd suggest we
make that a separate patch from the introduction of reserved_page_size.

@@ -211,7 +213,7 @@ PageAddItemExtended(Page page,
if (phdr->pd_lower < SizeOfPageHeaderData ||
phdr->pd_lower > phdr->pd_upper ||
phdr->pd_upper > phdr->pd_special ||
-		phdr->pd_special > BLCKSZ)
+		phdr->pd_special + reserved_page_size > BLCKSZ)
ereport(PANIC,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg("corrupted page pointers: lower = %u, upper = %u, special = %u",

Probably should add reserved_page_size to that errmsg output? Also,
this check of pointers seems to be done multiple times- maybe it should
be moved into a #define or similar?

@@ -723,7 +725,7 @@ PageRepairFragmentation(Page page)
if (pd_lower < SizeOfPageHeaderData ||
pd_lower > pd_upper ||
pd_upper > pd_special ||
-		pd_special > BLCKSZ ||
+		pd_special + reserved_page_size > BLCKSZ ||
pd_special != MAXALIGN(pd_special))
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),

This ends up being the same as above ...

@@ -1066,7 +1068,7 @@ PageIndexTupleDelete(Page page, OffsetNumber offnum)
if (phdr->pd_lower < SizeOfPageHeaderData ||
phdr->pd_lower > phdr->pd_upper ||
phdr->pd_upper > phdr->pd_special ||
-		phdr->pd_special > BLCKSZ ||
+		phdr->pd_special + reserved_page_size > BLCKSZ ||
phdr->pd_special != MAXALIGN(phdr->pd_special))
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),

And here ...

@@ -1201,7 +1203,7 @@ PageIndexMultiDelete(Page page, OffsetNumber *itemnos, int nitems)
if (pd_lower < SizeOfPageHeaderData ||
pd_lower > pd_upper ||
pd_upper > pd_special ||
-		pd_special > BLCKSZ ||
+		pd_special + reserved_page_size > BLCKSZ ||
pd_special != MAXALIGN(pd_special))
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),

And here ...

@@ -1307,7 +1309,7 @@ PageIndexTupleDeleteNoCompact(Page page, OffsetNumber offnum)
if (phdr->pd_lower < SizeOfPageHeaderData ||
phdr->pd_lower > phdr->pd_upper ||
phdr->pd_upper > phdr->pd_special ||
-		phdr->pd_special > BLCKSZ ||
+		phdr->pd_special + reserved_page_size > BLCKSZ ||
phdr->pd_special != MAXALIGN(phdr->pd_special))
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),

And here ...

@@ -1419,7 +1421,7 @@ PageIndexTupleOverwrite(Page page, OffsetNumber offnum,
if (phdr->pd_lower < SizeOfPageHeaderData ||
phdr->pd_lower > phdr->pd_upper ||
phdr->pd_upper > phdr->pd_special ||
-		phdr->pd_special > BLCKSZ ||
+		phdr->pd_special + reserved_page_size > BLCKSZ ||
phdr->pd_special != MAXALIGN(phdr->pd_special))
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),

And here ...

diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 6979aff727..060c4ab3e3 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -489,12 +489,12 @@ bt_check_every_level(Relation rel, Relation heaprel, bool heapkeyspace,
/*
* Size Bloom filter based on estimated number of tuples in index,
* while conservatively assuming that each block must contain at least
-		 * MaxTIDsPerBTreePage / 3 "plain" tuples -- see
+		 * MaxTIDsPerBTreePage() / 3 "plain" tuples -- see
* bt_posting_plain_tuple() for definition, and details of how posting
* list tuples are handled.
*/
total_pages = RelationGetNumberOfBlocks(rel);
-		total_elems = Max(total_pages * (MaxTIDsPerBTreePage / 3),
+		total_elems = Max(total_pages * (MaxTIDsPerBTreePage() / 3),
(int64) state->rel->rd_rel->reltuples);
/* Generate a random seed to avoid repetition */
seed = pg_prng_uint64(&pg_global_prng_state);

Making MaxTIDsPerBTreePage dynamic looks to be required as it doesn't
end up with any 'leftover' space, from what I can tell. Again, though,
perhaps this should be split out as an independent patch from the rest.
That is- we can change the higher-level functions to be dynamic in the
initial patches, and then eventually we'll get down to making the
lower-level functions dynamic.

diff --git a/contrib/bloom/bloom.h b/contrib/bloom/bloom.h
index efdf9415d1..8ebabdd7ee 100644
--- a/contrib/bloom/bloom.h
+++ b/contrib/bloom/bloom.h
@@ -131,7 +131,7 @@ typedef struct BloomMetaPageData
#define BLOOM_MAGICK_NUMBER (0xDBAC0DED)

/* Number of blocks numbers fit in BloomMetaPageData */
-#define BloomMetaBlockN		(sizeof(FreeBlockNumberArray) / sizeof(BlockNumber))
+#define BloomMetaBlockN()		((sizeof(FreeBlockNumberArray) - SizeOfPageReservedSpace())/ sizeof(BlockNumber))

#define BloomPageGetMeta(page) ((BloomMetaPageData *) PageGetContents(page))

@@ -151,6 +151,7 @@ typedef struct BloomState

#define BloomPageGetFreeSpace(state, page) \
(BLCKSZ - MAXALIGN(SizeOfPageHeaderData) \
+		- SizeOfPageReservedSpace()								  \
- BloomPageGetMaxOffset(page) * (state)->sizeOfBloomTuple \
- MAXALIGN(sizeof(BloomPageOpaqueData)))

This formulation (or something close to it) tends to happen quite a bit:

(BLCKSZ - MAXALIGN(SizeOfPageHeaderData) - SizeOfPageReservedSpace() ...

This is basically asking for "amount of usable space" where the
resulting 'usable space' either includes line pointers and tuples or
similar, or doesn't. Perhaps we should break this down into two
patches- one which provides a function to return usable space on a page,
and then the patch to add reserved_page_size can simply adjust that
instead of changing the very, very many places we have this forumlation.

diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index d935ed8fbd..d3d74a9d28 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -430,10 +430,10 @@ BloomFillMetapage(Relation index, Page metaPage)
*/
BloomInitPage(metaPage, BLOOM_META);
metadata = BloomPageGetMeta(metaPage);
-	memset(metadata, 0, sizeof(BloomMetaPageData));
+	memset(metadata, 0, sizeof(BloomMetaPageData) - SizeOfPageReservedSpace());

This doesn't seem quite right? The reserved space is off at the end of
the page and this is 0'ing the space immediately after the page header,
if I'm following correctly, and only to the size of BloomMetaPageData...

metadata->magickNumber = BLOOM_MAGICK_NUMBER;
metadata->opts = *opts;
-	((PageHeader) metaPage)->pd_lower += sizeof(BloomMetaPageData);
+	((PageHeader) metaPage)->pd_lower += sizeof(BloomMetaPageData) - SizeOfPageReservedSpace();

Not quite following what's going on here either.

diff --git a/contrib/bloom/blvacuum.c b/contrib/bloom/blvacuum.c
@@ -116,7 +116,7 @@ blbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
*/
if (BloomPageGetMaxOffset(page) != 0 &&
BloomPageGetFreeSpace(&state, page) >= state.sizeOfBloomTuple &&
-			countPage < BloomMetaBlockN)
+			countPage < BloomMetaBlockN())
notFullPage[countPage++] = blkno;

Looks to be another opportunity to have a separate patch making this
change first before actually changing the lower-level #define's,

diff --git a/src/backend/access/brin/brin_tuple.c b/src/backend/access/brin/brin_tuple.c
@@ -217,7 +217,7 @@ brin_form_tuple(BrinDesc *brdesc, BlockNumber blkno, BrinMemTuple *tuple,
* datatype, try to compress it in-line.
*/
if (!VARATT_IS_EXTENDED(DatumGetPointer(value)) &&
-				VARSIZE(DatumGetPointer(value)) > TOAST_INDEX_TARGET &&
+				VARSIZE(DatumGetPointer(value)) > TOAST_INDEX_TARGET() &&
(atttype->typstorage == TYPSTORAGE_EXTENDED ||
atttype->typstorage == TYPSTORAGE_MAIN))
{

Probably could be another patch but also if we're going to change
TOAST_INDEX_TARGET to be a function we should probably not have it named
in all-CAPS.

diff --git a/src/backend/access/gin/gindatapage.c b/src/backend/access/gin/gindatapage.c
@@ -535,7 +535,7 @@ dataBeginPlaceToPageLeaf(GinBtree btree, Buffer buf, GinBtreeStack *stack,
* a single byte, and we can use all the free space on the old page as
* well as the new page. For simplicity, ignore segment overhead etc.
*/
-		maxitems = Min(maxitems, freespace + GinDataPageMaxDataSize);
+		maxitems = Min(maxitems, freespace + GinDataPageMaxDataSize());
}
else
{

Ditto.

diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
@@ -38,8 +38,8 @@
/* GUC parameter */
int			gin_pending_list_limit = 0;

-#define GIN_PAGE_FREESIZE \
-	( BLCKSZ - MAXALIGN(SizeOfPageHeaderData) - MAXALIGN(sizeof(GinPageOpaqueData)) )
+#define GIN_PAGE_FREESIZE() \
+	( BLCKSZ - MAXALIGN(SizeOfPageHeaderData) - MAXALIGN(sizeof(GinPageOpaqueData)) - SizeOfPageReservedSpace() )

Another case of BLCKSZ - MAXALIGN(SizeOfPageHeaderData) -
SizeOfPageReservedSpace() ...

@@ -450,7 +450,7 @@ ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector)
* ginInsertCleanup() should not be called inside our CRIT_SECTION.
*/
cleanupSize = GinGetPendingListCleanupSize(index);
-	if (metadata->nPendingPages * GIN_PAGE_FREESIZE > cleanupSize * 1024L)
+	if (metadata->nPendingPages * GIN_PAGE_FREESIZE() > cleanupSize * 1024L)
needCleanup = true;

Also shouldn't be all-CAPS.

diff --git a/src/backend/access/nbtree/nbtsplitloc.c b/src/backend/access/nbtree/nbtsplitloc.c
index 43b67893d9..5babbb457a 100644
--- a/src/backend/access/nbtree/nbtsplitloc.c
+++ b/src/backend/access/nbtree/nbtsplitloc.c
@@ -156,7 +156,7 @@ _bt_findsplitloc(Relation rel,

/* Total free space available on a btree page, after fixed overhead */
leftspace = rightspace =
-		PageGetPageSize(origpage) - SizeOfPageHeaderData -
+		PageGetPageSize(origpage) - SizeOfPageHeaderData - SizeOfPageReservedSpace() -
MAXALIGN(sizeof(BTPageOpaqueData));

Also here ... though a bit interesting that this uses PageGetPageSize()
instead of BLCKSZ.

diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 011ec18015..022b5eee4e 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -154,3 +154,4 @@ int64		VacuumPageDirty = 0;

int VacuumCostBalance = 0; /* working state for vacuum */
bool VacuumCostActive = false;
+

Unnecessary whitespace hunk ?

Thanks!

Stephen

#12

David Christensen

david.christensen@crunchydata.com

over 2 years ago

In reply to: Stephen Frost (#11)

Re: [PATCHES] Post-special page storage TDE support

On Fri, May 12, 2023 at 7:48 PM Stephen Frost <sfrost@snowman.net> wrote:

Greetings,

* David Christensen (david.christensen@crunchydata.com) wrote:

Refreshing this with HEAD as of today, v4.

Thanks for updating this!

Thanks for the patience in my response here.

Subject: [PATCH v4 1/3] Add reserved_page_space to Page structure

This space is reserved for extended data on the Page structure which will be ultimately used for
encrypted data, extended checksums, and potentially other things. This data appears at the end of
the Page, after any `pd_special` area, and will be calculated at runtime based on specific
ControlFile features.

No effort is made to ensure this is backwards-compatible with existing clusters for `pg_upgrade`, as
we will require logical replication to move data into a cluster with different settings here.

This initial patch, at least, does maintain pg_upgrade as the
reserved_page_size (maybe not a great name?) is set to 0, right?
Basically this is just introducing the concept of a reserved_page_size
and adjusting all of the code that currently uses BLCKSZ or
PageGetPageSize() to account for this extra space.

Correct; a reserved_page_size of 0 would be the same page format as
currently exists, so you could use pg_upgrade with no page features
and be binary compatible with existing clusters.

Looking at the changes to bufpage.h, in particular ...
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
@@ -19,6 +19,14 @@
#include "storage/item.h"
#include "storage/off.h"
+extern PGDLLIMPORT int reserved_page_size;
+
+#define SizeOfPageReservedSpace() reserved_page_size
+#define MaxSizeOfPageReservedSpace 0
+
+/* strict upper bound on the amount of space occupied we have reserved on
+ * pages in this cluster */
This will eventually be calculated based on what features are supported
concurrently?

Correct; these are fleshed out in later patches.

@@ -36,10 +44,10 @@
* |                  v pd_upper                                                       |
* +-------------+------------------------------------+
* |                  | tupleN ...                         |
- * +-------------+------------------+-----------------+
- * |    ... tuple3 tuple2 tuple1 | "special space" |
- * +--------------------------------+-----------------+
- *                                                                   ^ pd_special
+ * +-------------+-----+------------+----+------------+
+ * | ... tuple2 tuple1 | "special space" | "reserved" |
+ * +-------------------+------------+----+------------+
+ *                                      ^ pd_special      ^ reserved_page_space

Right, adds a dynamic amount of space 'post-special area'.

Dynamic as in "fixed at initdb time" instead of compile time. However,
things are coded in such a way that the page feature bitmap is stored
on a given page, so different pages could have different
reserved_page_size depending on use case/code path. (Basically
preserving future flexibility while minimizing code changes here.) We
could utilize different features depending on what type of page it is,
say, or have different relations or tablespaces with different page
feature defaults.

@@ -73,6 +81,8 @@
* stored as the page trailer.  an access method should always
* initialize its pages with PageInit and then set its own opaque
* fields.
+ *
+ * XXX - update more comments here about reserved_page_space
*/

Would be good to do. ;)

Next revision... :D

@@ -325,7 +335,7 @@ static inline void
PageValidateSpecialPointer(Page page)
{
Assert(page);
-     Assert(((PageHeader) page)->pd_special <= BLCKSZ);
+     Assert((((PageHeader) page)->pd_special + reserved_page_size) <= BLCKSZ);
Assert(((PageHeader) page)->pd_special >= SizeOfPageHeaderData);
}
This is just one usage ... but seems like maybe we should be using
PageGetPageSize() here instead of BLCKSZ, and that more-or-less
throughout? Nearly everywhere we're using BLCKSZ today to give us that
compile-time advantage of a fixed block size is going to lose that
advantage anyway thanks to reserved_page_size being run-time. Now, one
up-side to this is that it'd also get us closer to being able to support
dynamic block sizes concurrently which would be quite interesting. That
is, a special tablespace with a 32KB block size while the rest are the
traditional 8KB. This would likely require multiple shared buffer
pools, of course...

I think multiple shared-buffer pools is a ways off; but sure, this
would support this sort of use case as well. I am working on a new
patch for this series (probably the first one in the series) which
will actually just abstract away all existing compile-time usages of
BLCKSZ. This will be a start in that direction and also make the
reserved_page_size patch a bit more reasonable to review.

diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index 9a302ddc30..a93cd9df9f 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -26,6 +26,8 @@
/* GUC variable */
bool         ignore_checksum_failure = false;
+int                  reserved_page_size = 0; /* how much page space to reserve for extended unencrypted metadata */
+
/* ----------------------------------------------------------------
* Page support functions
@@ -43,7 +45,7 @@ PageInit(Page page, Size pageSize, Size specialSize)
{
PageHeader p = (PageHeader) page;
-     specialSize = MAXALIGN(specialSize);
+     specialSize = MAXALIGN(specialSize) + reserved_page_size;
Rather than make it part of specialSize, I would think we'd be better
off just treating them independently. Eg, the later pd_upper setting
would be done by:

p->pd_upper = pageSize - specialSize - reserved_page_size;

etc.

I can see that there's a mild readability benefit, but really the
effect is local to PageInit(), so ¯\_(ツ)_/¯... happy to make that
change though.

@@ -186,7 +188,7 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
*   one that is both unused and deallocated.
*
*   If flag PAI_IS_HEAP is set, we enforce that there can't be more than
- *   MaxHeapTuplesPerPage line pointers on the page.
+ *   MaxHeapTuplesPerPage() line pointers on the page.
Making MaxHeapTuplesPerPage() runtime dynamic is a requirement for
supporting multiple page sizes concurrently ... but I'm not sure it's
actually required for the reserved_page_size idea as currently
considered. The reason is that with 8K or larger pages, the amount of
space we're already throwing away is at least 20 bytes, if I did my math
right. If we constrain reserved_page_size to be 20 bytes or less, as I
believe we're currently thinking we won't need that much, then we could
perhaps keep MaxHeapTuplesPerPage as a compile-time constant.

In this version we don't have that explicit constraint. In practice I
don't know that we have many more than 20 bytes, at least for the
first few features, but I don't think we can count on that forever
going forward. At some point we're going to have to parameterize
these, so might as well do it in this pass, since how else would you
know that this magic value has been exceeded?

On the other hand, to the extent that we want to consider having
variable page sizes in the future, perhaps we do want to change this.
If so, the approach broadly looks reasonable to me, but I'd suggest we
make that a separate patch from the introduction of reserved_page_size.

The variable blocksize patch I'm working on includes some of this, so
this will be in the next revision.

@@ -211,7 +213,7 @@ PageAddItemExtended(Page page,
if (phdr->pd_lower < SizeOfPageHeaderData ||
phdr->pd_lower > phdr->pd_upper ||
phdr->pd_upper > phdr->pd_special ||
-             phdr->pd_special > BLCKSZ)
+             phdr->pd_special + reserved_page_size > BLCKSZ)
ereport(PANIC,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg("corrupted page pointers: lower = %u, upper = %u, special = %u",
Probably should add reserved_page_size to that errmsg output? Also,
this check of pointers seems to be done multiple times- maybe it should
be moved into a #define or similar?

Sure, can change; agreed it'd be good to have. I just modified the
existing call sites and didn't attempt to change too much else.

[snipped other instances...]

Making MaxTIDsPerBTreePage dynamic looks to be required as it doesn't
end up with any 'leftover' space, from what I can tell. Again, though,
perhaps this should be split out as an independent patch from the rest.
That is- we can change the higher-level functions to be dynamic in the
initial patches, and then eventually we'll get down to making the
lower-level functions dynamic.

Same; should be accounted for in the next variable blocksize patch.
It does have a cascading effect though, so hard to make the high-level
functions dynamic but not the lower-level ones. What is the benefit in
this case for separating those two?

diff --git a/contrib/bloom/bloom.h b/contrib/bloom/bloom.h
index efdf9415d1..8ebabdd7ee 100644
--- a/contrib/bloom/bloom.h
+++ b/contrib/bloom/bloom.h
@@ -131,7 +131,7 @@ typedef struct BloomMetaPageData
#define BLOOM_MAGICK_NUMBER (0xDBAC0DED)
/* Number of blocks numbers fit in BloomMetaPageData */
-#define BloomMetaBlockN              (sizeof(FreeBlockNumberArray) / sizeof(BlockNumber))
+#define BloomMetaBlockN()            ((sizeof(FreeBlockNumberArray) - SizeOfPageReservedSpace())/ sizeof(BlockNumber))
#define BloomPageGetMeta(page) ((BloomMetaPageData *) PageGetContents(page))

@@ -151,6 +151,7 @@ typedef struct BloomState
#define BloomPageGetFreeSpace(state, page) \
(BLCKSZ - MAXALIGN(SizeOfPageHeaderData) \
+             - SizeOfPageReservedSpace()                                                               \
- BloomPageGetMaxOffset(page) * (state)->sizeOfBloomTuple \
- MAXALIGN(sizeof(BloomPageOpaqueData)))
This formulation (or something close to it) tends to happen quite a bit:

(BLCKSZ - MAXALIGN(SizeOfPageHeaderData) - SizeOfPageReservedSpace() ...

This is basically asking for "amount of usable space" where the
resulting 'usable space' either includes line pointers and tuples or
similar, or doesn't. Perhaps we should break this down into two
patches- one which provides a function to return usable space on a page,
and then the patch to add reserved_page_size can simply adjust that
instead of changing the very, very many places we have this forumlation.

Yeah, I can make this a computed expression; agreed it's pretty common
to have the usable space on the page so really any AM shouldn't know
or care about the details of either header or footer. Since we
already have PageGetContents() I will probably name it
PageGetContentsSize(). The AM can own everything from the pointer
returned by PageGetContents() through said size, allowing for both the
header and reserved_page_size in said computation.

diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index d935ed8fbd..d3d74a9d28 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -430,10 +430,10 @@ BloomFillMetapage(Relation index, Page metaPage)
*/
BloomInitPage(metaPage, BLOOM_META);
metadata = BloomPageGetMeta(metaPage);
-     memset(metadata, 0, sizeof(BloomMetaPageData));
+     memset(metadata, 0, sizeof(BloomMetaPageData) - SizeOfPageReservedSpace());
This doesn't seem quite right? The reserved space is off at the end of
the page and this is 0'ing the space immediately after the page header,
if I'm following correctly, and only to the size of BloomMetaPageData...

I think you're correct with that analysis. BloomInitPage() would have
(probably?) had a zero'd page so this underset would have been
unnoticed in practice, but still good to fix.

metadata->magickNumber = BLOOM_MAGICK_NUMBER;
metadata->opts = *opts;
-     ((PageHeader) metaPage)->pd_lower += sizeof(BloomMetaPageData);
+     ((PageHeader) metaPage)->pd_lower += sizeof(BloomMetaPageData) - SizeOfPageReservedSpace();

Not quite following what's going on here either.

Heh, not sure either. Not sure if there was a reason or a mechanical
replacement, but will look when I do next revisions.

diff --git a/contrib/bloom/blvacuum.c b/contrib/bloom/blvacuum.c
@@ -116,7 +116,7 @@ blbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
*/
if (BloomPageGetMaxOffset(page) != 0 &&
BloomPageGetFreeSpace(&state, page) >= state.sizeOfBloomTuple &&
-                     countPage < BloomMetaBlockN)
+                     countPage < BloomMetaBlockN())
notFullPage[countPage++] = blkno;

Looks to be another opportunity to have a separate patch making this
change first before actually changing the lower-level #define's,

#include <stdpatch/variable-blocksize>

diff --git a/src/backend/access/brin/brin_tuple.c b/src/backend/access/brin/brin_tuple.c
@@ -217,7 +217,7 @@ brin_form_tuple(BrinDesc *brdesc, BlockNumber blkno, BrinMemTuple *tuple,
* datatype, try to compress it in-line.
*/
if (!VARATT_IS_EXTENDED(DatumGetPointer(value)) &&
-                             VARSIZE(DatumGetPointer(value)) > TOAST_INDEX_TARGET &&
+                             VARSIZE(DatumGetPointer(value)) > TOAST_INDEX_TARGET() &&
(atttype->typstorage == TYPSTORAGE_EXTENDED ||
atttype->typstorage == TYPSTORAGE_MAIN))
{

Probably could be another patch but also if we're going to change
TOAST_INDEX_TARGET to be a function we should probably not have it named
in all-CAPS.

Okay, can make those style changes as well; agreed ALLCAPS should be constant.

diff --git a/src/backend/access/nbtree/nbtsplitloc.c b/src/backend/access/nbtree/nbtsplitloc.c
index 43b67893d9..5babbb457a 100644
--- a/src/backend/access/nbtree/nbtsplitloc.c
+++ b/src/backend/access/nbtree/nbtsplitloc.c
@@ -156,7 +156,7 @@ _bt_findsplitloc(Relation rel,

/* Total free space available on a btree page, after fixed overhead */
leftspace = rightspace =
-             PageGetPageSize(origpage) - SizeOfPageHeaderData -
+             PageGetPageSize(origpage) - SizeOfPageHeaderData - SizeOfPageReservedSpace() -
MAXALIGN(sizeof(BTPageOpaqueData));

Also here ... though a bit interesting that this uses PageGetPageSize()
instead of BLCKSZ.

Yeah, a few little exceptions. Variable blocksize patch introduces
those every place it can, and ClusterBlockSize() anywhere it can't.

diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 011ec18015..022b5eee4e 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -154,3 +154,4 @@ int64             VacuumPageDirty = 0;
int VacuumCostBalance = 0; /* working state for vacuum */
bool VacuumCostActive = false;
+
Unnecessary whitespace hunk ?

Will clean up.

Thanks for the review,

David

#13

Andres Freund

andres@anarazel.de

about 2 years ago

In reply to: David Christensen (#10)

Re: [PATCHES] Post-special page storage TDE support

Hi,

On 2023-05-09 17:08:26 -0500, David Christensen wrote:

From 965309ea3517fa734c4bc89c144e2031cdf6c0c3 Mon Sep 17 00:00:00 2001
From: David Christensen <david@pgguru.net>
Date: Tue, 9 May 2023 16:56:15 -0500
Subject: [PATCH v4 1/3] Add reserved_page_space to Page structure

This space is reserved for extended data on the Page structure which will be ultimately used for
encrypted data, extended checksums, and potentially other things. This data appears at the end of
the Page, after any `pd_special` area, and will be calculated at runtime based on specific
ControlFile features.

No effort is made to ensure this is backwards-compatible with existing clusters for `pg_upgrade`, as
we will require logical replication to move data into a cluster with
different settings here.

The first part of the last paragraph makes it sound like pg_upgrade won't be
supported across this commit, rather than just between different settings...

I think as a whole this is not an insane idea. A few comments:

- IMO the patch touches many places it shouldn't need to touch, because of
essentially renaming a lot of existing macro names to *Limit,
necessitating modifying a lot of users. I think instead the few places that
care about the runtime limit should be modified.

As-is the patch would cause a lot of fallout in extensions that just do
things like defining an on-stack array of Datums or such - even though all
they'd need is to change the define to the *Limit one.

Even leaving extensions aside, it must makes reviewing (and I'm sure
maintaining) the patch very tedious.

- I'm a bit worried about how the extra special page will be managed - if
there are multiple features that want to use it, who gets to put their data
at what offset?

After writing this I saw that 0002 tries to address this - but I don't like
the design. It introduces runtime overhead that seems likely to be visible.

- Checking for features using PageGetFeatureOffset() seems the wrong design to
me - instead of a branch for some feature being disabled, perfectly
predictable for the CPU, we need to do an external function call every time
to figure out that yet, checksums are *still* disabled.

- Recomputing offsets every time in PageGetFeatureOffset() seems too
expensive. The offsets can't change while running as PageGetFeatureOffset()
have enough information to distinguish between different kinds of relations
- so why do we need to recompute offsets on every single page? I'd instead
add a distinct offset variable for each feature.

- Modifying every single PageInit() call doesn't make sense to me. That'll
just create a lot of breakage for - as far as I can tell - no win.

- Why is it worth sacrificing space on every page to indicate which features
were enabled? I think there'd need to be some convincing reasons for
introducing such overhead.

- Is it really useful to encode the set of features enabled in a cluster with
a bitmask? That pretty much precludes utilizing extra page space in
extensions. We could instead just have an extra cluster-wide file that
defines a mapping of offset to feature.

Greetings,

Andres Freund

#14

Stephen Frost

sfrost@snowman.net

about 2 years ago

In reply to: Andres Freund (#13)

Re: [PATCHES] Post-special page storage TDE support

Greetings,

* Andres Freund (andres@anarazel.de) wrote:

On 2023-05-09 17:08:26 -0500, David Christensen wrote:

From 965309ea3517fa734c4bc89c144e2031cdf6c0c3 Mon Sep 17 00:00:00 2001
From: David Christensen <david@pgguru.net>
Date: Tue, 9 May 2023 16:56:15 -0500
Subject: [PATCH v4 1/3] Add reserved_page_space to Page structure

This space is reserved for extended data on the Page structure which will be ultimately used for
encrypted data, extended checksums, and potentially other things. This data appears at the end of
the Page, after any `pd_special` area, and will be calculated at runtime based on specific
ControlFile features.

No effort is made to ensure this is backwards-compatible with existing clusters for `pg_upgrade`, as
we will require logical replication to move data into a cluster with
different settings here.

The first part of the last paragraph makes it sound like pg_upgrade won't be
supported across this commit, rather than just between different settings...

I think as a whole this is not an insane idea. A few comments:

Thanks for all the feedback!

- Why is it worth sacrificing space on every page to indicate which features
were enabled? I think there'd need to be some convincing reasons for
introducing such overhead.

In conversations with folks (my memory specifically is a discussion with
Peter G, added to CC, and my apologies to Peter if I'm misremembering)
there was a pretty strong push that a page should be able to 'stand
alone' and not depend on something else (eg: pg_control, or whatever) to
provide info needed be able to interpret the page. For my part, I don't
have a particularly strong feeling on that, but that's what lead to this
design.

Getting a consensus on if that's a requirement or not would definitely
be really helpful.

Thanks,

Stephen

#15

David Christensen

david.christensen@crunchydata.com

about 2 years ago

In reply to: Stephen Frost (#14)

Re: [PATCHES] Post-special page storage TDE support

On Wed, Nov 8, 2023 at 8:04 AM Stephen Frost <sfrost@snowman.net> wrote:

Greetings,

* Andres Freund (andres@anarazel.de) wrote:

On 2023-05-09 17:08:26 -0500, David Christensen wrote:

From 965309ea3517fa734c4bc89c144e2031cdf6c0c3 Mon Sep 17 00:00:00 2001
From: David Christensen <david@pgguru.net>
Date: Tue, 9 May 2023 16:56:15 -0500
Subject: [PATCH v4 1/3] Add reserved_page_space to Page structure

This space is reserved for extended data on the Page structure which

will be ultimately used for

encrypted data, extended checksums, and potentially other things.

This data appears at the end of

the Page, after any `pd_special` area, and will be calculated at

runtime based on specific

ControlFile features.

No effort is made to ensure this is backwards-compatible with existing

clusters for `pg_upgrade`, as

we will require logical replication to move data into a cluster with
different settings here.

The first part of the last paragraph makes it sound like pg_upgrade

won't be

supported across this commit, rather than just between different

settings...

Yeah, that's vague, but you picked up on what I meant.

I think as a whole this is not an insane idea. A few comments:

Thanks for all the feedback!

- Why is it worth sacrificing space on every page to indicate which

features

were enabled? I think there'd need to be some convincing reasons for
introducing such overhead.

In conversations with folks (my memory specifically is a discussion with
Peter G, added to CC, and my apologies to Peter if I'm misremembering)
there was a pretty strong push that a page should be able to 'stand
alone' and not depend on something else (eg: pg_control, or whatever) to
provide info needed be able to interpret the page. For my part, I don't
have a particularly strong feeling on that, but that's what lead to this
design.

Unsurprisingly, I agree that it's useful to keep these features on the page
itself; from a forensic standpoint this seems much easier to interpret what
is happening here, as well it would allow you to have different features on
a given page or type of page depending on need. The initial patch utilizes
pg_control to store the cluster page features, but there's no reason it
couldn't be dependent on fork/page type or stored in pg_tablespace to
utilize different features.

Thanks,

David

#16

Stephen Frost

sfrost@snowman.net

about 2 years ago

In reply to: David Christensen (#15)

Re: [PATCHES] Post-special page storage TDE support

Greetings,

On Wed, Nov 8, 2023 at 20:55 David Christensen <
david.christensen@crunchydata.com> wrote:

On Wed, Nov 8, 2023 at 8:04 AM Stephen Frost <sfrost@snowman.net> wrote:

* Andres Freund (andres@anarazel.de) wrote:

On 2023-05-09 17:08:26 -0500, David Christensen wrote:

From 965309ea3517fa734c4bc89c144e2031cdf6c0c3 Mon Sep 17 00:00:00 2001
From: David Christensen <david@pgguru.net>
Date: Tue, 9 May 2023 16:56:15 -0500
Subject: [PATCH v4 1/3] Add reserved_page_space to Page structure

This space is reserved for extended data on the Page structure which

will be ultimately used for

encrypted data, extended checksums, and potentially other things.

This data appears at the end of

the Page, after any `pd_special` area, and will be calculated at

runtime based on specific

ControlFile features.

No effort is made to ensure this is backwards-compatible with

existing clusters for `pg_upgrade`, as

we will require logical replication to move data into a cluster with
different settings here.

The first part of the last paragraph makes it sound like pg_upgrade

won't be

supported across this commit, rather than just between different

settings...

Yeah, that's vague, but you picked up on what I meant.

I think as a whole this is not an insane idea. A few comments:

Thanks for all the feedback!

- Why is it worth sacrificing space on every page to indicate which

features

were enabled? I think there'd need to be some convincing reasons for
introducing such overhead.

In conversations with folks (my memory specifically is a discussion with
Peter G, added to CC, and my apologies to Peter if I'm misremembering)
there was a pretty strong push that a page should be able to 'stand
alone' and not depend on something else (eg: pg_control, or whatever) to
provide info needed be able to interpret the page. For my part, I don't
have a particularly strong feeling on that, but that's what lead to this
design.

Unsurprisingly, I agree that it's useful to keep these features on the
page itself; from a forensic standpoint this seems much easier to interpret
what is happening here, as well it would allow you to have different
features on a given page or type of page depending on need. The initial
patch utilizes pg_control to store the cluster page features, but there's
no reason it couldn't be dependent on fork/page type or stored in
pg_tablespace to utilize different features.

When it comes to authenticated encryption, it’s also the case that it’s
unclear what value the checksum field has, if any… it’s certainly not
directly needed as a checksum, as the auth tag is much better for the
purpose of seeing if the page has been changed in some way. It’s also not
big enough to serve as an auth tag per NIST guidelines regarding the size
of the authenticated data vs. the size of the tag. Using it to indicate
what features are enabled on the page seems pretty useful, as David notes.

Thanks,

Stephen

Show quoted text

#17

Peter Geoghegan

pg@bowt.ie

about 2 years ago

In reply to: Stephen Frost (#14)

Re: [PATCHES] Post-special page storage TDE support

On Wed, Nov 8, 2023 at 6:04 AM Stephen Frost <sfrost@snowman.net> wrote:

In conversations with folks (my memory specifically is a discussion with
Peter G, added to CC, and my apologies to Peter if I'm misremembering)
there was a pretty strong push that a page should be able to 'stand
alone' and not depend on something else (eg: pg_control, or whatever) to
provide info needed be able to interpret the page. For my part, I don't
have a particularly strong feeling on that, but that's what lead to this
design.

The term that I have used in the past is "self-contained". Meaning
capable of being decoded more or less as-is, without any metadata, by
tools like pg_filedump.

Any design in this area should try to make things as easy to debug as
possible, for the obvious reason: encrypted data that somehow becomes
corrupt is bound to be a nightmare to debug. (Besides, we already
support tools like pg_filedump, so this isn't a new principle.)

--
Peter Geoghegan

#18

David Christensen

david.christensen@crunchydata.com

about 2 years ago

In reply to: Andres Freund (#13)

Re: [PATCHES] Post-special page storage TDE support

On Tue, Nov 7, 2023 at 6:20 PM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2023-05-09 17:08:26 -0500, David Christensen wrote:

From 965309ea3517fa734c4bc89c144e2031cdf6c0c3 Mon Sep 17 00:00:00 2001
From: David Christensen <david@pgguru.net>
Date: Tue, 9 May 2023 16:56:15 -0500
Subject: [PATCH v4 1/3] Add reserved_page_space to Page structure

This space is reserved for extended data on the Page structure which

will be ultimately used for

encrypted data, extended checksums, and potentially other things. This

data appears at the end of

the Page, after any `pd_special` area, and will be calculated at runtime

based on specific

ControlFile features.

No effort is made to ensure this is backwards-compatible with existing

clusters for `pg_upgrade`, as

we will require logical replication to move data into a cluster with
different settings here.

The first part of the last paragraph makes it sound like pg_upgrade won't
be
supported across this commit, rather than just between different
settings...

Thanks for the review.

I think as a whole this is not an insane idea. A few comments:

- IMO the patch touches many places it shouldn't need to touch, because of
essentially renaming a lot of existing macro names to *Limit,
necessitating modifying a lot of users. I think instead the few places
that
care about the runtime limit should be modified.

As-is the patch would cause a lot of fallout in extensions that just do
things like defining an on-stack array of Datums or such - even though
all
they'd need is to change the define to the *Limit one.

Even leaving extensions aside, it must makes reviewing (and I'm sure
maintaining) the patch very tedious.

You make a good point, and I think you're right that we could teach the
places that care about runtime vs compile time differences about the
changes while leaving other callers alone. The *Limit ones were introduced
since we need constant values here from the Calc...() macros, but could try
keeping the existing *Limit with the old name and switching things around.
I suspect there will be the same amount of code churn, but less mechanical.

- I'm a bit worried about how the extra special page will be managed - if
there are multiple features that want to use it, who gets to put their
data
at what offset?

After writing this I saw that 0002 tries to address this - but I don't
like
the design. It introduces runtime overhead that seems likely to be
visible.

Agreed this could be optimized.

- Checking for features using PageGetFeatureOffset() seems the wrong
design to
me - instead of a branch for some feature being disabled, perfectly
predictable for the CPU, we need to do an external function call every
time
to figure out that yet, checksums are *still* disabled.

This is probably not a supported approach (it felt a little icky), but I'd
played around with const pointers to structs of const elements, where the
initial values of a global var was populated early on (so set once and
never changed post init), and the compiler didn't complain and things
seemed to work ok; not sure if this approach might help balance the early
mutability and constant lookup needs:

typedef struct PageFeatureOffsets {
const Size feature0offset;
const Size feature1offset;
...
} PageFeatureOffsets;

PageFeatureOffsets offsets = {0};
const PageFeatureOffsets *exposedOffsets = &offsets;

void InitOffsets() {
*((Size*)&offsets.feature0offset) = ...;
*((Size*)&offsets.feature1offset) = ...;
...
}

- Recomputing offsets every time in PageGetFeatureOffset() seems too

expensive. The offsets can't change while running as
PageGetFeatureOffset()
have enough information to distinguish between different kinds of
relations

Yes, this was a simple approach for ease of implementation; there is
certainly a way to precompute a lookup table from the page feature bitmask
into the offsets themselves or otherwise precompute, turn from function
call into inline/macro, etc.

- so why do we need to recompute offsets on every single page? I'd
instead
add a distinct offset variable for each feature.

This would work iff there is a single page feature set across all pages in
the cluster; I'm not sure we don't want more flexibility here.

- Modifying every single PageInit() call doesn't make sense to me. That'll
just create a lot of breakage for - as far as I can tell - no win.

This was a placeholder to allow different features depending on page type;
to keep things simple for now I just used the same values here, but we
could move this inside PageInit() instead (again, assuming single feature
set per cluster).

- Why is it worth sacrificing space on every page to indicate which
features
were enabled? I think there'd need to be some convincing reasons for
introducing such overhead.

The point here is if we can use either GCM authtag or stronger checksums
then we've gained the ability to authenticate the page contents at the cost
of reassigning those bits, in a way that would support variable
permutations of features for different relations or page types, if so
desired. A single global setting here both eliminates that possibility as
well as requires external data in order to fully interpret pages.

- Is it really useful to encode the set of features enabled in a cluster
with
a bitmask? That pretty much precludes utilizing extra page space in
extensions. We could instead just have an extra cluster-wide file that
defines a mapping of offset to feature.

Given the current design, yes we do need that, which does make it harder to
allocate/use from an extension. Due to needing to have consistent offsets
for a given feature set (however represented on a page), the implementation
load going forward as-is involves ensuring that a given bit always maps to
the same offset in the page regardless of additional features available in
the future. So the 0'th bit if enabled would always map to the 8 byte chunk
at the end of the page, the 1st bit corresponds to some amount of space
prior to that, etc. I'm not sure how to get that property without some
sort of bitmap or otherwise indexed operation.

I get what you're saying as far as the more global approach, and while that
does lend itself to some nice properties in terms of extensibility, some of
the features (GCM tags in particular) need to be able to control the page
offset at a consistent location so we can decode the rest of the page
without knowing anything else.

Additionally, since the reserved space/page features are configured at
initdb time I am unclear how a given extension would even be able to stake
a claim here. ...though if we consider this a two-part problem, one of
space reservation and one of space usage, that part could be handled via
allocating more than the minimum in the reserved_page_space and allowing
unallocated page space to be claimed later via some sort of additional
functions/other hook. That opens up other questions though, tracking
whether said space has ever been initialized and what to do when first
accessing existing/new pages as one example.

Best,

David

#19

Andres Freund

andres@anarazel.de

about 2 years ago

In reply to: Peter Geoghegan (#17)

Re: [PATCHES] Post-special page storage TDE support

Hi,

On 2023-11-08 18:47:56 -0800, Peter Geoghegan wrote:

On Wed, Nov 8, 2023 at 6:04 AM Stephen Frost <sfrost@snowman.net> wrote:

In conversations with folks (my memory specifically is a discussion with
Peter G, added to CC, and my apologies to Peter if I'm misremembering)
there was a pretty strong push that a page should be able to 'stand
alone' and not depend on something else (eg: pg_control, or whatever) to
provide info needed be able to interpret the page. For my part, I don't
have a particularly strong feeling on that, but that's what lead to this
design.

The term that I have used in the past is "self-contained". Meaning
capable of being decoded more or less as-is, without any metadata, by
tools like pg_filedump.

I'm not finding that very convincing - without cluster wide data, like keys, a
tool like pg_filedump isn't going to be able to do much with encrypted
pages. Given the need to look at some global data, figuring out the offset at
which data starts based on a value in pg_control isn't meaningfully worse than
having the data on each page.

Storing redundant data in each page header, when we've wanted space in the
page header for plenty other things, just doesn't seem a good use of said
space.

Greetings,

Andres Freund

#20

David Christensen

david.christensen@crunchydata.com

about 2 years ago

In reply to: Andres Freund (#19)

Re: [PATCHES] Post-special page storage TDE support

On Mon, Nov 13, 2023 at 2:27 PM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2023-11-08 18:47:56 -0800, Peter Geoghegan wrote:

On Wed, Nov 8, 2023 at 6:04 AM Stephen Frost <sfrost@snowman.net> wrote:

In conversations with folks (my memory specifically is a discussion

with

Peter G, added to CC, and my apologies to Peter if I'm misremembering)
there was a pretty strong push that a page should be able to 'stand
alone' and not depend on something else (eg: pg_control, or whatever)

to

provide info needed be able to interpret the page. For my part, I

don't

have a particularly strong feeling on that, but that's what lead to

this

design.

The term that I have used in the past is "self-contained". Meaning
capable of being decoded more or less as-is, without any metadata, by
tools like pg_filedump.

I'm not finding that very convincing - without cluster wide data, like
keys, a
tool like pg_filedump isn't going to be able to do much with encrypted
pages. Given the need to look at some global data, figuring out the offset
at
which data starts based on a value in pg_control isn't meaningfully worse
than
having the data on each page.

Storing redundant data in each page header, when we've wanted space in the
page header for plenty other things, just doesn't seem a good use of said
space.

This scheme would open up space per page that would now be available for
plenty of other things; the encoding in the header and the corresponding
available space in the footer would seem to open up quite a few options
now, no?

#21

Andres Freund

andres@anarazel.de

about 2 years ago

In reply to: David Christensen (#20)

Re: [PATCHES] Post-special page storage TDE support

Hi,

On 2023-11-13 14:37:47 -0600, David Christensen wrote:

On Mon, Nov 13, 2023 at 2:27 PM Andres Freund <andres@anarazel.de> wrote:

On 2023-11-08 18:47:56 -0800, Peter Geoghegan wrote:

On Wed, Nov 8, 2023 at 6:04 AM Stephen Frost <sfrost@snowman.net> wrote:

In conversations with folks (my memory specifically is a discussion

with

Peter G, added to CC, and my apologies to Peter if I'm misremembering)
there was a pretty strong push that a page should be able to 'stand
alone' and not depend on something else (eg: pg_control, or whatever)

to

provide info needed be able to interpret the page. For my part, I

don't

have a particularly strong feeling on that, but that's what lead to

this

design.

The term that I have used in the past is "self-contained". Meaning
capable of being decoded more or less as-is, without any metadata, by
tools like pg_filedump.

I'm not finding that very convincing - without cluster wide data, like
keys, a
tool like pg_filedump isn't going to be able to do much with encrypted
pages. Given the need to look at some global data, figuring out the offset
at
which data starts based on a value in pg_control isn't meaningfully worse
than
having the data on each page.

Storing redundant data in each page header, when we've wanted space in the
page header for plenty other things, just doesn't seem a good use of said
space.

This scheme would open up space per page that would now be available for
plenty of other things; the encoding in the header and the corresponding
available space in the footer would seem to open up quite a few options
now, no?

Sure, if you're willing to rewrite the whole cluster to upgrade and willing to
permanently sacrifice some data density. If the stored data is actually
specific to the page - that is the place to put the data. If not, then the
tradeoff is much more complicated IMO.

Of course this isn't a new problem - storing the page size on each page was
just silly, it's never going to change across the cluster and even more
definitely not going to change within a single relation.

Greetings,

Andres Freund

#22

David Christensen

david.christensen@crunchydata.com

about 2 years ago

In reply to: Andres Freund (#21)

Re: [PATCHES] Post-special page storage TDE support

On Mon, Nov 13, 2023 at 2:52 PM Andres Freund <andres@anarazel.de> wrote:

This scheme would open up space per page that would now be available for
plenty of other things; the encoding in the header and the corresponding
available space in the footer would seem to open up quite a few options
now, no?

Sure, if you're willing to rewrite the whole cluster to upgrade and
willing to
permanently sacrifice some data density. If the stored data is actually
specific to the page - that is the place to put the data. If not, then the
tradeoff is much more complicated IMO.

Of course this isn't a new problem - storing the page size on each page was
just silly, it's never going to change across the cluster and even more
definitely not going to change within a single relation.

Crazy idea; since stored pagesize is already a fixed cost that likely isn't
going away, what if instead of the pd_checksum field, we instead
reinterpret pd_pagesize_version; 4 would mean "no page features", but
anything 5 or higher could be looked up as an external page feature set,
with storage semantics outside of the realm of the page itself (other than
what the page features code itself needs to know); i.e,. move away from the
on-page bitmap into a more abstract representation of features which could
be something along the lines of what you were suggesting, including
extension support.

It seems like this could also support adding/removing features on page
read/write as long as there was sufficient space in the reserved_page
space; read the old feature set on page read, convert to the new feature
set which will write out the page with the additional/changed format.
Obviously there would be bookkeeping to be done in terms of making sure all
pages had been converted from one format to another, but for the page level
this would be straightforward.

Just thinking aloud here...

David

#23

Stephen Frost

sfrost@snowman.net

about 2 years ago

In reply to: David Christensen (#22)

Re: [PATCHES] Post-special page storage TDE support

Greetings,

On Mon, Nov 13, 2023 at 16:53 David Christensen <
david.christensen@crunchydata.com> wrote:

On Mon, Nov 13, 2023 at 2:52 PM Andres Freund <andres@anarazel.de> wrote:

This scheme would open up space per page that would now be available for
plenty of other things; the encoding in the header and the corresponding
available space in the footer would seem to open up quite a few options
now, no?

Sure, if you're willing to rewrite the whole cluster to upgrade and
willing to
permanently sacrifice some data density. If the stored data is actually
specific to the page - that is the place to put the data. If not, then the
tradeoff is much more complicated IMO.

Of course this isn't a new problem - storing the page size on each page
was
just silly, it's never going to change across the cluster and even more
definitely not going to change within a single relation.

Crazy idea; since stored pagesize is already a fixed cost that likely
isn't going away, what if instead of the pd_checksum field, we instead
reinterpret pd_pagesize_version; 4 would mean "no page features", but
anything 5 or higher could be looked up as an external page feature set,
with storage semantics outside of the realm of the page itself (other than
what the page features code itself needs to know); i.e,. move away from the
on-page bitmap into a more abstract representation of features which could
be something along the lines of what you were suggesting, including
extension support.

It seems like this could also support adding/removing features on page
read/write as long as there was sufficient space in the reserved_page
space; read the old feature set on page read, convert to the new feature
set which will write out the page with the additional/changed format.
Obviously there would be bookkeeping to be done in terms of making sure all
pages had been converted from one format to another, but for the page level
this would be straightforward.

Just thinking aloud here...

In other crazy idea space … if the page didn’t have enough space to allow
for the desired features then make any insert/update actions forcibly have
to choose a different page for the new tuple, while allowing delete’s to do
their usual thing, and then when vacuum comes along and is able to clean up
the page and remove the all dead tuples, it could then enable the features
on the page that are desired…

Thanks,

Stephen

Show quoted text

#24

David Christensen

david.christensen@crunchydata.com

about 2 years ago

In reply to: Andres Freund (#13)

1 attachment(s)

Re: [PATCHES] Post-special page storage TDE support

On Tue, Nov 7, 2023 at 6:20 PM Andres Freund <andres@anarazel.de> wrote:

Hi,

- IMO the patch touches many places it shouldn't need to touch, because of
essentially renaming a lot of existing macro names to *Limit,
necessitating modifying a lot of users. I think instead the few places
that
care about the runtime limit should be modified.

As-is the patch would cause a lot of fallout in extensions that just do
things like defining an on-stack array of Datums or such - even though
all
they'd need is to change the define to the *Limit one.

Even leaving extensions aside, it must makes reviewing (and I'm sure
maintaining) the patch very tedious.

Hi Andres et al,

So I've been looking at alternate approaches to this issue and considering
how to reduce churn, and I think we still need the *Limit variants. Let's
take a simple example:

Just looking at MaxHeapTuplesPerPage and breaking down instances in the
code, loosely partitioning into whether it's used as an array index or
other usage (doesn't discriminate against code vs comments, unfortunately)
we get the following breakdown:

$ git grep -hoE [[]?MaxHeapTuplesPerPage | sort | uniq -c
18 [MaxHeapTuplesPerPage
51 MaxHeapTuplesPerPage

This would be 18 places where we would need at adjust in a fairly
mechanical fashion to add the MaxHeapTuplesPerPageLimit instead of
MaxHeapTuplesPerPage vs some significant fraction of non-comment--even if
you assumed half were in comments, there would presumably need to be some
sort of adjustments in verbage since we are going to be changing some of
the interpretation.

I am working on a patch to cleanup some of the assumptions that smgr makes
currently about its space usage and how the individual access methods
consider it, as they should only be calculating things based on how much
space is available after smgr is done with it. That has traditionally been
BLCKSZ - SizeOfPageHeaderData, but this patch (included) factors that out
into a single expression that we can now use in access methods, so we can
then reserve additional page space and not need to adjust the access
methods furter.

Building on top of this patch, we'd define something like this to handle
the #defines that need to be dynamic:

extern Size reserved_page_space;
#define PageUsableSpace (BLCKSZ - SizeOfPageHeaderData -
reserved_page_space)
#define MaxHeapTuplesPerPage CalcMaxHeapTuplesPerPage(PageUsableSpace)
#define MaxHeapTuplesPerPageLimit CalcMaxHeapTuplesPerPage(BLCKSZ -
SizeOfPageHeaderData)
#define CalcMaxHeapTuplesPerPage(freesize)
((int) ((freesize) / \
(MAXALIGN(SizeofHeapTupleHeader) +
sizeof(ItemIdData))))

In my view, extensions that are expecting to need no changes when it comes
to changing how these are interpreted are better off needing to only change
the static allocation in a mechanical sense than revisit any other uses of
code; this seems more likely to guarantee a correct result than if you
exceed the page space and start overwriting things you weren't because
you're not aware that you need to check for dynamic limits on your own.

Take another thing which would need adjusting for reserving page space,
MaxHeapTupleSize:

$ git grep -ohE '[[]?MaxHeapTupleSize' | sort | uniq -c
3 [MaxHeapTupleSize
16 MaxHeapTupleSize

Here there are 3 static arrays which would need to be adjusted vs 16 other
instances. If we kept MaxHeapTupleSize interpretation the same and didn't
adjust an extension it would compile just fine, but with too large of a
length compared to the smaller PageUsableSpace, so you could conceivably
overwrite into the reserved space depending on what you were doing.

(since by definition the reserved_page_space >= 0, so PageUsableSpace will
always be <= BLCKSZ - SizeOfPageHeaderData, so any expression based on it
as a basis will be smaller).

In short, I think the approach I took originally actually will reduce
errors out-of-core, and while churn is still necessary churn.

I can produce a second patch which implements this calc/limit atop this
first one as well.

Thanks,

David

Attachments:

0001-Create-PageUsableSpace-to-represent-space-post-smgr.patchapplication/octet-stream; name=0001-Create-PageUsableSpace-to-represent-space-post-smgr.patchDownload

From eda398ca05230d30468119b3c23cd3847dc4f29c Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Sun, 26 Nov 2023 16:24:09 -0500
Subject: [PATCH] Create PageUsableSpace to represent space post-smgr

Work to abstract out the direct usage of SizeOfPageHeaderData and BLCKSZ from
access methods; they should only need to operate from a sense of what space is
available to them and not be party to the details.

This is in preparation for allowing space to be reserved at the end of the page
for, e.g., authenticated encryption tags and/or evs which will prevent future
churn when we redefine this value in the future.
---
 contrib/bloom/bloom.h                    | 4 ++--
 contrib/pageinspect/btreefuncs.c         | 2 +-
 contrib/pgstattuple/pgstatapprox.c       | 2 +-
 contrib/pgstattuple/pgstatindex.c        | 2 +-
 src/backend/access/common/bufmask.c      | 2 +-
 src/backend/access/gin/ginfast.c         | 2 +-
 src/backend/access/gist/gistbuild.c      | 4 ++--
 src/backend/access/heap/heapam.c         | 4 ++--
 src/backend/access/heap/heapam_handler.c | 2 +-
 src/backend/access/heap/vacuumlazy.c     | 2 +-
 src/backend/access/heap/visibilitymap.c  | 2 +-
 src/backend/optimizer/util/plancat.c     | 2 +-
 src/bin/pg_upgrade/file.c                | 2 +-
 src/include/access/brin_page.h           | 2 +-
 src/include/access/ginblock.h            | 4 ++--
 src/include/access/gist.h                | 2 +-
 src/include/access/gist_private.h        | 2 +-
 src/include/access/htup_details.h        | 4 ++--
 src/include/access/itup.h                | 2 +-
 src/include/access/nbtree.h              | 2 +-
 src/include/storage/bufpage.h            | 7 +++++++
 src/include/storage/fsm_internals.h      | 2 +-
 22 files changed, 33 insertions(+), 26 deletions(-)

diff --git a/contrib/bloom/bloom.h b/contrib/bloom/bloom.h
index 330811ec60..8331b2ab15 100644
--- a/contrib/bloom/bloom.h
+++ b/contrib/bloom/bloom.h
@@ -112,7 +112,7 @@ typedef struct BloomOptions
  */
 typedef BlockNumber FreeBlockNumberArray[
 										 MAXALIGN_DOWN(
-													   BLCKSZ - SizeOfPageHeaderData - MAXALIGN(sizeof(BloomPageOpaqueData))
+													   PageUsableSpace - MAXALIGN(sizeof(BloomPageOpaqueData))
 													   - MAXALIGN(sizeof(uint16) * 2 + sizeof(uint32) + sizeof(BloomOptions))
 													   ) / sizeof(BlockNumber)
 ];
@@ -150,7 +150,7 @@ typedef struct BloomState
 } BloomState;
 
 #define BloomPageGetFreeSpace(state, page) \
-	(BLCKSZ - MAXALIGN(SizeOfPageHeaderData) \
+	(PageUsableSpace \
 		- BloomPageGetMaxOffset(page) * (state)->sizeOfBloomTuple \
 		- MAXALIGN(sizeof(BloomPageOpaqueData)))
 
diff --git a/contrib/pageinspect/btreefuncs.c b/contrib/pageinspect/btreefuncs.c
index 9cdc8e182b..1e20fecf2f 100644
--- a/contrib/pageinspect/btreefuncs.c
+++ b/contrib/pageinspect/btreefuncs.c
@@ -116,7 +116,7 @@ GetBTPageStatistics(BlockNumber blkno, Buffer buffer, BTPageStat *stat)
 
 	stat->blkno = blkno;
 
-	stat->max_avail = BLCKSZ - (BLCKSZ - phdr->pd_special + SizeOfPageHeaderData);
+	stat->max_avail = PageUsableSpace - (BLCKSZ - phdr->pd_special);
 
 	stat->dead_items = stat->live_items = 0;
 
diff --git a/contrib/pgstattuple/pgstatapprox.c b/contrib/pgstattuple/pgstatapprox.c
index f601dc6121..3d2c9d186c 100644
--- a/contrib/pgstattuple/pgstatapprox.c
+++ b/contrib/pgstattuple/pgstatapprox.c
@@ -113,7 +113,7 @@ statapprox_heap(Relation rel, output_type *stat)
 		if (!PageIsNew(page))
 			stat->free_space += PageGetHeapFreeSpace(page);
 		else
-			stat->free_space += BLCKSZ - SizeOfPageHeaderData;
+			stat->free_space += PageUsableSpace;
 
 		/* We may count the page as scanned even if it's new/empty */
 		scanned++;
diff --git a/contrib/pgstattuple/pgstatindex.c b/contrib/pgstattuple/pgstatindex.c
index 8e5a4d6a66..5ce45952cc 100644
--- a/contrib/pgstattuple/pgstatindex.c
+++ b/contrib/pgstattuple/pgstatindex.c
@@ -309,7 +309,7 @@ pgstatindex_impl(Relation rel, FunctionCallInfo fcinfo)
 		{
 			int			max_avail;
 
-			max_avail = BLCKSZ - (BLCKSZ - ((PageHeader) page)->pd_special + SizeOfPageHeaderData);
+			max_avail = PageUsableSpace - (BLCKSZ - ((PageHeader) page)->pd_special);
 			indexStat.max_avail += max_avail;
 			indexStat.free_space += PageGetFreeSpace(page);
 
diff --git a/src/backend/access/common/bufmask.c b/src/backend/access/common/bufmask.c
index 5e392dab1e..dd67b53e03 100644
--- a/src/backend/access/common/bufmask.c
+++ b/src/backend/access/common/bufmask.c
@@ -120,7 +120,7 @@ mask_page_content(Page page)
 {
 	/* Mask Page Content */
 	memset(page + SizeOfPageHeaderData, MASK_MARKER,
-		   BLCKSZ - SizeOfPageHeaderData);
+		   PageUsableSpace);
 
 	/* Mask pd_lower and pd_upper */
 	memset(&((PageHeader) page)->pd_lower, MASK_MARKER,
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index 8798fbe963..9bcb2b0b95 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -39,7 +39,7 @@
 int			gin_pending_list_limit = 0;
 
 #define GIN_PAGE_FREESIZE \
-	( BLCKSZ - MAXALIGN(SizeOfPageHeaderData) - MAXALIGN(sizeof(GinPageOpaqueData)) )
+	( PageUsableSpace - MAXALIGN(sizeof(GinPageOpaqueData)) )
 
 typedef struct KeyArray
 {
diff --git a/src/backend/access/gist/gistbuild.c b/src/backend/access/gist/gistbuild.c
index a45e2fe375..d79e7a824a 100644
--- a/src/backend/access/gist/gistbuild.c
+++ b/src/backend/access/gist/gistbuild.c
@@ -702,7 +702,7 @@ gistInitBuffering(GISTBuildState *buildstate)
 	int			levelStep;
 
 	/* Calc space of index page which is available for index tuples */
-	pageFreeSpace = BLCKSZ - SizeOfPageHeaderData - sizeof(GISTPageOpaqueData)
+	pageFreeSpace = PageUsableSpace - sizeof(GISTPageOpaqueData)
 		- sizeof(ItemIdData)
 		- buildstate->freespace;
 
@@ -858,7 +858,7 @@ calculatePagesPerBuffer(GISTBuildState *buildstate, int levelStep)
 	Size		pageFreeSpace;
 
 	/* Calc space of index page which is available for index tuples */
-	pageFreeSpace = BLCKSZ - SizeOfPageHeaderData - sizeof(GISTPageOpaqueData)
+	pageFreeSpace = PageUsableSpace - sizeof(GISTPageOpaqueData)
 		- sizeof(ItemIdData)
 		- buildstate->freespace;
 
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 14de8158d4..7eccd5f531 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2064,7 +2064,7 @@ heap_prepare_insert(Relation relation, HeapTuple tup, TransactionId xid,
 static int
 heap_multi_insert_pages(HeapTuple *heaptuples, int done, int ntuples, Size saveFreeSpace)
 {
-	size_t		page_avail = BLCKSZ - SizeOfPageHeaderData - saveFreeSpace;
+	size_t		page_avail = PageUsableSpace - saveFreeSpace;
 	int			npages = 1;
 
 	for (int i = done; i < ntuples; i++)
@@ -2074,7 +2074,7 @@ heap_multi_insert_pages(HeapTuple *heaptuples, int done, int ntuples, Size saveF
 		if (page_avail < tup_sz)
 		{
 			npages++;
-			page_avail = BLCKSZ - SizeOfPageHeaderData - saveFreeSpace;
+			page_avail = PageUsableSpace - saveFreeSpace;
 		}
 		page_avail -= tup_sz;
 	}
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 7c28dafb72..58e92b87a2 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -2092,7 +2092,7 @@ heapam_relation_toast_am(Relation rel)
 #define HEAP_OVERHEAD_BYTES_PER_TUPLE \
 	(MAXALIGN(SizeofHeapTupleHeader) + sizeof(ItemIdData))
 #define HEAP_USABLE_BYTES_PER_PAGE \
-	(BLCKSZ - SizeOfPageHeaderData)
+	(PageUsableSpace)
 
 static void
 heapam_estimate_rel_size(Relation rel, int32 *attr_widths,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 59f51f40e1..69aada1280 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1445,7 +1445,7 @@ lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno,
 
 		if (GetRecordedFreeSpace(vacrel->rel, blkno) == 0)
 		{
-			freespace = BLCKSZ - SizeOfPageHeaderData;
+			freespace = PageUsableSpace;
 
 			RecordPageWithFreeSpace(vacrel->rel, blkno, freespace);
 		}
diff --git a/src/backend/access/heap/visibilitymap.c b/src/backend/access/heap/visibilitymap.c
index 2e18cd88bc..0bbd95be8f 100644
--- a/src/backend/access/heap/visibilitymap.c
+++ b/src/backend/access/heap/visibilitymap.c
@@ -105,7 +105,7 @@
  * extra headers, so the whole page minus the standard page header is
  * used for the bitmap.
  */
-#define MAPSIZE (BLCKSZ - MAXALIGN(SizeOfPageHeaderData))
+#define MAPSIZE (PageUsableSpace)
 
 /* Number of heap blocks we can represent in one byte */
 #define HEAPBLOCKS_PER_BYTE (BITS_PER_BYTE / BITS_PER_HEAPBLOCK)
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 7159c775fb..7f49baeb76 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -1089,7 +1089,7 @@ estimate_rel_size(Relation rel, int32 *attr_widths,
 			tuple_width += MAXALIGN(SizeofHeapTupleHeader);
 			tuple_width += sizeof(ItemIdData);
 			/* note: integer division is intentional here */
-			density = (BLCKSZ - SizeOfPageHeaderData) / tuple_width;
+			density = (PageUsableSpace) / tuple_width;
 		}
 		*tuples = rint(density * (double) curpages);
 
diff --git a/src/bin/pg_upgrade/file.c b/src/bin/pg_upgrade/file.c
index d173602882..bf6c8cc5b9 100644
--- a/src/bin/pg_upgrade/file.c
+++ b/src/bin/pg_upgrade/file.c
@@ -187,7 +187,7 @@ rewriteVisibilityMap(const char *fromfile, const char *tofile,
 	struct stat statbuf;
 
 	/* Compute number of old-format bytes per new page */
-	rewriteVmBytesPerPage = (BLCKSZ - SizeOfPageHeaderData) / 2;
+	rewriteVmBytesPerPage = (PageUsableSpace) / 2;
 
 	if ((src_fd = open(fromfile, O_RDONLY | PG_BINARY, 0)) < 0)
 		pg_fatal("error while copying relation \"%s.%s\": could not open file \"%s\": %s",
diff --git a/src/include/access/brin_page.h b/src/include/access/brin_page.h
index 3670ca6010..49f45ee9e5 100644
--- a/src/include/access/brin_page.h
+++ b/src/include/access/brin_page.h
@@ -86,7 +86,7 @@ typedef struct RevmapContents
 } RevmapContents;
 
 #define REVMAP_CONTENT_SIZE \
-	(BLCKSZ - MAXALIGN(SizeOfPageHeaderData) - \
+	(PageUsableSpace - \
 	 offsetof(RevmapContents, rm_tids) - \
 	 MAXALIGN(sizeof(BrinSpecialSpace)))
 /* max num of items in the array */
diff --git a/src/include/access/ginblock.h b/src/include/access/ginblock.h
index c59790ec5a..0db8fa0dda 100644
--- a/src/include/access/ginblock.h
+++ b/src/include/access/ginblock.h
@@ -318,7 +318,7 @@ typedef signed char GinNullCategory;
 	 GinPageGetOpaque(page)->maxoff * sizeof(PostingItem))
 
 #define GinDataPageMaxDataSize	\
-	(BLCKSZ - MAXALIGN(SizeOfPageHeaderData) \
+	(PageUsableSpace \
 	 - MAXALIGN(sizeof(ItemPointerData)) \
 	 - MAXALIGN(sizeof(GinPageOpaqueData)))
 
@@ -326,7 +326,7 @@ typedef signed char GinNullCategory;
  * List pages
  */
 #define GinListPageSize  \
-	( BLCKSZ - SizeOfPageHeaderData - MAXALIGN(sizeof(GinPageOpaqueData)) )
+	( PageUsableSpace - MAXALIGN(sizeof(GinPageOpaqueData)) )
 
 /*
  * A compressed posting list.
diff --git a/src/include/access/gist.h b/src/include/access/gist.h
index 0235716c06..6d735223ec 100644
--- a/src/include/access/gist.h
+++ b/src/include/access/gist.h
@@ -96,7 +96,7 @@ typedef GISTPageOpaqueData *GISTPageOpaque;
  * key size using opclass parameters.
  */
 #define GISTMaxIndexTupleSize	\
-	MAXALIGN_DOWN((BLCKSZ - SizeOfPageHeaderData - sizeof(GISTPageOpaqueData)) / \
+	MAXALIGN_DOWN((PageUsableSpace - sizeof(GISTPageOpaqueData)) / \
 				  4 - sizeof(ItemIdData))
 
 #define GISTMaxIndexKeySize	\
diff --git a/src/include/access/gist_private.h b/src/include/access/gist_private.h
index 82eb7b4bd8..5bf5000c97 100644
--- a/src/include/access/gist_private.h
+++ b/src/include/access/gist_private.h
@@ -474,7 +474,7 @@ extern void gistadjustmembers(Oid opfamilyoid,
 /* gistutil.c */
 
 #define GiSTPageSize   \
-	( BLCKSZ - SizeOfPageHeaderData - MAXALIGN(sizeof(GISTPageOpaqueData)) )
+	( PageUsableSpace - MAXALIGN(sizeof(GISTPageOpaqueData)) )
 
 #define GIST_MIN_FILLFACTOR			10
 #define GIST_DEFAULT_FILLFACTOR		90
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index 6fd87dc108..0f7a96820c 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -560,7 +560,7 @@ StaticAssertDecl(MaxOffsetNumber < SpecTokenOffsetNumber,
  * ItemIds and tuples have different alignment requirements, don't assume that
  * you can, say, fit 2 tuples of size MaxHeapTupleSize/2 on the same page.
  */
-#define MaxHeapTupleSize  (BLCKSZ - MAXALIGN(SizeOfPageHeaderData + sizeof(ItemIdData)))
+#define MaxHeapTupleSize  (PageUsableSpace - MAXALIGN(sizeof(ItemIdData)))
 #define MinHeapTupleSize  MAXALIGN(SizeofHeapTupleHeader)
 
 /*
@@ -575,7 +575,7 @@ StaticAssertDecl(MaxOffsetNumber < SpecTokenOffsetNumber,
  * require increases in the size of work arrays.
  */
 #define MaxHeapTuplesPerPage	\
-	((int) ((BLCKSZ - SizeOfPageHeaderData) / \
+	((int) ((PageUsableSpace) / \
 			(MAXALIGN(SizeofHeapTupleHeader) + sizeof(ItemIdData))))
 
 /*
diff --git a/src/include/access/itup.h b/src/include/access/itup.h
index 1d55536dbd..4e248cb2bb 100644
--- a/src/include/access/itup.h
+++ b/src/include/access/itup.h
@@ -164,7 +164,7 @@ index_getattr(IndexTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
  * But such a page always has at least MAXALIGN special space, so we're safe.
  */
 #define MaxIndexTuplesPerPage	\
-	((int) ((BLCKSZ - SizeOfPageHeaderData) / \
+	((int) ((PageUsableSpace) / \
 			(MAXALIGN(sizeof(IndexTupleData) + 1) + sizeof(ItemIdData))))
 
 #endif							/* ITUP_H */
diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h
index 7bfbf3086c..16060e09c8 100644
--- a/src/include/access/nbtree.h
+++ b/src/include/access/nbtree.h
@@ -183,7 +183,7 @@ typedef struct BTMetaPageData
  * than necessary as a result, which is considered acceptable.
  */
 #define MaxTIDsPerBTreePage \
-	(int) ((BLCKSZ - SizeOfPageHeaderData - sizeof(BTPageOpaqueData)) / \
+	(int) ((PageUsableSpace - sizeof(BTPageOpaqueData)) / \
 		   sizeof(ItemPointerData))
 
 /*
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 424ecba028..3d36a74385 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -213,6 +213,13 @@ typedef PageHeaderData *PageHeader;
  */
 #define SizeOfPageHeaderData (offsetof(PageHeaderData, pd_linp))
 
+/*
+ * how much space is left after smgr's bookkeeping, etc
+ */
+#define PageUsableSpace (BLCKSZ - SizeOfPageHeaderData)
+StaticAssertDecl(PageUsableSpace == MAXALIGN(PageUsableSpace),
+				 "SizeOfPageHeaderData must be MAXALIGN'd");
+
 /*
  * PageIsEmpty
  *		returns true iff no itemid has been allocated on the page
diff --git a/src/include/storage/fsm_internals.h b/src/include/storage/fsm_internals.h
index 9e314c83fa..2b9138f7a7 100644
--- a/src/include/storage/fsm_internals.h
+++ b/src/include/storage/fsm_internals.h
@@ -48,7 +48,7 @@ typedef FSMPageData *FSMPage;
  * Number of non-leaf and leaf nodes, and nodes in total, on an FSM page.
  * These definitions are internal to fsmpage.c.
  */
-#define NodesPerPage (BLCKSZ - MAXALIGN(SizeOfPageHeaderData) - \
+#define NodesPerPage (PageUsableSpace - \
 					  offsetof(FSMPageData, fp_nodes))
 
 #define NonLeafNodesPerPage (BLCKSZ / 2 - 1)
-- 
2.40.1

#25

David Christensen

david.christensen@crunchydata.com

about 2 years ago

In reply to: David Christensen (#24)

9 attachment(s)

Re: [PATCHES] Post-special page storage TDE support

Hi again!

Per some offline discussion with Stephen, I've continued to work on some
modifications here; this particular patchset is intended to facilitate
review by highlighting the mechanical nature of many of these changes. As
such, I have taken the following approach to this rework:

0001 - Create PageUsableSpace to represent space post-smgr
0002 - Add support for fast, non-division-based div/mod algorithms
0003 - Use fastdiv code in visibility map
0004 - Make PageUsableSpace incorporate variable-sized limit
0005 - Add Calc, Limit and Dynamic forms of all variable constants
0006 - Split MaxHeapTuplesPerPage into Limit and Dynamic variants
0007 - Split MaxIndexTuplesPerPage into Limit and Dynamic variants
0008 - Split MaxHeapTupleSize into Limit and Dynamic variants
0009 - Split MaxTIDsPerBTreePage into Limit and Dynamic variant

0001 - 0003 have appeared in this thread or in other forms on the list
already, though 0001 refactors things slightly more aggressively, but makes
StaticAssert() to ensure that this change is still sane.

0004 adds the ReservedPageSpace variable, and also redefines the previous
BLCKSZ - SizeOfPageHeaderDate as PageUsableSpaceMax; there are a few
related fixups.

0005 adds the macros to compute the former constants while leaving their
original definitions to evaluate to the same place (the infamous Calc* and
*Limit, plus we invite *Dynamic to the party as well; the names are
terrible and there must be something better)

0006 - 0009 are all the same approach; we undefine the old constant name
and modify the existing uses of this symbol to be either the *Limit or
*Dynamic, depending on if the changed available space would impact the
calculations. Since we are touching every use of this symbol, this
facilitates review of the impact, though I would contend that almost every
piece I've spot-checked seems like it really does need to know about the
runtime limit. Perhaps there is more we could do here. I could also see a
variable per constant rather than recalculating this every time, in which
case the *Dynamic would just be the variable and we'd need a hook to
initialize this or otherwise set on first use.

There are a number of additional things remaining to be done to get this to
fully work, but I did want to get some of this out there for review.

Still to do (almost all in some form in original patch, so just need to
extract the relevant pieces):
- set reserved-page-size via initdb
- load reserved-page-size from pg_control
- apply to the running cluster
- some form of compatibility for these constants in common and ensuring
bin/ works
- some toast-related changes (this requires a patch to support dynamic
relopts, which I can extract, as the existing code is using a constant
lookup table)
- probably some more pieces I'm forgetting

Thanks,
David

Attachments:

v2-0005-Add-Calc-Limit-and-Dynamic-forms-of-all-variable-.patchapplication/octet-stream; name=v2-0005-Add-Calc-Limit-and-Dynamic-forms-of-all-variable-.patchDownload

From 8e36986cde97522d0ac41d66ce00fc9d0e541585 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Tue, 19 Dec 2023 17:32:38 -0500
Subject: [PATCH v2 5/9] Add Calc, Limit and Dynamic forms of all variable
 constants

---
 contrib/bloom/bloom.h               |  2 +-
 src/bin/pg_upgrade/file.c           |  2 +-
 src/include/access/htup_details.h   | 13 ++++++++++---
 src/include/access/itup.h           |  8 ++++++--
 src/include/access/nbtree.h         |  8 +++++---
 src/include/storage/fsm_internals.h |  2 +-
 6 files changed, 24 insertions(+), 11 deletions(-)

diff --git a/contrib/bloom/bloom.h b/contrib/bloom/bloom.h
index 8331b2ab15..5d45cce80d 100644
--- a/contrib/bloom/bloom.h
+++ b/contrib/bloom/bloom.h
@@ -112,7 +112,7 @@ typedef struct BloomOptions
  */
 typedef BlockNumber FreeBlockNumberArray[
 										 MAXALIGN_DOWN(
-													   PageUsableSpace - MAXALIGN(sizeof(BloomPageOpaqueData))
+													   PageUsableSpaceMax - MAXALIGN(sizeof(BloomPageOpaqueData))
 													   - MAXALIGN(sizeof(uint16) * 2 + sizeof(uint32) + sizeof(BloomOptions))
 													   ) / sizeof(BlockNumber)
 ];
diff --git a/src/bin/pg_upgrade/file.c b/src/bin/pg_upgrade/file.c
index bf6c8cc5b9..2860666bc5 100644
--- a/src/bin/pg_upgrade/file.c
+++ b/src/bin/pg_upgrade/file.c
@@ -187,7 +187,7 @@ rewriteVisibilityMap(const char *fromfile, const char *tofile,
 	struct stat statbuf;
 
 	/* Compute number of old-format bytes per new page */
-	rewriteVmBytesPerPage = (PageUsableSpace) / 2;
+	rewriteVmBytesPerPage = (PageUsableSpaceMax) / 2;
 
 	if ((src_fd = open(fromfile, O_RDONLY | PG_BINARY, 0)) < 0)
 		pg_fatal("error while copying relation \"%s.%s\": could not open file \"%s\": %s",
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index 0f7a96820c..70eaac32a7 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -560,7 +560,10 @@ StaticAssertDecl(MaxOffsetNumber < SpecTokenOffsetNumber,
  * ItemIds and tuples have different alignment requirements, don't assume that
  * you can, say, fit 2 tuples of size MaxHeapTupleSize/2 on the same page.
  */
-#define MaxHeapTupleSize  (PageUsableSpace - MAXALIGN(sizeof(ItemIdData)))
+#define CalcMaxHeapTupleSize(usablespace)  ((usablespace) - MAXALIGN(sizeof(ItemIdData)))
+#define MaxHeapTupleSizeLimit CalcMaxHeapTupleSize(PageUsableSpaceMax)
+#define MaxHeapTupleSizeDynamic CalcMaxHeapTupleSize(PageUsableSpace)
+#define MaxHeapTupleSize MaxHeapTupleSizeLimit
 #define MinHeapTupleSize  MAXALIGN(SizeofHeapTupleHeader)
 
 /*
@@ -574,9 +577,13 @@ StaticAssertDecl(MaxOffsetNumber < SpecTokenOffsetNumber,
  * pointers to this anyway, to avoid excessive line-pointer bloat and not
  * require increases in the size of work arrays.
  */
-#define MaxHeapTuplesPerPage	\
-	((int) ((PageUsableSpace) / \
+
+#define CalcMaxHeapTuplesPerPage(usablespace)									\
+	((int) ((usablespace) /							\
 			(MAXALIGN(SizeofHeapTupleHeader) + sizeof(ItemIdData))))
+#define MaxHeapTuplesPerPageLimit (CalcMaxHeapTuplesPerPage(PageUsableSpaceMax))
+#define MaxHeapTuplesPerPageDynamic (CalcMaxHeapTuplesPerPage(PageUsableSpace))
+#define MaxHeapTuplesPerPage MaxHeapTuplesPerPageLimit
 
 /*
  * MaxAttrSize is a somewhat arbitrary upper limit on the declared size of
diff --git a/src/include/access/itup.h b/src/include/access/itup.h
index 4e248cb2bb..2fa07cbd3c 100644
--- a/src/include/access/itup.h
+++ b/src/include/access/itup.h
@@ -163,8 +163,12 @@ index_getattr(IndexTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
  * estimated here, seemingly allowing one more tuple than estimated here.
  * But such a page always has at least MAXALIGN special space, so we're safe.
  */
-#define MaxIndexTuplesPerPage	\
-	((int) ((PageUsableSpace) / \
+#define CalcMaxIndexTuplesPerPage(usablespace)		\
+	((int) ((usablespace) /												\
 			(MAXALIGN(sizeof(IndexTupleData) + 1) + sizeof(ItemIdData))))
+#define MaxIndexTuplesPerPageLimit (CalcMaxIndexTuplesPerPage(BLCKSZ - SizeOfPageHeaderData))
+#define MaxIndexTuplesPerPageDynamic (CalcMaxIndexTuplesPerPage(PageUsableSpace))
 
+// temporary to compile
+#define MaxIndexTuplesPerPage MaxIndexTuplesPerPageLimit
 #endif							/* ITUP_H */
diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h
index 38ad0ade74..907535be60 100644
--- a/src/include/access/nbtree.h
+++ b/src/include/access/nbtree.h
@@ -182,10 +182,12 @@ typedef struct BTMetaPageData
  * special area).  The value is slightly higher (i.e. more conservative)
  * than necessary as a result, which is considered acceptable.
  */
-#define MaxTIDsPerBTreePage \
-	(int) ((PageUsableSpace - sizeof(BTPageOpaqueData)) / \
+#define CalcMaxTIDsPerBTreePage(usablespace)			  \
+	(int) ((usablespace) - sizeof(BTPageOpaqueData) / \
 		   sizeof(ItemPointerData))
-
+#define MaxTIDsPerBTreePageLimit (CalcMaxTIDsPerBTreePage(PageUsableSpaceMax))
+#define MaxTIDsPerBTreePageDynamic (CalcMaxTIDsPerBTreePage(PageUsableSpace))
+#define MaxTIDsPerBTreePage MaxTIDsPerBTreePageLimit
 /*
  * The leaf-page fillfactor defaults to 90% but is user-adjustable.
  * For pages above the leaf level, we use a fixed 70% fillfactor.
diff --git a/src/include/storage/fsm_internals.h b/src/include/storage/fsm_internals.h
index 2b9138f7a7..373e2ae92f 100644
--- a/src/include/storage/fsm_internals.h
+++ b/src/include/storage/fsm_internals.h
@@ -48,7 +48,7 @@ typedef FSMPageData *FSMPage;
  * Number of non-leaf and leaf nodes, and nodes in total, on an FSM page.
  * These definitions are internal to fsmpage.c.
  */
-#define NodesPerPage (PageUsableSpace - \
+#define NodesPerPage (PageUsableSpaceMax - \
 					  offsetof(FSMPageData, fp_nodes))
 
 #define NonLeafNodesPerPage (BLCKSZ / 2 - 1)
-- 
2.40.1

v2-0001-Create-PageUsableSpace-to-represent-space-post-sm.patchapplication/octet-stream; name=v2-0001-Create-PageUsableSpace-to-represent-space-post-sm.patchDownload

From eeaa83487271c95c10b0c58b436d8b91da8fd7c8 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Sun, 26 Nov 2023 16:24:09 -0500
Subject: [PATCH v2 1/9] Create PageUsableSpace to represent space post-smgr

Work to abstract out the direct usage of SizeOfPageHeaderData and BLCKSZ from
access methods; they should only need to operate from a sense of what space is
available to them and not be party to the details.

This is in preparation for allowing space to be reserved at the end of the page
for, e.g., authenticated encryption tags and/or evs which will prevent future
churn when we redefine this value in the future.
---
 contrib/bloom/bloom.h                    | 4 ++--
 contrib/pageinspect/btreefuncs.c         | 2 +-
 contrib/pgstattuple/pgstatapprox.c       | 2 +-
 contrib/pgstattuple/pgstatindex.c        | 2 +-
 src/backend/access/common/bufmask.c      | 2 +-
 src/backend/access/gin/ginfast.c         | 2 +-
 src/backend/access/gist/gistbuild.c      | 4 ++--
 src/backend/access/heap/heapam.c         | 4 ++--
 src/backend/access/heap/heapam_handler.c | 2 +-
 src/backend/access/heap/vacuumlazy.c     | 2 +-
 src/backend/access/heap/visibilitymap.c  | 2 +-
 src/backend/optimizer/util/plancat.c     | 2 +-
 src/bin/pg_upgrade/file.c                | 2 +-
 src/include/access/brin_page.h           | 2 +-
 src/include/access/ginblock.h            | 4 ++--
 src/include/access/gist.h                | 2 +-
 src/include/access/gist_private.h        | 2 +-
 src/include/access/htup_details.h        | 4 ++--
 src/include/access/itup.h                | 2 +-
 src/include/access/nbtree.h              | 2 +-
 src/include/storage/bufpage.h            | 7 +++++++
 src/include/storage/fsm_internals.h      | 2 +-
 22 files changed, 33 insertions(+), 26 deletions(-)

diff --git a/contrib/bloom/bloom.h b/contrib/bloom/bloom.h
index 330811ec60..8331b2ab15 100644
--- a/contrib/bloom/bloom.h
+++ b/contrib/bloom/bloom.h
@@ -112,7 +112,7 @@ typedef struct BloomOptions
  */
 typedef BlockNumber FreeBlockNumberArray[
 										 MAXALIGN_DOWN(
-													   BLCKSZ - SizeOfPageHeaderData - MAXALIGN(sizeof(BloomPageOpaqueData))
+													   PageUsableSpace - MAXALIGN(sizeof(BloomPageOpaqueData))
 													   - MAXALIGN(sizeof(uint16) * 2 + sizeof(uint32) + sizeof(BloomOptions))
 													   ) / sizeof(BlockNumber)
 ];
@@ -150,7 +150,7 @@ typedef struct BloomState
 } BloomState;
 
 #define BloomPageGetFreeSpace(state, page) \
-	(BLCKSZ - MAXALIGN(SizeOfPageHeaderData) \
+	(PageUsableSpace \
 		- BloomPageGetMaxOffset(page) * (state)->sizeOfBloomTuple \
 		- MAXALIGN(sizeof(BloomPageOpaqueData)))
 
diff --git a/contrib/pageinspect/btreefuncs.c b/contrib/pageinspect/btreefuncs.c
index 9cdc8e182b..1e20fecf2f 100644
--- a/contrib/pageinspect/btreefuncs.c
+++ b/contrib/pageinspect/btreefuncs.c
@@ -116,7 +116,7 @@ GetBTPageStatistics(BlockNumber blkno, Buffer buffer, BTPageStat *stat)
 
 	stat->blkno = blkno;
 
-	stat->max_avail = BLCKSZ - (BLCKSZ - phdr->pd_special + SizeOfPageHeaderData);
+	stat->max_avail = PageUsableSpace - (BLCKSZ - phdr->pd_special);
 
 	stat->dead_items = stat->live_items = 0;
 
diff --git a/contrib/pgstattuple/pgstatapprox.c b/contrib/pgstattuple/pgstatapprox.c
index f601dc6121..3d2c9d186c 100644
--- a/contrib/pgstattuple/pgstatapprox.c
+++ b/contrib/pgstattuple/pgstatapprox.c
@@ -113,7 +113,7 @@ statapprox_heap(Relation rel, output_type *stat)
 		if (!PageIsNew(page))
 			stat->free_space += PageGetHeapFreeSpace(page);
 		else
-			stat->free_space += BLCKSZ - SizeOfPageHeaderData;
+			stat->free_space += PageUsableSpace;
 
 		/* We may count the page as scanned even if it's new/empty */
 		scanned++;
diff --git a/contrib/pgstattuple/pgstatindex.c b/contrib/pgstattuple/pgstatindex.c
index 8e5a4d6a66..5ce45952cc 100644
--- a/contrib/pgstattuple/pgstatindex.c
+++ b/contrib/pgstattuple/pgstatindex.c
@@ -309,7 +309,7 @@ pgstatindex_impl(Relation rel, FunctionCallInfo fcinfo)
 		{
 			int			max_avail;
 
-			max_avail = BLCKSZ - (BLCKSZ - ((PageHeader) page)->pd_special + SizeOfPageHeaderData);
+			max_avail = PageUsableSpace - (BLCKSZ - ((PageHeader) page)->pd_special);
 			indexStat.max_avail += max_avail;
 			indexStat.free_space += PageGetFreeSpace(page);
 
diff --git a/src/backend/access/common/bufmask.c b/src/backend/access/common/bufmask.c
index 5e392dab1e..dd67b53e03 100644
--- a/src/backend/access/common/bufmask.c
+++ b/src/backend/access/common/bufmask.c
@@ -120,7 +120,7 @@ mask_page_content(Page page)
 {
 	/* Mask Page Content */
 	memset(page + SizeOfPageHeaderData, MASK_MARKER,
-		   BLCKSZ - SizeOfPageHeaderData);
+		   PageUsableSpace);
 
 	/* Mask pd_lower and pd_upper */
 	memset(&((PageHeader) page)->pd_lower, MASK_MARKER,
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index 8798fbe963..9bcb2b0b95 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -39,7 +39,7 @@
 int			gin_pending_list_limit = 0;
 
 #define GIN_PAGE_FREESIZE \
-	( BLCKSZ - MAXALIGN(SizeOfPageHeaderData) - MAXALIGN(sizeof(GinPageOpaqueData)) )
+	( PageUsableSpace - MAXALIGN(sizeof(GinPageOpaqueData)) )
 
 typedef struct KeyArray
 {
diff --git a/src/backend/access/gist/gistbuild.c b/src/backend/access/gist/gistbuild.c
index a45e2fe375..d79e7a824a 100644
--- a/src/backend/access/gist/gistbuild.c
+++ b/src/backend/access/gist/gistbuild.c
@@ -702,7 +702,7 @@ gistInitBuffering(GISTBuildState *buildstate)
 	int			levelStep;
 
 	/* Calc space of index page which is available for index tuples */
-	pageFreeSpace = BLCKSZ - SizeOfPageHeaderData - sizeof(GISTPageOpaqueData)
+	pageFreeSpace = PageUsableSpace - sizeof(GISTPageOpaqueData)
 		- sizeof(ItemIdData)
 		- buildstate->freespace;
 
@@ -858,7 +858,7 @@ calculatePagesPerBuffer(GISTBuildState *buildstate, int levelStep)
 	Size		pageFreeSpace;
 
 	/* Calc space of index page which is available for index tuples */
-	pageFreeSpace = BLCKSZ - SizeOfPageHeaderData - sizeof(GISTPageOpaqueData)
+	pageFreeSpace = PageUsableSpace - sizeof(GISTPageOpaqueData)
 		- sizeof(ItemIdData)
 		- buildstate->freespace;
 
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index f938715359..5daad4c34d 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2064,7 +2064,7 @@ heap_prepare_insert(Relation relation, HeapTuple tup, TransactionId xid,
 static int
 heap_multi_insert_pages(HeapTuple *heaptuples, int done, int ntuples, Size saveFreeSpace)
 {
-	size_t		page_avail = BLCKSZ - SizeOfPageHeaderData - saveFreeSpace;
+	size_t		page_avail = PageUsableSpace - saveFreeSpace;
 	int			npages = 1;
 
 	for (int i = done; i < ntuples; i++)
@@ -2074,7 +2074,7 @@ heap_multi_insert_pages(HeapTuple *heaptuples, int done, int ntuples, Size saveF
 		if (page_avail < tup_sz)
 		{
 			npages++;
-			page_avail = BLCKSZ - SizeOfPageHeaderData - saveFreeSpace;
+			page_avail = PageUsableSpace - saveFreeSpace;
 		}
 		page_avail -= tup_sz;
 	}
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 7c28dafb72..58e92b87a2 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -2092,7 +2092,7 @@ heapam_relation_toast_am(Relation rel)
 #define HEAP_OVERHEAD_BYTES_PER_TUPLE \
 	(MAXALIGN(SizeofHeapTupleHeader) + sizeof(ItemIdData))
 #define HEAP_USABLE_BYTES_PER_PAGE \
-	(BLCKSZ - SizeOfPageHeaderData)
+	(PageUsableSpace)
 
 static void
 heapam_estimate_rel_size(Relation rel, int32 *attr_widths,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 3b9299b892..0e7bc32881 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1445,7 +1445,7 @@ lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno,
 
 		if (GetRecordedFreeSpace(vacrel->rel, blkno) == 0)
 		{
-			freespace = BLCKSZ - SizeOfPageHeaderData;
+			freespace = PageUsableSpace;
 
 			RecordPageWithFreeSpace(vacrel->rel, blkno, freespace);
 		}
diff --git a/src/backend/access/heap/visibilitymap.c b/src/backend/access/heap/visibilitymap.c
index 2e18cd88bc..0bbd95be8f 100644
--- a/src/backend/access/heap/visibilitymap.c
+++ b/src/backend/access/heap/visibilitymap.c
@@ -105,7 +105,7 @@
  * extra headers, so the whole page minus the standard page header is
  * used for the bitmap.
  */
-#define MAPSIZE (BLCKSZ - MAXALIGN(SizeOfPageHeaderData))
+#define MAPSIZE (PageUsableSpace)
 
 /* Number of heap blocks we can represent in one byte */
 #define HEAPBLOCKS_PER_BYTE (BITS_PER_BYTE / BITS_PER_HEAPBLOCK)
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 7159c775fb..7f49baeb76 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -1089,7 +1089,7 @@ estimate_rel_size(Relation rel, int32 *attr_widths,
 			tuple_width += MAXALIGN(SizeofHeapTupleHeader);
 			tuple_width += sizeof(ItemIdData);
 			/* note: integer division is intentional here */
-			density = (BLCKSZ - SizeOfPageHeaderData) / tuple_width;
+			density = (PageUsableSpace) / tuple_width;
 		}
 		*tuples = rint(density * (double) curpages);
 
diff --git a/src/bin/pg_upgrade/file.c b/src/bin/pg_upgrade/file.c
index d173602882..bf6c8cc5b9 100644
--- a/src/bin/pg_upgrade/file.c
+++ b/src/bin/pg_upgrade/file.c
@@ -187,7 +187,7 @@ rewriteVisibilityMap(const char *fromfile, const char *tofile,
 	struct stat statbuf;
 
 	/* Compute number of old-format bytes per new page */
-	rewriteVmBytesPerPage = (BLCKSZ - SizeOfPageHeaderData) / 2;
+	rewriteVmBytesPerPage = (PageUsableSpace) / 2;
 
 	if ((src_fd = open(fromfile, O_RDONLY | PG_BINARY, 0)) < 0)
 		pg_fatal("error while copying relation \"%s.%s\": could not open file \"%s\": %s",
diff --git a/src/include/access/brin_page.h b/src/include/access/brin_page.h
index 3670ca6010..49f45ee9e5 100644
--- a/src/include/access/brin_page.h
+++ b/src/include/access/brin_page.h
@@ -86,7 +86,7 @@ typedef struct RevmapContents
 } RevmapContents;
 
 #define REVMAP_CONTENT_SIZE \
-	(BLCKSZ - MAXALIGN(SizeOfPageHeaderData) - \
+	(PageUsableSpace - \
 	 offsetof(RevmapContents, rm_tids) - \
 	 MAXALIGN(sizeof(BrinSpecialSpace)))
 /* max num of items in the array */
diff --git a/src/include/access/ginblock.h b/src/include/access/ginblock.h
index c59790ec5a..0db8fa0dda 100644
--- a/src/include/access/ginblock.h
+++ b/src/include/access/ginblock.h
@@ -318,7 +318,7 @@ typedef signed char GinNullCategory;
 	 GinPageGetOpaque(page)->maxoff * sizeof(PostingItem))
 
 #define GinDataPageMaxDataSize	\
-	(BLCKSZ - MAXALIGN(SizeOfPageHeaderData) \
+	(PageUsableSpace \
 	 - MAXALIGN(sizeof(ItemPointerData)) \
 	 - MAXALIGN(sizeof(GinPageOpaqueData)))
 
@@ -326,7 +326,7 @@ typedef signed char GinNullCategory;
  * List pages
  */
 #define GinListPageSize  \
-	( BLCKSZ - SizeOfPageHeaderData - MAXALIGN(sizeof(GinPageOpaqueData)) )
+	( PageUsableSpace - MAXALIGN(sizeof(GinPageOpaqueData)) )
 
 /*
  * A compressed posting list.
diff --git a/src/include/access/gist.h b/src/include/access/gist.h
index 0235716c06..6d735223ec 100644
--- a/src/include/access/gist.h
+++ b/src/include/access/gist.h
@@ -96,7 +96,7 @@ typedef GISTPageOpaqueData *GISTPageOpaque;
  * key size using opclass parameters.
  */
 #define GISTMaxIndexTupleSize	\
-	MAXALIGN_DOWN((BLCKSZ - SizeOfPageHeaderData - sizeof(GISTPageOpaqueData)) / \
+	MAXALIGN_DOWN((PageUsableSpace - sizeof(GISTPageOpaqueData)) / \
 				  4 - sizeof(ItemIdData))
 
 #define GISTMaxIndexKeySize	\
diff --git a/src/include/access/gist_private.h b/src/include/access/gist_private.h
index 82eb7b4bd8..5bf5000c97 100644
--- a/src/include/access/gist_private.h
+++ b/src/include/access/gist_private.h
@@ -474,7 +474,7 @@ extern void gistadjustmembers(Oid opfamilyoid,
 /* gistutil.c */
 
 #define GiSTPageSize   \
-	( BLCKSZ - SizeOfPageHeaderData - MAXALIGN(sizeof(GISTPageOpaqueData)) )
+	( PageUsableSpace - MAXALIGN(sizeof(GISTPageOpaqueData)) )
 
 #define GIST_MIN_FILLFACTOR			10
 #define GIST_DEFAULT_FILLFACTOR		90
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index 6fd87dc108..0f7a96820c 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -560,7 +560,7 @@ StaticAssertDecl(MaxOffsetNumber < SpecTokenOffsetNumber,
  * ItemIds and tuples have different alignment requirements, don't assume that
  * you can, say, fit 2 tuples of size MaxHeapTupleSize/2 on the same page.
  */
-#define MaxHeapTupleSize  (BLCKSZ - MAXALIGN(SizeOfPageHeaderData + sizeof(ItemIdData)))
+#define MaxHeapTupleSize  (PageUsableSpace - MAXALIGN(sizeof(ItemIdData)))
 #define MinHeapTupleSize  MAXALIGN(SizeofHeapTupleHeader)
 
 /*
@@ -575,7 +575,7 @@ StaticAssertDecl(MaxOffsetNumber < SpecTokenOffsetNumber,
  * require increases in the size of work arrays.
  */
 #define MaxHeapTuplesPerPage	\
-	((int) ((BLCKSZ - SizeOfPageHeaderData) / \
+	((int) ((PageUsableSpace) / \
 			(MAXALIGN(SizeofHeapTupleHeader) + sizeof(ItemIdData))))
 
 /*
diff --git a/src/include/access/itup.h b/src/include/access/itup.h
index 1d55536dbd..4e248cb2bb 100644
--- a/src/include/access/itup.h
+++ b/src/include/access/itup.h
@@ -164,7 +164,7 @@ index_getattr(IndexTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
  * But such a page always has at least MAXALIGN special space, so we're safe.
  */
 #define MaxIndexTuplesPerPage	\
-	((int) ((BLCKSZ - SizeOfPageHeaderData) / \
+	((int) ((PageUsableSpace) / \
 			(MAXALIGN(sizeof(IndexTupleData) + 1) + sizeof(ItemIdData))))
 
 #endif							/* ITUP_H */
diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h
index 5e083591a6..38ad0ade74 100644
--- a/src/include/access/nbtree.h
+++ b/src/include/access/nbtree.h
@@ -183,7 +183,7 @@ typedef struct BTMetaPageData
  * than necessary as a result, which is considered acceptable.
  */
 #define MaxTIDsPerBTreePage \
-	(int) ((BLCKSZ - SizeOfPageHeaderData - sizeof(BTPageOpaqueData)) / \
+	(int) ((PageUsableSpace - sizeof(BTPageOpaqueData)) / \
 		   sizeof(ItemPointerData))
 
 /*
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 424ecba028..3d36a74385 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -213,6 +213,13 @@ typedef PageHeaderData *PageHeader;
  */
 #define SizeOfPageHeaderData (offsetof(PageHeaderData, pd_linp))
 
+/*
+ * how much space is left after smgr's bookkeeping, etc
+ */
+#define PageUsableSpace (BLCKSZ - SizeOfPageHeaderData)
+StaticAssertDecl(PageUsableSpace == MAXALIGN(PageUsableSpace),
+				 "SizeOfPageHeaderData must be MAXALIGN'd");
+
 /*
  * PageIsEmpty
  *		returns true iff no itemid has been allocated on the page
diff --git a/src/include/storage/fsm_internals.h b/src/include/storage/fsm_internals.h
index 9e314c83fa..2b9138f7a7 100644
--- a/src/include/storage/fsm_internals.h
+++ b/src/include/storage/fsm_internals.h
@@ -48,7 +48,7 @@ typedef FSMPageData *FSMPage;
  * Number of non-leaf and leaf nodes, and nodes in total, on an FSM page.
  * These definitions are internal to fsmpage.c.
  */
-#define NodesPerPage (BLCKSZ - MAXALIGN(SizeOfPageHeaderData) - \
+#define NodesPerPage (PageUsableSpace - \
 					  offsetof(FSMPageData, fp_nodes))
 
 #define NonLeafNodesPerPage (BLCKSZ / 2 - 1)
-- 
2.40.1

v2-0002-Add-support-for-fast-non-division-based-div-mod-a.patchapplication/octet-stream; name=v2-0002-Add-support-for-fast-non-division-based-div-mod-a.patchDownload

From 9e73bbd263cd36eba785df4de79c1f69517cbd41 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Fri, 29 Sep 2023 15:02:00 -0400
Subject: [PATCH v2 2/9] Add support for fast, non-division-based div/mod
 algorithms

---
 src/include/port/pg_bitutils.h | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/src/include/port/pg_bitutils.h b/src/include/port/pg_bitutils.h
index 4f1a13dba9..2d7311ebb5 100644
--- a/src/include/port/pg_bitutils.h
+++ b/src/include/port/pg_bitutils.h
@@ -340,4 +340,37 @@ pg_rotate_left32(uint32 word, int n)
 #define pg_prevpower2_size_t pg_prevpower2_64
 #endif
 
+
+/* integer division speedups for constant but runtime divisors */
+
+/*
+ * This value should cached globally and used in the other routines to find
+ * the div/mod quickly relative to `div` operand.  TODO: might have some other
+ * asm-tuned things in port maybe?  general-purpose solution should be ok
+ * though.
+ */
+static inline uint32 pg_fastinverse(uint16 divisor)
+{
+	return UINT32_C(0xFFFFFFFF) / divisor + 1;
+}
+
+/*
+ * pg_fastdiv - calculates the quotient of a 16-bit number against a constant
+ * divisor without using the division operator
+ */
+static inline uint16 pg_fastdiv(uint16 n, uint16 divisor, uint32 fastinv)
+{
+	return (((uint64)(fastinv - 1) * n)) >> 32;
+}
+
+/*
+ * pg_fastmod - calculates the modulus of a 16-bit number against a constant
+ * divisor without using the division operator
+ */
+static inline uint16 pg_fastmod(uint16 n, uint16 divisor, uint32 fastinv)
+{
+	uint32 lowbits = fastinv * n;
+	return ((uint64)lowbits * divisor) >> 32;
+}
+
 #endif							/* PG_BITUTILS_H */
-- 
2.40.1

v2-0003-Use-fastdiv-code-in-visibility-map.patchapplication/octet-stream; name=v2-0003-Use-fastdiv-code-in-visibility-map.patchDownload

From 6c992bfc815c837568e3001ba0f01e6e90d9dc61 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Fri, 29 Sep 2023 15:03:00 -0400
Subject: [PATCH v2 3/9] Use fastdiv code in visibility map

Adjust the code that calculates our heap block offsets based to be based on
PageUsableSpace instead of compile-time constants.  Use the fastdiv code to
support this.
---
 src/backend/access/heap/visibilitymap.c | 92 ++++++++++++++++++-------
 1 file changed, 67 insertions(+), 25 deletions(-)

diff --git a/src/backend/access/heap/visibilitymap.c b/src/backend/access/heap/visibilitymap.c
index 0bbd95be8f..206aef33d1 100644
--- a/src/backend/access/heap/visibilitymap.c
+++ b/src/backend/access/heap/visibilitymap.c
@@ -105,18 +105,24 @@
  * extra headers, so the whole page minus the standard page header is
  * used for the bitmap.
  */
-#define MAPSIZE (PageUsableSpace)
 
 /* Number of heap blocks we can represent in one byte */
 #define HEAPBLOCKS_PER_BYTE (BITS_PER_BYTE / BITS_PER_HEAPBLOCK)
 
+/* Init routine for our fastmath */
+#define MAPBLOCK_INIT if (unlikely(!mapblock_size))		\
+	{													\
+		mapblock_size = PageUsableSpace;				\
+		mapblock_inv = pg_fastinverse(mapblock_size);	\
+	}
+
 /* Number of heap blocks we can represent in one visibility map page. */
-#define HEAPBLOCKS_PER_PAGE (MAPSIZE * HEAPBLOCKS_PER_BYTE)
+#define HEAPBLOCKS_PER_PAGE (mapblock_size << BITS_PER_HEAPBLOCK)
 
 /* Mapping from heap block number to the right bit in the visibility map */
-#define HEAPBLK_TO_MAPBLOCK(x) ((x) / HEAPBLOCKS_PER_PAGE)
-#define HEAPBLK_TO_MAPBYTE(x) (((x) % HEAPBLOCKS_PER_PAGE) / HEAPBLOCKS_PER_BYTE)
-#define HEAPBLK_TO_OFFSET(x) (((x) % HEAPBLOCKS_PER_BYTE) * BITS_PER_HEAPBLOCK)
+#define HEAPBLK_TO_MAPBLOCK(x) (pg_fastdiv((x),mapblock_size,mapblock_inv))
+#define HEAPBLK_TO_MAPBYTE(x) (pg_fastmod((x),mapblock_size,mapblock_inv) >> 2) /* always 4 blocks per byte */
+#define HEAPBLK_TO_OFFSET(x) (((x) & 0x3) << 1) /* always 2 bits per entry */
 
 /* Masks for counting subsets of bits in the visibility map. */
 #define VISIBLE_MASK64	UINT64CONST(0x5555555555555555) /* The lower bit of each
@@ -128,6 +134,9 @@
 static Buffer vm_readbuf(Relation rel, BlockNumber blkno, bool extend);
 static Buffer vm_extend(Relation rel, BlockNumber vm_nblocks);
 
+/* storage for the fast div/mod inverse */
+static uint64 mapblock_inv = 0;
+static uint32 mapblock_size = 0;
 
 /*
  *	visibilitymap_clear - clear specified bits for one page in visibility map
@@ -139,13 +148,20 @@ static Buffer vm_extend(Relation rel, BlockNumber vm_nblocks);
 bool
 visibilitymap_clear(Relation rel, BlockNumber heapBlk, Buffer vmbuf, uint8 flags)
 {
-	BlockNumber mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
-	int			mapByte = HEAPBLK_TO_MAPBYTE(heapBlk);
-	int			mapOffset = HEAPBLK_TO_OFFSET(heapBlk);
-	uint8		mask = flags << mapOffset;
+	BlockNumber mapBlock;
+	int			mapByte;
+	int			mapOffset;
+	uint8		mask;
 	char	   *map;
 	bool		cleared = false;
 
+	MAPBLOCK_INIT;
+
+	mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
+	mapByte = HEAPBLK_TO_MAPBYTE(heapBlk);
+	mapOffset = HEAPBLK_TO_OFFSET(heapBlk);
+	mask = flags << mapOffset;
+
 	/* Must never clear all_visible bit while leaving all_frozen bit set */
 	Assert(flags & VISIBILITYMAP_VALID_BITS);
 	Assert(flags != VISIBILITYMAP_ALL_VISIBLE);
@@ -192,7 +208,11 @@ visibilitymap_clear(Relation rel, BlockNumber heapBlk, Buffer vmbuf, uint8 flags
 void
 visibilitymap_pin(Relation rel, BlockNumber heapBlk, Buffer *vmbuf)
 {
-	BlockNumber mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
+	BlockNumber mapBlock;
+
+	MAPBLOCK_INIT;
+
+	mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
 
 	/* Reuse the old pinned buffer if possible */
 	if (BufferIsValid(*vmbuf))
@@ -216,7 +236,11 @@ visibilitymap_pin(Relation rel, BlockNumber heapBlk, Buffer *vmbuf)
 bool
 visibilitymap_pin_ok(BlockNumber heapBlk, Buffer vmbuf)
 {
-	BlockNumber mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
+	BlockNumber mapBlock;
+
+	MAPBLOCK_INIT;
+
+	mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
 
 	return BufferIsValid(vmbuf) && BufferGetBlockNumber(vmbuf) == mapBlock;
 }
@@ -247,12 +271,18 @@ visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf,
 				  XLogRecPtr recptr, Buffer vmBuf, TransactionId cutoff_xid,
 				  uint8 flags)
 {
-	BlockNumber mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
-	uint32		mapByte = HEAPBLK_TO_MAPBYTE(heapBlk);
-	uint8		mapOffset = HEAPBLK_TO_OFFSET(heapBlk);
+	BlockNumber mapBlock;
+	uint32		mapByte;
+	uint8		mapOffset;
 	Page		page;
 	uint8	   *map;
 
+	MAPBLOCK_INIT;
+
+	mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
+	mapByte = HEAPBLK_TO_MAPBYTE(heapBlk);
+	mapOffset = HEAPBLK_TO_OFFSET(heapBlk);
+
 #ifdef TRACE_VISIBILITYMAP
 	elog(DEBUG1, "vm_set %s %d", RelationGetRelationName(rel), heapBlk);
 #endif
@@ -337,12 +367,18 @@ visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf,
 uint8
 visibilitymap_get_status(Relation rel, BlockNumber heapBlk, Buffer *vmbuf)
 {
-	BlockNumber mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
-	uint32		mapByte = HEAPBLK_TO_MAPBYTE(heapBlk);
-	uint8		mapOffset = HEAPBLK_TO_OFFSET(heapBlk);
+	BlockNumber mapBlock;
+	uint32		mapByte;
+	uint8		mapOffset;
 	char	   *map;
 	uint8		result;
 
+	MAPBLOCK_INIT;
+
+	mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
+	mapByte = HEAPBLK_TO_MAPBYTE(heapBlk);
+	mapOffset = HEAPBLK_TO_OFFSET(heapBlk);
+
 #ifdef TRACE_VISIBILITYMAP
 	elog(DEBUG1, "vm_get_status %s %d", RelationGetRelationName(rel), heapBlk);
 #endif
@@ -414,16 +450,16 @@ visibilitymap_count(Relation rel, BlockNumber *all_visible, BlockNumber *all_fro
 		 */
 		map = (uint64 *) PageGetContents(BufferGetPage(mapBuffer));
 
-		StaticAssertStmt(MAPSIZE % sizeof(uint64) == 0,
-						 "unsupported MAPSIZE");
+		Assert(mapblock_size % sizeof(uint64) == 0);
+
 		if (all_frozen == NULL)
 		{
-			for (i = 0; i < MAPSIZE / sizeof(uint64); i++)
+			for (i = 0; i < mapblock_size / sizeof(uint64); i++)
 				nvisible += pg_popcount64(map[i] & VISIBLE_MASK64);
 		}
 		else
 		{
-			for (i = 0; i < MAPSIZE / sizeof(uint64); i++)
+			for (i = 0; i < mapblock_size / sizeof(uint64); i++)
 			{
 				nvisible += pg_popcount64(map[i] & VISIBLE_MASK64);
 				nfrozen += pg_popcount64(map[i] & FROZEN_MASK64);
@@ -455,9 +491,15 @@ visibilitymap_prepare_truncate(Relation rel, BlockNumber nheapblocks)
 	BlockNumber newnblocks;
 
 	/* last remaining block, byte, and bit */
-	BlockNumber truncBlock = HEAPBLK_TO_MAPBLOCK(nheapblocks);
-	uint32		truncByte = HEAPBLK_TO_MAPBYTE(nheapblocks);
-	uint8		truncOffset = HEAPBLK_TO_OFFSET(nheapblocks);
+	BlockNumber truncBlock;
+	uint32		truncByte;
+	uint8		truncOffset;
+
+	MAPBLOCK_INIT;
+
+	truncBlock = HEAPBLK_TO_MAPBLOCK(nheapblocks);
+	truncByte = HEAPBLK_TO_MAPBYTE(nheapblocks);
+	truncOffset = HEAPBLK_TO_OFFSET(nheapblocks);
 
 #ifdef TRACE_VISIBILITYMAP
 	elog(DEBUG1, "vm_truncate %s %d", RelationGetRelationName(rel), nheapblocks);
@@ -501,7 +543,7 @@ visibilitymap_prepare_truncate(Relation rel, BlockNumber nheapblocks)
 		START_CRIT_SECTION();
 
 		/* Clear out the unwanted bytes. */
-		MemSet(&map[truncByte + 1], 0, MAPSIZE - (truncByte + 1));
+		MemSet(&map[truncByte + 1], 0, mapblock_size - (truncByte + 1));
 
 		/*----
 		 * Mask out the unwanted bits of the last remaining byte.
-- 
2.40.1

v2-0004-Make-PageUsableSpace-incorporate-variable-sized-l.patchapplication/octet-stream; name=v2-0004-Make-PageUsableSpace-incorporate-variable-sized-l.patchDownload

From 6d11a01bedaa1203a51424ffba8d6340bcb1a58a Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Tue, 12 Dec 2023 16:55:52 -0500
Subject: [PATCH v2 4/9] Make PageUsableSpace incorporate variable-sized limit

---
 src/backend/storage/page/bufpage.c |  2 +-
 src/include/storage/bufpage.h      | 13 +++++++++----
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index 9a302ddc30..d3aec3278d 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -26,7 +26,7 @@
 /* GUC variable */
 bool		ignore_checksum_failure = false;
 
-
+int ReservedPageSize = 0;
 /* ----------------------------------------------------------------
  *						Page support functions
  * ----------------------------------------------------------------
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 3d36a74385..c414effae3 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -214,11 +214,16 @@ typedef PageHeaderData *PageHeader;
 #define SizeOfPageHeaderData (offsetof(PageHeaderData, pd_linp))
 
 /*
- * how much space is left after smgr's bookkeeping, etc
+ * how much space is left after smgr's bookkeeping, etc; should be MAXALIGN
  */
-#define PageUsableSpace (BLCKSZ - SizeOfPageHeaderData)
-StaticAssertDecl(PageUsableSpace == MAXALIGN(PageUsableSpace),
-				 "SizeOfPageHeaderData must be MAXALIGN'd");
+extern int ReservedPageSize;
+
+/* ignore page usable space */
+#define PageUsableSpaceMax (BLCKSZ - SizeOfPageHeaderData)
+#define PageUsableSpace (BLCKSZ - SizeOfPageHeaderData - ReservedPageSize)
+// TODO: add init check here, yo
+/* StaticAssertDecl(PageUsableSpace == MAXALIGN(PageUsableSpace), */
+/* 				 "SizeOfPageHeaderData must be MAXALIGN'd"); */
 
 /*
  * PageIsEmpty
-- 
2.40.1

v2-0007-Split-MaxIndexTuplesPerPage-into-Limit-and-Dynami.patchapplication/octet-stream; name=v2-0007-Split-MaxIndexTuplesPerPage-into-Limit-and-Dynami.patchDownload

From 82c17f550e9532cf0e7732a754566d447a271b2c Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Tue, 19 Dec 2023 18:07:01 -0500
Subject: [PATCH v2 7/9] Split MaxIndexTuplesPerPage into Limit and Dynamic
 variants

---
 contrib/amcheck/verify_nbtree.c         |  6 +++---
 src/backend/access/gist/gist.c          |  2 +-
 src/backend/access/gist/gistget.c       |  8 ++++----
 src/backend/access/hash/hash.c          |  4 ++--
 src/backend/access/hash/hashovfl.c      |  6 +++---
 src/backend/access/hash/hashpage.c      |  4 ++--
 src/backend/access/hash/hashsearch.c    | 10 +++++-----
 src/backend/access/nbtree/nbtinsert.c   |  2 +-
 src/backend/access/nbtree/nbtpage.c     |  8 ++++----
 src/backend/access/nbtree/nbtree.c      |  4 ++--
 src/backend/access/nbtree/nbtxlog.c     |  4 ++--
 src/backend/access/spgist/spgdoinsert.c |  2 +-
 src/backend/access/spgist/spgscan.c     |  2 +-
 src/backend/access/spgist/spgvacuum.c   | 22 +++++++++++-----------
 src/backend/storage/page/bufpage.c      |  6 +++---
 src/include/access/hash.h               |  2 +-
 src/include/access/itup.h               |  5 +----
 src/include/access/nbtree.h             |  2 +-
 src/include/access/spgist_private.h     | 12 ++++++------
 19 files changed, 54 insertions(+), 57 deletions(-)

diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index c8d9d29a40..6026d7549d 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -3446,12 +3446,12 @@ palloc_btree_page(BtreeCheckState *state, BlockNumber blocknum)
 	 * to move left, in the case of backward index scans).
 	 */
 	maxoffset = PageGetMaxOffsetNumber(page);
-	if (maxoffset > MaxIndexTuplesPerPage)
+	if (maxoffset > MaxIndexTuplesPerPageDynamic)
 		ereport(ERROR,
 				(errcode(ERRCODE_INDEX_CORRUPTED),
-				 errmsg("Number of items on block %u of index \"%s\" exceeds MaxIndexTuplesPerPage (%u)",
+				 errmsg("Number of items on block %u of index \"%s\" exceeds MaxIndexTuplesPerPageDynamic (%u)",
 						blocknum, RelationGetRelationName(state->rel),
-						MaxIndexTuplesPerPage)));
+						MaxIndexTuplesPerPageDynamic)));
 
 	if (!P_ISLEAF(opaque) && !P_ISDELETED(opaque) && maxoffset < P_FIRSTDATAKEY(opaque))
 		ereport(ERROR,
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index e052ba8bda..52b44707bc 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -1664,7 +1664,7 @@ freeGISTstate(GISTSTATE *giststate)
 static void
 gistprunepage(Relation rel, Page page, Buffer buffer, Relation heapRel)
 {
-	OffsetNumber deletable[MaxIndexTuplesPerPage];
+	OffsetNumber deletable[MaxIndexTuplesPerPageLimit];
 	int			ndeletable = 0;
 	OffsetNumber offnum,
 				maxoff;
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
index 3134917428..b506a83a3e 100644
--- a/src/backend/access/gist/gistget.c
+++ b/src/backend/access/gist/gistget.c
@@ -659,12 +659,12 @@ gistgettuple(IndexScanDesc scan, ScanDirection dir)
 							MemoryContextSwitchTo(so->giststate->scanCxt);
 
 						so->killedItems =
-							(OffsetNumber *) palloc(MaxIndexTuplesPerPage
+							(OffsetNumber *) palloc(MaxIndexTuplesPerPageDynamic
 													* sizeof(OffsetNumber));
 
 						MemoryContextSwitchTo(oldCxt);
 					}
-					if (so->numKilled < MaxIndexTuplesPerPage)
+					if (so->numKilled < MaxIndexTuplesPerPageDynamic)
 						so->killedItems[so->numKilled++] =
 							so->pageData[so->curPageData - 1].offnum;
 				}
@@ -696,12 +696,12 @@ gistgettuple(IndexScanDesc scan, ScanDirection dir)
 						MemoryContextSwitchTo(so->giststate->scanCxt);
 
 					so->killedItems =
-						(OffsetNumber *) palloc(MaxIndexTuplesPerPage
+						(OffsetNumber *) palloc(MaxIndexTuplesPerPageDynamic
 												* sizeof(OffsetNumber));
 
 					MemoryContextSwitchTo(oldCxt);
 				}
-				if (so->numKilled < MaxIndexTuplesPerPage)
+				if (so->numKilled < MaxIndexTuplesPerPageDynamic)
 					so->killedItems[so->numKilled++] =
 						so->pageData[so->curPageData - 1].offnum;
 			}
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index 905519692c..a31d79719b 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -312,9 +312,9 @@ hashgettuple(IndexScanDesc scan, ScanDirection dir)
 			 */
 			if (so->killedItems == NULL)
 				so->killedItems = (int *)
-					palloc(MaxIndexTuplesPerPage * sizeof(int));
+					palloc(MaxIndexTuplesPerPageDynamic * sizeof(int));
 
-			if (so->numKilled < MaxIndexTuplesPerPage)
+			if (so->numKilled < MaxIndexTuplesPerPageDynamic)
 				so->killedItems[so->numKilled++] = so->currPos.itemIndex;
 		}
 
diff --git a/src/backend/access/hash/hashovfl.c b/src/backend/access/hash/hashovfl.c
index 2bd4432265..e2c510fe3b 100644
--- a/src/backend/access/hash/hashovfl.c
+++ b/src/backend/access/hash/hashovfl.c
@@ -888,9 +888,9 @@ _hash_squeezebucket(Relation rel,
 		OffsetNumber roffnum;
 		OffsetNumber maxroffnum;
 		OffsetNumber deletable[MaxOffsetNumber];
-		IndexTuple	itups[MaxIndexTuplesPerPage];
-		Size		tups_size[MaxIndexTuplesPerPage];
-		OffsetNumber itup_offsets[MaxIndexTuplesPerPage];
+		IndexTuple	itups[MaxIndexTuplesPerPageLimit];
+		Size		tups_size[MaxIndexTuplesPerPageLimit];
+		OffsetNumber itup_offsets[MaxIndexTuplesPerPageLimit];
 		uint16		ndeletable = 0;
 		uint16		nitups = 0;
 		Size		all_tups_size = 0;
diff --git a/src/backend/access/hash/hashpage.c b/src/backend/access/hash/hashpage.c
index 0c6e79f1bd..6d6937ea16 100644
--- a/src/backend/access/hash/hashpage.c
+++ b/src/backend/access/hash/hashpage.c
@@ -1087,8 +1087,8 @@ _hash_splitbucket(Relation rel,
 	Page		npage;
 	HashPageOpaque oopaque;
 	HashPageOpaque nopaque;
-	OffsetNumber itup_offsets[MaxIndexTuplesPerPage];
-	IndexTuple	itups[MaxIndexTuplesPerPage];
+	OffsetNumber itup_offsets[MaxIndexTuplesPerPageLimit];
+	IndexTuple	itups[MaxIndexTuplesPerPageLimit];
 	Size		all_tups_size = 0;
 	int			i;
 	uint16		nitups = 0;
diff --git a/src/backend/access/hash/hashsearch.c b/src/backend/access/hash/hashsearch.c
index 0a031bfd18..ba3900f8dd 100644
--- a/src/backend/access/hash/hashsearch.c
+++ b/src/backend/access/hash/hashsearch.c
@@ -532,7 +532,7 @@ _hash_readpage(IndexScanDesc scan, Buffer *bufP, ScanDirection dir)
 
 			itemIndex = _hash_load_qualified_items(scan, page, offnum, dir);
 
-			if (itemIndex != MaxIndexTuplesPerPage)
+			if (itemIndex != MaxIndexTuplesPerPageDynamic)
 				break;
 
 			/*
@@ -571,8 +571,8 @@ _hash_readpage(IndexScanDesc scan, Buffer *bufP, ScanDirection dir)
 		}
 
 		so->currPos.firstItem = itemIndex;
-		so->currPos.lastItem = MaxIndexTuplesPerPage - 1;
-		so->currPos.itemIndex = MaxIndexTuplesPerPage - 1;
+		so->currPos.lastItem = MaxIndexTuplesPerPageDynamic - 1;
+		so->currPos.itemIndex = MaxIndexTuplesPerPageDynamic - 1;
 	}
 
 	if (so->currPos.buf == so->hashso_bucket_buf ||
@@ -652,13 +652,13 @@ _hash_load_qualified_items(IndexScanDesc scan, Page page,
 			offnum = OffsetNumberNext(offnum);
 		}
 
-		Assert(itemIndex <= MaxIndexTuplesPerPage);
+		Assert(itemIndex <= MaxIndexTuplesPerPageDynamic);
 		return itemIndex;
 	}
 	else
 	{
 		/* load items[] in descending order */
-		itemIndex = MaxIndexTuplesPerPage;
+		itemIndex = MaxIndexTuplesPerPageDynamic;
 
 		while (offnum >= FirstOffsetNumber)
 		{
diff --git a/src/backend/access/nbtree/nbtinsert.c b/src/backend/access/nbtree/nbtinsert.c
index 9cff4f2931..a75de71659 100644
--- a/src/backend/access/nbtree/nbtinsert.c
+++ b/src/backend/access/nbtree/nbtinsert.c
@@ -2685,7 +2685,7 @@ _bt_delete_or_dedup_one_page(Relation rel, Relation heapRel,
 							 bool simpleonly, bool checkingunique,
 							 bool uniquedup, bool indexUnchanged)
 {
-	OffsetNumber deletable[MaxIndexTuplesPerPage];
+	OffsetNumber deletable[MaxIndexTuplesPerPageLimit];
 	int			ndeletable = 0;
 	OffsetNumber offnum,
 				minoff,
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 7d4e3a757c..c8887f6b8a 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -1160,7 +1160,7 @@ _bt_delitems_vacuum(Relation rel, Buffer buf,
 	bool		needswal = RelationNeedsWAL(rel);
 	char	   *updatedbuf = NULL;
 	Size		updatedbuflen = 0;
-	OffsetNumber updatedoffsets[MaxIndexTuplesPerPage];
+	OffsetNumber updatedoffsets[MaxIndexTuplesPerPageLimit];
 
 	/* Shouldn't be called unless there's something to do */
 	Assert(ndeletable > 0 || nupdatable > 0);
@@ -1291,7 +1291,7 @@ _bt_delitems_delete(Relation rel, Buffer buf,
 	bool		needswal = RelationNeedsWAL(rel);
 	char	   *updatedbuf = NULL;
 	Size		updatedbuflen = 0;
-	OffsetNumber updatedoffsets[MaxIndexTuplesPerPage];
+	OffsetNumber updatedoffsets[MaxIndexTuplesPerPageLimit];
 
 	/* Shouldn't be called unless there's something to do */
 	Assert(ndeletable > 0 || nupdatable > 0);
@@ -1524,8 +1524,8 @@ _bt_delitems_delete_check(Relation rel, Buffer buf, Relation heapRel,
 	OffsetNumber postingidxoffnum = InvalidOffsetNumber;
 	int			ndeletable = 0,
 				nupdatable = 0;
-	OffsetNumber deletable[MaxIndexTuplesPerPage];
-	BTVacuumPosting updatable[MaxIndexTuplesPerPage];
+	OffsetNumber deletable[MaxIndexTuplesPerPageLimit];
+	BTVacuumPosting updatable[MaxIndexTuplesPerPageLimit];
 
 	/* Use tableam interface to determine which tuples to delete first */
 	snapshotConflictHorizon = table_index_delete_tuples(heapRel, delstate);
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 6c8cd93fa0..64d2f07fc0 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -1160,9 +1160,9 @@ backtrack:
 	}
 	else if (P_ISLEAF(opaque))
 	{
-		OffsetNumber deletable[MaxIndexTuplesPerPage];
+		OffsetNumber deletable[MaxIndexTuplesPerPageLimit];
 		int			ndeletable;
-		BTVacuumPosting updatable[MaxIndexTuplesPerPage];
+		BTVacuumPosting updatable[MaxIndexTuplesPerPageLimit];
 		int			nupdatable;
 		OffsetNumber offnum,
 					minoff,
diff --git a/src/backend/access/nbtree/nbtxlog.c b/src/backend/access/nbtree/nbtxlog.c
index c87e46ed66..7035665eaf 100644
--- a/src/backend/access/nbtree/nbtxlog.c
+++ b/src/backend/access/nbtree/nbtxlog.c
@@ -40,8 +40,8 @@ _bt_restore_page(Page page, char *from, int len)
 	IndexTupleData itupdata;
 	Size		itemsz;
 	char	   *end = from + len;
-	Item		items[MaxIndexTuplesPerPage];
-	uint16		itemsizes[MaxIndexTuplesPerPage];
+	Item		items[MaxIndexTuplesPerPageLimit];
+	uint16		itemsizes[MaxIndexTuplesPerPageLimit];
 	int			i;
 	int			nitems;
 
diff --git a/src/backend/access/spgist/spgdoinsert.c b/src/backend/access/spgist/spgdoinsert.c
index 3554edcc9a..4a12b91cee 100644
--- a/src/backend/access/spgist/spgdoinsert.c
+++ b/src/backend/access/spgist/spgdoinsert.c
@@ -135,7 +135,7 @@ spgPageIndexMultiDelete(SpGistState *state, Page page,
 						BlockNumber blkno, OffsetNumber offnum)
 {
 	OffsetNumber firstItem;
-	OffsetNumber sortednos[MaxIndexTuplesPerPage];
+	OffsetNumber sortednos[MaxIndexTuplesPerPageLimit];
 	SpGistDeadTuple tuple = NULL;
 	int			i;
 
diff --git a/src/backend/access/spgist/spgscan.c b/src/backend/access/spgist/spgscan.c
index f350f0b4f1..4d13c96f89 100644
--- a/src/backend/access/spgist/spgscan.c
+++ b/src/backend/access/spgist/spgscan.c
@@ -961,7 +961,7 @@ storeGettuple(SpGistScanOpaque so, ItemPointer heapPtr,
 			  SpGistLeafTuple leafTuple, bool recheck,
 			  bool recheckDistances, double *nonNullDistances)
 {
-	Assert(so->nPtrs < MaxIndexTuplesPerPage);
+	Assert(so->nPtrs < MaxIndexTuplesPerPageDynamic);
 	so->heapPtrs[so->nPtrs] = *heapPtr;
 	so->recheck[so->nPtrs] = recheck;
 	so->recheckDistances[so->nPtrs] = recheckDistances;
diff --git a/src/backend/access/spgist/spgvacuum.c b/src/backend/access/spgist/spgvacuum.c
index 8a5b540c80..f561ff6e2b 100644
--- a/src/backend/access/spgist/spgvacuum.c
+++ b/src/backend/access/spgist/spgvacuum.c
@@ -128,14 +128,14 @@ vacuumLeafPage(spgBulkDeleteState *bds, Relation index, Buffer buffer,
 {
 	Page		page = BufferGetPage(buffer);
 	spgxlogVacuumLeaf xlrec;
-	OffsetNumber toDead[MaxIndexTuplesPerPage];
-	OffsetNumber toPlaceholder[MaxIndexTuplesPerPage];
-	OffsetNumber moveSrc[MaxIndexTuplesPerPage];
-	OffsetNumber moveDest[MaxIndexTuplesPerPage];
-	OffsetNumber chainSrc[MaxIndexTuplesPerPage];
-	OffsetNumber chainDest[MaxIndexTuplesPerPage];
-	OffsetNumber predecessor[MaxIndexTuplesPerPage + 1];
-	bool		deletable[MaxIndexTuplesPerPage + 1];
+	OffsetNumber toDead[MaxIndexTuplesPerPageLimit];
+	OffsetNumber toPlaceholder[MaxIndexTuplesPerPageLimit];
+	OffsetNumber moveSrc[MaxIndexTuplesPerPageLimit];
+	OffsetNumber moveDest[MaxIndexTuplesPerPageLimit];
+	OffsetNumber chainSrc[MaxIndexTuplesPerPageLimit];
+	OffsetNumber chainDest[MaxIndexTuplesPerPageLimit];
+	OffsetNumber predecessor[MaxIndexTuplesPerPageLimit + 1];
+	bool		deletable[MaxIndexTuplesPerPageLimit + 1];
 	int			nDeletable;
 	OffsetNumber i,
 				max = PageGetMaxOffsetNumber(page);
@@ -408,7 +408,7 @@ vacuumLeafRoot(spgBulkDeleteState *bds, Relation index, Buffer buffer)
 {
 	Page		page = BufferGetPage(buffer);
 	spgxlogVacuumRoot xlrec;
-	OffsetNumber toDelete[MaxIndexTuplesPerPage];
+	OffsetNumber toDelete[MaxIndexTuplesPerPageLimit];
 	OffsetNumber i,
 				max = PageGetMaxOffsetNumber(page);
 
@@ -498,8 +498,8 @@ vacuumRedirectAndPlaceholder(Relation index, Relation heaprel, Buffer buffer)
 				firstPlaceholder = InvalidOffsetNumber;
 	bool		hasNonPlaceholder = false;
 	bool		hasUpdate = false;
-	OffsetNumber itemToPlaceholder[MaxIndexTuplesPerPage];
-	OffsetNumber itemnos[MaxIndexTuplesPerPage];
+	OffsetNumber itemToPlaceholder[MaxIndexTuplesPerPageLimit];
+	OffsetNumber itemnos[MaxIndexTuplesPerPageLimit];
 	spgxlogVacuumRedirect xlrec;
 	GlobalVisState *vistest;
 
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index 406b3f252c..ad91135e38 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -1165,8 +1165,8 @@ PageIndexMultiDelete(Page page, OffsetNumber *itemnos, int nitems)
 	Offset		pd_upper = phdr->pd_upper;
 	Offset		pd_special = phdr->pd_special;
 	Offset		last_offset;
-	itemIdCompactData itemidbase[MaxIndexTuplesPerPage];
-	ItemIdData	newitemids[MaxIndexTuplesPerPage];
+	itemIdCompactData itemidbase[MaxIndexTuplesPerPageLimit];
+	ItemIdData	newitemids[MaxIndexTuplesPerPageLimit];
 	itemIdCompact itemidptr;
 	ItemId		lp;
 	int			nline,
@@ -1178,7 +1178,7 @@ PageIndexMultiDelete(Page page, OffsetNumber *itemnos, int nitems)
 	OffsetNumber offnum;
 	bool		presorted = true;	/* For now */
 
-	Assert(nitems <= MaxIndexTuplesPerPage);
+	Assert(nitems <= MaxIndexTuplesPerPageDynamic);
 
 	/*
 	 * If there aren't very many items to delete, then retail
diff --git a/src/include/access/hash.h b/src/include/access/hash.h
index 4806ce6c4e..26c7184193 100644
--- a/src/include/access/hash.h
+++ b/src/include/access/hash.h
@@ -124,7 +124,7 @@ typedef struct HashScanPosData
 	int			lastItem;		/* last valid index in items[] */
 	int			itemIndex;		/* current index in items[] */
 
-	HashScanPosItem items[MaxIndexTuplesPerPage];	/* MUST BE LAST */
+	HashScanPosItem items[MaxIndexTuplesPerPageLimit];	/* MUST BE LAST */
 } HashScanPosData;
 
 #define HashScanPosIsPinned(scanpos) \
diff --git a/src/include/access/itup.h b/src/include/access/itup.h
index 2fa07cbd3c..e35ab19acb 100644
--- a/src/include/access/itup.h
+++ b/src/include/access/itup.h
@@ -166,9 +166,6 @@ index_getattr(IndexTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
 #define CalcMaxIndexTuplesPerPage(usablespace)		\
 	((int) ((usablespace) /												\
 			(MAXALIGN(sizeof(IndexTupleData) + 1) + sizeof(ItemIdData))))
-#define MaxIndexTuplesPerPageLimit (CalcMaxIndexTuplesPerPage(BLCKSZ - SizeOfPageHeaderData))
+#define MaxIndexTuplesPerPageLimit (CalcMaxIndexTuplesPerPage(PageUsableSpaceMax))
 #define MaxIndexTuplesPerPageDynamic (CalcMaxIndexTuplesPerPage(PageUsableSpace))
-
-// temporary to compile
-#define MaxIndexTuplesPerPage MaxIndexTuplesPerPageLimit
 #endif							/* ITUP_H */
diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h
index 907535be60..d2ab4903bb 100644
--- a/src/include/access/nbtree.h
+++ b/src/include/access/nbtree.h
@@ -889,7 +889,7 @@ typedef struct BTDedupStateData
 	 * are implicitly unchanged by deduplication pass).
 	 */
 	int			nintervals;		/* current number of intervals in array */
-	BTDedupInterval intervals[MaxIndexTuplesPerPage];
+	BTDedupInterval intervals[MaxIndexTuplesPerPageLimit];
 } BTDedupStateData;
 
 typedef BTDedupStateData *BTDedupState;
diff --git a/src/include/access/spgist_private.h b/src/include/access/spgist_private.h
index bc39ee45cc..66fe809d22 100644
--- a/src/include/access/spgist_private.h
+++ b/src/include/access/spgist_private.h
@@ -226,17 +226,17 @@ typedef struct SpGistScanOpaqueData
 	TupleDesc	reconTupDesc;	/* if so, descriptor for reconstructed tuples */
 	int			nPtrs;			/* number of TIDs found on current page */
 	int			iPtr;			/* index for scanning through same */
-	ItemPointerData heapPtrs[MaxIndexTuplesPerPage];	/* TIDs from cur page */
-	bool		recheck[MaxIndexTuplesPerPage]; /* their recheck flags */
-	bool		recheckDistances[MaxIndexTuplesPerPage];	/* distance recheck
+	ItemPointerData heapPtrs[MaxIndexTuplesPerPageLimit];	/* TIDs from cur page */
+	bool		recheck[MaxIndexTuplesPerPageLimit]; /* their recheck flags */
+	bool		recheckDistances[MaxIndexTuplesPerPageLimit];	/* distance recheck
 															 * flags */
-	HeapTuple	reconTups[MaxIndexTuplesPerPage];	/* reconstructed tuples */
+	HeapTuple	reconTups[MaxIndexTuplesPerPageLimit];	/* reconstructed tuples */
 
 	/* distances (for recheck) */
-	IndexOrderByDistance *distances[MaxIndexTuplesPerPage];
+	IndexOrderByDistance *distances[MaxIndexTuplesPerPageLimit];
 
 	/*
-	 * Note: using MaxIndexTuplesPerPage above is a bit hokey since
+	 * Note: using MaxIndexTuplesPerPageDynamic above is a bit hokey since
 	 * SpGistLeafTuples aren't exactly IndexTuples; however, they are larger,
 	 * so this is safe.
 	 */
-- 
2.40.1

v2-0008-Split-MaxHeapTupleSize-into-Limit-and-Dynamic-var.patchapplication/octet-stream; name=v2-0008-Split-MaxHeapTupleSize-into-Limit-and-Dynamic-var.patchDownload

From c64223cb8305e2d76f78af936c53c93c52d8789c Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Tue, 19 Dec 2023 18:13:07 -0500
Subject: [PATCH v2 8/9] Split MaxHeapTupleSize into Limit and Dynamic variants

---
 src/backend/access/heap/heapam.c                | 12 ++++++------
 src/backend/access/heap/hio.c                   |  6 +++---
 src/backend/access/heap/rewriteheap.c           |  4 ++--
 src/backend/replication/logical/reorderbuffer.c |  2 +-
 src/backend/storage/freespace/freespace.c       |  2 +-
 src/include/access/heaptoast.h                  |  2 +-
 src/include/access/htup_details.h               |  5 ++---
 src/test/regress/expected/insert.out            |  2 +-
 src/test/regress/sql/insert.sql                 |  2 +-
 9 files changed, 18 insertions(+), 19 deletions(-)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 182d3d7b59..6916a58359 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -9231,7 +9231,7 @@ heap_xlog_insert(XLogReaderState *record)
 	union
 	{
 		HeapTupleHeaderData hdr;
-		char		data[MaxHeapTupleSize];
+		char		data[MaxHeapTupleSizeLimit];
 	}			tbuf;
 	HeapTupleHeader htup;
 	xl_heap_header xlhdr;
@@ -9287,7 +9287,7 @@ heap_xlog_insert(XLogReaderState *record)
 		data = XLogRecGetBlockData(record, 0, &datalen);
 
 		newlen = datalen - SizeOfHeapHeader;
-		Assert(datalen > SizeOfHeapHeader && newlen <= MaxHeapTupleSize);
+		Assert(datalen > SizeOfHeapHeader && newlen <= MaxHeapTupleSizeDynamic);
 		memcpy((char *) &xlhdr, data, SizeOfHeapHeader);
 		data += SizeOfHeapHeader;
 
@@ -9353,7 +9353,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
 	union
 	{
 		HeapTupleHeaderData hdr;
-		char		data[MaxHeapTupleSize];
+		char		data[MaxHeapTupleSizeLimit];
 	}			tbuf;
 	HeapTupleHeader htup;
 	uint32		newlen;
@@ -9431,7 +9431,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
 			tupdata = ((char *) xlhdr) + SizeOfMultiInsertTuple;
 
 			newlen = xlhdr->datalen;
-			Assert(newlen <= MaxHeapTupleSize);
+			Assert(newlen <= MaxHeapTupleSizeDynamic);
 			htup = &tbuf.hdr;
 			MemSet((char *) htup, 0, SizeofHeapTupleHeader);
 			/* PG73FORMAT: get bitmap [+ padding] [+ oid] + data */
@@ -9510,7 +9510,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
 	union
 	{
 		HeapTupleHeaderData hdr;
-		char		data[MaxHeapTupleSize];
+		char		data[MaxHeapTupleSizeLimit];
 	}			tbuf;
 	xl_heap_header xlhdr;
 	uint32		newlen;
@@ -9666,7 +9666,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
 		recdata += SizeOfHeapHeader;
 
 		tuplen = recdata_end - recdata;
-		Assert(tuplen <= MaxHeapTupleSize);
+		Assert(tuplen <= MaxHeapTupleSizeDynamic);
 
 		htup = &tbuf.hdr;
 		MemSet((char *) htup, 0, SizeofHeapTupleHeader);
diff --git a/src/backend/access/heap/hio.c b/src/backend/access/heap/hio.c
index bd427ba2f8..337aa4a530 100644
--- a/src/backend/access/heap/hio.c
+++ b/src/backend/access/heap/hio.c
@@ -530,11 +530,11 @@ RelationGetBufferForTuple(Relation relation, Size len,
 	/*
 	 * If we're gonna fail for oversize tuple, do it right away
 	 */
-	if (len > MaxHeapTupleSize)
+	if (len > MaxHeapTupleSizeDynamic)
 		ereport(ERROR,
 				(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
 				 errmsg("row is too big: size %zu, maximum size %zu",
-						len, MaxHeapTupleSize)));
+						len, MaxHeapTupleSizeDynamic)));
 
 	/* Compute desired extra freespace due to fillfactor option */
 	saveFreeSpace = RelationGetTargetPageFreeSpace(relation,
@@ -546,7 +546,7 @@ RelationGetBufferForTuple(Relation relation, Size len,
 	 * somewhat arbitrary, but it should prevent most unnecessary relation
 	 * extensions while inserting large tuples into low-fillfactor tables.
 	 */
-	nearlyEmptyFreeSpace = MaxHeapTupleSize -
+	nearlyEmptyFreeSpace = MaxHeapTupleSizeDynamic -
 		(MaxHeapTuplesPerPageDynamic / 8 * sizeof(ItemIdData));
 	if (len + saveFreeSpace > nearlyEmptyFreeSpace)
 		targetFreeSpace = Max(len, nearlyEmptyFreeSpace);
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 424958912c..273a7faf85 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -653,11 +653,11 @@ raw_heap_insert(RewriteState state, HeapTuple tup)
 	/*
 	 * If we're gonna fail for oversize tuple, do it right away
 	 */
-	if (len > MaxHeapTupleSize)
+	if (len > MaxHeapTupleSizeDynamic)
 		ereport(ERROR,
 				(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
 				 errmsg("row is too big: size %zu, maximum size %zu",
-						len, MaxHeapTupleSize)));
+						len, MaxHeapTupleSizeDynamic)));
 
 	/* Compute desired extra freespace due to fillfactor option */
 	saveFreeSpace = RelationGetTargetPageFreeSpace(state->rs_new_rel,
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 12edc5772a..5c235ec78c 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4882,7 +4882,7 @@ ReorderBufferToastReplace(ReorderBuffer *rb, ReorderBufferTXN *txn,
 	 * the tuplebuf because attrs[] will point back into the current content.
 	 */
 	tmphtup = heap_form_tuple(desc, attrs, isnull);
-	Assert(newtup->tuple.t_len <= MaxHeapTupleSize);
+	Assert(newtup->tuple.t_len <= MaxHeapTupleSizeDynamic);
 	Assert(ReorderBufferTupleBufData(newtup) == newtup->tuple.t_data);
 
 	memcpy(newtup->tuple.t_data, tmphtup->t_data, tmphtup->t_len);
diff --git a/src/backend/storage/freespace/freespace.c b/src/backend/storage/freespace/freespace.c
index fb9440ff72..cf712182aa 100644
--- a/src/backend/storage/freespace/freespace.c
+++ b/src/backend/storage/freespace/freespace.c
@@ -63,7 +63,7 @@
  */
 #define FSM_CATEGORIES	256
 #define FSM_CAT_STEP	(BLCKSZ / FSM_CATEGORIES)
-#define MaxFSMRequestSize	MaxHeapTupleSize
+#define MaxFSMRequestSize	MaxHeapTupleSizeDynamic
 
 /*
  * Depth of the on-disk tree. We need to be able to address 2^32-1 blocks,
diff --git a/src/include/access/heaptoast.h b/src/include/access/heaptoast.h
index 5c0a796f66..5e649aa1ac 100644
--- a/src/include/access/heaptoast.h
+++ b/src/include/access/heaptoast.h
@@ -65,7 +65,7 @@
  * compress it (we can't move it out-of-line, however).  Note that this
  * number is per-datum, not per-tuple, for simplicity in index_form_tuple().
  */
-#define TOAST_INDEX_TARGET		(MaxHeapTupleSize / 16)
+#define TOAST_INDEX_TARGET		(MaxHeapTupleSizeDynamic / 16)
 
 /*
  * When we store an oversize datum externally, we divide it into chunks
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index 90ae6ca97f..c7cf689218 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -550,7 +550,7 @@ StaticAssertDecl(MaxOffsetNumber < SpecTokenOffsetNumber,
 #define BITMAPLEN(NATTS)	(((int)(NATTS) + 7) / 8)
 
 /*
- * MaxHeapTupleSize is the maximum allowed size of a heap tuple, including
+ * MaxHeapTupleSizeDynamic is the maximum allowed size of a heap tuple, including
  * header and MAXALIGN alignment padding.  Basically it's BLCKSZ minus the
  * other stuff that has to be on a disk page.  Since heap pages use no
  * "special space", there's no deduction for that.
@@ -558,12 +558,11 @@ StaticAssertDecl(MaxOffsetNumber < SpecTokenOffsetNumber,
  * NOTE: we allow for the ItemId that must point to the tuple, ensuring that
  * an otherwise-empty page can indeed hold a tuple of this size.  Because
  * ItemIds and tuples have different alignment requirements, don't assume that
- * you can, say, fit 2 tuples of size MaxHeapTupleSize/2 on the same page.
+ * you can, say, fit 2 tuples of size MaxHeapTupleSizeDynamic/2 on the same page.
  */
 #define CalcMaxHeapTupleSize(usablespace)  ((usablespace) - MAXALIGN(sizeof(ItemIdData)))
 #define MaxHeapTupleSizeLimit CalcMaxHeapTupleSize(PageUsableSpaceMax)
 #define MaxHeapTupleSizeDynamic CalcMaxHeapTupleSize(PageUsableSpace)
-#define MaxHeapTupleSize MaxHeapTupleSizeLimit
 #define MinHeapTupleSize  MAXALIGN(SizeofHeapTupleHeader)
 
 /*
diff --git a/src/test/regress/expected/insert.out b/src/test/regress/expected/insert.out
index dd4354fc7d..435adf5439 100644
--- a/src/test/regress/expected/insert.out
+++ b/src/test/regress/expected/insert.out
@@ -86,7 +86,7 @@ drop table inserttest;
 --
 CREATE TABLE large_tuple_test (a int, b text) WITH (fillfactor = 10);
 ALTER TABLE large_tuple_test ALTER COLUMN b SET STORAGE plain;
--- create page w/ free space in range [nearlyEmptyFreeSpace, MaxHeapTupleSize)
+-- create page w/ free space in range [nearlyEmptyFreeSpace, MaxHeapTupleSizeDynamic)
 INSERT INTO large_tuple_test (select 1, NULL);
 -- should still fit on the page
 INSERT INTO large_tuple_test (select 2, repeat('a', 1000));
diff --git a/src/test/regress/sql/insert.sql b/src/test/regress/sql/insert.sql
index bdcffd0314..133acae4cc 100644
--- a/src/test/regress/sql/insert.sql
+++ b/src/test/regress/sql/insert.sql
@@ -43,7 +43,7 @@ drop table inserttest;
 CREATE TABLE large_tuple_test (a int, b text) WITH (fillfactor = 10);
 ALTER TABLE large_tuple_test ALTER COLUMN b SET STORAGE plain;
 
--- create page w/ free space in range [nearlyEmptyFreeSpace, MaxHeapTupleSize)
+-- create page w/ free space in range [nearlyEmptyFreeSpace, MaxHeapTupleSizeDynamic)
 INSERT INTO large_tuple_test (select 1, NULL);
 
 -- should still fit on the page
-- 
2.40.1

v2-0006-Split-MaxHeapTuplesPerPage-into-Limit-and-Dynamic.patchapplication/octet-stream; name=v2-0006-Split-MaxHeapTuplesPerPage-into-Limit-and-Dynamic.patchDownload

From e8056ec9120a0d53e88fb7c82dc7149b6089d156 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Tue, 19 Dec 2023 17:42:31 -0500
Subject: [PATCH v2 6/9] Split MaxHeapTuplesPerPage into Limit and Dynamic
 variants

---
 contrib/pg_surgery/heap_surgery.c             |  4 ++--
 src/backend/access/brin/brin_bloom.c          |  8 +++----
 src/backend/access/brin/brin_minmax_multi.c   |  8 +++----
 src/backend/access/gin/ginpostinglist.c       |  6 ++---
 src/backend/access/heap/README.HOT            |  2 +-
 src/backend/access/heap/heapam.c              |  6 ++---
 src/backend/access/heap/heapam_handler.c      |  8 +++----
 src/backend/access/heap/hio.c                 |  2 +-
 src/backend/access/heap/pruneheap.c           | 22 +++++++++----------
 src/backend/access/heap/vacuumlazy.c          | 22 +++++++++----------
 src/backend/nodes/tidbitmap.c                 |  2 +-
 src/backend/storage/page/bufpage.c            | 18 +++++++--------
 src/include/access/ginblock.h                 |  2 +-
 src/include/access/heapam.h                   |  6 ++---
 src/include/access/htup_details.h             |  1 -
 .../test_ginpostinglist/test_ginpostinglist.c |  6 ++---
 16 files changed, 61 insertions(+), 62 deletions(-)

diff --git a/contrib/pg_surgery/heap_surgery.c b/contrib/pg_surgery/heap_surgery.c
index 4308d1933b..f270e0ae3b 100644
--- a/contrib/pg_surgery/heap_surgery.c
+++ b/contrib/pg_surgery/heap_surgery.c
@@ -89,7 +89,7 @@ heap_force_common(FunctionCallInfo fcinfo, HeapTupleForceOption heap_force_opt)
 	Relation	rel;
 	OffsetNumber curr_start_ptr,
 				next_start_ptr;
-	bool		include_this_tid[MaxHeapTuplesPerPage];
+	bool		include_this_tid[MaxHeapTuplesPerPageLimit];
 
 	if (RecoveryInProgress())
 		ereport(ERROR,
@@ -225,7 +225,7 @@ heap_force_common(FunctionCallInfo fcinfo, HeapTupleForceOption heap_force_opt)
 			}
 
 			/* Mark it for processing. */
-			Assert(offno < MaxHeapTuplesPerPage);
+			Assert(offno < MaxHeapTuplesPerPageDynamic);
 			include_this_tid[offno] = true;
 		}
 
diff --git a/src/backend/access/brin/brin_bloom.c b/src/backend/access/brin/brin_bloom.c
index 5bfaffe59b..e779e13feb 100644
--- a/src/backend/access/brin/brin_bloom.c
+++ b/src/backend/access/brin/brin_bloom.c
@@ -166,7 +166,7 @@ typedef struct BloomOptions
  * on the fact that the filter header is ~20B alone, which is about
  * the same as the filter bitmap for 16 distinct items with 1% false
  * positive rate. So by allowing lower values we'd not gain much. In
- * any case, the min should not be larger than MaxHeapTuplesPerPage
+ * any case, the min should not be larger than MaxHeapTuplesPerPageDynamic
  * (~290), which is the theoretical maximum for single-page ranges.
  */
 #define		BLOOM_MIN_NDISTINCT_PER_RANGE		16
@@ -478,7 +478,7 @@ brin_bloom_opcinfo(PG_FUNCTION_ARGS)
  *
  * Adjust the ndistinct value based on the pagesPerRange value. First,
  * if it's negative, it's assumed to be relative to maximum number of
- * tuples in the range (assuming each page gets MaxHeapTuplesPerPage
+ * tuples in the range (assuming each page gets MaxHeapTuplesPerPageDynamic
  * tuples, which is likely a significant over-estimate). We also clamp
  * the value, not to over-size the bloom filter unnecessarily.
  *
@@ -493,7 +493,7 @@ brin_bloom_opcinfo(PG_FUNCTION_ARGS)
  * seems better to rely on the upper estimate.
  *
  * XXX We might also calculate a better estimate of rows per BRIN range,
- * instead of using MaxHeapTuplesPerPage (which probably produces values
+ * instead of using MaxHeapTuplesPerPageDynamic (which probably produces values
  * much higher than reality).
  */
 static int
@@ -508,7 +508,7 @@ brin_bloom_get_ndistinct(BrinDesc *bdesc, BloomOptions *opts)
 
 	Assert(BlockNumberIsValid(pagesPerRange));
 
-	maxtuples = MaxHeapTuplesPerPage * pagesPerRange;
+	maxtuples = MaxHeapTuplesPerPageDynamic * pagesPerRange;
 
 	/*
 	 * Similarly to n_distinct, negative values are relative - in this case to
diff --git a/src/backend/access/brin/brin_minmax_multi.c b/src/backend/access/brin/brin_minmax_multi.c
index 9811451b54..efdfe0afe8 100644
--- a/src/backend/access/brin/brin_minmax_multi.c
+++ b/src/backend/access/brin/brin_minmax_multi.c
@@ -2007,10 +2007,10 @@ brin_minmax_multi_distance_tid(PG_FUNCTION_ARGS)
 	 * We use the no-check variants here, because user-supplied values may
 	 * have (ip_posid == 0). See ItemPointerCompare.
 	 */
-	da1 = ItemPointerGetBlockNumberNoCheck(pa1) * MaxHeapTuplesPerPage +
+	da1 = ItemPointerGetBlockNumberNoCheck(pa1) * MaxHeapTuplesPerPageDynamic +
 		ItemPointerGetOffsetNumberNoCheck(pa1);
 
-	da2 = ItemPointerGetBlockNumberNoCheck(pa2) * MaxHeapTuplesPerPage +
+	da2 = ItemPointerGetBlockNumberNoCheck(pa2) * MaxHeapTuplesPerPageDynamic +
 		ItemPointerGetOffsetNumberNoCheck(pa2);
 
 	PG_RETURN_FLOAT8(da2 - da1);
@@ -2461,7 +2461,7 @@ brin_minmax_multi_add_value(PG_FUNCTION_ARGS)
 		 * much lower, but meh.
 		 */
 		maxvalues = Min(target_maxvalues * MINMAX_BUFFER_FACTOR,
-						MaxHeapTuplesPerPage * pagesPerRange);
+						MaxHeapTuplesPerPageDynamic * pagesPerRange);
 
 		/* but always at least the original value */
 		maxvalues = Max(maxvalues, target_maxvalues);
@@ -2507,7 +2507,7 @@ brin_minmax_multi_add_value(PG_FUNCTION_ARGS)
 		 * much lower, but meh.
 		 */
 		maxvalues = Min(serialized->maxvalues * MINMAX_BUFFER_FACTOR,
-						MaxHeapTuplesPerPage * pagesPerRange);
+						MaxHeapTuplesPerPageDynamic * pagesPerRange);
 
 		/* but always at least the original value */
 		maxvalues = Max(maxvalues, serialized->maxvalues);
diff --git a/src/backend/access/gin/ginpostinglist.c b/src/backend/access/gin/ginpostinglist.c
index 66a89837e6..b45ae8e149 100644
--- a/src/backend/access/gin/ginpostinglist.c
+++ b/src/backend/access/gin/ginpostinglist.c
@@ -26,7 +26,7 @@
  * lowest 32 bits are the block number. That leaves 21 bits unused, i.e.
  * only 43 low bits are used.
  *
- * 11 bits is enough for the offset number, because MaxHeapTuplesPerPage <
+ * 11 bits is enough for the offset number, because MaxHeapTuplesPerPageDynamic <
  * 2^11 on all supported block sizes. We are frugal with the bits, because
  * smaller integers use fewer bytes in the varbyte encoding, saving disk
  * space. (If we get a new table AM in the future that wants to use the full
@@ -74,9 +74,9 @@
 /*
  * How many bits do you need to encode offset number? OffsetNumber is a 16-bit
  * integer, but you can't fit that many items on a page. 11 ought to be more
- * than enough. It's tempting to derive this from MaxHeapTuplesPerPage, and
+ * than enough. It's tempting to derive this from MaxHeapTuplesPerPageDynamic, and
  * use the minimum number of bits, but that would require changing the on-disk
- * format if MaxHeapTuplesPerPage changes. Better to leave some slack.
+ * format if MaxHeapTuplesPerPageDynamic changes. Better to leave some slack.
  */
 #define MaxHeapTuplesPerPageBits		11
 
diff --git a/src/backend/access/heap/README.HOT b/src/backend/access/heap/README.HOT
index 74e407f375..296ae36310 100644
--- a/src/backend/access/heap/README.HOT
+++ b/src/backend/access/heap/README.HOT
@@ -264,7 +264,7 @@ of line pointer bloat: we might end up with huge numbers of line pointers
 and just a few actual tuples on a page.  To limit the damage in the worst
 case, and to keep various work arrays as well as the bitmaps in bitmap
 scans reasonably sized, the maximum number of line pointers per page
-is arbitrarily capped at MaxHeapTuplesPerPage (the most tuples that
+is arbitrarily capped at MaxHeapTuplesPerPageDynamic (the most tuples that
 could fit without HOT pruning).
 
 Effectively, space reclamation happens during tuple retrieval when the
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 5daad4c34d..182d3d7b59 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -478,7 +478,7 @@ heapgetpage(TableScanDesc sscan, BlockNumber block)
 
 	LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
 
-	Assert(ntup <= MaxHeapTuplesPerPage);
+	Assert(ntup <= MaxHeapTuplesPerPageDynamic);
 	scan->rs_ntuples = ntup;
 }
 
@@ -6750,8 +6750,8 @@ heap_freeze_execute_prepared(Relation rel, Buffer buffer,
 	/* Now WAL-log freezing if necessary */
 	if (RelationNeedsWAL(rel))
 	{
-		xl_heap_freeze_plan plans[MaxHeapTuplesPerPage];
-		OffsetNumber offsets[MaxHeapTuplesPerPage];
+		xl_heap_freeze_plan plans[MaxHeapTuplesPerPageLimit];
+		OffsetNumber offsets[MaxHeapTuplesPerPageLimit];
 		int			nplans;
 		xl_heap_freeze_page xlrec;
 		XLogRecPtr	recptr;
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 58e92b87a2..4d6e476eb0 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -1191,7 +1191,7 @@ heapam_index_build_range_scan(Relation heapRelation,
 	TransactionId OldestXmin;
 	BlockNumber previous_blkno = InvalidBlockNumber;
 	BlockNumber root_blkno = InvalidBlockNumber;
-	OffsetNumber root_offsets[MaxHeapTuplesPerPage];
+	OffsetNumber root_offsets[MaxHeapTuplesPerPageLimit];
 
 	/*
 	 * sanity checks
@@ -1754,8 +1754,8 @@ heapam_index_validate_scan(Relation heapRelation,
 	EState	   *estate;
 	ExprContext *econtext;
 	BlockNumber root_blkno = InvalidBlockNumber;
-	OffsetNumber root_offsets[MaxHeapTuplesPerPage];
-	bool		in_index[MaxHeapTuplesPerPage];
+	OffsetNumber root_offsets[MaxHeapTuplesPerPageLimit];
+	bool		in_index[MaxHeapTuplesPerPageLimit];
 	BlockNumber previous_blkno = InvalidBlockNumber;
 
 	/* state variables for the merge */
@@ -2220,7 +2220,7 @@ heapam_scan_bitmap_next_block(TableScanDesc scan,
 
 	LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
 
-	Assert(ntup <= MaxHeapTuplesPerPage);
+	Assert(ntup <= MaxHeapTuplesPerPageDynamic);
 	hscan->rs_ntuples = ntup;
 
 	return ntup > 0;
diff --git a/src/backend/access/heap/hio.c b/src/backend/access/heap/hio.c
index ccc4c6966a..bd427ba2f8 100644
--- a/src/backend/access/heap/hio.c
+++ b/src/backend/access/heap/hio.c
@@ -547,7 +547,7 @@ RelationGetBufferForTuple(Relation relation, Size len,
 	 * extensions while inserting large tuples into low-fillfactor tables.
 	 */
 	nearlyEmptyFreeSpace = MaxHeapTupleSize -
-		(MaxHeapTuplesPerPage / 8 * sizeof(ItemIdData));
+		(MaxHeapTuplesPerPageDynamic / 8 * sizeof(ItemIdData));
 	if (len + saveFreeSpace > nearlyEmptyFreeSpace)
 		targetFreeSpace = Max(len, nearlyEmptyFreeSpace);
 	else
diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index c5f1abd95a..2674ef4a17 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -42,17 +42,17 @@ typedef struct
 	int			ndead;
 	int			nunused;
 	/* arrays that accumulate indexes of items to be changed */
-	OffsetNumber redirected[MaxHeapTuplesPerPage * 2];
-	OffsetNumber nowdead[MaxHeapTuplesPerPage];
-	OffsetNumber nowunused[MaxHeapTuplesPerPage];
+	OffsetNumber redirected[MaxHeapTuplesPerPageLimit * 2];
+	OffsetNumber nowdead[MaxHeapTuplesPerPageLimit];
+	OffsetNumber nowunused[MaxHeapTuplesPerPageLimit];
 
 	/*
 	 * marked[i] is true if item i is entered in one of the above arrays.
 	 *
-	 * This needs to be MaxHeapTuplesPerPage + 1 long as FirstOffsetNumber is
+	 * This needs to be MaxHeapTuplesPerPageDynamic + 1 long as FirstOffsetNumber is
 	 * 1. Otherwise every access would need to subtract 1.
 	 */
-	bool		marked[MaxHeapTuplesPerPage + 1];
+	bool		marked[MaxHeapTuplesPerPageLimit + 1];
 } PruneState;
 
 /* Local functions */
@@ -482,7 +482,7 @@ heap_prune_chain(Buffer buffer, OffsetNumber rootoffnum,
 	OffsetNumber latestdead = InvalidOffsetNumber,
 				maxoff = PageGetMaxOffsetNumber(dp),
 				offnum;
-	OffsetNumber chainitems[MaxHeapTuplesPerPage];
+	OffsetNumber chainitems[MaxHeapTuplesPerPageLimit];
 	int			nchain = 0,
 				i;
 
@@ -753,7 +753,7 @@ static void
 heap_prune_record_redirect(PruneState *prstate,
 						   OffsetNumber offnum, OffsetNumber rdoffnum)
 {
-	Assert(prstate->nredirected < MaxHeapTuplesPerPage);
+	Assert(prstate->nredirected < MaxHeapTuplesPerPageDynamic);
 	prstate->redirected[prstate->nredirected * 2] = offnum;
 	prstate->redirected[prstate->nredirected * 2 + 1] = rdoffnum;
 	prstate->nredirected++;
@@ -767,7 +767,7 @@ heap_prune_record_redirect(PruneState *prstate,
 static void
 heap_prune_record_dead(PruneState *prstate, OffsetNumber offnum)
 {
-	Assert(prstate->ndead < MaxHeapTuplesPerPage);
+	Assert(prstate->ndead < MaxHeapTuplesPerPageDynamic);
 	prstate->nowdead[prstate->ndead] = offnum;
 	prstate->ndead++;
 	Assert(!prstate->marked[offnum]);
@@ -778,7 +778,7 @@ heap_prune_record_dead(PruneState *prstate, OffsetNumber offnum)
 static void
 heap_prune_record_unused(PruneState *prstate, OffsetNumber offnum)
 {
-	Assert(prstate->nunused < MaxHeapTuplesPerPage);
+	Assert(prstate->nunused < MaxHeapTuplesPerPageDynamic);
 	prstate->nowunused[prstate->nunused] = offnum;
 	prstate->nunused++;
 	Assert(!prstate->marked[offnum]);
@@ -980,7 +980,7 @@ page_verify_redirects(Page page)
  * If item k is part of a HOT-chain with root at item j, then we set
  * root_offsets[k - 1] = j.
  *
- * The passed-in root_offsets array must have MaxHeapTuplesPerPage entries.
+ * The passed-in root_offsets array must have MaxHeapTuplesPerPageDynamic entries.
  * Unused entries are filled with InvalidOffsetNumber (zero).
  *
  * The function must be called with at least share lock on the buffer, to
@@ -997,7 +997,7 @@ heap_get_root_tuples(Page page, OffsetNumber *root_offsets)
 				maxoff;
 
 	MemSet(root_offsets, InvalidOffsetNumber,
-		   MaxHeapTuplesPerPage * sizeof(OffsetNumber));
+		   MaxHeapTuplesPerPageDynamic * sizeof(OffsetNumber));
 
 	maxoff = PageGetMaxOffsetNumber(page);
 	for (offnum = FirstOffsetNumber; offnum <= maxoff; offnum = OffsetNumberNext(offnum))
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 0e7bc32881..d897317e67 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -909,8 +909,8 @@ lazy_scan_heap(LVRelState *vacrel)
 		 * dead_items TIDs, pause and do a cycle of vacuuming before we tackle
 		 * this page.
 		 */
-		Assert(dead_items->max_items >= MaxHeapTuplesPerPage);
-		if (dead_items->max_items - dead_items->num_items < MaxHeapTuplesPerPage)
+		Assert(dead_items->max_items >= MaxHeapTuplesPerPageDynamic);
+		if (dead_items->max_items - dead_items->num_items < MaxHeapTuplesPerPageDynamic)
 		{
 			/*
 			 * Before beginning index vacuuming, we release any pin we may
@@ -1556,8 +1556,8 @@ lazy_scan_prune(LVRelState *vacrel,
 				recently_dead_tuples;
 	HeapPageFreeze pagefrz;
 	int64		fpi_before = pgWalUsage.wal_fpi;
-	OffsetNumber deadoffsets[MaxHeapTuplesPerPage];
-	HeapTupleFreeze frozen[MaxHeapTuplesPerPage];
+	OffsetNumber deadoffsets[MaxHeapTuplesPerPageLimit];
+	HeapTupleFreeze frozen[MaxHeapTuplesPerPageLimit];
 
 	Assert(BufferGetBlockNumber(buf) == blkno);
 
@@ -1956,7 +1956,7 @@ lazy_scan_noprune(LVRelState *vacrel,
 	HeapTupleHeader tupleheader;
 	TransactionId NoFreezePageRelfrozenXid = vacrel->NewRelfrozenXid;
 	MultiXactId NoFreezePageRelminMxid = vacrel->NewRelminMxid;
-	OffsetNumber deadoffsets[MaxHeapTuplesPerPage];
+	OffsetNumber deadoffsets[MaxHeapTuplesPerPageLimit];
 
 	Assert(BufferGetBlockNumber(buf) == blkno);
 
@@ -2510,7 +2510,7 @@ lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
 {
 	VacDeadItems *dead_items = vacrel->dead_items;
 	Page		page = BufferGetPage(buffer);
-	OffsetNumber unused[MaxHeapTuplesPerPage];
+	OffsetNumber unused[MaxHeapTuplesPerPageLimit];
 	int			nunused = 0;
 	TransactionId visibility_cutoff_xid;
 	bool		all_frozen;
@@ -3161,16 +3161,16 @@ dead_items_max_items(LVRelState *vacrel)
 		max_items = Min(max_items, MAXDEADITEMS(MaxAllocSize));
 
 		/* curious coding here to ensure the multiplication can't overflow */
-		if ((BlockNumber) (max_items / MaxHeapTuplesPerPage) > rel_pages)
-			max_items = rel_pages * MaxHeapTuplesPerPage;
+		if ((BlockNumber) (max_items / MaxHeapTuplesPerPageDynamic) > rel_pages)
+			max_items = rel_pages * MaxHeapTuplesPerPageDynamic;
 
 		/* stay sane if small maintenance_work_mem */
-		max_items = Max(max_items, MaxHeapTuplesPerPage);
+		max_items = Max(max_items, MaxHeapTuplesPerPageDynamic);
 	}
 	else
 	{
 		/* One-pass case only stores a single heap page's TIDs at a time */
-		max_items = MaxHeapTuplesPerPage;
+		max_items = MaxHeapTuplesPerPageDynamic;
 	}
 
 	return (int) max_items;
@@ -3190,7 +3190,7 @@ dead_items_alloc(LVRelState *vacrel, int nworkers)
 	int			max_items;
 
 	max_items = dead_items_max_items(vacrel);
-	Assert(max_items >= MaxHeapTuplesPerPage);
+	Assert(max_items >= MaxHeapTuplesPerPageDynamic);
 
 	/*
 	 * Initialize state for a parallel vacuum.  As of now, only one worker can
diff --git a/src/backend/nodes/tidbitmap.c b/src/backend/nodes/tidbitmap.c
index bb6c830562..70a9a968bb 100644
--- a/src/backend/nodes/tidbitmap.c
+++ b/src/backend/nodes/tidbitmap.c
@@ -53,7 +53,7 @@
  * the per-page bitmaps variable size.  We just legislate that the size
  * is this:
  */
-#define MAX_TUPLES_PER_PAGE  MaxHeapTuplesPerPage
+#define MAX_TUPLES_PER_PAGE  MaxHeapTuplesPerPageLimit
 
 /*
  * When we have to switch over to lossy storage, we use a data structure
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index d3aec3278d..406b3f252c 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -186,7 +186,7 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
  *	one that is both unused and deallocated.
  *
  *	If flag PAI_IS_HEAP is set, we enforce that there can't be more than
- *	MaxHeapTuplesPerPage line pointers on the page.
+ *	MaxHeapTuplesPerPageDynamic line pointers on the page.
  *
  *	!!! EREPORT(ERROR) IS DISALLOWED HERE !!!
  */
@@ -295,9 +295,9 @@ PageAddItemExtended(Page page,
 	}
 
 	/* Reject placing items beyond heap boundary, if heap */
-	if ((flags & PAI_IS_HEAP) != 0 && offsetNumber > MaxHeapTuplesPerPage)
+	if ((flags & PAI_IS_HEAP) != 0 && offsetNumber > MaxHeapTuplesPerPageDynamic)
 	{
-		elog(WARNING, "can't put more than MaxHeapTuplesPerPage items in a heap page");
+		elog(WARNING, "can't put more than MaxHeapTuplesPerPageDynamic items in a heap page");
 		return InvalidOffsetNumber;
 	}
 
@@ -702,7 +702,7 @@ PageRepairFragmentation(Page page)
 	Offset		pd_upper = ((PageHeader) page)->pd_upper;
 	Offset		pd_special = ((PageHeader) page)->pd_special;
 	Offset		last_offset;
-	itemIdCompactData itemidbase[MaxHeapTuplesPerPage];
+	itemIdCompactData itemidbase[MaxHeapTuplesPerPageLimit];
 	itemIdCompact itemidptr;
 	ItemId		lp;
 	int			nline,
@@ -979,12 +979,12 @@ PageGetExactFreeSpace(Page page)
  *		reduced by the space needed for a new line pointer.
  *
  * The difference between this and PageGetFreeSpace is that this will return
- * zero if there are already MaxHeapTuplesPerPage line pointers in the page
+ * zero if there are already MaxHeapTuplesPerPageDynamic line pointers in the page
  * and none are free.  We use this to enforce that no more than
- * MaxHeapTuplesPerPage line pointers are created on a heap page.  (Although
+ * MaxHeapTuplesPerPageDynamic line pointers are created on a heap page.  (Although
  * no more tuples than that could fit anyway, in the presence of redirected
  * or dead line pointers it'd be possible to have too many line pointers.
- * To avoid breaking code that assumes MaxHeapTuplesPerPage is a hard limit
+ * To avoid breaking code that assumes MaxHeapTuplesPerPageDynamic is a hard limit
  * on the number of line pointers, we make this extra check.)
  */
 Size
@@ -999,10 +999,10 @@ PageGetHeapFreeSpace(Page page)
 					nline;
 
 		/*
-		 * Are there already MaxHeapTuplesPerPage line pointers in the page?
+		 * Are there already MaxHeapTuplesPerPageDynamic line pointers in the page?
 		 */
 		nline = PageGetMaxOffsetNumber(page);
-		if (nline >= MaxHeapTuplesPerPage)
+		if (nline >= MaxHeapTuplesPerPageDynamic)
 		{
 			if (PageHasFreeLinePointers(page))
 			{
diff --git a/src/include/access/ginblock.h b/src/include/access/ginblock.h
index 0db8fa0dda..11a9a9d713 100644
--- a/src/include/access/ginblock.h
+++ b/src/include/access/ginblock.h
@@ -162,7 +162,7 @@ extern bool GinPageIsRecyclable(Page page);
  *				pointers for that page
  * Note that these are all distinguishable from an "invalid" item pointer
  * (which is InvalidBlockNumber/0) as well as from all normal item
- * pointers (which have item numbers in the range 1..MaxHeapTuplesPerPage).
+ * pointers (which have item numbers in the range 1..MaxHeapTuplesPerPageDynamic).
  */
 #define ItemPointerSetMin(p)  \
 	ItemPointerSet((p), (BlockNumber)0, (OffsetNumber)0)
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index a2d7a0ea72..569a94e85c 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -75,7 +75,7 @@ typedef struct HeapScanDescData
 	/* these fields only used in page-at-a-time mode and for bitmap scans */
 	int			rs_cindex;		/* current tuple's index in vistuples */
 	int			rs_ntuples;		/* number of visible tuples on page */
-	OffsetNumber rs_vistuples[MaxHeapTuplesPerPage];	/* their offsets */
+	OffsetNumber rs_vistuples[MaxHeapTuplesPerPageLimit];	/* their offsets */
 }			HeapScanDescData;
 typedef struct HeapScanDescData *HeapScanDesc;
 
@@ -205,10 +205,10 @@ typedef struct PruneResult
 	 * This is of type int8[], instead of HTSV_Result[], so we can use -1 to
 	 * indicate no visibility has been computed, e.g. for LP_DEAD items.
 	 *
-	 * This needs to be MaxHeapTuplesPerPage + 1 long as FirstOffsetNumber is
+	 * This needs to be MaxHeapTuplesPerPageDynamic + 1 long as FirstOffsetNumber is
 	 * 1. Otherwise every access would need to subtract 1.
 	 */
-	int8		htsv[MaxHeapTuplesPerPage + 1];
+	int8		htsv[MaxHeapTuplesPerPageLimit + 1];
 } PruneResult;
 
 /*
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index 70eaac32a7..90ae6ca97f 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -583,7 +583,6 @@ StaticAssertDecl(MaxOffsetNumber < SpecTokenOffsetNumber,
 			(MAXALIGN(SizeofHeapTupleHeader) + sizeof(ItemIdData))))
 #define MaxHeapTuplesPerPageLimit (CalcMaxHeapTuplesPerPage(PageUsableSpaceMax))
 #define MaxHeapTuplesPerPageDynamic (CalcMaxHeapTuplesPerPage(PageUsableSpace))
-#define MaxHeapTuplesPerPage MaxHeapTuplesPerPageLimit
 
 /*
  * MaxAttrSize is a somewhat arbitrary upper limit on the declared size of
diff --git a/src/test/modules/test_ginpostinglist/test_ginpostinglist.c b/src/test/modules/test_ginpostinglist/test_ginpostinglist.c
index 80cee65684..5e5f9713e8 100644
--- a/src/test/modules/test_ginpostinglist/test_ginpostinglist.c
+++ b/src/test/modules/test_ginpostinglist/test_ginpostinglist.c
@@ -88,9 +88,9 @@ Datum
 test_ginpostinglist(PG_FUNCTION_ARGS)
 {
 	test_itemptr_pair(0, 2, 14);
-	test_itemptr_pair(0, MaxHeapTuplesPerPage, 14);
-	test_itemptr_pair(MaxBlockNumber, MaxHeapTuplesPerPage, 14);
-	test_itemptr_pair(MaxBlockNumber, MaxHeapTuplesPerPage, 16);
+	test_itemptr_pair(0, MaxHeapTuplesPerPageDynamic, 14);
+	test_itemptr_pair(MaxBlockNumber, MaxHeapTuplesPerPageDynamic, 14);
+	test_itemptr_pair(MaxBlockNumber, MaxHeapTuplesPerPageDynamic, 16);
 
 	PG_RETURN_VOID();
 }
-- 
2.40.1

v2-0009-Split-MaxTIDsPerBTreePage-into-Limit-and-Dynamic-.patchapplication/octet-stream; name=v2-0009-Split-MaxTIDsPerBTreePage-into-Limit-and-Dynamic-.patchDownload

From 17529bdcdf931be8d333b24c6410c7e38afecd00 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Tue, 12 Dec 2023 17:14:59 -0500
Subject: [PATCH v2 9/9] Split MaxTIDsPerBTreePage into Limit and Dynamic
 variants

---
 contrib/amcheck/verify_nbtree.c       | 4 ++--
 src/backend/access/nbtree/nbtdedup.c  | 4 ++--
 src/backend/access/nbtree/nbtinsert.c | 4 ++--
 src/backend/access/nbtree/nbtree.c    | 4 ++--
 src/backend/access/nbtree/nbtsearch.c | 8 ++++----
 src/include/access/nbtree.h           | 5 ++---
 6 files changed, 14 insertions(+), 15 deletions(-)

diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 6026d7549d..c8035fb773 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -533,12 +533,12 @@ bt_check_every_level(Relation rel, Relation heaprel, bool heapkeyspace,
 		/*
 		 * Size Bloom filter based on estimated number of tuples in index,
 		 * while conservatively assuming that each block must contain at least
-		 * MaxTIDsPerBTreePage / 3 "plain" tuples -- see
+		 * MaxTIDsPerBTreePageDynamic / 3 "plain" tuples -- see
 		 * bt_posting_plain_tuple() for definition, and details of how posting
 		 * list tuples are handled.
 		 */
 		total_pages = RelationGetNumberOfBlocks(rel);
-		total_elems = Max(total_pages * (MaxTIDsPerBTreePage / 3),
+		total_elems = Max(total_pages * (MaxTIDsPerBTreePageDynamic / 3),
 						  (int64) state->rel->rd_rel->reltuples);
 		/* Generate a random seed to avoid repetition */
 		seed = pg_prng_uint64(&pg_global_prng_state);
diff --git a/src/backend/access/nbtree/nbtdedup.c b/src/backend/access/nbtree/nbtdedup.c
index d4db0b28f2..a8134c78a0 100644
--- a/src/backend/access/nbtree/nbtdedup.c
+++ b/src/backend/access/nbtree/nbtdedup.c
@@ -355,8 +355,8 @@ _bt_bottomupdel_pass(Relation rel, Buffer buf, Relation heapRel,
 	delstate.bottomup = true;
 	delstate.bottomupfreespace = Max(BLCKSZ / 16, newitemsz);
 	delstate.ndeltids = 0;
-	delstate.deltids = palloc(MaxTIDsPerBTreePage * sizeof(TM_IndexDelete));
-	delstate.status = palloc(MaxTIDsPerBTreePage * sizeof(TM_IndexStatus));
+	delstate.deltids = palloc(MaxTIDsPerBTreePageDynamic * sizeof(TM_IndexDelete));
+	delstate.status = palloc(MaxTIDsPerBTreePageDynamic * sizeof(TM_IndexStatus));
 
 	minoff = P_FIRSTDATAKEY(opaque);
 	maxoff = PageGetMaxOffsetNumber(page);
diff --git a/src/backend/access/nbtree/nbtinsert.c b/src/backend/access/nbtree/nbtinsert.c
index a75de71659..ec9f04e99e 100644
--- a/src/backend/access/nbtree/nbtinsert.c
+++ b/src/backend/access/nbtree/nbtinsert.c
@@ -2829,8 +2829,8 @@ _bt_simpledel_pass(Relation rel, Buffer buffer, Relation heapRel,
 	delstate.bottomup = false;
 	delstate.bottomupfreespace = 0;
 	delstate.ndeltids = 0;
-	delstate.deltids = palloc(MaxTIDsPerBTreePage * sizeof(TM_IndexDelete));
-	delstate.status = palloc(MaxTIDsPerBTreePage * sizeof(TM_IndexStatus));
+	delstate.deltids = palloc(MaxTIDsPerBTreePageDynamic * sizeof(TM_IndexDelete));
+	delstate.status = palloc(MaxTIDsPerBTreePageDynamic * sizeof(TM_IndexStatus));
 
 	for (offnum = minoff;
 		 offnum <= maxoff;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 64d2f07fc0..8a9c690c87 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -263,8 +263,8 @@ btgettuple(IndexScanDesc scan, ScanDirection dir)
 				 */
 				if (so->killedItems == NULL)
 					so->killedItems = (int *)
-						palloc(MaxTIDsPerBTreePage * sizeof(int));
-				if (so->numKilled < MaxTIDsPerBTreePage)
+						palloc(MaxTIDsPerBTreePageDynamic * sizeof(int));
+				if (so->numKilled < MaxTIDsPerBTreePageDynamic)
 					so->killedItems[so->numKilled++] = so->currPos.itemIndex;
 			}
 
diff --git a/src/backend/access/nbtree/nbtsearch.c b/src/backend/access/nbtree/nbtsearch.c
index be61b3868f..ad9480711f 100644
--- a/src/backend/access/nbtree/nbtsearch.c
+++ b/src/backend/access/nbtree/nbtsearch.c
@@ -1724,7 +1724,7 @@ _bt_readpage(IndexScanDesc scan, ScanDirection dir, OffsetNumber offnum)
 		if (!continuescan)
 			so->currPos.moreRight = false;
 
-		Assert(itemIndex <= MaxTIDsPerBTreePage);
+		Assert(itemIndex <= MaxTIDsPerBTreePageDynamic);
 		so->currPos.firstItem = 0;
 		so->currPos.lastItem = itemIndex - 1;
 		so->currPos.itemIndex = 0;
@@ -1732,7 +1732,7 @@ _bt_readpage(IndexScanDesc scan, ScanDirection dir, OffsetNumber offnum)
 	else
 	{
 		/* load items[] in descending order */
-		itemIndex = MaxTIDsPerBTreePage;
+		itemIndex = MaxTIDsPerBTreePageDynamic;
 
 		offnum = Min(offnum, maxoff);
 
@@ -1831,8 +1831,8 @@ _bt_readpage(IndexScanDesc scan, ScanDirection dir, OffsetNumber offnum)
 
 		Assert(itemIndex >= 0);
 		so->currPos.firstItem = itemIndex;
-		so->currPos.lastItem = MaxTIDsPerBTreePage - 1;
-		so->currPos.itemIndex = MaxTIDsPerBTreePage - 1;
+		so->currPos.lastItem = MaxTIDsPerBTreePageDynamic - 1;
+		so->currPos.itemIndex = MaxTIDsPerBTreePageDynamic - 1;
 	}
 
 	return (so->currPos.firstItem <= so->currPos.lastItem);
diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h
index d2ab4903bb..e8a948842f 100644
--- a/src/include/access/nbtree.h
+++ b/src/include/access/nbtree.h
@@ -172,7 +172,7 @@ typedef struct BTMetaPageData
 				   MAXALIGN(sizeof(BTPageOpaqueData))) / 3)
 
 /*
- * MaxTIDsPerBTreePage is an upper bound on the number of heap TIDs tuples
+ * MaxTIDsPerBTreePageDynamic is an upper bound on the number of heap TIDs tuples
  * that may be stored on a btree leaf page.  It is used to size the
  * per-page temporary buffers.
  *
@@ -187,7 +187,6 @@ typedef struct BTMetaPageData
 		   sizeof(ItemPointerData))
 #define MaxTIDsPerBTreePageLimit (CalcMaxTIDsPerBTreePage(PageUsableSpaceMax))
 #define MaxTIDsPerBTreePageDynamic (CalcMaxTIDsPerBTreePage(PageUsableSpace))
-#define MaxTIDsPerBTreePage MaxTIDsPerBTreePageLimit
 /*
  * The leaf-page fillfactor defaults to 90% but is user-adjustable.
  * For pages above the leaf level, we use a fixed 70% fillfactor.
@@ -984,7 +983,7 @@ typedef struct BTScanPosData
 	int			lastItem;		/* last valid index in items[] */
 	int			itemIndex;		/* current index in items[] */
 
-	BTScanPosItem items[MaxTIDsPerBTreePage];	/* MUST BE LAST */
+	BTScanPosItem items[MaxTIDsPerBTreePageLimit];	/* MUST BE LAST */
 } BTScanPosData;
 
 typedef BTScanPosData *BTScanPos;
-- 
2.40.1

#26

David Christensen

david.christensen@crunchydata.com

almost 2 years ago

In reply to: David Christensen (#25)

28 attachment(s)

Re: [PATCHES] Post-special page storage TDE support

Hi,

I have finished the reworking of this particular patch series, and have
tried to
organize this in such a way that it will be easily reviewable. It is
constructed progressively to be able to follow what is happening here. As
such,
each individual commit is not guaranteed to compile on its own, so the whole
series would need to be applied before it works. (It does pass CI tests.)

Here is a brief roadmap of the patches; some of them have additional
details in
the commit message describing a little more about them.

These two patches do some refactoring of existing code to make a common
place to
modify the definitions:

v3-0001-refactor-Create-PageUsableSpace-to-represent-spac.patch
v3-0002-refactor-Make-PageGetUsablePageSize-routine.patch

These two patches add the ReservedPageSize variable and teach PageInit to
use to
adjust sizing accordingly:

v3-0003-feature-Add-ReservedPageSize-variable.patch
v3-0004-feature-Adjust-page-sizes-at-PageInit.patch

This patch modifies the definitions of 4 symbols to be computed based on
PageUsableSpace:

v3-0005-feature-Create-Calc-Limit-and-Dynamic-forms-for-f.patch

These following 4 patches are mechanical replacements of all existing uses
of
these symbols; this provides both visibility into where the existing symbol
is
used as well as distinguishing between parts that care about static
allocation
vs dynamic usage. The only non-mechanical change is to remove the
definition of
the old symbol so we can be guaranteed that all uses have been considered:

v3-0006-chore-Split-MaxHeapTuplesPerPage-into-Limit-and-D.patch
v3-0007-chore-Split-MaxIndexTuplesPerPage-into-Limit-and-.patch
v3-0008-chore-Split-MaxHeapTupleSize-into-Limit-and-Dynam.patch
v3-0009-chore-Split-MaxTIDsPerBTreePage-into-Limit-and-Dy.patch

The following patches are related to required changes to support dynamic
toast
limits:

v3-0010-feature-Add-hook-for-setting-reloptions-defaults-.patch
v3-0011-feature-Dynamically-calculate-toast_tuple_target.patch
v3-0012-feature-Add-Calc-options-for-toast-related-pieces.patch
v3-0013-chore-Replace-TOAST_MAX_CHUNK_SIZE-with-ClusterTo.patch
v3-0014-chore-Translation-updates-for-TOAST_MAX_CHUNK_SIZ.patch

In order to calculate some of the sizes, we need to include nbtree.h
internals,
but we can't use in front-end apps, so we separate out the pieces we care
about
into a separate include and use that:

v3-0015-chore-Split-nbtree.h-structure-defs-into-an-inter.patch

This is the meat of the patch; provide a common location for these
block-size-related constants to be computed using the infra that has been
set up
so far. Also ensure that we are properly initializing this in front end and
back end code. A tricky piece here is we have two separate include files
for
blocksize.h; one which exposes externs as consts for optimizations, and one
that
blocksize.c itself uses without consts, which it uses to create/initialized
the
vars:

v3-0016-feature-Calculate-all-blocksize-constants-in-a-co.patch

Add ControlFile and GUC support for reserved_page_size:

v3-0017-feature-ControlFile-GUC-support-for-reserved_page.patch

Add initdb support for reserving page space:

v3-0018-feature-Add-reserved_page_size-to-initdb-bootstra.patch

Fixes for pg_resetwal:

v3-0019-feature-Updates-for-pg_resetwal.patch

The following 4 patches mechanically replace the Dynamic form to use the new
Cluster variables:

v3-0020-chore-Rename-MaxHeapTupleSizeDynamic-to-ClusterMa.patch
v3-0021-chore-Rename-MaxHeapTuplesPerPageDynamic-to-Clust.patch
v3-0022-chore-Rename-MaxIndexTuplesPerPageDynamic-to-Clus.patch
v3-0023-chore-Rename-MaxTIDsPerBTreePageDynamic-to-Cluste.patch

Two pieces of optimization required for visibility map:

v3-0024-optimization-Add-support-for-fast-non-division-ba.patch
v3-0025-optimization-Use-fastdiv-code-in-visibility-map.patch

Update bufpage.h comments:

v3-0026-doc-update-bufpage-docs-w-reserved-space-data.patch

Fixes for bloom to use runtime size:

v3-0027-feature-Teach-bloom-about-PageUsableSpace.patch

Fixes for FSM to use runtime size:

v3-0028-feature-teach-FSM-about-reserved-page-space.patch

I hope this makes sense for reviewing, I know it's a big job, so breaking
things up a little more and organizing will hopefully help.

Best,

David

Attachments:

v3-0002-refactor-Make-PageGetUsablePageSize-routine.patchapplication/octet-stream; name=v3-0002-refactor-Make-PageGetUsablePageSize-routine.patchDownload

From 11c67779bc256090d3d0ade3dec22ff0966868fb Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Wed, 17 Jan 2024 13:31:29 -0500
Subject: [PATCH v3 02/28] refactor: Make PageGetUsablePageSize() routine

This is the equivalent change, but for locations which utilized
PageGetPageSize() and SizeOfPageHeaderData.
---
 src/backend/access/nbtree/nbtdedup.c    |  2 +-
 src/backend/access/nbtree/nbtsplitloc.c |  2 +-
 src/include/access/hash.h               |  7 +++----
 src/include/access/nbtree.h             |  8 ++++----
 src/include/storage/bufpage.h           | 10 ++++++++++
 5 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/src/backend/access/nbtree/nbtdedup.c b/src/backend/access/nbtree/nbtdedup.c
index 456d86b51c..6808dbe064 100644
--- a/src/backend/access/nbtree/nbtdedup.c
+++ b/src/backend/access/nbtree/nbtdedup.c
@@ -825,7 +825,7 @@ _bt_singleval_fillfactor(Page page, BTDedupState state, Size newitemsz)
 	int			reduction;
 
 	/* This calculation needs to match nbtsplitloc.c */
-	leftfree = PageGetPageSize(page) - SizeOfPageHeaderData -
+	leftfree = PageGetUsablePageSize(page) -
 		MAXALIGN(sizeof(BTPageOpaqueData));
 	/* Subtract size of new high key (includes pivot heap TID space) */
 	leftfree -= newitemsz + MAXALIGN(sizeof(ItemPointerData));
diff --git a/src/backend/access/nbtree/nbtsplitloc.c b/src/backend/access/nbtree/nbtsplitloc.c
index d0b1d82578..31bbe1812a 100644
--- a/src/backend/access/nbtree/nbtsplitloc.c
+++ b/src/backend/access/nbtree/nbtsplitloc.c
@@ -156,7 +156,7 @@ _bt_findsplitloc(Relation rel,
 
 	/* Total free space available on a btree page, after fixed overhead */
 	leftspace = rightspace =
-		PageGetPageSize(origpage) - SizeOfPageHeaderData -
+		PageGetUsablePageSize(origpage) -
 		MAXALIGN(sizeof(BTPageOpaqueData));
 
 	/* The right page will have the same high key as the old page */
diff --git a/src/include/access/hash.h b/src/include/access/hash.h
index 9c7d81525b..e777ac4d06 100644
--- a/src/include/access/hash.h
+++ b/src/include/access/hash.h
@@ -285,8 +285,7 @@ typedef struct HashOptions
  * Maximum size of a hash index item (it's okay to have only one per page)
  */
 #define HashMaxItemSize(page) \
-	MAXALIGN_DOWN(PageGetPageSize(page) - \
-				  SizeOfPageHeaderData - \
+	MAXALIGN_DOWN(PageGetUsablePageSize(page) - \
 				  sizeof(ItemIdData) - \
 				  MAXALIGN(sizeof(HashPageOpaqueData)))
 
@@ -317,8 +316,8 @@ typedef struct HashOptions
 	((uint32 *) PageGetContents(page))
 
 #define HashGetMaxBitmapSize(page) \
-	(PageGetPageSize((Page) page) - \
-	 (MAXALIGN(SizeOfPageHeaderData) + MAXALIGN(sizeof(HashPageOpaqueData))))
+	(PageGetUsablePageSize((Page) page) - \
+	 MAXALIGN(sizeof(HashPageOpaqueData)))
 
 #define HashPageGetMeta(page) \
 	((HashMetaPage) PageGetContents(page))
diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h
index 915a51bc23..27d42a3ff0 100644
--- a/src/include/access/nbtree.h
+++ b/src/include/access/nbtree.h
@@ -162,13 +162,13 @@ typedef struct BTMetaPageData
  * attribute, which we account for here.
  */
 #define BTMaxItemSize(page) \
-	(MAXALIGN_DOWN((PageGetPageSize(page) - \
-					MAXALIGN(SizeOfPageHeaderData + 3*sizeof(ItemIdData)) - \
+	(MAXALIGN_DOWN((PageGetUsablePageSize(page) - \
+					MAXALIGN(3*sizeof(ItemIdData)) - \
 					MAXALIGN(sizeof(BTPageOpaqueData))) / 3) - \
 					MAXALIGN(sizeof(ItemPointerData)))
 #define BTMaxItemSizeNoHeapTid(page) \
-	MAXALIGN_DOWN((PageGetPageSize(page) - \
-				   MAXALIGN(SizeOfPageHeaderData + 3*sizeof(ItemIdData)) - \
+	MAXALIGN_DOWN((PageGetUsablePageSize(page) - \
+				   MAXALIGN(3*sizeof(ItemIdData)) - \
 				   MAXALIGN(sizeof(BTPageOpaqueData))) / 3)
 
 /*
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index aff24a5ea2..5492a3682c 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -309,6 +309,16 @@ PageSetPageSizeAndVersion(Page page, Size size, uint8 version)
 	((PageHeader) page)->pd_pagesize_version = size | version;
 }
 
+/*
+ * PageGetUsablePageSize
+ *		Returns the usable space on a page (from end of page header to reserved space)
+ */
+static inline uint16
+PageGetUsablePageSize(Page page)
+{
+	return PageGetPageSize(page) - SizeOfPageHeaderData;
+}
+
 /* ----------------
  *		page special data functions
  * ----------------
-- 
2.40.1

v3-0001-refactor-Create-PageUsableSpace-to-represent-spac.patchapplication/octet-stream; name=v3-0001-refactor-Create-PageUsableSpace-to-represent-spac.patchDownload

From 965dd860d807c606d2fdf8ee002618de0f6ff464 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Sun, 26 Nov 2023 16:24:09 -0500
Subject: [PATCH v3 01/28] refactor: Create PageUsableSpace to represent space
 post-smgr

Work to abstract out the direct usage of SizeOfPageHeaderData and BLCKSZ from
access methods; they should only need to operate from a sense of what space is
available to them and not be party to the details.

This is in preparation for allowing space to be reserved at the end of the page
for, e.g., authenticated encryption tags and/or evs which will prevent future
churn when we redefine this value in the future.

This is largely mechanical, though some spots are trickier in their reworking;
basically anything which used BLCKSZ - SizeOfPageHeaderData (in some form).

Care was taken to ensure that even with differences in MAXALIGN() that no
changes were introduced in the rework here, though if there is an area to look
at more closely, this is it.
---
 contrib/bloom/bloom.h                    | 4 ++--
 contrib/pageinspect/btreefuncs.c         | 2 +-
 contrib/pgstattuple/pgstatapprox.c       | 2 +-
 contrib/pgstattuple/pgstatindex.c        | 2 +-
 src/backend/access/common/bufmask.c      | 2 +-
 src/backend/access/gin/ginfast.c         | 2 +-
 src/backend/access/gist/gistbuild.c      | 4 ++--
 src/backend/access/heap/heapam.c         | 4 ++--
 src/backend/access/heap/heapam_handler.c | 2 +-
 src/backend/access/heap/vacuumlazy.c     | 2 +-
 src/backend/access/heap/visibilitymap.c  | 2 +-
 src/backend/optimizer/util/plancat.c     | 2 +-
 src/bin/pg_upgrade/file.c                | 2 +-
 src/include/access/brin_page.h           | 2 +-
 src/include/access/ginblock.h            | 4 ++--
 src/include/access/gist.h                | 2 +-
 src/include/access/gist_private.h        | 2 +-
 src/include/access/htup_details.h        | 4 ++--
 src/include/access/itup.h                | 2 +-
 src/include/access/nbtree.h              | 2 +-
 src/include/storage/bufpage.h            | 7 +++++++
 src/include/storage/fsm_internals.h      | 2 +-
 22 files changed, 33 insertions(+), 26 deletions(-)

diff --git a/contrib/bloom/bloom.h b/contrib/bloom/bloom.h
index fba3ba7771..5a2002dd37 100644
--- a/contrib/bloom/bloom.h
+++ b/contrib/bloom/bloom.h
@@ -112,7 +112,7 @@ typedef struct BloomOptions
  */
 typedef BlockNumber FreeBlockNumberArray[
 										 MAXALIGN_DOWN(
-													   BLCKSZ - SizeOfPageHeaderData - MAXALIGN(sizeof(BloomPageOpaqueData))
+													   PageUsableSpace - MAXALIGN(sizeof(BloomPageOpaqueData))
 													   - MAXALIGN(sizeof(uint16) * 2 + sizeof(uint32) + sizeof(BloomOptions))
 													   ) / sizeof(BlockNumber)
 ];
@@ -150,7 +150,7 @@ typedef struct BloomState
 } BloomState;
 
 #define BloomPageGetFreeSpace(state, page) \
-	(BLCKSZ - MAXALIGN(SizeOfPageHeaderData) \
+	(PageUsableSpace \
 		- BloomPageGetMaxOffset(page) * (state)->sizeOfBloomTuple \
 		- MAXALIGN(sizeof(BloomPageOpaqueData)))
 
diff --git a/contrib/pageinspect/btreefuncs.c b/contrib/pageinspect/btreefuncs.c
index 9cdc8e182b..1e20fecf2f 100644
--- a/contrib/pageinspect/btreefuncs.c
+++ b/contrib/pageinspect/btreefuncs.c
@@ -116,7 +116,7 @@ GetBTPageStatistics(BlockNumber blkno, Buffer buffer, BTPageStat *stat)
 
 	stat->blkno = blkno;
 
-	stat->max_avail = BLCKSZ - (BLCKSZ - phdr->pd_special + SizeOfPageHeaderData);
+	stat->max_avail = PageUsableSpace - (BLCKSZ - phdr->pd_special);
 
 	stat->dead_items = stat->live_items = 0;
 
diff --git a/contrib/pgstattuple/pgstatapprox.c b/contrib/pgstattuple/pgstatapprox.c
index c84c642355..b344dc6385 100644
--- a/contrib/pgstattuple/pgstatapprox.c
+++ b/contrib/pgstattuple/pgstatapprox.c
@@ -113,7 +113,7 @@ statapprox_heap(Relation rel, output_type *stat)
 		if (!PageIsNew(page))
 			stat->free_space += PageGetHeapFreeSpace(page);
 		else
-			stat->free_space += BLCKSZ - SizeOfPageHeaderData;
+			stat->free_space += PageUsableSpace;
 
 		/* We may count the page as scanned even if it's new/empty */
 		scanned++;
diff --git a/contrib/pgstattuple/pgstatindex.c b/contrib/pgstattuple/pgstatindex.c
index 5c06ba6db4..2641df9d61 100644
--- a/contrib/pgstattuple/pgstatindex.c
+++ b/contrib/pgstattuple/pgstatindex.c
@@ -309,7 +309,7 @@ pgstatindex_impl(Relation rel, FunctionCallInfo fcinfo)
 		{
 			int			max_avail;
 
-			max_avail = BLCKSZ - (BLCKSZ - ((PageHeader) page)->pd_special + SizeOfPageHeaderData);
+			max_avail = PageUsableSpace - (BLCKSZ - ((PageHeader) page)->pd_special);
 			indexStat.max_avail += max_avail;
 			indexStat.free_space += PageGetFreeSpace(page);
 
diff --git a/src/backend/access/common/bufmask.c b/src/backend/access/common/bufmask.c
index 10a1e4d7c6..92bb06bb50 100644
--- a/src/backend/access/common/bufmask.c
+++ b/src/backend/access/common/bufmask.c
@@ -120,7 +120,7 @@ mask_page_content(Page page)
 {
 	/* Mask Page Content */
 	memset(page + SizeOfPageHeaderData, MASK_MARKER,
-		   BLCKSZ - SizeOfPageHeaderData);
+		   PageUsableSpace);
 
 	/* Mask pd_lower and pd_upper */
 	memset(&((PageHeader) page)->pd_lower, MASK_MARKER,
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index cff6850ef8..dc1687b49b 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -39,7 +39,7 @@
 int			gin_pending_list_limit = 0;
 
 #define GIN_PAGE_FREESIZE \
-	( BLCKSZ - MAXALIGN(SizeOfPageHeaderData) - MAXALIGN(sizeof(GinPageOpaqueData)) )
+	( PageUsableSpace - MAXALIGN(sizeof(GinPageOpaqueData)) )
 
 typedef struct KeyArray
 {
diff --git a/src/backend/access/gist/gistbuild.c b/src/backend/access/gist/gistbuild.c
index 08555b97f9..505be8430d 100644
--- a/src/backend/access/gist/gistbuild.c
+++ b/src/backend/access/gist/gistbuild.c
@@ -702,7 +702,7 @@ gistInitBuffering(GISTBuildState *buildstate)
 	int			levelStep;
 
 	/* Calc space of index page which is available for index tuples */
-	pageFreeSpace = BLCKSZ - SizeOfPageHeaderData - sizeof(GISTPageOpaqueData)
+	pageFreeSpace = PageUsableSpace - sizeof(GISTPageOpaqueData)
 		- sizeof(ItemIdData)
 		- buildstate->freespace;
 
@@ -858,7 +858,7 @@ calculatePagesPerBuffer(GISTBuildState *buildstate, int levelStep)
 	Size		pageFreeSpace;
 
 	/* Calc space of index page which is available for index tuples */
-	pageFreeSpace = BLCKSZ - SizeOfPageHeaderData - sizeof(GISTPageOpaqueData)
+	pageFreeSpace = PageUsableSpace - sizeof(GISTPageOpaqueData)
 		- sizeof(ItemIdData)
 		- buildstate->freespace;
 
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 707460a536..a46594f009 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2064,7 +2064,7 @@ heap_prepare_insert(Relation relation, HeapTuple tup, TransactionId xid,
 static int
 heap_multi_insert_pages(HeapTuple *heaptuples, int done, int ntuples, Size saveFreeSpace)
 {
-	size_t		page_avail = BLCKSZ - SizeOfPageHeaderData - saveFreeSpace;
+	size_t		page_avail = PageUsableSpace - saveFreeSpace;
 	int			npages = 1;
 
 	for (int i = done; i < ntuples; i++)
@@ -2074,7 +2074,7 @@ heap_multi_insert_pages(HeapTuple *heaptuples, int done, int ntuples, Size saveF
 		if (page_avail < tup_sz)
 		{
 			npages++;
-			page_avail = BLCKSZ - SizeOfPageHeaderData - saveFreeSpace;
+			page_avail = PageUsableSpace - saveFreeSpace;
 		}
 		page_avail -= tup_sz;
 	}
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index d15a02b2be..aa01124483 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -2092,7 +2092,7 @@ heapam_relation_toast_am(Relation rel)
 #define HEAP_OVERHEAD_BYTES_PER_TUPLE \
 	(MAXALIGN(SizeofHeapTupleHeader) + sizeof(ItemIdData))
 #define HEAP_USABLE_BYTES_PER_PAGE \
-	(BLCKSZ - SizeOfPageHeaderData)
+	(PageUsableSpace)
 
 static void
 heapam_estimate_rel_size(Relation rel, int32 *attr_widths,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 0fb3953513..e8c1db818a 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1272,7 +1272,7 @@ lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno,
 
 		if (GetRecordedFreeSpace(vacrel->rel, blkno) == 0)
 		{
-			freespace = BLCKSZ - SizeOfPageHeaderData;
+			freespace = PageUsableSpace;
 
 			RecordPageWithFreeSpace(vacrel->rel, blkno, freespace);
 		}
diff --git a/src/backend/access/heap/visibilitymap.c b/src/backend/access/heap/visibilitymap.c
index 19785ff9d3..10a266076d 100644
--- a/src/backend/access/heap/visibilitymap.c
+++ b/src/backend/access/heap/visibilitymap.c
@@ -105,7 +105,7 @@
  * extra headers, so the whole page minus the standard page header is
  * used for the bitmap.
  */
-#define MAPSIZE (BLCKSZ - MAXALIGN(SizeOfPageHeaderData))
+#define MAPSIZE (PageUsableSpace)
 
 /* Number of heap blocks we can represent in one byte */
 #define HEAPBLOCKS_PER_BYTE (BITS_PER_BYTE / BITS_PER_HEAPBLOCK)
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 9fab52c58f..28dfd71fd1 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -1089,7 +1089,7 @@ estimate_rel_size(Relation rel, int32 *attr_widths,
 			tuple_width += MAXALIGN(SizeofHeapTupleHeader);
 			tuple_width += sizeof(ItemIdData);
 			/* note: integer division is intentional here */
-			density = (BLCKSZ - SizeOfPageHeaderData) / tuple_width;
+			density = (PageUsableSpace) / tuple_width;
 		}
 		*tuples = rint(density * (double) curpages);
 
diff --git a/src/bin/pg_upgrade/file.c b/src/bin/pg_upgrade/file.c
index 4850a682cb..087c0c1bd4 100644
--- a/src/bin/pg_upgrade/file.c
+++ b/src/bin/pg_upgrade/file.c
@@ -187,7 +187,7 @@ rewriteVisibilityMap(const char *fromfile, const char *tofile,
 	struct stat statbuf;
 
 	/* Compute number of old-format bytes per new page */
-	rewriteVmBytesPerPage = (BLCKSZ - SizeOfPageHeaderData) / 2;
+	rewriteVmBytesPerPage = (PageUsableSpace) / 2;
 
 	if ((src_fd = open(fromfile, O_RDONLY | PG_BINARY, 0)) < 0)
 		pg_fatal("error while copying relation \"%s.%s\": could not open file \"%s\": %s",
diff --git a/src/include/access/brin_page.h b/src/include/access/brin_page.h
index 70b141c25e..28c4dd9f21 100644
--- a/src/include/access/brin_page.h
+++ b/src/include/access/brin_page.h
@@ -86,7 +86,7 @@ typedef struct RevmapContents
 } RevmapContents;
 
 #define REVMAP_CONTENT_SIZE \
-	(BLCKSZ - MAXALIGN(SizeOfPageHeaderData) - \
+	(PageUsableSpace - \
 	 offsetof(RevmapContents, rm_tids) - \
 	 MAXALIGN(sizeof(BrinSpecialSpace)))
 /* max num of items in the array */
diff --git a/src/include/access/ginblock.h b/src/include/access/ginblock.h
index b3b7daa049..1b0d3ed1ea 100644
--- a/src/include/access/ginblock.h
+++ b/src/include/access/ginblock.h
@@ -318,7 +318,7 @@ typedef signed char GinNullCategory;
 	 GinPageGetOpaque(page)->maxoff * sizeof(PostingItem))
 
 #define GinDataPageMaxDataSize	\
-	(BLCKSZ - MAXALIGN(SizeOfPageHeaderData) \
+	(PageUsableSpace \
 	 - MAXALIGN(sizeof(ItemPointerData)) \
 	 - MAXALIGN(sizeof(GinPageOpaqueData)))
 
@@ -326,7 +326,7 @@ typedef signed char GinNullCategory;
  * List pages
  */
 #define GinListPageSize  \
-	( BLCKSZ - SizeOfPageHeaderData - MAXALIGN(sizeof(GinPageOpaqueData)) )
+	( PageUsableSpace - MAXALIGN(sizeof(GinPageOpaqueData)) )
 
 /*
  * A compressed posting list.
diff --git a/src/include/access/gist.h b/src/include/access/gist.h
index c6dcd6a90d..3590bba31c 100644
--- a/src/include/access/gist.h
+++ b/src/include/access/gist.h
@@ -96,7 +96,7 @@ typedef GISTPageOpaqueData *GISTPageOpaque;
  * key size using opclass parameters.
  */
 #define GISTMaxIndexTupleSize	\
-	MAXALIGN_DOWN((BLCKSZ - SizeOfPageHeaderData - sizeof(GISTPageOpaqueData)) / \
+	MAXALIGN_DOWN((PageUsableSpace - sizeof(GISTPageOpaqueData)) / \
 				  4 - sizeof(ItemIdData))
 
 #define GISTMaxIndexKeySize	\
diff --git a/src/include/access/gist_private.h b/src/include/access/gist_private.h
index 7b8749c8db..b7402450ba 100644
--- a/src/include/access/gist_private.h
+++ b/src/include/access/gist_private.h
@@ -474,7 +474,7 @@ extern void gistadjustmembers(Oid opfamilyoid,
 /* gistutil.c */
 
 #define GiSTPageSize   \
-	( BLCKSZ - SizeOfPageHeaderData - MAXALIGN(sizeof(GISTPageOpaqueData)) )
+	( PageUsableSpace - MAXALIGN(sizeof(GISTPageOpaqueData)) )
 
 #define GIST_MIN_FILLFACTOR			10
 #define GIST_DEFAULT_FILLFACTOR		90
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index 5e38ef8696..370dbbe540 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -560,7 +560,7 @@ StaticAssertDecl(MaxOffsetNumber < SpecTokenOffsetNumber,
  * ItemIds and tuples have different alignment requirements, don't assume that
  * you can, say, fit 2 tuples of size MaxHeapTupleSize/2 on the same page.
  */
-#define MaxHeapTupleSize  (BLCKSZ - MAXALIGN(SizeOfPageHeaderData + sizeof(ItemIdData)))
+#define MaxHeapTupleSize  (PageUsableSpace - MAXALIGN(sizeof(ItemIdData)))
 #define MinHeapTupleSize  MAXALIGN(SizeofHeapTupleHeader)
 
 /*
@@ -575,7 +575,7 @@ StaticAssertDecl(MaxOffsetNumber < SpecTokenOffsetNumber,
  * require increases in the size of work arrays.
  */
 #define MaxHeapTuplesPerPage	\
-	((int) ((BLCKSZ - SizeOfPageHeaderData) / \
+	((int) ((PageUsableSpace) / \
 			(MAXALIGN(SizeofHeapTupleHeader) + sizeof(ItemIdData))))
 
 /*
diff --git a/src/include/access/itup.h b/src/include/access/itup.h
index 94885751e5..4b4209e43e 100644
--- a/src/include/access/itup.h
+++ b/src/include/access/itup.h
@@ -164,7 +164,7 @@ index_getattr(IndexTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
  * But such a page always has at least MAXALIGN special space, so we're safe.
  */
 #define MaxIndexTuplesPerPage	\
-	((int) ((BLCKSZ - SizeOfPageHeaderData) / \
+	((int) ((PageUsableSpace) / \
 			(MAXALIGN(sizeof(IndexTupleData) + 1) + sizeof(ItemIdData))))
 
 #endif							/* ITUP_H */
diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h
index 6eb162052e..915a51bc23 100644
--- a/src/include/access/nbtree.h
+++ b/src/include/access/nbtree.h
@@ -183,7 +183,7 @@ typedef struct BTMetaPageData
  * than necessary as a result, which is considered acceptable.
  */
 #define MaxTIDsPerBTreePage \
-	(int) ((BLCKSZ - SizeOfPageHeaderData - sizeof(BTPageOpaqueData)) / \
+	(int) ((PageUsableSpace - sizeof(BTPageOpaqueData)) / \
 		   sizeof(ItemPointerData))
 
 /*
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index d0df02d39c..aff24a5ea2 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -213,6 +213,13 @@ typedef PageHeaderData *PageHeader;
  */
 #define SizeOfPageHeaderData (offsetof(PageHeaderData, pd_linp))
 
+/*
+ * how much space is left after smgr's bookkeeping, etc
+ */
+#define PageUsableSpace (BLCKSZ - SizeOfPageHeaderData)
+StaticAssertDecl(PageUsableSpace == MAXALIGN(PageUsableSpace),
+				 "SizeOfPageHeaderData must be MAXALIGN'd");
+
 /*
  * PageIsEmpty
  *		returns true iff no itemid has been allocated on the page
diff --git a/src/include/storage/fsm_internals.h b/src/include/storage/fsm_internals.h
index a922e691fe..148693d977 100644
--- a/src/include/storage/fsm_internals.h
+++ b/src/include/storage/fsm_internals.h
@@ -48,7 +48,7 @@ typedef FSMPageData *FSMPage;
  * Number of non-leaf and leaf nodes, and nodes in total, on an FSM page.
  * These definitions are internal to fsmpage.c.
  */
-#define NodesPerPage (BLCKSZ - MAXALIGN(SizeOfPageHeaderData) - \
+#define NodesPerPage (PageUsableSpace - \
 					  offsetof(FSMPageData, fp_nodes))
 
 #define NonLeafNodesPerPage (BLCKSZ / 2 - 1)
-- 
2.40.1

v3-0005-feature-Create-Calc-Limit-and-Dynamic-forms-for-f.patchapplication/octet-stream; name=v3-0005-feature-Create-Calc-Limit-and-Dynamic-forms-for-f.patchDownload

From d880c921264962135cc506b627e9241ded42e0c6 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Tue, 19 Dec 2023 17:32:38 -0500
Subject: [PATCH v3 05/28] feature: Create Calc, Limit, and Dynamic forms for
 former constants

The CalcXXX() form computes the constants in terms of an abstract usable space,
which is passed in as a parameter.

The XXXLimit constant is the same thing as
the old value of the constant, which is in fact computed at compile time and is
used in static allocations.

The XXXDynamic expression is the runtime usage of this, computed including the
reserved page size.

We continue to define the old constant with its old value here for the purposes
of confirming that the backend compiles successfully with the refactor.

Future commits will mechanically replace all uses of the old constant and remove
this definition.
---
 contrib/bloom/bloom.h               |  2 +-
 src/bin/pg_upgrade/file.c           |  2 +-
 src/include/access/htup_details.h   | 13 ++++++++++---
 src/include/access/itup.h           |  8 ++++++--
 src/include/access/nbtree.h         |  8 +++++---
 src/include/storage/fsm_internals.h |  2 +-
 6 files changed, 24 insertions(+), 11 deletions(-)

diff --git a/contrib/bloom/bloom.h b/contrib/bloom/bloom.h
index 5a2002dd37..c3d7fe8372 100644
--- a/contrib/bloom/bloom.h
+++ b/contrib/bloom/bloom.h
@@ -112,7 +112,7 @@ typedef struct BloomOptions
  */
 typedef BlockNumber FreeBlockNumberArray[
 										 MAXALIGN_DOWN(
-													   PageUsableSpace - MAXALIGN(sizeof(BloomPageOpaqueData))
+													   PageUsableSpaceMax - MAXALIGN(sizeof(BloomPageOpaqueData))
 													   - MAXALIGN(sizeof(uint16) * 2 + sizeof(uint32) + sizeof(BloomOptions))
 													   ) / sizeof(BlockNumber)
 ];
diff --git a/src/bin/pg_upgrade/file.c b/src/bin/pg_upgrade/file.c
index 087c0c1bd4..888bb49ed3 100644
--- a/src/bin/pg_upgrade/file.c
+++ b/src/bin/pg_upgrade/file.c
@@ -187,7 +187,7 @@ rewriteVisibilityMap(const char *fromfile, const char *tofile,
 	struct stat statbuf;
 
 	/* Compute number of old-format bytes per new page */
-	rewriteVmBytesPerPage = (PageUsableSpace) / 2;
+	rewriteVmBytesPerPage = (PageUsableSpaceMax) / 2;
 
 	if ((src_fd = open(fromfile, O_RDONLY | PG_BINARY, 0)) < 0)
 		pg_fatal("error while copying relation \"%s.%s\": could not open file \"%s\": %s",
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index 370dbbe540..450d14ebd4 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -560,7 +560,10 @@ StaticAssertDecl(MaxOffsetNumber < SpecTokenOffsetNumber,
  * ItemIds and tuples have different alignment requirements, don't assume that
  * you can, say, fit 2 tuples of size MaxHeapTupleSize/2 on the same page.
  */
-#define MaxHeapTupleSize  (PageUsableSpace - MAXALIGN(sizeof(ItemIdData)))
+#define CalcMaxHeapTupleSize(usablespace)  ((usablespace) - MAXALIGN(sizeof(ItemIdData)))
+#define MaxHeapTupleSizeLimit CalcMaxHeapTupleSize(PageUsableSpaceMax)
+#define MaxHeapTupleSizeDynamic CalcMaxHeapTupleSize(PageUsableSpace)
+#define MaxHeapTupleSize MaxHeapTupleSizeLimit
 #define MinHeapTupleSize  MAXALIGN(SizeofHeapTupleHeader)
 
 /*
@@ -574,9 +577,13 @@ StaticAssertDecl(MaxOffsetNumber < SpecTokenOffsetNumber,
  * pointers to this anyway, to avoid excessive line-pointer bloat and not
  * require increases in the size of work arrays.
  */
-#define MaxHeapTuplesPerPage	\
-	((int) ((PageUsableSpace) / \
+
+#define CalcMaxHeapTuplesPerPage(usablespace)									\
+	((int) ((usablespace) /							\
 			(MAXALIGN(SizeofHeapTupleHeader) + sizeof(ItemIdData))))
+#define MaxHeapTuplesPerPageLimit (CalcMaxHeapTuplesPerPage(PageUsableSpaceMax))
+#define MaxHeapTuplesPerPageDynamic (CalcMaxHeapTuplesPerPage(PageUsableSpace))
+#define MaxHeapTuplesPerPage MaxHeapTuplesPerPageLimit
 
 /*
  * MaxAttrSize is a somewhat arbitrary upper limit on the declared size of
diff --git a/src/include/access/itup.h b/src/include/access/itup.h
index 4b4209e43e..84579fe63f 100644
--- a/src/include/access/itup.h
+++ b/src/include/access/itup.h
@@ -163,8 +163,12 @@ index_getattr(IndexTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
  * estimated here, seemingly allowing one more tuple than estimated here.
  * But such a page always has at least MAXALIGN special space, so we're safe.
  */
-#define MaxIndexTuplesPerPage	\
-	((int) ((PageUsableSpace) / \
+#define CalcMaxIndexTuplesPerPage(usablespace)		\
+	((int) ((usablespace) /												\
 			(MAXALIGN(sizeof(IndexTupleData) + 1) + sizeof(ItemIdData))))
+#define MaxIndexTuplesPerPageLimit (CalcMaxIndexTuplesPerPage(BLCKSZ - SizeOfPageHeaderData))
+#define MaxIndexTuplesPerPageDynamic (CalcMaxIndexTuplesPerPage(PageUsableSpace))
 
+// temporary to compile
+#define MaxIndexTuplesPerPage MaxIndexTuplesPerPageLimit
 #endif							/* ITUP_H */
diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h
index 27d42a3ff0..f1db466007 100644
--- a/src/include/access/nbtree.h
+++ b/src/include/access/nbtree.h
@@ -182,10 +182,12 @@ typedef struct BTMetaPageData
  * special area).  The value is slightly higher (i.e. more conservative)
  * than necessary as a result, which is considered acceptable.
  */
-#define MaxTIDsPerBTreePage \
-	(int) ((PageUsableSpace - sizeof(BTPageOpaqueData)) / \
+#define CalcMaxTIDsPerBTreePage(usablespace)			  \
+	(int) ((usablespace) - sizeof(BTPageOpaqueData) / \
 		   sizeof(ItemPointerData))
-
+#define MaxTIDsPerBTreePageLimit (CalcMaxTIDsPerBTreePage(PageUsableSpaceMax))
+#define MaxTIDsPerBTreePageDynamic (CalcMaxTIDsPerBTreePage(PageUsableSpace))
+#define MaxTIDsPerBTreePage MaxTIDsPerBTreePageLimit
 /*
  * The leaf-page fillfactor defaults to 90% but is user-adjustable.
  * For pages above the leaf level, we use a fixed 70% fillfactor.
diff --git a/src/include/storage/fsm_internals.h b/src/include/storage/fsm_internals.h
index 148693d977..195fb7804a 100644
--- a/src/include/storage/fsm_internals.h
+++ b/src/include/storage/fsm_internals.h
@@ -48,7 +48,7 @@ typedef FSMPageData *FSMPage;
  * Number of non-leaf and leaf nodes, and nodes in total, on an FSM page.
  * These definitions are internal to fsmpage.c.
  */
-#define NodesPerPage (PageUsableSpace - \
+#define NodesPerPage (PageUsableSpaceMax - \
 					  offsetof(FSMPageData, fp_nodes))
 
 #define NonLeafNodesPerPage (BLCKSZ / 2 - 1)
-- 
2.40.1

v3-0004-feature-Adjust-page-sizes-at-PageInit.patchapplication/octet-stream; name=v3-0004-feature-Adjust-page-sizes-at-PageInit.patchDownload

From e8dc7199821f3d0556538ff89088e41ec63ab720 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Fri, 5 Jan 2024 18:41:55 -0500
Subject: [PATCH v3 04/28] feature: Adjust page sizes at PageInit()

This is the part at which we are now properly reserving the data on the pages,
and all sizes should have been adjusted appropriately.
---
 src/backend/storage/page/bufpage.c | 6 +++---
 src/include/storage/bufpage.h      | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index 63fce5b588..bb9deacddf 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -46,15 +46,15 @@ PageInit(Page page, Size pageSize, Size specialSize)
 	specialSize = MAXALIGN(specialSize);
 
 	Assert(pageSize == BLCKSZ);
-	Assert(pageSize > specialSize + SizeOfPageHeaderData);
+	Assert(pageSize > specialSize + SizeOfPageHeaderData + ReservedPageSize);
 
 	/* Make sure all fields of page are zero, as well as unused space */
 	MemSet(p, 0, pageSize);
 
 	p->pd_flags = 0;
 	p->pd_lower = SizeOfPageHeaderData;
-	p->pd_upper = pageSize - specialSize;
-	p->pd_special = pageSize - specialSize;
+	p->pd_upper = pageSize - specialSize - ReservedPageSize;
+	p->pd_special = pageSize - specialSize - ReservedPageSize;
 	PageSetPageSizeAndVersion(page, pageSize, PG_PAGE_LAYOUT_VERSION);
 	/* p->pd_prune_xid = InvalidTransactionId;		done by above MemSet */
 }
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 9b9c6167ef..c21bb0d86f 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -332,7 +332,7 @@ PageGetUsablePageSize(Page page)
 static inline uint16
 PageGetSpecialSize(Page page)
 {
-	return (PageGetPageSize(page) - ((PageHeader) page)->pd_special);
+	return (PageGetPageSize(page) - ((PageHeader) page)->pd_special - ReservedPageSize);
 }
 
 /*
-- 
2.40.1

v3-0003-feature-Add-ReservedPageSize-variable.patchapplication/octet-stream; name=v3-0003-feature-Add-ReservedPageSize-variable.patchDownload

From 7af6f83b5d31491288ffa70a982fd339b402707f Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Tue, 12 Dec 2023 16:55:52 -0500
Subject: [PATCH v3 03/28] feature: Add ReservedPageSize variable

We redefine PageUsableSpace to account for this variable, as well as introduce
the PageUsableSpaceMax to reflect /just/ BLCKSZ - SizeOfPageHeaderData in the
few call sites that care at this point.
---
 src/backend/storage/page/bufpage.c |  2 +-
 src/include/storage/bufpage.h      | 12 +++++++-----
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index be6f1f62d2..63fce5b588 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -26,7 +26,7 @@
 /* GUC variable */
 bool		ignore_checksum_failure = false;
 
-
+int ReservedPageSize = 0;
 /* ----------------------------------------------------------------
  *						Page support functions
  * ----------------------------------------------------------------
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 5492a3682c..9b9c6167ef 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -214,11 +214,13 @@ typedef PageHeaderData *PageHeader;
 #define SizeOfPageHeaderData (offsetof(PageHeaderData, pd_linp))
 
 /*
- * how much space is left after smgr's bookkeeping, etc
+ * how much space is left after smgr's bookkeeping, etc; should be MAXALIGN
  */
-#define PageUsableSpace (BLCKSZ - SizeOfPageHeaderData)
-StaticAssertDecl(PageUsableSpace == MAXALIGN(PageUsableSpace),
-				 "SizeOfPageHeaderData must be MAXALIGN'd");
+extern int ReservedPageSize;
+
+/* ignore page usable space */
+#define PageUsableSpaceMax (BLCKSZ - SizeOfPageHeaderData)
+#define PageUsableSpace (BLCKSZ - SizeOfPageHeaderData - ReservedPageSize)
 
 /*
  * PageIsEmpty
@@ -316,7 +318,7 @@ PageSetPageSizeAndVersion(Page page, Size size, uint8 version)
 static inline uint16
 PageGetUsablePageSize(Page page)
 {
-	return PageGetPageSize(page) - SizeOfPageHeaderData;
+	return PageGetPageSize(page) - SizeOfPageHeaderData - ReservedPageSize;
 }
 
 /* ----------------
-- 
2.40.1

v3-0007-chore-Split-MaxIndexTuplesPerPage-into-Limit-and-.patchapplication/octet-stream; name=v3-0007-chore-Split-MaxIndexTuplesPerPage-into-Limit-and-.patchDownload

From 22582be3569a915e0c044d51f7883dafba58ac2b Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Tue, 19 Dec 2023 18:07:01 -0500
Subject: [PATCH v3 07/28] chore: Split MaxIndexTuplesPerPage into Limit and
 Dynamic variants

Mechanical replacement, removing the old constant
---
 contrib/amcheck/verify_nbtree.c         |  6 +++---
 src/backend/access/gist/gist.c          |  2 +-
 src/backend/access/gist/gistget.c       |  8 ++++----
 src/backend/access/hash/hash.c          |  4 ++--
 src/backend/access/hash/hashovfl.c      |  6 +++---
 src/backend/access/hash/hashpage.c      |  4 ++--
 src/backend/access/hash/hashsearch.c    | 10 +++++-----
 src/backend/access/nbtree/nbtinsert.c   |  2 +-
 src/backend/access/nbtree/nbtpage.c     |  8 ++++----
 src/backend/access/nbtree/nbtree.c      |  4 ++--
 src/backend/access/nbtree/nbtxlog.c     |  4 ++--
 src/backend/access/spgist/spgdoinsert.c |  2 +-
 src/backend/access/spgist/spgscan.c     |  2 +-
 src/backend/access/spgist/spgvacuum.c   | 22 +++++++++++-----------
 src/backend/storage/page/bufpage.c      |  6 +++---
 src/include/access/hash.h               |  2 +-
 src/include/access/itup.h               |  5 +----
 src/include/access/nbtree.h             |  2 +-
 src/include/access/spgist_private.h     | 12 ++++++------
 19 files changed, 54 insertions(+), 57 deletions(-)

diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 91caa53dd8..9d16431682 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -3446,12 +3446,12 @@ palloc_btree_page(BtreeCheckState *state, BlockNumber blocknum)
 	 * to move left, in the case of backward index scans).
 	 */
 	maxoffset = PageGetMaxOffsetNumber(page);
-	if (maxoffset > MaxIndexTuplesPerPage)
+	if (maxoffset > MaxIndexTuplesPerPageDynamic)
 		ereport(ERROR,
 				(errcode(ERRCODE_INDEX_CORRUPTED),
-				 errmsg("Number of items on block %u of index \"%s\" exceeds MaxIndexTuplesPerPage (%u)",
+				 errmsg("Number of items on block %u of index \"%s\" exceeds MaxIndexTuplesPerPageDynamic (%u)",
 						blocknum, RelationGetRelationName(state->rel),
-						MaxIndexTuplesPerPage)));
+						MaxIndexTuplesPerPageDynamic)));
 
 	if (!P_ISLEAF(opaque) && !P_ISDELETED(opaque) && maxoffset < P_FIRSTDATAKEY(opaque))
 		ereport(ERROR,
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 943ae91019..04962b1082 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -1664,7 +1664,7 @@ freeGISTstate(GISTSTATE *giststate)
 static void
 gistprunepage(Relation rel, Page page, Buffer buffer, Relation heapRel)
 {
-	OffsetNumber deletable[MaxIndexTuplesPerPage];
+	OffsetNumber deletable[MaxIndexTuplesPerPageLimit];
 	int			ndeletable = 0;
 	OffsetNumber offnum,
 				maxoff;
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
index d5c8ae8166..7af7b95997 100644
--- a/src/backend/access/gist/gistget.c
+++ b/src/backend/access/gist/gistget.c
@@ -659,12 +659,12 @@ gistgettuple(IndexScanDesc scan, ScanDirection dir)
 							MemoryContextSwitchTo(so->giststate->scanCxt);
 
 						so->killedItems =
-							(OffsetNumber *) palloc(MaxIndexTuplesPerPage
+							(OffsetNumber *) palloc(MaxIndexTuplesPerPageDynamic
 													* sizeof(OffsetNumber));
 
 						MemoryContextSwitchTo(oldCxt);
 					}
-					if (so->numKilled < MaxIndexTuplesPerPage)
+					if (so->numKilled < MaxIndexTuplesPerPageDynamic)
 						so->killedItems[so->numKilled++] =
 							so->pageData[so->curPageData - 1].offnum;
 				}
@@ -696,12 +696,12 @@ gistgettuple(IndexScanDesc scan, ScanDirection dir)
 						MemoryContextSwitchTo(so->giststate->scanCxt);
 
 					so->killedItems =
-						(OffsetNumber *) palloc(MaxIndexTuplesPerPage
+						(OffsetNumber *) palloc(MaxIndexTuplesPerPageDynamic
 												* sizeof(OffsetNumber));
 
 					MemoryContextSwitchTo(oldCxt);
 				}
-				if (so->numKilled < MaxIndexTuplesPerPage)
+				if (so->numKilled < MaxIndexTuplesPerPageDynamic)
 					so->killedItems[so->numKilled++] =
 						so->pageData[so->curPageData - 1].offnum;
 			}
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index fa5b59a150..125ab213bc 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -312,9 +312,9 @@ hashgettuple(IndexScanDesc scan, ScanDirection dir)
 			 */
 			if (so->killedItems == NULL)
 				so->killedItems = (int *)
-					palloc(MaxIndexTuplesPerPage * sizeof(int));
+					palloc(MaxIndexTuplesPerPageDynamic * sizeof(int));
 
-			if (so->numKilled < MaxIndexTuplesPerPage)
+			if (so->numKilled < MaxIndexTuplesPerPageDynamic)
 				so->killedItems[so->numKilled++] = so->currPos.itemIndex;
 		}
 
diff --git a/src/backend/access/hash/hashovfl.c b/src/backend/access/hash/hashovfl.c
index b512d0bcb6..b94dc169c9 100644
--- a/src/backend/access/hash/hashovfl.c
+++ b/src/backend/access/hash/hashovfl.c
@@ -888,9 +888,9 @@ _hash_squeezebucket(Relation rel,
 		OffsetNumber roffnum;
 		OffsetNumber maxroffnum;
 		OffsetNumber deletable[MaxOffsetNumber];
-		IndexTuple	itups[MaxIndexTuplesPerPage];
-		Size		tups_size[MaxIndexTuplesPerPage];
-		OffsetNumber itup_offsets[MaxIndexTuplesPerPage];
+		IndexTuple	itups[MaxIndexTuplesPerPageLimit];
+		Size		tups_size[MaxIndexTuplesPerPageLimit];
+		OffsetNumber itup_offsets[MaxIndexTuplesPerPageLimit];
 		uint16		ndeletable = 0;
 		uint16		nitups = 0;
 		Size		all_tups_size = 0;
diff --git a/src/backend/access/hash/hashpage.c b/src/backend/access/hash/hashpage.c
index 69b07b1453..16503931f6 100644
--- a/src/backend/access/hash/hashpage.c
+++ b/src/backend/access/hash/hashpage.c
@@ -1087,8 +1087,8 @@ _hash_splitbucket(Relation rel,
 	Page		npage;
 	HashPageOpaque oopaque;
 	HashPageOpaque nopaque;
-	OffsetNumber itup_offsets[MaxIndexTuplesPerPage];
-	IndexTuple	itups[MaxIndexTuplesPerPage];
+	OffsetNumber itup_offsets[MaxIndexTuplesPerPageLimit];
+	IndexTuple	itups[MaxIndexTuplesPerPageLimit];
 	Size		all_tups_size = 0;
 	int			i;
 	uint16		nitups = 0;
diff --git a/src/backend/access/hash/hashsearch.c b/src/backend/access/hash/hashsearch.c
index 8de3eab498..b8bdee54c7 100644
--- a/src/backend/access/hash/hashsearch.c
+++ b/src/backend/access/hash/hashsearch.c
@@ -532,7 +532,7 @@ _hash_readpage(IndexScanDesc scan, Buffer *bufP, ScanDirection dir)
 
 			itemIndex = _hash_load_qualified_items(scan, page, offnum, dir);
 
-			if (itemIndex != MaxIndexTuplesPerPage)
+			if (itemIndex != MaxIndexTuplesPerPageDynamic)
 				break;
 
 			/*
@@ -571,8 +571,8 @@ _hash_readpage(IndexScanDesc scan, Buffer *bufP, ScanDirection dir)
 		}
 
 		so->currPos.firstItem = itemIndex;
-		so->currPos.lastItem = MaxIndexTuplesPerPage - 1;
-		so->currPos.itemIndex = MaxIndexTuplesPerPage - 1;
+		so->currPos.lastItem = MaxIndexTuplesPerPageDynamic - 1;
+		so->currPos.itemIndex = MaxIndexTuplesPerPageDynamic - 1;
 	}
 
 	if (so->currPos.buf == so->hashso_bucket_buf ||
@@ -652,13 +652,13 @@ _hash_load_qualified_items(IndexScanDesc scan, Page page,
 			offnum = OffsetNumberNext(offnum);
 		}
 
-		Assert(itemIndex <= MaxIndexTuplesPerPage);
+		Assert(itemIndex <= MaxIndexTuplesPerPageDynamic);
 		return itemIndex;
 	}
 	else
 	{
 		/* load items[] in descending order */
-		itemIndex = MaxIndexTuplesPerPage;
+		itemIndex = MaxIndexTuplesPerPageDynamic;
 
 		while (offnum >= FirstOffsetNumber)
 		{
diff --git a/src/backend/access/nbtree/nbtinsert.c b/src/backend/access/nbtree/nbtinsert.c
index 709edd1c17..4674e5267a 100644
--- a/src/backend/access/nbtree/nbtinsert.c
+++ b/src/backend/access/nbtree/nbtinsert.c
@@ -2685,7 +2685,7 @@ _bt_delete_or_dedup_one_page(Relation rel, Relation heapRel,
 							 bool simpleonly, bool checkingunique,
 							 bool uniquedup, bool indexUnchanged)
 {
-	OffsetNumber deletable[MaxIndexTuplesPerPage];
+	OffsetNumber deletable[MaxIndexTuplesPerPageLimit];
 	int			ndeletable = 0;
 	OffsetNumber offnum,
 				minoff,
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 567bade9f4..fb24aa4f56 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -1160,7 +1160,7 @@ _bt_delitems_vacuum(Relation rel, Buffer buf,
 	bool		needswal = RelationNeedsWAL(rel);
 	char	   *updatedbuf = NULL;
 	Size		updatedbuflen = 0;
-	OffsetNumber updatedoffsets[MaxIndexTuplesPerPage];
+	OffsetNumber updatedoffsets[MaxIndexTuplesPerPageLimit];
 
 	/* Shouldn't be called unless there's something to do */
 	Assert(ndeletable > 0 || nupdatable > 0);
@@ -1291,7 +1291,7 @@ _bt_delitems_delete(Relation rel, Buffer buf,
 	bool		needswal = RelationNeedsWAL(rel);
 	char	   *updatedbuf = NULL;
 	Size		updatedbuflen = 0;
-	OffsetNumber updatedoffsets[MaxIndexTuplesPerPage];
+	OffsetNumber updatedoffsets[MaxIndexTuplesPerPageLimit];
 
 	/* Shouldn't be called unless there's something to do */
 	Assert(ndeletable > 0 || nupdatable > 0);
@@ -1524,8 +1524,8 @@ _bt_delitems_delete_check(Relation rel, Buffer buf, Relation heapRel,
 	OffsetNumber postingidxoffnum = InvalidOffsetNumber;
 	int			ndeletable = 0,
 				nupdatable = 0;
-	OffsetNumber deletable[MaxIndexTuplesPerPage];
-	BTVacuumPosting updatable[MaxIndexTuplesPerPage];
+	OffsetNumber deletable[MaxIndexTuplesPerPageLimit];
+	BTVacuumPosting updatable[MaxIndexTuplesPerPageLimit];
 
 	/* Use tableam interface to determine which tuples to delete first */
 	snapshotConflictHorizon = table_index_delete_tuples(heapRel, delstate);
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 696d79c085..2a4f990583 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -1159,9 +1159,9 @@ backtrack:
 	}
 	else if (P_ISLEAF(opaque))
 	{
-		OffsetNumber deletable[MaxIndexTuplesPerPage];
+		OffsetNumber deletable[MaxIndexTuplesPerPageLimit];
 		int			ndeletable;
-		BTVacuumPosting updatable[MaxIndexTuplesPerPage];
+		BTVacuumPosting updatable[MaxIndexTuplesPerPageLimit];
 		int			nupdatable;
 		OffsetNumber offnum,
 					minoff,
diff --git a/src/backend/access/nbtree/nbtxlog.c b/src/backend/access/nbtree/nbtxlog.c
index f683c21056..9ab55075a9 100644
--- a/src/backend/access/nbtree/nbtxlog.c
+++ b/src/backend/access/nbtree/nbtxlog.c
@@ -40,8 +40,8 @@ _bt_restore_page(Page page, char *from, int len)
 	IndexTupleData itupdata;
 	Size		itemsz;
 	char	   *end = from + len;
-	Item		items[MaxIndexTuplesPerPage];
-	uint16		itemsizes[MaxIndexTuplesPerPage];
+	Item		items[MaxIndexTuplesPerPageLimit];
+	uint16		itemsizes[MaxIndexTuplesPerPageLimit];
 	int			i;
 	int			nitems;
 
diff --git a/src/backend/access/spgist/spgdoinsert.c b/src/backend/access/spgist/spgdoinsert.c
index bb063c858d..3cbe1e2e51 100644
--- a/src/backend/access/spgist/spgdoinsert.c
+++ b/src/backend/access/spgist/spgdoinsert.c
@@ -135,7 +135,7 @@ spgPageIndexMultiDelete(SpGistState *state, Page page,
 						BlockNumber blkno, OffsetNumber offnum)
 {
 	OffsetNumber firstItem;
-	OffsetNumber sortednos[MaxIndexTuplesPerPage];
+	OffsetNumber sortednos[MaxIndexTuplesPerPageLimit];
 	SpGistDeadTuple tuple = NULL;
 	int			i;
 
diff --git a/src/backend/access/spgist/spgscan.c b/src/backend/access/spgist/spgscan.c
index 03293a7816..de6f767c89 100644
--- a/src/backend/access/spgist/spgscan.c
+++ b/src/backend/access/spgist/spgscan.c
@@ -961,7 +961,7 @@ storeGettuple(SpGistScanOpaque so, ItemPointer heapPtr,
 			  SpGistLeafTuple leafTuple, bool recheck,
 			  bool recheckDistances, double *nonNullDistances)
 {
-	Assert(so->nPtrs < MaxIndexTuplesPerPage);
+	Assert(so->nPtrs < MaxIndexTuplesPerPageDynamic);
 	so->heapPtrs[so->nPtrs] = *heapPtr;
 	so->recheck[so->nPtrs] = recheck;
 	so->recheckDistances[so->nPtrs] = recheckDistances;
diff --git a/src/backend/access/spgist/spgvacuum.c b/src/backend/access/spgist/spgvacuum.c
index ff82b97dd8..1d547d55d0 100644
--- a/src/backend/access/spgist/spgvacuum.c
+++ b/src/backend/access/spgist/spgvacuum.c
@@ -128,14 +128,14 @@ vacuumLeafPage(spgBulkDeleteState *bds, Relation index, Buffer buffer,
 {
 	Page		page = BufferGetPage(buffer);
 	spgxlogVacuumLeaf xlrec;
-	OffsetNumber toDead[MaxIndexTuplesPerPage];
-	OffsetNumber toPlaceholder[MaxIndexTuplesPerPage];
-	OffsetNumber moveSrc[MaxIndexTuplesPerPage];
-	OffsetNumber moveDest[MaxIndexTuplesPerPage];
-	OffsetNumber chainSrc[MaxIndexTuplesPerPage];
-	OffsetNumber chainDest[MaxIndexTuplesPerPage];
-	OffsetNumber predecessor[MaxIndexTuplesPerPage + 1];
-	bool		deletable[MaxIndexTuplesPerPage + 1];
+	OffsetNumber toDead[MaxIndexTuplesPerPageLimit];
+	OffsetNumber toPlaceholder[MaxIndexTuplesPerPageLimit];
+	OffsetNumber moveSrc[MaxIndexTuplesPerPageLimit];
+	OffsetNumber moveDest[MaxIndexTuplesPerPageLimit];
+	OffsetNumber chainSrc[MaxIndexTuplesPerPageLimit];
+	OffsetNumber chainDest[MaxIndexTuplesPerPageLimit];
+	OffsetNumber predecessor[MaxIndexTuplesPerPageLimit + 1];
+	bool		deletable[MaxIndexTuplesPerPageLimit + 1];
 	int			nDeletable;
 	OffsetNumber i,
 				max = PageGetMaxOffsetNumber(page);
@@ -408,7 +408,7 @@ vacuumLeafRoot(spgBulkDeleteState *bds, Relation index, Buffer buffer)
 {
 	Page		page = BufferGetPage(buffer);
 	spgxlogVacuumRoot xlrec;
-	OffsetNumber toDelete[MaxIndexTuplesPerPage];
+	OffsetNumber toDelete[MaxIndexTuplesPerPageLimit];
 	OffsetNumber i,
 				max = PageGetMaxOffsetNumber(page);
 
@@ -498,8 +498,8 @@ vacuumRedirectAndPlaceholder(Relation index, Relation heaprel, Buffer buffer)
 				firstPlaceholder = InvalidOffsetNumber;
 	bool		hasNonPlaceholder = false;
 	bool		hasUpdate = false;
-	OffsetNumber itemToPlaceholder[MaxIndexTuplesPerPage];
-	OffsetNumber itemnos[MaxIndexTuplesPerPage];
+	OffsetNumber itemToPlaceholder[MaxIndexTuplesPerPageLimit];
+	OffsetNumber itemnos[MaxIndexTuplesPerPageLimit];
 	spgxlogVacuumRedirect xlrec;
 	GlobalVisState *vistest;
 
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index 345624b133..d54651d703 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -1165,8 +1165,8 @@ PageIndexMultiDelete(Page page, OffsetNumber *itemnos, int nitems)
 	Offset		pd_upper = phdr->pd_upper;
 	Offset		pd_special = phdr->pd_special;
 	Offset		last_offset;
-	itemIdCompactData itemidbase[MaxIndexTuplesPerPage];
-	ItemIdData	newitemids[MaxIndexTuplesPerPage];
+	itemIdCompactData itemidbase[MaxIndexTuplesPerPageLimit];
+	ItemIdData	newitemids[MaxIndexTuplesPerPageLimit];
 	itemIdCompact itemidptr;
 	ItemId		lp;
 	int			nline,
@@ -1178,7 +1178,7 @@ PageIndexMultiDelete(Page page, OffsetNumber *itemnos, int nitems)
 	OffsetNumber offnum;
 	bool		presorted = true;	/* For now */
 
-	Assert(nitems <= MaxIndexTuplesPerPage);
+	Assert(nitems <= MaxIndexTuplesPerPageDynamic);
 
 	/*
 	 * If there aren't very many items to delete, then retail
diff --git a/src/include/access/hash.h b/src/include/access/hash.h
index e777ac4d06..c1cf1784fc 100644
--- a/src/include/access/hash.h
+++ b/src/include/access/hash.h
@@ -124,7 +124,7 @@ typedef struct HashScanPosData
 	int			lastItem;		/* last valid index in items[] */
 	int			itemIndex;		/* current index in items[] */
 
-	HashScanPosItem items[MaxIndexTuplesPerPage];	/* MUST BE LAST */
+	HashScanPosItem items[MaxIndexTuplesPerPageLimit];	/* MUST BE LAST */
 } HashScanPosData;
 
 #define HashScanPosIsPinned(scanpos) \
diff --git a/src/include/access/itup.h b/src/include/access/itup.h
index 84579fe63f..edda73e929 100644
--- a/src/include/access/itup.h
+++ b/src/include/access/itup.h
@@ -166,9 +166,6 @@ index_getattr(IndexTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
 #define CalcMaxIndexTuplesPerPage(usablespace)		\
 	((int) ((usablespace) /												\
 			(MAXALIGN(sizeof(IndexTupleData) + 1) + sizeof(ItemIdData))))
-#define MaxIndexTuplesPerPageLimit (CalcMaxIndexTuplesPerPage(BLCKSZ - SizeOfPageHeaderData))
+#define MaxIndexTuplesPerPageLimit (CalcMaxIndexTuplesPerPage(PageUsableSpaceMax))
 #define MaxIndexTuplesPerPageDynamic (CalcMaxIndexTuplesPerPage(PageUsableSpace))
-
-// temporary to compile
-#define MaxIndexTuplesPerPage MaxIndexTuplesPerPageLimit
 #endif							/* ITUP_H */
diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h
index f1db466007..2873401e3f 100644
--- a/src/include/access/nbtree.h
+++ b/src/include/access/nbtree.h
@@ -889,7 +889,7 @@ typedef struct BTDedupStateData
 	 * are implicitly unchanged by deduplication pass).
 	 */
 	int			nintervals;		/* current number of intervals in array */
-	BTDedupInterval intervals[MaxIndexTuplesPerPage];
+	BTDedupInterval intervals[MaxIndexTuplesPerPageLimit];
 } BTDedupStateData;
 
 typedef BTDedupStateData *BTDedupState;
diff --git a/src/include/access/spgist_private.h b/src/include/access/spgist_private.h
index 2e9c757b30..ba643648fa 100644
--- a/src/include/access/spgist_private.h
+++ b/src/include/access/spgist_private.h
@@ -226,17 +226,17 @@ typedef struct SpGistScanOpaqueData
 	TupleDesc	reconTupDesc;	/* if so, descriptor for reconstructed tuples */
 	int			nPtrs;			/* number of TIDs found on current page */
 	int			iPtr;			/* index for scanning through same */
-	ItemPointerData heapPtrs[MaxIndexTuplesPerPage];	/* TIDs from cur page */
-	bool		recheck[MaxIndexTuplesPerPage]; /* their recheck flags */
-	bool		recheckDistances[MaxIndexTuplesPerPage];	/* distance recheck
+	ItemPointerData heapPtrs[MaxIndexTuplesPerPageLimit];	/* TIDs from cur page */
+	bool		recheck[MaxIndexTuplesPerPageLimit]; /* their recheck flags */
+	bool		recheckDistances[MaxIndexTuplesPerPageLimit];	/* distance recheck
 															 * flags */
-	HeapTuple	reconTups[MaxIndexTuplesPerPage];	/* reconstructed tuples */
+	HeapTuple	reconTups[MaxIndexTuplesPerPageLimit];	/* reconstructed tuples */
 
 	/* distances (for recheck) */
-	IndexOrderByDistance *distances[MaxIndexTuplesPerPage];
+	IndexOrderByDistance *distances[MaxIndexTuplesPerPageLimit];
 
 	/*
-	 * Note: using MaxIndexTuplesPerPage above is a bit hokey since
+	 * Note: using MaxIndexTuplesPerPageDynamic above is a bit hokey since
 	 * SpGistLeafTuples aren't exactly IndexTuples; however, they are larger,
 	 * so this is safe.
 	 */
-- 
2.40.1

v3-0006-chore-Split-MaxHeapTuplesPerPage-into-Limit-and-D.patchapplication/octet-stream; name=v3-0006-chore-Split-MaxHeapTuplesPerPage-into-Limit-and-D.patchDownload

From 968a499fcc2999277a635567f74343b7dc1004a4 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Tue, 19 Dec 2023 17:42:31 -0500
Subject: [PATCH v3 06/28] chore: Split MaxHeapTuplesPerPage into Limit and
 Dynamic variants

Mechanical replacement, removing the old constant
---
 contrib/pg_surgery/heap_surgery.c             |  4 ++--
 src/backend/access/brin/brin_bloom.c          |  8 +++----
 src/backend/access/brin/brin_minmax_multi.c   |  8 +++----
 src/backend/access/gin/ginpostinglist.c       |  6 ++---
 src/backend/access/heap/README.HOT            |  2 +-
 src/backend/access/heap/heapam.c              |  6 ++---
 src/backend/access/heap/heapam_handler.c      |  8 +++----
 src/backend/access/heap/hio.c                 |  2 +-
 src/backend/access/heap/pruneheap.c           | 22 +++++++++----------
 src/backend/access/heap/vacuumlazy.c          | 22 +++++++++----------
 src/backend/nodes/tidbitmap.c                 |  2 +-
 src/backend/storage/page/bufpage.c            | 18 +++++++--------
 src/include/access/ginblock.h                 |  2 +-
 src/include/access/heapam.h                   |  6 ++---
 src/include/access/htup_details.h             |  1 -
 .../test_ginpostinglist/test_ginpostinglist.c |  6 ++---
 16 files changed, 61 insertions(+), 62 deletions(-)

diff --git a/contrib/pg_surgery/heap_surgery.c b/contrib/pg_surgery/heap_surgery.c
index 37dffe3f7d..9db36ea20a 100644
--- a/contrib/pg_surgery/heap_surgery.c
+++ b/contrib/pg_surgery/heap_surgery.c
@@ -89,7 +89,7 @@ heap_force_common(FunctionCallInfo fcinfo, HeapTupleForceOption heap_force_opt)
 	Relation	rel;
 	OffsetNumber curr_start_ptr,
 				next_start_ptr;
-	bool		include_this_tid[MaxHeapTuplesPerPage];
+	bool		include_this_tid[MaxHeapTuplesPerPageLimit];
 
 	if (RecoveryInProgress())
 		ereport(ERROR,
@@ -225,7 +225,7 @@ heap_force_common(FunctionCallInfo fcinfo, HeapTupleForceOption heap_force_opt)
 			}
 
 			/* Mark it for processing. */
-			Assert(offno < MaxHeapTuplesPerPage);
+			Assert(offno < MaxHeapTuplesPerPageDynamic);
 			include_this_tid[offno] = true;
 		}
 
diff --git a/src/backend/access/brin/brin_bloom.c b/src/backend/access/brin/brin_bloom.c
index 13c1e681f3..d24fb5aa28 100644
--- a/src/backend/access/brin/brin_bloom.c
+++ b/src/backend/access/brin/brin_bloom.c
@@ -166,7 +166,7 @@ typedef struct BloomOptions
  * on the fact that the filter header is ~20B alone, which is about
  * the same as the filter bitmap for 16 distinct items with 1% false
  * positive rate. So by allowing lower values we'd not gain much. In
- * any case, the min should not be larger than MaxHeapTuplesPerPage
+ * any case, the min should not be larger than MaxHeapTuplesPerPageDynamic
  * (~290), which is the theoretical maximum for single-page ranges.
  */
 #define		BLOOM_MIN_NDISTINCT_PER_RANGE		16
@@ -478,7 +478,7 @@ brin_bloom_opcinfo(PG_FUNCTION_ARGS)
  *
  * Adjust the ndistinct value based on the pagesPerRange value. First,
  * if it's negative, it's assumed to be relative to maximum number of
- * tuples in the range (assuming each page gets MaxHeapTuplesPerPage
+ * tuples in the range (assuming each page gets MaxHeapTuplesPerPageDynamic
  * tuples, which is likely a significant over-estimate). We also clamp
  * the value, not to over-size the bloom filter unnecessarily.
  *
@@ -493,7 +493,7 @@ brin_bloom_opcinfo(PG_FUNCTION_ARGS)
  * seems better to rely on the upper estimate.
  *
  * XXX We might also calculate a better estimate of rows per BRIN range,
- * instead of using MaxHeapTuplesPerPage (which probably produces values
+ * instead of using MaxHeapTuplesPerPageDynamic (which probably produces values
  * much higher than reality).
  */
 static int
@@ -508,7 +508,7 @@ brin_bloom_get_ndistinct(BrinDesc *bdesc, BloomOptions *opts)
 
 	Assert(BlockNumberIsValid(pagesPerRange));
 
-	maxtuples = MaxHeapTuplesPerPage * pagesPerRange;
+	maxtuples = MaxHeapTuplesPerPageDynamic * pagesPerRange;
 
 	/*
 	 * Similarly to n_distinct, negative values are relative - in this case to
diff --git a/src/backend/access/brin/brin_minmax_multi.c b/src/backend/access/brin/brin_minmax_multi.c
index 3ffaad3e42..87fb144265 100644
--- a/src/backend/access/brin/brin_minmax_multi.c
+++ b/src/backend/access/brin/brin_minmax_multi.c
@@ -2007,10 +2007,10 @@ brin_minmax_multi_distance_tid(PG_FUNCTION_ARGS)
 	 * We use the no-check variants here, because user-supplied values may
 	 * have (ip_posid == 0). See ItemPointerCompare.
 	 */
-	da1 = ItemPointerGetBlockNumberNoCheck(pa1) * MaxHeapTuplesPerPage +
+	da1 = ItemPointerGetBlockNumberNoCheck(pa1) * MaxHeapTuplesPerPageDynamic +
 		ItemPointerGetOffsetNumberNoCheck(pa1);
 
-	da2 = ItemPointerGetBlockNumberNoCheck(pa2) * MaxHeapTuplesPerPage +
+	da2 = ItemPointerGetBlockNumberNoCheck(pa2) * MaxHeapTuplesPerPageDynamic +
 		ItemPointerGetOffsetNumberNoCheck(pa2);
 
 	PG_RETURN_FLOAT8(da2 - da1);
@@ -2461,7 +2461,7 @@ brin_minmax_multi_add_value(PG_FUNCTION_ARGS)
 		 * much lower, but meh.
 		 */
 		maxvalues = Min(target_maxvalues * MINMAX_BUFFER_FACTOR,
-						MaxHeapTuplesPerPage * pagesPerRange);
+						MaxHeapTuplesPerPageDynamic * pagesPerRange);
 
 		/* but always at least the original value */
 		maxvalues = Max(maxvalues, target_maxvalues);
@@ -2507,7 +2507,7 @@ brin_minmax_multi_add_value(PG_FUNCTION_ARGS)
 		 * much lower, but meh.
 		 */
 		maxvalues = Min(serialized->maxvalues * MINMAX_BUFFER_FACTOR,
-						MaxHeapTuplesPerPage * pagesPerRange);
+						MaxHeapTuplesPerPageDynamic * pagesPerRange);
 
 		/* but always at least the original value */
 		maxvalues = Max(maxvalues, serialized->maxvalues);
diff --git a/src/backend/access/gin/ginpostinglist.c b/src/backend/access/gin/ginpostinglist.c
index 708f9f49ec..9bfae0ec81 100644
--- a/src/backend/access/gin/ginpostinglist.c
+++ b/src/backend/access/gin/ginpostinglist.c
@@ -26,7 +26,7 @@
  * lowest 32 bits are the block number. That leaves 21 bits unused, i.e.
  * only 43 low bits are used.
  *
- * 11 bits is enough for the offset number, because MaxHeapTuplesPerPage <
+ * 11 bits is enough for the offset number, because MaxHeapTuplesPerPageDynamic <
  * 2^11 on all supported block sizes. We are frugal with the bits, because
  * smaller integers use fewer bytes in the varbyte encoding, saving disk
  * space. (If we get a new table AM in the future that wants to use the full
@@ -74,9 +74,9 @@
 /*
  * How many bits do you need to encode offset number? OffsetNumber is a 16-bit
  * integer, but you can't fit that many items on a page. 11 ought to be more
- * than enough. It's tempting to derive this from MaxHeapTuplesPerPage, and
+ * than enough. It's tempting to derive this from MaxHeapTuplesPerPageDynamic, and
  * use the minimum number of bits, but that would require changing the on-disk
- * format if MaxHeapTuplesPerPage changes. Better to leave some slack.
+ * format if MaxHeapTuplesPerPageDynamic changes. Better to leave some slack.
  */
 #define MaxHeapTuplesPerPageBits		11
 
diff --git a/src/backend/access/heap/README.HOT b/src/backend/access/heap/README.HOT
index 74e407f375..296ae36310 100644
--- a/src/backend/access/heap/README.HOT
+++ b/src/backend/access/heap/README.HOT
@@ -264,7 +264,7 @@ of line pointer bloat: we might end up with huge numbers of line pointers
 and just a few actual tuples on a page.  To limit the damage in the worst
 case, and to keep various work arrays as well as the bitmaps in bitmap
 scans reasonably sized, the maximum number of line pointers per page
-is arbitrarily capped at MaxHeapTuplesPerPage (the most tuples that
+is arbitrarily capped at MaxHeapTuplesPerPageDynamic (the most tuples that
 could fit without HOT pruning).
 
 Effectively, space reclamation happens during tuple retrieval when the
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index a46594f009..250380d02b 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -478,7 +478,7 @@ heapgetpage(TableScanDesc sscan, BlockNumber block)
 
 	LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
 
-	Assert(ntup <= MaxHeapTuplesPerPage);
+	Assert(ntup <= MaxHeapTuplesPerPageDynamic);
 	scan->rs_ntuples = ntup;
 }
 
@@ -6750,8 +6750,8 @@ heap_freeze_execute_prepared(Relation rel, Buffer buffer,
 	/* Now WAL-log freezing if necessary */
 	if (RelationNeedsWAL(rel))
 	{
-		xl_heap_freeze_plan plans[MaxHeapTuplesPerPage];
-		OffsetNumber offsets[MaxHeapTuplesPerPage];
+		xl_heap_freeze_plan plans[MaxHeapTuplesPerPageLimit];
+		OffsetNumber offsets[MaxHeapTuplesPerPageLimit];
 		int			nplans;
 		xl_heap_freeze_page xlrec;
 		XLogRecPtr	recptr;
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index aa01124483..6ee063d9b6 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -1191,7 +1191,7 @@ heapam_index_build_range_scan(Relation heapRelation,
 	TransactionId OldestXmin;
 	BlockNumber previous_blkno = InvalidBlockNumber;
 	BlockNumber root_blkno = InvalidBlockNumber;
-	OffsetNumber root_offsets[MaxHeapTuplesPerPage];
+	OffsetNumber root_offsets[MaxHeapTuplesPerPageLimit];
 
 	/*
 	 * sanity checks
@@ -1754,8 +1754,8 @@ heapam_index_validate_scan(Relation heapRelation,
 	EState	   *estate;
 	ExprContext *econtext;
 	BlockNumber root_blkno = InvalidBlockNumber;
-	OffsetNumber root_offsets[MaxHeapTuplesPerPage];
-	bool		in_index[MaxHeapTuplesPerPage];
+	OffsetNumber root_offsets[MaxHeapTuplesPerPageLimit];
+	bool		in_index[MaxHeapTuplesPerPageLimit];
 	BlockNumber previous_blkno = InvalidBlockNumber;
 
 	/* state variables for the merge */
@@ -2220,7 +2220,7 @@ heapam_scan_bitmap_next_block(TableScanDesc scan,
 
 	LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
 
-	Assert(ntup <= MaxHeapTuplesPerPage);
+	Assert(ntup <= MaxHeapTuplesPerPageDynamic);
 	hscan->rs_ntuples = ntup;
 
 	return ntup > 0;
diff --git a/src/backend/access/heap/hio.c b/src/backend/access/heap/hio.c
index c7248d7c68..ce260d5612 100644
--- a/src/backend/access/heap/hio.c
+++ b/src/backend/access/heap/hio.c
@@ -547,7 +547,7 @@ RelationGetBufferForTuple(Relation relation, Size len,
 	 * extensions while inserting large tuples into low-fillfactor tables.
 	 */
 	nearlyEmptyFreeSpace = MaxHeapTupleSize -
-		(MaxHeapTuplesPerPage / 8 * sizeof(ItemIdData));
+		(MaxHeapTuplesPerPageDynamic / 8 * sizeof(ItemIdData));
 	if (len + saveFreeSpace > nearlyEmptyFreeSpace)
 		targetFreeSpace = Max(len, nearlyEmptyFreeSpace);
 	else
diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index 5917633567..193736f419 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -44,17 +44,17 @@ typedef struct
 	int			ndead;
 	int			nunused;
 	/* arrays that accumulate indexes of items to be changed */
-	OffsetNumber redirected[MaxHeapTuplesPerPage * 2];
-	OffsetNumber nowdead[MaxHeapTuplesPerPage];
-	OffsetNumber nowunused[MaxHeapTuplesPerPage];
+	OffsetNumber redirected[MaxHeapTuplesPerPageLimit * 2];
+	OffsetNumber nowdead[MaxHeapTuplesPerPageLimit];
+	OffsetNumber nowunused[MaxHeapTuplesPerPageLimit];
 
 	/*
 	 * marked[i] is true if item i is entered in one of the above arrays.
 	 *
-	 * This needs to be MaxHeapTuplesPerPage + 1 long as FirstOffsetNumber is
+	 * This needs to be MaxHeapTuplesPerPageDynamic + 1 long as FirstOffsetNumber is
 	 * 1. Otherwise every access would need to subtract 1.
 	 */
-	bool		marked[MaxHeapTuplesPerPage + 1];
+	bool		marked[MaxHeapTuplesPerPageLimit + 1];
 } PruneState;
 
 /* Local functions */
@@ -496,7 +496,7 @@ heap_prune_chain(Buffer buffer, OffsetNumber rootoffnum,
 	OffsetNumber latestdead = InvalidOffsetNumber,
 				maxoff = PageGetMaxOffsetNumber(dp),
 				offnum;
-	OffsetNumber chainitems[MaxHeapTuplesPerPage];
+	OffsetNumber chainitems[MaxHeapTuplesPerPageLimit];
 	int			nchain = 0,
 				i;
 
@@ -777,7 +777,7 @@ static void
 heap_prune_record_redirect(PruneState *prstate,
 						   OffsetNumber offnum, OffsetNumber rdoffnum)
 {
-	Assert(prstate->nredirected < MaxHeapTuplesPerPage);
+	Assert(prstate->nredirected < MaxHeapTuplesPerPageDynamic);
 	prstate->redirected[prstate->nredirected * 2] = offnum;
 	prstate->redirected[prstate->nredirected * 2 + 1] = rdoffnum;
 	prstate->nredirected++;
@@ -791,7 +791,7 @@ heap_prune_record_redirect(PruneState *prstate,
 static void
 heap_prune_record_dead(PruneState *prstate, OffsetNumber offnum)
 {
-	Assert(prstate->ndead < MaxHeapTuplesPerPage);
+	Assert(prstate->ndead < MaxHeapTuplesPerPageDynamic);
 	prstate->nowdead[prstate->ndead] = offnum;
 	prstate->ndead++;
 	Assert(!prstate->marked[offnum]);
@@ -823,7 +823,7 @@ heap_prune_record_dead_or_unused(PruneState *prstate, OffsetNumber offnum)
 static void
 heap_prune_record_unused(PruneState *prstate, OffsetNumber offnum)
 {
-	Assert(prstate->nunused < MaxHeapTuplesPerPage);
+	Assert(prstate->nunused < MaxHeapTuplesPerPageDynamic);
 	prstate->nowunused[prstate->nunused] = offnum;
 	prstate->nunused++;
 	Assert(!prstate->marked[offnum]);
@@ -1036,7 +1036,7 @@ page_verify_redirects(Page page)
  * If item k is part of a HOT-chain with root at item j, then we set
  * root_offsets[k - 1] = j.
  *
- * The passed-in root_offsets array must have MaxHeapTuplesPerPage entries.
+ * The passed-in root_offsets array must have MaxHeapTuplesPerPageDynamic entries.
  * Unused entries are filled with InvalidOffsetNumber (zero).
  *
  * The function must be called with at least share lock on the buffer, to
@@ -1053,7 +1053,7 @@ heap_get_root_tuples(Page page, OffsetNumber *root_offsets)
 				maxoff;
 
 	MemSet(root_offsets, InvalidOffsetNumber,
-		   MaxHeapTuplesPerPage * sizeof(OffsetNumber));
+		   MaxHeapTuplesPerPageDynamic * sizeof(OffsetNumber));
 
 	maxoff = PageGetMaxOffsetNumber(page);
 	for (offnum = FirstOffsetNumber; offnum <= maxoff; offnum = OffsetNumberNext(offnum))
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index e8c1db818a..3fce5dcdac 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -892,8 +892,8 @@ lazy_scan_heap(LVRelState *vacrel)
 		 * dead_items TIDs, pause and do a cycle of vacuuming before we tackle
 		 * this page.
 		 */
-		Assert(dead_items->max_items >= MaxHeapTuplesPerPage);
-		if (dead_items->max_items - dead_items->num_items < MaxHeapTuplesPerPage)
+		Assert(dead_items->max_items >= MaxHeapTuplesPerPageDynamic);
+		if (dead_items->max_items - dead_items->num_items < MaxHeapTuplesPerPageDynamic)
 		{
 			/*
 			 * Before beginning index vacuuming, we release any pin we may
@@ -1397,8 +1397,8 @@ lazy_scan_prune(LVRelState *vacrel,
 				all_frozen;
 	TransactionId visibility_cutoff_xid;
 	int64		fpi_before = pgWalUsage.wal_fpi;
-	OffsetNumber deadoffsets[MaxHeapTuplesPerPage];
-	HeapTupleFreeze frozen[MaxHeapTuplesPerPage];
+	OffsetNumber deadoffsets[MaxHeapTuplesPerPageLimit];
+	HeapTupleFreeze frozen[MaxHeapTuplesPerPageLimit];
 
 	Assert(BufferGetBlockNumber(buf) == blkno);
 
@@ -1926,7 +1926,7 @@ lazy_scan_noprune(LVRelState *vacrel,
 	HeapTupleHeader tupleheader;
 	TransactionId NoFreezePageRelfrozenXid = vacrel->NewRelfrozenXid;
 	MultiXactId NoFreezePageRelminMxid = vacrel->NewRelminMxid;
-	OffsetNumber deadoffsets[MaxHeapTuplesPerPage];
+	OffsetNumber deadoffsets[MaxHeapTuplesPerPageLimit];
 
 	Assert(BufferGetBlockNumber(buf) == blkno);
 
@@ -2470,7 +2470,7 @@ lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
 {
 	VacDeadItems *dead_items = vacrel->dead_items;
 	Page		page = BufferGetPage(buffer);
-	OffsetNumber unused[MaxHeapTuplesPerPage];
+	OffsetNumber unused[MaxHeapTuplesPerPageLimit];
 	int			nunused = 0;
 	TransactionId visibility_cutoff_xid;
 	bool		all_frozen;
@@ -3121,16 +3121,16 @@ dead_items_max_items(LVRelState *vacrel)
 		max_items = Min(max_items, MAXDEADITEMS(MaxAllocSize));
 
 		/* curious coding here to ensure the multiplication can't overflow */
-		if ((BlockNumber) (max_items / MaxHeapTuplesPerPage) > rel_pages)
-			max_items = rel_pages * MaxHeapTuplesPerPage;
+		if ((BlockNumber) (max_items / MaxHeapTuplesPerPageDynamic) > rel_pages)
+			max_items = rel_pages * MaxHeapTuplesPerPageDynamic;
 
 		/* stay sane if small maintenance_work_mem */
-		max_items = Max(max_items, MaxHeapTuplesPerPage);
+		max_items = Max(max_items, MaxHeapTuplesPerPageDynamic);
 	}
 	else
 	{
 		/* One-pass case only stores a single heap page's TIDs at a time */
-		max_items = MaxHeapTuplesPerPage;
+		max_items = MaxHeapTuplesPerPageDynamic;
 	}
 
 	return (int) max_items;
@@ -3150,7 +3150,7 @@ dead_items_alloc(LVRelState *vacrel, int nworkers)
 	int			max_items;
 
 	max_items = dead_items_max_items(vacrel);
-	Assert(max_items >= MaxHeapTuplesPerPage);
+	Assert(max_items >= MaxHeapTuplesPerPageDynamic);
 
 	/*
 	 * Initialize state for a parallel vacuum.  As of now, only one worker can
diff --git a/src/backend/nodes/tidbitmap.c b/src/backend/nodes/tidbitmap.c
index 0f4850065f..11368d633b 100644
--- a/src/backend/nodes/tidbitmap.c
+++ b/src/backend/nodes/tidbitmap.c
@@ -53,7 +53,7 @@
  * the per-page bitmaps variable size.  We just legislate that the size
  * is this:
  */
-#define MAX_TUPLES_PER_PAGE  MaxHeapTuplesPerPage
+#define MAX_TUPLES_PER_PAGE  MaxHeapTuplesPerPageLimit
 
 /*
  * When we have to switch over to lossy storage, we use a data structure
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index bb9deacddf..345624b133 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -186,7 +186,7 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
  *	one that is both unused and deallocated.
  *
  *	If flag PAI_IS_HEAP is set, we enforce that there can't be more than
- *	MaxHeapTuplesPerPage line pointers on the page.
+ *	MaxHeapTuplesPerPageDynamic line pointers on the page.
  *
  *	!!! EREPORT(ERROR) IS DISALLOWED HERE !!!
  */
@@ -295,9 +295,9 @@ PageAddItemExtended(Page page,
 	}
 
 	/* Reject placing items beyond heap boundary, if heap */
-	if ((flags & PAI_IS_HEAP) != 0 && offsetNumber > MaxHeapTuplesPerPage)
+	if ((flags & PAI_IS_HEAP) != 0 && offsetNumber > MaxHeapTuplesPerPageDynamic)
 	{
-		elog(WARNING, "can't put more than MaxHeapTuplesPerPage items in a heap page");
+		elog(WARNING, "can't put more than MaxHeapTuplesPerPageDynamic items in a heap page");
 		return InvalidOffsetNumber;
 	}
 
@@ -702,7 +702,7 @@ PageRepairFragmentation(Page page)
 	Offset		pd_upper = ((PageHeader) page)->pd_upper;
 	Offset		pd_special = ((PageHeader) page)->pd_special;
 	Offset		last_offset;
-	itemIdCompactData itemidbase[MaxHeapTuplesPerPage];
+	itemIdCompactData itemidbase[MaxHeapTuplesPerPageLimit];
 	itemIdCompact itemidptr;
 	ItemId		lp;
 	int			nline,
@@ -979,12 +979,12 @@ PageGetExactFreeSpace(Page page)
  *		reduced by the space needed for a new line pointer.
  *
  * The difference between this and PageGetFreeSpace is that this will return
- * zero if there are already MaxHeapTuplesPerPage line pointers in the page
+ * zero if there are already MaxHeapTuplesPerPageDynamic line pointers in the page
  * and none are free.  We use this to enforce that no more than
- * MaxHeapTuplesPerPage line pointers are created on a heap page.  (Although
+ * MaxHeapTuplesPerPageDynamic line pointers are created on a heap page.  (Although
  * no more tuples than that could fit anyway, in the presence of redirected
  * or dead line pointers it'd be possible to have too many line pointers.
- * To avoid breaking code that assumes MaxHeapTuplesPerPage is a hard limit
+ * To avoid breaking code that assumes MaxHeapTuplesPerPageDynamic is a hard limit
  * on the number of line pointers, we make this extra check.)
  */
 Size
@@ -999,10 +999,10 @@ PageGetHeapFreeSpace(Page page)
 					nline;
 
 		/*
-		 * Are there already MaxHeapTuplesPerPage line pointers in the page?
+		 * Are there already MaxHeapTuplesPerPageDynamic line pointers in the page?
 		 */
 		nline = PageGetMaxOffsetNumber(page);
-		if (nline >= MaxHeapTuplesPerPage)
+		if (nline >= MaxHeapTuplesPerPageDynamic)
 		{
 			if (PageHasFreeLinePointers(page))
 			{
diff --git a/src/include/access/ginblock.h b/src/include/access/ginblock.h
index 1b0d3ed1ea..2d7e295df0 100644
--- a/src/include/access/ginblock.h
+++ b/src/include/access/ginblock.h
@@ -162,7 +162,7 @@ extern bool GinPageIsRecyclable(Page page);
  *				pointers for that page
  * Note that these are all distinguishable from an "invalid" item pointer
  * (which is InvalidBlockNumber/0) as well as from all normal item
- * pointers (which have item numbers in the range 1..MaxHeapTuplesPerPage).
+ * pointers (which have item numbers in the range 1..MaxHeapTuplesPerPageDynamic).
  */
 #define ItemPointerSetMin(p)  \
 	ItemPointerSet((p), (BlockNumber)0, (OffsetNumber)0)
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 4b133f6859..2f693483f6 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -75,7 +75,7 @@ typedef struct HeapScanDescData
 	/* these fields only used in page-at-a-time mode and for bitmap scans */
 	int			rs_cindex;		/* current tuple's index in vistuples */
 	int			rs_ntuples;		/* number of visible tuples on page */
-	OffsetNumber rs_vistuples[MaxHeapTuplesPerPage];	/* their offsets */
+	OffsetNumber rs_vistuples[MaxHeapTuplesPerPageLimit];	/* their offsets */
 }			HeapScanDescData;
 typedef struct HeapScanDescData *HeapScanDesc;
 
@@ -205,10 +205,10 @@ typedef struct PruneResult
 	 * This is of type int8[], instead of HTSV_Result[], so we can use -1 to
 	 * indicate no visibility has been computed, e.g. for LP_DEAD items.
 	 *
-	 * This needs to be MaxHeapTuplesPerPage + 1 long as FirstOffsetNumber is
+	 * This needs to be MaxHeapTuplesPerPageDynamic + 1 long as FirstOffsetNumber is
 	 * 1. Otherwise every access would need to subtract 1.
 	 */
-	int8		htsv[MaxHeapTuplesPerPage + 1];
+	int8		htsv[MaxHeapTuplesPerPageLimit + 1];
 } PruneResult;
 
 /*
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index 450d14ebd4..e3e10d364f 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -583,7 +583,6 @@ StaticAssertDecl(MaxOffsetNumber < SpecTokenOffsetNumber,
 			(MAXALIGN(SizeofHeapTupleHeader) + sizeof(ItemIdData))))
 #define MaxHeapTuplesPerPageLimit (CalcMaxHeapTuplesPerPage(PageUsableSpaceMax))
 #define MaxHeapTuplesPerPageDynamic (CalcMaxHeapTuplesPerPage(PageUsableSpace))
-#define MaxHeapTuplesPerPage MaxHeapTuplesPerPageLimit
 
 /*
  * MaxAttrSize is a somewhat arbitrary upper limit on the declared size of
diff --git a/src/test/modules/test_ginpostinglist/test_ginpostinglist.c b/src/test/modules/test_ginpostinglist/test_ginpostinglist.c
index 04215cadd9..6e0d184d08 100644
--- a/src/test/modules/test_ginpostinglist/test_ginpostinglist.c
+++ b/src/test/modules/test_ginpostinglist/test_ginpostinglist.c
@@ -88,9 +88,9 @@ Datum
 test_ginpostinglist(PG_FUNCTION_ARGS)
 {
 	test_itemptr_pair(0, 2, 14);
-	test_itemptr_pair(0, MaxHeapTuplesPerPage, 14);
-	test_itemptr_pair(MaxBlockNumber, MaxHeapTuplesPerPage, 14);
-	test_itemptr_pair(MaxBlockNumber, MaxHeapTuplesPerPage, 16);
+	test_itemptr_pair(0, MaxHeapTuplesPerPageDynamic, 14);
+	test_itemptr_pair(MaxBlockNumber, MaxHeapTuplesPerPageDynamic, 14);
+	test_itemptr_pair(MaxBlockNumber, MaxHeapTuplesPerPageDynamic, 16);
 
 	PG_RETURN_VOID();
 }
-- 
2.40.1

v3-0009-chore-Split-MaxTIDsPerBTreePage-into-Limit-and-Dy.patchapplication/octet-stream; name=v3-0009-chore-Split-MaxTIDsPerBTreePage-into-Limit-and-Dy.patchDownload

From 7e424b22e79f98197a0532459f97c5dbea8074be Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Tue, 12 Dec 2023 17:14:59 -0500
Subject: [PATCH v3 09/28] chore: Split MaxTIDsPerBTreePage into Limit and
 Dynamic variants

Mechanical replacement, removing the old constant
---
 contrib/amcheck/verify_nbtree.c       | 4 ++--
 src/backend/access/nbtree/nbtdedup.c  | 4 ++--
 src/backend/access/nbtree/nbtinsert.c | 4 ++--
 src/backend/access/nbtree/nbtree.c    | 4 ++--
 src/backend/access/nbtree/nbtsearch.c | 8 ++++----
 src/include/access/nbtree.h           | 5 ++---
 6 files changed, 14 insertions(+), 15 deletions(-)

diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 9d16431682..4d2ae3927e 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -533,12 +533,12 @@ bt_check_every_level(Relation rel, Relation heaprel, bool heapkeyspace,
 		/*
 		 * Size Bloom filter based on estimated number of tuples in index,
 		 * while conservatively assuming that each block must contain at least
-		 * MaxTIDsPerBTreePage / 3 "plain" tuples -- see
+		 * MaxTIDsPerBTreePageDynamic / 3 "plain" tuples -- see
 		 * bt_posting_plain_tuple() for definition, and details of how posting
 		 * list tuples are handled.
 		 */
 		total_pages = RelationGetNumberOfBlocks(rel);
-		total_elems = Max(total_pages * (MaxTIDsPerBTreePage / 3),
+		total_elems = Max(total_pages * (MaxTIDsPerBTreePageDynamic / 3),
 						  (int64) state->rel->rd_rel->reltuples);
 		/* Generate a random seed to avoid repetition */
 		seed = pg_prng_uint64(&pg_global_prng_state);
diff --git a/src/backend/access/nbtree/nbtdedup.c b/src/backend/access/nbtree/nbtdedup.c
index 6808dbe064..d6655b2988 100644
--- a/src/backend/access/nbtree/nbtdedup.c
+++ b/src/backend/access/nbtree/nbtdedup.c
@@ -355,8 +355,8 @@ _bt_bottomupdel_pass(Relation rel, Buffer buf, Relation heapRel,
 	delstate.bottomup = true;
 	delstate.bottomupfreespace = Max(BLCKSZ / 16, newitemsz);
 	delstate.ndeltids = 0;
-	delstate.deltids = palloc(MaxTIDsPerBTreePage * sizeof(TM_IndexDelete));
-	delstate.status = palloc(MaxTIDsPerBTreePage * sizeof(TM_IndexStatus));
+	delstate.deltids = palloc(MaxTIDsPerBTreePageDynamic * sizeof(TM_IndexDelete));
+	delstate.status = palloc(MaxTIDsPerBTreePageDynamic * sizeof(TM_IndexStatus));
 
 	minoff = P_FIRSTDATAKEY(opaque);
 	maxoff = PageGetMaxOffsetNumber(page);
diff --git a/src/backend/access/nbtree/nbtinsert.c b/src/backend/access/nbtree/nbtinsert.c
index 4674e5267a..b41365b16b 100644
--- a/src/backend/access/nbtree/nbtinsert.c
+++ b/src/backend/access/nbtree/nbtinsert.c
@@ -2829,8 +2829,8 @@ _bt_simpledel_pass(Relation rel, Buffer buffer, Relation heapRel,
 	delstate.bottomup = false;
 	delstate.bottomupfreespace = 0;
 	delstate.ndeltids = 0;
-	delstate.deltids = palloc(MaxTIDsPerBTreePage * sizeof(TM_IndexDelete));
-	delstate.status = palloc(MaxTIDsPerBTreePage * sizeof(TM_IndexStatus));
+	delstate.deltids = palloc(MaxTIDsPerBTreePageDynamic * sizeof(TM_IndexDelete));
+	delstate.status = palloc(MaxTIDsPerBTreePageDynamic * sizeof(TM_IndexStatus));
 
 	for (offnum = minoff;
 		 offnum <= maxoff;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 2a4f990583..041bb73c47 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -263,8 +263,8 @@ btgettuple(IndexScanDesc scan, ScanDirection dir)
 				 */
 				if (so->killedItems == NULL)
 					so->killedItems = (int *)
-						palloc(MaxTIDsPerBTreePage * sizeof(int));
-				if (so->numKilled < MaxTIDsPerBTreePage)
+						palloc(MaxTIDsPerBTreePageDynamic * sizeof(int));
+				if (so->numKilled < MaxTIDsPerBTreePageDynamic)
 					so->killedItems[so->numKilled++] = so->currPos.itemIndex;
 			}
 
diff --git a/src/backend/access/nbtree/nbtsearch.c b/src/backend/access/nbtree/nbtsearch.c
index 63ee9ba225..8e80a41571 100644
--- a/src/backend/access/nbtree/nbtsearch.c
+++ b/src/backend/access/nbtree/nbtsearch.c
@@ -1726,7 +1726,7 @@ _bt_readpage(IndexScanDesc scan, ScanDirection dir, OffsetNumber offnum,
 		if (!continuescan)
 			so->currPos.moreRight = false;
 
-		Assert(itemIndex <= MaxTIDsPerBTreePage);
+		Assert(itemIndex <= MaxTIDsPerBTreePageDynamic);
 		so->currPos.firstItem = 0;
 		so->currPos.lastItem = itemIndex - 1;
 		so->currPos.itemIndex = 0;
@@ -1734,7 +1734,7 @@ _bt_readpage(IndexScanDesc scan, ScanDirection dir, OffsetNumber offnum,
 	else
 	{
 		/* load items[] in descending order */
-		itemIndex = MaxTIDsPerBTreePage;
+		itemIndex = MaxTIDsPerBTreePageDynamic;
 
 		offnum = Min(offnum, maxoff);
 
@@ -1836,8 +1836,8 @@ _bt_readpage(IndexScanDesc scan, ScanDirection dir, OffsetNumber offnum,
 
 		Assert(itemIndex >= 0);
 		so->currPos.firstItem = itemIndex;
-		so->currPos.lastItem = MaxTIDsPerBTreePage - 1;
-		so->currPos.itemIndex = MaxTIDsPerBTreePage - 1;
+		so->currPos.lastItem = MaxTIDsPerBTreePageDynamic - 1;
+		so->currPos.itemIndex = MaxTIDsPerBTreePageDynamic - 1;
 	}
 
 	return (so->currPos.firstItem <= so->currPos.lastItem);
diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h
index 2873401e3f..9575ca17ce 100644
--- a/src/include/access/nbtree.h
+++ b/src/include/access/nbtree.h
@@ -172,7 +172,7 @@ typedef struct BTMetaPageData
 				   MAXALIGN(sizeof(BTPageOpaqueData))) / 3)
 
 /*
- * MaxTIDsPerBTreePage is an upper bound on the number of heap TIDs tuples
+ * MaxTIDsPerBTreePageDynamic is an upper bound on the number of heap TIDs tuples
  * that may be stored on a btree leaf page.  It is used to size the
  * per-page temporary buffers.
  *
@@ -187,7 +187,6 @@ typedef struct BTMetaPageData
 		   sizeof(ItemPointerData))
 #define MaxTIDsPerBTreePageLimit (CalcMaxTIDsPerBTreePage(PageUsableSpaceMax))
 #define MaxTIDsPerBTreePageDynamic (CalcMaxTIDsPerBTreePage(PageUsableSpace))
-#define MaxTIDsPerBTreePage MaxTIDsPerBTreePageLimit
 /*
  * The leaf-page fillfactor defaults to 90% but is user-adjustable.
  * For pages above the leaf level, we use a fixed 70% fillfactor.
@@ -984,7 +983,7 @@ typedef struct BTScanPosData
 	int			lastItem;		/* last valid index in items[] */
 	int			itemIndex;		/* current index in items[] */
 
-	BTScanPosItem items[MaxTIDsPerBTreePage];	/* MUST BE LAST */
+	BTScanPosItem items[MaxTIDsPerBTreePageLimit];	/* MUST BE LAST */
 } BTScanPosData;
 
 typedef BTScanPosData *BTScanPos;
-- 
2.40.1

v3-0008-chore-Split-MaxHeapTupleSize-into-Limit-and-Dynam.patchapplication/octet-stream; name=v3-0008-chore-Split-MaxHeapTupleSize-into-Limit-and-Dynam.patchDownload

From 1fb8940b9379aa70d5954882f0ed9a65d651665f Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Tue, 19 Dec 2023 18:13:07 -0500
Subject: [PATCH v3 08/28] chore: Split MaxHeapTupleSize into Limit and Dynamic
 variants

Mechanical replacement, removing the old constant
---
 src/backend/access/heap/heapam.c                | 12 ++++++------
 src/backend/access/heap/hio.c                   |  6 +++---
 src/backend/access/heap/rewriteheap.c           |  4 ++--
 src/backend/replication/logical/reorderbuffer.c |  2 +-
 src/backend/storage/freespace/freespace.c       |  2 +-
 src/include/access/heaptoast.h                  |  2 +-
 src/include/access/htup_details.h               |  5 ++---
 src/test/regress/expected/insert.out            |  2 +-
 src/test/regress/sql/insert.sql                 |  2 +-
 9 files changed, 18 insertions(+), 19 deletions(-)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 250380d02b..e05b8760c9 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -9231,7 +9231,7 @@ heap_xlog_insert(XLogReaderState *record)
 	union
 	{
 		HeapTupleHeaderData hdr;
-		char		data[MaxHeapTupleSize];
+		char		data[MaxHeapTupleSizeLimit];
 	}			tbuf;
 	HeapTupleHeader htup;
 	xl_heap_header xlhdr;
@@ -9287,7 +9287,7 @@ heap_xlog_insert(XLogReaderState *record)
 		data = XLogRecGetBlockData(record, 0, &datalen);
 
 		newlen = datalen - SizeOfHeapHeader;
-		Assert(datalen > SizeOfHeapHeader && newlen <= MaxHeapTupleSize);
+		Assert(datalen > SizeOfHeapHeader && newlen <= MaxHeapTupleSizeDynamic);
 		memcpy((char *) &xlhdr, data, SizeOfHeapHeader);
 		data += SizeOfHeapHeader;
 
@@ -9353,7 +9353,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
 	union
 	{
 		HeapTupleHeaderData hdr;
-		char		data[MaxHeapTupleSize];
+		char		data[MaxHeapTupleSizeLimit];
 	}			tbuf;
 	HeapTupleHeader htup;
 	uint32		newlen;
@@ -9431,7 +9431,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
 			tupdata = ((char *) xlhdr) + SizeOfMultiInsertTuple;
 
 			newlen = xlhdr->datalen;
-			Assert(newlen <= MaxHeapTupleSize);
+			Assert(newlen <= MaxHeapTupleSizeDynamic);
 			htup = &tbuf.hdr;
 			MemSet((char *) htup, 0, SizeofHeapTupleHeader);
 			/* PG73FORMAT: get bitmap [+ padding] [+ oid] + data */
@@ -9510,7 +9510,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
 	union
 	{
 		HeapTupleHeaderData hdr;
-		char		data[MaxHeapTupleSize];
+		char		data[MaxHeapTupleSizeLimit];
 	}			tbuf;
 	xl_heap_header xlhdr;
 	uint32		newlen;
@@ -9666,7 +9666,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
 		recdata += SizeOfHeapHeader;
 
 		tuplen = recdata_end - recdata;
-		Assert(tuplen <= MaxHeapTupleSize);
+		Assert(tuplen <= MaxHeapTupleSizeDynamic);
 
 		htup = &tbuf.hdr;
 		MemSet((char *) htup, 0, SizeofHeapTupleHeader);
diff --git a/src/backend/access/heap/hio.c b/src/backend/access/heap/hio.c
index ce260d5612..681e586f66 100644
--- a/src/backend/access/heap/hio.c
+++ b/src/backend/access/heap/hio.c
@@ -530,11 +530,11 @@ RelationGetBufferForTuple(Relation relation, Size len,
 	/*
 	 * If we're gonna fail for oversize tuple, do it right away
 	 */
-	if (len > MaxHeapTupleSize)
+	if (len > MaxHeapTupleSizeDynamic)
 		ereport(ERROR,
 				(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
 				 errmsg("row is too big: size %zu, maximum size %zu",
-						len, MaxHeapTupleSize)));
+						len, MaxHeapTupleSizeDynamic)));
 
 	/* Compute desired extra freespace due to fillfactor option */
 	saveFreeSpace = RelationGetTargetPageFreeSpace(relation,
@@ -546,7 +546,7 @@ RelationGetBufferForTuple(Relation relation, Size len,
 	 * somewhat arbitrary, but it should prevent most unnecessary relation
 	 * extensions while inserting large tuples into low-fillfactor tables.
 	 */
-	nearlyEmptyFreeSpace = MaxHeapTupleSize -
+	nearlyEmptyFreeSpace = MaxHeapTupleSizeDynamic -
 		(MaxHeapTuplesPerPageDynamic / 8 * sizeof(ItemIdData));
 	if (len + saveFreeSpace > nearlyEmptyFreeSpace)
 		targetFreeSpace = Max(len, nearlyEmptyFreeSpace);
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 34107323ff..f0896e554c 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -653,11 +653,11 @@ raw_heap_insert(RewriteState state, HeapTuple tup)
 	/*
 	 * If we're gonna fail for oversize tuple, do it right away
 	 */
-	if (len > MaxHeapTupleSize)
+	if (len > MaxHeapTupleSizeDynamic)
 		ereport(ERROR,
 				(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
 				 errmsg("row is too big: size %zu, maximum size %zu",
-						len, MaxHeapTupleSize)));
+						len, MaxHeapTupleSizeDynamic)));
 
 	/* Compute desired extra freespace due to fillfactor option */
 	saveFreeSpace = RelationGetTargetPageFreeSpace(state->rs_new_rel,
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index d1334ffb55..bf0c326d40 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4882,7 +4882,7 @@ ReorderBufferToastReplace(ReorderBuffer *rb, ReorderBufferTXN *txn,
 	 * the tuplebuf because attrs[] will point back into the current content.
 	 */
 	tmphtup = heap_form_tuple(desc, attrs, isnull);
-	Assert(newtup->tuple.t_len <= MaxHeapTupleSize);
+	Assert(newtup->tuple.t_len <= MaxHeapTupleSizeDynamic);
 	Assert(ReorderBufferTupleBufData(newtup) == newtup->tuple.t_data);
 
 	memcpy(newtup->tuple.t_data, tmphtup->t_data, tmphtup->t_len);
diff --git a/src/backend/storage/freespace/freespace.c b/src/backend/storage/freespace/freespace.c
index e17c640baa..8981f62ac0 100644
--- a/src/backend/storage/freespace/freespace.c
+++ b/src/backend/storage/freespace/freespace.c
@@ -63,7 +63,7 @@
  */
 #define FSM_CATEGORIES	256
 #define FSM_CAT_STEP	(BLCKSZ / FSM_CATEGORIES)
-#define MaxFSMRequestSize	MaxHeapTupleSize
+#define MaxFSMRequestSize	MaxHeapTupleSizeDynamic
 
 /*
  * Depth of the on-disk tree. We need to be able to address 2^32-1 blocks,
diff --git a/src/include/access/heaptoast.h b/src/include/access/heaptoast.h
index c376dff48d..7488bd9a53 100644
--- a/src/include/access/heaptoast.h
+++ b/src/include/access/heaptoast.h
@@ -65,7 +65,7 @@
  * compress it (we can't move it out-of-line, however).  Note that this
  * number is per-datum, not per-tuple, for simplicity in index_form_tuple().
  */
-#define TOAST_INDEX_TARGET		(MaxHeapTupleSize / 16)
+#define TOAST_INDEX_TARGET		(MaxHeapTupleSizeDynamic / 16)
 
 /*
  * When we store an oversize datum externally, we divide it into chunks
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index e3e10d364f..e4a149c72f 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -550,7 +550,7 @@ StaticAssertDecl(MaxOffsetNumber < SpecTokenOffsetNumber,
 #define BITMAPLEN(NATTS)	(((int)(NATTS) + 7) / 8)
 
 /*
- * MaxHeapTupleSize is the maximum allowed size of a heap tuple, including
+ * MaxHeapTupleSizeDynamic is the maximum allowed size of a heap tuple, including
  * header and MAXALIGN alignment padding.  Basically it's BLCKSZ minus the
  * other stuff that has to be on a disk page.  Since heap pages use no
  * "special space", there's no deduction for that.
@@ -558,12 +558,11 @@ StaticAssertDecl(MaxOffsetNumber < SpecTokenOffsetNumber,
  * NOTE: we allow for the ItemId that must point to the tuple, ensuring that
  * an otherwise-empty page can indeed hold a tuple of this size.  Because
  * ItemIds and tuples have different alignment requirements, don't assume that
- * you can, say, fit 2 tuples of size MaxHeapTupleSize/2 on the same page.
+ * you can, say, fit 2 tuples of size MaxHeapTupleSizeDynamic/2 on the same page.
  */
 #define CalcMaxHeapTupleSize(usablespace)  ((usablespace) - MAXALIGN(sizeof(ItemIdData)))
 #define MaxHeapTupleSizeLimit CalcMaxHeapTupleSize(PageUsableSpaceMax)
 #define MaxHeapTupleSizeDynamic CalcMaxHeapTupleSize(PageUsableSpace)
-#define MaxHeapTupleSize MaxHeapTupleSizeLimit
 #define MinHeapTupleSize  MAXALIGN(SizeofHeapTupleHeader)
 
 /*
diff --git a/src/test/regress/expected/insert.out b/src/test/regress/expected/insert.out
index dd4354fc7d..435adf5439 100644
--- a/src/test/regress/expected/insert.out
+++ b/src/test/regress/expected/insert.out
@@ -86,7 +86,7 @@ drop table inserttest;
 --
 CREATE TABLE large_tuple_test (a int, b text) WITH (fillfactor = 10);
 ALTER TABLE large_tuple_test ALTER COLUMN b SET STORAGE plain;
--- create page w/ free space in range [nearlyEmptyFreeSpace, MaxHeapTupleSize)
+-- create page w/ free space in range [nearlyEmptyFreeSpace, MaxHeapTupleSizeDynamic)
 INSERT INTO large_tuple_test (select 1, NULL);
 -- should still fit on the page
 INSERT INTO large_tuple_test (select 2, repeat('a', 1000));
diff --git a/src/test/regress/sql/insert.sql b/src/test/regress/sql/insert.sql
index bdcffd0314..133acae4cc 100644
--- a/src/test/regress/sql/insert.sql
+++ b/src/test/regress/sql/insert.sql
@@ -43,7 +43,7 @@ drop table inserttest;
 CREATE TABLE large_tuple_test (a int, b text) WITH (fillfactor = 10);
 ALTER TABLE large_tuple_test ALTER COLUMN b SET STORAGE plain;
 
--- create page w/ free space in range [nearlyEmptyFreeSpace, MaxHeapTupleSize)
+-- create page w/ free space in range [nearlyEmptyFreeSpace, MaxHeapTupleSizeDynamic)
 INSERT INTO large_tuple_test (select 1, NULL);
 
 -- should still fit on the page
-- 
2.40.1

v3-0010-feature-Add-hook-for-setting-reloptions-defaults-.patchapplication/octet-stream; name=v3-0010-feature-Add-hook-for-setting-reloptions-defaults-.patchDownload

From 355f2d05ef8243feb7143375f88bb3c762414672 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Thu, 18 Jan 2024 13:53:57 -0500
Subject: [PATCH v3 10/28] feature: Add hook for setting reloptions defaults at
 runtime

This is necessary due to the various toast constants being non-constant due to
runtime differences in usable space.  Future commits will utilize this to tweak.

This is run once on startup, so while this is not the most efficient way to
handle this, it doesn't really matter for one-time spinup cost.
---
 src/backend/access/common/reloptions.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 0921a736ab..52c8c546ee 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -563,6 +563,8 @@ static void initialize_reloptions(void);
 static void parse_one_reloption(relopt_value *option, char *text_str,
 								int text_len, bool validate);
 
+static void update_dynamic_reloptions(void);
+
 /*
  * Get the length of a string reloption (either default or the user-defined
  * value).  This is used for allocation purposes when building a set of
@@ -572,6 +574,20 @@ static void parse_one_reloption(relopt_value *option, char *text_str,
 	((option).isset ? strlen((option).values.string_val) : \
 	 ((relopt_string *) (option).gen)->default_len)
 
+/*
+ * handle adjustments to the config table based on dynamic parameters' limits
+ */
+
+static void
+update_dynamic_reloptions(void)
+{
+	int i;
+
+	for (i = 0; intRelOpts[i].gen.name; i++)
+	{
+	}
+}
+
 /*
  * initialize_reloptions
  *		initialization routine, must be called before parsing
@@ -584,6 +600,13 @@ initialize_reloptions(void)
 	int			i;
 	int			j;
 
+	/*
+	 * Set the dynamic limits based on block size; if we get multiple can make
+	 * more sophisticated.
+	 */
+
+	update_dynamic_reloptions();
+
 	j = 0;
 	for (i = 0; boolRelOpts[i].gen.name; i++)
 	{
-- 
2.40.1

v3-0012-feature-Add-Calc-options-for-toast-related-pieces.patchapplication/octet-stream; name=v3-0012-feature-Add-Calc-options-for-toast-related-pieces.patchDownload

From fc40054e5af31e074a722a4b1836e94d0835a3fe Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Thu, 18 Jan 2024 13:57:12 -0500
Subject: [PATCH v3 12/28] feature: Add Calc options for toast-related pieces

Similar to the other Calc, Limit, etc,
---
 src/backend/access/common/toast_internals.c |  2 +-
 src/include/access/heaptoast.h              | 15 ++++++++++-----
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/src/backend/access/common/toast_internals.c b/src/backend/access/common/toast_internals.c
index a0522fcf5c..9607ca9a32 100644
--- a/src/backend/access/common/toast_internals.c
+++ b/src/backend/access/common/toast_internals.c
@@ -133,7 +133,7 @@ toast_save_datum(Relation rel, Datum value,
 	{
 		struct varlena hdr;
 		/* this is to make the union big enough for a chunk: */
-		char		data[TOAST_MAX_CHUNK_SIZE + VARHDRSZ];
+		char		data[TOAST_MAX_CHUNK_SIZE_LIMIT + VARHDRSZ];
 		/* ensure union is aligned well enough: */
 		int32		align_it;
 	}			chunk_data;
diff --git a/src/include/access/heaptoast.h b/src/include/access/heaptoast.h
index 7488bd9a53..67952565ff 100644
--- a/src/include/access/heaptoast.h
+++ b/src/include/access/heaptoast.h
@@ -20,11 +20,12 @@
 /*
  * Find the maximum size of a tuple if there are to be N tuples per page.
  */
-#define MaximumBytesPerTuple(tuplesPerPage) \
-	MAXALIGN_DOWN((BLCKSZ - \
-				   MAXALIGN(SizeOfPageHeaderData + (tuplesPerPage) * sizeof(ItemIdData))) \
+#define CalcMaximumBytesPerTuple(usablespace,tuplesPerPage)	\
+	MAXALIGN_DOWN(((usablespace) - ((tuplesPerPage) * sizeof(ItemIdData))) \
 				  / (tuplesPerPage))
 
+#define MaximumBytesPerTuple(tuplesPerPage) CalcMaximumBytesPerTuple(PageUsableSpace,tuplesPerPage)
+
 /*
  * These symbols control toaster activation.  If a tuple is larger than
  * TOAST_TUPLE_THRESHOLD, we will try to toast it down to no more than
@@ -81,13 +82,17 @@
 
 #define EXTERN_TUPLE_MAX_SIZE	MaximumBytesPerTuple(EXTERN_TUPLES_PER_PAGE)
 
-#define TOAST_MAX_CHUNK_SIZE	\
-	(EXTERN_TUPLE_MAX_SIZE -							\
+
+#define CalcToastMaxChunkSize(usablespace)							\
+	(CalcMaximumBytesPerTuple(usablespace,EXTERN_TUPLES_PER_PAGE) - \
 	 MAXALIGN(SizeofHeapTupleHeader) -					\
 	 sizeof(Oid) -										\
 	 sizeof(int32) -									\
 	 VARHDRSZ)
 
+#define TOAST_MAX_CHUNK_SIZE_LIMIT CalcToastMaxChunkSize(PageUsableSpaceMax)
+#define TOAST_MAX_CHUNK_SIZE CalcToastMaxChunkSize(PageUsableSpace)
+
 /* ----------
  * heap_toast_insert_or_update -
  *
-- 
2.40.1

v3-0014-chore-Translation-updates-for-TOAST_MAX_CHUNK_SIZ.patchapplication/octet-stream; name=v3-0014-chore-Translation-updates-for-TOAST_MAX_CHUNK_SIZ.patchDownload

From 85cc307ee5b3de0fd83c96b00bc0fa9c866d29b6 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Tue, 16 Jan 2024 17:58:32 -0500
Subject: [PATCH v3 14/28] chore: Translation updates for TOAST_MAX_CHUNK_SIZE
 change

Mechanical update of translations to account for message change.
---
 src/backend/po/de.po    | 4 ++--
 src/backend/po/es.po    | 4 ++--
 src/backend/po/fr.po    | 6 +++---
 src/backend/po/id.po    | 4 ++--
 src/backend/po/it.po    | 4 ++--
 src/backend/po/ja.po    | 4 ++--
 src/backend/po/ko.po    | 8 ++++----
 src/backend/po/pl.po    | 4 ++--
 src/backend/po/pt_BR.po | 4 ++--
 src/backend/po/ru.po    | 8 ++++----
 src/backend/po/sv.po    | 4 ++--
 src/backend/po/tr.po    | 4 ++--
 src/backend/po/uk.po    | 4 ++--
 src/backend/po/zh_CN.po | 4 ++--
 14 files changed, 33 insertions(+), 33 deletions(-)

diff --git a/src/backend/po/de.po b/src/backend/po/de.po
index 0a9e668c38..55fd99a0ec 100644
--- a/src/backend/po/de.po
+++ b/src/backend/po/de.po
@@ -2363,8 +2363,8 @@ msgstr "Der Datenbank-Cluster wurde mit INDEX_MAX_KEYS %d initialisiert, aber de
 
 #: access/transam/xlog.c:4105
 #, c-format
-msgid "The database cluster was initialized with TOAST_MAX_CHUNK_SIZE %d, but the server was compiled with TOAST_MAX_CHUNK_SIZE %d."
-msgstr "Der Datenbank-Cluster wurde mit TOAST_MAX_CHUNK_SIZE %d initialisiert, aber der Server wurde mit TOAST_MAX_CHUNK_SIZE %d kompiliert."
+msgid "The database cluster was initialized with ClusterToastMaxChunkSize %d, but the server was compiled with ClusterToastMaxChunkSize %d."
+msgstr "Der Datenbank-Cluster wurde mit ClusterToastMaxChunkSize %d initialisiert, aber der Server wurde mit ClusterToastMaxChunkSize %d kompiliert."
 
 #: access/transam/xlog.c:4112
 #, c-format
diff --git a/src/backend/po/es.po b/src/backend/po/es.po
index e50a935033..425a1030d3 100644
--- a/src/backend/po/es.po
+++ b/src/backend/po/es.po
@@ -2421,8 +2421,8 @@ msgstr "Los archivos de la base de datos fueron inicializados con INDEX_MAX_KEYS
 
 #: access/transam/xlog.c:4105
 #, c-format
-msgid "The database cluster was initialized with TOAST_MAX_CHUNK_SIZE %d, but the server was compiled with TOAST_MAX_CHUNK_SIZE %d."
-msgstr "Los archivos de la base de datos fueron inicializados con TOAST_MAX_CHUNK_SIZE %d, pero el servidor fue compilado con TOAST_MAX_CHUNK_SIZE %d."
+msgid "The database cluster was initialized with ClusterToastMaxChunkSize %d, but the server was compiled with ClusterToastMaxChunkSize %d."
+msgstr "Los archivos de la base de datos fueron inicializados con ClusterToastMaxChunkSize %d, pero el servidor fue compilado con ClusterToastMaxChunkSize %d."
 
 #: access/transam/xlog.c:4112
 #, c-format
diff --git a/src/backend/po/fr.po b/src/backend/po/fr.po
index fd51500b93..18c2a4ee2c 100644
--- a/src/backend/po/fr.po
+++ b/src/backend/po/fr.po
@@ -2424,10 +2424,10 @@ msgstr ""
 
 #: access/transam/xlog.c:4131
 #, c-format
-msgid "The database cluster was initialized with TOAST_MAX_CHUNK_SIZE %d, but the server was compiled with TOAST_MAX_CHUNK_SIZE %d."
+msgid "The database cluster was initialized with ClusterToastMaxChunkSize %d, but the server was compiled with ClusterToastMaxChunkSize %d."
 msgstr ""
-"Le cluster de bases de données a été initialisé avec un TOAST_MAX_CHUNK_SIZE\n"
-"à %d alors que le serveur a été compilé avec un TOAST_MAX_CHUNK_SIZE à %d."
+"Le cluster de bases de données a été initialisé avec un ClusterToastMaxChunkSize\n"
+"à %d alors que le serveur a été compilé avec un ClusterToastMaxChunkSize à %d."
 
 #: access/transam/xlog.c:4138
 #, c-format
diff --git a/src/backend/po/id.po b/src/backend/po/id.po
index d5d484132b..4845ea4c47 100644
--- a/src/backend/po/id.po
+++ b/src/backend/po/id.po
@@ -1124,8 +1124,8 @@ msgstr "cluster database telah diinisialkan dengan INDEX_MAX_KEYS %d, tapi serve
 
 #: access/transam/xlog.c:3690
 #, c-format
-msgid "The database cluster was initialized with TOAST_MAX_CHUNK_SIZE %d, but the server was compiled with TOAST_MAX_CHUNK_SIZE %d."
-msgstr "cluster database telah diinisialkan dengan TOAST_MAX_CHUNK_SIZE %d, tapi server telah dikompilasi dengan TOAST_MAX_CHUNK_SIZE %d."
+msgid "The database cluster was initialized with ClusterToastMaxChunkSize %d, but the server was compiled with ClusterToastMaxChunkSize %d."
+msgstr "cluster database telah diinisialkan dengan ClusterToastMaxChunkSize %d, tapi server telah dikompilasi dengan ClusterToastMaxChunkSize %d."
 
 #: access/transam/xlog.c:3699
 #, c-format
diff --git a/src/backend/po/it.po b/src/backend/po/it.po
index 673e2aaf00..9fb628e56a 100644
--- a/src/backend/po/it.po
+++ b/src/backend/po/it.po
@@ -2097,8 +2097,8 @@ msgstr "Il cluster di database è stato inizializzato con INDEX_MAX_KEYS %d, ma
 
 #: access/transam/xlog.c:4131
 #, c-format
-msgid "The database cluster was initialized with TOAST_MAX_CHUNK_SIZE %d, but the server was compiled with TOAST_MAX_CHUNK_SIZE %d."
-msgstr "Il cluster di database è stato inizializzato con TOAST_MAX_CHUNK_SIZE %d, ma il server è stato compilato con TOAST_MAX_CHUNK_SIZE %d."
+msgid "The database cluster was initialized with ClusterToastMaxChunkSize %d, but the server was compiled with ClusterToastMaxChunkSize %d."
+msgstr "Il cluster di database è stato inizializzato con ClusterToastMaxChunkSize %d, ma il server è stato compilato con ClusterToastMaxChunkSize %d."
 
 #: access/transam/xlog.c:4138
 #, c-format
diff --git a/src/backend/po/ja.po b/src/backend/po/ja.po
index 1ab9f7f68f..bd86abdef7 100644
--- a/src/backend/po/ja.po
+++ b/src/backend/po/ja.po
@@ -2126,8 +2126,8 @@ msgstr "データベースクラスタは INDEX_MAX_KEYS %d で初期化され
 
 #: access/transam/xlog.c:4105
 #, c-format
-msgid "The database cluster was initialized with TOAST_MAX_CHUNK_SIZE %d, but the server was compiled with TOAST_MAX_CHUNK_SIZE %d."
-msgstr "データベースクラスタは TOAST_MAX_CHUNK_SIZE %d で初期化されましたが、サーバーは TOAST_MAX_CHUNK_SIZE %d でコンパイルされています。"
+msgid "The database cluster was initialized with ClusterToastMaxChunkSize %d, but the server was compiled with ClusterToastMaxChunkSize %d."
+msgstr "データベースクラスタは ClusterToastMaxChunkSize %d で初期化されましたが、サーバーは ClusterToastMaxChunkSize %d でコンパイルされています。"
 
 #: access/transam/xlog.c:4112
 #, c-format
diff --git a/src/backend/po/ko.po b/src/backend/po/ko.po
index f330f3da7e..4d6597ddc7 100644
--- a/src/backend/po/ko.po
+++ b/src/backend/po/ko.po
@@ -2404,11 +2404,11 @@ msgstr ""
 #: access/transam/xlog.c:4837
 #, c-format
 msgid ""
-"The database cluster was initialized with TOAST_MAX_CHUNK_SIZE %d, but the "
-"server was compiled with TOAST_MAX_CHUNK_SIZE %d."
+"The database cluster was initialized with ClusterToastMaxChunkSize %d, but the "
+"server was compiled with ClusterToastMaxChunkSize %d."
 msgstr ""
-"데이터베이스 클러스터는 TOAST_MAX_CHUNK_SIZE %d(으)로 초기화되었지만 서버는 "
-"TOAST_MAX_CHUNK_SIZE %d(으)로 컴파일 되었습니다."
+"데이터베이스 클러스터는 ClusterToastMaxChunkSize %d(으)로 초기화되었지만 서버는 "
+"ClusterToastMaxChunkSize %d(으)로 컴파일 되었습니다."
 
 #: access/transam/xlog.c:4844
 #, c-format
diff --git a/src/backend/po/pl.po b/src/backend/po/pl.po
index 3ac9d0451c..b3fcc5fe7b 100644
--- a/src/backend/po/pl.po
+++ b/src/backend/po/pl.po
@@ -1967,8 +1967,8 @@ msgstr "Klaster bazy danych został zainicjowany z INDEX_MAX_KEYS %d, ale serwer
 
 #: access/transam/xlog.c:4592
 #, c-format
-msgid "The database cluster was initialized with TOAST_MAX_CHUNK_SIZE %d, but the server was compiled with TOAST_MAX_CHUNK_SIZE %d."
-msgstr "Klaster bazy danych został zainicjowany z TOAST_MAX_CHUNK_SIZE %d, ale serwer był skompilowany z TOAST_MAX_CHUNK_SIZE %d."
+msgid "The database cluster was initialized with ClusterToastMaxChunkSize %d, but the server was compiled with ClusterToastMaxChunkSize %d."
+msgstr "Klaster bazy danych został zainicjowany z ClusterToastMaxChunkSize %d, ale serwer był skompilowany z ClusterToastMaxChunkSize %d."
 
 #: access/transam/xlog.c:4599
 #, c-format
diff --git a/src/backend/po/pt_BR.po b/src/backend/po/pt_BR.po
index 37e4a28f07..cfc84a7b4a 100644
--- a/src/backend/po/pt_BR.po
+++ b/src/backend/po/pt_BR.po
@@ -1305,8 +1305,8 @@ msgstr "O agrupamento de banco de dados foi inicializado com INDEX_MAX_KEYS %d,
 
 #: access/transam/xlog.c:4548
 #, c-format
-msgid "The database cluster was initialized with TOAST_MAX_CHUNK_SIZE %d, but the server was compiled with TOAST_MAX_CHUNK_SIZE %d."
-msgstr "O agrupamento de banco de dados foi inicializado com TOAST_MAX_CHUNK_SIZE %d, mas o servidor foi compilado com TOAST_MAX_CHUNK_SIZE %d."
+msgid "The database cluster was initialized with ClusterToastMaxChunkSize %d, but the server was compiled with ClusterToastMaxChunkSize %d."
+msgstr "O agrupamento de banco de dados foi inicializado com ClusterToastMaxChunkSize %d, mas o servidor foi compilado com ClusterToastMaxChunkSize %d."
 
 #: access/transam/xlog.c:4555
 #, c-format
diff --git a/src/backend/po/ru.po b/src/backend/po/ru.po
index ae9c50eed7..6427ffc272 100644
--- a/src/backend/po/ru.po
+++ b/src/backend/po/ru.po
@@ -2673,11 +2673,11 @@ msgstr ""
 #: access/transam/xlog.c:4131
 #, c-format
 msgid ""
-"The database cluster was initialized with TOAST_MAX_CHUNK_SIZE %d, but the "
-"server was compiled with TOAST_MAX_CHUNK_SIZE %d."
+"The database cluster was initialized with ClusterToastMaxChunkSize %d, but the "
+"server was compiled with ClusterToastMaxChunkSize %d."
 msgstr ""
-"Кластер баз данных был инициализирован с TOAST_MAX_CHUNK_SIZE %d, но сервер "
-"скомпилирован с TOAST_MAX_CHUNK_SIZE %d."
+"Кластер баз данных был инициализирован с ClusterToastMaxChunkSize %d, но сервер "
+"скомпилирован с ClusterToastMaxChunkSize %d."
 
 #: access/transam/xlog.c:4138
 #, c-format
diff --git a/src/backend/po/sv.po b/src/backend/po/sv.po
index 0da20b6d43..626015b5e7 100644
--- a/src/backend/po/sv.po
+++ b/src/backend/po/sv.po
@@ -2379,8 +2379,8 @@ msgstr "Databasklustret initierades med INDEX_MAX_KEYS %d, men servern kompilera
 
 #: access/transam/xlog.c:4105
 #, c-format
-msgid "The database cluster was initialized with TOAST_MAX_CHUNK_SIZE %d, but the server was compiled with TOAST_MAX_CHUNK_SIZE %d."
-msgstr "Databasklustret initierades med TOAST_MAX_CHUNK_SIZE %d, men servern kompilerades med TOAST_MAX_CHUNK_SIZE %d."
+msgid "The database cluster was initialized with ClusterToastMaxChunkSize %d, but the server was compiled with ClusterToastMaxChunkSize %d."
+msgstr "Databasklustret initierades med ClusterToastMaxChunkSize %d, men servern kompilerades med ClusterToastMaxChunkSize %d."
 
 #: access/transam/xlog.c:4112
 #, c-format
diff --git a/src/backend/po/tr.po b/src/backend/po/tr.po
index b791e886b9..048d4a4a7e 100644
--- a/src/backend/po/tr.po
+++ b/src/backend/po/tr.po
@@ -1791,8 +1791,8 @@ msgstr "Veritabanı clusteri INDEX_MAX_KEYS %d ile ilklendirilmiştir, ancak sun
 
 #: access/transam/xlog.c:4711
 #, c-format
-msgid "The database cluster was initialized with TOAST_MAX_CHUNK_SIZE %d, but the server was compiled with TOAST_MAX_CHUNK_SIZE %d."
-msgstr "Veritabanı clusteri TOAST_MAX_CHUNK_SIZE %d ile ilklendirilmiştir, ancak sunucu  TOAST_MAX_CHUNK_SIZE %d ile derlenmiştir."
+msgid "The database cluster was initialized with ClusterToastMaxChunkSize %d, but the server was compiled with ClusterToastMaxChunkSize %d."
+msgstr "Veritabanı clusteri ClusterToastMaxChunkSize %d ile ilklendirilmiştir, ancak sunucu  ClusterToastMaxChunkSize %d ile derlenmiştir."
 
 #: access/transam/xlog.c:4718
 #, c-format
diff --git a/src/backend/po/uk.po b/src/backend/po/uk.po
index 1095fd9139..b50e5a0de2 100644
--- a/src/backend/po/uk.po
+++ b/src/backend/po/uk.po
@@ -2322,8 +2322,8 @@ msgstr "Кластер бази даних було ініціалізовано
 
 #: access/transam/xlog.c:4131
 #, c-format
-msgid "The database cluster was initialized with TOAST_MAX_CHUNK_SIZE %d, but the server was compiled with TOAST_MAX_CHUNK_SIZE %d."
-msgstr "Кластер бази даних було ініціалізовано з TOAST_MAX_CHUNK_SIZE %d, але сервер було скомпільовано з TOAST_MAX_CHUNK_SIZE %d."
+msgid "The database cluster was initialized with ClusterToastMaxChunkSize %d, but the server was compiled with ClusterToastMaxChunkSize %d."
+msgstr "Кластер бази даних було ініціалізовано з ClusterToastMaxChunkSize %d, але сервер було скомпільовано з ClusterToastMaxChunkSize %d."
 
 #: access/transam/xlog.c:4138
 #, c-format
diff --git a/src/backend/po/zh_CN.po b/src/backend/po/zh_CN.po
index 574684d775..513752a0b9 100644
--- a/src/backend/po/zh_CN.po
+++ b/src/backend/po/zh_CN.po
@@ -1887,8 +1887,8 @@ msgstr "数据库集群是以 INDEX_MAX_KEYS  %d 初始化的, 但是 服务器
 
 #: access/transam/xlog.c:4712
 #, c-format
-msgid "The database cluster was initialized with TOAST_MAX_CHUNK_SIZE %d, but the server was compiled with TOAST_MAX_CHUNK_SIZE %d."
-msgstr "数据库集群是以 TOAST_MAX_CHUNK_SIZE %d 初始化的, 但是 服务器是以 TOAST_MAX_CHUNK_SIZE %d 编译的."
+msgid "The database cluster was initialized with ClusterToastMaxChunkSize %d, but the server was compiled with ClusterToastMaxChunkSize %d."
+msgstr "数据库集群是以 ClusterToastMaxChunkSize %d 初始化的, 但是 服务器是以 ClusterToastMaxChunkSize %d 编译的."
 
 #: access/transam/xlog.c:4719
 #, c-format
-- 
2.40.1

v3-0013-chore-Replace-TOAST_MAX_CHUNK_SIZE-with-ClusterTo.patchapplication/octet-stream; name=v3-0013-chore-Replace-TOAST_MAX_CHUNK_SIZE-with-ClusterTo.patchDownload

From 4b85156d26b801a28d815073d533f9b54b68cc9e Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Tue, 16 Jan 2024 17:58:00 -0500
Subject: [PATCH v3 13/28] chore: Replace TOAST_MAX_CHUNK_SIZE with
 ClusterToastMaxSize var

Mainly mechanical, computed value will be defined in an upcoming commit.
---
 contrib/amcheck/verify_heapam.c             |  8 ++++----
 doc/src/sgml/storage.sgml                   |  2 +-
 src/backend/access/common/toast_internals.c |  2 +-
 src/backend/access/heap/heaptoast.c         | 16 ++++++++--------
 src/backend/access/transam/xlog.c           |  6 +++---
 src/bin/pg_resetwal/pg_resetwal.c           |  2 +-
 src/include/access/heaptoast.h              |  5 ++---
 7 files changed, 20 insertions(+), 21 deletions(-)

diff --git a/contrib/amcheck/verify_heapam.c b/contrib/amcheck/verify_heapam.c
index f2526ed63a..49e2e7dd94 100644
--- a/contrib/amcheck/verify_heapam.c
+++ b/contrib/amcheck/verify_heapam.c
@@ -1460,7 +1460,7 @@ check_toast_tuple(HeapTuple toasttup, HeapCheckContext *ctx,
 				  uint32 extsize)
 {
 	int32		chunk_seq;
-	int32		last_chunk_seq = (extsize - 1) / TOAST_MAX_CHUNK_SIZE;
+	int32		last_chunk_seq = (extsize - 1) / ClusterToastMaxChunkSize;
 	Pointer		chunk;
 	bool		isnull;
 	int32		chunksize;
@@ -1530,8 +1530,8 @@ check_toast_tuple(HeapTuple toasttup, HeapCheckContext *ctx,
 		return;
 	}
 
-	expected_size = chunk_seq < last_chunk_seq ? TOAST_MAX_CHUNK_SIZE
-		: extsize - (last_chunk_seq * TOAST_MAX_CHUNK_SIZE);
+	expected_size = chunk_seq < last_chunk_seq ? ClusterToastMaxChunkSize
+		: extsize - (last_chunk_seq * ClusterToastMaxChunkSize);
 
 	if (chunksize != expected_size)
 		report_toast_corruption(ctx, ta,
@@ -1773,7 +1773,7 @@ check_toasted_attribute(HeapCheckContext *ctx, ToastedAttribute *ta)
 	int32		last_chunk_seq;
 
 	extsize = VARATT_EXTERNAL_GET_EXTSIZE(ta->toast_pointer);
-	last_chunk_seq = (extsize - 1) / TOAST_MAX_CHUNK_SIZE;
+	last_chunk_seq = (extsize - 1) / ClusterToastMaxChunkSize;
 
 	/*
 	 * Setup a scan key to find chunks in toast table with matching va_valueid
diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml
index 3ea4e5526d..32ea45b089 100644
--- a/doc/src/sgml/storage.sgml
+++ b/doc/src/sgml/storage.sgml
@@ -415,7 +415,7 @@ described in more detail below.
 
 <para>
 Out-of-line values are divided (after compression if used) into chunks of at
-most <symbol>TOAST_MAX_CHUNK_SIZE</symbol> bytes (by default this value is chosen
+most <symbol>ClusterToastMaxChunkSize</symbol> bytes (by default this value is chosen
 so that four chunk rows will fit on a page, making it about 2000 bytes).
 Each chunk is stored as a separate row in the <acronym>TOAST</acronym> table
 belonging to the owning table.  Every
diff --git a/src/backend/access/common/toast_internals.c b/src/backend/access/common/toast_internals.c
index 9607ca9a32..7cc4e5f092 100644
--- a/src/backend/access/common/toast_internals.c
+++ b/src/backend/access/common/toast_internals.c
@@ -311,7 +311,7 @@ toast_save_datum(Relation rel, Datum value,
 		/*
 		 * Calculate the size of this chunk
 		 */
-		chunk_size = Min(TOAST_MAX_CHUNK_SIZE, data_todo);
+		chunk_size = Min(ClusterToastMaxChunkSize, data_todo);
 
 		/*
 		 * Build a tuple and store it
diff --git a/src/backend/access/heap/heaptoast.c b/src/backend/access/heap/heaptoast.c
index a420e16530..ada369134a 100644
--- a/src/backend/access/heap/heaptoast.c
+++ b/src/backend/access/heap/heaptoast.c
@@ -634,7 +634,7 @@ heap_fetch_toast_slice(Relation toastrel, Oid valueid, int32 attrsize,
 	SysScanDesc toastscan;
 	HeapTuple	ttup;
 	int32		expectedchunk;
-	int32		totalchunks = ((attrsize - 1) / TOAST_MAX_CHUNK_SIZE) + 1;
+	int32		totalchunks = ((attrsize - 1) / ClusterToastMaxChunkSize) + 1;
 	int			startchunk;
 	int			endchunk;
 	int			num_indexes;
@@ -647,8 +647,8 @@ heap_fetch_toast_slice(Relation toastrel, Oid valueid, int32 attrsize,
 									&toastidxs,
 									&num_indexes);
 
-	startchunk = sliceoffset / TOAST_MAX_CHUNK_SIZE;
-	endchunk = (sliceoffset + slicelength - 1) / TOAST_MAX_CHUNK_SIZE;
+	startchunk = sliceoffset / ClusterToastMaxChunkSize;
+	endchunk = (sliceoffset + slicelength - 1) / ClusterToastMaxChunkSize;
 	Assert(endchunk <= totalchunks);
 
 	/* Set up a scan key to fetch from the index. */
@@ -749,8 +749,8 @@ heap_fetch_toast_slice(Relation toastrel, Oid valueid, int32 attrsize,
 									 curchunk,
 									 startchunk, endchunk, valueid,
 									 RelationGetRelationName(toastrel))));
-		expected_size = curchunk < totalchunks - 1 ? TOAST_MAX_CHUNK_SIZE
-			: attrsize - ((totalchunks - 1) * TOAST_MAX_CHUNK_SIZE);
+		expected_size = curchunk < totalchunks - 1 ? ClusterToastMaxChunkSize
+			: attrsize - ((totalchunks - 1) * ClusterToastMaxChunkSize);
 		if (chunksize != expected_size)
 			ereport(ERROR,
 					(errcode(ERRCODE_DATA_CORRUPTED),
@@ -765,12 +765,12 @@ heap_fetch_toast_slice(Relation toastrel, Oid valueid, int32 attrsize,
 		chcpystrt = 0;
 		chcpyend = chunksize - 1;
 		if (curchunk == startchunk)
-			chcpystrt = sliceoffset % TOAST_MAX_CHUNK_SIZE;
+			chcpystrt = sliceoffset % ClusterToastMaxChunkSize;
 		if (curchunk == endchunk)
-			chcpyend = (sliceoffset + slicelength - 1) % TOAST_MAX_CHUNK_SIZE;
+			chcpyend = (sliceoffset + slicelength - 1) % ClusterToastMaxChunkSize;
 
 		memcpy(VARDATA(result) +
-			   (curchunk * TOAST_MAX_CHUNK_SIZE - sliceoffset) + chcpystrt,
+			   (curchunk * ClusterToastMaxChunkSize - sliceoffset) + chcpystrt,
 			   chunkdata + chcpystrt,
 			   (chcpyend - chcpystrt) + 1);
 
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 478377c4a2..f9defc70f4 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4276,9 +4276,9 @@ ReadControlFile(void)
 	if (ControlFile->toast_max_chunk_size != TOAST_MAX_CHUNK_SIZE)
 		ereport(FATAL,
 				(errmsg("database files are incompatible with server"),
-				 errdetail("The database cluster was initialized with TOAST_MAX_CHUNK_SIZE %d,"
-						   " but the server was compiled with TOAST_MAX_CHUNK_SIZE %d.",
-						   ControlFile->toast_max_chunk_size, (int) TOAST_MAX_CHUNK_SIZE),
+				 errdetail("The database cluster was initialized with ClusterToastMaxChunkSize %d,"
+						   " but the server was configured with ClusterToastMaxChunkSize %d.",
+						   ControlFile->toast_max_chunk_size, (int) ClusterToastMaxChunkSize),
 				 errhint("It looks like you need to recompile or initdb.")));
 	if (ControlFile->loblksize != LOBLKSIZE)
 		ereport(FATAL,
diff --git a/src/bin/pg_resetwal/pg_resetwal.c b/src/bin/pg_resetwal/pg_resetwal.c
index e9dcb5a6d8..9b02a290e7 100644
--- a/src/bin/pg_resetwal/pg_resetwal.c
+++ b/src/bin/pg_resetwal/pg_resetwal.c
@@ -695,7 +695,7 @@ GuessControlValues(void)
 	ControlFile.xlog_seg_size = DEFAULT_XLOG_SEG_SIZE;
 	ControlFile.nameDataLen = NAMEDATALEN;
 	ControlFile.indexMaxKeys = INDEX_MAX_KEYS;
-	ControlFile.toast_max_chunk_size = TOAST_MAX_CHUNK_SIZE;
+	ControlFile.toast_max_chunk_size = ClusterToastMaxChunkSize;
 	ControlFile.loblksize = LOBLKSIZE;
 	ControlFile.float8ByVal = FLOAT8PASSBYVAL;
 
diff --git a/src/include/access/heaptoast.h b/src/include/access/heaptoast.h
index 67952565ff..3fbb1d764f 100644
--- a/src/include/access/heaptoast.h
+++ b/src/include/access/heaptoast.h
@@ -70,13 +70,13 @@
 
 /*
  * When we store an oversize datum externally, we divide it into chunks
- * containing at most TOAST_MAX_CHUNK_SIZE data bytes.  This number *must*
+ * containing at most ClusterToastMaxChunkSize data bytes.  This number *must*
  * be small enough that the completed toast-table tuple (including the
  * ID and sequence fields and all overhead) will fit on a page.
  * The coding here sets the size on the theory that we want to fit
  * EXTERN_TUPLES_PER_PAGE tuples of maximum size onto a page.
  *
- * NB: Changing TOAST_MAX_CHUNK_SIZE requires an initdb.
+ * NB: Changing ClusterToastMaxChunkSize requires an initdb.
  */
 #define EXTERN_TUPLES_PER_PAGE	4	/* tweak only this */
 
@@ -91,7 +91,6 @@
 	 VARHDRSZ)
 
 #define TOAST_MAX_CHUNK_SIZE_LIMIT CalcToastMaxChunkSize(PageUsableSpaceMax)
-#define TOAST_MAX_CHUNK_SIZE CalcToastMaxChunkSize(PageUsableSpace)
 
 /* ----------
  * heap_toast_insert_or_update -
-- 
2.40.1

v3-0011-feature-Dynamically-calculate-toast_tuple_target.patchapplication/octet-stream; name=v3-0011-feature-Dynamically-calculate-toast_tuple_target.patchDownload

From 627f96f6b4fddc0b6a21bc9eb4cdf387d27f58c3 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Thu, 18 Jan 2024 13:56:42 -0500
Subject: [PATCH v3 11/28] feature: Dynamically calculate toast_tuple_target

---
 src/backend/access/common/reloptions.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 52c8c546ee..8b29d21b2a 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -328,7 +328,8 @@ static relopt_int intRelOpts[] =
 			RELOPT_KIND_HEAP,
 			ShareUpdateExclusiveLock
 		},
-		TOAST_TUPLE_TARGET, 128, TOAST_TUPLE_TARGET_MAIN
+		/* NOTE: these limits are dynamically initialized */
+		0, 0, 0
 	},
 	{
 		{
@@ -585,6 +586,12 @@ update_dynamic_reloptions(void)
 
 	for (i = 0; intRelOpts[i].gen.name; i++)
 	{
+		if (strcmp("toast_tuple_target", intRelOpts[i].gen.name) == 0)
+		{
+			intRelOpts[i].min = 128;
+			intRelOpts[i].default_val = CalcMaximumBytesPerTuple(PageUsableSpace,TOAST_TUPLES_PER_PAGE);
+			intRelOpts[i].max = CalcMaximumBytesPerTuple(PageUsableSpace,TOAST_TUPLES_PER_PAGE_MAIN);
+		}
 	}
 }
 
-- 
2.40.1

v3-0015-chore-Split-nbtree.h-structure-defs-into-an-inter.patchapplication/octet-stream; name=v3-0015-chore-Split-nbtree.h-structure-defs-into-an-inter.patchDownload

From 6d726b83d493d2999ae6ff455af5316b005467a6 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Fri, 5 Jan 2024 15:10:42 -0500
Subject: [PATCH v3 15/28] chore: Split nbtree.h structure defs into an
 internal file

These definitions have been separated out so we can calculate block size
constants from front-end code as well, which we were prevented from doing due to
s_lock.h complaints when compiled in front-end mode.

Since we only need to calculate the various cluster constants using this
structure sizes we can just define the parts that that routine cares about here
in the new header and pull in that into blocksize.c instead of all of nbtree.h.
---
 src/include/access/nbtree.h     | 163 +--------------------------
 src/include/access/nbtree_int.h | 192 ++++++++++++++++++++++++++++++++
 2 files changed, 194 insertions(+), 161 deletions(-)
 create mode 100644 src/include/access/nbtree_int.h

diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h
index 9575ca17ce..476a08def7 100644
--- a/src/include/access/nbtree.h
+++ b/src/include/access/nbtree.h
@@ -25,168 +25,9 @@
 #include "storage/bufmgr.h"
 #include "storage/shm_toc.h"
 
-/* There's room for a 16-bit vacuum cycle ID in BTPageOpaqueData */
-typedef uint16 BTCycleId;
+/* data structures are defined in nbtree_int.h */
+#include "access/nbtree_int.h"
 
-/*
- *	BTPageOpaqueData -- At the end of every page, we store a pointer
- *	to both siblings in the tree.  This is used to do forward/backward
- *	index scans.  The next-page link is also critical for recovery when
- *	a search has navigated to the wrong page due to concurrent page splits
- *	or deletions; see src/backend/access/nbtree/README for more info.
- *
- *	In addition, we store the page's btree level (counting upwards from
- *	zero at a leaf page) as well as some flag bits indicating the page type
- *	and status.  If the page is deleted, a BTDeletedPageData struct is stored
- *	in the page's tuple area, while a standard BTPageOpaqueData struct is
- *	stored in the page special area.
- *
- *	We also store a "vacuum cycle ID".  When a page is split while VACUUM is
- *	processing the index, a nonzero value associated with the VACUUM run is
- *	stored into both halves of the split page.  (If VACUUM is not running,
- *	both pages receive zero cycleids.)	This allows VACUUM to detect whether
- *	a page was split since it started, with a small probability of false match
- *	if the page was last split some exact multiple of MAX_BT_CYCLE_ID VACUUMs
- *	ago.  Also, during a split, the BTP_SPLIT_END flag is cleared in the left
- *	(original) page, and set in the right page, but only if the next page
- *	to its right has a different cycleid.
- *
- *	NOTE: the BTP_LEAF flag bit is redundant since level==0 could be tested
- *	instead.
- *
- *	NOTE: the btpo_level field used to be a union type in order to allow
- *	deleted pages to store a 32-bit safexid in the same field.  We now store
- *	64-bit/full safexid values using BTDeletedPageData instead.
- */
-
-typedef struct BTPageOpaqueData
-{
-	BlockNumber btpo_prev;		/* left sibling, or P_NONE if leftmost */
-	BlockNumber btpo_next;		/* right sibling, or P_NONE if rightmost */
-	uint32		btpo_level;		/* tree level --- zero for leaf pages */
-	uint16		btpo_flags;		/* flag bits, see below */
-	BTCycleId	btpo_cycleid;	/* vacuum cycle ID of latest split */
-} BTPageOpaqueData;
-
-typedef BTPageOpaqueData *BTPageOpaque;
-
-#define BTPageGetOpaque(page) ((BTPageOpaque) PageGetSpecialPointer(page))
-
-/* Bits defined in btpo_flags */
-#define BTP_LEAF		(1 << 0)	/* leaf page, i.e. not internal page */
-#define BTP_ROOT		(1 << 1)	/* root page (has no parent) */
-#define BTP_DELETED		(1 << 2)	/* page has been deleted from tree */
-#define BTP_META		(1 << 3)	/* meta-page */
-#define BTP_HALF_DEAD	(1 << 4)	/* empty, but still in tree */
-#define BTP_SPLIT_END	(1 << 5)	/* rightmost page of split group */
-#define BTP_HAS_GARBAGE (1 << 6)	/* page has LP_DEAD tuples (deprecated) */
-#define BTP_INCOMPLETE_SPLIT (1 << 7)	/* right sibling's downlink is missing */
-#define BTP_HAS_FULLXID	(1 << 8)	/* contains BTDeletedPageData */
-
-/*
- * The max allowed value of a cycle ID is a bit less than 64K.  This is
- * for convenience of pg_filedump and similar utilities: we want to use
- * the last 2 bytes of special space as an index type indicator, and
- * restricting cycle ID lets btree use that space for vacuum cycle IDs
- * while still allowing index type to be identified.
- */
-#define MAX_BT_CYCLE_ID		0xFF7F
-
-
-/*
- * The Meta page is always the first page in the btree index.
- * Its primary purpose is to point to the location of the btree root page.
- * We also point to the "fast" root, which is the current effective root;
- * see README for discussion.
- */
-
-typedef struct BTMetaPageData
-{
-	uint32		btm_magic;		/* should contain BTREE_MAGIC */
-	uint32		btm_version;	/* nbtree version (always <= BTREE_VERSION) */
-	BlockNumber btm_root;		/* current root location */
-	uint32		btm_level;		/* tree level of the root page */
-	BlockNumber btm_fastroot;	/* current "fast" root location */
-	uint32		btm_fastlevel;	/* tree level of the "fast" root page */
-	/* remaining fields only valid when btm_version >= BTREE_NOVAC_VERSION */
-
-	/* number of deleted, non-recyclable pages during last cleanup */
-	uint32		btm_last_cleanup_num_delpages;
-	/* number of heap tuples during last cleanup (deprecated) */
-	float8		btm_last_cleanup_num_heap_tuples;
-
-	bool		btm_allequalimage;	/* are all columns "equalimage"? */
-} BTMetaPageData;
-
-#define BTPageGetMeta(p) \
-	((BTMetaPageData *) PageGetContents(p))
-
-/*
- * The current Btree version is 4.  That's what you'll get when you create
- * a new index.
- *
- * Btree version 3 was used in PostgreSQL v11.  It is mostly the same as
- * version 4, but heap TIDs were not part of the keyspace.  Index tuples
- * with duplicate keys could be stored in any order.  We continue to
- * support reading and writing Btree versions 2 and 3, so that they don't
- * need to be immediately re-indexed at pg_upgrade.  In order to get the
- * new heapkeyspace semantics, however, a REINDEX is needed.
- *
- * Deduplication is safe to use when the btm_allequalimage field is set to
- * true.  It's safe to read the btm_allequalimage field on version 3, but
- * only version 4 indexes make use of deduplication.  Even version 4
- * indexes created on PostgreSQL v12 will need a REINDEX to make use of
- * deduplication, though, since there is no other way to set
- * btm_allequalimage to true (pg_upgrade hasn't been taught to set the
- * metapage field).
- *
- * Btree version 2 is mostly the same as version 3.  There are two new
- * fields in the metapage that were introduced in version 3.  A version 2
- * metapage will be automatically upgraded to version 3 on the first
- * insert to it.  INCLUDE indexes cannot use version 2.
- */
-#define BTREE_METAPAGE	0		/* first page is meta */
-#define BTREE_MAGIC		0x053162	/* magic number in metapage */
-#define BTREE_VERSION	4		/* current version number */
-#define BTREE_MIN_VERSION	2	/* minimum supported version */
-#define BTREE_NOVAC_VERSION	3	/* version with all meta fields set */
-
-/*
- * Maximum size of a btree index entry, including its tuple header.
- *
- * We actually need to be able to fit three items on every page,
- * so restrict any one item to 1/3 the per-page available space.
- *
- * There are rare cases where _bt_truncate() will need to enlarge
- * a heap index tuple to make space for a tiebreaker heap TID
- * attribute, which we account for here.
- */
-#define BTMaxItemSize(page) \
-	(MAXALIGN_DOWN((PageGetUsablePageSize(page) - \
-					MAXALIGN(3*sizeof(ItemIdData)) - \
-					MAXALIGN(sizeof(BTPageOpaqueData))) / 3) - \
-					MAXALIGN(sizeof(ItemPointerData)))
-#define BTMaxItemSizeNoHeapTid(page) \
-	MAXALIGN_DOWN((PageGetUsablePageSize(page) - \
-				   MAXALIGN(3*sizeof(ItemIdData)) - \
-				   MAXALIGN(sizeof(BTPageOpaqueData))) / 3)
-
-/*
- * MaxTIDsPerBTreePageDynamic is an upper bound on the number of heap TIDs tuples
- * that may be stored on a btree leaf page.  It is used to size the
- * per-page temporary buffers.
- *
- * Note: we don't bother considering per-tuple overheads here to keep
- * things simple (value is based on how many elements a single array of
- * heap TIDs must have to fill the space between the page header and
- * special area).  The value is slightly higher (i.e. more conservative)
- * than necessary as a result, which is considered acceptable.
- */
-#define CalcMaxTIDsPerBTreePage(usablespace)			  \
-	(int) ((usablespace) - sizeof(BTPageOpaqueData) / \
-		   sizeof(ItemPointerData))
-#define MaxTIDsPerBTreePageLimit (CalcMaxTIDsPerBTreePage(PageUsableSpaceMax))
-#define MaxTIDsPerBTreePageDynamic (CalcMaxTIDsPerBTreePage(PageUsableSpace))
 /*
  * The leaf-page fillfactor defaults to 90% but is user-adjustable.
  * For pages above the leaf level, we use a fixed 70% fillfactor.
diff --git a/src/include/access/nbtree_int.h b/src/include/access/nbtree_int.h
new file mode 100644
index 0000000000..f337912a9a
--- /dev/null
+++ b/src/include/access/nbtree_int.h
@@ -0,0 +1,192 @@
+/*-------------------------------------------------------------------------
+ *
+ * nbtree_int.h
+ *	  data structures for btree access method implementation.
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/access/nbtree_int.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef NBTREE_INT_H
+#define NBTREE_INT_H
+
+#include "storage/block.h"
+
+/*
+ * These definitions have been separated out so we can calculate block size
+ * constants from front-end code as well, which we were prevented from doing
+ * due to s_lock.h complaints when compiled in front-end mode.
+ *
+ * Since we only need to calculate the various cluster constants using this
+ * data, we can just define the parts that that routine cares about here and
+ * pull in that instead of all of nbtree.h directly.
+ */
+
+/* There's room for a 16-bit vacuum cycle ID in BTPageOpaqueData */
+typedef uint16 BTCycleId;
+
+/*
+ *	BTPageOpaqueData -- At the end of every page, we store a pointer
+ *	to both siblings in the tree.  This is used to do forward/backward
+ *	index scans.  The next-page link is also critical for recovery when
+ *	a search has navigated to the wrong page due to concurrent page splits
+ *	or deletions; see src/backend/access/nbtree/README for more info.
+ *
+ *	In addition, we store the page's btree level (counting upwards from
+ *	zero at a leaf page) as well as some flag bits indicating the page type
+ *	and status.  If the page is deleted, a BTDeletedPageData struct is stored
+ *	in the page's tuple area, while a standard BTPageOpaqueData struct is
+ *	stored in the page special area.
+ *
+ *	We also store a "vacuum cycle ID".  When a page is split while VACUUM is
+ *	processing the index, a nonzero value associated with the VACUUM run is
+ *	stored into both halves of the split page.  (If VACUUM is not running,
+ *	both pages receive zero cycleids.)	This allows VACUUM to detect whether
+ *	a page was split since it started, with a small probability of false match
+ *	if the page was last split some exact multiple of MAX_BT_CYCLE_ID VACUUMs
+ *	ago.  Also, during a split, the BTP_SPLIT_END flag is cleared in the left
+ *	(original) page, and set in the right page, but only if the next page
+ *	to its right has a different cycleid.
+ *
+ *	NOTE: the BTP_LEAF flag bit is redundant since level==0 could be tested
+ *	instead.
+ *
+ *	NOTE: the btpo_level field used to be a union type in order to allow
+ *	deleted pages to store a 32-bit safexid in the same field.  We now store
+ *	64-bit/full safexid values using BTDeletedPageData instead.
+ */
+
+typedef struct BTPageOpaqueData
+{
+	BlockNumber btpo_prev;		/* left sibling, or P_NONE if leftmost */
+	BlockNumber btpo_next;		/* right sibling, or P_NONE if rightmost */
+	uint32		btpo_level;		/* tree level --- zero for leaf pages */
+	uint16		btpo_flags;		/* flag bits, see below */
+	BTCycleId	btpo_cycleid;	/* vacuum cycle ID of latest split */
+} BTPageOpaqueData;
+
+typedef BTPageOpaqueData *BTPageOpaque;
+
+#define BTPageGetOpaque(page) ((BTPageOpaque) PageGetSpecialPointer(page))
+
+/* Bits defined in btpo_flags */
+#define BTP_LEAF		(1 << 0)	/* leaf page, i.e. not internal page */
+#define BTP_ROOT		(1 << 1)	/* root page (has no parent) */
+#define BTP_DELETED		(1 << 2)	/* page has been deleted from tree */
+#define BTP_META		(1 << 3)	/* meta-page */
+#define BTP_HALF_DEAD	(1 << 4)	/* empty, but still in tree */
+#define BTP_SPLIT_END	(1 << 5)	/* rightmost page of split group */
+#define BTP_HAS_GARBAGE (1 << 6)	/* page has LP_DEAD tuples (deprecated) */
+#define BTP_INCOMPLETE_SPLIT (1 << 7)	/* right sibling's downlink is missing */
+#define BTP_HAS_FULLXID	(1 << 8)	/* contains BTDeletedPageData */
+
+/*
+ * The max allowed value of a cycle ID is a bit less than 64K.  This is
+ * for convenience of pg_filedump and similar utilities: we want to use
+ * the last 2 bytes of special space as an index type indicator, and
+ * restricting cycle ID lets btree use that space for vacuum cycle IDs
+ * while still allowing index type to be identified.
+ */
+#define MAX_BT_CYCLE_ID		0xFF7F
+
+
+/*
+ * The Meta page is always the first page in the btree index.
+ * Its primary purpose is to point to the location of the btree root page.
+ * We also point to the "fast" root, which is the current effective root;
+ * see README for discussion.
+ */
+
+typedef struct BTMetaPageData
+{
+	uint32		btm_magic;		/* should contain BTREE_MAGIC */
+	uint32		btm_version;	/* nbtree version (always <= BTREE_VERSION) */
+	BlockNumber btm_root;		/* current root location */
+	uint32		btm_level;		/* tree level of the root page */
+	BlockNumber btm_fastroot;	/* current "fast" root location */
+	uint32		btm_fastlevel;	/* tree level of the "fast" root page */
+	/* remaining fields only valid when btm_version >= BTREE_NOVAC_VERSION */
+
+	/* number of deleted, non-recyclable pages during last cleanup */
+	uint32		btm_last_cleanup_num_delpages;
+	/* number of heap tuples during last cleanup (deprecated) */
+	float8		btm_last_cleanup_num_heap_tuples;
+
+	bool		btm_allequalimage;	/* are all columns "equalimage"? */
+} BTMetaPageData;
+
+#define BTPageGetMeta(p) \
+	((BTMetaPageData *) PageGetContents(p))
+
+/*
+ * The current Btree version is 4.  That's what you'll get when you create
+ * a new index.
+ *
+ * Btree version 3 was used in PostgreSQL v11.  It is mostly the same as
+ * version 4, but heap TIDs were not part of the keyspace.  Index tuples
+ * with duplicate keys could be stored in any order.  We continue to
+ * support reading and writing Btree versions 2 and 3, so that they don't
+ * need to be immediately re-indexed at pg_upgrade.  In order to get the
+ * new heapkeyspace semantics, however, a REINDEX is needed.
+ *
+ * Deduplication is safe to use when the btm_allequalimage field is set to
+ * true.  It's safe to read the btm_allequalimage field on version 3, but
+ * only version 4 indexes make use of deduplication.  Even version 4
+ * indexes created on PostgreSQL v12 will need a REINDEX to make use of
+ * deduplication, though, since there is no other way to set
+ * btm_allequalimage to true (pg_upgrade hasn't been taught to set the
+ * metapage field).
+ *
+ * Btree version 2 is mostly the same as version 3.  There are two new
+ * fields in the metapage that were introduced in version 3.  A version 2
+ * metapage will be automatically upgraded to version 3 on the first
+ * insert to it.  INCLUDE indexes cannot use version 2.
+ */
+#define BTREE_METAPAGE	0		/* first page is meta */
+#define BTREE_MAGIC		0x053162	/* magic number in metapage */
+#define BTREE_VERSION	4		/* current version number */
+#define BTREE_MIN_VERSION	2	/* minimum supported version */
+#define BTREE_NOVAC_VERSION	3	/* version with all meta fields set */
+
+/*
+ * Maximum size of a btree index entry, including its tuple header.
+ *
+ * We actually need to be able to fit three items on every page,
+ * so restrict any one item to 1/3 the per-page available space.
+ *
+ * There are rare cases where _bt_truncate() will need to enlarge
+ * a heap index tuple to make space for a tiebreaker heap TID
+ * attribute, which we account for here.
+ */
+#define BTMaxItemSize(page) \
+	(MAXALIGN_DOWN((PageGetUsablePageSize(page) - \
+					MAXALIGN(3*sizeof(ItemIdData)) - \
+					MAXALIGN(sizeof(BTPageOpaqueData))) / 3) - \
+					MAXALIGN(sizeof(ItemPointerData)))
+#define BTMaxItemSizeNoHeapTid(page) \
+	MAXALIGN_DOWN((PageGetUsablePageSize(page) - \
+				   MAXALIGN(3*sizeof(ItemIdData)) - \
+				   MAXALIGN(sizeof(BTPageOpaqueData))) / 3)
+
+/*
+ * ClusterMaxTIDsPerBTreePage is an upper bound on the number of heap TIDs tuples
+ * that may be stored on a btree leaf page.  It is used to size the
+ * per-page temporary buffers.
+ *
+ * Note: we don't bother considering per-tuple overheads here to keep
+ * things simple (value is based on how many elements a single array of
+ * heap TIDs must have to fill the space between the page header and
+ * special area).  The value is slightly higher (i.e. more conservative)
+ * than necessary as a result, which is considered acceptable.
+ */
+#define CalcMaxTIDsPerBTreePage(usablespace)			  \
+	(int) ((usablespace) - sizeof(BTPageOpaqueData) / \
+		   sizeof(ItemPointerData))
+#define MaxTIDsPerBTreePageLimit (CalcMaxTIDsPerBTreePage(PageUsableSpaceMax))
+
+#endif
+
-- 
2.40.1

v3-0017-feature-ControlFile-GUC-support-for-reserved_page.patchapplication/octet-stream; name=v3-0017-feature-ControlFile-GUC-support-for-reserved_page.patchDownload

From d023080f4a015c85ad328ca8963de2ef15acef9d Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Fri, 5 Jan 2024 15:45:47 -0500
Subject: [PATCH v3 17/28] feature: ControlFile/GUC support for
 reserved_page_size

---
 src/backend/access/transam/xlog.c       | 25 ++++++++++++++++++++++++-
 src/backend/utils/misc/guc.c            |  1 +
 src/backend/utils/misc/guc_tables.c     | 13 +++++++++++++
 src/bin/pg_controldata/pg_controldata.c |  2 ++
 src/include/catalog/pg_control.h        |  1 +
 5 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 5be782db92..63b3ee79d4 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -67,6 +67,7 @@
 #include "catalog/catversion.h"
 #include "catalog/pg_control.h"
 #include "catalog/pg_database.h"
+#include "common/blocksize.h"
 #include "common/controldata_utils.h"
 #include "common/file_utils.h"
 #include "executor/instrument.h"
@@ -4078,6 +4079,7 @@ WriteControlFile(void)
 	ControlFile->relseg_size = RELSEG_SIZE;
 	ControlFile->xlog_blcksz = XLOG_BLCKSZ;
 	ControlFile->xlog_seg_size = wal_segment_size;
+	ControlFile->reserved_page_size = ReservedPageSize;
 
 	ControlFile->nameDataLen = NAMEDATALEN;
 	ControlFile->indexMaxKeys = INDEX_MAX_KEYS;
@@ -4147,8 +4149,9 @@ ReadControlFile(void)
 	pg_crc32c	crc;
 	int			fd;
 	static char wal_segsz_str[20];
+	static char reserved_page_size_str[20];
 	int			r;
-
+	int reserved_page_size;
 	/*
 	 * Read data...
 	 */
@@ -4214,6 +4217,26 @@ ReadControlFile(void)
 		ereport(FATAL,
 				(errmsg("incorrect checksum in control file")));
 
+	/*
+	 * Block size computations affect a number of things that are later
+	 * checked, so ensure that we calculate as soon as CRC has been validated
+	 * before checking other things that may depend on it.
+	 */
+
+	reserved_page_size = ControlFile->reserved_page_size;
+
+	if (!IsValidReservedPageSize(reserved_page_size))
+		ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+						errmsg_plural("Reserved Page Size must be a multiple of 8 between 0 and 256, but the control file specifies %d byte",
+									  "Reserved Page Size must be a multiple of 8 between 0 and 256, but the control file specifies %d bytes",
+									  reserved_page_size,
+									  reserved_page_size)));
+
+	BlockSizeInit(ControlFile->blcksz, reserved_page_size);
+	snprintf(reserved_page_size_str, sizeof(reserved_page_size_str), "%d", reserved_page_size);
+	SetConfigOption("reserved_page_size", reserved_page_size_str, PGC_INTERNAL,
+					PGC_S_DYNAMIC_DEFAULT);
+
 	/*
 	 * Do compatibility checking immediately.  If the database isn't
 	 * compatible with the backend executable, we want to abort before we can
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 8f65ef3d89..0f07885ee5 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -33,6 +33,7 @@
 #include "catalog/objectaccess.h"
 #include "catalog/pg_authid.h"
 #include "catalog/pg_parameter_acl.h"
+#include "common/blocksize.h"
 #include "guc_internal.h"
 #include "libpq/pqformat.h"
 #include "parser/scansup.h"
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index e53ebc6dc2..6590dd9329 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2766,6 +2766,19 @@ struct config_int ConfigureNamesInt[] =
 		NULL, assign_max_wal_size, NULL
 	},
 
+	{
+		{"reserved_page_size", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows the size of reserved space for extended pages."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+			},
+		(int*)&ReservedPageSize,
+		0,
+		0,
+		MaxReservedPageSize,
+		NULL, NULL, NULL
+		},
+
 	{
 		{"checkpoint_timeout", PGC_SIGHUP, WAL_CHECKPOINTS,
 			gettext_noop("Sets the maximum time between automatic WAL checkpoints."),
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index 93e0837947..9c6b6adae7 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -304,6 +304,8 @@ main(int argc, char *argv[])
 	/* we don't print floatFormat since can't say much useful about it */
 	printf(_("Database block size:                  %u\n"),
 		   ControlFile->blcksz);
+	printf(_("Reserved page size:                   %u\n"),
+		   ControlFile->reserved_page_size);
 	printf(_("Blocks per segment of large relation: %u\n"),
 		   ControlFile->relseg_size);
 	printf(_("WAL block size:                       %u\n"),
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index a00606ffcd..b4711bc4f6 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -227,6 +227,7 @@ typedef struct ControlFileData
 	 */
 	char		mock_authentication_nonce[MOCK_AUTH_NONCE_LEN];
 
+	uint32		reserved_page_size;	/* how much space per disk block is reserved */
 	/* CRC of all above ... MUST BE LAST! */
 	pg_crc32c	crc;
 } ControlFileData;
-- 
2.40.1

v3-0018-feature-Add-reserved_page_size-to-initdb-bootstra.patchapplication/octet-stream; name=v3-0018-feature-Add-reserved_page_size-to-initdb-bootstra.patchDownload

From b3a27eeefa014a979694eb659a0f6b62483cf41e Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Fri, 5 Jan 2024 15:51:10 -0500
Subject: [PATCH v3 18/28] feature: Add reserved_page_size to initdb/bootstrap

Include a basic initdb test to exercise a few of the constraints.
---
 src/backend/bootstrap/bootstrap.c    | 14 ++++++++-
 src/bin/initdb/initdb.c              | 37 ++++++++++++++++++++++--
 src/bin/initdb/meson.build           |  1 +
 src/bin/initdb/t/002_reservedsize.pl | 43 ++++++++++++++++++++++++++++
 4 files changed, 92 insertions(+), 3 deletions(-)
 create mode 100644 src/bin/initdb/t/002_reservedsize.pl

diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index 141b25ddd7..90d190ed1c 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -28,6 +28,7 @@
 #include "catalog/index.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_type.h"
+#include "common/blocksize.h"
 #include "common/link-canary.h"
 #include "libpq/pqsignal.h"
 #include "miscadmin.h"
@@ -46,6 +47,7 @@
 #include "utils/relmapper.h"
 
 uint32		bootstrap_data_checksum_version = 0;	/* No checksum */
+uint32		bootstrap_reserved_page_size = 0;
 
 
 static void CheckerModeMain(void);
@@ -221,10 +223,18 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
 	argv++;
 	argc--;
 
-	while ((flag = getopt(argc, argv, "B:c:d:D:Fkr:X:-:")) != -1)
+	while ((flag = getopt(argc, argv, "b:B:c:d:D:Fkr:X:-:")) != -1)
 	{
 		switch (flag)
 		{
+			case 'b':
+				bootstrap_reserved_page_size = strtol(optarg, NULL, 0);
+				if (!IsValidReservedPageSize(bootstrap_reserved_page_size))
+					ereport(ERROR,
+							(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+							 errmsg("invalid reserved page size: %s; must be multiple of 8 between 0 and 256",
+									optarg)));
+				break;
 			case 'B':
 				SetConfigOption("shared_buffers", optarg, PGC_POSTMASTER, PGC_S_ARGV);
 				break;
@@ -300,6 +310,8 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
 	if (!SelectConfigFiles(userDoption, progname))
 		proc_exit(1);
 
+	BlockSizeInit(BLCKSZ, bootstrap_reserved_page_size);
+
 	/*
 	 * Validate we have been given a reasonable-looking DataDir and change
 	 * into it
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index ac409b0006..cfe9e7806e 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -69,6 +69,7 @@
 #include "catalog/pg_class_d.h" /* pgrminclude ignore */
 #include "catalog/pg_collation_d.h"
 #include "catalog/pg_database_d.h"	/* pgrminclude ignore */
+#include "common/blocksize.h"
 #include "common/file_perm.h"
 #include "common/file_utils.h"
 #include "common/logging.h"
@@ -166,6 +167,8 @@ static bool data_checksums = false;
 static char *xlog_dir = NULL;
 static int	wal_segment_size_mb = (DEFAULT_XLOG_SEG_SIZE) / (1024 * 1024);
 static DataDirSyncMethod sync_method = DATA_DIR_SYNC_METHOD_FSYNC;
+static char *str_reserved_page_size = NULL;
+static int reserved_page_size = 0;
 
 
 /* internal vars */
@@ -1149,11 +1152,12 @@ test_specific_config_settings(int test_conns, int test_buffs)
 
 	/* Set up the test postmaster invocation */
 	printfPQExpBuffer(&cmd,
-					  "\"%s\" --check %s %s "
+					  "\"%s\" --check %s %s -b %d "
 					  "-c max_connections=%d "
 					  "-c shared_buffers=%d "
 					  "-c dynamic_shared_memory_type=%s",
 					  backend_exec, boot_options, extra_options,
+					  reserved_page_size,
 					  test_conns, test_buffs,
 					  dynamic_shared_memory_type);
 
@@ -1531,6 +1535,9 @@ bootstrap_template1(void)
 
 	printfPQExpBuffer(&cmd, "\"%s\" --boot %s %s", backend_exec, boot_options, extra_options);
 	appendPQExpBuffer(&cmd, " -X %d", wal_segment_size_mb * (1024 * 1024));
+
+	if (reserved_page_size)
+		appendPQExpBuffer(&cmd, " -b %d", reserved_page_size);
 	if (data_checksums)
 		appendPQExpBuffer(&cmd, " -k");
 	if (debug)
@@ -2430,6 +2437,7 @@ usage(const char *progname)
 	printf(_("  -A, --auth=METHOD         default authentication method for local connections\n"));
 	printf(_("      --auth-host=METHOD    default authentication method for local TCP/IP connections\n"));
 	printf(_("      --auth-local=METHOD   default authentication method for local-socket connections\n"));
+	printf(_("  -b, --reserved-size=SIZE  reserved space in disk pages for page features\n"));
 	printf(_(" [-D, --pgdata=]DATADIR     location for this database cluster\n"));
 	printf(_("  -E, --encoding=ENCODING   set default encoding for new databases\n"));
 	printf(_("  -g, --allow-group-access  allow group read/execute on data directory\n"));
@@ -3095,6 +3103,7 @@ main(int argc, char *argv[])
 		{"sync-only", no_argument, NULL, 'S'},
 		{"waldir", required_argument, NULL, 'X'},
 		{"wal-segsize", required_argument, NULL, 12},
+		{"reserved-size", required_argument, NULL, 'b'},
 		{"data-checksums", no_argument, NULL, 'k'},
 		{"allow-group-access", no_argument, NULL, 'g'},
 		{"discard-caches", no_argument, NULL, 14},
@@ -3143,7 +3152,7 @@ main(int argc, char *argv[])
 
 	/* process command-line options */
 
-	while ((c = getopt_long(argc, argv, "A:c:dD:E:gkL:nNsST:U:WX:",
+	while ((c = getopt_long(argc, argv, "A:b:c:dD:E:gkL:nNsST:U:WX:",
 							long_options, &option_index)) != -1)
 	{
 		switch (c)
@@ -3167,6 +3176,9 @@ main(int argc, char *argv[])
 			case 11:
 				authmethodhost = pg_strdup(optarg);
 				break;
+			case 'b':
+				str_reserved_page_size = pg_strdup(optarg);
+				break;
 			case 'c':
 				{
 					char	   *buf = pg_strdup(optarg);
@@ -3352,6 +3364,27 @@ main(int argc, char *argv[])
 	if (!IsValidWalSegSize(wal_segment_size_mb * 1024 * 1024))
 		pg_fatal("argument of %s must be a power of two between 1 and 1024", "--wal-segsize");
 
+	if (str_reserved_page_size == NULL)
+		reserved_page_size = 0;
+	else
+	{
+		char	   *endptr;
+
+		/* check that the argument is a number */
+		reserved_page_size = strtol(str_reserved_page_size, &endptr, 10);
+
+		/* verify that the  segment size is valid */
+		if (endptr == str_reserved_page_size || *endptr != '\0')
+			pg_fatal("argument of --reserved-size must be a number");
+		/* check for valid block_size; last is bitwise power of two check */
+		if (!IsValidReservedPageSize(reserved_page_size))
+			pg_fatal("argument of --reserved-size must be a power of 2 between 0 and 256");
+	}
+
+	BlockSizeInit(BLCKSZ, reserved_page_size);
+	if (reserved_page_size)
+		printf(_("Reserving %u bytes on disk pages for additional features.\n"), reserved_page_size);
+
 	get_restricted_token();
 
 	setup_pgdata();
diff --git a/src/bin/initdb/meson.build b/src/bin/initdb/meson.build
index 7dc5ed6e77..914104f266 100644
--- a/src/bin/initdb/meson.build
+++ b/src/bin/initdb/meson.build
@@ -34,6 +34,7 @@ tests += {
     'env': {'with_icu': icu.found() ? 'yes' : 'no'},
     'tests': [
       't/001_initdb.pl',
+      't/002_reservedsize.pl',
     ],
   },
 }
diff --git a/src/bin/initdb/t/002_reservedsize.pl b/src/bin/initdb/t/002_reservedsize.pl
new file mode 100644
index 0000000000..83f4381fee
--- /dev/null
+++ b/src/bin/initdb/t/002_reservedsize.pl
@@ -0,0 +1,43 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+
+use strict;
+use warnings;
+use Fcntl ':mode';
+use File::stat qw{lstat};
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# validate expected handling of --reserved-size
+
+# default is 0 reserved size
+my $node1 = PostgreSQL::Test::Cluster->new('node1');
+$node1->init();
+$node1->start;
+
+is($node1->safe_psql('postgres',q{SELECT current_setting('reserved_page_size')}),
+   0, "reserved_size defaults to 0");
+
+$node1->stop;
+
+# reserve 8 bytes
+my $node2 = PostgreSQL::Test::Cluster->new('node2');
+$node2->init(extra => ['--reserved-size=8'] );
+$node2->start;
+
+is($node2->safe_psql('postgres',q{SELECT current_setting('reserved_page_size')}),
+   8, "reserved_page_size passes through correctly");
+
+$node2->stop;
+
+# reserve non-multiple of 8 bytes : initdb error
+command_fails([ 'initdb', '--reserved-size=18' ],
+	'multiple');
+
+# reserve too much space : initdb error
+command_fails([ 'initdb', '--reserved-size=1024' ],
+	'multiple');
+
+done_testing();
-- 
2.40.1

v3-0016-feature-Calculate-all-blocksize-constants-in-a-co.patchapplication/octet-stream; name=v3-0016-feature-Calculate-all-blocksize-constants-in-a-co.patchDownload

From 06f7cc7dba4b7b69ce91f05ab2527ac79dd5289c Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Fri, 5 Jan 2024 15:39:43 -0500
Subject: [PATCH v3 16/28] feature: Calculate all blocksize constants in a
 common location

These variables are calculated once on server startup or in front-end code which
utilizes these constants.

This adds the BlockSizeInit() and BlockSizeInitControl() routines to handle the
bootstrapping of the cluster constants depending on if we have a control file or
not.

Also move the declaration of ReservedPageSize variable into this module, since
we need the same sort of initialization if we are using these constants in the
frontend code as well.
---
 src/backend/access/transam/xlog.c  | 12 ++--
 src/backend/storage/page/bufpage.c |  1 -
 src/common/Makefile                |  1 +
 src/common/blocksize.c             | 88 ++++++++++++++++++++++++++++++
 src/common/meson.build             |  1 +
 src/include/common/blocksize.h     | 37 +++++++++++++
 src/include/common/blocksize_int.h | 42 ++++++++++++++
 src/include/storage/bufpage.h      |  6 +-
 8 files changed, 176 insertions(+), 12 deletions(-)
 create mode 100644 src/common/blocksize.c
 create mode 100644 src/include/common/blocksize.h
 create mode 100644 src/include/common/blocksize_int.h

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index f9defc70f4..5be782db92 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4082,8 +4082,8 @@ WriteControlFile(void)
 	ControlFile->nameDataLen = NAMEDATALEN;
 	ControlFile->indexMaxKeys = INDEX_MAX_KEYS;
 
-	ControlFile->toast_max_chunk_size = TOAST_MAX_CHUNK_SIZE;
-	ControlFile->loblksize = LOBLKSIZE;
+	ControlFile->toast_max_chunk_size = ClusterToastMaxChunkSize;
+	ControlFile->loblksize = ClusterLargeObjectBlockSize;
 
 	ControlFile->float8ByVal = FLOAT8PASSBYVAL;
 
@@ -4273,19 +4273,19 @@ ReadControlFile(void)
 						   " but the server was compiled with INDEX_MAX_KEYS %d.",
 						   ControlFile->indexMaxKeys, INDEX_MAX_KEYS),
 				 errhint("It looks like you need to recompile or initdb.")));
-	if (ControlFile->toast_max_chunk_size != TOAST_MAX_CHUNK_SIZE)
+	if (ControlFile->toast_max_chunk_size != ClusterToastMaxChunkSize)
 		ereport(FATAL,
 				(errmsg("database files are incompatible with server"),
 				 errdetail("The database cluster was initialized with ClusterToastMaxChunkSize %d,"
 						   " but the server was configured with ClusterToastMaxChunkSize %d.",
 						   ControlFile->toast_max_chunk_size, (int) ClusterToastMaxChunkSize),
 				 errhint("It looks like you need to recompile or initdb.")));
-	if (ControlFile->loblksize != LOBLKSIZE)
+	if (ControlFile->loblksize != ClusterLargeObjectBlockSize)
 		ereport(FATAL,
 				(errmsg("database files are incompatible with server"),
 				 errdetail("The database cluster was initialized with LOBLKSIZE %d,"
-						   " but the server was compiled with LOBLKSIZE %d.",
-						   ControlFile->loblksize, (int) LOBLKSIZE),
+						   " but the server was configured with LOBLKSIZE %d.",
+						   ControlFile->loblksize, (int) ClusterLargeObjectBlockSize),
 				 errhint("It looks like you need to recompile or initdb.")));
 
 #ifdef USE_FLOAT8_BYVAL
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index d54651d703..f84343e6ce 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -26,7 +26,6 @@
 /* GUC variable */
 bool		ignore_checksum_failure = false;
 
-int ReservedPageSize = 0;
 /* ----------------------------------------------------------------
  *						Page support functions
  * ----------------------------------------------------------------
diff --git a/src/common/Makefile b/src/common/Makefile
index 2ba5069dca..ca63835382 100644
--- a/src/common/Makefile
+++ b/src/common/Makefile
@@ -48,6 +48,7 @@ OBJS_COMMON = \
 	base64.o \
 	binaryheap.o \
 	blkreftable.o \
+	blocksize.o \
 	checksum_helper.o \
 	compression.o \
 	config_info.o \
diff --git a/src/common/blocksize.c b/src/common/blocksize.c
new file mode 100644
index 0000000000..1dc3e0bbe0
--- /dev/null
+++ b/src/common/blocksize.c
@@ -0,0 +1,88 @@
+/*-------------------------------------------------------------------------
+ *
+ * blocksize.c
+ *		This file contains methods to calculate blocksize-related variables
+ *
+ *
+ * Copyright (c) 2024, PostgreSQL Global Development Group
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/access/common/blocksize.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+#include "common/blocksize_int.h"
+#include "access/heaptoast.h"
+#include "access/htup_details.h"
+#include "access/itup.h"
+#include "access/nbtree_int.h"
+#include "common/controldata_utils.h"
+#include "storage/large_object.h"
+
+/* These variables are effectively constants, but are initialized by BlockSizeInit() */
+
+Size ReservedPageSize;
+
+Size ClusterMaxTIDsPerBTreePage = 0;
+Size ClusterLargeObjectBlockSize = 0;
+Size ClusterMaxHeapTupleSize = 0;
+Size ClusterMaxHeapTuplesPerPage = 0;
+Size ClusterMaxIndexTuplesPerPage = 0;
+Size ClusterToastMaxChunkSize = 0;
+
+/*
+ * This routine will calculate and cache the necessary constants. This should
+ * be called once very very early in the process (as soon as the native block
+ * size is known, so after reading ControlFile, or using BlockSizeInitControl).
+ */
+
+static bool initialized = false;
+
+void
+BlockSizeInit(Size rawblocksize, Size reservedsize)
+{
+	Assert(rawblocksize == BLCKSZ);
+	Assert(IsValidReservedPageSize(reservedsize));
+
+	if (initialized)
+		return;
+
+	ReservedPageSize = reservedsize;
+	ClusterMaxTIDsPerBTreePage = CalcMaxTIDsPerBTreePage(PageUsableSpace);
+	ClusterLargeObjectBlockSize = LOBLKSIZE;		 /* TODO: calculate? */
+	ClusterMaxHeapTupleSize = CalcMaxHeapTupleSize(PageUsableSpace);
+	ClusterMaxHeapTuplesPerPage = CalcMaxHeapTuplesPerPage(PageUsableSpace);
+	ClusterMaxIndexTuplesPerPage = CalcMaxIndexTuplesPerPage(PageUsableSpace);
+	ClusterToastMaxChunkSize = CalcToastMaxChunkSize(PageUsableSpace);
+
+	initialized = true;
+}
+
+/*
+ * Init the BlockSize using values from the given control file pointer.  If
+ * this is nil, then load and use the control file pointed to by the pgdata
+ * path and perform said operations.
+ */
+void
+BlockSizeInitControl(ControlFileData *control, const char *pgdata)
+{
+	bool crc_ok = true;
+
+	Assert(pgdata);
+
+	if (!control)
+		control = get_controlfile(pgdata, &crc_ok);
+
+	Assert(crc_ok);
+
+	if (control)
+	{
+		BlockSizeInit(control->blcksz, control->reserved_page_size);
+		return;
+	}
+
+	/* panic */
+}
diff --git a/src/common/meson.build b/src/common/meson.build
index 4eb16024cb..5369fd2960 100644
--- a/src/common/meson.build
+++ b/src/common/meson.build
@@ -5,6 +5,7 @@ common_sources = files(
   'base64.c',
   'binaryheap.c',
   'blkreftable.c',
+  'blocksize.c',
   'checksum_helper.c',
   'compression.c',
   'controldata_utils.c',
diff --git a/src/include/common/blocksize.h b/src/include/common/blocksize.h
new file mode 100644
index 0000000000..ff9e0cd3d6
--- /dev/null
+++ b/src/include/common/blocksize.h
@@ -0,0 +1,37 @@
+/*-------------------------------------------------------------------------
+ *
+ * blocksize.h
+ *	  definitions for cluster-specific limits/structure defs
+ *
+ *
+ * Copyright (c) 2024, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION: src/include/common/blocksize.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#ifndef BLOCKSIZE_H
+#define BLOCKSIZE_H
+
+#include "catalog/pg_control.h"
+
+void BlockSizeInit(Size rawblocksize, Size reservedsize);
+void BlockSizeInitControl(ControlFileData *ControlFile, const char *DataDir);
+
+/* These constants are initialized at runtime but are effective constants for callers */
+
+const extern PGDLLIMPORT Size ReservedPageSize;
+const extern PGDLLIMPORT Size ClusterMaxTIDsPerBTreePage;
+const extern PGDLLIMPORT Size ClusterLargeObjectBlockSize;
+const extern PGDLLIMPORT Size ClusterMaxHeapTupleSize;
+const extern PGDLLIMPORT Size ClusterMaxHeapTuplesPerPage;
+const extern PGDLLIMPORT Size ClusterMaxIndexTuplesPerPage;
+const extern PGDLLIMPORT Size ClusterToastMaxChunkSize;
+
+#define MaxReservedPageSize 256
+
+/* between 0 and MaxReservedPageSize and multiple of 8 */
+#define IsValidReservedPageSize(s) ((s) >= 0 && (s) <= MaxReservedPageSize && (((s)&0x7) == 0))
+
+#endif
diff --git a/src/include/common/blocksize_int.h b/src/include/common/blocksize_int.h
new file mode 100644
index 0000000000..a7213878f7
--- /dev/null
+++ b/src/include/common/blocksize_int.h
@@ -0,0 +1,42 @@
+/*-------------------------------------------------------------------------
+ *
+ * blocksize_int.h
+ *	  internal defintions for cluster-specific limits/structure defs
+ *
+ * Copyright (c) 2024, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION: src/include/common/blocksize_int.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+/*
+ * Note: We use the same identifier here as in blocksize.h, due to blocksize.c
+ * (the only consumer of this header file) needing to see these definitions;
+ * other subsequent header files will pull in blocksize.h, so without using
+ * the same symbol you get conflicting defintion errors.
+ */
+
+#ifndef BLOCKSIZE_H
+#define BLOCKSIZE_H
+
+/* forward declaration */
+typedef struct ControlFileData ControlFileData;
+
+void BlockSizeInit(Size rawblocksize, Size reservedsize);
+void BlockSizeInitControl(ControlFileData *ControlFile, const char *DataDir);
+
+extern PGDLLIMPORT Size ReservedPageSize;
+extern PGDLLIMPORT Size ClusterMaxTIDsPerBTreePage;
+extern PGDLLIMPORT Size ClusterLargeObjectBlockSize;
+extern PGDLLIMPORT Size ClusterMaxHeapTupleSize;
+extern PGDLLIMPORT Size ClusterMaxHeapTuplesPerPage;
+extern PGDLLIMPORT Size ClusterMaxIndexTuplesPerPage;
+extern PGDLLIMPORT Size ClusterToastMaxChunkSize;
+
+#define MaxReservedPageSize 256
+
+/* between 0 and MaxReservedPageSize and multiple of 8 */
+#define IsValidReservedPageSize(s) ((s) >= 0 && (s) <= MaxReservedPageSize && (((s)&0x7) == 0))
+
+#endif
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index c21bb0d86f..a233049b70 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -15,6 +15,7 @@
 #define BUFPAGE_H
 
 #include "access/xlogdefs.h"
+#include "common/blocksize.h"
 #include "storage/block.h"
 #include "storage/item.h"
 #include "storage/off.h"
@@ -213,11 +214,6 @@ typedef PageHeaderData *PageHeader;
  */
 #define SizeOfPageHeaderData (offsetof(PageHeaderData, pd_linp))
 
-/*
- * how much space is left after smgr's bookkeeping, etc; should be MAXALIGN
- */
-extern int ReservedPageSize;
-
 /* ignore page usable space */
 #define PageUsableSpaceMax (BLCKSZ - SizeOfPageHeaderData)
 #define PageUsableSpace (BLCKSZ - SizeOfPageHeaderData - ReservedPageSize)
-- 
2.40.1

v3-0019-feature-Updates-for-pg_resetwal.patchapplication/octet-stream; name=v3-0019-feature-Updates-for-pg_resetwal.patchDownload

From 21ba227bc37069c3519021f6facd1725e072238b Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Fri, 5 Jan 2024 18:29:23 -0500
Subject: [PATCH v3 19/28] feature: Updates for pg_resetwal

If you are explicitly reseting a missing or corrupted ControlFile, you will need
to explicitly provide the number of bytes the existing cluster was using for its
previous reserved_page_size setting.
---
 src/bin/pg_resetwal/pg_resetwal.c      | 53 +++++++++++++++++++++++++-
 src/bin/pg_resetwal/t/002_corrupted.pl |  8 ++--
 2 files changed, 56 insertions(+), 5 deletions(-)

diff --git a/src/bin/pg_resetwal/pg_resetwal.c b/src/bin/pg_resetwal/pg_resetwal.c
index 9b02a290e7..9c773ba0c1 100644
--- a/src/bin/pg_resetwal/pg_resetwal.c
+++ b/src/bin/pg_resetwal/pg_resetwal.c
@@ -75,6 +75,7 @@ static TimeLineID minXlogTli = 0;
 static XLogSegNo minXlogSegNo = 0;
 static int	WalSegSz;
 static int	set_wal_segsize;
+static int	set_reserved_page_size = -1;
 
 static void CheckDataVersion(void);
 static bool read_controlfile(void);
@@ -94,6 +95,7 @@ int
 main(int argc, char *argv[])
 {
 	static struct option long_options[] = {
+		{"reserved-size", required_argument, NULL, 'b'},
 		{"commit-timestamp-ids", required_argument, NULL, 'c'},
 		{"pgdata", required_argument, NULL, 'D'},
 		{"epoch", required_argument, NULL, 'e'},
@@ -138,10 +140,18 @@ main(int argc, char *argv[])
 	}
 
 
-	while ((c = getopt_long(argc, argv, "c:D:e:fl:m:no:O:u:x:", long_options, NULL)) != -1)
+	while ((c = getopt_long(argc, argv, "b:c:D:e:fl:m:no:O:u:x:", long_options, NULL)) != -1)
 	{
 		switch (c)
 		{
+			case 'b':
+				errno = 0;
+				set_reserved_page_size = strtol(optarg, &endptr, 10);
+				if (endptr == optarg || *endptr != '\0' || errno != 0)
+					pg_fatal("argument of --reserved-size must be a number");
+				if (!IsValidReservedPageSize(set_reserved_page_size))
+					pg_fatal("argument of --reserved-size must be a multiple of 8 between 0 and 256");
+				break;
 			case 'D':
 				DataDir = optarg;
 				break;
@@ -391,6 +401,34 @@ main(int argc, char *argv[])
 	else
 		WalSegSz = ControlFile.xlog_seg_size;
 
+	/*
+	 * If a reserved page size was specified, compare to existing ControlFile; if we
+	 * are wrong, we won't be able to read the data.  We will only want to set
+	 * it if we guessed.
+	 */
+	if (set_reserved_page_size == -1)
+	{
+		if (guessed)
+			pg_fatal("Cannot determine reserved page size; provide explicitly via --reserved-size");
+	}
+	else
+	{
+		if (!guessed && set_reserved_page_size != ControlFile.reserved_page_size)
+			pg_fatal("Cannot change reserved page size in existing cluster");
+
+		/* hope this is right, but by default we don't know; likely this is
+		 * 0 */
+		ControlFile.reserved_page_size = set_reserved_page_size;
+	}
+
+	/*
+	 * Set some dependent calculated fields stored in pg_control
+	 */
+	BlockSizeInit(ControlFile.blcksz, ControlFile.reserved_page_size);
+
+	ControlFile.toast_max_chunk_size = ClusterToastMaxChunkSize;
+	ControlFile.loblksize = ClusterLargeObjectBlockSize;
+
 	if (log_fname != NULL)
 		XLogFromFileName(log_fname, &minXlogTli, &minXlogSegNo, WalSegSz);
 
@@ -617,6 +655,16 @@ read_controlfile(void)
 			return false;
 		}
 
+		/* return false if block size is not valid */
+		if (!IsValidReservedPageSize(ControlFile.reserved_page_size))
+		{
+			pg_log_warning(ngettext("pg_control specifies invalid reserved page size (%d byte); proceed with caution",
+									"pg_control specifies invalid reserved page size (%d bytes); proceed with caution",
+									ControlFile.reserved_page_size),
+						   ControlFile.reserved_page_size);
+			return false;
+		}
+
 		return true;
 	}
 
@@ -1177,6 +1225,9 @@ usage(void)
 	printf(_("  -V, --version          output version information, then exit\n"));
 	printf(_("  -?, --help             show this help, then exit\n"));
 
+	printf(_("\nOptions required when guessing control file values:\n"));
+	printf(_("  -b, --reserved-size=SIZE         reserved page size, in bytes\n"));
+
 	printf(_("\nOptions to override control file values:\n"));
 	printf(_("  -c, --commit-timestamp-ids=XID,XID\n"
 			 "                                   set oldest and newest transactions bearing\n"
diff --git a/src/bin/pg_resetwal/t/002_corrupted.pl b/src/bin/pg_resetwal/t/002_corrupted.pl
index 897b03162e..082813966c 100644
--- a/src/bin/pg_resetwal/t/002_corrupted.pl
+++ b/src/bin/pg_resetwal/t/002_corrupted.pl
@@ -31,7 +31,7 @@ print $fh pack("x[$size]");
 close $fh;
 
 command_checks_all(
-	[ 'pg_resetwal', '-n', $node->data_dir ],
+	[ 'pg_resetwal', '-b', '0', '-n', $node->data_dir ],
 	0,
 	[qr/pg_control version number/],
 	[
@@ -47,7 +47,7 @@ print $fh $data, pack("x[" . ($size - 16) . "]");
 close $fh;
 
 command_checks_all(
-	[ 'pg_resetwal', '-n', $node->data_dir ],
+	[ 'pg_resetwal', '-b', '0', '-n', $node->data_dir ],
 	0,
 	[qr/pg_control version number/],
 	[
@@ -57,10 +57,10 @@ command_checks_all(
 
 # now try to run it
 command_fails_like(
-	[ 'pg_resetwal', $node->data_dir ],
+	[ 'pg_resetwal', '-b', '0', $node->data_dir ],
 	qr/not proceeding because control file values were guessed/,
 	'does not run when control file values were guessed');
-command_ok([ 'pg_resetwal', '-f', $node->data_dir ],
+command_ok([ 'pg_resetwal', '-b', '0', '-f', $node->data_dir ],
 	'runs with force when control file values were guessed');
 
 done_testing();
-- 
2.40.1

v3-0020-chore-Rename-MaxHeapTupleSizeDynamic-to-ClusterMa.patchapplication/octet-stream; name=v3-0020-chore-Rename-MaxHeapTupleSizeDynamic-to-ClusterMa.patchDownload

From c3f97c1f89c88d3af44e1be8422e834a92e1b65b Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Fri, 12 Jan 2024 17:19:14 -0500
Subject: [PATCH v3 20/28] chore: Rename MaxHeapTupleSizeDynamic to
 ClusterMaxHeapTupleSize

We are now using the computed variable instead of calculating the expression.
---
 src/backend/access/heap/heapam.c                | 6 +++---
 src/backend/access/heap/hio.c                   | 6 +++---
 src/backend/access/heap/rewriteheap.c           | 4 ++--
 src/backend/replication/logical/reorderbuffer.c | 2 +-
 src/backend/storage/freespace/freespace.c       | 2 +-
 src/include/access/heaptoast.h                  | 2 +-
 src/include/access/htup_details.h               | 5 ++---
 src/test/regress/expected/insert.out            | 2 +-
 src/test/regress/sql/insert.sql                 | 2 +-
 9 files changed, 15 insertions(+), 16 deletions(-)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index e05b8760c9..afa09835c3 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -9287,7 +9287,7 @@ heap_xlog_insert(XLogReaderState *record)
 		data = XLogRecGetBlockData(record, 0, &datalen);
 
 		newlen = datalen - SizeOfHeapHeader;
-		Assert(datalen > SizeOfHeapHeader && newlen <= MaxHeapTupleSizeDynamic);
+		Assert(datalen > SizeOfHeapHeader && newlen <= ClusterMaxHeapTupleSize);
 		memcpy((char *) &xlhdr, data, SizeOfHeapHeader);
 		data += SizeOfHeapHeader;
 
@@ -9431,7 +9431,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
 			tupdata = ((char *) xlhdr) + SizeOfMultiInsertTuple;
 
 			newlen = xlhdr->datalen;
-			Assert(newlen <= MaxHeapTupleSizeDynamic);
+			Assert(newlen <= ClusterMaxHeapTupleSize);
 			htup = &tbuf.hdr;
 			MemSet((char *) htup, 0, SizeofHeapTupleHeader);
 			/* PG73FORMAT: get bitmap [+ padding] [+ oid] + data */
@@ -9666,7 +9666,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
 		recdata += SizeOfHeapHeader;
 
 		tuplen = recdata_end - recdata;
-		Assert(tuplen <= MaxHeapTupleSizeDynamic);
+		Assert(tuplen <= ClusterMaxHeapTupleSize);
 
 		htup = &tbuf.hdr;
 		MemSet((char *) htup, 0, SizeofHeapTupleHeader);
diff --git a/src/backend/access/heap/hio.c b/src/backend/access/heap/hio.c
index 681e586f66..9da5d3d158 100644
--- a/src/backend/access/heap/hio.c
+++ b/src/backend/access/heap/hio.c
@@ -530,11 +530,11 @@ RelationGetBufferForTuple(Relation relation, Size len,
 	/*
 	 * If we're gonna fail for oversize tuple, do it right away
 	 */
-	if (len > MaxHeapTupleSizeDynamic)
+	if (len > ClusterMaxHeapTupleSize)
 		ereport(ERROR,
 				(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
 				 errmsg("row is too big: size %zu, maximum size %zu",
-						len, MaxHeapTupleSizeDynamic)));
+						len, ClusterMaxHeapTupleSize)));
 
 	/* Compute desired extra freespace due to fillfactor option */
 	saveFreeSpace = RelationGetTargetPageFreeSpace(relation,
@@ -546,7 +546,7 @@ RelationGetBufferForTuple(Relation relation, Size len,
 	 * somewhat arbitrary, but it should prevent most unnecessary relation
 	 * extensions while inserting large tuples into low-fillfactor tables.
 	 */
-	nearlyEmptyFreeSpace = MaxHeapTupleSizeDynamic -
+	nearlyEmptyFreeSpace = ClusterMaxHeapTupleSize -
 		(MaxHeapTuplesPerPageDynamic / 8 * sizeof(ItemIdData));
 	if (len + saveFreeSpace > nearlyEmptyFreeSpace)
 		targetFreeSpace = Max(len, nearlyEmptyFreeSpace);
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index f0896e554c..75e0d1ffe8 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -653,11 +653,11 @@ raw_heap_insert(RewriteState state, HeapTuple tup)
 	/*
 	 * If we're gonna fail for oversize tuple, do it right away
 	 */
-	if (len > MaxHeapTupleSizeDynamic)
+	if (len > ClusterMaxHeapTupleSize)
 		ereport(ERROR,
 				(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
 				 errmsg("row is too big: size %zu, maximum size %zu",
-						len, MaxHeapTupleSizeDynamic)));
+						len, ClusterMaxHeapTupleSize)));
 
 	/* Compute desired extra freespace due to fillfactor option */
 	saveFreeSpace = RelationGetTargetPageFreeSpace(state->rs_new_rel,
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index bf0c326d40..ac5b69676e 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4882,7 +4882,7 @@ ReorderBufferToastReplace(ReorderBuffer *rb, ReorderBufferTXN *txn,
 	 * the tuplebuf because attrs[] will point back into the current content.
 	 */
 	tmphtup = heap_form_tuple(desc, attrs, isnull);
-	Assert(newtup->tuple.t_len <= MaxHeapTupleSizeDynamic);
+	Assert(newtup->tuple.t_len <= ClusterMaxHeapTupleSize);
 	Assert(ReorderBufferTupleBufData(newtup) == newtup->tuple.t_data);
 
 	memcpy(newtup->tuple.t_data, tmphtup->t_data, tmphtup->t_len);
diff --git a/src/backend/storage/freespace/freespace.c b/src/backend/storage/freespace/freespace.c
index 8981f62ac0..323c652a9d 100644
--- a/src/backend/storage/freespace/freespace.c
+++ b/src/backend/storage/freespace/freespace.c
@@ -63,7 +63,7 @@
  */
 #define FSM_CATEGORIES	256
 #define FSM_CAT_STEP	(BLCKSZ / FSM_CATEGORIES)
-#define MaxFSMRequestSize	MaxHeapTupleSizeDynamic
+#define MaxFSMRequestSize	ClusterMaxHeapTupleSize
 
 /*
  * Depth of the on-disk tree. We need to be able to address 2^32-1 blocks,
diff --git a/src/include/access/heaptoast.h b/src/include/access/heaptoast.h
index 3fbb1d764f..96b719d9a3 100644
--- a/src/include/access/heaptoast.h
+++ b/src/include/access/heaptoast.h
@@ -66,7 +66,7 @@
  * compress it (we can't move it out-of-line, however).  Note that this
  * number is per-datum, not per-tuple, for simplicity in index_form_tuple().
  */
-#define TOAST_INDEX_TARGET		(MaxHeapTupleSizeDynamic / 16)
+#define TOAST_INDEX_TARGET		(ClusterMaxHeapTupleSize / 16)
 
 /*
  * When we store an oversize datum externally, we divide it into chunks
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index e4a149c72f..bb473ab679 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -550,7 +550,7 @@ StaticAssertDecl(MaxOffsetNumber < SpecTokenOffsetNumber,
 #define BITMAPLEN(NATTS)	(((int)(NATTS) + 7) / 8)
 
 /*
- * MaxHeapTupleSizeDynamic is the maximum allowed size of a heap tuple, including
+ * ClusterMaxHeapTupleSize is the maximum allowed size of a heap tuple, including
  * header and MAXALIGN alignment padding.  Basically it's BLCKSZ minus the
  * other stuff that has to be on a disk page.  Since heap pages use no
  * "special space", there's no deduction for that.
@@ -558,11 +558,10 @@ StaticAssertDecl(MaxOffsetNumber < SpecTokenOffsetNumber,
  * NOTE: we allow for the ItemId that must point to the tuple, ensuring that
  * an otherwise-empty page can indeed hold a tuple of this size.  Because
  * ItemIds and tuples have different alignment requirements, don't assume that
- * you can, say, fit 2 tuples of size MaxHeapTupleSizeDynamic/2 on the same page.
+ * you can, say, fit 2 tuples of size ClusterMaxHeapTupleSize/2 on the same page.
  */
 #define CalcMaxHeapTupleSize(usablespace)  ((usablespace) - MAXALIGN(sizeof(ItemIdData)))
 #define MaxHeapTupleSizeLimit CalcMaxHeapTupleSize(PageUsableSpaceMax)
-#define MaxHeapTupleSizeDynamic CalcMaxHeapTupleSize(PageUsableSpace)
 #define MinHeapTupleSize  MAXALIGN(SizeofHeapTupleHeader)
 
 /*
diff --git a/src/test/regress/expected/insert.out b/src/test/regress/expected/insert.out
index 435adf5439..eebf3c6d4d 100644
--- a/src/test/regress/expected/insert.out
+++ b/src/test/regress/expected/insert.out
@@ -86,7 +86,7 @@ drop table inserttest;
 --
 CREATE TABLE large_tuple_test (a int, b text) WITH (fillfactor = 10);
 ALTER TABLE large_tuple_test ALTER COLUMN b SET STORAGE plain;
--- create page w/ free space in range [nearlyEmptyFreeSpace, MaxHeapTupleSizeDynamic)
+-- create page w/ free space in range [nearlyEmptyFreeSpace, ClusterMaxHeapTupleSize)
 INSERT INTO large_tuple_test (select 1, NULL);
 -- should still fit on the page
 INSERT INTO large_tuple_test (select 2, repeat('a', 1000));
diff --git a/src/test/regress/sql/insert.sql b/src/test/regress/sql/insert.sql
index 133acae4cc..53f46e7960 100644
--- a/src/test/regress/sql/insert.sql
+++ b/src/test/regress/sql/insert.sql
@@ -43,7 +43,7 @@ drop table inserttest;
 CREATE TABLE large_tuple_test (a int, b text) WITH (fillfactor = 10);
 ALTER TABLE large_tuple_test ALTER COLUMN b SET STORAGE plain;
 
--- create page w/ free space in range [nearlyEmptyFreeSpace, MaxHeapTupleSizeDynamic)
+-- create page w/ free space in range [nearlyEmptyFreeSpace, ClusterMaxHeapTupleSize)
 INSERT INTO large_tuple_test (select 1, NULL);
 
 -- should still fit on the page
-- 
2.40.1

v3-0021-chore-Rename-MaxHeapTuplesPerPageDynamic-to-Clust.patchapplication/octet-stream; name=v3-0021-chore-Rename-MaxHeapTuplesPerPageDynamic-to-Clust.patchDownload

From ce22eecde8b5a07f6657b5c781e6bbfe95fa5707 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Fri, 12 Jan 2024 17:19:14 -0500
Subject: [PATCH v3 21/28] chore: Rename MaxHeapTuplesPerPageDynamic to
 ClusterMaxHeapTuplesPerPage

We are now using the computed variable instead of calculating the expression.
---
 contrib/pg_surgery/heap_surgery.c                |  2 +-
 src/backend/access/brin/brin_bloom.c             |  8 ++++----
 src/backend/access/brin/brin_minmax_multi.c      |  8 ++++----
 src/backend/access/gin/ginpostinglist.c          |  6 +++---
 src/backend/access/heap/README.HOT               |  2 +-
 src/backend/access/heap/heapam.c                 |  2 +-
 src/backend/access/heap/heapam_handler.c         |  2 +-
 src/backend/access/heap/hio.c                    |  2 +-
 src/backend/access/heap/pruneheap.c              | 12 ++++++------
 src/backend/access/heap/vacuumlazy.c             | 14 +++++++-------
 src/backend/storage/page/bufpage.c               | 16 ++++++++--------
 src/include/access/ginblock.h                    |  2 +-
 src/include/access/heapam.h                      |  2 +-
 src/include/access/htup_details.h                |  1 -
 .../test_ginpostinglist/test_ginpostinglist.c    |  6 +++---
 15 files changed, 42 insertions(+), 43 deletions(-)

diff --git a/contrib/pg_surgery/heap_surgery.c b/contrib/pg_surgery/heap_surgery.c
index 9db36ea20a..86aff2494e 100644
--- a/contrib/pg_surgery/heap_surgery.c
+++ b/contrib/pg_surgery/heap_surgery.c
@@ -225,7 +225,7 @@ heap_force_common(FunctionCallInfo fcinfo, HeapTupleForceOption heap_force_opt)
 			}
 
 			/* Mark it for processing. */
-			Assert(offno < MaxHeapTuplesPerPageDynamic);
+			Assert(offno < ClusterMaxHeapTuplesPerPage);
 			include_this_tid[offno] = true;
 		}
 
diff --git a/src/backend/access/brin/brin_bloom.c b/src/backend/access/brin/brin_bloom.c
index d24fb5aa28..652a67f356 100644
--- a/src/backend/access/brin/brin_bloom.c
+++ b/src/backend/access/brin/brin_bloom.c
@@ -166,7 +166,7 @@ typedef struct BloomOptions
  * on the fact that the filter header is ~20B alone, which is about
  * the same as the filter bitmap for 16 distinct items with 1% false
  * positive rate. So by allowing lower values we'd not gain much. In
- * any case, the min should not be larger than MaxHeapTuplesPerPageDynamic
+ * any case, the min should not be larger than ClusterMaxHeapTuplesPerPage
  * (~290), which is the theoretical maximum for single-page ranges.
  */
 #define		BLOOM_MIN_NDISTINCT_PER_RANGE		16
@@ -478,7 +478,7 @@ brin_bloom_opcinfo(PG_FUNCTION_ARGS)
  *
  * Adjust the ndistinct value based on the pagesPerRange value. First,
  * if it's negative, it's assumed to be relative to maximum number of
- * tuples in the range (assuming each page gets MaxHeapTuplesPerPageDynamic
+ * tuples in the range (assuming each page gets ClusterMaxHeapTuplesPerPage
  * tuples, which is likely a significant over-estimate). We also clamp
  * the value, not to over-size the bloom filter unnecessarily.
  *
@@ -493,7 +493,7 @@ brin_bloom_opcinfo(PG_FUNCTION_ARGS)
  * seems better to rely on the upper estimate.
  *
  * XXX We might also calculate a better estimate of rows per BRIN range,
- * instead of using MaxHeapTuplesPerPageDynamic (which probably produces values
+ * instead of using ClusterMaxHeapTuplesPerPage (which probably produces values
  * much higher than reality).
  */
 static int
@@ -508,7 +508,7 @@ brin_bloom_get_ndistinct(BrinDesc *bdesc, BloomOptions *opts)
 
 	Assert(BlockNumberIsValid(pagesPerRange));
 
-	maxtuples = MaxHeapTuplesPerPageDynamic * pagesPerRange;
+	maxtuples = ClusterMaxHeapTuplesPerPage * pagesPerRange;
 
 	/*
 	 * Similarly to n_distinct, negative values are relative - in this case to
diff --git a/src/backend/access/brin/brin_minmax_multi.c b/src/backend/access/brin/brin_minmax_multi.c
index 87fb144265..901676122a 100644
--- a/src/backend/access/brin/brin_minmax_multi.c
+++ b/src/backend/access/brin/brin_minmax_multi.c
@@ -2007,10 +2007,10 @@ brin_minmax_multi_distance_tid(PG_FUNCTION_ARGS)
 	 * We use the no-check variants here, because user-supplied values may
 	 * have (ip_posid == 0). See ItemPointerCompare.
 	 */
-	da1 = ItemPointerGetBlockNumberNoCheck(pa1) * MaxHeapTuplesPerPageDynamic +
+	da1 = ItemPointerGetBlockNumberNoCheck(pa1) * ClusterMaxHeapTuplesPerPage +
 		ItemPointerGetOffsetNumberNoCheck(pa1);
 
-	da2 = ItemPointerGetBlockNumberNoCheck(pa2) * MaxHeapTuplesPerPageDynamic +
+	da2 = ItemPointerGetBlockNumberNoCheck(pa2) * ClusterMaxHeapTuplesPerPage +
 		ItemPointerGetOffsetNumberNoCheck(pa2);
 
 	PG_RETURN_FLOAT8(da2 - da1);
@@ -2461,7 +2461,7 @@ brin_minmax_multi_add_value(PG_FUNCTION_ARGS)
 		 * much lower, but meh.
 		 */
 		maxvalues = Min(target_maxvalues * MINMAX_BUFFER_FACTOR,
-						MaxHeapTuplesPerPageDynamic * pagesPerRange);
+						ClusterMaxHeapTuplesPerPage * pagesPerRange);
 
 		/* but always at least the original value */
 		maxvalues = Max(maxvalues, target_maxvalues);
@@ -2507,7 +2507,7 @@ brin_minmax_multi_add_value(PG_FUNCTION_ARGS)
 		 * much lower, but meh.
 		 */
 		maxvalues = Min(serialized->maxvalues * MINMAX_BUFFER_FACTOR,
-						MaxHeapTuplesPerPageDynamic * pagesPerRange);
+						ClusterMaxHeapTuplesPerPage * pagesPerRange);
 
 		/* but always at least the original value */
 		maxvalues = Max(maxvalues, serialized->maxvalues);
diff --git a/src/backend/access/gin/ginpostinglist.c b/src/backend/access/gin/ginpostinglist.c
index 9bfae0ec81..8aa0f17bf8 100644
--- a/src/backend/access/gin/ginpostinglist.c
+++ b/src/backend/access/gin/ginpostinglist.c
@@ -26,7 +26,7 @@
  * lowest 32 bits are the block number. That leaves 21 bits unused, i.e.
  * only 43 low bits are used.
  *
- * 11 bits is enough for the offset number, because MaxHeapTuplesPerPageDynamic <
+ * 11 bits is enough for the offset number, because ClusterMaxHeapTuplesPerPage <
  * 2^11 on all supported block sizes. We are frugal with the bits, because
  * smaller integers use fewer bytes in the varbyte encoding, saving disk
  * space. (If we get a new table AM in the future that wants to use the full
@@ -74,9 +74,9 @@
 /*
  * How many bits do you need to encode offset number? OffsetNumber is a 16-bit
  * integer, but you can't fit that many items on a page. 11 ought to be more
- * than enough. It's tempting to derive this from MaxHeapTuplesPerPageDynamic, and
+ * than enough. It's tempting to derive this from ClusterMaxHeapTuplesPerPage, and
  * use the minimum number of bits, but that would require changing the on-disk
- * format if MaxHeapTuplesPerPageDynamic changes. Better to leave some slack.
+ * format if ClusterMaxHeapTuplesPerPage changes. Better to leave some slack.
  */
 #define MaxHeapTuplesPerPageBits		11
 
diff --git a/src/backend/access/heap/README.HOT b/src/backend/access/heap/README.HOT
index 296ae36310..e286e1dec3 100644
--- a/src/backend/access/heap/README.HOT
+++ b/src/backend/access/heap/README.HOT
@@ -264,7 +264,7 @@ of line pointer bloat: we might end up with huge numbers of line pointers
 and just a few actual tuples on a page.  To limit the damage in the worst
 case, and to keep various work arrays as well as the bitmaps in bitmap
 scans reasonably sized, the maximum number of line pointers per page
-is arbitrarily capped at MaxHeapTuplesPerPageDynamic (the most tuples that
+is arbitrarily capped at ClusterMaxHeapTuplesPerPage (the most tuples that
 could fit without HOT pruning).
 
 Effectively, space reclamation happens during tuple retrieval when the
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index afa09835c3..cf3459c08d 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -478,7 +478,7 @@ heapgetpage(TableScanDesc sscan, BlockNumber block)
 
 	LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
 
-	Assert(ntup <= MaxHeapTuplesPerPageDynamic);
+	Assert(ntup <= ClusterMaxHeapTuplesPerPage);
 	scan->rs_ntuples = ntup;
 }
 
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 6ee063d9b6..3d51900d33 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -2220,7 +2220,7 @@ heapam_scan_bitmap_next_block(TableScanDesc scan,
 
 	LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
 
-	Assert(ntup <= MaxHeapTuplesPerPageDynamic);
+	Assert(ntup <= ClusterMaxHeapTuplesPerPage);
 	hscan->rs_ntuples = ntup;
 
 	return ntup > 0;
diff --git a/src/backend/access/heap/hio.c b/src/backend/access/heap/hio.c
index 9da5d3d158..3e3963503b 100644
--- a/src/backend/access/heap/hio.c
+++ b/src/backend/access/heap/hio.c
@@ -547,7 +547,7 @@ RelationGetBufferForTuple(Relation relation, Size len,
 	 * extensions while inserting large tuples into low-fillfactor tables.
 	 */
 	nearlyEmptyFreeSpace = ClusterMaxHeapTupleSize -
-		(MaxHeapTuplesPerPageDynamic / 8 * sizeof(ItemIdData));
+		(ClusterMaxHeapTuplesPerPage / 8 * sizeof(ItemIdData));
 	if (len + saveFreeSpace > nearlyEmptyFreeSpace)
 		targetFreeSpace = Max(len, nearlyEmptyFreeSpace);
 	else
diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index 193736f419..4073b0db35 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -51,7 +51,7 @@ typedef struct
 	/*
 	 * marked[i] is true if item i is entered in one of the above arrays.
 	 *
-	 * This needs to be MaxHeapTuplesPerPageDynamic + 1 long as FirstOffsetNumber is
+	 * This needs to be ClusterMaxHeapTuplesPerPage + 1 long as FirstOffsetNumber is
 	 * 1. Otherwise every access would need to subtract 1.
 	 */
 	bool		marked[MaxHeapTuplesPerPageLimit + 1];
@@ -777,7 +777,7 @@ static void
 heap_prune_record_redirect(PruneState *prstate,
 						   OffsetNumber offnum, OffsetNumber rdoffnum)
 {
-	Assert(prstate->nredirected < MaxHeapTuplesPerPageDynamic);
+	Assert(prstate->nredirected < ClusterMaxHeapTuplesPerPage);
 	prstate->redirected[prstate->nredirected * 2] = offnum;
 	prstate->redirected[prstate->nredirected * 2 + 1] = rdoffnum;
 	prstate->nredirected++;
@@ -791,7 +791,7 @@ heap_prune_record_redirect(PruneState *prstate,
 static void
 heap_prune_record_dead(PruneState *prstate, OffsetNumber offnum)
 {
-	Assert(prstate->ndead < MaxHeapTuplesPerPageDynamic);
+	Assert(prstate->ndead < ClusterMaxHeapTuplesPerPage);
 	prstate->nowdead[prstate->ndead] = offnum;
 	prstate->ndead++;
 	Assert(!prstate->marked[offnum]);
@@ -823,7 +823,7 @@ heap_prune_record_dead_or_unused(PruneState *prstate, OffsetNumber offnum)
 static void
 heap_prune_record_unused(PruneState *prstate, OffsetNumber offnum)
 {
-	Assert(prstate->nunused < MaxHeapTuplesPerPageDynamic);
+	Assert(prstate->nunused < ClusterMaxHeapTuplesPerPage);
 	prstate->nowunused[prstate->nunused] = offnum;
 	prstate->nunused++;
 	Assert(!prstate->marked[offnum]);
@@ -1036,7 +1036,7 @@ page_verify_redirects(Page page)
  * If item k is part of a HOT-chain with root at item j, then we set
  * root_offsets[k - 1] = j.
  *
- * The passed-in root_offsets array must have MaxHeapTuplesPerPageDynamic entries.
+ * The passed-in root_offsets array must have ClusterMaxHeapTuplesPerPage entries.
  * Unused entries are filled with InvalidOffsetNumber (zero).
  *
  * The function must be called with at least share lock on the buffer, to
@@ -1053,7 +1053,7 @@ heap_get_root_tuples(Page page, OffsetNumber *root_offsets)
 				maxoff;
 
 	MemSet(root_offsets, InvalidOffsetNumber,
-		   MaxHeapTuplesPerPageDynamic * sizeof(OffsetNumber));
+		   ClusterMaxHeapTuplesPerPage * sizeof(OffsetNumber));
 
 	maxoff = PageGetMaxOffsetNumber(page);
 	for (offnum = FirstOffsetNumber; offnum <= maxoff; offnum = OffsetNumberNext(offnum))
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 3fce5dcdac..e6b7edcc91 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -892,8 +892,8 @@ lazy_scan_heap(LVRelState *vacrel)
 		 * dead_items TIDs, pause and do a cycle of vacuuming before we tackle
 		 * this page.
 		 */
-		Assert(dead_items->max_items >= MaxHeapTuplesPerPageDynamic);
-		if (dead_items->max_items - dead_items->num_items < MaxHeapTuplesPerPageDynamic)
+		Assert(dead_items->max_items >= ClusterMaxHeapTuplesPerPage);
+		if (dead_items->max_items - dead_items->num_items < ClusterMaxHeapTuplesPerPage)
 		{
 			/*
 			 * Before beginning index vacuuming, we release any pin we may
@@ -3121,16 +3121,16 @@ dead_items_max_items(LVRelState *vacrel)
 		max_items = Min(max_items, MAXDEADITEMS(MaxAllocSize));
 
 		/* curious coding here to ensure the multiplication can't overflow */
-		if ((BlockNumber) (max_items / MaxHeapTuplesPerPageDynamic) > rel_pages)
-			max_items = rel_pages * MaxHeapTuplesPerPageDynamic;
+		if ((BlockNumber) (max_items / ClusterMaxHeapTuplesPerPage) > rel_pages)
+			max_items = rel_pages * ClusterMaxHeapTuplesPerPage;
 
 		/* stay sane if small maintenance_work_mem */
-		max_items = Max(max_items, MaxHeapTuplesPerPageDynamic);
+		max_items = Max(max_items, ClusterMaxHeapTuplesPerPage);
 	}
 	else
 	{
 		/* One-pass case only stores a single heap page's TIDs at a time */
-		max_items = MaxHeapTuplesPerPageDynamic;
+		max_items = ClusterMaxHeapTuplesPerPage;
 	}
 
 	return (int) max_items;
@@ -3150,7 +3150,7 @@ dead_items_alloc(LVRelState *vacrel, int nworkers)
 	int			max_items;
 
 	max_items = dead_items_max_items(vacrel);
-	Assert(max_items >= MaxHeapTuplesPerPageDynamic);
+	Assert(max_items >= ClusterMaxHeapTuplesPerPage);
 
 	/*
 	 * Initialize state for a parallel vacuum.  As of now, only one worker can
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index f84343e6ce..e4bbc46405 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -185,7 +185,7 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
  *	one that is both unused and deallocated.
  *
  *	If flag PAI_IS_HEAP is set, we enforce that there can't be more than
- *	MaxHeapTuplesPerPageDynamic line pointers on the page.
+ *	ClusterMaxHeapTuplesPerPage line pointers on the page.
  *
  *	!!! EREPORT(ERROR) IS DISALLOWED HERE !!!
  */
@@ -294,9 +294,9 @@ PageAddItemExtended(Page page,
 	}
 
 	/* Reject placing items beyond heap boundary, if heap */
-	if ((flags & PAI_IS_HEAP) != 0 && offsetNumber > MaxHeapTuplesPerPageDynamic)
+	if ((flags & PAI_IS_HEAP) != 0 && offsetNumber > ClusterMaxHeapTuplesPerPage)
 	{
-		elog(WARNING, "can't put more than MaxHeapTuplesPerPageDynamic items in a heap page");
+		elog(WARNING, "can't put more than ClusterMaxHeapTuplesPerPage items in a heap page");
 		return InvalidOffsetNumber;
 	}
 
@@ -978,12 +978,12 @@ PageGetExactFreeSpace(Page page)
  *		reduced by the space needed for a new line pointer.
  *
  * The difference between this and PageGetFreeSpace is that this will return
- * zero if there are already MaxHeapTuplesPerPageDynamic line pointers in the page
+ * zero if there are already ClusterMaxHeapTuplesPerPage line pointers in the page
  * and none are free.  We use this to enforce that no more than
- * MaxHeapTuplesPerPageDynamic line pointers are created on a heap page.  (Although
+ * ClusterMaxHeapTuplesPerPage line pointers are created on a heap page.  (Although
  * no more tuples than that could fit anyway, in the presence of redirected
  * or dead line pointers it'd be possible to have too many line pointers.
- * To avoid breaking code that assumes MaxHeapTuplesPerPageDynamic is a hard limit
+ * To avoid breaking code that assumes ClusterMaxHeapTuplesPerPage is a hard limit
  * on the number of line pointers, we make this extra check.)
  */
 Size
@@ -998,10 +998,10 @@ PageGetHeapFreeSpace(Page page)
 					nline;
 
 		/*
-		 * Are there already MaxHeapTuplesPerPageDynamic line pointers in the page?
+		 * Are there already ClusterMaxHeapTuplesPerPage line pointers in the page?
 		 */
 		nline = PageGetMaxOffsetNumber(page);
-		if (nline >= MaxHeapTuplesPerPageDynamic)
+		if (nline >= ClusterMaxHeapTuplesPerPage)
 		{
 			if (PageHasFreeLinePointers(page))
 			{
diff --git a/src/include/access/ginblock.h b/src/include/access/ginblock.h
index 2d7e295df0..05b2f34408 100644
--- a/src/include/access/ginblock.h
+++ b/src/include/access/ginblock.h
@@ -162,7 +162,7 @@ extern bool GinPageIsRecyclable(Page page);
  *				pointers for that page
  * Note that these are all distinguishable from an "invalid" item pointer
  * (which is InvalidBlockNumber/0) as well as from all normal item
- * pointers (which have item numbers in the range 1..MaxHeapTuplesPerPageDynamic).
+ * pointers (which have item numbers in the range 1..ClusterMaxHeapTuplesPerPage).
  */
 #define ItemPointerSetMin(p)  \
 	ItemPointerSet((p), (BlockNumber)0, (OffsetNumber)0)
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 2f693483f6..3217d72f99 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -205,7 +205,7 @@ typedef struct PruneResult
 	 * This is of type int8[], instead of HTSV_Result[], so we can use -1 to
 	 * indicate no visibility has been computed, e.g. for LP_DEAD items.
 	 *
-	 * This needs to be MaxHeapTuplesPerPageDynamic + 1 long as FirstOffsetNumber is
+	 * This needs to be ClusterMaxHeapTuplesPerPage + 1 long as FirstOffsetNumber is
 	 * 1. Otherwise every access would need to subtract 1.
 	 */
 	int8		htsv[MaxHeapTuplesPerPageLimit + 1];
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index bb473ab679..dada8c7c71 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -580,7 +580,6 @@ StaticAssertDecl(MaxOffsetNumber < SpecTokenOffsetNumber,
 	((int) ((usablespace) /							\
 			(MAXALIGN(SizeofHeapTupleHeader) + sizeof(ItemIdData))))
 #define MaxHeapTuplesPerPageLimit (CalcMaxHeapTuplesPerPage(PageUsableSpaceMax))
-#define MaxHeapTuplesPerPageDynamic (CalcMaxHeapTuplesPerPage(PageUsableSpace))
 
 /*
  * MaxAttrSize is a somewhat arbitrary upper limit on the declared size of
diff --git a/src/test/modules/test_ginpostinglist/test_ginpostinglist.c b/src/test/modules/test_ginpostinglist/test_ginpostinglist.c
index 6e0d184d08..f79dbc2bdd 100644
--- a/src/test/modules/test_ginpostinglist/test_ginpostinglist.c
+++ b/src/test/modules/test_ginpostinglist/test_ginpostinglist.c
@@ -88,9 +88,9 @@ Datum
 test_ginpostinglist(PG_FUNCTION_ARGS)
 {
 	test_itemptr_pair(0, 2, 14);
-	test_itemptr_pair(0, MaxHeapTuplesPerPageDynamic, 14);
-	test_itemptr_pair(MaxBlockNumber, MaxHeapTuplesPerPageDynamic, 14);
-	test_itemptr_pair(MaxBlockNumber, MaxHeapTuplesPerPageDynamic, 16);
+	test_itemptr_pair(0, ClusterMaxHeapTuplesPerPage, 14);
+	test_itemptr_pair(MaxBlockNumber, ClusterMaxHeapTuplesPerPage, 14);
+	test_itemptr_pair(MaxBlockNumber, ClusterMaxHeapTuplesPerPage, 16);
 
 	PG_RETURN_VOID();
 }
-- 
2.40.1

v3-0024-optimization-Add-support-for-fast-non-division-ba.patchapplication/octet-stream; name=v3-0024-optimization-Add-support-for-fast-non-division-ba.patchDownload

From 710c2698312f049e7ecce53ba8be6b7f9a360ab3 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Fri, 29 Sep 2023 15:02:00 -0400
Subject: [PATCH v3 24/28] optimization: Add support for fast,
 non-division-based div/mod algorithms

One place that is unhappy with the runtime changes is the visibility map.  This
provides a feature for a fast mod/div operation using only a single cacheable
division operation to compute the inverse which we can then multiply and
bitshift appropriately for the actual computations, which serves our needs here.
---
 src/include/port/pg_bitutils.h | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/src/include/port/pg_bitutils.h b/src/include/port/pg_bitutils.h
index 799f70d052..1e5c0ed5d2 100644
--- a/src/include/port/pg_bitutils.h
+++ b/src/include/port/pg_bitutils.h
@@ -340,4 +340,37 @@ pg_rotate_left32(uint32 word, int n)
 #define pg_prevpower2_size_t pg_prevpower2_64
 #endif
 
+
+/* integer division speedups for constant but runtime divisors */
+
+/*
+ * This value should cached globally and used in the other routines to find
+ * the div/mod quickly relative to `div` operand.  TODO: might have some other
+ * asm-tuned things in port maybe?  general-purpose solution should be ok
+ * though.
+ */
+static inline uint32 pg_fastinverse(uint16 divisor)
+{
+	return UINT32_C(0xFFFFFFFF) / divisor + 1;
+}
+
+/*
+ * pg_fastdiv - calculates the quotient of a 16-bit number against a constant
+ * divisor without using the division operator
+ */
+static inline uint16 pg_fastdiv(uint16 n, uint16 divisor, uint32 fastinv)
+{
+	return (((uint64)(fastinv - 1) * n)) >> 32;
+}
+
+/*
+ * pg_fastmod - calculates the modulus of a 16-bit number against a constant
+ * divisor without using the division operator
+ */
+static inline uint16 pg_fastmod(uint16 n, uint16 divisor, uint32 fastinv)
+{
+	uint32 lowbits = fastinv * n;
+	return ((uint64)lowbits * divisor) >> 32;
+}
+
 #endif							/* PG_BITUTILS_H */
-- 
2.40.1

v3-0022-chore-Rename-MaxIndexTuplesPerPageDynamic-to-Clus.patchapplication/octet-stream; name=v3-0022-chore-Rename-MaxIndexTuplesPerPageDynamic-to-Clus.patchDownload

From de4537fd7375cacdc182c0c7e3af196491e453eb Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Fri, 12 Jan 2024 17:19:15 -0500
Subject: [PATCH v3 22/28] chore: Rename MaxIndexTuplesPerPageDynamic to
 ClusterMaxIndexTuplesPerPage

---
 contrib/amcheck/verify_nbtree.c      |  6 +++---
 src/backend/access/gist/gistget.c    |  8 ++++----
 src/backend/access/hash/hash.c       |  4 ++--
 src/backend/access/hash/hashsearch.c | 10 +++++-----
 src/backend/access/spgist/spgscan.c  |  2 +-
 src/backend/storage/page/bufpage.c   |  2 +-
 src/include/access/itup.h            |  1 -
 src/include/access/spgist_private.h  |  2 +-
 8 files changed, 17 insertions(+), 18 deletions(-)

diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 4d2ae3927e..a2d26394be 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -3446,12 +3446,12 @@ palloc_btree_page(BtreeCheckState *state, BlockNumber blocknum)
 	 * to move left, in the case of backward index scans).
 	 */
 	maxoffset = PageGetMaxOffsetNumber(page);
-	if (maxoffset > MaxIndexTuplesPerPageDynamic)
+	if (maxoffset > ClusterMaxIndexTuplesPerPage)
 		ereport(ERROR,
 				(errcode(ERRCODE_INDEX_CORRUPTED),
-				 errmsg("Number of items on block %u of index \"%s\" exceeds MaxIndexTuplesPerPageDynamic (%u)",
+				 errmsg("Number of items on block %u of index \"%s\" exceeds ClusterMaxIndexTuplesPerPage (%u)",
 						blocknum, RelationGetRelationName(state->rel),
-						MaxIndexTuplesPerPageDynamic)));
+						(unsigned int)ClusterMaxIndexTuplesPerPage)));
 
 	if (!P_ISLEAF(opaque) && !P_ISDELETED(opaque) && maxoffset < P_FIRSTDATAKEY(opaque))
 		ereport(ERROR,
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
index 7af7b95997..1caa072581 100644
--- a/src/backend/access/gist/gistget.c
+++ b/src/backend/access/gist/gistget.c
@@ -659,12 +659,12 @@ gistgettuple(IndexScanDesc scan, ScanDirection dir)
 							MemoryContextSwitchTo(so->giststate->scanCxt);
 
 						so->killedItems =
-							(OffsetNumber *) palloc(MaxIndexTuplesPerPageDynamic
+							(OffsetNumber *) palloc(ClusterMaxIndexTuplesPerPage
 													* sizeof(OffsetNumber));
 
 						MemoryContextSwitchTo(oldCxt);
 					}
-					if (so->numKilled < MaxIndexTuplesPerPageDynamic)
+					if (so->numKilled < ClusterMaxIndexTuplesPerPage)
 						so->killedItems[so->numKilled++] =
 							so->pageData[so->curPageData - 1].offnum;
 				}
@@ -696,12 +696,12 @@ gistgettuple(IndexScanDesc scan, ScanDirection dir)
 						MemoryContextSwitchTo(so->giststate->scanCxt);
 
 					so->killedItems =
-						(OffsetNumber *) palloc(MaxIndexTuplesPerPageDynamic
+						(OffsetNumber *) palloc(ClusterMaxIndexTuplesPerPage
 												* sizeof(OffsetNumber));
 
 					MemoryContextSwitchTo(oldCxt);
 				}
-				if (so->numKilled < MaxIndexTuplesPerPageDynamic)
+				if (so->numKilled < ClusterMaxIndexTuplesPerPage)
 					so->killedItems[so->numKilled++] =
 						so->pageData[so->curPageData - 1].offnum;
 			}
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index 125ab213bc..3f66b2570a 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -312,9 +312,9 @@ hashgettuple(IndexScanDesc scan, ScanDirection dir)
 			 */
 			if (so->killedItems == NULL)
 				so->killedItems = (int *)
-					palloc(MaxIndexTuplesPerPageDynamic * sizeof(int));
+					palloc(ClusterMaxIndexTuplesPerPage * sizeof(int));
 
-			if (so->numKilled < MaxIndexTuplesPerPageDynamic)
+			if (so->numKilled < ClusterMaxIndexTuplesPerPage)
 				so->killedItems[so->numKilled++] = so->currPos.itemIndex;
 		}
 
diff --git a/src/backend/access/hash/hashsearch.c b/src/backend/access/hash/hashsearch.c
index b8bdee54c7..d033950f71 100644
--- a/src/backend/access/hash/hashsearch.c
+++ b/src/backend/access/hash/hashsearch.c
@@ -532,7 +532,7 @@ _hash_readpage(IndexScanDesc scan, Buffer *bufP, ScanDirection dir)
 
 			itemIndex = _hash_load_qualified_items(scan, page, offnum, dir);
 
-			if (itemIndex != MaxIndexTuplesPerPageDynamic)
+			if (itemIndex != ClusterMaxIndexTuplesPerPage)
 				break;
 
 			/*
@@ -571,8 +571,8 @@ _hash_readpage(IndexScanDesc scan, Buffer *bufP, ScanDirection dir)
 		}
 
 		so->currPos.firstItem = itemIndex;
-		so->currPos.lastItem = MaxIndexTuplesPerPageDynamic - 1;
-		so->currPos.itemIndex = MaxIndexTuplesPerPageDynamic - 1;
+		so->currPos.lastItem = ClusterMaxIndexTuplesPerPage - 1;
+		so->currPos.itemIndex = ClusterMaxIndexTuplesPerPage - 1;
 	}
 
 	if (so->currPos.buf == so->hashso_bucket_buf ||
@@ -652,13 +652,13 @@ _hash_load_qualified_items(IndexScanDesc scan, Page page,
 			offnum = OffsetNumberNext(offnum);
 		}
 
-		Assert(itemIndex <= MaxIndexTuplesPerPageDynamic);
+		Assert(itemIndex <= ClusterMaxIndexTuplesPerPage);
 		return itemIndex;
 	}
 	else
 	{
 		/* load items[] in descending order */
-		itemIndex = MaxIndexTuplesPerPageDynamic;
+		itemIndex = ClusterMaxIndexTuplesPerPage;
 
 		while (offnum >= FirstOffsetNumber)
 		{
diff --git a/src/backend/access/spgist/spgscan.c b/src/backend/access/spgist/spgscan.c
index de6f767c89..5690fc4981 100644
--- a/src/backend/access/spgist/spgscan.c
+++ b/src/backend/access/spgist/spgscan.c
@@ -961,7 +961,7 @@ storeGettuple(SpGistScanOpaque so, ItemPointer heapPtr,
 			  SpGistLeafTuple leafTuple, bool recheck,
 			  bool recheckDistances, double *nonNullDistances)
 {
-	Assert(so->nPtrs < MaxIndexTuplesPerPageDynamic);
+	Assert(so->nPtrs < ClusterMaxIndexTuplesPerPage);
 	so->heapPtrs[so->nPtrs] = *heapPtr;
 	so->recheck[so->nPtrs] = recheck;
 	so->recheckDistances[so->nPtrs] = recheckDistances;
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index e4bbc46405..8a5fb66a8e 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -1177,7 +1177,7 @@ PageIndexMultiDelete(Page page, OffsetNumber *itemnos, int nitems)
 	OffsetNumber offnum;
 	bool		presorted = true;	/* For now */
 
-	Assert(nitems <= MaxIndexTuplesPerPageDynamic);
+	Assert(nitems <= ClusterMaxIndexTuplesPerPage);
 
 	/*
 	 * If there aren't very many items to delete, then retail
diff --git a/src/include/access/itup.h b/src/include/access/itup.h
index edda73e929..d84f3e2505 100644
--- a/src/include/access/itup.h
+++ b/src/include/access/itup.h
@@ -167,5 +167,4 @@ index_getattr(IndexTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
 	((int) ((usablespace) /												\
 			(MAXALIGN(sizeof(IndexTupleData) + 1) + sizeof(ItemIdData))))
 #define MaxIndexTuplesPerPageLimit (CalcMaxIndexTuplesPerPage(PageUsableSpaceMax))
-#define MaxIndexTuplesPerPageDynamic (CalcMaxIndexTuplesPerPage(PageUsableSpace))
 #endif							/* ITUP_H */
diff --git a/src/include/access/spgist_private.h b/src/include/access/spgist_private.h
index ba643648fa..d8ca2b7e0f 100644
--- a/src/include/access/spgist_private.h
+++ b/src/include/access/spgist_private.h
@@ -236,7 +236,7 @@ typedef struct SpGistScanOpaqueData
 	IndexOrderByDistance *distances[MaxIndexTuplesPerPageLimit];
 
 	/*
-	 * Note: using MaxIndexTuplesPerPageDynamic above is a bit hokey since
+	 * Note: using ClusterMaxIndexTuplesPerPage above is a bit hokey since
 	 * SpGistLeafTuples aren't exactly IndexTuples; however, they are larger,
 	 * so this is safe.
 	 */
-- 
2.40.1

v3-0023-chore-Rename-MaxTIDsPerBTreePageDynamic-to-Cluste.patchapplication/octet-stream; name=v3-0023-chore-Rename-MaxTIDsPerBTreePageDynamic-to-Cluste.patchDownload

From a3e77e88dee59a7e13a9c33726cc1ba08f4bb3e3 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Fri, 12 Jan 2024 17:19:15 -0500
Subject: [PATCH v3 23/28] chore: Rename MaxTIDsPerBTreePageDynamic to
 ClusterMaxTIDsPerBTreePage

We are now using the computed variable instead of calculating the expression.
---
 contrib/amcheck/verify_nbtree.c       | 4 ++--
 src/backend/access/nbtree/nbtdedup.c  | 4 ++--
 src/backend/access/nbtree/nbtinsert.c | 4 ++--
 src/backend/access/nbtree/nbtree.c    | 4 ++--
 src/backend/access/nbtree/nbtsearch.c | 8 ++++----
 5 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index a2d26394be..c3b82dbbbe 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -533,12 +533,12 @@ bt_check_every_level(Relation rel, Relation heaprel, bool heapkeyspace,
 		/*
 		 * Size Bloom filter based on estimated number of tuples in index,
 		 * while conservatively assuming that each block must contain at least
-		 * MaxTIDsPerBTreePageDynamic / 3 "plain" tuples -- see
+		 * ClusterMaxTIDsPerBTreePage / 3 "plain" tuples -- see
 		 * bt_posting_plain_tuple() for definition, and details of how posting
 		 * list tuples are handled.
 		 */
 		total_pages = RelationGetNumberOfBlocks(rel);
-		total_elems = Max(total_pages * (MaxTIDsPerBTreePageDynamic / 3),
+		total_elems = Max(total_pages * (ClusterMaxTIDsPerBTreePage / 3),
 						  (int64) state->rel->rd_rel->reltuples);
 		/* Generate a random seed to avoid repetition */
 		seed = pg_prng_uint64(&pg_global_prng_state);
diff --git a/src/backend/access/nbtree/nbtdedup.c b/src/backend/access/nbtree/nbtdedup.c
index d6655b2988..dab043bf78 100644
--- a/src/backend/access/nbtree/nbtdedup.c
+++ b/src/backend/access/nbtree/nbtdedup.c
@@ -355,8 +355,8 @@ _bt_bottomupdel_pass(Relation rel, Buffer buf, Relation heapRel,
 	delstate.bottomup = true;
 	delstate.bottomupfreespace = Max(BLCKSZ / 16, newitemsz);
 	delstate.ndeltids = 0;
-	delstate.deltids = palloc(MaxTIDsPerBTreePageDynamic * sizeof(TM_IndexDelete));
-	delstate.status = palloc(MaxTIDsPerBTreePageDynamic * sizeof(TM_IndexStatus));
+	delstate.deltids = palloc(ClusterMaxTIDsPerBTreePage * sizeof(TM_IndexDelete));
+	delstate.status = palloc(ClusterMaxTIDsPerBTreePage * sizeof(TM_IndexStatus));
 
 	minoff = P_FIRSTDATAKEY(opaque);
 	maxoff = PageGetMaxOffsetNumber(page);
diff --git a/src/backend/access/nbtree/nbtinsert.c b/src/backend/access/nbtree/nbtinsert.c
index b41365b16b..ccc2454801 100644
--- a/src/backend/access/nbtree/nbtinsert.c
+++ b/src/backend/access/nbtree/nbtinsert.c
@@ -2829,8 +2829,8 @@ _bt_simpledel_pass(Relation rel, Buffer buffer, Relation heapRel,
 	delstate.bottomup = false;
 	delstate.bottomupfreespace = 0;
 	delstate.ndeltids = 0;
-	delstate.deltids = palloc(MaxTIDsPerBTreePageDynamic * sizeof(TM_IndexDelete));
-	delstate.status = palloc(MaxTIDsPerBTreePageDynamic * sizeof(TM_IndexStatus));
+	delstate.deltids = palloc(ClusterMaxTIDsPerBTreePage * sizeof(TM_IndexDelete));
+	delstate.status = palloc(ClusterMaxTIDsPerBTreePage * sizeof(TM_IndexStatus));
 
 	for (offnum = minoff;
 		 offnum <= maxoff;
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 041bb73c47..520e52ee12 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -263,8 +263,8 @@ btgettuple(IndexScanDesc scan, ScanDirection dir)
 				 */
 				if (so->killedItems == NULL)
 					so->killedItems = (int *)
-						palloc(MaxTIDsPerBTreePageDynamic * sizeof(int));
-				if (so->numKilled < MaxTIDsPerBTreePageDynamic)
+						palloc(ClusterMaxTIDsPerBTreePage * sizeof(int));
+				if (so->numKilled < ClusterMaxTIDsPerBTreePage)
 					so->killedItems[so->numKilled++] = so->currPos.itemIndex;
 			}
 
diff --git a/src/backend/access/nbtree/nbtsearch.c b/src/backend/access/nbtree/nbtsearch.c
index 8e80a41571..4925ace477 100644
--- a/src/backend/access/nbtree/nbtsearch.c
+++ b/src/backend/access/nbtree/nbtsearch.c
@@ -1726,7 +1726,7 @@ _bt_readpage(IndexScanDesc scan, ScanDirection dir, OffsetNumber offnum,
 		if (!continuescan)
 			so->currPos.moreRight = false;
 
-		Assert(itemIndex <= MaxTIDsPerBTreePageDynamic);
+		Assert(itemIndex <= ClusterMaxTIDsPerBTreePage);
 		so->currPos.firstItem = 0;
 		so->currPos.lastItem = itemIndex - 1;
 		so->currPos.itemIndex = 0;
@@ -1734,7 +1734,7 @@ _bt_readpage(IndexScanDesc scan, ScanDirection dir, OffsetNumber offnum,
 	else
 	{
 		/* load items[] in descending order */
-		itemIndex = MaxTIDsPerBTreePageDynamic;
+		itemIndex = ClusterMaxTIDsPerBTreePage;
 
 		offnum = Min(offnum, maxoff);
 
@@ -1836,8 +1836,8 @@ _bt_readpage(IndexScanDesc scan, ScanDirection dir, OffsetNumber offnum,
 
 		Assert(itemIndex >= 0);
 		so->currPos.firstItem = itemIndex;
-		so->currPos.lastItem = MaxTIDsPerBTreePageDynamic - 1;
-		so->currPos.itemIndex = MaxTIDsPerBTreePageDynamic - 1;
+		so->currPos.lastItem = ClusterMaxTIDsPerBTreePage - 1;
+		so->currPos.itemIndex = ClusterMaxTIDsPerBTreePage - 1;
 	}
 
 	return (so->currPos.firstItem <= so->currPos.lastItem);
-- 
2.40.1

v3-0025-optimization-Use-fastdiv-code-in-visibility-map.patchapplication/octet-stream; name=v3-0025-optimization-Use-fastdiv-code-in-visibility-map.patchDownload

From 675179a1896bce96a27b7f0b226eb4b402e186e4 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Fri, 29 Sep 2023 15:03:00 -0400
Subject: [PATCH v3 25/28] optimization: Use fastdiv code in visibility map

Adjust the code that calculates our heap block offsets based to be based on
PageUsableSpace instead of compile-time constants.  Use the fastdiv code to
support this.
---
 src/backend/access/heap/visibilitymap.c | 92 ++++++++++++++++++-------
 1 file changed, 67 insertions(+), 25 deletions(-)

diff --git a/src/backend/access/heap/visibilitymap.c b/src/backend/access/heap/visibilitymap.c
index 10a266076d..e10c29f279 100644
--- a/src/backend/access/heap/visibilitymap.c
+++ b/src/backend/access/heap/visibilitymap.c
@@ -105,18 +105,24 @@
  * extra headers, so the whole page minus the standard page header is
  * used for the bitmap.
  */
-#define MAPSIZE (PageUsableSpace)
 
 /* Number of heap blocks we can represent in one byte */
 #define HEAPBLOCKS_PER_BYTE (BITS_PER_BYTE / BITS_PER_HEAPBLOCK)
 
+/* Init routine for our fastmath */
+#define MAPBLOCK_INIT if (unlikely(!mapblock_size))		\
+	{													\
+		mapblock_size = PageUsableSpace;				\
+		mapblock_inv = pg_fastinverse(mapblock_size);	\
+	}
+
 /* Number of heap blocks we can represent in one visibility map page. */
-#define HEAPBLOCKS_PER_PAGE (MAPSIZE * HEAPBLOCKS_PER_BYTE)
+#define HEAPBLOCKS_PER_PAGE (mapblock_size << BITS_PER_HEAPBLOCK)
 
 /* Mapping from heap block number to the right bit in the visibility map */
-#define HEAPBLK_TO_MAPBLOCK(x) ((x) / HEAPBLOCKS_PER_PAGE)
-#define HEAPBLK_TO_MAPBYTE(x) (((x) % HEAPBLOCKS_PER_PAGE) / HEAPBLOCKS_PER_BYTE)
-#define HEAPBLK_TO_OFFSET(x) (((x) % HEAPBLOCKS_PER_BYTE) * BITS_PER_HEAPBLOCK)
+#define HEAPBLK_TO_MAPBLOCK(x) (pg_fastdiv((x),mapblock_size,mapblock_inv))
+#define HEAPBLK_TO_MAPBYTE(x) (pg_fastmod((x),mapblock_size,mapblock_inv) >> 2) /* always 4 blocks per byte */
+#define HEAPBLK_TO_OFFSET(x) (((x) & 0x3) << 1) /* always 2 bits per entry */
 
 /* Masks for counting subsets of bits in the visibility map. */
 #define VISIBLE_MASK64	UINT64CONST(0x5555555555555555) /* The lower bit of each
@@ -128,6 +134,9 @@
 static Buffer vm_readbuf(Relation rel, BlockNumber blkno, bool extend);
 static Buffer vm_extend(Relation rel, BlockNumber vm_nblocks);
 
+/* storage for the fast div/mod inverse */
+static uint64 mapblock_inv = 0;
+static uint32 mapblock_size = 0;
 
 /*
  *	visibilitymap_clear - clear specified bits for one page in visibility map
@@ -139,13 +148,20 @@ static Buffer vm_extend(Relation rel, BlockNumber vm_nblocks);
 bool
 visibilitymap_clear(Relation rel, BlockNumber heapBlk, Buffer vmbuf, uint8 flags)
 {
-	BlockNumber mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
-	int			mapByte = HEAPBLK_TO_MAPBYTE(heapBlk);
-	int			mapOffset = HEAPBLK_TO_OFFSET(heapBlk);
-	uint8		mask = flags << mapOffset;
+	BlockNumber mapBlock;
+	int			mapByte;
+	int			mapOffset;
+	uint8		mask;
 	char	   *map;
 	bool		cleared = false;
 
+	MAPBLOCK_INIT;
+
+	mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
+	mapByte = HEAPBLK_TO_MAPBYTE(heapBlk);
+	mapOffset = HEAPBLK_TO_OFFSET(heapBlk);
+	mask = flags << mapOffset;
+
 	/* Must never clear all_visible bit while leaving all_frozen bit set */
 	Assert(flags & VISIBILITYMAP_VALID_BITS);
 	Assert(flags != VISIBILITYMAP_ALL_VISIBLE);
@@ -192,7 +208,11 @@ visibilitymap_clear(Relation rel, BlockNumber heapBlk, Buffer vmbuf, uint8 flags
 void
 visibilitymap_pin(Relation rel, BlockNumber heapBlk, Buffer *vmbuf)
 {
-	BlockNumber mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
+	BlockNumber mapBlock;
+
+	MAPBLOCK_INIT;
+
+	mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
 
 	/* Reuse the old pinned buffer if possible */
 	if (BufferIsValid(*vmbuf))
@@ -216,7 +236,11 @@ visibilitymap_pin(Relation rel, BlockNumber heapBlk, Buffer *vmbuf)
 bool
 visibilitymap_pin_ok(BlockNumber heapBlk, Buffer vmbuf)
 {
-	BlockNumber mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
+	BlockNumber mapBlock;
+
+	MAPBLOCK_INIT;
+
+	mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
 
 	return BufferIsValid(vmbuf) && BufferGetBlockNumber(vmbuf) == mapBlock;
 }
@@ -247,12 +271,18 @@ visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf,
 				  XLogRecPtr recptr, Buffer vmBuf, TransactionId cutoff_xid,
 				  uint8 flags)
 {
-	BlockNumber mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
-	uint32		mapByte = HEAPBLK_TO_MAPBYTE(heapBlk);
-	uint8		mapOffset = HEAPBLK_TO_OFFSET(heapBlk);
+	BlockNumber mapBlock;
+	uint32		mapByte;
+	uint8		mapOffset;
 	Page		page;
 	uint8	   *map;
 
+	MAPBLOCK_INIT;
+
+	mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
+	mapByte = HEAPBLK_TO_MAPBYTE(heapBlk);
+	mapOffset = HEAPBLK_TO_OFFSET(heapBlk);
+
 #ifdef TRACE_VISIBILITYMAP
 	elog(DEBUG1, "vm_set %s %d", RelationGetRelationName(rel), heapBlk);
 #endif
@@ -337,12 +367,18 @@ visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf,
 uint8
 visibilitymap_get_status(Relation rel, BlockNumber heapBlk, Buffer *vmbuf)
 {
-	BlockNumber mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
-	uint32		mapByte = HEAPBLK_TO_MAPBYTE(heapBlk);
-	uint8		mapOffset = HEAPBLK_TO_OFFSET(heapBlk);
+	BlockNumber mapBlock;
+	uint32		mapByte;
+	uint8		mapOffset;
 	char	   *map;
 	uint8		result;
 
+	MAPBLOCK_INIT;
+
+	mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
+	mapByte = HEAPBLK_TO_MAPBYTE(heapBlk);
+	mapOffset = HEAPBLK_TO_OFFSET(heapBlk);
+
 #ifdef TRACE_VISIBILITYMAP
 	elog(DEBUG1, "vm_get_status %s %d", RelationGetRelationName(rel), heapBlk);
 #endif
@@ -414,16 +450,16 @@ visibilitymap_count(Relation rel, BlockNumber *all_visible, BlockNumber *all_fro
 		 */
 		map = (uint64 *) PageGetContents(BufferGetPage(mapBuffer));
 
-		StaticAssertStmt(MAPSIZE % sizeof(uint64) == 0,
-						 "unsupported MAPSIZE");
+		Assert(mapblock_size % sizeof(uint64) == 0);
+
 		if (all_frozen == NULL)
 		{
-			for (i = 0; i < MAPSIZE / sizeof(uint64); i++)
+			for (i = 0; i < mapblock_size / sizeof(uint64); i++)
 				nvisible += pg_popcount64(map[i] & VISIBLE_MASK64);
 		}
 		else
 		{
-			for (i = 0; i < MAPSIZE / sizeof(uint64); i++)
+			for (i = 0; i < mapblock_size / sizeof(uint64); i++)
 			{
 				nvisible += pg_popcount64(map[i] & VISIBLE_MASK64);
 				nfrozen += pg_popcount64(map[i] & FROZEN_MASK64);
@@ -455,9 +491,15 @@ visibilitymap_prepare_truncate(Relation rel, BlockNumber nheapblocks)
 	BlockNumber newnblocks;
 
 	/* last remaining block, byte, and bit */
-	BlockNumber truncBlock = HEAPBLK_TO_MAPBLOCK(nheapblocks);
-	uint32		truncByte = HEAPBLK_TO_MAPBYTE(nheapblocks);
-	uint8		truncOffset = HEAPBLK_TO_OFFSET(nheapblocks);
+	BlockNumber truncBlock;
+	uint32		truncByte;
+	uint8		truncOffset;
+
+	MAPBLOCK_INIT;
+
+	truncBlock = HEAPBLK_TO_MAPBLOCK(nheapblocks);
+	truncByte = HEAPBLK_TO_MAPBYTE(nheapblocks);
+	truncOffset = HEAPBLK_TO_OFFSET(nheapblocks);
 
 #ifdef TRACE_VISIBILITYMAP
 	elog(DEBUG1, "vm_truncate %s %d", RelationGetRelationName(rel), nheapblocks);
@@ -501,7 +543,7 @@ visibilitymap_prepare_truncate(Relation rel, BlockNumber nheapblocks)
 		START_CRIT_SECTION();
 
 		/* Clear out the unwanted bytes. */
-		MemSet(&map[truncByte + 1], 0, MAPSIZE - (truncByte + 1));
+		MemSet(&map[truncByte + 1], 0, mapblock_size - (truncByte + 1));
 
 		/*----
 		 * Mask out the unwanted bits of the last remaining byte.
-- 
2.40.1

v3-0028-feature-teach-FSM-about-reserved-page-space.patchapplication/octet-stream; name=v3-0028-feature-teach-FSM-about-reserved-page-space.patchDownload

From 7b3fb9e98301c6d08cf827e98c1254131b190c2b Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Fri, 19 Jan 2024 12:56:56 -0500
Subject: [PATCH v3 28/28] feature: teach FSM about reserved page space

---
 src/backend/storage/freespace/freespace.c |  2 +-
 src/include/storage/fsm_internals.h       | 10 +++++++++-
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/src/backend/storage/freespace/freespace.c b/src/backend/storage/freespace/freespace.c
index 323c652a9d..6ce7517cdd 100644
--- a/src/backend/storage/freespace/freespace.c
+++ b/src/backend/storage/freespace/freespace.c
@@ -72,7 +72,7 @@
  * this means that 4096 bytes is the smallest BLCKSZ that we can get away
  * with a 3-level tree, and 512 is the smallest we support.
  */
-#define FSM_TREE_DEPTH	((SlotsPerFSMPage >= 1626) ? 3 : 4)
+#define FSM_TREE_DEPTH	((MinSlotsPerFSMPage >= 1626) ? 3 : 4)
 
 #define FSM_ROOT_LEVEL	(FSM_TREE_DEPTH - 1)
 #define FSM_BOTTOM_LEVEL 0
diff --git a/src/include/storage/fsm_internals.h b/src/include/storage/fsm_internals.h
index 195fb7804a..da0351aa41 100644
--- a/src/include/storage/fsm_internals.h
+++ b/src/include/storage/fsm_internals.h
@@ -48,7 +48,7 @@ typedef FSMPageData *FSMPage;
  * Number of non-leaf and leaf nodes, and nodes in total, on an FSM page.
  * These definitions are internal to fsmpage.c.
  */
-#define NodesPerPage (PageUsableSpaceMax - \
+#define NodesPerPage (PageUsableSpace - \
 					  offsetof(FSMPageData, fp_nodes))
 
 #define NonLeafNodesPerPage (BLCKSZ / 2 - 1)
@@ -60,6 +60,14 @@ typedef FSMPageData *FSMPage;
  */
 #define SlotsPerFSMPage LeafNodesPerPage
 
+/*
+ * MinSlotsPerFSMPage is required so we can properly choose the tree level
+ * based on the minimum known number of slots on the page; the actual number
+ * could be more, but is not a compile-time constant, so the FSM_TREE_DEPTH
+ * logic uses this value instead to make the compiler happy.
+ */
+#define MinSlotsPerFSMPage (BLCKSZ - SizeOfPageHeaderData - MaxReservedPageSize - offsetof(FSMPageData, fp_nodes) - NonLeafNodesPerPage)
+
 /* Prototypes for functions in fsmpage.c */
 extern int	fsm_search_avail(Buffer buf, uint8 minvalue, bool advancenext,
 							 bool exclusive_lock_held);
-- 
2.40.1

v3-0027-feature-Teach-bloom-about-PageUsableSpace.patchapplication/octet-stream; name=v3-0027-feature-Teach-bloom-about-PageUsableSpace.patchDownload

From 90901d525b605b7cf4af353c2c7690e9ffaad5ec Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Fri, 19 Jan 2024 12:32:24 -0500
Subject: [PATCH v3 27/28] feature: Teach bloom about PageUsableSpace

---
 contrib/bloom/bloom.h    | 22 +++++++++++-----------
 contrib/bloom/blutils.c  |  4 ++--
 contrib/bloom/blvacuum.c |  2 +-
 3 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/contrib/bloom/bloom.h b/contrib/bloom/bloom.h
index c3d7fe8372..eadb37aca5 100644
--- a/contrib/bloom/bloom.h
+++ b/contrib/bloom/bloom.h
@@ -106,16 +106,13 @@ typedef struct BloomOptions
 											 * index key */
 } BloomOptions;
 
-/*
- * FreeBlockNumberArray - array of block numbers sized so that metadata fill
- * all space in metapage.
- */
-typedef BlockNumber FreeBlockNumberArray[
-										 MAXALIGN_DOWN(
-													   PageUsableSpaceMax - MAXALIGN(sizeof(BloomPageOpaqueData))
-													   - MAXALIGN(sizeof(uint16) * 2 + sizeof(uint32) + sizeof(BloomOptions))
-													   ) / sizeof(BlockNumber)
-];
+#define CalcFreeBlockNumberArraySize(usablespace) \
+	(MAXALIGN_DOWN((usablespace) -										\
+				   MAXALIGN(sizeof(BloomPageOpaqueData))				\
+				   - MAXALIGN(sizeof(uint16) * 2 + sizeof(uint32) + sizeof(BloomOptions)) \
+		) / sizeof(BlockNumber) )
+
+typedef BlockNumber FreeBlockNumberArray[FLEXIBLE_ARRAY_MEMBER];
 
 /* Metadata of bloom index */
 typedef struct BloomMetaPageData
@@ -127,11 +124,14 @@ typedef struct BloomMetaPageData
 	FreeBlockNumberArray notFullPage;
 } BloomMetaPageData;
 
+#define SizeOfFreeBlockNumberArray CalcFreeBlockNumberArraySize(PageUsableSpace)
+#define SizeOfBloomMetaPageData (offsetof(BloomMetaPageData,notFullPage) + (SizeOfFreeBlockNumberArray))
+
 /* Magic number to distinguish bloom pages from others */
 #define BLOOM_MAGICK_NUMBER (0xDBAC0DED)
 
 /* Number of blocks numbers fit in BloomMetaPageData */
-#define BloomMetaBlockN		(sizeof(FreeBlockNumberArray) / sizeof(BlockNumber))
+#define BloomMetaBlockN		(SizeOfFreeBlockNumberArray / sizeof(BlockNumber))
 
 #define BloomPageGetMeta(page)	((BloomMetaPageData *) PageGetContents(page))
 
diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index 6836129c90..f42efa4907 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -432,10 +432,10 @@ BloomFillMetapage(Relation index, Page metaPage)
 	 */
 	BloomInitPage(metaPage, BLOOM_META);
 	metadata = BloomPageGetMeta(metaPage);
-	memset(metadata, 0, sizeof(BloomMetaPageData));
+	memset(metadata, 0, SizeOfBloomMetaPageData);
 	metadata->magickNumber = BLOOM_MAGICK_NUMBER;
 	metadata->opts = *opts;
-	((PageHeader) metaPage)->pd_lower += sizeof(BloomMetaPageData);
+	((PageHeader) metaPage)->pd_lower += SizeOfBloomMetaPageData;
 
 	/* If this fails, probably FreeBlockNumberArray size calc is wrong: */
 	Assert(((PageHeader) metaPage)->pd_lower <= ((PageHeader) metaPage)->pd_upper);
diff --git a/contrib/bloom/blvacuum.c b/contrib/bloom/blvacuum.c
index 0998240fea..b22e1fb15b 100644
--- a/contrib/bloom/blvacuum.c
+++ b/contrib/bloom/blvacuum.c
@@ -37,7 +37,7 @@ blbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 	Relation	index = info->index;
 	BlockNumber blkno,
 				npages;
-	FreeBlockNumberArray notFullPage;
+	BlockNumber notFullPage[CalcFreeBlockNumberArraySize(PageUsableSpaceMax)];
 	int			countPage = 0;
 	BloomState	state;
 	Buffer		buffer;
-- 
2.40.1

v3-0026-doc-update-bufpage-docs-w-reserved-space-data.patchapplication/octet-stream; name=v3-0026-doc-update-bufpage-docs-w-reserved-space-data.patchDownload

From c41928b1f93ccc07822361c62a7542d54a6d63c7 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Fri, 19 Jan 2024 11:33:09 -0500
Subject: [PATCH v3 26/28] doc: update bufpage docs w/reserved space data

---
 src/include/storage/bufpage.h | 23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index a233049b70..bc451f9232 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -37,10 +37,10 @@
  * |			 v pd_upper							  |
  * +-------------+------------------------------------+
  * |			 | tupleN ...                         |
- * +-------------+------------------+-----------------+
- * |	   ... tuple3 tuple2 tuple1 | "special space" |
- * +--------------------------------+-----------------+
- *									^ pd_special
+ * +-------------+------------------------+-----------+
+ * | ... tuple3 tuple2 tuple1 | "special" | reserved  |
+ * +--------------------------+-----------------------+
+ *                            ^ pd_special
  *
  * a page is full when nothing can be added between pd_lower and
  * pd_upper.
@@ -69,11 +69,16 @@
  *
  * AM-generic per-page information is kept in PageHeaderData.
  *
- * AM-specific per-page data (if any) is kept in the area marked "special
- * space"; each AM has an "opaque" structure defined somewhere that is
- * stored as the page trailer.  an access method should always
- * initialize its pages with PageInit and then set its own opaque
- * fields.
+ * Reserved page space is defined at initdb time and reserves the final bytes
+ * of each disk page for conditional feature use, for instance storing
+ * authenticated data or IVs for encryption.
+ *
+ * AM-specific per-page data (if any) is kept in the area marked "special";
+ * each AM has an "opaque" structure defined somewhere that is
+ * stored as the page trailer, right before any reserved page space.
+ * An access method should always initialize its pages with PageInit
+ * and then set its own opaque fields at the pd_special offset that
+ * was assigned in PageInit.
  */
 
 typedef Pointer Page;
-- 
2.40.1

#27

Aleksander Alekseev

aleksander@timescale.com

almost 2 years ago

In reply to: David Christensen (#26)

Re: [PATCHES] Post-special page storage TDE support

Hi David,

I have finished the reworking of this particular patch series, and have tried to
organize this in such a way that it will be easily reviewable. It is
constructed progressively to be able to follow what is happening here. As such,
each individual commit is not guaranteed to compile on its own, so the whole
series would need to be applied before it works. (It does pass CI tests.)

Here is a brief roadmap of the patches; some of them have additional details in
the commit message describing a little more about them.

These two patches do some refactoring of existing code to make a common place to
modify the definitions:

v3-0001-refactor-Create-PageUsableSpace-to-represent-spac.patch
v3-0002-refactor-Make-PageGetUsablePageSize-routine.patch

These two patches add the ReservedPageSize variable and teach PageInit to use to
adjust sizing accordingly:

v3-0003-feature-Add-ReservedPageSize-variable.patch
v3-0004-feature-Adjust-page-sizes-at-PageInit.patch

This patch modifies the definitions of 4 symbols to be computed based on
PageUsableSpace:

v3-0005-feature-Create-Calc-Limit-and-Dynamic-forms-for-f.patch

These following 4 patches are mechanical replacements of all existing uses of
these symbols; this provides both visibility into where the existing symbol is
used as well as distinguishing between parts that care about static allocation
vs dynamic usage. The only non-mechanical change is to remove the definition of
the old symbol so we can be guaranteed that all uses have been considered:

v3-0006-chore-Split-MaxHeapTuplesPerPage-into-Limit-and-D.patch
v3-0007-chore-Split-MaxIndexTuplesPerPage-into-Limit-and-.patch
v3-0008-chore-Split-MaxHeapTupleSize-into-Limit-and-Dynam.patch
v3-0009-chore-Split-MaxTIDsPerBTreePage-into-Limit-and-Dy.patch

The following patches are related to required changes to support dynamic toast
limits:

v3-0010-feature-Add-hook-for-setting-reloptions-defaults-.patch
v3-0011-feature-Dynamically-calculate-toast_tuple_target.patch
v3-0012-feature-Add-Calc-options-for-toast-related-pieces.patch
v3-0013-chore-Replace-TOAST_MAX_CHUNK_SIZE-with-ClusterTo.patch
v3-0014-chore-Translation-updates-for-TOAST_MAX_CHUNK_SIZ.patch

In order to calculate some of the sizes, we need to include nbtree.h internals,
but we can't use in front-end apps, so we separate out the pieces we care about
into a separate include and use that:

v3-0015-chore-Split-nbtree.h-structure-defs-into-an-inter.patch

This is the meat of the patch; provide a common location for these
block-size-related constants to be computed using the infra that has been set up
so far. Also ensure that we are properly initializing this in front end and
back end code. A tricky piece here is we have two separate include files for
blocksize.h; one which exposes externs as consts for optimizations, and one that
blocksize.c itself uses without consts, which it uses to create/initialized the
vars:

v3-0016-feature-Calculate-all-blocksize-constants-in-a-co.patch

Add ControlFile and GUC support for reserved_page_size:

v3-0017-feature-ControlFile-GUC-support-for-reserved_page.patch

Add initdb support for reserving page space:

v3-0018-feature-Add-reserved_page_size-to-initdb-bootstra.patch

Fixes for pg_resetwal:

v3-0019-feature-Updates-for-pg_resetwal.patch

The following 4 patches mechanically replace the Dynamic form to use the new
Cluster variables:

v3-0020-chore-Rename-MaxHeapTupleSizeDynamic-to-ClusterMa.patch
v3-0021-chore-Rename-MaxHeapTuplesPerPageDynamic-to-Clust.patch
v3-0022-chore-Rename-MaxIndexTuplesPerPageDynamic-to-Clus.patch
v3-0023-chore-Rename-MaxTIDsPerBTreePageDynamic-to-Cluste.patch

Two pieces of optimization required for visibility map:

v3-0024-optimization-Add-support-for-fast-non-division-ba.patch
v3-0025-optimization-Use-fastdiv-code-in-visibility-map.patch

Update bufpage.h comments:

v3-0026-doc-update-bufpage-docs-w-reserved-space-data.patch

Fixes for bloom to use runtime size:

v3-0027-feature-Teach-bloom-about-PageUsableSpace.patch

Fixes for FSM to use runtime size:

v3-0028-feature-teach-FSM-about-reserved-page-space.patch

I hope this makes sense for reviewing, I know it's a big job, so breaking things up a little more and organizing will hopefully help.

Just wanted to let you know that the patchset seems to need a rebase,
according to cfbot.

Best regards,
Aleksander Alekseev (wearing a co-CFM hat)

#28

David Christensen

david.christensen@crunchydata.com

almost 2 years ago

In reply to: Aleksander Alekseev (#27)

23 attachment(s)

Re: [PATCHES] Post-special page storage TDE support

Hi Aleksander et al,

Enclosing v4 for this patch series, rebased atop the
constant-splitting series[1]https://commitfest.postgresql.org/47/4828/. For the purposes of having cfbot happy,
I am including the prerequisites as a squashed commit v4-0000, however
this is not technically part of this series.

The roadmap this time is similar to the last series, with some
improvements being made in terms of a few bug fixes and other
reorganizations/cleanups. With the prerequisite/rework, we are able
to eliminate some number of patches in the previous series.

Squashed prerequisites, out of scope for review:
v4-0000-squashed-prerequisites.patch

Refactoring some of the existing uses of BLCKSZ and SizeOfPageHeaderData:
v4-0001-refactor-Create-PageUsableSpace-to-represent-spac.patch
v4-0002-refactor-Make-PageGetUsablePageSize-routine.patch
v4-0003-feature-Add-ReservedPageSize-variable.patch
v4-0004-feature-Adjust-page-sizes-at-PageInit.patch

Making TOAST dynamic:
v4-0005-feature-Add-hook-for-setting-reloptions-defaults-.patch
v4-0006-feature-Add-Calc-options-for-toast-related-pieces.patch
v4-0007-feature-Dynamically-calculate-toast_tuple_target.patch
v4-0008-chore-Replace-TOAST_MAX_CHUNK_SIZE-with-ClusterTo.patch
v4-0009-chore-Translation-updates-for-TOAST_MAX_CHUNK_SIZ.patch

Infra/support for blocksize calculations:
v4-0010-chore-Split-nbtree.h-structure-defs-into-an-inter.patch
v4-0011-Control-File-support-for-reserved_page_size.patch
v4-0012-feature-Calculate-all-blocksize-constants-in-a-co.patch

GUC/initdb/bootstrap support for setting reserved-page-size:
v4-0013-GUC-for-reserved_page_size.patch
v4-0014-feature-Add-reserved_page_size-to-initdb-bootstra.patch
v4-0015-feature-Updates-for-pg_resetwal.patch

Optimization of VisMap:
v4-0016-optimization-Add-support-for-fast-non-division-ba.patch
v4-0017-optimization-Use-fastdiv-code-in-visibility-map.patch

Docs:
v4-0018-doc-update-bufpage-docs-w-reserved-space-data.patch

Misc cleanup/fixes:
v4-0019-feature-Teach-bloom-about-PageUsableSpace.patch
v4-0020-feature-teach-FSM-about-reserved-page-space.patch
v4-0021-feature-expose-reserved_page_size-in-SQL-controld.patch

Write out of init options that are relevant:
v4-0022-feature-save-out-our-initialization-options.patch

A few notes:
- There was a bug in the previous VisMap in v3 which resulted in
treating the page size as smaller than it was. This has been fixed.

- v4-0022 is new, but useful for the page features going forward, and
should simplify some things like using `pg_resetwal` or other places
that really need to know how initdb was initialized.

- I have done some performance metrics with this feature vs unpatched
postgres. Since the biggest place this seemed to affect was the
visibility map (per profiling), I constructed an index-only scan test
case which basically measured nested loop against index-only lookups
with something like 20M rows in the index and 1M generate_series
options, measuring the differences between the approach we are using
(and several others), and showing a trimmean of < 0.005 in execution
time.[2]https://www.pgguru.net/2024-03-13-vismap-benchmarking.txt This seems acceptable (if not just noise), so would be
interested in any sorts of performance deviations others encounter.

Thanks,

David

[1]: https://commitfest.postgresql.org/47/4828/
[2]: https://www.pgguru.net/2024-03-13-vismap-benchmarking.txt

Attachments:

v4-0019-feature-Teach-bloom-about-PageUsableSpace.patchapplication/octet-stream; name=v4-0019-feature-Teach-bloom-about-PageUsableSpace.patchDownload

From 297e698652ac08a9f2e4412a655b6cdd0ece0d8e Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Fri, 19 Jan 2024 12:32:24 -0500
Subject: [PATCH v4 19/22] feature: Teach bloom about PageUsableSpace

---
 contrib/bloom/bloom.h    | 22 +++++++++++-----------
 contrib/bloom/blutils.c  |  4 ++--
 contrib/bloom/blvacuum.c |  2 +-
 3 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/contrib/bloom/bloom.h b/contrib/bloom/bloom.h
index 6d645aa3a2..eadb37aca5 100644
--- a/contrib/bloom/bloom.h
+++ b/contrib/bloom/bloom.h
@@ -106,16 +106,13 @@ typedef struct BloomOptions
 											 * index key */
 } BloomOptions;
 
-/*
- * FreeBlockNumberArray - array of block numbers sized so that metadata fill
- * all space in metapage.
- */
-typedef BlockNumber FreeBlockNumberArray[
-	MAXALIGN_DOWN(
-		PageUsableSpaceMax - MAXALIGN(sizeof(BloomPageOpaqueData))
-		- MAXALIGN(sizeof(uint16) * 2 + sizeof(uint32) + sizeof(BloomOptions))
-		) / sizeof(BlockNumber)
-];
+#define CalcFreeBlockNumberArraySize(usablespace) \
+	(MAXALIGN_DOWN((usablespace) -										\
+				   MAXALIGN(sizeof(BloomPageOpaqueData))				\
+				   - MAXALIGN(sizeof(uint16) * 2 + sizeof(uint32) + sizeof(BloomOptions)) \
+		) / sizeof(BlockNumber) )
+
+typedef BlockNumber FreeBlockNumberArray[FLEXIBLE_ARRAY_MEMBER];
 
 /* Metadata of bloom index */
 typedef struct BloomMetaPageData
@@ -127,11 +124,14 @@ typedef struct BloomMetaPageData
 	FreeBlockNumberArray notFullPage;
 } BloomMetaPageData;
 
+#define SizeOfFreeBlockNumberArray CalcFreeBlockNumberArraySize(PageUsableSpace)
+#define SizeOfBloomMetaPageData (offsetof(BloomMetaPageData,notFullPage) + (SizeOfFreeBlockNumberArray))
+
 /* Magic number to distinguish bloom pages from others */
 #define BLOOM_MAGICK_NUMBER (0xDBAC0DED)
 
 /* Number of blocks numbers fit in BloomMetaPageData */
-#define BloomMetaBlockN		(sizeof(FreeBlockNumberArray) / sizeof(BlockNumber))
+#define BloomMetaBlockN		(SizeOfFreeBlockNumberArray / sizeof(BlockNumber))
 
 #define BloomPageGetMeta(page)	((BloomMetaPageData *) PageGetContents(page))
 
diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
index 6836129c90..f42efa4907 100644
--- a/contrib/bloom/blutils.c
+++ b/contrib/bloom/blutils.c
@@ -432,10 +432,10 @@ BloomFillMetapage(Relation index, Page metaPage)
 	 */
 	BloomInitPage(metaPage, BLOOM_META);
 	metadata = BloomPageGetMeta(metaPage);
-	memset(metadata, 0, sizeof(BloomMetaPageData));
+	memset(metadata, 0, SizeOfBloomMetaPageData);
 	metadata->magickNumber = BLOOM_MAGICK_NUMBER;
 	metadata->opts = *opts;
-	((PageHeader) metaPage)->pd_lower += sizeof(BloomMetaPageData);
+	((PageHeader) metaPage)->pd_lower += SizeOfBloomMetaPageData;
 
 	/* If this fails, probably FreeBlockNumberArray size calc is wrong: */
 	Assert(((PageHeader) metaPage)->pd_lower <= ((PageHeader) metaPage)->pd_upper);
diff --git a/contrib/bloom/blvacuum.c b/contrib/bloom/blvacuum.c
index 0998240fea..b22e1fb15b 100644
--- a/contrib/bloom/blvacuum.c
+++ b/contrib/bloom/blvacuum.c
@@ -37,7 +37,7 @@ blbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 	Relation	index = info->index;
 	BlockNumber blkno,
 				npages;
-	FreeBlockNumberArray notFullPage;
+	BlockNumber notFullPage[CalcFreeBlockNumberArraySize(PageUsableSpaceMax)];
 	int			countPage = 0;
 	BloomState	state;
 	Buffer		buffer;
-- 
2.40.1

v4-0020-feature-teach-FSM-about-reserved-page-space.patchapplication/octet-stream; name=v4-0020-feature-teach-FSM-about-reserved-page-space.patchDownload

From f6e9c5fa915490927d1d48787cd80af6081488fe Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Fri, 19 Jan 2024 12:56:56 -0500
Subject: [PATCH v4 20/22] feature: teach FSM about reserved page space

---
 src/backend/storage/freespace/freespace.c |  2 +-
 src/include/storage/fsm_internals.h       | 10 +++++++++-
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/src/backend/storage/freespace/freespace.c b/src/backend/storage/freespace/freespace.c
index 1604937242..d411146e14 100644
--- a/src/backend/storage/freespace/freespace.c
+++ b/src/backend/storage/freespace/freespace.c
@@ -72,7 +72,7 @@
  * this means that 4096 bytes is the smallest BLCKSZ that we can get away
  * with a 3-level tree, and 512 is the smallest we support.
  */
-#define FSM_TREE_DEPTH	((SlotsPerFSMPage >= 1626) ? 3 : 4)
+#define FSM_TREE_DEPTH	((MinSlotsPerFSMPage >= 1626) ? 3 : 4)
 
 #define FSM_ROOT_LEVEL	(FSM_TREE_DEPTH - 1)
 #define FSM_BOTTOM_LEVEL 0
diff --git a/src/include/storage/fsm_internals.h b/src/include/storage/fsm_internals.h
index 195fb7804a..da0351aa41 100644
--- a/src/include/storage/fsm_internals.h
+++ b/src/include/storage/fsm_internals.h
@@ -48,7 +48,7 @@ typedef FSMPageData *FSMPage;
  * Number of non-leaf and leaf nodes, and nodes in total, on an FSM page.
  * These definitions are internal to fsmpage.c.
  */
-#define NodesPerPage (PageUsableSpaceMax - \
+#define NodesPerPage (PageUsableSpace - \
 					  offsetof(FSMPageData, fp_nodes))
 
 #define NonLeafNodesPerPage (BLCKSZ / 2 - 1)
@@ -60,6 +60,14 @@ typedef FSMPageData *FSMPage;
  */
 #define SlotsPerFSMPage LeafNodesPerPage
 
+/*
+ * MinSlotsPerFSMPage is required so we can properly choose the tree level
+ * based on the minimum known number of slots on the page; the actual number
+ * could be more, but is not a compile-time constant, so the FSM_TREE_DEPTH
+ * logic uses this value instead to make the compiler happy.
+ */
+#define MinSlotsPerFSMPage (BLCKSZ - SizeOfPageHeaderData - MaxReservedPageSize - offsetof(FSMPageData, fp_nodes) - NonLeafNodesPerPage)
+
 /* Prototypes for functions in fsmpage.c */
 extern int	fsm_search_avail(Buffer buf, uint8 minvalue, bool advancenext,
 							 bool exclusive_lock_held);
-- 
2.40.1

v4-0021-feature-expose-reserved_page_size-in-SQL-controld.patchapplication/octet-stream; name=v4-0021-feature-expose-reserved_page_size-in-SQL-controld.patchDownload

From 9564a17689ceaf9b19e923e902d1a7419a095ac1 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Mon, 29 Jan 2024 14:31:29 -0500
Subject: [PATCH v4 21/22] feature: expose reserved_page_size in SQL
 controldata

---
 src/backend/utils/misc/pg_controldata.c | 7 +++++--
 src/include/catalog/pg_proc.dat         | 6 +++---
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/src/backend/utils/misc/pg_controldata.c b/src/backend/utils/misc/pg_controldata.c
index 98c932dc7b..0e13480c92 100644
--- a/src/backend/utils/misc/pg_controldata.c
+++ b/src/backend/utils/misc/pg_controldata.c
@@ -203,8 +203,8 @@ pg_control_recovery(PG_FUNCTION_ARGS)
 Datum
 pg_control_init(PG_FUNCTION_ARGS)
 {
-	Datum		values[11];
-	bool		nulls[11];
+	Datum		values[12];
+	bool		nulls[12];
 	TupleDesc	tupdesc;
 	HeapTuple	htup;
 	ControlFileData *ControlFile;
@@ -254,6 +254,9 @@ pg_control_init(PG_FUNCTION_ARGS)
 	values[10] = Int32GetDatum(ControlFile->data_checksum_version);
 	nulls[10] = false;
 
+	values[11] = Int32GetDatum(ControlFile->reserved_page_size);
+	nulls[11] = false;
+
 	htup = heap_form_tuple(tupdesc, values, nulls);
 
 	PG_RETURN_DATUM(HeapTupleGetDatum(htup));
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 291ed876fc..170b6f6496 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -11959,9 +11959,9 @@
   descr => 'pg_controldata init state information as a function',
   proname => 'pg_control_init', provolatile => 'v', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{int4,int4,int4,int4,int4,int4,int4,int4,int4,bool,int4}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames => '{max_data_alignment,database_block_size,blocks_per_segment,wal_block_size,bytes_per_wal_segment,max_identifier_length,max_index_columns,max_toast_chunk_size,large_object_chunk_size,float8_pass_by_value,data_page_checksum_version}',
+  proallargtypes => '{int4,int4,int4,int4,int4,int4,int4,int4,int4,bool,int4,int4}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{max_data_alignment,database_block_size,blocks_per_segment,wal_block_size,bytes_per_wal_segment,max_identifier_length,max_index_columns,max_toast_chunk_size,large_object_chunk_size,float8_pass_by_value,data_page_checksum_version,reserved_page_size}',
   prosrc => 'pg_control_init' },
 
 # subscripting support for built-in types
-- 
2.40.1

v4-0022-feature-save-out-our-initialization-options.patchapplication/octet-stream; name=v4-0022-feature-save-out-our-initialization-options.patchDownload

From a3b64d8dc5ab640ef2ee6fc094514ce5ca7e0240 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Tue, 30 Jan 2024 10:35:20 -0500
Subject: [PATCH v4 22/22] feature: save out our initialization options

---
 src/bin/initdb/initdb.c | 41 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 41 insertions(+)

diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 229367446c..8e15273da9 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -157,6 +157,7 @@ static const char *authmethodhost = NULL;
 static const char *authmethodlocal = NULL;
 static _stringlist *extra_guc_names = NULL;
 static _stringlist *extra_guc_values = NULL;
+static _stringlist *init_options = NULL;
 static bool debug = false;
 static bool noclean = false;
 static bool noinstructions = false;
@@ -402,6 +403,43 @@ add_stringlist_item(_stringlist **listhead, const char *str)
 	}
 }
 
+/*
+ * Add an option to our initlist
+ */
+static void
+add_init_option_int(const char *opt, int val)
+{
+	char buf[100];
+
+	snprintf(buf, sizeof(buf)-1, "%s %d", opt, val);
+	add_stringlist_item(&init_options, buf);
+}
+
+/* output our initlist */
+static void
+output_init_list(void)
+{
+	char *path = psprintf("%s/%s", pg_data, "init_opts");
+	int cnt = 0;
+	struct _stringlist *s;
+	FILE *f;
+
+	if (!init_options)
+		return;
+
+	if ((f = fopen(path, "w")) == NULL)
+		pg_fatal("could not open file \"%s\" for writing: %m", path);
+
+	for (s = init_options; s; s=s->next)
+	{
+		if (cnt++)
+			fputc(' ', f);
+		fputs(s->str, f);
+	}
+	fputc('\n', f);
+	fclose(f);
+}
+
 /*
  * Modify the array of lines, replacing "token" by "replacement"
  * the first time it occurs on each line.
@@ -3021,6 +3059,8 @@ initialize_data_directory(void)
 	 */
 	write_version_file("base/1");
 
+	output_init_list();
+
 	/*
 	 * Create the stuff we don't need to use bootstrap mode for, using a
 	 * backend running in simple standalone mode.
@@ -3384,6 +3424,7 @@ main(int argc, char *argv[])
 		/* check for valid block_size; last is bitwise power of two check */
 		if (!IsValidReservedPageSize(reserved_page_size))
 			pg_fatal("argument of --reserved-size must be a multiple of 8 between 0 and 256");
+		add_init_option_int("--reserved-size", reserved_page_size);
 	}
 
 	BlockSizeInit(BLCKSZ, reserved_page_size);
-- 
2.40.1

v4-0003-feature-Add-ReservedPageSize-variable.patchapplication/octet-stream; name=v4-0003-feature-Add-ReservedPageSize-variable.patchDownload

From e0ca1f4558380fedc08121aa303d6024de467e3f Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Tue, 12 Dec 2023 16:55:52 -0500
Subject: [PATCH v4 03/22] feature: Add ReservedPageSize variable

We redefine PageUsableSpace to account for this variable, as well as introduce
the PageUsableSpaceMax to reflect /just/ BLCKSZ - SizeOfPageHeaderData in the
few call sites that care at this point.
---
 src/backend/access/heap/visibilitymap.c |  3 +--
 src/backend/nodes/tidbitmap.c           |  2 +-
 src/backend/storage/page/bufpage.c      |  2 +-
 src/bin/pg_upgrade/file.c               |  2 +-
 src/include/storage/bufpage.h           | 10 +++++-----
 src/include/storage/fsm_internals.h     |  2 +-
 6 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/src/backend/access/heap/visibilitymap.c b/src/backend/access/heap/visibilitymap.c
index 5c48b7d63c..6bde6f4388 100644
--- a/src/backend/access/heap/visibilitymap.c
+++ b/src/backend/access/heap/visibilitymap.c
@@ -414,8 +414,7 @@ visibilitymap_count(Relation rel, BlockNumber *all_visible, BlockNumber *all_fro
 		 */
 		map = (uint64 *) PageGetContents(BufferGetPage(mapBuffer));
 
-		StaticAssertStmt(MAPSIZE % sizeof(uint64) == 0,
-						 "unsupported MAPSIZE");
+		Assert(MAPSIZE % sizeof(uint64) == 0);
 		if (all_frozen == NULL)
 		{
 			for (i = 0; i < MAPSIZE / sizeof(uint64); i++)
diff --git a/src/backend/nodes/tidbitmap.c b/src/backend/nodes/tidbitmap.c
index 1afa431062..8df8ddb02f 100644
--- a/src/backend/nodes/tidbitmap.c
+++ b/src/backend/nodes/tidbitmap.c
@@ -54,7 +54,7 @@
  * the per-page bitmaps variable size.  We just legislate that the size
  * is this:
  */
-#define MAX_TUPLES_PER_PAGE  ClusterMaxHeapTuplesPerPage
+#define MAX_TUPLES_PER_PAGE  MaxHeapTuplesPerPageLimit
 
 /*
  * When we have to switch over to lossy storage, we use a data structure
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index bc37eae9f2..353ae286a4 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -26,7 +26,7 @@
 /* GUC variable */
 bool		ignore_checksum_failure = false;
 
-
+int ReservedPageSize = 0;
 /* ----------------------------------------------------------------
  *						Page support functions
  * ----------------------------------------------------------------
diff --git a/src/bin/pg_upgrade/file.c b/src/bin/pg_upgrade/file.c
index 284af5077b..1ad589ca45 100644
--- a/src/bin/pg_upgrade/file.c
+++ b/src/bin/pg_upgrade/file.c
@@ -227,7 +227,7 @@ rewriteVisibilityMap(const char *fromfile, const char *tofile,
 	struct stat statbuf;
 
 	/* Compute number of old-format bytes per new page */
-	rewriteVmBytesPerPage = (PageUsableSpace) / 2;
+	rewriteVmBytesPerPage = (PageUsableSpaceMax) / 2;
 
 	if ((src_fd = open(fromfile, O_RDONLY | PG_BINARY, 0)) < 0)
 		pg_fatal("error while copying relation \"%s.%s\": could not open file \"%s\": %m",
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 6eee77e57e..5a6060ca89 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -214,11 +214,11 @@ typedef PageHeaderData *PageHeader;
 #define SizeOfPageHeaderData (offsetof(PageHeaderData, pd_linp))
 
 /*
- * how much space is left after smgr's bookkeeping, etc
+ * how much space is left after smgr's bookkeeping, etc; should be MAXALIGN
  */
-#define PageUsableSpace (BLCKSZ - SizeOfPageHeaderData)
-StaticAssertDecl(PageUsableSpace == MAXALIGN(PageUsableSpace),
-				 "SizeOfPageHeaderData must be MAXALIGN'd");
+extern int ReservedPageSize;
+
+#define PageUsableSpace (BLCKSZ - SizeOfPageHeaderData - ReservedPageSize)
 #define PageUsableSpaceMax (BLCKSZ - SizeOfPageHeaderData)
 
 /*
@@ -317,7 +317,7 @@ PageSetPageSizeAndVersion(Page page, Size size, uint8 version)
 static inline uint16
 PageGetUsablePageSize(Page page)
 {
-	return PageGetPageSize(page) - SizeOfPageHeaderData;
+	return PageGetPageSize(page) - SizeOfPageHeaderData - ReservedPageSize;
 }
 
 /* ----------------
diff --git a/src/include/storage/fsm_internals.h b/src/include/storage/fsm_internals.h
index 148693d977..195fb7804a 100644
--- a/src/include/storage/fsm_internals.h
+++ b/src/include/storage/fsm_internals.h
@@ -48,7 +48,7 @@ typedef FSMPageData *FSMPage;
  * Number of non-leaf and leaf nodes, and nodes in total, on an FSM page.
  * These definitions are internal to fsmpage.c.
  */
-#define NodesPerPage (PageUsableSpace - \
+#define NodesPerPage (PageUsableSpaceMax - \
 					  offsetof(FSMPageData, fp_nodes))
 
 #define NonLeafNodesPerPage (BLCKSZ / 2 - 1)
-- 
2.40.1

v4-0001-refactor-Create-PageUsableSpace-to-represent-spac.patchapplication/octet-stream; name=v4-0001-refactor-Create-PageUsableSpace-to-represent-spac.patchDownload

From 44e27be7ce30049c83e7a15aee3f9bc5bf01ac03 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Sun, 26 Nov 2023 16:24:09 -0500
Subject: [PATCH v4 01/22] refactor: Create PageUsableSpace to represent space
 post-smgr

Work to abstract out the direct usage of SizeOfPageHeaderData and BLCKSZ from
access methods; they should only need to operate from a sense of what space is
available to them and not be party to the details.

This is in preparation for allowing space to be reserved at the end of the page
for, e.g., authenticated encryption tags and/or evs which will prevent future
churn when we redefine this value in the future.

This is largely mechanical, though some spots are trickier in their reworking;
basically anything which used BLCKSZ - SizeOfPageHeaderData (in some form).

Care was taken to ensure that even with differences in MAXALIGN() that no
changes were introduced in the rework here, though if there is an area to look
at more closely, this is it.
---
 contrib/bloom/bloom.h                    | 10 +++++-----
 contrib/pageinspect/btreefuncs.c         |  2 +-
 contrib/pgstattuple/pgstatapprox.c       |  2 +-
 contrib/pgstattuple/pgstatindex.c        |  2 +-
 src/backend/access/common/bufmask.c      |  2 +-
 src/backend/access/gin/ginfast.c         |  2 +-
 src/backend/access/gist/gistbuild.c      |  4 ++--
 src/backend/access/heap/heapam.c         |  4 ++--
 src/backend/access/heap/heapam_handler.c |  2 +-
 src/backend/access/heap/vacuumlazy.c     |  2 +-
 src/backend/access/heap/visibilitymap.c  |  2 +-
 src/backend/optimizer/util/plancat.c     |  2 +-
 src/bin/pg_upgrade/file.c                |  2 +-
 src/include/access/brin_page.h           |  2 +-
 src/include/access/ginblock.h            |  4 ++--
 src/include/access/gist.h                |  2 +-
 src/include/access/gist_private.h        |  2 +-
 src/include/access/htup_details.h        |  8 ++++----
 src/include/access/itup.h                |  4 ++--
 src/include/access/nbtree.h              |  4 ++--
 src/include/storage/bufpage.h            |  8 ++++++++
 src/include/storage/fsm_internals.h      |  2 +-
 22 files changed, 41 insertions(+), 33 deletions(-)

diff --git a/contrib/bloom/bloom.h b/contrib/bloom/bloom.h
index fba3ba7771..6d645aa3a2 100644
--- a/contrib/bloom/bloom.h
+++ b/contrib/bloom/bloom.h
@@ -111,10 +111,10 @@ typedef struct BloomOptions
  * all space in metapage.
  */
 typedef BlockNumber FreeBlockNumberArray[
-										 MAXALIGN_DOWN(
-													   BLCKSZ - SizeOfPageHeaderData - MAXALIGN(sizeof(BloomPageOpaqueData))
-													   - MAXALIGN(sizeof(uint16) * 2 + sizeof(uint32) + sizeof(BloomOptions))
-													   ) / sizeof(BlockNumber)
+	MAXALIGN_DOWN(
+		PageUsableSpaceMax - MAXALIGN(sizeof(BloomPageOpaqueData))
+		- MAXALIGN(sizeof(uint16) * 2 + sizeof(uint32) + sizeof(BloomOptions))
+		) / sizeof(BlockNumber)
 ];
 
 /* Metadata of bloom index */
@@ -150,7 +150,7 @@ typedef struct BloomState
 } BloomState;
 
 #define BloomPageGetFreeSpace(state, page) \
-	(BLCKSZ - MAXALIGN(SizeOfPageHeaderData) \
+	(PageUsableSpace \
 		- BloomPageGetMaxOffset(page) * (state)->sizeOfBloomTuple \
 		- MAXALIGN(sizeof(BloomPageOpaqueData)))
 
diff --git a/contrib/pageinspect/btreefuncs.c b/contrib/pageinspect/btreefuncs.c
index 9cdc8e182b..1e20fecf2f 100644
--- a/contrib/pageinspect/btreefuncs.c
+++ b/contrib/pageinspect/btreefuncs.c
@@ -116,7 +116,7 @@ GetBTPageStatistics(BlockNumber blkno, Buffer buffer, BTPageStat *stat)
 
 	stat->blkno = blkno;
 
-	stat->max_avail = BLCKSZ - (BLCKSZ - phdr->pd_special + SizeOfPageHeaderData);
+	stat->max_avail = PageUsableSpace - (BLCKSZ - phdr->pd_special);
 
 	stat->dead_items = stat->live_items = 0;
 
diff --git a/contrib/pgstattuple/pgstatapprox.c b/contrib/pgstattuple/pgstatapprox.c
index c84c642355..b344dc6385 100644
--- a/contrib/pgstattuple/pgstatapprox.c
+++ b/contrib/pgstattuple/pgstatapprox.c
@@ -113,7 +113,7 @@ statapprox_heap(Relation rel, output_type *stat)
 		if (!PageIsNew(page))
 			stat->free_space += PageGetHeapFreeSpace(page);
 		else
-			stat->free_space += BLCKSZ - SizeOfPageHeaderData;
+			stat->free_space += PageUsableSpace;
 
 		/* We may count the page as scanned even if it's new/empty */
 		scanned++;
diff --git a/contrib/pgstattuple/pgstatindex.c b/contrib/pgstattuple/pgstatindex.c
index 5c06ba6db4..2641df9d61 100644
--- a/contrib/pgstattuple/pgstatindex.c
+++ b/contrib/pgstattuple/pgstatindex.c
@@ -309,7 +309,7 @@ pgstatindex_impl(Relation rel, FunctionCallInfo fcinfo)
 		{
 			int			max_avail;
 
-			max_avail = BLCKSZ - (BLCKSZ - ((PageHeader) page)->pd_special + SizeOfPageHeaderData);
+			max_avail = PageUsableSpace - (BLCKSZ - ((PageHeader) page)->pd_special);
 			indexStat.max_avail += max_avail;
 			indexStat.free_space += PageGetFreeSpace(page);
 
diff --git a/src/backend/access/common/bufmask.c b/src/backend/access/common/bufmask.c
index 10a1e4d7c6..92bb06bb50 100644
--- a/src/backend/access/common/bufmask.c
+++ b/src/backend/access/common/bufmask.c
@@ -120,7 +120,7 @@ mask_page_content(Page page)
 {
 	/* Mask Page Content */
 	memset(page + SizeOfPageHeaderData, MASK_MARKER,
-		   BLCKSZ - SizeOfPageHeaderData);
+		   PageUsableSpace);
 
 	/* Mask pd_lower and pd_upper */
 	memset(&((PageHeader) page)->pd_lower, MASK_MARKER,
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index eeca3ed318..197ea3a047 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -39,7 +39,7 @@
 int			gin_pending_list_limit = 0;
 
 #define GIN_PAGE_FREESIZE \
-	( BLCKSZ - MAXALIGN(SizeOfPageHeaderData) - MAXALIGN(sizeof(GinPageOpaqueData)) )
+	( PageUsableSpace - MAXALIGN(sizeof(GinPageOpaqueData)) )
 
 typedef struct KeyArray
 {
diff --git a/src/backend/access/gist/gistbuild.c b/src/backend/access/gist/gistbuild.c
index ba06df30fa..1c984d9d91 100644
--- a/src/backend/access/gist/gistbuild.c
+++ b/src/backend/access/gist/gistbuild.c
@@ -636,7 +636,7 @@ gistInitBuffering(GISTBuildState *buildstate)
 	int			levelStep;
 
 	/* Calc space of index page which is available for index tuples */
-	pageFreeSpace = BLCKSZ - SizeOfPageHeaderData - sizeof(GISTPageOpaqueData)
+	pageFreeSpace = PageUsableSpace - sizeof(GISTPageOpaqueData)
 		- sizeof(ItemIdData)
 		- buildstate->freespace;
 
@@ -792,7 +792,7 @@ calculatePagesPerBuffer(GISTBuildState *buildstate, int levelStep)
 	Size		pageFreeSpace;
 
 	/* Calc space of index page which is available for index tuples */
-	pageFreeSpace = BLCKSZ - SizeOfPageHeaderData - sizeof(GISTPageOpaqueData)
+	pageFreeSpace = PageUsableSpace - sizeof(GISTPageOpaqueData)
 		- sizeof(ItemIdData)
 		- buildstate->freespace;
 
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 633c6e4303..9b034539f0 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2060,7 +2060,7 @@ heap_prepare_insert(Relation relation, HeapTuple tup, TransactionId xid,
 static int
 heap_multi_insert_pages(HeapTuple *heaptuples, int done, int ntuples, Size saveFreeSpace)
 {
-	size_t		page_avail = BLCKSZ - SizeOfPageHeaderData - saveFreeSpace;
+	size_t		page_avail = PageUsableSpace - saveFreeSpace;
 	int			npages = 1;
 
 	for (int i = done; i < ntuples; i++)
@@ -2070,7 +2070,7 @@ heap_multi_insert_pages(HeapTuple *heaptuples, int done, int ntuples, Size saveF
 		if (page_avail < tup_sz)
 		{
 			npages++;
-			page_avail = BLCKSZ - SizeOfPageHeaderData - saveFreeSpace;
+			page_avail = PageUsableSpace - saveFreeSpace;
 		}
 		page_avail -= tup_sz;
 	}
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index dd894afaf1..9e72fe6a28 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -2090,7 +2090,7 @@ heapam_relation_toast_am(Relation rel)
 #define HEAP_OVERHEAD_BYTES_PER_TUPLE \
 	(MAXALIGN(SizeofHeapTupleHeader) + sizeof(ItemIdData))
 #define HEAP_USABLE_BYTES_PER_PAGE \
-	(BLCKSZ - SizeOfPageHeaderData)
+	(PageUsableSpace)
 
 static void
 heapam_estimate_rel_size(Relation rel, int32 *attr_widths,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index c147c77984..3d1cc75bbb 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1301,7 +1301,7 @@ lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno,
 
 		if (GetRecordedFreeSpace(vacrel->rel, blkno) == 0)
 		{
-			freespace = BLCKSZ - SizeOfPageHeaderData;
+			freespace = PageUsableSpace;
 
 			RecordPageWithFreeSpace(vacrel->rel, blkno, freespace);
 		}
diff --git a/src/backend/access/heap/visibilitymap.c b/src/backend/access/heap/visibilitymap.c
index 1ab6c865e3..5c48b7d63c 100644
--- a/src/backend/access/heap/visibilitymap.c
+++ b/src/backend/access/heap/visibilitymap.c
@@ -105,7 +105,7 @@
  * extra headers, so the whole page minus the standard page header is
  * used for the bitmap.
  */
-#define MAPSIZE (BLCKSZ - MAXALIGN(SizeOfPageHeaderData))
+#define MAPSIZE (PageUsableSpace)
 
 /* Number of heap blocks we can represent in one byte */
 #define HEAPBLOCKS_PER_BYTE (BITS_PER_BYTE / BITS_PER_HEAPBLOCK)
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 6bb53e4346..858682a1a2 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -1106,7 +1106,7 @@ estimate_rel_size(Relation rel, int32 *attr_widths,
 			tuple_width += MAXALIGN(SizeofHeapTupleHeader);
 			tuple_width += sizeof(ItemIdData);
 			/* note: integer division is intentional here */
-			density = (BLCKSZ - SizeOfPageHeaderData) / tuple_width;
+			density = (PageUsableSpace) / tuple_width;
 		}
 		*tuples = rint(density * (double) curpages);
 
diff --git a/src/bin/pg_upgrade/file.c b/src/bin/pg_upgrade/file.c
index 73932504ca..284af5077b 100644
--- a/src/bin/pg_upgrade/file.c
+++ b/src/bin/pg_upgrade/file.c
@@ -227,7 +227,7 @@ rewriteVisibilityMap(const char *fromfile, const char *tofile,
 	struct stat statbuf;
 
 	/* Compute number of old-format bytes per new page */
-	rewriteVmBytesPerPage = (BLCKSZ - SizeOfPageHeaderData) / 2;
+	rewriteVmBytesPerPage = (PageUsableSpace) / 2;
 
 	if ((src_fd = open(fromfile, O_RDONLY | PG_BINARY, 0)) < 0)
 		pg_fatal("error while copying relation \"%s.%s\": could not open file \"%s\": %m",
diff --git a/src/include/access/brin_page.h b/src/include/access/brin_page.h
index 70b141c25e..28c4dd9f21 100644
--- a/src/include/access/brin_page.h
+++ b/src/include/access/brin_page.h
@@ -86,7 +86,7 @@ typedef struct RevmapContents
 } RevmapContents;
 
 #define REVMAP_CONTENT_SIZE \
-	(BLCKSZ - MAXALIGN(SizeOfPageHeaderData) - \
+	(PageUsableSpace - \
 	 offsetof(RevmapContents, rm_tids) - \
 	 MAXALIGN(sizeof(BrinSpecialSpace)))
 /* max num of items in the array */
diff --git a/src/include/access/ginblock.h b/src/include/access/ginblock.h
index c55d11be64..05b2f34408 100644
--- a/src/include/access/ginblock.h
+++ b/src/include/access/ginblock.h
@@ -318,7 +318,7 @@ typedef signed char GinNullCategory;
 	 GinPageGetOpaque(page)->maxoff * sizeof(PostingItem))
 
 #define GinDataPageMaxDataSize	\
-	(BLCKSZ - MAXALIGN(SizeOfPageHeaderData) \
+	(PageUsableSpace \
 	 - MAXALIGN(sizeof(ItemPointerData)) \
 	 - MAXALIGN(sizeof(GinPageOpaqueData)))
 
@@ -326,7 +326,7 @@ typedef signed char GinNullCategory;
  * List pages
  */
 #define GinListPageSize  \
-	( BLCKSZ - SizeOfPageHeaderData - MAXALIGN(sizeof(GinPageOpaqueData)) )
+	( PageUsableSpace - MAXALIGN(sizeof(GinPageOpaqueData)) )
 
 /*
  * A compressed posting list.
diff --git a/src/include/access/gist.h b/src/include/access/gist.h
index 22dd04c141..b22b9bad3d 100644
--- a/src/include/access/gist.h
+++ b/src/include/access/gist.h
@@ -98,7 +98,7 @@ typedef GISTPageOpaqueData *GISTPageOpaque;
  * key size using opclass parameters.
  */
 #define GISTMaxIndexTupleSize	\
-	MAXALIGN_DOWN((BLCKSZ - SizeOfPageHeaderData - sizeof(GISTPageOpaqueData)) / \
+	MAXALIGN_DOWN((PageUsableSpace - sizeof(GISTPageOpaqueData)) / \
 				  4 - sizeof(ItemIdData))
 
 #define GISTMaxIndexKeySize	\
diff --git a/src/include/access/gist_private.h b/src/include/access/gist_private.h
index 7b8749c8db..b7402450ba 100644
--- a/src/include/access/gist_private.h
+++ b/src/include/access/gist_private.h
@@ -474,7 +474,7 @@ extern void gistadjustmembers(Oid opfamilyoid,
 /* gistutil.c */
 
 #define GiSTPageSize   \
-	( BLCKSZ - SizeOfPageHeaderData - MAXALIGN(sizeof(GISTPageOpaqueData)) )
+	( PageUsableSpace - MAXALIGN(sizeof(GISTPageOpaqueData)) )
 
 #define GIST_MIN_FILLFACTOR			10
 #define GIST_DEFAULT_FILLFACTOR		90
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index bb5c746b98..57dd928bda 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -569,8 +569,8 @@ StaticAssertDecl(MaxOffsetNumber < SpecTokenOffsetNumber,
  * you can, say, fit 2 tuples of size ClusterMaxHeapTupleSize/2 on the same page.
  */
 #define CalcMaxHeapTupleSize(size)  (size - sizeof(ItemIdData))
-#define ClusterMaxHeapTupleSize CalcMaxHeapTupleSize(BLCKSZ - SizeOfPageHeaderData)
-#define MaxHeapTupleSizeLimit CalcMaxHeapTupleSize(BLCKSZ - SizeOfPageHeaderData)
+#define ClusterMaxHeapTupleSize CalcMaxHeapTupleSize(PageUsableSpace)
+#define MaxHeapTupleSizeLimit CalcMaxHeapTupleSize(PageUsableSpaceMax)
 #define MinHeapTupleSize  MAXALIGN(SizeofHeapTupleHeader)
 
 /*
@@ -599,8 +599,8 @@ StaticAssertDecl(MaxOffsetNumber < SpecTokenOffsetNumber,
  */
 #define CalcMaxHeapTuplesPerPage(size)	((int) ((size) / \
 			(MAXALIGN(SizeofHeapTupleHeader) + sizeof(ItemIdData))))
-#define ClusterMaxHeapTuplesPerPage CalcMaxHeapTuplesPerPage(BLCKSZ - SizeOfPageHeaderData)
-#define MaxHeapTuplesPerPageLimit CalcMaxHeapTuplesPerPage(BLCKSZ - SizeOfPageHeaderData)
+#define ClusterMaxHeapTuplesPerPage CalcMaxHeapTuplesPerPage(PageUsableSpace)
+#define MaxHeapTuplesPerPageLimit CalcMaxHeapTuplesPerPage(PageUsableSpaceMax)
 
 /*
  * MaxAttrSize is a somewhat arbitrary upper limit on the declared size of
diff --git a/src/include/access/itup.h b/src/include/access/itup.h
index 61fa8ff538..9c436c8b16 100644
--- a/src/include/access/itup.h
+++ b/src/include/access/itup.h
@@ -173,7 +173,7 @@ index_getattr(IndexTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
  */
 #define CalcMaxIndexTuplesPerPage(size)	((int) ((size) / \
 			(MAXALIGN(sizeof(IndexTupleData) + 1) + sizeof(ItemIdData))))
-#define ClusterMaxIndexTuplesPerPage CalcMaxIndexTuplesPerPage(BLCKSZ - SizeOfPageHeaderData)
-#define MaxIndexTuplesPerPageLimit CalcMaxIndexTuplesPerPage(BLCKSZ - SizeOfPageHeaderData)
+#define ClusterMaxIndexTuplesPerPage CalcMaxIndexTuplesPerPage(PageUsableSpace)
+#define MaxIndexTuplesPerPageLimit CalcMaxIndexTuplesPerPage(PageUsableSpaceMax)
 
 #endif							/* ITUP_H */
diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h
index 568646a146..4607e45dff 100644
--- a/src/include/access/nbtree.h
+++ b/src/include/access/nbtree.h
@@ -194,9 +194,9 @@ typedef struct BTMetaPageData
 	(int) (((size) - sizeof(BTPageOpaqueData)) /	\
 		   sizeof(ItemPointerData))
 #define ClusterMaxTIDsPerBTreePage \
-	CalcMaxTIDsPerBTreePage(BLCKSZ - SizeOfPageHeaderData)
+	CalcMaxTIDsPerBTreePage(PageUsableSpace)
 #define MaxTIDsPerBTreePageLimit \
-	CalcMaxTIDsPerBTreePage(BLCKSZ - SizeOfPageHeaderData)
+	CalcMaxTIDsPerBTreePage(PageUsableSpaceMax)
 
 /*
  * The leaf-page fillfactor defaults to 90% but is user-adjustable.
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index d0df02d39c..dcdeaaf910 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -213,6 +213,14 @@ typedef PageHeaderData *PageHeader;
  */
 #define SizeOfPageHeaderData (offsetof(PageHeaderData, pd_linp))
 
+/*
+ * how much space is left after smgr's bookkeeping, etc
+ */
+#define PageUsableSpace (BLCKSZ - SizeOfPageHeaderData)
+StaticAssertDecl(PageUsableSpace == MAXALIGN(PageUsableSpace),
+				 "SizeOfPageHeaderData must be MAXALIGN'd");
+#define PageUsableSpaceMax (BLCKSZ - SizeOfPageHeaderData)
+
 /*
  * PageIsEmpty
  *		returns true iff no itemid has been allocated on the page
diff --git a/src/include/storage/fsm_internals.h b/src/include/storage/fsm_internals.h
index a922e691fe..148693d977 100644
--- a/src/include/storage/fsm_internals.h
+++ b/src/include/storage/fsm_internals.h
@@ -48,7 +48,7 @@ typedef FSMPageData *FSMPage;
  * Number of non-leaf and leaf nodes, and nodes in total, on an FSM page.
  * These definitions are internal to fsmpage.c.
  */
-#define NodesPerPage (BLCKSZ - MAXALIGN(SizeOfPageHeaderData) - \
+#define NodesPerPage (PageUsableSpace - \
 					  offsetof(FSMPageData, fp_nodes))
 
 #define NonLeafNodesPerPage (BLCKSZ / 2 - 1)
-- 
2.40.1

v4-0000-squashed-prerequisites.patchapplication/octet-stream; name=v4-0000-squashed-prerequisites.patchDownload

From d99160e074f493c45969ddeff6a2e9ff23366b0f Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Wed, 7 Feb 2024 10:49:31 -0500
Subject: [PATCH] Squashed prerequisites for constant-splitting

---
 contrib/amcheck/verify_nbtree.c               | 10 ++--
 contrib/pg_surgery/heap_surgery.c             |  4 +-
 src/backend/access/brin/brin_bloom.c          |  8 ++--
 src/backend/access/brin/brin_minmax_multi.c   |  8 ++--
 src/backend/access/gin/ginpostinglist.c       |  6 +--
 src/backend/access/gist/gist.c                |  2 +-
 src/backend/access/gist/gistget.c             |  8 ++--
 src/backend/access/hash/hash.c                |  4 +-
 src/backend/access/hash/hashovfl.c            |  6 +--
 src/backend/access/hash/hashpage.c            |  4 +-
 src/backend/access/hash/hashsearch.c          | 10 ++--
 src/backend/access/heap/README.HOT            |  2 +-
 src/backend/access/heap/heapam.c              | 18 +++----
 src/backend/access/heap/heapam_handler.c      |  8 ++--
 src/backend/access/heap/hio.c                 |  8 ++--
 src/backend/access/heap/pruneheap.c           | 22 ++++-----
 src/backend/access/heap/rewriteheap.c         |  4 +-
 src/backend/access/heap/vacuumlazy.c          | 22 ++++-----
 src/backend/access/nbtree/nbtdedup.c          |  4 +-
 src/backend/access/nbtree/nbtinsert.c         |  6 +--
 src/backend/access/nbtree/nbtpage.c           |  8 ++--
 src/backend/access/nbtree/nbtree.c            |  8 ++--
 src/backend/access/nbtree/nbtsearch.c         |  8 ++--
 src/backend/access/nbtree/nbtxlog.c           |  4 +-
 src/backend/access/spgist/spgdoinsert.c       |  2 +-
 src/backend/access/spgist/spgscan.c           |  2 +-
 src/backend/access/spgist/spgvacuum.c         | 22 ++++-----
 src/backend/nodes/tidbitmap.c                 |  2 +-
 .../replication/logical/reorderbuffer.c       |  2 +-
 src/backend/storage/freespace/freespace.c     |  2 +-
 src/backend/storage/page/bufpage.c            | 24 +++++-----
 src/include/access/ginblock.h                 |  2 +-
 src/include/access/hash.h                     |  2 +-
 src/include/access/heapam.h                   |  6 +--
 src/include/access/heaptoast.h                |  2 +-
 src/include/access/htup_details.h             | 48 ++++++++++++++-----
 src/include/access/itup.h                     | 23 ++++++---
 src/include/access/nbtree.h                   | 26 +++++++---
 src/include/access/spgist_private.h           | 12 ++---
 .../test_ginpostinglist/test_ginpostinglist.c |  6 +--
 src/test/regress/expected/insert.out          |  2 +-
 src/test/regress/sql/insert.sql               |  2 +-
 42 files changed, 212 insertions(+), 167 deletions(-)

diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 1ef4cff88e..73352296e2 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -534,12 +534,12 @@ bt_check_every_level(Relation rel, Relation heaprel, bool heapkeyspace,
 		/*
 		 * Size Bloom filter based on estimated number of tuples in index,
 		 * while conservatively assuming that each block must contain at least
-		 * MaxTIDsPerBTreePage / 3 "plain" tuples -- see
+		 * ClusterMaxTIDsPerBTreePage / 3 "plain" tuples -- see
 		 * bt_posting_plain_tuple() for definition, and details of how posting
 		 * list tuples are handled.
 		 */
 		total_pages = RelationGetNumberOfBlocks(rel);
-		total_elems = Max(total_pages * (MaxTIDsPerBTreePage / 3),
+		total_elems = Max(total_pages * (ClusterMaxTIDsPerBTreePage / 3),
 						  (int64) state->rel->rd_rel->reltuples);
 		/* Generate a random seed to avoid repetition */
 		seed = pg_prng_uint64(&pg_global_prng_state);
@@ -3447,12 +3447,12 @@ palloc_btree_page(BtreeCheckState *state, BlockNumber blocknum)
 	 * to move left, in the case of backward index scans).
 	 */
 	maxoffset = PageGetMaxOffsetNumber(page);
-	if (maxoffset > MaxIndexTuplesPerPage)
+	if (maxoffset > ClusterMaxIndexTuplesPerPage)
 		ereport(ERROR,
 				(errcode(ERRCODE_INDEX_CORRUPTED),
-				 errmsg("Number of items on block %u of index \"%s\" exceeds MaxIndexTuplesPerPage (%u)",
+				 errmsg("Number of items on block %u of index \"%s\" exceeds ClusterMaxIndexTuplesPerPage (%u)",
 						blocknum, RelationGetRelationName(state->rel),
-						MaxIndexTuplesPerPage)));
+						ClusterMaxIndexTuplesPerPage)));
 
 	if (!P_ISLEAF(opaque) && !P_ISDELETED(opaque) && maxoffset < P_FIRSTDATAKEY(opaque))
 		ereport(ERROR,
diff --git a/contrib/pg_surgery/heap_surgery.c b/contrib/pg_surgery/heap_surgery.c
index 37dffe3f7d..86aff2494e 100644
--- a/contrib/pg_surgery/heap_surgery.c
+++ b/contrib/pg_surgery/heap_surgery.c
@@ -89,7 +89,7 @@ heap_force_common(FunctionCallInfo fcinfo, HeapTupleForceOption heap_force_opt)
 	Relation	rel;
 	OffsetNumber curr_start_ptr,
 				next_start_ptr;
-	bool		include_this_tid[MaxHeapTuplesPerPage];
+	bool		include_this_tid[MaxHeapTuplesPerPageLimit];
 
 	if (RecoveryInProgress())
 		ereport(ERROR,
@@ -225,7 +225,7 @@ heap_force_common(FunctionCallInfo fcinfo, HeapTupleForceOption heap_force_opt)
 			}
 
 			/* Mark it for processing. */
-			Assert(offno < MaxHeapTuplesPerPage);
+			Assert(offno < ClusterMaxHeapTuplesPerPage);
 			include_this_tid[offno] = true;
 		}
 
diff --git a/src/backend/access/brin/brin_bloom.c b/src/backend/access/brin/brin_bloom.c
index ebf3301627..236446f880 100644
--- a/src/backend/access/brin/brin_bloom.c
+++ b/src/backend/access/brin/brin_bloom.c
@@ -163,7 +163,7 @@ typedef struct BloomOptions
  * on the fact that the filter header is ~20B alone, which is about
  * the same as the filter bitmap for 16 distinct items with 1% false
  * positive rate. So by allowing lower values we'd not gain much. In
- * any case, the min should not be larger than MaxHeapTuplesPerPage
+ * any case, the min should not be larger than ClusterMaxHeapTuplesPerPage
  * (~290), which is the theoretical maximum for single-page ranges.
  */
 #define		BLOOM_MIN_NDISTINCT_PER_RANGE		16
@@ -475,7 +475,7 @@ brin_bloom_opcinfo(PG_FUNCTION_ARGS)
  *
  * Adjust the ndistinct value based on the pagesPerRange value. First,
  * if it's negative, it's assumed to be relative to maximum number of
- * tuples in the range (assuming each page gets MaxHeapTuplesPerPage
+ * tuples in the range (assuming each page gets ClusterMaxHeapTuplesPerPage
  * tuples, which is likely a significant over-estimate). We also clamp
  * the value, not to over-size the bloom filter unnecessarily.
  *
@@ -490,7 +490,7 @@ brin_bloom_opcinfo(PG_FUNCTION_ARGS)
  * seems better to rely on the upper estimate.
  *
  * XXX We might also calculate a better estimate of rows per BRIN range,
- * instead of using MaxHeapTuplesPerPage (which probably produces values
+ * instead of using ClusterMaxHeapTuplesPerPage (which probably produces values
  * much higher than reality).
  */
 static int
@@ -505,7 +505,7 @@ brin_bloom_get_ndistinct(BrinDesc *bdesc, BloomOptions *opts)
 
 	Assert(BlockNumberIsValid(pagesPerRange));
 
-	maxtuples = MaxHeapTuplesPerPage * pagesPerRange;
+	maxtuples = ClusterMaxHeapTuplesPerPage * pagesPerRange;
 
 	/*
 	 * Similarly to n_distinct, negative values are relative - in this case to
diff --git a/src/backend/access/brin/brin_minmax_multi.c b/src/backend/access/brin/brin_minmax_multi.c
index c5962c00d6..234dd3d2df 100644
--- a/src/backend/access/brin/brin_minmax_multi.c
+++ b/src/backend/access/brin/brin_minmax_multi.c
@@ -2006,10 +2006,10 @@ brin_minmax_multi_distance_tid(PG_FUNCTION_ARGS)
 	 * We use the no-check variants here, because user-supplied values may
 	 * have (ip_posid == 0). See ItemPointerCompare.
 	 */
-	da1 = ItemPointerGetBlockNumberNoCheck(pa1) * MaxHeapTuplesPerPage +
+	da1 = ItemPointerGetBlockNumberNoCheck(pa1) * ClusterMaxHeapTuplesPerPage +
 		ItemPointerGetOffsetNumberNoCheck(pa1);
 
-	da2 = ItemPointerGetBlockNumberNoCheck(pa2) * MaxHeapTuplesPerPage +
+	da2 = ItemPointerGetBlockNumberNoCheck(pa2) * ClusterMaxHeapTuplesPerPage +
 		ItemPointerGetOffsetNumberNoCheck(pa2);
 
 	PG_RETURN_FLOAT8(da2 - da1);
@@ -2460,7 +2460,7 @@ brin_minmax_multi_add_value(PG_FUNCTION_ARGS)
 		 * much lower, but meh.
 		 */
 		maxvalues = Min(target_maxvalues * MINMAX_BUFFER_FACTOR,
-						MaxHeapTuplesPerPage * pagesPerRange);
+						ClusterMaxHeapTuplesPerPage * pagesPerRange);
 
 		/* but always at least the original value */
 		maxvalues = Max(maxvalues, target_maxvalues);
@@ -2506,7 +2506,7 @@ brin_minmax_multi_add_value(PG_FUNCTION_ARGS)
 		 * much lower, but meh.
 		 */
 		maxvalues = Min(serialized->maxvalues * MINMAX_BUFFER_FACTOR,
-						MaxHeapTuplesPerPage * pagesPerRange);
+						ClusterMaxHeapTuplesPerPage * pagesPerRange);
 
 		/* but always at least the original value */
 		maxvalues = Max(maxvalues, serialized->maxvalues);
diff --git a/src/backend/access/gin/ginpostinglist.c b/src/backend/access/gin/ginpostinglist.c
index 708f9f49ec..8aa0f17bf8 100644
--- a/src/backend/access/gin/ginpostinglist.c
+++ b/src/backend/access/gin/ginpostinglist.c
@@ -26,7 +26,7 @@
  * lowest 32 bits are the block number. That leaves 21 bits unused, i.e.
  * only 43 low bits are used.
  *
- * 11 bits is enough for the offset number, because MaxHeapTuplesPerPage <
+ * 11 bits is enough for the offset number, because ClusterMaxHeapTuplesPerPage <
  * 2^11 on all supported block sizes. We are frugal with the bits, because
  * smaller integers use fewer bytes in the varbyte encoding, saving disk
  * space. (If we get a new table AM in the future that wants to use the full
@@ -74,9 +74,9 @@
 /*
  * How many bits do you need to encode offset number? OffsetNumber is a 16-bit
  * integer, but you can't fit that many items on a page. 11 ought to be more
- * than enough. It's tempting to derive this from MaxHeapTuplesPerPage, and
+ * than enough. It's tempting to derive this from ClusterMaxHeapTuplesPerPage, and
  * use the minimum number of bits, but that would require changing the on-disk
- * format if MaxHeapTuplesPerPage changes. Better to leave some slack.
+ * format if ClusterMaxHeapTuplesPerPage changes. Better to leave some slack.
  */
 #define MaxHeapTuplesPerPageBits		11
 
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index ed4ffa63a7..c2f1762c36 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -1663,7 +1663,7 @@ freeGISTstate(GISTSTATE *giststate)
 static void
 gistprunepage(Relation rel, Page page, Buffer buffer, Relation heapRel)
 {
-	OffsetNumber deletable[MaxIndexTuplesPerPage];
+	OffsetNumber deletable[MaxIndexTuplesPerPageLimit];
 	int			ndeletable = 0;
 	OffsetNumber offnum,
 				maxoff;
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
index b35b8a9757..32443a7047 100644
--- a/src/backend/access/gist/gistget.c
+++ b/src/backend/access/gist/gistget.c
@@ -658,12 +658,12 @@ gistgettuple(IndexScanDesc scan, ScanDirection dir)
 							MemoryContextSwitchTo(so->giststate->scanCxt);
 
 						so->killedItems =
-							(OffsetNumber *) palloc(MaxIndexTuplesPerPage
+							(OffsetNumber *) palloc(ClusterMaxIndexTuplesPerPage
 													* sizeof(OffsetNumber));
 
 						MemoryContextSwitchTo(oldCxt);
 					}
-					if (so->numKilled < MaxIndexTuplesPerPage)
+					if (so->numKilled < ClusterMaxIndexTuplesPerPage)
 						so->killedItems[so->numKilled++] =
 							so->pageData[so->curPageData - 1].offnum;
 				}
@@ -695,12 +695,12 @@ gistgettuple(IndexScanDesc scan, ScanDirection dir)
 						MemoryContextSwitchTo(so->giststate->scanCxt);
 
 					so->killedItems =
-						(OffsetNumber *) palloc(MaxIndexTuplesPerPage
+						(OffsetNumber *) palloc(ClusterMaxIndexTuplesPerPage
 												* sizeof(OffsetNumber));
 
 					MemoryContextSwitchTo(oldCxt);
 				}
-				if (so->numKilled < MaxIndexTuplesPerPage)
+				if (so->numKilled < ClusterMaxIndexTuplesPerPage)
 					so->killedItems[so->numKilled++] =
 						so->pageData[so->curPageData - 1].offnum;
 			}
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index 01d06b7c32..210371ce25 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -312,9 +312,9 @@ hashgettuple(IndexScanDesc scan, ScanDirection dir)
 			 */
 			if (so->killedItems == NULL)
 				so->killedItems = (int *)
-					palloc(MaxIndexTuplesPerPage * sizeof(int));
+					palloc(ClusterMaxIndexTuplesPerPage * sizeof(int));
 
-			if (so->numKilled < MaxIndexTuplesPerPage)
+			if (so->numKilled < ClusterMaxIndexTuplesPerPage)
 				so->killedItems[so->numKilled++] = so->currPos.itemIndex;
 		}
 
diff --git a/src/backend/access/hash/hashovfl.c b/src/backend/access/hash/hashovfl.c
index c280ae885e..ec3a1a8356 100644
--- a/src/backend/access/hash/hashovfl.c
+++ b/src/backend/access/hash/hashovfl.c
@@ -903,9 +903,9 @@ _hash_squeezebucket(Relation rel,
 		OffsetNumber roffnum;
 		OffsetNumber maxroffnum;
 		OffsetNumber deletable[MaxOffsetNumber];
-		IndexTuple	itups[MaxIndexTuplesPerPage];
-		Size		tups_size[MaxIndexTuplesPerPage];
-		OffsetNumber itup_offsets[MaxIndexTuplesPerPage];
+		IndexTuple	itups[MaxIndexTuplesPerPageLimit];
+		Size		tups_size[MaxIndexTuplesPerPageLimit];
+		OffsetNumber itup_offsets[MaxIndexTuplesPerPageLimit];
 		uint16		ndeletable = 0;
 		uint16		nitups = 0;
 		Size		all_tups_size = 0;
diff --git a/src/backend/access/hash/hashpage.c b/src/backend/access/hash/hashpage.c
index d09c349e28..314c7f557b 100644
--- a/src/backend/access/hash/hashpage.c
+++ b/src/backend/access/hash/hashpage.c
@@ -1087,8 +1087,8 @@ _hash_splitbucket(Relation rel,
 	Page		npage;
 	HashPageOpaque oopaque;
 	HashPageOpaque nopaque;
-	OffsetNumber itup_offsets[MaxIndexTuplesPerPage];
-	IndexTuple	itups[MaxIndexTuplesPerPage];
+	OffsetNumber itup_offsets[MaxIndexTuplesPerPageLimit];
+	IndexTuple	itups[MaxIndexTuplesPerPageLimit];
 	Size		all_tups_size = 0;
 	int			i;
 	uint16		nitups = 0;
diff --git a/src/backend/access/hash/hashsearch.c b/src/backend/access/hash/hashsearch.c
index 0d99d6abc8..785bd953ea 100644
--- a/src/backend/access/hash/hashsearch.c
+++ b/src/backend/access/hash/hashsearch.c
@@ -532,7 +532,7 @@ _hash_readpage(IndexScanDesc scan, Buffer *bufP, ScanDirection dir)
 
 			itemIndex = _hash_load_qualified_items(scan, page, offnum, dir);
 
-			if (itemIndex != MaxIndexTuplesPerPage)
+			if (itemIndex != ClusterMaxIndexTuplesPerPage)
 				break;
 
 			/*
@@ -571,8 +571,8 @@ _hash_readpage(IndexScanDesc scan, Buffer *bufP, ScanDirection dir)
 		}
 
 		so->currPos.firstItem = itemIndex;
-		so->currPos.lastItem = MaxIndexTuplesPerPage - 1;
-		so->currPos.itemIndex = MaxIndexTuplesPerPage - 1;
+		so->currPos.lastItem = ClusterMaxIndexTuplesPerPage - 1;
+		so->currPos.itemIndex = ClusterMaxIndexTuplesPerPage - 1;
 	}
 
 	if (so->currPos.buf == so->hashso_bucket_buf ||
@@ -652,13 +652,13 @@ _hash_load_qualified_items(IndexScanDesc scan, Page page,
 			offnum = OffsetNumberNext(offnum);
 		}
 
-		Assert(itemIndex <= MaxIndexTuplesPerPage);
+		Assert(itemIndex <= ClusterMaxIndexTuplesPerPage);
 		return itemIndex;
 	}
 	else
 	{
 		/* load items[] in descending order */
-		itemIndex = MaxIndexTuplesPerPage;
+		itemIndex = ClusterMaxIndexTuplesPerPage;
 
 		while (offnum >= FirstOffsetNumber)
 		{
diff --git a/src/backend/access/heap/README.HOT b/src/backend/access/heap/README.HOT
index 74e407f375..e286e1dec3 100644
--- a/src/backend/access/heap/README.HOT
+++ b/src/backend/access/heap/README.HOT
@@ -264,7 +264,7 @@ of line pointer bloat: we might end up with huge numbers of line pointers
 and just a few actual tuples on a page.  To limit the damage in the worst
 case, and to keep various work arrays as well as the bitmaps in bitmap
 scans reasonably sized, the maximum number of line pointers per page
-is arbitrarily capped at MaxHeapTuplesPerPage (the most tuples that
+is arbitrarily capped at ClusterMaxHeapTuplesPerPage (the most tuples that
 could fit without HOT pruning).
 
 Effectively, space reclamation happens during tuple retrieval when the
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 34bc60f625..633c6e4303 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -474,7 +474,7 @@ heapgetpage(TableScanDesc sscan, BlockNumber block)
 
 	LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
 
-	Assert(ntup <= MaxHeapTuplesPerPage);
+	Assert(ntup <= ClusterMaxHeapTuplesPerPage);
 	scan->rs_ntuples = ntup;
 }
 
@@ -6746,8 +6746,8 @@ heap_freeze_execute_prepared(Relation rel, Buffer buffer,
 	/* Now WAL-log freezing if necessary */
 	if (RelationNeedsWAL(rel))
 	{
-		xl_heap_freeze_plan plans[MaxHeapTuplesPerPage];
-		OffsetNumber offsets[MaxHeapTuplesPerPage];
+		xl_heap_freeze_plan plans[MaxHeapTuplesPerPageLimit];
+		OffsetNumber offsets[MaxHeapTuplesPerPageLimit];
 		int			nplans;
 		xl_heap_freeze_page xlrec;
 		XLogRecPtr	recptr;
@@ -9227,7 +9227,7 @@ heap_xlog_insert(XLogReaderState *record)
 	union
 	{
 		HeapTupleHeaderData hdr;
-		char		data[MaxHeapTupleSize];
+		char		data[MaxHeapTupleSizeLimit];
 	}			tbuf;
 	HeapTupleHeader htup;
 	xl_heap_header xlhdr;
@@ -9283,7 +9283,7 @@ heap_xlog_insert(XLogReaderState *record)
 		data = XLogRecGetBlockData(record, 0, &datalen);
 
 		newlen = datalen - SizeOfHeapHeader;
-		Assert(datalen > SizeOfHeapHeader && newlen <= MaxHeapTupleSize);
+		Assert(datalen > SizeOfHeapHeader && newlen <= ClusterMaxHeapTupleSize);
 		memcpy((char *) &xlhdr, data, SizeOfHeapHeader);
 		data += SizeOfHeapHeader;
 
@@ -9349,7 +9349,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
 	union
 	{
 		HeapTupleHeaderData hdr;
-		char		data[MaxHeapTupleSize];
+		char		data[MaxHeapTupleSizeLimit];
 	}			tbuf;
 	HeapTupleHeader htup;
 	uint32		newlen;
@@ -9427,7 +9427,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
 			tupdata = ((char *) xlhdr) + SizeOfMultiInsertTuple;
 
 			newlen = xlhdr->datalen;
-			Assert(newlen <= MaxHeapTupleSize);
+			Assert(newlen <= ClusterMaxHeapTupleSize);
 			htup = &tbuf.hdr;
 			MemSet((char *) htup, 0, SizeofHeapTupleHeader);
 			/* PG73FORMAT: get bitmap [+ padding] [+ oid] + data */
@@ -9506,7 +9506,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
 	union
 	{
 		HeapTupleHeaderData hdr;
-		char		data[MaxHeapTupleSize];
+		char		data[MaxHeapTupleSizeLimit];
 	}			tbuf;
 	xl_heap_header xlhdr;
 	uint32		newlen;
@@ -9662,7 +9662,7 @@ heap_xlog_update(XLogReaderState *record, bool hot_update)
 		recdata += SizeOfHeapHeader;
 
 		tuplen = recdata_end - recdata;
-		Assert(tuplen <= MaxHeapTupleSize);
+		Assert(tuplen <= ClusterMaxHeapTupleSize);
 
 		htup = &tbuf.hdr;
 		MemSet((char *) htup, 0, SizeofHeapTupleHeader);
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 680a50bf8b..dd894afaf1 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -1189,7 +1189,7 @@ heapam_index_build_range_scan(Relation heapRelation,
 	TransactionId OldestXmin;
 	BlockNumber previous_blkno = InvalidBlockNumber;
 	BlockNumber root_blkno = InvalidBlockNumber;
-	OffsetNumber root_offsets[MaxHeapTuplesPerPage];
+	OffsetNumber root_offsets[MaxHeapTuplesPerPageLimit];
 
 	/*
 	 * sanity checks
@@ -1752,8 +1752,8 @@ heapam_index_validate_scan(Relation heapRelation,
 	EState	   *estate;
 	ExprContext *econtext;
 	BlockNumber root_blkno = InvalidBlockNumber;
-	OffsetNumber root_offsets[MaxHeapTuplesPerPage];
-	bool		in_index[MaxHeapTuplesPerPage];
+	OffsetNumber root_offsets[MaxHeapTuplesPerPageLimit];
+	bool		in_index[MaxHeapTuplesPerPageLimit];
 	BlockNumber previous_blkno = InvalidBlockNumber;
 
 	/* state variables for the merge */
@@ -2218,7 +2218,7 @@ heapam_scan_bitmap_next_block(TableScanDesc scan,
 
 	LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
 
-	Assert(ntup <= MaxHeapTuplesPerPage);
+	Assert(ntup <= ClusterMaxHeapTuplesPerPage);
 	hscan->rs_ntuples = ntup;
 
 	return ntup > 0;
diff --git a/src/backend/access/heap/hio.c b/src/backend/access/heap/hio.c
index 7c662cdf46..a515a45b00 100644
--- a/src/backend/access/heap/hio.c
+++ b/src/backend/access/heap/hio.c
@@ -529,11 +529,11 @@ RelationGetBufferForTuple(Relation relation, Size len,
 	/*
 	 * If we're gonna fail for oversize tuple, do it right away
 	 */
-	if (len > MaxHeapTupleSize)
+	if (len > ClusterMaxHeapTupleSize)
 		ereport(ERROR,
 				(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
 				 errmsg("row is too big: size %zu, maximum size %zu",
-						len, MaxHeapTupleSize)));
+						len, ClusterMaxHeapTupleSize)));
 
 	/* Compute desired extra freespace due to fillfactor option */
 	saveFreeSpace = RelationGetTargetPageFreeSpace(relation,
@@ -545,8 +545,8 @@ RelationGetBufferForTuple(Relation relation, Size len,
 	 * somewhat arbitrary, but it should prevent most unnecessary relation
 	 * extensions while inserting large tuples into low-fillfactor tables.
 	 */
-	nearlyEmptyFreeSpace = MaxHeapTupleSize -
-		(MaxHeapTuplesPerPage / 8 * sizeof(ItemIdData));
+	nearlyEmptyFreeSpace = ClusterMaxHeapTupleSize -
+		(ClusterMaxHeapTuplesPerPage / 8 * sizeof(ItemIdData));
 	if (len + saveFreeSpace > nearlyEmptyFreeSpace)
 		targetFreeSpace = Max(len, nearlyEmptyFreeSpace);
 	else
diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
index 4f12413b8b..deb153198f 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -42,17 +42,17 @@ typedef struct
 	int			ndead;
 	int			nunused;
 	/* arrays that accumulate indexes of items to be changed */
-	OffsetNumber redirected[MaxHeapTuplesPerPage * 2];
-	OffsetNumber nowdead[MaxHeapTuplesPerPage];
-	OffsetNumber nowunused[MaxHeapTuplesPerPage];
+	OffsetNumber redirected[MaxHeapTuplesPerPageLimit * 2];
+	OffsetNumber nowdead[MaxHeapTuplesPerPageLimit];
+	OffsetNumber nowunused[MaxHeapTuplesPerPageLimit];
 
 	/*
 	 * marked[i] is true if item i is entered in one of the above arrays.
 	 *
-	 * This needs to be MaxHeapTuplesPerPage + 1 long as FirstOffsetNumber is
+	 * This needs to be ClusterMaxHeapTuplesPerPage + 1 long as FirstOffsetNumber is
 	 * 1. Otherwise every access would need to subtract 1.
 	 */
-	bool		marked[MaxHeapTuplesPerPage + 1];
+	bool		marked[MaxHeapTuplesPerPageLimit + 1];
 } PruneState;
 
 /* Local functions */
@@ -494,7 +494,7 @@ heap_prune_chain(Buffer buffer, OffsetNumber rootoffnum,
 	OffsetNumber latestdead = InvalidOffsetNumber,
 				maxoff = PageGetMaxOffsetNumber(dp),
 				offnum;
-	OffsetNumber chainitems[MaxHeapTuplesPerPage];
+	OffsetNumber chainitems[MaxHeapTuplesPerPageLimit];
 	int			nchain = 0,
 				i;
 
@@ -775,7 +775,7 @@ static void
 heap_prune_record_redirect(PruneState *prstate,
 						   OffsetNumber offnum, OffsetNumber rdoffnum)
 {
-	Assert(prstate->nredirected < MaxHeapTuplesPerPage);
+	Assert(prstate->nredirected < ClusterMaxHeapTuplesPerPage);
 	prstate->redirected[prstate->nredirected * 2] = offnum;
 	prstate->redirected[prstate->nredirected * 2 + 1] = rdoffnum;
 	prstate->nredirected++;
@@ -789,7 +789,7 @@ heap_prune_record_redirect(PruneState *prstate,
 static void
 heap_prune_record_dead(PruneState *prstate, OffsetNumber offnum)
 {
-	Assert(prstate->ndead < MaxHeapTuplesPerPage);
+	Assert(prstate->ndead < ClusterMaxHeapTuplesPerPage);
 	prstate->nowdead[prstate->ndead] = offnum;
 	prstate->ndead++;
 	Assert(!prstate->marked[offnum]);
@@ -821,7 +821,7 @@ heap_prune_record_dead_or_unused(PruneState *prstate, OffsetNumber offnum)
 static void
 heap_prune_record_unused(PruneState *prstate, OffsetNumber offnum)
 {
-	Assert(prstate->nunused < MaxHeapTuplesPerPage);
+	Assert(prstate->nunused < ClusterMaxHeapTuplesPerPage);
 	prstate->nowunused[prstate->nunused] = offnum;
 	prstate->nunused++;
 	Assert(!prstate->marked[offnum]);
@@ -1034,7 +1034,7 @@ page_verify_redirects(Page page)
  * If item k is part of a HOT-chain with root at item j, then we set
  * root_offsets[k - 1] = j.
  *
- * The passed-in root_offsets array must have MaxHeapTuplesPerPage entries.
+ * The passed-in root_offsets array must have ClusterMaxHeapTuplesPerPage entries.
  * Unused entries are filled with InvalidOffsetNumber (zero).
  *
  * The function must be called with at least share lock on the buffer, to
@@ -1051,7 +1051,7 @@ heap_get_root_tuples(Page page, OffsetNumber *root_offsets)
 				maxoff;
 
 	MemSet(root_offsets, InvalidOffsetNumber,
-		   MaxHeapTuplesPerPage * sizeof(OffsetNumber));
+		   ClusterMaxHeapTuplesPerPage * sizeof(OffsetNumber));
 
 	maxoff = PageGetMaxOffsetNumber(page);
 	for (offnum = FirstOffsetNumber; offnum <= maxoff; offnum = OffsetNumberNext(offnum))
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 473f3aa9be..1350af2a7d 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -634,11 +634,11 @@ raw_heap_insert(RewriteState state, HeapTuple tup)
 	/*
 	 * If we're gonna fail for oversize tuple, do it right away
 	 */
-	if (len > MaxHeapTupleSize)
+	if (len > ClusterMaxHeapTupleSize)
 		ereport(ERROR,
 				(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
 				 errmsg("row is too big: size %zu, maximum size %zu",
-						len, MaxHeapTupleSize)));
+						len, ClusterMaxHeapTupleSize)));
 
 	/* Compute desired extra freespace due to fillfactor option */
 	saveFreeSpace = RelationGetTargetPageFreeSpace(state->rs_new_rel,
diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 1800490775..c147c77984 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -866,8 +866,8 @@ lazy_scan_heap(LVRelState *vacrel)
 		 * dead_items TIDs, pause and do a cycle of vacuuming before we tackle
 		 * this page.
 		 */
-		Assert(dead_items->max_items >= MaxHeapTuplesPerPage);
-		if (dead_items->max_items - dead_items->num_items < MaxHeapTuplesPerPage)
+		Assert(dead_items->max_items >= ClusterMaxHeapTuplesPerPage);
+		if (dead_items->max_items - dead_items->num_items < ClusterMaxHeapTuplesPerPage)
 		{
 			/*
 			 * Before beginning index vacuuming, we release any pin we may
@@ -1426,8 +1426,8 @@ lazy_scan_prune(LVRelState *vacrel,
 				all_frozen;
 	TransactionId visibility_cutoff_xid;
 	int64		fpi_before = pgWalUsage.wal_fpi;
-	OffsetNumber deadoffsets[MaxHeapTuplesPerPage];
-	HeapTupleFreeze frozen[MaxHeapTuplesPerPage];
+	OffsetNumber deadoffsets[MaxHeapTuplesPerPageLimit];
+	HeapTupleFreeze frozen[MaxHeapTuplesPerPageLimit];
 
 	Assert(BufferGetBlockNumber(buf) == blkno);
 
@@ -1955,7 +1955,7 @@ lazy_scan_noprune(LVRelState *vacrel,
 	HeapTupleHeader tupleheader;
 	TransactionId NoFreezePageRelfrozenXid = vacrel->NewRelfrozenXid;
 	MultiXactId NoFreezePageRelminMxid = vacrel->NewRelminMxid;
-	OffsetNumber deadoffsets[MaxHeapTuplesPerPage];
+	OffsetNumber deadoffsets[MaxHeapTuplesPerPageLimit];
 
 	Assert(BufferGetBlockNumber(buf) == blkno);
 
@@ -2499,7 +2499,7 @@ lazy_vacuum_heap_page(LVRelState *vacrel, BlockNumber blkno, Buffer buffer,
 {
 	VacDeadItems *dead_items = vacrel->dead_items;
 	Page		page = BufferGetPage(buffer);
-	OffsetNumber unused[MaxHeapTuplesPerPage];
+	OffsetNumber unused[MaxHeapTuplesPerPageLimit];
 	int			nunused = 0;
 	TransactionId visibility_cutoff_xid;
 	bool		all_frozen;
@@ -3150,16 +3150,16 @@ dead_items_max_items(LVRelState *vacrel)
 		max_items = Min(max_items, MAXDEADITEMS(MaxAllocSize));
 
 		/* curious coding here to ensure the multiplication can't overflow */
-		if ((BlockNumber) (max_items / MaxHeapTuplesPerPage) > rel_pages)
-			max_items = rel_pages * MaxHeapTuplesPerPage;
+		if ((BlockNumber) (max_items / ClusterMaxHeapTuplesPerPage) > rel_pages)
+			max_items = rel_pages * ClusterMaxHeapTuplesPerPage;
 
 		/* stay sane if small maintenance_work_mem */
-		max_items = Max(max_items, MaxHeapTuplesPerPage);
+		max_items = Max(max_items, ClusterMaxHeapTuplesPerPage);
 	}
 	else
 	{
 		/* One-pass case only stores a single heap page's TIDs at a time */
-		max_items = MaxHeapTuplesPerPage;
+		max_items = ClusterMaxHeapTuplesPerPage;
 	}
 
 	return (int) max_items;
@@ -3179,7 +3179,7 @@ dead_items_alloc(LVRelState *vacrel, int nworkers)
 	int			max_items;
 
 	max_items = dead_items_max_items(vacrel);
-	Assert(max_items >= MaxHeapTuplesPerPage);
+	Assert(max_items >= ClusterMaxHeapTuplesPerPage);
 
 	/*
 	 * Initialize state for a parallel vacuum.  As of now, only one worker can
diff --git a/src/backend/access/nbtree/nbtdedup.c b/src/backend/access/nbtree/nbtdedup.c
index 456d86b51c..c8af6cbc2a 100644
--- a/src/backend/access/nbtree/nbtdedup.c
+++ b/src/backend/access/nbtree/nbtdedup.c
@@ -355,8 +355,8 @@ _bt_bottomupdel_pass(Relation rel, Buffer buf, Relation heapRel,
 	delstate.bottomup = true;
 	delstate.bottomupfreespace = Max(BLCKSZ / 16, newitemsz);
 	delstate.ndeltids = 0;
-	delstate.deltids = palloc(MaxTIDsPerBTreePage * sizeof(TM_IndexDelete));
-	delstate.status = palloc(MaxTIDsPerBTreePage * sizeof(TM_IndexStatus));
+	delstate.deltids = palloc(ClusterMaxTIDsPerBTreePage * sizeof(TM_IndexDelete));
+	delstate.status = palloc(ClusterMaxTIDsPerBTreePage * sizeof(TM_IndexStatus));
 
 	minoff = P_FIRSTDATAKEY(opaque);
 	maxoff = PageGetMaxOffsetNumber(page);
diff --git a/src/backend/access/nbtree/nbtinsert.c b/src/backend/access/nbtree/nbtinsert.c
index 7e8902e48c..da8352d403 100644
--- a/src/backend/access/nbtree/nbtinsert.c
+++ b/src/backend/access/nbtree/nbtinsert.c
@@ -2685,7 +2685,7 @@ _bt_delete_or_dedup_one_page(Relation rel, Relation heapRel,
 							 bool simpleonly, bool checkingunique,
 							 bool uniquedup, bool indexUnchanged)
 {
-	OffsetNumber deletable[MaxIndexTuplesPerPage];
+	OffsetNumber deletable[MaxIndexTuplesPerPageLimit];
 	int			ndeletable = 0;
 	OffsetNumber offnum,
 				minoff,
@@ -2829,8 +2829,8 @@ _bt_simpledel_pass(Relation rel, Buffer buffer, Relation heapRel,
 	delstate.bottomup = false;
 	delstate.bottomupfreespace = 0;
 	delstate.ndeltids = 0;
-	delstate.deltids = palloc(MaxTIDsPerBTreePage * sizeof(TM_IndexDelete));
-	delstate.status = palloc(MaxTIDsPerBTreePage * sizeof(TM_IndexStatus));
+	delstate.deltids = palloc(ClusterMaxTIDsPerBTreePage * sizeof(TM_IndexDelete));
+	delstate.status = palloc(ClusterMaxTIDsPerBTreePage * sizeof(TM_IndexStatus));
 
 	for (offnum = minoff;
 		 offnum <= maxoff;
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 01bbece6bf..f728b36c80 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -1160,7 +1160,7 @@ _bt_delitems_vacuum(Relation rel, Buffer buf,
 	bool		needswal = RelationNeedsWAL(rel);
 	char	   *updatedbuf = NULL;
 	Size		updatedbuflen = 0;
-	OffsetNumber updatedoffsets[MaxIndexTuplesPerPage];
+	OffsetNumber updatedoffsets[MaxIndexTuplesPerPageLimit];
 
 	/* Shouldn't be called unless there's something to do */
 	Assert(ndeletable > 0 || nupdatable > 0);
@@ -1291,7 +1291,7 @@ _bt_delitems_delete(Relation rel, Buffer buf,
 	bool		needswal = RelationNeedsWAL(rel);
 	char	   *updatedbuf = NULL;
 	Size		updatedbuflen = 0;
-	OffsetNumber updatedoffsets[MaxIndexTuplesPerPage];
+	OffsetNumber updatedoffsets[MaxIndexTuplesPerPageLimit];
 
 	/* Shouldn't be called unless there's something to do */
 	Assert(ndeletable > 0 || nupdatable > 0);
@@ -1519,8 +1519,8 @@ _bt_delitems_delete_check(Relation rel, Buffer buf, Relation heapRel,
 	OffsetNumber postingidxoffnum = InvalidOffsetNumber;
 	int			ndeletable = 0,
 				nupdatable = 0;
-	OffsetNumber deletable[MaxIndexTuplesPerPage];
-	BTVacuumPosting updatable[MaxIndexTuplesPerPage];
+	OffsetNumber deletable[MaxIndexTuplesPerPageLimit];
+	BTVacuumPosting updatable[MaxIndexTuplesPerPageLimit];
 
 	/* Use tableam interface to determine which tuples to delete first */
 	snapshotConflictHorizon = table_index_delete_tuples(heapRel, delstate);
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 41df1027d2..ad37c05205 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -246,8 +246,8 @@ btgettuple(IndexScanDesc scan, ScanDirection dir)
 				 */
 				if (so->killedItems == NULL)
 					so->killedItems = (int *)
-						palloc(MaxTIDsPerBTreePage * sizeof(int));
-				if (so->numKilled < MaxTIDsPerBTreePage)
+						palloc(ClusterMaxTIDsPerBTreePage * sizeof(int));
+				if (so->numKilled < ClusterMaxTIDsPerBTreePage)
 					so->killedItems[so->numKilled++] = so->currPos.itemIndex;
 			}
 
@@ -1142,9 +1142,9 @@ backtrack:
 	}
 	else if (P_ISLEAF(opaque))
 	{
-		OffsetNumber deletable[MaxIndexTuplesPerPage];
+		OffsetNumber deletable[MaxIndexTuplesPerPageLimit];
 		int			ndeletable;
-		BTVacuumPosting updatable[MaxIndexTuplesPerPage];
+		BTVacuumPosting updatable[MaxIndexTuplesPerPageLimit];
 		int			nupdatable;
 		OffsetNumber offnum,
 					minoff,
diff --git a/src/backend/access/nbtree/nbtsearch.c b/src/backend/access/nbtree/nbtsearch.c
index 23e723a233..1d77f1303f 100644
--- a/src/backend/access/nbtree/nbtsearch.c
+++ b/src/backend/access/nbtree/nbtsearch.c
@@ -1726,7 +1726,7 @@ _bt_readpage(IndexScanDesc scan, ScanDirection dir, OffsetNumber offnum,
 		if (!continuescan)
 			so->currPos.moreRight = false;
 
-		Assert(itemIndex <= MaxTIDsPerBTreePage);
+		Assert(itemIndex <= ClusterMaxTIDsPerBTreePage);
 		so->currPos.firstItem = 0;
 		so->currPos.lastItem = itemIndex - 1;
 		so->currPos.itemIndex = 0;
@@ -1734,7 +1734,7 @@ _bt_readpage(IndexScanDesc scan, ScanDirection dir, OffsetNumber offnum,
 	else
 	{
 		/* load items[] in descending order */
-		itemIndex = MaxTIDsPerBTreePage;
+		itemIndex = ClusterMaxTIDsPerBTreePage;
 
 		offnum = Min(offnum, maxoff);
 
@@ -1836,8 +1836,8 @@ _bt_readpage(IndexScanDesc scan, ScanDirection dir, OffsetNumber offnum,
 
 		Assert(itemIndex >= 0);
 		so->currPos.firstItem = itemIndex;
-		so->currPos.lastItem = MaxTIDsPerBTreePage - 1;
-		so->currPos.itemIndex = MaxTIDsPerBTreePage - 1;
+		so->currPos.lastItem = ClusterMaxTIDsPerBTreePage - 1;
+		so->currPos.itemIndex = ClusterMaxTIDsPerBTreePage - 1;
 	}
 
 	return (so->currPos.firstItem <= so->currPos.lastItem);
diff --git a/src/backend/access/nbtree/nbtxlog.c b/src/backend/access/nbtree/nbtxlog.c
index b5b0e22447..c912260d2c 100644
--- a/src/backend/access/nbtree/nbtxlog.c
+++ b/src/backend/access/nbtree/nbtxlog.c
@@ -38,8 +38,8 @@ _bt_restore_page(Page page, char *from, int len)
 	IndexTupleData itupdata;
 	Size		itemsz;
 	char	   *end = from + len;
-	Item		items[MaxIndexTuplesPerPage];
-	uint16		itemsizes[MaxIndexTuplesPerPage];
+	Item		items[MaxIndexTuplesPerPageLimit];
+	uint16		itemsizes[MaxIndexTuplesPerPageLimit];
 	int			i;
 	int			nitems;
 
diff --git a/src/backend/access/spgist/spgdoinsert.c b/src/backend/access/spgist/spgdoinsert.c
index a4995c168b..3c3ede6f3a 100644
--- a/src/backend/access/spgist/spgdoinsert.c
+++ b/src/backend/access/spgist/spgdoinsert.c
@@ -134,7 +134,7 @@ spgPageIndexMultiDelete(SpGistState *state, Page page,
 						BlockNumber blkno, OffsetNumber offnum)
 {
 	OffsetNumber firstItem;
-	OffsetNumber sortednos[MaxIndexTuplesPerPage];
+	OffsetNumber sortednos[MaxIndexTuplesPerPageLimit];
 	SpGistDeadTuple tuple = NULL;
 	int			i;
 
diff --git a/src/backend/access/spgist/spgscan.c b/src/backend/access/spgist/spgscan.c
index 03293a7816..5690fc4981 100644
--- a/src/backend/access/spgist/spgscan.c
+++ b/src/backend/access/spgist/spgscan.c
@@ -961,7 +961,7 @@ storeGettuple(SpGistScanOpaque so, ItemPointer heapPtr,
 			  SpGistLeafTuple leafTuple, bool recheck,
 			  bool recheckDistances, double *nonNullDistances)
 {
-	Assert(so->nPtrs < MaxIndexTuplesPerPage);
+	Assert(so->nPtrs < ClusterMaxIndexTuplesPerPage);
 	so->heapPtrs[so->nPtrs] = *heapPtr;
 	so->recheck[so->nPtrs] = recheck;
 	so->recheckDistances[so->nPtrs] = recheckDistances;
diff --git a/src/backend/access/spgist/spgvacuum.c b/src/backend/access/spgist/spgvacuum.c
index d2e1624924..586f367b50 100644
--- a/src/backend/access/spgist/spgvacuum.c
+++ b/src/backend/access/spgist/spgvacuum.c
@@ -127,14 +127,14 @@ vacuumLeafPage(spgBulkDeleteState *bds, Relation index, Buffer buffer,
 {
 	Page		page = BufferGetPage(buffer);
 	spgxlogVacuumLeaf xlrec;
-	OffsetNumber toDead[MaxIndexTuplesPerPage];
-	OffsetNumber toPlaceholder[MaxIndexTuplesPerPage];
-	OffsetNumber moveSrc[MaxIndexTuplesPerPage];
-	OffsetNumber moveDest[MaxIndexTuplesPerPage];
-	OffsetNumber chainSrc[MaxIndexTuplesPerPage];
-	OffsetNumber chainDest[MaxIndexTuplesPerPage];
-	OffsetNumber predecessor[MaxIndexTuplesPerPage + 1];
-	bool		deletable[MaxIndexTuplesPerPage + 1];
+	OffsetNumber toDead[MaxIndexTuplesPerPageLimit];
+	OffsetNumber toPlaceholder[MaxIndexTuplesPerPageLimit];
+	OffsetNumber moveSrc[MaxIndexTuplesPerPageLimit];
+	OffsetNumber moveDest[MaxIndexTuplesPerPageLimit];
+	OffsetNumber chainSrc[MaxIndexTuplesPerPageLimit];
+	OffsetNumber chainDest[MaxIndexTuplesPerPageLimit];
+	OffsetNumber predecessor[MaxIndexTuplesPerPageLimit + 1];
+	bool		deletable[MaxIndexTuplesPerPageLimit + 1];
 	int			nDeletable;
 	OffsetNumber i,
 				max = PageGetMaxOffsetNumber(page);
@@ -407,7 +407,7 @@ vacuumLeafRoot(spgBulkDeleteState *bds, Relation index, Buffer buffer)
 {
 	Page		page = BufferGetPage(buffer);
 	spgxlogVacuumRoot xlrec;
-	OffsetNumber toDelete[MaxIndexTuplesPerPage];
+	OffsetNumber toDelete[MaxIndexTuplesPerPageLimit];
 	OffsetNumber i,
 				max = PageGetMaxOffsetNumber(page);
 
@@ -497,8 +497,8 @@ vacuumRedirectAndPlaceholder(Relation index, Relation heaprel, Buffer buffer)
 				firstPlaceholder = InvalidOffsetNumber;
 	bool		hasNonPlaceholder = false;
 	bool		hasUpdate = false;
-	OffsetNumber itemToPlaceholder[MaxIndexTuplesPerPage];
-	OffsetNumber itemnos[MaxIndexTuplesPerPage];
+	OffsetNumber itemToPlaceholder[MaxIndexTuplesPerPageLimit];
+	OffsetNumber itemnos[MaxIndexTuplesPerPageLimit];
 	spgxlogVacuumRedirect xlrec;
 	GlobalVisState *vistest;
 
diff --git a/src/backend/nodes/tidbitmap.c b/src/backend/nodes/tidbitmap.c
index e8ab5d78fc..1afa431062 100644
--- a/src/backend/nodes/tidbitmap.c
+++ b/src/backend/nodes/tidbitmap.c
@@ -54,7 +54,7 @@
  * the per-page bitmaps variable size.  We just legislate that the size
  * is this:
  */
-#define MAX_TUPLES_PER_PAGE  MaxHeapTuplesPerPage
+#define MAX_TUPLES_PER_PAGE  ClusterMaxHeapTuplesPerPage
 
 /*
  * When we have to switch over to lossy storage, we use a data structure
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 001f901ee6..0aac767974 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -4878,7 +4878,7 @@ ReorderBufferToastReplace(ReorderBuffer *rb, ReorderBufferTXN *txn,
 	 * the tuplebuf because attrs[] will point back into the current content.
 	 */
 	tmphtup = heap_form_tuple(desc, attrs, isnull);
-	Assert(newtup->t_len <= MaxHeapTupleSize);
+	Assert(newtup->t_len <= ClusterMaxHeapTupleSize);
 	Assert(newtup->t_data == (HeapTupleHeader) ((char *) newtup + HEAPTUPLESIZE));
 
 	memcpy(newtup->t_data, tmphtup->t_data, tmphtup->t_len);
diff --git a/src/backend/storage/freespace/freespace.c b/src/backend/storage/freespace/freespace.c
index bcdb182193..1604937242 100644
--- a/src/backend/storage/freespace/freespace.c
+++ b/src/backend/storage/freespace/freespace.c
@@ -63,7 +63,7 @@
  */
 #define FSM_CATEGORIES	256
 #define FSM_CAT_STEP	(BLCKSZ / FSM_CATEGORIES)
-#define MaxFSMRequestSize	MaxHeapTupleSize
+#define MaxFSMRequestSize	ClusterMaxHeapTupleSize
 
 /*
  * Depth of the on-disk tree. We need to be able to address 2^32-1 blocks,
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index be6f1f62d2..bc37eae9f2 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -186,7 +186,7 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
  *	one that is both unused and deallocated.
  *
  *	If flag PAI_IS_HEAP is set, we enforce that there can't be more than
- *	MaxHeapTuplesPerPage line pointers on the page.
+ *	ClusterMaxHeapTuplesPerPage line pointers on the page.
  *
  *	!!! EREPORT(ERROR) IS DISALLOWED HERE !!!
  */
@@ -295,9 +295,9 @@ PageAddItemExtended(Page page,
 	}
 
 	/* Reject placing items beyond heap boundary, if heap */
-	if ((flags & PAI_IS_HEAP) != 0 && offsetNumber > MaxHeapTuplesPerPage)
+	if ((flags & PAI_IS_HEAP) != 0 && offsetNumber > ClusterMaxHeapTuplesPerPage)
 	{
-		elog(WARNING, "can't put more than MaxHeapTuplesPerPage items in a heap page");
+		elog(WARNING, "can't put more than ClusterMaxHeapTuplesPerPage items in a heap page");
 		return InvalidOffsetNumber;
 	}
 
@@ -702,7 +702,7 @@ PageRepairFragmentation(Page page)
 	Offset		pd_upper = ((PageHeader) page)->pd_upper;
 	Offset		pd_special = ((PageHeader) page)->pd_special;
 	Offset		last_offset;
-	itemIdCompactData itemidbase[MaxHeapTuplesPerPage];
+	itemIdCompactData itemidbase[MaxHeapTuplesPerPageLimit];
 	itemIdCompact itemidptr;
 	ItemId		lp;
 	int			nline,
@@ -979,12 +979,12 @@ PageGetExactFreeSpace(Page page)
  *		reduced by the space needed for a new line pointer.
  *
  * The difference between this and PageGetFreeSpace is that this will return
- * zero if there are already MaxHeapTuplesPerPage line pointers in the page
+ * zero if there are already ClusterMaxHeapTuplesPerPage line pointers in the page
  * and none are free.  We use this to enforce that no more than
- * MaxHeapTuplesPerPage line pointers are created on a heap page.  (Although
+ * ClusterMaxHeapTuplesPerPage line pointers are created on a heap page.  (Although
  * no more tuples than that could fit anyway, in the presence of redirected
  * or dead line pointers it'd be possible to have too many line pointers.
- * To avoid breaking code that assumes MaxHeapTuplesPerPage is a hard limit
+ * To avoid breaking code that assumes ClusterMaxHeapTuplesPerPage is a hard limit
  * on the number of line pointers, we make this extra check.)
  */
 Size
@@ -999,10 +999,10 @@ PageGetHeapFreeSpace(Page page)
 					nline;
 
 		/*
-		 * Are there already MaxHeapTuplesPerPage line pointers in the page?
+		 * Are there already ClusterMaxHeapTuplesPerPage line pointers in the page?
 		 */
 		nline = PageGetMaxOffsetNumber(page);
-		if (nline >= MaxHeapTuplesPerPage)
+		if (nline >= ClusterMaxHeapTuplesPerPage)
 		{
 			if (PageHasFreeLinePointers(page))
 			{
@@ -1165,8 +1165,8 @@ PageIndexMultiDelete(Page page, OffsetNumber *itemnos, int nitems)
 	Offset		pd_upper = phdr->pd_upper;
 	Offset		pd_special = phdr->pd_special;
 	Offset		last_offset;
-	itemIdCompactData itemidbase[MaxIndexTuplesPerPage];
-	ItemIdData	newitemids[MaxIndexTuplesPerPage];
+	itemIdCompactData itemidbase[MaxIndexTuplesPerPageLimit];
+	ItemIdData	newitemids[MaxIndexTuplesPerPageLimit];
 	itemIdCompact itemidptr;
 	ItemId		lp;
 	int			nline,
@@ -1178,7 +1178,7 @@ PageIndexMultiDelete(Page page, OffsetNumber *itemnos, int nitems)
 	OffsetNumber offnum;
 	bool		presorted = true;	/* For now */
 
-	Assert(nitems <= MaxIndexTuplesPerPage);
+	Assert(nitems <= ClusterMaxIndexTuplesPerPage);
 
 	/*
 	 * If there aren't very many items to delete, then retail
diff --git a/src/include/access/ginblock.h b/src/include/access/ginblock.h
index b3b7daa049..c55d11be64 100644
--- a/src/include/access/ginblock.h
+++ b/src/include/access/ginblock.h
@@ -162,7 +162,7 @@ extern bool GinPageIsRecyclable(Page page);
  *				pointers for that page
  * Note that these are all distinguishable from an "invalid" item pointer
  * (which is InvalidBlockNumber/0) as well as from all normal item
- * pointers (which have item numbers in the range 1..MaxHeapTuplesPerPage).
+ * pointers (which have item numbers in the range 1..ClusterMaxHeapTuplesPerPage).
  */
 #define ItemPointerSetMin(p)  \
 	ItemPointerSet((p), (BlockNumber)0, (OffsetNumber)0)
diff --git a/src/include/access/hash.h b/src/include/access/hash.h
index 9c7d81525b..b797872bd4 100644
--- a/src/include/access/hash.h
+++ b/src/include/access/hash.h
@@ -124,7 +124,7 @@ typedef struct HashScanPosData
 	int			lastItem;		/* last valid index in items[] */
 	int			itemIndex;		/* current index in items[] */
 
-	HashScanPosItem items[MaxIndexTuplesPerPage];	/* MUST BE LAST */
+	HashScanPosItem items[MaxIndexTuplesPerPageLimit];	/* MUST BE LAST */
 } HashScanPosData;
 
 #define HashScanPosIsPinned(scanpos) \
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 4b133f6859..3217d72f99 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -75,7 +75,7 @@ typedef struct HeapScanDescData
 	/* these fields only used in page-at-a-time mode and for bitmap scans */
 	int			rs_cindex;		/* current tuple's index in vistuples */
 	int			rs_ntuples;		/* number of visible tuples on page */
-	OffsetNumber rs_vistuples[MaxHeapTuplesPerPage];	/* their offsets */
+	OffsetNumber rs_vistuples[MaxHeapTuplesPerPageLimit];	/* their offsets */
 }			HeapScanDescData;
 typedef struct HeapScanDescData *HeapScanDesc;
 
@@ -205,10 +205,10 @@ typedef struct PruneResult
 	 * This is of type int8[], instead of HTSV_Result[], so we can use -1 to
 	 * indicate no visibility has been computed, e.g. for LP_DEAD items.
 	 *
-	 * This needs to be MaxHeapTuplesPerPage + 1 long as FirstOffsetNumber is
+	 * This needs to be ClusterMaxHeapTuplesPerPage + 1 long as FirstOffsetNumber is
 	 * 1. Otherwise every access would need to subtract 1.
 	 */
-	int8		htsv[MaxHeapTuplesPerPage + 1];
+	int8		htsv[MaxHeapTuplesPerPageLimit + 1];
 } PruneResult;
 
 /*
diff --git a/src/include/access/heaptoast.h b/src/include/access/heaptoast.h
index c376dff48d..6fe836f7d1 100644
--- a/src/include/access/heaptoast.h
+++ b/src/include/access/heaptoast.h
@@ -65,7 +65,7 @@
  * compress it (we can't move it out-of-line, however).  Note that this
  * number is per-datum, not per-tuple, for simplicity in index_form_tuple().
  */
-#define TOAST_INDEX_TARGET		(MaxHeapTupleSize / 16)
+#define TOAST_INDEX_TARGET		(ClusterMaxHeapTupleSize / 16)
 
 /*
  * When we store an oversize datum externally, we divide it into chunks
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index 5e38ef8696..bb5c746b98 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -550,33 +550,57 @@ StaticAssertDecl(MaxOffsetNumber < SpecTokenOffsetNumber,
 #define BITMAPLEN(NATTS)	(((int)(NATTS) + 7) / 8)
 
 /*
- * MaxHeapTupleSize is the maximum allowed size of a heap tuple, including
- * header and MAXALIGN alignment padding.  Basically it's BLCKSZ minus the
- * other stuff that has to be on a disk page.  Since heap pages use no
- * "special space", there's no deduction for that.
+ * ClusterMaxHeapTupleSize is a cluster-specific maximum allowed size of a
+ * heap tuple, including header and MAXALIGN alignment padding.  Basically
+ * it's BLCKSZ minus the other stuff that has to be on a disk page.  Since
+ * heap pages use no "special space", there's no deduction for that.
+ *
+ * MaxHeapTuplesSizeLimit is the largest value that ClusterMaxHeapTupleSize
+ * could be.  While these currently evaluate to the same value, these are
+ * being split out so ClusterMaxHeapTupleSize can become a variable
+ * instead of a constant.
+ *
+ * The CalcMaxHeapTupleSize() macro is used to determine the appropriate
+ * values, given the usable page space on a given page.
  *
  * NOTE: we allow for the ItemId that must point to the tuple, ensuring that
  * an otherwise-empty page can indeed hold a tuple of this size.  Because
  * ItemIds and tuples have different alignment requirements, don't assume that
- * you can, say, fit 2 tuples of size MaxHeapTupleSize/2 on the same page.
+ * you can, say, fit 2 tuples of size ClusterMaxHeapTupleSize/2 on the same page.
  */
-#define MaxHeapTupleSize  (BLCKSZ - MAXALIGN(SizeOfPageHeaderData + sizeof(ItemIdData)))
+#define CalcMaxHeapTupleSize(size)  (size - sizeof(ItemIdData))
+#define ClusterMaxHeapTupleSize CalcMaxHeapTupleSize(BLCKSZ - SizeOfPageHeaderData)
+#define MaxHeapTupleSizeLimit CalcMaxHeapTupleSize(BLCKSZ - SizeOfPageHeaderData)
 #define MinHeapTupleSize  MAXALIGN(SizeofHeapTupleHeader)
 
 /*
- * MaxHeapTuplesPerPage is an upper bound on the number of tuples that can
- * fit on one heap page.  (Note that indexes could have more, because they
- * use a smaller tuple header.)  We arrive at the divisor because each tuple
- * must be maxaligned, and it must have an associated line pointer.
+ * ClusterMaxHeapTuplesPerPage is a cluster-specific upper bound on the number
+ * of tuples that can fit on one heap page.  (Note that indexes could have
+ * more, because they use a smaller tuple header.)  We arrive at the divisor
+ * because each tuple must be maxaligned, and it must have an associated line
+ * pointer.
+ *
+ * MaxHeapTuplesPerPageLimit is the largest value that
+ * ClusterMaxHeapTuplesPerPage could be.  While these currently evaluate to
+ * the same value, these are being split out so ClusterMaxHeapTuplesPerPage
+ * can become a variable instead of a constant.
+ *
+ * The CalcMaxHeapTuplesPerPage() macro is used to determine the appropriate
+ * values, given the usable page space on a given page.
+ *
+ * The old MaxHeapTuplesPerPage symbol has been removed; static allocations
+ * should use the MaxHeapTuplesPerPageLimit constant, while runtime code
+ * should use ClusterMaxHeapTuplesPerPage.
  *
  * Note: with HOT, there could theoretically be more line pointers (not actual
  * tuples) than this on a heap page.  However we constrain the number of line
  * pointers to this anyway, to avoid excessive line-pointer bloat and not
  * require increases in the size of work arrays.
  */
-#define MaxHeapTuplesPerPage	\
-	((int) ((BLCKSZ - SizeOfPageHeaderData) / \
+#define CalcMaxHeapTuplesPerPage(size)	((int) ((size) / \
 			(MAXALIGN(SizeofHeapTupleHeader) + sizeof(ItemIdData))))
+#define ClusterMaxHeapTuplesPerPage CalcMaxHeapTuplesPerPage(BLCKSZ - SizeOfPageHeaderData)
+#define MaxHeapTuplesPerPageLimit CalcMaxHeapTuplesPerPage(BLCKSZ - SizeOfPageHeaderData)
 
 /*
  * MaxAttrSize is a somewhat arbitrary upper limit on the declared size of
diff --git a/src/include/access/itup.h b/src/include/access/itup.h
index 94885751e5..61fa8ff538 100644
--- a/src/include/access/itup.h
+++ b/src/include/access/itup.h
@@ -148,11 +148,19 @@ index_getattr(IndexTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
 #endif
 
 /*
- * MaxIndexTuplesPerPage is an upper bound on the number of tuples that can
- * fit on one index page.  An index tuple must have either data or a null
- * bitmap, so we can safely assume it's at least 1 byte bigger than a bare
- * IndexTupleData struct.  We arrive at the divisor because each tuple
- * must be maxaligned, and it must have an associated line pointer.
+ * ClusterMaxIndexTuplesPerPage is a cluster-specific upper bound on the
+ * number of tuples that can fit on one index page.  An index tuple must have
+ * either data or a null bitmap, so we can safely assume it's at least 1 byte
+ * bigger than a bare IndexTupleData struct.  We arrive at the divisor because
+ * each tuple must be maxaligned, and it must have an associated line pointer.
+ *
+ * MaxIndexTuplesPerPageLimit is the largest value that
+ * ClusterMaxIndexTuplesPerPage could be.  While these currently evaluate to
+ * the same value, these are being split out so ClusterMaxIndexTuplesPerPage
+ * can become a variable instead of a constant.
+ *
+ * The CalcMaxIndexTuplesPerPage() macro is used to determine the appropriate
+ * values, given the usable page space on a given page.
  *
  * To be index-type-independent, this does not account for any special space
  * on the page, and is thus conservative.
@@ -163,8 +171,9 @@ index_getattr(IndexTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
  * estimated here, seemingly allowing one more tuple than estimated here.
  * But such a page always has at least MAXALIGN special space, so we're safe.
  */
-#define MaxIndexTuplesPerPage	\
-	((int) ((BLCKSZ - SizeOfPageHeaderData) / \
+#define CalcMaxIndexTuplesPerPage(size)	((int) ((size) / \
 			(MAXALIGN(sizeof(IndexTupleData) + 1) + sizeof(ItemIdData))))
+#define ClusterMaxIndexTuplesPerPage CalcMaxIndexTuplesPerPage(BLCKSZ - SizeOfPageHeaderData)
+#define MaxIndexTuplesPerPageLimit CalcMaxIndexTuplesPerPage(BLCKSZ - SizeOfPageHeaderData)
 
 #endif							/* ITUP_H */
diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h
index 6eb162052e..568646a146 100644
--- a/src/include/access/nbtree.h
+++ b/src/include/access/nbtree.h
@@ -172,9 +172,17 @@ typedef struct BTMetaPageData
 				   MAXALIGN(sizeof(BTPageOpaqueData))) / 3)
 
 /*
- * MaxTIDsPerBTreePage is an upper bound on the number of heap TIDs tuples
- * that may be stored on a btree leaf page.  It is used to size the
- * per-page temporary buffers.
+ * ClusterMaxTIDsPerBTreePage is a cluster-specific upper bound on the number
+ * of heap TIDs tuples that may be stored on a btree leaf page.  It is used to
+ * size the per-page temporary buffers.
+ *
+ * MaxTIDsPerBTreePageLimit is the largest value that
+ * ClusterMaxTIDsPerBTreePage could be.  While these currently evaluate to the
+ * same value, these are being split out so ClusterMaxTIDsPerBTreePage can
+ * become a variable instead of a constant.
+ *
+ * The CalcMaxTIDsPerBTreePage() macro is used to determine the appropriate
+ * values, given the usable page space on a given page.
  *
  * Note: we don't bother considering per-tuple overheads here to keep
  * things simple (value is based on how many elements a single array of
@@ -182,9 +190,13 @@ typedef struct BTMetaPageData
  * special area).  The value is slightly higher (i.e. more conservative)
  * than necessary as a result, which is considered acceptable.
  */
-#define MaxTIDsPerBTreePage \
-	(int) ((BLCKSZ - SizeOfPageHeaderData - sizeof(BTPageOpaqueData)) / \
+#define CalcMaxTIDsPerBTreePage(size) \
+	(int) (((size) - sizeof(BTPageOpaqueData)) /	\
 		   sizeof(ItemPointerData))
+#define ClusterMaxTIDsPerBTreePage \
+	CalcMaxTIDsPerBTreePage(BLCKSZ - SizeOfPageHeaderData)
+#define MaxTIDsPerBTreePageLimit \
+	CalcMaxTIDsPerBTreePage(BLCKSZ - SizeOfPageHeaderData)
 
 /*
  * The leaf-page fillfactor defaults to 90% but is user-adjustable.
@@ -887,7 +899,7 @@ typedef struct BTDedupStateData
 	 * are implicitly unchanged by deduplication pass).
 	 */
 	int			nintervals;		/* current number of intervals in array */
-	BTDedupInterval intervals[MaxIndexTuplesPerPage];
+	BTDedupInterval intervals[MaxIndexTuplesPerPageLimit];
 } BTDedupStateData;
 
 typedef BTDedupStateData *BTDedupState;
@@ -982,7 +994,7 @@ typedef struct BTScanPosData
 	int			lastItem;		/* last valid index in items[] */
 	int			itemIndex;		/* current index in items[] */
 
-	BTScanPosItem items[MaxTIDsPerBTreePage];	/* MUST BE LAST */
+	BTScanPosItem items[MaxTIDsPerBTreePageLimit];	/* MUST BE LAST */
 } BTScanPosData;
 
 typedef BTScanPosData *BTScanPos;
diff --git a/src/include/access/spgist_private.h b/src/include/access/spgist_private.h
index 2e9c757b30..d8ca2b7e0f 100644
--- a/src/include/access/spgist_private.h
+++ b/src/include/access/spgist_private.h
@@ -226,17 +226,17 @@ typedef struct SpGistScanOpaqueData
 	TupleDesc	reconTupDesc;	/* if so, descriptor for reconstructed tuples */
 	int			nPtrs;			/* number of TIDs found on current page */
 	int			iPtr;			/* index for scanning through same */
-	ItemPointerData heapPtrs[MaxIndexTuplesPerPage];	/* TIDs from cur page */
-	bool		recheck[MaxIndexTuplesPerPage]; /* their recheck flags */
-	bool		recheckDistances[MaxIndexTuplesPerPage];	/* distance recheck
+	ItemPointerData heapPtrs[MaxIndexTuplesPerPageLimit];	/* TIDs from cur page */
+	bool		recheck[MaxIndexTuplesPerPageLimit]; /* their recheck flags */
+	bool		recheckDistances[MaxIndexTuplesPerPageLimit];	/* distance recheck
 															 * flags */
-	HeapTuple	reconTups[MaxIndexTuplesPerPage];	/* reconstructed tuples */
+	HeapTuple	reconTups[MaxIndexTuplesPerPageLimit];	/* reconstructed tuples */
 
 	/* distances (for recheck) */
-	IndexOrderByDistance *distances[MaxIndexTuplesPerPage];
+	IndexOrderByDistance *distances[MaxIndexTuplesPerPageLimit];
 
 	/*
-	 * Note: using MaxIndexTuplesPerPage above is a bit hokey since
+	 * Note: using ClusterMaxIndexTuplesPerPage above is a bit hokey since
 	 * SpGistLeafTuples aren't exactly IndexTuples; however, they are larger,
 	 * so this is safe.
 	 */
diff --git a/src/test/modules/test_ginpostinglist/test_ginpostinglist.c b/src/test/modules/test_ginpostinglist/test_ginpostinglist.c
index 04215cadd9..f79dbc2bdd 100644
--- a/src/test/modules/test_ginpostinglist/test_ginpostinglist.c
+++ b/src/test/modules/test_ginpostinglist/test_ginpostinglist.c
@@ -88,9 +88,9 @@ Datum
 test_ginpostinglist(PG_FUNCTION_ARGS)
 {
 	test_itemptr_pair(0, 2, 14);
-	test_itemptr_pair(0, MaxHeapTuplesPerPage, 14);
-	test_itemptr_pair(MaxBlockNumber, MaxHeapTuplesPerPage, 14);
-	test_itemptr_pair(MaxBlockNumber, MaxHeapTuplesPerPage, 16);
+	test_itemptr_pair(0, ClusterMaxHeapTuplesPerPage, 14);
+	test_itemptr_pair(MaxBlockNumber, ClusterMaxHeapTuplesPerPage, 14);
+	test_itemptr_pair(MaxBlockNumber, ClusterMaxHeapTuplesPerPage, 16);
 
 	PG_RETURN_VOID();
 }
diff --git a/src/test/regress/expected/insert.out b/src/test/regress/expected/insert.out
index dd4354fc7d..eebf3c6d4d 100644
--- a/src/test/regress/expected/insert.out
+++ b/src/test/regress/expected/insert.out
@@ -86,7 +86,7 @@ drop table inserttest;
 --
 CREATE TABLE large_tuple_test (a int, b text) WITH (fillfactor = 10);
 ALTER TABLE large_tuple_test ALTER COLUMN b SET STORAGE plain;
--- create page w/ free space in range [nearlyEmptyFreeSpace, MaxHeapTupleSize)
+-- create page w/ free space in range [nearlyEmptyFreeSpace, ClusterMaxHeapTupleSize)
 INSERT INTO large_tuple_test (select 1, NULL);
 -- should still fit on the page
 INSERT INTO large_tuple_test (select 2, repeat('a', 1000));
diff --git a/src/test/regress/sql/insert.sql b/src/test/regress/sql/insert.sql
index bdcffd0314..53f46e7960 100644
--- a/src/test/regress/sql/insert.sql
+++ b/src/test/regress/sql/insert.sql
@@ -43,7 +43,7 @@ drop table inserttest;
 CREATE TABLE large_tuple_test (a int, b text) WITH (fillfactor = 10);
 ALTER TABLE large_tuple_test ALTER COLUMN b SET STORAGE plain;
 
--- create page w/ free space in range [nearlyEmptyFreeSpace, MaxHeapTupleSize)
+-- create page w/ free space in range [nearlyEmptyFreeSpace, ClusterMaxHeapTupleSize)
 INSERT INTO large_tuple_test (select 1, NULL);
 
 -- should still fit on the page
-- 
2.40.1

v4-0004-feature-Adjust-page-sizes-at-PageInit.patchapplication/octet-stream; name=v4-0004-feature-Adjust-page-sizes-at-PageInit.patchDownload

From 6cd420eea8374f30d5ed9ddde8e28c3f677d4b66 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Fri, 5 Jan 2024 18:41:55 -0500
Subject: [PATCH v4 04/22] feature: Adjust page sizes at PageInit()

This is the part at which we are now properly reserving the data on the pages,
and all sizes should have been adjusted appropriately.
---
 src/backend/storage/page/bufpage.c | 6 +++---
 src/include/storage/bufpage.h      | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index 353ae286a4..4aba6e8081 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -46,15 +46,15 @@ PageInit(Page page, Size pageSize, Size specialSize)
 	specialSize = MAXALIGN(specialSize);
 
 	Assert(pageSize == BLCKSZ);
-	Assert(pageSize > specialSize + SizeOfPageHeaderData);
+	Assert(pageSize > specialSize + SizeOfPageHeaderData + ReservedPageSize);
 
 	/* Make sure all fields of page are zero, as well as unused space */
 	MemSet(p, 0, pageSize);
 
 	p->pd_flags = 0;
 	p->pd_lower = SizeOfPageHeaderData;
-	p->pd_upper = pageSize - specialSize;
-	p->pd_special = pageSize - specialSize;
+	p->pd_upper = pageSize - specialSize - ReservedPageSize;
+	p->pd_special = pageSize - specialSize - ReservedPageSize;
 	PageSetPageSizeAndVersion(page, pageSize, PG_PAGE_LAYOUT_VERSION);
 	/* p->pd_prune_xid = InvalidTransactionId;		done by above MemSet */
 }
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 5a6060ca89..8d62544830 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -331,7 +331,7 @@ PageGetUsablePageSize(Page page)
 static inline uint16
 PageGetSpecialSize(Page page)
 {
-	return (PageGetPageSize(page) - ((PageHeader) page)->pd_special);
+	return (PageGetPageSize(page) - ((PageHeader) page)->pd_special - ReservedPageSize);
 }
 
 /*
-- 
2.40.1

v4-0002-refactor-Make-PageGetUsablePageSize-routine.patchapplication/octet-stream; name=v4-0002-refactor-Make-PageGetUsablePageSize-routine.patchDownload

From b726498746fefdd396ad97911ab64a44a7a7281b Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Wed, 17 Jan 2024 13:31:29 -0500
Subject: [PATCH v4 02/22] refactor: Make PageGetUsablePageSize() routine

This is the equivalent change, but for locations which utilized
PageGetPageSize() and SizeOfPageHeaderData.
---
 src/backend/access/nbtree/nbtdedup.c    |  2 +-
 src/backend/access/nbtree/nbtsplitloc.c |  2 +-
 src/include/access/hash.h               |  7 +++----
 src/include/access/nbtree.h             |  8 ++++----
 src/include/storage/bufpage.h           | 10 ++++++++++
 5 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/src/backend/access/nbtree/nbtdedup.c b/src/backend/access/nbtree/nbtdedup.c
index c8af6cbc2a..dab043bf78 100644
--- a/src/backend/access/nbtree/nbtdedup.c
+++ b/src/backend/access/nbtree/nbtdedup.c
@@ -825,7 +825,7 @@ _bt_singleval_fillfactor(Page page, BTDedupState state, Size newitemsz)
 	int			reduction;
 
 	/* This calculation needs to match nbtsplitloc.c */
-	leftfree = PageGetPageSize(page) - SizeOfPageHeaderData -
+	leftfree = PageGetUsablePageSize(page) -
 		MAXALIGN(sizeof(BTPageOpaqueData));
 	/* Subtract size of new high key (includes pivot heap TID space) */
 	leftfree -= newitemsz + MAXALIGN(sizeof(ItemPointerData));
diff --git a/src/backend/access/nbtree/nbtsplitloc.c b/src/backend/access/nbtree/nbtsplitloc.c
index 1f40d40263..df8b24464f 100644
--- a/src/backend/access/nbtree/nbtsplitloc.c
+++ b/src/backend/access/nbtree/nbtsplitloc.c
@@ -156,7 +156,7 @@ _bt_findsplitloc(Relation rel,
 
 	/* Total free space available on a btree page, after fixed overhead */
 	leftspace = rightspace =
-		PageGetPageSize(origpage) - SizeOfPageHeaderData -
+		PageGetUsablePageSize(origpage) -
 		MAXALIGN(sizeof(BTPageOpaqueData));
 
 	/* The right page will have the same high key as the old page */
diff --git a/src/include/access/hash.h b/src/include/access/hash.h
index b797872bd4..c1cf1784fc 100644
--- a/src/include/access/hash.h
+++ b/src/include/access/hash.h
@@ -285,8 +285,7 @@ typedef struct HashOptions
  * Maximum size of a hash index item (it's okay to have only one per page)
  */
 #define HashMaxItemSize(page) \
-	MAXALIGN_DOWN(PageGetPageSize(page) - \
-				  SizeOfPageHeaderData - \
+	MAXALIGN_DOWN(PageGetUsablePageSize(page) - \
 				  sizeof(ItemIdData) - \
 				  MAXALIGN(sizeof(HashPageOpaqueData)))
 
@@ -317,8 +316,8 @@ typedef struct HashOptions
 	((uint32 *) PageGetContents(page))
 
 #define HashGetMaxBitmapSize(page) \
-	(PageGetPageSize((Page) page) - \
-	 (MAXALIGN(SizeOfPageHeaderData) + MAXALIGN(sizeof(HashPageOpaqueData))))
+	(PageGetUsablePageSize((Page) page) - \
+	 MAXALIGN(sizeof(HashPageOpaqueData)))
 
 #define HashPageGetMeta(page) \
 	((HashMetaPage) PageGetContents(page))
diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h
index 4607e45dff..9054b7dc3e 100644
--- a/src/include/access/nbtree.h
+++ b/src/include/access/nbtree.h
@@ -162,13 +162,13 @@ typedef struct BTMetaPageData
  * attribute, which we account for here.
  */
 #define BTMaxItemSize(page) \
-	(MAXALIGN_DOWN((PageGetPageSize(page) - \
-					MAXALIGN(SizeOfPageHeaderData + 3*sizeof(ItemIdData)) - \
+	(MAXALIGN_DOWN((PageGetUsablePageSize(page) - \
+					MAXALIGN(3*sizeof(ItemIdData)) - \
 					MAXALIGN(sizeof(BTPageOpaqueData))) / 3) - \
 					MAXALIGN(sizeof(ItemPointerData)))
 #define BTMaxItemSizeNoHeapTid(page) \
-	MAXALIGN_DOWN((PageGetPageSize(page) - \
-				   MAXALIGN(SizeOfPageHeaderData + 3*sizeof(ItemIdData)) - \
+	MAXALIGN_DOWN((PageGetUsablePageSize(page) - \
+				   MAXALIGN(3*sizeof(ItemIdData)) - \
 				   MAXALIGN(sizeof(BTPageOpaqueData))) / 3)
 
 /*
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index dcdeaaf910..6eee77e57e 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -310,6 +310,16 @@ PageSetPageSizeAndVersion(Page page, Size size, uint8 version)
 	((PageHeader) page)->pd_pagesize_version = size | version;
 }
 
+/*
+ * PageGetUsablePageSize
+ *		Returns the usable space on a page (from end of page header to reserved space)
+ */
+static inline uint16
+PageGetUsablePageSize(Page page)
+{
+	return PageGetPageSize(page) - SizeOfPageHeaderData;
+}
+
 /* ----------------
  *		page special data functions
  * ----------------
-- 
2.40.1

v4-0005-feature-Add-hook-for-setting-reloptions-defaults-.patchapplication/octet-stream; name=v4-0005-feature-Add-hook-for-setting-reloptions-defaults-.patchDownload

From eb34646b7a1d938e0fcd109dcd2a510ade78209e Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Thu, 18 Jan 2024 13:53:57 -0500
Subject: [PATCH v4 05/22] feature: Add hook for setting reloptions defaults at
 runtime

This is necessary due to the various toast constants being non-constant due to
runtime differences in usable space.  Future commits will utilize this to tweak.

This is run once on startup, so while this is not the most efficient way to
handle this, it doesn't really matter for one-time spinup cost.
---
 src/backend/access/common/reloptions.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index d6eb5d8559..9482929557 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -561,6 +561,8 @@ static void initialize_reloptions(void);
 static void parse_one_reloption(relopt_value *option, char *text_str,
 								int text_len, bool validate);
 
+static void update_dynamic_reloptions(void);
+
 /*
  * Get the length of a string reloption (either default or the user-defined
  * value).  This is used for allocation purposes when building a set of
@@ -570,6 +572,20 @@ static void parse_one_reloption(relopt_value *option, char *text_str,
 	((option).isset ? strlen((option).values.string_val) : \
 	 ((relopt_string *) (option).gen)->default_len)
 
+/*
+ * handle adjustments to the config table based on dynamic parameters' limits
+ */
+
+static void
+update_dynamic_reloptions(void)
+{
+	int i;
+
+	for (i = 0; intRelOpts[i].gen.name; i++)
+	{
+	}
+}
+
 /*
  * initialize_reloptions
  *		initialization routine, must be called before parsing
@@ -582,6 +598,13 @@ initialize_reloptions(void)
 	int			i;
 	int			j;
 
+	/*
+	 * Set the dynamic limits based on block size; if we get multiple can make
+	 * more sophisticated.
+	 */
+
+	update_dynamic_reloptions();
+
 	j = 0;
 	for (i = 0; boolRelOpts[i].gen.name; i++)
 	{
-- 
2.40.1

v4-0009-chore-Translation-updates-for-TOAST_MAX_CHUNK_SIZ.patchapplication/octet-stream; name=v4-0009-chore-Translation-updates-for-TOAST_MAX_CHUNK_SIZ.patchDownload

From a8f0e8fd54aa1bfc6062f1d7cb7f41b7e6cef690 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Tue, 16 Jan 2024 17:58:32 -0500
Subject: [PATCH v4 09/22] chore: Translation updates for TOAST_MAX_CHUNK_SIZE
 change

Mechanical update of translations to account for message change.
---
 src/backend/po/de.po    | 4 ++--
 src/backend/po/es.po    | 4 ++--
 src/backend/po/fr.po    | 6 +++---
 src/backend/po/id.po    | 4 ++--
 src/backend/po/it.po    | 4 ++--
 src/backend/po/ja.po    | 4 ++--
 src/backend/po/ko.po    | 8 ++++----
 src/backend/po/pl.po    | 4 ++--
 src/backend/po/pt_BR.po | 4 ++--
 src/backend/po/ru.po    | 8 ++++----
 src/backend/po/sv.po    | 4 ++--
 src/backend/po/tr.po    | 4 ++--
 src/backend/po/uk.po    | 4 ++--
 src/backend/po/zh_CN.po | 4 ++--
 14 files changed, 33 insertions(+), 33 deletions(-)

diff --git a/src/backend/po/de.po b/src/backend/po/de.po
index 0a9e668c38..55fd99a0ec 100644
--- a/src/backend/po/de.po
+++ b/src/backend/po/de.po
@@ -2363,8 +2363,8 @@ msgstr "Der Datenbank-Cluster wurde mit INDEX_MAX_KEYS %d initialisiert, aber de
 
 #: access/transam/xlog.c:4105
 #, c-format
-msgid "The database cluster was initialized with TOAST_MAX_CHUNK_SIZE %d, but the server was compiled with TOAST_MAX_CHUNK_SIZE %d."
-msgstr "Der Datenbank-Cluster wurde mit TOAST_MAX_CHUNK_SIZE %d initialisiert, aber der Server wurde mit TOAST_MAX_CHUNK_SIZE %d kompiliert."
+msgid "The database cluster was initialized with ClusterToastMaxChunkSize %d, but the server was compiled with ClusterToastMaxChunkSize %d."
+msgstr "Der Datenbank-Cluster wurde mit ClusterToastMaxChunkSize %d initialisiert, aber der Server wurde mit ClusterToastMaxChunkSize %d kompiliert."
 
 #: access/transam/xlog.c:4112
 #, c-format
diff --git a/src/backend/po/es.po b/src/backend/po/es.po
index e50a935033..425a1030d3 100644
--- a/src/backend/po/es.po
+++ b/src/backend/po/es.po
@@ -2421,8 +2421,8 @@ msgstr "Los archivos de la base de datos fueron inicializados con INDEX_MAX_KEYS
 
 #: access/transam/xlog.c:4105
 #, c-format
-msgid "The database cluster was initialized with TOAST_MAX_CHUNK_SIZE %d, but the server was compiled with TOAST_MAX_CHUNK_SIZE %d."
-msgstr "Los archivos de la base de datos fueron inicializados con TOAST_MAX_CHUNK_SIZE %d, pero el servidor fue compilado con TOAST_MAX_CHUNK_SIZE %d."
+msgid "The database cluster was initialized with ClusterToastMaxChunkSize %d, but the server was compiled with ClusterToastMaxChunkSize %d."
+msgstr "Los archivos de la base de datos fueron inicializados con ClusterToastMaxChunkSize %d, pero el servidor fue compilado con ClusterToastMaxChunkSize %d."
 
 #: access/transam/xlog.c:4112
 #, c-format
diff --git a/src/backend/po/fr.po b/src/backend/po/fr.po
index fd51500b93..18c2a4ee2c 100644
--- a/src/backend/po/fr.po
+++ b/src/backend/po/fr.po
@@ -2424,10 +2424,10 @@ msgstr ""
 
 #: access/transam/xlog.c:4131
 #, c-format
-msgid "The database cluster was initialized with TOAST_MAX_CHUNK_SIZE %d, but the server was compiled with TOAST_MAX_CHUNK_SIZE %d."
+msgid "The database cluster was initialized with ClusterToastMaxChunkSize %d, but the server was compiled with ClusterToastMaxChunkSize %d."
 msgstr ""
-"Le cluster de bases de données a été initialisé avec un TOAST_MAX_CHUNK_SIZE\n"
-"à %d alors que le serveur a été compilé avec un TOAST_MAX_CHUNK_SIZE à %d."
+"Le cluster de bases de données a été initialisé avec un ClusterToastMaxChunkSize\n"
+"à %d alors que le serveur a été compilé avec un ClusterToastMaxChunkSize à %d."
 
 #: access/transam/xlog.c:4138
 #, c-format
diff --git a/src/backend/po/id.po b/src/backend/po/id.po
index d5d484132b..4845ea4c47 100644
--- a/src/backend/po/id.po
+++ b/src/backend/po/id.po
@@ -1124,8 +1124,8 @@ msgstr "cluster database telah diinisialkan dengan INDEX_MAX_KEYS %d, tapi serve
 
 #: access/transam/xlog.c:3690
 #, c-format
-msgid "The database cluster was initialized with TOAST_MAX_CHUNK_SIZE %d, but the server was compiled with TOAST_MAX_CHUNK_SIZE %d."
-msgstr "cluster database telah diinisialkan dengan TOAST_MAX_CHUNK_SIZE %d, tapi server telah dikompilasi dengan TOAST_MAX_CHUNK_SIZE %d."
+msgid "The database cluster was initialized with ClusterToastMaxChunkSize %d, but the server was compiled with ClusterToastMaxChunkSize %d."
+msgstr "cluster database telah diinisialkan dengan ClusterToastMaxChunkSize %d, tapi server telah dikompilasi dengan ClusterToastMaxChunkSize %d."
 
 #: access/transam/xlog.c:3699
 #, c-format
diff --git a/src/backend/po/it.po b/src/backend/po/it.po
index 673e2aaf00..9fb628e56a 100644
--- a/src/backend/po/it.po
+++ b/src/backend/po/it.po
@@ -2097,8 +2097,8 @@ msgstr "Il cluster di database è stato inizializzato con INDEX_MAX_KEYS %d, ma
 
 #: access/transam/xlog.c:4131
 #, c-format
-msgid "The database cluster was initialized with TOAST_MAX_CHUNK_SIZE %d, but the server was compiled with TOAST_MAX_CHUNK_SIZE %d."
-msgstr "Il cluster di database è stato inizializzato con TOAST_MAX_CHUNK_SIZE %d, ma il server è stato compilato con TOAST_MAX_CHUNK_SIZE %d."
+msgid "The database cluster was initialized with ClusterToastMaxChunkSize %d, but the server was compiled with ClusterToastMaxChunkSize %d."
+msgstr "Il cluster di database è stato inizializzato con ClusterToastMaxChunkSize %d, ma il server è stato compilato con ClusterToastMaxChunkSize %d."
 
 #: access/transam/xlog.c:4138
 #, c-format
diff --git a/src/backend/po/ja.po b/src/backend/po/ja.po
index 1ab9f7f68f..bd86abdef7 100644
--- a/src/backend/po/ja.po
+++ b/src/backend/po/ja.po
@@ -2126,8 +2126,8 @@ msgstr "データベースクラスタは INDEX_MAX_KEYS %d で初期化され
 
 #: access/transam/xlog.c:4105
 #, c-format
-msgid "The database cluster was initialized with TOAST_MAX_CHUNK_SIZE %d, but the server was compiled with TOAST_MAX_CHUNK_SIZE %d."
-msgstr "データベースクラスタは TOAST_MAX_CHUNK_SIZE %d で初期化されましたが、サーバーは TOAST_MAX_CHUNK_SIZE %d でコンパイルされています。"
+msgid "The database cluster was initialized with ClusterToastMaxChunkSize %d, but the server was compiled with ClusterToastMaxChunkSize %d."
+msgstr "データベースクラスタは ClusterToastMaxChunkSize %d で初期化されましたが、サーバーは ClusterToastMaxChunkSize %d でコンパイルされています。"
 
 #: access/transam/xlog.c:4112
 #, c-format
diff --git a/src/backend/po/ko.po b/src/backend/po/ko.po
index f330f3da7e..4d6597ddc7 100644
--- a/src/backend/po/ko.po
+++ b/src/backend/po/ko.po
@@ -2404,11 +2404,11 @@ msgstr ""
 #: access/transam/xlog.c:4837
 #, c-format
 msgid ""
-"The database cluster was initialized with TOAST_MAX_CHUNK_SIZE %d, but the "
-"server was compiled with TOAST_MAX_CHUNK_SIZE %d."
+"The database cluster was initialized with ClusterToastMaxChunkSize %d, but the "
+"server was compiled with ClusterToastMaxChunkSize %d."
 msgstr ""
-"데이터베이스 클러스터는 TOAST_MAX_CHUNK_SIZE %d(으)로 초기화되었지만 서버는 "
-"TOAST_MAX_CHUNK_SIZE %d(으)로 컴파일 되었습니다."
+"데이터베이스 클러스터는 ClusterToastMaxChunkSize %d(으)로 초기화되었지만 서버는 "
+"ClusterToastMaxChunkSize %d(으)로 컴파일 되었습니다."
 
 #: access/transam/xlog.c:4844
 #, c-format
diff --git a/src/backend/po/pl.po b/src/backend/po/pl.po
index 3ac9d0451c..b3fcc5fe7b 100644
--- a/src/backend/po/pl.po
+++ b/src/backend/po/pl.po
@@ -1967,8 +1967,8 @@ msgstr "Klaster bazy danych został zainicjowany z INDEX_MAX_KEYS %d, ale serwer
 
 #: access/transam/xlog.c:4592
 #, c-format
-msgid "The database cluster was initialized with TOAST_MAX_CHUNK_SIZE %d, but the server was compiled with TOAST_MAX_CHUNK_SIZE %d."
-msgstr "Klaster bazy danych został zainicjowany z TOAST_MAX_CHUNK_SIZE %d, ale serwer był skompilowany z TOAST_MAX_CHUNK_SIZE %d."
+msgid "The database cluster was initialized with ClusterToastMaxChunkSize %d, but the server was compiled with ClusterToastMaxChunkSize %d."
+msgstr "Klaster bazy danych został zainicjowany z ClusterToastMaxChunkSize %d, ale serwer był skompilowany z ClusterToastMaxChunkSize %d."
 
 #: access/transam/xlog.c:4599
 #, c-format
diff --git a/src/backend/po/pt_BR.po b/src/backend/po/pt_BR.po
index 37e4a28f07..cfc84a7b4a 100644
--- a/src/backend/po/pt_BR.po
+++ b/src/backend/po/pt_BR.po
@@ -1305,8 +1305,8 @@ msgstr "O agrupamento de banco de dados foi inicializado com INDEX_MAX_KEYS %d,
 
 #: access/transam/xlog.c:4548
 #, c-format
-msgid "The database cluster was initialized with TOAST_MAX_CHUNK_SIZE %d, but the server was compiled with TOAST_MAX_CHUNK_SIZE %d."
-msgstr "O agrupamento de banco de dados foi inicializado com TOAST_MAX_CHUNK_SIZE %d, mas o servidor foi compilado com TOAST_MAX_CHUNK_SIZE %d."
+msgid "The database cluster was initialized with ClusterToastMaxChunkSize %d, but the server was compiled with ClusterToastMaxChunkSize %d."
+msgstr "O agrupamento de banco de dados foi inicializado com ClusterToastMaxChunkSize %d, mas o servidor foi compilado com ClusterToastMaxChunkSize %d."
 
 #: access/transam/xlog.c:4555
 #, c-format
diff --git a/src/backend/po/ru.po b/src/backend/po/ru.po
index ae9c50eed7..6427ffc272 100644
--- a/src/backend/po/ru.po
+++ b/src/backend/po/ru.po
@@ -2673,11 +2673,11 @@ msgstr ""
 #: access/transam/xlog.c:4131
 #, c-format
 msgid ""
-"The database cluster was initialized with TOAST_MAX_CHUNK_SIZE %d, but the "
-"server was compiled with TOAST_MAX_CHUNK_SIZE %d."
+"The database cluster was initialized with ClusterToastMaxChunkSize %d, but the "
+"server was compiled with ClusterToastMaxChunkSize %d."
 msgstr ""
-"Кластер баз данных был инициализирован с TOAST_MAX_CHUNK_SIZE %d, но сервер "
-"скомпилирован с TOAST_MAX_CHUNK_SIZE %d."
+"Кластер баз данных был инициализирован с ClusterToastMaxChunkSize %d, но сервер "
+"скомпилирован с ClusterToastMaxChunkSize %d."
 
 #: access/transam/xlog.c:4138
 #, c-format
diff --git a/src/backend/po/sv.po b/src/backend/po/sv.po
index 0da20b6d43..626015b5e7 100644
--- a/src/backend/po/sv.po
+++ b/src/backend/po/sv.po
@@ -2379,8 +2379,8 @@ msgstr "Databasklustret initierades med INDEX_MAX_KEYS %d, men servern kompilera
 
 #: access/transam/xlog.c:4105
 #, c-format
-msgid "The database cluster was initialized with TOAST_MAX_CHUNK_SIZE %d, but the server was compiled with TOAST_MAX_CHUNK_SIZE %d."
-msgstr "Databasklustret initierades med TOAST_MAX_CHUNK_SIZE %d, men servern kompilerades med TOAST_MAX_CHUNK_SIZE %d."
+msgid "The database cluster was initialized with ClusterToastMaxChunkSize %d, but the server was compiled with ClusterToastMaxChunkSize %d."
+msgstr "Databasklustret initierades med ClusterToastMaxChunkSize %d, men servern kompilerades med ClusterToastMaxChunkSize %d."
 
 #: access/transam/xlog.c:4112
 #, c-format
diff --git a/src/backend/po/tr.po b/src/backend/po/tr.po
index b791e886b9..048d4a4a7e 100644
--- a/src/backend/po/tr.po
+++ b/src/backend/po/tr.po
@@ -1791,8 +1791,8 @@ msgstr "Veritabanı clusteri INDEX_MAX_KEYS %d ile ilklendirilmiştir, ancak sun
 
 #: access/transam/xlog.c:4711
 #, c-format
-msgid "The database cluster was initialized with TOAST_MAX_CHUNK_SIZE %d, but the server was compiled with TOAST_MAX_CHUNK_SIZE %d."
-msgstr "Veritabanı clusteri TOAST_MAX_CHUNK_SIZE %d ile ilklendirilmiştir, ancak sunucu  TOAST_MAX_CHUNK_SIZE %d ile derlenmiştir."
+msgid "The database cluster was initialized with ClusterToastMaxChunkSize %d, but the server was compiled with ClusterToastMaxChunkSize %d."
+msgstr "Veritabanı clusteri ClusterToastMaxChunkSize %d ile ilklendirilmiştir, ancak sunucu  ClusterToastMaxChunkSize %d ile derlenmiştir."
 
 #: access/transam/xlog.c:4718
 #, c-format
diff --git a/src/backend/po/uk.po b/src/backend/po/uk.po
index 1095fd9139..b50e5a0de2 100644
--- a/src/backend/po/uk.po
+++ b/src/backend/po/uk.po
@@ -2322,8 +2322,8 @@ msgstr "Кластер бази даних було ініціалізовано
 
 #: access/transam/xlog.c:4131
 #, c-format
-msgid "The database cluster was initialized with TOAST_MAX_CHUNK_SIZE %d, but the server was compiled with TOAST_MAX_CHUNK_SIZE %d."
-msgstr "Кластер бази даних було ініціалізовано з TOAST_MAX_CHUNK_SIZE %d, але сервер було скомпільовано з TOAST_MAX_CHUNK_SIZE %d."
+msgid "The database cluster was initialized with ClusterToastMaxChunkSize %d, but the server was compiled with ClusterToastMaxChunkSize %d."
+msgstr "Кластер бази даних було ініціалізовано з ClusterToastMaxChunkSize %d, але сервер було скомпільовано з ClusterToastMaxChunkSize %d."
 
 #: access/transam/xlog.c:4138
 #, c-format
diff --git a/src/backend/po/zh_CN.po b/src/backend/po/zh_CN.po
index 574684d775..513752a0b9 100644
--- a/src/backend/po/zh_CN.po
+++ b/src/backend/po/zh_CN.po
@@ -1887,8 +1887,8 @@ msgstr "数据库集群是以 INDEX_MAX_KEYS  %d 初始化的, 但是 服务器
 
 #: access/transam/xlog.c:4712
 #, c-format
-msgid "The database cluster was initialized with TOAST_MAX_CHUNK_SIZE %d, but the server was compiled with TOAST_MAX_CHUNK_SIZE %d."
-msgstr "数据库集群是以 TOAST_MAX_CHUNK_SIZE %d 初始化的, 但是 服务器是以 TOAST_MAX_CHUNK_SIZE %d 编译的."
+msgid "The database cluster was initialized with ClusterToastMaxChunkSize %d, but the server was compiled with ClusterToastMaxChunkSize %d."
+msgstr "数据库集群是以 ClusterToastMaxChunkSize %d 初始化的, 但是 服务器是以 ClusterToastMaxChunkSize %d 编译的."
 
 #: access/transam/xlog.c:4719
 #, c-format
-- 
2.40.1

v4-0006-feature-Add-Calc-options-for-toast-related-pieces.patchapplication/octet-stream; name=v4-0006-feature-Add-Calc-options-for-toast-related-pieces.patchDownload

From c870673f91caee78dad2ead25d67e0047d766fd1 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Thu, 18 Jan 2024 13:57:12 -0500
Subject: [PATCH v4 06/22] feature: Add Calc options for toast-related pieces

Similar to the other Calc, Limit, etc,
---
 src/backend/access/common/toast_internals.c |  2 +-
 src/bin/pg_resetwal/pg_resetwal.c           |  2 +-
 src/include/access/heaptoast.h              | 15 ++++++++++-----
 3 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/src/backend/access/common/toast_internals.c b/src/backend/access/common/toast_internals.c
index 90d0654e62..af72ade4ba 100644
--- a/src/backend/access/common/toast_internals.c
+++ b/src/backend/access/common/toast_internals.c
@@ -132,7 +132,7 @@ toast_save_datum(Relation rel, Datum value,
 	{
 		struct varlena hdr;
 		/* this is to make the union big enough for a chunk: */
-		char		data[TOAST_MAX_CHUNK_SIZE + VARHDRSZ];
+		char		data[TOAST_MAX_CHUNK_SIZE_LIMIT + VARHDRSZ];
 		/* ensure union is aligned well enough: */
 		int32		align_it;
 	}			chunk_data;
diff --git a/src/bin/pg_resetwal/pg_resetwal.c b/src/bin/pg_resetwal/pg_resetwal.c
index e9dcb5a6d8..db5fd71ca0 100644
--- a/src/bin/pg_resetwal/pg_resetwal.c
+++ b/src/bin/pg_resetwal/pg_resetwal.c
@@ -695,7 +695,7 @@ GuessControlValues(void)
 	ControlFile.xlog_seg_size = DEFAULT_XLOG_SEG_SIZE;
 	ControlFile.nameDataLen = NAMEDATALEN;
 	ControlFile.indexMaxKeys = INDEX_MAX_KEYS;
-	ControlFile.toast_max_chunk_size = TOAST_MAX_CHUNK_SIZE;
+	ControlFile.toast_max_chunk_size = TOAST_MAX_CHUNK_SIZE_LIMIT;
 	ControlFile.loblksize = LOBLKSIZE;
 	ControlFile.float8ByVal = FLOAT8PASSBYVAL;
 
diff --git a/src/include/access/heaptoast.h b/src/include/access/heaptoast.h
index 6fe836f7d1..eba4986559 100644
--- a/src/include/access/heaptoast.h
+++ b/src/include/access/heaptoast.h
@@ -20,11 +20,12 @@
 /*
  * Find the maximum size of a tuple if there are to be N tuples per page.
  */
-#define MaximumBytesPerTuple(tuplesPerPage) \
-	MAXALIGN_DOWN((BLCKSZ - \
-				   MAXALIGN(SizeOfPageHeaderData + (tuplesPerPage) * sizeof(ItemIdData))) \
+#define CalcMaximumBytesPerTuple(usablespace,tuplesPerPage)	\
+	MAXALIGN_DOWN(((usablespace) - ((tuplesPerPage) * sizeof(ItemIdData))) \
 				  / (tuplesPerPage))
 
+#define MaximumBytesPerTuple(tuplesPerPage) CalcMaximumBytesPerTuple(PageUsableSpaceMax,tuplesPerPage)
+
 /*
  * These symbols control toaster activation.  If a tuple is larger than
  * TOAST_TUPLE_THRESHOLD, we will try to toast it down to no more than
@@ -81,13 +82,17 @@
 
 #define EXTERN_TUPLE_MAX_SIZE	MaximumBytesPerTuple(EXTERN_TUPLES_PER_PAGE)
 
-#define TOAST_MAX_CHUNK_SIZE	\
-	(EXTERN_TUPLE_MAX_SIZE -							\
+
+#define CalcToastMaxChunkSize(usablespace)							\
+	(CalcMaximumBytesPerTuple(usablespace,EXTERN_TUPLES_PER_PAGE) - \
 	 MAXALIGN(SizeofHeapTupleHeader) -					\
 	 sizeof(Oid) -										\
 	 sizeof(int32) -									\
 	 VARHDRSZ)
 
+#define TOAST_MAX_CHUNK_SIZE_LIMIT CalcToastMaxChunkSize(PageUsableSpaceMax)
+#define TOAST_MAX_CHUNK_SIZE CalcToastMaxChunkSize(PageUsableSpace)
+
 /* ----------
  * heap_toast_insert_or_update -
  *
-- 
2.40.1

v4-0008-chore-Replace-TOAST_MAX_CHUNK_SIZE-with-ClusterTo.patchapplication/octet-stream; name=v4-0008-chore-Replace-TOAST_MAX_CHUNK_SIZE-with-ClusterTo.patchDownload

From e36c323a8eec0099097301a30b9c3ca4df25bfc1 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Tue, 16 Jan 2024 17:58:00 -0500
Subject: [PATCH v4 08/22] chore: Replace TOAST_MAX_CHUNK_SIZE with
 ClusterToastMaxSize var

Mainly mechanical, computed value will be defined in an upcoming commit.
---
 contrib/amcheck/verify_heapam.c             |  8 ++++----
 doc/src/sgml/storage.sgml                   |  2 +-
 src/backend/access/common/toast_internals.c |  2 +-
 src/backend/access/heap/heaptoast.c         | 16 ++++++++--------
 src/backend/access/transam/xlog.c           | 10 +++++-----
 src/include/access/heaptoast.h              |  6 +++---
 6 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/contrib/amcheck/verify_heapam.c b/contrib/amcheck/verify_heapam.c
index f2526ed63a..49e2e7dd94 100644
--- a/contrib/amcheck/verify_heapam.c
+++ b/contrib/amcheck/verify_heapam.c
@@ -1460,7 +1460,7 @@ check_toast_tuple(HeapTuple toasttup, HeapCheckContext *ctx,
 				  uint32 extsize)
 {
 	int32		chunk_seq;
-	int32		last_chunk_seq = (extsize - 1) / TOAST_MAX_CHUNK_SIZE;
+	int32		last_chunk_seq = (extsize - 1) / ClusterToastMaxChunkSize;
 	Pointer		chunk;
 	bool		isnull;
 	int32		chunksize;
@@ -1530,8 +1530,8 @@ check_toast_tuple(HeapTuple toasttup, HeapCheckContext *ctx,
 		return;
 	}
 
-	expected_size = chunk_seq < last_chunk_seq ? TOAST_MAX_CHUNK_SIZE
-		: extsize - (last_chunk_seq * TOAST_MAX_CHUNK_SIZE);
+	expected_size = chunk_seq < last_chunk_seq ? ClusterToastMaxChunkSize
+		: extsize - (last_chunk_seq * ClusterToastMaxChunkSize);
 
 	if (chunksize != expected_size)
 		report_toast_corruption(ctx, ta,
@@ -1773,7 +1773,7 @@ check_toasted_attribute(HeapCheckContext *ctx, ToastedAttribute *ta)
 	int32		last_chunk_seq;
 
 	extsize = VARATT_EXTERNAL_GET_EXTSIZE(ta->toast_pointer);
-	last_chunk_seq = (extsize - 1) / TOAST_MAX_CHUNK_SIZE;
+	last_chunk_seq = (extsize - 1) / ClusterToastMaxChunkSize;
 
 	/*
 	 * Setup a scan key to find chunks in toast table with matching va_valueid
diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml
index 2e1914b95b..ecf9427ed2 100644
--- a/doc/src/sgml/storage.sgml
+++ b/doc/src/sgml/storage.sgml
@@ -415,7 +415,7 @@ described in more detail below.
 
 <para>
 Out-of-line values are divided (after compression if used) into chunks of at
-most <symbol>TOAST_MAX_CHUNK_SIZE</symbol> bytes (by default this value is chosen
+most <symbol>ClusterToastMaxChunkSize</symbol> bytes (by default this value is chosen
 so that four chunk rows will fit on a page, making it about 2000 bytes).
 Each chunk is stored as a separate row in the <acronym>TOAST</acronym> table
 belonging to the owning table.  Every
diff --git a/src/backend/access/common/toast_internals.c b/src/backend/access/common/toast_internals.c
index af72ade4ba..4c97a1cced 100644
--- a/src/backend/access/common/toast_internals.c
+++ b/src/backend/access/common/toast_internals.c
@@ -310,7 +310,7 @@ toast_save_datum(Relation rel, Datum value,
 		/*
 		 * Calculate the size of this chunk
 		 */
-		chunk_size = Min(TOAST_MAX_CHUNK_SIZE, data_todo);
+		chunk_size = Min(ClusterToastMaxChunkSize, data_todo);
 
 		/*
 		 * Build a tuple and store it
diff --git a/src/backend/access/heap/heaptoast.c b/src/backend/access/heap/heaptoast.c
index a420e16530..ada369134a 100644
--- a/src/backend/access/heap/heaptoast.c
+++ b/src/backend/access/heap/heaptoast.c
@@ -634,7 +634,7 @@ heap_fetch_toast_slice(Relation toastrel, Oid valueid, int32 attrsize,
 	SysScanDesc toastscan;
 	HeapTuple	ttup;
 	int32		expectedchunk;
-	int32		totalchunks = ((attrsize - 1) / TOAST_MAX_CHUNK_SIZE) + 1;
+	int32		totalchunks = ((attrsize - 1) / ClusterToastMaxChunkSize) + 1;
 	int			startchunk;
 	int			endchunk;
 	int			num_indexes;
@@ -647,8 +647,8 @@ heap_fetch_toast_slice(Relation toastrel, Oid valueid, int32 attrsize,
 									&toastidxs,
 									&num_indexes);
 
-	startchunk = sliceoffset / TOAST_MAX_CHUNK_SIZE;
-	endchunk = (sliceoffset + slicelength - 1) / TOAST_MAX_CHUNK_SIZE;
+	startchunk = sliceoffset / ClusterToastMaxChunkSize;
+	endchunk = (sliceoffset + slicelength - 1) / ClusterToastMaxChunkSize;
 	Assert(endchunk <= totalchunks);
 
 	/* Set up a scan key to fetch from the index. */
@@ -749,8 +749,8 @@ heap_fetch_toast_slice(Relation toastrel, Oid valueid, int32 attrsize,
 									 curchunk,
 									 startchunk, endchunk, valueid,
 									 RelationGetRelationName(toastrel))));
-		expected_size = curchunk < totalchunks - 1 ? TOAST_MAX_CHUNK_SIZE
-			: attrsize - ((totalchunks - 1) * TOAST_MAX_CHUNK_SIZE);
+		expected_size = curchunk < totalchunks - 1 ? ClusterToastMaxChunkSize
+			: attrsize - ((totalchunks - 1) * ClusterToastMaxChunkSize);
 		if (chunksize != expected_size)
 			ereport(ERROR,
 					(errcode(ERRCODE_DATA_CORRUPTED),
@@ -765,12 +765,12 @@ heap_fetch_toast_slice(Relation toastrel, Oid valueid, int32 attrsize,
 		chcpystrt = 0;
 		chcpyend = chunksize - 1;
 		if (curchunk == startchunk)
-			chcpystrt = sliceoffset % TOAST_MAX_CHUNK_SIZE;
+			chcpystrt = sliceoffset % ClusterToastMaxChunkSize;
 		if (curchunk == endchunk)
-			chcpyend = (sliceoffset + slicelength - 1) % TOAST_MAX_CHUNK_SIZE;
+			chcpyend = (sliceoffset + slicelength - 1) % ClusterToastMaxChunkSize;
 
 		memcpy(VARDATA(result) +
-			   (curchunk * TOAST_MAX_CHUNK_SIZE - sliceoffset) + chcpystrt,
+			   (curchunk * ClusterToastMaxChunkSize - sliceoffset) + chcpystrt,
 			   chunkdata + chcpystrt,
 			   (chcpyend - chcpystrt) + 1);
 
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 20a5f86209..cfb8e94136 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4185,7 +4185,7 @@ WriteControlFile(void)
 	ControlFile->nameDataLen = NAMEDATALEN;
 	ControlFile->indexMaxKeys = INDEX_MAX_KEYS;
 
-	ControlFile->toast_max_chunk_size = TOAST_MAX_CHUNK_SIZE;
+	ControlFile->toast_max_chunk_size = ClusterToastMaxChunkSize;
 	ControlFile->loblksize = LOBLKSIZE;
 
 	ControlFile->float8ByVal = FLOAT8PASSBYVAL;
@@ -4376,12 +4376,12 @@ ReadControlFile(void)
 						   " but the server was compiled with INDEX_MAX_KEYS %d.",
 						   ControlFile->indexMaxKeys, INDEX_MAX_KEYS),
 				 errhint("It looks like you need to recompile or initdb.")));
-	if (ControlFile->toast_max_chunk_size != TOAST_MAX_CHUNK_SIZE)
+	if (ControlFile->toast_max_chunk_size != ClusterToastMaxChunkSize)
 		ereport(FATAL,
 				(errmsg("database files are incompatible with server"),
-				 errdetail("The database cluster was initialized with TOAST_MAX_CHUNK_SIZE %d,"
-						   " but the server was compiled with TOAST_MAX_CHUNK_SIZE %d.",
-						   ControlFile->toast_max_chunk_size, (int) TOAST_MAX_CHUNK_SIZE),
+				 errdetail("The database cluster was initialized with ClusterToastMaxChunkSize %d,"
+						   " but the server was configured with ClusterToastMaxChunkSize %d.",
+						   ControlFile->toast_max_chunk_size, (int) ClusterToastMaxChunkSize),
 				 errhint("It looks like you need to recompile or initdb.")));
 	if (ControlFile->loblksize != LOBLKSIZE)
 		ereport(FATAL,
diff --git a/src/include/access/heaptoast.h b/src/include/access/heaptoast.h
index eba4986559..b0a85ec4a1 100644
--- a/src/include/access/heaptoast.h
+++ b/src/include/access/heaptoast.h
@@ -70,13 +70,13 @@
 
 /*
  * When we store an oversize datum externally, we divide it into chunks
- * containing at most TOAST_MAX_CHUNK_SIZE data bytes.  This number *must*
+ * containing at most ClusterToastMaxChunkSize data bytes.  This number *must*
  * be small enough that the completed toast-table tuple (including the
  * ID and sequence fields and all overhead) will fit on a page.
  * The coding here sets the size on the theory that we want to fit
  * EXTERN_TUPLES_PER_PAGE tuples of maximum size onto a page.
  *
- * NB: Changing TOAST_MAX_CHUNK_SIZE requires an initdb.
+ * NB: Changing ClusterToastMaxChunkSize requires an initdb.
  */
 #define EXTERN_TUPLES_PER_PAGE	4	/* tweak only this */
 
@@ -91,7 +91,7 @@
 	 VARHDRSZ)
 
 #define TOAST_MAX_CHUNK_SIZE_LIMIT CalcToastMaxChunkSize(PageUsableSpaceMax)
-#define TOAST_MAX_CHUNK_SIZE CalcToastMaxChunkSize(PageUsableSpace)
+#define ClusterToastMaxChunkSize CalcToastMaxChunkSize(PageUsableSpace)
 
 /* ----------
  * heap_toast_insert_or_update -
-- 
2.40.1

v4-0007-feature-Dynamically-calculate-toast_tuple_target.patchapplication/octet-stream; name=v4-0007-feature-Dynamically-calculate-toast_tuple_target.patchDownload

From 9148bae7ed0693917581c266768b8ec4dbab6c6f Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Thu, 18 Jan 2024 13:56:42 -0500
Subject: [PATCH v4 07/22] feature: Dynamically calculate toast_tuple_target

---
 src/backend/access/common/reloptions.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 9482929557..dfebf5c570 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -326,7 +326,8 @@ static relopt_int intRelOpts[] =
 			RELOPT_KIND_HEAP,
 			ShareUpdateExclusiveLock
 		},
-		TOAST_TUPLE_TARGET, 128, TOAST_TUPLE_TARGET_MAIN
+		/* NOTE: these limits are dynamically initialized */
+		0, 0, 0
 	},
 	{
 		{
@@ -583,6 +584,12 @@ update_dynamic_reloptions(void)
 
 	for (i = 0; intRelOpts[i].gen.name; i++)
 	{
+		if (strcmp("toast_tuple_target", intRelOpts[i].gen.name) == 0)
+		{
+			intRelOpts[i].min = 128;
+			intRelOpts[i].default_val = CalcMaximumBytesPerTuple(PageUsableSpace,TOAST_TUPLES_PER_PAGE);
+			intRelOpts[i].max = CalcMaximumBytesPerTuple(PageUsableSpace,TOAST_TUPLES_PER_PAGE_MAIN);
+		}
 	}
 }
 
-- 
2.40.1

v4-0010-chore-Split-nbtree.h-structure-defs-into-an-inter.patchapplication/octet-stream; name=v4-0010-chore-Split-nbtree.h-structure-defs-into-an-inter.patchDownload

From d5e2a3fe4cf675216ff6d8f32218ad1ae357dc6a Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Fri, 5 Jan 2024 15:10:42 -0500
Subject: [PATCH v4 10/22] chore: Split nbtree.h structure defs into an
 internal file

These definitions have been separated out so we can calculate block size
constants from front-end code as well, which we were prevented from doing due to
s_lock.h complaints when compiled in front-end mode.

Since we only need to calculate the various cluster constants using this
structure sizes we can just define the parts that that routine cares about here
in the new header and pull in that into blocksize.c instead of all of nbtree.h.
---
 src/include/access/nbtree.h     | 174 +----------------------------
 src/include/access/nbtree_int.h | 192 ++++++++++++++++++++++++++++++++
 2 files changed, 194 insertions(+), 172 deletions(-)
 create mode 100644 src/include/access/nbtree_int.h

diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h
index 9054b7dc3e..476a08def7 100644
--- a/src/include/access/nbtree.h
+++ b/src/include/access/nbtree.h
@@ -25,178 +25,8 @@
 #include "storage/bufmgr.h"
 #include "storage/shm_toc.h"
 
-/* There's room for a 16-bit vacuum cycle ID in BTPageOpaqueData */
-typedef uint16 BTCycleId;
-
-/*
- *	BTPageOpaqueData -- At the end of every page, we store a pointer
- *	to both siblings in the tree.  This is used to do forward/backward
- *	index scans.  The next-page link is also critical for recovery when
- *	a search has navigated to the wrong page due to concurrent page splits
- *	or deletions; see src/backend/access/nbtree/README for more info.
- *
- *	In addition, we store the page's btree level (counting upwards from
- *	zero at a leaf page) as well as some flag bits indicating the page type
- *	and status.  If the page is deleted, a BTDeletedPageData struct is stored
- *	in the page's tuple area, while a standard BTPageOpaqueData struct is
- *	stored in the page special area.
- *
- *	We also store a "vacuum cycle ID".  When a page is split while VACUUM is
- *	processing the index, a nonzero value associated with the VACUUM run is
- *	stored into both halves of the split page.  (If VACUUM is not running,
- *	both pages receive zero cycleids.)	This allows VACUUM to detect whether
- *	a page was split since it started, with a small probability of false match
- *	if the page was last split some exact multiple of MAX_BT_CYCLE_ID VACUUMs
- *	ago.  Also, during a split, the BTP_SPLIT_END flag is cleared in the left
- *	(original) page, and set in the right page, but only if the next page
- *	to its right has a different cycleid.
- *
- *	NOTE: the BTP_LEAF flag bit is redundant since level==0 could be tested
- *	instead.
- *
- *	NOTE: the btpo_level field used to be a union type in order to allow
- *	deleted pages to store a 32-bit safexid in the same field.  We now store
- *	64-bit/full safexid values using BTDeletedPageData instead.
- */
-
-typedef struct BTPageOpaqueData
-{
-	BlockNumber btpo_prev;		/* left sibling, or P_NONE if leftmost */
-	BlockNumber btpo_next;		/* right sibling, or P_NONE if rightmost */
-	uint32		btpo_level;		/* tree level --- zero for leaf pages */
-	uint16		btpo_flags;		/* flag bits, see below */
-	BTCycleId	btpo_cycleid;	/* vacuum cycle ID of latest split */
-} BTPageOpaqueData;
-
-typedef BTPageOpaqueData *BTPageOpaque;
-
-#define BTPageGetOpaque(page) ((BTPageOpaque) PageGetSpecialPointer(page))
-
-/* Bits defined in btpo_flags */
-#define BTP_LEAF		(1 << 0)	/* leaf page, i.e. not internal page */
-#define BTP_ROOT		(1 << 1)	/* root page (has no parent) */
-#define BTP_DELETED		(1 << 2)	/* page has been deleted from tree */
-#define BTP_META		(1 << 3)	/* meta-page */
-#define BTP_HALF_DEAD	(1 << 4)	/* empty, but still in tree */
-#define BTP_SPLIT_END	(1 << 5)	/* rightmost page of split group */
-#define BTP_HAS_GARBAGE (1 << 6)	/* page has LP_DEAD tuples (deprecated) */
-#define BTP_INCOMPLETE_SPLIT (1 << 7)	/* right sibling's downlink is missing */
-#define BTP_HAS_FULLXID	(1 << 8)	/* contains BTDeletedPageData */
-
-/*
- * The max allowed value of a cycle ID is a bit less than 64K.  This is
- * for convenience of pg_filedump and similar utilities: we want to use
- * the last 2 bytes of special space as an index type indicator, and
- * restricting cycle ID lets btree use that space for vacuum cycle IDs
- * while still allowing index type to be identified.
- */
-#define MAX_BT_CYCLE_ID		0xFF7F
-
-
-/*
- * The Meta page is always the first page in the btree index.
- * Its primary purpose is to point to the location of the btree root page.
- * We also point to the "fast" root, which is the current effective root;
- * see README for discussion.
- */
-
-typedef struct BTMetaPageData
-{
-	uint32		btm_magic;		/* should contain BTREE_MAGIC */
-	uint32		btm_version;	/* nbtree version (always <= BTREE_VERSION) */
-	BlockNumber btm_root;		/* current root location */
-	uint32		btm_level;		/* tree level of the root page */
-	BlockNumber btm_fastroot;	/* current "fast" root location */
-	uint32		btm_fastlevel;	/* tree level of the "fast" root page */
-	/* remaining fields only valid when btm_version >= BTREE_NOVAC_VERSION */
-
-	/* number of deleted, non-recyclable pages during last cleanup */
-	uint32		btm_last_cleanup_num_delpages;
-	/* number of heap tuples during last cleanup (deprecated) */
-	float8		btm_last_cleanup_num_heap_tuples;
-
-	bool		btm_allequalimage;	/* are all columns "equalimage"? */
-} BTMetaPageData;
-
-#define BTPageGetMeta(p) \
-	((BTMetaPageData *) PageGetContents(p))
-
-/*
- * The current Btree version is 4.  That's what you'll get when you create
- * a new index.
- *
- * Btree version 3 was used in PostgreSQL v11.  It is mostly the same as
- * version 4, but heap TIDs were not part of the keyspace.  Index tuples
- * with duplicate keys could be stored in any order.  We continue to
- * support reading and writing Btree versions 2 and 3, so that they don't
- * need to be immediately re-indexed at pg_upgrade.  In order to get the
- * new heapkeyspace semantics, however, a REINDEX is needed.
- *
- * Deduplication is safe to use when the btm_allequalimage field is set to
- * true.  It's safe to read the btm_allequalimage field on version 3, but
- * only version 4 indexes make use of deduplication.  Even version 4
- * indexes created on PostgreSQL v12 will need a REINDEX to make use of
- * deduplication, though, since there is no other way to set
- * btm_allequalimage to true (pg_upgrade hasn't been taught to set the
- * metapage field).
- *
- * Btree version 2 is mostly the same as version 3.  There are two new
- * fields in the metapage that were introduced in version 3.  A version 2
- * metapage will be automatically upgraded to version 3 on the first
- * insert to it.  INCLUDE indexes cannot use version 2.
- */
-#define BTREE_METAPAGE	0		/* first page is meta */
-#define BTREE_MAGIC		0x053162	/* magic number in metapage */
-#define BTREE_VERSION	4		/* current version number */
-#define BTREE_MIN_VERSION	2	/* minimum supported version */
-#define BTREE_NOVAC_VERSION	3	/* version with all meta fields set */
-
-/*
- * Maximum size of a btree index entry, including its tuple header.
- *
- * We actually need to be able to fit three items on every page,
- * so restrict any one item to 1/3 the per-page available space.
- *
- * There are rare cases where _bt_truncate() will need to enlarge
- * a heap index tuple to make space for a tiebreaker heap TID
- * attribute, which we account for here.
- */
-#define BTMaxItemSize(page) \
-	(MAXALIGN_DOWN((PageGetUsablePageSize(page) - \
-					MAXALIGN(3*sizeof(ItemIdData)) - \
-					MAXALIGN(sizeof(BTPageOpaqueData))) / 3) - \
-					MAXALIGN(sizeof(ItemPointerData)))
-#define BTMaxItemSizeNoHeapTid(page) \
-	MAXALIGN_DOWN((PageGetUsablePageSize(page) - \
-				   MAXALIGN(3*sizeof(ItemIdData)) - \
-				   MAXALIGN(sizeof(BTPageOpaqueData))) / 3)
-
-/*
- * ClusterMaxTIDsPerBTreePage is a cluster-specific upper bound on the number
- * of heap TIDs tuples that may be stored on a btree leaf page.  It is used to
- * size the per-page temporary buffers.
- *
- * MaxTIDsPerBTreePageLimit is the largest value that
- * ClusterMaxTIDsPerBTreePage could be.  While these currently evaluate to the
- * same value, these are being split out so ClusterMaxTIDsPerBTreePage can
- * become a variable instead of a constant.
- *
- * The CalcMaxTIDsPerBTreePage() macro is used to determine the appropriate
- * values, given the usable page space on a given page.
- *
- * Note: we don't bother considering per-tuple overheads here to keep
- * things simple (value is based on how many elements a single array of
- * heap TIDs must have to fill the space between the page header and
- * special area).  The value is slightly higher (i.e. more conservative)
- * than necessary as a result, which is considered acceptable.
- */
-#define CalcMaxTIDsPerBTreePage(size) \
-	(int) (((size) - sizeof(BTPageOpaqueData)) /	\
-		   sizeof(ItemPointerData))
-#define ClusterMaxTIDsPerBTreePage \
-	CalcMaxTIDsPerBTreePage(PageUsableSpace)
-#define MaxTIDsPerBTreePageLimit \
-	CalcMaxTIDsPerBTreePage(PageUsableSpaceMax)
+/* data structures are defined in nbtree_int.h */
+#include "access/nbtree_int.h"
 
 /*
  * The leaf-page fillfactor defaults to 90% but is user-adjustable.
diff --git a/src/include/access/nbtree_int.h b/src/include/access/nbtree_int.h
new file mode 100644
index 0000000000..b40bc89fa0
--- /dev/null
+++ b/src/include/access/nbtree_int.h
@@ -0,0 +1,192 @@
+/*-------------------------------------------------------------------------
+ *
+ * nbtree_int.h
+ *	  data structures for btree access method implementation.
+ *
+ *
+ * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/access/nbtree_int.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef NBTREE_INT_H
+#define NBTREE_INT_H
+
+#include "storage/block.h"
+
+/*
+ * These definitions have been separated out so we can calculate block size
+ * constants from front-end code as well, which we were prevented from doing
+ * due to s_lock.h complaints when compiled in front-end mode.
+ *
+ * Since we only need to calculate the various cluster constants using this
+ * data, we can just define the parts that that routine cares about here and
+ * pull in that instead of all of nbtree.h directly.
+ */
+
+/* There's room for a 16-bit vacuum cycle ID in BTPageOpaqueData */
+typedef uint16 BTCycleId;
+
+/*
+ *	BTPageOpaqueData -- At the end of every page, we store a pointer
+ *	to both siblings in the tree.  This is used to do forward/backward
+ *	index scans.  The next-page link is also critical for recovery when
+ *	a search has navigated to the wrong page due to concurrent page splits
+ *	or deletions; see src/backend/access/nbtree/README for more info.
+ *
+ *	In addition, we store the page's btree level (counting upwards from
+ *	zero at a leaf page) as well as some flag bits indicating the page type
+ *	and status.  If the page is deleted, a BTDeletedPageData struct is stored
+ *	in the page's tuple area, while a standard BTPageOpaqueData struct is
+ *	stored in the page special area.
+ *
+ *	We also store a "vacuum cycle ID".  When a page is split while VACUUM is
+ *	processing the index, a nonzero value associated with the VACUUM run is
+ *	stored into both halves of the split page.  (If VACUUM is not running,
+ *	both pages receive zero cycleids.)	This allows VACUUM to detect whether
+ *	a page was split since it started, with a small probability of false match
+ *	if the page was last split some exact multiple of MAX_BT_CYCLE_ID VACUUMs
+ *	ago.  Also, during a split, the BTP_SPLIT_END flag is cleared in the left
+ *	(original) page, and set in the right page, but only if the next page
+ *	to its right has a different cycleid.
+ *
+ *	NOTE: the BTP_LEAF flag bit is redundant since level==0 could be tested
+ *	instead.
+ *
+ *	NOTE: the btpo_level field used to be a union type in order to allow
+ *	deleted pages to store a 32-bit safexid in the same field.  We now store
+ *	64-bit/full safexid values using BTDeletedPageData instead.
+ */
+
+typedef struct BTPageOpaqueData
+{
+	BlockNumber btpo_prev;		/* left sibling, or P_NONE if leftmost */
+	BlockNumber btpo_next;		/* right sibling, or P_NONE if rightmost */
+	uint32		btpo_level;		/* tree level --- zero for leaf pages */
+	uint16		btpo_flags;		/* flag bits, see below */
+	BTCycleId	btpo_cycleid;	/* vacuum cycle ID of latest split */
+} BTPageOpaqueData;
+
+typedef BTPageOpaqueData *BTPageOpaque;
+
+#define BTPageGetOpaque(page) ((BTPageOpaque) PageGetSpecialPointer(page))
+
+/* Bits defined in btpo_flags */
+#define BTP_LEAF		(1 << 0)	/* leaf page, i.e. not internal page */
+#define BTP_ROOT		(1 << 1)	/* root page (has no parent) */
+#define BTP_DELETED		(1 << 2)	/* page has been deleted from tree */
+#define BTP_META		(1 << 3)	/* meta-page */
+#define BTP_HALF_DEAD	(1 << 4)	/* empty, but still in tree */
+#define BTP_SPLIT_END	(1 << 5)	/* rightmost page of split group */
+#define BTP_HAS_GARBAGE (1 << 6)	/* page has LP_DEAD tuples (deprecated) */
+#define BTP_INCOMPLETE_SPLIT (1 << 7)	/* right sibling's downlink is missing */
+#define BTP_HAS_FULLXID	(1 << 8)	/* contains BTDeletedPageData */
+
+/*
+ * The max allowed value of a cycle ID is a bit less than 64K.  This is
+ * for convenience of pg_filedump and similar utilities: we want to use
+ * the last 2 bytes of special space as an index type indicator, and
+ * restricting cycle ID lets btree use that space for vacuum cycle IDs
+ * while still allowing index type to be identified.
+ */
+#define MAX_BT_CYCLE_ID		0xFF7F
+
+
+/*
+ * The Meta page is always the first page in the btree index.
+ * Its primary purpose is to point to the location of the btree root page.
+ * We also point to the "fast" root, which is the current effective root;
+ * see README for discussion.
+ */
+
+typedef struct BTMetaPageData
+{
+	uint32		btm_magic;		/* should contain BTREE_MAGIC */
+	uint32		btm_version;	/* nbtree version (always <= BTREE_VERSION) */
+	BlockNumber btm_root;		/* current root location */
+	uint32		btm_level;		/* tree level of the root page */
+	BlockNumber btm_fastroot;	/* current "fast" root location */
+	uint32		btm_fastlevel;	/* tree level of the "fast" root page */
+	/* remaining fields only valid when btm_version >= BTREE_NOVAC_VERSION */
+
+	/* number of deleted, non-recyclable pages during last cleanup */
+	uint32		btm_last_cleanup_num_delpages;
+	/* number of heap tuples during last cleanup (deprecated) */
+	float8		btm_last_cleanup_num_heap_tuples;
+
+	bool		btm_allequalimage;	/* are all columns "equalimage"? */
+} BTMetaPageData;
+
+#define BTPageGetMeta(p) \
+	((BTMetaPageData *) PageGetContents(p))
+
+/*
+ * The current Btree version is 4.  That's what you'll get when you create
+ * a new index.
+ *
+ * Btree version 3 was used in PostgreSQL v11.  It is mostly the same as
+ * version 4, but heap TIDs were not part of the keyspace.  Index tuples
+ * with duplicate keys could be stored in any order.  We continue to
+ * support reading and writing Btree versions 2 and 3, so that they don't
+ * need to be immediately re-indexed at pg_upgrade.  In order to get the
+ * new heapkeyspace semantics, however, a REINDEX is needed.
+ *
+ * Deduplication is safe to use when the btm_allequalimage field is set to
+ * true.  It's safe to read the btm_allequalimage field on version 3, but
+ * only version 4 indexes make use of deduplication.  Even version 4
+ * indexes created on PostgreSQL v12 will need a REINDEX to make use of
+ * deduplication, though, since there is no other way to set
+ * btm_allequalimage to true (pg_upgrade hasn't been taught to set the
+ * metapage field).
+ *
+ * Btree version 2 is mostly the same as version 3.  There are two new
+ * fields in the metapage that were introduced in version 3.  A version 2
+ * metapage will be automatically upgraded to version 3 on the first
+ * insert to it.  INCLUDE indexes cannot use version 2.
+ */
+#define BTREE_METAPAGE	0		/* first page is meta */
+#define BTREE_MAGIC		0x053162	/* magic number in metapage */
+#define BTREE_VERSION	4		/* current version number */
+#define BTREE_MIN_VERSION	2	/* minimum supported version */
+#define BTREE_NOVAC_VERSION	3	/* version with all meta fields set */
+
+/*
+ * Maximum size of a btree index entry, including its tuple header.
+ *
+ * We actually need to be able to fit three items on every page,
+ * so restrict any one item to 1/3 the per-page available space.
+ *
+ * There are rare cases where _bt_truncate() will need to enlarge
+ * a heap index tuple to make space for a tiebreaker heap TID
+ * attribute, which we account for here.
+ */
+#define BTMaxItemSize(page) \
+	(MAXALIGN_DOWN((PageGetUsablePageSize(page) - \
+					MAXALIGN(3*sizeof(ItemIdData)) - \
+					MAXALIGN(sizeof(BTPageOpaqueData))) / 3) - \
+					MAXALIGN(sizeof(ItemPointerData)))
+#define BTMaxItemSizeNoHeapTid(page) \
+	MAXALIGN_DOWN((PageGetUsablePageSize(page) - \
+				   MAXALIGN(3*sizeof(ItemIdData)) - \
+				   MAXALIGN(sizeof(BTPageOpaqueData))) / 3)
+
+/*
+ * ClusterMaxTIDsPerBTreePage is an upper bound on the number of heap TIDs tuples
+ * that may be stored on a btree leaf page.  It is used to size the
+ * per-page temporary buffers.
+ *
+ * Note: we don't bother considering per-tuple overheads here to keep
+ * things simple (value is based on how many elements a single array of
+ * heap TIDs must have to fill the space between the page header and
+ * special area).  The value is slightly higher (i.e. more conservative)
+ * than necessary as a result, which is considered acceptable.
+ */
+#define CalcMaxTIDsPerBTreePage(usablespace)			  \
+	(int) ((usablespace) - sizeof(BTPageOpaqueData) / \
+		   sizeof(ItemPointerData))
+#define ClusterMaxTIDsPerBTreePage (CalcMaxTIDsPerBTreePage(PageUsableSpace))
+#define MaxTIDsPerBTreePageLimit (CalcMaxTIDsPerBTreePage(PageUsableSpaceMax))
+
+#endif
-- 
2.40.1

v4-0013-GUC-for-reserved_page_size.patchapplication/octet-stream; name=v4-0013-GUC-for-reserved_page_size.patchDownload

From 9501248e4564a6403bfd8a9ddf7a6a0b603a36b4 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Wed, 28 Feb 2024 13:16:33 -0500
Subject: [PATCH v4 13/22] GUC for reserved_page_size

---
 src/backend/access/transam/xlog.c   |  4 ++++
 src/backend/utils/misc/guc_tables.c | 13 +++++++++++++
 2 files changed, 17 insertions(+)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 93e435f820..da627b8021 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4251,6 +4251,7 @@ ReadControlFile(void)
 	pg_crc32c	crc;
 	int			fd;
 	static char wal_segsz_str[20];
+	static char reserved_page_size_str[20];
 	int			r;
 	int reserved_page_size;
 	/*
@@ -4334,6 +4335,9 @@ ReadControlFile(void)
 									  reserved_page_size)));
 
 	BlockSizeInit(ControlFile->blcksz, reserved_page_size);
+	snprintf(reserved_page_size_str, sizeof(reserved_page_size_str), "%d", reserved_page_size);
+	SetConfigOption("reserved_page_size", reserved_page_size_str, PGC_INTERNAL,
+					PGC_S_DYNAMIC_DEFAULT);
 
 	/*
 	 * Do compatibility checking immediately.  If the database isn't
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 57d9de4dd9..ee27b97171 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2828,6 +2828,19 @@ struct config_int ConfigureNamesInt[] =
 		NULL, assign_max_wal_size, NULL
 	},
 
+	{
+		{"reserved_page_size", PGC_INTERNAL, PRESET_OPTIONS,
+			gettext_noop("Shows the size of reserved space for extended pages."),
+			NULL,
+			GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+			},
+		(int*)&ReservedPageSize,
+		0,
+		0,
+		MaxReservedPageSize,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"checkpoint_timeout", PGC_SIGHUP, WAL_CHECKPOINTS,
 			gettext_noop("Sets the maximum time between automatic WAL checkpoints."),
-- 
2.40.1

v4-0014-feature-Add-reserved_page_size-to-initdb-bootstra.patchapplication/octet-stream; name=v4-0014-feature-Add-reserved_page_size-to-initdb-bootstra.patchDownload

From bc5c881757ff4dc56c79f81ee27dbdff028f3b46 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Fri, 5 Jan 2024 15:51:10 -0500
Subject: [PATCH v4 14/22] feature: Add reserved_page_size to initdb/bootstrap

Include a basic initdb test to exercise a few of the constraints.
---
 src/backend/bootstrap/bootstrap.c    | 15 ++++++++--
 src/bin/initdb/initdb.c              | 37 +++++++++++++++++++++--
 src/bin/initdb/meson.build           |  1 +
 src/bin/initdb/t/002_reservedsize.pl | 45 ++++++++++++++++++++++++++++
 4 files changed, 93 insertions(+), 5 deletions(-)
 create mode 100644 src/bin/initdb/t/002_reservedsize.pl

diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index 20d9e786d8..5258b160db 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -27,6 +27,7 @@
 #include "catalog/index.h"
 #include "catalog/pg_collation.h"
 #include "catalog/pg_type.h"
+#include "common/blocksize.h"
 #include "common/link-canary.h"
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
@@ -217,10 +218,18 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
 	argv++;
 	argc--;
 
-	while ((flag = getopt(argc, argv, "B:c:d:D:Fkr:X:-:")) != -1)
+	while ((flag = getopt(argc, argv, "b:B:c:d:D:Fkr:X:-:")) != -1)
 	{
 		switch (flag)
 		{
+			case 'b':
+				bootstrap_reserved_page_size = strtol(optarg, NULL, 0);
+				if (!IsValidReservedPageSize(bootstrap_reserved_page_size))
+					ereport(ERROR,
+							(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+							 errmsg("invalid reserved page size: %s; must be multiple of 8 between 0 and 256",
+									optarg)));
+				break;
 			case 'B':
 				SetConfigOption("shared_buffers", optarg, PGC_POSTMASTER, PGC_S_ARGV);
 				break;
@@ -296,6 +305,8 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
 	if (!SelectConfigFiles(userDoption, progname))
 		proc_exit(1);
 
+	BlockSizeInit(BLCKSZ, bootstrap_reserved_page_size);
+
 	/*
 	 * Validate we have been given a reasonable-looking DataDir and change
 	 * into it
@@ -308,8 +319,6 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
 	SetProcessingMode(BootstrapProcessing);
 	IgnoreSystemIndexes = true;
 
-	BlockSizeInit(BLCKSZ, bootstrap_reserved_page_size);
-
 	InitializeMaxBackends();
 
 	CreateSharedMemoryAndSemaphores();
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index de58002a5d..229367446c 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -69,6 +69,7 @@
 #include "catalog/pg_class_d.h" /* pgrminclude ignore */
 #include "catalog/pg_collation_d.h"
 #include "catalog/pg_database_d.h"	/* pgrminclude ignore */
+#include "common/blocksize.h"
 #include "common/file_perm.h"
 #include "common/file_utils.h"
 #include "common/logging.h"
@@ -166,6 +167,8 @@ static bool data_checksums = false;
 static char *xlog_dir = NULL;
 static int	wal_segment_size_mb = (DEFAULT_XLOG_SEG_SIZE) / (1024 * 1024);
 static DataDirSyncMethod sync_method = DATA_DIR_SYNC_METHOD_FSYNC;
+static char *str_reserved_page_size = NULL;
+static int reserved_page_size = 0;
 
 
 /* internal vars */
@@ -1154,11 +1157,12 @@ test_specific_config_settings(int test_conns, int test_buffs)
 
 	/* Set up the test postmaster invocation */
 	printfPQExpBuffer(&cmd,
-					  "\"%s\" --check %s %s "
+					  "\"%s\" --check %s %s -b %d "
 					  "-c max_connections=%d "
 					  "-c shared_buffers=%d "
 					  "-c dynamic_shared_memory_type=%s",
 					  backend_exec, boot_options, extra_options,
+					  reserved_page_size,
 					  test_conns, test_buffs,
 					  dynamic_shared_memory_type);
 
@@ -1536,6 +1540,9 @@ bootstrap_template1(void)
 
 	printfPQExpBuffer(&cmd, "\"%s\" --boot %s %s", backend_exec, boot_options, extra_options);
 	appendPQExpBuffer(&cmd, " -X %d", wal_segment_size_mb * (1024 * 1024));
+
+	if (reserved_page_size)
+		appendPQExpBuffer(&cmd, " -b %d", reserved_page_size);
 	if (data_checksums)
 		appendPQExpBuffer(&cmd, " -k");
 	if (debug)
@@ -2435,6 +2442,7 @@ usage(const char *progname)
 	printf(_("  -A, --auth=METHOD         default authentication method for local connections\n"));
 	printf(_("      --auth-host=METHOD    default authentication method for local TCP/IP connections\n"));
 	printf(_("      --auth-local=METHOD   default authentication method for local-socket connections\n"));
+	printf(_("  -b, --reserved-size=SIZE  reserved space in disk pages for page features\n"));
 	printf(_(" [-D, --pgdata=]DATADIR     location for this database cluster\n"));
 	printf(_("  -E, --encoding=ENCODING   set default encoding for new databases\n"));
 	printf(_("  -g, --allow-group-access  allow group read/execute on data directory\n"));
@@ -3100,6 +3108,7 @@ main(int argc, char *argv[])
 		{"sync-only", no_argument, NULL, 'S'},
 		{"waldir", required_argument, NULL, 'X'},
 		{"wal-segsize", required_argument, NULL, 12},
+		{"reserved-size", required_argument, NULL, 'b'},
 		{"data-checksums", no_argument, NULL, 'k'},
 		{"allow-group-access", no_argument, NULL, 'g'},
 		{"discard-caches", no_argument, NULL, 14},
@@ -3148,7 +3157,7 @@ main(int argc, char *argv[])
 
 	/* process command-line options */
 
-	while ((c = getopt_long(argc, argv, "A:c:dD:E:gkL:nNsST:U:WX:",
+	while ((c = getopt_long(argc, argv, "A:b:c:dD:E:gkL:nNsST:U:WX:",
 							long_options, &option_index)) != -1)
 	{
 		switch (c)
@@ -3172,6 +3181,9 @@ main(int argc, char *argv[])
 			case 11:
 				authmethodhost = pg_strdup(optarg);
 				break;
+			case 'b':
+				str_reserved_page_size = pg_strdup(optarg);
+				break;
 			case 'c':
 				{
 					char	   *buf = pg_strdup(optarg);
@@ -3357,6 +3369,27 @@ main(int argc, char *argv[])
 	if (!IsValidWalSegSize(wal_segment_size_mb * 1024 * 1024))
 		pg_fatal("argument of %s must be a power of two between 1 and 1024", "--wal-segsize");
 
+	if (str_reserved_page_size == NULL)
+		reserved_page_size = 0;
+	else
+	{
+		char	   *endptr;
+
+		/* check that the argument is a number */
+		reserved_page_size = strtol(str_reserved_page_size, &endptr, 10);
+
+		/* verify that the  segment size is valid */
+		if (endptr == str_reserved_page_size || *endptr != '\0')
+			pg_fatal("argument of --reserved-size must be a number");
+		/* check for valid block_size; last is bitwise power of two check */
+		if (!IsValidReservedPageSize(reserved_page_size))
+			pg_fatal("argument of --reserved-size must be a multiple of 8 between 0 and 256");
+	}
+
+	BlockSizeInit(BLCKSZ, reserved_page_size);
+	if (reserved_page_size)
+		printf(_("Reserving %u bytes on disk pages for additional features.\n"), reserved_page_size);
+
 	get_restricted_token();
 
 	setup_pgdata();
diff --git a/src/bin/initdb/meson.build b/src/bin/initdb/meson.build
index 7dc5ed6e77..914104f266 100644
--- a/src/bin/initdb/meson.build
+++ b/src/bin/initdb/meson.build
@@ -34,6 +34,7 @@ tests += {
     'env': {'with_icu': icu.found() ? 'yes' : 'no'},
     'tests': [
       't/001_initdb.pl',
+      't/002_reservedsize.pl',
     ],
   },
 }
diff --git a/src/bin/initdb/t/002_reservedsize.pl b/src/bin/initdb/t/002_reservedsize.pl
new file mode 100644
index 0000000000..7cb53aa07f
--- /dev/null
+++ b/src/bin/initdb/t/002_reservedsize.pl
@@ -0,0 +1,45 @@
+
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+
+use strict;
+use warnings;
+use Fcntl ':mode';
+use File::stat qw{lstat};
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# validate expected handling of --reserved-size
+
+# default is 0 reserved size
+my $node1 = PostgreSQL::Test::Cluster->new('node1');
+$node1->init();
+$node1->start;
+
+is($node1->safe_psql('postgres',q{SELECT current_setting('reserved_page_size')}),
+   0, "reserved_size defaults to 0");
+
+$node1->stop;
+
+# reserve 8 bytes
+my $node2 = PostgreSQL::Test::Cluster->new('node2');
+$node2->init(extra => ['--reserved-size=8'] );
+$node2->start;
+
+is($node2->safe_psql('postgres',q{SELECT current_setting('reserved_page_size')}),
+   8, "reserved_page_size passes through correctly");
+
+$node2->stop;
+
+# reserve non-multiple of 8 bytes : initdb error
+command_fails_like([ 'initdb', '--reserved-size=18' ],
+	qr/\Qinitdb: error: argument of --reserved-size must be a multiple of 8 between 0 and 256\E/,
+	'--reserved-size requires multiple of 8');
+
+# reserve too much space : initdb error
+command_fails_like([ 'initdb', '--reserved-size=1024' ],
+	qr/\Qinitdb: error: argument of --reserved-size must be a multiple of 8 between 0 and 256\E/,
+	'--reserved-size must be less than 256');
+
+done_testing();
-- 
2.40.1

v4-0011-Control-File-support-for-reserved_page_size.patchapplication/octet-stream; name=v4-0011-Control-File-support-for-reserved_page_size.patchDownload

From 21fcfcb7eb6035ea54b71fe83f50d4132a475fb7 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Wed, 28 Feb 2024 13:15:25 -0500
Subject: [PATCH v4 11/22] Control File support for reserved_page_size

---
 src/backend/access/transam/xlog.c       | 5 ++++-
 src/bin/pg_controldata/pg_controldata.c | 2 ++
 src/include/catalog/pg_control.h        | 3 ++-
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index cfb8e94136..b372e4548d 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4181,6 +4181,7 @@ WriteControlFile(void)
 	ControlFile->relseg_size = RELSEG_SIZE;
 	ControlFile->xlog_blcksz = XLOG_BLCKSZ;
 	ControlFile->xlog_seg_size = wal_segment_size;
+	ControlFile->reserved_page_size = ReservedPageSize;
 
 	ControlFile->nameDataLen = NAMEDATALEN;
 	ControlFile->indexMaxKeys = INDEX_MAX_KEYS;
@@ -4250,8 +4251,9 @@ ReadControlFile(void)
 	pg_crc32c	crc;
 	int			fd;
 	static char wal_segsz_str[20];
+	static char reserved_page_size_str[20];
 	int			r;
-
+	int reserved_page_size;
 	/*
 	 * Read data...
 	 */
@@ -4317,6 +4319,7 @@ ReadControlFile(void)
 		ereport(FATAL,
 				(errmsg("incorrect checksum in control file")));
 
+	reserved_page_size = ControlFile->reserved_page_size;
 	/*
 	 * Do compatibility checking immediately.  If the database isn't
 	 * compatible with the backend executable, we want to abort before we can
diff --git a/src/bin/pg_controldata/pg_controldata.c b/src/bin/pg_controldata/pg_controldata.c
index 93e0837947..9c6b6adae7 100644
--- a/src/bin/pg_controldata/pg_controldata.c
+++ b/src/bin/pg_controldata/pg_controldata.c
@@ -304,6 +304,8 @@ main(int argc, char *argv[])
 	/* we don't print floatFormat since can't say much useful about it */
 	printf(_("Database block size:                  %u\n"),
 		   ControlFile->blcksz);
+	printf(_("Reserved page size:                   %u\n"),
+		   ControlFile->reserved_page_size);
 	printf(_("Blocks per segment of large relation: %u\n"),
 		   ControlFile->relseg_size);
 	printf(_("WAL block size:                       %u\n"),
diff --git a/src/include/catalog/pg_control.h b/src/include/catalog/pg_control.h
index a00606ffcd..40f6c1e8cd 100644
--- a/src/include/catalog/pg_control.h
+++ b/src/include/catalog/pg_control.h
@@ -22,7 +22,7 @@
 
 
 /* Version identifier for this pg_control format */
-#define PG_CONTROL_VERSION	1300
+#define PG_CONTROL_VERSION	1700
 
 /* Nonce key length, see below */
 #define MOCK_AUTH_NONCE_LEN		32
@@ -227,6 +227,7 @@ typedef struct ControlFileData
 	 */
 	char		mock_authentication_nonce[MOCK_AUTH_NONCE_LEN];
 
+	uint32		reserved_page_size;	/* how much space per disk block is reserved */
 	/* CRC of all above ... MUST BE LAST! */
 	pg_crc32c	crc;
 } ControlFileData;
-- 
2.40.1

v4-0012-feature-Calculate-all-blocksize-constants-in-a-co.patchapplication/octet-stream; name=v4-0012-feature-Calculate-all-blocksize-constants-in-a-co.patchDownload

From 40c1c7e76e7ad51189e07fdbadcd1e20ca49dd22 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Fri, 5 Jan 2024 15:39:43 -0500
Subject: [PATCH v4 12/22] feature: Calculate all blocksize constants in a
 common location

These variables are calculated once on server startup or in front-end code which
utilizes these constants.

This adds the BlockSizeInit() and BlockSizeInitControl() routines to handle the
bootstrapping of the cluster constants depending on if we have a control file or
not.

Also move the declaration of ReservedPageSize variable into this module, since
we need the same sort of initialization if we are using these constants in the
frontend code as well.
---
 src/backend/access/transam/xlog.c  | 25 +++++++--
 src/backend/bootstrap/bootstrap.c  |  4 +-
 src/backend/storage/page/bufpage.c |  1 -
 src/common/Makefile                |  1 +
 src/common/blocksize.c             | 88 ++++++++++++++++++++++++++++++
 src/common/meson.build             |  1 +
 src/include/access/heaptoast.h     |  3 +-
 src/include/access/htup_details.h  |  2 -
 src/include/access/itup.h          |  1 -
 src/include/access/nbtree_int.h    |  1 -
 src/include/common/blocksize.h     | 37 +++++++++++++
 src/include/common/blocksize_int.h | 42 ++++++++++++++
 src/include/storage/bufpage.h      |  3 +-
 13 files changed, 194 insertions(+), 15 deletions(-)
 create mode 100644 src/common/blocksize.c
 create mode 100644 src/include/common/blocksize.h
 create mode 100644 src/include/common/blocksize_int.h

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index b372e4548d..93e435f820 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4187,7 +4187,7 @@ WriteControlFile(void)
 	ControlFile->indexMaxKeys = INDEX_MAX_KEYS;
 
 	ControlFile->toast_max_chunk_size = ClusterToastMaxChunkSize;
-	ControlFile->loblksize = LOBLKSIZE;
+	ControlFile->loblksize = ClusterLargeObjectBlockSize;
 
 	ControlFile->float8ByVal = FLOAT8PASSBYVAL;
 
@@ -4251,7 +4251,6 @@ ReadControlFile(void)
 	pg_crc32c	crc;
 	int			fd;
 	static char wal_segsz_str[20];
-	static char reserved_page_size_str[20];
 	int			r;
 	int reserved_page_size;
 	/*
@@ -4319,7 +4318,23 @@ ReadControlFile(void)
 		ereport(FATAL,
 				(errmsg("incorrect checksum in control file")));
 
+	/*
+	 * Block size computations affect a number of things that are later
+	 * checked, so ensure that we calculate as soon as CRC has been validated
+	 * before checking other things that may depend on it.
+	 */
+
 	reserved_page_size = ControlFile->reserved_page_size;
+
+	if (!IsValidReservedPageSize(reserved_page_size))
+		ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+						errmsg_plural("Reserved Page Size must be a multiple of 8 between 0 and 256, but the control file specifies %d byte",
+									  "Reserved Page Size must be a multiple of 8 between 0 and 256, but the control file specifies %d bytes",
+									  reserved_page_size,
+									  reserved_page_size)));
+
+	BlockSizeInit(ControlFile->blcksz, reserved_page_size);
+
 	/*
 	 * Do compatibility checking immediately.  If the database isn't
 	 * compatible with the backend executable, we want to abort before we can
@@ -4386,12 +4401,12 @@ ReadControlFile(void)
 						   " but the server was configured with ClusterToastMaxChunkSize %d.",
 						   ControlFile->toast_max_chunk_size, (int) ClusterToastMaxChunkSize),
 				 errhint("It looks like you need to recompile or initdb.")));
-	if (ControlFile->loblksize != LOBLKSIZE)
+	if (ControlFile->loblksize != ClusterLargeObjectBlockSize)
 		ereport(FATAL,
 				(errmsg("database files are incompatible with server"),
 				 errdetail("The database cluster was initialized with LOBLKSIZE %d,"
-						   " but the server was compiled with LOBLKSIZE %d.",
-						   ControlFile->loblksize, (int) LOBLKSIZE),
+						   " but the server was configured with LOBLKSIZE %d.",
+						   ControlFile->loblksize, (int) ClusterLargeObjectBlockSize),
 				 errhint("It looks like you need to recompile or initdb.")));
 
 #ifdef USE_FLOAT8_BYVAL
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index 986f6f1d9c..20d9e786d8 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -42,7 +42,7 @@
 #include "utils/relmapper.h"
 
 uint32		bootstrap_data_checksum_version = 0;	/* No checksum */
-
+Size		bootstrap_reserved_page_size = 0;		/* No reserved size */
 
 static void CheckerModeMain(void);
 static void bootstrap_signals(void);
@@ -308,6 +308,8 @@ BootstrapModeMain(int argc, char *argv[], bool check_only)
 	SetProcessingMode(BootstrapProcessing);
 	IgnoreSystemIndexes = true;
 
+	BlockSizeInit(BLCKSZ, bootstrap_reserved_page_size);
+
 	InitializeMaxBackends();
 
 	CreateSharedMemoryAndSemaphores();
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index 4aba6e8081..8a5fb66a8e 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -26,7 +26,6 @@
 /* GUC variable */
 bool		ignore_checksum_failure = false;
 
-int ReservedPageSize = 0;
 /* ----------------------------------------------------------------
  *						Page support functions
  * ----------------------------------------------------------------
diff --git a/src/common/Makefile b/src/common/Makefile
index 3d83299432..9276b9f8ae 100644
--- a/src/common/Makefile
+++ b/src/common/Makefile
@@ -48,6 +48,7 @@ OBJS_COMMON = \
 	base64.o \
 	binaryheap.o \
 	blkreftable.o \
+	blocksize.o \
 	checksum_helper.o \
 	compression.o \
 	config_info.o \
diff --git a/src/common/blocksize.c b/src/common/blocksize.c
new file mode 100644
index 0000000000..1dc3e0bbe0
--- /dev/null
+++ b/src/common/blocksize.c
@@ -0,0 +1,88 @@
+/*-------------------------------------------------------------------------
+ *
+ * blocksize.c
+ *		This file contains methods to calculate blocksize-related variables
+ *
+ *
+ * Copyright (c) 2024, PostgreSQL Global Development Group
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/access/common/blocksize.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+#include "common/blocksize_int.h"
+#include "access/heaptoast.h"
+#include "access/htup_details.h"
+#include "access/itup.h"
+#include "access/nbtree_int.h"
+#include "common/controldata_utils.h"
+#include "storage/large_object.h"
+
+/* These variables are effectively constants, but are initialized by BlockSizeInit() */
+
+Size ReservedPageSize;
+
+Size ClusterMaxTIDsPerBTreePage = 0;
+Size ClusterLargeObjectBlockSize = 0;
+Size ClusterMaxHeapTupleSize = 0;
+Size ClusterMaxHeapTuplesPerPage = 0;
+Size ClusterMaxIndexTuplesPerPage = 0;
+Size ClusterToastMaxChunkSize = 0;
+
+/*
+ * This routine will calculate and cache the necessary constants. This should
+ * be called once very very early in the process (as soon as the native block
+ * size is known, so after reading ControlFile, or using BlockSizeInitControl).
+ */
+
+static bool initialized = false;
+
+void
+BlockSizeInit(Size rawblocksize, Size reservedsize)
+{
+	Assert(rawblocksize == BLCKSZ);
+	Assert(IsValidReservedPageSize(reservedsize));
+
+	if (initialized)
+		return;
+
+	ReservedPageSize = reservedsize;
+	ClusterMaxTIDsPerBTreePage = CalcMaxTIDsPerBTreePage(PageUsableSpace);
+	ClusterLargeObjectBlockSize = LOBLKSIZE;		 /* TODO: calculate? */
+	ClusterMaxHeapTupleSize = CalcMaxHeapTupleSize(PageUsableSpace);
+	ClusterMaxHeapTuplesPerPage = CalcMaxHeapTuplesPerPage(PageUsableSpace);
+	ClusterMaxIndexTuplesPerPage = CalcMaxIndexTuplesPerPage(PageUsableSpace);
+	ClusterToastMaxChunkSize = CalcToastMaxChunkSize(PageUsableSpace);
+
+	initialized = true;
+}
+
+/*
+ * Init the BlockSize using values from the given control file pointer.  If
+ * this is nil, then load and use the control file pointed to by the pgdata
+ * path and perform said operations.
+ */
+void
+BlockSizeInitControl(ControlFileData *control, const char *pgdata)
+{
+	bool crc_ok = true;
+
+	Assert(pgdata);
+
+	if (!control)
+		control = get_controlfile(pgdata, &crc_ok);
+
+	Assert(crc_ok);
+
+	if (control)
+	{
+		BlockSizeInit(control->blcksz, control->reserved_page_size);
+		return;
+	}
+
+	/* panic */
+}
diff --git a/src/common/meson.build b/src/common/meson.build
index de68e408fa..3a711a06a4 100644
--- a/src/common/meson.build
+++ b/src/common/meson.build
@@ -5,6 +5,7 @@ common_sources = files(
   'base64.c',
   'binaryheap.c',
   'blkreftable.c',
+  'blocksize.c',
   'checksum_helper.c',
   'compression.c',
   'controldata_utils.c',
diff --git a/src/include/access/heaptoast.h b/src/include/access/heaptoast.h
index b0a85ec4a1..96b719d9a3 100644
--- a/src/include/access/heaptoast.h
+++ b/src/include/access/heaptoast.h
@@ -24,7 +24,7 @@
 	MAXALIGN_DOWN(((usablespace) - ((tuplesPerPage) * sizeof(ItemIdData))) \
 				  / (tuplesPerPage))
 
-#define MaximumBytesPerTuple(tuplesPerPage) CalcMaximumBytesPerTuple(PageUsableSpaceMax,tuplesPerPage)
+#define MaximumBytesPerTuple(tuplesPerPage) CalcMaximumBytesPerTuple(PageUsableSpace,tuplesPerPage)
 
 /*
  * These symbols control toaster activation.  If a tuple is larger than
@@ -91,7 +91,6 @@
 	 VARHDRSZ)
 
 #define TOAST_MAX_CHUNK_SIZE_LIMIT CalcToastMaxChunkSize(PageUsableSpaceMax)
-#define ClusterToastMaxChunkSize CalcToastMaxChunkSize(PageUsableSpace)
 
 /* ----------
  * heap_toast_insert_or_update -
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index 57dd928bda..0c41170ed4 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -569,7 +569,6 @@ StaticAssertDecl(MaxOffsetNumber < SpecTokenOffsetNumber,
  * you can, say, fit 2 tuples of size ClusterMaxHeapTupleSize/2 on the same page.
  */
 #define CalcMaxHeapTupleSize(size)  (size - sizeof(ItemIdData))
-#define ClusterMaxHeapTupleSize CalcMaxHeapTupleSize(PageUsableSpace)
 #define MaxHeapTupleSizeLimit CalcMaxHeapTupleSize(PageUsableSpaceMax)
 #define MinHeapTupleSize  MAXALIGN(SizeofHeapTupleHeader)
 
@@ -599,7 +598,6 @@ StaticAssertDecl(MaxOffsetNumber < SpecTokenOffsetNumber,
  */
 #define CalcMaxHeapTuplesPerPage(size)	((int) ((size) / \
 			(MAXALIGN(SizeofHeapTupleHeader) + sizeof(ItemIdData))))
-#define ClusterMaxHeapTuplesPerPage CalcMaxHeapTuplesPerPage(PageUsableSpace)
 #define MaxHeapTuplesPerPageLimit CalcMaxHeapTuplesPerPage(PageUsableSpaceMax)
 
 /*
diff --git a/src/include/access/itup.h b/src/include/access/itup.h
index 9c436c8b16..3ef6ceb9f0 100644
--- a/src/include/access/itup.h
+++ b/src/include/access/itup.h
@@ -173,7 +173,6 @@ index_getattr(IndexTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull)
  */
 #define CalcMaxIndexTuplesPerPage(size)	((int) ((size) / \
 			(MAXALIGN(sizeof(IndexTupleData) + 1) + sizeof(ItemIdData))))
-#define ClusterMaxIndexTuplesPerPage CalcMaxIndexTuplesPerPage(PageUsableSpace)
 #define MaxIndexTuplesPerPageLimit CalcMaxIndexTuplesPerPage(PageUsableSpaceMax)
 
 #endif							/* ITUP_H */
diff --git a/src/include/access/nbtree_int.h b/src/include/access/nbtree_int.h
index b40bc89fa0..8e0249ca96 100644
--- a/src/include/access/nbtree_int.h
+++ b/src/include/access/nbtree_int.h
@@ -186,7 +186,6 @@ typedef struct BTMetaPageData
 #define CalcMaxTIDsPerBTreePage(usablespace)			  \
 	(int) ((usablespace) - sizeof(BTPageOpaqueData) / \
 		   sizeof(ItemPointerData))
-#define ClusterMaxTIDsPerBTreePage (CalcMaxTIDsPerBTreePage(PageUsableSpace))
 #define MaxTIDsPerBTreePageLimit (CalcMaxTIDsPerBTreePage(PageUsableSpaceMax))
 
 #endif
diff --git a/src/include/common/blocksize.h b/src/include/common/blocksize.h
new file mode 100644
index 0000000000..ff9e0cd3d6
--- /dev/null
+++ b/src/include/common/blocksize.h
@@ -0,0 +1,37 @@
+/*-------------------------------------------------------------------------
+ *
+ * blocksize.h
+ *	  definitions for cluster-specific limits/structure defs
+ *
+ *
+ * Copyright (c) 2024, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION: src/include/common/blocksize.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#ifndef BLOCKSIZE_H
+#define BLOCKSIZE_H
+
+#include "catalog/pg_control.h"
+
+void BlockSizeInit(Size rawblocksize, Size reservedsize);
+void BlockSizeInitControl(ControlFileData *ControlFile, const char *DataDir);
+
+/* These constants are initialized at runtime but are effective constants for callers */
+
+const extern PGDLLIMPORT Size ReservedPageSize;
+const extern PGDLLIMPORT Size ClusterMaxTIDsPerBTreePage;
+const extern PGDLLIMPORT Size ClusterLargeObjectBlockSize;
+const extern PGDLLIMPORT Size ClusterMaxHeapTupleSize;
+const extern PGDLLIMPORT Size ClusterMaxHeapTuplesPerPage;
+const extern PGDLLIMPORT Size ClusterMaxIndexTuplesPerPage;
+const extern PGDLLIMPORT Size ClusterToastMaxChunkSize;
+
+#define MaxReservedPageSize 256
+
+/* between 0 and MaxReservedPageSize and multiple of 8 */
+#define IsValidReservedPageSize(s) ((s) >= 0 && (s) <= MaxReservedPageSize && (((s)&0x7) == 0))
+
+#endif
diff --git a/src/include/common/blocksize_int.h b/src/include/common/blocksize_int.h
new file mode 100644
index 0000000000..a7213878f7
--- /dev/null
+++ b/src/include/common/blocksize_int.h
@@ -0,0 +1,42 @@
+/*-------------------------------------------------------------------------
+ *
+ * blocksize_int.h
+ *	  internal defintions for cluster-specific limits/structure defs
+ *
+ * Copyright (c) 2024, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION: src/include/common/blocksize_int.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+/*
+ * Note: We use the same identifier here as in blocksize.h, due to blocksize.c
+ * (the only consumer of this header file) needing to see these definitions;
+ * other subsequent header files will pull in blocksize.h, so without using
+ * the same symbol you get conflicting defintion errors.
+ */
+
+#ifndef BLOCKSIZE_H
+#define BLOCKSIZE_H
+
+/* forward declaration */
+typedef struct ControlFileData ControlFileData;
+
+void BlockSizeInit(Size rawblocksize, Size reservedsize);
+void BlockSizeInitControl(ControlFileData *ControlFile, const char *DataDir);
+
+extern PGDLLIMPORT Size ReservedPageSize;
+extern PGDLLIMPORT Size ClusterMaxTIDsPerBTreePage;
+extern PGDLLIMPORT Size ClusterLargeObjectBlockSize;
+extern PGDLLIMPORT Size ClusterMaxHeapTupleSize;
+extern PGDLLIMPORT Size ClusterMaxHeapTuplesPerPage;
+extern PGDLLIMPORT Size ClusterMaxIndexTuplesPerPage;
+extern PGDLLIMPORT Size ClusterToastMaxChunkSize;
+
+#define MaxReservedPageSize 256
+
+/* between 0 and MaxReservedPageSize and multiple of 8 */
+#define IsValidReservedPageSize(s) ((s) >= 0 && (s) <= MaxReservedPageSize && (((s)&0x7) == 0))
+
+#endif
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 8d62544830..9cb04f6ad8 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -15,6 +15,7 @@
 #define BUFPAGE_H
 
 #include "access/xlogdefs.h"
+#include "common/blocksize.h"
 #include "storage/block.h"
 #include "storage/item.h"
 #include "storage/off.h"
@@ -216,8 +217,6 @@ typedef PageHeaderData *PageHeader;
 /*
  * how much space is left after smgr's bookkeeping, etc; should be MAXALIGN
  */
-extern int ReservedPageSize;
-
 #define PageUsableSpace (BLCKSZ - SizeOfPageHeaderData - ReservedPageSize)
 #define PageUsableSpaceMax (BLCKSZ - SizeOfPageHeaderData)
 
-- 
2.40.1

v4-0015-feature-Updates-for-pg_resetwal.patchapplication/octet-stream; name=v4-0015-feature-Updates-for-pg_resetwal.patchDownload

From eb7ac4975cc32ae874ecd3175f842f62495eb92e Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Fri, 5 Jan 2024 18:29:23 -0500
Subject: [PATCH v4 15/22] feature: Updates for pg_resetwal

If you are explicitly reseting a missing or corrupted ControlFile, you will need
to explicitly provide the number of bytes the existing cluster was using for its
previous reserved_page_size setting.
---
 src/bin/pg_resetwal/pg_resetwal.c      | 55 +++++++++++++++++++++++++-
 src/bin/pg_resetwal/t/002_corrupted.pl |  8 ++--
 2 files changed, 57 insertions(+), 6 deletions(-)

diff --git a/src/bin/pg_resetwal/pg_resetwal.c b/src/bin/pg_resetwal/pg_resetwal.c
index db5fd71ca0..9c773ba0c1 100644
--- a/src/bin/pg_resetwal/pg_resetwal.c
+++ b/src/bin/pg_resetwal/pg_resetwal.c
@@ -75,6 +75,7 @@ static TimeLineID minXlogTli = 0;
 static XLogSegNo minXlogSegNo = 0;
 static int	WalSegSz;
 static int	set_wal_segsize;
+static int	set_reserved_page_size = -1;
 
 static void CheckDataVersion(void);
 static bool read_controlfile(void);
@@ -94,6 +95,7 @@ int
 main(int argc, char *argv[])
 {
 	static struct option long_options[] = {
+		{"reserved-size", required_argument, NULL, 'b'},
 		{"commit-timestamp-ids", required_argument, NULL, 'c'},
 		{"pgdata", required_argument, NULL, 'D'},
 		{"epoch", required_argument, NULL, 'e'},
@@ -138,10 +140,18 @@ main(int argc, char *argv[])
 	}
 
 
-	while ((c = getopt_long(argc, argv, "c:D:e:fl:m:no:O:u:x:", long_options, NULL)) != -1)
+	while ((c = getopt_long(argc, argv, "b:c:D:e:fl:m:no:O:u:x:", long_options, NULL)) != -1)
 	{
 		switch (c)
 		{
+			case 'b':
+				errno = 0;
+				set_reserved_page_size = strtol(optarg, &endptr, 10);
+				if (endptr == optarg || *endptr != '\0' || errno != 0)
+					pg_fatal("argument of --reserved-size must be a number");
+				if (!IsValidReservedPageSize(set_reserved_page_size))
+					pg_fatal("argument of --reserved-size must be a multiple of 8 between 0 and 256");
+				break;
 			case 'D':
 				DataDir = optarg;
 				break;
@@ -391,6 +401,34 @@ main(int argc, char *argv[])
 	else
 		WalSegSz = ControlFile.xlog_seg_size;
 
+	/*
+	 * If a reserved page size was specified, compare to existing ControlFile; if we
+	 * are wrong, we won't be able to read the data.  We will only want to set
+	 * it if we guessed.
+	 */
+	if (set_reserved_page_size == -1)
+	{
+		if (guessed)
+			pg_fatal("Cannot determine reserved page size; provide explicitly via --reserved-size");
+	}
+	else
+	{
+		if (!guessed && set_reserved_page_size != ControlFile.reserved_page_size)
+			pg_fatal("Cannot change reserved page size in existing cluster");
+
+		/* hope this is right, but by default we don't know; likely this is
+		 * 0 */
+		ControlFile.reserved_page_size = set_reserved_page_size;
+	}
+
+	/*
+	 * Set some dependent calculated fields stored in pg_control
+	 */
+	BlockSizeInit(ControlFile.blcksz, ControlFile.reserved_page_size);
+
+	ControlFile.toast_max_chunk_size = ClusterToastMaxChunkSize;
+	ControlFile.loblksize = ClusterLargeObjectBlockSize;
+
 	if (log_fname != NULL)
 		XLogFromFileName(log_fname, &minXlogTli, &minXlogSegNo, WalSegSz);
 
@@ -617,6 +655,16 @@ read_controlfile(void)
 			return false;
 		}
 
+		/* return false if block size is not valid */
+		if (!IsValidReservedPageSize(ControlFile.reserved_page_size))
+		{
+			pg_log_warning(ngettext("pg_control specifies invalid reserved page size (%d byte); proceed with caution",
+									"pg_control specifies invalid reserved page size (%d bytes); proceed with caution",
+									ControlFile.reserved_page_size),
+						   ControlFile.reserved_page_size);
+			return false;
+		}
+
 		return true;
 	}
 
@@ -695,7 +743,7 @@ GuessControlValues(void)
 	ControlFile.xlog_seg_size = DEFAULT_XLOG_SEG_SIZE;
 	ControlFile.nameDataLen = NAMEDATALEN;
 	ControlFile.indexMaxKeys = INDEX_MAX_KEYS;
-	ControlFile.toast_max_chunk_size = TOAST_MAX_CHUNK_SIZE_LIMIT;
+	ControlFile.toast_max_chunk_size = ClusterToastMaxChunkSize;
 	ControlFile.loblksize = LOBLKSIZE;
 	ControlFile.float8ByVal = FLOAT8PASSBYVAL;
 
@@ -1177,6 +1225,9 @@ usage(void)
 	printf(_("  -V, --version          output version information, then exit\n"));
 	printf(_("  -?, --help             show this help, then exit\n"));
 
+	printf(_("\nOptions required when guessing control file values:\n"));
+	printf(_("  -b, --reserved-size=SIZE         reserved page size, in bytes\n"));
+
 	printf(_("\nOptions to override control file values:\n"));
 	printf(_("  -c, --commit-timestamp-ids=XID,XID\n"
 			 "                                   set oldest and newest transactions bearing\n"
diff --git a/src/bin/pg_resetwal/t/002_corrupted.pl b/src/bin/pg_resetwal/t/002_corrupted.pl
index 897b03162e..082813966c 100644
--- a/src/bin/pg_resetwal/t/002_corrupted.pl
+++ b/src/bin/pg_resetwal/t/002_corrupted.pl
@@ -31,7 +31,7 @@ print $fh pack("x[$size]");
 close $fh;
 
 command_checks_all(
-	[ 'pg_resetwal', '-n', $node->data_dir ],
+	[ 'pg_resetwal', '-b', '0', '-n', $node->data_dir ],
 	0,
 	[qr/pg_control version number/],
 	[
@@ -47,7 +47,7 @@ print $fh $data, pack("x[" . ($size - 16) . "]");
 close $fh;
 
 command_checks_all(
-	[ 'pg_resetwal', '-n', $node->data_dir ],
+	[ 'pg_resetwal', '-b', '0', '-n', $node->data_dir ],
 	0,
 	[qr/pg_control version number/],
 	[
@@ -57,10 +57,10 @@ command_checks_all(
 
 # now try to run it
 command_fails_like(
-	[ 'pg_resetwal', $node->data_dir ],
+	[ 'pg_resetwal', '-b', '0', $node->data_dir ],
 	qr/not proceeding because control file values were guessed/,
 	'does not run when control file values were guessed');
-command_ok([ 'pg_resetwal', '-f', $node->data_dir ],
+command_ok([ 'pg_resetwal', '-b', '0', '-f', $node->data_dir ],
 	'runs with force when control file values were guessed');
 
 done_testing();
-- 
2.40.1

v4-0016-optimization-Add-support-for-fast-non-division-ba.patchapplication/octet-stream; name=v4-0016-optimization-Add-support-for-fast-non-division-ba.patchDownload

From 9e7a1c2998575913e1bcc92032e6e1049377f206 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Fri, 29 Sep 2023 15:02:00 -0400
Subject: [PATCH v4 16/22] optimization: Add support for fast,
 non-division-based div/mod algorithms

One place that is unhappy with the runtime changes is the visibility map.  This
provides a feature for a fast mod/div operation using only a single cacheable
division operation to compute the inverse which we can then multiply and
bitshift appropriately for the actual computations, which serves our needs here.
---
 src/include/port/pg_bitutils.h | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/src/include/port/pg_bitutils.h b/src/include/port/pg_bitutils.h
index 46bf4f0103..51d8c09a1a 100644
--- a/src/include/port/pg_bitutils.h
+++ b/src/include/port/pg_bitutils.h
@@ -340,4 +340,37 @@ pg_rotate_left32(uint32 word, int n)
 #define pg_prevpower2_size_t pg_prevpower2_64
 #endif
 
+
+/* integer division speedups for constant but runtime divisors */
+
+/*
+ * This value should cached globally and used in the other routines to find
+ * the div/mod quickly relative to `div` operand.  TODO: might have some other
+ * asm-tuned things in port maybe?  general-purpose solution should be ok
+ * though.
+ */
+static inline uint32 pg_fastinverse(uint16 divisor)
+{
+	return UINT32_C(0xFFFFFFFF) / divisor + 1;
+}
+
+/*
+ * pg_fastdiv - calculates the quotient of a 16-bit number against a constant
+ * divisor without using the division operator
+ */
+static inline uint32 pg_fastdiv(uint32 n, uint32 divisor, uint64 fastinv)
+{
+	return (uint32)(((uint64)n * (uint64)(fastinv)) >> 32);
+}
+
+/*
+ * pg_fastmod - calculates the modulus of a 16-bit number against a constant
+ * divisor without using the division operator
+ */
+static inline uint32 pg_fastmod(uint32 n, uint32 divisor, uint64 fastinv)
+{
+	uint32 lowbits = fastinv * n;
+	return ((uint64)lowbits * (uint64)divisor) >> 32;
+}
+
 #endif							/* PG_BITUTILS_H */
-- 
2.40.1

v4-0018-doc-update-bufpage-docs-w-reserved-space-data.patchapplication/octet-stream; name=v4-0018-doc-update-bufpage-docs-w-reserved-space-data.patchDownload

From b3567bb8ffd8e33c97c73980a8a7da451b06e67b Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Fri, 19 Jan 2024 11:33:09 -0500
Subject: [PATCH v4 18/22] doc: update bufpage docs w/reserved space data

---
 src/include/storage/bufpage.h | 25 ++++++++++++++++---------
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 9cb04f6ad8..b95e0ba00c 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -37,10 +37,10 @@
  * |			 v pd_upper							  |
  * +-------------+------------------------------------+
  * |			 | tupleN ...                         |
- * +-------------+------------------+-----------------+
- * |	   ... tuple3 tuple2 tuple1 | "special space" |
- * +--------------------------------+-----------------+
- *									^ pd_special
+ * +-------------+------------------------+-----------+
+ * | ... tuple3 tuple2 tuple1 | "special" | reserved  |
+ * +--------------------------+-----------------------+
+ *                            ^ pd_special
  *
  * a page is full when nothing can be added between pd_lower and
  * pd_upper.
@@ -69,11 +69,18 @@
  *
  * AM-generic per-page information is kept in PageHeaderData.
  *
- * AM-specific per-page data (if any) is kept in the area marked "special
- * space"; each AM has an "opaque" structure defined somewhere that is
- * stored as the page trailer.  an access method should always
- * initialize its pages with PageInit and then set its own opaque
- * fields.
+ * Reserved page size is defined at initdb time and reserves the final bytes
+ * of each disk page for conditional feature use, for instance storing
+ * authenticated data or IVs for encryption.  If reserved page size is present
+ * (either through explicit allocation or through implicit definition via
+ * defined page features) then the special space offset will be adjusted to
+ * start not at the end of the block itself, but right before the MAXALIGN'd
+ * ReservedPageSize chunk at the end, which is allocated/managed using the
+ * page features mechanism.  This adjustment is done at PageInit() time
+ * transparently to the AM, which still uses the normal pd_special pointer to
+ * reference its opaque block.  The only difference here is that the
+ * pd_special field + sizeof(opaque structure) will not (necessarily) be the
+ * same as the heap block size, but instead BLCKSZ - ReservedPageSize.
  */
 
 typedef Pointer Page;
-- 
2.40.1

v4-0017-optimization-Use-fastdiv-code-in-visibility-map.patchapplication/octet-stream; name=v4-0017-optimization-Use-fastdiv-code-in-visibility-map.patchDownload

From 70a1d6888fb320ba210f76d31bf1736c7182b1d3 Mon Sep 17 00:00:00 2001
From: David Christensen <david.christensen@crunchydata.com>
Date: Fri, 29 Sep 2023 15:03:00 -0400
Subject: [PATCH v4 17/22] optimization: Use fastdiv code in visibility map

Adjust the code that calculates our heap block offsets based to be based on
PageUsableSpace instead of compile-time constants.  Use the fastdiv code to
support this.
---
 src/backend/access/heap/visibilitymap.c | 95 +++++++++++++++++--------
 1 file changed, 67 insertions(+), 28 deletions(-)

diff --git a/src/backend/access/heap/visibilitymap.c b/src/backend/access/heap/visibilitymap.c
index 6bde6f4388..8c47076548 100644
--- a/src/backend/access/heap/visibilitymap.c
+++ b/src/backend/access/heap/visibilitymap.c
@@ -105,18 +105,19 @@
  * extra headers, so the whole page minus the standard page header is
  * used for the bitmap.
  */
-#define MAPSIZE (PageUsableSpace)
 
-/* Number of heap blocks we can represent in one byte */
-#define HEAPBLOCKS_PER_BYTE (BITS_PER_BYTE / BITS_PER_HEAPBLOCK)
-
-/* Number of heap blocks we can represent in one visibility map page. */
-#define HEAPBLOCKS_PER_PAGE (MAPSIZE * HEAPBLOCKS_PER_BYTE)
+#define MAPBLOCK_SIZE PageUsableSpace
+#define HEAPBLOCK_SIZE (MAPBLOCK_SIZE<<2)
+/* Init routine for our fastmath */
+#define MAPBLOCK_INIT if (unlikely(!heapblock_inv))		\
+	{													\
+		heapblock_inv = pg_fastinverse(HEAPBLOCK_SIZE);	\
+	}
 
 /* Mapping from heap block number to the right bit in the visibility map */
-#define HEAPBLK_TO_MAPBLOCK(x) ((x) / HEAPBLOCKS_PER_PAGE)
-#define HEAPBLK_TO_MAPBYTE(x) (((x) % HEAPBLOCKS_PER_PAGE) / HEAPBLOCKS_PER_BYTE)
-#define HEAPBLK_TO_OFFSET(x) (((x) % HEAPBLOCKS_PER_BYTE) * BITS_PER_HEAPBLOCK)
+#define HEAPBLK_TO_MAPBLOCK(x) (pg_fastdiv((x),HEAPBLOCK_SIZE,heapblock_inv))
+#define HEAPBLK_TO_MAPBYTE(x) (pg_fastmod((x),HEAPBLOCK_SIZE,heapblock_inv) >> 2) /* always 4 blocks per byte */
+#define HEAPBLK_TO_OFFSET(x) (((x) & 0x3) << 1) /* always 2 bits per entry */
 
 /* Masks for counting subsets of bits in the visibility map. */
 #define VISIBLE_MASK64	UINT64CONST(0x5555555555555555) /* The lower bit of each
@@ -128,6 +129,8 @@
 static Buffer vm_readbuf(Relation rel, BlockNumber blkno, bool extend);
 static Buffer vm_extend(Relation rel, BlockNumber vm_nblocks);
 
+/* storage for the fast div/mod inverse */
+static uint64 heapblock_inv = 0;
 
 /*
  *	visibilitymap_clear - clear specified bits for one page in visibility map
@@ -139,13 +142,20 @@ static Buffer vm_extend(Relation rel, BlockNumber vm_nblocks);
 bool
 visibilitymap_clear(Relation rel, BlockNumber heapBlk, Buffer vmbuf, uint8 flags)
 {
-	BlockNumber mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
-	int			mapByte = HEAPBLK_TO_MAPBYTE(heapBlk);
-	int			mapOffset = HEAPBLK_TO_OFFSET(heapBlk);
-	uint8		mask = flags << mapOffset;
+	BlockNumber mapBlock;
+	int			mapByte;
+	int			mapOffset;
+	uint8		mask;
 	char	   *map;
 	bool		cleared = false;
 
+	MAPBLOCK_INIT;
+
+	mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
+	mapByte = HEAPBLK_TO_MAPBYTE(heapBlk);
+	mapOffset = HEAPBLK_TO_OFFSET(heapBlk);
+	mask = flags << mapOffset;
+
 	/* Must never clear all_visible bit while leaving all_frozen bit set */
 	Assert(flags & VISIBILITYMAP_VALID_BITS);
 	Assert(flags != VISIBILITYMAP_ALL_VISIBLE);
@@ -192,7 +202,11 @@ visibilitymap_clear(Relation rel, BlockNumber heapBlk, Buffer vmbuf, uint8 flags
 void
 visibilitymap_pin(Relation rel, BlockNumber heapBlk, Buffer *vmbuf)
 {
-	BlockNumber mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
+	BlockNumber mapBlock;
+
+	MAPBLOCK_INIT;
+
+	mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
 
 	/* Reuse the old pinned buffer if possible */
 	if (BufferIsValid(*vmbuf))
@@ -216,7 +230,11 @@ visibilitymap_pin(Relation rel, BlockNumber heapBlk, Buffer *vmbuf)
 bool
 visibilitymap_pin_ok(BlockNumber heapBlk, Buffer vmbuf)
 {
-	BlockNumber mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
+	BlockNumber mapBlock;
+
+	MAPBLOCK_INIT;
+
+	mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
 
 	return BufferIsValid(vmbuf) && BufferGetBlockNumber(vmbuf) == mapBlock;
 }
@@ -247,12 +265,18 @@ visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf,
 				  XLogRecPtr recptr, Buffer vmBuf, TransactionId cutoff_xid,
 				  uint8 flags)
 {
-	BlockNumber mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
-	uint32		mapByte = HEAPBLK_TO_MAPBYTE(heapBlk);
-	uint8		mapOffset = HEAPBLK_TO_OFFSET(heapBlk);
+	BlockNumber mapBlock;
+	uint32		mapByte;
+	uint8		mapOffset;
 	Page		page;
 	uint8	   *map;
 
+	MAPBLOCK_INIT;
+
+	mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
+	mapByte = HEAPBLK_TO_MAPBYTE(heapBlk);
+	mapOffset = HEAPBLK_TO_OFFSET(heapBlk);
+
 #ifdef TRACE_VISIBILITYMAP
 	elog(DEBUG1, "vm_set %s %d", RelationGetRelationName(rel), heapBlk);
 #endif
@@ -337,12 +361,18 @@ visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf,
 uint8
 visibilitymap_get_status(Relation rel, BlockNumber heapBlk, Buffer *vmbuf)
 {
-	BlockNumber mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
-	uint32		mapByte = HEAPBLK_TO_MAPBYTE(heapBlk);
-	uint8		mapOffset = HEAPBLK_TO_OFFSET(heapBlk);
+	BlockNumber mapBlock;
+	uint32		mapByte;
+	uint8		mapOffset;
 	char	   *map;
 	uint8		result;
 
+	MAPBLOCK_INIT;
+
+	mapBlock = HEAPBLK_TO_MAPBLOCK(heapBlk);
+	mapByte = HEAPBLK_TO_MAPBYTE(heapBlk);
+	mapOffset = HEAPBLK_TO_OFFSET(heapBlk);
+
 #ifdef TRACE_VISIBILITYMAP
 	elog(DEBUG1, "vm_get_status %s %d", RelationGetRelationName(rel), heapBlk);
 #endif
@@ -389,6 +419,8 @@ visibilitymap_count(Relation rel, BlockNumber *all_visible, BlockNumber *all_fro
 	BlockNumber nvisible = 0;
 	BlockNumber nfrozen = 0;
 
+	MAPBLOCK_INIT;
+
 	/* all_visible must be specified */
 	Assert(all_visible);
 
@@ -414,15 +446,16 @@ visibilitymap_count(Relation rel, BlockNumber *all_visible, BlockNumber *all_fro
 		 */
 		map = (uint64 *) PageGetContents(BufferGetPage(mapBuffer));
 
-		Assert(MAPSIZE % sizeof(uint64) == 0);
+		Assert(MAPBLOCK_SIZE % sizeof(uint64) == 0);
+
 		if (all_frozen == NULL)
 		{
-			for (i = 0; i < MAPSIZE / sizeof(uint64); i++)
+			for (i = 0; i < MAPBLOCK_SIZE / sizeof(uint64); i++)
 				nvisible += pg_popcount64(map[i] & VISIBLE_MASK64);
 		}
 		else
 		{
-			for (i = 0; i < MAPSIZE / sizeof(uint64); i++)
+			for (i = 0; i < MAPBLOCK_SIZE / sizeof(uint64); i++)
 			{
 				nvisible += pg_popcount64(map[i] & VISIBLE_MASK64);
 				nfrozen += pg_popcount64(map[i] & FROZEN_MASK64);
@@ -454,9 +487,15 @@ visibilitymap_prepare_truncate(Relation rel, BlockNumber nheapblocks)
 	BlockNumber newnblocks;
 
 	/* last remaining block, byte, and bit */
-	BlockNumber truncBlock = HEAPBLK_TO_MAPBLOCK(nheapblocks);
-	uint32		truncByte = HEAPBLK_TO_MAPBYTE(nheapblocks);
-	uint8		truncOffset = HEAPBLK_TO_OFFSET(nheapblocks);
+	BlockNumber truncBlock;
+	uint32		truncByte;
+	uint8		truncOffset;
+
+	MAPBLOCK_INIT;
+
+	truncBlock = HEAPBLK_TO_MAPBLOCK(nheapblocks);
+	truncByte = HEAPBLK_TO_MAPBYTE(nheapblocks);
+	truncOffset = HEAPBLK_TO_OFFSET(nheapblocks);
 
 #ifdef TRACE_VISIBILITYMAP
 	elog(DEBUG1, "vm_truncate %s %d", RelationGetRelationName(rel), nheapblocks);
@@ -500,7 +539,7 @@ visibilitymap_prepare_truncate(Relation rel, BlockNumber nheapblocks)
 		START_CRIT_SECTION();
 
 		/* Clear out the unwanted bytes. */
-		MemSet(&map[truncByte + 1], 0, MAPSIZE - (truncByte + 1));
+		MemSet(&map[truncByte + 1], 0, MAPBLOCK_SIZE - (truncByte + 1));
 
 		/*----
 		 * Mask out the unwanted bits of the last remaining byte.
-- 
2.40.1

#29

Michael Paquier

michael@paquier.xyz

about 1 year ago

In reply to: David Christensen (#28)

Re: [PATCHES] Post-special page storage TDE support

On Wed, Mar 13, 2024 at 11:26:48AM -0500, David Christensen wrote:

Enclosing v4 for this patch series, rebased atop the
constant-splitting series[1]. For the purposes of having cfbot happy,
I am including the prerequisites as a squashed commit v4-0000, however
this is not technically part of this series.

The last update of this thread is from march 2024, with no replies and
no reviews. Please note that this fails in the CI so I'd suggest a
rebase for now, and I have marked the patch as waiting on author. If
there is a lack of interest, well..
--
Michael

#30

David Christensen

david.christensen@crunchydata.com

about 1 year ago

In reply to: Michael Paquier (#29)

Re: [PATCHES] Post-special page storage TDE support

On Tue, Dec 10, 2024 at 12:54 AM Michael Paquier <michael@paquier.xyz> wrote:

On Wed, Mar 13, 2024 at 11:26:48AM -0500, David Christensen wrote:

Enclosing v4 for this patch series, rebased atop the
constant-splitting series[1]. For the purposes of having cfbot happy,
I am including the prerequisites as a squashed commit v4-0000, however
this is not technically part of this series.

The last update of this thread is from march 2024, with no replies and
no reviews. Please note that this fails in the CI so I'd suggest a
rebase for now, and I have marked the patch as waiting on author. If
there is a lack of interest, well..

I can't say there is a lack of interest from the author per se :), but
not really seeing much in the way of community engagement makes me
think it's largely unwanted. I'd certainly be happy to rebase and
reengage, but if it's not wanted at the conceptual level it doesn't
seem worth the effort. It's hard to interpret lack of response as
"don't care, fine" vs "don't want" vs "haven't looked, -hackers is a
firehose".

Thanks,

David

#31

Bruce Momjian

bruce@momjian.us

about 1 year ago

In reply to: David Christensen (#30)

Re: [PATCHES] Post-special page storage TDE support

On Thu, Dec 12, 2024 at 09:15:55AM -0600, David Christensen wrote:

On Tue, Dec 10, 2024 at 12:54 AM Michael Paquier <michael@paquier.xyz> wrote:

On Wed, Mar 13, 2024 at 11:26:48AM -0500, David Christensen wrote:

Enclosing v4 for this patch series, rebased atop the
constant-splitting series[1]. For the purposes of having cfbot happy,
I am including the prerequisites as a squashed commit v4-0000, however
this is not technically part of this series.

The last update of this thread is from march 2024, with no replies and
no reviews. Please note that this fails in the CI so I'd suggest a
rebase for now, and I have marked the patch as waiting on author. If
there is a lack of interest, well..

I can't say there is a lack of interest from the author per se :), but
not really seeing much in the way of community engagement makes me
think it's largely unwanted. I'd certainly be happy to rebase and
reengage, but if it's not wanted at the conceptual level it doesn't
seem worth the effort. It's hard to interpret lack of response as
"don't care, fine" vs "don't want" vs "haven't looked, -hackers is a
firehose".

The value of TDE is limited from a security value perspective, but high
on the list of security policy requirements. Our community is much more
responsive to actual value vs policy compliance value.

When I started focusing on TDE, it was going to require changes to
buffer reads/writes, WAL, and require a way to store secret keys. I
thought those changes would be acceptable given TDE's security value.
Once the file I/O changes were required, I think the balance tilted to
TDE requiring too many code changes given its security value (not policy
compliance value).

At least that is my analysis, and part of me wishes I was wrong. I know
there are several commercial forks of TDE, mostly because companies are
more sensitive to policy compliance value, which translates to monetary
value for them.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Do not let urgent matters crowd out time for investment in the future.

#32

Greg Sabino Mullane

htamfids@gmail.com

about 1 year ago

In reply to: David Christensen (#18)

Re: [PATCHES] Post-special page storage TDE support

Re-reading this thread, and this has been nagging me:

On Mon, Nov 13, 2023 at 3:03 PM David Christensen <
david.christensen@crunchydata.com> wrote:

- so why do we need to recompute offsets on every single page? I'd

instead

add a distinct offset variable for each feature.

This would work iff there is a single page feature set across all pages in
the cluster; I'm not sure we don't want more flexibility here.

I'm quite skeptical about the usefulness of this; seems to be at best a
footgun to have per-page features. Also, none of the examples given so far
(e.g. encryption, better checksums) are of a non-global nature, so did you
have specific examples in mind?

Cheers,
Greg

#33

Greg Sabino Mullane

htamfids@gmail.com

about 1 year ago

In reply to: Bruce Momjian (#31)

Re: [PATCHES] Post-special page storage TDE support

On Fri, Dec 27, 2024 at 10:12 AM Bruce Momjian <bruce@momjian.us> wrote:

The value of TDE is limited from a security value perspective, but high on
the list of security policy requirements. Our community is much more
responsive to actual value vs policy compliance value.

True. The number of forks, though, makes me feel this is a "when", not "if"
feature. Has there been any other complex feature forked/implemented by so
many? Maybe columnar storage?

Cheers,
Greg

#34

Bruce Momjian

bruce@momjian.us

about 1 year ago

In reply to: Greg Sabino Mullane (#33)

Re: [PATCHES] Post-special page storage TDE support

On Fri, Dec 27, 2024 at 12:25:11PM -0500, Greg Sabino Mullane wrote:

On Fri, Dec 27, 2024 at 10:12 AM Bruce Momjian <bruce@momjian.us> wrote:

The value of TDE is limited from a security value perspective, but high on
the list of security policy requirements. Our community is much more
responsive to actual value vs policy compliance value.

True. The number of forks, though, makes me feel this is a "when", not "if"
feature. Has there been any other complex feature forked/implemented by so
many? Maybe columnar storage?

That is a great question. We have TDE implementations from EDB,
Fujitsu, Percona, Cybertec, and Crunchy Data, and perhaps others, and
that is a lot of duplicated effort.

As far as parallels, I think compatibility with Oracle and MSSQL are
areas that several companies have developed that the community is
unlikely to ever develop, I think because they are pure compatibility,
not functionality. I think TDE having primarily policy compliance value
also might make it something the community never develops.

I think this blog post is the clearest I have seen about the technical
value vs.policy compliance value of TDE:

https://www.percona.com/blog/why-postgresql-needs-transparent-database-encryption-tde/

One possible way TDE could be added to community Postgres is if the code
changes required were reduced due to an API redesign.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Do not let urgent matters crowd out time for investment in the future.

#35

David Christensen

david.christensen@crunchydata.com

about 1 year ago

In reply to: Bruce Momjian (#34)

Re: [PATCHES] Post-special page storage TDE support

On Fri, Dec 27, 2024 at 1:58 PM Bruce Momjian <bruce@momjian.us> wrote:

On Fri, Dec 27, 2024 at 12:25:11PM -0500, Greg Sabino Mullane wrote:

On Fri, Dec 27, 2024 at 10:12 AM Bruce Momjian <bruce@momjian.us> wrote:

The value of TDE is limited from a security value perspective, but high on
the list of security policy requirements. Our community is much more
responsive to actual value vs policy compliance value.

True. The number of forks, though, makes me feel this is a "when", not "if"
feature. Has there been any other complex feature forked/implemented by so
many? Maybe columnar storage?

That is a great question. We have TDE implementations from EDB,
Fujitsu, Percona, Cybertec, and Crunchy Data, and perhaps others, and
that is a lot of duplicated effort.

As far as parallels, I think compatibility with Oracle and MSSQL are
areas that several companies have developed that the community is
unlikely to ever develop, I think because they are pure compatibility,
not functionality. I think TDE having primarily policy compliance value
also might make it something the community never develops.

I think this blog post is the clearest I have seen about the technical
value vs.policy compliance value of TDE:

https://www.percona.com/blog/why-postgresql-needs-transparent-database-encryption-tde/

One possible way TDE could be added to community Postgres is if the code
changes required were reduced due to an API redesign.

A couple big pieces here could be modifying the API to add
PreparePageForWrite()/PreparePageFromRead() hooks to transform the
data page once read from disk or getting ready to write to disk. I
think I have a (not yet rebased atop bulk read/write API and various
incremental backup pieces) patch version which basically refactors
things somewhat to support that, but basically making a single call
point that we can add things like checksums, page encryption, etc.

I think there was also a thread floating around moving various
arbitrary read/write file APIs (for temporary files) into a common
API; my recollection is there was something along the lines of 4 or 5
different file read abstractions we used at various pieces in the code
base, so making a common one that could also be hooked would give us
the ability to make a read/write transient file API that we could then
"do stuff" with. (My recollection is we could support encryption and
compression, but don't have the thread in front of me.)

Some form of early init pluggability would be required, since we'd
need to support reading encrypted WAL before full startup is
accomplished. This seems harder, at least given the bits I was
originally plugging into.

Obviously existing forks would want to support reading their existing
clusters, so not sure there is an all-in-one solution here.

Just some musing here...

David