Amcheck verification of GiST and GIN

Started by Andrey Borodinover 3 years ago92 messages

x4mmm@yandex-team.ru

over 3 years ago

1 attachment(s)

Hello world!

Few years ago we had a thread with $subj [0]/messages/by-id/CAF3eApa07-BajjG8+RYx-Dr_cq28ZA0GsZmUQrGu5b2ayRhB5A@mail.gmail.com. A year ago Heikki put a lot of effort in improving GIN checks [1]/messages/by-id/9fdbb584-1e10-6a55-ecc2-9ba8b5dca1cf@iki.fi while hunting a GIN bug.
And in view of some releases with a recommendation to reindex anything that fails or lacks amcheck verification, I decided that I want to review the thread.

PFA $subj incorporating all Heikki's improvements and restored GiST checks. Also I've added heapallindexed verification for GiST. I'm sure that we must add it for GIN too. Yet I do not know how to implement it. Maybe just check that every entry generated from heap present in entry tree? Or that every tids is present in the index?

GiST verification does parent check despite taking only AccessShareLock. It's possible because when the key discrepancy is found we acquire parent tuple with lock coupling. I'm sure that this is correct to check keys this way. And I'm almost sure it will not deadlock, because split is doing the same locking.

What do you think?

Best regards, Andrey Borodin.

[0]: /messages/by-id/CAF3eApa07-BajjG8+RYx-Dr_cq28ZA0GsZmUQrGu5b2ayRhB5A@mail.gmail.com
[1]: /messages/by-id/9fdbb584-1e10-6a55-ecc2-9ba8b5dca1cf@iki.fi

Attachments:

v10-0001-Amcheck-for-GIN-and-GiST.patchapplication/octet-stream; name=v10-0001-Amcheck-for-GIN-and-GiST.patch; x-unix-mode=0644Download

From c99eca21503f7b17d374a46289c6c2bca6209685 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Thu, 22 Jul 2021 10:08:47 +0300
Subject: [PATCH v10] Amcheck for GIN and GiST

changes since last patch version:
- print a lot more DEBUG3 messages even when nothing's wrong
- don't stop on error, report as warning

Author: Grigory Kryachko, Heikki, Andrey
Discussion: https://www.postgresql.org/message-id/CAF3eApa07-BajjG8%2BRYx-Dr_cq28ZA0GsZmUQrGu5b2ayRhB5A%40mail.gmail.com
---
 contrib/amcheck/Makefile                |   8 +-
 contrib/amcheck/amcheck--1.3--1.4.sql   |  24 +
 contrib/amcheck/amcheck.c               | 187 ++++++
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/amcheck.h               |  27 +
 contrib/amcheck/expected/check_gin.out  |  60 ++
 contrib/amcheck/expected/check_gist.out |  64 ++
 contrib/amcheck/sql/check_gin.sql       |  40 ++
 contrib/amcheck/sql/check_gist.sql      |  22 +
 contrib/amcheck/verify_gin.c            | 801 ++++++++++++++++++++++++
 contrib/amcheck/verify_gist.c           | 524 ++++++++++++++++
 contrib/amcheck/verify_nbtree.c         | 306 +++------
 doc/src/sgml/amcheck.sgml               |  38 ++
 13 files changed, 1872 insertions(+), 231 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.3--1.4.sql
 create mode 100644 contrib/amcheck/amcheck.c
 create mode 100644 contrib/amcheck/amcheck.h
 create mode 100644 contrib/amcheck/expected/check_gin.out
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gin.sql
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gin.c
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index b82f221e50..db8f8112da 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -3,14 +3,18 @@
 MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
+	amcheck.o \
+	verify_gin.o \
+	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
-DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql \
+	amcheck--1.3--1.4.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap
+REGRESS = check check_btree check_heap check_gin check_gist
 
 TAP_TESTS = 1
 
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
new file mode 100644
index 0000000000..54180d356d
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -0,0 +1,24 @@
+/* contrib/amcheck/amcheck--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.4'" to load this file. \quit
+
+
+--
+-- gin_index_parent_check()
+--
+CREATE FUNCTION gin_index_parent_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gin_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gin_index_parent_check(regclass, boolean) FROM PUBLIC;
+
+-- gist_index_parent_check()
+--
+CREATE FUNCTION gist_index_parent_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
new file mode 100644
index 0000000000..7a222719dd
--- /dev/null
+++ b/contrib/amcheck/amcheck.c
@@ -0,0 +1,187 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.c
+ *		Utility functions common to all access methods.
+ *
+ * Copyright (c) 2017-2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/table.h"
+#include "access/tableam.h"
+#include "amcheck.h"
+#include "catalog/index.h"
+#include "commands/tablecmds.h"
+
+
+static bool
+amcheck_index_mainfork_expected(Relation rel);
+
+/*
+ * Check if index relation should have a file for its main relation
+ * fork.  Verification uses this to skip unlogged indexes when in hot standby
+ * mode, where there is simply nothing to verify.
+ *
+ * NB: Caller should call index_checkable()
+ * before calling here.
+ */
+static bool
+amcheck_index_mainfork_expected(Relation rel)
+{
+	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
+		!RecoveryInProgress())
+		return true;
+
+	ereport(NOTICE,
+			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
+			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
+					RelationGetRelationName(rel))));
+
+	return false;
+}
+
+void amcheck_lock_relation_and_check(Oid indrelid, IndexCheckableCallback checkable,
+												IndexDoCheckCallback check, LOCKMODE lockmode, void *state)
+{
+	Oid			heapid;
+	Relation	indrel;
+	Relation	heaprel;
+	Oid			save_userid;
+	int			save_sec_context;
+	int			save_nestlevel;
+
+	/*
+	 * We must lock table before index to avoid deadlocks.  However, if the
+	 * passed indrelid isn't an index then IndexGetRelation() will fail.
+	 * Rather than emitting a not-very-helpful error message, postpone
+	 * complaining, expecting that the is-it-an-index test below will fail.
+	 *
+	 * In hot standby mode this will raise an error when parentcheck is true.
+	 */
+	heapid = IndexGetRelation(indrelid, true);
+	if (OidIsValid(heapid))
+	{
+		heaprel = table_open(heapid, lockmode);
+
+		/*
+		 * Switch to the table owner's userid, so that any index functions are
+		 * run as that user.  Also lock down security-restricted operations
+		 * and arrange to make GUC variable changes local to this command.
+		 */
+		GetUserIdAndSecContext(&save_userid, &save_sec_context);
+		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
+							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
+		save_nestlevel = NewGUCNestLevel();
+	}
+	else
+	{
+		heaprel = NULL;
+		/* for "gcc -Og" https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78394 */
+		save_userid = InvalidOid;
+		save_sec_context = -1;
+		save_nestlevel = -1;
+	}
+
+	/*
+	 * Open the target index relations separately (like relation_openrv(), but
+	 * with heap relation locked first to prevent deadlocking).  In hot
+	 * standby mode this will raise an error when parentcheck is true.
+	 *
+	 * There is no need for the usual indcheckxmin usability horizon test
+	 * here, even in the heapallindexed case, because index undergoing
+	 * verification only needs to have entries for a new transaction snapshot.
+	 * (If this is a parentcheck verification, there is no question about
+	 * committed or recently dead heap tuples lacking index entries due to
+	 * concurrent activity.)
+	 */
+	indrel = index_open(indrelid, lockmode);
+
+	/*
+	 * Since we did the IndexGetRelation call above without any lock, it's
+	 * barely possible that a race against an index drop/recreation could have
+	 * netted us the wrong table.
+	 */
+	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_TABLE),
+				 errmsg("could not open parent table of index \"%s\"",
+						RelationGetRelationName(indrel))));
+
+	/* Relation suitable for checking */
+	checkable(indrel);
+
+	if (amcheck_index_mainfork_expected(indrel))
+		check(indrel, heaprel, state);
+
+	/* Roll back any GUC changes executed by index functions */
+	AtEOXact_GUC(false, save_nestlevel);
+
+	/* Restore userid and security context */
+	SetUserIdAndSecContext(save_userid, save_sec_context);
+
+	/*
+	 * Release locks early. That's ok here because nothing in the called
+	 * routines will trigger shared cache invalidations to be sent, so we can
+	 * relax the usual pattern of only releasing locks after commit.
+	 */
+	index_close(indrel, lockmode);
+	if (heaprel)
+		table_close(heaprel, lockmode);
+}
+
+/*
+ * PageGetItemId() wrapper that validates returned line pointer.
+ *
+ * Buffer page/page item access macros generally trust that line pointers are
+ * not corrupt, which might cause problems for verification itself.  For
+ * example, there is no bounds checking in PageGetItem().  Passing it a
+ * corrupt line pointer can cause it to return a tuple/pointer that is unsafe
+ * to dereference.
+ *
+ * Validating line pointers before tuples avoids undefined behavior and
+ * assertion failures with corrupt indexes, making the verification process
+ * more robust and predictable.
+ */
+ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset, size_t opaquesize)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	Assert(opaquesize == MAXALIGN(opaquesize));
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(opaquesize))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree and gist
+	 * never uses either.  Verify that line pointer has storage, too, since
+	 * even LP_DEAD items should.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index ab50931f75..e67ace01c9 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.3'
+default_version = '1.4'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/amcheck.h b/contrib/amcheck/amcheck.h
new file mode 100644
index 0000000000..10906efd8a
--- /dev/null
+++ b/contrib/amcheck/amcheck.h
@@ -0,0 +1,27 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "storage/lockdefs.h"
+#include "utils/relcache.h"
+#include "miscadmin.h"
+
+/* Typedefs for callback functions for amcheck_lock_relation */
+typedef void (*IndexCheckableCallback) (Relation index);
+typedef void (*IndexDoCheckCallback) (Relation rel, Relation heaprel, void* state);
+
+extern void amcheck_lock_relation_and_check(Oid indrelid,
+											IndexCheckableCallback checkable,
+											IndexDoCheckCallback check,
+											LOCKMODE lockmode, void *state);
+
+extern ItemId PageGetItemIdCareful(Relation rel, BlockNumber block,
+					 Page page, OffsetNumber offset, size_t opaquesize);
\ No newline at end of file
diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
new file mode 100644
index 0000000000..3e63355143
--- /dev/null
+++ b/contrib/amcheck/expected/check_gin.out
@@ -0,0 +1,60 @@
+-- minimal test, basically just verifying that amcheck works with GIN
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx', true);
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- minimal test, basically just verifying that amcheck works with GIN
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+SELECT gin_index_parent_check('gin_check_idx', true);
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- minimal test, basically just verifying that amcheck works with GIN
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array', true);
+ERROR:  "gin_check_text_array" is not an index
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 0000000000..6ea04a06b0
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,64 @@
+-- minimal test, basically just verifying that amcheck works with GiST
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,100000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+DELETE FROM gist_check WHERE c[1] < 1000;
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0;
+VACUUM gist_check;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
new file mode 100644
index 0000000000..4d573686dc
--- /dev/null
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -0,0 +1,40 @@
+-- minimal test, basically just verifying that amcheck works with GIN
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx', true);
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- minimal test, basically just verifying that amcheck works with GIN
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+
+SELECT gin_index_parent_check('gin_check_idx', true);
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- minimal test, basically just verifying that amcheck works with GIN
+SELECT setseed(1);
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array', true);
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 0000000000..11c2f00164
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,22 @@
+-- minimal test, basically just verifying that amcheck works with GiST
+SELECT setseed(1);
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,100000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+DELETE FROM gist_check WHERE c[1] < 1000;
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0;
+VACUUM gist_check;
+
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
new file mode 100644
index 0000000000..45f3bb47c7
--- /dev/null
+++ b/contrib/amcheck/verify_gin.c
@@ -0,0 +1,801 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gin.c
+ *		Verifies the integrity of GIN indexes based on invariants.
+ *
+ * Verification checks that all paths in GIN graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gin.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gin_private.h"
+#include "access/nbtree.h"
+#include "amcheck.h"
+#include "catalog/pg_am.h"
+#include "miscadmin.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "string.h"
+
+/*
+ * GinScanItem represents one item of depth-first scan of GIN index.
+ */
+typedef struct GinScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GinScanItem *next;
+}			GinScanItem;
+
+/*
+ * GinPostingTreeScanItem represents one item of depth-first scan of GIN  posting tree.
+ */
+typedef struct GinPostingTreeScanItem
+{
+	int			depth;
+	ItemPointerData parentkey;
+	BlockNumber parentblk;
+	BlockNumber blkno;
+	struct GinPostingTreeScanItem *next;
+}			GinPostingTreeScanItem;
+
+
+PG_FUNCTION_INFO_V1(gin_index_parent_check);
+
+static void gin_index_checkable(Relation rel);
+static void gin_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state);
+static bool check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gin_refind_parent(Relation rel, BlockNumber parentblkno,
+									BlockNumber childblkno,
+									BufferAccessStrategy strategy);
+
+/*
+ * gin_index_parent_check(index regclass)
+ *
+ * Verify integrity of GIN index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gin_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	bool	heapallindexed = false;
+
+	if (PG_NARGS() >= 2)
+		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid, gin_index_checkable,
+		gin_check_parent_keys_consistency, AccessShareLock, &heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Read item pointers from leaf entry tuple.
+ *
+ * Returns a palloc'd array of ItemPointers. The number of items is returned
+ * in *nitems.
+ */
+static ItemPointer
+ginReadTupleWithoutState(IndexTuple itup, int *nitems)
+{
+	Pointer		ptr = GinGetPosting(itup);
+	int			nipd = GinGetNPosting(itup);
+	ItemPointer ipd;
+	int			ndecoded;
+
+	if (GinItupIsCompressed(itup))
+	{
+		if (nipd > 0)
+		{
+			ipd = ginPostingListDecode((GinPostingList *) ptr, &ndecoded);
+			if (nipd != ndecoded)
+				elog(ERROR, "number of items mismatch in GIN entry tuple, %d in tuple header, %d decoded",
+					 nipd, ndecoded);
+		}
+		else
+		{
+			ipd = palloc(0);
+		}
+	}
+	else
+	{
+		ipd = (ItemPointer) palloc(sizeof(ItemPointerData) * nipd);
+		memcpy(ipd, ptr, sizeof(ItemPointerData) * nipd);
+	}
+	*nitems = nipd;
+	return ipd;
+}
+
+
+/*
+ * Check that relation is eligible for GIN verification
+ */
+static void
+gin_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIN_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GIN indexes are supported as targets for this verification"),
+				 errdetail("Relation \"%s\" is not a GIN index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+/*
+ * Allocates memory context and scans through postigTree graph
+ *
+ */
+static void
+gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting_tree_root)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinPostingTreeScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinPostingTreeScanItem *) palloc0(sizeof(GinPostingTreeScanItem));
+	stack->depth = 0;
+	ItemPointerSetInvalid(&stack->parentkey);
+	stack->parentblk = InvalidBlockNumber;
+	stack->blkno = posting_tree_root;
+
+	elog(DEBUG3, "processing posting tree at blk %u", posting_tree_root);
+
+	while (stack)
+	{
+		GinPostingTreeScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		Assert(GinPageIsData(page));
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			ItemPointerData minItem;
+			int			nlist;
+			ItemPointerData *list;
+			char		tidrange_buf[100];
+
+			ItemPointerSetMin(&minItem);
+
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			list = GinDataLeafPageGetItems(page, &nlist, minItem);
+
+			if (nlist > 0)
+			{
+				snprintf(tidrange_buf, sizeof(tidrange_buf),
+						 "%d tids (%u, %u) - (%u, %u)",
+						 nlist,
+						 ItemPointerGetBlockNumberNoCheck(&list[0]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[0]),
+						 ItemPointerGetBlockNumberNoCheck(&list[nlist - 1]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[nlist - 1]));
+			} else {
+				snprintf(tidrange_buf, sizeof(tidrange_buf), "0 tids");
+			}
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				elog(DEBUG3, "blk %u: parent %u highkey (%u, %u), %s",
+					 stack->blkno,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey),
+					 tidrange_buf);
+			}
+			else
+			{
+				elog(DEBUG3, "blk %u: root leaf, %s",
+					 stack->blkno,
+					 tidrange_buf);
+			}
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				ItemPointerGetOffsetNumberNoCheck(&stack->parentkey) != InvalidOffsetNumber &&
+				nlist > 0 &&
+				ItemPointerCompare(&stack->parentkey, &list[nlist - 1]) < 0)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": tid exceeds parent's high key in postingTree leaf on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			}
+		}
+		else
+		{
+			LocationIndex pd_lower;
+			int			lowersize;
+			ItemPointerData bound;
+
+			/*
+			 * Check that tuples in each page are properly ordered and
+			 * consistent with parent high key
+			 */
+			maxoff = GinPageGetOpaque(page)->maxoff;
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: internal posting tree page with %u items, parent %u highkey (%u, %u)",
+					 stack->blkno,
+					 maxoff,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey));
+			else
+				elog(DEBUG3, "blk %u: root internal posting tree page with %u items", stack->blkno, maxoff);
+
+			/*
+			 * A GIN posting tree internal page stores PostingItems in the
+			 * 'lower' part of the page. The 'upper' part is unused. The
+			 * number of elements is stored in the opaque area (maxoff).
+			 * Make sure the size of the 'lower' part agrees with 'maxoff'
+			 *
+			 * We didn't set pd_lower until PostgreSQL version 9.4, so if this
+			 * check fails, it could also be because the index was binary-upgraded
+			 * from an earlier version. That was a long time ago, though, so let's
+			 * warn if it doesn't match.
+			 */
+			pd_lower = ((PageHeader) page)->pd_lower;
+			lowersize = pd_lower - MAXALIGN(SizeOfPageHeaderData);
+			if ((lowersize - MAXALIGN(sizeof(ItemPointerData))) / sizeof(PostingItem) != maxoff)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has unexpected pd_lower %u in posting tree block %u with maxoff %u)",
+								RelationGetRelationName(rel), pd_lower, stack->blkno, maxoff)));
+			}
+
+			/*
+			 * Before the PostingItems, there's one ItemPointerData in the
+			 * 'lower' part that stores the page's high key.
+			 */
+			bound = *GinDataPageGetRightBound(page);
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				if (!ItemPointerEquals(&stack->parentkey, &bound))
+				{
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\": posting tree page's high key (%u, %u) doesn't match the downlink on block %u (parent blk %u, key (%u, %u))",
+									RelationGetRelationName(rel),
+									ItemPointerGetBlockNumberNoCheck(&bound),
+									ItemPointerGetOffsetNumberNoCheck(&bound),
+									stack->blkno,
+									stack->parentblk,
+									ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+									ItemPointerGetOffsetNumberNoCheck(&stack->parentkey))));
+				}
+			}
+
+			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
+
+				elog(DEBUG3, "key (%u, %u) -> %u",
+					 ItemPointerGetBlockNumber(&posting_item->key),
+					 ItemPointerGetOffsetNumber(&posting_item->key),
+					 BlockIdGetBlockNumber(&posting_item->child_blkno));
+
+				if (i == maxoff && GinPageGetOpaque(page)->rightlink == InvalidBlockNumber)
+				{
+					/* The rightmost item in the tree level has (0, 0) as the key */
+					if (ItemPointerGetBlockNumberNoCheck(&posting_item->key) != 0 ||
+						ItemPointerGetOffsetNumberNoCheck(&posting_item->key) != 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": rightmost posting tree page (blk %u) has unexpected last key (%u, %u)",
+										RelationGetRelationName(rel),
+										stack->blkno,
+										ItemPointerGetBlockNumberNoCheck(&posting_item->key),
+										ItemPointerGetOffsetNumberNoCheck(&posting_item->key))));
+					}
+				}
+				else if (i != FirstOffsetNumber)
+				{
+					PostingItem *previous_posting_item = GinDataPageGetPostingItem(page, i - 1);
+
+					if (ItemPointerCompare(&posting_item->key, &previous_posting_item->key) < 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\" has wrong tuple order in posting tree, block %u, offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+					}
+				}
+
+				/*
+				 * Check if this tuple is consistent with the downlink in the
+				 * parent.
+				 */
+				if (stack->parentblk != InvalidBlockNumber && i == maxoff)
+				{
+					if (ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting item exceeds parent's high key in postingTree internal page on block %u offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+
+					}
+				}
+
+				/* If this is an internal page, recurse into the child */
+				if (!GinPageIsLeaf(page))
+				{
+					GinPostingTreeScanItem *ptr;
+
+					ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
+					ptr->depth = stack->depth + 1;
+					ptr->parentkey = posting_item->key;
+					ptr->parentblk = stack->blkno;
+					ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+
+			}
+		}
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Main entry point for GIN check. Allocates memory context and scans through
+ * GIN graph.
+ */
+static void
+gin_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GinState	state;
+	bool		heapallindexed = *((bool*)callback_state);
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+	initGinState(&state, rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinScanItem *) palloc0(sizeof(GinScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIN_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GinScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+		IndexTuple	prev_tuple;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		if (!check_index_page(rel, buffer, stack->blkno))
+		{
+			goto nextpage;
+		}
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (stack->parenttup != NULL)
+		{
+			GinNullCategory parent_key_category;
+			Datum		parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+			OffsetNumber maxoff = PageGetMaxOffsetNumber(page);
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, maxoff, sizeof(GinPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory page_max_key_category;
+			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
+
+			if (GinPageGetOpaque(page)->rightlink != InvalidBlockNumber &&
+				ginCompareEntries(&state, attnum, page_max_key, page_max_key_category, parent_key, parent_key_category) > 0)
+			{
+				/* split page detected, install right link to the stack */
+				GinScanItem *ptr;
+
+				elog(DEBUG3, "split detected");
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth;
+				ptr->parenttup = CopyIndexTuple(stack->parenttup);
+				ptr->parentblk = stack->parentblk;
+				ptr->parentlsn = stack->parentlsn;
+				ptr->blkno = GinPageGetOpaque(page)->rightlink;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+				goto nextpage;
+			}
+		}
+
+		/*
+		 * Check that tuples in each page are properly ordered and consistent
+		 * with parent high key
+		 */
+		maxoff = PageGetMaxOffsetNumber(page);
+		prev_tuple = NULL;
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i, sizeof(GinPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory prev_key_category;
+			Datum		prev_key;
+			GinNullCategory current_key_category;
+			Datum		current_key;
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+				continue;
+			}
+
+			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
+
+			/* (apparently) first block is metadata, skip order check */
+			if (i != FirstOffsetNumber && stack->blkno != (BlockNumber) 1)
+			{
+				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
+				if (ginCompareEntries(&state, attnum, prev_key, prev_key_category, current_key, current_key_category) >= 0)
+				{
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has wrong tuple order, block %u, offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+				}
+			}
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				i == maxoff)
+			{
+				GinNullCategory parent_key_category;
+				Datum		parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+
+				if (ginCompareEntries(&state, attnum, current_key, current_key_category, parent_key, parent_key_category) > 0)
+				{
+					/*
+					 * There was a discrepancy between parent and child
+					 * tuples. We need to verify it is not a result of
+					 * concurrent call of gistplacetopage(). So, lock parent
+					 * and try to find downlink for current page. It may be
+					 * missing due to concurrent page split, this is OK.
+					 */
+					pfree(stack->parenttup);
+					stack->parenttup = gin_refind_parent(rel, stack->parentblk,
+														 stack->blkno, strategy);
+
+					/* We found it - make a final check before failing */
+					if (!stack->parenttup)
+						elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+							 stack->blkno, stack->parentblk);
+					else
+					{
+						parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+						if (ginCompareEntries(&state, attnum, current_key, current_key_category, parent_key, parent_key_category) > 0)
+							ereport(ERROR,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+											RelationGetRelationName(rel), stack->blkno, i)));
+						else
+						{
+							/*
+							 * But now it is properly adjusted - nothing to do
+							 * here.
+							 */
+						}
+					}
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+				// XXX: Why do we have invalid pointers here?? Got an SegFault without this check
+			if (!GinPageIsLeaf(page) && ItemPointerIsValid(&(idxtuple->t_tid)))
+			{
+				GinScanItem *ptr;
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth + 1;
+				/* last tuple in layer has no high key */
+				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
+				{
+					ptr->parenttup = CopyIndexTuple(idxtuple);
+				}
+				else
+				{
+					ptr->parenttup = NULL;
+				}
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+			/* If this item is a pointer to a posting tree, recurse into it */
+			else if (GinIsPostingTree(idxtuple))
+			{
+				BlockNumber rootPostingTree = GinGetPostingTree(idxtuple);
+
+				gin_check_posting_tree_parent_keys_consistency(rel, rootPostingTree);
+			}
+			else
+			{
+				ItemPointer ipd;
+				int			nipd;
+
+				ipd = ginReadTupleWithoutState(idxtuple, &nipd);
+
+				for (int j = 0; j < nipd; j++)
+				{
+					if (!OffsetNumberIsValid(ItemPointerGetOffsetNumber(&ipd[j])))
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting list contains invalid heap pointer on block %u",
+										RelationGetRelationName(rel), stack->blkno)));
+					}
+				}
+				pfree(ipd);
+			}
+
+			prev_tuple = CopyIndexTuple(idxtuple);
+		}
+
+nextpage:
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Verify that a freshly-read page looks sane.
+ */
+static bool
+gincheckpage(Relation rel, Buffer buf)
+{
+	Page		page = BufferGetPage(buf);
+
+	/*
+	 * ReadBuffer verifies that every newly-read page passes
+	 * PageHeaderIsValid, which means it either contains a reasonably sane
+	 * page header or is all-zero.  We have to defend against the all-zero
+	 * case, however.
+	 */
+	if (PageIsNew(page))
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains unexpected zero page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buf)),
+				 errhint("Please REINDEX it.")));
+		return false;
+	}
+
+	/*
+	 * Additionally check that the special area looks sane.
+	 */
+	if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains corrupted page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buf)),
+				 errhint("Please REINDEX it.")));
+		return false;
+	}
+	return true;
+}
+
+static bool
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	if (!gincheckpage(rel, buffer))
+		return false;
+
+	if (GinPageIsDeleted(page))
+	{
+		if (!GinPageIsLeaf(page))
+		{
+			ereport(WARNING,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+			return false;
+		}
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+		{
+			ereport(WARNING,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+			return false;
+		}
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+		return false;
+	}
+	return true;
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gin_refind_parent(Relation rel, BlockNumber parentblkno,
+				  BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIN_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GinPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o, sizeof(GinPageOpaqueData));
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 0000000000..7c06982ab8
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,524 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gist.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "access/tableam.h"
+#include "access/transam.h"
+#include "catalog/pg_am.h"
+#include "common/pg_prng.h"
+#include "catalog/index.h"
+#include "lib/bloomfilter.h"
+#include "miscadmin.h"
+#include "storage/lmgr.h"
+#include "storage/smgr.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "utils/snapmgr.h"
+
+#include "amcheck.h"
+
+/*
+ * GistScanItem represents one item of depth-first scan of GiST index.
+ */
+typedef struct GistScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GistScanItem *next;
+} GistScanItem;
+
+typedef struct GistCheckState
+{
+	/* Bloom filter fingerprints B-Tree index */
+	bloom_filter *filter;
+	/* Debug counter */
+	int64		heaptuplespresent;
+	/* GiST state */
+	GISTSTATE      *state;
+
+	Snapshot		snapshot;
+	Relation	rel;
+	Relation	heaprel;
+} GistCheckState;
+
+PG_FUNCTION_INFO_V1(gist_index_parent_check);
+
+static GistCheckState gist_init_heapallindexed(Relation rel);
+static void gist_index_checkable(Relation rel);
+static void gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+												void* callback_state);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gist_refind_parent(Relation rel, BlockNumber parentblkno,
+									 BlockNumber childblkno,
+									 BufferAccessStrategy strategy);
+static void gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+						  bool *isnull, bool tupleIsAlive, void *checkstate);
+
+/*
+ * gist_index_parent_check(index regclass)
+ *
+ * Verify integrity of GiST index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum gist_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid		indrelid = PG_GETARG_OID(0);
+	bool	heapallindexed = false;
+
+	if (PG_NARGS() >= 2)
+		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid, gist_index_checkable,
+		gist_check_parent_keys_consistency, AccessShareLock, &heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Check that relation is eligible for GiST verification
+ */
+static void
+gist_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIST_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GiST indexes are supported as targets for this verification"),
+				 errdetail("Relation \"%s\" is not a GiST index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+static GistCheckState
+gist_init_heapallindexed(Relation rel)
+{
+	int64		total_pages;
+	int64		total_elems;
+	uint64		seed;
+	GistCheckState result;
+
+	/*
+	* Size Bloom filter based on estimated number of tuples in index
+	*/
+	total_pages = RelationGetNumberOfBlocks(rel);
+	total_elems = Max(total_pages * (MaxOffsetNumber / 5),
+						(int64) rel->rd_rel->reltuples);
+	/* Generate a random seed to avoid repetition */
+	seed = pg_prng_uint64(&pg_global_prng_state);
+	/* Create Bloom filter to fingerprint index */
+	result.filter = bloom_create(total_elems, maintenance_work_mem, seed);
+
+	/*
+	 * Register our own snapshot 
+	 */
+	result.snapshot = RegisterSnapshot(GetTransactionSnapshot());
+
+	/*
+	 * GetTransactionSnapshot() always acquires a new MVCC snapshot in
+	 * READ COMMITTED mode.  A new snapshot is guaranteed to have all
+	 * the entries it requires in the index.
+	 *
+	 * We must defend against the possibility that an old xact
+	 * snapshot was returned at higher isolation levels when that
+	 * snapshot is not safe for index scans of the target index.  This
+	 * is possible when the snapshot sees tuples that are before the
+	 * index's indcheckxmin horizon.  Throwing an error here should be
+	 * very rare.  It doesn't seem worth using a secondary snapshot to
+	 * avoid this.
+	 */
+	if (IsolationUsesXactSnapshot() && rel->rd_index->indcheckxmin &&
+		!TransactionIdPrecedes(HeapTupleHeaderGetXmin(rel->rd_indextuple->t_data),
+								result.snapshot->xmin))
+		ereport(ERROR,
+				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+					errmsg("index \"%s\" cannot be verified using transaction snapshot",
+						RelationGetRelationName(rel))));
+	
+	return result;
+}
+
+/*
+ * Main entry point for GiST check. Allocates memory context and scans through
+ * GiST graph.  This function verifies that tuples of internal pages cover all
+ * the key space of each tuple on leaf page.  To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries to
+ * adjust it by tuples on referenced child page.  Parent gist tuple should
+ * never require any adjustments.
+ */
+static void
+gist_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GistScanItem   *stack;
+	MemoryContext	mctx;
+	MemoryContext	oldcontext;
+	GISTSTATE      *state;
+	int				leafdepth;
+	bool			heapallindexed = *((bool*)callback_state);
+	GistCheckState  check_state;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	state = initGISTstate(rel);
+
+	if (heapallindexed)
+		check_state = gist_init_heapallindexed(rel);
+	check_state.state = state;
+	check_state.rel = rel;
+	check_state.heaprel = heaprel;
+	
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GistScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber  i, maxoff;
+		XLogRecPtr	lsn;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page))
+		{
+			/* split page detected, install right link to the stack */
+			GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+			ptr->depth = stack->depth;
+			ptr->parenttup = CopyIndexTuple(stack->parenttup);
+			ptr->parentblk = stack->parentblk;
+			ptr->parentlsn = stack->parentlsn;
+			ptr->blkno = GistPageGetOpaque(page)->rightlink;
+			ptr->next = stack->next;
+			stack->next = ptr;
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GistPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that each tuple looks valid, and is consistent with the
+		 * downlink we followed when we stepped on this page.
+		 */
+		maxoff = PageGetMaxOffsetNumber(page);
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId iid = PageGetItemIdCareful(rel, stack->blkno, page, i, sizeof(GISTPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+			/*
+			 * Check that it's not a leftover invalid tuple from pre-9.1 See
+			 * also gistdoinsert() and gistbulkdelete() handling of such
+			 * tuples. We do consider it error here.
+			 */
+			if (GistTupleIsInvalid(idxtuple))
+				ereport(ERROR,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("index \"%s\" contains an inner tuple marked as invalid, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i),
+						 errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+						 errhint("Please REINDEX it.")));
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+			{
+				/*
+				 * There was a discrepancy between parent and child tuples.
+				 * We need to verify it is not a result of concurrent call of
+				 * gistplacetopage(). So, lock parent and try to find downlink
+				 * for current page. It may be missing due to concurrent page
+				 * split, this is OK.
+				 * 
+				 * Note that when we aquire parent tuple now we hold lock for
+				 * both parent and child buffers. Thus parent tuple must
+				 * include keyspace of the child.
+				 */
+				pfree(stack->parenttup);
+				stack->parenttup = gist_refind_parent(rel, stack->parentblk,
+													  stack->blkno, strategy);
+
+				/* We found it - make a final check before failing */
+				if (!stack->parenttup)
+					elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+						 stack->blkno, stack->parentblk);
+				else if (gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+				else
+				{
+					/*
+					 * But now it is properly adjusted - nothing to do here.
+					 */
+				}
+			}
+
+			
+			if (GistPageIsLeaf(page))
+			{
+				if (heapallindexed)
+				{
+					bloom_add_element(check_state.filter, (unsigned char *) idxtuple,
+								  IndexTupleSize(idxtuple));
+				}
+			}
+			/* If this is an internal page, recurse into the child */
+			else
+			{
+				GistScanItem *ptr;
+
+				ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+				ptr->depth = stack->depth + 1;
+				ptr->parenttup = CopyIndexTuple(idxtuple);
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		LockBuffer(buffer, GIST_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	if (heapallindexed)
+	{
+		IndexInfo  *indexinfo = BuildIndexInfo(rel);
+		TableScanDesc scan;
+
+		scan = table_beginscan_strat(heaprel,	/* relation */
+									 check_state.snapshot,	/* snapshot */
+									 0, /* number of keys */
+									 NULL,	/* scan key */
+									 true,	/* buffer access strategy OK */
+									 true); /* syncscan OK? */
+
+		/*
+		 * Scan will behave as the first scan of a CREATE INDEX CONCURRENTLY.
+		 */
+		indexinfo->ii_Concurrent = true;
+
+		indexinfo->ii_Unique = false;
+		indexinfo->ii_ExclusionOps = NULL;
+		indexinfo->ii_ExclusionProcs = NULL;
+		indexinfo->ii_ExclusionStrats = NULL;
+
+		elog(DEBUG1, "verifying that tuples from index \"%s\" are present in \"%s\"",
+			 RelationGetRelationName(rel),
+			 RelationGetRelationName(heaprel));
+
+		table_index_build_scan(heaprel, rel, indexinfo, true, false,
+							   gist_tuple_present_callback, (void *) &check_state, scan);
+
+		ereport(DEBUG1,
+		(errmsg_internal("finished verifying presence of " INT64_FORMAT " tuples from table \"%s\" with bitset %.2f%% set",
+							check_state.heaptuplespresent, RelationGetRelationName(heaprel),
+							100.0 * bloom_prop_bits_set(check_state.filter))));
+
+		UnregisterSnapshot(check_state.snapshot);
+		bloom_free(check_state.filter);
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+static void
+gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+						  bool *isnull, bool tupleIsAlive, void *checkstate)
+{
+	GistCheckState *state = (GistCheckState *) checkstate;
+	IndexTuple	itup = gistFormTuple(state->state, index, values, isnull, true);
+	itup->t_tid = *tid;
+	/* Probe Bloom filter -- tuple should be present */
+	if (bloom_lacks_element(state->filter, (unsigned char *) itup,
+							IndexTupleSize(itup)))
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("heap tuple (%u,%u) from table \"%s\" lacks matching index tuple within index \"%s\"",
+						ItemPointerGetBlockNumber(&(itup->t_tid)),
+						ItemPointerGetOffsetNumber(&(itup->t_tid)),
+						RelationGetRelationName(state->heaprel),
+						RelationGetRelationName(state->rel))));
+
+	state->heaptuplespresent++;
+
+	pfree(itup);
+}
+
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	gistcheckpage(rel, buffer);
+
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted page %d",
+						RelationGetRelationName(rel), blockNo)));
+
+	if (GistPageIsDeleted(page))
+	{
+		if (!GistPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gist_refind_parent(Relation rel, BlockNumber parentblkno,
+				   BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIST_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GistPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o, sizeof(GISTPageOpaqueData));
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index a8791000f8..d12c55b478 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -40,6 +40,8 @@
 #include "utils/memutils.h"
 #include "utils/snapmgr.h"
 
+#include "amcheck.h"
+
 
 PG_MODULE_MAGIC;
 
@@ -137,10 +139,8 @@ typedef struct BtreeLevel
 PG_FUNCTION_INFO_V1(bt_index_check);
 PG_FUNCTION_INFO_V1(bt_index_parent_check);
 
-static void bt_index_check_internal(Oid indrelid, bool parentcheck,
-									bool heapallindexed, bool rootdescend);
+static void bt_index_check_internal_callback(Relation indrel, Relation heaprel, void* state);
 static inline void btree_index_checkable(Relation rel);
-static inline bool btree_index_mainfork_expected(Relation rel);
 static void bt_check_every_level(Relation rel, Relation heaprel,
 								 bool heapkeyspace, bool readonly, bool heapallindexed,
 								 bool rootdescend);
@@ -183,12 +183,17 @@ static inline bool invariant_l_nontarget_offset(BtreeCheckState *state,
 static Page palloc_btree_page(BtreeCheckState *state, BlockNumber blocknum);
 static inline BTScanInsert bt_mkscankey_pivotsearch(Relation rel,
 													IndexTuple itup);
-static ItemId PageGetItemIdCareful(BtreeCheckState *state, BlockNumber block,
-								   Page page, OffsetNumber offset);
 static inline ItemPointer BTreeTupleGetHeapTIDCareful(BtreeCheckState *state,
 													  IndexTuple itup, bool nonpivot);
 static inline ItemPointer BTreeTupleGetPointsToTID(IndexTuple itup);
 
+typedef struct BTCheckCallbackState
+{
+	bool parentcheck;
+	bool heapallindexed;
+	bool rootdescend;
+} BTCheckCallbackState;
+
 /*
  * bt_index_check(index regclass, heapallindexed boolean)
  *
@@ -202,12 +207,17 @@ Datum
 bt_index_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
+	BTCheckCallbackState args;
 
-	if (PG_NARGS() == 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = false;
+
+	if (PG_NARGS() >= 2)
+		args.heapallindexed = PG_GETARG_BOOL(1);
 
-	bt_index_check_internal(indrelid, false, heapallindexed, false);
+	amcheck_lock_relation_and_check(indrelid, btree_index_checkable,
+		bt_index_check_internal_callback, AccessShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -225,15 +235,18 @@ Datum
 bt_index_parent_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		rootdescend = false;
+	BTCheckCallbackState args;
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = true;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+		args.heapallindexed = PG_GETARG_BOOL(1);
 	if (PG_NARGS() == 3)
-		rootdescend = PG_GETARG_BOOL(2);
+		args.rootdescend = PG_GETARG_BOOL(2);
 
-	bt_index_check_internal(indrelid, true, heapallindexed, rootdescend);
+	amcheck_lock_relation_and_check(indrelid, btree_index_checkable,
+		bt_index_check_internal_callback, ShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -241,126 +254,35 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
 /*
  * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
  */
-static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend)
+static void bt_index_check_internal_callback(Relation indrel, Relation heaprel, void* state)
 {
-	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	LOCKMODE	lockmode;
-	Oid			save_userid;
-	int			save_sec_context;
-	int			save_nestlevel;
-
-	if (parentcheck)
-		lockmode = ShareLock;
-	else
-		lockmode = AccessShareLock;
-
-	/*
-	 * We must lock table before index to avoid deadlocks.  However, if the
-	 * passed indrelid isn't an index then IndexGetRelation() will fail.
-	 * Rather than emitting a not-very-helpful error message, postpone
-	 * complaining, expecting that the is-it-an-index test below will fail.
-	 *
-	 * In hot standby mode this will raise an error when parentcheck is true.
-	 */
-	heapid = IndexGetRelation(indrelid, true);
-	if (OidIsValid(heapid))
-	{
-		heaprel = table_open(heapid, lockmode);
-
-		/*
-		 * Switch to the table owner's userid, so that any index functions are
-		 * run as that user.  Also lock down security-restricted operations
-		 * and arrange to make GUC variable changes local to this command.
-		 */
-		GetUserIdAndSecContext(&save_userid, &save_sec_context);
-		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
-							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
-		save_nestlevel = NewGUCNestLevel();
-	}
-	else
-	{
-		heaprel = NULL;
-		/* for "gcc -Og" https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78394 */
-		save_userid = InvalidOid;
-		save_sec_context = -1;
-		save_nestlevel = -1;
-	}
-
-	/*
-	 * Open the target index relations separately (like relation_openrv(), but
-	 * with heap relation locked first to prevent deadlocking).  In hot
-	 * standby mode this will raise an error when parentcheck is true.
-	 *
-	 * There is no need for the usual indcheckxmin usability horizon test
-	 * here, even in the heapallindexed case, because index undergoing
-	 * verification only needs to have entries for a new transaction snapshot.
-	 * (If this is a parentcheck verification, there is no question about
-	 * committed or recently dead heap tuples lacking index entries due to
-	 * concurrent activity.)
-	 */
-	indrel = index_open(indrelid, lockmode);
+	BTCheckCallbackState* args = (BTCheckCallbackState*) state;
+	bool		heapkeyspace,
+					allequalimage;
 
-	/*
-	 * Since we did the IndexGetRelation call above without any lock, it's
-	 * barely possible that a race against an index drop/recreation could have
-	 * netted us the wrong table.
-	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
 		ereport(ERROR,
-				(errcode(ERRCODE_UNDEFINED_TABLE),
-				 errmsg("could not open parent table of index \"%s\"",
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" lacks a main relation fork",
 						RelationGetRelationName(indrel))));
 
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	if (btree_index_mainfork_expected(indrel))
-	{
-		bool		heapkeyspace,
-					allequalimage;
-
-		if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" lacks a main relation fork",
-							RelationGetRelationName(indrel))));
-
-		/* Extract metadata from metapage, and sanitize it in passing */
-		_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
-		if (allequalimage && !heapkeyspace)
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
-							RelationGetRelationName(indrel))));
-		if (allequalimage && !_bt_allequalimage(indrel, false))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
-							RelationGetRelationName(indrel))));
-
-		/* Check index, possibly against table it is an index on */
-		bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
-							 heapallindexed, rootdescend);
-	}
-
-	/* Roll back any GUC changes executed by index functions */
-	AtEOXact_GUC(false, save_nestlevel);
+	/* Extract metadata from metapage, and sanitize it in passing */
+	_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
+	if (allequalimage && !heapkeyspace)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
+						RelationGetRelationName(indrel))));
+	if (allequalimage && !_bt_allequalimage(indrel, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
+						RelationGetRelationName(indrel))));
 
-	/* Restore userid and security context */
-	SetUserIdAndSecContext(save_userid, save_sec_context);
+	/* Check index, possibly against table it is an index on */
+	bt_check_every_level(indrel, heaprel, heapkeyspace, args->parentcheck,
+							args->heapallindexed, args->rootdescend);
 
-	/*
-	 * Release locks early. That's ok here because nothing in the called
-	 * routines will trigger shared cache invalidations to be sent, so we can
-	 * relax the usual pattern of only releasing locks after commit.
-	 */
-	index_close(indrel, lockmode);
-	if (heaprel)
-		table_close(heaprel, lockmode);
 }
 
 /*
@@ -397,29 +319,6 @@ btree_index_checkable(Relation rel)
 				 errdetail("Index is not valid.")));
 }
 
-/*
- * Check if B-Tree index relation should have a file for its main relation
- * fork.  Verification uses this to skip unlogged indexes when in hot standby
- * mode, where there is simply nothing to verify.  We behave as if the
- * relation is empty.
- *
- * NB: Caller should call btree_index_checkable() before calling here.
- */
-static inline bool
-btree_index_mainfork_expected(Relation rel)
-{
-	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
-		!RecoveryInProgress())
-		return true;
-
-	ereport(DEBUG1,
-			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
-			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
-					RelationGetRelationName(rel))));
-
-	return false;
-}
-
 /*
  * Main entry point for B-Tree SQL-callable functions. Walks the B-Tree in
  * logical order, verifying invariants as it goes.  Optionally, verification
@@ -792,9 +691,9 @@ bt_check_level_from_leftmost(BtreeCheckState *state, BtreeLevel level)
 				ItemId		itemid;
 
 				/* Internal page -- downlink gets leftmost on next level */
-				itemid = PageGetItemIdCareful(state, state->targetblock,
+				itemid = PageGetItemIdCareful(state->rel, state->targetblock,
 											  state->target,
-											  P_FIRSTDATAKEY(opaque));
+											  P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 				itup = (IndexTuple) PageGetItem(state->target, itemid);
 				nextleveldown.leftmost = BTreeTupleGetDownLink(itup);
 				nextleveldown.level = opaque->btpo_level - 1;
@@ -874,8 +773,8 @@ nextpage:
 			IndexTuple	itup;
 			ItemId		itemid;
 
-			itemid = PageGetItemIdCareful(state, state->targetblock,
-										  state->target, P_HIKEY);
+			itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+										  state->target, P_HIKEY, sizeof(BTPageOpaqueData));
 			itup = (IndexTuple) PageGetItem(state->target, itemid);
 
 			state->lowkey = MemoryContextAlloc(oldcontext, IndexTupleSize(itup));
@@ -1092,8 +991,8 @@ bt_target_page_check(BtreeCheckState *state)
 		IndexTuple	itup;
 
 		/* Verify line pointer before checking tuple */
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, P_HIKEY);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, P_HIKEY, sizeof(BTPageOpaqueData));
 		if (!_bt_check_natts(state->rel, state->heapkeyspace, state->target,
 							 P_HIKEY))
 		{
@@ -1128,8 +1027,8 @@ bt_target_page_check(BtreeCheckState *state)
 
 		CHECK_FOR_INTERRUPTS();
 
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, offset);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, offset, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(state->target, itemid);
 		tupsize = IndexTupleSize(itup);
 
@@ -1441,9 +1340,9 @@ bt_target_page_check(BtreeCheckState *state)
 							 OffsetNumberNext(offset));
 
 			/* Reuse itup to get pointed-to heap location of second item */
-			itemid = PageGetItemIdCareful(state, state->targetblock,
+			itemid = PageGetItemIdCareful(state->rel, state->targetblock,
 										  state->target,
-										  OffsetNumberNext(offset));
+										  OffsetNumberNext(offset), sizeof(BTPageOpaqueData));
 			itup = (IndexTuple) PageGetItem(state->target, itemid);
 			tid = BTreeTupleGetPointsToTID(itup);
 			nhtid = psprintf("(%u,%u)",
@@ -1734,8 +1633,8 @@ bt_right_page_check_scankey(BtreeCheckState *state)
 	if (P_ISLEAF(opaque) && nline >= P_FIRSTDATAKEY(opaque))
 	{
 		/* Return first data item (if any) */
-		rightitem = PageGetItemIdCareful(state, targetnext, rightpage,
-										 P_FIRSTDATAKEY(opaque));
+		rightitem = PageGetItemIdCareful(state->rel, targetnext, rightpage,
+										 P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 	}
 	else if (!P_ISLEAF(opaque) &&
 			 nline >= OffsetNumberNext(P_FIRSTDATAKEY(opaque)))
@@ -1744,8 +1643,8 @@ bt_right_page_check_scankey(BtreeCheckState *state)
 		 * Return first item after the internal page's "negative infinity"
 		 * item
 		 */
-		rightitem = PageGetItemIdCareful(state, targetnext, rightpage,
-										 OffsetNumberNext(P_FIRSTDATAKEY(opaque)));
+		rightitem = PageGetItemIdCareful(state->rel, targetnext, rightpage,
+										 OffsetNumberNext(P_FIRSTDATAKEY(opaque)), sizeof(BTPageOpaqueData));
 	}
 	else
 	{
@@ -1864,8 +1763,8 @@ bt_child_highkey_check(BtreeCheckState *state,
 
 	if (OffsetNumberIsValid(target_downlinkoffnum))
 	{
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, target_downlinkoffnum);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, target_downlinkoffnum, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(state->target, itemid);
 		downlink = BTreeTupleGetDownLink(itup);
 	}
@@ -1968,7 +1867,7 @@ bt_child_highkey_check(BtreeCheckState *state,
 			OffsetNumber pivotkey_offset;
 
 			/* Get high key */
-			itemid = PageGetItemIdCareful(state, blkno, page, P_HIKEY);
+			itemid = PageGetItemIdCareful(state->rel, blkno, page, P_HIKEY, sizeof(BTPageOpaqueData));
 			highkey = (IndexTuple) PageGetItem(page, itemid);
 
 			/*
@@ -2019,8 +1918,8 @@ bt_child_highkey_check(BtreeCheckState *state,
 													LSN_FORMAT_ARGS(state->targetlsn))));
 					pivotkey_offset = P_HIKEY;
 				}
-				itemid = PageGetItemIdCareful(state, state->targetblock,
-											  state->target, pivotkey_offset);
+				itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+											  state->target, pivotkey_offset, sizeof(BTPageOpaqueData));
 				itup = (IndexTuple) PageGetItem(state->target, itemid);
 			}
 			else
@@ -2106,8 +2005,8 @@ bt_child_check(BtreeCheckState *state, BTScanInsert targetkey,
 	BTPageOpaque copaque;
 	BTPageOpaque topaque;
 
-	itemid = PageGetItemIdCareful(state, state->targetblock,
-								  state->target, downlinkoffnum);
+	itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+								  state->target, downlinkoffnum, sizeof(BTPageOpaqueData));
 	itup = (IndexTuple) PageGetItem(state->target, itemid);
 	childblock = BTreeTupleGetDownLink(itup);
 
@@ -2338,7 +2237,7 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 		 RelationGetRelationName(state->rel));
 
 	level = opaque->btpo_level;
-	itemid = PageGetItemIdCareful(state, blkno, page, P_FIRSTDATAKEY(opaque));
+	itemid = PageGetItemIdCareful(state->rel, blkno, page, P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 	itup = (IndexTuple) PageGetItem(page, itemid);
 	childblk = BTreeTupleGetDownLink(itup);
 	for (;;)
@@ -2362,8 +2261,8 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 										level - 1, copaque->btpo_level)));
 
 		level = copaque->btpo_level;
-		itemid = PageGetItemIdCareful(state, childblk, child,
-									  P_FIRSTDATAKEY(copaque));
+		itemid = PageGetItemIdCareful(state->rel, childblk, child,
+									  P_FIRSTDATAKEY(copaque), sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(child, itemid);
 		childblk = BTreeTupleGetDownLink(itup);
 		/* Be slightly more pro-active in freeing this memory, just in case */
@@ -2411,7 +2310,7 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 	 */
 	if (P_ISHALFDEAD(copaque) && !P_RIGHTMOST(copaque))
 	{
-		itemid = PageGetItemIdCareful(state, childblk, child, P_HIKEY);
+		itemid = PageGetItemIdCareful(state->rel, childblk, child, P_HIKEY, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(child, itemid);
 		if (BTreeTupleGetTopParent(itup) == blkno)
 			return;
@@ -2781,8 +2680,8 @@ invariant_l_offset(BtreeCheckState *state, BTScanInsert key,
 	Assert(key->pivotsearch);
 
 	/* Verify line pointer before checking tuple */
-	itemid = PageGetItemIdCareful(state, state->targetblock, state->target,
-								  upperbound);
+	itemid = PageGetItemIdCareful(state->rel, state->targetblock, state->target,
+								  upperbound, sizeof(BTPageOpaqueData));
 	/* pg_upgrade'd indexes may legally have equal sibling tuples */
 	if (!key->heapkeyspace)
 		return invariant_leq_offset(state, key, upperbound);
@@ -2904,8 +2803,8 @@ invariant_l_nontarget_offset(BtreeCheckState *state, BTScanInsert key,
 	Assert(key->pivotsearch);
 
 	/* Verify line pointer before checking tuple */
-	itemid = PageGetItemIdCareful(state, nontargetblock, nontarget,
-								  upperbound);
+	itemid = PageGetItemIdCareful(state->rel, nontargetblock, nontarget,
+								  upperbound, sizeof(BTPageOpaqueData));
 	cmp = _bt_compare(state->rel, key, nontarget, upperbound);
 
 	/* pg_upgrade'd indexes may legally have equal sibling tuples */
@@ -3142,55 +3041,6 @@ bt_mkscankey_pivotsearch(Relation rel, IndexTuple itup)
 	return skey;
 }
 
-/*
- * PageGetItemId() wrapper that validates returned line pointer.
- *
- * Buffer page/page item access macros generally trust that line pointers are
- * not corrupt, which might cause problems for verification itself.  For
- * example, there is no bounds checking in PageGetItem().  Passing it a
- * corrupt line pointer can cause it to return a tuple/pointer that is unsafe
- * to dereference.
- *
- * Validating line pointers before tuples avoids undefined behavior and
- * assertion failures with corrupt indexes, making the verification process
- * more robust and predictable.
- */
-static ItemId
-PageGetItemIdCareful(BtreeCheckState *state, BlockNumber block, Page page,
-					 OffsetNumber offset)
-{
-	ItemId		itemid = PageGetItemId(page, offset);
-
-	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
-		BLCKSZ - MAXALIGN(sizeof(BTPageOpaqueData)))
-		ereport(ERROR,
-				(errcode(ERRCODE_INDEX_CORRUPTED),
-				 errmsg("line pointer points past end of tuple space in index \"%s\"",
-						RelationGetRelationName(state->rel)),
-				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
-									block, offset, ItemIdGetOffset(itemid),
-									ItemIdGetLength(itemid),
-									ItemIdGetFlags(itemid))));
-
-	/*
-	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree
-	 * never uses either.  Verify that line pointer has storage, too, since
-	 * even LP_DEAD items should within nbtree.
-	 */
-	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
-		ItemIdGetLength(itemid) == 0)
-		ereport(ERROR,
-				(errcode(ERRCODE_INDEX_CORRUPTED),
-				 errmsg("invalid line pointer storage in index \"%s\"",
-						RelationGetRelationName(state->rel)),
-				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
-									block, offset, ItemIdGetOffset(itemid),
-									ItemIdGetLength(itemid),
-									ItemIdGetFlags(itemid))));
-
-	return itemid;
-}
-
 /*
  * BTreeTupleGetHeapTID() wrapper that enforces that a heap TID is present in
  * cases where that is mandatory (i.e. for non-pivot tuples)
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 5d61a33936..7ffa36b205 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -179,6 +179,44 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gin_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gin_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gin_index_parent_check</function> tests that its target GIN index
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gist_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_parent_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
   <tip>
    <para>
-- 
2.32.0 (Apple Git-132)

Andrey Borodin

x4mmm@yandex-team.ru

over 3 years ago

In reply to: Andrey Borodin (#1)

1 attachment(s)

Re: Amcheck verification of GiST and GIN

On 30 May 2022, at 12:40, Andrey Borodin <x4mmm@yandex-team.ru> wrote:

What do you think?

Hi Andrey!

Here's a version with better tests. I've made sure that GiST tests actually trigger page reuse after deletion. And enhanced comments in both GiST and GIN test scripts. I hope you'll like it.

GIN heapallindexed is still a no-op check. Looking forward to hear any ideas on what it could be.

Best regards, Andrey Borodin.

Attachments:

v11-0001-Amcheck-for-GIN-and-GiST.patchapplication/octet-stream; name=v11-0001-Amcheck-for-GIN-and-GiST.patch; x-unix-mode=0644Download

From 35922b26ae10b131216746fed1890a2e255b811d Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Thu, 22 Jul 2021 10:08:47 +0300
Subject: [PATCH v11] Amcheck for GIN and GiST

Author: Grigory Kryachko, Heikki, Andrey
Discussion: https://www.postgresql.org/message-id/CAF3eApa07-BajjG8%2BRYx-Dr_cq28ZA0GsZmUQrGu5b2ayRhB5A%40mail.gmail.com
---
 contrib/amcheck/Makefile                |   8 +-
 contrib/amcheck/amcheck--1.3--1.4.sql   |  24 +
 contrib/amcheck/amcheck.c               | 187 ++++++
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/amcheck.h               |  27 +
 contrib/amcheck/expected/check_gin.out  |  60 ++
 contrib/amcheck/expected/check_gist.out | 119 ++++
 contrib/amcheck/sql/check_gin.sql       |  40 ++
 contrib/amcheck/sql/check_gist.sql      |  42 ++
 contrib/amcheck/verify_gin.c            | 801 ++++++++++++++++++++++++
 contrib/amcheck/verify_gist.c           | 524 ++++++++++++++++
 contrib/amcheck/verify_nbtree.c         | 306 +++------
 doc/src/sgml/amcheck.sgml               |  38 ++
 13 files changed, 1947 insertions(+), 231 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.3--1.4.sql
 create mode 100644 contrib/amcheck/amcheck.c
 create mode 100644 contrib/amcheck/amcheck.h
 create mode 100644 contrib/amcheck/expected/check_gin.out
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gin.sql
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gin.c
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index b82f221e50..db8f8112da 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -3,14 +3,18 @@
 MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
+	amcheck.o \
+	verify_gin.o \
+	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
-DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql \
+	amcheck--1.3--1.4.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap
+REGRESS = check check_btree check_heap check_gin check_gist
 
 TAP_TESTS = 1
 
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
new file mode 100644
index 0000000000..54180d356d
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -0,0 +1,24 @@
+/* contrib/amcheck/amcheck--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.4'" to load this file. \quit
+
+
+--
+-- gin_index_parent_check()
+--
+CREATE FUNCTION gin_index_parent_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gin_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gin_index_parent_check(regclass, boolean) FROM PUBLIC;
+
+-- gist_index_parent_check()
+--
+CREATE FUNCTION gist_index_parent_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
new file mode 100644
index 0000000000..7a222719dd
--- /dev/null
+++ b/contrib/amcheck/amcheck.c
@@ -0,0 +1,187 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.c
+ *		Utility functions common to all access methods.
+ *
+ * Copyright (c) 2017-2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/table.h"
+#include "access/tableam.h"
+#include "amcheck.h"
+#include "catalog/index.h"
+#include "commands/tablecmds.h"
+
+
+static bool
+amcheck_index_mainfork_expected(Relation rel);
+
+/*
+ * Check if index relation should have a file for its main relation
+ * fork.  Verification uses this to skip unlogged indexes when in hot standby
+ * mode, where there is simply nothing to verify.
+ *
+ * NB: Caller should call index_checkable()
+ * before calling here.
+ */
+static bool
+amcheck_index_mainfork_expected(Relation rel)
+{
+	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
+		!RecoveryInProgress())
+		return true;
+
+	ereport(NOTICE,
+			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
+			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
+					RelationGetRelationName(rel))));
+
+	return false;
+}
+
+void amcheck_lock_relation_and_check(Oid indrelid, IndexCheckableCallback checkable,
+												IndexDoCheckCallback check, LOCKMODE lockmode, void *state)
+{
+	Oid			heapid;
+	Relation	indrel;
+	Relation	heaprel;
+	Oid			save_userid;
+	int			save_sec_context;
+	int			save_nestlevel;
+
+	/*
+	 * We must lock table before index to avoid deadlocks.  However, if the
+	 * passed indrelid isn't an index then IndexGetRelation() will fail.
+	 * Rather than emitting a not-very-helpful error message, postpone
+	 * complaining, expecting that the is-it-an-index test below will fail.
+	 *
+	 * In hot standby mode this will raise an error when parentcheck is true.
+	 */
+	heapid = IndexGetRelation(indrelid, true);
+	if (OidIsValid(heapid))
+	{
+		heaprel = table_open(heapid, lockmode);
+
+		/*
+		 * Switch to the table owner's userid, so that any index functions are
+		 * run as that user.  Also lock down security-restricted operations
+		 * and arrange to make GUC variable changes local to this command.
+		 */
+		GetUserIdAndSecContext(&save_userid, &save_sec_context);
+		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
+							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
+		save_nestlevel = NewGUCNestLevel();
+	}
+	else
+	{
+		heaprel = NULL;
+		/* for "gcc -Og" https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78394 */
+		save_userid = InvalidOid;
+		save_sec_context = -1;
+		save_nestlevel = -1;
+	}
+
+	/*
+	 * Open the target index relations separately (like relation_openrv(), but
+	 * with heap relation locked first to prevent deadlocking).  In hot
+	 * standby mode this will raise an error when parentcheck is true.
+	 *
+	 * There is no need for the usual indcheckxmin usability horizon test
+	 * here, even in the heapallindexed case, because index undergoing
+	 * verification only needs to have entries for a new transaction snapshot.
+	 * (If this is a parentcheck verification, there is no question about
+	 * committed or recently dead heap tuples lacking index entries due to
+	 * concurrent activity.)
+	 */
+	indrel = index_open(indrelid, lockmode);
+
+	/*
+	 * Since we did the IndexGetRelation call above without any lock, it's
+	 * barely possible that a race against an index drop/recreation could have
+	 * netted us the wrong table.
+	 */
+	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_TABLE),
+				 errmsg("could not open parent table of index \"%s\"",
+						RelationGetRelationName(indrel))));
+
+	/* Relation suitable for checking */
+	checkable(indrel);
+
+	if (amcheck_index_mainfork_expected(indrel))
+		check(indrel, heaprel, state);
+
+	/* Roll back any GUC changes executed by index functions */
+	AtEOXact_GUC(false, save_nestlevel);
+
+	/* Restore userid and security context */
+	SetUserIdAndSecContext(save_userid, save_sec_context);
+
+	/*
+	 * Release locks early. That's ok here because nothing in the called
+	 * routines will trigger shared cache invalidations to be sent, so we can
+	 * relax the usual pattern of only releasing locks after commit.
+	 */
+	index_close(indrel, lockmode);
+	if (heaprel)
+		table_close(heaprel, lockmode);
+}
+
+/*
+ * PageGetItemId() wrapper that validates returned line pointer.
+ *
+ * Buffer page/page item access macros generally trust that line pointers are
+ * not corrupt, which might cause problems for verification itself.  For
+ * example, there is no bounds checking in PageGetItem().  Passing it a
+ * corrupt line pointer can cause it to return a tuple/pointer that is unsafe
+ * to dereference.
+ *
+ * Validating line pointers before tuples avoids undefined behavior and
+ * assertion failures with corrupt indexes, making the verification process
+ * more robust and predictable.
+ */
+ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset, size_t opaquesize)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	Assert(opaquesize == MAXALIGN(opaquesize));
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(opaquesize))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree and gist
+	 * never uses either.  Verify that line pointer has storage, too, since
+	 * even LP_DEAD items should.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index ab50931f75..e67ace01c9 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.3'
+default_version = '1.4'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/amcheck.h b/contrib/amcheck/amcheck.h
new file mode 100644
index 0000000000..10906efd8a
--- /dev/null
+++ b/contrib/amcheck/amcheck.h
@@ -0,0 +1,27 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "storage/lockdefs.h"
+#include "utils/relcache.h"
+#include "miscadmin.h"
+
+/* Typedefs for callback functions for amcheck_lock_relation */
+typedef void (*IndexCheckableCallback) (Relation index);
+typedef void (*IndexDoCheckCallback) (Relation rel, Relation heaprel, void* state);
+
+extern void amcheck_lock_relation_and_check(Oid indrelid,
+											IndexCheckableCallback checkable,
+											IndexDoCheckCallback check,
+											LOCKMODE lockmode, void *state);
+
+extern ItemId PageGetItemIdCareful(Relation rel, BlockNumber block,
+					 Page page, OffsetNumber offset, size_t opaquesize);
\ No newline at end of file
diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
new file mode 100644
index 0000000000..c4c0cfd94d
--- /dev/null
+++ b/contrib/amcheck/expected/check_gin.out
@@ -0,0 +1,60 @@
+-- Test of index bulk load
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx', true);
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test index inserts
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+SELECT gin_index_parent_check('gin_check_idx', true);
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test GIN over text array
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array', true);
+ERROR:  "gin_check_text_array" is not an index
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 0000000000..9749adfd34
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,119 @@
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
new file mode 100644
index 0000000000..568a67ebe2
--- /dev/null
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -0,0 +1,40 @@
+-- Test of index bulk load
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx', true);
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test index inserts
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+
+SELECT gin_index_parent_check('gin_check_idx', true);
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test GIN over text array
+SELECT setseed(1);
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array', true);
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 0000000000..75b9ff4b43
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,42 @@
+
+SELECT setseed(1);
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
new file mode 100644
index 0000000000..45f3bb47c7
--- /dev/null
+++ b/contrib/amcheck/verify_gin.c
@@ -0,0 +1,801 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gin.c
+ *		Verifies the integrity of GIN indexes based on invariants.
+ *
+ * Verification checks that all paths in GIN graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gin.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gin_private.h"
+#include "access/nbtree.h"
+#include "amcheck.h"
+#include "catalog/pg_am.h"
+#include "miscadmin.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "string.h"
+
+/*
+ * GinScanItem represents one item of depth-first scan of GIN index.
+ */
+typedef struct GinScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GinScanItem *next;
+}			GinScanItem;
+
+/*
+ * GinPostingTreeScanItem represents one item of depth-first scan of GIN  posting tree.
+ */
+typedef struct GinPostingTreeScanItem
+{
+	int			depth;
+	ItemPointerData parentkey;
+	BlockNumber parentblk;
+	BlockNumber blkno;
+	struct GinPostingTreeScanItem *next;
+}			GinPostingTreeScanItem;
+
+
+PG_FUNCTION_INFO_V1(gin_index_parent_check);
+
+static void gin_index_checkable(Relation rel);
+static void gin_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state);
+static bool check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gin_refind_parent(Relation rel, BlockNumber parentblkno,
+									BlockNumber childblkno,
+									BufferAccessStrategy strategy);
+
+/*
+ * gin_index_parent_check(index regclass)
+ *
+ * Verify integrity of GIN index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gin_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	bool	heapallindexed = false;
+
+	if (PG_NARGS() >= 2)
+		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid, gin_index_checkable,
+		gin_check_parent_keys_consistency, AccessShareLock, &heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Read item pointers from leaf entry tuple.
+ *
+ * Returns a palloc'd array of ItemPointers. The number of items is returned
+ * in *nitems.
+ */
+static ItemPointer
+ginReadTupleWithoutState(IndexTuple itup, int *nitems)
+{
+	Pointer		ptr = GinGetPosting(itup);
+	int			nipd = GinGetNPosting(itup);
+	ItemPointer ipd;
+	int			ndecoded;
+
+	if (GinItupIsCompressed(itup))
+	{
+		if (nipd > 0)
+		{
+			ipd = ginPostingListDecode((GinPostingList *) ptr, &ndecoded);
+			if (nipd != ndecoded)
+				elog(ERROR, "number of items mismatch in GIN entry tuple, %d in tuple header, %d decoded",
+					 nipd, ndecoded);
+		}
+		else
+		{
+			ipd = palloc(0);
+		}
+	}
+	else
+	{
+		ipd = (ItemPointer) palloc(sizeof(ItemPointerData) * nipd);
+		memcpy(ipd, ptr, sizeof(ItemPointerData) * nipd);
+	}
+	*nitems = nipd;
+	return ipd;
+}
+
+
+/*
+ * Check that relation is eligible for GIN verification
+ */
+static void
+gin_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIN_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GIN indexes are supported as targets for this verification"),
+				 errdetail("Relation \"%s\" is not a GIN index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+/*
+ * Allocates memory context and scans through postigTree graph
+ *
+ */
+static void
+gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting_tree_root)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinPostingTreeScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinPostingTreeScanItem *) palloc0(sizeof(GinPostingTreeScanItem));
+	stack->depth = 0;
+	ItemPointerSetInvalid(&stack->parentkey);
+	stack->parentblk = InvalidBlockNumber;
+	stack->blkno = posting_tree_root;
+
+	elog(DEBUG3, "processing posting tree at blk %u", posting_tree_root);
+
+	while (stack)
+	{
+		GinPostingTreeScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		Assert(GinPageIsData(page));
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			ItemPointerData minItem;
+			int			nlist;
+			ItemPointerData *list;
+			char		tidrange_buf[100];
+
+			ItemPointerSetMin(&minItem);
+
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			list = GinDataLeafPageGetItems(page, &nlist, minItem);
+
+			if (nlist > 0)
+			{
+				snprintf(tidrange_buf, sizeof(tidrange_buf),
+						 "%d tids (%u, %u) - (%u, %u)",
+						 nlist,
+						 ItemPointerGetBlockNumberNoCheck(&list[0]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[0]),
+						 ItemPointerGetBlockNumberNoCheck(&list[nlist - 1]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[nlist - 1]));
+			} else {
+				snprintf(tidrange_buf, sizeof(tidrange_buf), "0 tids");
+			}
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				elog(DEBUG3, "blk %u: parent %u highkey (%u, %u), %s",
+					 stack->blkno,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey),
+					 tidrange_buf);
+			}
+			else
+			{
+				elog(DEBUG3, "blk %u: root leaf, %s",
+					 stack->blkno,
+					 tidrange_buf);
+			}
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				ItemPointerGetOffsetNumberNoCheck(&stack->parentkey) != InvalidOffsetNumber &&
+				nlist > 0 &&
+				ItemPointerCompare(&stack->parentkey, &list[nlist - 1]) < 0)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": tid exceeds parent's high key in postingTree leaf on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			}
+		}
+		else
+		{
+			LocationIndex pd_lower;
+			int			lowersize;
+			ItemPointerData bound;
+
+			/*
+			 * Check that tuples in each page are properly ordered and
+			 * consistent with parent high key
+			 */
+			maxoff = GinPageGetOpaque(page)->maxoff;
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: internal posting tree page with %u items, parent %u highkey (%u, %u)",
+					 stack->blkno,
+					 maxoff,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey));
+			else
+				elog(DEBUG3, "blk %u: root internal posting tree page with %u items", stack->blkno, maxoff);
+
+			/*
+			 * A GIN posting tree internal page stores PostingItems in the
+			 * 'lower' part of the page. The 'upper' part is unused. The
+			 * number of elements is stored in the opaque area (maxoff).
+			 * Make sure the size of the 'lower' part agrees with 'maxoff'
+			 *
+			 * We didn't set pd_lower until PostgreSQL version 9.4, so if this
+			 * check fails, it could also be because the index was binary-upgraded
+			 * from an earlier version. That was a long time ago, though, so let's
+			 * warn if it doesn't match.
+			 */
+			pd_lower = ((PageHeader) page)->pd_lower;
+			lowersize = pd_lower - MAXALIGN(SizeOfPageHeaderData);
+			if ((lowersize - MAXALIGN(sizeof(ItemPointerData))) / sizeof(PostingItem) != maxoff)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has unexpected pd_lower %u in posting tree block %u with maxoff %u)",
+								RelationGetRelationName(rel), pd_lower, stack->blkno, maxoff)));
+			}
+
+			/*
+			 * Before the PostingItems, there's one ItemPointerData in the
+			 * 'lower' part that stores the page's high key.
+			 */
+			bound = *GinDataPageGetRightBound(page);
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				if (!ItemPointerEquals(&stack->parentkey, &bound))
+				{
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\": posting tree page's high key (%u, %u) doesn't match the downlink on block %u (parent blk %u, key (%u, %u))",
+									RelationGetRelationName(rel),
+									ItemPointerGetBlockNumberNoCheck(&bound),
+									ItemPointerGetOffsetNumberNoCheck(&bound),
+									stack->blkno,
+									stack->parentblk,
+									ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+									ItemPointerGetOffsetNumberNoCheck(&stack->parentkey))));
+				}
+			}
+
+			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
+
+				elog(DEBUG3, "key (%u, %u) -> %u",
+					 ItemPointerGetBlockNumber(&posting_item->key),
+					 ItemPointerGetOffsetNumber(&posting_item->key),
+					 BlockIdGetBlockNumber(&posting_item->child_blkno));
+
+				if (i == maxoff && GinPageGetOpaque(page)->rightlink == InvalidBlockNumber)
+				{
+					/* The rightmost item in the tree level has (0, 0) as the key */
+					if (ItemPointerGetBlockNumberNoCheck(&posting_item->key) != 0 ||
+						ItemPointerGetOffsetNumberNoCheck(&posting_item->key) != 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": rightmost posting tree page (blk %u) has unexpected last key (%u, %u)",
+										RelationGetRelationName(rel),
+										stack->blkno,
+										ItemPointerGetBlockNumberNoCheck(&posting_item->key),
+										ItemPointerGetOffsetNumberNoCheck(&posting_item->key))));
+					}
+				}
+				else if (i != FirstOffsetNumber)
+				{
+					PostingItem *previous_posting_item = GinDataPageGetPostingItem(page, i - 1);
+
+					if (ItemPointerCompare(&posting_item->key, &previous_posting_item->key) < 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\" has wrong tuple order in posting tree, block %u, offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+					}
+				}
+
+				/*
+				 * Check if this tuple is consistent with the downlink in the
+				 * parent.
+				 */
+				if (stack->parentblk != InvalidBlockNumber && i == maxoff)
+				{
+					if (ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting item exceeds parent's high key in postingTree internal page on block %u offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+
+					}
+				}
+
+				/* If this is an internal page, recurse into the child */
+				if (!GinPageIsLeaf(page))
+				{
+					GinPostingTreeScanItem *ptr;
+
+					ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
+					ptr->depth = stack->depth + 1;
+					ptr->parentkey = posting_item->key;
+					ptr->parentblk = stack->blkno;
+					ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+
+			}
+		}
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Main entry point for GIN check. Allocates memory context and scans through
+ * GIN graph.
+ */
+static void
+gin_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GinState	state;
+	bool		heapallindexed = *((bool*)callback_state);
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+	initGinState(&state, rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinScanItem *) palloc0(sizeof(GinScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIN_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GinScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+		IndexTuple	prev_tuple;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		if (!check_index_page(rel, buffer, stack->blkno))
+		{
+			goto nextpage;
+		}
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (stack->parenttup != NULL)
+		{
+			GinNullCategory parent_key_category;
+			Datum		parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+			OffsetNumber maxoff = PageGetMaxOffsetNumber(page);
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, maxoff, sizeof(GinPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory page_max_key_category;
+			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
+
+			if (GinPageGetOpaque(page)->rightlink != InvalidBlockNumber &&
+				ginCompareEntries(&state, attnum, page_max_key, page_max_key_category, parent_key, parent_key_category) > 0)
+			{
+				/* split page detected, install right link to the stack */
+				GinScanItem *ptr;
+
+				elog(DEBUG3, "split detected");
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth;
+				ptr->parenttup = CopyIndexTuple(stack->parenttup);
+				ptr->parentblk = stack->parentblk;
+				ptr->parentlsn = stack->parentlsn;
+				ptr->blkno = GinPageGetOpaque(page)->rightlink;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+				goto nextpage;
+			}
+		}
+
+		/*
+		 * Check that tuples in each page are properly ordered and consistent
+		 * with parent high key
+		 */
+		maxoff = PageGetMaxOffsetNumber(page);
+		prev_tuple = NULL;
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i, sizeof(GinPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory prev_key_category;
+			Datum		prev_key;
+			GinNullCategory current_key_category;
+			Datum		current_key;
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+				continue;
+			}
+
+			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
+
+			/* (apparently) first block is metadata, skip order check */
+			if (i != FirstOffsetNumber && stack->blkno != (BlockNumber) 1)
+			{
+				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
+				if (ginCompareEntries(&state, attnum, prev_key, prev_key_category, current_key, current_key_category) >= 0)
+				{
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has wrong tuple order, block %u, offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+				}
+			}
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				i == maxoff)
+			{
+				GinNullCategory parent_key_category;
+				Datum		parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+
+				if (ginCompareEntries(&state, attnum, current_key, current_key_category, parent_key, parent_key_category) > 0)
+				{
+					/*
+					 * There was a discrepancy between parent and child
+					 * tuples. We need to verify it is not a result of
+					 * concurrent call of gistplacetopage(). So, lock parent
+					 * and try to find downlink for current page. It may be
+					 * missing due to concurrent page split, this is OK.
+					 */
+					pfree(stack->parenttup);
+					stack->parenttup = gin_refind_parent(rel, stack->parentblk,
+														 stack->blkno, strategy);
+
+					/* We found it - make a final check before failing */
+					if (!stack->parenttup)
+						elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+							 stack->blkno, stack->parentblk);
+					else
+					{
+						parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+						if (ginCompareEntries(&state, attnum, current_key, current_key_category, parent_key, parent_key_category) > 0)
+							ereport(ERROR,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+											RelationGetRelationName(rel), stack->blkno, i)));
+						else
+						{
+							/*
+							 * But now it is properly adjusted - nothing to do
+							 * here.
+							 */
+						}
+					}
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+				// XXX: Why do we have invalid pointers here?? Got an SegFault without this check
+			if (!GinPageIsLeaf(page) && ItemPointerIsValid(&(idxtuple->t_tid)))
+			{
+				GinScanItem *ptr;
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth + 1;
+				/* last tuple in layer has no high key */
+				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
+				{
+					ptr->parenttup = CopyIndexTuple(idxtuple);
+				}
+				else
+				{
+					ptr->parenttup = NULL;
+				}
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+			/* If this item is a pointer to a posting tree, recurse into it */
+			else if (GinIsPostingTree(idxtuple))
+			{
+				BlockNumber rootPostingTree = GinGetPostingTree(idxtuple);
+
+				gin_check_posting_tree_parent_keys_consistency(rel, rootPostingTree);
+			}
+			else
+			{
+				ItemPointer ipd;
+				int			nipd;
+
+				ipd = ginReadTupleWithoutState(idxtuple, &nipd);
+
+				for (int j = 0; j < nipd; j++)
+				{
+					if (!OffsetNumberIsValid(ItemPointerGetOffsetNumber(&ipd[j])))
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting list contains invalid heap pointer on block %u",
+										RelationGetRelationName(rel), stack->blkno)));
+					}
+				}
+				pfree(ipd);
+			}
+
+			prev_tuple = CopyIndexTuple(idxtuple);
+		}
+
+nextpage:
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Verify that a freshly-read page looks sane.
+ */
+static bool
+gincheckpage(Relation rel, Buffer buf)
+{
+	Page		page = BufferGetPage(buf);
+
+	/*
+	 * ReadBuffer verifies that every newly-read page passes
+	 * PageHeaderIsValid, which means it either contains a reasonably sane
+	 * page header or is all-zero.  We have to defend against the all-zero
+	 * case, however.
+	 */
+	if (PageIsNew(page))
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains unexpected zero page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buf)),
+				 errhint("Please REINDEX it.")));
+		return false;
+	}
+
+	/*
+	 * Additionally check that the special area looks sane.
+	 */
+	if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains corrupted page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buf)),
+				 errhint("Please REINDEX it.")));
+		return false;
+	}
+	return true;
+}
+
+static bool
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	if (!gincheckpage(rel, buffer))
+		return false;
+
+	if (GinPageIsDeleted(page))
+	{
+		if (!GinPageIsLeaf(page))
+		{
+			ereport(WARNING,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+			return false;
+		}
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+		{
+			ereport(WARNING,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+			return false;
+		}
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+		return false;
+	}
+	return true;
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gin_refind_parent(Relation rel, BlockNumber parentblkno,
+				  BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIN_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GinPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o, sizeof(GinPageOpaqueData));
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 0000000000..7c06982ab8
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,524 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gist.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "access/tableam.h"
+#include "access/transam.h"
+#include "catalog/pg_am.h"
+#include "common/pg_prng.h"
+#include "catalog/index.h"
+#include "lib/bloomfilter.h"
+#include "miscadmin.h"
+#include "storage/lmgr.h"
+#include "storage/smgr.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "utils/snapmgr.h"
+
+#include "amcheck.h"
+
+/*
+ * GistScanItem represents one item of depth-first scan of GiST index.
+ */
+typedef struct GistScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GistScanItem *next;
+} GistScanItem;
+
+typedef struct GistCheckState
+{
+	/* Bloom filter fingerprints B-Tree index */
+	bloom_filter *filter;
+	/* Debug counter */
+	int64		heaptuplespresent;
+	/* GiST state */
+	GISTSTATE      *state;
+
+	Snapshot		snapshot;
+	Relation	rel;
+	Relation	heaprel;
+} GistCheckState;
+
+PG_FUNCTION_INFO_V1(gist_index_parent_check);
+
+static GistCheckState gist_init_heapallindexed(Relation rel);
+static void gist_index_checkable(Relation rel);
+static void gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+												void* callback_state);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gist_refind_parent(Relation rel, BlockNumber parentblkno,
+									 BlockNumber childblkno,
+									 BufferAccessStrategy strategy);
+static void gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+						  bool *isnull, bool tupleIsAlive, void *checkstate);
+
+/*
+ * gist_index_parent_check(index regclass)
+ *
+ * Verify integrity of GiST index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum gist_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid		indrelid = PG_GETARG_OID(0);
+	bool	heapallindexed = false;
+
+	if (PG_NARGS() >= 2)
+		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid, gist_index_checkable,
+		gist_check_parent_keys_consistency, AccessShareLock, &heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Check that relation is eligible for GiST verification
+ */
+static void
+gist_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIST_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GiST indexes are supported as targets for this verification"),
+				 errdetail("Relation \"%s\" is not a GiST index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+static GistCheckState
+gist_init_heapallindexed(Relation rel)
+{
+	int64		total_pages;
+	int64		total_elems;
+	uint64		seed;
+	GistCheckState result;
+
+	/*
+	* Size Bloom filter based on estimated number of tuples in index
+	*/
+	total_pages = RelationGetNumberOfBlocks(rel);
+	total_elems = Max(total_pages * (MaxOffsetNumber / 5),
+						(int64) rel->rd_rel->reltuples);
+	/* Generate a random seed to avoid repetition */
+	seed = pg_prng_uint64(&pg_global_prng_state);
+	/* Create Bloom filter to fingerprint index */
+	result.filter = bloom_create(total_elems, maintenance_work_mem, seed);
+
+	/*
+	 * Register our own snapshot 
+	 */
+	result.snapshot = RegisterSnapshot(GetTransactionSnapshot());
+
+	/*
+	 * GetTransactionSnapshot() always acquires a new MVCC snapshot in
+	 * READ COMMITTED mode.  A new snapshot is guaranteed to have all
+	 * the entries it requires in the index.
+	 *
+	 * We must defend against the possibility that an old xact
+	 * snapshot was returned at higher isolation levels when that
+	 * snapshot is not safe for index scans of the target index.  This
+	 * is possible when the snapshot sees tuples that are before the
+	 * index's indcheckxmin horizon.  Throwing an error here should be
+	 * very rare.  It doesn't seem worth using a secondary snapshot to
+	 * avoid this.
+	 */
+	if (IsolationUsesXactSnapshot() && rel->rd_index->indcheckxmin &&
+		!TransactionIdPrecedes(HeapTupleHeaderGetXmin(rel->rd_indextuple->t_data),
+								result.snapshot->xmin))
+		ereport(ERROR,
+				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+					errmsg("index \"%s\" cannot be verified using transaction snapshot",
+						RelationGetRelationName(rel))));
+	
+	return result;
+}
+
+/*
+ * Main entry point for GiST check. Allocates memory context and scans through
+ * GiST graph.  This function verifies that tuples of internal pages cover all
+ * the key space of each tuple on leaf page.  To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries to
+ * adjust it by tuples on referenced child page.  Parent gist tuple should
+ * never require any adjustments.
+ */
+static void
+gist_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GistScanItem   *stack;
+	MemoryContext	mctx;
+	MemoryContext	oldcontext;
+	GISTSTATE      *state;
+	int				leafdepth;
+	bool			heapallindexed = *((bool*)callback_state);
+	GistCheckState  check_state;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	state = initGISTstate(rel);
+
+	if (heapallindexed)
+		check_state = gist_init_heapallindexed(rel);
+	check_state.state = state;
+	check_state.rel = rel;
+	check_state.heaprel = heaprel;
+	
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GistScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber  i, maxoff;
+		XLogRecPtr	lsn;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page))
+		{
+			/* split page detected, install right link to the stack */
+			GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+			ptr->depth = stack->depth;
+			ptr->parenttup = CopyIndexTuple(stack->parenttup);
+			ptr->parentblk = stack->parentblk;
+			ptr->parentlsn = stack->parentlsn;
+			ptr->blkno = GistPageGetOpaque(page)->rightlink;
+			ptr->next = stack->next;
+			stack->next = ptr;
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GistPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that each tuple looks valid, and is consistent with the
+		 * downlink we followed when we stepped on this page.
+		 */
+		maxoff = PageGetMaxOffsetNumber(page);
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId iid = PageGetItemIdCareful(rel, stack->blkno, page, i, sizeof(GISTPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+			/*
+			 * Check that it's not a leftover invalid tuple from pre-9.1 See
+			 * also gistdoinsert() and gistbulkdelete() handling of such
+			 * tuples. We do consider it error here.
+			 */
+			if (GistTupleIsInvalid(idxtuple))
+				ereport(ERROR,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("index \"%s\" contains an inner tuple marked as invalid, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i),
+						 errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+						 errhint("Please REINDEX it.")));
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+			{
+				/*
+				 * There was a discrepancy between parent and child tuples.
+				 * We need to verify it is not a result of concurrent call of
+				 * gistplacetopage(). So, lock parent and try to find downlink
+				 * for current page. It may be missing due to concurrent page
+				 * split, this is OK.
+				 * 
+				 * Note that when we aquire parent tuple now we hold lock for
+				 * both parent and child buffers. Thus parent tuple must
+				 * include keyspace of the child.
+				 */
+				pfree(stack->parenttup);
+				stack->parenttup = gist_refind_parent(rel, stack->parentblk,
+													  stack->blkno, strategy);
+
+				/* We found it - make a final check before failing */
+				if (!stack->parenttup)
+					elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+						 stack->blkno, stack->parentblk);
+				else if (gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+				else
+				{
+					/*
+					 * But now it is properly adjusted - nothing to do here.
+					 */
+				}
+			}
+
+			
+			if (GistPageIsLeaf(page))
+			{
+				if (heapallindexed)
+				{
+					bloom_add_element(check_state.filter, (unsigned char *) idxtuple,
+								  IndexTupleSize(idxtuple));
+				}
+			}
+			/* If this is an internal page, recurse into the child */
+			else
+			{
+				GistScanItem *ptr;
+
+				ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+				ptr->depth = stack->depth + 1;
+				ptr->parenttup = CopyIndexTuple(idxtuple);
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		LockBuffer(buffer, GIST_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	if (heapallindexed)
+	{
+		IndexInfo  *indexinfo = BuildIndexInfo(rel);
+		TableScanDesc scan;
+
+		scan = table_beginscan_strat(heaprel,	/* relation */
+									 check_state.snapshot,	/* snapshot */
+									 0, /* number of keys */
+									 NULL,	/* scan key */
+									 true,	/* buffer access strategy OK */
+									 true); /* syncscan OK? */
+
+		/*
+		 * Scan will behave as the first scan of a CREATE INDEX CONCURRENTLY.
+		 */
+		indexinfo->ii_Concurrent = true;
+
+		indexinfo->ii_Unique = false;
+		indexinfo->ii_ExclusionOps = NULL;
+		indexinfo->ii_ExclusionProcs = NULL;
+		indexinfo->ii_ExclusionStrats = NULL;
+
+		elog(DEBUG1, "verifying that tuples from index \"%s\" are present in \"%s\"",
+			 RelationGetRelationName(rel),
+			 RelationGetRelationName(heaprel));
+
+		table_index_build_scan(heaprel, rel, indexinfo, true, false,
+							   gist_tuple_present_callback, (void *) &check_state, scan);
+
+		ereport(DEBUG1,
+		(errmsg_internal("finished verifying presence of " INT64_FORMAT " tuples from table \"%s\" with bitset %.2f%% set",
+							check_state.heaptuplespresent, RelationGetRelationName(heaprel),
+							100.0 * bloom_prop_bits_set(check_state.filter))));
+
+		UnregisterSnapshot(check_state.snapshot);
+		bloom_free(check_state.filter);
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+static void
+gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+						  bool *isnull, bool tupleIsAlive, void *checkstate)
+{
+	GistCheckState *state = (GistCheckState *) checkstate;
+	IndexTuple	itup = gistFormTuple(state->state, index, values, isnull, true);
+	itup->t_tid = *tid;
+	/* Probe Bloom filter -- tuple should be present */
+	if (bloom_lacks_element(state->filter, (unsigned char *) itup,
+							IndexTupleSize(itup)))
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("heap tuple (%u,%u) from table \"%s\" lacks matching index tuple within index \"%s\"",
+						ItemPointerGetBlockNumber(&(itup->t_tid)),
+						ItemPointerGetOffsetNumber(&(itup->t_tid)),
+						RelationGetRelationName(state->heaprel),
+						RelationGetRelationName(state->rel))));
+
+	state->heaptuplespresent++;
+
+	pfree(itup);
+}
+
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	gistcheckpage(rel, buffer);
+
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted page %d",
+						RelationGetRelationName(rel), blockNo)));
+
+	if (GistPageIsDeleted(page))
+	{
+		if (!GistPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gist_refind_parent(Relation rel, BlockNumber parentblkno,
+				   BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIST_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GistPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o, sizeof(GISTPageOpaqueData));
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index a8791000f8..d12c55b478 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -40,6 +40,8 @@
 #include "utils/memutils.h"
 #include "utils/snapmgr.h"
 
+#include "amcheck.h"
+
 
 PG_MODULE_MAGIC;
 
@@ -137,10 +139,8 @@ typedef struct BtreeLevel
 PG_FUNCTION_INFO_V1(bt_index_check);
 PG_FUNCTION_INFO_V1(bt_index_parent_check);
 
-static void bt_index_check_internal(Oid indrelid, bool parentcheck,
-									bool heapallindexed, bool rootdescend);
+static void bt_index_check_internal_callback(Relation indrel, Relation heaprel, void* state);
 static inline void btree_index_checkable(Relation rel);
-static inline bool btree_index_mainfork_expected(Relation rel);
 static void bt_check_every_level(Relation rel, Relation heaprel,
 								 bool heapkeyspace, bool readonly, bool heapallindexed,
 								 bool rootdescend);
@@ -183,12 +183,17 @@ static inline bool invariant_l_nontarget_offset(BtreeCheckState *state,
 static Page palloc_btree_page(BtreeCheckState *state, BlockNumber blocknum);
 static inline BTScanInsert bt_mkscankey_pivotsearch(Relation rel,
 													IndexTuple itup);
-static ItemId PageGetItemIdCareful(BtreeCheckState *state, BlockNumber block,
-								   Page page, OffsetNumber offset);
 static inline ItemPointer BTreeTupleGetHeapTIDCareful(BtreeCheckState *state,
 													  IndexTuple itup, bool nonpivot);
 static inline ItemPointer BTreeTupleGetPointsToTID(IndexTuple itup);
 
+typedef struct BTCheckCallbackState
+{
+	bool parentcheck;
+	bool heapallindexed;
+	bool rootdescend;
+} BTCheckCallbackState;
+
 /*
  * bt_index_check(index regclass, heapallindexed boolean)
  *
@@ -202,12 +207,17 @@ Datum
 bt_index_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
+	BTCheckCallbackState args;
 
-	if (PG_NARGS() == 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = false;
+
+	if (PG_NARGS() >= 2)
+		args.heapallindexed = PG_GETARG_BOOL(1);
 
-	bt_index_check_internal(indrelid, false, heapallindexed, false);
+	amcheck_lock_relation_and_check(indrelid, btree_index_checkable,
+		bt_index_check_internal_callback, AccessShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -225,15 +235,18 @@ Datum
 bt_index_parent_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		rootdescend = false;
+	BTCheckCallbackState args;
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = true;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+		args.heapallindexed = PG_GETARG_BOOL(1);
 	if (PG_NARGS() == 3)
-		rootdescend = PG_GETARG_BOOL(2);
+		args.rootdescend = PG_GETARG_BOOL(2);
 
-	bt_index_check_internal(indrelid, true, heapallindexed, rootdescend);
+	amcheck_lock_relation_and_check(indrelid, btree_index_checkable,
+		bt_index_check_internal_callback, ShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -241,126 +254,35 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
 /*
  * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
  */
-static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend)
+static void bt_index_check_internal_callback(Relation indrel, Relation heaprel, void* state)
 {
-	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	LOCKMODE	lockmode;
-	Oid			save_userid;
-	int			save_sec_context;
-	int			save_nestlevel;
-
-	if (parentcheck)
-		lockmode = ShareLock;
-	else
-		lockmode = AccessShareLock;
-
-	/*
-	 * We must lock table before index to avoid deadlocks.  However, if the
-	 * passed indrelid isn't an index then IndexGetRelation() will fail.
-	 * Rather than emitting a not-very-helpful error message, postpone
-	 * complaining, expecting that the is-it-an-index test below will fail.
-	 *
-	 * In hot standby mode this will raise an error when parentcheck is true.
-	 */
-	heapid = IndexGetRelation(indrelid, true);
-	if (OidIsValid(heapid))
-	{
-		heaprel = table_open(heapid, lockmode);
-
-		/*
-		 * Switch to the table owner's userid, so that any index functions are
-		 * run as that user.  Also lock down security-restricted operations
-		 * and arrange to make GUC variable changes local to this command.
-		 */
-		GetUserIdAndSecContext(&save_userid, &save_sec_context);
-		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
-							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
-		save_nestlevel = NewGUCNestLevel();
-	}
-	else
-	{
-		heaprel = NULL;
-		/* for "gcc -Og" https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78394 */
-		save_userid = InvalidOid;
-		save_sec_context = -1;
-		save_nestlevel = -1;
-	}
-
-	/*
-	 * Open the target index relations separately (like relation_openrv(), but
-	 * with heap relation locked first to prevent deadlocking).  In hot
-	 * standby mode this will raise an error when parentcheck is true.
-	 *
-	 * There is no need for the usual indcheckxmin usability horizon test
-	 * here, even in the heapallindexed case, because index undergoing
-	 * verification only needs to have entries for a new transaction snapshot.
-	 * (If this is a parentcheck verification, there is no question about
-	 * committed or recently dead heap tuples lacking index entries due to
-	 * concurrent activity.)
-	 */
-	indrel = index_open(indrelid, lockmode);
+	BTCheckCallbackState* args = (BTCheckCallbackState*) state;
+	bool		heapkeyspace,
+					allequalimage;
 
-	/*
-	 * Since we did the IndexGetRelation call above without any lock, it's
-	 * barely possible that a race against an index drop/recreation could have
-	 * netted us the wrong table.
-	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
 		ereport(ERROR,
-				(errcode(ERRCODE_UNDEFINED_TABLE),
-				 errmsg("could not open parent table of index \"%s\"",
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" lacks a main relation fork",
 						RelationGetRelationName(indrel))));
 
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	if (btree_index_mainfork_expected(indrel))
-	{
-		bool		heapkeyspace,
-					allequalimage;
-
-		if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" lacks a main relation fork",
-							RelationGetRelationName(indrel))));
-
-		/* Extract metadata from metapage, and sanitize it in passing */
-		_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
-		if (allequalimage && !heapkeyspace)
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
-							RelationGetRelationName(indrel))));
-		if (allequalimage && !_bt_allequalimage(indrel, false))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
-							RelationGetRelationName(indrel))));
-
-		/* Check index, possibly against table it is an index on */
-		bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
-							 heapallindexed, rootdescend);
-	}
-
-	/* Roll back any GUC changes executed by index functions */
-	AtEOXact_GUC(false, save_nestlevel);
+	/* Extract metadata from metapage, and sanitize it in passing */
+	_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
+	if (allequalimage && !heapkeyspace)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
+						RelationGetRelationName(indrel))));
+	if (allequalimage && !_bt_allequalimage(indrel, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
+						RelationGetRelationName(indrel))));
 
-	/* Restore userid and security context */
-	SetUserIdAndSecContext(save_userid, save_sec_context);
+	/* Check index, possibly against table it is an index on */
+	bt_check_every_level(indrel, heaprel, heapkeyspace, args->parentcheck,
+							args->heapallindexed, args->rootdescend);
 
-	/*
-	 * Release locks early. That's ok here because nothing in the called
-	 * routines will trigger shared cache invalidations to be sent, so we can
-	 * relax the usual pattern of only releasing locks after commit.
-	 */
-	index_close(indrel, lockmode);
-	if (heaprel)
-		table_close(heaprel, lockmode);
 }
 
 /*
@@ -397,29 +319,6 @@ btree_index_checkable(Relation rel)
 				 errdetail("Index is not valid.")));
 }
 
-/*
- * Check if B-Tree index relation should have a file for its main relation
- * fork.  Verification uses this to skip unlogged indexes when in hot standby
- * mode, where there is simply nothing to verify.  We behave as if the
- * relation is empty.
- *
- * NB: Caller should call btree_index_checkable() before calling here.
- */
-static inline bool
-btree_index_mainfork_expected(Relation rel)
-{
-	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
-		!RecoveryInProgress())
-		return true;
-
-	ereport(DEBUG1,
-			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
-			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
-					RelationGetRelationName(rel))));
-
-	return false;
-}
-
 /*
  * Main entry point for B-Tree SQL-callable functions. Walks the B-Tree in
  * logical order, verifying invariants as it goes.  Optionally, verification
@@ -792,9 +691,9 @@ bt_check_level_from_leftmost(BtreeCheckState *state, BtreeLevel level)
 				ItemId		itemid;
 
 				/* Internal page -- downlink gets leftmost on next level */
-				itemid = PageGetItemIdCareful(state, state->targetblock,
+				itemid = PageGetItemIdCareful(state->rel, state->targetblock,
 											  state->target,
-											  P_FIRSTDATAKEY(opaque));
+											  P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 				itup = (IndexTuple) PageGetItem(state->target, itemid);
 				nextleveldown.leftmost = BTreeTupleGetDownLink(itup);
 				nextleveldown.level = opaque->btpo_level - 1;
@@ -874,8 +773,8 @@ nextpage:
 			IndexTuple	itup;
 			ItemId		itemid;
 
-			itemid = PageGetItemIdCareful(state, state->targetblock,
-										  state->target, P_HIKEY);
+			itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+										  state->target, P_HIKEY, sizeof(BTPageOpaqueData));
 			itup = (IndexTuple) PageGetItem(state->target, itemid);
 
 			state->lowkey = MemoryContextAlloc(oldcontext, IndexTupleSize(itup));
@@ -1092,8 +991,8 @@ bt_target_page_check(BtreeCheckState *state)
 		IndexTuple	itup;
 
 		/* Verify line pointer before checking tuple */
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, P_HIKEY);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, P_HIKEY, sizeof(BTPageOpaqueData));
 		if (!_bt_check_natts(state->rel, state->heapkeyspace, state->target,
 							 P_HIKEY))
 		{
@@ -1128,8 +1027,8 @@ bt_target_page_check(BtreeCheckState *state)
 
 		CHECK_FOR_INTERRUPTS();
 
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, offset);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, offset, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(state->target, itemid);
 		tupsize = IndexTupleSize(itup);
 
@@ -1441,9 +1340,9 @@ bt_target_page_check(BtreeCheckState *state)
 							 OffsetNumberNext(offset));
 
 			/* Reuse itup to get pointed-to heap location of second item */
-			itemid = PageGetItemIdCareful(state, state->targetblock,
+			itemid = PageGetItemIdCareful(state->rel, state->targetblock,
 										  state->target,
-										  OffsetNumberNext(offset));
+										  OffsetNumberNext(offset), sizeof(BTPageOpaqueData));
 			itup = (IndexTuple) PageGetItem(state->target, itemid);
 			tid = BTreeTupleGetPointsToTID(itup);
 			nhtid = psprintf("(%u,%u)",
@@ -1734,8 +1633,8 @@ bt_right_page_check_scankey(BtreeCheckState *state)
 	if (P_ISLEAF(opaque) && nline >= P_FIRSTDATAKEY(opaque))
 	{
 		/* Return first data item (if any) */
-		rightitem = PageGetItemIdCareful(state, targetnext, rightpage,
-										 P_FIRSTDATAKEY(opaque));
+		rightitem = PageGetItemIdCareful(state->rel, targetnext, rightpage,
+										 P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 	}
 	else if (!P_ISLEAF(opaque) &&
 			 nline >= OffsetNumberNext(P_FIRSTDATAKEY(opaque)))
@@ -1744,8 +1643,8 @@ bt_right_page_check_scankey(BtreeCheckState *state)
 		 * Return first item after the internal page's "negative infinity"
 		 * item
 		 */
-		rightitem = PageGetItemIdCareful(state, targetnext, rightpage,
-										 OffsetNumberNext(P_FIRSTDATAKEY(opaque)));
+		rightitem = PageGetItemIdCareful(state->rel, targetnext, rightpage,
+										 OffsetNumberNext(P_FIRSTDATAKEY(opaque)), sizeof(BTPageOpaqueData));
 	}
 	else
 	{
@@ -1864,8 +1763,8 @@ bt_child_highkey_check(BtreeCheckState *state,
 
 	if (OffsetNumberIsValid(target_downlinkoffnum))
 	{
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, target_downlinkoffnum);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, target_downlinkoffnum, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(state->target, itemid);
 		downlink = BTreeTupleGetDownLink(itup);
 	}
@@ -1968,7 +1867,7 @@ bt_child_highkey_check(BtreeCheckState *state,
 			OffsetNumber pivotkey_offset;
 
 			/* Get high key */
-			itemid = PageGetItemIdCareful(state, blkno, page, P_HIKEY);
+			itemid = PageGetItemIdCareful(state->rel, blkno, page, P_HIKEY, sizeof(BTPageOpaqueData));
 			highkey = (IndexTuple) PageGetItem(page, itemid);
 
 			/*
@@ -2019,8 +1918,8 @@ bt_child_highkey_check(BtreeCheckState *state,
 													LSN_FORMAT_ARGS(state->targetlsn))));
 					pivotkey_offset = P_HIKEY;
 				}
-				itemid = PageGetItemIdCareful(state, state->targetblock,
-											  state->target, pivotkey_offset);
+				itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+											  state->target, pivotkey_offset, sizeof(BTPageOpaqueData));
 				itup = (IndexTuple) PageGetItem(state->target, itemid);
 			}
 			else
@@ -2106,8 +2005,8 @@ bt_child_check(BtreeCheckState *state, BTScanInsert targetkey,
 	BTPageOpaque copaque;
 	BTPageOpaque topaque;
 
-	itemid = PageGetItemIdCareful(state, state->targetblock,
-								  state->target, downlinkoffnum);
+	itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+								  state->target, downlinkoffnum, sizeof(BTPageOpaqueData));
 	itup = (IndexTuple) PageGetItem(state->target, itemid);
 	childblock = BTreeTupleGetDownLink(itup);
 
@@ -2338,7 +2237,7 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 		 RelationGetRelationName(state->rel));
 
 	level = opaque->btpo_level;
-	itemid = PageGetItemIdCareful(state, blkno, page, P_FIRSTDATAKEY(opaque));
+	itemid = PageGetItemIdCareful(state->rel, blkno, page, P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 	itup = (IndexTuple) PageGetItem(page, itemid);
 	childblk = BTreeTupleGetDownLink(itup);
 	for (;;)
@@ -2362,8 +2261,8 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 										level - 1, copaque->btpo_level)));
 
 		level = copaque->btpo_level;
-		itemid = PageGetItemIdCareful(state, childblk, child,
-									  P_FIRSTDATAKEY(copaque));
+		itemid = PageGetItemIdCareful(state->rel, childblk, child,
+									  P_FIRSTDATAKEY(copaque), sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(child, itemid);
 		childblk = BTreeTupleGetDownLink(itup);
 		/* Be slightly more pro-active in freeing this memory, just in case */
@@ -2411,7 +2310,7 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 	 */
 	if (P_ISHALFDEAD(copaque) && !P_RIGHTMOST(copaque))
 	{
-		itemid = PageGetItemIdCareful(state, childblk, child, P_HIKEY);
+		itemid = PageGetItemIdCareful(state->rel, childblk, child, P_HIKEY, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(child, itemid);
 		if (BTreeTupleGetTopParent(itup) == blkno)
 			return;
@@ -2781,8 +2680,8 @@ invariant_l_offset(BtreeCheckState *state, BTScanInsert key,
 	Assert(key->pivotsearch);
 
 	/* Verify line pointer before checking tuple */
-	itemid = PageGetItemIdCareful(state, state->targetblock, state->target,
-								  upperbound);
+	itemid = PageGetItemIdCareful(state->rel, state->targetblock, state->target,
+								  upperbound, sizeof(BTPageOpaqueData));
 	/* pg_upgrade'd indexes may legally have equal sibling tuples */
 	if (!key->heapkeyspace)
 		return invariant_leq_offset(state, key, upperbound);
@@ -2904,8 +2803,8 @@ invariant_l_nontarget_offset(BtreeCheckState *state, BTScanInsert key,
 	Assert(key->pivotsearch);
 
 	/* Verify line pointer before checking tuple */
-	itemid = PageGetItemIdCareful(state, nontargetblock, nontarget,
-								  upperbound);
+	itemid = PageGetItemIdCareful(state->rel, nontargetblock, nontarget,
+								  upperbound, sizeof(BTPageOpaqueData));
 	cmp = _bt_compare(state->rel, key, nontarget, upperbound);
 
 	/* pg_upgrade'd indexes may legally have equal sibling tuples */
@@ -3142,55 +3041,6 @@ bt_mkscankey_pivotsearch(Relation rel, IndexTuple itup)
 	return skey;
 }
 
-/*
- * PageGetItemId() wrapper that validates returned line pointer.
- *
- * Buffer page/page item access macros generally trust that line pointers are
- * not corrupt, which might cause problems for verification itself.  For
- * example, there is no bounds checking in PageGetItem().  Passing it a
- * corrupt line pointer can cause it to return a tuple/pointer that is unsafe
- * to dereference.
- *
- * Validating line pointers before tuples avoids undefined behavior and
- * assertion failures with corrupt indexes, making the verification process
- * more robust and predictable.
- */
-static ItemId
-PageGetItemIdCareful(BtreeCheckState *state, BlockNumber block, Page page,
-					 OffsetNumber offset)
-{
-	ItemId		itemid = PageGetItemId(page, offset);
-
-	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
-		BLCKSZ - MAXALIGN(sizeof(BTPageOpaqueData)))
-		ereport(ERROR,
-				(errcode(ERRCODE_INDEX_CORRUPTED),
-				 errmsg("line pointer points past end of tuple space in index \"%s\"",
-						RelationGetRelationName(state->rel)),
-				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
-									block, offset, ItemIdGetOffset(itemid),
-									ItemIdGetLength(itemid),
-									ItemIdGetFlags(itemid))));
-
-	/*
-	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree
-	 * never uses either.  Verify that line pointer has storage, too, since
-	 * even LP_DEAD items should within nbtree.
-	 */
-	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
-		ItemIdGetLength(itemid) == 0)
-		ereport(ERROR,
-				(errcode(ERRCODE_INDEX_CORRUPTED),
-				 errmsg("invalid line pointer storage in index \"%s\"",
-						RelationGetRelationName(state->rel)),
-				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
-									block, offset, ItemIdGetOffset(itemid),
-									ItemIdGetLength(itemid),
-									ItemIdGetFlags(itemid))));
-
-	return itemid;
-}
-
 /*
  * BTreeTupleGetHeapTID() wrapper that enforces that a heap TID is present in
  * cases where that is mandatory (i.e. for non-pivot tuples)
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 5d61a33936..7ffa36b205 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -179,6 +179,44 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gin_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gin_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gin_index_parent_check</function> tests that its target GIN index
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gist_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_parent_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
   <tip>
    <para>
-- 
2.32.0 (Apple Git-132)

Nikolay Samokhvalov

samokhvalov@gmail.com

over 3 years ago

In reply to: Andrey Borodin (#2)

Re: Amcheck verification of GiST and GIN

On Wed, Jun 22, 2022 at 11:35 AM Andrey Borodin <x4mmm@yandex-team.ru>
wrote:

On 30 May 2022, at 12:40, Andrey Borodin <x4mmm@yandex-team.ru> wrote:

What do you think?

Hi Andrey!

Hi Andrey!

Since you're talking to yourself, just wanted to support you – this is an
important thing, definitely should be very useful for many projects; I hope
to find time to test it in the next few days.

Thanks for working on it.

Andres Freund

andres@anarazel.de

over 3 years ago

In reply to: Andrey Borodin (#2)

Re: Amcheck verification of GiST and GIN

Hi,

I think having amcheck for more indexes is great.

On 2022-06-22 20:40:56 +0300, Andrey Borodin wrote:

diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c

new file mode 100644
index 0000000000..7a222719dd
--- /dev/null
+++ b/contrib/amcheck/amcheck.c
@@ -0,0 +1,187 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.c
+ *		Utility functions common to all access methods.

This'd likely be easier to read if the reorganization were split into its own
commit.

I'd also split gin / gist support. It's a large enough patch that that imo
makes reviewing easier.

+void amcheck_lock_relation_and_check(Oid indrelid, IndexCheckableCallback checkable,
+												IndexDoCheckCallback check, LOCKMODE lockmode, void *state)

Might be worth pgindenting - the void for function definitions (but not for
declarations) is typically on its own line in PG code.

+static GistCheckState
+gist_init_heapallindexed(Relation rel)
+{
+	int64		total_pages;
+	int64		total_elems;
+	uint64		seed;
+	GistCheckState result;
+
+	/*
+	* Size Bloom filter based on estimated number of tuples in index
+	*/
+	total_pages = RelationGetNumberOfBlocks(rel);
+	total_elems = Max(total_pages * (MaxOffsetNumber / 5),
+						(int64) rel->rd_rel->reltuples);
+	/* Generate a random seed to avoid repetition */
+	seed = pg_prng_uint64(&pg_global_prng_state);
+	/* Create Bloom filter to fingerprint index */
+	result.filter = bloom_create(total_elems, maintenance_work_mem, seed);
+
+	/*
+	 * Register our own snapshot
+	 */
+	result.snapshot = RegisterSnapshot(GetTransactionSnapshot());

FWIW, comments like this, that just restate exactly what the code does, are
imo not helpful. Also, there's a trailing space :)

Greetings,

Andres Freund

Andrey Borodin

x4mmm@yandex-team.ru

over 3 years ago

In reply to: Nikolay Samokhvalov (#3)

Re: Amcheck verification of GiST and GIN

On 23 Jun 2022, at 00:27, Nikolay Samokhvalov <samokhvalov@gmail.com> wrote:

Since you're talking to yourself, just wanted to support you – this is an important thing, definitely should be very useful for many projects; I hope to find time to test it in the next few days.

Thanks Nikolay!

On 23 Jun 2022, at 04:29, Andres Freund <andres@anarazel.de> wrote:

Thanks for looking into the patch, Andres!

On 2022-06-22 20:40:56 +0300, Andrey Borodin wrote:
diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
new file mode 100644
index 0000000000..7a222719dd
--- /dev/null
+++ b/contrib/amcheck/amcheck.c
@@ -0,0 +1,187 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.c
+ *		Utility functions common to all access methods.
This'd likely be easier to read if the reorganization were split into its own
commit.

I'd also split gin / gist support. It's a large enough patch that that imo
makes reviewing easier.

I will split the patch in 3 steps:
1. extract generic functions to amcheck.c
2. add gist functions
3. add gin functions
But each this step is just adding few independent files + some lines to Makefile.

I'll fix other notes too in the next version.

Thanks!

Best regards, Andrey Borodin.

Andrey Borodin

x4mmm@yandex-team.ru

over 3 years ago

In reply to: Andrey Borodin (#5)

3 attachment(s)

Re: Amcheck verification of GiST and GIN

On 26 Jun 2022, at 00:10, Andrey Borodin <x4mmm@yandex-team.ru> wrote:

I will split the patch in 3 steps:
1. extract generic functions to amcheck.c
2. add gist functions
3. add gin functions

I'll fix other notes too in the next version.

Done. PFA attached patchset.

Thanks!

Best regards, Andrey Borodin.

Attachments:

v12-0001-Refactor-amcheck-to-extract-common-locking-routi.patchapplication/octet-stream; name=v12-0001-Refactor-amcheck-to-extract-common-locking-routi.patch; x-unix-mode=0644Download

From 2a0b582407d2c180e31ca278a61d0a128d64726f Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:08:10 +0500
Subject: [PATCH v12 1/3] Refactor amcheck to extract common locking routines

---
 contrib/amcheck/Makefile        |   2 +
 contrib/amcheck/amcheck.c       | 188 ++++++++++++++++++++
 contrib/amcheck/amcheck.h       |  27 +++
 contrib/amcheck/verify_nbtree.c | 306 ++++++++------------------------
 4 files changed, 295 insertions(+), 228 deletions(-)
 create mode 100644 contrib/amcheck/amcheck.c
 create mode 100644 contrib/amcheck/amcheck.h

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index b82f221e50..f10fd9d89d 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -3,11 +3,13 @@
 MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
+	amcheck.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
 DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
 REGRESS = check check_btree check_heap
diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
new file mode 100644
index 0000000000..0194ef0d7a
--- /dev/null
+++ b/contrib/amcheck/amcheck.c
@@ -0,0 +1,188 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.c
+ *		Utility functions common to all access methods.
+ *
+ * Copyright (c) 2017-2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/table.h"
+#include "access/tableam.h"
+#include "amcheck.h"
+#include "catalog/index.h"
+#include "commands/tablecmds.h"
+
+
+static bool
+amcheck_index_mainfork_expected(Relation rel);
+
+/*
+ * Check if index relation should have a file for its main relation
+ * fork.  Verification uses this to skip unlogged indexes when in hot standby
+ * mode, where there is simply nothing to verify.
+ *
+ * NB: Caller should call index_checkable()
+ * before calling here.
+ */
+static bool
+amcheck_index_mainfork_expected(Relation rel)
+{
+	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
+		!RecoveryInProgress())
+		return true;
+
+	ereport(NOTICE,
+			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
+			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
+					RelationGetRelationName(rel))));
+
+	return false;
+}
+
+void
+amcheck_lock_relation_and_check(Oid indrelid, IndexCheckableCallback checkable,
+												IndexDoCheckCallback check, LOCKMODE lockmode, void *state)
+{
+	Oid			heapid;
+	Relation	indrel;
+	Relation	heaprel;
+	Oid			save_userid;
+	int			save_sec_context;
+	int			save_nestlevel;
+
+	/*
+	 * We must lock table before index to avoid deadlocks.  However, if the
+	 * passed indrelid isn't an index then IndexGetRelation() will fail.
+	 * Rather than emitting a not-very-helpful error message, postpone
+	 * complaining, expecting that the is-it-an-index test below will fail.
+	 *
+	 * In hot standby mode this will raise an error when parentcheck is true.
+	 */
+	heapid = IndexGetRelation(indrelid, true);
+	if (OidIsValid(heapid))
+	{
+		heaprel = table_open(heapid, lockmode);
+
+		/*
+		 * Switch to the table owner's userid, so that any index functions are
+		 * run as that user.  Also lock down security-restricted operations
+		 * and arrange to make GUC variable changes local to this command.
+		 */
+		GetUserIdAndSecContext(&save_userid, &save_sec_context);
+		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
+							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
+		save_nestlevel = NewGUCNestLevel();
+	}
+	else
+	{
+		heaprel = NULL;
+		/* for "gcc -Og" https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78394 */
+		save_userid = InvalidOid;
+		save_sec_context = -1;
+		save_nestlevel = -1;
+	}
+
+	/*
+	 * Open the target index relations separately (like relation_openrv(), but
+	 * with heap relation locked first to prevent deadlocking).  In hot
+	 * standby mode this will raise an error when parentcheck is true.
+	 *
+	 * There is no need for the usual indcheckxmin usability horizon test
+	 * here, even in the heapallindexed case, because index undergoing
+	 * verification only needs to have entries for a new transaction snapshot.
+	 * (If this is a parentcheck verification, there is no question about
+	 * committed or recently dead heap tuples lacking index entries due to
+	 * concurrent activity.)
+	 */
+	indrel = index_open(indrelid, lockmode);
+
+	/*
+	 * Since we did the IndexGetRelation call above without any lock, it's
+	 * barely possible that a race against an index drop/recreation could have
+	 * netted us the wrong table.
+	 */
+	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_TABLE),
+				 errmsg("could not open parent table of index \"%s\"",
+						RelationGetRelationName(indrel))));
+
+	/* Relation suitable for checking */
+	checkable(indrel);
+
+	if (amcheck_index_mainfork_expected(indrel))
+		check(indrel, heaprel, state);
+
+	/* Roll back any GUC changes executed by index functions */
+	AtEOXact_GUC(false, save_nestlevel);
+
+	/* Restore userid and security context */
+	SetUserIdAndSecContext(save_userid, save_sec_context);
+
+	/*
+	 * Release locks early. That's ok here because nothing in the called
+	 * routines will trigger shared cache invalidations to be sent, so we can
+	 * relax the usual pattern of only releasing locks after commit.
+	 */
+	index_close(indrel, lockmode);
+	if (heaprel)
+		table_close(heaprel, lockmode);
+}
+
+/*
+ * PageGetItemId() wrapper that validates returned line pointer.
+ *
+ * Buffer page/page item access macros generally trust that line pointers are
+ * not corrupt, which might cause problems for verification itself.  For
+ * example, there is no bounds checking in PageGetItem().  Passing it a
+ * corrupt line pointer can cause it to return a tuple/pointer that is unsafe
+ * to dereference.
+ *
+ * Validating line pointers before tuples avoids undefined behavior and
+ * assertion failures with corrupt indexes, making the verification process
+ * more robust and predictable.
+ */
+ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset, size_t opaquesize)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	Assert(opaquesize == MAXALIGN(opaquesize));
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(opaquesize))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree and gist
+	 * never uses either.  Verify that line pointer has storage, too, since
+	 * even LP_DEAD items should.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/contrib/amcheck/amcheck.h b/contrib/amcheck/amcheck.h
new file mode 100644
index 0000000000..10906efd8a
--- /dev/null
+++ b/contrib/amcheck/amcheck.h
@@ -0,0 +1,27 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "storage/lockdefs.h"
+#include "utils/relcache.h"
+#include "miscadmin.h"
+
+/* Typedefs for callback functions for amcheck_lock_relation */
+typedef void (*IndexCheckableCallback) (Relation index);
+typedef void (*IndexDoCheckCallback) (Relation rel, Relation heaprel, void* state);
+
+extern void amcheck_lock_relation_and_check(Oid indrelid,
+											IndexCheckableCallback checkable,
+											IndexDoCheckCallback check,
+											LOCKMODE lockmode, void *state);
+
+extern ItemId PageGetItemIdCareful(Relation rel, BlockNumber block,
+					 Page page, OffsetNumber offset, size_t opaquesize);
\ No newline at end of file
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index a8791000f8..d12c55b478 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -40,6 +40,8 @@
 #include "utils/memutils.h"
 #include "utils/snapmgr.h"
 
+#include "amcheck.h"
+
 
 PG_MODULE_MAGIC;
 
@@ -137,10 +139,8 @@ typedef struct BtreeLevel
 PG_FUNCTION_INFO_V1(bt_index_check);
 PG_FUNCTION_INFO_V1(bt_index_parent_check);
 
-static void bt_index_check_internal(Oid indrelid, bool parentcheck,
-									bool heapallindexed, bool rootdescend);
+static void bt_index_check_internal_callback(Relation indrel, Relation heaprel, void* state);
 static inline void btree_index_checkable(Relation rel);
-static inline bool btree_index_mainfork_expected(Relation rel);
 static void bt_check_every_level(Relation rel, Relation heaprel,
 								 bool heapkeyspace, bool readonly, bool heapallindexed,
 								 bool rootdescend);
@@ -183,12 +183,17 @@ static inline bool invariant_l_nontarget_offset(BtreeCheckState *state,
 static Page palloc_btree_page(BtreeCheckState *state, BlockNumber blocknum);
 static inline BTScanInsert bt_mkscankey_pivotsearch(Relation rel,
 													IndexTuple itup);
-static ItemId PageGetItemIdCareful(BtreeCheckState *state, BlockNumber block,
-								   Page page, OffsetNumber offset);
 static inline ItemPointer BTreeTupleGetHeapTIDCareful(BtreeCheckState *state,
 													  IndexTuple itup, bool nonpivot);
 static inline ItemPointer BTreeTupleGetPointsToTID(IndexTuple itup);
 
+typedef struct BTCheckCallbackState
+{
+	bool parentcheck;
+	bool heapallindexed;
+	bool rootdescend;
+} BTCheckCallbackState;
+
 /*
  * bt_index_check(index regclass, heapallindexed boolean)
  *
@@ -202,12 +207,17 @@ Datum
 bt_index_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
+	BTCheckCallbackState args;
 
-	if (PG_NARGS() == 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = false;
+
+	if (PG_NARGS() >= 2)
+		args.heapallindexed = PG_GETARG_BOOL(1);
 
-	bt_index_check_internal(indrelid, false, heapallindexed, false);
+	amcheck_lock_relation_and_check(indrelid, btree_index_checkable,
+		bt_index_check_internal_callback, AccessShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -225,15 +235,18 @@ Datum
 bt_index_parent_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		rootdescend = false;
+	BTCheckCallbackState args;
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = true;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+		args.heapallindexed = PG_GETARG_BOOL(1);
 	if (PG_NARGS() == 3)
-		rootdescend = PG_GETARG_BOOL(2);
+		args.rootdescend = PG_GETARG_BOOL(2);
 
-	bt_index_check_internal(indrelid, true, heapallindexed, rootdescend);
+	amcheck_lock_relation_and_check(indrelid, btree_index_checkable,
+		bt_index_check_internal_callback, ShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -241,126 +254,35 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
 /*
  * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
  */
-static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend)
+static void bt_index_check_internal_callback(Relation indrel, Relation heaprel, void* state)
 {
-	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	LOCKMODE	lockmode;
-	Oid			save_userid;
-	int			save_sec_context;
-	int			save_nestlevel;
-
-	if (parentcheck)
-		lockmode = ShareLock;
-	else
-		lockmode = AccessShareLock;
-
-	/*
-	 * We must lock table before index to avoid deadlocks.  However, if the
-	 * passed indrelid isn't an index then IndexGetRelation() will fail.
-	 * Rather than emitting a not-very-helpful error message, postpone
-	 * complaining, expecting that the is-it-an-index test below will fail.
-	 *
-	 * In hot standby mode this will raise an error when parentcheck is true.
-	 */
-	heapid = IndexGetRelation(indrelid, true);
-	if (OidIsValid(heapid))
-	{
-		heaprel = table_open(heapid, lockmode);
-
-		/*
-		 * Switch to the table owner's userid, so that any index functions are
-		 * run as that user.  Also lock down security-restricted operations
-		 * and arrange to make GUC variable changes local to this command.
-		 */
-		GetUserIdAndSecContext(&save_userid, &save_sec_context);
-		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
-							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
-		save_nestlevel = NewGUCNestLevel();
-	}
-	else
-	{
-		heaprel = NULL;
-		/* for "gcc -Og" https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78394 */
-		save_userid = InvalidOid;
-		save_sec_context = -1;
-		save_nestlevel = -1;
-	}
-
-	/*
-	 * Open the target index relations separately (like relation_openrv(), but
-	 * with heap relation locked first to prevent deadlocking).  In hot
-	 * standby mode this will raise an error when parentcheck is true.
-	 *
-	 * There is no need for the usual indcheckxmin usability horizon test
-	 * here, even in the heapallindexed case, because index undergoing
-	 * verification only needs to have entries for a new transaction snapshot.
-	 * (If this is a parentcheck verification, there is no question about
-	 * committed or recently dead heap tuples lacking index entries due to
-	 * concurrent activity.)
-	 */
-	indrel = index_open(indrelid, lockmode);
+	BTCheckCallbackState* args = (BTCheckCallbackState*) state;
+	bool		heapkeyspace,
+					allequalimage;
 
-	/*
-	 * Since we did the IndexGetRelation call above without any lock, it's
-	 * barely possible that a race against an index drop/recreation could have
-	 * netted us the wrong table.
-	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
 		ereport(ERROR,
-				(errcode(ERRCODE_UNDEFINED_TABLE),
-				 errmsg("could not open parent table of index \"%s\"",
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" lacks a main relation fork",
 						RelationGetRelationName(indrel))));
 
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	if (btree_index_mainfork_expected(indrel))
-	{
-		bool		heapkeyspace,
-					allequalimage;
-
-		if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" lacks a main relation fork",
-							RelationGetRelationName(indrel))));
-
-		/* Extract metadata from metapage, and sanitize it in passing */
-		_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
-		if (allequalimage && !heapkeyspace)
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
-							RelationGetRelationName(indrel))));
-		if (allequalimage && !_bt_allequalimage(indrel, false))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
-							RelationGetRelationName(indrel))));
-
-		/* Check index, possibly against table it is an index on */
-		bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
-							 heapallindexed, rootdescend);
-	}
-
-	/* Roll back any GUC changes executed by index functions */
-	AtEOXact_GUC(false, save_nestlevel);
+	/* Extract metadata from metapage, and sanitize it in passing */
+	_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
+	if (allequalimage && !heapkeyspace)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
+						RelationGetRelationName(indrel))));
+	if (allequalimage && !_bt_allequalimage(indrel, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
+						RelationGetRelationName(indrel))));
 
-	/* Restore userid and security context */
-	SetUserIdAndSecContext(save_userid, save_sec_context);
+	/* Check index, possibly against table it is an index on */
+	bt_check_every_level(indrel, heaprel, heapkeyspace, args->parentcheck,
+							args->heapallindexed, args->rootdescend);
 
-	/*
-	 * Release locks early. That's ok here because nothing in the called
-	 * routines will trigger shared cache invalidations to be sent, so we can
-	 * relax the usual pattern of only releasing locks after commit.
-	 */
-	index_close(indrel, lockmode);
-	if (heaprel)
-		table_close(heaprel, lockmode);
 }
 
 /*
@@ -397,29 +319,6 @@ btree_index_checkable(Relation rel)
 				 errdetail("Index is not valid.")));
 }
 
-/*
- * Check if B-Tree index relation should have a file for its main relation
- * fork.  Verification uses this to skip unlogged indexes when in hot standby
- * mode, where there is simply nothing to verify.  We behave as if the
- * relation is empty.
- *
- * NB: Caller should call btree_index_checkable() before calling here.
- */
-static inline bool
-btree_index_mainfork_expected(Relation rel)
-{
-	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
-		!RecoveryInProgress())
-		return true;
-
-	ereport(DEBUG1,
-			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
-			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
-					RelationGetRelationName(rel))));
-
-	return false;
-}
-
 /*
  * Main entry point for B-Tree SQL-callable functions. Walks the B-Tree in
  * logical order, verifying invariants as it goes.  Optionally, verification
@@ -792,9 +691,9 @@ bt_check_level_from_leftmost(BtreeCheckState *state, BtreeLevel level)
 				ItemId		itemid;
 
 				/* Internal page -- downlink gets leftmost on next level */
-				itemid = PageGetItemIdCareful(state, state->targetblock,
+				itemid = PageGetItemIdCareful(state->rel, state->targetblock,
 											  state->target,
-											  P_FIRSTDATAKEY(opaque));
+											  P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 				itup = (IndexTuple) PageGetItem(state->target, itemid);
 				nextleveldown.leftmost = BTreeTupleGetDownLink(itup);
 				nextleveldown.level = opaque->btpo_level - 1;
@@ -874,8 +773,8 @@ nextpage:
 			IndexTuple	itup;
 			ItemId		itemid;
 
-			itemid = PageGetItemIdCareful(state, state->targetblock,
-										  state->target, P_HIKEY);
+			itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+										  state->target, P_HIKEY, sizeof(BTPageOpaqueData));
 			itup = (IndexTuple) PageGetItem(state->target, itemid);
 
 			state->lowkey = MemoryContextAlloc(oldcontext, IndexTupleSize(itup));
@@ -1092,8 +991,8 @@ bt_target_page_check(BtreeCheckState *state)
 		IndexTuple	itup;
 
 		/* Verify line pointer before checking tuple */
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, P_HIKEY);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, P_HIKEY, sizeof(BTPageOpaqueData));
 		if (!_bt_check_natts(state->rel, state->heapkeyspace, state->target,
 							 P_HIKEY))
 		{
@@ -1128,8 +1027,8 @@ bt_target_page_check(BtreeCheckState *state)
 
 		CHECK_FOR_INTERRUPTS();
 
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, offset);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, offset, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(state->target, itemid);
 		tupsize = IndexTupleSize(itup);
 
@@ -1441,9 +1340,9 @@ bt_target_page_check(BtreeCheckState *state)
 							 OffsetNumberNext(offset));
 
 			/* Reuse itup to get pointed-to heap location of second item */
-			itemid = PageGetItemIdCareful(state, state->targetblock,
+			itemid = PageGetItemIdCareful(state->rel, state->targetblock,
 										  state->target,
-										  OffsetNumberNext(offset));
+										  OffsetNumberNext(offset), sizeof(BTPageOpaqueData));
 			itup = (IndexTuple) PageGetItem(state->target, itemid);
 			tid = BTreeTupleGetPointsToTID(itup);
 			nhtid = psprintf("(%u,%u)",
@@ -1734,8 +1633,8 @@ bt_right_page_check_scankey(BtreeCheckState *state)
 	if (P_ISLEAF(opaque) && nline >= P_FIRSTDATAKEY(opaque))
 	{
 		/* Return first data item (if any) */
-		rightitem = PageGetItemIdCareful(state, targetnext, rightpage,
-										 P_FIRSTDATAKEY(opaque));
+		rightitem = PageGetItemIdCareful(state->rel, targetnext, rightpage,
+										 P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 	}
 	else if (!P_ISLEAF(opaque) &&
 			 nline >= OffsetNumberNext(P_FIRSTDATAKEY(opaque)))
@@ -1744,8 +1643,8 @@ bt_right_page_check_scankey(BtreeCheckState *state)
 		 * Return first item after the internal page's "negative infinity"
 		 * item
 		 */
-		rightitem = PageGetItemIdCareful(state, targetnext, rightpage,
-										 OffsetNumberNext(P_FIRSTDATAKEY(opaque)));
+		rightitem = PageGetItemIdCareful(state->rel, targetnext, rightpage,
+										 OffsetNumberNext(P_FIRSTDATAKEY(opaque)), sizeof(BTPageOpaqueData));
 	}
 	else
 	{
@@ -1864,8 +1763,8 @@ bt_child_highkey_check(BtreeCheckState *state,
 
 	if (OffsetNumberIsValid(target_downlinkoffnum))
 	{
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, target_downlinkoffnum);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, target_downlinkoffnum, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(state->target, itemid);
 		downlink = BTreeTupleGetDownLink(itup);
 	}
@@ -1968,7 +1867,7 @@ bt_child_highkey_check(BtreeCheckState *state,
 			OffsetNumber pivotkey_offset;
 
 			/* Get high key */
-			itemid = PageGetItemIdCareful(state, blkno, page, P_HIKEY);
+			itemid = PageGetItemIdCareful(state->rel, blkno, page, P_HIKEY, sizeof(BTPageOpaqueData));
 			highkey = (IndexTuple) PageGetItem(page, itemid);
 
 			/*
@@ -2019,8 +1918,8 @@ bt_child_highkey_check(BtreeCheckState *state,
 													LSN_FORMAT_ARGS(state->targetlsn))));
 					pivotkey_offset = P_HIKEY;
 				}
-				itemid = PageGetItemIdCareful(state, state->targetblock,
-											  state->target, pivotkey_offset);
+				itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+											  state->target, pivotkey_offset, sizeof(BTPageOpaqueData));
 				itup = (IndexTuple) PageGetItem(state->target, itemid);
 			}
 			else
@@ -2106,8 +2005,8 @@ bt_child_check(BtreeCheckState *state, BTScanInsert targetkey,
 	BTPageOpaque copaque;
 	BTPageOpaque topaque;
 
-	itemid = PageGetItemIdCareful(state, state->targetblock,
-								  state->target, downlinkoffnum);
+	itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+								  state->target, downlinkoffnum, sizeof(BTPageOpaqueData));
 	itup = (IndexTuple) PageGetItem(state->target, itemid);
 	childblock = BTreeTupleGetDownLink(itup);
 
@@ -2338,7 +2237,7 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 		 RelationGetRelationName(state->rel));
 
 	level = opaque->btpo_level;
-	itemid = PageGetItemIdCareful(state, blkno, page, P_FIRSTDATAKEY(opaque));
+	itemid = PageGetItemIdCareful(state->rel, blkno, page, P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 	itup = (IndexTuple) PageGetItem(page, itemid);
 	childblk = BTreeTupleGetDownLink(itup);
 	for (;;)
@@ -2362,8 +2261,8 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 										level - 1, copaque->btpo_level)));
 
 		level = copaque->btpo_level;
-		itemid = PageGetItemIdCareful(state, childblk, child,
-									  P_FIRSTDATAKEY(copaque));
+		itemid = PageGetItemIdCareful(state->rel, childblk, child,
+									  P_FIRSTDATAKEY(copaque), sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(child, itemid);
 		childblk = BTreeTupleGetDownLink(itup);
 		/* Be slightly more pro-active in freeing this memory, just in case */
@@ -2411,7 +2310,7 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 	 */
 	if (P_ISHALFDEAD(copaque) && !P_RIGHTMOST(copaque))
 	{
-		itemid = PageGetItemIdCareful(state, childblk, child, P_HIKEY);
+		itemid = PageGetItemIdCareful(state->rel, childblk, child, P_HIKEY, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(child, itemid);
 		if (BTreeTupleGetTopParent(itup) == blkno)
 			return;
@@ -2781,8 +2680,8 @@ invariant_l_offset(BtreeCheckState *state, BTScanInsert key,
 	Assert(key->pivotsearch);
 
 	/* Verify line pointer before checking tuple */
-	itemid = PageGetItemIdCareful(state, state->targetblock, state->target,
-								  upperbound);
+	itemid = PageGetItemIdCareful(state->rel, state->targetblock, state->target,
+								  upperbound, sizeof(BTPageOpaqueData));
 	/* pg_upgrade'd indexes may legally have equal sibling tuples */
 	if (!key->heapkeyspace)
 		return invariant_leq_offset(state, key, upperbound);
@@ -2904,8 +2803,8 @@ invariant_l_nontarget_offset(BtreeCheckState *state, BTScanInsert key,
 	Assert(key->pivotsearch);
 
 	/* Verify line pointer before checking tuple */
-	itemid = PageGetItemIdCareful(state, nontargetblock, nontarget,
-								  upperbound);
+	itemid = PageGetItemIdCareful(state->rel, nontargetblock, nontarget,
+								  upperbound, sizeof(BTPageOpaqueData));
 	cmp = _bt_compare(state->rel, key, nontarget, upperbound);
 
 	/* pg_upgrade'd indexes may legally have equal sibling tuples */
@@ -3142,55 +3041,6 @@ bt_mkscankey_pivotsearch(Relation rel, IndexTuple itup)
 	return skey;
 }
 
-/*
- * PageGetItemId() wrapper that validates returned line pointer.
- *
- * Buffer page/page item access macros generally trust that line pointers are
- * not corrupt, which might cause problems for verification itself.  For
- * example, there is no bounds checking in PageGetItem().  Passing it a
- * corrupt line pointer can cause it to return a tuple/pointer that is unsafe
- * to dereference.
- *
- * Validating line pointers before tuples avoids undefined behavior and
- * assertion failures with corrupt indexes, making the verification process
- * more robust and predictable.
- */
-static ItemId
-PageGetItemIdCareful(BtreeCheckState *state, BlockNumber block, Page page,
-					 OffsetNumber offset)
-{
-	ItemId		itemid = PageGetItemId(page, offset);
-
-	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
-		BLCKSZ - MAXALIGN(sizeof(BTPageOpaqueData)))
-		ereport(ERROR,
-				(errcode(ERRCODE_INDEX_CORRUPTED),
-				 errmsg("line pointer points past end of tuple space in index \"%s\"",
-						RelationGetRelationName(state->rel)),
-				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
-									block, offset, ItemIdGetOffset(itemid),
-									ItemIdGetLength(itemid),
-									ItemIdGetFlags(itemid))));
-
-	/*
-	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree
-	 * never uses either.  Verify that line pointer has storage, too, since
-	 * even LP_DEAD items should within nbtree.
-	 */
-	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
-		ItemIdGetLength(itemid) == 0)
-		ereport(ERROR,
-				(errcode(ERRCODE_INDEX_CORRUPTED),
-				 errmsg("invalid line pointer storage in index \"%s\"",
-						RelationGetRelationName(state->rel)),
-				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
-									block, offset, ItemIdGetOffset(itemid),
-									ItemIdGetLength(itemid),
-									ItemIdGetFlags(itemid))));
-
-	return itemid;
-}
-
 /*
  * BTreeTupleGetHeapTID() wrapper that enforces that a heap TID is present in
  * cases where that is mandatory (i.e. for non-pivot tuples)
-- 
2.32.0 (Apple Git-132)

v12-0002-Add-gist_index_parent_check-function-to-verify-G.patchapplication/octet-stream; name=v12-0002-Add-gist_index_parent_check-function-to-verify-G.patch; x-unix-mode=0644Download

From 25f4940f334d51109e881abcfd439ab627de0685 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:17:44 +0500
Subject: [PATCH v12 2/3] Add gist_index_parent_check() function to verify GiST
 index

---
 contrib/amcheck/Makefile                |   6 +-
 contrib/amcheck/amcheck--1.3--1.4.sql   |  14 +
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/expected/check_gist.out | 119 ++++++
 contrib/amcheck/sql/check_gist.sql      |  42 ++
 contrib/amcheck/verify_gist.c           | 520 ++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml               |  19 +
 7 files changed, 719 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.3--1.4.sql
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index f10fd9d89d..a817419581 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,15 +4,17 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
+	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
-DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql \
+		amcheck--1.3--1.4.sql
 
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap
+REGRESS = check check_btree check_heap check_gist
 
 TAP_TESTS = 1
 
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
new file mode 100644
index 0000000000..93297379ef
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -0,0 +1,14 @@
+/* contrib/amcheck/amcheck--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.4'" to load this file. \quit
+
+
+-- gist_index_parent_check()
+--
+CREATE FUNCTION gist_index_parent_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index ab50931f75..e67ace01c9 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.3'
+default_version = '1.4'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 0000000000..9749adfd34
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,119 @@
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 0000000000..75b9ff4b43
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,42 @@
+
+SELECT setseed(1);
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 0000000000..db65880d87
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,520 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gist.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "access/tableam.h"
+#include "access/transam.h"
+#include "catalog/pg_am.h"
+#include "common/pg_prng.h"
+#include "catalog/index.h"
+#include "lib/bloomfilter.h"
+#include "miscadmin.h"
+#include "storage/lmgr.h"
+#include "storage/smgr.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "utils/snapmgr.h"
+
+#include "amcheck.h"
+
+/*
+ * GistScanItem represents one item of depth-first scan of GiST index.
+ */
+typedef struct GistScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GistScanItem *next;
+} GistScanItem;
+
+typedef struct GistCheckState
+{
+	/* Bloom filter fingerprints B-Tree index */
+	bloom_filter *filter;
+	/* Debug counter */
+	int64		heaptuplespresent;
+	/* GiST state */
+	GISTSTATE      *state;
+
+	Snapshot		snapshot;
+	Relation	rel;
+	Relation	heaprel;
+} GistCheckState;
+
+PG_FUNCTION_INFO_V1(gist_index_parent_check);
+
+static GistCheckState gist_init_heapallindexed(Relation rel);
+static void gist_index_checkable(Relation rel);
+static void gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+												void* callback_state);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gist_refind_parent(Relation rel, BlockNumber parentblkno,
+									 BlockNumber childblkno,
+									 BufferAccessStrategy strategy);
+static void gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+						  bool *isnull, bool tupleIsAlive, void *checkstate);
+
+/*
+ * gist_index_parent_check(index regclass)
+ *
+ * Verify integrity of GiST index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum gist_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid		indrelid = PG_GETARG_OID(0);
+	bool	heapallindexed = false;
+
+	if (PG_NARGS() >= 2)
+		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid, gist_index_checkable,
+		gist_check_parent_keys_consistency, AccessShareLock, &heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Check that relation is eligible for GiST verification
+ */
+static void
+gist_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIST_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GiST indexes are supported as targets for this verification"),
+				 errdetail("Relation \"%s\" is not a GiST index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+static GistCheckState
+gist_init_heapallindexed(Relation rel)
+{
+	int64		total_pages;
+	int64		total_elems;
+	uint64		seed;
+	GistCheckState result;
+
+	/*
+	 * Size Bloom filter based on estimated number of tuples in index.
+	 * This logic is similar to B-tree, see verify_btree.c .
+	 */
+	total_pages = RelationGetNumberOfBlocks(rel);
+	total_elems = Max(total_pages * (MaxOffsetNumber / 5),
+						(int64) rel->rd_rel->reltuples);
+	seed = pg_prng_uint64(&pg_global_prng_state);
+	result.filter = bloom_create(total_elems, maintenance_work_mem, seed);
+
+	result.snapshot = RegisterSnapshot(GetTransactionSnapshot());
+
+	/*
+	 * GetTransactionSnapshot() always acquires a new MVCC snapshot in
+	 * READ COMMITTED mode.  A new snapshot is guaranteed to have all
+	 * the entries it requires in the index.
+	 *
+	 * We must defend against the possibility that an old xact
+	 * snapshot was returned at higher isolation levels when that
+	 * snapshot is not safe for index scans of the target index.  This
+	 * is possible when the snapshot sees tuples that are before the
+	 * index's indcheckxmin horizon.  Throwing an error here should be
+	 * very rare.  It doesn't seem worth using a secondary snapshot to
+	 * avoid this.
+	 */
+	if (IsolationUsesXactSnapshot() && rel->rd_index->indcheckxmin &&
+		!TransactionIdPrecedes(HeapTupleHeaderGetXmin(rel->rd_indextuple->t_data),
+								result.snapshot->xmin))
+		ereport(ERROR,
+				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+					errmsg("index \"%s\" cannot be verified using transaction snapshot",
+						RelationGetRelationName(rel))));
+	
+	return result;
+}
+
+/*
+ * Main entry point for GiST check. Allocates memory context and scans through
+ * GiST graph.  This function verifies that tuples of internal pages cover all
+ * the key space of each tuple on leaf page.  To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries to
+ * adjust it by tuples on referenced child page.  Parent gist tuple should
+ * never require any adjustments.
+ */
+static void
+gist_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GistScanItem   *stack;
+	MemoryContext	mctx;
+	MemoryContext	oldcontext;
+	GISTSTATE      *state;
+	int				leafdepth;
+	bool			heapallindexed = *((bool*)callback_state);
+	GistCheckState  check_state;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	state = initGISTstate(rel);
+
+	if (heapallindexed)
+		check_state = gist_init_heapallindexed(rel);
+	check_state.state = state;
+	check_state.rel = rel;
+	check_state.heaprel = heaprel;
+	
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GistScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber  i, maxoff;
+		XLogRecPtr	lsn;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page))
+		{
+			/* split page detected, install right link to the stack */
+			GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+			ptr->depth = stack->depth;
+			ptr->parenttup = CopyIndexTuple(stack->parenttup);
+			ptr->parentblk = stack->parentblk;
+			ptr->parentlsn = stack->parentlsn;
+			ptr->blkno = GistPageGetOpaque(page)->rightlink;
+			ptr->next = stack->next;
+			stack->next = ptr;
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GistPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that each tuple looks valid, and is consistent with the
+		 * downlink we followed when we stepped on this page.
+		 */
+		maxoff = PageGetMaxOffsetNumber(page);
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId iid = PageGetItemIdCareful(rel, stack->blkno, page, i, sizeof(GISTPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+			/*
+			 * Check that it's not a leftover invalid tuple from pre-9.1 See
+			 * also gistdoinsert() and gistbulkdelete() handling of such
+			 * tuples. We do consider it error here.
+			 */
+			if (GistTupleIsInvalid(idxtuple))
+				ereport(ERROR,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("index \"%s\" contains an inner tuple marked as invalid, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i),
+						 errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+						 errhint("Please REINDEX it.")));
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+			{
+				/*
+				 * There was a discrepancy between parent and child tuples.
+				 * We need to verify it is not a result of concurrent call of
+				 * gistplacetopage(). So, lock parent and try to find downlink
+				 * for current page. It may be missing due to concurrent page
+				 * split, this is OK.
+				 * 
+				 * Note that when we aquire parent tuple now we hold lock for
+				 * both parent and child buffers. Thus parent tuple must
+				 * include keyspace of the child.
+				 */
+				pfree(stack->parenttup);
+				stack->parenttup = gist_refind_parent(rel, stack->parentblk,
+													  stack->blkno, strategy);
+
+				/* We found it - make a final check before failing */
+				if (!stack->parenttup)
+					elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+						 stack->blkno, stack->parentblk);
+				else if (gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+				else
+				{
+					/*
+					 * But now it is properly adjusted - nothing to do here.
+					 */
+				}
+			}
+
+			
+			if (GistPageIsLeaf(page))
+			{
+				if (heapallindexed)
+				{
+					bloom_add_element(check_state.filter, (unsigned char *) idxtuple,
+								  IndexTupleSize(idxtuple));
+				}
+			}
+			/* If this is an internal page, recurse into the child */
+			else
+			{
+				GistScanItem *ptr;
+
+				ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+				ptr->depth = stack->depth + 1;
+				ptr->parenttup = CopyIndexTuple(idxtuple);
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		LockBuffer(buffer, GIST_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	if (heapallindexed)
+	{
+		IndexInfo  *indexinfo = BuildIndexInfo(rel);
+		TableScanDesc scan;
+
+		scan = table_beginscan_strat(heaprel,	/* relation */
+									 check_state.snapshot,	/* snapshot */
+									 0, /* number of keys */
+									 NULL,	/* scan key */
+									 true,	/* buffer access strategy OK */
+									 true); /* syncscan OK? */
+
+		/*
+		 * Scan will behave as the first scan of a CREATE INDEX CONCURRENTLY.
+		 */
+		indexinfo->ii_Concurrent = true;
+
+		indexinfo->ii_Unique = false;
+		indexinfo->ii_ExclusionOps = NULL;
+		indexinfo->ii_ExclusionProcs = NULL;
+		indexinfo->ii_ExclusionStrats = NULL;
+
+		elog(DEBUG1, "verifying that tuples from index \"%s\" are present in \"%s\"",
+			 RelationGetRelationName(rel),
+			 RelationGetRelationName(heaprel));
+
+		table_index_build_scan(heaprel, rel, indexinfo, true, false,
+							   gist_tuple_present_callback, (void *) &check_state, scan);
+
+		ereport(DEBUG1,
+		(errmsg_internal("finished verifying presence of " INT64_FORMAT " tuples from table \"%s\" with bitset %.2f%% set",
+							check_state.heaptuplespresent, RelationGetRelationName(heaprel),
+							100.0 * bloom_prop_bits_set(check_state.filter))));
+
+		UnregisterSnapshot(check_state.snapshot);
+		bloom_free(check_state.filter);
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+static void
+gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+						  bool *isnull, bool tupleIsAlive, void *checkstate)
+{
+	GistCheckState *state = (GistCheckState *) checkstate;
+	IndexTuple	itup = gistFormTuple(state->state, index, values, isnull, true);
+	itup->t_tid = *tid;
+	/* Probe Bloom filter -- tuple should be present */
+	if (bloom_lacks_element(state->filter, (unsigned char *) itup,
+							IndexTupleSize(itup)))
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("heap tuple (%u,%u) from table \"%s\" lacks matching index tuple within index \"%s\"",
+						ItemPointerGetBlockNumber(&(itup->t_tid)),
+						ItemPointerGetOffsetNumber(&(itup->t_tid)),
+						RelationGetRelationName(state->heaprel),
+						RelationGetRelationName(state->rel))));
+
+	state->heaptuplespresent++;
+
+	pfree(itup);
+}
+
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	gistcheckpage(rel, buffer);
+
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted page %d",
+						RelationGetRelationName(rel), blockNo)));
+
+	if (GistPageIsDeleted(page))
+	{
+		if (!GistPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gist_refind_parent(Relation rel, BlockNumber parentblkno,
+				   BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIST_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GistPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o, sizeof(GISTPageOpaqueData));
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 5d61a33936..9397a69c6e 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -179,6 +179,25 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gist_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_parent_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
   <tip>
    <para>
-- 
2.32.0 (Apple Git-132)

v12-0003-Add-gin_index_parent_check-to-verify-GIN-index.patchapplication/octet-stream; name=v12-0003-Add-gin_index_parent_check-to-verify-GIN-index.patch; x-unix-mode=0644Download

From b928670494a831f3925705a9f7cc0a60e914fb57 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:22:26 +0500
Subject: [PATCH v12 3/3] Add gin_index_parent_check() to verify GIN index

---
 contrib/amcheck/Makefile               |   3 +-
 contrib/amcheck/amcheck--1.3--1.4.sql  |  11 +-
 contrib/amcheck/expected/check_gin.out |  60 ++
 contrib/amcheck/sql/check_gin.sql      |  40 ++
 contrib/amcheck/verify_gin.c           | 801 +++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml              |  19 +
 6 files changed, 932 insertions(+), 2 deletions(-)
 create mode 100644 contrib/amcheck/expected/check_gin.out
 create mode 100644 contrib/amcheck/sql/check_gin.sql
 create mode 100644 contrib/amcheck/verify_gin.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index a817419581..ecb849a605 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -5,6 +5,7 @@ OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
 	verify_gist.o \
+	verify_gin.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
@@ -14,7 +15,7 @@ DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck
 
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap check_gist
+REGRESS = check check_btree check_heap check_gist check_gin
 
 TAP_TESTS = 1
 
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
index 93297379ef..c914e6d0ba 100644
--- a/contrib/amcheck/amcheck--1.3--1.4.sql
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -11,4 +11,13 @@ RETURNS VOID
 AS 'MODULE_PATHNAME', 'gist_index_parent_check'
 LANGUAGE C STRICT;
 
-REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
+REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
+
+-- gin_index_parent_check()
+--
+CREATE FUNCTION gin_index_parent_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gin_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gin_index_parent_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
new file mode 100644
index 0000000000..c4c0cfd94d
--- /dev/null
+++ b/contrib/amcheck/expected/check_gin.out
@@ -0,0 +1,60 @@
+-- Test of index bulk load
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx', true);
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test index inserts
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+SELECT gin_index_parent_check('gin_check_idx', true);
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test GIN over text array
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array', true);
+ERROR:  "gin_check_text_array" is not an index
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
new file mode 100644
index 0000000000..568a67ebe2
--- /dev/null
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -0,0 +1,40 @@
+-- Test of index bulk load
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx', true);
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test index inserts
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+
+SELECT gin_index_parent_check('gin_check_idx', true);
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test GIN over text array
+SELECT setseed(1);
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array', true);
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
new file mode 100644
index 0000000000..45f3bb47c7
--- /dev/null
+++ b/contrib/amcheck/verify_gin.c
@@ -0,0 +1,801 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gin.c
+ *		Verifies the integrity of GIN indexes based on invariants.
+ *
+ * Verification checks that all paths in GIN graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gin.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gin_private.h"
+#include "access/nbtree.h"
+#include "amcheck.h"
+#include "catalog/pg_am.h"
+#include "miscadmin.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "string.h"
+
+/*
+ * GinScanItem represents one item of depth-first scan of GIN index.
+ */
+typedef struct GinScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GinScanItem *next;
+}			GinScanItem;
+
+/*
+ * GinPostingTreeScanItem represents one item of depth-first scan of GIN  posting tree.
+ */
+typedef struct GinPostingTreeScanItem
+{
+	int			depth;
+	ItemPointerData parentkey;
+	BlockNumber parentblk;
+	BlockNumber blkno;
+	struct GinPostingTreeScanItem *next;
+}			GinPostingTreeScanItem;
+
+
+PG_FUNCTION_INFO_V1(gin_index_parent_check);
+
+static void gin_index_checkable(Relation rel);
+static void gin_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state);
+static bool check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gin_refind_parent(Relation rel, BlockNumber parentblkno,
+									BlockNumber childblkno,
+									BufferAccessStrategy strategy);
+
+/*
+ * gin_index_parent_check(index regclass)
+ *
+ * Verify integrity of GIN index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gin_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	bool	heapallindexed = false;
+
+	if (PG_NARGS() >= 2)
+		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid, gin_index_checkable,
+		gin_check_parent_keys_consistency, AccessShareLock, &heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Read item pointers from leaf entry tuple.
+ *
+ * Returns a palloc'd array of ItemPointers. The number of items is returned
+ * in *nitems.
+ */
+static ItemPointer
+ginReadTupleWithoutState(IndexTuple itup, int *nitems)
+{
+	Pointer		ptr = GinGetPosting(itup);
+	int			nipd = GinGetNPosting(itup);
+	ItemPointer ipd;
+	int			ndecoded;
+
+	if (GinItupIsCompressed(itup))
+	{
+		if (nipd > 0)
+		{
+			ipd = ginPostingListDecode((GinPostingList *) ptr, &ndecoded);
+			if (nipd != ndecoded)
+				elog(ERROR, "number of items mismatch in GIN entry tuple, %d in tuple header, %d decoded",
+					 nipd, ndecoded);
+		}
+		else
+		{
+			ipd = palloc(0);
+		}
+	}
+	else
+	{
+		ipd = (ItemPointer) palloc(sizeof(ItemPointerData) * nipd);
+		memcpy(ipd, ptr, sizeof(ItemPointerData) * nipd);
+	}
+	*nitems = nipd;
+	return ipd;
+}
+
+
+/*
+ * Check that relation is eligible for GIN verification
+ */
+static void
+gin_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIN_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GIN indexes are supported as targets for this verification"),
+				 errdetail("Relation \"%s\" is not a GIN index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+/*
+ * Allocates memory context and scans through postigTree graph
+ *
+ */
+static void
+gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting_tree_root)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinPostingTreeScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinPostingTreeScanItem *) palloc0(sizeof(GinPostingTreeScanItem));
+	stack->depth = 0;
+	ItemPointerSetInvalid(&stack->parentkey);
+	stack->parentblk = InvalidBlockNumber;
+	stack->blkno = posting_tree_root;
+
+	elog(DEBUG3, "processing posting tree at blk %u", posting_tree_root);
+
+	while (stack)
+	{
+		GinPostingTreeScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		Assert(GinPageIsData(page));
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			ItemPointerData minItem;
+			int			nlist;
+			ItemPointerData *list;
+			char		tidrange_buf[100];
+
+			ItemPointerSetMin(&minItem);
+
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			list = GinDataLeafPageGetItems(page, &nlist, minItem);
+
+			if (nlist > 0)
+			{
+				snprintf(tidrange_buf, sizeof(tidrange_buf),
+						 "%d tids (%u, %u) - (%u, %u)",
+						 nlist,
+						 ItemPointerGetBlockNumberNoCheck(&list[0]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[0]),
+						 ItemPointerGetBlockNumberNoCheck(&list[nlist - 1]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[nlist - 1]));
+			} else {
+				snprintf(tidrange_buf, sizeof(tidrange_buf), "0 tids");
+			}
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				elog(DEBUG3, "blk %u: parent %u highkey (%u, %u), %s",
+					 stack->blkno,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey),
+					 tidrange_buf);
+			}
+			else
+			{
+				elog(DEBUG3, "blk %u: root leaf, %s",
+					 stack->blkno,
+					 tidrange_buf);
+			}
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				ItemPointerGetOffsetNumberNoCheck(&stack->parentkey) != InvalidOffsetNumber &&
+				nlist > 0 &&
+				ItemPointerCompare(&stack->parentkey, &list[nlist - 1]) < 0)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": tid exceeds parent's high key in postingTree leaf on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			}
+		}
+		else
+		{
+			LocationIndex pd_lower;
+			int			lowersize;
+			ItemPointerData bound;
+
+			/*
+			 * Check that tuples in each page are properly ordered and
+			 * consistent with parent high key
+			 */
+			maxoff = GinPageGetOpaque(page)->maxoff;
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: internal posting tree page with %u items, parent %u highkey (%u, %u)",
+					 stack->blkno,
+					 maxoff,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey));
+			else
+				elog(DEBUG3, "blk %u: root internal posting tree page with %u items", stack->blkno, maxoff);
+
+			/*
+			 * A GIN posting tree internal page stores PostingItems in the
+			 * 'lower' part of the page. The 'upper' part is unused. The
+			 * number of elements is stored in the opaque area (maxoff).
+			 * Make sure the size of the 'lower' part agrees with 'maxoff'
+			 *
+			 * We didn't set pd_lower until PostgreSQL version 9.4, so if this
+			 * check fails, it could also be because the index was binary-upgraded
+			 * from an earlier version. That was a long time ago, though, so let's
+			 * warn if it doesn't match.
+			 */
+			pd_lower = ((PageHeader) page)->pd_lower;
+			lowersize = pd_lower - MAXALIGN(SizeOfPageHeaderData);
+			if ((lowersize - MAXALIGN(sizeof(ItemPointerData))) / sizeof(PostingItem) != maxoff)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has unexpected pd_lower %u in posting tree block %u with maxoff %u)",
+								RelationGetRelationName(rel), pd_lower, stack->blkno, maxoff)));
+			}
+
+			/*
+			 * Before the PostingItems, there's one ItemPointerData in the
+			 * 'lower' part that stores the page's high key.
+			 */
+			bound = *GinDataPageGetRightBound(page);
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				if (!ItemPointerEquals(&stack->parentkey, &bound))
+				{
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\": posting tree page's high key (%u, %u) doesn't match the downlink on block %u (parent blk %u, key (%u, %u))",
+									RelationGetRelationName(rel),
+									ItemPointerGetBlockNumberNoCheck(&bound),
+									ItemPointerGetOffsetNumberNoCheck(&bound),
+									stack->blkno,
+									stack->parentblk,
+									ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+									ItemPointerGetOffsetNumberNoCheck(&stack->parentkey))));
+				}
+			}
+
+			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
+
+				elog(DEBUG3, "key (%u, %u) -> %u",
+					 ItemPointerGetBlockNumber(&posting_item->key),
+					 ItemPointerGetOffsetNumber(&posting_item->key),
+					 BlockIdGetBlockNumber(&posting_item->child_blkno));
+
+				if (i == maxoff && GinPageGetOpaque(page)->rightlink == InvalidBlockNumber)
+				{
+					/* The rightmost item in the tree level has (0, 0) as the key */
+					if (ItemPointerGetBlockNumberNoCheck(&posting_item->key) != 0 ||
+						ItemPointerGetOffsetNumberNoCheck(&posting_item->key) != 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": rightmost posting tree page (blk %u) has unexpected last key (%u, %u)",
+										RelationGetRelationName(rel),
+										stack->blkno,
+										ItemPointerGetBlockNumberNoCheck(&posting_item->key),
+										ItemPointerGetOffsetNumberNoCheck(&posting_item->key))));
+					}
+				}
+				else if (i != FirstOffsetNumber)
+				{
+					PostingItem *previous_posting_item = GinDataPageGetPostingItem(page, i - 1);
+
+					if (ItemPointerCompare(&posting_item->key, &previous_posting_item->key) < 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\" has wrong tuple order in posting tree, block %u, offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+					}
+				}
+
+				/*
+				 * Check if this tuple is consistent with the downlink in the
+				 * parent.
+				 */
+				if (stack->parentblk != InvalidBlockNumber && i == maxoff)
+				{
+					if (ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting item exceeds parent's high key in postingTree internal page on block %u offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+
+					}
+				}
+
+				/* If this is an internal page, recurse into the child */
+				if (!GinPageIsLeaf(page))
+				{
+					GinPostingTreeScanItem *ptr;
+
+					ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
+					ptr->depth = stack->depth + 1;
+					ptr->parentkey = posting_item->key;
+					ptr->parentblk = stack->blkno;
+					ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+
+			}
+		}
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Main entry point for GIN check. Allocates memory context and scans through
+ * GIN graph.
+ */
+static void
+gin_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GinState	state;
+	bool		heapallindexed = *((bool*)callback_state);
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+	initGinState(&state, rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinScanItem *) palloc0(sizeof(GinScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIN_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GinScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+		IndexTuple	prev_tuple;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		if (!check_index_page(rel, buffer, stack->blkno))
+		{
+			goto nextpage;
+		}
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (stack->parenttup != NULL)
+		{
+			GinNullCategory parent_key_category;
+			Datum		parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+			OffsetNumber maxoff = PageGetMaxOffsetNumber(page);
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, maxoff, sizeof(GinPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory page_max_key_category;
+			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
+
+			if (GinPageGetOpaque(page)->rightlink != InvalidBlockNumber &&
+				ginCompareEntries(&state, attnum, page_max_key, page_max_key_category, parent_key, parent_key_category) > 0)
+			{
+				/* split page detected, install right link to the stack */
+				GinScanItem *ptr;
+
+				elog(DEBUG3, "split detected");
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth;
+				ptr->parenttup = CopyIndexTuple(stack->parenttup);
+				ptr->parentblk = stack->parentblk;
+				ptr->parentlsn = stack->parentlsn;
+				ptr->blkno = GinPageGetOpaque(page)->rightlink;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+				goto nextpage;
+			}
+		}
+
+		/*
+		 * Check that tuples in each page are properly ordered and consistent
+		 * with parent high key
+		 */
+		maxoff = PageGetMaxOffsetNumber(page);
+		prev_tuple = NULL;
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i, sizeof(GinPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory prev_key_category;
+			Datum		prev_key;
+			GinNullCategory current_key_category;
+			Datum		current_key;
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+				continue;
+			}
+
+			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
+
+			/* (apparently) first block is metadata, skip order check */
+			if (i != FirstOffsetNumber && stack->blkno != (BlockNumber) 1)
+			{
+				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
+				if (ginCompareEntries(&state, attnum, prev_key, prev_key_category, current_key, current_key_category) >= 0)
+				{
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has wrong tuple order, block %u, offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+				}
+			}
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				i == maxoff)
+			{
+				GinNullCategory parent_key_category;
+				Datum		parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+
+				if (ginCompareEntries(&state, attnum, current_key, current_key_category, parent_key, parent_key_category) > 0)
+				{
+					/*
+					 * There was a discrepancy between parent and child
+					 * tuples. We need to verify it is not a result of
+					 * concurrent call of gistplacetopage(). So, lock parent
+					 * and try to find downlink for current page. It may be
+					 * missing due to concurrent page split, this is OK.
+					 */
+					pfree(stack->parenttup);
+					stack->parenttup = gin_refind_parent(rel, stack->parentblk,
+														 stack->blkno, strategy);
+
+					/* We found it - make a final check before failing */
+					if (!stack->parenttup)
+						elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+							 stack->blkno, stack->parentblk);
+					else
+					{
+						parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+						if (ginCompareEntries(&state, attnum, current_key, current_key_category, parent_key, parent_key_category) > 0)
+							ereport(ERROR,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+											RelationGetRelationName(rel), stack->blkno, i)));
+						else
+						{
+							/*
+							 * But now it is properly adjusted - nothing to do
+							 * here.
+							 */
+						}
+					}
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+				// XXX: Why do we have invalid pointers here?? Got an SegFault without this check
+			if (!GinPageIsLeaf(page) && ItemPointerIsValid(&(idxtuple->t_tid)))
+			{
+				GinScanItem *ptr;
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth + 1;
+				/* last tuple in layer has no high key */
+				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
+				{
+					ptr->parenttup = CopyIndexTuple(idxtuple);
+				}
+				else
+				{
+					ptr->parenttup = NULL;
+				}
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+			/* If this item is a pointer to a posting tree, recurse into it */
+			else if (GinIsPostingTree(idxtuple))
+			{
+				BlockNumber rootPostingTree = GinGetPostingTree(idxtuple);
+
+				gin_check_posting_tree_parent_keys_consistency(rel, rootPostingTree);
+			}
+			else
+			{
+				ItemPointer ipd;
+				int			nipd;
+
+				ipd = ginReadTupleWithoutState(idxtuple, &nipd);
+
+				for (int j = 0; j < nipd; j++)
+				{
+					if (!OffsetNumberIsValid(ItemPointerGetOffsetNumber(&ipd[j])))
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting list contains invalid heap pointer on block %u",
+										RelationGetRelationName(rel), stack->blkno)));
+					}
+				}
+				pfree(ipd);
+			}
+
+			prev_tuple = CopyIndexTuple(idxtuple);
+		}
+
+nextpage:
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Verify that a freshly-read page looks sane.
+ */
+static bool
+gincheckpage(Relation rel, Buffer buf)
+{
+	Page		page = BufferGetPage(buf);
+
+	/*
+	 * ReadBuffer verifies that every newly-read page passes
+	 * PageHeaderIsValid, which means it either contains a reasonably sane
+	 * page header or is all-zero.  We have to defend against the all-zero
+	 * case, however.
+	 */
+	if (PageIsNew(page))
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains unexpected zero page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buf)),
+				 errhint("Please REINDEX it.")));
+		return false;
+	}
+
+	/*
+	 * Additionally check that the special area looks sane.
+	 */
+	if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains corrupted page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buf)),
+				 errhint("Please REINDEX it.")));
+		return false;
+	}
+	return true;
+}
+
+static bool
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	if (!gincheckpage(rel, buffer))
+		return false;
+
+	if (GinPageIsDeleted(page))
+	{
+		if (!GinPageIsLeaf(page))
+		{
+			ereport(WARNING,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+			return false;
+		}
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+		{
+			ereport(WARNING,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+			return false;
+		}
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+		return false;
+	}
+	return true;
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gin_refind_parent(Relation rel, BlockNumber parentblkno,
+				  BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIN_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GinPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o, sizeof(GinPageOpaqueData));
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 9397a69c6e..7ffa36b205 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -180,6 +180,25 @@ ORDER BY c.relpages DESC LIMIT 10;
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term>
+     <function>gin_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gin_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gin_index_parent_check</function> tests that its target GIN index
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term>
      <function>gist_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
-- 
2.32.0 (Apple Git-132)

Andrey Borodin

x4mmm@yandex-team.ru

over 3 years ago

In reply to: Andrey Borodin (#6)

3 attachment(s)

Re: Amcheck verification of GiST and GIN

On 23 Jul 2022, at 14:40, Andrey Borodin <x4mmm@yandex-team.ru> wrote:

Done. PFA attached patchset.

Best regards, Andrey Borodin.
<v12-0001-Refactor-amcheck-to-extract-common-locking-routi.patch><v12-0002-Add-gist_index_parent_check-function-to-verify-G.patch><v12-0003-Add-gin_index_parent_check-to-verify-GIN-index.patch>

Here's v13. Changes:
1. Fixed passing through downlink in GIN index
2. Fixed GIN tests (one test case was not working)

Thanks to Vitaliy Kukharik for trying this patches.

Best regards, Andrey Borodin.

Attachments:

v13-0001-Refactor-amcheck-to-extract-common-locking-routi.patchapplication/octet-stream; name=v13-0001-Refactor-amcheck-to-extract-common-locking-routi.patch; x-unix-mode=0644Download

From b5bfc240b5cebcaec7b3ccd213710188b430d7b5 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:08:10 +0500
Subject: [PATCH v13 1/3] Refactor amcheck to extract common locking routines

---
 contrib/amcheck/Makefile        |   2 +
 contrib/amcheck/amcheck.c       | 188 ++++++++++++++++++++
 contrib/amcheck/amcheck.h       |  27 +++
 contrib/amcheck/verify_nbtree.c | 306 ++++++++------------------------
 4 files changed, 295 insertions(+), 228 deletions(-)
 create mode 100644 contrib/amcheck/amcheck.c
 create mode 100644 contrib/amcheck/amcheck.h

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index b82f221e50..f10fd9d89d 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -3,11 +3,13 @@
 MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
+	amcheck.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
 DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
 REGRESS = check check_btree check_heap
diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
new file mode 100644
index 0000000000..0194ef0d7a
--- /dev/null
+++ b/contrib/amcheck/amcheck.c
@@ -0,0 +1,188 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.c
+ *		Utility functions common to all access methods.
+ *
+ * Copyright (c) 2017-2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/table.h"
+#include "access/tableam.h"
+#include "amcheck.h"
+#include "catalog/index.h"
+#include "commands/tablecmds.h"
+
+
+static bool
+amcheck_index_mainfork_expected(Relation rel);
+
+/*
+ * Check if index relation should have a file for its main relation
+ * fork.  Verification uses this to skip unlogged indexes when in hot standby
+ * mode, where there is simply nothing to verify.
+ *
+ * NB: Caller should call index_checkable()
+ * before calling here.
+ */
+static bool
+amcheck_index_mainfork_expected(Relation rel)
+{
+	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
+		!RecoveryInProgress())
+		return true;
+
+	ereport(NOTICE,
+			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
+			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
+					RelationGetRelationName(rel))));
+
+	return false;
+}
+
+void
+amcheck_lock_relation_and_check(Oid indrelid, IndexCheckableCallback checkable,
+												IndexDoCheckCallback check, LOCKMODE lockmode, void *state)
+{
+	Oid			heapid;
+	Relation	indrel;
+	Relation	heaprel;
+	Oid			save_userid;
+	int			save_sec_context;
+	int			save_nestlevel;
+
+	/*
+	 * We must lock table before index to avoid deadlocks.  However, if the
+	 * passed indrelid isn't an index then IndexGetRelation() will fail.
+	 * Rather than emitting a not-very-helpful error message, postpone
+	 * complaining, expecting that the is-it-an-index test below will fail.
+	 *
+	 * In hot standby mode this will raise an error when parentcheck is true.
+	 */
+	heapid = IndexGetRelation(indrelid, true);
+	if (OidIsValid(heapid))
+	{
+		heaprel = table_open(heapid, lockmode);
+
+		/*
+		 * Switch to the table owner's userid, so that any index functions are
+		 * run as that user.  Also lock down security-restricted operations
+		 * and arrange to make GUC variable changes local to this command.
+		 */
+		GetUserIdAndSecContext(&save_userid, &save_sec_context);
+		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
+							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
+		save_nestlevel = NewGUCNestLevel();
+	}
+	else
+	{
+		heaprel = NULL;
+		/* for "gcc -Og" https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78394 */
+		save_userid = InvalidOid;
+		save_sec_context = -1;
+		save_nestlevel = -1;
+	}
+
+	/*
+	 * Open the target index relations separately (like relation_openrv(), but
+	 * with heap relation locked first to prevent deadlocking).  In hot
+	 * standby mode this will raise an error when parentcheck is true.
+	 *
+	 * There is no need for the usual indcheckxmin usability horizon test
+	 * here, even in the heapallindexed case, because index undergoing
+	 * verification only needs to have entries for a new transaction snapshot.
+	 * (If this is a parentcheck verification, there is no question about
+	 * committed or recently dead heap tuples lacking index entries due to
+	 * concurrent activity.)
+	 */
+	indrel = index_open(indrelid, lockmode);
+
+	/*
+	 * Since we did the IndexGetRelation call above without any lock, it's
+	 * barely possible that a race against an index drop/recreation could have
+	 * netted us the wrong table.
+	 */
+	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_TABLE),
+				 errmsg("could not open parent table of index \"%s\"",
+						RelationGetRelationName(indrel))));
+
+	/* Relation suitable for checking */
+	checkable(indrel);
+
+	if (amcheck_index_mainfork_expected(indrel))
+		check(indrel, heaprel, state);
+
+	/* Roll back any GUC changes executed by index functions */
+	AtEOXact_GUC(false, save_nestlevel);
+
+	/* Restore userid and security context */
+	SetUserIdAndSecContext(save_userid, save_sec_context);
+
+	/*
+	 * Release locks early. That's ok here because nothing in the called
+	 * routines will trigger shared cache invalidations to be sent, so we can
+	 * relax the usual pattern of only releasing locks after commit.
+	 */
+	index_close(indrel, lockmode);
+	if (heaprel)
+		table_close(heaprel, lockmode);
+}
+
+/*
+ * PageGetItemId() wrapper that validates returned line pointer.
+ *
+ * Buffer page/page item access macros generally trust that line pointers are
+ * not corrupt, which might cause problems for verification itself.  For
+ * example, there is no bounds checking in PageGetItem().  Passing it a
+ * corrupt line pointer can cause it to return a tuple/pointer that is unsafe
+ * to dereference.
+ *
+ * Validating line pointers before tuples avoids undefined behavior and
+ * assertion failures with corrupt indexes, making the verification process
+ * more robust and predictable.
+ */
+ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset, size_t opaquesize)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	Assert(opaquesize == MAXALIGN(opaquesize));
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(opaquesize))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree and gist
+	 * never uses either.  Verify that line pointer has storage, too, since
+	 * even LP_DEAD items should.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/contrib/amcheck/amcheck.h b/contrib/amcheck/amcheck.h
new file mode 100644
index 0000000000..10906efd8a
--- /dev/null
+++ b/contrib/amcheck/amcheck.h
@@ -0,0 +1,27 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "storage/lockdefs.h"
+#include "utils/relcache.h"
+#include "miscadmin.h"
+
+/* Typedefs for callback functions for amcheck_lock_relation */
+typedef void (*IndexCheckableCallback) (Relation index);
+typedef void (*IndexDoCheckCallback) (Relation rel, Relation heaprel, void* state);
+
+extern void amcheck_lock_relation_and_check(Oid indrelid,
+											IndexCheckableCallback checkable,
+											IndexDoCheckCallback check,
+											LOCKMODE lockmode, void *state);
+
+extern ItemId PageGetItemIdCareful(Relation rel, BlockNumber block,
+					 Page page, OffsetNumber offset, size_t opaquesize);
\ No newline at end of file
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 2beeebb163..d12c55b478 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -40,6 +40,8 @@
 #include "utils/memutils.h"
 #include "utils/snapmgr.h"
 
+#include "amcheck.h"
+
 
 PG_MODULE_MAGIC;
 
@@ -137,10 +139,8 @@ typedef struct BtreeLevel
 PG_FUNCTION_INFO_V1(bt_index_check);
 PG_FUNCTION_INFO_V1(bt_index_parent_check);
 
-static void bt_index_check_internal(Oid indrelid, bool parentcheck,
-									bool heapallindexed, bool rootdescend);
+static void bt_index_check_internal_callback(Relation indrel, Relation heaprel, void* state);
 static inline void btree_index_checkable(Relation rel);
-static inline bool btree_index_mainfork_expected(Relation rel);
 static void bt_check_every_level(Relation rel, Relation heaprel,
 								 bool heapkeyspace, bool readonly, bool heapallindexed,
 								 bool rootdescend);
@@ -183,12 +183,17 @@ static inline bool invariant_l_nontarget_offset(BtreeCheckState *state,
 static Page palloc_btree_page(BtreeCheckState *state, BlockNumber blocknum);
 static inline BTScanInsert bt_mkscankey_pivotsearch(Relation rel,
 													IndexTuple itup);
-static ItemId PageGetItemIdCareful(BtreeCheckState *state, BlockNumber block,
-								   Page page, OffsetNumber offset);
 static inline ItemPointer BTreeTupleGetHeapTIDCareful(BtreeCheckState *state,
 													  IndexTuple itup, bool nonpivot);
 static inline ItemPointer BTreeTupleGetPointsToTID(IndexTuple itup);
 
+typedef struct BTCheckCallbackState
+{
+	bool parentcheck;
+	bool heapallindexed;
+	bool rootdescend;
+} BTCheckCallbackState;
+
 /*
  * bt_index_check(index regclass, heapallindexed boolean)
  *
@@ -202,12 +207,17 @@ Datum
 bt_index_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
+	BTCheckCallbackState args;
 
-	if (PG_NARGS() == 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = false;
+
+	if (PG_NARGS() >= 2)
+		args.heapallindexed = PG_GETARG_BOOL(1);
 
-	bt_index_check_internal(indrelid, false, heapallindexed, false);
+	amcheck_lock_relation_and_check(indrelid, btree_index_checkable,
+		bt_index_check_internal_callback, AccessShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -225,15 +235,18 @@ Datum
 bt_index_parent_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		rootdescend = false;
+	BTCheckCallbackState args;
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = true;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+		args.heapallindexed = PG_GETARG_BOOL(1);
 	if (PG_NARGS() == 3)
-		rootdescend = PG_GETARG_BOOL(2);
+		args.rootdescend = PG_GETARG_BOOL(2);
 
-	bt_index_check_internal(indrelid, true, heapallindexed, rootdescend);
+	amcheck_lock_relation_and_check(indrelid, btree_index_checkable,
+		bt_index_check_internal_callback, ShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -241,126 +254,35 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
 /*
  * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
  */
-static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend)
+static void bt_index_check_internal_callback(Relation indrel, Relation heaprel, void* state)
 {
-	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	LOCKMODE	lockmode;
-	Oid			save_userid;
-	int			save_sec_context;
-	int			save_nestlevel;
-
-	if (parentcheck)
-		lockmode = ShareLock;
-	else
-		lockmode = AccessShareLock;
-
-	/*
-	 * We must lock table before index to avoid deadlocks.  However, if the
-	 * passed indrelid isn't an index then IndexGetRelation() will fail.
-	 * Rather than emitting a not-very-helpful error message, postpone
-	 * complaining, expecting that the is-it-an-index test below will fail.
-	 *
-	 * In hot standby mode this will raise an error when parentcheck is true.
-	 */
-	heapid = IndexGetRelation(indrelid, true);
-	if (OidIsValid(heapid))
-	{
-		heaprel = table_open(heapid, lockmode);
-
-		/*
-		 * Switch to the table owner's userid, so that any index functions are
-		 * run as that user.  Also lock down security-restricted operations
-		 * and arrange to make GUC variable changes local to this command.
-		 */
-		GetUserIdAndSecContext(&save_userid, &save_sec_context);
-		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
-							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
-		save_nestlevel = NewGUCNestLevel();
-	}
-	else
-	{
-		heaprel = NULL;
-		/* Set these just to suppress "uninitialized variable" warnings */
-		save_userid = InvalidOid;
-		save_sec_context = -1;
-		save_nestlevel = -1;
-	}
-
-	/*
-	 * Open the target index relations separately (like relation_openrv(), but
-	 * with heap relation locked first to prevent deadlocking).  In hot
-	 * standby mode this will raise an error when parentcheck is true.
-	 *
-	 * There is no need for the usual indcheckxmin usability horizon test
-	 * here, even in the heapallindexed case, because index undergoing
-	 * verification only needs to have entries for a new transaction snapshot.
-	 * (If this is a parentcheck verification, there is no question about
-	 * committed or recently dead heap tuples lacking index entries due to
-	 * concurrent activity.)
-	 */
-	indrel = index_open(indrelid, lockmode);
+	BTCheckCallbackState* args = (BTCheckCallbackState*) state;
+	bool		heapkeyspace,
+					allequalimage;
 
-	/*
-	 * Since we did the IndexGetRelation call above without any lock, it's
-	 * barely possible that a race against an index drop/recreation could have
-	 * netted us the wrong table.
-	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
 		ereport(ERROR,
-				(errcode(ERRCODE_UNDEFINED_TABLE),
-				 errmsg("could not open parent table of index \"%s\"",
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" lacks a main relation fork",
 						RelationGetRelationName(indrel))));
 
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	if (btree_index_mainfork_expected(indrel))
-	{
-		bool		heapkeyspace,
-					allequalimage;
-
-		if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" lacks a main relation fork",
-							RelationGetRelationName(indrel))));
-
-		/* Extract metadata from metapage, and sanitize it in passing */
-		_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
-		if (allequalimage && !heapkeyspace)
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
-							RelationGetRelationName(indrel))));
-		if (allequalimage && !_bt_allequalimage(indrel, false))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
-							RelationGetRelationName(indrel))));
-
-		/* Check index, possibly against table it is an index on */
-		bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
-							 heapallindexed, rootdescend);
-	}
-
-	/* Roll back any GUC changes executed by index functions */
-	AtEOXact_GUC(false, save_nestlevel);
+	/* Extract metadata from metapage, and sanitize it in passing */
+	_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
+	if (allequalimage && !heapkeyspace)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
+						RelationGetRelationName(indrel))));
+	if (allequalimage && !_bt_allequalimage(indrel, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
+						RelationGetRelationName(indrel))));
 
-	/* Restore userid and security context */
-	SetUserIdAndSecContext(save_userid, save_sec_context);
+	/* Check index, possibly against table it is an index on */
+	bt_check_every_level(indrel, heaprel, heapkeyspace, args->parentcheck,
+							args->heapallindexed, args->rootdescend);
 
-	/*
-	 * Release locks early. That's ok here because nothing in the called
-	 * routines will trigger shared cache invalidations to be sent, so we can
-	 * relax the usual pattern of only releasing locks after commit.
-	 */
-	index_close(indrel, lockmode);
-	if (heaprel)
-		table_close(heaprel, lockmode);
 }
 
 /*
@@ -397,29 +319,6 @@ btree_index_checkable(Relation rel)
 				 errdetail("Index is not valid.")));
 }
 
-/*
- * Check if B-Tree index relation should have a file for its main relation
- * fork.  Verification uses this to skip unlogged indexes when in hot standby
- * mode, where there is simply nothing to verify.  We behave as if the
- * relation is empty.
- *
- * NB: Caller should call btree_index_checkable() before calling here.
- */
-static inline bool
-btree_index_mainfork_expected(Relation rel)
-{
-	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
-		!RecoveryInProgress())
-		return true;
-
-	ereport(DEBUG1,
-			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
-			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
-					RelationGetRelationName(rel))));
-
-	return false;
-}
-
 /*
  * Main entry point for B-Tree SQL-callable functions. Walks the B-Tree in
  * logical order, verifying invariants as it goes.  Optionally, verification
@@ -792,9 +691,9 @@ bt_check_level_from_leftmost(BtreeCheckState *state, BtreeLevel level)
 				ItemId		itemid;
 
 				/* Internal page -- downlink gets leftmost on next level */
-				itemid = PageGetItemIdCareful(state, state->targetblock,
+				itemid = PageGetItemIdCareful(state->rel, state->targetblock,
 											  state->target,
-											  P_FIRSTDATAKEY(opaque));
+											  P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 				itup = (IndexTuple) PageGetItem(state->target, itemid);
 				nextleveldown.leftmost = BTreeTupleGetDownLink(itup);
 				nextleveldown.level = opaque->btpo_level - 1;
@@ -874,8 +773,8 @@ nextpage:
 			IndexTuple	itup;
 			ItemId		itemid;
 
-			itemid = PageGetItemIdCareful(state, state->targetblock,
-										  state->target, P_HIKEY);
+			itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+										  state->target, P_HIKEY, sizeof(BTPageOpaqueData));
 			itup = (IndexTuple) PageGetItem(state->target, itemid);
 
 			state->lowkey = MemoryContextAlloc(oldcontext, IndexTupleSize(itup));
@@ -1092,8 +991,8 @@ bt_target_page_check(BtreeCheckState *state)
 		IndexTuple	itup;
 
 		/* Verify line pointer before checking tuple */
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, P_HIKEY);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, P_HIKEY, sizeof(BTPageOpaqueData));
 		if (!_bt_check_natts(state->rel, state->heapkeyspace, state->target,
 							 P_HIKEY))
 		{
@@ -1128,8 +1027,8 @@ bt_target_page_check(BtreeCheckState *state)
 
 		CHECK_FOR_INTERRUPTS();
 
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, offset);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, offset, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(state->target, itemid);
 		tupsize = IndexTupleSize(itup);
 
@@ -1441,9 +1340,9 @@ bt_target_page_check(BtreeCheckState *state)
 							 OffsetNumberNext(offset));
 
 			/* Reuse itup to get pointed-to heap location of second item */
-			itemid = PageGetItemIdCareful(state, state->targetblock,
+			itemid = PageGetItemIdCareful(state->rel, state->targetblock,
 										  state->target,
-										  OffsetNumberNext(offset));
+										  OffsetNumberNext(offset), sizeof(BTPageOpaqueData));
 			itup = (IndexTuple) PageGetItem(state->target, itemid);
 			tid = BTreeTupleGetPointsToTID(itup);
 			nhtid = psprintf("(%u,%u)",
@@ -1734,8 +1633,8 @@ bt_right_page_check_scankey(BtreeCheckState *state)
 	if (P_ISLEAF(opaque) && nline >= P_FIRSTDATAKEY(opaque))
 	{
 		/* Return first data item (if any) */
-		rightitem = PageGetItemIdCareful(state, targetnext, rightpage,
-										 P_FIRSTDATAKEY(opaque));
+		rightitem = PageGetItemIdCareful(state->rel, targetnext, rightpage,
+										 P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 	}
 	else if (!P_ISLEAF(opaque) &&
 			 nline >= OffsetNumberNext(P_FIRSTDATAKEY(opaque)))
@@ -1744,8 +1643,8 @@ bt_right_page_check_scankey(BtreeCheckState *state)
 		 * Return first item after the internal page's "negative infinity"
 		 * item
 		 */
-		rightitem = PageGetItemIdCareful(state, targetnext, rightpage,
-										 OffsetNumberNext(P_FIRSTDATAKEY(opaque)));
+		rightitem = PageGetItemIdCareful(state->rel, targetnext, rightpage,
+										 OffsetNumberNext(P_FIRSTDATAKEY(opaque)), sizeof(BTPageOpaqueData));
 	}
 	else
 	{
@@ -1864,8 +1763,8 @@ bt_child_highkey_check(BtreeCheckState *state,
 
 	if (OffsetNumberIsValid(target_downlinkoffnum))
 	{
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, target_downlinkoffnum);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, target_downlinkoffnum, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(state->target, itemid);
 		downlink = BTreeTupleGetDownLink(itup);
 	}
@@ -1968,7 +1867,7 @@ bt_child_highkey_check(BtreeCheckState *state,
 			OffsetNumber pivotkey_offset;
 
 			/* Get high key */
-			itemid = PageGetItemIdCareful(state, blkno, page, P_HIKEY);
+			itemid = PageGetItemIdCareful(state->rel, blkno, page, P_HIKEY, sizeof(BTPageOpaqueData));
 			highkey = (IndexTuple) PageGetItem(page, itemid);
 
 			/*
@@ -2019,8 +1918,8 @@ bt_child_highkey_check(BtreeCheckState *state,
 													LSN_FORMAT_ARGS(state->targetlsn))));
 					pivotkey_offset = P_HIKEY;
 				}
-				itemid = PageGetItemIdCareful(state, state->targetblock,
-											  state->target, pivotkey_offset);
+				itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+											  state->target, pivotkey_offset, sizeof(BTPageOpaqueData));
 				itup = (IndexTuple) PageGetItem(state->target, itemid);
 			}
 			else
@@ -2106,8 +2005,8 @@ bt_child_check(BtreeCheckState *state, BTScanInsert targetkey,
 	BTPageOpaque copaque;
 	BTPageOpaque topaque;
 
-	itemid = PageGetItemIdCareful(state, state->targetblock,
-								  state->target, downlinkoffnum);
+	itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+								  state->target, downlinkoffnum, sizeof(BTPageOpaqueData));
 	itup = (IndexTuple) PageGetItem(state->target, itemid);
 	childblock = BTreeTupleGetDownLink(itup);
 
@@ -2338,7 +2237,7 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 		 RelationGetRelationName(state->rel));
 
 	level = opaque->btpo_level;
-	itemid = PageGetItemIdCareful(state, blkno, page, P_FIRSTDATAKEY(opaque));
+	itemid = PageGetItemIdCareful(state->rel, blkno, page, P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 	itup = (IndexTuple) PageGetItem(page, itemid);
 	childblk = BTreeTupleGetDownLink(itup);
 	for (;;)
@@ -2362,8 +2261,8 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 										level - 1, copaque->btpo_level)));
 
 		level = copaque->btpo_level;
-		itemid = PageGetItemIdCareful(state, childblk, child,
-									  P_FIRSTDATAKEY(copaque));
+		itemid = PageGetItemIdCareful(state->rel, childblk, child,
+									  P_FIRSTDATAKEY(copaque), sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(child, itemid);
 		childblk = BTreeTupleGetDownLink(itup);
 		/* Be slightly more pro-active in freeing this memory, just in case */
@@ -2411,7 +2310,7 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 	 */
 	if (P_ISHALFDEAD(copaque) && !P_RIGHTMOST(copaque))
 	{
-		itemid = PageGetItemIdCareful(state, childblk, child, P_HIKEY);
+		itemid = PageGetItemIdCareful(state->rel, childblk, child, P_HIKEY, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(child, itemid);
 		if (BTreeTupleGetTopParent(itup) == blkno)
 			return;
@@ -2781,8 +2680,8 @@ invariant_l_offset(BtreeCheckState *state, BTScanInsert key,
 	Assert(key->pivotsearch);
 
 	/* Verify line pointer before checking tuple */
-	itemid = PageGetItemIdCareful(state, state->targetblock, state->target,
-								  upperbound);
+	itemid = PageGetItemIdCareful(state->rel, state->targetblock, state->target,
+								  upperbound, sizeof(BTPageOpaqueData));
 	/* pg_upgrade'd indexes may legally have equal sibling tuples */
 	if (!key->heapkeyspace)
 		return invariant_leq_offset(state, key, upperbound);
@@ -2904,8 +2803,8 @@ invariant_l_nontarget_offset(BtreeCheckState *state, BTScanInsert key,
 	Assert(key->pivotsearch);
 
 	/* Verify line pointer before checking tuple */
-	itemid = PageGetItemIdCareful(state, nontargetblock, nontarget,
-								  upperbound);
+	itemid = PageGetItemIdCareful(state->rel, nontargetblock, nontarget,
+								  upperbound, sizeof(BTPageOpaqueData));
 	cmp = _bt_compare(state->rel, key, nontarget, upperbound);
 
 	/* pg_upgrade'd indexes may legally have equal sibling tuples */
@@ -3142,55 +3041,6 @@ bt_mkscankey_pivotsearch(Relation rel, IndexTuple itup)
 	return skey;
 }
 
-/*
- * PageGetItemId() wrapper that validates returned line pointer.
- *
- * Buffer page/page item access macros generally trust that line pointers are
- * not corrupt, which might cause problems for verification itself.  For
- * example, there is no bounds checking in PageGetItem().  Passing it a
- * corrupt line pointer can cause it to return a tuple/pointer that is unsafe
- * to dereference.
- *
- * Validating line pointers before tuples avoids undefined behavior and
- * assertion failures with corrupt indexes, making the verification process
- * more robust and predictable.
- */
-static ItemId
-PageGetItemIdCareful(BtreeCheckState *state, BlockNumber block, Page page,
-					 OffsetNumber offset)
-{
-	ItemId		itemid = PageGetItemId(page, offset);
-
-	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
-		BLCKSZ - MAXALIGN(sizeof(BTPageOpaqueData)))
-		ereport(ERROR,
-				(errcode(ERRCODE_INDEX_CORRUPTED),
-				 errmsg("line pointer points past end of tuple space in index \"%s\"",
-						RelationGetRelationName(state->rel)),
-				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
-									block, offset, ItemIdGetOffset(itemid),
-									ItemIdGetLength(itemid),
-									ItemIdGetFlags(itemid))));
-
-	/*
-	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree
-	 * never uses either.  Verify that line pointer has storage, too, since
-	 * even LP_DEAD items should within nbtree.
-	 */
-	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
-		ItemIdGetLength(itemid) == 0)
-		ereport(ERROR,
-				(errcode(ERRCODE_INDEX_CORRUPTED),
-				 errmsg("invalid line pointer storage in index \"%s\"",
-						RelationGetRelationName(state->rel)),
-				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
-									block, offset, ItemIdGetOffset(itemid),
-									ItemIdGetLength(itemid),
-									ItemIdGetFlags(itemid))));
-
-	return itemid;
-}
-
 /*
  * BTreeTupleGetHeapTID() wrapper that enforces that a heap TID is present in
  * cases where that is mandatory (i.e. for non-pivot tuples)
-- 
2.32.0 (Apple Git-132)

v13-0002-Add-gist_index_parent_check-function-to-verify-G.patchapplication/octet-stream; name=v13-0002-Add-gist_index_parent_check-function-to-verify-G.patch; x-unix-mode=0644Download

From 9eb322531c9ec74cb6e8ef0910d54f46f2f82f08 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:17:44 +0500
Subject: [PATCH v13 2/3] Add gist_index_parent_check() function to verify GiST
 index

---
 contrib/amcheck/Makefile                |   6 +-
 contrib/amcheck/amcheck--1.3--1.4.sql   |  14 +
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/expected/check_gist.out | 119 ++++++
 contrib/amcheck/sql/check_gist.sql      |  42 ++
 contrib/amcheck/verify_gist.c           | 520 ++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml               |  19 +
 7 files changed, 719 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.3--1.4.sql
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index f10fd9d89d..a817419581 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,15 +4,17 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
+	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
-DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql \
+		amcheck--1.3--1.4.sql
 
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap
+REGRESS = check check_btree check_heap check_gist
 
 TAP_TESTS = 1
 
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
new file mode 100644
index 0000000000..93297379ef
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -0,0 +1,14 @@
+/* contrib/amcheck/amcheck--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.4'" to load this file. \quit
+
+
+-- gist_index_parent_check()
+--
+CREATE FUNCTION gist_index_parent_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index ab50931f75..e67ace01c9 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.3'
+default_version = '1.4'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 0000000000..9749adfd34
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,119 @@
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 0000000000..75b9ff4b43
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,42 @@
+
+SELECT setseed(1);
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 0000000000..db65880d87
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,520 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gist.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "access/tableam.h"
+#include "access/transam.h"
+#include "catalog/pg_am.h"
+#include "common/pg_prng.h"
+#include "catalog/index.h"
+#include "lib/bloomfilter.h"
+#include "miscadmin.h"
+#include "storage/lmgr.h"
+#include "storage/smgr.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "utils/snapmgr.h"
+
+#include "amcheck.h"
+
+/*
+ * GistScanItem represents one item of depth-first scan of GiST index.
+ */
+typedef struct GistScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GistScanItem *next;
+} GistScanItem;
+
+typedef struct GistCheckState
+{
+	/* Bloom filter fingerprints B-Tree index */
+	bloom_filter *filter;
+	/* Debug counter */
+	int64		heaptuplespresent;
+	/* GiST state */
+	GISTSTATE      *state;
+
+	Snapshot		snapshot;
+	Relation	rel;
+	Relation	heaprel;
+} GistCheckState;
+
+PG_FUNCTION_INFO_V1(gist_index_parent_check);
+
+static GistCheckState gist_init_heapallindexed(Relation rel);
+static void gist_index_checkable(Relation rel);
+static void gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+												void* callback_state);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gist_refind_parent(Relation rel, BlockNumber parentblkno,
+									 BlockNumber childblkno,
+									 BufferAccessStrategy strategy);
+static void gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+						  bool *isnull, bool tupleIsAlive, void *checkstate);
+
+/*
+ * gist_index_parent_check(index regclass)
+ *
+ * Verify integrity of GiST index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum gist_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid		indrelid = PG_GETARG_OID(0);
+	bool	heapallindexed = false;
+
+	if (PG_NARGS() >= 2)
+		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid, gist_index_checkable,
+		gist_check_parent_keys_consistency, AccessShareLock, &heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Check that relation is eligible for GiST verification
+ */
+static void
+gist_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIST_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GiST indexes are supported as targets for this verification"),
+				 errdetail("Relation \"%s\" is not a GiST index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+static GistCheckState
+gist_init_heapallindexed(Relation rel)
+{
+	int64		total_pages;
+	int64		total_elems;
+	uint64		seed;
+	GistCheckState result;
+
+	/*
+	 * Size Bloom filter based on estimated number of tuples in index.
+	 * This logic is similar to B-tree, see verify_btree.c .
+	 */
+	total_pages = RelationGetNumberOfBlocks(rel);
+	total_elems = Max(total_pages * (MaxOffsetNumber / 5),
+						(int64) rel->rd_rel->reltuples);
+	seed = pg_prng_uint64(&pg_global_prng_state);
+	result.filter = bloom_create(total_elems, maintenance_work_mem, seed);
+
+	result.snapshot = RegisterSnapshot(GetTransactionSnapshot());
+
+	/*
+	 * GetTransactionSnapshot() always acquires a new MVCC snapshot in
+	 * READ COMMITTED mode.  A new snapshot is guaranteed to have all
+	 * the entries it requires in the index.
+	 *
+	 * We must defend against the possibility that an old xact
+	 * snapshot was returned at higher isolation levels when that
+	 * snapshot is not safe for index scans of the target index.  This
+	 * is possible when the snapshot sees tuples that are before the
+	 * index's indcheckxmin horizon.  Throwing an error here should be
+	 * very rare.  It doesn't seem worth using a secondary snapshot to
+	 * avoid this.
+	 */
+	if (IsolationUsesXactSnapshot() && rel->rd_index->indcheckxmin &&
+		!TransactionIdPrecedes(HeapTupleHeaderGetXmin(rel->rd_indextuple->t_data),
+								result.snapshot->xmin))
+		ereport(ERROR,
+				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+					errmsg("index \"%s\" cannot be verified using transaction snapshot",
+						RelationGetRelationName(rel))));
+	
+	return result;
+}
+
+/*
+ * Main entry point for GiST check. Allocates memory context and scans through
+ * GiST graph.  This function verifies that tuples of internal pages cover all
+ * the key space of each tuple on leaf page.  To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries to
+ * adjust it by tuples on referenced child page.  Parent gist tuple should
+ * never require any adjustments.
+ */
+static void
+gist_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GistScanItem   *stack;
+	MemoryContext	mctx;
+	MemoryContext	oldcontext;
+	GISTSTATE      *state;
+	int				leafdepth;
+	bool			heapallindexed = *((bool*)callback_state);
+	GistCheckState  check_state;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	state = initGISTstate(rel);
+
+	if (heapallindexed)
+		check_state = gist_init_heapallindexed(rel);
+	check_state.state = state;
+	check_state.rel = rel;
+	check_state.heaprel = heaprel;
+	
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GistScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber  i, maxoff;
+		XLogRecPtr	lsn;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page))
+		{
+			/* split page detected, install right link to the stack */
+			GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+			ptr->depth = stack->depth;
+			ptr->parenttup = CopyIndexTuple(stack->parenttup);
+			ptr->parentblk = stack->parentblk;
+			ptr->parentlsn = stack->parentlsn;
+			ptr->blkno = GistPageGetOpaque(page)->rightlink;
+			ptr->next = stack->next;
+			stack->next = ptr;
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GistPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that each tuple looks valid, and is consistent with the
+		 * downlink we followed when we stepped on this page.
+		 */
+		maxoff = PageGetMaxOffsetNumber(page);
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId iid = PageGetItemIdCareful(rel, stack->blkno, page, i, sizeof(GISTPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+			/*
+			 * Check that it's not a leftover invalid tuple from pre-9.1 See
+			 * also gistdoinsert() and gistbulkdelete() handling of such
+			 * tuples. We do consider it error here.
+			 */
+			if (GistTupleIsInvalid(idxtuple))
+				ereport(ERROR,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("index \"%s\" contains an inner tuple marked as invalid, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i),
+						 errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+						 errhint("Please REINDEX it.")));
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+			{
+				/*
+				 * There was a discrepancy between parent and child tuples.
+				 * We need to verify it is not a result of concurrent call of
+				 * gistplacetopage(). So, lock parent and try to find downlink
+				 * for current page. It may be missing due to concurrent page
+				 * split, this is OK.
+				 * 
+				 * Note that when we aquire parent tuple now we hold lock for
+				 * both parent and child buffers. Thus parent tuple must
+				 * include keyspace of the child.
+				 */
+				pfree(stack->parenttup);
+				stack->parenttup = gist_refind_parent(rel, stack->parentblk,
+													  stack->blkno, strategy);
+
+				/* We found it - make a final check before failing */
+				if (!stack->parenttup)
+					elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+						 stack->blkno, stack->parentblk);
+				else if (gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+				else
+				{
+					/*
+					 * But now it is properly adjusted - nothing to do here.
+					 */
+				}
+			}
+
+			
+			if (GistPageIsLeaf(page))
+			{
+				if (heapallindexed)
+				{
+					bloom_add_element(check_state.filter, (unsigned char *) idxtuple,
+								  IndexTupleSize(idxtuple));
+				}
+			}
+			/* If this is an internal page, recurse into the child */
+			else
+			{
+				GistScanItem *ptr;
+
+				ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+				ptr->depth = stack->depth + 1;
+				ptr->parenttup = CopyIndexTuple(idxtuple);
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		LockBuffer(buffer, GIST_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	if (heapallindexed)
+	{
+		IndexInfo  *indexinfo = BuildIndexInfo(rel);
+		TableScanDesc scan;
+
+		scan = table_beginscan_strat(heaprel,	/* relation */
+									 check_state.snapshot,	/* snapshot */
+									 0, /* number of keys */
+									 NULL,	/* scan key */
+									 true,	/* buffer access strategy OK */
+									 true); /* syncscan OK? */
+
+		/*
+		 * Scan will behave as the first scan of a CREATE INDEX CONCURRENTLY.
+		 */
+		indexinfo->ii_Concurrent = true;
+
+		indexinfo->ii_Unique = false;
+		indexinfo->ii_ExclusionOps = NULL;
+		indexinfo->ii_ExclusionProcs = NULL;
+		indexinfo->ii_ExclusionStrats = NULL;
+
+		elog(DEBUG1, "verifying that tuples from index \"%s\" are present in \"%s\"",
+			 RelationGetRelationName(rel),
+			 RelationGetRelationName(heaprel));
+
+		table_index_build_scan(heaprel, rel, indexinfo, true, false,
+							   gist_tuple_present_callback, (void *) &check_state, scan);
+
+		ereport(DEBUG1,
+		(errmsg_internal("finished verifying presence of " INT64_FORMAT " tuples from table \"%s\" with bitset %.2f%% set",
+							check_state.heaptuplespresent, RelationGetRelationName(heaprel),
+							100.0 * bloom_prop_bits_set(check_state.filter))));
+
+		UnregisterSnapshot(check_state.snapshot);
+		bloom_free(check_state.filter);
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+static void
+gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+						  bool *isnull, bool tupleIsAlive, void *checkstate)
+{
+	GistCheckState *state = (GistCheckState *) checkstate;
+	IndexTuple	itup = gistFormTuple(state->state, index, values, isnull, true);
+	itup->t_tid = *tid;
+	/* Probe Bloom filter -- tuple should be present */
+	if (bloom_lacks_element(state->filter, (unsigned char *) itup,
+							IndexTupleSize(itup)))
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("heap tuple (%u,%u) from table \"%s\" lacks matching index tuple within index \"%s\"",
+						ItemPointerGetBlockNumber(&(itup->t_tid)),
+						ItemPointerGetOffsetNumber(&(itup->t_tid)),
+						RelationGetRelationName(state->heaprel),
+						RelationGetRelationName(state->rel))));
+
+	state->heaptuplespresent++;
+
+	pfree(itup);
+}
+
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	gistcheckpage(rel, buffer);
+
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted page %d",
+						RelationGetRelationName(rel), blockNo)));
+
+	if (GistPageIsDeleted(page))
+	{
+		if (!GistPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gist_refind_parent(Relation rel, BlockNumber parentblkno,
+				   BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIST_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GistPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o, sizeof(GISTPageOpaqueData));
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 5d61a33936..9397a69c6e 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -179,6 +179,25 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gist_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_parent_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
   <tip>
    <para>
-- 
2.32.0 (Apple Git-132)

v13-0003-Add-gin_index_parent_check-to-verify-GIN-index.patchapplication/octet-stream; name=v13-0003-Add-gin_index_parent_check-to-verify-GIN-index.patch; x-unix-mode=0644Download

From e45cc422aea129d35da8b521580379cf78ad430b Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:22:26 +0500
Subject: [PATCH v13 3/3] Add gin_index_parent_check() to verify GIN index

---
 contrib/amcheck/Makefile               |   3 +-
 contrib/amcheck/amcheck--1.3--1.4.sql  |  11 +-
 contrib/amcheck/amcheck.c              |   2 +-
 contrib/amcheck/expected/check_gin.out |  64 ++
 contrib/amcheck/sql/check_gin.sql      |  40 ++
 contrib/amcheck/verify_gin.c           | 800 +++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml              |  19 +
 7 files changed, 936 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/expected/check_gin.out
 create mode 100644 contrib/amcheck/sql/check_gin.sql
 create mode 100644 contrib/amcheck/verify_gin.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index a817419581..ecb849a605 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -5,6 +5,7 @@ OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
 	verify_gist.o \
+	verify_gin.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
@@ -14,7 +15,7 @@ DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck
 
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap check_gist
+REGRESS = check check_btree check_heap check_gist check_gin
 
 TAP_TESTS = 1
 
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
index 93297379ef..c914e6d0ba 100644
--- a/contrib/amcheck/amcheck--1.3--1.4.sql
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -11,4 +11,13 @@ RETURNS VOID
 AS 'MODULE_PATHNAME', 'gist_index_parent_check'
 LANGUAGE C STRICT;
 
-REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
+REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
+
+-- gin_index_parent_check()
+--
+CREATE FUNCTION gin_index_parent_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gin_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gin_index_parent_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
index 0194ef0d7a..1e0c875926 100644
--- a/contrib/amcheck/amcheck.c
+++ b/contrib/amcheck/amcheck.c
@@ -83,7 +83,7 @@ amcheck_lock_relation_and_check(Oid indrelid, IndexCheckableCallback checkable,
 	else
 	{
 		heaprel = NULL;
-		/* for "gcc -Og" https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78394 */
+		/* Set these just to suppress "uninitialized variable" warnings */
 		save_userid = InvalidOid;
 		save_sec_context = -1;
 		save_nestlevel = -1;
diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
new file mode 100644
index 0000000000..d98d525c66
--- /dev/null
+++ b/contrib/amcheck/expected/check_gin.out
@@ -0,0 +1,64 @@
+-- Test of index bulk load
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx', true);
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test index inserts
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+SELECT gin_index_parent_check('gin_check_idx', true);
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test GIN over text array
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx', true);
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
new file mode 100644
index 0000000000..789259e662
--- /dev/null
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -0,0 +1,40 @@
+-- Test of index bulk load
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx', true);
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test index inserts
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+
+SELECT gin_index_parent_check('gin_check_idx', true);
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test GIN over text array
+SELECT setseed(1);
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx', true);
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
new file mode 100644
index 0000000000..90fe89501d
--- /dev/null
+++ b/contrib/amcheck/verify_gin.c
@@ -0,0 +1,800 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gin.c
+ *		Verifies the integrity of GIN indexes based on invariants.
+ *
+ * Verification checks that all paths in GIN graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gin.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gin_private.h"
+#include "access/nbtree.h"
+#include "amcheck.h"
+#include "catalog/pg_am.h"
+#include "miscadmin.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "string.h"
+
+/*
+ * GinScanItem represents one item of depth-first scan of GIN index.
+ */
+typedef struct GinScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GinScanItem *next;
+}			GinScanItem;
+
+/*
+ * GinPostingTreeScanItem represents one item of depth-first scan of GIN  posting tree.
+ */
+typedef struct GinPostingTreeScanItem
+{
+	int			depth;
+	ItemPointerData parentkey;
+	BlockNumber parentblk;
+	BlockNumber blkno;
+	struct GinPostingTreeScanItem *next;
+}			GinPostingTreeScanItem;
+
+
+PG_FUNCTION_INFO_V1(gin_index_parent_check);
+
+static void gin_index_checkable(Relation rel);
+static void gin_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state);
+static bool check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gin_refind_parent(Relation rel, BlockNumber parentblkno,
+									BlockNumber childblkno,
+									BufferAccessStrategy strategy);
+
+/*
+ * gin_index_parent_check(index regclass)
+ *
+ * Verify integrity of GIN index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gin_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	bool	heapallindexed = false;
+
+	if (PG_NARGS() >= 2)
+		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid, gin_index_checkable,
+		gin_check_parent_keys_consistency, AccessShareLock, &heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Read item pointers from leaf entry tuple.
+ *
+ * Returns a palloc'd array of ItemPointers. The number of items is returned
+ * in *nitems.
+ */
+static ItemPointer
+ginReadTupleWithoutState(IndexTuple itup, int *nitems)
+{
+	Pointer		ptr = GinGetPosting(itup);
+	int			nipd = GinGetNPosting(itup);
+	ItemPointer ipd;
+	int			ndecoded;
+
+	if (GinItupIsCompressed(itup))
+	{
+		if (nipd > 0)
+		{
+			ipd = ginPostingListDecode((GinPostingList *) ptr, &ndecoded);
+			if (nipd != ndecoded)
+				elog(ERROR, "number of items mismatch in GIN entry tuple, %d in tuple header, %d decoded",
+					 nipd, ndecoded);
+		}
+		else
+		{
+			ipd = palloc(0);
+		}
+	}
+	else
+	{
+		ipd = (ItemPointer) palloc(sizeof(ItemPointerData) * nipd);
+		memcpy(ipd, ptr, sizeof(ItemPointerData) * nipd);
+	}
+	*nitems = nipd;
+	return ipd;
+}
+
+
+/*
+ * Check that relation is eligible for GIN verification
+ */
+static void
+gin_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIN_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GIN indexes are supported as targets for this verification"),
+				 errdetail("Relation \"%s\" is not a GIN index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+/*
+ * Allocates memory context and scans through postigTree graph
+ *
+ */
+static void
+gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting_tree_root)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinPostingTreeScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinPostingTreeScanItem *) palloc0(sizeof(GinPostingTreeScanItem));
+	stack->depth = 0;
+	ItemPointerSetInvalid(&stack->parentkey);
+	stack->parentblk = InvalidBlockNumber;
+	stack->blkno = posting_tree_root;
+
+	elog(DEBUG3, "processing posting tree at blk %u", posting_tree_root);
+
+	while (stack)
+	{
+		GinPostingTreeScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		Assert(GinPageIsData(page));
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			ItemPointerData minItem;
+			int			nlist;
+			ItemPointerData *list;
+			char		tidrange_buf[100];
+
+			ItemPointerSetMin(&minItem);
+
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			list = GinDataLeafPageGetItems(page, &nlist, minItem);
+
+			if (nlist > 0)
+			{
+				snprintf(tidrange_buf, sizeof(tidrange_buf),
+						 "%d tids (%u, %u) - (%u, %u)",
+						 nlist,
+						 ItemPointerGetBlockNumberNoCheck(&list[0]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[0]),
+						 ItemPointerGetBlockNumberNoCheck(&list[nlist - 1]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[nlist - 1]));
+			} else {
+				snprintf(tidrange_buf, sizeof(tidrange_buf), "0 tids");
+			}
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				elog(DEBUG3, "blk %u: parent %u highkey (%u, %u), %s",
+					 stack->blkno,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey),
+					 tidrange_buf);
+			}
+			else
+			{
+				elog(DEBUG3, "blk %u: root leaf, %s",
+					 stack->blkno,
+					 tidrange_buf);
+			}
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				ItemPointerGetOffsetNumberNoCheck(&stack->parentkey) != InvalidOffsetNumber &&
+				nlist > 0 &&
+				ItemPointerCompare(&stack->parentkey, &list[nlist - 1]) < 0)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": tid exceeds parent's high key in postingTree leaf on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			}
+		}
+		else
+		{
+			LocationIndex pd_lower;
+			int			lowersize;
+			ItemPointerData bound;
+
+			/*
+			 * Check that tuples in each page are properly ordered and
+			 * consistent with parent high key
+			 */
+			maxoff = GinPageGetOpaque(page)->maxoff;
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: internal posting tree page with %u items, parent %u highkey (%u, %u)",
+					 stack->blkno,
+					 maxoff,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey));
+			else
+				elog(DEBUG3, "blk %u: root internal posting tree page with %u items", stack->blkno, maxoff);
+
+			/*
+			 * A GIN posting tree internal page stores PostingItems in the
+			 * 'lower' part of the page. The 'upper' part is unused. The
+			 * number of elements is stored in the opaque area (maxoff).
+			 * Make sure the size of the 'lower' part agrees with 'maxoff'
+			 *
+			 * We didn't set pd_lower until PostgreSQL version 9.4, so if this
+			 * check fails, it could also be because the index was binary-upgraded
+			 * from an earlier version. That was a long time ago, though, so let's
+			 * warn if it doesn't match.
+			 */
+			pd_lower = ((PageHeader) page)->pd_lower;
+			lowersize = pd_lower - MAXALIGN(SizeOfPageHeaderData);
+			if ((lowersize - MAXALIGN(sizeof(ItemPointerData))) / sizeof(PostingItem) != maxoff)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has unexpected pd_lower %u in posting tree block %u with maxoff %u)",
+								RelationGetRelationName(rel), pd_lower, stack->blkno, maxoff)));
+			}
+
+			/*
+			 * Before the PostingItems, there's one ItemPointerData in the
+			 * 'lower' part that stores the page's high key.
+			 */
+			bound = *GinDataPageGetRightBound(page);
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				if (!ItemPointerEquals(&stack->parentkey, &bound))
+				{
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\": posting tree page's high key (%u, %u) doesn't match the downlink on block %u (parent blk %u, key (%u, %u))",
+									RelationGetRelationName(rel),
+									ItemPointerGetBlockNumberNoCheck(&bound),
+									ItemPointerGetOffsetNumberNoCheck(&bound),
+									stack->blkno,
+									stack->parentblk,
+									ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+									ItemPointerGetOffsetNumberNoCheck(&stack->parentkey))));
+				}
+			}
+
+			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
+
+				elog(DEBUG3, "key (%u, %u) -> %u",
+					 ItemPointerGetBlockNumber(&posting_item->key),
+					 ItemPointerGetOffsetNumber(&posting_item->key),
+					 BlockIdGetBlockNumber(&posting_item->child_blkno));
+
+				if (i == maxoff && GinPageGetOpaque(page)->rightlink == InvalidBlockNumber)
+				{
+					/* The rightmost item in the tree level has (0, 0) as the key */
+					if (ItemPointerGetBlockNumberNoCheck(&posting_item->key) != 0 ||
+						ItemPointerGetOffsetNumberNoCheck(&posting_item->key) != 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": rightmost posting tree page (blk %u) has unexpected last key (%u, %u)",
+										RelationGetRelationName(rel),
+										stack->blkno,
+										ItemPointerGetBlockNumberNoCheck(&posting_item->key),
+										ItemPointerGetOffsetNumberNoCheck(&posting_item->key))));
+					}
+				}
+				else if (i != FirstOffsetNumber)
+				{
+					PostingItem *previous_posting_item = GinDataPageGetPostingItem(page, i - 1);
+
+					if (ItemPointerCompare(&posting_item->key, &previous_posting_item->key) < 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\" has wrong tuple order in posting tree, block %u, offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+					}
+				}
+
+				/*
+				 * Check if this tuple is consistent with the downlink in the
+				 * parent.
+				 */
+				if (stack->parentblk != InvalidBlockNumber && i == maxoff)
+				{
+					if (ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting item exceeds parent's high key in postingTree internal page on block %u offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+
+					}
+				}
+
+				/* If this is an internal page, recurse into the child */
+				if (!GinPageIsLeaf(page))
+				{
+					GinPostingTreeScanItem *ptr;
+
+					ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
+					ptr->depth = stack->depth + 1;
+					ptr->parentkey = posting_item->key;
+					ptr->parentblk = stack->blkno;
+					ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+
+			}
+		}
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Main entry point for GIN check. Allocates memory context and scans through
+ * GIN graph.
+ */
+static void
+gin_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GinState	state;
+	bool		heapallindexed = *((bool*)callback_state);
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+	initGinState(&state, rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinScanItem *) palloc0(sizeof(GinScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIN_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GinScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+		IndexTuple	prev_tuple;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		if (!check_index_page(rel, buffer, stack->blkno))
+		{
+			goto nextpage;
+		}
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (stack->parenttup != NULL)
+		{
+			GinNullCategory parent_key_category;
+			Datum		parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+			OffsetNumber maxoff = PageGetMaxOffsetNumber(page);
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, maxoff, sizeof(GinPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory page_max_key_category;
+			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
+
+			if (GinPageGetOpaque(page)->rightlink != InvalidBlockNumber &&
+				ginCompareEntries(&state, attnum, page_max_key, page_max_key_category, parent_key, parent_key_category) > 0)
+			{
+				/* split page detected, install right link to the stack */
+				GinScanItem *ptr;
+
+				elog(DEBUG3, "split detected");
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth;
+				ptr->parenttup = CopyIndexTuple(stack->parenttup);
+				ptr->parentblk = stack->parentblk;
+				ptr->parentlsn = stack->parentlsn;
+				ptr->blkno = GinPageGetOpaque(page)->rightlink;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+				goto nextpage;
+			}
+		}
+
+		/*
+		 * Check that tuples in each page are properly ordered and consistent
+		 * with parent high key
+		 */
+		maxoff = PageGetMaxOffsetNumber(page);
+		prev_tuple = NULL;
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i, sizeof(GinPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory prev_key_category;
+			Datum		prev_key;
+			GinNullCategory current_key_category;
+			Datum		current_key;
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+				continue;
+			}
+
+			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
+
+			/* (apparently) first block is metadata, skip order check */
+			if (i != FirstOffsetNumber && stack->blkno != (BlockNumber) 1)
+			{
+				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
+				if (ginCompareEntries(&state, attnum, prev_key, prev_key_category, current_key, current_key_category) >= 0)
+				{
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has wrong tuple order, block %u, offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+				}
+			}
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				i == maxoff)
+			{
+				GinNullCategory parent_key_category;
+				Datum		parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+
+				if (ginCompareEntries(&state, attnum, current_key, current_key_category, parent_key, parent_key_category) > 0)
+				{
+					/*
+					 * There was a discrepancy between parent and child
+					 * tuples. We need to verify it is not a result of
+					 * concurrent call of gistplacetopage(). So, lock parent
+					 * and try to find downlink for current page. It may be
+					 * missing due to concurrent page split, this is OK.
+					 */
+					pfree(stack->parenttup);
+					stack->parenttup = gin_refind_parent(rel, stack->parentblk,
+														 stack->blkno, strategy);
+
+					/* We found it - make a final check before failing */
+					if (!stack->parenttup)
+						elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+							 stack->blkno, stack->parentblk);
+					else
+					{
+						parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+						if (ginCompareEntries(&state, attnum, current_key, current_key_category, parent_key, parent_key_category) > 0)
+							ereport(ERROR,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+											RelationGetRelationName(rel), stack->blkno, i)));
+						else
+						{
+							/*
+							 * But now it is properly adjusted - nothing to do
+							 * here.
+							 */
+						}
+					}
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+			if (!GinPageIsLeaf(page))
+			{
+				GinScanItem *ptr;
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth + 1;
+				/* last tuple in layer has no high key */
+				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
+				{
+					ptr->parenttup = CopyIndexTuple(idxtuple);
+				}
+				else
+				{
+					ptr->parenttup = NULL;
+				}
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = GinGetDownlink(idxtuple);
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+			/* If this item is a pointer to a posting tree, recurse into it */
+			else if (GinIsPostingTree(idxtuple))
+			{
+				BlockNumber rootPostingTree = GinGetPostingTree(idxtuple);
+
+				gin_check_posting_tree_parent_keys_consistency(rel, rootPostingTree);
+			}
+			else
+			{
+				ItemPointer ipd;
+				int			nipd;
+
+				ipd = ginReadTupleWithoutState(idxtuple, &nipd);
+
+				for (int j = 0; j < nipd; j++)
+				{
+					if (!OffsetNumberIsValid(ItemPointerGetOffsetNumber(&ipd[j])))
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting list contains invalid heap pointer on block %u",
+										RelationGetRelationName(rel), stack->blkno)));
+					}
+				}
+				pfree(ipd);
+			}
+
+			prev_tuple = CopyIndexTuple(idxtuple);
+		}
+
+nextpage:
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Verify that a freshly-read page looks sane.
+ */
+static bool
+gincheckpage(Relation rel, Buffer buf)
+{
+	Page		page = BufferGetPage(buf);
+
+	/*
+	 * ReadBuffer verifies that every newly-read page passes
+	 * PageHeaderIsValid, which means it either contains a reasonably sane
+	 * page header or is all-zero.  We have to defend against the all-zero
+	 * case, however.
+	 */
+	if (PageIsNew(page))
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains unexpected zero page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buf)),
+				 errhint("Please REINDEX it.")));
+		return false;
+	}
+
+	/*
+	 * Additionally check that the special area looks sane.
+	 */
+	if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains corrupted page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buf)),
+				 errhint("Please REINDEX it.")));
+		return false;
+	}
+	return true;
+}
+
+static bool
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	if (!gincheckpage(rel, buffer))
+		return false;
+
+	if (GinPageIsDeleted(page))
+	{
+		if (!GinPageIsLeaf(page))
+		{
+			ereport(WARNING,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+			return false;
+		}
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+		{
+			ereport(WARNING,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+			return false;
+		}
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+		return false;
+	}
+	return true;
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gin_refind_parent(Relation rel, BlockNumber parentblkno,
+				  BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIN_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GinPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o, sizeof(GinPageOpaqueData));
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 9397a69c6e..7ffa36b205 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -180,6 +180,25 @@ ORDER BY c.relpages DESC LIMIT 10;
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term>
+     <function>gin_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gin_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gin_index_parent_check</function> tests that its target GIN index
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term>
      <function>gist_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
-- 
2.32.0 (Apple Git-132)

Andres Freund

andres@anarazel.de

over 3 years ago

In reply to: Andrey Borodin (#7)

Re: Amcheck verification of GiST and GIN

Hi,

On 2022-08-17 17:28:02 +0500, Andrey Borodin wrote:

Here's v13. Changes:
1. Fixed passing through downlink in GIN index
2. Fixed GIN tests (one test case was not working)

Thanks to Vitaliy Kukharik for trying this patches.

Due to the merge of the meson based build, this patch needs to be
adjusted. See
https://cirrus-ci.com/build/6637154947301376

The changes should be fairly simple, just mirroring the Makefile ones.

Greetings,

Andres Freund

andres@anarazel.de

over 3 years ago

In reply to: Andres Freund (#8)

3 attachment(s)

Re: Amcheck verification of GiST and GIN

Hi,

On 2022-09-22 08:19:09 -0700, Andres Freund wrote:

Hi,

On 2022-08-17 17:28:02 +0500, Andrey Borodin wrote:

Here's v13. Changes:
1. Fixed passing through downlink in GIN index
2. Fixed GIN tests (one test case was not working)

Thanks to Vitaliy Kukharik for trying this patches.

Due to the merge of the meson based build, this patch needs to be
adjusted. See
https://cirrus-ci.com/build/6637154947301376

The changes should be fairly simple, just mirroring the Makefile ones.

Here's an updated patch adding meson compat.

I didn't fix the following warnings:

[25/28 3 89%] Compiling C object contrib/amcheck/amcheck.dll.p/amcheck.c.obj
../../home/andres/src/postgresql/contrib/amcheck/amcheck.c: In function ‘amcheck_lock_relation_and_check’:
../../home/andres/src/postgresql/contrib/amcheck/amcheck.c:81:20: warning: implicit declaration of function ‘NewGUCNestLevel’ [-Wimplicit-function-declaration]
81 | save_nestlevel = NewGUCNestLevel();
| ^~~~~~~~~~~~~~~
../../home/andres/src/postgresql/contrib/amcheck/amcheck.c:124:2: warning: implicit declaration of function ‘AtEOXact_GUC’; did you mean ‘AtEOXact_SMgr’? [-Wimplicit-function-declaration]
124 | AtEOXact_GUC(false, save_nestlevel);
| ^~~~~~~~~~~~
| AtEOXact_SMgr
[26/28 2 92%] Compiling C object contrib/amcheck/amcheck.dll.p/verify_gin.c.obj
../../home/andres/src/postgresql/contrib/amcheck/verify_gin.c: In function ‘gin_check_parent_keys_consistency’:
../../home/andres/src/postgresql/contrib/amcheck/verify_gin.c:423:8: warning: unused variable ‘heapallindexed’ [-Wunused-variable]
423 | bool heapallindexed = *((bool*)callback_state);
| ^~~~~~~~~~~~~~
[28/28 1 100%] Linking target contrib/amcheck/amcheck.dll

Greetings,

Andres Freund

Attachments:

v14-0002-Add-gist_index_parent_check-function-to-verify-G.patchtext/x-diff; charset=us-asciiDownload

From 5c6709a38561adf4cf2d6f72ad5fcd86fdf4bbde Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:17:44 +0500
Subject: [PATCH v14 2/3] Add gist_index_parent_check() function to verify GiST
 index

---
 contrib/amcheck/Makefile                |   6 +-
 contrib/amcheck/amcheck--1.3--1.4.sql   |  14 +
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/expected/check_gist.out | 119 ++++++
 contrib/amcheck/meson.build             |   3 +
 contrib/amcheck/sql/check_gist.sql      |  42 ++
 contrib/amcheck/verify_gist.c           | 520 ++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml               |  19 +
 8 files changed, 722 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.3--1.4.sql
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index f10fd9d89d5..a8174195817 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,15 +4,17 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
+	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
-DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql \
+		amcheck--1.3--1.4.sql
 
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap
+REGRESS = check check_btree check_heap check_gist
 
 TAP_TESTS = 1
 
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
new file mode 100644
index 00000000000..93297379efe
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -0,0 +1,14 @@
+/* contrib/amcheck/amcheck--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.4'" to load this file. \quit
+
+
+-- gist_index_parent_check()
+--
+CREATE FUNCTION gist_index_parent_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index ab50931f754..e67ace01c99 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.3'
+default_version = '1.4'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 00000000000..9749adfd340
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,119 @@
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 227e68ff834..d84138f7fa6 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -1,5 +1,6 @@
 amcheck = shared_module('amcheck', [
     'amcheck.c',
+    'verify_gist.c',
     'verify_heapam.c',
     'verify_nbtree.c',
   ],
@@ -13,6 +14,7 @@ install_data(
   'amcheck--1.0--1.1.sql',
   'amcheck--1.1--1.2.sql',
   'amcheck--1.2--1.3.sql',
+  'amcheck--1.3--1.4.sql',
   kwargs: contrib_data_args,
 )
 
@@ -25,6 +27,7 @@ tests += {
       'check',
       'check_btree',
       'check_heap',
+      'check_gist',
     ],
   },
   'tap': {
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 00000000000..75b9ff4b436
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,42 @@
+
+SELECT setseed(1);
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 00000000000..db65880d871
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,520 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gist.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "access/tableam.h"
+#include "access/transam.h"
+#include "catalog/pg_am.h"
+#include "common/pg_prng.h"
+#include "catalog/index.h"
+#include "lib/bloomfilter.h"
+#include "miscadmin.h"
+#include "storage/lmgr.h"
+#include "storage/smgr.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "utils/snapmgr.h"
+
+#include "amcheck.h"
+
+/*
+ * GistScanItem represents one item of depth-first scan of GiST index.
+ */
+typedef struct GistScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GistScanItem *next;
+} GistScanItem;
+
+typedef struct GistCheckState
+{
+	/* Bloom filter fingerprints B-Tree index */
+	bloom_filter *filter;
+	/* Debug counter */
+	int64		heaptuplespresent;
+	/* GiST state */
+	GISTSTATE      *state;
+
+	Snapshot		snapshot;
+	Relation	rel;
+	Relation	heaprel;
+} GistCheckState;
+
+PG_FUNCTION_INFO_V1(gist_index_parent_check);
+
+static GistCheckState gist_init_heapallindexed(Relation rel);
+static void gist_index_checkable(Relation rel);
+static void gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+												void* callback_state);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gist_refind_parent(Relation rel, BlockNumber parentblkno,
+									 BlockNumber childblkno,
+									 BufferAccessStrategy strategy);
+static void gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+						  bool *isnull, bool tupleIsAlive, void *checkstate);
+
+/*
+ * gist_index_parent_check(index regclass)
+ *
+ * Verify integrity of GiST index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum gist_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid		indrelid = PG_GETARG_OID(0);
+	bool	heapallindexed = false;
+
+	if (PG_NARGS() >= 2)
+		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid, gist_index_checkable,
+		gist_check_parent_keys_consistency, AccessShareLock, &heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Check that relation is eligible for GiST verification
+ */
+static void
+gist_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIST_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GiST indexes are supported as targets for this verification"),
+				 errdetail("Relation \"%s\" is not a GiST index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+static GistCheckState
+gist_init_heapallindexed(Relation rel)
+{
+	int64		total_pages;
+	int64		total_elems;
+	uint64		seed;
+	GistCheckState result;
+
+	/*
+	 * Size Bloom filter based on estimated number of tuples in index.
+	 * This logic is similar to B-tree, see verify_btree.c .
+	 */
+	total_pages = RelationGetNumberOfBlocks(rel);
+	total_elems = Max(total_pages * (MaxOffsetNumber / 5),
+						(int64) rel->rd_rel->reltuples);
+	seed = pg_prng_uint64(&pg_global_prng_state);
+	result.filter = bloom_create(total_elems, maintenance_work_mem, seed);
+
+	result.snapshot = RegisterSnapshot(GetTransactionSnapshot());
+
+	/*
+	 * GetTransactionSnapshot() always acquires a new MVCC snapshot in
+	 * READ COMMITTED mode.  A new snapshot is guaranteed to have all
+	 * the entries it requires in the index.
+	 *
+	 * We must defend against the possibility that an old xact
+	 * snapshot was returned at higher isolation levels when that
+	 * snapshot is not safe for index scans of the target index.  This
+	 * is possible when the snapshot sees tuples that are before the
+	 * index's indcheckxmin horizon.  Throwing an error here should be
+	 * very rare.  It doesn't seem worth using a secondary snapshot to
+	 * avoid this.
+	 */
+	if (IsolationUsesXactSnapshot() && rel->rd_index->indcheckxmin &&
+		!TransactionIdPrecedes(HeapTupleHeaderGetXmin(rel->rd_indextuple->t_data),
+								result.snapshot->xmin))
+		ereport(ERROR,
+				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+					errmsg("index \"%s\" cannot be verified using transaction snapshot",
+						RelationGetRelationName(rel))));
+	
+	return result;
+}
+
+/*
+ * Main entry point for GiST check. Allocates memory context and scans through
+ * GiST graph.  This function verifies that tuples of internal pages cover all
+ * the key space of each tuple on leaf page.  To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries to
+ * adjust it by tuples on referenced child page.  Parent gist tuple should
+ * never require any adjustments.
+ */
+static void
+gist_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GistScanItem   *stack;
+	MemoryContext	mctx;
+	MemoryContext	oldcontext;
+	GISTSTATE      *state;
+	int				leafdepth;
+	bool			heapallindexed = *((bool*)callback_state);
+	GistCheckState  check_state;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	state = initGISTstate(rel);
+
+	if (heapallindexed)
+		check_state = gist_init_heapallindexed(rel);
+	check_state.state = state;
+	check_state.rel = rel;
+	check_state.heaprel = heaprel;
+	
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GistScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber  i, maxoff;
+		XLogRecPtr	lsn;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page))
+		{
+			/* split page detected, install right link to the stack */
+			GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+			ptr->depth = stack->depth;
+			ptr->parenttup = CopyIndexTuple(stack->parenttup);
+			ptr->parentblk = stack->parentblk;
+			ptr->parentlsn = stack->parentlsn;
+			ptr->blkno = GistPageGetOpaque(page)->rightlink;
+			ptr->next = stack->next;
+			stack->next = ptr;
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GistPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that each tuple looks valid, and is consistent with the
+		 * downlink we followed when we stepped on this page.
+		 */
+		maxoff = PageGetMaxOffsetNumber(page);
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId iid = PageGetItemIdCareful(rel, stack->blkno, page, i, sizeof(GISTPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+			/*
+			 * Check that it's not a leftover invalid tuple from pre-9.1 See
+			 * also gistdoinsert() and gistbulkdelete() handling of such
+			 * tuples. We do consider it error here.
+			 */
+			if (GistTupleIsInvalid(idxtuple))
+				ereport(ERROR,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("index \"%s\" contains an inner tuple marked as invalid, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i),
+						 errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+						 errhint("Please REINDEX it.")));
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+			{
+				/*
+				 * There was a discrepancy between parent and child tuples.
+				 * We need to verify it is not a result of concurrent call of
+				 * gistplacetopage(). So, lock parent and try to find downlink
+				 * for current page. It may be missing due to concurrent page
+				 * split, this is OK.
+				 * 
+				 * Note that when we aquire parent tuple now we hold lock for
+				 * both parent and child buffers. Thus parent tuple must
+				 * include keyspace of the child.
+				 */
+				pfree(stack->parenttup);
+				stack->parenttup = gist_refind_parent(rel, stack->parentblk,
+													  stack->blkno, strategy);
+
+				/* We found it - make a final check before failing */
+				if (!stack->parenttup)
+					elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+						 stack->blkno, stack->parentblk);
+				else if (gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+				else
+				{
+					/*
+					 * But now it is properly adjusted - nothing to do here.
+					 */
+				}
+			}
+
+			
+			if (GistPageIsLeaf(page))
+			{
+				if (heapallindexed)
+				{
+					bloom_add_element(check_state.filter, (unsigned char *) idxtuple,
+								  IndexTupleSize(idxtuple));
+				}
+			}
+			/* If this is an internal page, recurse into the child */
+			else
+			{
+				GistScanItem *ptr;
+
+				ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+				ptr->depth = stack->depth + 1;
+				ptr->parenttup = CopyIndexTuple(idxtuple);
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		LockBuffer(buffer, GIST_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	if (heapallindexed)
+	{
+		IndexInfo  *indexinfo = BuildIndexInfo(rel);
+		TableScanDesc scan;
+
+		scan = table_beginscan_strat(heaprel,	/* relation */
+									 check_state.snapshot,	/* snapshot */
+									 0, /* number of keys */
+									 NULL,	/* scan key */
+									 true,	/* buffer access strategy OK */
+									 true); /* syncscan OK? */
+
+		/*
+		 * Scan will behave as the first scan of a CREATE INDEX CONCURRENTLY.
+		 */
+		indexinfo->ii_Concurrent = true;
+
+		indexinfo->ii_Unique = false;
+		indexinfo->ii_ExclusionOps = NULL;
+		indexinfo->ii_ExclusionProcs = NULL;
+		indexinfo->ii_ExclusionStrats = NULL;
+
+		elog(DEBUG1, "verifying that tuples from index \"%s\" are present in \"%s\"",
+			 RelationGetRelationName(rel),
+			 RelationGetRelationName(heaprel));
+
+		table_index_build_scan(heaprel, rel, indexinfo, true, false,
+							   gist_tuple_present_callback, (void *) &check_state, scan);
+
+		ereport(DEBUG1,
+		(errmsg_internal("finished verifying presence of " INT64_FORMAT " tuples from table \"%s\" with bitset %.2f%% set",
+							check_state.heaptuplespresent, RelationGetRelationName(heaprel),
+							100.0 * bloom_prop_bits_set(check_state.filter))));
+
+		UnregisterSnapshot(check_state.snapshot);
+		bloom_free(check_state.filter);
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+static void
+gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+						  bool *isnull, bool tupleIsAlive, void *checkstate)
+{
+	GistCheckState *state = (GistCheckState *) checkstate;
+	IndexTuple	itup = gistFormTuple(state->state, index, values, isnull, true);
+	itup->t_tid = *tid;
+	/* Probe Bloom filter -- tuple should be present */
+	if (bloom_lacks_element(state->filter, (unsigned char *) itup,
+							IndexTupleSize(itup)))
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("heap tuple (%u,%u) from table \"%s\" lacks matching index tuple within index \"%s\"",
+						ItemPointerGetBlockNumber(&(itup->t_tid)),
+						ItemPointerGetOffsetNumber(&(itup->t_tid)),
+						RelationGetRelationName(state->heaprel),
+						RelationGetRelationName(state->rel))));
+
+	state->heaptuplespresent++;
+
+	pfree(itup);
+}
+
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	gistcheckpage(rel, buffer);
+
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted page %d",
+						RelationGetRelationName(rel), blockNo)));
+
+	if (GistPageIsDeleted(page))
+	{
+		if (!GistPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gist_refind_parent(Relation rel, BlockNumber parentblkno,
+				   BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIST_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GistPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o, sizeof(GISTPageOpaqueData));
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 5d61a33936f..9397a69c6e5 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -179,6 +179,25 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gist_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_parent_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
   <tip>
    <para>
-- 
2.37.3.542.gdd3f6c4cae

v14-0003-Add-gin_index_parent_check-to-verify-GIN-index.patchtext/x-diff; charset=us-asciiDownload

From 5af85c93cb961040c5e256f79db38d285c5f1fb7 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:22:26 +0500
Subject: [PATCH v14 3/3] Add gin_index_parent_check() to verify GIN index

---
 contrib/amcheck/Makefile               |   3 +-
 contrib/amcheck/amcheck--1.3--1.4.sql  |  11 +-
 contrib/amcheck/amcheck.c              |   2 +-
 contrib/amcheck/expected/check_gin.out |  64 ++
 contrib/amcheck/meson.build            |   2 +
 contrib/amcheck/sql/check_gin.sql      |  40 ++
 contrib/amcheck/verify_gin.c           | 800 +++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml              |  19 +
 8 files changed, 938 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/expected/check_gin.out
 create mode 100644 contrib/amcheck/sql/check_gin.sql
 create mode 100644 contrib/amcheck/verify_gin.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index a8174195817..ecb849a605b 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -5,6 +5,7 @@ OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
 	verify_gist.o \
+	verify_gin.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
@@ -14,7 +15,7 @@ DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck
 
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap check_gist
+REGRESS = check check_btree check_heap check_gist check_gin
 
 TAP_TESTS = 1
 
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
index 93297379efe..c914e6d0baa 100644
--- a/contrib/amcheck/amcheck--1.3--1.4.sql
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -11,4 +11,13 @@ RETURNS VOID
 AS 'MODULE_PATHNAME', 'gist_index_parent_check'
 LANGUAGE C STRICT;
 
-REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
+REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
+
+-- gin_index_parent_check()
+--
+CREATE FUNCTION gin_index_parent_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gin_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gin_index_parent_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
index 0194ef0d7a2..1e0c875926c 100644
--- a/contrib/amcheck/amcheck.c
+++ b/contrib/amcheck/amcheck.c
@@ -83,7 +83,7 @@ amcheck_lock_relation_and_check(Oid indrelid, IndexCheckableCallback checkable,
 	else
 	{
 		heaprel = NULL;
-		/* for "gcc -Og" https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78394 */
+		/* Set these just to suppress "uninitialized variable" warnings */
 		save_userid = InvalidOid;
 		save_sec_context = -1;
 		save_nestlevel = -1;
diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
new file mode 100644
index 00000000000..d98d525c660
--- /dev/null
+++ b/contrib/amcheck/expected/check_gin.out
@@ -0,0 +1,64 @@
+-- Test of index bulk load
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx', true);
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test index inserts
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+SELECT gin_index_parent_check('gin_check_idx', true);
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test GIN over text array
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx', true);
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index d84138f7fa6..0d7d651b24a 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -1,5 +1,6 @@
 amcheck = shared_module('amcheck', [
     'amcheck.c',
+    'verify_gin.c',
     'verify_gist.c',
     'verify_heapam.c',
     'verify_nbtree.c',
@@ -28,6 +29,7 @@ tests += {
       'check_btree',
       'check_heap',
       'check_gist',
+      'check_gin',
     ],
   },
   'tap': {
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
new file mode 100644
index 00000000000..789259e6629
--- /dev/null
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -0,0 +1,40 @@
+-- Test of index bulk load
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx', true);
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test index inserts
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+
+SELECT gin_index_parent_check('gin_check_idx', true);
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test GIN over text array
+SELECT setseed(1);
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx', true);
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
new file mode 100644
index 00000000000..90fe89501d8
--- /dev/null
+++ b/contrib/amcheck/verify_gin.c
@@ -0,0 +1,800 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gin.c
+ *		Verifies the integrity of GIN indexes based on invariants.
+ *
+ * Verification checks that all paths in GIN graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gin.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gin_private.h"
+#include "access/nbtree.h"
+#include "amcheck.h"
+#include "catalog/pg_am.h"
+#include "miscadmin.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "string.h"
+
+/*
+ * GinScanItem represents one item of depth-first scan of GIN index.
+ */
+typedef struct GinScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GinScanItem *next;
+}			GinScanItem;
+
+/*
+ * GinPostingTreeScanItem represents one item of depth-first scan of GIN  posting tree.
+ */
+typedef struct GinPostingTreeScanItem
+{
+	int			depth;
+	ItemPointerData parentkey;
+	BlockNumber parentblk;
+	BlockNumber blkno;
+	struct GinPostingTreeScanItem *next;
+}			GinPostingTreeScanItem;
+
+
+PG_FUNCTION_INFO_V1(gin_index_parent_check);
+
+static void gin_index_checkable(Relation rel);
+static void gin_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state);
+static bool check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gin_refind_parent(Relation rel, BlockNumber parentblkno,
+									BlockNumber childblkno,
+									BufferAccessStrategy strategy);
+
+/*
+ * gin_index_parent_check(index regclass)
+ *
+ * Verify integrity of GIN index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gin_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	bool	heapallindexed = false;
+
+	if (PG_NARGS() >= 2)
+		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid, gin_index_checkable,
+		gin_check_parent_keys_consistency, AccessShareLock, &heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Read item pointers from leaf entry tuple.
+ *
+ * Returns a palloc'd array of ItemPointers. The number of items is returned
+ * in *nitems.
+ */
+static ItemPointer
+ginReadTupleWithoutState(IndexTuple itup, int *nitems)
+{
+	Pointer		ptr = GinGetPosting(itup);
+	int			nipd = GinGetNPosting(itup);
+	ItemPointer ipd;
+	int			ndecoded;
+
+	if (GinItupIsCompressed(itup))
+	{
+		if (nipd > 0)
+		{
+			ipd = ginPostingListDecode((GinPostingList *) ptr, &ndecoded);
+			if (nipd != ndecoded)
+				elog(ERROR, "number of items mismatch in GIN entry tuple, %d in tuple header, %d decoded",
+					 nipd, ndecoded);
+		}
+		else
+		{
+			ipd = palloc(0);
+		}
+	}
+	else
+	{
+		ipd = (ItemPointer) palloc(sizeof(ItemPointerData) * nipd);
+		memcpy(ipd, ptr, sizeof(ItemPointerData) * nipd);
+	}
+	*nitems = nipd;
+	return ipd;
+}
+
+
+/*
+ * Check that relation is eligible for GIN verification
+ */
+static void
+gin_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIN_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GIN indexes are supported as targets for this verification"),
+				 errdetail("Relation \"%s\" is not a GIN index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+/*
+ * Allocates memory context and scans through postigTree graph
+ *
+ */
+static void
+gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting_tree_root)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinPostingTreeScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinPostingTreeScanItem *) palloc0(sizeof(GinPostingTreeScanItem));
+	stack->depth = 0;
+	ItemPointerSetInvalid(&stack->parentkey);
+	stack->parentblk = InvalidBlockNumber;
+	stack->blkno = posting_tree_root;
+
+	elog(DEBUG3, "processing posting tree at blk %u", posting_tree_root);
+
+	while (stack)
+	{
+		GinPostingTreeScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		Assert(GinPageIsData(page));
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			ItemPointerData minItem;
+			int			nlist;
+			ItemPointerData *list;
+			char		tidrange_buf[100];
+
+			ItemPointerSetMin(&minItem);
+
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			list = GinDataLeafPageGetItems(page, &nlist, minItem);
+
+			if (nlist > 0)
+			{
+				snprintf(tidrange_buf, sizeof(tidrange_buf),
+						 "%d tids (%u, %u) - (%u, %u)",
+						 nlist,
+						 ItemPointerGetBlockNumberNoCheck(&list[0]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[0]),
+						 ItemPointerGetBlockNumberNoCheck(&list[nlist - 1]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[nlist - 1]));
+			} else {
+				snprintf(tidrange_buf, sizeof(tidrange_buf), "0 tids");
+			}
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				elog(DEBUG3, "blk %u: parent %u highkey (%u, %u), %s",
+					 stack->blkno,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey),
+					 tidrange_buf);
+			}
+			else
+			{
+				elog(DEBUG3, "blk %u: root leaf, %s",
+					 stack->blkno,
+					 tidrange_buf);
+			}
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				ItemPointerGetOffsetNumberNoCheck(&stack->parentkey) != InvalidOffsetNumber &&
+				nlist > 0 &&
+				ItemPointerCompare(&stack->parentkey, &list[nlist - 1]) < 0)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": tid exceeds parent's high key in postingTree leaf on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			}
+		}
+		else
+		{
+			LocationIndex pd_lower;
+			int			lowersize;
+			ItemPointerData bound;
+
+			/*
+			 * Check that tuples in each page are properly ordered and
+			 * consistent with parent high key
+			 */
+			maxoff = GinPageGetOpaque(page)->maxoff;
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: internal posting tree page with %u items, parent %u highkey (%u, %u)",
+					 stack->blkno,
+					 maxoff,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey));
+			else
+				elog(DEBUG3, "blk %u: root internal posting tree page with %u items", stack->blkno, maxoff);
+
+			/*
+			 * A GIN posting tree internal page stores PostingItems in the
+			 * 'lower' part of the page. The 'upper' part is unused. The
+			 * number of elements is stored in the opaque area (maxoff).
+			 * Make sure the size of the 'lower' part agrees with 'maxoff'
+			 *
+			 * We didn't set pd_lower until PostgreSQL version 9.4, so if this
+			 * check fails, it could also be because the index was binary-upgraded
+			 * from an earlier version. That was a long time ago, though, so let's
+			 * warn if it doesn't match.
+			 */
+			pd_lower = ((PageHeader) page)->pd_lower;
+			lowersize = pd_lower - MAXALIGN(SizeOfPageHeaderData);
+			if ((lowersize - MAXALIGN(sizeof(ItemPointerData))) / sizeof(PostingItem) != maxoff)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has unexpected pd_lower %u in posting tree block %u with maxoff %u)",
+								RelationGetRelationName(rel), pd_lower, stack->blkno, maxoff)));
+			}
+
+			/*
+			 * Before the PostingItems, there's one ItemPointerData in the
+			 * 'lower' part that stores the page's high key.
+			 */
+			bound = *GinDataPageGetRightBound(page);
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				if (!ItemPointerEquals(&stack->parentkey, &bound))
+				{
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\": posting tree page's high key (%u, %u) doesn't match the downlink on block %u (parent blk %u, key (%u, %u))",
+									RelationGetRelationName(rel),
+									ItemPointerGetBlockNumberNoCheck(&bound),
+									ItemPointerGetOffsetNumberNoCheck(&bound),
+									stack->blkno,
+									stack->parentblk,
+									ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+									ItemPointerGetOffsetNumberNoCheck(&stack->parentkey))));
+				}
+			}
+
+			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
+
+				elog(DEBUG3, "key (%u, %u) -> %u",
+					 ItemPointerGetBlockNumber(&posting_item->key),
+					 ItemPointerGetOffsetNumber(&posting_item->key),
+					 BlockIdGetBlockNumber(&posting_item->child_blkno));
+
+				if (i == maxoff && GinPageGetOpaque(page)->rightlink == InvalidBlockNumber)
+				{
+					/* The rightmost item in the tree level has (0, 0) as the key */
+					if (ItemPointerGetBlockNumberNoCheck(&posting_item->key) != 0 ||
+						ItemPointerGetOffsetNumberNoCheck(&posting_item->key) != 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": rightmost posting tree page (blk %u) has unexpected last key (%u, %u)",
+										RelationGetRelationName(rel),
+										stack->blkno,
+										ItemPointerGetBlockNumberNoCheck(&posting_item->key),
+										ItemPointerGetOffsetNumberNoCheck(&posting_item->key))));
+					}
+				}
+				else if (i != FirstOffsetNumber)
+				{
+					PostingItem *previous_posting_item = GinDataPageGetPostingItem(page, i - 1);
+
+					if (ItemPointerCompare(&posting_item->key, &previous_posting_item->key) < 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\" has wrong tuple order in posting tree, block %u, offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+					}
+				}
+
+				/*
+				 * Check if this tuple is consistent with the downlink in the
+				 * parent.
+				 */
+				if (stack->parentblk != InvalidBlockNumber && i == maxoff)
+				{
+					if (ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting item exceeds parent's high key in postingTree internal page on block %u offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+
+					}
+				}
+
+				/* If this is an internal page, recurse into the child */
+				if (!GinPageIsLeaf(page))
+				{
+					GinPostingTreeScanItem *ptr;
+
+					ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
+					ptr->depth = stack->depth + 1;
+					ptr->parentkey = posting_item->key;
+					ptr->parentblk = stack->blkno;
+					ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+
+			}
+		}
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Main entry point for GIN check. Allocates memory context and scans through
+ * GIN graph.
+ */
+static void
+gin_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GinState	state;
+	bool		heapallindexed = *((bool*)callback_state);
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+	initGinState(&state, rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinScanItem *) palloc0(sizeof(GinScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIN_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GinScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+		IndexTuple	prev_tuple;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		if (!check_index_page(rel, buffer, stack->blkno))
+		{
+			goto nextpage;
+		}
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (stack->parenttup != NULL)
+		{
+			GinNullCategory parent_key_category;
+			Datum		parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+			OffsetNumber maxoff = PageGetMaxOffsetNumber(page);
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, maxoff, sizeof(GinPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory page_max_key_category;
+			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
+
+			if (GinPageGetOpaque(page)->rightlink != InvalidBlockNumber &&
+				ginCompareEntries(&state, attnum, page_max_key, page_max_key_category, parent_key, parent_key_category) > 0)
+			{
+				/* split page detected, install right link to the stack */
+				GinScanItem *ptr;
+
+				elog(DEBUG3, "split detected");
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth;
+				ptr->parenttup = CopyIndexTuple(stack->parenttup);
+				ptr->parentblk = stack->parentblk;
+				ptr->parentlsn = stack->parentlsn;
+				ptr->blkno = GinPageGetOpaque(page)->rightlink;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+				goto nextpage;
+			}
+		}
+
+		/*
+		 * Check that tuples in each page are properly ordered and consistent
+		 * with parent high key
+		 */
+		maxoff = PageGetMaxOffsetNumber(page);
+		prev_tuple = NULL;
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i, sizeof(GinPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory prev_key_category;
+			Datum		prev_key;
+			GinNullCategory current_key_category;
+			Datum		current_key;
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+				continue;
+			}
+
+			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
+
+			/* (apparently) first block is metadata, skip order check */
+			if (i != FirstOffsetNumber && stack->blkno != (BlockNumber) 1)
+			{
+				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
+				if (ginCompareEntries(&state, attnum, prev_key, prev_key_category, current_key, current_key_category) >= 0)
+				{
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has wrong tuple order, block %u, offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+				}
+			}
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				i == maxoff)
+			{
+				GinNullCategory parent_key_category;
+				Datum		parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+
+				if (ginCompareEntries(&state, attnum, current_key, current_key_category, parent_key, parent_key_category) > 0)
+				{
+					/*
+					 * There was a discrepancy between parent and child
+					 * tuples. We need to verify it is not a result of
+					 * concurrent call of gistplacetopage(). So, lock parent
+					 * and try to find downlink for current page. It may be
+					 * missing due to concurrent page split, this is OK.
+					 */
+					pfree(stack->parenttup);
+					stack->parenttup = gin_refind_parent(rel, stack->parentblk,
+														 stack->blkno, strategy);
+
+					/* We found it - make a final check before failing */
+					if (!stack->parenttup)
+						elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+							 stack->blkno, stack->parentblk);
+					else
+					{
+						parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+						if (ginCompareEntries(&state, attnum, current_key, current_key_category, parent_key, parent_key_category) > 0)
+							ereport(ERROR,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+											RelationGetRelationName(rel), stack->blkno, i)));
+						else
+						{
+							/*
+							 * But now it is properly adjusted - nothing to do
+							 * here.
+							 */
+						}
+					}
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+			if (!GinPageIsLeaf(page))
+			{
+				GinScanItem *ptr;
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth + 1;
+				/* last tuple in layer has no high key */
+				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
+				{
+					ptr->parenttup = CopyIndexTuple(idxtuple);
+				}
+				else
+				{
+					ptr->parenttup = NULL;
+				}
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = GinGetDownlink(idxtuple);
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+			/* If this item is a pointer to a posting tree, recurse into it */
+			else if (GinIsPostingTree(idxtuple))
+			{
+				BlockNumber rootPostingTree = GinGetPostingTree(idxtuple);
+
+				gin_check_posting_tree_parent_keys_consistency(rel, rootPostingTree);
+			}
+			else
+			{
+				ItemPointer ipd;
+				int			nipd;
+
+				ipd = ginReadTupleWithoutState(idxtuple, &nipd);
+
+				for (int j = 0; j < nipd; j++)
+				{
+					if (!OffsetNumberIsValid(ItemPointerGetOffsetNumber(&ipd[j])))
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting list contains invalid heap pointer on block %u",
+										RelationGetRelationName(rel), stack->blkno)));
+					}
+				}
+				pfree(ipd);
+			}
+
+			prev_tuple = CopyIndexTuple(idxtuple);
+		}
+
+nextpage:
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Verify that a freshly-read page looks sane.
+ */
+static bool
+gincheckpage(Relation rel, Buffer buf)
+{
+	Page		page = BufferGetPage(buf);
+
+	/*
+	 * ReadBuffer verifies that every newly-read page passes
+	 * PageHeaderIsValid, which means it either contains a reasonably sane
+	 * page header or is all-zero.  We have to defend against the all-zero
+	 * case, however.
+	 */
+	if (PageIsNew(page))
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains unexpected zero page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buf)),
+				 errhint("Please REINDEX it.")));
+		return false;
+	}
+
+	/*
+	 * Additionally check that the special area looks sane.
+	 */
+	if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains corrupted page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buf)),
+				 errhint("Please REINDEX it.")));
+		return false;
+	}
+	return true;
+}
+
+static bool
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	if (!gincheckpage(rel, buffer))
+		return false;
+
+	if (GinPageIsDeleted(page))
+	{
+		if (!GinPageIsLeaf(page))
+		{
+			ereport(WARNING,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+			return false;
+		}
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+		{
+			ereport(WARNING,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+			return false;
+		}
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+		return false;
+	}
+	return true;
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gin_refind_parent(Relation rel, BlockNumber parentblkno,
+				  BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIN_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GinPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o, sizeof(GinPageOpaqueData));
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 9397a69c6e5..7ffa36b2057 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -180,6 +180,25 @@ ORDER BY c.relpages DESC LIMIT 10;
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term>
+     <function>gin_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gin_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gin_index_parent_check</function> tests that its target GIN index
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term>
      <function>gist_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
-- 
2.37.3.542.gdd3f6c4cae

v14-0001-Refactor-amcheck-to-extract-common-locking-routi.patchtext/x-diff; charset=us-asciiDownload

From 9c2919df1419216e596819e9a3e23616d431d8d0 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:08:10 +0500
Subject: [PATCH v14 1/3] Refactor amcheck to extract common locking routines

---
 contrib/amcheck/Makefile        |   2 +
 contrib/amcheck/amcheck.c       | 188 +++++++++++++++++++
 contrib/amcheck/amcheck.h       |  27 +++
 contrib/amcheck/meson.build     |   1 +
 contrib/amcheck/verify_nbtree.c | 308 ++++++++------------------------
 5 files changed, 297 insertions(+), 229 deletions(-)
 create mode 100644 contrib/amcheck/amcheck.c
 create mode 100644 contrib/amcheck/amcheck.h

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index b82f221e50b..f10fd9d89d5 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -3,11 +3,13 @@
 MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
+	amcheck.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
 DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
 REGRESS = check check_btree check_heap
diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
new file mode 100644
index 00000000000..0194ef0d7a2
--- /dev/null
+++ b/contrib/amcheck/amcheck.c
@@ -0,0 +1,188 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.c
+ *		Utility functions common to all access methods.
+ *
+ * Copyright (c) 2017-2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/table.h"
+#include "access/tableam.h"
+#include "amcheck.h"
+#include "catalog/index.h"
+#include "commands/tablecmds.h"
+
+
+static bool
+amcheck_index_mainfork_expected(Relation rel);
+
+/*
+ * Check if index relation should have a file for its main relation
+ * fork.  Verification uses this to skip unlogged indexes when in hot standby
+ * mode, where there is simply nothing to verify.
+ *
+ * NB: Caller should call index_checkable()
+ * before calling here.
+ */
+static bool
+amcheck_index_mainfork_expected(Relation rel)
+{
+	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
+		!RecoveryInProgress())
+		return true;
+
+	ereport(NOTICE,
+			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
+			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
+					RelationGetRelationName(rel))));
+
+	return false;
+}
+
+void
+amcheck_lock_relation_and_check(Oid indrelid, IndexCheckableCallback checkable,
+												IndexDoCheckCallback check, LOCKMODE lockmode, void *state)
+{
+	Oid			heapid;
+	Relation	indrel;
+	Relation	heaprel;
+	Oid			save_userid;
+	int			save_sec_context;
+	int			save_nestlevel;
+
+	/*
+	 * We must lock table before index to avoid deadlocks.  However, if the
+	 * passed indrelid isn't an index then IndexGetRelation() will fail.
+	 * Rather than emitting a not-very-helpful error message, postpone
+	 * complaining, expecting that the is-it-an-index test below will fail.
+	 *
+	 * In hot standby mode this will raise an error when parentcheck is true.
+	 */
+	heapid = IndexGetRelation(indrelid, true);
+	if (OidIsValid(heapid))
+	{
+		heaprel = table_open(heapid, lockmode);
+
+		/*
+		 * Switch to the table owner's userid, so that any index functions are
+		 * run as that user.  Also lock down security-restricted operations
+		 * and arrange to make GUC variable changes local to this command.
+		 */
+		GetUserIdAndSecContext(&save_userid, &save_sec_context);
+		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
+							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
+		save_nestlevel = NewGUCNestLevel();
+	}
+	else
+	{
+		heaprel = NULL;
+		/* for "gcc -Og" https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78394 */
+		save_userid = InvalidOid;
+		save_sec_context = -1;
+		save_nestlevel = -1;
+	}
+
+	/*
+	 * Open the target index relations separately (like relation_openrv(), but
+	 * with heap relation locked first to prevent deadlocking).  In hot
+	 * standby mode this will raise an error when parentcheck is true.
+	 *
+	 * There is no need for the usual indcheckxmin usability horizon test
+	 * here, even in the heapallindexed case, because index undergoing
+	 * verification only needs to have entries for a new transaction snapshot.
+	 * (If this is a parentcheck verification, there is no question about
+	 * committed or recently dead heap tuples lacking index entries due to
+	 * concurrent activity.)
+	 */
+	indrel = index_open(indrelid, lockmode);
+
+	/*
+	 * Since we did the IndexGetRelation call above without any lock, it's
+	 * barely possible that a race against an index drop/recreation could have
+	 * netted us the wrong table.
+	 */
+	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_TABLE),
+				 errmsg("could not open parent table of index \"%s\"",
+						RelationGetRelationName(indrel))));
+
+	/* Relation suitable for checking */
+	checkable(indrel);
+
+	if (amcheck_index_mainfork_expected(indrel))
+		check(indrel, heaprel, state);
+
+	/* Roll back any GUC changes executed by index functions */
+	AtEOXact_GUC(false, save_nestlevel);
+
+	/* Restore userid and security context */
+	SetUserIdAndSecContext(save_userid, save_sec_context);
+
+	/*
+	 * Release locks early. That's ok here because nothing in the called
+	 * routines will trigger shared cache invalidations to be sent, so we can
+	 * relax the usual pattern of only releasing locks after commit.
+	 */
+	index_close(indrel, lockmode);
+	if (heaprel)
+		table_close(heaprel, lockmode);
+}
+
+/*
+ * PageGetItemId() wrapper that validates returned line pointer.
+ *
+ * Buffer page/page item access macros generally trust that line pointers are
+ * not corrupt, which might cause problems for verification itself.  For
+ * example, there is no bounds checking in PageGetItem().  Passing it a
+ * corrupt line pointer can cause it to return a tuple/pointer that is unsafe
+ * to dereference.
+ *
+ * Validating line pointers before tuples avoids undefined behavior and
+ * assertion failures with corrupt indexes, making the verification process
+ * more robust and predictable.
+ */
+ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset, size_t opaquesize)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	Assert(opaquesize == MAXALIGN(opaquesize));
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(opaquesize))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree and gist
+	 * never uses either.  Verify that line pointer has storage, too, since
+	 * even LP_DEAD items should.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/contrib/amcheck/amcheck.h b/contrib/amcheck/amcheck.h
new file mode 100644
index 00000000000..10906efd8a5
--- /dev/null
+++ b/contrib/amcheck/amcheck.h
@@ -0,0 +1,27 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "storage/lockdefs.h"
+#include "utils/relcache.h"
+#include "miscadmin.h"
+
+/* Typedefs for callback functions for amcheck_lock_relation */
+typedef void (*IndexCheckableCallback) (Relation index);
+typedef void (*IndexDoCheckCallback) (Relation rel, Relation heaprel, void* state);
+
+extern void amcheck_lock_relation_and_check(Oid indrelid,
+											IndexCheckableCallback checkable,
+											IndexDoCheckCallback check,
+											LOCKMODE lockmode, void *state);
+
+extern ItemId PageGetItemIdCareful(Relation rel, BlockNumber block,
+					 Page page, OffsetNumber offset, size_t opaquesize);
\ No newline at end of file
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 1db3d20349e..227e68ff834 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -1,4 +1,5 @@
 amcheck = shared_module('amcheck', [
+    'amcheck.c',
     'verify_heapam.c',
     'verify_nbtree.c',
   ],
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 9021d156eb7..950014f19d5 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -41,6 +41,8 @@
 #include "utils/memutils.h"
 #include "utils/snapmgr.h"
 
+#include "amcheck.h"
+
 
 PG_MODULE_MAGIC;
 
@@ -138,10 +140,8 @@ typedef struct BtreeLevel
 PG_FUNCTION_INFO_V1(bt_index_check);
 PG_FUNCTION_INFO_V1(bt_index_parent_check);
 
-static void bt_index_check_internal(Oid indrelid, bool parentcheck,
-									bool heapallindexed, bool rootdescend);
+static void bt_index_check_internal_callback(Relation indrel, Relation heaprel, void* state);
 static inline void btree_index_checkable(Relation rel);
-static inline bool btree_index_mainfork_expected(Relation rel);
 static void bt_check_every_level(Relation rel, Relation heaprel,
 								 bool heapkeyspace, bool readonly, bool heapallindexed,
 								 bool rootdescend);
@@ -184,12 +184,17 @@ static inline bool invariant_l_nontarget_offset(BtreeCheckState *state,
 static Page palloc_btree_page(BtreeCheckState *state, BlockNumber blocknum);
 static inline BTScanInsert bt_mkscankey_pivotsearch(Relation rel,
 													IndexTuple itup);
-static ItemId PageGetItemIdCareful(BtreeCheckState *state, BlockNumber block,
-								   Page page, OffsetNumber offset);
 static inline ItemPointer BTreeTupleGetHeapTIDCareful(BtreeCheckState *state,
 													  IndexTuple itup, bool nonpivot);
 static inline ItemPointer BTreeTupleGetPointsToTID(IndexTuple itup);
 
+typedef struct BTCheckCallbackState
+{
+	bool parentcheck;
+	bool heapallindexed;
+	bool rootdescend;
+} BTCheckCallbackState;
+
 /*
  * bt_index_check(index regclass, heapallindexed boolean)
  *
@@ -203,12 +208,17 @@ Datum
 bt_index_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
+	BTCheckCallbackState args;
 
-	if (PG_NARGS() == 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = false;
 
-	bt_index_check_internal(indrelid, false, heapallindexed, false);
+	if (PG_NARGS() >= 2)
+		args.heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid, btree_index_checkable,
+		bt_index_check_internal_callback, AccessShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -226,15 +236,18 @@ Datum
 bt_index_parent_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		rootdescend = false;
+	BTCheckCallbackState args;
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = true;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+		args.heapallindexed = PG_GETARG_BOOL(1);
 	if (PG_NARGS() == 3)
-		rootdescend = PG_GETARG_BOOL(2);
+		args.rootdescend = PG_GETARG_BOOL(2);
 
-	bt_index_check_internal(indrelid, true, heapallindexed, rootdescend);
+	amcheck_lock_relation_and_check(indrelid, btree_index_checkable,
+		bt_index_check_internal_callback, ShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -242,126 +255,35 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
 /*
  * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
  */
-static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend)
+static void bt_index_check_internal_callback(Relation indrel, Relation heaprel, void* state)
 {
-	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	LOCKMODE	lockmode;
-	Oid			save_userid;
-	int			save_sec_context;
-	int			save_nestlevel;
-
-	if (parentcheck)
-		lockmode = ShareLock;
-	else
-		lockmode = AccessShareLock;
-
-	/*
-	 * We must lock table before index to avoid deadlocks.  However, if the
-	 * passed indrelid isn't an index then IndexGetRelation() will fail.
-	 * Rather than emitting a not-very-helpful error message, postpone
-	 * complaining, expecting that the is-it-an-index test below will fail.
-	 *
-	 * In hot standby mode this will raise an error when parentcheck is true.
-	 */
-	heapid = IndexGetRelation(indrelid, true);
-	if (OidIsValid(heapid))
-	{
-		heaprel = table_open(heapid, lockmode);
-
-		/*
-		 * Switch to the table owner's userid, so that any index functions are
-		 * run as that user.  Also lock down security-restricted operations
-		 * and arrange to make GUC variable changes local to this command.
-		 */
-		GetUserIdAndSecContext(&save_userid, &save_sec_context);
-		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
-							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
-		save_nestlevel = NewGUCNestLevel();
-	}
-	else
-	{
-		heaprel = NULL;
-		/* Set these just to suppress "uninitialized variable" warnings */
-		save_userid = InvalidOid;
-		save_sec_context = -1;
-		save_nestlevel = -1;
-	}
-
-	/*
-	 * Open the target index relations separately (like relation_openrv(), but
-	 * with heap relation locked first to prevent deadlocking).  In hot
-	 * standby mode this will raise an error when parentcheck is true.
-	 *
-	 * There is no need for the usual indcheckxmin usability horizon test
-	 * here, even in the heapallindexed case, because index undergoing
-	 * verification only needs to have entries for a new transaction snapshot.
-	 * (If this is a parentcheck verification, there is no question about
-	 * committed or recently dead heap tuples lacking index entries due to
-	 * concurrent activity.)
-	 */
-	indrel = index_open(indrelid, lockmode);
-
-	/*
-	 * Since we did the IndexGetRelation call above without any lock, it's
-	 * barely possible that a race against an index drop/recreation could have
-	 * netted us the wrong table.
-	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
-		ereport(ERROR,
-				(errcode(ERRCODE_UNDEFINED_TABLE),
-				 errmsg("could not open parent table of index \"%s\"",
-						RelationGetRelationName(indrel))));
-
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	if (btree_index_mainfork_expected(indrel))
-	{
-		bool		heapkeyspace,
+	BTCheckCallbackState* args = (BTCheckCallbackState*) state;
+	bool		heapkeyspace,
 					allequalimage;
 
-		if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" lacks a main relation fork",
-							RelationGetRelationName(indrel))));
+	if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" lacks a main relation fork",
+						RelationGetRelationName(indrel))));
 
-		/* Extract metadata from metapage, and sanitize it in passing */
-		_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
-		if (allequalimage && !heapkeyspace)
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
-							RelationGetRelationName(indrel))));
-		if (allequalimage && !_bt_allequalimage(indrel, false))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
-							RelationGetRelationName(indrel))));
+	/* Extract metadata from metapage, and sanitize it in passing */
+	_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
+	if (allequalimage && !heapkeyspace)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
+						RelationGetRelationName(indrel))));
+	if (allequalimage && !_bt_allequalimage(indrel, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
+						RelationGetRelationName(indrel))));
 
-		/* Check index, possibly against table it is an index on */
-		bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
-							 heapallindexed, rootdescend);
-	}
+	/* Check index, possibly against table it is an index on */
+	bt_check_every_level(indrel, heaprel, heapkeyspace, args->parentcheck,
+							args->heapallindexed, args->rootdescend);
 
-	/* Roll back any GUC changes executed by index functions */
-	AtEOXact_GUC(false, save_nestlevel);
-
-	/* Restore userid and security context */
-	SetUserIdAndSecContext(save_userid, save_sec_context);
-
-	/*
-	 * Release locks early. That's ok here because nothing in the called
-	 * routines will trigger shared cache invalidations to be sent, so we can
-	 * relax the usual pattern of only releasing locks after commit.
-	 */
-	index_close(indrel, lockmode);
-	if (heaprel)
-		table_close(heaprel, lockmode);
 }
 
 /*
@@ -398,29 +320,6 @@ btree_index_checkable(Relation rel)
 				 errdetail("Index is not valid.")));
 }
 
-/*
- * Check if B-Tree index relation should have a file for its main relation
- * fork.  Verification uses this to skip unlogged indexes when in hot standby
- * mode, where there is simply nothing to verify.  We behave as if the
- * relation is empty.
- *
- * NB: Caller should call btree_index_checkable() before calling here.
- */
-static inline bool
-btree_index_mainfork_expected(Relation rel)
-{
-	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
-		!RecoveryInProgress())
-		return true;
-
-	ereport(DEBUG1,
-			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
-			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
-					RelationGetRelationName(rel))));
-
-	return false;
-}
-
 /*
  * Main entry point for B-Tree SQL-callable functions. Walks the B-Tree in
  * logical order, verifying invariants as it goes.  Optionally, verification
@@ -793,9 +692,9 @@ bt_check_level_from_leftmost(BtreeCheckState *state, BtreeLevel level)
 				ItemId		itemid;
 
 				/* Internal page -- downlink gets leftmost on next level */
-				itemid = PageGetItemIdCareful(state, state->targetblock,
+				itemid = PageGetItemIdCareful(state->rel, state->targetblock,
 											  state->target,
-											  P_FIRSTDATAKEY(opaque));
+											  P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 				itup = (IndexTuple) PageGetItem(state->target, itemid);
 				nextleveldown.leftmost = BTreeTupleGetDownLink(itup);
 				nextleveldown.level = opaque->btpo_level - 1;
@@ -875,8 +774,8 @@ nextpage:
 			IndexTuple	itup;
 			ItemId		itemid;
 
-			itemid = PageGetItemIdCareful(state, state->targetblock,
-										  state->target, P_HIKEY);
+			itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+										  state->target, P_HIKEY, sizeof(BTPageOpaqueData));
 			itup = (IndexTuple) PageGetItem(state->target, itemid);
 
 			state->lowkey = MemoryContextAlloc(oldcontext, IndexTupleSize(itup));
@@ -1093,8 +992,8 @@ bt_target_page_check(BtreeCheckState *state)
 		IndexTuple	itup;
 
 		/* Verify line pointer before checking tuple */
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, P_HIKEY);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, P_HIKEY, sizeof(BTPageOpaqueData));
 		if (!_bt_check_natts(state->rel, state->heapkeyspace, state->target,
 							 P_HIKEY))
 		{
@@ -1129,8 +1028,8 @@ bt_target_page_check(BtreeCheckState *state)
 
 		CHECK_FOR_INTERRUPTS();
 
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, offset);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, offset, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(state->target, itemid);
 		tupsize = IndexTupleSize(itup);
 
@@ -1442,9 +1341,9 @@ bt_target_page_check(BtreeCheckState *state)
 							 OffsetNumberNext(offset));
 
 			/* Reuse itup to get pointed-to heap location of second item */
-			itemid = PageGetItemIdCareful(state, state->targetblock,
+			itemid = PageGetItemIdCareful(state->rel, state->targetblock,
 										  state->target,
-										  OffsetNumberNext(offset));
+										  OffsetNumberNext(offset), sizeof(BTPageOpaqueData));
 			itup = (IndexTuple) PageGetItem(state->target, itemid);
 			tid = BTreeTupleGetPointsToTID(itup);
 			nhtid = psprintf("(%u,%u)",
@@ -1735,8 +1634,8 @@ bt_right_page_check_scankey(BtreeCheckState *state)
 	if (P_ISLEAF(opaque) && nline >= P_FIRSTDATAKEY(opaque))
 	{
 		/* Return first data item (if any) */
-		rightitem = PageGetItemIdCareful(state, targetnext, rightpage,
-										 P_FIRSTDATAKEY(opaque));
+		rightitem = PageGetItemIdCareful(state->rel, targetnext, rightpage,
+										 P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 	}
 	else if (!P_ISLEAF(opaque) &&
 			 nline >= OffsetNumberNext(P_FIRSTDATAKEY(opaque)))
@@ -1745,8 +1644,8 @@ bt_right_page_check_scankey(BtreeCheckState *state)
 		 * Return first item after the internal page's "negative infinity"
 		 * item
 		 */
-		rightitem = PageGetItemIdCareful(state, targetnext, rightpage,
-										 OffsetNumberNext(P_FIRSTDATAKEY(opaque)));
+		rightitem = PageGetItemIdCareful(state->rel, targetnext, rightpage,
+										 OffsetNumberNext(P_FIRSTDATAKEY(opaque)), sizeof(BTPageOpaqueData));
 	}
 	else
 	{
@@ -1865,8 +1764,8 @@ bt_child_highkey_check(BtreeCheckState *state,
 
 	if (OffsetNumberIsValid(target_downlinkoffnum))
 	{
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, target_downlinkoffnum);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, target_downlinkoffnum, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(state->target, itemid);
 		downlink = BTreeTupleGetDownLink(itup);
 	}
@@ -1969,7 +1868,7 @@ bt_child_highkey_check(BtreeCheckState *state,
 			OffsetNumber pivotkey_offset;
 
 			/* Get high key */
-			itemid = PageGetItemIdCareful(state, blkno, page, P_HIKEY);
+			itemid = PageGetItemIdCareful(state->rel, blkno, page, P_HIKEY, sizeof(BTPageOpaqueData));
 			highkey = (IndexTuple) PageGetItem(page, itemid);
 
 			/*
@@ -2020,8 +1919,8 @@ bt_child_highkey_check(BtreeCheckState *state,
 													LSN_FORMAT_ARGS(state->targetlsn))));
 					pivotkey_offset = P_HIKEY;
 				}
-				itemid = PageGetItemIdCareful(state, state->targetblock,
-											  state->target, pivotkey_offset);
+				itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+											  state->target, pivotkey_offset, sizeof(BTPageOpaqueData));
 				itup = (IndexTuple) PageGetItem(state->target, itemid);
 			}
 			else
@@ -2107,8 +2006,8 @@ bt_child_check(BtreeCheckState *state, BTScanInsert targetkey,
 	BTPageOpaque copaque;
 	BTPageOpaque topaque;
 
-	itemid = PageGetItemIdCareful(state, state->targetblock,
-								  state->target, downlinkoffnum);
+	itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+								  state->target, downlinkoffnum, sizeof(BTPageOpaqueData));
 	itup = (IndexTuple) PageGetItem(state->target, itemid);
 	childblock = BTreeTupleGetDownLink(itup);
 
@@ -2339,7 +2238,7 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 		 RelationGetRelationName(state->rel));
 
 	level = opaque->btpo_level;
-	itemid = PageGetItemIdCareful(state, blkno, page, P_FIRSTDATAKEY(opaque));
+	itemid = PageGetItemIdCareful(state->rel, blkno, page, P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 	itup = (IndexTuple) PageGetItem(page, itemid);
 	childblk = BTreeTupleGetDownLink(itup);
 	for (;;)
@@ -2363,8 +2262,8 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 										level - 1, copaque->btpo_level)));
 
 		level = copaque->btpo_level;
-		itemid = PageGetItemIdCareful(state, childblk, child,
-									  P_FIRSTDATAKEY(copaque));
+		itemid = PageGetItemIdCareful(state->rel, childblk, child,
+									  P_FIRSTDATAKEY(copaque), sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(child, itemid);
 		childblk = BTreeTupleGetDownLink(itup);
 		/* Be slightly more pro-active in freeing this memory, just in case */
@@ -2412,7 +2311,7 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 	 */
 	if (P_ISHALFDEAD(copaque) && !P_RIGHTMOST(copaque))
 	{
-		itemid = PageGetItemIdCareful(state, childblk, child, P_HIKEY);
+		itemid = PageGetItemIdCareful(state->rel, childblk, child, P_HIKEY, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(child, itemid);
 		if (BTreeTupleGetTopParent(itup) == blkno)
 			return;
@@ -2782,8 +2681,8 @@ invariant_l_offset(BtreeCheckState *state, BTScanInsert key,
 	Assert(key->pivotsearch);
 
 	/* Verify line pointer before checking tuple */
-	itemid = PageGetItemIdCareful(state, state->targetblock, state->target,
-								  upperbound);
+	itemid = PageGetItemIdCareful(state->rel, state->targetblock, state->target,
+								  upperbound, sizeof(BTPageOpaqueData));
 	/* pg_upgrade'd indexes may legally have equal sibling tuples */
 	if (!key->heapkeyspace)
 		return invariant_leq_offset(state, key, upperbound);
@@ -2905,8 +2804,8 @@ invariant_l_nontarget_offset(BtreeCheckState *state, BTScanInsert key,
 	Assert(key->pivotsearch);
 
 	/* Verify line pointer before checking tuple */
-	itemid = PageGetItemIdCareful(state, nontargetblock, nontarget,
-								  upperbound);
+	itemid = PageGetItemIdCareful(state->rel, nontargetblock, nontarget,
+								  upperbound, sizeof(BTPageOpaqueData));
 	cmp = _bt_compare(state->rel, key, nontarget, upperbound);
 
 	/* pg_upgrade'd indexes may legally have equal sibling tuples */
@@ -3143,55 +3042,6 @@ bt_mkscankey_pivotsearch(Relation rel, IndexTuple itup)
 	return skey;
 }
 
-/*
- * PageGetItemId() wrapper that validates returned line pointer.
- *
- * Buffer page/page item access macros generally trust that line pointers are
- * not corrupt, which might cause problems for verification itself.  For
- * example, there is no bounds checking in PageGetItem().  Passing it a
- * corrupt line pointer can cause it to return a tuple/pointer that is unsafe
- * to dereference.
- *
- * Validating line pointers before tuples avoids undefined behavior and
- * assertion failures with corrupt indexes, making the verification process
- * more robust and predictable.
- */
-static ItemId
-PageGetItemIdCareful(BtreeCheckState *state, BlockNumber block, Page page,
-					 OffsetNumber offset)
-{
-	ItemId		itemid = PageGetItemId(page, offset);
-
-	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
-		BLCKSZ - MAXALIGN(sizeof(BTPageOpaqueData)))
-		ereport(ERROR,
-				(errcode(ERRCODE_INDEX_CORRUPTED),
-				 errmsg("line pointer points past end of tuple space in index \"%s\"",
-						RelationGetRelationName(state->rel)),
-				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
-									block, offset, ItemIdGetOffset(itemid),
-									ItemIdGetLength(itemid),
-									ItemIdGetFlags(itemid))));
-
-	/*
-	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree
-	 * never uses either.  Verify that line pointer has storage, too, since
-	 * even LP_DEAD items should within nbtree.
-	 */
-	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
-		ItemIdGetLength(itemid) == 0)
-		ereport(ERROR,
-				(errcode(ERRCODE_INDEX_CORRUPTED),
-				 errmsg("invalid line pointer storage in index \"%s\"",
-						RelationGetRelationName(state->rel)),
-				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
-									block, offset, ItemIdGetOffset(itemid),
-									ItemIdGetLength(itemid),
-									ItemIdGetFlags(itemid))));
-
-	return itemid;
-}
-
 /*
  * BTreeTupleGetHeapTID() wrapper that enforces that a heap TID is present in
  * cases where that is mandatory (i.e. for non-pivot tuples)
-- 
2.37.3.542.gdd3f6c4cae

#10

Andrew Borodin

amborodin86@gmail.com

over 3 years ago

In reply to: Andres Freund (#9)

3 attachment(s)

Re: Amcheck verification of GiST and GIN

On Sun, Oct 2, 2022 at 12:12 AM Andres Freund <andres@anarazel.de> wrote:

Here's an updated patch adding meson compat.

Thank you, Andres! Here's one more rebase (something was adjusted in
amcheck build).
Also I've fixed new warnings except warning about absent
heapallindexed for GIN. It's a TODO.

Thanks!

Best regards, Andrey Borodin.

Attachments:

v15-0002-Add-gist_index_parent_check-function-to-verify-G.patchapplication/octet-stream; name=v15-0002-Add-gist_index_parent_check-function-to-verify-G.patchDownload

From 215a18b266ade415edbc014a36ecdceb20168b1b Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:17:44 +0500
Subject: [PATCH v15 2/3] Add gist_index_parent_check() function to verify GiST
 index

---
 contrib/amcheck/Makefile                |   6 +-
 contrib/amcheck/amcheck--1.3--1.4.sql   |  14 +
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/expected/check_gist.out | 119 ++++++
 contrib/amcheck/meson.build             |   3 +
 contrib/amcheck/sql/check_gist.sql      |  42 ++
 contrib/amcheck/verify_gist.c           | 520 ++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml               |  19 +
 8 files changed, 722 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.3--1.4.sql
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index f10fd9d89d..a817419581 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,15 +4,17 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
+	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
-DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql \
+		amcheck--1.3--1.4.sql
 
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap
+REGRESS = check check_btree check_heap check_gist
 
 TAP_TESTS = 1
 
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
new file mode 100644
index 0000000000..93297379ef
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -0,0 +1,14 @@
+/* contrib/amcheck/amcheck--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.4'" to load this file. \quit
+
+
+-- gist_index_parent_check()
+--
+CREATE FUNCTION gist_index_parent_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index ab50931f75..e67ace01c9 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.3'
+default_version = '1.4'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 0000000000..9749adfd34
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,119 @@
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 29d100120e..66e34d8706 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -1,5 +1,6 @@
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
@@ -22,6 +23,7 @@ install_data(
   'amcheck--1.0--1.1.sql',
   'amcheck--1.1--1.2.sql',
   'amcheck--1.2--1.3.sql',
+  'amcheck--1.3--1.4.sql',
   kwargs: contrib_data_args,
 )
 
@@ -34,6 +36,7 @@ tests += {
       'check',
       'check_btree',
       'check_heap',
+      'check_gist',
     ],
   },
   'tap': {
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 0000000000..75b9ff4b43
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,42 @@
+
+SELECT setseed(1);
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 0000000000..db65880d87
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,520 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gist.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "access/tableam.h"
+#include "access/transam.h"
+#include "catalog/pg_am.h"
+#include "common/pg_prng.h"
+#include "catalog/index.h"
+#include "lib/bloomfilter.h"
+#include "miscadmin.h"
+#include "storage/lmgr.h"
+#include "storage/smgr.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "utils/snapmgr.h"
+
+#include "amcheck.h"
+
+/*
+ * GistScanItem represents one item of depth-first scan of GiST index.
+ */
+typedef struct GistScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GistScanItem *next;
+} GistScanItem;
+
+typedef struct GistCheckState
+{
+	/* Bloom filter fingerprints B-Tree index */
+	bloom_filter *filter;
+	/* Debug counter */
+	int64		heaptuplespresent;
+	/* GiST state */
+	GISTSTATE      *state;
+
+	Snapshot		snapshot;
+	Relation	rel;
+	Relation	heaprel;
+} GistCheckState;
+
+PG_FUNCTION_INFO_V1(gist_index_parent_check);
+
+static GistCheckState gist_init_heapallindexed(Relation rel);
+static void gist_index_checkable(Relation rel);
+static void gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+												void* callback_state);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gist_refind_parent(Relation rel, BlockNumber parentblkno,
+									 BlockNumber childblkno,
+									 BufferAccessStrategy strategy);
+static void gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+						  bool *isnull, bool tupleIsAlive, void *checkstate);
+
+/*
+ * gist_index_parent_check(index regclass)
+ *
+ * Verify integrity of GiST index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum gist_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid		indrelid = PG_GETARG_OID(0);
+	bool	heapallindexed = false;
+
+	if (PG_NARGS() >= 2)
+		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid, gist_index_checkable,
+		gist_check_parent_keys_consistency, AccessShareLock, &heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Check that relation is eligible for GiST verification
+ */
+static void
+gist_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIST_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GiST indexes are supported as targets for this verification"),
+				 errdetail("Relation \"%s\" is not a GiST index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+static GistCheckState
+gist_init_heapallindexed(Relation rel)
+{
+	int64		total_pages;
+	int64		total_elems;
+	uint64		seed;
+	GistCheckState result;
+
+	/*
+	 * Size Bloom filter based on estimated number of tuples in index.
+	 * This logic is similar to B-tree, see verify_btree.c .
+	 */
+	total_pages = RelationGetNumberOfBlocks(rel);
+	total_elems = Max(total_pages * (MaxOffsetNumber / 5),
+						(int64) rel->rd_rel->reltuples);
+	seed = pg_prng_uint64(&pg_global_prng_state);
+	result.filter = bloom_create(total_elems, maintenance_work_mem, seed);
+
+	result.snapshot = RegisterSnapshot(GetTransactionSnapshot());
+
+	/*
+	 * GetTransactionSnapshot() always acquires a new MVCC snapshot in
+	 * READ COMMITTED mode.  A new snapshot is guaranteed to have all
+	 * the entries it requires in the index.
+	 *
+	 * We must defend against the possibility that an old xact
+	 * snapshot was returned at higher isolation levels when that
+	 * snapshot is not safe for index scans of the target index.  This
+	 * is possible when the snapshot sees tuples that are before the
+	 * index's indcheckxmin horizon.  Throwing an error here should be
+	 * very rare.  It doesn't seem worth using a secondary snapshot to
+	 * avoid this.
+	 */
+	if (IsolationUsesXactSnapshot() && rel->rd_index->indcheckxmin &&
+		!TransactionIdPrecedes(HeapTupleHeaderGetXmin(rel->rd_indextuple->t_data),
+								result.snapshot->xmin))
+		ereport(ERROR,
+				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+					errmsg("index \"%s\" cannot be verified using transaction snapshot",
+						RelationGetRelationName(rel))));
+	
+	return result;
+}
+
+/*
+ * Main entry point for GiST check. Allocates memory context and scans through
+ * GiST graph.  This function verifies that tuples of internal pages cover all
+ * the key space of each tuple on leaf page.  To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries to
+ * adjust it by tuples on referenced child page.  Parent gist tuple should
+ * never require any adjustments.
+ */
+static void
+gist_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GistScanItem   *stack;
+	MemoryContext	mctx;
+	MemoryContext	oldcontext;
+	GISTSTATE      *state;
+	int				leafdepth;
+	bool			heapallindexed = *((bool*)callback_state);
+	GistCheckState  check_state;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	state = initGISTstate(rel);
+
+	if (heapallindexed)
+		check_state = gist_init_heapallindexed(rel);
+	check_state.state = state;
+	check_state.rel = rel;
+	check_state.heaprel = heaprel;
+	
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GistScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber  i, maxoff;
+		XLogRecPtr	lsn;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page))
+		{
+			/* split page detected, install right link to the stack */
+			GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+			ptr->depth = stack->depth;
+			ptr->parenttup = CopyIndexTuple(stack->parenttup);
+			ptr->parentblk = stack->parentblk;
+			ptr->parentlsn = stack->parentlsn;
+			ptr->blkno = GistPageGetOpaque(page)->rightlink;
+			ptr->next = stack->next;
+			stack->next = ptr;
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GistPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that each tuple looks valid, and is consistent with the
+		 * downlink we followed when we stepped on this page.
+		 */
+		maxoff = PageGetMaxOffsetNumber(page);
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId iid = PageGetItemIdCareful(rel, stack->blkno, page, i, sizeof(GISTPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+			/*
+			 * Check that it's not a leftover invalid tuple from pre-9.1 See
+			 * also gistdoinsert() and gistbulkdelete() handling of such
+			 * tuples. We do consider it error here.
+			 */
+			if (GistTupleIsInvalid(idxtuple))
+				ereport(ERROR,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("index \"%s\" contains an inner tuple marked as invalid, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i),
+						 errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+						 errhint("Please REINDEX it.")));
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+			{
+				/*
+				 * There was a discrepancy between parent and child tuples.
+				 * We need to verify it is not a result of concurrent call of
+				 * gistplacetopage(). So, lock parent and try to find downlink
+				 * for current page. It may be missing due to concurrent page
+				 * split, this is OK.
+				 * 
+				 * Note that when we aquire parent tuple now we hold lock for
+				 * both parent and child buffers. Thus parent tuple must
+				 * include keyspace of the child.
+				 */
+				pfree(stack->parenttup);
+				stack->parenttup = gist_refind_parent(rel, stack->parentblk,
+													  stack->blkno, strategy);
+
+				/* We found it - make a final check before failing */
+				if (!stack->parenttup)
+					elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+						 stack->blkno, stack->parentblk);
+				else if (gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+				else
+				{
+					/*
+					 * But now it is properly adjusted - nothing to do here.
+					 */
+				}
+			}
+
+			
+			if (GistPageIsLeaf(page))
+			{
+				if (heapallindexed)
+				{
+					bloom_add_element(check_state.filter, (unsigned char *) idxtuple,
+								  IndexTupleSize(idxtuple));
+				}
+			}
+			/* If this is an internal page, recurse into the child */
+			else
+			{
+				GistScanItem *ptr;
+
+				ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+				ptr->depth = stack->depth + 1;
+				ptr->parenttup = CopyIndexTuple(idxtuple);
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		LockBuffer(buffer, GIST_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	if (heapallindexed)
+	{
+		IndexInfo  *indexinfo = BuildIndexInfo(rel);
+		TableScanDesc scan;
+
+		scan = table_beginscan_strat(heaprel,	/* relation */
+									 check_state.snapshot,	/* snapshot */
+									 0, /* number of keys */
+									 NULL,	/* scan key */
+									 true,	/* buffer access strategy OK */
+									 true); /* syncscan OK? */
+
+		/*
+		 * Scan will behave as the first scan of a CREATE INDEX CONCURRENTLY.
+		 */
+		indexinfo->ii_Concurrent = true;
+
+		indexinfo->ii_Unique = false;
+		indexinfo->ii_ExclusionOps = NULL;
+		indexinfo->ii_ExclusionProcs = NULL;
+		indexinfo->ii_ExclusionStrats = NULL;
+
+		elog(DEBUG1, "verifying that tuples from index \"%s\" are present in \"%s\"",
+			 RelationGetRelationName(rel),
+			 RelationGetRelationName(heaprel));
+
+		table_index_build_scan(heaprel, rel, indexinfo, true, false,
+							   gist_tuple_present_callback, (void *) &check_state, scan);
+
+		ereport(DEBUG1,
+		(errmsg_internal("finished verifying presence of " INT64_FORMAT " tuples from table \"%s\" with bitset %.2f%% set",
+							check_state.heaptuplespresent, RelationGetRelationName(heaprel),
+							100.0 * bloom_prop_bits_set(check_state.filter))));
+
+		UnregisterSnapshot(check_state.snapshot);
+		bloom_free(check_state.filter);
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+static void
+gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+						  bool *isnull, bool tupleIsAlive, void *checkstate)
+{
+	GistCheckState *state = (GistCheckState *) checkstate;
+	IndexTuple	itup = gistFormTuple(state->state, index, values, isnull, true);
+	itup->t_tid = *tid;
+	/* Probe Bloom filter -- tuple should be present */
+	if (bloom_lacks_element(state->filter, (unsigned char *) itup,
+							IndexTupleSize(itup)))
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("heap tuple (%u,%u) from table \"%s\" lacks matching index tuple within index \"%s\"",
+						ItemPointerGetBlockNumber(&(itup->t_tid)),
+						ItemPointerGetOffsetNumber(&(itup->t_tid)),
+						RelationGetRelationName(state->heaprel),
+						RelationGetRelationName(state->rel))));
+
+	state->heaptuplespresent++;
+
+	pfree(itup);
+}
+
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	gistcheckpage(rel, buffer);
+
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted page %d",
+						RelationGetRelationName(rel), blockNo)));
+
+	if (GistPageIsDeleted(page))
+	{
+		if (!GistPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gist_refind_parent(Relation rel, BlockNumber parentblkno,
+				   BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIST_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GistPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o, sizeof(GISTPageOpaqueData));
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 5d61a33936..9397a69c6e 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -179,6 +179,25 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gist_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_parent_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
   <tip>
    <para>
-- 
2.37.0 (Apple Git-136)

v15-0003-Add-gin_index_parent_check-to-verify-GIN-index.patchapplication/octet-stream; name=v15-0003-Add-gin_index_parent_check-to-verify-GIN-index.patchDownload

From a77ab4d265b9c19e14af2532a67191ec32746d1c Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:22:26 +0500
Subject: [PATCH v15 3/3] Add gin_index_parent_check() to verify GIN index

---
 contrib/amcheck/Makefile               |   3 +-
 contrib/amcheck/amcheck--1.3--1.4.sql  |  11 +-
 contrib/amcheck/amcheck.c              |   2 +-
 contrib/amcheck/expected/check_gin.out |  64 ++
 contrib/amcheck/meson.build            |   2 +
 contrib/amcheck/sql/check_gin.sql      |  40 ++
 contrib/amcheck/verify_gin.c           | 800 +++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml              |  19 +
 8 files changed, 938 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/expected/check_gin.out
 create mode 100644 contrib/amcheck/sql/check_gin.sql
 create mode 100644 contrib/amcheck/verify_gin.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index a817419581..ecb849a605 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -5,6 +5,7 @@ OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
 	verify_gist.o \
+	verify_gin.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
@@ -14,7 +15,7 @@ DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck
 
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap check_gist
+REGRESS = check check_btree check_heap check_gist check_gin
 
 TAP_TESTS = 1
 
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
index 93297379ef..c914e6d0ba 100644
--- a/contrib/amcheck/amcheck--1.3--1.4.sql
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -11,4 +11,13 @@ RETURNS VOID
 AS 'MODULE_PATHNAME', 'gist_index_parent_check'
 LANGUAGE C STRICT;
 
-REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
+REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
+
+-- gin_index_parent_check()
+--
+CREATE FUNCTION gin_index_parent_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gin_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gin_index_parent_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
index 3793b0cd93..9999a233f8 100644
--- a/contrib/amcheck/amcheck.c
+++ b/contrib/amcheck/amcheck.c
@@ -83,7 +83,7 @@ amcheck_lock_relation_and_check(Oid indrelid, IndexCheckableCallback checkable,
 	else
 	{
 		heaprel = NULL;
-		/* for "gcc -Og" https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78394 */
+		/* Set these just to suppress "uninitialized variable" warnings */
 		save_userid = InvalidOid;
 		save_sec_context = -1;
 		save_nestlevel = -1;
diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
new file mode 100644
index 0000000000..d98d525c66
--- /dev/null
+++ b/contrib/amcheck/expected/check_gin.out
@@ -0,0 +1,64 @@
+-- Test of index bulk load
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx', true);
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test index inserts
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+SELECT gin_index_parent_check('gin_check_idx', true);
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test GIN over text array
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx', true);
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 66e34d8706..f3f097c5c5 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -1,5 +1,6 @@
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gin.c',
   'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
@@ -37,6 +38,7 @@ tests += {
       'check_btree',
       'check_heap',
       'check_gist',
+      'check_gin',
     ],
   },
   'tap': {
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
new file mode 100644
index 0000000000..789259e662
--- /dev/null
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -0,0 +1,40 @@
+-- Test of index bulk load
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx', true);
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test index inserts
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+
+SELECT gin_index_parent_check('gin_check_idx', true);
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test GIN over text array
+SELECT setseed(1);
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx', true);
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
new file mode 100644
index 0000000000..90fe89501d
--- /dev/null
+++ b/contrib/amcheck/verify_gin.c
@@ -0,0 +1,800 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gin.c
+ *		Verifies the integrity of GIN indexes based on invariants.
+ *
+ * Verification checks that all paths in GIN graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gin.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gin_private.h"
+#include "access/nbtree.h"
+#include "amcheck.h"
+#include "catalog/pg_am.h"
+#include "miscadmin.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "string.h"
+
+/*
+ * GinScanItem represents one item of depth-first scan of GIN index.
+ */
+typedef struct GinScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GinScanItem *next;
+}			GinScanItem;
+
+/*
+ * GinPostingTreeScanItem represents one item of depth-first scan of GIN  posting tree.
+ */
+typedef struct GinPostingTreeScanItem
+{
+	int			depth;
+	ItemPointerData parentkey;
+	BlockNumber parentblk;
+	BlockNumber blkno;
+	struct GinPostingTreeScanItem *next;
+}			GinPostingTreeScanItem;
+
+
+PG_FUNCTION_INFO_V1(gin_index_parent_check);
+
+static void gin_index_checkable(Relation rel);
+static void gin_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state);
+static bool check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gin_refind_parent(Relation rel, BlockNumber parentblkno,
+									BlockNumber childblkno,
+									BufferAccessStrategy strategy);
+
+/*
+ * gin_index_parent_check(index regclass)
+ *
+ * Verify integrity of GIN index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gin_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	bool	heapallindexed = false;
+
+	if (PG_NARGS() >= 2)
+		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid, gin_index_checkable,
+		gin_check_parent_keys_consistency, AccessShareLock, &heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Read item pointers from leaf entry tuple.
+ *
+ * Returns a palloc'd array of ItemPointers. The number of items is returned
+ * in *nitems.
+ */
+static ItemPointer
+ginReadTupleWithoutState(IndexTuple itup, int *nitems)
+{
+	Pointer		ptr = GinGetPosting(itup);
+	int			nipd = GinGetNPosting(itup);
+	ItemPointer ipd;
+	int			ndecoded;
+
+	if (GinItupIsCompressed(itup))
+	{
+		if (nipd > 0)
+		{
+			ipd = ginPostingListDecode((GinPostingList *) ptr, &ndecoded);
+			if (nipd != ndecoded)
+				elog(ERROR, "number of items mismatch in GIN entry tuple, %d in tuple header, %d decoded",
+					 nipd, ndecoded);
+		}
+		else
+		{
+			ipd = palloc(0);
+		}
+	}
+	else
+	{
+		ipd = (ItemPointer) palloc(sizeof(ItemPointerData) * nipd);
+		memcpy(ipd, ptr, sizeof(ItemPointerData) * nipd);
+	}
+	*nitems = nipd;
+	return ipd;
+}
+
+
+/*
+ * Check that relation is eligible for GIN verification
+ */
+static void
+gin_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIN_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GIN indexes are supported as targets for this verification"),
+				 errdetail("Relation \"%s\" is not a GIN index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+/*
+ * Allocates memory context and scans through postigTree graph
+ *
+ */
+static void
+gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting_tree_root)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinPostingTreeScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinPostingTreeScanItem *) palloc0(sizeof(GinPostingTreeScanItem));
+	stack->depth = 0;
+	ItemPointerSetInvalid(&stack->parentkey);
+	stack->parentblk = InvalidBlockNumber;
+	stack->blkno = posting_tree_root;
+
+	elog(DEBUG3, "processing posting tree at blk %u", posting_tree_root);
+
+	while (stack)
+	{
+		GinPostingTreeScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		Assert(GinPageIsData(page));
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			ItemPointerData minItem;
+			int			nlist;
+			ItemPointerData *list;
+			char		tidrange_buf[100];
+
+			ItemPointerSetMin(&minItem);
+
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			list = GinDataLeafPageGetItems(page, &nlist, minItem);
+
+			if (nlist > 0)
+			{
+				snprintf(tidrange_buf, sizeof(tidrange_buf),
+						 "%d tids (%u, %u) - (%u, %u)",
+						 nlist,
+						 ItemPointerGetBlockNumberNoCheck(&list[0]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[0]),
+						 ItemPointerGetBlockNumberNoCheck(&list[nlist - 1]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[nlist - 1]));
+			} else {
+				snprintf(tidrange_buf, sizeof(tidrange_buf), "0 tids");
+			}
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				elog(DEBUG3, "blk %u: parent %u highkey (%u, %u), %s",
+					 stack->blkno,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey),
+					 tidrange_buf);
+			}
+			else
+			{
+				elog(DEBUG3, "blk %u: root leaf, %s",
+					 stack->blkno,
+					 tidrange_buf);
+			}
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				ItemPointerGetOffsetNumberNoCheck(&stack->parentkey) != InvalidOffsetNumber &&
+				nlist > 0 &&
+				ItemPointerCompare(&stack->parentkey, &list[nlist - 1]) < 0)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": tid exceeds parent's high key in postingTree leaf on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			}
+		}
+		else
+		{
+			LocationIndex pd_lower;
+			int			lowersize;
+			ItemPointerData bound;
+
+			/*
+			 * Check that tuples in each page are properly ordered and
+			 * consistent with parent high key
+			 */
+			maxoff = GinPageGetOpaque(page)->maxoff;
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: internal posting tree page with %u items, parent %u highkey (%u, %u)",
+					 stack->blkno,
+					 maxoff,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey));
+			else
+				elog(DEBUG3, "blk %u: root internal posting tree page with %u items", stack->blkno, maxoff);
+
+			/*
+			 * A GIN posting tree internal page stores PostingItems in the
+			 * 'lower' part of the page. The 'upper' part is unused. The
+			 * number of elements is stored in the opaque area (maxoff).
+			 * Make sure the size of the 'lower' part agrees with 'maxoff'
+			 *
+			 * We didn't set pd_lower until PostgreSQL version 9.4, so if this
+			 * check fails, it could also be because the index was binary-upgraded
+			 * from an earlier version. That was a long time ago, though, so let's
+			 * warn if it doesn't match.
+			 */
+			pd_lower = ((PageHeader) page)->pd_lower;
+			lowersize = pd_lower - MAXALIGN(SizeOfPageHeaderData);
+			if ((lowersize - MAXALIGN(sizeof(ItemPointerData))) / sizeof(PostingItem) != maxoff)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has unexpected pd_lower %u in posting tree block %u with maxoff %u)",
+								RelationGetRelationName(rel), pd_lower, stack->blkno, maxoff)));
+			}
+
+			/*
+			 * Before the PostingItems, there's one ItemPointerData in the
+			 * 'lower' part that stores the page's high key.
+			 */
+			bound = *GinDataPageGetRightBound(page);
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				if (!ItemPointerEquals(&stack->parentkey, &bound))
+				{
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\": posting tree page's high key (%u, %u) doesn't match the downlink on block %u (parent blk %u, key (%u, %u))",
+									RelationGetRelationName(rel),
+									ItemPointerGetBlockNumberNoCheck(&bound),
+									ItemPointerGetOffsetNumberNoCheck(&bound),
+									stack->blkno,
+									stack->parentblk,
+									ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+									ItemPointerGetOffsetNumberNoCheck(&stack->parentkey))));
+				}
+			}
+
+			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
+
+				elog(DEBUG3, "key (%u, %u) -> %u",
+					 ItemPointerGetBlockNumber(&posting_item->key),
+					 ItemPointerGetOffsetNumber(&posting_item->key),
+					 BlockIdGetBlockNumber(&posting_item->child_blkno));
+
+				if (i == maxoff && GinPageGetOpaque(page)->rightlink == InvalidBlockNumber)
+				{
+					/* The rightmost item in the tree level has (0, 0) as the key */
+					if (ItemPointerGetBlockNumberNoCheck(&posting_item->key) != 0 ||
+						ItemPointerGetOffsetNumberNoCheck(&posting_item->key) != 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": rightmost posting tree page (blk %u) has unexpected last key (%u, %u)",
+										RelationGetRelationName(rel),
+										stack->blkno,
+										ItemPointerGetBlockNumberNoCheck(&posting_item->key),
+										ItemPointerGetOffsetNumberNoCheck(&posting_item->key))));
+					}
+				}
+				else if (i != FirstOffsetNumber)
+				{
+					PostingItem *previous_posting_item = GinDataPageGetPostingItem(page, i - 1);
+
+					if (ItemPointerCompare(&posting_item->key, &previous_posting_item->key) < 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\" has wrong tuple order in posting tree, block %u, offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+					}
+				}
+
+				/*
+				 * Check if this tuple is consistent with the downlink in the
+				 * parent.
+				 */
+				if (stack->parentblk != InvalidBlockNumber && i == maxoff)
+				{
+					if (ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting item exceeds parent's high key in postingTree internal page on block %u offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+
+					}
+				}
+
+				/* If this is an internal page, recurse into the child */
+				if (!GinPageIsLeaf(page))
+				{
+					GinPostingTreeScanItem *ptr;
+
+					ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
+					ptr->depth = stack->depth + 1;
+					ptr->parentkey = posting_item->key;
+					ptr->parentblk = stack->blkno;
+					ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+
+			}
+		}
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Main entry point for GIN check. Allocates memory context and scans through
+ * GIN graph.
+ */
+static void
+gin_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GinState	state;
+	bool		heapallindexed = *((bool*)callback_state);
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+	initGinState(&state, rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinScanItem *) palloc0(sizeof(GinScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIN_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GinScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+		IndexTuple	prev_tuple;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		if (!check_index_page(rel, buffer, stack->blkno))
+		{
+			goto nextpage;
+		}
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (stack->parenttup != NULL)
+		{
+			GinNullCategory parent_key_category;
+			Datum		parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+			OffsetNumber maxoff = PageGetMaxOffsetNumber(page);
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, maxoff, sizeof(GinPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory page_max_key_category;
+			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
+
+			if (GinPageGetOpaque(page)->rightlink != InvalidBlockNumber &&
+				ginCompareEntries(&state, attnum, page_max_key, page_max_key_category, parent_key, parent_key_category) > 0)
+			{
+				/* split page detected, install right link to the stack */
+				GinScanItem *ptr;
+
+				elog(DEBUG3, "split detected");
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth;
+				ptr->parenttup = CopyIndexTuple(stack->parenttup);
+				ptr->parentblk = stack->parentblk;
+				ptr->parentlsn = stack->parentlsn;
+				ptr->blkno = GinPageGetOpaque(page)->rightlink;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+				goto nextpage;
+			}
+		}
+
+		/*
+		 * Check that tuples in each page are properly ordered and consistent
+		 * with parent high key
+		 */
+		maxoff = PageGetMaxOffsetNumber(page);
+		prev_tuple = NULL;
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i, sizeof(GinPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory prev_key_category;
+			Datum		prev_key;
+			GinNullCategory current_key_category;
+			Datum		current_key;
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+				continue;
+			}
+
+			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
+
+			/* (apparently) first block is metadata, skip order check */
+			if (i != FirstOffsetNumber && stack->blkno != (BlockNumber) 1)
+			{
+				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
+				if (ginCompareEntries(&state, attnum, prev_key, prev_key_category, current_key, current_key_category) >= 0)
+				{
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has wrong tuple order, block %u, offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+				}
+			}
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				i == maxoff)
+			{
+				GinNullCategory parent_key_category;
+				Datum		parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+
+				if (ginCompareEntries(&state, attnum, current_key, current_key_category, parent_key, parent_key_category) > 0)
+				{
+					/*
+					 * There was a discrepancy between parent and child
+					 * tuples. We need to verify it is not a result of
+					 * concurrent call of gistplacetopage(). So, lock parent
+					 * and try to find downlink for current page. It may be
+					 * missing due to concurrent page split, this is OK.
+					 */
+					pfree(stack->parenttup);
+					stack->parenttup = gin_refind_parent(rel, stack->parentblk,
+														 stack->blkno, strategy);
+
+					/* We found it - make a final check before failing */
+					if (!stack->parenttup)
+						elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+							 stack->blkno, stack->parentblk);
+					else
+					{
+						parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+						if (ginCompareEntries(&state, attnum, current_key, current_key_category, parent_key, parent_key_category) > 0)
+							ereport(ERROR,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+											RelationGetRelationName(rel), stack->blkno, i)));
+						else
+						{
+							/*
+							 * But now it is properly adjusted - nothing to do
+							 * here.
+							 */
+						}
+					}
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+			if (!GinPageIsLeaf(page))
+			{
+				GinScanItem *ptr;
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth + 1;
+				/* last tuple in layer has no high key */
+				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
+				{
+					ptr->parenttup = CopyIndexTuple(idxtuple);
+				}
+				else
+				{
+					ptr->parenttup = NULL;
+				}
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = GinGetDownlink(idxtuple);
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+			/* If this item is a pointer to a posting tree, recurse into it */
+			else if (GinIsPostingTree(idxtuple))
+			{
+				BlockNumber rootPostingTree = GinGetPostingTree(idxtuple);
+
+				gin_check_posting_tree_parent_keys_consistency(rel, rootPostingTree);
+			}
+			else
+			{
+				ItemPointer ipd;
+				int			nipd;
+
+				ipd = ginReadTupleWithoutState(idxtuple, &nipd);
+
+				for (int j = 0; j < nipd; j++)
+				{
+					if (!OffsetNumberIsValid(ItemPointerGetOffsetNumber(&ipd[j])))
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting list contains invalid heap pointer on block %u",
+										RelationGetRelationName(rel), stack->blkno)));
+					}
+				}
+				pfree(ipd);
+			}
+
+			prev_tuple = CopyIndexTuple(idxtuple);
+		}
+
+nextpage:
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Verify that a freshly-read page looks sane.
+ */
+static bool
+gincheckpage(Relation rel, Buffer buf)
+{
+	Page		page = BufferGetPage(buf);
+
+	/*
+	 * ReadBuffer verifies that every newly-read page passes
+	 * PageHeaderIsValid, which means it either contains a reasonably sane
+	 * page header or is all-zero.  We have to defend against the all-zero
+	 * case, however.
+	 */
+	if (PageIsNew(page))
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains unexpected zero page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buf)),
+				 errhint("Please REINDEX it.")));
+		return false;
+	}
+
+	/*
+	 * Additionally check that the special area looks sane.
+	 */
+	if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains corrupted page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buf)),
+				 errhint("Please REINDEX it.")));
+		return false;
+	}
+	return true;
+}
+
+static bool
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	if (!gincheckpage(rel, buffer))
+		return false;
+
+	if (GinPageIsDeleted(page))
+	{
+		if (!GinPageIsLeaf(page))
+		{
+			ereport(WARNING,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+			return false;
+		}
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+		{
+			ereport(WARNING,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+			return false;
+		}
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+		return false;
+	}
+	return true;
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gin_refind_parent(Relation rel, BlockNumber parentblkno,
+				  BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIN_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GinPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o, sizeof(GinPageOpaqueData));
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 9397a69c6e..7ffa36b205 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -180,6 +180,25 @@ ORDER BY c.relpages DESC LIMIT 10;
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term>
+     <function>gin_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gin_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gin_index_parent_check</function> tests that its target GIN index
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term>
      <function>gist_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
-- 
2.37.0 (Apple Git-136)

v15-0001-Refactor-amcheck-to-extract-common-locking-routi.patchapplication/octet-stream; name=v15-0001-Refactor-amcheck-to-extract-common-locking-routi.patchDownload

From fbf14a532dd21fa48521cb8978a05b6dc21c9c4f Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:08:10 +0500
Subject: [PATCH v15 1/3] Refactor amcheck to extract common locking routines

---
 contrib/amcheck/Makefile        |   2 +
 contrib/amcheck/amcheck.c       | 188 ++++++++++++++++++++
 contrib/amcheck/amcheck.h       |  27 +++
 contrib/amcheck/meson.build     |   1 +
 contrib/amcheck/verify_nbtree.c | 306 ++++++++------------------------
 5 files changed, 296 insertions(+), 228 deletions(-)
 create mode 100644 contrib/amcheck/amcheck.c
 create mode 100644 contrib/amcheck/amcheck.h

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index b82f221e50..f10fd9d89d 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -3,11 +3,13 @@
 MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
+	amcheck.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
 DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
 REGRESS = check check_btree check_heap
diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
new file mode 100644
index 0000000000..3793b0cd93
--- /dev/null
+++ b/contrib/amcheck/amcheck.c
@@ -0,0 +1,188 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.c
+ *		Utility functions common to all access methods.
+ *
+ * Copyright (c) 2017-2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/table.h"
+#include "access/tableam.h"
+#include "amcheck.h"
+#include "catalog/index.h"
+#include "commands/tablecmds.h"
+#include "utils/guc.h"
+
+static bool
+amcheck_index_mainfork_expected(Relation rel);
+
+/*
+ * Check if index relation should have a file for its main relation
+ * fork.  Verification uses this to skip unlogged indexes when in hot standby
+ * mode, where there is simply nothing to verify.
+ *
+ * NB: Caller should call index_checkable()
+ * before calling here.
+ */
+static bool
+amcheck_index_mainfork_expected(Relation rel)
+{
+	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
+		!RecoveryInProgress())
+		return true;
+
+	ereport(NOTICE,
+			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
+			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
+					RelationGetRelationName(rel))));
+
+	return false;
+}
+
+void
+amcheck_lock_relation_and_check(Oid indrelid, IndexCheckableCallback checkable,
+												IndexDoCheckCallback check, LOCKMODE lockmode, void *state)
+{
+	Oid			heapid;
+	Relation	indrel;
+	Relation	heaprel;
+	Oid			save_userid;
+	int			save_sec_context;
+	int			save_nestlevel;
+
+	/*
+	 * We must lock table before index to avoid deadlocks.  However, if the
+	 * passed indrelid isn't an index then IndexGetRelation() will fail.
+	 * Rather than emitting a not-very-helpful error message, postpone
+	 * complaining, expecting that the is-it-an-index test below will fail.
+	 *
+	 * In hot standby mode this will raise an error when parentcheck is true.
+	 */
+	heapid = IndexGetRelation(indrelid, true);
+	if (OidIsValid(heapid))
+	{
+		heaprel = table_open(heapid, lockmode);
+
+		/*
+		 * Switch to the table owner's userid, so that any index functions are
+		 * run as that user.  Also lock down security-restricted operations
+		 * and arrange to make GUC variable changes local to this command.
+		 */
+		GetUserIdAndSecContext(&save_userid, &save_sec_context);
+		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
+							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
+		save_nestlevel = NewGUCNestLevel();
+	}
+	else
+	{
+		heaprel = NULL;
+		/* for "gcc -Og" https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78394 */
+		save_userid = InvalidOid;
+		save_sec_context = -1;
+		save_nestlevel = -1;
+	}
+
+	/*
+	 * Open the target index relations separately (like relation_openrv(), but
+	 * with heap relation locked first to prevent deadlocking).  In hot
+	 * standby mode this will raise an error when parentcheck is true.
+	 *
+	 * There is no need for the usual indcheckxmin usability horizon test
+	 * here, even in the heapallindexed case, because index undergoing
+	 * verification only needs to have entries for a new transaction snapshot.
+	 * (If this is a parentcheck verification, there is no question about
+	 * committed or recently dead heap tuples lacking index entries due to
+	 * concurrent activity.)
+	 */
+	indrel = index_open(indrelid, lockmode);
+
+	/*
+	 * Since we did the IndexGetRelation call above without any lock, it's
+	 * barely possible that a race against an index drop/recreation could have
+	 * netted us the wrong table.
+	 */
+	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_TABLE),
+				 errmsg("could not open parent table of index \"%s\"",
+						RelationGetRelationName(indrel))));
+
+	/* Relation suitable for checking */
+	checkable(indrel);
+
+	if (amcheck_index_mainfork_expected(indrel))
+		check(indrel, heaprel, state);
+
+	/* Roll back any GUC changes executed by index functions */
+	AtEOXact_GUC(false, save_nestlevel);
+
+	/* Restore userid and security context */
+	SetUserIdAndSecContext(save_userid, save_sec_context);
+
+	/*
+	 * Release locks early. That's ok here because nothing in the called
+	 * routines will trigger shared cache invalidations to be sent, so we can
+	 * relax the usual pattern of only releasing locks after commit.
+	 */
+	index_close(indrel, lockmode);
+	if (heaprel)
+		table_close(heaprel, lockmode);
+}
+
+/*
+ * PageGetItemId() wrapper that validates returned line pointer.
+ *
+ * Buffer page/page item access macros generally trust that line pointers are
+ * not corrupt, which might cause problems for verification itself.  For
+ * example, there is no bounds checking in PageGetItem().  Passing it a
+ * corrupt line pointer can cause it to return a tuple/pointer that is unsafe
+ * to dereference.
+ *
+ * Validating line pointers before tuples avoids undefined behavior and
+ * assertion failures with corrupt indexes, making the verification process
+ * more robust and predictable.
+ */
+ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset, size_t opaquesize)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	Assert(opaquesize == MAXALIGN(opaquesize));
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(opaquesize))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree and gist
+	 * never uses either.  Verify that line pointer has storage, too, since
+	 * even LP_DEAD items should.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/contrib/amcheck/amcheck.h b/contrib/amcheck/amcheck.h
new file mode 100644
index 0000000000..10906efd8a
--- /dev/null
+++ b/contrib/amcheck/amcheck.h
@@ -0,0 +1,27 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "storage/lockdefs.h"
+#include "utils/relcache.h"
+#include "miscadmin.h"
+
+/* Typedefs for callback functions for amcheck_lock_relation */
+typedef void (*IndexCheckableCallback) (Relation index);
+typedef void (*IndexDoCheckCallback) (Relation rel, Relation heaprel, void* state);
+
+extern void amcheck_lock_relation_and_check(Oid indrelid,
+											IndexCheckableCallback checkable,
+											IndexDoCheckCallback check,
+											LOCKMODE lockmode, void *state);
+
+extern ItemId PageGetItemIdCareful(Relation rel, BlockNumber block,
+					 Page page, OffsetNumber offset, size_t opaquesize);
\ No newline at end of file
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 2194a91124..29d100120e 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -1,4 +1,5 @@
 amcheck_sources = files(
+  'amcheck.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 9021d156eb..950014f19d 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -41,6 +41,8 @@
 #include "utils/memutils.h"
 #include "utils/snapmgr.h"
 
+#include "amcheck.h"
+
 
 PG_MODULE_MAGIC;
 
@@ -138,10 +140,8 @@ typedef struct BtreeLevel
 PG_FUNCTION_INFO_V1(bt_index_check);
 PG_FUNCTION_INFO_V1(bt_index_parent_check);
 
-static void bt_index_check_internal(Oid indrelid, bool parentcheck,
-									bool heapallindexed, bool rootdescend);
+static void bt_index_check_internal_callback(Relation indrel, Relation heaprel, void* state);
 static inline void btree_index_checkable(Relation rel);
-static inline bool btree_index_mainfork_expected(Relation rel);
 static void bt_check_every_level(Relation rel, Relation heaprel,
 								 bool heapkeyspace, bool readonly, bool heapallindexed,
 								 bool rootdescend);
@@ -184,12 +184,17 @@ static inline bool invariant_l_nontarget_offset(BtreeCheckState *state,
 static Page palloc_btree_page(BtreeCheckState *state, BlockNumber blocknum);
 static inline BTScanInsert bt_mkscankey_pivotsearch(Relation rel,
 													IndexTuple itup);
-static ItemId PageGetItemIdCareful(BtreeCheckState *state, BlockNumber block,
-								   Page page, OffsetNumber offset);
 static inline ItemPointer BTreeTupleGetHeapTIDCareful(BtreeCheckState *state,
 													  IndexTuple itup, bool nonpivot);
 static inline ItemPointer BTreeTupleGetPointsToTID(IndexTuple itup);
 
+typedef struct BTCheckCallbackState
+{
+	bool parentcheck;
+	bool heapallindexed;
+	bool rootdescend;
+} BTCheckCallbackState;
+
 /*
  * bt_index_check(index regclass, heapallindexed boolean)
  *
@@ -203,12 +208,17 @@ Datum
 bt_index_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
+	BTCheckCallbackState args;
 
-	if (PG_NARGS() == 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = false;
+
+	if (PG_NARGS() >= 2)
+		args.heapallindexed = PG_GETARG_BOOL(1);
 
-	bt_index_check_internal(indrelid, false, heapallindexed, false);
+	amcheck_lock_relation_and_check(indrelid, btree_index_checkable,
+		bt_index_check_internal_callback, AccessShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -226,15 +236,18 @@ Datum
 bt_index_parent_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		rootdescend = false;
+	BTCheckCallbackState args;
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = true;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+		args.heapallindexed = PG_GETARG_BOOL(1);
 	if (PG_NARGS() == 3)
-		rootdescend = PG_GETARG_BOOL(2);
+		args.rootdescend = PG_GETARG_BOOL(2);
 
-	bt_index_check_internal(indrelid, true, heapallindexed, rootdescend);
+	amcheck_lock_relation_and_check(indrelid, btree_index_checkable,
+		bt_index_check_internal_callback, ShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -242,126 +255,35 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
 /*
  * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
  */
-static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend)
+static void bt_index_check_internal_callback(Relation indrel, Relation heaprel, void* state)
 {
-	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	LOCKMODE	lockmode;
-	Oid			save_userid;
-	int			save_sec_context;
-	int			save_nestlevel;
-
-	if (parentcheck)
-		lockmode = ShareLock;
-	else
-		lockmode = AccessShareLock;
-
-	/*
-	 * We must lock table before index to avoid deadlocks.  However, if the
-	 * passed indrelid isn't an index then IndexGetRelation() will fail.
-	 * Rather than emitting a not-very-helpful error message, postpone
-	 * complaining, expecting that the is-it-an-index test below will fail.
-	 *
-	 * In hot standby mode this will raise an error when parentcheck is true.
-	 */
-	heapid = IndexGetRelation(indrelid, true);
-	if (OidIsValid(heapid))
-	{
-		heaprel = table_open(heapid, lockmode);
-
-		/*
-		 * Switch to the table owner's userid, so that any index functions are
-		 * run as that user.  Also lock down security-restricted operations
-		 * and arrange to make GUC variable changes local to this command.
-		 */
-		GetUserIdAndSecContext(&save_userid, &save_sec_context);
-		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
-							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
-		save_nestlevel = NewGUCNestLevel();
-	}
-	else
-	{
-		heaprel = NULL;
-		/* Set these just to suppress "uninitialized variable" warnings */
-		save_userid = InvalidOid;
-		save_sec_context = -1;
-		save_nestlevel = -1;
-	}
-
-	/*
-	 * Open the target index relations separately (like relation_openrv(), but
-	 * with heap relation locked first to prevent deadlocking).  In hot
-	 * standby mode this will raise an error when parentcheck is true.
-	 *
-	 * There is no need for the usual indcheckxmin usability horizon test
-	 * here, even in the heapallindexed case, because index undergoing
-	 * verification only needs to have entries for a new transaction snapshot.
-	 * (If this is a parentcheck verification, there is no question about
-	 * committed or recently dead heap tuples lacking index entries due to
-	 * concurrent activity.)
-	 */
-	indrel = index_open(indrelid, lockmode);
+	BTCheckCallbackState* args = (BTCheckCallbackState*) state;
+	bool		heapkeyspace,
+					allequalimage;
 
-	/*
-	 * Since we did the IndexGetRelation call above without any lock, it's
-	 * barely possible that a race against an index drop/recreation could have
-	 * netted us the wrong table.
-	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
 		ereport(ERROR,
-				(errcode(ERRCODE_UNDEFINED_TABLE),
-				 errmsg("could not open parent table of index \"%s\"",
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" lacks a main relation fork",
 						RelationGetRelationName(indrel))));
 
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	if (btree_index_mainfork_expected(indrel))
-	{
-		bool		heapkeyspace,
-					allequalimage;
-
-		if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" lacks a main relation fork",
-							RelationGetRelationName(indrel))));
-
-		/* Extract metadata from metapage, and sanitize it in passing */
-		_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
-		if (allequalimage && !heapkeyspace)
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
-							RelationGetRelationName(indrel))));
-		if (allequalimage && !_bt_allequalimage(indrel, false))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
-							RelationGetRelationName(indrel))));
-
-		/* Check index, possibly against table it is an index on */
-		bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
-							 heapallindexed, rootdescend);
-	}
-
-	/* Roll back any GUC changes executed by index functions */
-	AtEOXact_GUC(false, save_nestlevel);
+	/* Extract metadata from metapage, and sanitize it in passing */
+	_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
+	if (allequalimage && !heapkeyspace)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
+						RelationGetRelationName(indrel))));
+	if (allequalimage && !_bt_allequalimage(indrel, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
+						RelationGetRelationName(indrel))));
 
-	/* Restore userid and security context */
-	SetUserIdAndSecContext(save_userid, save_sec_context);
+	/* Check index, possibly against table it is an index on */
+	bt_check_every_level(indrel, heaprel, heapkeyspace, args->parentcheck,
+							args->heapallindexed, args->rootdescend);
 
-	/*
-	 * Release locks early. That's ok here because nothing in the called
-	 * routines will trigger shared cache invalidations to be sent, so we can
-	 * relax the usual pattern of only releasing locks after commit.
-	 */
-	index_close(indrel, lockmode);
-	if (heaprel)
-		table_close(heaprel, lockmode);
 }
 
 /*
@@ -398,29 +320,6 @@ btree_index_checkable(Relation rel)
 				 errdetail("Index is not valid.")));
 }
 
-/*
- * Check if B-Tree index relation should have a file for its main relation
- * fork.  Verification uses this to skip unlogged indexes when in hot standby
- * mode, where there is simply nothing to verify.  We behave as if the
- * relation is empty.
- *
- * NB: Caller should call btree_index_checkable() before calling here.
- */
-static inline bool
-btree_index_mainfork_expected(Relation rel)
-{
-	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
-		!RecoveryInProgress())
-		return true;
-
-	ereport(DEBUG1,
-			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
-			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
-					RelationGetRelationName(rel))));
-
-	return false;
-}
-
 /*
  * Main entry point for B-Tree SQL-callable functions. Walks the B-Tree in
  * logical order, verifying invariants as it goes.  Optionally, verification
@@ -793,9 +692,9 @@ bt_check_level_from_leftmost(BtreeCheckState *state, BtreeLevel level)
 				ItemId		itemid;
 
 				/* Internal page -- downlink gets leftmost on next level */
-				itemid = PageGetItemIdCareful(state, state->targetblock,
+				itemid = PageGetItemIdCareful(state->rel, state->targetblock,
 											  state->target,
-											  P_FIRSTDATAKEY(opaque));
+											  P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 				itup = (IndexTuple) PageGetItem(state->target, itemid);
 				nextleveldown.leftmost = BTreeTupleGetDownLink(itup);
 				nextleveldown.level = opaque->btpo_level - 1;
@@ -875,8 +774,8 @@ nextpage:
 			IndexTuple	itup;
 			ItemId		itemid;
 
-			itemid = PageGetItemIdCareful(state, state->targetblock,
-										  state->target, P_HIKEY);
+			itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+										  state->target, P_HIKEY, sizeof(BTPageOpaqueData));
 			itup = (IndexTuple) PageGetItem(state->target, itemid);
 
 			state->lowkey = MemoryContextAlloc(oldcontext, IndexTupleSize(itup));
@@ -1093,8 +992,8 @@ bt_target_page_check(BtreeCheckState *state)
 		IndexTuple	itup;
 
 		/* Verify line pointer before checking tuple */
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, P_HIKEY);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, P_HIKEY, sizeof(BTPageOpaqueData));
 		if (!_bt_check_natts(state->rel, state->heapkeyspace, state->target,
 							 P_HIKEY))
 		{
@@ -1129,8 +1028,8 @@ bt_target_page_check(BtreeCheckState *state)
 
 		CHECK_FOR_INTERRUPTS();
 
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, offset);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, offset, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(state->target, itemid);
 		tupsize = IndexTupleSize(itup);
 
@@ -1442,9 +1341,9 @@ bt_target_page_check(BtreeCheckState *state)
 							 OffsetNumberNext(offset));
 
 			/* Reuse itup to get pointed-to heap location of second item */
-			itemid = PageGetItemIdCareful(state, state->targetblock,
+			itemid = PageGetItemIdCareful(state->rel, state->targetblock,
 										  state->target,
-										  OffsetNumberNext(offset));
+										  OffsetNumberNext(offset), sizeof(BTPageOpaqueData));
 			itup = (IndexTuple) PageGetItem(state->target, itemid);
 			tid = BTreeTupleGetPointsToTID(itup);
 			nhtid = psprintf("(%u,%u)",
@@ -1735,8 +1634,8 @@ bt_right_page_check_scankey(BtreeCheckState *state)
 	if (P_ISLEAF(opaque) && nline >= P_FIRSTDATAKEY(opaque))
 	{
 		/* Return first data item (if any) */
-		rightitem = PageGetItemIdCareful(state, targetnext, rightpage,
-										 P_FIRSTDATAKEY(opaque));
+		rightitem = PageGetItemIdCareful(state->rel, targetnext, rightpage,
+										 P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 	}
 	else if (!P_ISLEAF(opaque) &&
 			 nline >= OffsetNumberNext(P_FIRSTDATAKEY(opaque)))
@@ -1745,8 +1644,8 @@ bt_right_page_check_scankey(BtreeCheckState *state)
 		 * Return first item after the internal page's "negative infinity"
 		 * item
 		 */
-		rightitem = PageGetItemIdCareful(state, targetnext, rightpage,
-										 OffsetNumberNext(P_FIRSTDATAKEY(opaque)));
+		rightitem = PageGetItemIdCareful(state->rel, targetnext, rightpage,
+										 OffsetNumberNext(P_FIRSTDATAKEY(opaque)), sizeof(BTPageOpaqueData));
 	}
 	else
 	{
@@ -1865,8 +1764,8 @@ bt_child_highkey_check(BtreeCheckState *state,
 
 	if (OffsetNumberIsValid(target_downlinkoffnum))
 	{
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, target_downlinkoffnum);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, target_downlinkoffnum, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(state->target, itemid);
 		downlink = BTreeTupleGetDownLink(itup);
 	}
@@ -1969,7 +1868,7 @@ bt_child_highkey_check(BtreeCheckState *state,
 			OffsetNumber pivotkey_offset;
 
 			/* Get high key */
-			itemid = PageGetItemIdCareful(state, blkno, page, P_HIKEY);
+			itemid = PageGetItemIdCareful(state->rel, blkno, page, P_HIKEY, sizeof(BTPageOpaqueData));
 			highkey = (IndexTuple) PageGetItem(page, itemid);
 
 			/*
@@ -2020,8 +1919,8 @@ bt_child_highkey_check(BtreeCheckState *state,
 													LSN_FORMAT_ARGS(state->targetlsn))));
 					pivotkey_offset = P_HIKEY;
 				}
-				itemid = PageGetItemIdCareful(state, state->targetblock,
-											  state->target, pivotkey_offset);
+				itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+											  state->target, pivotkey_offset, sizeof(BTPageOpaqueData));
 				itup = (IndexTuple) PageGetItem(state->target, itemid);
 			}
 			else
@@ -2107,8 +2006,8 @@ bt_child_check(BtreeCheckState *state, BTScanInsert targetkey,
 	BTPageOpaque copaque;
 	BTPageOpaque topaque;
 
-	itemid = PageGetItemIdCareful(state, state->targetblock,
-								  state->target, downlinkoffnum);
+	itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+								  state->target, downlinkoffnum, sizeof(BTPageOpaqueData));
 	itup = (IndexTuple) PageGetItem(state->target, itemid);
 	childblock = BTreeTupleGetDownLink(itup);
 
@@ -2339,7 +2238,7 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 		 RelationGetRelationName(state->rel));
 
 	level = opaque->btpo_level;
-	itemid = PageGetItemIdCareful(state, blkno, page, P_FIRSTDATAKEY(opaque));
+	itemid = PageGetItemIdCareful(state->rel, blkno, page, P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 	itup = (IndexTuple) PageGetItem(page, itemid);
 	childblk = BTreeTupleGetDownLink(itup);
 	for (;;)
@@ -2363,8 +2262,8 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 										level - 1, copaque->btpo_level)));
 
 		level = copaque->btpo_level;
-		itemid = PageGetItemIdCareful(state, childblk, child,
-									  P_FIRSTDATAKEY(copaque));
+		itemid = PageGetItemIdCareful(state->rel, childblk, child,
+									  P_FIRSTDATAKEY(copaque), sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(child, itemid);
 		childblk = BTreeTupleGetDownLink(itup);
 		/* Be slightly more pro-active in freeing this memory, just in case */
@@ -2412,7 +2311,7 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 	 */
 	if (P_ISHALFDEAD(copaque) && !P_RIGHTMOST(copaque))
 	{
-		itemid = PageGetItemIdCareful(state, childblk, child, P_HIKEY);
+		itemid = PageGetItemIdCareful(state->rel, childblk, child, P_HIKEY, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(child, itemid);
 		if (BTreeTupleGetTopParent(itup) == blkno)
 			return;
@@ -2782,8 +2681,8 @@ invariant_l_offset(BtreeCheckState *state, BTScanInsert key,
 	Assert(key->pivotsearch);
 
 	/* Verify line pointer before checking tuple */
-	itemid = PageGetItemIdCareful(state, state->targetblock, state->target,
-								  upperbound);
+	itemid = PageGetItemIdCareful(state->rel, state->targetblock, state->target,
+								  upperbound, sizeof(BTPageOpaqueData));
 	/* pg_upgrade'd indexes may legally have equal sibling tuples */
 	if (!key->heapkeyspace)
 		return invariant_leq_offset(state, key, upperbound);
@@ -2905,8 +2804,8 @@ invariant_l_nontarget_offset(BtreeCheckState *state, BTScanInsert key,
 	Assert(key->pivotsearch);
 
 	/* Verify line pointer before checking tuple */
-	itemid = PageGetItemIdCareful(state, nontargetblock, nontarget,
-								  upperbound);
+	itemid = PageGetItemIdCareful(state->rel, nontargetblock, nontarget,
+								  upperbound, sizeof(BTPageOpaqueData));
 	cmp = _bt_compare(state->rel, key, nontarget, upperbound);
 
 	/* pg_upgrade'd indexes may legally have equal sibling tuples */
@@ -3143,55 +3042,6 @@ bt_mkscankey_pivotsearch(Relation rel, IndexTuple itup)
 	return skey;
 }
 
-/*
- * PageGetItemId() wrapper that validates returned line pointer.
- *
- * Buffer page/page item access macros generally trust that line pointers are
- * not corrupt, which might cause problems for verification itself.  For
- * example, there is no bounds checking in PageGetItem().  Passing it a
- * corrupt line pointer can cause it to return a tuple/pointer that is unsafe
- * to dereference.
- *
- * Validating line pointers before tuples avoids undefined behavior and
- * assertion failures with corrupt indexes, making the verification process
- * more robust and predictable.
- */
-static ItemId
-PageGetItemIdCareful(BtreeCheckState *state, BlockNumber block, Page page,
-					 OffsetNumber offset)
-{
-	ItemId		itemid = PageGetItemId(page, offset);
-
-	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
-		BLCKSZ - MAXALIGN(sizeof(BTPageOpaqueData)))
-		ereport(ERROR,
-				(errcode(ERRCODE_INDEX_CORRUPTED),
-				 errmsg("line pointer points past end of tuple space in index \"%s\"",
-						RelationGetRelationName(state->rel)),
-				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
-									block, offset, ItemIdGetOffset(itemid),
-									ItemIdGetLength(itemid),
-									ItemIdGetFlags(itemid))));
-
-	/*
-	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree
-	 * never uses either.  Verify that line pointer has storage, too, since
-	 * even LP_DEAD items should within nbtree.
-	 */
-	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
-		ItemIdGetLength(itemid) == 0)
-		ereport(ERROR,
-				(errcode(ERRCODE_INDEX_CORRUPTED),
-				 errmsg("invalid line pointer storage in index \"%s\"",
-						RelationGetRelationName(state->rel)),
-				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
-									block, offset, ItemIdGetOffset(itemid),
-									ItemIdGetLength(itemid),
-									ItemIdGetFlags(itemid))));
-
-	return itemid;
-}
-
 /*
  * BTreeTupleGetHeapTID() wrapper that enforces that a heap TID is present in
  * cases where that is mandatory (i.e. for non-pivot tuples)
-- 
2.37.0 (Apple Git-136)

#11

Jose Arthur Benetasso Villanova

jose.arthur@gmail.com

about 3 years ago

In reply to: Andrew Borodin (#10)

Re: Amcheck verification of GiST and GIN

Hello.

I reviewed this patch and I would like to share some comments.

It compiled with those 2 warnings:

verify_gin.c: In function 'gin_check_parent_keys_consistency':
verify_gin.c:481:38: warning: declaration of 'maxoff' shadows a previous
local [-Wshadow=compatible-local]
481 | OffsetNumber maxoff =
PageGetMaxOffsetNumber(page);
| ^~~~~~
verify_gin.c:453:41: note: shadowed declaration is here
453 | maxoff;
| ^~~~~~
verify_gin.c:423:25: warning: unused variable 'heapallindexed'
[-Wunused-variable]
423 | bool heapallindexed = *((bool*)callback_state);
| ^~~~~~~~~~~~~~

Also, I'm not sure about postgres' headers conventions, inside amcheck.h,
there is "miscadmin.h" included, and inside verify_gin.c, verify_gist.h
and verify_nbtree.c both amcheck.h and miscadmin.h are included.

About the documentation, the bt_index_parent_check has comments about the
ShareLock and "SET client_min_messages = DEBUG1;", and both
gist_index_parent_check and gin_index_parent_check lack it. verify_gin
uses DEBUG3, I'm not sure if it is on purpose, but it would be nice to
document it or put DEBUG1 to be consistent.

I lack enough context to do a deep review on the code, so in this area
this patch needs more eyes.

I did the following test:

postgres=# create table teste (t text, tv tsvector);
CREATE TABLE
postgres=# insert into teste values ('hello', 'hello'::tsvector);
INSERT 0 1
postgres=# create index teste_tv on teste using gist(tv);
CREATE INDEX
postgres=# select pg_relation_filepath('teste_tv');
pg_relation_filepath
----------------------
base/5/16441
(1 row)

postgres=#
\q
$ bin/pg_ctl -D data -l log
waiting for server to shut down.... done
server stopped
$ okteta base/5/16441 # I couldn't figure out the dd syntax to change the
1FE9 to '0'
$ bin/pg_ctl -D data -l log
waiting for server to start.... done
server started
$ bin/psql -U ze postgres
psql (16devel)
Type "help" for help.

postgres=# SET client_min_messages = DEBUG3;
SET
postgres=# select gist_index_parent_check('teste_tv'::regclass, true);
DEBUG: verifying that tuples from index "teste_tv" are present in "teste"
ERROR: heap tuple (0,1) from table "teste" lacks matching index tuple
within index "teste_tv"
postgres=#

A simple index corruption in gin:

postgres=# CREATE TABLE "gin_check"("Column1" int[]);
CREATE TABLE
postgres=# insert into gin_check values (ARRAY[1]),(ARRAY[2]);
INSERT 0 2
postgres=# CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
CREATE INDEX
postgres=# select pg_relation_filepath('gin_check_idx');
pg_relation_filepath
----------------------
base/5/16453
(1 row)

postgres=#
\q
$ bin/pg_ctl -D data -l logfile stop
waiting for server to shut down.... done
server stopped
$ okteta data/base/5/16453 # edited some bits near 3FCC
$ bin/pg_ctl -D data -l logfile start
waiting for server to start.... done
server started
$ bin/psql -U ze postgres
psql (16devel)
Type "help" for help.

postgres=# SET client_min_messages = DEBUG3;
SET
postgres=# SELECT gin_index_parent_check('gin_check_idx', true);
ERROR: number of items mismatch in GIN entry tuple, 49 in tuple header, 1
decoded
postgres=#

There are more code paths to follow to check the entire code, and I had a
hard time to corrupt the indices. Is there any automated code to corrupt
index to test such code?

--
Jose Arthur Benetasso Villanova

#12

Andrey Borodin

amborodin86@gmail.com

about 3 years ago

In reply to: Jose Arthur Benetasso Villanova (#11)

3 attachment(s)

Re: Amcheck verification of GiST and GIN

Hello!

Thank you for the review!

On Thu, Nov 24, 2022 at 6:04 PM Jose Arthur Benetasso Villanova
<jose.arthur@gmail.com> wrote:

It compiled with those 2 warnings:

verify_gin.c: In function 'gin_check_parent_keys_consistency':
verify_gin.c:481:38: warning: declaration of 'maxoff' shadows a previous
local [-Wshadow=compatible-local]
481 | OffsetNumber maxoff =
PageGetMaxOffsetNumber(page);
| ^~~~~~
verify_gin.c:453:41: note: shadowed declaration is here
453 | maxoff;
| ^~~~~~
verify_gin.c:423:25: warning: unused variable 'heapallindexed'
[-Wunused-variable]

Fixed.

423 | bool heapallindexed = *((bool*)callback_state);
| ^~~~~~~~~~~~~~

This one is in progress yet, heapallindexed check is not implemented yet...

Also, I'm not sure about postgres' headers conventions, inside amcheck.h,
there is "miscadmin.h" included, and inside verify_gin.c, verify_gist.h
and verify_nbtree.c both amcheck.h and miscadmin.h are included.

Fixed.

About the documentation, the bt_index_parent_check has comments about the
ShareLock and "SET client_min_messages = DEBUG1;", and both
gist_index_parent_check and gin_index_parent_check lack it. verify_gin
uses DEBUG3, I'm not sure if it is on purpose, but it would be nice to
document it or put DEBUG1 to be consistent.

GiST and GIN verifications do not take ShareLock for parent checks.
B-tree check cannot verify cross-level invariants between levels when
the index is changing.

GiST verification checks only one invariant that can be verified if
page locks acquired the same way as page split does.
GIN does not require ShareLock because it does not check cross-level invariants.

Reporting progress with DEBUG1 is a good idea, I did not know that
this feature exists. I'll implement something similar in following
versions.

I did the following test:

Cool! Thank you!

There are more code paths to follow to check the entire code, and I had a
hard time to corrupt the indices. Is there any automated code to corrupt
index to test such code?

Heapam tests do this in an automated way, look into this file
t/001_verify_heapam.pl.
Surely we can write these tests. At least automate what you have just
done in the review. However, committing similar checks is a very
tedious work: something will inevitably turn buildfarm red as a
watermelon.

I hope I'll post a version with DEBUG1 reporting and heapallindexed soon.
PFA current state.
Thank you for looking into this!

Best regards, Andrey Borodin.

Attachments:

v16-0001-Refactor-amcheck-to-extract-common-locking-routi.patchapplication/octet-stream; name=v16-0001-Refactor-amcheck-to-extract-common-locking-routi.patchDownload

From 653bad3f597e774f3d95070b8751d948581f9fa1 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:08:10 +0500
Subject: [PATCH v16 1/3] Refactor amcheck to extract common locking routines

---
 contrib/amcheck/Makefile        |   2 +
 contrib/amcheck/amcheck.c       | 188 +++++++++++++++++++
 contrib/amcheck/amcheck.h       |  27 +++
 contrib/amcheck/meson.build     |   1 +
 contrib/amcheck/verify_nbtree.c | 307 ++++++++------------------------
 5 files changed, 296 insertions(+), 229 deletions(-)
 create mode 100644 contrib/amcheck/amcheck.c
 create mode 100644 contrib/amcheck/amcheck.h

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index b82f221e50..f10fd9d89d 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -3,11 +3,13 @@
 MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
+	amcheck.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
 DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
 REGRESS = check check_btree check_heap
diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
new file mode 100644
index 0000000000..3793b0cd93
--- /dev/null
+++ b/contrib/amcheck/amcheck.c
@@ -0,0 +1,188 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.c
+ *		Utility functions common to all access methods.
+ *
+ * Copyright (c) 2017-2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/table.h"
+#include "access/tableam.h"
+#include "amcheck.h"
+#include "catalog/index.h"
+#include "commands/tablecmds.h"
+#include "utils/guc.h"
+
+static bool
+amcheck_index_mainfork_expected(Relation rel);
+
+/*
+ * Check if index relation should have a file for its main relation
+ * fork.  Verification uses this to skip unlogged indexes when in hot standby
+ * mode, where there is simply nothing to verify.
+ *
+ * NB: Caller should call index_checkable()
+ * before calling here.
+ */
+static bool
+amcheck_index_mainfork_expected(Relation rel)
+{
+	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
+		!RecoveryInProgress())
+		return true;
+
+	ereport(NOTICE,
+			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
+			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
+					RelationGetRelationName(rel))));
+
+	return false;
+}
+
+void
+amcheck_lock_relation_and_check(Oid indrelid, IndexCheckableCallback checkable,
+												IndexDoCheckCallback check, LOCKMODE lockmode, void *state)
+{
+	Oid			heapid;
+	Relation	indrel;
+	Relation	heaprel;
+	Oid			save_userid;
+	int			save_sec_context;
+	int			save_nestlevel;
+
+	/*
+	 * We must lock table before index to avoid deadlocks.  However, if the
+	 * passed indrelid isn't an index then IndexGetRelation() will fail.
+	 * Rather than emitting a not-very-helpful error message, postpone
+	 * complaining, expecting that the is-it-an-index test below will fail.
+	 *
+	 * In hot standby mode this will raise an error when parentcheck is true.
+	 */
+	heapid = IndexGetRelation(indrelid, true);
+	if (OidIsValid(heapid))
+	{
+		heaprel = table_open(heapid, lockmode);
+
+		/*
+		 * Switch to the table owner's userid, so that any index functions are
+		 * run as that user.  Also lock down security-restricted operations
+		 * and arrange to make GUC variable changes local to this command.
+		 */
+		GetUserIdAndSecContext(&save_userid, &save_sec_context);
+		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
+							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
+		save_nestlevel = NewGUCNestLevel();
+	}
+	else
+	{
+		heaprel = NULL;
+		/* for "gcc -Og" https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78394 */
+		save_userid = InvalidOid;
+		save_sec_context = -1;
+		save_nestlevel = -1;
+	}
+
+	/*
+	 * Open the target index relations separately (like relation_openrv(), but
+	 * with heap relation locked first to prevent deadlocking).  In hot
+	 * standby mode this will raise an error when parentcheck is true.
+	 *
+	 * There is no need for the usual indcheckxmin usability horizon test
+	 * here, even in the heapallindexed case, because index undergoing
+	 * verification only needs to have entries for a new transaction snapshot.
+	 * (If this is a parentcheck verification, there is no question about
+	 * committed or recently dead heap tuples lacking index entries due to
+	 * concurrent activity.)
+	 */
+	indrel = index_open(indrelid, lockmode);
+
+	/*
+	 * Since we did the IndexGetRelation call above without any lock, it's
+	 * barely possible that a race against an index drop/recreation could have
+	 * netted us the wrong table.
+	 */
+	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_TABLE),
+				 errmsg("could not open parent table of index \"%s\"",
+						RelationGetRelationName(indrel))));
+
+	/* Relation suitable for checking */
+	checkable(indrel);
+
+	if (amcheck_index_mainfork_expected(indrel))
+		check(indrel, heaprel, state);
+
+	/* Roll back any GUC changes executed by index functions */
+	AtEOXact_GUC(false, save_nestlevel);
+
+	/* Restore userid and security context */
+	SetUserIdAndSecContext(save_userid, save_sec_context);
+
+	/*
+	 * Release locks early. That's ok here because nothing in the called
+	 * routines will trigger shared cache invalidations to be sent, so we can
+	 * relax the usual pattern of only releasing locks after commit.
+	 */
+	index_close(indrel, lockmode);
+	if (heaprel)
+		table_close(heaprel, lockmode);
+}
+
+/*
+ * PageGetItemId() wrapper that validates returned line pointer.
+ *
+ * Buffer page/page item access macros generally trust that line pointers are
+ * not corrupt, which might cause problems for verification itself.  For
+ * example, there is no bounds checking in PageGetItem().  Passing it a
+ * corrupt line pointer can cause it to return a tuple/pointer that is unsafe
+ * to dereference.
+ *
+ * Validating line pointers before tuples avoids undefined behavior and
+ * assertion failures with corrupt indexes, making the verification process
+ * more robust and predictable.
+ */
+ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset, size_t opaquesize)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	Assert(opaquesize == MAXALIGN(opaquesize));
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(opaquesize))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree and gist
+	 * never uses either.  Verify that line pointer has storage, too, since
+	 * even LP_DEAD items should.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/contrib/amcheck/amcheck.h b/contrib/amcheck/amcheck.h
new file mode 100644
index 0000000000..10906efd8a
--- /dev/null
+++ b/contrib/amcheck/amcheck.h
@@ -0,0 +1,27 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "storage/lockdefs.h"
+#include "utils/relcache.h"
+#include "miscadmin.h"
+
+/* Typedefs for callback functions for amcheck_lock_relation */
+typedef void (*IndexCheckableCallback) (Relation index);
+typedef void (*IndexDoCheckCallback) (Relation rel, Relation heaprel, void* state);
+
+extern void amcheck_lock_relation_and_check(Oid indrelid,
+											IndexCheckableCallback checkable,
+											IndexDoCheckCallback check,
+											LOCKMODE lockmode, void *state);
+
+extern ItemId PageGetItemIdCareful(Relation rel, BlockNumber block,
+					 Page page, OffsetNumber offset, size_t opaquesize);
\ No newline at end of file
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 2194a91124..29d100120e 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -1,4 +1,5 @@
 amcheck_sources = files(
+  'amcheck.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 9021d156eb..8c67edab2a 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -34,13 +34,14 @@
 #include "commands/tablecmds.h"
 #include "common/pg_prng.h"
 #include "lib/bloomfilter.h"
-#include "miscadmin.h"
 #include "storage/lmgr.h"
 #include "storage/smgr.h"
 #include "utils/guc.h"
 #include "utils/memutils.h"
 #include "utils/snapmgr.h"
 
+#include "amcheck.h"
+
 
 PG_MODULE_MAGIC;
 
@@ -138,10 +139,8 @@ typedef struct BtreeLevel
 PG_FUNCTION_INFO_V1(bt_index_check);
 PG_FUNCTION_INFO_V1(bt_index_parent_check);
 
-static void bt_index_check_internal(Oid indrelid, bool parentcheck,
-									bool heapallindexed, bool rootdescend);
+static void bt_index_check_internal_callback(Relation indrel, Relation heaprel, void* state);
 static inline void btree_index_checkable(Relation rel);
-static inline bool btree_index_mainfork_expected(Relation rel);
 static void bt_check_every_level(Relation rel, Relation heaprel,
 								 bool heapkeyspace, bool readonly, bool heapallindexed,
 								 bool rootdescend);
@@ -184,12 +183,17 @@ static inline bool invariant_l_nontarget_offset(BtreeCheckState *state,
 static Page palloc_btree_page(BtreeCheckState *state, BlockNumber blocknum);
 static inline BTScanInsert bt_mkscankey_pivotsearch(Relation rel,
 													IndexTuple itup);
-static ItemId PageGetItemIdCareful(BtreeCheckState *state, BlockNumber block,
-								   Page page, OffsetNumber offset);
 static inline ItemPointer BTreeTupleGetHeapTIDCareful(BtreeCheckState *state,
 													  IndexTuple itup, bool nonpivot);
 static inline ItemPointer BTreeTupleGetPointsToTID(IndexTuple itup);
 
+typedef struct BTCheckCallbackState
+{
+	bool parentcheck;
+	bool heapallindexed;
+	bool rootdescend;
+} BTCheckCallbackState;
+
 /*
  * bt_index_check(index regclass, heapallindexed boolean)
  *
@@ -203,12 +207,17 @@ Datum
 bt_index_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
+	BTCheckCallbackState args;
 
-	if (PG_NARGS() == 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = false;
+
+	if (PG_NARGS() >= 2)
+		args.heapallindexed = PG_GETARG_BOOL(1);
 
-	bt_index_check_internal(indrelid, false, heapallindexed, false);
+	amcheck_lock_relation_and_check(indrelid, btree_index_checkable,
+		bt_index_check_internal_callback, AccessShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -226,15 +235,18 @@ Datum
 bt_index_parent_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		rootdescend = false;
+	BTCheckCallbackState args;
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = true;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+		args.heapallindexed = PG_GETARG_BOOL(1);
 	if (PG_NARGS() == 3)
-		rootdescend = PG_GETARG_BOOL(2);
+		args.rootdescend = PG_GETARG_BOOL(2);
 
-	bt_index_check_internal(indrelid, true, heapallindexed, rootdescend);
+	amcheck_lock_relation_and_check(indrelid, btree_index_checkable,
+		bt_index_check_internal_callback, ShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -242,126 +254,35 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
 /*
  * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
  */
-static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend)
+static void bt_index_check_internal_callback(Relation indrel, Relation heaprel, void* state)
 {
-	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	LOCKMODE	lockmode;
-	Oid			save_userid;
-	int			save_sec_context;
-	int			save_nestlevel;
-
-	if (parentcheck)
-		lockmode = ShareLock;
-	else
-		lockmode = AccessShareLock;
-
-	/*
-	 * We must lock table before index to avoid deadlocks.  However, if the
-	 * passed indrelid isn't an index then IndexGetRelation() will fail.
-	 * Rather than emitting a not-very-helpful error message, postpone
-	 * complaining, expecting that the is-it-an-index test below will fail.
-	 *
-	 * In hot standby mode this will raise an error when parentcheck is true.
-	 */
-	heapid = IndexGetRelation(indrelid, true);
-	if (OidIsValid(heapid))
-	{
-		heaprel = table_open(heapid, lockmode);
-
-		/*
-		 * Switch to the table owner's userid, so that any index functions are
-		 * run as that user.  Also lock down security-restricted operations
-		 * and arrange to make GUC variable changes local to this command.
-		 */
-		GetUserIdAndSecContext(&save_userid, &save_sec_context);
-		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
-							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
-		save_nestlevel = NewGUCNestLevel();
-	}
-	else
-	{
-		heaprel = NULL;
-		/* Set these just to suppress "uninitialized variable" warnings */
-		save_userid = InvalidOid;
-		save_sec_context = -1;
-		save_nestlevel = -1;
-	}
-
-	/*
-	 * Open the target index relations separately (like relation_openrv(), but
-	 * with heap relation locked first to prevent deadlocking).  In hot
-	 * standby mode this will raise an error when parentcheck is true.
-	 *
-	 * There is no need for the usual indcheckxmin usability horizon test
-	 * here, even in the heapallindexed case, because index undergoing
-	 * verification only needs to have entries for a new transaction snapshot.
-	 * (If this is a parentcheck verification, there is no question about
-	 * committed or recently dead heap tuples lacking index entries due to
-	 * concurrent activity.)
-	 */
-	indrel = index_open(indrelid, lockmode);
+	BTCheckCallbackState* args = (BTCheckCallbackState*) state;
+	bool		heapkeyspace,
+					allequalimage;
 
-	/*
-	 * Since we did the IndexGetRelation call above without any lock, it's
-	 * barely possible that a race against an index drop/recreation could have
-	 * netted us the wrong table.
-	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
 		ereport(ERROR,
-				(errcode(ERRCODE_UNDEFINED_TABLE),
-				 errmsg("could not open parent table of index \"%s\"",
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" lacks a main relation fork",
 						RelationGetRelationName(indrel))));
 
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	if (btree_index_mainfork_expected(indrel))
-	{
-		bool		heapkeyspace,
-					allequalimage;
-
-		if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" lacks a main relation fork",
-							RelationGetRelationName(indrel))));
-
-		/* Extract metadata from metapage, and sanitize it in passing */
-		_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
-		if (allequalimage && !heapkeyspace)
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
-							RelationGetRelationName(indrel))));
-		if (allequalimage && !_bt_allequalimage(indrel, false))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
-							RelationGetRelationName(indrel))));
-
-		/* Check index, possibly against table it is an index on */
-		bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
-							 heapallindexed, rootdescend);
-	}
-
-	/* Roll back any GUC changes executed by index functions */
-	AtEOXact_GUC(false, save_nestlevel);
+	/* Extract metadata from metapage, and sanitize it in passing */
+	_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
+	if (allequalimage && !heapkeyspace)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
+						RelationGetRelationName(indrel))));
+	if (allequalimage && !_bt_allequalimage(indrel, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
+						RelationGetRelationName(indrel))));
 
-	/* Restore userid and security context */
-	SetUserIdAndSecContext(save_userid, save_sec_context);
+	/* Check index, possibly against table it is an index on */
+	bt_check_every_level(indrel, heaprel, heapkeyspace, args->parentcheck,
+							args->heapallindexed, args->rootdescend);
 
-	/*
-	 * Release locks early. That's ok here because nothing in the called
-	 * routines will trigger shared cache invalidations to be sent, so we can
-	 * relax the usual pattern of only releasing locks after commit.
-	 */
-	index_close(indrel, lockmode);
-	if (heaprel)
-		table_close(heaprel, lockmode);
 }
 
 /*
@@ -398,29 +319,6 @@ btree_index_checkable(Relation rel)
 				 errdetail("Index is not valid.")));
 }
 
-/*
- * Check if B-Tree index relation should have a file for its main relation
- * fork.  Verification uses this to skip unlogged indexes when in hot standby
- * mode, where there is simply nothing to verify.  We behave as if the
- * relation is empty.
- *
- * NB: Caller should call btree_index_checkable() before calling here.
- */
-static inline bool
-btree_index_mainfork_expected(Relation rel)
-{
-	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
-		!RecoveryInProgress())
-		return true;
-
-	ereport(DEBUG1,
-			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
-			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
-					RelationGetRelationName(rel))));
-
-	return false;
-}
-
 /*
  * Main entry point for B-Tree SQL-callable functions. Walks the B-Tree in
  * logical order, verifying invariants as it goes.  Optionally, verification
@@ -793,9 +691,9 @@ bt_check_level_from_leftmost(BtreeCheckState *state, BtreeLevel level)
 				ItemId		itemid;
 
 				/* Internal page -- downlink gets leftmost on next level */
-				itemid = PageGetItemIdCareful(state, state->targetblock,
+				itemid = PageGetItemIdCareful(state->rel, state->targetblock,
 											  state->target,
-											  P_FIRSTDATAKEY(opaque));
+											  P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 				itup = (IndexTuple) PageGetItem(state->target, itemid);
 				nextleveldown.leftmost = BTreeTupleGetDownLink(itup);
 				nextleveldown.level = opaque->btpo_level - 1;
@@ -875,8 +773,8 @@ nextpage:
 			IndexTuple	itup;
 			ItemId		itemid;
 
-			itemid = PageGetItemIdCareful(state, state->targetblock,
-										  state->target, P_HIKEY);
+			itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+										  state->target, P_HIKEY, sizeof(BTPageOpaqueData));
 			itup = (IndexTuple) PageGetItem(state->target, itemid);
 
 			state->lowkey = MemoryContextAlloc(oldcontext, IndexTupleSize(itup));
@@ -1093,8 +991,8 @@ bt_target_page_check(BtreeCheckState *state)
 		IndexTuple	itup;
 
 		/* Verify line pointer before checking tuple */
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, P_HIKEY);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, P_HIKEY, sizeof(BTPageOpaqueData));
 		if (!_bt_check_natts(state->rel, state->heapkeyspace, state->target,
 							 P_HIKEY))
 		{
@@ -1129,8 +1027,8 @@ bt_target_page_check(BtreeCheckState *state)
 
 		CHECK_FOR_INTERRUPTS();
 
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, offset);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, offset, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(state->target, itemid);
 		tupsize = IndexTupleSize(itup);
 
@@ -1442,9 +1340,9 @@ bt_target_page_check(BtreeCheckState *state)
 							 OffsetNumberNext(offset));
 
 			/* Reuse itup to get pointed-to heap location of second item */
-			itemid = PageGetItemIdCareful(state, state->targetblock,
+			itemid = PageGetItemIdCareful(state->rel, state->targetblock,
 										  state->target,
-										  OffsetNumberNext(offset));
+										  OffsetNumberNext(offset), sizeof(BTPageOpaqueData));
 			itup = (IndexTuple) PageGetItem(state->target, itemid);
 			tid = BTreeTupleGetPointsToTID(itup);
 			nhtid = psprintf("(%u,%u)",
@@ -1735,8 +1633,8 @@ bt_right_page_check_scankey(BtreeCheckState *state)
 	if (P_ISLEAF(opaque) && nline >= P_FIRSTDATAKEY(opaque))
 	{
 		/* Return first data item (if any) */
-		rightitem = PageGetItemIdCareful(state, targetnext, rightpage,
-										 P_FIRSTDATAKEY(opaque));
+		rightitem = PageGetItemIdCareful(state->rel, targetnext, rightpage,
+										 P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 	}
 	else if (!P_ISLEAF(opaque) &&
 			 nline >= OffsetNumberNext(P_FIRSTDATAKEY(opaque)))
@@ -1745,8 +1643,8 @@ bt_right_page_check_scankey(BtreeCheckState *state)
 		 * Return first item after the internal page's "negative infinity"
 		 * item
 		 */
-		rightitem = PageGetItemIdCareful(state, targetnext, rightpage,
-										 OffsetNumberNext(P_FIRSTDATAKEY(opaque)));
+		rightitem = PageGetItemIdCareful(state->rel, targetnext, rightpage,
+										 OffsetNumberNext(P_FIRSTDATAKEY(opaque)), sizeof(BTPageOpaqueData));
 	}
 	else
 	{
@@ -1865,8 +1763,8 @@ bt_child_highkey_check(BtreeCheckState *state,
 
 	if (OffsetNumberIsValid(target_downlinkoffnum))
 	{
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, target_downlinkoffnum);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, target_downlinkoffnum, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(state->target, itemid);
 		downlink = BTreeTupleGetDownLink(itup);
 	}
@@ -1969,7 +1867,7 @@ bt_child_highkey_check(BtreeCheckState *state,
 			OffsetNumber pivotkey_offset;
 
 			/* Get high key */
-			itemid = PageGetItemIdCareful(state, blkno, page, P_HIKEY);
+			itemid = PageGetItemIdCareful(state->rel, blkno, page, P_HIKEY, sizeof(BTPageOpaqueData));
 			highkey = (IndexTuple) PageGetItem(page, itemid);
 
 			/*
@@ -2020,8 +1918,8 @@ bt_child_highkey_check(BtreeCheckState *state,
 													LSN_FORMAT_ARGS(state->targetlsn))));
 					pivotkey_offset = P_HIKEY;
 				}
-				itemid = PageGetItemIdCareful(state, state->targetblock,
-											  state->target, pivotkey_offset);
+				itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+											  state->target, pivotkey_offset, sizeof(BTPageOpaqueData));
 				itup = (IndexTuple) PageGetItem(state->target, itemid);
 			}
 			else
@@ -2107,8 +2005,8 @@ bt_child_check(BtreeCheckState *state, BTScanInsert targetkey,
 	BTPageOpaque copaque;
 	BTPageOpaque topaque;
 
-	itemid = PageGetItemIdCareful(state, state->targetblock,
-								  state->target, downlinkoffnum);
+	itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+								  state->target, downlinkoffnum, sizeof(BTPageOpaqueData));
 	itup = (IndexTuple) PageGetItem(state->target, itemid);
 	childblock = BTreeTupleGetDownLink(itup);
 
@@ -2339,7 +2237,7 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 		 RelationGetRelationName(state->rel));
 
 	level = opaque->btpo_level;
-	itemid = PageGetItemIdCareful(state, blkno, page, P_FIRSTDATAKEY(opaque));
+	itemid = PageGetItemIdCareful(state->rel, blkno, page, P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 	itup = (IndexTuple) PageGetItem(page, itemid);
 	childblk = BTreeTupleGetDownLink(itup);
 	for (;;)
@@ -2363,8 +2261,8 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 										level - 1, copaque->btpo_level)));
 
 		level = copaque->btpo_level;
-		itemid = PageGetItemIdCareful(state, childblk, child,
-									  P_FIRSTDATAKEY(copaque));
+		itemid = PageGetItemIdCareful(state->rel, childblk, child,
+									  P_FIRSTDATAKEY(copaque), sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(child, itemid);
 		childblk = BTreeTupleGetDownLink(itup);
 		/* Be slightly more pro-active in freeing this memory, just in case */
@@ -2412,7 +2310,7 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 	 */
 	if (P_ISHALFDEAD(copaque) && !P_RIGHTMOST(copaque))
 	{
-		itemid = PageGetItemIdCareful(state, childblk, child, P_HIKEY);
+		itemid = PageGetItemIdCareful(state->rel, childblk, child, P_HIKEY, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(child, itemid);
 		if (BTreeTupleGetTopParent(itup) == blkno)
 			return;
@@ -2782,8 +2680,8 @@ invariant_l_offset(BtreeCheckState *state, BTScanInsert key,
 	Assert(key->pivotsearch);
 
 	/* Verify line pointer before checking tuple */
-	itemid = PageGetItemIdCareful(state, state->targetblock, state->target,
-								  upperbound);
+	itemid = PageGetItemIdCareful(state->rel, state->targetblock, state->target,
+								  upperbound, sizeof(BTPageOpaqueData));
 	/* pg_upgrade'd indexes may legally have equal sibling tuples */
 	if (!key->heapkeyspace)
 		return invariant_leq_offset(state, key, upperbound);
@@ -2905,8 +2803,8 @@ invariant_l_nontarget_offset(BtreeCheckState *state, BTScanInsert key,
 	Assert(key->pivotsearch);
 
 	/* Verify line pointer before checking tuple */
-	itemid = PageGetItemIdCareful(state, nontargetblock, nontarget,
-								  upperbound);
+	itemid = PageGetItemIdCareful(state->rel, nontargetblock, nontarget,
+								  upperbound, sizeof(BTPageOpaqueData));
 	cmp = _bt_compare(state->rel, key, nontarget, upperbound);
 
 	/* pg_upgrade'd indexes may legally have equal sibling tuples */
@@ -3143,55 +3041,6 @@ bt_mkscankey_pivotsearch(Relation rel, IndexTuple itup)
 	return skey;
 }
 
-/*
- * PageGetItemId() wrapper that validates returned line pointer.
- *
- * Buffer page/page item access macros generally trust that line pointers are
- * not corrupt, which might cause problems for verification itself.  For
- * example, there is no bounds checking in PageGetItem().  Passing it a
- * corrupt line pointer can cause it to return a tuple/pointer that is unsafe
- * to dereference.
- *
- * Validating line pointers before tuples avoids undefined behavior and
- * assertion failures with corrupt indexes, making the verification process
- * more robust and predictable.
- */
-static ItemId
-PageGetItemIdCareful(BtreeCheckState *state, BlockNumber block, Page page,
-					 OffsetNumber offset)
-{
-	ItemId		itemid = PageGetItemId(page, offset);
-
-	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
-		BLCKSZ - MAXALIGN(sizeof(BTPageOpaqueData)))
-		ereport(ERROR,
-				(errcode(ERRCODE_INDEX_CORRUPTED),
-				 errmsg("line pointer points past end of tuple space in index \"%s\"",
-						RelationGetRelationName(state->rel)),
-				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
-									block, offset, ItemIdGetOffset(itemid),
-									ItemIdGetLength(itemid),
-									ItemIdGetFlags(itemid))));
-
-	/*
-	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree
-	 * never uses either.  Verify that line pointer has storage, too, since
-	 * even LP_DEAD items should within nbtree.
-	 */
-	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
-		ItemIdGetLength(itemid) == 0)
-		ereport(ERROR,
-				(errcode(ERRCODE_INDEX_CORRUPTED),
-				 errmsg("invalid line pointer storage in index \"%s\"",
-						RelationGetRelationName(state->rel)),
-				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
-									block, offset, ItemIdGetOffset(itemid),
-									ItemIdGetLength(itemid),
-									ItemIdGetFlags(itemid))));
-
-	return itemid;
-}
-
 /*
  * BTreeTupleGetHeapTID() wrapper that enforces that a heap TID is present in
  * cases where that is mandatory (i.e. for non-pivot tuples)
-- 
2.37.0 (Apple Git-136)

v16-0003-Add-gin_index_parent_check-to-verify-GIN-index.patchapplication/octet-stream; name=v16-0003-Add-gin_index_parent_check-to-verify-GIN-index.patchDownload

From bbb8f217eea72975d1ee6d895edca9c021669dde Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:22:26 +0500
Subject: [PATCH v16 3/3] Add gin_index_parent_check() to verify GIN index

---
 contrib/amcheck/Makefile               |   3 +-
 contrib/amcheck/amcheck--1.3--1.4.sql  |  11 +-
 contrib/amcheck/amcheck.c              |   2 +-
 contrib/amcheck/expected/check_gin.out |  64 ++
 contrib/amcheck/meson.build            |   2 +
 contrib/amcheck/sql/check_gin.sql      |  40 ++
 contrib/amcheck/verify_gin.c           | 798 +++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml              |  19 +
 8 files changed, 936 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/expected/check_gin.out
 create mode 100644 contrib/amcheck/sql/check_gin.sql
 create mode 100644 contrib/amcheck/verify_gin.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index a817419581..ecb849a605 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -5,6 +5,7 @@ OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
 	verify_gist.o \
+	verify_gin.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
@@ -14,7 +15,7 @@ DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck
 
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap check_gist
+REGRESS = check check_btree check_heap check_gist check_gin
 
 TAP_TESTS = 1
 
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
index 93297379ef..c914e6d0ba 100644
--- a/contrib/amcheck/amcheck--1.3--1.4.sql
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -11,4 +11,13 @@ RETURNS VOID
 AS 'MODULE_PATHNAME', 'gist_index_parent_check'
 LANGUAGE C STRICT;
 
-REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
+REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
+
+-- gin_index_parent_check()
+--
+CREATE FUNCTION gin_index_parent_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gin_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gin_index_parent_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
index 3793b0cd93..9999a233f8 100644
--- a/contrib/amcheck/amcheck.c
+++ b/contrib/amcheck/amcheck.c
@@ -83,7 +83,7 @@ amcheck_lock_relation_and_check(Oid indrelid, IndexCheckableCallback checkable,
 	else
 	{
 		heaprel = NULL;
-		/* for "gcc -Og" https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78394 */
+		/* Set these just to suppress "uninitialized variable" warnings */
 		save_userid = InvalidOid;
 		save_sec_context = -1;
 		save_nestlevel = -1;
diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
new file mode 100644
index 0000000000..d98d525c66
--- /dev/null
+++ b/contrib/amcheck/expected/check_gin.out
@@ -0,0 +1,64 @@
+-- Test of index bulk load
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx', true);
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test index inserts
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+SELECT gin_index_parent_check('gin_check_idx', true);
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test GIN over text array
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx', true);
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 66e34d8706..f3f097c5c5 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -1,5 +1,6 @@
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gin.c',
   'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
@@ -37,6 +38,7 @@ tests += {
       'check_btree',
       'check_heap',
       'check_gist',
+      'check_gin',
     ],
   },
   'tap': {
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
new file mode 100644
index 0000000000..789259e662
--- /dev/null
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -0,0 +1,40 @@
+-- Test of index bulk load
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx', true);
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test index inserts
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+
+SELECT gin_index_parent_check('gin_check_idx', true);
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test GIN over text array
+SELECT setseed(1);
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx', true);
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
new file mode 100644
index 0000000000..59aa860de6
--- /dev/null
+++ b/contrib/amcheck/verify_gin.c
@@ -0,0 +1,798 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gin.c
+ *		Verifies the integrity of GIN indexes based on invariants.
+ *
+ * Verification checks that all paths in GIN graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gin.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gin_private.h"
+#include "access/nbtree.h"
+#include "amcheck.h"
+#include "catalog/pg_am.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "string.h"
+
+/*
+ * GinScanItem represents one item of depth-first scan of GIN index.
+ */
+typedef struct GinScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GinScanItem *next;
+}			GinScanItem;
+
+/*
+ * GinPostingTreeScanItem represents one item of depth-first scan of GIN  posting tree.
+ */
+typedef struct GinPostingTreeScanItem
+{
+	int			depth;
+	ItemPointerData parentkey;
+	BlockNumber parentblk;
+	BlockNumber blkno;
+	struct GinPostingTreeScanItem *next;
+}			GinPostingTreeScanItem;
+
+
+PG_FUNCTION_INFO_V1(gin_index_parent_check);
+
+static void gin_index_checkable(Relation rel);
+static void gin_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state);
+static bool check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gin_refind_parent(Relation rel, BlockNumber parentblkno,
+									BlockNumber childblkno,
+									BufferAccessStrategy strategy);
+
+/*
+ * gin_index_parent_check(index regclass)
+ *
+ * Verify integrity of GIN index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gin_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	bool	heapallindexed = false;
+
+	if (PG_NARGS() >= 2)
+		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid, gin_index_checkable,
+		gin_check_parent_keys_consistency, AccessShareLock, &heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Read item pointers from leaf entry tuple.
+ *
+ * Returns a palloc'd array of ItemPointers. The number of items is returned
+ * in *nitems.
+ */
+static ItemPointer
+ginReadTupleWithoutState(IndexTuple itup, int *nitems)
+{
+	Pointer		ptr = GinGetPosting(itup);
+	int			nipd = GinGetNPosting(itup);
+	ItemPointer ipd;
+	int			ndecoded;
+
+	if (GinItupIsCompressed(itup))
+	{
+		if (nipd > 0)
+		{
+			ipd = ginPostingListDecode((GinPostingList *) ptr, &ndecoded);
+			if (nipd != ndecoded)
+				elog(ERROR, "number of items mismatch in GIN entry tuple, %d in tuple header, %d decoded",
+					 nipd, ndecoded);
+		}
+		else
+		{
+			ipd = palloc(0);
+		}
+	}
+	else
+	{
+		ipd = (ItemPointer) palloc(sizeof(ItemPointerData) * nipd);
+		memcpy(ipd, ptr, sizeof(ItemPointerData) * nipd);
+	}
+	*nitems = nipd;
+	return ipd;
+}
+
+
+/*
+ * Check that relation is eligible for GIN verification
+ */
+static void
+gin_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIN_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GIN indexes are supported as targets for this verification"),
+				 errdetail("Relation \"%s\" is not a GIN index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+/*
+ * Allocates memory context and scans through postigTree graph
+ *
+ */
+static void
+gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting_tree_root)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinPostingTreeScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinPostingTreeScanItem *) palloc0(sizeof(GinPostingTreeScanItem));
+	stack->depth = 0;
+	ItemPointerSetInvalid(&stack->parentkey);
+	stack->parentblk = InvalidBlockNumber;
+	stack->blkno = posting_tree_root;
+
+	elog(DEBUG3, "processing posting tree at blk %u", posting_tree_root);
+
+	while (stack)
+	{
+		GinPostingTreeScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		Assert(GinPageIsData(page));
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			ItemPointerData minItem;
+			int			nlist;
+			ItemPointerData *list;
+			char		tidrange_buf[100];
+
+			ItemPointerSetMin(&minItem);
+
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			list = GinDataLeafPageGetItems(page, &nlist, minItem);
+
+			if (nlist > 0)
+			{
+				snprintf(tidrange_buf, sizeof(tidrange_buf),
+						 "%d tids (%u, %u) - (%u, %u)",
+						 nlist,
+						 ItemPointerGetBlockNumberNoCheck(&list[0]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[0]),
+						 ItemPointerGetBlockNumberNoCheck(&list[nlist - 1]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[nlist - 1]));
+			} else {
+				snprintf(tidrange_buf, sizeof(tidrange_buf), "0 tids");
+			}
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				elog(DEBUG3, "blk %u: parent %u highkey (%u, %u), %s",
+					 stack->blkno,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey),
+					 tidrange_buf);
+			}
+			else
+			{
+				elog(DEBUG3, "blk %u: root leaf, %s",
+					 stack->blkno,
+					 tidrange_buf);
+			}
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				ItemPointerGetOffsetNumberNoCheck(&stack->parentkey) != InvalidOffsetNumber &&
+				nlist > 0 &&
+				ItemPointerCompare(&stack->parentkey, &list[nlist - 1]) < 0)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": tid exceeds parent's high key in postingTree leaf on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			}
+		}
+		else
+		{
+			LocationIndex pd_lower;
+			int			lowersize;
+			ItemPointerData bound;
+
+			/*
+			 * Check that tuples in each page are properly ordered and
+			 * consistent with parent high key
+			 */
+			maxoff = GinPageGetOpaque(page)->maxoff;
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: internal posting tree page with %u items, parent %u highkey (%u, %u)",
+					 stack->blkno,
+					 maxoff,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey));
+			else
+				elog(DEBUG3, "blk %u: root internal posting tree page with %u items", stack->blkno, maxoff);
+
+			/*
+			 * A GIN posting tree internal page stores PostingItems in the
+			 * 'lower' part of the page. The 'upper' part is unused. The
+			 * number of elements is stored in the opaque area (maxoff).
+			 * Make sure the size of the 'lower' part agrees with 'maxoff'
+			 *
+			 * We didn't set pd_lower until PostgreSQL version 9.4, so if this
+			 * check fails, it could also be because the index was binary-upgraded
+			 * from an earlier version. That was a long time ago, though, so let's
+			 * warn if it doesn't match.
+			 */
+			pd_lower = ((PageHeader) page)->pd_lower;
+			lowersize = pd_lower - MAXALIGN(SizeOfPageHeaderData);
+			if ((lowersize - MAXALIGN(sizeof(ItemPointerData))) / sizeof(PostingItem) != maxoff)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has unexpected pd_lower %u in posting tree block %u with maxoff %u)",
+								RelationGetRelationName(rel), pd_lower, stack->blkno, maxoff)));
+			}
+
+			/*
+			 * Before the PostingItems, there's one ItemPointerData in the
+			 * 'lower' part that stores the page's high key.
+			 */
+			bound = *GinDataPageGetRightBound(page);
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				if (!ItemPointerEquals(&stack->parentkey, &bound))
+				{
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\": posting tree page's high key (%u, %u) doesn't match the downlink on block %u (parent blk %u, key (%u, %u))",
+									RelationGetRelationName(rel),
+									ItemPointerGetBlockNumberNoCheck(&bound),
+									ItemPointerGetOffsetNumberNoCheck(&bound),
+									stack->blkno,
+									stack->parentblk,
+									ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+									ItemPointerGetOffsetNumberNoCheck(&stack->parentkey))));
+				}
+			}
+
+			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
+
+				elog(DEBUG3, "key (%u, %u) -> %u",
+					 ItemPointerGetBlockNumber(&posting_item->key),
+					 ItemPointerGetOffsetNumber(&posting_item->key),
+					 BlockIdGetBlockNumber(&posting_item->child_blkno));
+
+				if (i == maxoff && GinPageGetOpaque(page)->rightlink == InvalidBlockNumber)
+				{
+					/* The rightmost item in the tree level has (0, 0) as the key */
+					if (ItemPointerGetBlockNumberNoCheck(&posting_item->key) != 0 ||
+						ItemPointerGetOffsetNumberNoCheck(&posting_item->key) != 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": rightmost posting tree page (blk %u) has unexpected last key (%u, %u)",
+										RelationGetRelationName(rel),
+										stack->blkno,
+										ItemPointerGetBlockNumberNoCheck(&posting_item->key),
+										ItemPointerGetOffsetNumberNoCheck(&posting_item->key))));
+					}
+				}
+				else if (i != FirstOffsetNumber)
+				{
+					PostingItem *previous_posting_item = GinDataPageGetPostingItem(page, i - 1);
+
+					if (ItemPointerCompare(&posting_item->key, &previous_posting_item->key) < 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\" has wrong tuple order in posting tree, block %u, offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+					}
+				}
+
+				/*
+				 * Check if this tuple is consistent with the downlink in the
+				 * parent.
+				 */
+				if (stack->parentblk != InvalidBlockNumber && i == maxoff)
+				{
+					if (ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting item exceeds parent's high key in postingTree internal page on block %u offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+
+					}
+				}
+
+				/* If this is an internal page, recurse into the child */
+				if (!GinPageIsLeaf(page))
+				{
+					GinPostingTreeScanItem *ptr;
+
+					ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
+					ptr->depth = stack->depth + 1;
+					ptr->parentkey = posting_item->key;
+					ptr->parentblk = stack->blkno;
+					ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+
+			}
+		}
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Main entry point for GIN check. Allocates memory context and scans through
+ * GIN graph.
+ */
+static void
+gin_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GinState	state;
+	bool		heapallindexed = *((bool*)callback_state);
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+	initGinState(&state, rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinScanItem *) palloc0(sizeof(GinScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIN_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GinScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+		IndexTuple	prev_tuple;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+		maxoff = PageGetMaxOffsetNumber(page);
+
+		/* Do basic sanity checks on the page headers */
+		if (!check_index_page(rel, buffer, stack->blkno))
+		{
+			goto nextpage;
+		}
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (stack->parenttup != NULL)
+		{
+			GinNullCategory parent_key_category;
+			Datum		parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, maxoff, sizeof(GinPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory page_max_key_category;
+			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
+
+			if (GinPageGetOpaque(page)->rightlink != InvalidBlockNumber &&
+				ginCompareEntries(&state, attnum, page_max_key, page_max_key_category, parent_key, parent_key_category) > 0)
+			{
+				/* split page detected, install right link to the stack */
+				GinScanItem *ptr;
+
+				elog(DEBUG3, "split detected");
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth;
+				ptr->parenttup = CopyIndexTuple(stack->parenttup);
+				ptr->parentblk = stack->parentblk;
+				ptr->parentlsn = stack->parentlsn;
+				ptr->blkno = GinPageGetOpaque(page)->rightlink;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+				goto nextpage;
+			}
+		}
+
+		/*
+		 * Check that tuples in each page are properly ordered and consistent
+		 * with parent high key
+		 */
+		prev_tuple = NULL;
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i, sizeof(GinPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory prev_key_category;
+			Datum		prev_key;
+			GinNullCategory current_key_category;
+			Datum		current_key;
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+				continue;
+			}
+
+			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
+
+			/* (apparently) first block is metadata, skip order check */
+			if (i != FirstOffsetNumber && stack->blkno != (BlockNumber) 1)
+			{
+				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
+				if (ginCompareEntries(&state, attnum, prev_key, prev_key_category, current_key, current_key_category) >= 0)
+				{
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has wrong tuple order, block %u, offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+				}
+			}
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				i == maxoff)
+			{
+				GinNullCategory parent_key_category;
+				Datum		parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+
+				if (ginCompareEntries(&state, attnum, current_key, current_key_category, parent_key, parent_key_category) > 0)
+				{
+					/*
+					 * There was a discrepancy between parent and child
+					 * tuples. We need to verify it is not a result of
+					 * concurrent call of gistplacetopage(). So, lock parent
+					 * and try to find downlink for current page. It may be
+					 * missing due to concurrent page split, this is OK.
+					 */
+					pfree(stack->parenttup);
+					stack->parenttup = gin_refind_parent(rel, stack->parentblk,
+														 stack->blkno, strategy);
+
+					/* We found it - make a final check before failing */
+					if (!stack->parenttup)
+						elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+							 stack->blkno, stack->parentblk);
+					else
+					{
+						parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+						if (ginCompareEntries(&state, attnum, current_key, current_key_category, parent_key, parent_key_category) > 0)
+							ereport(ERROR,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+											RelationGetRelationName(rel), stack->blkno, i)));
+						else
+						{
+							/*
+							 * But now it is properly adjusted - nothing to do
+							 * here.
+							 */
+						}
+					}
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+			if (!GinPageIsLeaf(page))
+			{
+				GinScanItem *ptr;
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth + 1;
+				/* last tuple in layer has no high key */
+				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
+				{
+					ptr->parenttup = CopyIndexTuple(idxtuple);
+				}
+				else
+				{
+					ptr->parenttup = NULL;
+				}
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = GinGetDownlink(idxtuple);
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+			/* If this item is a pointer to a posting tree, recurse into it */
+			else if (GinIsPostingTree(idxtuple))
+			{
+				BlockNumber rootPostingTree = GinGetPostingTree(idxtuple);
+
+				gin_check_posting_tree_parent_keys_consistency(rel, rootPostingTree);
+			}
+			else
+			{
+				ItemPointer ipd;
+				int			nipd;
+
+				ipd = ginReadTupleWithoutState(idxtuple, &nipd);
+
+				for (int j = 0; j < nipd; j++)
+				{
+					if (!OffsetNumberIsValid(ItemPointerGetOffsetNumber(&ipd[j])))
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting list contains invalid heap pointer on block %u",
+										RelationGetRelationName(rel), stack->blkno)));
+					}
+				}
+				pfree(ipd);
+			}
+
+			prev_tuple = CopyIndexTuple(idxtuple);
+		}
+
+nextpage:
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Verify that a freshly-read page looks sane.
+ */
+static bool
+gincheckpage(Relation rel, Buffer buf)
+{
+	Page		page = BufferGetPage(buf);
+
+	/*
+	 * ReadBuffer verifies that every newly-read page passes
+	 * PageHeaderIsValid, which means it either contains a reasonably sane
+	 * page header or is all-zero.  We have to defend against the all-zero
+	 * case, however.
+	 */
+	if (PageIsNew(page))
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains unexpected zero page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buf)),
+				 errhint("Please REINDEX it.")));
+		return false;
+	}
+
+	/*
+	 * Additionally check that the special area looks sane.
+	 */
+	if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains corrupted page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buf)),
+				 errhint("Please REINDEX it.")));
+		return false;
+	}
+	return true;
+}
+
+static bool
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	if (!gincheckpage(rel, buffer))
+		return false;
+
+	if (GinPageIsDeleted(page))
+	{
+		if (!GinPageIsLeaf(page))
+		{
+			ereport(WARNING,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+			return false;
+		}
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+		{
+			ereport(WARNING,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+			return false;
+		}
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+		return false;
+	}
+	return true;
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gin_refind_parent(Relation rel, BlockNumber parentblkno,
+				  BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIN_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GinPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o, sizeof(GinPageOpaqueData));
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 9397a69c6e..7ffa36b205 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -180,6 +180,25 @@ ORDER BY c.relpages DESC LIMIT 10;
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term>
+     <function>gin_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gin_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gin_index_parent_check</function> tests that its target GIN index
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term>
      <function>gist_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
-- 
2.37.0 (Apple Git-136)

v16-0002-Add-gist_index_parent_check-function-to-verify-G.patchapplication/octet-stream; name=v16-0002-Add-gist_index_parent_check-function-to-verify-G.patchDownload

From 2d938e125667e31f007d9bf20388ce5aee59b156 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:17:44 +0500
Subject: [PATCH v16 2/3] Add gist_index_parent_check() function to verify GiST
 index

---
 contrib/amcheck/Makefile                |   6 +-
 contrib/amcheck/amcheck--1.3--1.4.sql   |  14 +
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/expected/check_gist.out | 119 ++++++
 contrib/amcheck/meson.build             |   3 +
 contrib/amcheck/sql/check_gist.sql      |  42 ++
 contrib/amcheck/verify_gist.c           | 518 ++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml               |  19 +
 8 files changed, 720 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.3--1.4.sql
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index f10fd9d89d..a817419581 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,15 +4,17 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
+	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
-DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql \
+		amcheck--1.3--1.4.sql
 
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap
+REGRESS = check check_btree check_heap check_gist
 
 TAP_TESTS = 1
 
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
new file mode 100644
index 0000000000..93297379ef
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -0,0 +1,14 @@
+/* contrib/amcheck/amcheck--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.4'" to load this file. \quit
+
+
+-- gist_index_parent_check()
+--
+CREATE FUNCTION gist_index_parent_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index ab50931f75..e67ace01c9 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.3'
+default_version = '1.4'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 0000000000..9749adfd34
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,119 @@
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 29d100120e..66e34d8706 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -1,5 +1,6 @@
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
@@ -22,6 +23,7 @@ install_data(
   'amcheck--1.0--1.1.sql',
   'amcheck--1.1--1.2.sql',
   'amcheck--1.2--1.3.sql',
+  'amcheck--1.3--1.4.sql',
   kwargs: contrib_data_args,
 )
 
@@ -34,6 +36,7 @@ tests += {
       'check',
       'check_btree',
       'check_heap',
+      'check_gist',
     ],
   },
   'tap': {
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 0000000000..75b9ff4b43
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,42 @@
+
+SELECT setseed(1);
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 0000000000..f08e96f445
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,518 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gist.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "access/tableam.h"
+#include "access/transam.h"
+#include "catalog/pg_am.h"
+#include "common/pg_prng.h"
+#include "catalog/index.h"
+#include "lib/bloomfilter.h"
+#include "storage/lmgr.h"
+#include "storage/smgr.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "utils/snapmgr.h"
+
+#include "amcheck.h"
+
+/*
+ * GistScanItem represents one item of depth-first scan of GiST index.
+ */
+typedef struct GistScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GistScanItem *next;
+} GistScanItem;
+
+typedef struct GistCheckState
+{
+	/* Bloom filter fingerprints B-Tree index */
+	bloom_filter *filter;
+	/* Debug counter */
+	int64		heaptuplespresent;
+	/* GiST state */
+	GISTSTATE      *state;
+
+	Snapshot		snapshot;
+	Relation	rel;
+	Relation	heaprel;
+} GistCheckState;
+
+PG_FUNCTION_INFO_V1(gist_index_parent_check);
+
+static GistCheckState gist_init_heapallindexed(Relation rel);
+static void gist_index_checkable(Relation rel);
+static void gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+												void* callback_state);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gist_refind_parent(Relation rel, BlockNumber parentblkno,
+									 BlockNumber childblkno,
+									 BufferAccessStrategy strategy);
+static void gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+						  bool *isnull, bool tupleIsAlive, void *checkstate);
+
+/*
+ * gist_index_parent_check(index regclass)
+ *
+ * Verify integrity of GiST index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum gist_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid		indrelid = PG_GETARG_OID(0);
+	bool	heapallindexed = false;
+
+	if (PG_NARGS() >= 2)
+		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid, gist_index_checkable,
+		gist_check_parent_keys_consistency, AccessShareLock, &heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Check that relation is eligible for GiST verification
+ */
+static void
+gist_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIST_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GiST indexes are supported as targets for this verification"),
+				 errdetail("Relation \"%s\" is not a GiST index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+static GistCheckState
+gist_init_heapallindexed(Relation rel)
+{
+	int64		total_pages;
+	int64		total_elems;
+	uint64		seed;
+	GistCheckState result;
+
+	/*
+	 * Size Bloom filter based on estimated number of tuples in index.
+	 * This logic is similar to B-tree, see verify_btree.c .
+	 */
+	total_pages = RelationGetNumberOfBlocks(rel);
+	total_elems = Max(total_pages * (MaxOffsetNumber / 5),
+						(int64) rel->rd_rel->reltuples);
+	seed = pg_prng_uint64(&pg_global_prng_state);
+	result.filter = bloom_create(total_elems, maintenance_work_mem, seed);
+
+	result.snapshot = RegisterSnapshot(GetTransactionSnapshot());
+
+	/*
+	 * GetTransactionSnapshot() always acquires a new MVCC snapshot in
+	 * READ COMMITTED mode.  A new snapshot is guaranteed to have all
+	 * the entries it requires in the index.
+	 *
+	 * We must defend against the possibility that an old xact
+	 * snapshot was returned at higher isolation levels when that
+	 * snapshot is not safe for index scans of the target index.  This
+	 * is possible when the snapshot sees tuples that are before the
+	 * index's indcheckxmin horizon.  Throwing an error here should be
+	 * very rare.  It doesn't seem worth using a secondary snapshot to
+	 * avoid this.
+	 */
+	if (IsolationUsesXactSnapshot() && rel->rd_index->indcheckxmin &&
+		!TransactionIdPrecedes(HeapTupleHeaderGetXmin(rel->rd_indextuple->t_data),
+								result.snapshot->xmin))
+		ereport(ERROR,
+				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+					errmsg("index \"%s\" cannot be verified using transaction snapshot",
+						RelationGetRelationName(rel))));
+
+	return result;
+}
+
+/*
+ * Main entry point for GiST check. Allocates memory context and scans through
+ * GiST graph.  This function verifies that tuples of internal pages cover all
+ * the key space of each tuple on leaf page.  To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries to
+ * adjust it by tuples on referenced child page.  Parent gist tuple should
+ * never require any adjustments.
+ */
+static void
+gist_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GistScanItem   *stack;
+	MemoryContext	mctx;
+	MemoryContext	oldcontext;
+	GISTSTATE      *state;
+	int				leafdepth;
+	bool			heapallindexed = *((bool*)callback_state);
+	GistCheckState  check_state;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	state = initGISTstate(rel);
+
+	if (heapallindexed)
+		check_state = gist_init_heapallindexed(rel);
+	check_state.state = state;
+	check_state.rel = rel;
+	check_state.heaprel = heaprel;
+
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GistScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber  i, maxoff;
+		XLogRecPtr	lsn;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page))
+		{
+			/* split page detected, install right link to the stack */
+			GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+			ptr->depth = stack->depth;
+			ptr->parenttup = CopyIndexTuple(stack->parenttup);
+			ptr->parentblk = stack->parentblk;
+			ptr->parentlsn = stack->parentlsn;
+			ptr->blkno = GistPageGetOpaque(page)->rightlink;
+			ptr->next = stack->next;
+			stack->next = ptr;
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GistPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that each tuple looks valid, and is consistent with the
+		 * downlink we followed when we stepped on this page.
+		 */
+		maxoff = PageGetMaxOffsetNumber(page);
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId iid = PageGetItemIdCareful(rel, stack->blkno, page, i, sizeof(GISTPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+			/*
+			 * Check that it's not a leftover invalid tuple from pre-9.1 See
+			 * also gistdoinsert() and gistbulkdelete() handling of such
+			 * tuples. We do consider it error here.
+			 */
+			if (GistTupleIsInvalid(idxtuple))
+				ereport(ERROR,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("index \"%s\" contains an inner tuple marked as invalid, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i),
+						 errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+						 errhint("Please REINDEX it.")));
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+			{
+				/*
+				 * There was a discrepancy between parent and child tuples.
+				 * We need to verify it is not a result of concurrent call of
+				 * gistplacetopage(). So, lock parent and try to find downlink
+				 * for current page. It may be missing due to concurrent page
+				 * split, this is OK.
+				 *
+				 * Note that when we aquire parent tuple now we hold lock for
+				 * both parent and child buffers. Thus parent tuple must
+				 * include keyspace of the child.
+				 */
+				pfree(stack->parenttup);
+				stack->parenttup = gist_refind_parent(rel, stack->parentblk,
+													  stack->blkno, strategy);
+
+				/* We found it - make a final check before failing */
+				if (!stack->parenttup)
+					elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+						 stack->blkno, stack->parentblk);
+				else if (gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+				else
+				{
+					/*
+					 * But now it is properly adjusted - nothing to do here.
+					 */
+				}
+			}
+
+			if (GistPageIsLeaf(page))
+			{
+				if (heapallindexed)
+				{
+					bloom_add_element(check_state.filter, (unsigned char *) idxtuple,
+								  IndexTupleSize(idxtuple));
+				}
+			}
+			/* If this is an internal page, recurse into the child */
+			else
+			{
+				GistScanItem *ptr;
+
+				ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+				ptr->depth = stack->depth + 1;
+				ptr->parenttup = CopyIndexTuple(idxtuple);
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		LockBuffer(buffer, GIST_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	if (heapallindexed)
+	{
+		IndexInfo  *indexinfo = BuildIndexInfo(rel);
+		TableScanDesc scan;
+
+		scan = table_beginscan_strat(heaprel,	/* relation */
+									 check_state.snapshot,	/* snapshot */
+									 0, /* number of keys */
+									 NULL,	/* scan key */
+									 true,	/* buffer access strategy OK */
+									 true); /* syncscan OK? */
+
+		/*
+		 * Scan will behave as the first scan of a CREATE INDEX CONCURRENTLY.
+		 */
+		indexinfo->ii_Concurrent = true;
+
+		indexinfo->ii_Unique = false;
+		indexinfo->ii_ExclusionOps = NULL;
+		indexinfo->ii_ExclusionProcs = NULL;
+		indexinfo->ii_ExclusionStrats = NULL;
+
+		elog(DEBUG1, "verifying that tuples from index \"%s\" are present in \"%s\"",
+			 RelationGetRelationName(rel),
+			 RelationGetRelationName(heaprel));
+
+		table_index_build_scan(heaprel, rel, indexinfo, true, false,
+							   gist_tuple_present_callback, (void *) &check_state, scan);
+
+		ereport(DEBUG1,
+		(errmsg_internal("finished verifying presence of " INT64_FORMAT " tuples from table \"%s\" with bitset %.2f%% set",
+							check_state.heaptuplespresent, RelationGetRelationName(heaprel),
+							100.0 * bloom_prop_bits_set(check_state.filter))));
+
+		UnregisterSnapshot(check_state.snapshot);
+		bloom_free(check_state.filter);
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+static void
+gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+						  bool *isnull, bool tupleIsAlive, void *checkstate)
+{
+	GistCheckState *state = (GistCheckState *) checkstate;
+	IndexTuple	itup = gistFormTuple(state->state, index, values, isnull, true);
+	itup->t_tid = *tid;
+	/* Probe Bloom filter -- tuple should be present */
+	if (bloom_lacks_element(state->filter, (unsigned char *) itup,
+							IndexTupleSize(itup)))
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("heap tuple (%u,%u) from table \"%s\" lacks matching index tuple within index \"%s\"",
+						ItemPointerGetBlockNumber(&(itup->t_tid)),
+						ItemPointerGetOffsetNumber(&(itup->t_tid)),
+						RelationGetRelationName(state->heaprel),
+						RelationGetRelationName(state->rel))));
+
+	state->heaptuplespresent++;
+
+	pfree(itup);
+}
+
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	gistcheckpage(rel, buffer);
+
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted page %d",
+						RelationGetRelationName(rel), blockNo)));
+
+	if (GistPageIsDeleted(page))
+	{
+		if (!GistPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gist_refind_parent(Relation rel, BlockNumber parentblkno,
+				   BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIST_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GistPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o, sizeof(GISTPageOpaqueData));
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 5d61a33936..9397a69c6e 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -179,6 +179,25 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gist_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_parent_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
   <tip>
    <para>
-- 
2.37.0 (Apple Git-136)

#13

Andrey Borodin

amborodin86@gmail.com

about 3 years ago

In reply to: Andrey Borodin (#12)

3 attachment(s)

Re: Amcheck verification of GiST and GIN

On Sun, Nov 27, 2022 at 1:29 PM Andrey Borodin <amborodin86@gmail.com> wrote:

GiST verification checks only one invariant that can be verified if
page locks acquired the same way as page split does.
GIN does not require ShareLock because it does not check cross-level invariants.

I was wrong. GIN check does similar gin_refind_parent() to lock pages
in bottom-up manner and truly verify downlink-child_page invariant.

Here's v17. The only difference is that I added progress reporting to
GiST verification.
I still did not implement heapallindexed for GIN. Existence of pending
lists makes this just too difficult for a weekend coding project :(

Thank you!

Best regards, Andrey Borodin.

Attachments:

v17-0002-Add-gist_index_parent_check-function-to-verify-G.patchapplication/octet-stream; name=v17-0002-Add-gist_index_parent_check-function-to-verify-G.patchDownload

From bfdf8bd8267a37560db4bb5fd2f53164d23068df Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:17:44 +0500
Subject: [PATCH v17 2/3] Add gist_index_parent_check() function to verify GiST
 index

---
 contrib/amcheck/Makefile                |   6 +-
 contrib/amcheck/amcheck--1.3--1.4.sql   |  14 +
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/expected/check_gist.out | 119 ++++++
 contrib/amcheck/meson.build             |   3 +
 contrib/amcheck/sql/check_gist.sql      |  42 ++
 contrib/amcheck/verify_gist.c           | 538 ++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml               |  19 +
 8 files changed, 740 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.3--1.4.sql
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index f10fd9d89d..a817419581 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,15 +4,17 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
+	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
-DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql \
+		amcheck--1.3--1.4.sql
 
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap
+REGRESS = check check_btree check_heap check_gist
 
 TAP_TESTS = 1
 
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
new file mode 100644
index 0000000000..93297379ef
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -0,0 +1,14 @@
+/* contrib/amcheck/amcheck--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.4'" to load this file. \quit
+
+
+-- gist_index_parent_check()
+--
+CREATE FUNCTION gist_index_parent_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index ab50931f75..e67ace01c9 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.3'
+default_version = '1.4'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 0000000000..9749adfd34
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,119 @@
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 29d100120e..66e34d8706 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -1,5 +1,6 @@
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
@@ -22,6 +23,7 @@ install_data(
   'amcheck--1.0--1.1.sql',
   'amcheck--1.1--1.2.sql',
   'amcheck--1.2--1.3.sql',
+  'amcheck--1.3--1.4.sql',
   kwargs: contrib_data_args,
 )
 
@@ -34,6 +36,7 @@ tests += {
       'check',
       'check_btree',
       'check_heap',
+      'check_gist',
     ],
   },
   'tap': {
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 0000000000..75b9ff4b43
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,42 @@
+
+SELECT setseed(1);
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 0000000000..5a5fa73536
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,538 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gist.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "access/tableam.h"
+#include "access/transam.h"
+#include "catalog/pg_am.h"
+#include "common/pg_prng.h"
+#include "catalog/index.h"
+#include "lib/bloomfilter.h"
+#include "storage/lmgr.h"
+#include "storage/smgr.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "utils/snapmgr.h"
+
+#include "amcheck.h"
+
+/*
+ * GistScanItem represents one item of depth-first scan of GiST index.
+ */
+typedef struct GistScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GistScanItem *next;
+} GistScanItem;
+
+typedef struct GistCheckState
+{
+	/* Bloom filter fingerprints index tuples */
+	bloom_filter *filter;
+	/* Debug counter */
+	int64		heaptuplespresent;
+	/* GiST state */
+	GISTSTATE      *state;
+
+	Snapshot		snapshot;
+	Relation	rel;
+	Relation	heaprel;
+
+	/* progress reporting stuff */
+	BlockNumber totalblocks;
+	BlockNumber reportedblocks;
+	BlockNumber scannedblocks;
+	BlockNumber deltablocks;
+} GistCheckState;
+
+PG_FUNCTION_INFO_V1(gist_index_parent_check);
+
+static void gist_init_heapallindexed(Relation rel, GistCheckState *result);
+static void gist_index_checkable(Relation rel);
+static void gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+												void* callback_state);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gist_refind_parent(Relation rel, BlockNumber parentblkno,
+									 BlockNumber childblkno,
+									 BufferAccessStrategy strategy);
+static void gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+						  bool *isnull, bool tupleIsAlive, void *checkstate);
+
+/*
+ * gist_index_parent_check(index regclass)
+ *
+ * Verify integrity of GiST index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum gist_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid		indrelid = PG_GETARG_OID(0);
+	bool	heapallindexed = false;
+
+	if (PG_NARGS() >= 2)
+		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid, gist_index_checkable,
+		gist_check_parent_keys_consistency, AccessShareLock, &heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Check that relation is eligible for GiST verification
+ */
+static void
+gist_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIST_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GiST indexes are supported as targets for this verification"),
+				 errdetail("Relation \"%s\" is not a GiST index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+static void
+gist_init_heapallindexed(Relation rel, GistCheckState *result)
+{
+	int64		total_pages;
+	int64		total_elems;
+	uint64		seed;
+
+	/*
+	 * Size Bloom filter based on estimated number of tuples in index.
+	 * This logic is similar to B-tree, see verify_btree.c .
+	 */
+	total_pages = result->totalblocks;
+	total_elems = Max(total_pages * (MaxOffsetNumber / 5),
+						(int64) rel->rd_rel->reltuples);
+	seed = pg_prng_uint64(&pg_global_prng_state);
+	result->filter = bloom_create(total_elems, maintenance_work_mem, seed);
+
+	result->snapshot = RegisterSnapshot(GetTransactionSnapshot());
+
+
+	/*
+	 * GetTransactionSnapshot() always acquires a new MVCC snapshot in
+	 * READ COMMITTED mode.  A new snapshot is guaranteed to have all
+	 * the entries it requires in the index.
+	 *
+	 * We must defend against the possibility that an old xact
+	 * snapshot was returned at higher isolation levels when that
+	 * snapshot is not safe for index scans of the target index.  This
+	 * is possible when the snapshot sees tuples that are before the
+	 * index's indcheckxmin horizon.  Throwing an error here should be
+	 * very rare.  It doesn't seem worth using a secondary snapshot to
+	 * avoid this.
+	 */
+	if (IsolationUsesXactSnapshot() && rel->rd_index->indcheckxmin &&
+		!TransactionIdPrecedes(HeapTupleHeaderGetXmin(rel->rd_indextuple->t_data),
+								result->snapshot->xmin))
+		ereport(ERROR,
+				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+					errmsg("index \"%s\" cannot be verified using transaction snapshot",
+						RelationGetRelationName(rel))));
+}
+
+/*
+ * Main entry point for GiST check. Allocates memory context and scans through
+ * GiST graph.  This function verifies that tuples of internal pages cover all
+ * the key space of each tuple on leaf page.  To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries to
+ * adjust it by tuples on referenced child page.  Parent gist tuple should
+ * never require any adjustments.
+ */
+static void
+gist_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GistScanItem   *stack;
+	MemoryContext	mctx;
+	MemoryContext	oldcontext;
+	GISTSTATE      *state;
+	int				leafdepth;
+	bool			heapallindexed = *((bool*)callback_state);
+	GistCheckState  check_state;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	state = initGISTstate(rel);
+
+	check_state.state = state;
+	check_state.rel = rel;
+	check_state.heaprel = heaprel;
+
+	check_state.totalblocks = RelationGetNumberOfBlocks(rel);
+	check_state.reportedblocks = 0;
+	check_state.scannedblocks = 0;
+	/* report every 100 blocks or 5%, whichever is bigger */
+	check_state.deltablocks = Max(check_state.totalblocks / 20, 100);
+
+	if (heapallindexed)
+		gist_init_heapallindexed(rel, &check_state);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GistScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber  i, maxoff;
+		XLogRecPtr	lsn;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Report progress */
+		if (check_state.scannedblocks > check_state.reportedblocks +
+			  check_state.deltablocks)
+		{
+			elog(DEBUG1, "verified level %u blocks of approximately %u total",
+				check_state.scannedblocks, check_state.totalblocks);
+			check_state.reportedblocks = check_state.scannedblocks;
+		}
+		check_state.scannedblocks++;
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page))
+		{
+			/* split page detected, install right link to the stack */
+			GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+			ptr->depth = stack->depth;
+			ptr->parenttup = CopyIndexTuple(stack->parenttup);
+			ptr->parentblk = stack->parentblk;
+			ptr->parentlsn = stack->parentlsn;
+			ptr->blkno = GistPageGetOpaque(page)->rightlink;
+			ptr->next = stack->next;
+			stack->next = ptr;
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GistPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that each tuple looks valid, and is consistent with the
+		 * downlink we followed when we stepped on this page.
+		 */
+		maxoff = PageGetMaxOffsetNumber(page);
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId iid = PageGetItemIdCareful(rel, stack->blkno, page, i, sizeof(GISTPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+			/*
+			 * Check that it's not a leftover invalid tuple from pre-9.1 See
+			 * also gistdoinsert() and gistbulkdelete() handling of such
+			 * tuples. We do consider it error here.
+			 */
+			if (GistTupleIsInvalid(idxtuple))
+				ereport(ERROR,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("index \"%s\" contains an inner tuple marked as invalid, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i),
+						 errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+						 errhint("Please REINDEX it.")));
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+			{
+				/*
+				 * There was a discrepancy between parent and child tuples.
+				 * We need to verify it is not a result of concurrent call of
+				 * gistplacetopage(). So, lock parent and try to find downlink
+				 * for current page. It may be missing due to concurrent page
+				 * split, this is OK.
+				 *
+				 * Note that when we aquire parent tuple now we hold lock for
+				 * both parent and child buffers. Thus parent tuple must
+				 * include keyspace of the child.
+				 */
+				pfree(stack->parenttup);
+				stack->parenttup = gist_refind_parent(rel, stack->parentblk,
+													  stack->blkno, strategy);
+
+				/* We found it - make a final check before failing */
+				if (!stack->parenttup)
+					elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+						 stack->blkno, stack->parentblk);
+				else if (gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+				else
+				{
+					/*
+					 * But now it is properly adjusted - nothing to do here.
+					 */
+				}
+			}
+
+			if (GistPageIsLeaf(page))
+			{
+				if (heapallindexed)
+				{
+					bloom_add_element(check_state.filter, (unsigned char *) idxtuple,
+								  IndexTupleSize(idxtuple));
+				}
+			}
+			/* If this is an internal page, recurse into the child */
+			else
+			{
+				GistScanItem *ptr;
+
+				ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+				ptr->depth = stack->depth + 1;
+				ptr->parenttup = CopyIndexTuple(idxtuple);
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		LockBuffer(buffer, GIST_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	if (heapallindexed)
+	{
+		IndexInfo  *indexinfo = BuildIndexInfo(rel);
+		TableScanDesc scan;
+
+		scan = table_beginscan_strat(heaprel,	/* relation */
+									 check_state.snapshot,	/* snapshot */
+									 0, /* number of keys */
+									 NULL,	/* scan key */
+									 true,	/* buffer access strategy OK */
+									 true); /* syncscan OK? */
+
+		/*
+		 * Scan will behave as the first scan of a CREATE INDEX CONCURRENTLY.
+		 */
+		indexinfo->ii_Concurrent = true;
+
+		indexinfo->ii_Unique = false;
+		indexinfo->ii_ExclusionOps = NULL;
+		indexinfo->ii_ExclusionProcs = NULL;
+		indexinfo->ii_ExclusionStrats = NULL;
+
+		elog(DEBUG1, "verifying that tuples from index \"%s\" are present in \"%s\"",
+			 RelationGetRelationName(rel),
+			 RelationGetRelationName(heaprel));
+
+		table_index_build_scan(heaprel, rel, indexinfo, true, false,
+							   gist_tuple_present_callback, (void *) &check_state, scan);
+
+		ereport(DEBUG1,
+		(errmsg_internal("finished verifying presence of " INT64_FORMAT " tuples from table \"%s\" with bitset %.2f%% set",
+							check_state.heaptuplespresent, RelationGetRelationName(heaprel),
+							100.0 * bloom_prop_bits_set(check_state.filter))));
+
+		UnregisterSnapshot(check_state.snapshot);
+		bloom_free(check_state.filter);
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+static void
+gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+						  bool *isnull, bool tupleIsAlive, void *checkstate)
+{
+	GistCheckState *state = (GistCheckState *) checkstate;
+	IndexTuple	itup = gistFormTuple(state->state, index, values, isnull, true);
+	itup->t_tid = *tid;
+	/* Probe Bloom filter -- tuple should be present */
+	if (bloom_lacks_element(state->filter, (unsigned char *) itup,
+							IndexTupleSize(itup)))
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("heap tuple (%u,%u) from table \"%s\" lacks matching index tuple within index \"%s\"",
+						ItemPointerGetBlockNumber(&(itup->t_tid)),
+						ItemPointerGetOffsetNumber(&(itup->t_tid)),
+						RelationGetRelationName(state->heaprel),
+						RelationGetRelationName(state->rel))));
+
+	state->heaptuplespresent++;
+
+	pfree(itup);
+}
+
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	gistcheckpage(rel, buffer);
+
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted page %d",
+						RelationGetRelationName(rel), blockNo)));
+
+	if (GistPageIsDeleted(page))
+	{
+		if (!GistPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gist_refind_parent(Relation rel, BlockNumber parentblkno,
+				   BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIST_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GistPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o, sizeof(GISTPageOpaqueData));
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 5d61a33936..9397a69c6e 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -179,6 +179,25 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gist_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_parent_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
   <tip>
    <para>
-- 
2.37.0 (Apple Git-136)

v17-0003-Add-gin_index_parent_check-to-verify-GIN-index.patchapplication/octet-stream; name=v17-0003-Add-gin_index_parent_check-to-verify-GIN-index.patchDownload

From 9a24a686594058dc1cdc6ea70ec9adfe397a85b9 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:22:26 +0500
Subject: [PATCH v17 3/3] Add gin_index_parent_check() to verify GIN index

---
 contrib/amcheck/Makefile               |   3 +-
 contrib/amcheck/amcheck--1.3--1.4.sql  |  11 +-
 contrib/amcheck/amcheck.c              |   2 +-
 contrib/amcheck/expected/check_gin.out |  64 ++
 contrib/amcheck/meson.build            |   2 +
 contrib/amcheck/sql/check_gin.sql      |  40 ++
 contrib/amcheck/verify_gin.c           | 798 +++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml              |  19 +
 8 files changed, 936 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/expected/check_gin.out
 create mode 100644 contrib/amcheck/sql/check_gin.sql
 create mode 100644 contrib/amcheck/verify_gin.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index a817419581..ecb849a605 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -5,6 +5,7 @@ OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
 	verify_gist.o \
+	verify_gin.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
@@ -14,7 +15,7 @@ DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck
 
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap check_gist
+REGRESS = check check_btree check_heap check_gist check_gin
 
 TAP_TESTS = 1
 
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
index 93297379ef..c914e6d0ba 100644
--- a/contrib/amcheck/amcheck--1.3--1.4.sql
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -11,4 +11,13 @@ RETURNS VOID
 AS 'MODULE_PATHNAME', 'gist_index_parent_check'
 LANGUAGE C STRICT;
 
-REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
+REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
+
+-- gin_index_parent_check()
+--
+CREATE FUNCTION gin_index_parent_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gin_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gin_index_parent_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
index 3793b0cd93..9999a233f8 100644
--- a/contrib/amcheck/amcheck.c
+++ b/contrib/amcheck/amcheck.c
@@ -83,7 +83,7 @@ amcheck_lock_relation_and_check(Oid indrelid, IndexCheckableCallback checkable,
 	else
 	{
 		heaprel = NULL;
-		/* for "gcc -Og" https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78394 */
+		/* Set these just to suppress "uninitialized variable" warnings */
 		save_userid = InvalidOid;
 		save_sec_context = -1;
 		save_nestlevel = -1;
diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
new file mode 100644
index 0000000000..d98d525c66
--- /dev/null
+++ b/contrib/amcheck/expected/check_gin.out
@@ -0,0 +1,64 @@
+-- Test of index bulk load
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx', true);
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test index inserts
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+SELECT gin_index_parent_check('gin_check_idx', true);
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test GIN over text array
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx', true);
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 66e34d8706..f3f097c5c5 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -1,5 +1,6 @@
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gin.c',
   'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
@@ -37,6 +38,7 @@ tests += {
       'check_btree',
       'check_heap',
       'check_gist',
+      'check_gin',
     ],
   },
   'tap': {
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
new file mode 100644
index 0000000000..789259e662
--- /dev/null
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -0,0 +1,40 @@
+-- Test of index bulk load
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx', true);
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test index inserts
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+
+SELECT gin_index_parent_check('gin_check_idx', true);
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test GIN over text array
+SELECT setseed(1);
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx', true);
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
new file mode 100644
index 0000000000..c91fc11899
--- /dev/null
+++ b/contrib/amcheck/verify_gin.c
@@ -0,0 +1,798 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gin.c
+ *		Verifies the integrity of GIN indexes based on invariants.
+ *
+ * Verification checks that all paths in GIN graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gin.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gin_private.h"
+#include "access/nbtree.h"
+#include "amcheck.h"
+#include "catalog/pg_am.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "string.h"
+
+/*
+ * GinScanItem represents one item of depth-first scan of GIN index.
+ */
+typedef struct GinScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GinScanItem *next;
+}			GinScanItem;
+
+/*
+ * GinPostingTreeScanItem represents one item of depth-first scan of GIN  posting tree.
+ */
+typedef struct GinPostingTreeScanItem
+{
+	int			depth;
+	ItemPointerData parentkey;
+	BlockNumber parentblk;
+	BlockNumber blkno;
+	struct GinPostingTreeScanItem *next;
+}			GinPostingTreeScanItem;
+
+
+PG_FUNCTION_INFO_V1(gin_index_parent_check);
+
+static void gin_index_checkable(Relation rel);
+static void gin_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state);
+static bool check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gin_refind_parent(Relation rel, BlockNumber parentblkno,
+									BlockNumber childblkno,
+									BufferAccessStrategy strategy);
+
+/*
+ * gin_index_parent_check(index regclass)
+ *
+ * Verify integrity of GIN index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gin_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	bool	heapallindexed = false;
+
+	if (PG_NARGS() >= 2)
+		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid, gin_index_checkable,
+		gin_check_parent_keys_consistency, AccessShareLock, &heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Read item pointers from leaf entry tuple.
+ *
+ * Returns a palloc'd array of ItemPointers. The number of items is returned
+ * in *nitems.
+ */
+static ItemPointer
+ginReadTupleWithoutState(IndexTuple itup, int *nitems)
+{
+	Pointer		ptr = GinGetPosting(itup);
+	int			nipd = GinGetNPosting(itup);
+	ItemPointer ipd;
+	int			ndecoded;
+
+	if (GinItupIsCompressed(itup))
+	{
+		if (nipd > 0)
+		{
+			ipd = ginPostingListDecode((GinPostingList *) ptr, &ndecoded);
+			if (nipd != ndecoded)
+				elog(ERROR, "number of items mismatch in GIN entry tuple, %d in tuple header, %d decoded",
+					 nipd, ndecoded);
+		}
+		else
+		{
+			ipd = palloc(0);
+		}
+	}
+	else
+	{
+		ipd = (ItemPointer) palloc(sizeof(ItemPointerData) * nipd);
+		memcpy(ipd, ptr, sizeof(ItemPointerData) * nipd);
+	}
+	*nitems = nipd;
+	return ipd;
+}
+
+
+/*
+ * Check that relation is eligible for GIN verification
+ */
+static void
+gin_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIN_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GIN indexes are supported as targets for this verification"),
+				 errdetail("Relation \"%s\" is not a GIN index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+/*
+ * Allocates memory context and scans through postigTree graph
+ *
+ */
+static void
+gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting_tree_root)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinPostingTreeScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinPostingTreeScanItem *) palloc0(sizeof(GinPostingTreeScanItem));
+	stack->depth = 0;
+	ItemPointerSetInvalid(&stack->parentkey);
+	stack->parentblk = InvalidBlockNumber;
+	stack->blkno = posting_tree_root;
+
+	elog(DEBUG3, "processing posting tree at blk %u", posting_tree_root);
+
+	while (stack)
+	{
+		GinPostingTreeScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		Assert(GinPageIsData(page));
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			ItemPointerData minItem;
+			int			nlist;
+			ItemPointerData *list;
+			char		tidrange_buf[100];
+
+			ItemPointerSetMin(&minItem);
+
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			list = GinDataLeafPageGetItems(page, &nlist, minItem);
+
+			if (nlist > 0)
+			{
+				snprintf(tidrange_buf, sizeof(tidrange_buf),
+						 "%d tids (%u, %u) - (%u, %u)",
+						 nlist,
+						 ItemPointerGetBlockNumberNoCheck(&list[0]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[0]),
+						 ItemPointerGetBlockNumberNoCheck(&list[nlist - 1]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[nlist - 1]));
+			} else {
+				snprintf(tidrange_buf, sizeof(tidrange_buf), "0 tids");
+			}
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				elog(DEBUG3, "blk %u: parent %u highkey (%u, %u), %s",
+					 stack->blkno,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey),
+					 tidrange_buf);
+			}
+			else
+			{
+				elog(DEBUG3, "blk %u: root leaf, %s",
+					 stack->blkno,
+					 tidrange_buf);
+			}
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				ItemPointerGetOffsetNumberNoCheck(&stack->parentkey) != InvalidOffsetNumber &&
+				nlist > 0 &&
+				ItemPointerCompare(&stack->parentkey, &list[nlist - 1]) < 0)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": tid exceeds parent's high key in postingTree leaf on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			}
+		}
+		else
+		{
+			LocationIndex pd_lower;
+			int			lowersize;
+			ItemPointerData bound;
+
+			/*
+			 * Check that tuples in each page are properly ordered and
+			 * consistent with parent high key
+			 */
+			maxoff = GinPageGetOpaque(page)->maxoff;
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: internal posting tree page with %u items, parent %u highkey (%u, %u)",
+					 stack->blkno,
+					 maxoff,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey));
+			else
+				elog(DEBUG3, "blk %u: root internal posting tree page with %u items", stack->blkno, maxoff);
+
+			/*
+			 * A GIN posting tree internal page stores PostingItems in the
+			 * 'lower' part of the page. The 'upper' part is unused. The
+			 * number of elements is stored in the opaque area (maxoff).
+			 * Make sure the size of the 'lower' part agrees with 'maxoff'
+			 *
+			 * We didn't set pd_lower until PostgreSQL version 9.4, so if this
+			 * check fails, it could also be because the index was binary-upgraded
+			 * from an earlier version. That was a long time ago, though, so let's
+			 * warn if it doesn't match.
+			 */
+			pd_lower = ((PageHeader) page)->pd_lower;
+			lowersize = pd_lower - MAXALIGN(SizeOfPageHeaderData);
+			if ((lowersize - MAXALIGN(sizeof(ItemPointerData))) / sizeof(PostingItem) != maxoff)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has unexpected pd_lower %u in posting tree block %u with maxoff %u)",
+								RelationGetRelationName(rel), pd_lower, stack->blkno, maxoff)));
+			}
+
+			/*
+			 * Before the PostingItems, there's one ItemPointerData in the
+			 * 'lower' part that stores the page's high key.
+			 */
+			bound = *GinDataPageGetRightBound(page);
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				if (!ItemPointerEquals(&stack->parentkey, &bound))
+				{
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\": posting tree page's high key (%u, %u) doesn't match the downlink on block %u (parent blk %u, key (%u, %u))",
+									RelationGetRelationName(rel),
+									ItemPointerGetBlockNumberNoCheck(&bound),
+									ItemPointerGetOffsetNumberNoCheck(&bound),
+									stack->blkno,
+									stack->parentblk,
+									ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+									ItemPointerGetOffsetNumberNoCheck(&stack->parentkey))));
+				}
+			}
+
+			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
+
+				elog(DEBUG3, "key (%u, %u) -> %u",
+					 ItemPointerGetBlockNumber(&posting_item->key),
+					 ItemPointerGetOffsetNumber(&posting_item->key),
+					 BlockIdGetBlockNumber(&posting_item->child_blkno));
+
+				if (i == maxoff && GinPageGetOpaque(page)->rightlink == InvalidBlockNumber)
+				{
+					/* The rightmost item in the tree level has (0, 0) as the key */
+					if (ItemPointerGetBlockNumberNoCheck(&posting_item->key) != 0 ||
+						ItemPointerGetOffsetNumberNoCheck(&posting_item->key) != 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": rightmost posting tree page (blk %u) has unexpected last key (%u, %u)",
+										RelationGetRelationName(rel),
+										stack->blkno,
+										ItemPointerGetBlockNumberNoCheck(&posting_item->key),
+										ItemPointerGetOffsetNumberNoCheck(&posting_item->key))));
+					}
+				}
+				else if (i != FirstOffsetNumber)
+				{
+					PostingItem *previous_posting_item = GinDataPageGetPostingItem(page, i - 1);
+
+					if (ItemPointerCompare(&posting_item->key, &previous_posting_item->key) < 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\" has wrong tuple order in posting tree, block %u, offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+					}
+				}
+
+				/*
+				 * Check if this tuple is consistent with the downlink in the
+				 * parent.
+				 */
+				if (stack->parentblk != InvalidBlockNumber && i == maxoff)
+				{
+					if (ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting item exceeds parent's high key in postingTree internal page on block %u offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+
+					}
+				}
+
+				/* If this is an internal page, recurse into the child */
+				if (!GinPageIsLeaf(page))
+				{
+					GinPostingTreeScanItem *ptr;
+
+					ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
+					ptr->depth = stack->depth + 1;
+					ptr->parentkey = posting_item->key;
+					ptr->parentblk = stack->blkno;
+					ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+
+			}
+		}
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Main entry point for GIN check. Allocates memory context and scans through
+ * GIN graph.
+ */
+static void
+gin_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GinState	state;
+	bool		heapallindexed = *((bool*)callback_state); // TODO, not implemented yet
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+	initGinState(&state, rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinScanItem *) palloc0(sizeof(GinScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIN_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GinScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+		IndexTuple	prev_tuple;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+		maxoff = PageGetMaxOffsetNumber(page);
+
+		/* Do basic sanity checks on the page headers */
+		if (!check_index_page(rel, buffer, stack->blkno))
+		{
+			goto nextpage;
+		}
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (stack->parenttup != NULL)
+		{
+			GinNullCategory parent_key_category;
+			Datum		parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, maxoff, sizeof(GinPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory page_max_key_category;
+			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
+
+			if (GinPageGetOpaque(page)->rightlink != InvalidBlockNumber &&
+				ginCompareEntries(&state, attnum, page_max_key, page_max_key_category, parent_key, parent_key_category) > 0)
+			{
+				/* split page detected, install right link to the stack */
+				GinScanItem *ptr;
+
+				elog(DEBUG3, "split detected");
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth;
+				ptr->parenttup = CopyIndexTuple(stack->parenttup);
+				ptr->parentblk = stack->parentblk;
+				ptr->parentlsn = stack->parentlsn;
+				ptr->blkno = GinPageGetOpaque(page)->rightlink;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+				goto nextpage;
+			}
+		}
+
+		/*
+		 * Check that tuples in each page are properly ordered and consistent
+		 * with parent high key
+		 */
+		prev_tuple = NULL;
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i, sizeof(GinPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory prev_key_category;
+			Datum		prev_key;
+			GinNullCategory current_key_category;
+			Datum		current_key;
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+				continue;
+			}
+
+			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
+
+			/* (apparently) first block is metadata, skip order check */
+			if (i != FirstOffsetNumber && stack->blkno != (BlockNumber) 1)
+			{
+				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
+				if (ginCompareEntries(&state, attnum, prev_key, prev_key_category, current_key, current_key_category) >= 0)
+				{
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has wrong tuple order, block %u, offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+				}
+			}
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				i == maxoff)
+			{
+				GinNullCategory parent_key_category;
+				Datum		parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+
+				if (ginCompareEntries(&state, attnum, current_key, current_key_category, parent_key, parent_key_category) > 0)
+				{
+					/*
+					 * There was a discrepancy between parent and child
+					 * tuples. We need to verify it is not a result of
+					 * concurrent call of gistplacetopage(). So, lock parent
+					 * and try to find downlink for current page. It may be
+					 * missing due to concurrent page split, this is OK.
+					 */
+					pfree(stack->parenttup);
+					stack->parenttup = gin_refind_parent(rel, stack->parentblk,
+														 stack->blkno, strategy);
+
+					/* We found it - make a final check before failing */
+					if (!stack->parenttup)
+						elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+							 stack->blkno, stack->parentblk);
+					else
+					{
+						parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+						if (ginCompareEntries(&state, attnum, current_key, current_key_category, parent_key, parent_key_category) > 0)
+							ereport(ERROR,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+											RelationGetRelationName(rel), stack->blkno, i)));
+						else
+						{
+							/*
+							 * But now it is properly adjusted - nothing to do
+							 * here.
+							 */
+						}
+					}
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+			if (!GinPageIsLeaf(page))
+			{
+				GinScanItem *ptr;
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth + 1;
+				/* last tuple in layer has no high key */
+				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
+				{
+					ptr->parenttup = CopyIndexTuple(idxtuple);
+				}
+				else
+				{
+					ptr->parenttup = NULL;
+				}
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = GinGetDownlink(idxtuple);
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+			/* If this item is a pointer to a posting tree, recurse into it */
+			else if (GinIsPostingTree(idxtuple))
+			{
+				BlockNumber rootPostingTree = GinGetPostingTree(idxtuple);
+
+				gin_check_posting_tree_parent_keys_consistency(rel, rootPostingTree);
+			}
+			else
+			{
+				ItemPointer ipd;
+				int			nipd;
+
+				ipd = ginReadTupleWithoutState(idxtuple, &nipd);
+
+				for (int j = 0; j < nipd; j++)
+				{
+					if (!OffsetNumberIsValid(ItemPointerGetOffsetNumber(&ipd[j])))
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting list contains invalid heap pointer on block %u",
+										RelationGetRelationName(rel), stack->blkno)));
+					}
+				}
+				pfree(ipd);
+			}
+
+			prev_tuple = CopyIndexTuple(idxtuple);
+		}
+
+nextpage:
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Verify that a freshly-read page looks sane.
+ */
+static bool
+gincheckpage(Relation rel, Buffer buf)
+{
+	Page		page = BufferGetPage(buf);
+
+	/*
+	 * ReadBuffer verifies that every newly-read page passes
+	 * PageHeaderIsValid, which means it either contains a reasonably sane
+	 * page header or is all-zero.  We have to defend against the all-zero
+	 * case, however.
+	 */
+	if (PageIsNew(page))
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains unexpected zero page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buf)),
+				 errhint("Please REINDEX it.")));
+		return false;
+	}
+
+	/*
+	 * Additionally check that the special area looks sane.
+	 */
+	if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains corrupted page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buf)),
+				 errhint("Please REINDEX it.")));
+		return false;
+	}
+	return true;
+}
+
+static bool
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	if (!gincheckpage(rel, buffer))
+		return false;
+
+	if (GinPageIsDeleted(page))
+	{
+		if (!GinPageIsLeaf(page))
+		{
+			ereport(WARNING,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+			return false;
+		}
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+		{
+			ereport(WARNING,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+			return false;
+		}
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+		return false;
+	}
+	return true;
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gin_refind_parent(Relation rel, BlockNumber parentblkno,
+				  BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIN_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GinPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o, sizeof(GinPageOpaqueData));
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 9397a69c6e..7ffa36b205 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -180,6 +180,25 @@ ORDER BY c.relpages DESC LIMIT 10;
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term>
+     <function>gin_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gin_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gin_index_parent_check</function> tests that its target GIN index
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term>
      <function>gist_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
-- 
2.37.0 (Apple Git-136)

v17-0001-Refactor-amcheck-to-extract-common-locking-routi.patchapplication/octet-stream; name=v17-0001-Refactor-amcheck-to-extract-common-locking-routi.patchDownload

From 653bad3f597e774f3d95070b8751d948581f9fa1 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:08:10 +0500
Subject: [PATCH v17 1/3] Refactor amcheck to extract common locking routines

---
 contrib/amcheck/Makefile        |   2 +
 contrib/amcheck/amcheck.c       | 188 +++++++++++++++++++
 contrib/amcheck/amcheck.h       |  27 +++
 contrib/amcheck/meson.build     |   1 +
 contrib/amcheck/verify_nbtree.c | 307 ++++++++------------------------
 5 files changed, 296 insertions(+), 229 deletions(-)
 create mode 100644 contrib/amcheck/amcheck.c
 create mode 100644 contrib/amcheck/amcheck.h

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index b82f221e50..f10fd9d89d 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -3,11 +3,13 @@
 MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
+	amcheck.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
 DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
 REGRESS = check check_btree check_heap
diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
new file mode 100644
index 0000000000..3793b0cd93
--- /dev/null
+++ b/contrib/amcheck/amcheck.c
@@ -0,0 +1,188 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.c
+ *		Utility functions common to all access methods.
+ *
+ * Copyright (c) 2017-2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/table.h"
+#include "access/tableam.h"
+#include "amcheck.h"
+#include "catalog/index.h"
+#include "commands/tablecmds.h"
+#include "utils/guc.h"
+
+static bool
+amcheck_index_mainfork_expected(Relation rel);
+
+/*
+ * Check if index relation should have a file for its main relation
+ * fork.  Verification uses this to skip unlogged indexes when in hot standby
+ * mode, where there is simply nothing to verify.
+ *
+ * NB: Caller should call index_checkable()
+ * before calling here.
+ */
+static bool
+amcheck_index_mainfork_expected(Relation rel)
+{
+	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
+		!RecoveryInProgress())
+		return true;
+
+	ereport(NOTICE,
+			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
+			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
+					RelationGetRelationName(rel))));
+
+	return false;
+}
+
+void
+amcheck_lock_relation_and_check(Oid indrelid, IndexCheckableCallback checkable,
+												IndexDoCheckCallback check, LOCKMODE lockmode, void *state)
+{
+	Oid			heapid;
+	Relation	indrel;
+	Relation	heaprel;
+	Oid			save_userid;
+	int			save_sec_context;
+	int			save_nestlevel;
+
+	/*
+	 * We must lock table before index to avoid deadlocks.  However, if the
+	 * passed indrelid isn't an index then IndexGetRelation() will fail.
+	 * Rather than emitting a not-very-helpful error message, postpone
+	 * complaining, expecting that the is-it-an-index test below will fail.
+	 *
+	 * In hot standby mode this will raise an error when parentcheck is true.
+	 */
+	heapid = IndexGetRelation(indrelid, true);
+	if (OidIsValid(heapid))
+	{
+		heaprel = table_open(heapid, lockmode);
+
+		/*
+		 * Switch to the table owner's userid, so that any index functions are
+		 * run as that user.  Also lock down security-restricted operations
+		 * and arrange to make GUC variable changes local to this command.
+		 */
+		GetUserIdAndSecContext(&save_userid, &save_sec_context);
+		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
+							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
+		save_nestlevel = NewGUCNestLevel();
+	}
+	else
+	{
+		heaprel = NULL;
+		/* for "gcc -Og" https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78394 */
+		save_userid = InvalidOid;
+		save_sec_context = -1;
+		save_nestlevel = -1;
+	}
+
+	/*
+	 * Open the target index relations separately (like relation_openrv(), but
+	 * with heap relation locked first to prevent deadlocking).  In hot
+	 * standby mode this will raise an error when parentcheck is true.
+	 *
+	 * There is no need for the usual indcheckxmin usability horizon test
+	 * here, even in the heapallindexed case, because index undergoing
+	 * verification only needs to have entries for a new transaction snapshot.
+	 * (If this is a parentcheck verification, there is no question about
+	 * committed or recently dead heap tuples lacking index entries due to
+	 * concurrent activity.)
+	 */
+	indrel = index_open(indrelid, lockmode);
+
+	/*
+	 * Since we did the IndexGetRelation call above without any lock, it's
+	 * barely possible that a race against an index drop/recreation could have
+	 * netted us the wrong table.
+	 */
+	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_TABLE),
+				 errmsg("could not open parent table of index \"%s\"",
+						RelationGetRelationName(indrel))));
+
+	/* Relation suitable for checking */
+	checkable(indrel);
+
+	if (amcheck_index_mainfork_expected(indrel))
+		check(indrel, heaprel, state);
+
+	/* Roll back any GUC changes executed by index functions */
+	AtEOXact_GUC(false, save_nestlevel);
+
+	/* Restore userid and security context */
+	SetUserIdAndSecContext(save_userid, save_sec_context);
+
+	/*
+	 * Release locks early. That's ok here because nothing in the called
+	 * routines will trigger shared cache invalidations to be sent, so we can
+	 * relax the usual pattern of only releasing locks after commit.
+	 */
+	index_close(indrel, lockmode);
+	if (heaprel)
+		table_close(heaprel, lockmode);
+}
+
+/*
+ * PageGetItemId() wrapper that validates returned line pointer.
+ *
+ * Buffer page/page item access macros generally trust that line pointers are
+ * not corrupt, which might cause problems for verification itself.  For
+ * example, there is no bounds checking in PageGetItem().  Passing it a
+ * corrupt line pointer can cause it to return a tuple/pointer that is unsafe
+ * to dereference.
+ *
+ * Validating line pointers before tuples avoids undefined behavior and
+ * assertion failures with corrupt indexes, making the verification process
+ * more robust and predictable.
+ */
+ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset, size_t opaquesize)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	Assert(opaquesize == MAXALIGN(opaquesize));
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(opaquesize))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree and gist
+	 * never uses either.  Verify that line pointer has storage, too, since
+	 * even LP_DEAD items should.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/contrib/amcheck/amcheck.h b/contrib/amcheck/amcheck.h
new file mode 100644
index 0000000000..10906efd8a
--- /dev/null
+++ b/contrib/amcheck/amcheck.h
@@ -0,0 +1,27 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "storage/lockdefs.h"
+#include "utils/relcache.h"
+#include "miscadmin.h"
+
+/* Typedefs for callback functions for amcheck_lock_relation */
+typedef void (*IndexCheckableCallback) (Relation index);
+typedef void (*IndexDoCheckCallback) (Relation rel, Relation heaprel, void* state);
+
+extern void amcheck_lock_relation_and_check(Oid indrelid,
+											IndexCheckableCallback checkable,
+											IndexDoCheckCallback check,
+											LOCKMODE lockmode, void *state);
+
+extern ItemId PageGetItemIdCareful(Relation rel, BlockNumber block,
+					 Page page, OffsetNumber offset, size_t opaquesize);
\ No newline at end of file
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 2194a91124..29d100120e 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -1,4 +1,5 @@
 amcheck_sources = files(
+  'amcheck.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 9021d156eb..8c67edab2a 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -34,13 +34,14 @@
 #include "commands/tablecmds.h"
 #include "common/pg_prng.h"
 #include "lib/bloomfilter.h"
-#include "miscadmin.h"
 #include "storage/lmgr.h"
 #include "storage/smgr.h"
 #include "utils/guc.h"
 #include "utils/memutils.h"
 #include "utils/snapmgr.h"
 
+#include "amcheck.h"
+
 
 PG_MODULE_MAGIC;
 
@@ -138,10 +139,8 @@ typedef struct BtreeLevel
 PG_FUNCTION_INFO_V1(bt_index_check);
 PG_FUNCTION_INFO_V1(bt_index_parent_check);
 
-static void bt_index_check_internal(Oid indrelid, bool parentcheck,
-									bool heapallindexed, bool rootdescend);
+static void bt_index_check_internal_callback(Relation indrel, Relation heaprel, void* state);
 static inline void btree_index_checkable(Relation rel);
-static inline bool btree_index_mainfork_expected(Relation rel);
 static void bt_check_every_level(Relation rel, Relation heaprel,
 								 bool heapkeyspace, bool readonly, bool heapallindexed,
 								 bool rootdescend);
@@ -184,12 +183,17 @@ static inline bool invariant_l_nontarget_offset(BtreeCheckState *state,
 static Page palloc_btree_page(BtreeCheckState *state, BlockNumber blocknum);
 static inline BTScanInsert bt_mkscankey_pivotsearch(Relation rel,
 													IndexTuple itup);
-static ItemId PageGetItemIdCareful(BtreeCheckState *state, BlockNumber block,
-								   Page page, OffsetNumber offset);
 static inline ItemPointer BTreeTupleGetHeapTIDCareful(BtreeCheckState *state,
 													  IndexTuple itup, bool nonpivot);
 static inline ItemPointer BTreeTupleGetPointsToTID(IndexTuple itup);
 
+typedef struct BTCheckCallbackState
+{
+	bool parentcheck;
+	bool heapallindexed;
+	bool rootdescend;
+} BTCheckCallbackState;
+
 /*
  * bt_index_check(index regclass, heapallindexed boolean)
  *
@@ -203,12 +207,17 @@ Datum
 bt_index_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
+	BTCheckCallbackState args;
 
-	if (PG_NARGS() == 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = false;
+
+	if (PG_NARGS() >= 2)
+		args.heapallindexed = PG_GETARG_BOOL(1);
 
-	bt_index_check_internal(indrelid, false, heapallindexed, false);
+	amcheck_lock_relation_and_check(indrelid, btree_index_checkable,
+		bt_index_check_internal_callback, AccessShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -226,15 +235,18 @@ Datum
 bt_index_parent_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		rootdescend = false;
+	BTCheckCallbackState args;
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = true;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+		args.heapallindexed = PG_GETARG_BOOL(1);
 	if (PG_NARGS() == 3)
-		rootdescend = PG_GETARG_BOOL(2);
+		args.rootdescend = PG_GETARG_BOOL(2);
 
-	bt_index_check_internal(indrelid, true, heapallindexed, rootdescend);
+	amcheck_lock_relation_and_check(indrelid, btree_index_checkable,
+		bt_index_check_internal_callback, ShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -242,126 +254,35 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
 /*
  * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
  */
-static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend)
+static void bt_index_check_internal_callback(Relation indrel, Relation heaprel, void* state)
 {
-	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	LOCKMODE	lockmode;
-	Oid			save_userid;
-	int			save_sec_context;
-	int			save_nestlevel;
-
-	if (parentcheck)
-		lockmode = ShareLock;
-	else
-		lockmode = AccessShareLock;
-
-	/*
-	 * We must lock table before index to avoid deadlocks.  However, if the
-	 * passed indrelid isn't an index then IndexGetRelation() will fail.
-	 * Rather than emitting a not-very-helpful error message, postpone
-	 * complaining, expecting that the is-it-an-index test below will fail.
-	 *
-	 * In hot standby mode this will raise an error when parentcheck is true.
-	 */
-	heapid = IndexGetRelation(indrelid, true);
-	if (OidIsValid(heapid))
-	{
-		heaprel = table_open(heapid, lockmode);
-
-		/*
-		 * Switch to the table owner's userid, so that any index functions are
-		 * run as that user.  Also lock down security-restricted operations
-		 * and arrange to make GUC variable changes local to this command.
-		 */
-		GetUserIdAndSecContext(&save_userid, &save_sec_context);
-		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
-							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
-		save_nestlevel = NewGUCNestLevel();
-	}
-	else
-	{
-		heaprel = NULL;
-		/* Set these just to suppress "uninitialized variable" warnings */
-		save_userid = InvalidOid;
-		save_sec_context = -1;
-		save_nestlevel = -1;
-	}
-
-	/*
-	 * Open the target index relations separately (like relation_openrv(), but
-	 * with heap relation locked first to prevent deadlocking).  In hot
-	 * standby mode this will raise an error when parentcheck is true.
-	 *
-	 * There is no need for the usual indcheckxmin usability horizon test
-	 * here, even in the heapallindexed case, because index undergoing
-	 * verification only needs to have entries for a new transaction snapshot.
-	 * (If this is a parentcheck verification, there is no question about
-	 * committed or recently dead heap tuples lacking index entries due to
-	 * concurrent activity.)
-	 */
-	indrel = index_open(indrelid, lockmode);
+	BTCheckCallbackState* args = (BTCheckCallbackState*) state;
+	bool		heapkeyspace,
+					allequalimage;
 
-	/*
-	 * Since we did the IndexGetRelation call above without any lock, it's
-	 * barely possible that a race against an index drop/recreation could have
-	 * netted us the wrong table.
-	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
 		ereport(ERROR,
-				(errcode(ERRCODE_UNDEFINED_TABLE),
-				 errmsg("could not open parent table of index \"%s\"",
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" lacks a main relation fork",
 						RelationGetRelationName(indrel))));
 
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	if (btree_index_mainfork_expected(indrel))
-	{
-		bool		heapkeyspace,
-					allequalimage;
-
-		if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" lacks a main relation fork",
-							RelationGetRelationName(indrel))));
-
-		/* Extract metadata from metapage, and sanitize it in passing */
-		_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
-		if (allequalimage && !heapkeyspace)
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
-							RelationGetRelationName(indrel))));
-		if (allequalimage && !_bt_allequalimage(indrel, false))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
-							RelationGetRelationName(indrel))));
-
-		/* Check index, possibly against table it is an index on */
-		bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
-							 heapallindexed, rootdescend);
-	}
-
-	/* Roll back any GUC changes executed by index functions */
-	AtEOXact_GUC(false, save_nestlevel);
+	/* Extract metadata from metapage, and sanitize it in passing */
+	_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
+	if (allequalimage && !heapkeyspace)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
+						RelationGetRelationName(indrel))));
+	if (allequalimage && !_bt_allequalimage(indrel, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
+						RelationGetRelationName(indrel))));
 
-	/* Restore userid and security context */
-	SetUserIdAndSecContext(save_userid, save_sec_context);
+	/* Check index, possibly against table it is an index on */
+	bt_check_every_level(indrel, heaprel, heapkeyspace, args->parentcheck,
+							args->heapallindexed, args->rootdescend);
 
-	/*
-	 * Release locks early. That's ok here because nothing in the called
-	 * routines will trigger shared cache invalidations to be sent, so we can
-	 * relax the usual pattern of only releasing locks after commit.
-	 */
-	index_close(indrel, lockmode);
-	if (heaprel)
-		table_close(heaprel, lockmode);
 }
 
 /*
@@ -398,29 +319,6 @@ btree_index_checkable(Relation rel)
 				 errdetail("Index is not valid.")));
 }
 
-/*
- * Check if B-Tree index relation should have a file for its main relation
- * fork.  Verification uses this to skip unlogged indexes when in hot standby
- * mode, where there is simply nothing to verify.  We behave as if the
- * relation is empty.
- *
- * NB: Caller should call btree_index_checkable() before calling here.
- */
-static inline bool
-btree_index_mainfork_expected(Relation rel)
-{
-	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
-		!RecoveryInProgress())
-		return true;
-
-	ereport(DEBUG1,
-			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
-			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
-					RelationGetRelationName(rel))));
-
-	return false;
-}
-
 /*
  * Main entry point for B-Tree SQL-callable functions. Walks the B-Tree in
  * logical order, verifying invariants as it goes.  Optionally, verification
@@ -793,9 +691,9 @@ bt_check_level_from_leftmost(BtreeCheckState *state, BtreeLevel level)
 				ItemId		itemid;
 
 				/* Internal page -- downlink gets leftmost on next level */
-				itemid = PageGetItemIdCareful(state, state->targetblock,
+				itemid = PageGetItemIdCareful(state->rel, state->targetblock,
 											  state->target,
-											  P_FIRSTDATAKEY(opaque));
+											  P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 				itup = (IndexTuple) PageGetItem(state->target, itemid);
 				nextleveldown.leftmost = BTreeTupleGetDownLink(itup);
 				nextleveldown.level = opaque->btpo_level - 1;
@@ -875,8 +773,8 @@ nextpage:
 			IndexTuple	itup;
 			ItemId		itemid;
 
-			itemid = PageGetItemIdCareful(state, state->targetblock,
-										  state->target, P_HIKEY);
+			itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+										  state->target, P_HIKEY, sizeof(BTPageOpaqueData));
 			itup = (IndexTuple) PageGetItem(state->target, itemid);
 
 			state->lowkey = MemoryContextAlloc(oldcontext, IndexTupleSize(itup));
@@ -1093,8 +991,8 @@ bt_target_page_check(BtreeCheckState *state)
 		IndexTuple	itup;
 
 		/* Verify line pointer before checking tuple */
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, P_HIKEY);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, P_HIKEY, sizeof(BTPageOpaqueData));
 		if (!_bt_check_natts(state->rel, state->heapkeyspace, state->target,
 							 P_HIKEY))
 		{
@@ -1129,8 +1027,8 @@ bt_target_page_check(BtreeCheckState *state)
 
 		CHECK_FOR_INTERRUPTS();
 
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, offset);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, offset, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(state->target, itemid);
 		tupsize = IndexTupleSize(itup);
 
@@ -1442,9 +1340,9 @@ bt_target_page_check(BtreeCheckState *state)
 							 OffsetNumberNext(offset));
 
 			/* Reuse itup to get pointed-to heap location of second item */
-			itemid = PageGetItemIdCareful(state, state->targetblock,
+			itemid = PageGetItemIdCareful(state->rel, state->targetblock,
 										  state->target,
-										  OffsetNumberNext(offset));
+										  OffsetNumberNext(offset), sizeof(BTPageOpaqueData));
 			itup = (IndexTuple) PageGetItem(state->target, itemid);
 			tid = BTreeTupleGetPointsToTID(itup);
 			nhtid = psprintf("(%u,%u)",
@@ -1735,8 +1633,8 @@ bt_right_page_check_scankey(BtreeCheckState *state)
 	if (P_ISLEAF(opaque) && nline >= P_FIRSTDATAKEY(opaque))
 	{
 		/* Return first data item (if any) */
-		rightitem = PageGetItemIdCareful(state, targetnext, rightpage,
-										 P_FIRSTDATAKEY(opaque));
+		rightitem = PageGetItemIdCareful(state->rel, targetnext, rightpage,
+										 P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 	}
 	else if (!P_ISLEAF(opaque) &&
 			 nline >= OffsetNumberNext(P_FIRSTDATAKEY(opaque)))
@@ -1745,8 +1643,8 @@ bt_right_page_check_scankey(BtreeCheckState *state)
 		 * Return first item after the internal page's "negative infinity"
 		 * item
 		 */
-		rightitem = PageGetItemIdCareful(state, targetnext, rightpage,
-										 OffsetNumberNext(P_FIRSTDATAKEY(opaque)));
+		rightitem = PageGetItemIdCareful(state->rel, targetnext, rightpage,
+										 OffsetNumberNext(P_FIRSTDATAKEY(opaque)), sizeof(BTPageOpaqueData));
 	}
 	else
 	{
@@ -1865,8 +1763,8 @@ bt_child_highkey_check(BtreeCheckState *state,
 
 	if (OffsetNumberIsValid(target_downlinkoffnum))
 	{
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, target_downlinkoffnum);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, target_downlinkoffnum, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(state->target, itemid);
 		downlink = BTreeTupleGetDownLink(itup);
 	}
@@ -1969,7 +1867,7 @@ bt_child_highkey_check(BtreeCheckState *state,
 			OffsetNumber pivotkey_offset;
 
 			/* Get high key */
-			itemid = PageGetItemIdCareful(state, blkno, page, P_HIKEY);
+			itemid = PageGetItemIdCareful(state->rel, blkno, page, P_HIKEY, sizeof(BTPageOpaqueData));
 			highkey = (IndexTuple) PageGetItem(page, itemid);
 
 			/*
@@ -2020,8 +1918,8 @@ bt_child_highkey_check(BtreeCheckState *state,
 													LSN_FORMAT_ARGS(state->targetlsn))));
 					pivotkey_offset = P_HIKEY;
 				}
-				itemid = PageGetItemIdCareful(state, state->targetblock,
-											  state->target, pivotkey_offset);
+				itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+											  state->target, pivotkey_offset, sizeof(BTPageOpaqueData));
 				itup = (IndexTuple) PageGetItem(state->target, itemid);
 			}
 			else
@@ -2107,8 +2005,8 @@ bt_child_check(BtreeCheckState *state, BTScanInsert targetkey,
 	BTPageOpaque copaque;
 	BTPageOpaque topaque;
 
-	itemid = PageGetItemIdCareful(state, state->targetblock,
-								  state->target, downlinkoffnum);
+	itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+								  state->target, downlinkoffnum, sizeof(BTPageOpaqueData));
 	itup = (IndexTuple) PageGetItem(state->target, itemid);
 	childblock = BTreeTupleGetDownLink(itup);
 
@@ -2339,7 +2237,7 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 		 RelationGetRelationName(state->rel));
 
 	level = opaque->btpo_level;
-	itemid = PageGetItemIdCareful(state, blkno, page, P_FIRSTDATAKEY(opaque));
+	itemid = PageGetItemIdCareful(state->rel, blkno, page, P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 	itup = (IndexTuple) PageGetItem(page, itemid);
 	childblk = BTreeTupleGetDownLink(itup);
 	for (;;)
@@ -2363,8 +2261,8 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 										level - 1, copaque->btpo_level)));
 
 		level = copaque->btpo_level;
-		itemid = PageGetItemIdCareful(state, childblk, child,
-									  P_FIRSTDATAKEY(copaque));
+		itemid = PageGetItemIdCareful(state->rel, childblk, child,
+									  P_FIRSTDATAKEY(copaque), sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(child, itemid);
 		childblk = BTreeTupleGetDownLink(itup);
 		/* Be slightly more pro-active in freeing this memory, just in case */
@@ -2412,7 +2310,7 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 	 */
 	if (P_ISHALFDEAD(copaque) && !P_RIGHTMOST(copaque))
 	{
-		itemid = PageGetItemIdCareful(state, childblk, child, P_HIKEY);
+		itemid = PageGetItemIdCareful(state->rel, childblk, child, P_HIKEY, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(child, itemid);
 		if (BTreeTupleGetTopParent(itup) == blkno)
 			return;
@@ -2782,8 +2680,8 @@ invariant_l_offset(BtreeCheckState *state, BTScanInsert key,
 	Assert(key->pivotsearch);
 
 	/* Verify line pointer before checking tuple */
-	itemid = PageGetItemIdCareful(state, state->targetblock, state->target,
-								  upperbound);
+	itemid = PageGetItemIdCareful(state->rel, state->targetblock, state->target,
+								  upperbound, sizeof(BTPageOpaqueData));
 	/* pg_upgrade'd indexes may legally have equal sibling tuples */
 	if (!key->heapkeyspace)
 		return invariant_leq_offset(state, key, upperbound);
@@ -2905,8 +2803,8 @@ invariant_l_nontarget_offset(BtreeCheckState *state, BTScanInsert key,
 	Assert(key->pivotsearch);
 
 	/* Verify line pointer before checking tuple */
-	itemid = PageGetItemIdCareful(state, nontargetblock, nontarget,
-								  upperbound);
+	itemid = PageGetItemIdCareful(state->rel, nontargetblock, nontarget,
+								  upperbound, sizeof(BTPageOpaqueData));
 	cmp = _bt_compare(state->rel, key, nontarget, upperbound);
 
 	/* pg_upgrade'd indexes may legally have equal sibling tuples */
@@ -3143,55 +3041,6 @@ bt_mkscankey_pivotsearch(Relation rel, IndexTuple itup)
 	return skey;
 }
 
-/*
- * PageGetItemId() wrapper that validates returned line pointer.
- *
- * Buffer page/page item access macros generally trust that line pointers are
- * not corrupt, which might cause problems for verification itself.  For
- * example, there is no bounds checking in PageGetItem().  Passing it a
- * corrupt line pointer can cause it to return a tuple/pointer that is unsafe
- * to dereference.
- *
- * Validating line pointers before tuples avoids undefined behavior and
- * assertion failures with corrupt indexes, making the verification process
- * more robust and predictable.
- */
-static ItemId
-PageGetItemIdCareful(BtreeCheckState *state, BlockNumber block, Page page,
-					 OffsetNumber offset)
-{
-	ItemId		itemid = PageGetItemId(page, offset);
-
-	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
-		BLCKSZ - MAXALIGN(sizeof(BTPageOpaqueData)))
-		ereport(ERROR,
-				(errcode(ERRCODE_INDEX_CORRUPTED),
-				 errmsg("line pointer points past end of tuple space in index \"%s\"",
-						RelationGetRelationName(state->rel)),
-				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
-									block, offset, ItemIdGetOffset(itemid),
-									ItemIdGetLength(itemid),
-									ItemIdGetFlags(itemid))));
-
-	/*
-	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree
-	 * never uses either.  Verify that line pointer has storage, too, since
-	 * even LP_DEAD items should within nbtree.
-	 */
-	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
-		ItemIdGetLength(itemid) == 0)
-		ereport(ERROR,
-				(errcode(ERRCODE_INDEX_CORRUPTED),
-				 errmsg("invalid line pointer storage in index \"%s\"",
-						RelationGetRelationName(state->rel)),
-				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
-									block, offset, ItemIdGetOffset(itemid),
-									ItemIdGetLength(itemid),
-									ItemIdGetFlags(itemid))));
-
-	return itemid;
-}
-
 /*
  * BTreeTupleGetHeapTID() wrapper that enforces that a heap TID is present in
  * cases where that is mandatory (i.e. for non-pivot tuples)
-- 
2.37.0 (Apple Git-136)

#14

Jose Arthur Benetasso Villanova

jose.arthur@gmail.com

about 3 years ago

In reply to: Andrey Borodin (#13)

Re: Amcheck verification of GiST and GIN

On Sun, 27 Nov 2022, Andrey Borodin wrote:

On Sun, Nov 27, 2022 at 1:29 PM Andrey Borodin <amborodin86@gmail.com> wrote:

I was wrong. GIN check does similar gin_refind_parent() to lock pages
in bottom-up manner and truly verify downlink-child_page invariant.

Does this mean that we need the adjustment in docs?

Here's v17. The only difference is that I added progress reporting to
GiST verification.
I still did not implement heapallindexed for GIN. Existence of pending
lists makes this just too difficult for a weekend coding project :(

Thank you!

Best regards, Andrey Borodin.

I'm a bit lost here. I tried your patch again and indeed the
heapallindexed inside gin_check_parent_keys_consistency has a TODO
comment, but it's unclear to me if you are going to implement it or if the
patch "needs review". Right now it's "Waiting on Author".

--
Jose Arthur Benetasso Villanova

#15

Robert Haas

robertmhaas@gmail.com

about 3 years ago

In reply to: Jose Arthur Benetasso Villanova (#14)

Re: Amcheck verification of GiST and GIN

On Wed, Dec 14, 2022 at 7:19 AM Jose Arthur Benetasso Villanova
<jose.arthur@gmail.com> wrote:

I'm a bit lost here. I tried your patch again and indeed the
heapallindexed inside gin_check_parent_keys_consistency has a TODO
comment, but it's unclear to me if you are going to implement it or if the
patch "needs review". Right now it's "Waiting on Author".

FWIW, I don't think there's a hard requirement that every index AM
needs to support the same set of amcheck options. Where it makes sense
and can be done in a reasonably straightforward manner, we should. But
sometimes that may not be the case, and that seems fine, too.

--
Robert Haas
EDB: http://www.enterprisedb.com

#16

Andrey Borodin

amborodin86@gmail.com

about 3 years ago

In reply to: Jose Arthur Benetasso Villanova (#14)

3 attachment(s)

Re: Amcheck verification of GiST and GIN

Hi Jose, thank you for review and sorry for so long delay to answer.

On Wed, Dec 14, 2022 at 4:19 AM Jose Arthur Benetasso Villanova
<jose.arthur@gmail.com> wrote:

On Sun, 27 Nov 2022, Andrey Borodin wrote:

On Sun, Nov 27, 2022 at 1:29 PM Andrey Borodin <amborodin86@gmail.com> wrote:

I was wrong. GIN check does similar gin_refind_parent() to lock pages
in bottom-up manner and truly verify downlink-child_page invariant.

Does this mean that we need the adjustment in docs?

It seems to me that gin_index_parent_check() docs are correct.

Here's v17. The only difference is that I added progress reporting to
GiST verification.
I still did not implement heapallindexed for GIN. Existence of pending
lists makes this just too difficult for a weekend coding project :(

Thank you!

Best regards, Andrey Borodin.

I'm a bit lost here. I tried your patch again and indeed the
heapallindexed inside gin_check_parent_keys_consistency has a TODO
comment, but it's unclear to me if you are going to implement it or if the
patch "needs review". Right now it's "Waiting on Author".

Please find the attached new version. In this patchset heapallindexed
flag is removed from GIN checks.

Thank you!

Best regards, Andrey Borodin.

Attachments:

v18-0003-Add-gin_index_parent_check-to-verify-GIN-index.patchapplication/octet-stream; name=v18-0003-Add-gin_index_parent_check-to-verify-GIN-index.patchDownload

From 38f8cac157df49c079952eae3de3864f519fa958 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:22:26 +0500
Subject: [PATCH v18 3/3] Add gin_index_parent_check() to verify GIN index

---
 contrib/amcheck/Makefile               |   3 +-
 contrib/amcheck/amcheck--1.3--1.4.sql  |  11 +-
 contrib/amcheck/amcheck.c              |   2 +-
 contrib/amcheck/expected/check_gin.out |  64 ++
 contrib/amcheck/meson.build            |   2 +
 contrib/amcheck/sql/check_gin.sql      |  40 ++
 contrib/amcheck/verify_gin.c           | 798 +++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml              |  19 +
 8 files changed, 936 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/expected/check_gin.out
 create mode 100644 contrib/amcheck/sql/check_gin.sql
 create mode 100644 contrib/amcheck/verify_gin.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index a817419581..ecb849a605 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -5,6 +5,7 @@ OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
 	verify_gist.o \
+	verify_gin.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
@@ -14,7 +15,7 @@ DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck
 
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap check_gist
+REGRESS = check check_btree check_heap check_gist check_gin
 
 TAP_TESTS = 1
 
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
index 93297379ef..c914e6d0ba 100644
--- a/contrib/amcheck/amcheck--1.3--1.4.sql
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -11,4 +11,13 @@ RETURNS VOID
 AS 'MODULE_PATHNAME', 'gist_index_parent_check'
 LANGUAGE C STRICT;
 
-REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
+REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
+
+-- gin_index_parent_check()
+--
+CREATE FUNCTION gin_index_parent_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gin_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gin_index_parent_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
index 3793b0cd93..9999a233f8 100644
--- a/contrib/amcheck/amcheck.c
+++ b/contrib/amcheck/amcheck.c
@@ -83,7 +83,7 @@ amcheck_lock_relation_and_check(Oid indrelid, IndexCheckableCallback checkable,
 	else
 	{
 		heaprel = NULL;
-		/* for "gcc -Og" https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78394 */
+		/* Set these just to suppress "uninitialized variable" warnings */
 		save_userid = InvalidOid;
 		save_sec_context = -1;
 		save_nestlevel = -1;
diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
new file mode 100644
index 0000000000..d98d525c66
--- /dev/null
+++ b/contrib/amcheck/expected/check_gin.out
@@ -0,0 +1,64 @@
+-- Test of index bulk load
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx', true);
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test index inserts
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+SELECT gin_index_parent_check('gin_check_idx', true);
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test GIN over text array
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx', true);
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 45e9d74947..fec44a6826 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gin.c',
   'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
@@ -39,6 +40,7 @@ tests += {
       'check_btree',
       'check_heap',
       'check_gist',
+      'check_gin',
     ],
   },
   'tap': {
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
new file mode 100644
index 0000000000..789259e662
--- /dev/null
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -0,0 +1,40 @@
+-- Test of index bulk load
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx', true);
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test index inserts
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+
+SELECT gin_index_parent_check('gin_check_idx', true);
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test GIN over text array
+SELECT setseed(1);
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx', true);
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
new file mode 100644
index 0000000000..c91fc11899
--- /dev/null
+++ b/contrib/amcheck/verify_gin.c
@@ -0,0 +1,798 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gin.c
+ *		Verifies the integrity of GIN indexes based on invariants.
+ *
+ * Verification checks that all paths in GIN graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gin.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gin_private.h"
+#include "access/nbtree.h"
+#include "amcheck.h"
+#include "catalog/pg_am.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "string.h"
+
+/*
+ * GinScanItem represents one item of depth-first scan of GIN index.
+ */
+typedef struct GinScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GinScanItem *next;
+}			GinScanItem;
+
+/*
+ * GinPostingTreeScanItem represents one item of depth-first scan of GIN  posting tree.
+ */
+typedef struct GinPostingTreeScanItem
+{
+	int			depth;
+	ItemPointerData parentkey;
+	BlockNumber parentblk;
+	BlockNumber blkno;
+	struct GinPostingTreeScanItem *next;
+}			GinPostingTreeScanItem;
+
+
+PG_FUNCTION_INFO_V1(gin_index_parent_check);
+
+static void gin_index_checkable(Relation rel);
+static void gin_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state);
+static bool check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gin_refind_parent(Relation rel, BlockNumber parentblkno,
+									BlockNumber childblkno,
+									BufferAccessStrategy strategy);
+
+/*
+ * gin_index_parent_check(index regclass)
+ *
+ * Verify integrity of GIN index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gin_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	bool	heapallindexed = false;
+
+	if (PG_NARGS() >= 2)
+		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid, gin_index_checkable,
+		gin_check_parent_keys_consistency, AccessShareLock, &heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Read item pointers from leaf entry tuple.
+ *
+ * Returns a palloc'd array of ItemPointers. The number of items is returned
+ * in *nitems.
+ */
+static ItemPointer
+ginReadTupleWithoutState(IndexTuple itup, int *nitems)
+{
+	Pointer		ptr = GinGetPosting(itup);
+	int			nipd = GinGetNPosting(itup);
+	ItemPointer ipd;
+	int			ndecoded;
+
+	if (GinItupIsCompressed(itup))
+	{
+		if (nipd > 0)
+		{
+			ipd = ginPostingListDecode((GinPostingList *) ptr, &ndecoded);
+			if (nipd != ndecoded)
+				elog(ERROR, "number of items mismatch in GIN entry tuple, %d in tuple header, %d decoded",
+					 nipd, ndecoded);
+		}
+		else
+		{
+			ipd = palloc(0);
+		}
+	}
+	else
+	{
+		ipd = (ItemPointer) palloc(sizeof(ItemPointerData) * nipd);
+		memcpy(ipd, ptr, sizeof(ItemPointerData) * nipd);
+	}
+	*nitems = nipd;
+	return ipd;
+}
+
+
+/*
+ * Check that relation is eligible for GIN verification
+ */
+static void
+gin_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIN_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GIN indexes are supported as targets for this verification"),
+				 errdetail("Relation \"%s\" is not a GIN index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+/*
+ * Allocates memory context and scans through postigTree graph
+ *
+ */
+static void
+gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting_tree_root)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinPostingTreeScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinPostingTreeScanItem *) palloc0(sizeof(GinPostingTreeScanItem));
+	stack->depth = 0;
+	ItemPointerSetInvalid(&stack->parentkey);
+	stack->parentblk = InvalidBlockNumber;
+	stack->blkno = posting_tree_root;
+
+	elog(DEBUG3, "processing posting tree at blk %u", posting_tree_root);
+
+	while (stack)
+	{
+		GinPostingTreeScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		Assert(GinPageIsData(page));
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			ItemPointerData minItem;
+			int			nlist;
+			ItemPointerData *list;
+			char		tidrange_buf[100];
+
+			ItemPointerSetMin(&minItem);
+
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			list = GinDataLeafPageGetItems(page, &nlist, minItem);
+
+			if (nlist > 0)
+			{
+				snprintf(tidrange_buf, sizeof(tidrange_buf),
+						 "%d tids (%u, %u) - (%u, %u)",
+						 nlist,
+						 ItemPointerGetBlockNumberNoCheck(&list[0]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[0]),
+						 ItemPointerGetBlockNumberNoCheck(&list[nlist - 1]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[nlist - 1]));
+			} else {
+				snprintf(tidrange_buf, sizeof(tidrange_buf), "0 tids");
+			}
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				elog(DEBUG3, "blk %u: parent %u highkey (%u, %u), %s",
+					 stack->blkno,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey),
+					 tidrange_buf);
+			}
+			else
+			{
+				elog(DEBUG3, "blk %u: root leaf, %s",
+					 stack->blkno,
+					 tidrange_buf);
+			}
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				ItemPointerGetOffsetNumberNoCheck(&stack->parentkey) != InvalidOffsetNumber &&
+				nlist > 0 &&
+				ItemPointerCompare(&stack->parentkey, &list[nlist - 1]) < 0)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": tid exceeds parent's high key in postingTree leaf on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			}
+		}
+		else
+		{
+			LocationIndex pd_lower;
+			int			lowersize;
+			ItemPointerData bound;
+
+			/*
+			 * Check that tuples in each page are properly ordered and
+			 * consistent with parent high key
+			 */
+			maxoff = GinPageGetOpaque(page)->maxoff;
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: internal posting tree page with %u items, parent %u highkey (%u, %u)",
+					 stack->blkno,
+					 maxoff,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey));
+			else
+				elog(DEBUG3, "blk %u: root internal posting tree page with %u items", stack->blkno, maxoff);
+
+			/*
+			 * A GIN posting tree internal page stores PostingItems in the
+			 * 'lower' part of the page. The 'upper' part is unused. The
+			 * number of elements is stored in the opaque area (maxoff).
+			 * Make sure the size of the 'lower' part agrees with 'maxoff'
+			 *
+			 * We didn't set pd_lower until PostgreSQL version 9.4, so if this
+			 * check fails, it could also be because the index was binary-upgraded
+			 * from an earlier version. That was a long time ago, though, so let's
+			 * warn if it doesn't match.
+			 */
+			pd_lower = ((PageHeader) page)->pd_lower;
+			lowersize = pd_lower - MAXALIGN(SizeOfPageHeaderData);
+			if ((lowersize - MAXALIGN(sizeof(ItemPointerData))) / sizeof(PostingItem) != maxoff)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has unexpected pd_lower %u in posting tree block %u with maxoff %u)",
+								RelationGetRelationName(rel), pd_lower, stack->blkno, maxoff)));
+			}
+
+			/*
+			 * Before the PostingItems, there's one ItemPointerData in the
+			 * 'lower' part that stores the page's high key.
+			 */
+			bound = *GinDataPageGetRightBound(page);
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				if (!ItemPointerEquals(&stack->parentkey, &bound))
+				{
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\": posting tree page's high key (%u, %u) doesn't match the downlink on block %u (parent blk %u, key (%u, %u))",
+									RelationGetRelationName(rel),
+									ItemPointerGetBlockNumberNoCheck(&bound),
+									ItemPointerGetOffsetNumberNoCheck(&bound),
+									stack->blkno,
+									stack->parentblk,
+									ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+									ItemPointerGetOffsetNumberNoCheck(&stack->parentkey))));
+				}
+			}
+
+			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
+
+				elog(DEBUG3, "key (%u, %u) -> %u",
+					 ItemPointerGetBlockNumber(&posting_item->key),
+					 ItemPointerGetOffsetNumber(&posting_item->key),
+					 BlockIdGetBlockNumber(&posting_item->child_blkno));
+
+				if (i == maxoff && GinPageGetOpaque(page)->rightlink == InvalidBlockNumber)
+				{
+					/* The rightmost item in the tree level has (0, 0) as the key */
+					if (ItemPointerGetBlockNumberNoCheck(&posting_item->key) != 0 ||
+						ItemPointerGetOffsetNumberNoCheck(&posting_item->key) != 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": rightmost posting tree page (blk %u) has unexpected last key (%u, %u)",
+										RelationGetRelationName(rel),
+										stack->blkno,
+										ItemPointerGetBlockNumberNoCheck(&posting_item->key),
+										ItemPointerGetOffsetNumberNoCheck(&posting_item->key))));
+					}
+				}
+				else if (i != FirstOffsetNumber)
+				{
+					PostingItem *previous_posting_item = GinDataPageGetPostingItem(page, i - 1);
+
+					if (ItemPointerCompare(&posting_item->key, &previous_posting_item->key) < 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\" has wrong tuple order in posting tree, block %u, offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+					}
+				}
+
+				/*
+				 * Check if this tuple is consistent with the downlink in the
+				 * parent.
+				 */
+				if (stack->parentblk != InvalidBlockNumber && i == maxoff)
+				{
+					if (ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting item exceeds parent's high key in postingTree internal page on block %u offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+
+					}
+				}
+
+				/* If this is an internal page, recurse into the child */
+				if (!GinPageIsLeaf(page))
+				{
+					GinPostingTreeScanItem *ptr;
+
+					ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
+					ptr->depth = stack->depth + 1;
+					ptr->parentkey = posting_item->key;
+					ptr->parentblk = stack->blkno;
+					ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+
+			}
+		}
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Main entry point for GIN check. Allocates memory context and scans through
+ * GIN graph.
+ */
+static void
+gin_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GinState	state;
+	bool		heapallindexed = *((bool*)callback_state); // TODO, not implemented yet
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+	initGinState(&state, rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinScanItem *) palloc0(sizeof(GinScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIN_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GinScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+		IndexTuple	prev_tuple;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+		maxoff = PageGetMaxOffsetNumber(page);
+
+		/* Do basic sanity checks on the page headers */
+		if (!check_index_page(rel, buffer, stack->blkno))
+		{
+			goto nextpage;
+		}
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (stack->parenttup != NULL)
+		{
+			GinNullCategory parent_key_category;
+			Datum		parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, maxoff, sizeof(GinPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory page_max_key_category;
+			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
+
+			if (GinPageGetOpaque(page)->rightlink != InvalidBlockNumber &&
+				ginCompareEntries(&state, attnum, page_max_key, page_max_key_category, parent_key, parent_key_category) > 0)
+			{
+				/* split page detected, install right link to the stack */
+				GinScanItem *ptr;
+
+				elog(DEBUG3, "split detected");
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth;
+				ptr->parenttup = CopyIndexTuple(stack->parenttup);
+				ptr->parentblk = stack->parentblk;
+				ptr->parentlsn = stack->parentlsn;
+				ptr->blkno = GinPageGetOpaque(page)->rightlink;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+				goto nextpage;
+			}
+		}
+
+		/*
+		 * Check that tuples in each page are properly ordered and consistent
+		 * with parent high key
+		 */
+		prev_tuple = NULL;
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i, sizeof(GinPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory prev_key_category;
+			Datum		prev_key;
+			GinNullCategory current_key_category;
+			Datum		current_key;
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+				continue;
+			}
+
+			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
+
+			/* (apparently) first block is metadata, skip order check */
+			if (i != FirstOffsetNumber && stack->blkno != (BlockNumber) 1)
+			{
+				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
+				if (ginCompareEntries(&state, attnum, prev_key, prev_key_category, current_key, current_key_category) >= 0)
+				{
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has wrong tuple order, block %u, offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+				}
+			}
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				i == maxoff)
+			{
+				GinNullCategory parent_key_category;
+				Datum		parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+
+				if (ginCompareEntries(&state, attnum, current_key, current_key_category, parent_key, parent_key_category) > 0)
+				{
+					/*
+					 * There was a discrepancy between parent and child
+					 * tuples. We need to verify it is not a result of
+					 * concurrent call of gistplacetopage(). So, lock parent
+					 * and try to find downlink for current page. It may be
+					 * missing due to concurrent page split, this is OK.
+					 */
+					pfree(stack->parenttup);
+					stack->parenttup = gin_refind_parent(rel, stack->parentblk,
+														 stack->blkno, strategy);
+
+					/* We found it - make a final check before failing */
+					if (!stack->parenttup)
+						elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+							 stack->blkno, stack->parentblk);
+					else
+					{
+						parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+						if (ginCompareEntries(&state, attnum, current_key, current_key_category, parent_key, parent_key_category) > 0)
+							ereport(ERROR,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+											RelationGetRelationName(rel), stack->blkno, i)));
+						else
+						{
+							/*
+							 * But now it is properly adjusted - nothing to do
+							 * here.
+							 */
+						}
+					}
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+			if (!GinPageIsLeaf(page))
+			{
+				GinScanItem *ptr;
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth + 1;
+				/* last tuple in layer has no high key */
+				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
+				{
+					ptr->parenttup = CopyIndexTuple(idxtuple);
+				}
+				else
+				{
+					ptr->parenttup = NULL;
+				}
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = GinGetDownlink(idxtuple);
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+			/* If this item is a pointer to a posting tree, recurse into it */
+			else if (GinIsPostingTree(idxtuple))
+			{
+				BlockNumber rootPostingTree = GinGetPostingTree(idxtuple);
+
+				gin_check_posting_tree_parent_keys_consistency(rel, rootPostingTree);
+			}
+			else
+			{
+				ItemPointer ipd;
+				int			nipd;
+
+				ipd = ginReadTupleWithoutState(idxtuple, &nipd);
+
+				for (int j = 0; j < nipd; j++)
+				{
+					if (!OffsetNumberIsValid(ItemPointerGetOffsetNumber(&ipd[j])))
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting list contains invalid heap pointer on block %u",
+										RelationGetRelationName(rel), stack->blkno)));
+					}
+				}
+				pfree(ipd);
+			}
+
+			prev_tuple = CopyIndexTuple(idxtuple);
+		}
+
+nextpage:
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Verify that a freshly-read page looks sane.
+ */
+static bool
+gincheckpage(Relation rel, Buffer buf)
+{
+	Page		page = BufferGetPage(buf);
+
+	/*
+	 * ReadBuffer verifies that every newly-read page passes
+	 * PageHeaderIsValid, which means it either contains a reasonably sane
+	 * page header or is all-zero.  We have to defend against the all-zero
+	 * case, however.
+	 */
+	if (PageIsNew(page))
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains unexpected zero page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buf)),
+				 errhint("Please REINDEX it.")));
+		return false;
+	}
+
+	/*
+	 * Additionally check that the special area looks sane.
+	 */
+	if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains corrupted page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buf)),
+				 errhint("Please REINDEX it.")));
+		return false;
+	}
+	return true;
+}
+
+static bool
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	if (!gincheckpage(rel, buffer))
+		return false;
+
+	if (GinPageIsDeleted(page))
+	{
+		if (!GinPageIsLeaf(page))
+		{
+			ereport(WARNING,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+			return false;
+		}
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+		{
+			ereport(WARNING,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+			return false;
+		}
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+		return false;
+	}
+	return true;
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gin_refind_parent(Relation rel, BlockNumber parentblkno,
+				  BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIN_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GinPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o, sizeof(GinPageOpaqueData));
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 9397a69c6e..7ffa36b205 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -180,6 +180,25 @@ ORDER BY c.relpages DESC LIMIT 10;
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term>
+     <function>gin_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gin_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gin_index_parent_check</function> tests that its target GIN index
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term>
      <function>gist_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
-- 
2.32.0 (Apple Git-132)

v18-0002-Add-gist_index_parent_check-function-to-verify-G.patchapplication/octet-stream; name=v18-0002-Add-gist_index_parent_check-function-to-verify-G.patchDownload

From 65733ff09b6133e630e4c1d5678cb66b48a3cbe2 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:17:44 +0500
Subject: [PATCH v18 2/3] Add gist_index_parent_check() function to verify GiST
 index

---
 contrib/amcheck/Makefile                |   6 +-
 contrib/amcheck/amcheck--1.3--1.4.sql   |  14 +
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/expected/check_gist.out | 119 ++++++
 contrib/amcheck/meson.build             |   3 +
 contrib/amcheck/sql/check_gist.sql      |  42 ++
 contrib/amcheck/verify_gist.c           | 538 ++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml               |  19 +
 8 files changed, 740 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.3--1.4.sql
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index f10fd9d89d..a817419581 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,15 +4,17 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
+	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
-DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql \
+		amcheck--1.3--1.4.sql
 
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap
+REGRESS = check check_btree check_heap check_gist
 
 TAP_TESTS = 1
 
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
new file mode 100644
index 0000000000..93297379ef
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -0,0 +1,14 @@
+/* contrib/amcheck/amcheck--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.4'" to load this file. \quit
+
+
+-- gist_index_parent_check()
+--
+CREATE FUNCTION gist_index_parent_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index ab50931f75..e67ace01c9 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.3'
+default_version = '1.4'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 0000000000..9749adfd34
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,119 @@
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index cd81cbf3bc..45e9d74947 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
@@ -24,6 +25,7 @@ install_data(
   'amcheck--1.0--1.1.sql',
   'amcheck--1.1--1.2.sql',
   'amcheck--1.2--1.3.sql',
+  'amcheck--1.3--1.4.sql',
   kwargs: contrib_data_args,
 )
 
@@ -36,6 +38,7 @@ tests += {
       'check',
       'check_btree',
       'check_heap',
+      'check_gist',
     ],
   },
   'tap': {
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 0000000000..75b9ff4b43
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,42 @@
+
+SELECT setseed(1);
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 0000000000..5a5fa73536
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,538 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gist.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "access/tableam.h"
+#include "access/transam.h"
+#include "catalog/pg_am.h"
+#include "common/pg_prng.h"
+#include "catalog/index.h"
+#include "lib/bloomfilter.h"
+#include "storage/lmgr.h"
+#include "storage/smgr.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "utils/snapmgr.h"
+
+#include "amcheck.h"
+
+/*
+ * GistScanItem represents one item of depth-first scan of GiST index.
+ */
+typedef struct GistScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GistScanItem *next;
+} GistScanItem;
+
+typedef struct GistCheckState
+{
+	/* Bloom filter fingerprints index tuples */
+	bloom_filter *filter;
+	/* Debug counter */
+	int64		heaptuplespresent;
+	/* GiST state */
+	GISTSTATE      *state;
+
+	Snapshot		snapshot;
+	Relation	rel;
+	Relation	heaprel;
+
+	/* progress reporting stuff */
+	BlockNumber totalblocks;
+	BlockNumber reportedblocks;
+	BlockNumber scannedblocks;
+	BlockNumber deltablocks;
+} GistCheckState;
+
+PG_FUNCTION_INFO_V1(gist_index_parent_check);
+
+static void gist_init_heapallindexed(Relation rel, GistCheckState *result);
+static void gist_index_checkable(Relation rel);
+static void gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+												void* callback_state);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gist_refind_parent(Relation rel, BlockNumber parentblkno,
+									 BlockNumber childblkno,
+									 BufferAccessStrategy strategy);
+static void gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+						  bool *isnull, bool tupleIsAlive, void *checkstate);
+
+/*
+ * gist_index_parent_check(index regclass)
+ *
+ * Verify integrity of GiST index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum gist_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid		indrelid = PG_GETARG_OID(0);
+	bool	heapallindexed = false;
+
+	if (PG_NARGS() >= 2)
+		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid, gist_index_checkable,
+		gist_check_parent_keys_consistency, AccessShareLock, &heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Check that relation is eligible for GiST verification
+ */
+static void
+gist_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIST_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GiST indexes are supported as targets for this verification"),
+				 errdetail("Relation \"%s\" is not a GiST index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+static void
+gist_init_heapallindexed(Relation rel, GistCheckState *result)
+{
+	int64		total_pages;
+	int64		total_elems;
+	uint64		seed;
+
+	/*
+	 * Size Bloom filter based on estimated number of tuples in index.
+	 * This logic is similar to B-tree, see verify_btree.c .
+	 */
+	total_pages = result->totalblocks;
+	total_elems = Max(total_pages * (MaxOffsetNumber / 5),
+						(int64) rel->rd_rel->reltuples);
+	seed = pg_prng_uint64(&pg_global_prng_state);
+	result->filter = bloom_create(total_elems, maintenance_work_mem, seed);
+
+	result->snapshot = RegisterSnapshot(GetTransactionSnapshot());
+
+
+	/*
+	 * GetTransactionSnapshot() always acquires a new MVCC snapshot in
+	 * READ COMMITTED mode.  A new snapshot is guaranteed to have all
+	 * the entries it requires in the index.
+	 *
+	 * We must defend against the possibility that an old xact
+	 * snapshot was returned at higher isolation levels when that
+	 * snapshot is not safe for index scans of the target index.  This
+	 * is possible when the snapshot sees tuples that are before the
+	 * index's indcheckxmin horizon.  Throwing an error here should be
+	 * very rare.  It doesn't seem worth using a secondary snapshot to
+	 * avoid this.
+	 */
+	if (IsolationUsesXactSnapshot() && rel->rd_index->indcheckxmin &&
+		!TransactionIdPrecedes(HeapTupleHeaderGetXmin(rel->rd_indextuple->t_data),
+								result->snapshot->xmin))
+		ereport(ERROR,
+				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+					errmsg("index \"%s\" cannot be verified using transaction snapshot",
+						RelationGetRelationName(rel))));
+}
+
+/*
+ * Main entry point for GiST check. Allocates memory context and scans through
+ * GiST graph.  This function verifies that tuples of internal pages cover all
+ * the key space of each tuple on leaf page.  To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries to
+ * adjust it by tuples on referenced child page.  Parent gist tuple should
+ * never require any adjustments.
+ */
+static void
+gist_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GistScanItem   *stack;
+	MemoryContext	mctx;
+	MemoryContext	oldcontext;
+	GISTSTATE      *state;
+	int				leafdepth;
+	bool			heapallindexed = *((bool*)callback_state);
+	GistCheckState  check_state;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	state = initGISTstate(rel);
+
+	check_state.state = state;
+	check_state.rel = rel;
+	check_state.heaprel = heaprel;
+
+	check_state.totalblocks = RelationGetNumberOfBlocks(rel);
+	check_state.reportedblocks = 0;
+	check_state.scannedblocks = 0;
+	/* report every 100 blocks or 5%, whichever is bigger */
+	check_state.deltablocks = Max(check_state.totalblocks / 20, 100);
+
+	if (heapallindexed)
+		gist_init_heapallindexed(rel, &check_state);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GistScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber  i, maxoff;
+		XLogRecPtr	lsn;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Report progress */
+		if (check_state.scannedblocks > check_state.reportedblocks +
+			  check_state.deltablocks)
+		{
+			elog(DEBUG1, "verified level %u blocks of approximately %u total",
+				check_state.scannedblocks, check_state.totalblocks);
+			check_state.reportedblocks = check_state.scannedblocks;
+		}
+		check_state.scannedblocks++;
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page))
+		{
+			/* split page detected, install right link to the stack */
+			GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+			ptr->depth = stack->depth;
+			ptr->parenttup = CopyIndexTuple(stack->parenttup);
+			ptr->parentblk = stack->parentblk;
+			ptr->parentlsn = stack->parentlsn;
+			ptr->blkno = GistPageGetOpaque(page)->rightlink;
+			ptr->next = stack->next;
+			stack->next = ptr;
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GistPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that each tuple looks valid, and is consistent with the
+		 * downlink we followed when we stepped on this page.
+		 */
+		maxoff = PageGetMaxOffsetNumber(page);
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId iid = PageGetItemIdCareful(rel, stack->blkno, page, i, sizeof(GISTPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+			/*
+			 * Check that it's not a leftover invalid tuple from pre-9.1 See
+			 * also gistdoinsert() and gistbulkdelete() handling of such
+			 * tuples. We do consider it error here.
+			 */
+			if (GistTupleIsInvalid(idxtuple))
+				ereport(ERROR,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("index \"%s\" contains an inner tuple marked as invalid, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i),
+						 errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+						 errhint("Please REINDEX it.")));
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+			{
+				/*
+				 * There was a discrepancy between parent and child tuples.
+				 * We need to verify it is not a result of concurrent call of
+				 * gistplacetopage(). So, lock parent and try to find downlink
+				 * for current page. It may be missing due to concurrent page
+				 * split, this is OK.
+				 *
+				 * Note that when we aquire parent tuple now we hold lock for
+				 * both parent and child buffers. Thus parent tuple must
+				 * include keyspace of the child.
+				 */
+				pfree(stack->parenttup);
+				stack->parenttup = gist_refind_parent(rel, stack->parentblk,
+													  stack->blkno, strategy);
+
+				/* We found it - make a final check before failing */
+				if (!stack->parenttup)
+					elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+						 stack->blkno, stack->parentblk);
+				else if (gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+				else
+				{
+					/*
+					 * But now it is properly adjusted - nothing to do here.
+					 */
+				}
+			}
+
+			if (GistPageIsLeaf(page))
+			{
+				if (heapallindexed)
+				{
+					bloom_add_element(check_state.filter, (unsigned char *) idxtuple,
+								  IndexTupleSize(idxtuple));
+				}
+			}
+			/* If this is an internal page, recurse into the child */
+			else
+			{
+				GistScanItem *ptr;
+
+				ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+				ptr->depth = stack->depth + 1;
+				ptr->parenttup = CopyIndexTuple(idxtuple);
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		LockBuffer(buffer, GIST_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	if (heapallindexed)
+	{
+		IndexInfo  *indexinfo = BuildIndexInfo(rel);
+		TableScanDesc scan;
+
+		scan = table_beginscan_strat(heaprel,	/* relation */
+									 check_state.snapshot,	/* snapshot */
+									 0, /* number of keys */
+									 NULL,	/* scan key */
+									 true,	/* buffer access strategy OK */
+									 true); /* syncscan OK? */
+
+		/*
+		 * Scan will behave as the first scan of a CREATE INDEX CONCURRENTLY.
+		 */
+		indexinfo->ii_Concurrent = true;
+
+		indexinfo->ii_Unique = false;
+		indexinfo->ii_ExclusionOps = NULL;
+		indexinfo->ii_ExclusionProcs = NULL;
+		indexinfo->ii_ExclusionStrats = NULL;
+
+		elog(DEBUG1, "verifying that tuples from index \"%s\" are present in \"%s\"",
+			 RelationGetRelationName(rel),
+			 RelationGetRelationName(heaprel));
+
+		table_index_build_scan(heaprel, rel, indexinfo, true, false,
+							   gist_tuple_present_callback, (void *) &check_state, scan);
+
+		ereport(DEBUG1,
+		(errmsg_internal("finished verifying presence of " INT64_FORMAT " tuples from table \"%s\" with bitset %.2f%% set",
+							check_state.heaptuplespresent, RelationGetRelationName(heaprel),
+							100.0 * bloom_prop_bits_set(check_state.filter))));
+
+		UnregisterSnapshot(check_state.snapshot);
+		bloom_free(check_state.filter);
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+static void
+gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+						  bool *isnull, bool tupleIsAlive, void *checkstate)
+{
+	GistCheckState *state = (GistCheckState *) checkstate;
+	IndexTuple	itup = gistFormTuple(state->state, index, values, isnull, true);
+	itup->t_tid = *tid;
+	/* Probe Bloom filter -- tuple should be present */
+	if (bloom_lacks_element(state->filter, (unsigned char *) itup,
+							IndexTupleSize(itup)))
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("heap tuple (%u,%u) from table \"%s\" lacks matching index tuple within index \"%s\"",
+						ItemPointerGetBlockNumber(&(itup->t_tid)),
+						ItemPointerGetOffsetNumber(&(itup->t_tid)),
+						RelationGetRelationName(state->heaprel),
+						RelationGetRelationName(state->rel))));
+
+	state->heaptuplespresent++;
+
+	pfree(itup);
+}
+
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	gistcheckpage(rel, buffer);
+
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted page %d",
+						RelationGetRelationName(rel), blockNo)));
+
+	if (GistPageIsDeleted(page))
+	{
+		if (!GistPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gist_refind_parent(Relation rel, BlockNumber parentblkno,
+				   BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIST_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GistPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o, sizeof(GISTPageOpaqueData));
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 5d61a33936..9397a69c6e 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -179,6 +179,25 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gist_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_parent_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
   <tip>
    <para>
-- 
2.32.0 (Apple Git-132)

v18-0001-Refactor-amcheck-to-extract-common-locking-routi.patchapplication/octet-stream; name=v18-0001-Refactor-amcheck-to-extract-common-locking-routi.patchDownload

From 89a87b5239d0aabcc2208939f213df747c686f16 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:08:10 +0500
Subject: [PATCH v18 1/3] Refactor amcheck to extract common locking routines

---
 contrib/amcheck/Makefile        |   2 +
 contrib/amcheck/amcheck.c       | 188 +++++++++++++++++++
 contrib/amcheck/amcheck.h       |  27 +++
 contrib/amcheck/meson.build     |   1 +
 contrib/amcheck/verify_nbtree.c | 307 ++++++++------------------------
 5 files changed, 296 insertions(+), 229 deletions(-)
 create mode 100644 contrib/amcheck/amcheck.c
 create mode 100644 contrib/amcheck/amcheck.h

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index b82f221e50..f10fd9d89d 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -3,11 +3,13 @@
 MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
+	amcheck.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
 DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
 REGRESS = check check_btree check_heap
diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
new file mode 100644
index 0000000000..3793b0cd93
--- /dev/null
+++ b/contrib/amcheck/amcheck.c
@@ -0,0 +1,188 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.c
+ *		Utility functions common to all access methods.
+ *
+ * Copyright (c) 2017-2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/table.h"
+#include "access/tableam.h"
+#include "amcheck.h"
+#include "catalog/index.h"
+#include "commands/tablecmds.h"
+#include "utils/guc.h"
+
+static bool
+amcheck_index_mainfork_expected(Relation rel);
+
+/*
+ * Check if index relation should have a file for its main relation
+ * fork.  Verification uses this to skip unlogged indexes when in hot standby
+ * mode, where there is simply nothing to verify.
+ *
+ * NB: Caller should call index_checkable()
+ * before calling here.
+ */
+static bool
+amcheck_index_mainfork_expected(Relation rel)
+{
+	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
+		!RecoveryInProgress())
+		return true;
+
+	ereport(NOTICE,
+			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
+			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
+					RelationGetRelationName(rel))));
+
+	return false;
+}
+
+void
+amcheck_lock_relation_and_check(Oid indrelid, IndexCheckableCallback checkable,
+												IndexDoCheckCallback check, LOCKMODE lockmode, void *state)
+{
+	Oid			heapid;
+	Relation	indrel;
+	Relation	heaprel;
+	Oid			save_userid;
+	int			save_sec_context;
+	int			save_nestlevel;
+
+	/*
+	 * We must lock table before index to avoid deadlocks.  However, if the
+	 * passed indrelid isn't an index then IndexGetRelation() will fail.
+	 * Rather than emitting a not-very-helpful error message, postpone
+	 * complaining, expecting that the is-it-an-index test below will fail.
+	 *
+	 * In hot standby mode this will raise an error when parentcheck is true.
+	 */
+	heapid = IndexGetRelation(indrelid, true);
+	if (OidIsValid(heapid))
+	{
+		heaprel = table_open(heapid, lockmode);
+
+		/*
+		 * Switch to the table owner's userid, so that any index functions are
+		 * run as that user.  Also lock down security-restricted operations
+		 * and arrange to make GUC variable changes local to this command.
+		 */
+		GetUserIdAndSecContext(&save_userid, &save_sec_context);
+		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
+							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
+		save_nestlevel = NewGUCNestLevel();
+	}
+	else
+	{
+		heaprel = NULL;
+		/* for "gcc -Og" https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78394 */
+		save_userid = InvalidOid;
+		save_sec_context = -1;
+		save_nestlevel = -1;
+	}
+
+	/*
+	 * Open the target index relations separately (like relation_openrv(), but
+	 * with heap relation locked first to prevent deadlocking).  In hot
+	 * standby mode this will raise an error when parentcheck is true.
+	 *
+	 * There is no need for the usual indcheckxmin usability horizon test
+	 * here, even in the heapallindexed case, because index undergoing
+	 * verification only needs to have entries for a new transaction snapshot.
+	 * (If this is a parentcheck verification, there is no question about
+	 * committed or recently dead heap tuples lacking index entries due to
+	 * concurrent activity.)
+	 */
+	indrel = index_open(indrelid, lockmode);
+
+	/*
+	 * Since we did the IndexGetRelation call above without any lock, it's
+	 * barely possible that a race against an index drop/recreation could have
+	 * netted us the wrong table.
+	 */
+	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_TABLE),
+				 errmsg("could not open parent table of index \"%s\"",
+						RelationGetRelationName(indrel))));
+
+	/* Relation suitable for checking */
+	checkable(indrel);
+
+	if (amcheck_index_mainfork_expected(indrel))
+		check(indrel, heaprel, state);
+
+	/* Roll back any GUC changes executed by index functions */
+	AtEOXact_GUC(false, save_nestlevel);
+
+	/* Restore userid and security context */
+	SetUserIdAndSecContext(save_userid, save_sec_context);
+
+	/*
+	 * Release locks early. That's ok here because nothing in the called
+	 * routines will trigger shared cache invalidations to be sent, so we can
+	 * relax the usual pattern of only releasing locks after commit.
+	 */
+	index_close(indrel, lockmode);
+	if (heaprel)
+		table_close(heaprel, lockmode);
+}
+
+/*
+ * PageGetItemId() wrapper that validates returned line pointer.
+ *
+ * Buffer page/page item access macros generally trust that line pointers are
+ * not corrupt, which might cause problems for verification itself.  For
+ * example, there is no bounds checking in PageGetItem().  Passing it a
+ * corrupt line pointer can cause it to return a tuple/pointer that is unsafe
+ * to dereference.
+ *
+ * Validating line pointers before tuples avoids undefined behavior and
+ * assertion failures with corrupt indexes, making the verification process
+ * more robust and predictable.
+ */
+ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset, size_t opaquesize)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	Assert(opaquesize == MAXALIGN(opaquesize));
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(opaquesize))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree and gist
+	 * never uses either.  Verify that line pointer has storage, too, since
+	 * even LP_DEAD items should.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/contrib/amcheck/amcheck.h b/contrib/amcheck/amcheck.h
new file mode 100644
index 0000000000..10906efd8a
--- /dev/null
+++ b/contrib/amcheck/amcheck.h
@@ -0,0 +1,27 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "storage/lockdefs.h"
+#include "utils/relcache.h"
+#include "miscadmin.h"
+
+/* Typedefs for callback functions for amcheck_lock_relation */
+typedef void (*IndexCheckableCallback) (Relation index);
+typedef void (*IndexDoCheckCallback) (Relation rel, Relation heaprel, void* state);
+
+extern void amcheck_lock_relation_and_check(Oid indrelid,
+											IndexCheckableCallback checkable,
+											IndexDoCheckCallback check,
+											LOCKMODE lockmode, void *state);
+
+extern ItemId PageGetItemIdCareful(Relation rel, BlockNumber block,
+					 Page page, OffsetNumber offset, size_t opaquesize);
\ No newline at end of file
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 5b55cf343a..cd81cbf3bc 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -1,6 +1,7 @@
 # Copyright (c) 2022-2023, PostgreSQL Global Development Group
 
 amcheck_sources = files(
+  'amcheck.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 257cff671b..3c1599f215 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -34,13 +34,14 @@
 #include "commands/tablecmds.h"
 #include "common/pg_prng.h"
 #include "lib/bloomfilter.h"
-#include "miscadmin.h"
 #include "storage/lmgr.h"
 #include "storage/smgr.h"
 #include "utils/guc.h"
 #include "utils/memutils.h"
 #include "utils/snapmgr.h"
 
+#include "amcheck.h"
+
 
 PG_MODULE_MAGIC;
 
@@ -138,10 +139,8 @@ typedef struct BtreeLevel
 PG_FUNCTION_INFO_V1(bt_index_check);
 PG_FUNCTION_INFO_V1(bt_index_parent_check);
 
-static void bt_index_check_internal(Oid indrelid, bool parentcheck,
-									bool heapallindexed, bool rootdescend);
+static void bt_index_check_internal_callback(Relation indrel, Relation heaprel, void* state);
 static inline void btree_index_checkable(Relation rel);
-static inline bool btree_index_mainfork_expected(Relation rel);
 static void bt_check_every_level(Relation rel, Relation heaprel,
 								 bool heapkeyspace, bool readonly, bool heapallindexed,
 								 bool rootdescend);
@@ -184,12 +183,17 @@ static inline bool invariant_l_nontarget_offset(BtreeCheckState *state,
 static Page palloc_btree_page(BtreeCheckState *state, BlockNumber blocknum);
 static inline BTScanInsert bt_mkscankey_pivotsearch(Relation rel,
 													IndexTuple itup);
-static ItemId PageGetItemIdCareful(BtreeCheckState *state, BlockNumber block,
-								   Page page, OffsetNumber offset);
 static inline ItemPointer BTreeTupleGetHeapTIDCareful(BtreeCheckState *state,
 													  IndexTuple itup, bool nonpivot);
 static inline ItemPointer BTreeTupleGetPointsToTID(IndexTuple itup);
 
+typedef struct BTCheckCallbackState
+{
+	bool parentcheck;
+	bool heapallindexed;
+	bool rootdescend;
+} BTCheckCallbackState;
+
 /*
  * bt_index_check(index regclass, heapallindexed boolean)
  *
@@ -203,12 +207,17 @@ Datum
 bt_index_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
+	BTCheckCallbackState args;
 
-	if (PG_NARGS() == 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = false;
+
+	if (PG_NARGS() >= 2)
+		args.heapallindexed = PG_GETARG_BOOL(1);
 
-	bt_index_check_internal(indrelid, false, heapallindexed, false);
+	amcheck_lock_relation_and_check(indrelid, btree_index_checkable,
+		bt_index_check_internal_callback, AccessShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -226,15 +235,18 @@ Datum
 bt_index_parent_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		rootdescend = false;
+	BTCheckCallbackState args;
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = true;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+		args.heapallindexed = PG_GETARG_BOOL(1);
 	if (PG_NARGS() == 3)
-		rootdescend = PG_GETARG_BOOL(2);
+		args.rootdescend = PG_GETARG_BOOL(2);
 
-	bt_index_check_internal(indrelid, true, heapallindexed, rootdescend);
+	amcheck_lock_relation_and_check(indrelid, btree_index_checkable,
+		bt_index_check_internal_callback, ShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -242,126 +254,35 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
 /*
  * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
  */
-static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend)
+static void bt_index_check_internal_callback(Relation indrel, Relation heaprel, void* state)
 {
-	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	LOCKMODE	lockmode;
-	Oid			save_userid;
-	int			save_sec_context;
-	int			save_nestlevel;
-
-	if (parentcheck)
-		lockmode = ShareLock;
-	else
-		lockmode = AccessShareLock;
-
-	/*
-	 * We must lock table before index to avoid deadlocks.  However, if the
-	 * passed indrelid isn't an index then IndexGetRelation() will fail.
-	 * Rather than emitting a not-very-helpful error message, postpone
-	 * complaining, expecting that the is-it-an-index test below will fail.
-	 *
-	 * In hot standby mode this will raise an error when parentcheck is true.
-	 */
-	heapid = IndexGetRelation(indrelid, true);
-	if (OidIsValid(heapid))
-	{
-		heaprel = table_open(heapid, lockmode);
-
-		/*
-		 * Switch to the table owner's userid, so that any index functions are
-		 * run as that user.  Also lock down security-restricted operations
-		 * and arrange to make GUC variable changes local to this command.
-		 */
-		GetUserIdAndSecContext(&save_userid, &save_sec_context);
-		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
-							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
-		save_nestlevel = NewGUCNestLevel();
-	}
-	else
-	{
-		heaprel = NULL;
-		/* Set these just to suppress "uninitialized variable" warnings */
-		save_userid = InvalidOid;
-		save_sec_context = -1;
-		save_nestlevel = -1;
-	}
-
-	/*
-	 * Open the target index relations separately (like relation_openrv(), but
-	 * with heap relation locked first to prevent deadlocking).  In hot
-	 * standby mode this will raise an error when parentcheck is true.
-	 *
-	 * There is no need for the usual indcheckxmin usability horizon test
-	 * here, even in the heapallindexed case, because index undergoing
-	 * verification only needs to have entries for a new transaction snapshot.
-	 * (If this is a parentcheck verification, there is no question about
-	 * committed or recently dead heap tuples lacking index entries due to
-	 * concurrent activity.)
-	 */
-	indrel = index_open(indrelid, lockmode);
+	BTCheckCallbackState* args = (BTCheckCallbackState*) state;
+	bool		heapkeyspace,
+					allequalimage;
 
-	/*
-	 * Since we did the IndexGetRelation call above without any lock, it's
-	 * barely possible that a race against an index drop/recreation could have
-	 * netted us the wrong table.
-	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
 		ereport(ERROR,
-				(errcode(ERRCODE_UNDEFINED_TABLE),
-				 errmsg("could not open parent table of index \"%s\"",
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" lacks a main relation fork",
 						RelationGetRelationName(indrel))));
 
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	if (btree_index_mainfork_expected(indrel))
-	{
-		bool		heapkeyspace,
-					allequalimage;
-
-		if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" lacks a main relation fork",
-							RelationGetRelationName(indrel))));
-
-		/* Extract metadata from metapage, and sanitize it in passing */
-		_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
-		if (allequalimage && !heapkeyspace)
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
-							RelationGetRelationName(indrel))));
-		if (allequalimage && !_bt_allequalimage(indrel, false))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
-							RelationGetRelationName(indrel))));
-
-		/* Check index, possibly against table it is an index on */
-		bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
-							 heapallindexed, rootdescend);
-	}
-
-	/* Roll back any GUC changes executed by index functions */
-	AtEOXact_GUC(false, save_nestlevel);
+	/* Extract metadata from metapage, and sanitize it in passing */
+	_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
+	if (allequalimage && !heapkeyspace)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
+						RelationGetRelationName(indrel))));
+	if (allequalimage && !_bt_allequalimage(indrel, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
+						RelationGetRelationName(indrel))));
 
-	/* Restore userid and security context */
-	SetUserIdAndSecContext(save_userid, save_sec_context);
+	/* Check index, possibly against table it is an index on */
+	bt_check_every_level(indrel, heaprel, heapkeyspace, args->parentcheck,
+							args->heapallindexed, args->rootdescend);
 
-	/*
-	 * Release locks early. That's ok here because nothing in the called
-	 * routines will trigger shared cache invalidations to be sent, so we can
-	 * relax the usual pattern of only releasing locks after commit.
-	 */
-	index_close(indrel, lockmode);
-	if (heaprel)
-		table_close(heaprel, lockmode);
 }
 
 /*
@@ -398,29 +319,6 @@ btree_index_checkable(Relation rel)
 				 errdetail("Index is not valid.")));
 }
 
-/*
- * Check if B-Tree index relation should have a file for its main relation
- * fork.  Verification uses this to skip unlogged indexes when in hot standby
- * mode, where there is simply nothing to verify.  We behave as if the
- * relation is empty.
- *
- * NB: Caller should call btree_index_checkable() before calling here.
- */
-static inline bool
-btree_index_mainfork_expected(Relation rel)
-{
-	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
-		!RecoveryInProgress())
-		return true;
-
-	ereport(DEBUG1,
-			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
-			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
-					RelationGetRelationName(rel))));
-
-	return false;
-}
-
 /*
  * Main entry point for B-Tree SQL-callable functions. Walks the B-Tree in
  * logical order, verifying invariants as it goes.  Optionally, verification
@@ -793,9 +691,9 @@ bt_check_level_from_leftmost(BtreeCheckState *state, BtreeLevel level)
 				ItemId		itemid;
 
 				/* Internal page -- downlink gets leftmost on next level */
-				itemid = PageGetItemIdCareful(state, state->targetblock,
+				itemid = PageGetItemIdCareful(state->rel, state->targetblock,
 											  state->target,
-											  P_FIRSTDATAKEY(opaque));
+											  P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 				itup = (IndexTuple) PageGetItem(state->target, itemid);
 				nextleveldown.leftmost = BTreeTupleGetDownLink(itup);
 				nextleveldown.level = opaque->btpo_level - 1;
@@ -875,8 +773,8 @@ nextpage:
 			IndexTuple	itup;
 			ItemId		itemid;
 
-			itemid = PageGetItemIdCareful(state, state->targetblock,
-										  state->target, P_HIKEY);
+			itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+										  state->target, P_HIKEY, sizeof(BTPageOpaqueData));
 			itup = (IndexTuple) PageGetItem(state->target, itemid);
 
 			state->lowkey = MemoryContextAlloc(oldcontext, IndexTupleSize(itup));
@@ -1093,8 +991,8 @@ bt_target_page_check(BtreeCheckState *state)
 		IndexTuple	itup;
 
 		/* Verify line pointer before checking tuple */
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, P_HIKEY);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, P_HIKEY, sizeof(BTPageOpaqueData));
 		if (!_bt_check_natts(state->rel, state->heapkeyspace, state->target,
 							 P_HIKEY))
 		{
@@ -1129,8 +1027,8 @@ bt_target_page_check(BtreeCheckState *state)
 
 		CHECK_FOR_INTERRUPTS();
 
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, offset);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, offset, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(state->target, itemid);
 		tupsize = IndexTupleSize(itup);
 
@@ -1442,9 +1340,9 @@ bt_target_page_check(BtreeCheckState *state)
 							 OffsetNumberNext(offset));
 
 			/* Reuse itup to get pointed-to heap location of second item */
-			itemid = PageGetItemIdCareful(state, state->targetblock,
+			itemid = PageGetItemIdCareful(state->rel, state->targetblock,
 										  state->target,
-										  OffsetNumberNext(offset));
+										  OffsetNumberNext(offset), sizeof(BTPageOpaqueData));
 			itup = (IndexTuple) PageGetItem(state->target, itemid);
 			tid = BTreeTupleGetPointsToTID(itup);
 			nhtid = psprintf("(%u,%u)",
@@ -1735,8 +1633,8 @@ bt_right_page_check_scankey(BtreeCheckState *state)
 	if (P_ISLEAF(opaque) && nline >= P_FIRSTDATAKEY(opaque))
 	{
 		/* Return first data item (if any) */
-		rightitem = PageGetItemIdCareful(state, targetnext, rightpage,
-										 P_FIRSTDATAKEY(opaque));
+		rightitem = PageGetItemIdCareful(state->rel, targetnext, rightpage,
+										 P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 	}
 	else if (!P_ISLEAF(opaque) &&
 			 nline >= OffsetNumberNext(P_FIRSTDATAKEY(opaque)))
@@ -1745,8 +1643,8 @@ bt_right_page_check_scankey(BtreeCheckState *state)
 		 * Return first item after the internal page's "negative infinity"
 		 * item
 		 */
-		rightitem = PageGetItemIdCareful(state, targetnext, rightpage,
-										 OffsetNumberNext(P_FIRSTDATAKEY(opaque)));
+		rightitem = PageGetItemIdCareful(state->rel, targetnext, rightpage,
+										 OffsetNumberNext(P_FIRSTDATAKEY(opaque)), sizeof(BTPageOpaqueData));
 	}
 	else
 	{
@@ -1865,8 +1763,8 @@ bt_child_highkey_check(BtreeCheckState *state,
 
 	if (OffsetNumberIsValid(target_downlinkoffnum))
 	{
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, target_downlinkoffnum);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, target_downlinkoffnum, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(state->target, itemid);
 		downlink = BTreeTupleGetDownLink(itup);
 	}
@@ -1969,7 +1867,7 @@ bt_child_highkey_check(BtreeCheckState *state,
 			OffsetNumber pivotkey_offset;
 
 			/* Get high key */
-			itemid = PageGetItemIdCareful(state, blkno, page, P_HIKEY);
+			itemid = PageGetItemIdCareful(state->rel, blkno, page, P_HIKEY, sizeof(BTPageOpaqueData));
 			highkey = (IndexTuple) PageGetItem(page, itemid);
 
 			/*
@@ -2020,8 +1918,8 @@ bt_child_highkey_check(BtreeCheckState *state,
 													LSN_FORMAT_ARGS(state->targetlsn))));
 					pivotkey_offset = P_HIKEY;
 				}
-				itemid = PageGetItemIdCareful(state, state->targetblock,
-											  state->target, pivotkey_offset);
+				itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+											  state->target, pivotkey_offset, sizeof(BTPageOpaqueData));
 				itup = (IndexTuple) PageGetItem(state->target, itemid);
 			}
 			else
@@ -2107,8 +2005,8 @@ bt_child_check(BtreeCheckState *state, BTScanInsert targetkey,
 	BTPageOpaque copaque;
 	BTPageOpaque topaque;
 
-	itemid = PageGetItemIdCareful(state, state->targetblock,
-								  state->target, downlinkoffnum);
+	itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+								  state->target, downlinkoffnum, sizeof(BTPageOpaqueData));
 	itup = (IndexTuple) PageGetItem(state->target, itemid);
 	childblock = BTreeTupleGetDownLink(itup);
 
@@ -2339,7 +2237,7 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 		 RelationGetRelationName(state->rel));
 
 	level = opaque->btpo_level;
-	itemid = PageGetItemIdCareful(state, blkno, page, P_FIRSTDATAKEY(opaque));
+	itemid = PageGetItemIdCareful(state->rel, blkno, page, P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 	itup = (IndexTuple) PageGetItem(page, itemid);
 	childblk = BTreeTupleGetDownLink(itup);
 	for (;;)
@@ -2363,8 +2261,8 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 										level - 1, copaque->btpo_level)));
 
 		level = copaque->btpo_level;
-		itemid = PageGetItemIdCareful(state, childblk, child,
-									  P_FIRSTDATAKEY(copaque));
+		itemid = PageGetItemIdCareful(state->rel, childblk, child,
+									  P_FIRSTDATAKEY(copaque), sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(child, itemid);
 		childblk = BTreeTupleGetDownLink(itup);
 		/* Be slightly more pro-active in freeing this memory, just in case */
@@ -2412,7 +2310,7 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 	 */
 	if (P_ISHALFDEAD(copaque) && !P_RIGHTMOST(copaque))
 	{
-		itemid = PageGetItemIdCareful(state, childblk, child, P_HIKEY);
+		itemid = PageGetItemIdCareful(state->rel, childblk, child, P_HIKEY, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(child, itemid);
 		if (BTreeTupleGetTopParent(itup) == blkno)
 			return;
@@ -2782,8 +2680,8 @@ invariant_l_offset(BtreeCheckState *state, BTScanInsert key,
 	Assert(key->pivotsearch);
 
 	/* Verify line pointer before checking tuple */
-	itemid = PageGetItemIdCareful(state, state->targetblock, state->target,
-								  upperbound);
+	itemid = PageGetItemIdCareful(state->rel, state->targetblock, state->target,
+								  upperbound, sizeof(BTPageOpaqueData));
 	/* pg_upgrade'd indexes may legally have equal sibling tuples */
 	if (!key->heapkeyspace)
 		return invariant_leq_offset(state, key, upperbound);
@@ -2905,8 +2803,8 @@ invariant_l_nontarget_offset(BtreeCheckState *state, BTScanInsert key,
 	Assert(key->pivotsearch);
 
 	/* Verify line pointer before checking tuple */
-	itemid = PageGetItemIdCareful(state, nontargetblock, nontarget,
-								  upperbound);
+	itemid = PageGetItemIdCareful(state->rel, nontargetblock, nontarget,
+								  upperbound, sizeof(BTPageOpaqueData));
 	cmp = _bt_compare(state->rel, key, nontarget, upperbound);
 
 	/* pg_upgrade'd indexes may legally have equal sibling tuples */
@@ -3143,55 +3041,6 @@ bt_mkscankey_pivotsearch(Relation rel, IndexTuple itup)
 	return skey;
 }
 
-/*
- * PageGetItemId() wrapper that validates returned line pointer.
- *
- * Buffer page/page item access macros generally trust that line pointers are
- * not corrupt, which might cause problems for verification itself.  For
- * example, there is no bounds checking in PageGetItem().  Passing it a
- * corrupt line pointer can cause it to return a tuple/pointer that is unsafe
- * to dereference.
- *
- * Validating line pointers before tuples avoids undefined behavior and
- * assertion failures with corrupt indexes, making the verification process
- * more robust and predictable.
- */
-static ItemId
-PageGetItemIdCareful(BtreeCheckState *state, BlockNumber block, Page page,
-					 OffsetNumber offset)
-{
-	ItemId		itemid = PageGetItemId(page, offset);
-
-	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
-		BLCKSZ - MAXALIGN(sizeof(BTPageOpaqueData)))
-		ereport(ERROR,
-				(errcode(ERRCODE_INDEX_CORRUPTED),
-				 errmsg("line pointer points past end of tuple space in index \"%s\"",
-						RelationGetRelationName(state->rel)),
-				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
-									block, offset, ItemIdGetOffset(itemid),
-									ItemIdGetLength(itemid),
-									ItemIdGetFlags(itemid))));
-
-	/*
-	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree
-	 * never uses either.  Verify that line pointer has storage, too, since
-	 * even LP_DEAD items should within nbtree.
-	 */
-	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
-		ItemIdGetLength(itemid) == 0)
-		ereport(ERROR,
-				(errcode(ERRCODE_INDEX_CORRUPTED),
-				 errmsg("invalid line pointer storage in index \"%s\"",
-						RelationGetRelationName(state->rel)),
-				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
-									block, offset, ItemIdGetOffset(itemid),
-									ItemIdGetLength(itemid),
-									ItemIdGetFlags(itemid))));
-
-	return itemid;
-}
-
 /*
  * BTreeTupleGetHeapTID() wrapper that enforces that a heap TID is present in
  * cases where that is mandatory (i.e. for non-pivot tuples)
-- 
2.32.0 (Apple Git-132)

#17

Andrey Borodin

amborodin86@gmail.com

about 3 years ago

In reply to: Andrey Borodin (#16)

3 attachment(s)

Re: Amcheck verification of GiST and GIN

On Sun, Jan 8, 2023 at 8:05 PM Andrey Borodin <amborodin86@gmail.com> wrote:

Please find the attached new version. In this patchset heapallindexed
flag is removed from GIN checks.

Uh... sorry, git-formatted wrong branch.
Here's the correct version. Double checked.

Best regards, Andrey Borodin.

Attachments:

v19-0003-Add-gin_index_parent_check-to-verify-GIN-index.patchapplication/octet-stream; name=v19-0003-Add-gin_index_parent_check-to-verify-GIN-index.patchDownload

From 42bc27c4ed058ec24fe582d771989cbfa9a0b38b Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:22:26 +0500
Subject: [PATCH v19 3/3] Add gin_index_parent_check() to verify GIN index

---
 contrib/amcheck/Makefile               |   3 +-
 contrib/amcheck/amcheck--1.3--1.4.sql  |  11 +-
 contrib/amcheck/amcheck.c              |   2 +-
 contrib/amcheck/expected/check_gin.out |  64 ++
 contrib/amcheck/meson.build            |   2 +
 contrib/amcheck/sql/check_gin.sql      |  40 ++
 contrib/amcheck/verify_gin.c           | 793 +++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml              |  19 +
 8 files changed, 931 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/expected/check_gin.out
 create mode 100644 contrib/amcheck/sql/check_gin.sql
 create mode 100644 contrib/amcheck/verify_gin.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index a817419581..ecb849a605 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -5,6 +5,7 @@ OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
 	verify_gist.o \
+	verify_gin.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
@@ -14,7 +15,7 @@ DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck
 
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap check_gist
+REGRESS = check check_btree check_heap check_gist check_gin
 
 TAP_TESTS = 1
 
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
index 93297379ef..5e283be45b 100644
--- a/contrib/amcheck/amcheck--1.3--1.4.sql
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -11,4 +11,13 @@ RETURNS VOID
 AS 'MODULE_PATHNAME', 'gist_index_parent_check'
 LANGUAGE C STRICT;
 
-REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
+REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
+
+-- gin_index_parent_check()
+--
+CREATE FUNCTION gin_index_parent_check(index regclass)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gin_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gin_index_parent_check(regclass) FROM PUBLIC;
\ No newline at end of file
diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
index 3793b0cd93..9999a233f8 100644
--- a/contrib/amcheck/amcheck.c
+++ b/contrib/amcheck/amcheck.c
@@ -83,7 +83,7 @@ amcheck_lock_relation_and_check(Oid indrelid, IndexCheckableCallback checkable,
 	else
 	{
 		heaprel = NULL;
-		/* for "gcc -Og" https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78394 */
+		/* Set these just to suppress "uninitialized variable" warnings */
 		save_userid = InvalidOid;
 		save_sec_context = -1;
 		save_nestlevel = -1;
diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
new file mode 100644
index 0000000000..43fd769a50
--- /dev/null
+++ b/contrib/amcheck/expected/check_gin.out
@@ -0,0 +1,64 @@
+-- Test of index bulk load
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx');
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test index inserts
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+SELECT gin_index_parent_check('gin_check_idx');
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test GIN over text array
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx');
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 45e9d74947..fec44a6826 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gin.c',
   'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
@@ -39,6 +40,7 @@ tests += {
       'check_btree',
       'check_heap',
       'check_gist',
+      'check_gin',
     ],
   },
   'tap': {
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
new file mode 100644
index 0000000000..9771afffa5
--- /dev/null
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -0,0 +1,40 @@
+-- Test of index bulk load
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test index inserts
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+
+SELECT gin_index_parent_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test GIN over text array
+SELECT setseed(1);
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx');
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
new file mode 100644
index 0000000000..8fd00513f7
--- /dev/null
+++ b/contrib/amcheck/verify_gin.c
@@ -0,0 +1,793 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gin.c
+ *		Verifies the integrity of GIN indexes based on invariants.
+ *
+ * Verification checks that all paths in GIN graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gin.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gin_private.h"
+#include "access/nbtree.h"
+#include "amcheck.h"
+#include "catalog/pg_am.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "string.h"
+
+/*
+ * GinScanItem represents one item of depth-first scan of GIN index.
+ */
+typedef struct GinScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GinScanItem *next;
+}			GinScanItem;
+
+/*
+ * GinPostingTreeScanItem represents one item of depth-first scan of GIN  posting tree.
+ */
+typedef struct GinPostingTreeScanItem
+{
+	int			depth;
+	ItemPointerData parentkey;
+	BlockNumber parentblk;
+	BlockNumber blkno;
+	struct GinPostingTreeScanItem *next;
+}			GinPostingTreeScanItem;
+
+
+PG_FUNCTION_INFO_V1(gin_index_parent_check);
+
+static void gin_index_checkable(Relation rel);
+static void gin_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state);
+static bool check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gin_refind_parent(Relation rel, BlockNumber parentblkno,
+									BlockNumber childblkno,
+									BufferAccessStrategy strategy);
+
+/*
+ * gin_index_parent_check(index regclass)
+ *
+ * Verify integrity of GIN index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gin_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+
+	amcheck_lock_relation_and_check(indrelid, gin_index_checkable,
+		gin_check_parent_keys_consistency, AccessShareLock, NULL);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Read item pointers from leaf entry tuple.
+ *
+ * Returns a palloc'd array of ItemPointers. The number of items is returned
+ * in *nitems.
+ */
+static ItemPointer
+ginReadTupleWithoutState(IndexTuple itup, int *nitems)
+{
+	Pointer		ptr = GinGetPosting(itup);
+	int			nipd = GinGetNPosting(itup);
+	ItemPointer ipd;
+	int			ndecoded;
+
+	if (GinItupIsCompressed(itup))
+	{
+		if (nipd > 0)
+		{
+			ipd = ginPostingListDecode((GinPostingList *) ptr, &ndecoded);
+			if (nipd != ndecoded)
+				elog(ERROR, "number of items mismatch in GIN entry tuple, %d in tuple header, %d decoded",
+					 nipd, ndecoded);
+		}
+		else
+		{
+			ipd = palloc(0);
+		}
+	}
+	else
+	{
+		ipd = (ItemPointer) palloc(sizeof(ItemPointerData) * nipd);
+		memcpy(ipd, ptr, sizeof(ItemPointerData) * nipd);
+	}
+	*nitems = nipd;
+	return ipd;
+}
+
+
+/*
+ * Check that relation is eligible for GIN verification
+ */
+static void
+gin_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIN_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GIN indexes are supported as targets for this verification"),
+				 errdetail("Relation \"%s\" is not a GIN index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+/*
+ * Allocates memory context and scans through postigTree graph
+ *
+ */
+static void
+gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting_tree_root)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinPostingTreeScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinPostingTreeScanItem *) palloc0(sizeof(GinPostingTreeScanItem));
+	stack->depth = 0;
+	ItemPointerSetInvalid(&stack->parentkey);
+	stack->parentblk = InvalidBlockNumber;
+	stack->blkno = posting_tree_root;
+
+	elog(DEBUG3, "processing posting tree at blk %u", posting_tree_root);
+
+	while (stack)
+	{
+		GinPostingTreeScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		Assert(GinPageIsData(page));
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			ItemPointerData minItem;
+			int			nlist;
+			ItemPointerData *list;
+			char		tidrange_buf[100];
+
+			ItemPointerSetMin(&minItem);
+
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			list = GinDataLeafPageGetItems(page, &nlist, minItem);
+
+			if (nlist > 0)
+			{
+				snprintf(tidrange_buf, sizeof(tidrange_buf),
+						 "%d tids (%u, %u) - (%u, %u)",
+						 nlist,
+						 ItemPointerGetBlockNumberNoCheck(&list[0]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[0]),
+						 ItemPointerGetBlockNumberNoCheck(&list[nlist - 1]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[nlist - 1]));
+			} else {
+				snprintf(tidrange_buf, sizeof(tidrange_buf), "0 tids");
+			}
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				elog(DEBUG3, "blk %u: parent %u highkey (%u, %u), %s",
+					 stack->blkno,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey),
+					 tidrange_buf);
+			}
+			else
+			{
+				elog(DEBUG3, "blk %u: root leaf, %s",
+					 stack->blkno,
+					 tidrange_buf);
+			}
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				ItemPointerGetOffsetNumberNoCheck(&stack->parentkey) != InvalidOffsetNumber &&
+				nlist > 0 &&
+				ItemPointerCompare(&stack->parentkey, &list[nlist - 1]) < 0)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": tid exceeds parent's high key in postingTree leaf on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			}
+		}
+		else
+		{
+			LocationIndex pd_lower;
+			int			lowersize;
+			ItemPointerData bound;
+
+			/*
+			 * Check that tuples in each page are properly ordered and
+			 * consistent with parent high key
+			 */
+			maxoff = GinPageGetOpaque(page)->maxoff;
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: internal posting tree page with %u items, parent %u highkey (%u, %u)",
+					 stack->blkno,
+					 maxoff,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey));
+			else
+				elog(DEBUG3, "blk %u: root internal posting tree page with %u items", stack->blkno, maxoff);
+
+			/*
+			 * A GIN posting tree internal page stores PostingItems in the
+			 * 'lower' part of the page. The 'upper' part is unused. The
+			 * number of elements is stored in the opaque area (maxoff).
+			 * Make sure the size of the 'lower' part agrees with 'maxoff'
+			 *
+			 * We didn't set pd_lower until PostgreSQL version 9.4, so if this
+			 * check fails, it could also be because the index was binary-upgraded
+			 * from an earlier version. That was a long time ago, though, so let's
+			 * warn if it doesn't match.
+			 */
+			pd_lower = ((PageHeader) page)->pd_lower;
+			lowersize = pd_lower - MAXALIGN(SizeOfPageHeaderData);
+			if ((lowersize - MAXALIGN(sizeof(ItemPointerData))) / sizeof(PostingItem) != maxoff)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has unexpected pd_lower %u in posting tree block %u with maxoff %u)",
+								RelationGetRelationName(rel), pd_lower, stack->blkno, maxoff)));
+			}
+
+			/*
+			 * Before the PostingItems, there's one ItemPointerData in the
+			 * 'lower' part that stores the page's high key.
+			 */
+			bound = *GinDataPageGetRightBound(page);
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				if (!ItemPointerEquals(&stack->parentkey, &bound))
+				{
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\": posting tree page's high key (%u, %u) doesn't match the downlink on block %u (parent blk %u, key (%u, %u))",
+									RelationGetRelationName(rel),
+									ItemPointerGetBlockNumberNoCheck(&bound),
+									ItemPointerGetOffsetNumberNoCheck(&bound),
+									stack->blkno,
+									stack->parentblk,
+									ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+									ItemPointerGetOffsetNumberNoCheck(&stack->parentkey))));
+				}
+			}
+
+			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
+
+				elog(DEBUG3, "key (%u, %u) -> %u",
+					 ItemPointerGetBlockNumber(&posting_item->key),
+					 ItemPointerGetOffsetNumber(&posting_item->key),
+					 BlockIdGetBlockNumber(&posting_item->child_blkno));
+
+				if (i == maxoff && GinPageGetOpaque(page)->rightlink == InvalidBlockNumber)
+				{
+					/* The rightmost item in the tree level has (0, 0) as the key */
+					if (ItemPointerGetBlockNumberNoCheck(&posting_item->key) != 0 ||
+						ItemPointerGetOffsetNumberNoCheck(&posting_item->key) != 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": rightmost posting tree page (blk %u) has unexpected last key (%u, %u)",
+										RelationGetRelationName(rel),
+										stack->blkno,
+										ItemPointerGetBlockNumberNoCheck(&posting_item->key),
+										ItemPointerGetOffsetNumberNoCheck(&posting_item->key))));
+					}
+				}
+				else if (i != FirstOffsetNumber)
+				{
+					PostingItem *previous_posting_item = GinDataPageGetPostingItem(page, i - 1);
+
+					if (ItemPointerCompare(&posting_item->key, &previous_posting_item->key) < 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\" has wrong tuple order in posting tree, block %u, offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+					}
+				}
+
+				/*
+				 * Check if this tuple is consistent with the downlink in the
+				 * parent.
+				 */
+				if (stack->parentblk != InvalidBlockNumber && i == maxoff)
+				{
+					if (ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting item exceeds parent's high key in postingTree internal page on block %u offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+
+					}
+				}
+
+				/* If this is an internal page, recurse into the child */
+				if (!GinPageIsLeaf(page))
+				{
+					GinPostingTreeScanItem *ptr;
+
+					ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
+					ptr->depth = stack->depth + 1;
+					ptr->parentkey = posting_item->key;
+					ptr->parentblk = stack->blkno;
+					ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+
+			}
+		}
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Main entry point for GIN check. Allocates memory context and scans through
+ * GIN graph.
+ */
+static void
+gin_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GinState	state;
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+	initGinState(&state, rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinScanItem *) palloc0(sizeof(GinScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIN_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GinScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+		IndexTuple	prev_tuple;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+		maxoff = PageGetMaxOffsetNumber(page);
+
+		/* Do basic sanity checks on the page headers */
+		if (!check_index_page(rel, buffer, stack->blkno))
+		{
+			goto nextpage;
+		}
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (stack->parenttup != NULL)
+		{
+			GinNullCategory parent_key_category;
+			Datum		parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, maxoff, sizeof(GinPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory page_max_key_category;
+			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
+
+			if (GinPageGetOpaque(page)->rightlink != InvalidBlockNumber &&
+				ginCompareEntries(&state, attnum, page_max_key, page_max_key_category, parent_key, parent_key_category) > 0)
+			{
+				/* split page detected, install right link to the stack */
+				GinScanItem *ptr;
+
+				elog(DEBUG3, "split detected");
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth;
+				ptr->parenttup = CopyIndexTuple(stack->parenttup);
+				ptr->parentblk = stack->parentblk;
+				ptr->parentlsn = stack->parentlsn;
+				ptr->blkno = GinPageGetOpaque(page)->rightlink;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+				goto nextpage;
+			}
+		}
+
+		/*
+		 * Check that tuples in each page are properly ordered and consistent
+		 * with parent high key
+		 */
+		prev_tuple = NULL;
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i, sizeof(GinPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory prev_key_category;
+			Datum		prev_key;
+			GinNullCategory current_key_category;
+			Datum		current_key;
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+				continue;
+			}
+
+			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
+
+			/* (apparently) first block is metadata, skip order check */
+			if (i != FirstOffsetNumber && stack->blkno != (BlockNumber) 1)
+			{
+				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
+				if (ginCompareEntries(&state, attnum, prev_key, prev_key_category, current_key, current_key_category) >= 0)
+				{
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has wrong tuple order, block %u, offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+				}
+			}
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				i == maxoff)
+			{
+				GinNullCategory parent_key_category;
+				Datum		parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+
+				if (ginCompareEntries(&state, attnum, current_key, current_key_category, parent_key, parent_key_category) > 0)
+				{
+					/*
+					 * There was a discrepancy between parent and child
+					 * tuples. We need to verify it is not a result of
+					 * concurrent call of gistplacetopage(). So, lock parent
+					 * and try to find downlink for current page. It may be
+					 * missing due to concurrent page split, this is OK.
+					 */
+					pfree(stack->parenttup);
+					stack->parenttup = gin_refind_parent(rel, stack->parentblk,
+														 stack->blkno, strategy);
+
+					/* We found it - make a final check before failing */
+					if (!stack->parenttup)
+						elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+							 stack->blkno, stack->parentblk);
+					else
+					{
+						parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+						if (ginCompareEntries(&state, attnum, current_key, current_key_category, parent_key, parent_key_category) > 0)
+							ereport(ERROR,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+											RelationGetRelationName(rel), stack->blkno, i)));
+						else
+						{
+							/*
+							 * But now it is properly adjusted - nothing to do
+							 * here.
+							 */
+						}
+					}
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+			if (!GinPageIsLeaf(page))
+			{
+				GinScanItem *ptr;
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth + 1;
+				/* last tuple in layer has no high key */
+				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
+				{
+					ptr->parenttup = CopyIndexTuple(idxtuple);
+				}
+				else
+				{
+					ptr->parenttup = NULL;
+				}
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = GinGetDownlink(idxtuple);
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+			/* If this item is a pointer to a posting tree, recurse into it */
+			else if (GinIsPostingTree(idxtuple))
+			{
+				BlockNumber rootPostingTree = GinGetPostingTree(idxtuple);
+
+				gin_check_posting_tree_parent_keys_consistency(rel, rootPostingTree);
+			}
+			else
+			{
+				ItemPointer ipd;
+				int			nipd;
+
+				ipd = ginReadTupleWithoutState(idxtuple, &nipd);
+
+				for (int j = 0; j < nipd; j++)
+				{
+					if (!OffsetNumberIsValid(ItemPointerGetOffsetNumber(&ipd[j])))
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting list contains invalid heap pointer on block %u",
+										RelationGetRelationName(rel), stack->blkno)));
+					}
+				}
+				pfree(ipd);
+			}
+
+			prev_tuple = CopyIndexTuple(idxtuple);
+		}
+
+nextpage:
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Verify that a freshly-read page looks sane.
+ */
+static bool
+gincheckpage(Relation rel, Buffer buf)
+{
+	Page		page = BufferGetPage(buf);
+
+	/*
+	 * ReadBuffer verifies that every newly-read page passes
+	 * PageHeaderIsValid, which means it either contains a reasonably sane
+	 * page header or is all-zero.  We have to defend against the all-zero
+	 * case, however.
+	 */
+	if (PageIsNew(page))
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains unexpected zero page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buf)),
+				 errhint("Please REINDEX it.")));
+		return false;
+	}
+
+	/*
+	 * Additionally check that the special area looks sane.
+	 */
+	if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains corrupted page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buf)),
+				 errhint("Please REINDEX it.")));
+		return false;
+	}
+	return true;
+}
+
+static bool
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	if (!gincheckpage(rel, buffer))
+		return false;
+
+	if (GinPageIsDeleted(page))
+	{
+		if (!GinPageIsLeaf(page))
+		{
+			ereport(WARNING,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+			return false;
+		}
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+		{
+			ereport(WARNING,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+			return false;
+		}
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+		return false;
+	}
+	return true;
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gin_refind_parent(Relation rel, BlockNumber parentblkno,
+				  BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIN_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GinPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o, sizeof(GinPageOpaqueData));
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 9397a69c6e..7ffa36b205 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -180,6 +180,25 @@ ORDER BY c.relpages DESC LIMIT 10;
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term>
+     <function>gin_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gin_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gin_index_parent_check</function> tests that its target GIN index
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term>
      <function>gist_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
-- 
2.32.0 (Apple Git-132)

v19-0001-Refactor-amcheck-to-extract-common-locking-routi.patchapplication/octet-stream; name=v19-0001-Refactor-amcheck-to-extract-common-locking-routi.patchDownload

From 89a87b5239d0aabcc2208939f213df747c686f16 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:08:10 +0500
Subject: [PATCH v19 1/3] Refactor amcheck to extract common locking routines

---
 contrib/amcheck/Makefile        |   2 +
 contrib/amcheck/amcheck.c       | 188 +++++++++++++++++++
 contrib/amcheck/amcheck.h       |  27 +++
 contrib/amcheck/meson.build     |   1 +
 contrib/amcheck/verify_nbtree.c | 307 ++++++++------------------------
 5 files changed, 296 insertions(+), 229 deletions(-)
 create mode 100644 contrib/amcheck/amcheck.c
 create mode 100644 contrib/amcheck/amcheck.h

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index b82f221e50..f10fd9d89d 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -3,11 +3,13 @@
 MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
+	amcheck.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
 DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
 REGRESS = check check_btree check_heap
diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
new file mode 100644
index 0000000000..3793b0cd93
--- /dev/null
+++ b/contrib/amcheck/amcheck.c
@@ -0,0 +1,188 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.c
+ *		Utility functions common to all access methods.
+ *
+ * Copyright (c) 2017-2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/table.h"
+#include "access/tableam.h"
+#include "amcheck.h"
+#include "catalog/index.h"
+#include "commands/tablecmds.h"
+#include "utils/guc.h"
+
+static bool
+amcheck_index_mainfork_expected(Relation rel);
+
+/*
+ * Check if index relation should have a file for its main relation
+ * fork.  Verification uses this to skip unlogged indexes when in hot standby
+ * mode, where there is simply nothing to verify.
+ *
+ * NB: Caller should call index_checkable()
+ * before calling here.
+ */
+static bool
+amcheck_index_mainfork_expected(Relation rel)
+{
+	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
+		!RecoveryInProgress())
+		return true;
+
+	ereport(NOTICE,
+			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
+			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
+					RelationGetRelationName(rel))));
+
+	return false;
+}
+
+void
+amcheck_lock_relation_and_check(Oid indrelid, IndexCheckableCallback checkable,
+												IndexDoCheckCallback check, LOCKMODE lockmode, void *state)
+{
+	Oid			heapid;
+	Relation	indrel;
+	Relation	heaprel;
+	Oid			save_userid;
+	int			save_sec_context;
+	int			save_nestlevel;
+
+	/*
+	 * We must lock table before index to avoid deadlocks.  However, if the
+	 * passed indrelid isn't an index then IndexGetRelation() will fail.
+	 * Rather than emitting a not-very-helpful error message, postpone
+	 * complaining, expecting that the is-it-an-index test below will fail.
+	 *
+	 * In hot standby mode this will raise an error when parentcheck is true.
+	 */
+	heapid = IndexGetRelation(indrelid, true);
+	if (OidIsValid(heapid))
+	{
+		heaprel = table_open(heapid, lockmode);
+
+		/*
+		 * Switch to the table owner's userid, so that any index functions are
+		 * run as that user.  Also lock down security-restricted operations
+		 * and arrange to make GUC variable changes local to this command.
+		 */
+		GetUserIdAndSecContext(&save_userid, &save_sec_context);
+		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
+							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
+		save_nestlevel = NewGUCNestLevel();
+	}
+	else
+	{
+		heaprel = NULL;
+		/* for "gcc -Og" https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78394 */
+		save_userid = InvalidOid;
+		save_sec_context = -1;
+		save_nestlevel = -1;
+	}
+
+	/*
+	 * Open the target index relations separately (like relation_openrv(), but
+	 * with heap relation locked first to prevent deadlocking).  In hot
+	 * standby mode this will raise an error when parentcheck is true.
+	 *
+	 * There is no need for the usual indcheckxmin usability horizon test
+	 * here, even in the heapallindexed case, because index undergoing
+	 * verification only needs to have entries for a new transaction snapshot.
+	 * (If this is a parentcheck verification, there is no question about
+	 * committed or recently dead heap tuples lacking index entries due to
+	 * concurrent activity.)
+	 */
+	indrel = index_open(indrelid, lockmode);
+
+	/*
+	 * Since we did the IndexGetRelation call above without any lock, it's
+	 * barely possible that a race against an index drop/recreation could have
+	 * netted us the wrong table.
+	 */
+	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_TABLE),
+				 errmsg("could not open parent table of index \"%s\"",
+						RelationGetRelationName(indrel))));
+
+	/* Relation suitable for checking */
+	checkable(indrel);
+
+	if (amcheck_index_mainfork_expected(indrel))
+		check(indrel, heaprel, state);
+
+	/* Roll back any GUC changes executed by index functions */
+	AtEOXact_GUC(false, save_nestlevel);
+
+	/* Restore userid and security context */
+	SetUserIdAndSecContext(save_userid, save_sec_context);
+
+	/*
+	 * Release locks early. That's ok here because nothing in the called
+	 * routines will trigger shared cache invalidations to be sent, so we can
+	 * relax the usual pattern of only releasing locks after commit.
+	 */
+	index_close(indrel, lockmode);
+	if (heaprel)
+		table_close(heaprel, lockmode);
+}
+
+/*
+ * PageGetItemId() wrapper that validates returned line pointer.
+ *
+ * Buffer page/page item access macros generally trust that line pointers are
+ * not corrupt, which might cause problems for verification itself.  For
+ * example, there is no bounds checking in PageGetItem().  Passing it a
+ * corrupt line pointer can cause it to return a tuple/pointer that is unsafe
+ * to dereference.
+ *
+ * Validating line pointers before tuples avoids undefined behavior and
+ * assertion failures with corrupt indexes, making the verification process
+ * more robust and predictable.
+ */
+ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset, size_t opaquesize)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	Assert(opaquesize == MAXALIGN(opaquesize));
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(opaquesize))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree and gist
+	 * never uses either.  Verify that line pointer has storage, too, since
+	 * even LP_DEAD items should.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/contrib/amcheck/amcheck.h b/contrib/amcheck/amcheck.h
new file mode 100644
index 0000000000..10906efd8a
--- /dev/null
+++ b/contrib/amcheck/amcheck.h
@@ -0,0 +1,27 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "storage/lockdefs.h"
+#include "utils/relcache.h"
+#include "miscadmin.h"
+
+/* Typedefs for callback functions for amcheck_lock_relation */
+typedef void (*IndexCheckableCallback) (Relation index);
+typedef void (*IndexDoCheckCallback) (Relation rel, Relation heaprel, void* state);
+
+extern void amcheck_lock_relation_and_check(Oid indrelid,
+											IndexCheckableCallback checkable,
+											IndexDoCheckCallback check,
+											LOCKMODE lockmode, void *state);
+
+extern ItemId PageGetItemIdCareful(Relation rel, BlockNumber block,
+					 Page page, OffsetNumber offset, size_t opaquesize);
\ No newline at end of file
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 5b55cf343a..cd81cbf3bc 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -1,6 +1,7 @@
 # Copyright (c) 2022-2023, PostgreSQL Global Development Group
 
 amcheck_sources = files(
+  'amcheck.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 257cff671b..3c1599f215 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -34,13 +34,14 @@
 #include "commands/tablecmds.h"
 #include "common/pg_prng.h"
 #include "lib/bloomfilter.h"
-#include "miscadmin.h"
 #include "storage/lmgr.h"
 #include "storage/smgr.h"
 #include "utils/guc.h"
 #include "utils/memutils.h"
 #include "utils/snapmgr.h"
 
+#include "amcheck.h"
+
 
 PG_MODULE_MAGIC;
 
@@ -138,10 +139,8 @@ typedef struct BtreeLevel
 PG_FUNCTION_INFO_V1(bt_index_check);
 PG_FUNCTION_INFO_V1(bt_index_parent_check);
 
-static void bt_index_check_internal(Oid indrelid, bool parentcheck,
-									bool heapallindexed, bool rootdescend);
+static void bt_index_check_internal_callback(Relation indrel, Relation heaprel, void* state);
 static inline void btree_index_checkable(Relation rel);
-static inline bool btree_index_mainfork_expected(Relation rel);
 static void bt_check_every_level(Relation rel, Relation heaprel,
 								 bool heapkeyspace, bool readonly, bool heapallindexed,
 								 bool rootdescend);
@@ -184,12 +183,17 @@ static inline bool invariant_l_nontarget_offset(BtreeCheckState *state,
 static Page palloc_btree_page(BtreeCheckState *state, BlockNumber blocknum);
 static inline BTScanInsert bt_mkscankey_pivotsearch(Relation rel,
 													IndexTuple itup);
-static ItemId PageGetItemIdCareful(BtreeCheckState *state, BlockNumber block,
-								   Page page, OffsetNumber offset);
 static inline ItemPointer BTreeTupleGetHeapTIDCareful(BtreeCheckState *state,
 													  IndexTuple itup, bool nonpivot);
 static inline ItemPointer BTreeTupleGetPointsToTID(IndexTuple itup);
 
+typedef struct BTCheckCallbackState
+{
+	bool parentcheck;
+	bool heapallindexed;
+	bool rootdescend;
+} BTCheckCallbackState;
+
 /*
  * bt_index_check(index regclass, heapallindexed boolean)
  *
@@ -203,12 +207,17 @@ Datum
 bt_index_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
+	BTCheckCallbackState args;
 
-	if (PG_NARGS() == 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = false;
+
+	if (PG_NARGS() >= 2)
+		args.heapallindexed = PG_GETARG_BOOL(1);
 
-	bt_index_check_internal(indrelid, false, heapallindexed, false);
+	amcheck_lock_relation_and_check(indrelid, btree_index_checkable,
+		bt_index_check_internal_callback, AccessShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -226,15 +235,18 @@ Datum
 bt_index_parent_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		rootdescend = false;
+	BTCheckCallbackState args;
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = true;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+		args.heapallindexed = PG_GETARG_BOOL(1);
 	if (PG_NARGS() == 3)
-		rootdescend = PG_GETARG_BOOL(2);
+		args.rootdescend = PG_GETARG_BOOL(2);
 
-	bt_index_check_internal(indrelid, true, heapallindexed, rootdescend);
+	amcheck_lock_relation_and_check(indrelid, btree_index_checkable,
+		bt_index_check_internal_callback, ShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -242,126 +254,35 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
 /*
  * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
  */
-static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend)
+static void bt_index_check_internal_callback(Relation indrel, Relation heaprel, void* state)
 {
-	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	LOCKMODE	lockmode;
-	Oid			save_userid;
-	int			save_sec_context;
-	int			save_nestlevel;
-
-	if (parentcheck)
-		lockmode = ShareLock;
-	else
-		lockmode = AccessShareLock;
-
-	/*
-	 * We must lock table before index to avoid deadlocks.  However, if the
-	 * passed indrelid isn't an index then IndexGetRelation() will fail.
-	 * Rather than emitting a not-very-helpful error message, postpone
-	 * complaining, expecting that the is-it-an-index test below will fail.
-	 *
-	 * In hot standby mode this will raise an error when parentcheck is true.
-	 */
-	heapid = IndexGetRelation(indrelid, true);
-	if (OidIsValid(heapid))
-	{
-		heaprel = table_open(heapid, lockmode);
-
-		/*
-		 * Switch to the table owner's userid, so that any index functions are
-		 * run as that user.  Also lock down security-restricted operations
-		 * and arrange to make GUC variable changes local to this command.
-		 */
-		GetUserIdAndSecContext(&save_userid, &save_sec_context);
-		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
-							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
-		save_nestlevel = NewGUCNestLevel();
-	}
-	else
-	{
-		heaprel = NULL;
-		/* Set these just to suppress "uninitialized variable" warnings */
-		save_userid = InvalidOid;
-		save_sec_context = -1;
-		save_nestlevel = -1;
-	}
-
-	/*
-	 * Open the target index relations separately (like relation_openrv(), but
-	 * with heap relation locked first to prevent deadlocking).  In hot
-	 * standby mode this will raise an error when parentcheck is true.
-	 *
-	 * There is no need for the usual indcheckxmin usability horizon test
-	 * here, even in the heapallindexed case, because index undergoing
-	 * verification only needs to have entries for a new transaction snapshot.
-	 * (If this is a parentcheck verification, there is no question about
-	 * committed or recently dead heap tuples lacking index entries due to
-	 * concurrent activity.)
-	 */
-	indrel = index_open(indrelid, lockmode);
+	BTCheckCallbackState* args = (BTCheckCallbackState*) state;
+	bool		heapkeyspace,
+					allequalimage;
 
-	/*
-	 * Since we did the IndexGetRelation call above without any lock, it's
-	 * barely possible that a race against an index drop/recreation could have
-	 * netted us the wrong table.
-	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
 		ereport(ERROR,
-				(errcode(ERRCODE_UNDEFINED_TABLE),
-				 errmsg("could not open parent table of index \"%s\"",
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" lacks a main relation fork",
 						RelationGetRelationName(indrel))));
 
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	if (btree_index_mainfork_expected(indrel))
-	{
-		bool		heapkeyspace,
-					allequalimage;
-
-		if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" lacks a main relation fork",
-							RelationGetRelationName(indrel))));
-
-		/* Extract metadata from metapage, and sanitize it in passing */
-		_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
-		if (allequalimage && !heapkeyspace)
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
-							RelationGetRelationName(indrel))));
-		if (allequalimage && !_bt_allequalimage(indrel, false))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
-							RelationGetRelationName(indrel))));
-
-		/* Check index, possibly against table it is an index on */
-		bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
-							 heapallindexed, rootdescend);
-	}
-
-	/* Roll back any GUC changes executed by index functions */
-	AtEOXact_GUC(false, save_nestlevel);
+	/* Extract metadata from metapage, and sanitize it in passing */
+	_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
+	if (allequalimage && !heapkeyspace)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
+						RelationGetRelationName(indrel))));
+	if (allequalimage && !_bt_allequalimage(indrel, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
+						RelationGetRelationName(indrel))));
 
-	/* Restore userid and security context */
-	SetUserIdAndSecContext(save_userid, save_sec_context);
+	/* Check index, possibly against table it is an index on */
+	bt_check_every_level(indrel, heaprel, heapkeyspace, args->parentcheck,
+							args->heapallindexed, args->rootdescend);
 
-	/*
-	 * Release locks early. That's ok here because nothing in the called
-	 * routines will trigger shared cache invalidations to be sent, so we can
-	 * relax the usual pattern of only releasing locks after commit.
-	 */
-	index_close(indrel, lockmode);
-	if (heaprel)
-		table_close(heaprel, lockmode);
 }
 
 /*
@@ -398,29 +319,6 @@ btree_index_checkable(Relation rel)
 				 errdetail("Index is not valid.")));
 }
 
-/*
- * Check if B-Tree index relation should have a file for its main relation
- * fork.  Verification uses this to skip unlogged indexes when in hot standby
- * mode, where there is simply nothing to verify.  We behave as if the
- * relation is empty.
- *
- * NB: Caller should call btree_index_checkable() before calling here.
- */
-static inline bool
-btree_index_mainfork_expected(Relation rel)
-{
-	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
-		!RecoveryInProgress())
-		return true;
-
-	ereport(DEBUG1,
-			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
-			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
-					RelationGetRelationName(rel))));
-
-	return false;
-}
-
 /*
  * Main entry point for B-Tree SQL-callable functions. Walks the B-Tree in
  * logical order, verifying invariants as it goes.  Optionally, verification
@@ -793,9 +691,9 @@ bt_check_level_from_leftmost(BtreeCheckState *state, BtreeLevel level)
 				ItemId		itemid;
 
 				/* Internal page -- downlink gets leftmost on next level */
-				itemid = PageGetItemIdCareful(state, state->targetblock,
+				itemid = PageGetItemIdCareful(state->rel, state->targetblock,
 											  state->target,
-											  P_FIRSTDATAKEY(opaque));
+											  P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 				itup = (IndexTuple) PageGetItem(state->target, itemid);
 				nextleveldown.leftmost = BTreeTupleGetDownLink(itup);
 				nextleveldown.level = opaque->btpo_level - 1;
@@ -875,8 +773,8 @@ nextpage:
 			IndexTuple	itup;
 			ItemId		itemid;
 
-			itemid = PageGetItemIdCareful(state, state->targetblock,
-										  state->target, P_HIKEY);
+			itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+										  state->target, P_HIKEY, sizeof(BTPageOpaqueData));
 			itup = (IndexTuple) PageGetItem(state->target, itemid);
 
 			state->lowkey = MemoryContextAlloc(oldcontext, IndexTupleSize(itup));
@@ -1093,8 +991,8 @@ bt_target_page_check(BtreeCheckState *state)
 		IndexTuple	itup;
 
 		/* Verify line pointer before checking tuple */
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, P_HIKEY);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, P_HIKEY, sizeof(BTPageOpaqueData));
 		if (!_bt_check_natts(state->rel, state->heapkeyspace, state->target,
 							 P_HIKEY))
 		{
@@ -1129,8 +1027,8 @@ bt_target_page_check(BtreeCheckState *state)
 
 		CHECK_FOR_INTERRUPTS();
 
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, offset);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, offset, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(state->target, itemid);
 		tupsize = IndexTupleSize(itup);
 
@@ -1442,9 +1340,9 @@ bt_target_page_check(BtreeCheckState *state)
 							 OffsetNumberNext(offset));
 
 			/* Reuse itup to get pointed-to heap location of second item */
-			itemid = PageGetItemIdCareful(state, state->targetblock,
+			itemid = PageGetItemIdCareful(state->rel, state->targetblock,
 										  state->target,
-										  OffsetNumberNext(offset));
+										  OffsetNumberNext(offset), sizeof(BTPageOpaqueData));
 			itup = (IndexTuple) PageGetItem(state->target, itemid);
 			tid = BTreeTupleGetPointsToTID(itup);
 			nhtid = psprintf("(%u,%u)",
@@ -1735,8 +1633,8 @@ bt_right_page_check_scankey(BtreeCheckState *state)
 	if (P_ISLEAF(opaque) && nline >= P_FIRSTDATAKEY(opaque))
 	{
 		/* Return first data item (if any) */
-		rightitem = PageGetItemIdCareful(state, targetnext, rightpage,
-										 P_FIRSTDATAKEY(opaque));
+		rightitem = PageGetItemIdCareful(state->rel, targetnext, rightpage,
+										 P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 	}
 	else if (!P_ISLEAF(opaque) &&
 			 nline >= OffsetNumberNext(P_FIRSTDATAKEY(opaque)))
@@ -1745,8 +1643,8 @@ bt_right_page_check_scankey(BtreeCheckState *state)
 		 * Return first item after the internal page's "negative infinity"
 		 * item
 		 */
-		rightitem = PageGetItemIdCareful(state, targetnext, rightpage,
-										 OffsetNumberNext(P_FIRSTDATAKEY(opaque)));
+		rightitem = PageGetItemIdCareful(state->rel, targetnext, rightpage,
+										 OffsetNumberNext(P_FIRSTDATAKEY(opaque)), sizeof(BTPageOpaqueData));
 	}
 	else
 	{
@@ -1865,8 +1763,8 @@ bt_child_highkey_check(BtreeCheckState *state,
 
 	if (OffsetNumberIsValid(target_downlinkoffnum))
 	{
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, target_downlinkoffnum);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, target_downlinkoffnum, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(state->target, itemid);
 		downlink = BTreeTupleGetDownLink(itup);
 	}
@@ -1969,7 +1867,7 @@ bt_child_highkey_check(BtreeCheckState *state,
 			OffsetNumber pivotkey_offset;
 
 			/* Get high key */
-			itemid = PageGetItemIdCareful(state, blkno, page, P_HIKEY);
+			itemid = PageGetItemIdCareful(state->rel, blkno, page, P_HIKEY, sizeof(BTPageOpaqueData));
 			highkey = (IndexTuple) PageGetItem(page, itemid);
 
 			/*
@@ -2020,8 +1918,8 @@ bt_child_highkey_check(BtreeCheckState *state,
 													LSN_FORMAT_ARGS(state->targetlsn))));
 					pivotkey_offset = P_HIKEY;
 				}
-				itemid = PageGetItemIdCareful(state, state->targetblock,
-											  state->target, pivotkey_offset);
+				itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+											  state->target, pivotkey_offset, sizeof(BTPageOpaqueData));
 				itup = (IndexTuple) PageGetItem(state->target, itemid);
 			}
 			else
@@ -2107,8 +2005,8 @@ bt_child_check(BtreeCheckState *state, BTScanInsert targetkey,
 	BTPageOpaque copaque;
 	BTPageOpaque topaque;
 
-	itemid = PageGetItemIdCareful(state, state->targetblock,
-								  state->target, downlinkoffnum);
+	itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+								  state->target, downlinkoffnum, sizeof(BTPageOpaqueData));
 	itup = (IndexTuple) PageGetItem(state->target, itemid);
 	childblock = BTreeTupleGetDownLink(itup);
 
@@ -2339,7 +2237,7 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 		 RelationGetRelationName(state->rel));
 
 	level = opaque->btpo_level;
-	itemid = PageGetItemIdCareful(state, blkno, page, P_FIRSTDATAKEY(opaque));
+	itemid = PageGetItemIdCareful(state->rel, blkno, page, P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 	itup = (IndexTuple) PageGetItem(page, itemid);
 	childblk = BTreeTupleGetDownLink(itup);
 	for (;;)
@@ -2363,8 +2261,8 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 										level - 1, copaque->btpo_level)));
 
 		level = copaque->btpo_level;
-		itemid = PageGetItemIdCareful(state, childblk, child,
-									  P_FIRSTDATAKEY(copaque));
+		itemid = PageGetItemIdCareful(state->rel, childblk, child,
+									  P_FIRSTDATAKEY(copaque), sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(child, itemid);
 		childblk = BTreeTupleGetDownLink(itup);
 		/* Be slightly more pro-active in freeing this memory, just in case */
@@ -2412,7 +2310,7 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 	 */
 	if (P_ISHALFDEAD(copaque) && !P_RIGHTMOST(copaque))
 	{
-		itemid = PageGetItemIdCareful(state, childblk, child, P_HIKEY);
+		itemid = PageGetItemIdCareful(state->rel, childblk, child, P_HIKEY, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(child, itemid);
 		if (BTreeTupleGetTopParent(itup) == blkno)
 			return;
@@ -2782,8 +2680,8 @@ invariant_l_offset(BtreeCheckState *state, BTScanInsert key,
 	Assert(key->pivotsearch);
 
 	/* Verify line pointer before checking tuple */
-	itemid = PageGetItemIdCareful(state, state->targetblock, state->target,
-								  upperbound);
+	itemid = PageGetItemIdCareful(state->rel, state->targetblock, state->target,
+								  upperbound, sizeof(BTPageOpaqueData));
 	/* pg_upgrade'd indexes may legally have equal sibling tuples */
 	if (!key->heapkeyspace)
 		return invariant_leq_offset(state, key, upperbound);
@@ -2905,8 +2803,8 @@ invariant_l_nontarget_offset(BtreeCheckState *state, BTScanInsert key,
 	Assert(key->pivotsearch);
 
 	/* Verify line pointer before checking tuple */
-	itemid = PageGetItemIdCareful(state, nontargetblock, nontarget,
-								  upperbound);
+	itemid = PageGetItemIdCareful(state->rel, nontargetblock, nontarget,
+								  upperbound, sizeof(BTPageOpaqueData));
 	cmp = _bt_compare(state->rel, key, nontarget, upperbound);
 
 	/* pg_upgrade'd indexes may legally have equal sibling tuples */
@@ -3143,55 +3041,6 @@ bt_mkscankey_pivotsearch(Relation rel, IndexTuple itup)
 	return skey;
 }
 
-/*
- * PageGetItemId() wrapper that validates returned line pointer.
- *
- * Buffer page/page item access macros generally trust that line pointers are
- * not corrupt, which might cause problems for verification itself.  For
- * example, there is no bounds checking in PageGetItem().  Passing it a
- * corrupt line pointer can cause it to return a tuple/pointer that is unsafe
- * to dereference.
- *
- * Validating line pointers before tuples avoids undefined behavior and
- * assertion failures with corrupt indexes, making the verification process
- * more robust and predictable.
- */
-static ItemId
-PageGetItemIdCareful(BtreeCheckState *state, BlockNumber block, Page page,
-					 OffsetNumber offset)
-{
-	ItemId		itemid = PageGetItemId(page, offset);
-
-	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
-		BLCKSZ - MAXALIGN(sizeof(BTPageOpaqueData)))
-		ereport(ERROR,
-				(errcode(ERRCODE_INDEX_CORRUPTED),
-				 errmsg("line pointer points past end of tuple space in index \"%s\"",
-						RelationGetRelationName(state->rel)),
-				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
-									block, offset, ItemIdGetOffset(itemid),
-									ItemIdGetLength(itemid),
-									ItemIdGetFlags(itemid))));
-
-	/*
-	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree
-	 * never uses either.  Verify that line pointer has storage, too, since
-	 * even LP_DEAD items should within nbtree.
-	 */
-	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
-		ItemIdGetLength(itemid) == 0)
-		ereport(ERROR,
-				(errcode(ERRCODE_INDEX_CORRUPTED),
-				 errmsg("invalid line pointer storage in index \"%s\"",
-						RelationGetRelationName(state->rel)),
-				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
-									block, offset, ItemIdGetOffset(itemid),
-									ItemIdGetLength(itemid),
-									ItemIdGetFlags(itemid))));
-
-	return itemid;
-}
-
 /*
  * BTreeTupleGetHeapTID() wrapper that enforces that a heap TID is present in
  * cases where that is mandatory (i.e. for non-pivot tuples)
-- 
2.32.0 (Apple Git-132)

v19-0002-Add-gist_index_parent_check-function-to-verify-G.patchapplication/octet-stream; name=v19-0002-Add-gist_index_parent_check-function-to-verify-G.patchDownload

From 65733ff09b6133e630e4c1d5678cb66b48a3cbe2 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:17:44 +0500
Subject: [PATCH v19 2/3] Add gist_index_parent_check() function to verify GiST
 index

---
 contrib/amcheck/Makefile                |   6 +-
 contrib/amcheck/amcheck--1.3--1.4.sql   |  14 +
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/expected/check_gist.out | 119 ++++++
 contrib/amcheck/meson.build             |   3 +
 contrib/amcheck/sql/check_gist.sql      |  42 ++
 contrib/amcheck/verify_gist.c           | 538 ++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml               |  19 +
 8 files changed, 740 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.3--1.4.sql
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index f10fd9d89d..a817419581 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,15 +4,17 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
+	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
-DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql \
+		amcheck--1.3--1.4.sql
 
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap
+REGRESS = check check_btree check_heap check_gist
 
 TAP_TESTS = 1
 
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
new file mode 100644
index 0000000000..93297379ef
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -0,0 +1,14 @@
+/* contrib/amcheck/amcheck--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.4'" to load this file. \quit
+
+
+-- gist_index_parent_check()
+--
+CREATE FUNCTION gist_index_parent_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index ab50931f75..e67ace01c9 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.3'
+default_version = '1.4'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 0000000000..9749adfd34
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,119 @@
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index cd81cbf3bc..45e9d74947 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
@@ -24,6 +25,7 @@ install_data(
   'amcheck--1.0--1.1.sql',
   'amcheck--1.1--1.2.sql',
   'amcheck--1.2--1.3.sql',
+  'amcheck--1.3--1.4.sql',
   kwargs: contrib_data_args,
 )
 
@@ -36,6 +38,7 @@ tests += {
       'check',
       'check_btree',
       'check_heap',
+      'check_gist',
     ],
   },
   'tap': {
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 0000000000..75b9ff4b43
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,42 @@
+
+SELECT setseed(1);
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 0000000000..5a5fa73536
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,538 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gist.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "access/tableam.h"
+#include "access/transam.h"
+#include "catalog/pg_am.h"
+#include "common/pg_prng.h"
+#include "catalog/index.h"
+#include "lib/bloomfilter.h"
+#include "storage/lmgr.h"
+#include "storage/smgr.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "utils/snapmgr.h"
+
+#include "amcheck.h"
+
+/*
+ * GistScanItem represents one item of depth-first scan of GiST index.
+ */
+typedef struct GistScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GistScanItem *next;
+} GistScanItem;
+
+typedef struct GistCheckState
+{
+	/* Bloom filter fingerprints index tuples */
+	bloom_filter *filter;
+	/* Debug counter */
+	int64		heaptuplespresent;
+	/* GiST state */
+	GISTSTATE      *state;
+
+	Snapshot		snapshot;
+	Relation	rel;
+	Relation	heaprel;
+
+	/* progress reporting stuff */
+	BlockNumber totalblocks;
+	BlockNumber reportedblocks;
+	BlockNumber scannedblocks;
+	BlockNumber deltablocks;
+} GistCheckState;
+
+PG_FUNCTION_INFO_V1(gist_index_parent_check);
+
+static void gist_init_heapallindexed(Relation rel, GistCheckState *result);
+static void gist_index_checkable(Relation rel);
+static void gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+												void* callback_state);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gist_refind_parent(Relation rel, BlockNumber parentblkno,
+									 BlockNumber childblkno,
+									 BufferAccessStrategy strategy);
+static void gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+						  bool *isnull, bool tupleIsAlive, void *checkstate);
+
+/*
+ * gist_index_parent_check(index regclass)
+ *
+ * Verify integrity of GiST index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum gist_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid		indrelid = PG_GETARG_OID(0);
+	bool	heapallindexed = false;
+
+	if (PG_NARGS() >= 2)
+		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid, gist_index_checkable,
+		gist_check_parent_keys_consistency, AccessShareLock, &heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Check that relation is eligible for GiST verification
+ */
+static void
+gist_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIST_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GiST indexes are supported as targets for this verification"),
+				 errdetail("Relation \"%s\" is not a GiST index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+static void
+gist_init_heapallindexed(Relation rel, GistCheckState *result)
+{
+	int64		total_pages;
+	int64		total_elems;
+	uint64		seed;
+
+	/*
+	 * Size Bloom filter based on estimated number of tuples in index.
+	 * This logic is similar to B-tree, see verify_btree.c .
+	 */
+	total_pages = result->totalblocks;
+	total_elems = Max(total_pages * (MaxOffsetNumber / 5),
+						(int64) rel->rd_rel->reltuples);
+	seed = pg_prng_uint64(&pg_global_prng_state);
+	result->filter = bloom_create(total_elems, maintenance_work_mem, seed);
+
+	result->snapshot = RegisterSnapshot(GetTransactionSnapshot());
+
+
+	/*
+	 * GetTransactionSnapshot() always acquires a new MVCC snapshot in
+	 * READ COMMITTED mode.  A new snapshot is guaranteed to have all
+	 * the entries it requires in the index.
+	 *
+	 * We must defend against the possibility that an old xact
+	 * snapshot was returned at higher isolation levels when that
+	 * snapshot is not safe for index scans of the target index.  This
+	 * is possible when the snapshot sees tuples that are before the
+	 * index's indcheckxmin horizon.  Throwing an error here should be
+	 * very rare.  It doesn't seem worth using a secondary snapshot to
+	 * avoid this.
+	 */
+	if (IsolationUsesXactSnapshot() && rel->rd_index->indcheckxmin &&
+		!TransactionIdPrecedes(HeapTupleHeaderGetXmin(rel->rd_indextuple->t_data),
+								result->snapshot->xmin))
+		ereport(ERROR,
+				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+					errmsg("index \"%s\" cannot be verified using transaction snapshot",
+						RelationGetRelationName(rel))));
+}
+
+/*
+ * Main entry point for GiST check. Allocates memory context and scans through
+ * GiST graph.  This function verifies that tuples of internal pages cover all
+ * the key space of each tuple on leaf page.  To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries to
+ * adjust it by tuples on referenced child page.  Parent gist tuple should
+ * never require any adjustments.
+ */
+static void
+gist_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GistScanItem   *stack;
+	MemoryContext	mctx;
+	MemoryContext	oldcontext;
+	GISTSTATE      *state;
+	int				leafdepth;
+	bool			heapallindexed = *((bool*)callback_state);
+	GistCheckState  check_state;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	state = initGISTstate(rel);
+
+	check_state.state = state;
+	check_state.rel = rel;
+	check_state.heaprel = heaprel;
+
+	check_state.totalblocks = RelationGetNumberOfBlocks(rel);
+	check_state.reportedblocks = 0;
+	check_state.scannedblocks = 0;
+	/* report every 100 blocks or 5%, whichever is bigger */
+	check_state.deltablocks = Max(check_state.totalblocks / 20, 100);
+
+	if (heapallindexed)
+		gist_init_heapallindexed(rel, &check_state);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GistScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber  i, maxoff;
+		XLogRecPtr	lsn;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Report progress */
+		if (check_state.scannedblocks > check_state.reportedblocks +
+			  check_state.deltablocks)
+		{
+			elog(DEBUG1, "verified level %u blocks of approximately %u total",
+				check_state.scannedblocks, check_state.totalblocks);
+			check_state.reportedblocks = check_state.scannedblocks;
+		}
+		check_state.scannedblocks++;
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page))
+		{
+			/* split page detected, install right link to the stack */
+			GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+			ptr->depth = stack->depth;
+			ptr->parenttup = CopyIndexTuple(stack->parenttup);
+			ptr->parentblk = stack->parentblk;
+			ptr->parentlsn = stack->parentlsn;
+			ptr->blkno = GistPageGetOpaque(page)->rightlink;
+			ptr->next = stack->next;
+			stack->next = ptr;
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GistPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that each tuple looks valid, and is consistent with the
+		 * downlink we followed when we stepped on this page.
+		 */
+		maxoff = PageGetMaxOffsetNumber(page);
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId iid = PageGetItemIdCareful(rel, stack->blkno, page, i, sizeof(GISTPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+			/*
+			 * Check that it's not a leftover invalid tuple from pre-9.1 See
+			 * also gistdoinsert() and gistbulkdelete() handling of such
+			 * tuples. We do consider it error here.
+			 */
+			if (GistTupleIsInvalid(idxtuple))
+				ereport(ERROR,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("index \"%s\" contains an inner tuple marked as invalid, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i),
+						 errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+						 errhint("Please REINDEX it.")));
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+			{
+				/*
+				 * There was a discrepancy between parent and child tuples.
+				 * We need to verify it is not a result of concurrent call of
+				 * gistplacetopage(). So, lock parent and try to find downlink
+				 * for current page. It may be missing due to concurrent page
+				 * split, this is OK.
+				 *
+				 * Note that when we aquire parent tuple now we hold lock for
+				 * both parent and child buffers. Thus parent tuple must
+				 * include keyspace of the child.
+				 */
+				pfree(stack->parenttup);
+				stack->parenttup = gist_refind_parent(rel, stack->parentblk,
+													  stack->blkno, strategy);
+
+				/* We found it - make a final check before failing */
+				if (!stack->parenttup)
+					elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+						 stack->blkno, stack->parentblk);
+				else if (gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+				else
+				{
+					/*
+					 * But now it is properly adjusted - nothing to do here.
+					 */
+				}
+			}
+
+			if (GistPageIsLeaf(page))
+			{
+				if (heapallindexed)
+				{
+					bloom_add_element(check_state.filter, (unsigned char *) idxtuple,
+								  IndexTupleSize(idxtuple));
+				}
+			}
+			/* If this is an internal page, recurse into the child */
+			else
+			{
+				GistScanItem *ptr;
+
+				ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+				ptr->depth = stack->depth + 1;
+				ptr->parenttup = CopyIndexTuple(idxtuple);
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		LockBuffer(buffer, GIST_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	if (heapallindexed)
+	{
+		IndexInfo  *indexinfo = BuildIndexInfo(rel);
+		TableScanDesc scan;
+
+		scan = table_beginscan_strat(heaprel,	/* relation */
+									 check_state.snapshot,	/* snapshot */
+									 0, /* number of keys */
+									 NULL,	/* scan key */
+									 true,	/* buffer access strategy OK */
+									 true); /* syncscan OK? */
+
+		/*
+		 * Scan will behave as the first scan of a CREATE INDEX CONCURRENTLY.
+		 */
+		indexinfo->ii_Concurrent = true;
+
+		indexinfo->ii_Unique = false;
+		indexinfo->ii_ExclusionOps = NULL;
+		indexinfo->ii_ExclusionProcs = NULL;
+		indexinfo->ii_ExclusionStrats = NULL;
+
+		elog(DEBUG1, "verifying that tuples from index \"%s\" are present in \"%s\"",
+			 RelationGetRelationName(rel),
+			 RelationGetRelationName(heaprel));
+
+		table_index_build_scan(heaprel, rel, indexinfo, true, false,
+							   gist_tuple_present_callback, (void *) &check_state, scan);
+
+		ereport(DEBUG1,
+		(errmsg_internal("finished verifying presence of " INT64_FORMAT " tuples from table \"%s\" with bitset %.2f%% set",
+							check_state.heaptuplespresent, RelationGetRelationName(heaprel),
+							100.0 * bloom_prop_bits_set(check_state.filter))));
+
+		UnregisterSnapshot(check_state.snapshot);
+		bloom_free(check_state.filter);
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+static void
+gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+						  bool *isnull, bool tupleIsAlive, void *checkstate)
+{
+	GistCheckState *state = (GistCheckState *) checkstate;
+	IndexTuple	itup = gistFormTuple(state->state, index, values, isnull, true);
+	itup->t_tid = *tid;
+	/* Probe Bloom filter -- tuple should be present */
+	if (bloom_lacks_element(state->filter, (unsigned char *) itup,
+							IndexTupleSize(itup)))
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("heap tuple (%u,%u) from table \"%s\" lacks matching index tuple within index \"%s\"",
+						ItemPointerGetBlockNumber(&(itup->t_tid)),
+						ItemPointerGetOffsetNumber(&(itup->t_tid)),
+						RelationGetRelationName(state->heaprel),
+						RelationGetRelationName(state->rel))));
+
+	state->heaptuplespresent++;
+
+	pfree(itup);
+}
+
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	gistcheckpage(rel, buffer);
+
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted page %d",
+						RelationGetRelationName(rel), blockNo)));
+
+	if (GistPageIsDeleted(page))
+	{
+		if (!GistPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gist_refind_parent(Relation rel, BlockNumber parentblkno,
+				   BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIST_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GistPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o, sizeof(GISTPageOpaqueData));
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 5d61a33936..9397a69c6e 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -179,6 +179,25 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gist_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_parent_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
   <tip>
    <para>
-- 
2.32.0 (Apple Git-132)

#18

Jose Arthur Benetasso Villanova

jose.arthur@gmail.com

almost 3 years ago

In reply to: Andrey Borodin (#17)

Re: Amcheck verification of GiST and GIN

On Sun, 8 Jan 2023, Andrey Borodin wrote:

On Sun, Jan 8, 2023 at 8:05 PM Andrey Borodin <amborodin86@gmail.com> wrote:

Please find the attached new version. In this patchset heapallindexed
flag is removed from GIN checks.

Uh... sorry, git-formatted wrong branch.
Here's the correct version. Double checked.

Hello again.

I applied the patch without errors / warnings and did the same tests. All
working as expected.

The only thing that I found is the gin_index_parent_check function in docs
still references the "gin_index_parent_check(index regclass,
heapallindexed boolean) returns void"

--
Jose Arthur Benetasso Villanova

#19

Andrey Borodin

amborodin86@gmail.com

almost 3 years ago

In reply to: Jose Arthur Benetasso Villanova (#18)

3 attachment(s)

Re: Amcheck verification of GiST and GIN

On Fri, Jan 13, 2023 at 3:46 AM Jose Arthur Benetasso Villanova
<jose.arthur@gmail.com> wrote:

The only thing that I found is the gin_index_parent_check function in docs
still references the "gin_index_parent_check(index regclass,
heapallindexed boolean) returns void"

Correct! Please find the attached fixed version.

Thank you!

Best regards, Andrey Borodin.

Attachments:

v20-0001-Refactor-amcheck-to-extract-common-locking-routi.patchapplication/octet-stream; name=v20-0001-Refactor-amcheck-to-extract-common-locking-routi.patchDownload

From e4d5ace7e8888a2fac4fa04f091fa2dd4ebf8342 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:08:10 +0500
Subject: [PATCH v20 1/3] Refactor amcheck to extract common locking routines

---
 contrib/amcheck/Makefile        |   2 +
 contrib/amcheck/amcheck.c       | 188 +++++++++++++++++++
 contrib/amcheck/amcheck.h       |  27 +++
 contrib/amcheck/meson.build     |   1 +
 contrib/amcheck/verify_nbtree.c | 307 ++++++++------------------------
 5 files changed, 296 insertions(+), 229 deletions(-)
 create mode 100644 contrib/amcheck/amcheck.c
 create mode 100644 contrib/amcheck/amcheck.h

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index b82f221e50..f10fd9d89d 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -3,11 +3,13 @@
 MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
+	amcheck.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
 DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
 REGRESS = check check_btree check_heap
diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
new file mode 100644
index 0000000000..3793b0cd93
--- /dev/null
+++ b/contrib/amcheck/amcheck.c
@@ -0,0 +1,188 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.c
+ *		Utility functions common to all access methods.
+ *
+ * Copyright (c) 2017-2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/table.h"
+#include "access/tableam.h"
+#include "amcheck.h"
+#include "catalog/index.h"
+#include "commands/tablecmds.h"
+#include "utils/guc.h"
+
+static bool
+amcheck_index_mainfork_expected(Relation rel);
+
+/*
+ * Check if index relation should have a file for its main relation
+ * fork.  Verification uses this to skip unlogged indexes when in hot standby
+ * mode, where there is simply nothing to verify.
+ *
+ * NB: Caller should call index_checkable()
+ * before calling here.
+ */
+static bool
+amcheck_index_mainfork_expected(Relation rel)
+{
+	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
+		!RecoveryInProgress())
+		return true;
+
+	ereport(NOTICE,
+			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
+			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
+					RelationGetRelationName(rel))));
+
+	return false;
+}
+
+void
+amcheck_lock_relation_and_check(Oid indrelid, IndexCheckableCallback checkable,
+												IndexDoCheckCallback check, LOCKMODE lockmode, void *state)
+{
+	Oid			heapid;
+	Relation	indrel;
+	Relation	heaprel;
+	Oid			save_userid;
+	int			save_sec_context;
+	int			save_nestlevel;
+
+	/*
+	 * We must lock table before index to avoid deadlocks.  However, if the
+	 * passed indrelid isn't an index then IndexGetRelation() will fail.
+	 * Rather than emitting a not-very-helpful error message, postpone
+	 * complaining, expecting that the is-it-an-index test below will fail.
+	 *
+	 * In hot standby mode this will raise an error when parentcheck is true.
+	 */
+	heapid = IndexGetRelation(indrelid, true);
+	if (OidIsValid(heapid))
+	{
+		heaprel = table_open(heapid, lockmode);
+
+		/*
+		 * Switch to the table owner's userid, so that any index functions are
+		 * run as that user.  Also lock down security-restricted operations
+		 * and arrange to make GUC variable changes local to this command.
+		 */
+		GetUserIdAndSecContext(&save_userid, &save_sec_context);
+		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
+							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
+		save_nestlevel = NewGUCNestLevel();
+	}
+	else
+	{
+		heaprel = NULL;
+		/* for "gcc -Og" https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78394 */
+		save_userid = InvalidOid;
+		save_sec_context = -1;
+		save_nestlevel = -1;
+	}
+
+	/*
+	 * Open the target index relations separately (like relation_openrv(), but
+	 * with heap relation locked first to prevent deadlocking).  In hot
+	 * standby mode this will raise an error when parentcheck is true.
+	 *
+	 * There is no need for the usual indcheckxmin usability horizon test
+	 * here, even in the heapallindexed case, because index undergoing
+	 * verification only needs to have entries for a new transaction snapshot.
+	 * (If this is a parentcheck verification, there is no question about
+	 * committed or recently dead heap tuples lacking index entries due to
+	 * concurrent activity.)
+	 */
+	indrel = index_open(indrelid, lockmode);
+
+	/*
+	 * Since we did the IndexGetRelation call above without any lock, it's
+	 * barely possible that a race against an index drop/recreation could have
+	 * netted us the wrong table.
+	 */
+	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_TABLE),
+				 errmsg("could not open parent table of index \"%s\"",
+						RelationGetRelationName(indrel))));
+
+	/* Relation suitable for checking */
+	checkable(indrel);
+
+	if (amcheck_index_mainfork_expected(indrel))
+		check(indrel, heaprel, state);
+
+	/* Roll back any GUC changes executed by index functions */
+	AtEOXact_GUC(false, save_nestlevel);
+
+	/* Restore userid and security context */
+	SetUserIdAndSecContext(save_userid, save_sec_context);
+
+	/*
+	 * Release locks early. That's ok here because nothing in the called
+	 * routines will trigger shared cache invalidations to be sent, so we can
+	 * relax the usual pattern of only releasing locks after commit.
+	 */
+	index_close(indrel, lockmode);
+	if (heaprel)
+		table_close(heaprel, lockmode);
+}
+
+/*
+ * PageGetItemId() wrapper that validates returned line pointer.
+ *
+ * Buffer page/page item access macros generally trust that line pointers are
+ * not corrupt, which might cause problems for verification itself.  For
+ * example, there is no bounds checking in PageGetItem().  Passing it a
+ * corrupt line pointer can cause it to return a tuple/pointer that is unsafe
+ * to dereference.
+ *
+ * Validating line pointers before tuples avoids undefined behavior and
+ * assertion failures with corrupt indexes, making the verification process
+ * more robust and predictable.
+ */
+ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset, size_t opaquesize)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	Assert(opaquesize == MAXALIGN(opaquesize));
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(opaquesize))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree and gist
+	 * never uses either.  Verify that line pointer has storage, too, since
+	 * even LP_DEAD items should.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/contrib/amcheck/amcheck.h b/contrib/amcheck/amcheck.h
new file mode 100644
index 0000000000..10906efd8a
--- /dev/null
+++ b/contrib/amcheck/amcheck.h
@@ -0,0 +1,27 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "storage/lockdefs.h"
+#include "utils/relcache.h"
+#include "miscadmin.h"
+
+/* Typedefs for callback functions for amcheck_lock_relation */
+typedef void (*IndexCheckableCallback) (Relation index);
+typedef void (*IndexDoCheckCallback) (Relation rel, Relation heaprel, void* state);
+
+extern void amcheck_lock_relation_and_check(Oid indrelid,
+											IndexCheckableCallback checkable,
+											IndexDoCheckCallback check,
+											LOCKMODE lockmode, void *state);
+
+extern ItemId PageGetItemIdCareful(Relation rel, BlockNumber block,
+					 Page page, OffsetNumber offset, size_t opaquesize);
\ No newline at end of file
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 5b55cf343a..cd81cbf3bc 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -1,6 +1,7 @@
 # Copyright (c) 2022-2023, PostgreSQL Global Development Group
 
 amcheck_sources = files(
+  'amcheck.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 257cff671b..3c1599f215 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -34,13 +34,14 @@
 #include "commands/tablecmds.h"
 #include "common/pg_prng.h"
 #include "lib/bloomfilter.h"
-#include "miscadmin.h"
 #include "storage/lmgr.h"
 #include "storage/smgr.h"
 #include "utils/guc.h"
 #include "utils/memutils.h"
 #include "utils/snapmgr.h"
 
+#include "amcheck.h"
+
 
 PG_MODULE_MAGIC;
 
@@ -138,10 +139,8 @@ typedef struct BtreeLevel
 PG_FUNCTION_INFO_V1(bt_index_check);
 PG_FUNCTION_INFO_V1(bt_index_parent_check);
 
-static void bt_index_check_internal(Oid indrelid, bool parentcheck,
-									bool heapallindexed, bool rootdescend);
+static void bt_index_check_internal_callback(Relation indrel, Relation heaprel, void* state);
 static inline void btree_index_checkable(Relation rel);
-static inline bool btree_index_mainfork_expected(Relation rel);
 static void bt_check_every_level(Relation rel, Relation heaprel,
 								 bool heapkeyspace, bool readonly, bool heapallindexed,
 								 bool rootdescend);
@@ -184,12 +183,17 @@ static inline bool invariant_l_nontarget_offset(BtreeCheckState *state,
 static Page palloc_btree_page(BtreeCheckState *state, BlockNumber blocknum);
 static inline BTScanInsert bt_mkscankey_pivotsearch(Relation rel,
 													IndexTuple itup);
-static ItemId PageGetItemIdCareful(BtreeCheckState *state, BlockNumber block,
-								   Page page, OffsetNumber offset);
 static inline ItemPointer BTreeTupleGetHeapTIDCareful(BtreeCheckState *state,
 													  IndexTuple itup, bool nonpivot);
 static inline ItemPointer BTreeTupleGetPointsToTID(IndexTuple itup);
 
+typedef struct BTCheckCallbackState
+{
+	bool parentcheck;
+	bool heapallindexed;
+	bool rootdescend;
+} BTCheckCallbackState;
+
 /*
  * bt_index_check(index regclass, heapallindexed boolean)
  *
@@ -203,12 +207,17 @@ Datum
 bt_index_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
+	BTCheckCallbackState args;
 
-	if (PG_NARGS() == 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = false;
+
+	if (PG_NARGS() >= 2)
+		args.heapallindexed = PG_GETARG_BOOL(1);
 
-	bt_index_check_internal(indrelid, false, heapallindexed, false);
+	amcheck_lock_relation_and_check(indrelid, btree_index_checkable,
+		bt_index_check_internal_callback, AccessShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -226,15 +235,18 @@ Datum
 bt_index_parent_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		rootdescend = false;
+	BTCheckCallbackState args;
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = true;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+		args.heapallindexed = PG_GETARG_BOOL(1);
 	if (PG_NARGS() == 3)
-		rootdescend = PG_GETARG_BOOL(2);
+		args.rootdescend = PG_GETARG_BOOL(2);
 
-	bt_index_check_internal(indrelid, true, heapallindexed, rootdescend);
+	amcheck_lock_relation_and_check(indrelid, btree_index_checkable,
+		bt_index_check_internal_callback, ShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -242,126 +254,35 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
 /*
  * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
  */
-static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend)
+static void bt_index_check_internal_callback(Relation indrel, Relation heaprel, void* state)
 {
-	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	LOCKMODE	lockmode;
-	Oid			save_userid;
-	int			save_sec_context;
-	int			save_nestlevel;
-
-	if (parentcheck)
-		lockmode = ShareLock;
-	else
-		lockmode = AccessShareLock;
-
-	/*
-	 * We must lock table before index to avoid deadlocks.  However, if the
-	 * passed indrelid isn't an index then IndexGetRelation() will fail.
-	 * Rather than emitting a not-very-helpful error message, postpone
-	 * complaining, expecting that the is-it-an-index test below will fail.
-	 *
-	 * In hot standby mode this will raise an error when parentcheck is true.
-	 */
-	heapid = IndexGetRelation(indrelid, true);
-	if (OidIsValid(heapid))
-	{
-		heaprel = table_open(heapid, lockmode);
-
-		/*
-		 * Switch to the table owner's userid, so that any index functions are
-		 * run as that user.  Also lock down security-restricted operations
-		 * and arrange to make GUC variable changes local to this command.
-		 */
-		GetUserIdAndSecContext(&save_userid, &save_sec_context);
-		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
-							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
-		save_nestlevel = NewGUCNestLevel();
-	}
-	else
-	{
-		heaprel = NULL;
-		/* Set these just to suppress "uninitialized variable" warnings */
-		save_userid = InvalidOid;
-		save_sec_context = -1;
-		save_nestlevel = -1;
-	}
-
-	/*
-	 * Open the target index relations separately (like relation_openrv(), but
-	 * with heap relation locked first to prevent deadlocking).  In hot
-	 * standby mode this will raise an error when parentcheck is true.
-	 *
-	 * There is no need for the usual indcheckxmin usability horizon test
-	 * here, even in the heapallindexed case, because index undergoing
-	 * verification only needs to have entries for a new transaction snapshot.
-	 * (If this is a parentcheck verification, there is no question about
-	 * committed or recently dead heap tuples lacking index entries due to
-	 * concurrent activity.)
-	 */
-	indrel = index_open(indrelid, lockmode);
+	BTCheckCallbackState* args = (BTCheckCallbackState*) state;
+	bool		heapkeyspace,
+					allequalimage;
 
-	/*
-	 * Since we did the IndexGetRelation call above without any lock, it's
-	 * barely possible that a race against an index drop/recreation could have
-	 * netted us the wrong table.
-	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
 		ereport(ERROR,
-				(errcode(ERRCODE_UNDEFINED_TABLE),
-				 errmsg("could not open parent table of index \"%s\"",
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" lacks a main relation fork",
 						RelationGetRelationName(indrel))));
 
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	if (btree_index_mainfork_expected(indrel))
-	{
-		bool		heapkeyspace,
-					allequalimage;
-
-		if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" lacks a main relation fork",
-							RelationGetRelationName(indrel))));
-
-		/* Extract metadata from metapage, and sanitize it in passing */
-		_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
-		if (allequalimage && !heapkeyspace)
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
-							RelationGetRelationName(indrel))));
-		if (allequalimage && !_bt_allequalimage(indrel, false))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
-							RelationGetRelationName(indrel))));
-
-		/* Check index, possibly against table it is an index on */
-		bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
-							 heapallindexed, rootdescend);
-	}
-
-	/* Roll back any GUC changes executed by index functions */
-	AtEOXact_GUC(false, save_nestlevel);
+	/* Extract metadata from metapage, and sanitize it in passing */
+	_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
+	if (allequalimage && !heapkeyspace)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
+						RelationGetRelationName(indrel))));
+	if (allequalimage && !_bt_allequalimage(indrel, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
+						RelationGetRelationName(indrel))));
 
-	/* Restore userid and security context */
-	SetUserIdAndSecContext(save_userid, save_sec_context);
+	/* Check index, possibly against table it is an index on */
+	bt_check_every_level(indrel, heaprel, heapkeyspace, args->parentcheck,
+							args->heapallindexed, args->rootdescend);
 
-	/*
-	 * Release locks early. That's ok here because nothing in the called
-	 * routines will trigger shared cache invalidations to be sent, so we can
-	 * relax the usual pattern of only releasing locks after commit.
-	 */
-	index_close(indrel, lockmode);
-	if (heaprel)
-		table_close(heaprel, lockmode);
 }
 
 /*
@@ -398,29 +319,6 @@ btree_index_checkable(Relation rel)
 				 errdetail("Index is not valid.")));
 }
 
-/*
- * Check if B-Tree index relation should have a file for its main relation
- * fork.  Verification uses this to skip unlogged indexes when in hot standby
- * mode, where there is simply nothing to verify.  We behave as if the
- * relation is empty.
- *
- * NB: Caller should call btree_index_checkable() before calling here.
- */
-static inline bool
-btree_index_mainfork_expected(Relation rel)
-{
-	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
-		!RecoveryInProgress())
-		return true;
-
-	ereport(DEBUG1,
-			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
-			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
-					RelationGetRelationName(rel))));
-
-	return false;
-}
-
 /*
  * Main entry point for B-Tree SQL-callable functions. Walks the B-Tree in
  * logical order, verifying invariants as it goes.  Optionally, verification
@@ -793,9 +691,9 @@ bt_check_level_from_leftmost(BtreeCheckState *state, BtreeLevel level)
 				ItemId		itemid;
 
 				/* Internal page -- downlink gets leftmost on next level */
-				itemid = PageGetItemIdCareful(state, state->targetblock,
+				itemid = PageGetItemIdCareful(state->rel, state->targetblock,
 											  state->target,
-											  P_FIRSTDATAKEY(opaque));
+											  P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 				itup = (IndexTuple) PageGetItem(state->target, itemid);
 				nextleveldown.leftmost = BTreeTupleGetDownLink(itup);
 				nextleveldown.level = opaque->btpo_level - 1;
@@ -875,8 +773,8 @@ nextpage:
 			IndexTuple	itup;
 			ItemId		itemid;
 
-			itemid = PageGetItemIdCareful(state, state->targetblock,
-										  state->target, P_HIKEY);
+			itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+										  state->target, P_HIKEY, sizeof(BTPageOpaqueData));
 			itup = (IndexTuple) PageGetItem(state->target, itemid);
 
 			state->lowkey = MemoryContextAlloc(oldcontext, IndexTupleSize(itup));
@@ -1093,8 +991,8 @@ bt_target_page_check(BtreeCheckState *state)
 		IndexTuple	itup;
 
 		/* Verify line pointer before checking tuple */
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, P_HIKEY);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, P_HIKEY, sizeof(BTPageOpaqueData));
 		if (!_bt_check_natts(state->rel, state->heapkeyspace, state->target,
 							 P_HIKEY))
 		{
@@ -1129,8 +1027,8 @@ bt_target_page_check(BtreeCheckState *state)
 
 		CHECK_FOR_INTERRUPTS();
 
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, offset);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, offset, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(state->target, itemid);
 		tupsize = IndexTupleSize(itup);
 
@@ -1442,9 +1340,9 @@ bt_target_page_check(BtreeCheckState *state)
 							 OffsetNumberNext(offset));
 
 			/* Reuse itup to get pointed-to heap location of second item */
-			itemid = PageGetItemIdCareful(state, state->targetblock,
+			itemid = PageGetItemIdCareful(state->rel, state->targetblock,
 										  state->target,
-										  OffsetNumberNext(offset));
+										  OffsetNumberNext(offset), sizeof(BTPageOpaqueData));
 			itup = (IndexTuple) PageGetItem(state->target, itemid);
 			tid = BTreeTupleGetPointsToTID(itup);
 			nhtid = psprintf("(%u,%u)",
@@ -1735,8 +1633,8 @@ bt_right_page_check_scankey(BtreeCheckState *state)
 	if (P_ISLEAF(opaque) && nline >= P_FIRSTDATAKEY(opaque))
 	{
 		/* Return first data item (if any) */
-		rightitem = PageGetItemIdCareful(state, targetnext, rightpage,
-										 P_FIRSTDATAKEY(opaque));
+		rightitem = PageGetItemIdCareful(state->rel, targetnext, rightpage,
+										 P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 	}
 	else if (!P_ISLEAF(opaque) &&
 			 nline >= OffsetNumberNext(P_FIRSTDATAKEY(opaque)))
@@ -1745,8 +1643,8 @@ bt_right_page_check_scankey(BtreeCheckState *state)
 		 * Return first item after the internal page's "negative infinity"
 		 * item
 		 */
-		rightitem = PageGetItemIdCareful(state, targetnext, rightpage,
-										 OffsetNumberNext(P_FIRSTDATAKEY(opaque)));
+		rightitem = PageGetItemIdCareful(state->rel, targetnext, rightpage,
+										 OffsetNumberNext(P_FIRSTDATAKEY(opaque)), sizeof(BTPageOpaqueData));
 	}
 	else
 	{
@@ -1865,8 +1763,8 @@ bt_child_highkey_check(BtreeCheckState *state,
 
 	if (OffsetNumberIsValid(target_downlinkoffnum))
 	{
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, target_downlinkoffnum);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, target_downlinkoffnum, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(state->target, itemid);
 		downlink = BTreeTupleGetDownLink(itup);
 	}
@@ -1969,7 +1867,7 @@ bt_child_highkey_check(BtreeCheckState *state,
 			OffsetNumber pivotkey_offset;
 
 			/* Get high key */
-			itemid = PageGetItemIdCareful(state, blkno, page, P_HIKEY);
+			itemid = PageGetItemIdCareful(state->rel, blkno, page, P_HIKEY, sizeof(BTPageOpaqueData));
 			highkey = (IndexTuple) PageGetItem(page, itemid);
 
 			/*
@@ -2020,8 +1918,8 @@ bt_child_highkey_check(BtreeCheckState *state,
 													LSN_FORMAT_ARGS(state->targetlsn))));
 					pivotkey_offset = P_HIKEY;
 				}
-				itemid = PageGetItemIdCareful(state, state->targetblock,
-											  state->target, pivotkey_offset);
+				itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+											  state->target, pivotkey_offset, sizeof(BTPageOpaqueData));
 				itup = (IndexTuple) PageGetItem(state->target, itemid);
 			}
 			else
@@ -2107,8 +2005,8 @@ bt_child_check(BtreeCheckState *state, BTScanInsert targetkey,
 	BTPageOpaque copaque;
 	BTPageOpaque topaque;
 
-	itemid = PageGetItemIdCareful(state, state->targetblock,
-								  state->target, downlinkoffnum);
+	itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+								  state->target, downlinkoffnum, sizeof(BTPageOpaqueData));
 	itup = (IndexTuple) PageGetItem(state->target, itemid);
 	childblock = BTreeTupleGetDownLink(itup);
 
@@ -2339,7 +2237,7 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 		 RelationGetRelationName(state->rel));
 
 	level = opaque->btpo_level;
-	itemid = PageGetItemIdCareful(state, blkno, page, P_FIRSTDATAKEY(opaque));
+	itemid = PageGetItemIdCareful(state->rel, blkno, page, P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 	itup = (IndexTuple) PageGetItem(page, itemid);
 	childblk = BTreeTupleGetDownLink(itup);
 	for (;;)
@@ -2363,8 +2261,8 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 										level - 1, copaque->btpo_level)));
 
 		level = copaque->btpo_level;
-		itemid = PageGetItemIdCareful(state, childblk, child,
-									  P_FIRSTDATAKEY(copaque));
+		itemid = PageGetItemIdCareful(state->rel, childblk, child,
+									  P_FIRSTDATAKEY(copaque), sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(child, itemid);
 		childblk = BTreeTupleGetDownLink(itup);
 		/* Be slightly more pro-active in freeing this memory, just in case */
@@ -2412,7 +2310,7 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 	 */
 	if (P_ISHALFDEAD(copaque) && !P_RIGHTMOST(copaque))
 	{
-		itemid = PageGetItemIdCareful(state, childblk, child, P_HIKEY);
+		itemid = PageGetItemIdCareful(state->rel, childblk, child, P_HIKEY, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(child, itemid);
 		if (BTreeTupleGetTopParent(itup) == blkno)
 			return;
@@ -2782,8 +2680,8 @@ invariant_l_offset(BtreeCheckState *state, BTScanInsert key,
 	Assert(key->pivotsearch);
 
 	/* Verify line pointer before checking tuple */
-	itemid = PageGetItemIdCareful(state, state->targetblock, state->target,
-								  upperbound);
+	itemid = PageGetItemIdCareful(state->rel, state->targetblock, state->target,
+								  upperbound, sizeof(BTPageOpaqueData));
 	/* pg_upgrade'd indexes may legally have equal sibling tuples */
 	if (!key->heapkeyspace)
 		return invariant_leq_offset(state, key, upperbound);
@@ -2905,8 +2803,8 @@ invariant_l_nontarget_offset(BtreeCheckState *state, BTScanInsert key,
 	Assert(key->pivotsearch);
 
 	/* Verify line pointer before checking tuple */
-	itemid = PageGetItemIdCareful(state, nontargetblock, nontarget,
-								  upperbound);
+	itemid = PageGetItemIdCareful(state->rel, nontargetblock, nontarget,
+								  upperbound, sizeof(BTPageOpaqueData));
 	cmp = _bt_compare(state->rel, key, nontarget, upperbound);
 
 	/* pg_upgrade'd indexes may legally have equal sibling tuples */
@@ -3143,55 +3041,6 @@ bt_mkscankey_pivotsearch(Relation rel, IndexTuple itup)
 	return skey;
 }
 
-/*
- * PageGetItemId() wrapper that validates returned line pointer.
- *
- * Buffer page/page item access macros generally trust that line pointers are
- * not corrupt, which might cause problems for verification itself.  For
- * example, there is no bounds checking in PageGetItem().  Passing it a
- * corrupt line pointer can cause it to return a tuple/pointer that is unsafe
- * to dereference.
- *
- * Validating line pointers before tuples avoids undefined behavior and
- * assertion failures with corrupt indexes, making the verification process
- * more robust and predictable.
- */
-static ItemId
-PageGetItemIdCareful(BtreeCheckState *state, BlockNumber block, Page page,
-					 OffsetNumber offset)
-{
-	ItemId		itemid = PageGetItemId(page, offset);
-
-	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
-		BLCKSZ - MAXALIGN(sizeof(BTPageOpaqueData)))
-		ereport(ERROR,
-				(errcode(ERRCODE_INDEX_CORRUPTED),
-				 errmsg("line pointer points past end of tuple space in index \"%s\"",
-						RelationGetRelationName(state->rel)),
-				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
-									block, offset, ItemIdGetOffset(itemid),
-									ItemIdGetLength(itemid),
-									ItemIdGetFlags(itemid))));
-
-	/*
-	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree
-	 * never uses either.  Verify that line pointer has storage, too, since
-	 * even LP_DEAD items should within nbtree.
-	 */
-	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
-		ItemIdGetLength(itemid) == 0)
-		ereport(ERROR,
-				(errcode(ERRCODE_INDEX_CORRUPTED),
-				 errmsg("invalid line pointer storage in index \"%s\"",
-						RelationGetRelationName(state->rel)),
-				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
-									block, offset, ItemIdGetOffset(itemid),
-									ItemIdGetLength(itemid),
-									ItemIdGetFlags(itemid))));
-
-	return itemid;
-}
-
 /*
  * BTreeTupleGetHeapTID() wrapper that enforces that a heap TID is present in
  * cases where that is mandatory (i.e. for non-pivot tuples)
-- 
2.32.0 (Apple Git-132)

v20-0003-Add-gin_index_parent_check-to-verify-GIN-index.patchapplication/octet-stream; name=v20-0003-Add-gin_index_parent_check-to-verify-GIN-index.patchDownload

From 7e758b4c5f97e250c201a0fc59b4b9420040558a Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:22:26 +0500
Subject: [PATCH v20 3/3] Add gin_index_parent_check() to verify GIN index

---
 contrib/amcheck/Makefile               |   3 +-
 contrib/amcheck/amcheck--1.3--1.4.sql  |  11 +-
 contrib/amcheck/amcheck.c              |   2 +-
 contrib/amcheck/expected/check_gin.out |  64 ++
 contrib/amcheck/meson.build            |   2 +
 contrib/amcheck/sql/check_gin.sql      |  40 ++
 contrib/amcheck/verify_gin.c           | 793 +++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml              |  19 +
 8 files changed, 931 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/expected/check_gin.out
 create mode 100644 contrib/amcheck/sql/check_gin.sql
 create mode 100644 contrib/amcheck/verify_gin.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index a817419581..ecb849a605 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -5,6 +5,7 @@ OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
 	verify_gist.o \
+	verify_gin.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
@@ -14,7 +15,7 @@ DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck
 
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap check_gist
+REGRESS = check check_btree check_heap check_gist check_gin
 
 TAP_TESTS = 1
 
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
index 93297379ef..5e283be45b 100644
--- a/contrib/amcheck/amcheck--1.3--1.4.sql
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -11,4 +11,13 @@ RETURNS VOID
 AS 'MODULE_PATHNAME', 'gist_index_parent_check'
 LANGUAGE C STRICT;
 
-REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
+REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
+
+-- gin_index_parent_check()
+--
+CREATE FUNCTION gin_index_parent_check(index regclass)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gin_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gin_index_parent_check(regclass) FROM PUBLIC;
\ No newline at end of file
diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
index 3793b0cd93..9999a233f8 100644
--- a/contrib/amcheck/amcheck.c
+++ b/contrib/amcheck/amcheck.c
@@ -83,7 +83,7 @@ amcheck_lock_relation_and_check(Oid indrelid, IndexCheckableCallback checkable,
 	else
 	{
 		heaprel = NULL;
-		/* for "gcc -Og" https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78394 */
+		/* Set these just to suppress "uninitialized variable" warnings */
 		save_userid = InvalidOid;
 		save_sec_context = -1;
 		save_nestlevel = -1;
diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
new file mode 100644
index 0000000000..43fd769a50
--- /dev/null
+++ b/contrib/amcheck/expected/check_gin.out
@@ -0,0 +1,64 @@
+-- Test of index bulk load
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx');
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test index inserts
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+SELECT gin_index_parent_check('gin_check_idx');
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test GIN over text array
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx');
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 45e9d74947..fec44a6826 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gin.c',
   'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
@@ -39,6 +40,7 @@ tests += {
       'check_btree',
       'check_heap',
       'check_gist',
+      'check_gin',
     ],
   },
   'tap': {
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
new file mode 100644
index 0000000000..9771afffa5
--- /dev/null
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -0,0 +1,40 @@
+-- Test of index bulk load
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test index inserts
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+
+SELECT gin_index_parent_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test GIN over text array
+SELECT setseed(1);
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx');
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
new file mode 100644
index 0000000000..8fd00513f7
--- /dev/null
+++ b/contrib/amcheck/verify_gin.c
@@ -0,0 +1,793 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gin.c
+ *		Verifies the integrity of GIN indexes based on invariants.
+ *
+ * Verification checks that all paths in GIN graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gin.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gin_private.h"
+#include "access/nbtree.h"
+#include "amcheck.h"
+#include "catalog/pg_am.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "string.h"
+
+/*
+ * GinScanItem represents one item of depth-first scan of GIN index.
+ */
+typedef struct GinScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GinScanItem *next;
+}			GinScanItem;
+
+/*
+ * GinPostingTreeScanItem represents one item of depth-first scan of GIN  posting tree.
+ */
+typedef struct GinPostingTreeScanItem
+{
+	int			depth;
+	ItemPointerData parentkey;
+	BlockNumber parentblk;
+	BlockNumber blkno;
+	struct GinPostingTreeScanItem *next;
+}			GinPostingTreeScanItem;
+
+
+PG_FUNCTION_INFO_V1(gin_index_parent_check);
+
+static void gin_index_checkable(Relation rel);
+static void gin_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state);
+static bool check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gin_refind_parent(Relation rel, BlockNumber parentblkno,
+									BlockNumber childblkno,
+									BufferAccessStrategy strategy);
+
+/*
+ * gin_index_parent_check(index regclass)
+ *
+ * Verify integrity of GIN index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gin_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+
+	amcheck_lock_relation_and_check(indrelid, gin_index_checkable,
+		gin_check_parent_keys_consistency, AccessShareLock, NULL);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Read item pointers from leaf entry tuple.
+ *
+ * Returns a palloc'd array of ItemPointers. The number of items is returned
+ * in *nitems.
+ */
+static ItemPointer
+ginReadTupleWithoutState(IndexTuple itup, int *nitems)
+{
+	Pointer		ptr = GinGetPosting(itup);
+	int			nipd = GinGetNPosting(itup);
+	ItemPointer ipd;
+	int			ndecoded;
+
+	if (GinItupIsCompressed(itup))
+	{
+		if (nipd > 0)
+		{
+			ipd = ginPostingListDecode((GinPostingList *) ptr, &ndecoded);
+			if (nipd != ndecoded)
+				elog(ERROR, "number of items mismatch in GIN entry tuple, %d in tuple header, %d decoded",
+					 nipd, ndecoded);
+		}
+		else
+		{
+			ipd = palloc(0);
+		}
+	}
+	else
+	{
+		ipd = (ItemPointer) palloc(sizeof(ItemPointerData) * nipd);
+		memcpy(ipd, ptr, sizeof(ItemPointerData) * nipd);
+	}
+	*nitems = nipd;
+	return ipd;
+}
+
+
+/*
+ * Check that relation is eligible for GIN verification
+ */
+static void
+gin_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIN_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GIN indexes are supported as targets for this verification"),
+				 errdetail("Relation \"%s\" is not a GIN index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+/*
+ * Allocates memory context and scans through postigTree graph
+ *
+ */
+static void
+gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting_tree_root)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinPostingTreeScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinPostingTreeScanItem *) palloc0(sizeof(GinPostingTreeScanItem));
+	stack->depth = 0;
+	ItemPointerSetInvalid(&stack->parentkey);
+	stack->parentblk = InvalidBlockNumber;
+	stack->blkno = posting_tree_root;
+
+	elog(DEBUG3, "processing posting tree at blk %u", posting_tree_root);
+
+	while (stack)
+	{
+		GinPostingTreeScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		Assert(GinPageIsData(page));
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			ItemPointerData minItem;
+			int			nlist;
+			ItemPointerData *list;
+			char		tidrange_buf[100];
+
+			ItemPointerSetMin(&minItem);
+
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			list = GinDataLeafPageGetItems(page, &nlist, minItem);
+
+			if (nlist > 0)
+			{
+				snprintf(tidrange_buf, sizeof(tidrange_buf),
+						 "%d tids (%u, %u) - (%u, %u)",
+						 nlist,
+						 ItemPointerGetBlockNumberNoCheck(&list[0]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[0]),
+						 ItemPointerGetBlockNumberNoCheck(&list[nlist - 1]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[nlist - 1]));
+			} else {
+				snprintf(tidrange_buf, sizeof(tidrange_buf), "0 tids");
+			}
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				elog(DEBUG3, "blk %u: parent %u highkey (%u, %u), %s",
+					 stack->blkno,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey),
+					 tidrange_buf);
+			}
+			else
+			{
+				elog(DEBUG3, "blk %u: root leaf, %s",
+					 stack->blkno,
+					 tidrange_buf);
+			}
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				ItemPointerGetOffsetNumberNoCheck(&stack->parentkey) != InvalidOffsetNumber &&
+				nlist > 0 &&
+				ItemPointerCompare(&stack->parentkey, &list[nlist - 1]) < 0)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": tid exceeds parent's high key in postingTree leaf on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			}
+		}
+		else
+		{
+			LocationIndex pd_lower;
+			int			lowersize;
+			ItemPointerData bound;
+
+			/*
+			 * Check that tuples in each page are properly ordered and
+			 * consistent with parent high key
+			 */
+			maxoff = GinPageGetOpaque(page)->maxoff;
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: internal posting tree page with %u items, parent %u highkey (%u, %u)",
+					 stack->blkno,
+					 maxoff,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey));
+			else
+				elog(DEBUG3, "blk %u: root internal posting tree page with %u items", stack->blkno, maxoff);
+
+			/*
+			 * A GIN posting tree internal page stores PostingItems in the
+			 * 'lower' part of the page. The 'upper' part is unused. The
+			 * number of elements is stored in the opaque area (maxoff).
+			 * Make sure the size of the 'lower' part agrees with 'maxoff'
+			 *
+			 * We didn't set pd_lower until PostgreSQL version 9.4, so if this
+			 * check fails, it could also be because the index was binary-upgraded
+			 * from an earlier version. That was a long time ago, though, so let's
+			 * warn if it doesn't match.
+			 */
+			pd_lower = ((PageHeader) page)->pd_lower;
+			lowersize = pd_lower - MAXALIGN(SizeOfPageHeaderData);
+			if ((lowersize - MAXALIGN(sizeof(ItemPointerData))) / sizeof(PostingItem) != maxoff)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has unexpected pd_lower %u in posting tree block %u with maxoff %u)",
+								RelationGetRelationName(rel), pd_lower, stack->blkno, maxoff)));
+			}
+
+			/*
+			 * Before the PostingItems, there's one ItemPointerData in the
+			 * 'lower' part that stores the page's high key.
+			 */
+			bound = *GinDataPageGetRightBound(page);
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				if (!ItemPointerEquals(&stack->parentkey, &bound))
+				{
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\": posting tree page's high key (%u, %u) doesn't match the downlink on block %u (parent blk %u, key (%u, %u))",
+									RelationGetRelationName(rel),
+									ItemPointerGetBlockNumberNoCheck(&bound),
+									ItemPointerGetOffsetNumberNoCheck(&bound),
+									stack->blkno,
+									stack->parentblk,
+									ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+									ItemPointerGetOffsetNumberNoCheck(&stack->parentkey))));
+				}
+			}
+
+			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
+
+				elog(DEBUG3, "key (%u, %u) -> %u",
+					 ItemPointerGetBlockNumber(&posting_item->key),
+					 ItemPointerGetOffsetNumber(&posting_item->key),
+					 BlockIdGetBlockNumber(&posting_item->child_blkno));
+
+				if (i == maxoff && GinPageGetOpaque(page)->rightlink == InvalidBlockNumber)
+				{
+					/* The rightmost item in the tree level has (0, 0) as the key */
+					if (ItemPointerGetBlockNumberNoCheck(&posting_item->key) != 0 ||
+						ItemPointerGetOffsetNumberNoCheck(&posting_item->key) != 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": rightmost posting tree page (blk %u) has unexpected last key (%u, %u)",
+										RelationGetRelationName(rel),
+										stack->blkno,
+										ItemPointerGetBlockNumberNoCheck(&posting_item->key),
+										ItemPointerGetOffsetNumberNoCheck(&posting_item->key))));
+					}
+				}
+				else if (i != FirstOffsetNumber)
+				{
+					PostingItem *previous_posting_item = GinDataPageGetPostingItem(page, i - 1);
+
+					if (ItemPointerCompare(&posting_item->key, &previous_posting_item->key) < 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\" has wrong tuple order in posting tree, block %u, offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+					}
+				}
+
+				/*
+				 * Check if this tuple is consistent with the downlink in the
+				 * parent.
+				 */
+				if (stack->parentblk != InvalidBlockNumber && i == maxoff)
+				{
+					if (ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting item exceeds parent's high key in postingTree internal page on block %u offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+
+					}
+				}
+
+				/* If this is an internal page, recurse into the child */
+				if (!GinPageIsLeaf(page))
+				{
+					GinPostingTreeScanItem *ptr;
+
+					ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
+					ptr->depth = stack->depth + 1;
+					ptr->parentkey = posting_item->key;
+					ptr->parentblk = stack->blkno;
+					ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+
+			}
+		}
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Main entry point for GIN check. Allocates memory context and scans through
+ * GIN graph.
+ */
+static void
+gin_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GinState	state;
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+	initGinState(&state, rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinScanItem *) palloc0(sizeof(GinScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIN_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GinScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+		IndexTuple	prev_tuple;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+		maxoff = PageGetMaxOffsetNumber(page);
+
+		/* Do basic sanity checks on the page headers */
+		if (!check_index_page(rel, buffer, stack->blkno))
+		{
+			goto nextpage;
+		}
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (stack->parenttup != NULL)
+		{
+			GinNullCategory parent_key_category;
+			Datum		parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, maxoff, sizeof(GinPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory page_max_key_category;
+			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
+
+			if (GinPageGetOpaque(page)->rightlink != InvalidBlockNumber &&
+				ginCompareEntries(&state, attnum, page_max_key, page_max_key_category, parent_key, parent_key_category) > 0)
+			{
+				/* split page detected, install right link to the stack */
+				GinScanItem *ptr;
+
+				elog(DEBUG3, "split detected");
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth;
+				ptr->parenttup = CopyIndexTuple(stack->parenttup);
+				ptr->parentblk = stack->parentblk;
+				ptr->parentlsn = stack->parentlsn;
+				ptr->blkno = GinPageGetOpaque(page)->rightlink;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+				goto nextpage;
+			}
+		}
+
+		/*
+		 * Check that tuples in each page are properly ordered and consistent
+		 * with parent high key
+		 */
+		prev_tuple = NULL;
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i, sizeof(GinPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory prev_key_category;
+			Datum		prev_key;
+			GinNullCategory current_key_category;
+			Datum		current_key;
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+				continue;
+			}
+
+			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
+
+			/* (apparently) first block is metadata, skip order check */
+			if (i != FirstOffsetNumber && stack->blkno != (BlockNumber) 1)
+			{
+				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
+				if (ginCompareEntries(&state, attnum, prev_key, prev_key_category, current_key, current_key_category) >= 0)
+				{
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has wrong tuple order, block %u, offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+				}
+			}
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				i == maxoff)
+			{
+				GinNullCategory parent_key_category;
+				Datum		parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+
+				if (ginCompareEntries(&state, attnum, current_key, current_key_category, parent_key, parent_key_category) > 0)
+				{
+					/*
+					 * There was a discrepancy between parent and child
+					 * tuples. We need to verify it is not a result of
+					 * concurrent call of gistplacetopage(). So, lock parent
+					 * and try to find downlink for current page. It may be
+					 * missing due to concurrent page split, this is OK.
+					 */
+					pfree(stack->parenttup);
+					stack->parenttup = gin_refind_parent(rel, stack->parentblk,
+														 stack->blkno, strategy);
+
+					/* We found it - make a final check before failing */
+					if (!stack->parenttup)
+						elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+							 stack->blkno, stack->parentblk);
+					else
+					{
+						parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+						if (ginCompareEntries(&state, attnum, current_key, current_key_category, parent_key, parent_key_category) > 0)
+							ereport(ERROR,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+											RelationGetRelationName(rel), stack->blkno, i)));
+						else
+						{
+							/*
+							 * But now it is properly adjusted - nothing to do
+							 * here.
+							 */
+						}
+					}
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+			if (!GinPageIsLeaf(page))
+			{
+				GinScanItem *ptr;
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth + 1;
+				/* last tuple in layer has no high key */
+				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
+				{
+					ptr->parenttup = CopyIndexTuple(idxtuple);
+				}
+				else
+				{
+					ptr->parenttup = NULL;
+				}
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = GinGetDownlink(idxtuple);
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+			/* If this item is a pointer to a posting tree, recurse into it */
+			else if (GinIsPostingTree(idxtuple))
+			{
+				BlockNumber rootPostingTree = GinGetPostingTree(idxtuple);
+
+				gin_check_posting_tree_parent_keys_consistency(rel, rootPostingTree);
+			}
+			else
+			{
+				ItemPointer ipd;
+				int			nipd;
+
+				ipd = ginReadTupleWithoutState(idxtuple, &nipd);
+
+				for (int j = 0; j < nipd; j++)
+				{
+					if (!OffsetNumberIsValid(ItemPointerGetOffsetNumber(&ipd[j])))
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting list contains invalid heap pointer on block %u",
+										RelationGetRelationName(rel), stack->blkno)));
+					}
+				}
+				pfree(ipd);
+			}
+
+			prev_tuple = CopyIndexTuple(idxtuple);
+		}
+
+nextpage:
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Verify that a freshly-read page looks sane.
+ */
+static bool
+gincheckpage(Relation rel, Buffer buf)
+{
+	Page		page = BufferGetPage(buf);
+
+	/*
+	 * ReadBuffer verifies that every newly-read page passes
+	 * PageHeaderIsValid, which means it either contains a reasonably sane
+	 * page header or is all-zero.  We have to defend against the all-zero
+	 * case, however.
+	 */
+	if (PageIsNew(page))
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains unexpected zero page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buf)),
+				 errhint("Please REINDEX it.")));
+		return false;
+	}
+
+	/*
+	 * Additionally check that the special area looks sane.
+	 */
+	if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains corrupted page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buf)),
+				 errhint("Please REINDEX it.")));
+		return false;
+	}
+	return true;
+}
+
+static bool
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	if (!gincheckpage(rel, buffer))
+		return false;
+
+	if (GinPageIsDeleted(page))
+	{
+		if (!GinPageIsLeaf(page))
+		{
+			ereport(WARNING,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+			return false;
+		}
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+		{
+			ereport(WARNING,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+			return false;
+		}
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+		return false;
+	}
+	return true;
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gin_refind_parent(Relation rel, BlockNumber parentblkno,
+				  BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIN_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GinPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o, sizeof(GinPageOpaqueData));
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index abe3135132..f472554ec7 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -180,6 +180,25 @@ ORDER BY c.relpages DESC LIMIT 10;
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term>
+     <function>gin_index_parent_check(index regclass) returns void</function>
+     <indexterm>
+      <primary>gin_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gin_index_parent_check</function> tests that its target GIN index
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term>
      <function>gist_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
-- 
2.32.0 (Apple Git-132)

v20-0002-Add-gist_index_parent_check-function-to-verify-G.patchapplication/octet-stream; name=v20-0002-Add-gist_index_parent_check-function-to-verify-G.patchDownload

From b7800362dd644654394e4d24ee18f7fb7cd39a73 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:17:44 +0500
Subject: [PATCH v20 2/3] Add gist_index_parent_check() function to verify GiST
 index

---
 contrib/amcheck/Makefile                |   6 +-
 contrib/amcheck/amcheck--1.3--1.4.sql   |  14 +
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/expected/check_gist.out | 119 ++++++
 contrib/amcheck/meson.build             |   3 +
 contrib/amcheck/sql/check_gist.sql      |  42 ++
 contrib/amcheck/verify_gist.c           | 538 ++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml               |  19 +
 8 files changed, 740 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.3--1.4.sql
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index f10fd9d89d..a817419581 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,15 +4,17 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
+	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
-DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql \
+		amcheck--1.3--1.4.sql
 
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap
+REGRESS = check check_btree check_heap check_gist
 
 TAP_TESTS = 1
 
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
new file mode 100644
index 0000000000..93297379ef
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -0,0 +1,14 @@
+/* contrib/amcheck/amcheck--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.4'" to load this file. \quit
+
+
+-- gist_index_parent_check()
+--
+CREATE FUNCTION gist_index_parent_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index ab50931f75..e67ace01c9 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.3'
+default_version = '1.4'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 0000000000..9749adfd34
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,119 @@
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index cd81cbf3bc..45e9d74947 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
@@ -24,6 +25,7 @@ install_data(
   'amcheck--1.0--1.1.sql',
   'amcheck--1.1--1.2.sql',
   'amcheck--1.2--1.3.sql',
+  'amcheck--1.3--1.4.sql',
   kwargs: contrib_data_args,
 )
 
@@ -36,6 +38,7 @@ tests += {
       'check',
       'check_btree',
       'check_heap',
+      'check_gist',
     ],
   },
   'tap': {
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 0000000000..75b9ff4b43
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,42 @@
+
+SELECT setseed(1);
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 0000000000..5a5fa73536
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,538 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gist.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "access/tableam.h"
+#include "access/transam.h"
+#include "catalog/pg_am.h"
+#include "common/pg_prng.h"
+#include "catalog/index.h"
+#include "lib/bloomfilter.h"
+#include "storage/lmgr.h"
+#include "storage/smgr.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "utils/snapmgr.h"
+
+#include "amcheck.h"
+
+/*
+ * GistScanItem represents one item of depth-first scan of GiST index.
+ */
+typedef struct GistScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GistScanItem *next;
+} GistScanItem;
+
+typedef struct GistCheckState
+{
+	/* Bloom filter fingerprints index tuples */
+	bloom_filter *filter;
+	/* Debug counter */
+	int64		heaptuplespresent;
+	/* GiST state */
+	GISTSTATE      *state;
+
+	Snapshot		snapshot;
+	Relation	rel;
+	Relation	heaprel;
+
+	/* progress reporting stuff */
+	BlockNumber totalblocks;
+	BlockNumber reportedblocks;
+	BlockNumber scannedblocks;
+	BlockNumber deltablocks;
+} GistCheckState;
+
+PG_FUNCTION_INFO_V1(gist_index_parent_check);
+
+static void gist_init_heapallindexed(Relation rel, GistCheckState *result);
+static void gist_index_checkable(Relation rel);
+static void gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+												void* callback_state);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gist_refind_parent(Relation rel, BlockNumber parentblkno,
+									 BlockNumber childblkno,
+									 BufferAccessStrategy strategy);
+static void gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+						  bool *isnull, bool tupleIsAlive, void *checkstate);
+
+/*
+ * gist_index_parent_check(index regclass)
+ *
+ * Verify integrity of GiST index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum gist_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid		indrelid = PG_GETARG_OID(0);
+	bool	heapallindexed = false;
+
+	if (PG_NARGS() >= 2)
+		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid, gist_index_checkable,
+		gist_check_parent_keys_consistency, AccessShareLock, &heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Check that relation is eligible for GiST verification
+ */
+static void
+gist_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIST_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GiST indexes are supported as targets for this verification"),
+				 errdetail("Relation \"%s\" is not a GiST index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+static void
+gist_init_heapallindexed(Relation rel, GistCheckState *result)
+{
+	int64		total_pages;
+	int64		total_elems;
+	uint64		seed;
+
+	/*
+	 * Size Bloom filter based on estimated number of tuples in index.
+	 * This logic is similar to B-tree, see verify_btree.c .
+	 */
+	total_pages = result->totalblocks;
+	total_elems = Max(total_pages * (MaxOffsetNumber / 5),
+						(int64) rel->rd_rel->reltuples);
+	seed = pg_prng_uint64(&pg_global_prng_state);
+	result->filter = bloom_create(total_elems, maintenance_work_mem, seed);
+
+	result->snapshot = RegisterSnapshot(GetTransactionSnapshot());
+
+
+	/*
+	 * GetTransactionSnapshot() always acquires a new MVCC snapshot in
+	 * READ COMMITTED mode.  A new snapshot is guaranteed to have all
+	 * the entries it requires in the index.
+	 *
+	 * We must defend against the possibility that an old xact
+	 * snapshot was returned at higher isolation levels when that
+	 * snapshot is not safe for index scans of the target index.  This
+	 * is possible when the snapshot sees tuples that are before the
+	 * index's indcheckxmin horizon.  Throwing an error here should be
+	 * very rare.  It doesn't seem worth using a secondary snapshot to
+	 * avoid this.
+	 */
+	if (IsolationUsesXactSnapshot() && rel->rd_index->indcheckxmin &&
+		!TransactionIdPrecedes(HeapTupleHeaderGetXmin(rel->rd_indextuple->t_data),
+								result->snapshot->xmin))
+		ereport(ERROR,
+				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+					errmsg("index \"%s\" cannot be verified using transaction snapshot",
+						RelationGetRelationName(rel))));
+}
+
+/*
+ * Main entry point for GiST check. Allocates memory context and scans through
+ * GiST graph.  This function verifies that tuples of internal pages cover all
+ * the key space of each tuple on leaf page.  To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries to
+ * adjust it by tuples on referenced child page.  Parent gist tuple should
+ * never require any adjustments.
+ */
+static void
+gist_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GistScanItem   *stack;
+	MemoryContext	mctx;
+	MemoryContext	oldcontext;
+	GISTSTATE      *state;
+	int				leafdepth;
+	bool			heapallindexed = *((bool*)callback_state);
+	GistCheckState  check_state;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	state = initGISTstate(rel);
+
+	check_state.state = state;
+	check_state.rel = rel;
+	check_state.heaprel = heaprel;
+
+	check_state.totalblocks = RelationGetNumberOfBlocks(rel);
+	check_state.reportedblocks = 0;
+	check_state.scannedblocks = 0;
+	/* report every 100 blocks or 5%, whichever is bigger */
+	check_state.deltablocks = Max(check_state.totalblocks / 20, 100);
+
+	if (heapallindexed)
+		gist_init_heapallindexed(rel, &check_state);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GistScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber  i, maxoff;
+		XLogRecPtr	lsn;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Report progress */
+		if (check_state.scannedblocks > check_state.reportedblocks +
+			  check_state.deltablocks)
+		{
+			elog(DEBUG1, "verified level %u blocks of approximately %u total",
+				check_state.scannedblocks, check_state.totalblocks);
+			check_state.reportedblocks = check_state.scannedblocks;
+		}
+		check_state.scannedblocks++;
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page))
+		{
+			/* split page detected, install right link to the stack */
+			GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+			ptr->depth = stack->depth;
+			ptr->parenttup = CopyIndexTuple(stack->parenttup);
+			ptr->parentblk = stack->parentblk;
+			ptr->parentlsn = stack->parentlsn;
+			ptr->blkno = GistPageGetOpaque(page)->rightlink;
+			ptr->next = stack->next;
+			stack->next = ptr;
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GistPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that each tuple looks valid, and is consistent with the
+		 * downlink we followed when we stepped on this page.
+		 */
+		maxoff = PageGetMaxOffsetNumber(page);
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId iid = PageGetItemIdCareful(rel, stack->blkno, page, i, sizeof(GISTPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+			/*
+			 * Check that it's not a leftover invalid tuple from pre-9.1 See
+			 * also gistdoinsert() and gistbulkdelete() handling of such
+			 * tuples. We do consider it error here.
+			 */
+			if (GistTupleIsInvalid(idxtuple))
+				ereport(ERROR,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("index \"%s\" contains an inner tuple marked as invalid, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i),
+						 errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+						 errhint("Please REINDEX it.")));
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+			{
+				/*
+				 * There was a discrepancy between parent and child tuples.
+				 * We need to verify it is not a result of concurrent call of
+				 * gistplacetopage(). So, lock parent and try to find downlink
+				 * for current page. It may be missing due to concurrent page
+				 * split, this is OK.
+				 *
+				 * Note that when we aquire parent tuple now we hold lock for
+				 * both parent and child buffers. Thus parent tuple must
+				 * include keyspace of the child.
+				 */
+				pfree(stack->parenttup);
+				stack->parenttup = gist_refind_parent(rel, stack->parentblk,
+													  stack->blkno, strategy);
+
+				/* We found it - make a final check before failing */
+				if (!stack->parenttup)
+					elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+						 stack->blkno, stack->parentblk);
+				else if (gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+				else
+				{
+					/*
+					 * But now it is properly adjusted - nothing to do here.
+					 */
+				}
+			}
+
+			if (GistPageIsLeaf(page))
+			{
+				if (heapallindexed)
+				{
+					bloom_add_element(check_state.filter, (unsigned char *) idxtuple,
+								  IndexTupleSize(idxtuple));
+				}
+			}
+			/* If this is an internal page, recurse into the child */
+			else
+			{
+				GistScanItem *ptr;
+
+				ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+				ptr->depth = stack->depth + 1;
+				ptr->parenttup = CopyIndexTuple(idxtuple);
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		LockBuffer(buffer, GIST_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	if (heapallindexed)
+	{
+		IndexInfo  *indexinfo = BuildIndexInfo(rel);
+		TableScanDesc scan;
+
+		scan = table_beginscan_strat(heaprel,	/* relation */
+									 check_state.snapshot,	/* snapshot */
+									 0, /* number of keys */
+									 NULL,	/* scan key */
+									 true,	/* buffer access strategy OK */
+									 true); /* syncscan OK? */
+
+		/*
+		 * Scan will behave as the first scan of a CREATE INDEX CONCURRENTLY.
+		 */
+		indexinfo->ii_Concurrent = true;
+
+		indexinfo->ii_Unique = false;
+		indexinfo->ii_ExclusionOps = NULL;
+		indexinfo->ii_ExclusionProcs = NULL;
+		indexinfo->ii_ExclusionStrats = NULL;
+
+		elog(DEBUG1, "verifying that tuples from index \"%s\" are present in \"%s\"",
+			 RelationGetRelationName(rel),
+			 RelationGetRelationName(heaprel));
+
+		table_index_build_scan(heaprel, rel, indexinfo, true, false,
+							   gist_tuple_present_callback, (void *) &check_state, scan);
+
+		ereport(DEBUG1,
+		(errmsg_internal("finished verifying presence of " INT64_FORMAT " tuples from table \"%s\" with bitset %.2f%% set",
+							check_state.heaptuplespresent, RelationGetRelationName(heaprel),
+							100.0 * bloom_prop_bits_set(check_state.filter))));
+
+		UnregisterSnapshot(check_state.snapshot);
+		bloom_free(check_state.filter);
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+static void
+gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+						  bool *isnull, bool tupleIsAlive, void *checkstate)
+{
+	GistCheckState *state = (GistCheckState *) checkstate;
+	IndexTuple	itup = gistFormTuple(state->state, index, values, isnull, true);
+	itup->t_tid = *tid;
+	/* Probe Bloom filter -- tuple should be present */
+	if (bloom_lacks_element(state->filter, (unsigned char *) itup,
+							IndexTupleSize(itup)))
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("heap tuple (%u,%u) from table \"%s\" lacks matching index tuple within index \"%s\"",
+						ItemPointerGetBlockNumber(&(itup->t_tid)),
+						ItemPointerGetOffsetNumber(&(itup->t_tid)),
+						RelationGetRelationName(state->heaprel),
+						RelationGetRelationName(state->rel))));
+
+	state->heaptuplespresent++;
+
+	pfree(itup);
+}
+
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	gistcheckpage(rel, buffer);
+
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted page %d",
+						RelationGetRelationName(rel), blockNo)));
+
+	if (GistPageIsDeleted(page))
+	{
+		if (!GistPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gist_refind_parent(Relation rel, BlockNumber parentblkno,
+				   BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIST_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GistPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o, sizeof(GISTPageOpaqueData));
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 923cbde9dd..abe3135132 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -179,6 +179,25 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gist_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_parent_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
   <tip>
    <para>
-- 
2.32.0 (Apple Git-132)

#20

Jose Arthur Benetasso Villanova

jose.arthur@gmail.com

almost 3 years ago

In reply to: Andrey Borodin (#19)

Re: Amcheck verification of GiST and GIN

On Fri, 13 Jan 2023, Andrey Borodin wrote:

On Fri, Jan 13, 2023 at 3:46 AM Jose Arthur Benetasso Villanova
<jose.arthur@gmail.com> wrote:

The only thing that I found is the gin_index_parent_check function in docs
still references the "gin_index_parent_check(index regclass,
heapallindexed boolean) returns void"

Correct! Please find the attached fixed version.

Thank you!

Best regards, Andrey Borodin.

Hello again. I see the change. Thanks

--
Jose Arthur Benetasso Villanova

#21

Andrey Borodin

amborodin86@gmail.com

almost 3 years ago

In reply to: Jose Arthur Benetasso Villanova (#20)

3 attachment(s)

Re: Amcheck verification of GiST and GIN

On Fri, Jan 13, 2023 at 7:35 PM Jose Arthur Benetasso Villanova
<jose.arthur@gmail.com> wrote:

Hello again. I see the change. Thanks

Thanks! I also found out that there was a CI complaint about amcheck.h
not including some necessary stuff. Here's a version with a fix for
that.

Best regards, Andrey Borodin.

Attachments:

v21-0003-Add-gin_index_parent_check-to-verify-GIN-index.patchapplication/octet-stream; name=v21-0003-Add-gin_index_parent_check-to-verify-GIN-index.patchDownload

From b75c19ede76c40f5c3a0acdf36d462d9b60aef8b Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:22:26 +0500
Subject: [PATCH v21 3/3] Add gin_index_parent_check() to verify GIN index

---
 contrib/amcheck/Makefile               |   3 +-
 contrib/amcheck/amcheck--1.3--1.4.sql  |  11 +-
 contrib/amcheck/amcheck.c              |   2 +-
 contrib/amcheck/expected/check_gin.out |  64 ++
 contrib/amcheck/meson.build            |   2 +
 contrib/amcheck/sql/check_gin.sql      |  40 ++
 contrib/amcheck/verify_gin.c           | 793 +++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml              |  19 +
 8 files changed, 931 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/expected/check_gin.out
 create mode 100644 contrib/amcheck/sql/check_gin.sql
 create mode 100644 contrib/amcheck/verify_gin.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index a817419581..ecb849a605 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -5,6 +5,7 @@ OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
 	verify_gist.o \
+	verify_gin.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
@@ -14,7 +15,7 @@ DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck
 
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap check_gist
+REGRESS = check check_btree check_heap check_gist check_gin
 
 TAP_TESTS = 1
 
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
index 93297379ef..5e283be45b 100644
--- a/contrib/amcheck/amcheck--1.3--1.4.sql
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -11,4 +11,13 @@ RETURNS VOID
 AS 'MODULE_PATHNAME', 'gist_index_parent_check'
 LANGUAGE C STRICT;
 
-REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
+REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
+
+-- gin_index_parent_check()
+--
+CREATE FUNCTION gin_index_parent_check(index regclass)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gin_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gin_index_parent_check(regclass) FROM PUBLIC;
\ No newline at end of file
diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
index 3793b0cd93..9999a233f8 100644
--- a/contrib/amcheck/amcheck.c
+++ b/contrib/amcheck/amcheck.c
@@ -83,7 +83,7 @@ amcheck_lock_relation_and_check(Oid indrelid, IndexCheckableCallback checkable,
 	else
 	{
 		heaprel = NULL;
-		/* for "gcc -Og" https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78394 */
+		/* Set these just to suppress "uninitialized variable" warnings */
 		save_userid = InvalidOid;
 		save_sec_context = -1;
 		save_nestlevel = -1;
diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
new file mode 100644
index 0000000000..43fd769a50
--- /dev/null
+++ b/contrib/amcheck/expected/check_gin.out
@@ -0,0 +1,64 @@
+-- Test of index bulk load
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx');
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test index inserts
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+SELECT gin_index_parent_check('gin_check_idx');
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test GIN over text array
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx');
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 45e9d74947..fec44a6826 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gin.c',
   'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
@@ -39,6 +40,7 @@ tests += {
       'check_btree',
       'check_heap',
       'check_gist',
+      'check_gin',
     ],
   },
   'tap': {
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
new file mode 100644
index 0000000000..9771afffa5
--- /dev/null
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -0,0 +1,40 @@
+-- Test of index bulk load
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test index inserts
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+
+SELECT gin_index_parent_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test GIN over text array
+SELECT setseed(1);
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx');
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
new file mode 100644
index 0000000000..8fd00513f7
--- /dev/null
+++ b/contrib/amcheck/verify_gin.c
@@ -0,0 +1,793 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gin.c
+ *		Verifies the integrity of GIN indexes based on invariants.
+ *
+ * Verification checks that all paths in GIN graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gin.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gin_private.h"
+#include "access/nbtree.h"
+#include "amcheck.h"
+#include "catalog/pg_am.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "string.h"
+
+/*
+ * GinScanItem represents one item of depth-first scan of GIN index.
+ */
+typedef struct GinScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GinScanItem *next;
+}			GinScanItem;
+
+/*
+ * GinPostingTreeScanItem represents one item of depth-first scan of GIN  posting tree.
+ */
+typedef struct GinPostingTreeScanItem
+{
+	int			depth;
+	ItemPointerData parentkey;
+	BlockNumber parentblk;
+	BlockNumber blkno;
+	struct GinPostingTreeScanItem *next;
+}			GinPostingTreeScanItem;
+
+
+PG_FUNCTION_INFO_V1(gin_index_parent_check);
+
+static void gin_index_checkable(Relation rel);
+static void gin_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state);
+static bool check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gin_refind_parent(Relation rel, BlockNumber parentblkno,
+									BlockNumber childblkno,
+									BufferAccessStrategy strategy);
+
+/*
+ * gin_index_parent_check(index regclass)
+ *
+ * Verify integrity of GIN index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gin_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+
+	amcheck_lock_relation_and_check(indrelid, gin_index_checkable,
+		gin_check_parent_keys_consistency, AccessShareLock, NULL);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Read item pointers from leaf entry tuple.
+ *
+ * Returns a palloc'd array of ItemPointers. The number of items is returned
+ * in *nitems.
+ */
+static ItemPointer
+ginReadTupleWithoutState(IndexTuple itup, int *nitems)
+{
+	Pointer		ptr = GinGetPosting(itup);
+	int			nipd = GinGetNPosting(itup);
+	ItemPointer ipd;
+	int			ndecoded;
+
+	if (GinItupIsCompressed(itup))
+	{
+		if (nipd > 0)
+		{
+			ipd = ginPostingListDecode((GinPostingList *) ptr, &ndecoded);
+			if (nipd != ndecoded)
+				elog(ERROR, "number of items mismatch in GIN entry tuple, %d in tuple header, %d decoded",
+					 nipd, ndecoded);
+		}
+		else
+		{
+			ipd = palloc(0);
+		}
+	}
+	else
+	{
+		ipd = (ItemPointer) palloc(sizeof(ItemPointerData) * nipd);
+		memcpy(ipd, ptr, sizeof(ItemPointerData) * nipd);
+	}
+	*nitems = nipd;
+	return ipd;
+}
+
+
+/*
+ * Check that relation is eligible for GIN verification
+ */
+static void
+gin_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIN_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GIN indexes are supported as targets for this verification"),
+				 errdetail("Relation \"%s\" is not a GIN index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+/*
+ * Allocates memory context and scans through postigTree graph
+ *
+ */
+static void
+gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting_tree_root)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinPostingTreeScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinPostingTreeScanItem *) palloc0(sizeof(GinPostingTreeScanItem));
+	stack->depth = 0;
+	ItemPointerSetInvalid(&stack->parentkey);
+	stack->parentblk = InvalidBlockNumber;
+	stack->blkno = posting_tree_root;
+
+	elog(DEBUG3, "processing posting tree at blk %u", posting_tree_root);
+
+	while (stack)
+	{
+		GinPostingTreeScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		Assert(GinPageIsData(page));
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			ItemPointerData minItem;
+			int			nlist;
+			ItemPointerData *list;
+			char		tidrange_buf[100];
+
+			ItemPointerSetMin(&minItem);
+
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			list = GinDataLeafPageGetItems(page, &nlist, minItem);
+
+			if (nlist > 0)
+			{
+				snprintf(tidrange_buf, sizeof(tidrange_buf),
+						 "%d tids (%u, %u) - (%u, %u)",
+						 nlist,
+						 ItemPointerGetBlockNumberNoCheck(&list[0]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[0]),
+						 ItemPointerGetBlockNumberNoCheck(&list[nlist - 1]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[nlist - 1]));
+			} else {
+				snprintf(tidrange_buf, sizeof(tidrange_buf), "0 tids");
+			}
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				elog(DEBUG3, "blk %u: parent %u highkey (%u, %u), %s",
+					 stack->blkno,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey),
+					 tidrange_buf);
+			}
+			else
+			{
+				elog(DEBUG3, "blk %u: root leaf, %s",
+					 stack->blkno,
+					 tidrange_buf);
+			}
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				ItemPointerGetOffsetNumberNoCheck(&stack->parentkey) != InvalidOffsetNumber &&
+				nlist > 0 &&
+				ItemPointerCompare(&stack->parentkey, &list[nlist - 1]) < 0)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": tid exceeds parent's high key in postingTree leaf on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			}
+		}
+		else
+		{
+			LocationIndex pd_lower;
+			int			lowersize;
+			ItemPointerData bound;
+
+			/*
+			 * Check that tuples in each page are properly ordered and
+			 * consistent with parent high key
+			 */
+			maxoff = GinPageGetOpaque(page)->maxoff;
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: internal posting tree page with %u items, parent %u highkey (%u, %u)",
+					 stack->blkno,
+					 maxoff,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey));
+			else
+				elog(DEBUG3, "blk %u: root internal posting tree page with %u items", stack->blkno, maxoff);
+
+			/*
+			 * A GIN posting tree internal page stores PostingItems in the
+			 * 'lower' part of the page. The 'upper' part is unused. The
+			 * number of elements is stored in the opaque area (maxoff).
+			 * Make sure the size of the 'lower' part agrees with 'maxoff'
+			 *
+			 * We didn't set pd_lower until PostgreSQL version 9.4, so if this
+			 * check fails, it could also be because the index was binary-upgraded
+			 * from an earlier version. That was a long time ago, though, so let's
+			 * warn if it doesn't match.
+			 */
+			pd_lower = ((PageHeader) page)->pd_lower;
+			lowersize = pd_lower - MAXALIGN(SizeOfPageHeaderData);
+			if ((lowersize - MAXALIGN(sizeof(ItemPointerData))) / sizeof(PostingItem) != maxoff)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has unexpected pd_lower %u in posting tree block %u with maxoff %u)",
+								RelationGetRelationName(rel), pd_lower, stack->blkno, maxoff)));
+			}
+
+			/*
+			 * Before the PostingItems, there's one ItemPointerData in the
+			 * 'lower' part that stores the page's high key.
+			 */
+			bound = *GinDataPageGetRightBound(page);
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				if (!ItemPointerEquals(&stack->parentkey, &bound))
+				{
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\": posting tree page's high key (%u, %u) doesn't match the downlink on block %u (parent blk %u, key (%u, %u))",
+									RelationGetRelationName(rel),
+									ItemPointerGetBlockNumberNoCheck(&bound),
+									ItemPointerGetOffsetNumberNoCheck(&bound),
+									stack->blkno,
+									stack->parentblk,
+									ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+									ItemPointerGetOffsetNumberNoCheck(&stack->parentkey))));
+				}
+			}
+
+			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
+
+				elog(DEBUG3, "key (%u, %u) -> %u",
+					 ItemPointerGetBlockNumber(&posting_item->key),
+					 ItemPointerGetOffsetNumber(&posting_item->key),
+					 BlockIdGetBlockNumber(&posting_item->child_blkno));
+
+				if (i == maxoff && GinPageGetOpaque(page)->rightlink == InvalidBlockNumber)
+				{
+					/* The rightmost item in the tree level has (0, 0) as the key */
+					if (ItemPointerGetBlockNumberNoCheck(&posting_item->key) != 0 ||
+						ItemPointerGetOffsetNumberNoCheck(&posting_item->key) != 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": rightmost posting tree page (blk %u) has unexpected last key (%u, %u)",
+										RelationGetRelationName(rel),
+										stack->blkno,
+										ItemPointerGetBlockNumberNoCheck(&posting_item->key),
+										ItemPointerGetOffsetNumberNoCheck(&posting_item->key))));
+					}
+				}
+				else if (i != FirstOffsetNumber)
+				{
+					PostingItem *previous_posting_item = GinDataPageGetPostingItem(page, i - 1);
+
+					if (ItemPointerCompare(&posting_item->key, &previous_posting_item->key) < 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\" has wrong tuple order in posting tree, block %u, offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+					}
+				}
+
+				/*
+				 * Check if this tuple is consistent with the downlink in the
+				 * parent.
+				 */
+				if (stack->parentblk != InvalidBlockNumber && i == maxoff)
+				{
+					if (ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting item exceeds parent's high key in postingTree internal page on block %u offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+
+					}
+				}
+
+				/* If this is an internal page, recurse into the child */
+				if (!GinPageIsLeaf(page))
+				{
+					GinPostingTreeScanItem *ptr;
+
+					ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
+					ptr->depth = stack->depth + 1;
+					ptr->parentkey = posting_item->key;
+					ptr->parentblk = stack->blkno;
+					ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+
+			}
+		}
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Main entry point for GIN check. Allocates memory context and scans through
+ * GIN graph.
+ */
+static void
+gin_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GinState	state;
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+	initGinState(&state, rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinScanItem *) palloc0(sizeof(GinScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIN_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GinScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+		IndexTuple	prev_tuple;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+		maxoff = PageGetMaxOffsetNumber(page);
+
+		/* Do basic sanity checks on the page headers */
+		if (!check_index_page(rel, buffer, stack->blkno))
+		{
+			goto nextpage;
+		}
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (stack->parenttup != NULL)
+		{
+			GinNullCategory parent_key_category;
+			Datum		parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, maxoff, sizeof(GinPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory page_max_key_category;
+			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
+
+			if (GinPageGetOpaque(page)->rightlink != InvalidBlockNumber &&
+				ginCompareEntries(&state, attnum, page_max_key, page_max_key_category, parent_key, parent_key_category) > 0)
+			{
+				/* split page detected, install right link to the stack */
+				GinScanItem *ptr;
+
+				elog(DEBUG3, "split detected");
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth;
+				ptr->parenttup = CopyIndexTuple(stack->parenttup);
+				ptr->parentblk = stack->parentblk;
+				ptr->parentlsn = stack->parentlsn;
+				ptr->blkno = GinPageGetOpaque(page)->rightlink;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+				goto nextpage;
+			}
+		}
+
+		/*
+		 * Check that tuples in each page are properly ordered and consistent
+		 * with parent high key
+		 */
+		prev_tuple = NULL;
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i, sizeof(GinPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory prev_key_category;
+			Datum		prev_key;
+			GinNullCategory current_key_category;
+			Datum		current_key;
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+				continue;
+			}
+
+			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
+
+			/* (apparently) first block is metadata, skip order check */
+			if (i != FirstOffsetNumber && stack->blkno != (BlockNumber) 1)
+			{
+				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
+				if (ginCompareEntries(&state, attnum, prev_key, prev_key_category, current_key, current_key_category) >= 0)
+				{
+					ereport(WARNING,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has wrong tuple order, block %u, offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+				}
+			}
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				i == maxoff)
+			{
+				GinNullCategory parent_key_category;
+				Datum		parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+
+				if (ginCompareEntries(&state, attnum, current_key, current_key_category, parent_key, parent_key_category) > 0)
+				{
+					/*
+					 * There was a discrepancy between parent and child
+					 * tuples. We need to verify it is not a result of
+					 * concurrent call of gistplacetopage(). So, lock parent
+					 * and try to find downlink for current page. It may be
+					 * missing due to concurrent page split, this is OK.
+					 */
+					pfree(stack->parenttup);
+					stack->parenttup = gin_refind_parent(rel, stack->parentblk,
+														 stack->blkno, strategy);
+
+					/* We found it - make a final check before failing */
+					if (!stack->parenttup)
+						elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+							 stack->blkno, stack->parentblk);
+					else
+					{
+						parent_key = gintuple_get_key(&state, stack->parenttup, &parent_key_category);
+						if (ginCompareEntries(&state, attnum, current_key, current_key_category, parent_key, parent_key_category) > 0)
+							ereport(ERROR,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+											RelationGetRelationName(rel), stack->blkno, i)));
+						else
+						{
+							/*
+							 * But now it is properly adjusted - nothing to do
+							 * here.
+							 */
+						}
+					}
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+			if (!GinPageIsLeaf(page))
+			{
+				GinScanItem *ptr;
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth + 1;
+				/* last tuple in layer has no high key */
+				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
+				{
+					ptr->parenttup = CopyIndexTuple(idxtuple);
+				}
+				else
+				{
+					ptr->parenttup = NULL;
+				}
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = GinGetDownlink(idxtuple);
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+			/* If this item is a pointer to a posting tree, recurse into it */
+			else if (GinIsPostingTree(idxtuple))
+			{
+				BlockNumber rootPostingTree = GinGetPostingTree(idxtuple);
+
+				gin_check_posting_tree_parent_keys_consistency(rel, rootPostingTree);
+			}
+			else
+			{
+				ItemPointer ipd;
+				int			nipd;
+
+				ipd = ginReadTupleWithoutState(idxtuple, &nipd);
+
+				for (int j = 0; j < nipd; j++)
+				{
+					if (!OffsetNumberIsValid(ItemPointerGetOffsetNumber(&ipd[j])))
+					{
+						ereport(WARNING,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting list contains invalid heap pointer on block %u",
+										RelationGetRelationName(rel), stack->blkno)));
+					}
+				}
+				pfree(ipd);
+			}
+
+			prev_tuple = CopyIndexTuple(idxtuple);
+		}
+
+nextpage:
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Verify that a freshly-read page looks sane.
+ */
+static bool
+gincheckpage(Relation rel, Buffer buf)
+{
+	Page		page = BufferGetPage(buf);
+
+	/*
+	 * ReadBuffer verifies that every newly-read page passes
+	 * PageHeaderIsValid, which means it either contains a reasonably sane
+	 * page header or is all-zero.  We have to defend against the all-zero
+	 * case, however.
+	 */
+	if (PageIsNew(page))
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains unexpected zero page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buf)),
+				 errhint("Please REINDEX it.")));
+		return false;
+	}
+
+	/*
+	 * Additionally check that the special area looks sane.
+	 */
+	if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains corrupted page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buf)),
+				 errhint("Please REINDEX it.")));
+		return false;
+	}
+	return true;
+}
+
+static bool
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	if (!gincheckpage(rel, buffer))
+		return false;
+
+	if (GinPageIsDeleted(page))
+	{
+		if (!GinPageIsLeaf(page))
+		{
+			ereport(WARNING,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+			return false;
+		}
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+		{
+			ereport(WARNING,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+			return false;
+		}
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+		return false;
+	}
+	return true;
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gin_refind_parent(Relation rel, BlockNumber parentblkno,
+				  BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIN_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GinPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o, sizeof(GinPageOpaqueData));
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index abe3135132..f472554ec7 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -180,6 +180,25 @@ ORDER BY c.relpages DESC LIMIT 10;
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term>
+     <function>gin_index_parent_check(index regclass) returns void</function>
+     <indexterm>
+      <primary>gin_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gin_index_parent_check</function> tests that its target GIN index
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term>
      <function>gist_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
-- 
2.32.0 (Apple Git-132)

v21-0001-Refactor-amcheck-to-extract-common-locking-routi.patchapplication/octet-stream; name=v21-0001-Refactor-amcheck-to-extract-common-locking-routi.patchDownload

From a5a92975e2fac3e306d9ffb339dae0301996e444 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:08:10 +0500
Subject: [PATCH v21 1/3] Refactor amcheck to extract common locking routines

---
 contrib/amcheck/Makefile        |   2 +
 contrib/amcheck/amcheck.c       | 188 +++++++++++++++++++
 contrib/amcheck/amcheck.h       |  29 +++
 contrib/amcheck/meson.build     |   1 +
 contrib/amcheck/verify_nbtree.c | 308 ++++++++------------------------
 5 files changed, 298 insertions(+), 230 deletions(-)
 create mode 100644 contrib/amcheck/amcheck.c
 create mode 100644 contrib/amcheck/amcheck.h

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index b82f221e50..f10fd9d89d 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -3,11 +3,13 @@
 MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
+	amcheck.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
 DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
 REGRESS = check check_btree check_heap
diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
new file mode 100644
index 0000000000..3793b0cd93
--- /dev/null
+++ b/contrib/amcheck/amcheck.c
@@ -0,0 +1,188 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.c
+ *		Utility functions common to all access methods.
+ *
+ * Copyright (c) 2017-2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/table.h"
+#include "access/tableam.h"
+#include "amcheck.h"
+#include "catalog/index.h"
+#include "commands/tablecmds.h"
+#include "utils/guc.h"
+
+static bool
+amcheck_index_mainfork_expected(Relation rel);
+
+/*
+ * Check if index relation should have a file for its main relation
+ * fork.  Verification uses this to skip unlogged indexes when in hot standby
+ * mode, where there is simply nothing to verify.
+ *
+ * NB: Caller should call index_checkable()
+ * before calling here.
+ */
+static bool
+amcheck_index_mainfork_expected(Relation rel)
+{
+	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
+		!RecoveryInProgress())
+		return true;
+
+	ereport(NOTICE,
+			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
+			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
+					RelationGetRelationName(rel))));
+
+	return false;
+}
+
+void
+amcheck_lock_relation_and_check(Oid indrelid, IndexCheckableCallback checkable,
+												IndexDoCheckCallback check, LOCKMODE lockmode, void *state)
+{
+	Oid			heapid;
+	Relation	indrel;
+	Relation	heaprel;
+	Oid			save_userid;
+	int			save_sec_context;
+	int			save_nestlevel;
+
+	/*
+	 * We must lock table before index to avoid deadlocks.  However, if the
+	 * passed indrelid isn't an index then IndexGetRelation() will fail.
+	 * Rather than emitting a not-very-helpful error message, postpone
+	 * complaining, expecting that the is-it-an-index test below will fail.
+	 *
+	 * In hot standby mode this will raise an error when parentcheck is true.
+	 */
+	heapid = IndexGetRelation(indrelid, true);
+	if (OidIsValid(heapid))
+	{
+		heaprel = table_open(heapid, lockmode);
+
+		/*
+		 * Switch to the table owner's userid, so that any index functions are
+		 * run as that user.  Also lock down security-restricted operations
+		 * and arrange to make GUC variable changes local to this command.
+		 */
+		GetUserIdAndSecContext(&save_userid, &save_sec_context);
+		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
+							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
+		save_nestlevel = NewGUCNestLevel();
+	}
+	else
+	{
+		heaprel = NULL;
+		/* for "gcc -Og" https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78394 */
+		save_userid = InvalidOid;
+		save_sec_context = -1;
+		save_nestlevel = -1;
+	}
+
+	/*
+	 * Open the target index relations separately (like relation_openrv(), but
+	 * with heap relation locked first to prevent deadlocking).  In hot
+	 * standby mode this will raise an error when parentcheck is true.
+	 *
+	 * There is no need for the usual indcheckxmin usability horizon test
+	 * here, even in the heapallindexed case, because index undergoing
+	 * verification only needs to have entries for a new transaction snapshot.
+	 * (If this is a parentcheck verification, there is no question about
+	 * committed or recently dead heap tuples lacking index entries due to
+	 * concurrent activity.)
+	 */
+	indrel = index_open(indrelid, lockmode);
+
+	/*
+	 * Since we did the IndexGetRelation call above without any lock, it's
+	 * barely possible that a race against an index drop/recreation could have
+	 * netted us the wrong table.
+	 */
+	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_TABLE),
+				 errmsg("could not open parent table of index \"%s\"",
+						RelationGetRelationName(indrel))));
+
+	/* Relation suitable for checking */
+	checkable(indrel);
+
+	if (amcheck_index_mainfork_expected(indrel))
+		check(indrel, heaprel, state);
+
+	/* Roll back any GUC changes executed by index functions */
+	AtEOXact_GUC(false, save_nestlevel);
+
+	/* Restore userid and security context */
+	SetUserIdAndSecContext(save_userid, save_sec_context);
+
+	/*
+	 * Release locks early. That's ok here because nothing in the called
+	 * routines will trigger shared cache invalidations to be sent, so we can
+	 * relax the usual pattern of only releasing locks after commit.
+	 */
+	index_close(indrel, lockmode);
+	if (heaprel)
+		table_close(heaprel, lockmode);
+}
+
+/*
+ * PageGetItemId() wrapper that validates returned line pointer.
+ *
+ * Buffer page/page item access macros generally trust that line pointers are
+ * not corrupt, which might cause problems for verification itself.  For
+ * example, there is no bounds checking in PageGetItem().  Passing it a
+ * corrupt line pointer can cause it to return a tuple/pointer that is unsafe
+ * to dereference.
+ *
+ * Validating line pointers before tuples avoids undefined behavior and
+ * assertion failures with corrupt indexes, making the verification process
+ * more robust and predictable.
+ */
+ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset, size_t opaquesize)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	Assert(opaquesize == MAXALIGN(opaquesize));
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(opaquesize))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree and gist
+	 * never uses either.  Verify that line pointer has storage, too, since
+	 * even LP_DEAD items should.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/contrib/amcheck/amcheck.h b/contrib/amcheck/amcheck.h
new file mode 100644
index 0000000000..fac9511f0b
--- /dev/null
+++ b/contrib/amcheck/amcheck.h
@@ -0,0 +1,29 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "storage/bufpage.h"
+#include "storage/lmgr.h"
+#include "storage/lockdefs.h"
+#include "utils/relcache.h"
+#include "miscadmin.h"
+
+/* Typedefs for callback functions for amcheck_lock_relation */
+typedef void (*IndexCheckableCallback) (Relation index);
+typedef void (*IndexDoCheckCallback) (Relation rel, Relation heaprel, void* state);
+
+extern void amcheck_lock_relation_and_check(Oid indrelid,
+											IndexCheckableCallback checkable,
+											IndexDoCheckCallback check,
+											LOCKMODE lockmode, void *state);
+
+extern ItemId PageGetItemIdCareful(Relation rel, BlockNumber block,
+					 Page page, OffsetNumber offset, size_t opaquesize);
\ No newline at end of file
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 5b55cf343a..cd81cbf3bc 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -1,6 +1,7 @@
 # Copyright (c) 2022-2023, PostgreSQL Global Development Group
 
 amcheck_sources = files(
+  'amcheck.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 257cff671b..37a8e957de 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -34,13 +34,13 @@
 #include "commands/tablecmds.h"
 #include "common/pg_prng.h"
 #include "lib/bloomfilter.h"
-#include "miscadmin.h"
-#include "storage/lmgr.h"
 #include "storage/smgr.h"
 #include "utils/guc.h"
 #include "utils/memutils.h"
 #include "utils/snapmgr.h"
 
+#include "amcheck.h"
+
 
 PG_MODULE_MAGIC;
 
@@ -138,10 +138,8 @@ typedef struct BtreeLevel
 PG_FUNCTION_INFO_V1(bt_index_check);
 PG_FUNCTION_INFO_V1(bt_index_parent_check);
 
-static void bt_index_check_internal(Oid indrelid, bool parentcheck,
-									bool heapallindexed, bool rootdescend);
+static void bt_index_check_internal_callback(Relation indrel, Relation heaprel, void* state);
 static inline void btree_index_checkable(Relation rel);
-static inline bool btree_index_mainfork_expected(Relation rel);
 static void bt_check_every_level(Relation rel, Relation heaprel,
 								 bool heapkeyspace, bool readonly, bool heapallindexed,
 								 bool rootdescend);
@@ -184,12 +182,17 @@ static inline bool invariant_l_nontarget_offset(BtreeCheckState *state,
 static Page palloc_btree_page(BtreeCheckState *state, BlockNumber blocknum);
 static inline BTScanInsert bt_mkscankey_pivotsearch(Relation rel,
 													IndexTuple itup);
-static ItemId PageGetItemIdCareful(BtreeCheckState *state, BlockNumber block,
-								   Page page, OffsetNumber offset);
 static inline ItemPointer BTreeTupleGetHeapTIDCareful(BtreeCheckState *state,
 													  IndexTuple itup, bool nonpivot);
 static inline ItemPointer BTreeTupleGetPointsToTID(IndexTuple itup);
 
+typedef struct BTCheckCallbackState
+{
+	bool parentcheck;
+	bool heapallindexed;
+	bool rootdescend;
+} BTCheckCallbackState;
+
 /*
  * bt_index_check(index regclass, heapallindexed boolean)
  *
@@ -203,12 +206,17 @@ Datum
 bt_index_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
+	BTCheckCallbackState args;
 
-	if (PG_NARGS() == 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = false;
+
+	if (PG_NARGS() >= 2)
+		args.heapallindexed = PG_GETARG_BOOL(1);
 
-	bt_index_check_internal(indrelid, false, heapallindexed, false);
+	amcheck_lock_relation_and_check(indrelid, btree_index_checkable,
+		bt_index_check_internal_callback, AccessShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -226,15 +234,18 @@ Datum
 bt_index_parent_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		rootdescend = false;
+	BTCheckCallbackState args;
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = true;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+		args.heapallindexed = PG_GETARG_BOOL(1);
 	if (PG_NARGS() == 3)
-		rootdescend = PG_GETARG_BOOL(2);
+		args.rootdescend = PG_GETARG_BOOL(2);
 
-	bt_index_check_internal(indrelid, true, heapallindexed, rootdescend);
+	amcheck_lock_relation_and_check(indrelid, btree_index_checkable,
+		bt_index_check_internal_callback, ShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -242,126 +253,35 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
 /*
  * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
  */
-static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend)
+static void bt_index_check_internal_callback(Relation indrel, Relation heaprel, void* state)
 {
-	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	LOCKMODE	lockmode;
-	Oid			save_userid;
-	int			save_sec_context;
-	int			save_nestlevel;
-
-	if (parentcheck)
-		lockmode = ShareLock;
-	else
-		lockmode = AccessShareLock;
-
-	/*
-	 * We must lock table before index to avoid deadlocks.  However, if the
-	 * passed indrelid isn't an index then IndexGetRelation() will fail.
-	 * Rather than emitting a not-very-helpful error message, postpone
-	 * complaining, expecting that the is-it-an-index test below will fail.
-	 *
-	 * In hot standby mode this will raise an error when parentcheck is true.
-	 */
-	heapid = IndexGetRelation(indrelid, true);
-	if (OidIsValid(heapid))
-	{
-		heaprel = table_open(heapid, lockmode);
-
-		/*
-		 * Switch to the table owner's userid, so that any index functions are
-		 * run as that user.  Also lock down security-restricted operations
-		 * and arrange to make GUC variable changes local to this command.
-		 */
-		GetUserIdAndSecContext(&save_userid, &save_sec_context);
-		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
-							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
-		save_nestlevel = NewGUCNestLevel();
-	}
-	else
-	{
-		heaprel = NULL;
-		/* Set these just to suppress "uninitialized variable" warnings */
-		save_userid = InvalidOid;
-		save_sec_context = -1;
-		save_nestlevel = -1;
-	}
-
-	/*
-	 * Open the target index relations separately (like relation_openrv(), but
-	 * with heap relation locked first to prevent deadlocking).  In hot
-	 * standby mode this will raise an error when parentcheck is true.
-	 *
-	 * There is no need for the usual indcheckxmin usability horizon test
-	 * here, even in the heapallindexed case, because index undergoing
-	 * verification only needs to have entries for a new transaction snapshot.
-	 * (If this is a parentcheck verification, there is no question about
-	 * committed or recently dead heap tuples lacking index entries due to
-	 * concurrent activity.)
-	 */
-	indrel = index_open(indrelid, lockmode);
+	BTCheckCallbackState* args = (BTCheckCallbackState*) state;
+	bool		heapkeyspace,
+					allequalimage;
 
-	/*
-	 * Since we did the IndexGetRelation call above without any lock, it's
-	 * barely possible that a race against an index drop/recreation could have
-	 * netted us the wrong table.
-	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
 		ereport(ERROR,
-				(errcode(ERRCODE_UNDEFINED_TABLE),
-				 errmsg("could not open parent table of index \"%s\"",
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" lacks a main relation fork",
 						RelationGetRelationName(indrel))));
 
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	if (btree_index_mainfork_expected(indrel))
-	{
-		bool		heapkeyspace,
-					allequalimage;
-
-		if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" lacks a main relation fork",
-							RelationGetRelationName(indrel))));
-
-		/* Extract metadata from metapage, and sanitize it in passing */
-		_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
-		if (allequalimage && !heapkeyspace)
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
-							RelationGetRelationName(indrel))));
-		if (allequalimage && !_bt_allequalimage(indrel, false))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
-							RelationGetRelationName(indrel))));
-
-		/* Check index, possibly against table it is an index on */
-		bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
-							 heapallindexed, rootdescend);
-	}
-
-	/* Roll back any GUC changes executed by index functions */
-	AtEOXact_GUC(false, save_nestlevel);
+	/* Extract metadata from metapage, and sanitize it in passing */
+	_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
+	if (allequalimage && !heapkeyspace)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
+						RelationGetRelationName(indrel))));
+	if (allequalimage && !_bt_allequalimage(indrel, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
+						RelationGetRelationName(indrel))));
 
-	/* Restore userid and security context */
-	SetUserIdAndSecContext(save_userid, save_sec_context);
+	/* Check index, possibly against table it is an index on */
+	bt_check_every_level(indrel, heaprel, heapkeyspace, args->parentcheck,
+							args->heapallindexed, args->rootdescend);
 
-	/*
-	 * Release locks early. That's ok here because nothing in the called
-	 * routines will trigger shared cache invalidations to be sent, so we can
-	 * relax the usual pattern of only releasing locks after commit.
-	 */
-	index_close(indrel, lockmode);
-	if (heaprel)
-		table_close(heaprel, lockmode);
 }
 
 /*
@@ -398,29 +318,6 @@ btree_index_checkable(Relation rel)
 				 errdetail("Index is not valid.")));
 }
 
-/*
- * Check if B-Tree index relation should have a file for its main relation
- * fork.  Verification uses this to skip unlogged indexes when in hot standby
- * mode, where there is simply nothing to verify.  We behave as if the
- * relation is empty.
- *
- * NB: Caller should call btree_index_checkable() before calling here.
- */
-static inline bool
-btree_index_mainfork_expected(Relation rel)
-{
-	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
-		!RecoveryInProgress())
-		return true;
-
-	ereport(DEBUG1,
-			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
-			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
-					RelationGetRelationName(rel))));
-
-	return false;
-}
-
 /*
  * Main entry point for B-Tree SQL-callable functions. Walks the B-Tree in
  * logical order, verifying invariants as it goes.  Optionally, verification
@@ -793,9 +690,9 @@ bt_check_level_from_leftmost(BtreeCheckState *state, BtreeLevel level)
 				ItemId		itemid;
 
 				/* Internal page -- downlink gets leftmost on next level */
-				itemid = PageGetItemIdCareful(state, state->targetblock,
+				itemid = PageGetItemIdCareful(state->rel, state->targetblock,
 											  state->target,
-											  P_FIRSTDATAKEY(opaque));
+											  P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 				itup = (IndexTuple) PageGetItem(state->target, itemid);
 				nextleveldown.leftmost = BTreeTupleGetDownLink(itup);
 				nextleveldown.level = opaque->btpo_level - 1;
@@ -875,8 +772,8 @@ nextpage:
 			IndexTuple	itup;
 			ItemId		itemid;
 
-			itemid = PageGetItemIdCareful(state, state->targetblock,
-										  state->target, P_HIKEY);
+			itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+										  state->target, P_HIKEY, sizeof(BTPageOpaqueData));
 			itup = (IndexTuple) PageGetItem(state->target, itemid);
 
 			state->lowkey = MemoryContextAlloc(oldcontext, IndexTupleSize(itup));
@@ -1093,8 +990,8 @@ bt_target_page_check(BtreeCheckState *state)
 		IndexTuple	itup;
 
 		/* Verify line pointer before checking tuple */
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, P_HIKEY);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, P_HIKEY, sizeof(BTPageOpaqueData));
 		if (!_bt_check_natts(state->rel, state->heapkeyspace, state->target,
 							 P_HIKEY))
 		{
@@ -1129,8 +1026,8 @@ bt_target_page_check(BtreeCheckState *state)
 
 		CHECK_FOR_INTERRUPTS();
 
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, offset);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, offset, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(state->target, itemid);
 		tupsize = IndexTupleSize(itup);
 
@@ -1442,9 +1339,9 @@ bt_target_page_check(BtreeCheckState *state)
 							 OffsetNumberNext(offset));
 
 			/* Reuse itup to get pointed-to heap location of second item */
-			itemid = PageGetItemIdCareful(state, state->targetblock,
+			itemid = PageGetItemIdCareful(state->rel, state->targetblock,
 										  state->target,
-										  OffsetNumberNext(offset));
+										  OffsetNumberNext(offset), sizeof(BTPageOpaqueData));
 			itup = (IndexTuple) PageGetItem(state->target, itemid);
 			tid = BTreeTupleGetPointsToTID(itup);
 			nhtid = psprintf("(%u,%u)",
@@ -1735,8 +1632,8 @@ bt_right_page_check_scankey(BtreeCheckState *state)
 	if (P_ISLEAF(opaque) && nline >= P_FIRSTDATAKEY(opaque))
 	{
 		/* Return first data item (if any) */
-		rightitem = PageGetItemIdCareful(state, targetnext, rightpage,
-										 P_FIRSTDATAKEY(opaque));
+		rightitem = PageGetItemIdCareful(state->rel, targetnext, rightpage,
+										 P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 	}
 	else if (!P_ISLEAF(opaque) &&
 			 nline >= OffsetNumberNext(P_FIRSTDATAKEY(opaque)))
@@ -1745,8 +1642,8 @@ bt_right_page_check_scankey(BtreeCheckState *state)
 		 * Return first item after the internal page's "negative infinity"
 		 * item
 		 */
-		rightitem = PageGetItemIdCareful(state, targetnext, rightpage,
-										 OffsetNumberNext(P_FIRSTDATAKEY(opaque)));
+		rightitem = PageGetItemIdCareful(state->rel, targetnext, rightpage,
+										 OffsetNumberNext(P_FIRSTDATAKEY(opaque)), sizeof(BTPageOpaqueData));
 	}
 	else
 	{
@@ -1865,8 +1762,8 @@ bt_child_highkey_check(BtreeCheckState *state,
 
 	if (OffsetNumberIsValid(target_downlinkoffnum))
 	{
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, target_downlinkoffnum);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, target_downlinkoffnum, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(state->target, itemid);
 		downlink = BTreeTupleGetDownLink(itup);
 	}
@@ -1969,7 +1866,7 @@ bt_child_highkey_check(BtreeCheckState *state,
 			OffsetNumber pivotkey_offset;
 
 			/* Get high key */
-			itemid = PageGetItemIdCareful(state, blkno, page, P_HIKEY);
+			itemid = PageGetItemIdCareful(state->rel, blkno, page, P_HIKEY, sizeof(BTPageOpaqueData));
 			highkey = (IndexTuple) PageGetItem(page, itemid);
 
 			/*
@@ -2020,8 +1917,8 @@ bt_child_highkey_check(BtreeCheckState *state,
 													LSN_FORMAT_ARGS(state->targetlsn))));
 					pivotkey_offset = P_HIKEY;
 				}
-				itemid = PageGetItemIdCareful(state, state->targetblock,
-											  state->target, pivotkey_offset);
+				itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+											  state->target, pivotkey_offset, sizeof(BTPageOpaqueData));
 				itup = (IndexTuple) PageGetItem(state->target, itemid);
 			}
 			else
@@ -2107,8 +2004,8 @@ bt_child_check(BtreeCheckState *state, BTScanInsert targetkey,
 	BTPageOpaque copaque;
 	BTPageOpaque topaque;
 
-	itemid = PageGetItemIdCareful(state, state->targetblock,
-								  state->target, downlinkoffnum);
+	itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+								  state->target, downlinkoffnum, sizeof(BTPageOpaqueData));
 	itup = (IndexTuple) PageGetItem(state->target, itemid);
 	childblock = BTreeTupleGetDownLink(itup);
 
@@ -2339,7 +2236,7 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 		 RelationGetRelationName(state->rel));
 
 	level = opaque->btpo_level;
-	itemid = PageGetItemIdCareful(state, blkno, page, P_FIRSTDATAKEY(opaque));
+	itemid = PageGetItemIdCareful(state->rel, blkno, page, P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 	itup = (IndexTuple) PageGetItem(page, itemid);
 	childblk = BTreeTupleGetDownLink(itup);
 	for (;;)
@@ -2363,8 +2260,8 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 										level - 1, copaque->btpo_level)));
 
 		level = copaque->btpo_level;
-		itemid = PageGetItemIdCareful(state, childblk, child,
-									  P_FIRSTDATAKEY(copaque));
+		itemid = PageGetItemIdCareful(state->rel, childblk, child,
+									  P_FIRSTDATAKEY(copaque), sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(child, itemid);
 		childblk = BTreeTupleGetDownLink(itup);
 		/* Be slightly more pro-active in freeing this memory, just in case */
@@ -2412,7 +2309,7 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,
 	 */
 	if (P_ISHALFDEAD(copaque) && !P_RIGHTMOST(copaque))
 	{
-		itemid = PageGetItemIdCareful(state, childblk, child, P_HIKEY);
+		itemid = PageGetItemIdCareful(state->rel, childblk, child, P_HIKEY, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(child, itemid);
 		if (BTreeTupleGetTopParent(itup) == blkno)
 			return;
@@ -2782,8 +2679,8 @@ invariant_l_offset(BtreeCheckState *state, BTScanInsert key,
 	Assert(key->pivotsearch);
 
 	/* Verify line pointer before checking tuple */
-	itemid = PageGetItemIdCareful(state, state->targetblock, state->target,
-								  upperbound);
+	itemid = PageGetItemIdCareful(state->rel, state->targetblock, state->target,
+								  upperbound, sizeof(BTPageOpaqueData));
 	/* pg_upgrade'd indexes may legally have equal sibling tuples */
 	if (!key->heapkeyspace)
 		return invariant_leq_offset(state, key, upperbound);
@@ -2905,8 +2802,8 @@ invariant_l_nontarget_offset(BtreeCheckState *state, BTScanInsert key,
 	Assert(key->pivotsearch);
 
 	/* Verify line pointer before checking tuple */
-	itemid = PageGetItemIdCareful(state, nontargetblock, nontarget,
-								  upperbound);
+	itemid = PageGetItemIdCareful(state->rel, nontargetblock, nontarget,
+								  upperbound, sizeof(BTPageOpaqueData));
 	cmp = _bt_compare(state->rel, key, nontarget, upperbound);
 
 	/* pg_upgrade'd indexes may legally have equal sibling tuples */
@@ -3143,55 +3040,6 @@ bt_mkscankey_pivotsearch(Relation rel, IndexTuple itup)
 	return skey;
 }
 
-/*
- * PageGetItemId() wrapper that validates returned line pointer.
- *
- * Buffer page/page item access macros generally trust that line pointers are
- * not corrupt, which might cause problems for verification itself.  For
- * example, there is no bounds checking in PageGetItem().  Passing it a
- * corrupt line pointer can cause it to return a tuple/pointer that is unsafe
- * to dereference.
- *
- * Validating line pointers before tuples avoids undefined behavior and
- * assertion failures with corrupt indexes, making the verification process
- * more robust and predictable.
- */
-static ItemId
-PageGetItemIdCareful(BtreeCheckState *state, BlockNumber block, Page page,
-					 OffsetNumber offset)
-{
-	ItemId		itemid = PageGetItemId(page, offset);
-
-	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
-		BLCKSZ - MAXALIGN(sizeof(BTPageOpaqueData)))
-		ereport(ERROR,
-				(errcode(ERRCODE_INDEX_CORRUPTED),
-				 errmsg("line pointer points past end of tuple space in index \"%s\"",
-						RelationGetRelationName(state->rel)),
-				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
-									block, offset, ItemIdGetOffset(itemid),
-									ItemIdGetLength(itemid),
-									ItemIdGetFlags(itemid))));
-
-	/*
-	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree
-	 * never uses either.  Verify that line pointer has storage, too, since
-	 * even LP_DEAD items should within nbtree.
-	 */
-	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
-		ItemIdGetLength(itemid) == 0)
-		ereport(ERROR,
-				(errcode(ERRCODE_INDEX_CORRUPTED),
-				 errmsg("invalid line pointer storage in index \"%s\"",
-						RelationGetRelationName(state->rel)),
-				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
-									block, offset, ItemIdGetOffset(itemid),
-									ItemIdGetLength(itemid),
-									ItemIdGetFlags(itemid))));
-
-	return itemid;
-}
-
 /*
  * BTreeTupleGetHeapTID() wrapper that enforces that a heap TID is present in
  * cases where that is mandatory (i.e. for non-pivot tuples)
-- 
2.32.0 (Apple Git-132)

v21-0002-Add-gist_index_parent_check-function-to-verify-G.patchapplication/octet-stream; name=v21-0002-Add-gist_index_parent_check-function-to-verify-G.patchDownload

From ca68e194f462cdd95b74c80106c02ba34f6f07aa Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:17:44 +0500
Subject: [PATCH v21 2/3] Add gist_index_parent_check() function to verify GiST
 index

---
 contrib/amcheck/Makefile                |   6 +-
 contrib/amcheck/amcheck--1.3--1.4.sql   |  14 +
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/expected/check_gist.out | 119 ++++++
 contrib/amcheck/meson.build             |   3 +
 contrib/amcheck/sql/check_gist.sql      |  42 ++
 contrib/amcheck/verify_gist.c           | 538 ++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml               |  19 +
 8 files changed, 740 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.3--1.4.sql
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index f10fd9d89d..a817419581 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,15 +4,17 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
+	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
-DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql \
+		amcheck--1.3--1.4.sql
 
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap
+REGRESS = check check_btree check_heap check_gist
 
 TAP_TESTS = 1
 
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
new file mode 100644
index 0000000000..93297379ef
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -0,0 +1,14 @@
+/* contrib/amcheck/amcheck--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.4'" to load this file. \quit
+
+
+-- gist_index_parent_check()
+--
+CREATE FUNCTION gist_index_parent_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index ab50931f75..e67ace01c9 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.3'
+default_version = '1.4'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 0000000000..9749adfd34
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,119 @@
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index cd81cbf3bc..45e9d74947 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
@@ -24,6 +25,7 @@ install_data(
   'amcheck--1.0--1.1.sql',
   'amcheck--1.1--1.2.sql',
   'amcheck--1.2--1.3.sql',
+  'amcheck--1.3--1.4.sql',
   kwargs: contrib_data_args,
 )
 
@@ -36,6 +38,7 @@ tests += {
       'check',
       'check_btree',
       'check_heap',
+      'check_gist',
     ],
   },
   'tap': {
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 0000000000..75b9ff4b43
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,42 @@
+
+SELECT setseed(1);
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 0000000000..5a5fa73536
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,538 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gist.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2022, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "access/tableam.h"
+#include "access/transam.h"
+#include "catalog/pg_am.h"
+#include "common/pg_prng.h"
+#include "catalog/index.h"
+#include "lib/bloomfilter.h"
+#include "storage/lmgr.h"
+#include "storage/smgr.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "utils/snapmgr.h"
+
+#include "amcheck.h"
+
+/*
+ * GistScanItem represents one item of depth-first scan of GiST index.
+ */
+typedef struct GistScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GistScanItem *next;
+} GistScanItem;
+
+typedef struct GistCheckState
+{
+	/* Bloom filter fingerprints index tuples */
+	bloom_filter *filter;
+	/* Debug counter */
+	int64		heaptuplespresent;
+	/* GiST state */
+	GISTSTATE      *state;
+
+	Snapshot		snapshot;
+	Relation	rel;
+	Relation	heaprel;
+
+	/* progress reporting stuff */
+	BlockNumber totalblocks;
+	BlockNumber reportedblocks;
+	BlockNumber scannedblocks;
+	BlockNumber deltablocks;
+} GistCheckState;
+
+PG_FUNCTION_INFO_V1(gist_index_parent_check);
+
+static void gist_init_heapallindexed(Relation rel, GistCheckState *result);
+static void gist_index_checkable(Relation rel);
+static void gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+												void* callback_state);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gist_refind_parent(Relation rel, BlockNumber parentblkno,
+									 BlockNumber childblkno,
+									 BufferAccessStrategy strategy);
+static void gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+						  bool *isnull, bool tupleIsAlive, void *checkstate);
+
+/*
+ * gist_index_parent_check(index regclass)
+ *
+ * Verify integrity of GiST index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum gist_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid		indrelid = PG_GETARG_OID(0);
+	bool	heapallindexed = false;
+
+	if (PG_NARGS() >= 2)
+		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid, gist_index_checkable,
+		gist_check_parent_keys_consistency, AccessShareLock, &heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Check that relation is eligible for GiST verification
+ */
+static void
+gist_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIST_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GiST indexes are supported as targets for this verification"),
+				 errdetail("Relation \"%s\" is not a GiST index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+static void
+gist_init_heapallindexed(Relation rel, GistCheckState *result)
+{
+	int64		total_pages;
+	int64		total_elems;
+	uint64		seed;
+
+	/*
+	 * Size Bloom filter based on estimated number of tuples in index.
+	 * This logic is similar to B-tree, see verify_btree.c .
+	 */
+	total_pages = result->totalblocks;
+	total_elems = Max(total_pages * (MaxOffsetNumber / 5),
+						(int64) rel->rd_rel->reltuples);
+	seed = pg_prng_uint64(&pg_global_prng_state);
+	result->filter = bloom_create(total_elems, maintenance_work_mem, seed);
+
+	result->snapshot = RegisterSnapshot(GetTransactionSnapshot());
+
+
+	/*
+	 * GetTransactionSnapshot() always acquires a new MVCC snapshot in
+	 * READ COMMITTED mode.  A new snapshot is guaranteed to have all
+	 * the entries it requires in the index.
+	 *
+	 * We must defend against the possibility that an old xact
+	 * snapshot was returned at higher isolation levels when that
+	 * snapshot is not safe for index scans of the target index.  This
+	 * is possible when the snapshot sees tuples that are before the
+	 * index's indcheckxmin horizon.  Throwing an error here should be
+	 * very rare.  It doesn't seem worth using a secondary snapshot to
+	 * avoid this.
+	 */
+	if (IsolationUsesXactSnapshot() && rel->rd_index->indcheckxmin &&
+		!TransactionIdPrecedes(HeapTupleHeaderGetXmin(rel->rd_indextuple->t_data),
+								result->snapshot->xmin))
+		ereport(ERROR,
+				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+					errmsg("index \"%s\" cannot be verified using transaction snapshot",
+						RelationGetRelationName(rel))));
+}
+
+/*
+ * Main entry point for GiST check. Allocates memory context and scans through
+ * GiST graph.  This function verifies that tuples of internal pages cover all
+ * the key space of each tuple on leaf page.  To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries to
+ * adjust it by tuples on referenced child page.  Parent gist tuple should
+ * never require any adjustments.
+ */
+static void
+gist_check_parent_keys_consistency(Relation rel, Relation heaprel, void* callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GistScanItem   *stack;
+	MemoryContext	mctx;
+	MemoryContext	oldcontext;
+	GISTSTATE      *state;
+	int				leafdepth;
+	bool			heapallindexed = *((bool*)callback_state);
+	GistCheckState  check_state;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	state = initGISTstate(rel);
+
+	check_state.state = state;
+	check_state.rel = rel;
+	check_state.heaprel = heaprel;
+
+	check_state.totalblocks = RelationGetNumberOfBlocks(rel);
+	check_state.reportedblocks = 0;
+	check_state.scannedblocks = 0;
+	/* report every 100 blocks or 5%, whichever is bigger */
+	check_state.deltablocks = Max(check_state.totalblocks / 20, 100);
+
+	if (heapallindexed)
+		gist_init_heapallindexed(rel, &check_state);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GistScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber  i, maxoff;
+		XLogRecPtr	lsn;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Report progress */
+		if (check_state.scannedblocks > check_state.reportedblocks +
+			  check_state.deltablocks)
+		{
+			elog(DEBUG1, "verified level %u blocks of approximately %u total",
+				check_state.scannedblocks, check_state.totalblocks);
+			check_state.reportedblocks = check_state.scannedblocks;
+		}
+		check_state.scannedblocks++;
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page))
+		{
+			/* split page detected, install right link to the stack */
+			GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+			ptr->depth = stack->depth;
+			ptr->parenttup = CopyIndexTuple(stack->parenttup);
+			ptr->parentblk = stack->parentblk;
+			ptr->parentlsn = stack->parentlsn;
+			ptr->blkno = GistPageGetOpaque(page)->rightlink;
+			ptr->next = stack->next;
+			stack->next = ptr;
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GistPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that each tuple looks valid, and is consistent with the
+		 * downlink we followed when we stepped on this page.
+		 */
+		maxoff = PageGetMaxOffsetNumber(page);
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId iid = PageGetItemIdCareful(rel, stack->blkno, page, i, sizeof(GISTPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+			/*
+			 * Check that it's not a leftover invalid tuple from pre-9.1 See
+			 * also gistdoinsert() and gistbulkdelete() handling of such
+			 * tuples. We do consider it error here.
+			 */
+			if (GistTupleIsInvalid(idxtuple))
+				ereport(ERROR,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("index \"%s\" contains an inner tuple marked as invalid, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i),
+						 errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+						 errhint("Please REINDEX it.")));
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+			{
+				/*
+				 * There was a discrepancy between parent and child tuples.
+				 * We need to verify it is not a result of concurrent call of
+				 * gistplacetopage(). So, lock parent and try to find downlink
+				 * for current page. It may be missing due to concurrent page
+				 * split, this is OK.
+				 *
+				 * Note that when we aquire parent tuple now we hold lock for
+				 * both parent and child buffers. Thus parent tuple must
+				 * include keyspace of the child.
+				 */
+				pfree(stack->parenttup);
+				stack->parenttup = gist_refind_parent(rel, stack->parentblk,
+													  stack->blkno, strategy);
+
+				/* We found it - make a final check before failing */
+				if (!stack->parenttup)
+					elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+						 stack->blkno, stack->parentblk);
+				else if (gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+				else
+				{
+					/*
+					 * But now it is properly adjusted - nothing to do here.
+					 */
+				}
+			}
+
+			if (GistPageIsLeaf(page))
+			{
+				if (heapallindexed)
+				{
+					bloom_add_element(check_state.filter, (unsigned char *) idxtuple,
+								  IndexTupleSize(idxtuple));
+				}
+			}
+			/* If this is an internal page, recurse into the child */
+			else
+			{
+				GistScanItem *ptr;
+
+				ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+				ptr->depth = stack->depth + 1;
+				ptr->parenttup = CopyIndexTuple(idxtuple);
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		LockBuffer(buffer, GIST_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	if (heapallindexed)
+	{
+		IndexInfo  *indexinfo = BuildIndexInfo(rel);
+		TableScanDesc scan;
+
+		scan = table_beginscan_strat(heaprel,	/* relation */
+									 check_state.snapshot,	/* snapshot */
+									 0, /* number of keys */
+									 NULL,	/* scan key */
+									 true,	/* buffer access strategy OK */
+									 true); /* syncscan OK? */
+
+		/*
+		 * Scan will behave as the first scan of a CREATE INDEX CONCURRENTLY.
+		 */
+		indexinfo->ii_Concurrent = true;
+
+		indexinfo->ii_Unique = false;
+		indexinfo->ii_ExclusionOps = NULL;
+		indexinfo->ii_ExclusionProcs = NULL;
+		indexinfo->ii_ExclusionStrats = NULL;
+
+		elog(DEBUG1, "verifying that tuples from index \"%s\" are present in \"%s\"",
+			 RelationGetRelationName(rel),
+			 RelationGetRelationName(heaprel));
+
+		table_index_build_scan(heaprel, rel, indexinfo, true, false,
+							   gist_tuple_present_callback, (void *) &check_state, scan);
+
+		ereport(DEBUG1,
+		(errmsg_internal("finished verifying presence of " INT64_FORMAT " tuples from table \"%s\" with bitset %.2f%% set",
+							check_state.heaptuplespresent, RelationGetRelationName(heaprel),
+							100.0 * bloom_prop_bits_set(check_state.filter))));
+
+		UnregisterSnapshot(check_state.snapshot);
+		bloom_free(check_state.filter);
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+static void
+gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+						  bool *isnull, bool tupleIsAlive, void *checkstate)
+{
+	GistCheckState *state = (GistCheckState *) checkstate;
+	IndexTuple	itup = gistFormTuple(state->state, index, values, isnull, true);
+	itup->t_tid = *tid;
+	/* Probe Bloom filter -- tuple should be present */
+	if (bloom_lacks_element(state->filter, (unsigned char *) itup,
+							IndexTupleSize(itup)))
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("heap tuple (%u,%u) from table \"%s\" lacks matching index tuple within index \"%s\"",
+						ItemPointerGetBlockNumber(&(itup->t_tid)),
+						ItemPointerGetOffsetNumber(&(itup->t_tid)),
+						RelationGetRelationName(state->heaprel),
+						RelationGetRelationName(state->rel))));
+
+	state->heaptuplespresent++;
+
+	pfree(itup);
+}
+
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	gistcheckpage(rel, buffer);
+
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted page %d",
+						RelationGetRelationName(rel), blockNo)));
+
+	if (GistPageIsDeleted(page))
+	{
+		if (!GistPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gist_refind_parent(Relation rel, BlockNumber parentblkno,
+				   BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIST_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GistPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o, sizeof(GISTPageOpaqueData));
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 923cbde9dd..abe3135132 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -179,6 +179,25 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gist_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_parent_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
   <tip>
    <para>
-- 
2.32.0 (Apple Git-132)

#22

Aleksander Alekseev

aleksander@timescale.com

almost 3 years ago

In reply to: Andrey Borodin (#21)

Re: Amcheck verification of GiST and GIN

Hi Andrey,

Thanks! I also found out that there was a CI complaint about amcheck.h
not including some necessary stuff. Here's a version with a fix for
that.

Thanks for the updated patchset.

One little nitpick I have is that the tests cover only cases when all
the checks pass successfully. The tests don't show that the checks
will fail if the indexes are corrupted. Usually we check this as well,
see bd807be6 and other amcheck replated patches and commits.

--
Best regards,
Aleksander Alekseev

#23

Peter Geoghegan

pg@bowt.ie

almost 3 years ago

In reply to: Andrey Borodin (#21)

Re: Amcheck verification of GiST and GIN

On Fri, Jan 13, 2023 at 8:15 PM Andrey Borodin <amborodin86@gmail.com> wrote:

(v21 of patch series)

I can see why the refactoring patch is necessary overall, but I have
some concerns about the details. More specifically:

* PageGetItemIdCareful() doesn't seem like it needs to be moved to
amcheck.c and generalized to work with GIN and GiST.

It seems better to just allow some redundancy, by having static/local
versions of PageGetItemIdCareful() for both GIN and GiST. There are
numerous reasons why that seems better to me. For one thing it's
simpler. For another, the requirements are already a bit different,
and may become more different in the future. I have seriously
considered adding a new PageGetItemCareful() routine to nbtree in the
past (which would work along similar lines when we access
IndexTuples), which would have to be quite different across each index
AM. Maybe this idea of adding a PageGetItemCareful() would totally
supersede the existing PageGetItemIdCareful() function.

But even now, without any of that, the rules for
PageGetItemIdCareful() are already different. For example, with GIN
you cannot have LP_DEAD bits set, so ISTM that you should be checking
for that in its own custom version of PageGetItemIdCareful().

You can just have comments that refer the reader to the original
nbtree version of PageGetItemIdCareful() for a high level overview.

* You have distinct versions of the current btree_index_checkable()
function for both GIN and GiST, which doesn't seem necessary to me --
so this is kind of the opposite of the situation with
PageGetItemIdCareful() IMV.

The only reason to have separate versions of these is to detect when
the wrong index AM is used -- the other 2 checks are 100% common to
all index AMs. Why not just move that one non-generic check out of the
function, to each respective index AM .c file, while keeping the other
2 generic checks in amcheck.c?

Once things are structured this way, it would then make sense to add a
can't-be-LP_DEAD check to the GIN specific version of
PageGetItemIdCareful().

I also have some questions about the verification functionality itself:

* Why haven't you done something like palloc_btree_page() for both
GiST and GIN, and use that for everything?

Obviously this may not be possible in100% of all cases -- even
verify_nbtree.c doesn't manage that. But I see no reason for that
here. Though, in general, it's not exactly clear what's going on with
buffer lock coupling in general.

* Why does gin_refind_parent() buffer lock the parent while the child
buffer lock remains held?

In any case this doesn't really need to have any buffer lock coupling.
Since you're both of the new verification functions you're adding are
"parent" variants, that acquire a ShareLock to block concurrent
modifications and concurrent VACUUM?

* Oh wait, they don't use a ShareLock at all -- they use an
AccessShareLock. This means that there are significant inconsistencies
with the verify_nbtree.c scheme.

I now realize that gist_index_parent_check() and
gin_index_parent_check() are actually much closer to bt_index_check()
than to bt_index_parent_check(). I think that you should stick with
the convention of using the word "parent" whenever we'll need a
ShareLock, and omitting "parent" whenever we will only require an
AccessShareLock. I'm not sure if that means that you should change the
lock strength or change the name of the functions. I am sure that you
should follow the general convention that we have already.

I feel rather pessimistic about our ability to get all the details
right with GIN. Frankly I have serious doubts that GIN itself gets
everything right, which makes our task just about impossible. The GIN
README did gain a "Concurrency" section in 2019, at my behest, but in
general the locking protocols are still chronically under-documented,
and have been revised in various ways as a response to bugs. So at
least in the case of GIN, we really need amcheck coverage, but should
take a very conservative approach.

With GIN I think that we need to make the most modest possible
assumptions about concurrency, by using a ShareLock. Without that, I
think that we can have very little confidence in the verification
checks -- the concurrency rules are just too complicated right now.
Maybe it will be possible in the future, but right now I'd rather not
try that. I find it very difficult to figure out the GIN locking
protocol, even for things that seem like they should be quite
straightforward. This situation would be totally unthinkable in
nbtree, and perhaps with GiST.

* Why does the GIN patch change a comment in contrib/amcheck/amcheck.c?

* There is no pg_amcheck patch here, but I think that there should be,
since that is now the preferred and recommended way to run amcheck in
general.

We could probably do something very similar to what is already there
for nbtree. Maybe it would make sense to change --heapallindexed and
--parent-check so that they call your parent check functions for GiST
and GIN -- though the locking/naming situation must be resolved before
we decide what to do here, for pg_amcheck.

--
Peter Geoghegan

#24

Peter Geoghegan

pg@bowt.ie

almost 3 years ago

In reply to: Peter Geoghegan (#23)

Re: Amcheck verification of GiST and GIN

On Thu, Feb 2, 2023 at 11:51 AM Peter Geoghegan <pg@bowt.ie> wrote:

I also have some questions about the verification functionality itself:

I forgot to include another big concern here:

* Why are there only WARNINGs, never ERRORs here?

It's far more likely that you'll run into problems when running
amcheck this way. I understand that the heapam checks can do that, but
that is both more useful, and less risky. With heapam we're not
traversing a tree structure in logical/keyspace order. I'm not
claiming that this approach is impossible; just that it doesn't seem
even remotely worth it. Indexes are never supposed to be corrupt, but
if they are corrupt the solution always involves a REINDEX. You never
try to recover the data from an index, since it's redundant and less
authoritative, almost by definition (at least in Postgres).

By far the most important piece of information is that an index has
some non-zero amount of corruption. Any amount of corruption is
supposed to be extremely surprising. It's kind of like if you see one
cockroach in your home. The problem is not that you have one cockroach
in your home; the problem is that you simply have cockroaches. We can
all agree that in some abstract sense, fewer cockroaches is better.
But that doesn't seem to have any practical relevance -- it's a purely
theoretical point. It doesn't really affect what you do about the
problem at that point.

Admittedly there is some value in seeing multiple WARNINGs to true
experts that are performing some kind of forensic analysis, but that
doesn't seem worth it to me -- I'm an expert, and I don't think that
I'd do it this way for any reason other than it being more convenient
as a way to get information about a system that I don't have access
to. Even then, I think that I'd probably have serious doubts about
most of the extra information that I'd get, since it might very well
be a downstream consequence of the same basic problem.

--
Peter Geoghegan

#25

Nikolay Samokhvalov

samokhvalov@gmail.com

almost 3 years ago

In reply to: Peter Geoghegan (#24)

Re: Amcheck verification of GiST and GIN

On Thu, Feb 2, 2023 at 12:15 PM Peter Geoghegan <pg@bowt.ie> wrote:

On Thu, Feb 2, 2023 at 11:51 AM Peter Geoghegan <pg@bowt.ie> wrote:

...

Admittedly there is some value in seeing multiple WARNINGs to true
experts that are performing some kind of forensic analysis, but that
doesn't seem worth it to me -- I'm an expert, and I don't think that
I'd do it this way for any reason other than it being more convenient
as a way to get information about a system that I don't have access
to. Even then, I think that I'd probably have serious doubts about
most of the extra information that I'd get, since it might very well
be a downstream consequence of the same basic problem.

...

I understand your thoughts (I think) and agree with them, but at least one
scenario where I do want to see *all* errors is corruption prevention –
running
amcheck in lower environments, not in production, to predict and prevent
issues.
For example, not long ago, Ubuntu 16.04 became EOL (in phases), and people
needed to upgrade, with glibc version change. It was quite good to use
amcheck
on production clones (running on a new OS/glibc) to identify all indexes
that
need to be rebuilt. Being able to see only one of them would be very
inconvenient. Rebuilding all indexes didn't seem a good idea in the case of
large databases.

#26

Peter Geoghegan

pg@bowt.ie

almost 3 years ago

In reply to: Nikolay Samokhvalov (#25)

Re: Amcheck verification of GiST and GIN

On Thu, Feb 2, 2023 at 12:31 PM Nikolay Samokhvalov
<samokhvalov@gmail.com> wrote:

I understand your thoughts (I think) and agree with them, but at least one
scenario where I do want to see *all* errors is corruption prevention – running
amcheck in lower environments, not in production, to predict and prevent issues.
For example, not long ago, Ubuntu 16.04 became EOL (in phases), and people
needed to upgrade, with glibc version change. It was quite good to use amcheck
on production clones (running on a new OS/glibc) to identify all indexes that
need to be rebuilt. Being able to see only one of them would be very
inconvenient. Rebuilding all indexes didn't seem a good idea in the case of
large databases.

I agree that this matters at the level of whole indexes. That is, if
you want to check every index in the database, it is unhelpful if the
whole process stops just because one individual index has corruption.
Any extra information about the index that is corrupt may not be all
that valuable, but information about other indexes remains almost as
valuable.

I think that that problem should be solved at a higher level, in the
program that runs amcheck. Note that pg_amcheck will already do this
for B-Tree indexes. While verify_nbtree.c won't try to limp on with an
index that is known to be corrupt, pg_amcheck will continue with other
indexes.

We should add a "Tip" to the amcheck documentation on 14+ about this.
We should clearly advise users that they should probably just use
pg_amcheck. Using the SQL interface directly should now mostly be
something that only a tiny minority of experts need to do -- and even
the experts won't do it that way unless they have a good reason to.

--
Peter Geoghegan

#27

Nikolay Samokhvalov

samokhvalov@gmail.com

almost 3 years ago

In reply to: Peter Geoghegan (#26)

Re: Amcheck verification of GiST and GIN

On Thu, Feb 2, 2023 at 12:43 PM Peter Geoghegan <pg@bowt.ie> wrote:

I agree that this matters at the level of whole indexes.

I already realized my mistake – indeed, having multiple errors for 1 index
doesn't seem to be super practically helpful.

I think that that problem should be solved at a higher level, in the
program that runs amcheck. Note that pg_amcheck will already do this
for B-Tree indexes.

That's a great tool, and it's great it supports parallelization, very useful
on large machines.

We should add a "Tip" to the amcheck documentation on 14+ about this.
We should clearly advise users that they should probably just use
pg_amcheck.

and with -j$N, with high $N (unless it's production)

#28

Peter Geoghegan

pg@bowt.ie

almost 3 years ago

In reply to: Nikolay Samokhvalov (#27)

Re: Amcheck verification of GiST and GIN

On Thu, Feb 2, 2023 at 12:56 PM Nikolay Samokhvalov
<samokhvalov@gmail.com> wrote:

I already realized my mistake – indeed, having multiple errors for 1 index
doesn't seem to be super practically helpful.

I wouldn't mind supporting it if the cost wasn't too high. But I
believe that it's not a good trade-off.

I think that that problem should be solved at a higher level, in the
program that runs amcheck. Note that pg_amcheck will already do this
for B-Tree indexes.

That's a great tool, and it's great it supports parallelization, very useful
on large machines.

Another big advantage of just using pg_amcheck is that running each
index verification in a standalone query avoids needlessly holding the
same MVCC snapshot across all indexes verified (compared to running
one big SQL query that verifies multiple indexes). As simple as
pg_amcheck's approach is (it's doing nothing that you couldn't
replicate in a shell script), in practice that its standardized
approach probably makes things a lot smoother, especially in terms of
how VACUUM is impacted.

--
Peter Geoghegan

#29

Peter Geoghegan

pg@bowt.ie

almost 3 years ago

In reply to: Peter Geoghegan (#24)

3 attachment(s)

Re: Amcheck verification of GiST and GIN

On Thu, Feb 2, 2023 at 12:15 PM Peter Geoghegan <pg@bowt.ie> wrote:

* Why are there only WARNINGs, never ERRORs here?

Attached revision v22 switches all of the WARNINGs over to ERRORs. It
has also been re-indented, and now uses a non-generic version of
PageGetItemIdCareful() in both verify_gin.c and verify_gist.c.
Obviously this isn't a big set of revisions, but I thought that Andrey
would appreciate it if I posted this much now. I haven't thought much
more about the locking stuff, which is my main concern for now.

Who are the authors of the patch, in full? At some point we'll need to
get the attribution right if this is going to be committed.

I think that it would be good to add some comments explaining the high
level control flow. Is the verification process driven by a
breadth-first search, or a depth-first search, or something else?

I think that we should focus on getting the GiST patch into shape for
commit first, since that seems easier.

--
Peter Geoghegan

Attachments:

v22-0002-Add-gist_index_parent_check-function-to-verify-G.patchapplication/octet-stream; name=v22-0002-Add-gist_index_parent_check-function-to-verify-G.patchDownload

From 855c21e319fd44c87870ccd2af120c7ae40a969d Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:17:44 +0500
Subject: [PATCH v22 2/3] Add gist_index_parent_check() function to verify GiST
 index

---
 contrib/amcheck/Makefile                |   6 +-
 contrib/amcheck/amcheck--1.3--1.4.sql   |  14 +
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/expected/check_gist.out | 119 +++++
 contrib/amcheck/meson.build             |   3 +
 contrib/amcheck/sql/check_gist.sql      |  42 ++
 contrib/amcheck/verify_gist.c           | 576 ++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml               |  19 +
 8 files changed, 778 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.3--1.4.sql
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index 6d26551fe..e9e019827 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,14 +4,16 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
+	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
-DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql \
+		amcheck--1.3--1.4.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap
+REGRESS = check check_btree check_gist check_heap
 
 TAP_TESTS = 1
 
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
new file mode 100644
index 000000000..93297379e
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -0,0 +1,14 @@
+/* contrib/amcheck/amcheck--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.4'" to load this file. \quit
+
+
+-- gist_index_parent_check()
+--
+CREATE FUNCTION gist_index_parent_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index ab50931f7..e67ace01c 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.3'
+default_version = '1.4'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 000000000..9749adfd3
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,119 @@
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', false);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx1', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+SELECT gist_index_parent_check('gist_check_idx2', true);
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index cd81cbf3b..9e7ebc049 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
@@ -24,6 +25,7 @@ install_data(
   'amcheck--1.0--1.1.sql',
   'amcheck--1.1--1.2.sql',
   'amcheck--1.2--1.3.sql',
+  'amcheck--1.3--1.4.sql',
   kwargs: contrib_data_args,
 )
 
@@ -35,6 +37,7 @@ tests += {
     'sql': [
       'check',
       'check_btree',
+      'check_gist',
       'check_heap',
     ],
   },
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 000000000..75b9ff4b4
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,42 @@
+
+SELECT setseed(1);
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_parent_check('gist_check_idx1', false);
+SELECT gist_index_parent_check('gist_check_idx2', false);
+SELECT gist_index_parent_check('gist_check_idx1', true);
+SELECT gist_index_parent_check('gist_check_idx2', true);
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 000000000..845d87ee8
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,576 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gist.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "access/tableam.h"
+#include "amcheck.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
+#include "common/pg_prng.h"
+#include "lib/bloomfilter.h"
+#include "utils/memutils.h"
+
+
+/*
+ * GistScanItem represents one item of depth-first scan of GiST index.
+ */
+typedef struct GistScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GistScanItem *next;
+} GistScanItem;
+
+typedef struct GistCheckState
+{
+	/* Bloom filter fingerprints index tuples */
+	bloom_filter *filter;
+	/* Debug counter */
+	int64		heaptuplespresent;
+	/* GiST state */
+	GISTSTATE  *state;
+
+	Snapshot	snapshot;
+	Relation	rel;
+	Relation	heaprel;
+
+	/* progress reporting stuff */
+	BlockNumber totalblocks;
+	BlockNumber reportedblocks;
+	BlockNumber scannedblocks;
+	BlockNumber deltablocks;
+} GistCheckState;
+
+PG_FUNCTION_INFO_V1(gist_index_parent_check);
+
+static void gist_init_heapallindexed(Relation rel, GistCheckState * result);
+static void gist_index_checkable(Relation rel);
+static void gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+											   void *callback_state);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gist_refind_parent(Relation rel, BlockNumber parentblkno,
+									 BlockNumber childblkno,
+									 BufferAccessStrategy strategy);
+static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block,
+								   Page page, OffsetNumber offset);
+static void gist_tuple_present_callback(Relation index, ItemPointer tid,
+										Datum *values, bool *isnull,
+										bool tupleIsAlive, void *checkstate);
+
+/*
+ * gist_index_parent_check(index regclass)
+ *
+ * Verify integrity of GiST index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gist_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	bool		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid,
+									gist_index_checkable,
+									gist_check_parent_keys_consistency,
+									AccessShareLock,
+									&heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Check that relation is eligible for GiST verification
+ */
+static void
+gist_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIST_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GiST indexes are supported as targets for this verification"),
+				 errdetail("Relation \"%s\" is not a GiST index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+static void
+gist_init_heapallindexed(Relation rel, GistCheckState * result)
+{
+	int64		total_pages;
+	int64		total_elems;
+	uint64		seed;
+
+	/*
+	 * Size Bloom filter based on estimated number of tuples in index. This
+	 * logic is similar to B-tree, see verify_btree.c .
+	 */
+	total_pages = result->totalblocks;
+	total_elems = Max(total_pages * (MaxOffsetNumber / 5),
+					  (int64) rel->rd_rel->reltuples);
+	seed = pg_prng_uint64(&pg_global_prng_state);
+	result->filter = bloom_create(total_elems, maintenance_work_mem, seed);
+
+	result->snapshot = RegisterSnapshot(GetTransactionSnapshot());
+
+
+	/*
+	 * GetTransactionSnapshot() always acquires a new MVCC snapshot in READ
+	 * COMMITTED mode.  A new snapshot is guaranteed to have all the entries
+	 * it requires in the index.
+	 *
+	 * We must defend against the possibility that an old xact snapshot was
+	 * returned at higher isolation levels when that snapshot is not safe for
+	 * index scans of the target index.  This is possible when the snapshot
+	 * sees tuples that are before the index's indcheckxmin horizon.  Throwing
+	 * an error here should be very rare.  It doesn't seem worth using a
+	 * secondary snapshot to avoid this.
+	 */
+	if (IsolationUsesXactSnapshot() && rel->rd_index->indcheckxmin &&
+		!TransactionIdPrecedes(HeapTupleHeaderGetXmin(rel->rd_indextuple->t_data),
+							   result->snapshot->xmin))
+		ereport(ERROR,
+				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+				 errmsg("index \"%s\" cannot be verified using transaction snapshot",
+						RelationGetRelationName(rel))));
+}
+
+/*
+ * Main entry point for GiST check. Allocates memory context and scans through
+ * GiST graph.  This function verifies that tuples of internal pages cover all
+ * the key space of each tuple on leaf page.  To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries to
+ * adjust it by tuples on referenced child page.  Parent gist tuple should
+ * never require any adjustments.
+ */
+static void
+gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+								   void *callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GistScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GISTSTATE  *state;
+	int			leafdepth;
+	bool		heapallindexed = *((bool *) callback_state);
+	GistCheckState check_state;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	state = initGISTstate(rel);
+
+	check_state.state = state;
+	check_state.rel = rel;
+	check_state.heaprel = heaprel;
+
+	check_state.totalblocks = RelationGetNumberOfBlocks(rel);
+	check_state.reportedblocks = 0;
+	check_state.scannedblocks = 0;
+	/* report every 100 blocks or 5%, whichever is bigger */
+	check_state.deltablocks = Max(check_state.totalblocks / 20, 100);
+
+	if (heapallindexed)
+		gist_init_heapallindexed(rel, &check_state);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GistScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Report progress */
+		if (check_state.scannedblocks > check_state.reportedblocks +
+			check_state.deltablocks)
+		{
+			elog(DEBUG1, "verified level %u blocks of approximately %u total",
+				 check_state.scannedblocks, check_state.totalblocks);
+			check_state.reportedblocks = check_state.scannedblocks;
+		}
+		check_state.scannedblocks++;
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page))
+		{
+			/* split page detected, install right link to the stack */
+			GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+			ptr->depth = stack->depth;
+			ptr->parenttup = CopyIndexTuple(stack->parenttup);
+			ptr->parentblk = stack->parentblk;
+			ptr->parentlsn = stack->parentlsn;
+			ptr->blkno = GistPageGetOpaque(page)->rightlink;
+			ptr->next = stack->next;
+			stack->next = ptr;
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GistPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that each tuple looks valid, and is consistent with the
+		 * downlink we followed when we stepped on this page.
+		 */
+		maxoff = PageGetMaxOffsetNumber(page);
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+			/*
+			 * Check that it's not a leftover invalid tuple from pre-9.1 See
+			 * also gistdoinsert() and gistbulkdelete() handling of such
+			 * tuples. We do consider it error here.
+			 */
+			if (GistTupleIsInvalid(idxtuple))
+				ereport(ERROR,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("index \"%s\" contains an inner tuple marked as invalid, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i),
+						 errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+						 errhint("Please REINDEX it.")));
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+			{
+				/*
+				 * There was a discrepancy between parent and child tuples. We
+				 * need to verify it is not a result of concurrent call of
+				 * gistplacetopage(). So, lock parent and try to find downlink
+				 * for current page. It may be missing due to concurrent page
+				 * split, this is OK.
+				 *
+				 * Note that when we aquire parent tuple now we hold lock for
+				 * both parent and child buffers. Thus parent tuple must
+				 * include keyspace of the child.
+				 */
+				pfree(stack->parenttup);
+				stack->parenttup = gist_refind_parent(rel, stack->parentblk,
+													  stack->blkno, strategy);
+
+				/* We found it - make a final check before failing */
+				if (!stack->parenttup)
+					elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+						 stack->blkno, stack->parentblk);
+				else if (gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+				else
+				{
+					/*
+					 * But now it is properly adjusted - nothing to do here.
+					 */
+				}
+			}
+
+			if (GistPageIsLeaf(page))
+			{
+				if (heapallindexed)
+					bloom_add_element(check_state.filter,
+									  (unsigned char *) idxtuple,
+									  IndexTupleSize(idxtuple));
+			}
+			else
+			{
+				/* Internal page, so recurse to the child */
+				GistScanItem *ptr;
+
+				ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+				ptr->depth = stack->depth + 1;
+				ptr->parenttup = CopyIndexTuple(idxtuple);
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		LockBuffer(buffer, GIST_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	if (heapallindexed)
+	{
+		IndexInfo  *indexinfo = BuildIndexInfo(rel);
+		TableScanDesc scan;
+
+		scan = table_beginscan_strat(heaprel,	/* relation */
+									 check_state.snapshot,	/* snapshot */
+									 0, /* number of keys */
+									 NULL,	/* scan key */
+									 true,	/* buffer access strategy OK */
+									 true); /* syncscan OK? */
+
+		/*
+		 * Scan will behave as the first scan of a CREATE INDEX CONCURRENTLY.
+		 */
+		indexinfo->ii_Concurrent = true;
+
+		indexinfo->ii_Unique = false;
+		indexinfo->ii_ExclusionOps = NULL;
+		indexinfo->ii_ExclusionProcs = NULL;
+		indexinfo->ii_ExclusionStrats = NULL;
+
+		elog(DEBUG1, "verifying that tuples from index \"%s\" are present in \"%s\"",
+			 RelationGetRelationName(rel),
+			 RelationGetRelationName(heaprel));
+
+		table_index_build_scan(heaprel, rel, indexinfo, true, false,
+							   gist_tuple_present_callback, (void *) &check_state, scan);
+
+		ereport(DEBUG1,
+				(errmsg_internal("finished verifying presence of " INT64_FORMAT " tuples from table \"%s\" with bitset %.2f%% set",
+								 check_state.heaptuplespresent,
+								 RelationGetRelationName(heaprel),
+								 100.0 * bloom_prop_bits_set(check_state.filter))));
+
+		UnregisterSnapshot(check_state.snapshot);
+		bloom_free(check_state.filter);
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+static void
+gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+							bool *isnull, bool tupleIsAlive, void *checkstate)
+{
+	GistCheckState *state = (GistCheckState *) checkstate;
+	IndexTuple	itup = gistFormTuple(state->state, index, values, isnull, true);
+
+	itup->t_tid = *tid;
+	/* Probe Bloom filter -- tuple should be present */
+	if (bloom_lacks_element(state->filter, (unsigned char *) itup,
+							IndexTupleSize(itup)))
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("heap tuple (%u,%u) from table \"%s\" lacks matching index tuple within index \"%s\"",
+						ItemPointerGetBlockNumber(&(itup->t_tid)),
+						ItemPointerGetOffsetNumber(&(itup->t_tid)),
+						RelationGetRelationName(state->heaprel),
+						RelationGetRelationName(state->rel))));
+
+	state->heaptuplespresent++;
+
+	pfree(itup);
+}
+
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	gistcheckpage(rel, buffer);
+
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted page %d",
+						RelationGetRelationName(rel), blockNo)));
+
+	if (GistPageIsDeleted(page))
+	{
+		if (!GistPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gist_refind_parent(Relation rel,
+				   BlockNumber parentblkno, BlockNumber childblkno,
+				   BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIST_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GistPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+static ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(sizeof(GISTPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree
+	 * and gist never uses either.  Verify that line pointer has storage, too,
+	 * since even LP_DEAD items should.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 2b9c1a920..41f7e952e 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -179,6 +179,25 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gist_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_parent_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
   <tip>
    <para>
-- 
2.39.0

v22-0001-Refactor-amcheck-to-extract-common-locking-routi.patchapplication/octet-stream; name=v22-0001-Refactor-amcheck-to-extract-common-locking-routi.patchDownload

From 71e01bbb51c112cd36fccf5be0b923633d31831d Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:08:10 +0500
Subject: [PATCH v22 1/3] Refactor amcheck to extract common locking routines

---
 contrib/amcheck/Makefile        |   1 +
 contrib/amcheck/amcheck.c       | 139 ++++++++++++++++++++++
 contrib/amcheck/amcheck.h       |  28 +++++
 contrib/amcheck/meson.build     |   1 +
 contrib/amcheck/verify_nbtree.c | 202 +++++++++-----------------------
 5 files changed, 222 insertions(+), 149 deletions(-)
 create mode 100644 contrib/amcheck/amcheck.c
 create mode 100644 contrib/amcheck/amcheck.h

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index b82f221e5..6d26551fe 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -3,6 +3,7 @@
 MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
+	amcheck.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
new file mode 100644
index 000000000..2a2782f4b
--- /dev/null
+++ b/contrib/amcheck/amcheck.c
@@ -0,0 +1,139 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.c
+ *		Utility functions common to all access methods.
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/table.h"
+#include "access/tableam.h"
+#include "amcheck.h"
+#include "catalog/index.h"
+#include "commands/tablecmds.h"
+#include "utils/guc.h"
+
+static bool amcheck_index_mainfork_expected(Relation rel);
+
+
+/*
+ * Check if index relation should have a file for its main relation fork.
+ * Verification uses this to skip unlogged indexes when in hot standby mode,
+ * where there is simply nothing to verify.
+ *
+ * NB: Caller should call index_checkable() before calling here.
+ */
+static bool
+amcheck_index_mainfork_expected(Relation rel)
+{
+	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
+		!RecoveryInProgress())
+		return true;
+
+	ereport(NOTICE,
+			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
+			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
+					RelationGetRelationName(rel))));
+
+	return false;
+}
+
+void
+amcheck_lock_relation_and_check(Oid indrelid,
+								IndexCheckableCallback checkable,
+								IndexDoCheckCallback check,
+								LOCKMODE lockmode,
+								void *state)
+{
+	Oid			heapid;
+	Relation	indrel;
+	Relation	heaprel;
+	Oid			save_userid;
+	int			save_sec_context;
+	int			save_nestlevel;
+
+	/*
+	 * We must lock table before index to avoid deadlocks.  However, if the
+	 * passed indrelid isn't an index then IndexGetRelation() will fail.
+	 * Rather than emitting a not-very-helpful error message, postpone
+	 * complaining, expecting that the is-it-an-index test below will fail.
+	 *
+	 * In hot standby mode this will raise an error when parentcheck is true.
+	 */
+	heapid = IndexGetRelation(indrelid, true);
+	if (OidIsValid(heapid))
+	{
+		heaprel = table_open(heapid, lockmode);
+
+		/*
+		 * Switch to the table owner's userid, so that any index functions are
+		 * run as that user.  Also lock down security-restricted operations
+		 * and arrange to make GUC variable changes local to this command.
+		 */
+		GetUserIdAndSecContext(&save_userid, &save_sec_context);
+		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
+							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
+		save_nestlevel = NewGUCNestLevel();
+	}
+	else
+	{
+		heaprel = NULL;
+		/* Set these just to suppress "uninitialized variable" warnings */
+		save_userid = InvalidOid;
+		save_sec_context = -1;
+		save_nestlevel = -1;
+	}
+
+	/*
+	 * Open the target index relations separately (like relation_openrv(), but
+	 * with heap relation locked first to prevent deadlocking).  In hot
+	 * standby mode this will raise an error when parentcheck is true.
+	 *
+	 * There is no need for the usual indcheckxmin usability horizon test
+	 * here, even in the heapallindexed case, because index undergoing
+	 * verification only needs to have entries for a new transaction snapshot.
+	 * (If this is a parentcheck verification, there is no question about
+	 * committed or recently dead heap tuples lacking index entries due to
+	 * concurrent activity.)
+	 */
+	indrel = index_open(indrelid, lockmode);
+
+	/*
+	 * Since we did the IndexGetRelation call above without any lock, it's
+	 * barely possible that a race against an index drop/recreation could have
+	 * netted us the wrong table.
+	 */
+	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_TABLE),
+				 errmsg("could not open parent table of index \"%s\"",
+						RelationGetRelationName(indrel))));
+
+	/* Relation suitable for checking */
+	checkable(indrel);
+
+	if (amcheck_index_mainfork_expected(indrel))
+		check(indrel, heaprel, state);
+
+	/* Roll back any GUC changes executed by index functions */
+	AtEOXact_GUC(false, save_nestlevel);
+
+	/* Restore userid and security context */
+	SetUserIdAndSecContext(save_userid, save_sec_context);
+
+	/*
+	 * Release locks early. That's ok here because nothing in the called
+	 * routines will trigger shared cache invalidations to be sent, so we can
+	 * relax the usual pattern of only releasing locks after commit.
+	 */
+	index_close(indrel, lockmode);
+	if (heaprel)
+		table_close(heaprel, lockmode);
+}
diff --git a/contrib/amcheck/amcheck.h b/contrib/amcheck/amcheck.h
new file mode 100644
index 000000000..4630426d1
--- /dev/null
+++ b/contrib/amcheck/amcheck.h
@@ -0,0 +1,28 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "storage/bufpage.h"
+#include "storage/lmgr.h"
+#include "storage/lockdefs.h"
+#include "utils/relcache.h"
+#include "miscadmin.h"
+
+/* Typedefs for callback functions for amcheck_lock_relation */
+typedef void (*IndexCheckableCallback) (Relation index);
+typedef void (*IndexDoCheckCallback) (Relation rel,
+									  Relation heaprel,
+									  void *state);
+
+extern void amcheck_lock_relation_and_check(Oid indrelid,
+											IndexCheckableCallback checkable,
+											IndexDoCheckCallback check,
+											LOCKMODE lockmode, void *state);
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 5b55cf343..cd81cbf3b 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -1,6 +1,7 @@
 # Copyright (c) 2022-2023, PostgreSQL Global Development Group
 
 amcheck_sources = files(
+  'amcheck.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 257cff671..6c90d01f8 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -29,13 +29,12 @@
 #include "access/tableam.h"
 #include "access/transam.h"
 #include "access/xact.h"
+#include "amcheck.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
 #include "commands/tablecmds.h"
 #include "common/pg_prng.h"
 #include "lib/bloomfilter.h"
-#include "miscadmin.h"
-#include "storage/lmgr.h"
 #include "storage/smgr.h"
 #include "utils/guc.h"
 #include "utils/memutils.h"
@@ -135,13 +134,20 @@ typedef struct BtreeLevel
 	bool		istruerootlevel;
 } BtreeLevel;
 
+typedef struct BTCallbackState
+{
+	bool		parentcheck;
+	bool		heapallindexed;
+	bool		rootdescend;
+} BTCallbackState;
+
+
 PG_FUNCTION_INFO_V1(bt_index_check);
 PG_FUNCTION_INFO_V1(bt_index_parent_check);
 
-static void bt_index_check_internal(Oid indrelid, bool parentcheck,
-									bool heapallindexed, bool rootdescend);
+static void bt_index_check_callback(Relation indrel, Relation heaprel,
+									void *state);
 static inline void btree_index_checkable(Relation rel);
-static inline bool btree_index_mainfork_expected(Relation rel);
 static void bt_check_every_level(Relation rel, Relation heaprel,
 								 bool heapkeyspace, bool readonly, bool heapallindexed,
 								 bool rootdescend);
@@ -203,12 +209,18 @@ Datum
 bt_index_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
+	BTCallbackState args;
 
-	if (PG_NARGS() == 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = false;
 
-	bt_index_check_internal(indrelid, false, heapallindexed, false);
+	if (PG_NARGS() >= 2)
+		args.heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid, btree_index_checkable,
+									bt_index_check_callback,
+									AccessShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -226,15 +238,20 @@ Datum
 bt_index_parent_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		rootdescend = false;
+	BTCallbackState args;
+
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = true;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+		args.heapallindexed = PG_GETARG_BOOL(1);
 	if (PG_NARGS() == 3)
-		rootdescend = PG_GETARG_BOOL(2);
+		args.rootdescend = PG_GETARG_BOOL(2);
 
-	bt_index_check_internal(indrelid, true, heapallindexed, rootdescend);
+	amcheck_lock_relation_and_check(indrelid, btree_index_checkable,
+									bt_index_check_callback,
+									ShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -243,125 +260,35 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
  * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
  */
 static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend)
+bt_index_check_callback(Relation indrel, Relation heaprel, void *state)
 {
-	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	LOCKMODE	lockmode;
-	Oid			save_userid;
-	int			save_sec_context;
-	int			save_nestlevel;
+	BTCallbackState *args = (BTCallbackState *) state;
+	bool		heapkeyspace,
+				allequalimage;
 
-	if (parentcheck)
-		lockmode = ShareLock;
-	else
-		lockmode = AccessShareLock;
-
-	/*
-	 * We must lock table before index to avoid deadlocks.  However, if the
-	 * passed indrelid isn't an index then IndexGetRelation() will fail.
-	 * Rather than emitting a not-very-helpful error message, postpone
-	 * complaining, expecting that the is-it-an-index test below will fail.
-	 *
-	 * In hot standby mode this will raise an error when parentcheck is true.
-	 */
-	heapid = IndexGetRelation(indrelid, true);
-	if (OidIsValid(heapid))
-	{
-		heaprel = table_open(heapid, lockmode);
-
-		/*
-		 * Switch to the table owner's userid, so that any index functions are
-		 * run as that user.  Also lock down security-restricted operations
-		 * and arrange to make GUC variable changes local to this command.
-		 */
-		GetUserIdAndSecContext(&save_userid, &save_sec_context);
-		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
-							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
-		save_nestlevel = NewGUCNestLevel();
-	}
-	else
-	{
-		heaprel = NULL;
-		/* Set these just to suppress "uninitialized variable" warnings */
-		save_userid = InvalidOid;
-		save_sec_context = -1;
-		save_nestlevel = -1;
-	}
-
-	/*
-	 * Open the target index relations separately (like relation_openrv(), but
-	 * with heap relation locked first to prevent deadlocking).  In hot
-	 * standby mode this will raise an error when parentcheck is true.
-	 *
-	 * There is no need for the usual indcheckxmin usability horizon test
-	 * here, even in the heapallindexed case, because index undergoing
-	 * verification only needs to have entries for a new transaction snapshot.
-	 * (If this is a parentcheck verification, there is no question about
-	 * committed or recently dead heap tuples lacking index entries due to
-	 * concurrent activity.)
-	 */
-	indrel = index_open(indrelid, lockmode);
-
-	/*
-	 * Since we did the IndexGetRelation call above without any lock, it's
-	 * barely possible that a race against an index drop/recreation could have
-	 * netted us the wrong table.
-	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
 		ereport(ERROR,
-				(errcode(ERRCODE_UNDEFINED_TABLE),
-				 errmsg("could not open parent table of index \"%s\"",
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" lacks a main relation fork",
 						RelationGetRelationName(indrel))));
 
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
+	/* Extract metadata from metapage, and sanitize it in passing */
+	_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
+	if (allequalimage && !heapkeyspace)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
+						RelationGetRelationName(indrel))));
+	if (allequalimage && !_bt_allequalimage(indrel, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
+						RelationGetRelationName(indrel))));
 
-	if (btree_index_mainfork_expected(indrel))
-	{
-		bool		heapkeyspace,
-					allequalimage;
+	/* Check index, possibly against table it is an index on */
+	bt_check_every_level(indrel, heaprel, heapkeyspace, args->parentcheck,
+						 args->heapallindexed, args->rootdescend);
 
-		if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" lacks a main relation fork",
-							RelationGetRelationName(indrel))));
-
-		/* Extract metadata from metapage, and sanitize it in passing */
-		_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
-		if (allequalimage && !heapkeyspace)
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
-							RelationGetRelationName(indrel))));
-		if (allequalimage && !_bt_allequalimage(indrel, false))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
-							RelationGetRelationName(indrel))));
-
-		/* Check index, possibly against table it is an index on */
-		bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
-							 heapallindexed, rootdescend);
-	}
-
-	/* Roll back any GUC changes executed by index functions */
-	AtEOXact_GUC(false, save_nestlevel);
-
-	/* Restore userid and security context */
-	SetUserIdAndSecContext(save_userid, save_sec_context);
-
-	/*
-	 * Release locks early. That's ok here because nothing in the called
-	 * routines will trigger shared cache invalidations to be sent, so we can
-	 * relax the usual pattern of only releasing locks after commit.
-	 */
-	index_close(indrel, lockmode);
-	if (heaprel)
-		table_close(heaprel, lockmode);
 }
 
 /*
@@ -398,29 +325,6 @@ btree_index_checkable(Relation rel)
 				 errdetail("Index is not valid.")));
 }
 
-/*
- * Check if B-Tree index relation should have a file for its main relation
- * fork.  Verification uses this to skip unlogged indexes when in hot standby
- * mode, where there is simply nothing to verify.  We behave as if the
- * relation is empty.
- *
- * NB: Caller should call btree_index_checkable() before calling here.
- */
-static inline bool
-btree_index_mainfork_expected(Relation rel)
-{
-	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
-		!RecoveryInProgress())
-		return true;
-
-	ereport(DEBUG1,
-			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
-			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
-					RelationGetRelationName(rel))));
-
-	return false;
-}
-
 /*
  * Main entry point for B-Tree SQL-callable functions. Walks the B-Tree in
  * logical order, verifying invariants as it goes.  Optionally, verification
-- 
2.39.0

v22-0003-Add-gin_index_parent_check-to-verify-GIN-index.patchapplication/octet-stream; name=v22-0003-Add-gin_index_parent_check-to-verify-GIN-index.patchDownload

From ce7c009b9d7abcb41f9f79c1216d7001cfbc2c4f Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:22:26 +0500
Subject: [PATCH v22 3/3] Add gin_index_parent_check() to verify GIN index

---
 contrib/amcheck/Makefile               |   3 +-
 contrib/amcheck/amcheck--1.3--1.4.sql  |  11 +-
 contrib/amcheck/expected/check_gin.out |  64 ++
 contrib/amcheck/meson.build            |   2 +
 contrib/amcheck/sql/check_gin.sql      |  40 ++
 contrib/amcheck/verify_gin.c           | 799 +++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml              |  19 +
 7 files changed, 936 insertions(+), 2 deletions(-)
 create mode 100644 contrib/amcheck/expected/check_gin.out
 create mode 100644 contrib/amcheck/sql/check_gin.sql
 create mode 100644 contrib/amcheck/verify_gin.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index e9e019827..4c672f0db 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,6 +4,7 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
+	verify_gin.o \
 	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
@@ -13,7 +14,7 @@ DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck
 		amcheck--1.3--1.4.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_gist check_heap
+REGRESS = check check_btree check_gin check_gist check_heap
 
 TAP_TESTS = 1
 
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
index 93297379e..5e283be45 100644
--- a/contrib/amcheck/amcheck--1.3--1.4.sql
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -11,4 +11,13 @@ RETURNS VOID
 AS 'MODULE_PATHNAME', 'gist_index_parent_check'
 LANGUAGE C STRICT;
 
-REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
+REVOKE ALL ON FUNCTION gist_index_parent_check(regclass, boolean) FROM PUBLIC;
+
+-- gin_index_parent_check()
+--
+CREATE FUNCTION gin_index_parent_check(index regclass)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gin_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gin_index_parent_check(regclass) FROM PUBLIC;
\ No newline at end of file
diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
new file mode 100644
index 000000000..43fd769a5
--- /dev/null
+++ b/contrib/amcheck/expected/check_gin.out
@@ -0,0 +1,64 @@
+-- Test of index bulk load
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx');
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test index inserts
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+SELECT gin_index_parent_check('gin_check_idx');
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test GIN over text array
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx');
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 9e7ebc049..dc2191bd5 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gin.c',
   'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
@@ -37,6 +38,7 @@ tests += {
     'sql': [
       'check',
       'check_btree',
+      'check_gin',
       'check_gist',
       'check_heap',
     ],
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
new file mode 100644
index 000000000..9771afffa
--- /dev/null
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -0,0 +1,40 @@
+-- Test of index bulk load
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test index inserts
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+
+SELECT gin_index_parent_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test GIN over text array
+SELECT setseed(1);
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx');
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
new file mode 100644
index 000000000..ee4696e68
--- /dev/null
+++ b/contrib/amcheck/verify_gin.c
@@ -0,0 +1,799 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gin.c
+ *		Verifies the integrity of GIN indexes based on invariants.
+ *
+ * Verification checks that all paths in GIN graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gin.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gin_private.h"
+#include "access/nbtree.h"
+#include "amcheck.h"
+#include "catalog/pg_am.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "string.h"
+
+/*
+ * GinScanItem represents one item of depth-first scan of GIN index.
+ */
+typedef struct GinScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GinScanItem *next;
+} GinScanItem;
+
+/*
+ * GinPostingTreeScanItem represents one item of depth-first scan of GIN  posting tree.
+ */
+typedef struct GinPostingTreeScanItem
+{
+	int			depth;
+	ItemPointerData parentkey;
+	BlockNumber parentblk;
+	BlockNumber blkno;
+	struct GinPostingTreeScanItem *next;
+} GinPostingTreeScanItem;
+
+
+PG_FUNCTION_INFO_V1(gin_index_parent_check);
+
+static void gin_index_checkable(Relation rel);
+static void gin_check_parent_keys_consistency(Relation rel,
+											  Relation heaprel,
+											  void *callback_state);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gin_refind_parent(Relation rel,
+									BlockNumber parentblkno,
+									BlockNumber childblkno,
+									BufferAccessStrategy strategy);
+static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+								   OffsetNumber offset);
+
+/*
+ * gin_index_parent_check(index regclass)
+ *
+ * Verify integrity of GIN index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gin_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+
+	amcheck_lock_relation_and_check(indrelid,
+									gin_index_checkable,
+									gin_check_parent_keys_consistency,
+									AccessShareLock,
+									NULL);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Read item pointers from leaf entry tuple.
+ *
+ * Returns a palloc'd array of ItemPointers. The number of items is returned
+ * in *nitems.
+ */
+static ItemPointer
+ginReadTupleWithoutState(IndexTuple itup, int *nitems)
+{
+	Pointer		ptr = GinGetPosting(itup);
+	int			nipd = GinGetNPosting(itup);
+	ItemPointer ipd;
+	int			ndecoded;
+
+	if (GinItupIsCompressed(itup))
+	{
+		if (nipd > 0)
+		{
+			ipd = ginPostingListDecode((GinPostingList *) ptr, &ndecoded);
+			if (nipd != ndecoded)
+				elog(ERROR, "number of items mismatch in GIN entry tuple, %d in tuple header, %d decoded",
+					 nipd, ndecoded);
+		}
+		else
+		{
+			ipd = palloc(0);
+		}
+	}
+	else
+	{
+		ipd = (ItemPointer) palloc(sizeof(ItemPointerData) * nipd);
+		memcpy(ipd, ptr, sizeof(ItemPointerData) * nipd);
+	}
+	*nitems = nipd;
+	return ipd;
+}
+
+
+/*
+ * Check that relation is eligible for GIN verification
+ */
+static void
+gin_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIN_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GIN indexes are supported as targets for this verification"),
+				 errdetail("Relation \"%s\" is not a GIN index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+/*
+ * Allocates memory context and scans through postigTree graph
+ *
+ */
+static void
+gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting_tree_root)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinPostingTreeScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinPostingTreeScanItem *) palloc0(sizeof(GinPostingTreeScanItem));
+	stack->depth = 0;
+	ItemPointerSetInvalid(&stack->parentkey);
+	stack->parentblk = InvalidBlockNumber;
+	stack->blkno = posting_tree_root;
+
+	elog(DEBUG3, "processing posting tree at blk %u", posting_tree_root);
+
+	while (stack)
+	{
+		GinPostingTreeScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		Assert(GinPageIsData(page));
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			ItemPointerData minItem;
+			int			nlist;
+			ItemPointerData *list;
+			char		tidrange_buf[100];
+
+			ItemPointerSetMin(&minItem);
+
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			list = GinDataLeafPageGetItems(page, &nlist, minItem);
+
+			if (nlist > 0)
+			{
+				snprintf(tidrange_buf, sizeof(tidrange_buf),
+						 "%d tids (%u, %u) - (%u, %u)",
+						 nlist,
+						 ItemPointerGetBlockNumberNoCheck(&list[0]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[0]),
+						 ItemPointerGetBlockNumberNoCheck(&list[nlist - 1]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[nlist - 1]));
+			}
+			else
+			{
+				snprintf(tidrange_buf, sizeof(tidrange_buf), "0 tids");
+			}
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				elog(DEBUG3, "blk %u: parent %u highkey (%u, %u), %s",
+					 stack->blkno,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey),
+					 tidrange_buf);
+			}
+			else
+			{
+				elog(DEBUG3, "blk %u: root leaf, %s",
+					 stack->blkno,
+					 tidrange_buf);
+			}
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				ItemPointerGetOffsetNumberNoCheck(&stack->parentkey) != InvalidOffsetNumber &&
+				nlist > 0 && ItemPointerCompare(&stack->parentkey, &list[nlist - 1]) < 0)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": tid exceeds parent's high key in postingTree leaf on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+		else
+		{
+			LocationIndex pd_lower;
+			ItemPointerData bound;
+			int			lowersize;
+
+			/*
+			 * Check that tuples in each page are properly ordered and
+			 * consistent with parent high key
+			 */
+			maxoff = GinPageGetOpaque(page)->maxoff;
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: internal posting tree page with %u items, parent %u highkey (%u, %u)",
+					 stack->blkno, maxoff, stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey));
+			else
+				elog(DEBUG3, "blk %u: root internal posting tree page with %u items",
+					 stack->blkno, maxoff);
+
+			/*
+			 * A GIN posting tree internal page stores PostingItems in the
+			 * 'lower' part of the page. The 'upper' part is unused. The
+			 * number of elements is stored in the opaque area (maxoff). Make
+			 * sure the size of the 'lower' part agrees with 'maxoff'
+			 *
+			 * We didn't set pd_lower until PostgreSQL version 9.4, so if this
+			 * check fails, it could also be because the index was
+			 * binary-upgraded from an earlier version. That was a long time
+			 * ago, though, so let's warn if it doesn't match.
+			 */
+			pd_lower = ((PageHeader) page)->pd_lower;
+			lowersize = pd_lower - MAXALIGN(SizeOfPageHeaderData);
+			if ((lowersize - MAXALIGN(sizeof(ItemPointerData))) / sizeof(PostingItem) != maxoff)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has unexpected pd_lower %u in posting tree block %u with maxoff %u)",
+								RelationGetRelationName(rel), pd_lower, stack->blkno, maxoff)));
+
+			/*
+			 * Before the PostingItems, there's one ItemPointerData in the
+			 * 'lower' part that stores the page's high key.
+			 */
+			bound = *GinDataPageGetRightBound(page);
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				!ItemPointerEquals(&stack->parentkey, &bound))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": posting tree page's high key (%u, %u) doesn't match the downlink on block %u (parent blk %u, key (%u, %u))",
+								RelationGetRelationName(rel),
+								ItemPointerGetBlockNumberNoCheck(&bound),
+								ItemPointerGetOffsetNumberNoCheck(&bound),
+								stack->blkno, stack->parentblk,
+								ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+								ItemPointerGetOffsetNumberNoCheck(&stack->parentkey))));
+
+			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
+
+				elog(DEBUG3, "key (%u, %u) -> %u",
+					 ItemPointerGetBlockNumber(&posting_item->key),
+					 ItemPointerGetOffsetNumber(&posting_item->key),
+					 BlockIdGetBlockNumber(&posting_item->child_blkno));
+
+				if (i == maxoff &&
+					GinPageGetOpaque(page)->rightlink == InvalidBlockNumber)
+				{
+					/*
+					 * The rightmost item in the tree level has (0, 0) as the
+					 * key
+					 */
+					if (ItemPointerGetBlockNumberNoCheck(&posting_item->key) != 0 ||
+						ItemPointerGetOffsetNumberNoCheck(&posting_item->key) != 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": rightmost posting tree page (blk %u) has unexpected last key (%u, %u)",
+										RelationGetRelationName(rel),
+										stack->blkno,
+										ItemPointerGetBlockNumberNoCheck(&posting_item->key),
+										ItemPointerGetOffsetNumberNoCheck(&posting_item->key))));
+				}
+				else if (i != FirstOffsetNumber)
+				{
+					PostingItem *previous_posting_item = GinDataPageGetPostingItem(page, i - 1);
+
+					if (ItemPointerCompare(&posting_item->key, &previous_posting_item->key) < 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\" has wrong tuple order in posting tree, block %u, offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+				}
+
+				/*
+				 * Check if this tuple is consistent with the downlink in the
+				 * parent.
+				 */
+				if (stack->parentblk != InvalidBlockNumber && i == maxoff &&
+					ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\": posting item exceeds parent's high key in postingTree internal page on block %u offset %u",
+									RelationGetRelationName(rel),
+									stack->blkno, i)));
+
+				/* If this is an internal page, recurse into the child */
+				if (!GinPageIsLeaf(page))
+				{
+					GinPostingTreeScanItem *ptr;
+
+					ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
+					ptr->depth = stack->depth + 1;
+					ptr->parentkey = posting_item->key;
+					ptr->parentblk = stack->blkno;
+					ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+			}
+		}
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Main entry point for GIN check. Allocates memory context and scans through
+ * GIN graph.
+ */
+static void
+gin_check_parent_keys_consistency(Relation rel,
+								  Relation heaprel,
+								  void *callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GinState	state;
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+	initGinState(&state, rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinScanItem *) palloc0(sizeof(GinScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIN_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GinScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+		IndexTuple	prev_tuple;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+		maxoff = PageGetMaxOffsetNumber(page);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (stack->parenttup != NULL)
+		{
+			GinNullCategory parent_key_category;
+			Datum parent_key = gintuple_get_key(&state,
+												stack->parenttup,
+												&parent_key_category);
+			ItemId iid = PageGetItemIdCareful(rel, stack->blkno,
+											  page, maxoff);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory page_max_key_category;
+			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
+
+			if (GinPageGetOpaque(page)->rightlink != InvalidBlockNumber &&
+				ginCompareEntries(&state, attnum, page_max_key,
+								  page_max_key_category, parent_key,
+								  parent_key_category) > 0)
+			{
+				/* split page detected, install right link to the stack */
+				GinScanItem *ptr;
+
+				elog(DEBUG3, "split detected");
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth;
+				ptr->parenttup = CopyIndexTuple(stack->parenttup);
+				ptr->parentblk = stack->parentblk;
+				ptr->parentlsn = stack->parentlsn;
+				ptr->blkno = GinPageGetOpaque(page)->rightlink;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that tuples in each page are properly ordered and consistent
+		 * with parent high key
+		 */
+		prev_tuple = NULL;
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory prev_key_category;
+			Datum		prev_key;
+			GinNullCategory current_key_category;
+			Datum		current_key;
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
+
+			/* (apparently) first block is metadata, skip order check */
+			if (i != FirstOffsetNumber && stack->blkno != (BlockNumber) 1)
+			{
+				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
+				if (ginCompareEntries(&state, attnum, prev_key,
+									  prev_key_category, current_key,
+									  current_key_category) >= 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has wrong tuple order, block %u, offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+			}
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				i == maxoff)
+			{
+				GinNullCategory parent_key_category;
+				Datum parent_key = gintuple_get_key(&state,
+													stack->parenttup,
+													&parent_key_category);
+
+				if (ginCompareEntries(&state, attnum, current_key,
+									  current_key_category, parent_key,
+									  parent_key_category) > 0)
+				{
+					/*
+					 * There was a discrepancy between parent and child
+					 * tuples. We need to verify it is not a result of
+					 * concurrent call of gistplacetopage(). So, lock parent
+					 * and try to find downlink for current page. It may be
+					 * missing due to concurrent page split, this is OK.
+					 */
+					pfree(stack->parenttup);
+					stack->parenttup = gin_refind_parent(rel, stack->parentblk,
+														 stack->blkno, strategy);
+
+					/* We found it - make a final check before failing */
+					if (!stack->parenttup)
+						elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+							 stack->blkno, stack->parentblk);
+					else
+					{
+						parent_key = gintuple_get_key(&state,
+													  stack->parenttup,
+													  &parent_key_category);
+						if (ginCompareEntries(&state, attnum, current_key,
+											  current_key_category, parent_key,
+											  parent_key_category) > 0)
+							ereport(ERROR,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+											RelationGetRelationName(rel), stack->blkno, i)));
+						else
+						{
+							/*
+							 * But now it is properly adjusted - nothing to do
+							 * here.
+							 */
+						}
+					}
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+			if (!GinPageIsLeaf(page))
+			{
+				GinScanItem *ptr;
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth + 1;
+				/* last tuple in layer has no high key */
+				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
+				{
+					ptr->parenttup = CopyIndexTuple(idxtuple);
+				}
+				else
+				{
+					ptr->parenttup = NULL;
+				}
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = GinGetDownlink(idxtuple);
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+			/* If this item is a pointer to a posting tree, recurse into it */
+			else if (GinIsPostingTree(idxtuple))
+			{
+				BlockNumber rootPostingTree = GinGetPostingTree(idxtuple);
+
+				gin_check_posting_tree_parent_keys_consistency(rel, rootPostingTree);
+			}
+			else
+			{
+				ItemPointer ipd;
+				int			nipd;
+
+				ipd = ginReadTupleWithoutState(idxtuple, &nipd);
+
+				for (int j = 0; j < nipd; j++)
+				{
+					if (!OffsetNumberIsValid(ItemPointerGetOffsetNumber(&ipd[j])))
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting list contains invalid heap pointer on block %u",
+										RelationGetRelationName(rel), stack->blkno)));
+				}
+				pfree(ipd);
+			}
+
+			prev_tuple = CopyIndexTuple(idxtuple);
+		}
+
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Verify that a freshly-read page looks sane.
+ */
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	/*
+	 * ReadBuffer verifies that every newly-read page passes
+	 * PageHeaderIsValid, which means it either contains a reasonably sane
+	 * page header or is all-zero.  We have to defend against the all-zero
+	 * case, however.
+	 */
+	if (PageIsNew(page))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains unexpected zero page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	/*
+	 * Additionally check that the special area looks sane.
+	 */
+	if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains corrupted page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	if (GinPageIsDeleted(page))
+	{
+		if (!GinPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gin_refind_parent(Relation rel, BlockNumber parentblkno,
+				  BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIN_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GinPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+static ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED or LP_DEAD,
+	 * since GIN never uses all three.  Verify that line pointer has storage,
+	 * too.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdIsDead(itemid) || ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 41f7e952e..5549c0173 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -180,6 +180,25 @@ ORDER BY c.relpages DESC LIMIT 10;
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term>
+     <function>gin_index_parent_check(index regclass) returns void</function>
+     <indexterm>
+      <primary>gin_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gin_index_parent_check</function> tests that its target GIN index
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term>
      <function>gist_index_parent_check(index regclass, heapallindexed boolean) returns void</function>
-- 
2.39.0

#30

Andrey Borodin

amborodin86@gmail.com

almost 3 years ago

In reply to: Peter Geoghegan (#29)

3 attachment(s)

Re: Amcheck verification of GiST and GIN

Thank for working on this, Peter!

On Fri, Feb 3, 2023 at 6:50 PM Peter Geoghegan <pg@bowt.ie> wrote:

I think that we should focus on getting the GiST patch into shape for
commit first, since that seems easier.

Here's the next version. I've focused on GiST part in this revision.
Changes:
1. Refactored index_chackable so that is shared between all AMs.
2. Renamed gist_index_parent_check -> gist_index_check
3. Gathered reviewers (in no particular order). I hope I didn't forget
anyone. GIN patch is based on work by Grigory Kryachko, but
essentially rewritten by Heikki. Somewhat cosmetically whacked by me.
4. Extended comments for GistScanItem,
gist_check_parent_keys_consistency() and gist_refind_parent().

I tried adding support of GiST in pg_amcheck, but it is largely
assuming the relation is either heap or B-tree. I hope to do that part
tomorrow or in nearest future.

Here's the current version. Thank you!

Best regards, Andrey Borodin.

Attachments:

v23-0001-Refactor-amcheck-to-extract-common-locking-routi.patchapplication/octet-stream; name=v23-0001-Refactor-amcheck-to-extract-common-locking-routi.patchDownload

From d45ce69e9d15e5da36ca651d7c6ca46cd84399f2 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:08:10 +0500
Subject: [PATCH v23 1/3] Refactor amcheck to extract common locking routines
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Other indexes will need to do same precautions before doing checks:
 - ensuring index is checkable
 - switching user context
 - taking care about GUCs changed by index functions
To reuse existing functionality this commit moves it to amcheck.c.

Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile        |   1 +
 contrib/amcheck/amcheck.c       | 173 ++++++++++++++++++++++++
 contrib/amcheck/amcheck.h       |  30 ++++
 contrib/amcheck/meson.build     |   1 +
 contrib/amcheck/verify_nbtree.c | 233 +++++++-------------------------
 5 files changed, 256 insertions(+), 182 deletions(-)
 create mode 100644 contrib/amcheck/amcheck.c
 create mode 100644 contrib/amcheck/amcheck.h

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index b82f221e50..6d26551fe3 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -3,6 +3,7 @@
 MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
+	amcheck.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
new file mode 100644
index 0000000000..5a9c9429a3
--- /dev/null
+++ b/contrib/amcheck/amcheck.c
@@ -0,0 +1,173 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.c
+ *		Utility functions common to all access methods.
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/table.h"
+#include "access/tableam.h"
+#include "amcheck.h"
+#include "catalog/index.h"
+#include "commands/tablecmds.h"
+#include "utils/guc.h"
+
+static bool amcheck_index_mainfork_expected(Relation rel);
+
+
+/*
+ * Check if index relation should have a file for its main relation fork.
+ * Verification uses this to skip unlogged indexes when in hot standby mode,
+ * where there is simply nothing to verify.
+ *
+ * NB: Caller should call index_checkable() before calling here.
+ */
+static bool
+amcheck_index_mainfork_expected(Relation rel)
+{
+	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
+		!RecoveryInProgress())
+		return true;
+
+	ereport(NOTICE,
+			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
+			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
+					RelationGetRelationName(rel))));
+
+	return false;
+}
+
+void
+amcheck_lock_relation_and_check(Oid indrelid,
+								Oid am_id,
+								IndexDoCheckCallback check,
+								LOCKMODE lockmode,
+								void *state)
+{
+	Oid			heapid;
+	Relation	indrel;
+	Relation	heaprel;
+	Oid			save_userid;
+	int			save_sec_context;
+	int			save_nestlevel;
+
+	/*
+	 * We must lock table before index to avoid deadlocks.  However, if the
+	 * passed indrelid isn't an index then IndexGetRelation() will fail.
+	 * Rather than emitting a not-very-helpful error message, postpone
+	 * complaining, expecting that the is-it-an-index test below will fail.
+	 *
+	 * In hot standby mode this will raise an error when parentcheck is true.
+	 */
+	heapid = IndexGetRelation(indrelid, true);
+	if (OidIsValid(heapid))
+	{
+		heaprel = table_open(heapid, lockmode);
+
+		/*
+		 * Switch to the table owner's userid, so that any index functions are
+		 * run as that user.  Also lock down security-restricted operations
+		 * and arrange to make GUC variable changes local to this command.
+		 */
+		GetUserIdAndSecContext(&save_userid, &save_sec_context);
+		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
+							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
+		save_nestlevel = NewGUCNestLevel();
+	}
+	else
+	{
+		heaprel = NULL;
+		/* Set these just to suppress "uninitialized variable" warnings */
+		save_userid = InvalidOid;
+		save_sec_context = -1;
+		save_nestlevel = -1;
+	}
+
+	/*
+	 * Open the target index relations separately (like relation_openrv(), but
+	 * with heap relation locked first to prevent deadlocking).  In hot
+	 * standby mode this will raise an error when parentcheck is true.
+	 *
+	 * There is no need for the usual indcheckxmin usability horizon test
+	 * here, even in the heapallindexed case, because index undergoing
+	 * verification only needs to have entries for a new transaction snapshot.
+	 * (If this is a parentcheck verification, there is no question about
+	 * committed or recently dead heap tuples lacking index entries due to
+	 * concurrent activity.)
+	 */
+	indrel = index_open(indrelid, lockmode);
+
+	/*
+	 * Since we did the IndexGetRelation call above without any lock, it's
+	 * barely possible that a race against an index drop/recreation could have
+	 * netted us the wrong table.
+	 */
+	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_TABLE),
+				 errmsg("could not open parent table of index \"%s\"",
+						RelationGetRelationName(indrel))));
+
+	/* Relation suitable for checking */
+	index_checkable(indrel, am_id);
+
+	if (amcheck_index_mainfork_expected(indrel))
+		check(indrel, heaprel, state);
+
+	/* Roll back any GUC changes executed by index functions */
+	AtEOXact_GUC(false, save_nestlevel);
+
+	/* Restore userid and security context */
+	SetUserIdAndSecContext(save_userid, save_sec_context);
+
+	/*
+	 * Release locks early. That's ok here because nothing in the called
+	 * routines will trigger shared cache invalidations to be sent, so we can
+	 * relax the usual pattern of only releasing locks after commit.
+	 */
+	index_close(indrel, lockmode);
+	if (heaprel)
+		table_close(heaprel, lockmode);
+}
+
+/*
+ * Basic checks about the suitability of a relation for checking as an index.
+ *
+ *
+ * NB: Intentionally not checking permissions, the function is normally not
+ * callable by non-superusers. If granted, it's useful to be able to check a
+ * whole cluster.
+ */
+void
+index_checkable(Relation rel, Oid am_id)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != am_id)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only B-Tree indexes are supported as targets for verification"),
+				 errdetail("Relation \"%s\" is not a B-Tree index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid.")));
+}
diff --git a/contrib/amcheck/amcheck.h b/contrib/amcheck/amcheck.h
new file mode 100644
index 0000000000..b139da067a
--- /dev/null
+++ b/contrib/amcheck/amcheck.h
@@ -0,0 +1,30 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "storage/bufpage.h"
+#include "storage/lmgr.h"
+#include "storage/lockdefs.h"
+#include "utils/relcache.h"
+#include "miscadmin.h"
+
+/* Typedefs for callback functions for amcheck_lock_relation */
+typedef void (*IndexCheckableCallback) (Relation index);
+typedef void (*IndexDoCheckCallback) (Relation rel,
+									  Relation heaprel,
+									  void *state);
+
+extern void amcheck_lock_relation_and_check(Oid indrelid,
+											Oid am_id,
+											IndexDoCheckCallback check,
+											LOCKMODE lockmode, void *state);
+
+extern void index_checkable(Relation rel, Oid am_id);
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 5b55cf343a..cd81cbf3bc 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -1,6 +1,7 @@
 # Copyright (c) 2022-2023, PostgreSQL Global Development Group
 
 amcheck_sources = files(
+  'amcheck.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 257cff671b..c2ae2cb011 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -29,13 +29,12 @@
 #include "access/tableam.h"
 #include "access/transam.h"
 #include "access/xact.h"
+#include "amcheck.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
 #include "commands/tablecmds.h"
 #include "common/pg_prng.h"
 #include "lib/bloomfilter.h"
-#include "miscadmin.h"
-#include "storage/lmgr.h"
 #include "storage/smgr.h"
 #include "utils/guc.h"
 #include "utils/memutils.h"
@@ -135,13 +134,19 @@ typedef struct BtreeLevel
 	bool		istruerootlevel;
 } BtreeLevel;
 
+typedef struct BTCallbackState
+{
+	bool		parentcheck;
+	bool		heapallindexed;
+	bool		rootdescend;
+} BTCallbackState;
+
+
 PG_FUNCTION_INFO_V1(bt_index_check);
 PG_FUNCTION_INFO_V1(bt_index_parent_check);
 
-static void bt_index_check_internal(Oid indrelid, bool parentcheck,
-									bool heapallindexed, bool rootdescend);
-static inline void btree_index_checkable(Relation rel);
-static inline bool btree_index_mainfork_expected(Relation rel);
+static void bt_index_check_callback(Relation indrel, Relation heaprel,
+									void *state);
 static void bt_check_every_level(Relation rel, Relation heaprel,
 								 bool heapkeyspace, bool readonly, bool heapallindexed,
 								 bool rootdescend);
@@ -203,12 +208,18 @@ Datum
 bt_index_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
+	BTCallbackState args;
 
-	if (PG_NARGS() == 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = false;
 
-	bt_index_check_internal(indrelid, false, heapallindexed, false);
+	if (PG_NARGS() >= 2)
+		args.heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
+									bt_index_check_callback,
+									AccessShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -226,15 +237,20 @@ Datum
 bt_index_parent_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		rootdescend = false;
+	BTCallbackState args;
+
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = true;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+		args.heapallindexed = PG_GETARG_BOOL(1);
 	if (PG_NARGS() == 3)
-		rootdescend = PG_GETARG_BOOL(2);
+		args.rootdescend = PG_GETARG_BOOL(2);
 
-	bt_index_check_internal(indrelid, true, heapallindexed, rootdescend);
+	amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
+									bt_index_check_callback,
+									ShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -243,182 +259,35 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
  * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
  */
 static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend)
+bt_index_check_callback(Relation indrel, Relation heaprel, void *state)
 {
-	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	LOCKMODE	lockmode;
-	Oid			save_userid;
-	int			save_sec_context;
-	int			save_nestlevel;
-
-	if (parentcheck)
-		lockmode = ShareLock;
-	else
-		lockmode = AccessShareLock;
-
-	/*
-	 * We must lock table before index to avoid deadlocks.  However, if the
-	 * passed indrelid isn't an index then IndexGetRelation() will fail.
-	 * Rather than emitting a not-very-helpful error message, postpone
-	 * complaining, expecting that the is-it-an-index test below will fail.
-	 *
-	 * In hot standby mode this will raise an error when parentcheck is true.
-	 */
-	heapid = IndexGetRelation(indrelid, true);
-	if (OidIsValid(heapid))
-	{
-		heaprel = table_open(heapid, lockmode);
-
-		/*
-		 * Switch to the table owner's userid, so that any index functions are
-		 * run as that user.  Also lock down security-restricted operations
-		 * and arrange to make GUC variable changes local to this command.
-		 */
-		GetUserIdAndSecContext(&save_userid, &save_sec_context);
-		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
-							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
-		save_nestlevel = NewGUCNestLevel();
-	}
-	else
-	{
-		heaprel = NULL;
-		/* Set these just to suppress "uninitialized variable" warnings */
-		save_userid = InvalidOid;
-		save_sec_context = -1;
-		save_nestlevel = -1;
-	}
-
-	/*
-	 * Open the target index relations separately (like relation_openrv(), but
-	 * with heap relation locked first to prevent deadlocking).  In hot
-	 * standby mode this will raise an error when parentcheck is true.
-	 *
-	 * There is no need for the usual indcheckxmin usability horizon test
-	 * here, even in the heapallindexed case, because index undergoing
-	 * verification only needs to have entries for a new transaction snapshot.
-	 * (If this is a parentcheck verification, there is no question about
-	 * committed or recently dead heap tuples lacking index entries due to
-	 * concurrent activity.)
-	 */
-	indrel = index_open(indrelid, lockmode);
+	BTCallbackState *args = (BTCallbackState *) state;
+	bool		heapkeyspace,
+				allequalimage;
 
-	/*
-	 * Since we did the IndexGetRelation call above without any lock, it's
-	 * barely possible that a race against an index drop/recreation could have
-	 * netted us the wrong table.
-	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
 		ereport(ERROR,
-				(errcode(ERRCODE_UNDEFINED_TABLE),
-				 errmsg("could not open parent table of index \"%s\"",
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" lacks a main relation fork",
 						RelationGetRelationName(indrel))));
 
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	if (btree_index_mainfork_expected(indrel))
-	{
-		bool		heapkeyspace,
-					allequalimage;
-
-		if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" lacks a main relation fork",
-							RelationGetRelationName(indrel))));
-
-		/* Extract metadata from metapage, and sanitize it in passing */
-		_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
-		if (allequalimage && !heapkeyspace)
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
-							RelationGetRelationName(indrel))));
-		if (allequalimage && !_bt_allequalimage(indrel, false))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
-							RelationGetRelationName(indrel))));
-
-		/* Check index, possibly against table it is an index on */
-		bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
-							 heapallindexed, rootdescend);
-	}
-
-	/* Roll back any GUC changes executed by index functions */
-	AtEOXact_GUC(false, save_nestlevel);
-
-	/* Restore userid and security context */
-	SetUserIdAndSecContext(save_userid, save_sec_context);
-
-	/*
-	 * Release locks early. That's ok here because nothing in the called
-	 * routines will trigger shared cache invalidations to be sent, so we can
-	 * relax the usual pattern of only releasing locks after commit.
-	 */
-	index_close(indrel, lockmode);
-	if (heaprel)
-		table_close(heaprel, lockmode);
-}
-
-/*
- * Basic checks about the suitability of a relation for checking as a B-Tree
- * index.
- *
- * NB: Intentionally not checking permissions, the function is normally not
- * callable by non-superusers. If granted, it's useful to be able to check a
- * whole cluster.
- */
-static inline void
-btree_index_checkable(Relation rel)
-{
-	if (rel->rd_rel->relkind != RELKIND_INDEX ||
-		rel->rd_rel->relam != BTREE_AM_OID)
+	/* Extract metadata from metapage, and sanitize it in passing */
+	_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
+	if (allequalimage && !heapkeyspace)
 		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("only B-Tree indexes are supported as targets for verification"),
-				 errdetail("Relation \"%s\" is not a B-Tree index.",
-						   RelationGetRelationName(rel))));
-
-	if (RELATION_IS_OTHER_TEMP(rel))
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("cannot access temporary tables of other sessions"),
-				 errdetail("Index \"%s\" is associated with temporary relation.",
-						   RelationGetRelationName(rel))));
-
-	if (!rel->rd_index->indisvalid)
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
+						RelationGetRelationName(indrel))));
+	if (allequalimage && !_bt_allequalimage(indrel, false))
 		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("cannot check index \"%s\"",
-						RelationGetRelationName(rel)),
-				 errdetail("Index is not valid.")));
-}
-
-/*
- * Check if B-Tree index relation should have a file for its main relation
- * fork.  Verification uses this to skip unlogged indexes when in hot standby
- * mode, where there is simply nothing to verify.  We behave as if the
- * relation is empty.
- *
- * NB: Caller should call btree_index_checkable() before calling here.
- */
-static inline bool
-btree_index_mainfork_expected(Relation rel)
-{
-	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
-		!RecoveryInProgress())
-		return true;
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
+						RelationGetRelationName(indrel))));
 
-	ereport(DEBUG1,
-			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
-			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
-					RelationGetRelationName(rel))));
+	/* Check index, possibly against table it is an index on */
+	bt_check_every_level(indrel, heaprel, heapkeyspace, args->parentcheck,
+						 args->heapallindexed, args->rootdescend);
 
-	return false;
 }
 
 /*
-- 
2.32.0 (Apple Git-132)

v23-0002-Add-gist_index_check-function-to-verify-GiST-ind.patchapplication/octet-stream; name=v23-0002-Add-gist_index_check-function-to-verify-GiST-ind.patchDownload

From 3b6b704f2236e0bc3a08d7173a57aec8de9207a5 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:17:44 +0500
Subject: [PATCH v23 2/3] Add gist_index_check() function to verify GiST index
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This function traverses GiST with a depth-fisrt search and checks
that all downlink tuples are included into parent tuple keyspace.
This traverse takes lock of any page until some discapency found.
To re-check suspicious pair of parent and child tuples it aqcuires
locks on both parent and child pages in the same order as page
split does.

Author: Andrey Borodin <amborodin@acm.org>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile                |   6 +-
 contrib/amcheck/amcheck--1.3--1.4.sql   |  14 +
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/expected/check_gist.out | 119 +++++
 contrib/amcheck/meson.build             |   3 +
 contrib/amcheck/sql/check_gist.sql      |  42 ++
 contrib/amcheck/verify_gist.c           | 581 ++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml               |  19 +
 8 files changed, 783 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.3--1.4.sql
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index 6d26551fe3..e9e0198276 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,14 +4,16 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
+	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
-DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql \
+		amcheck--1.3--1.4.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap
+REGRESS = check check_btree check_gist check_heap
 
 TAP_TESTS = 1
 
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
new file mode 100644
index 0000000000..5d30784b44
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -0,0 +1,14 @@
+/* contrib/amcheck/amcheck--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.4'" to load this file. \quit
+
+
+-- gist_index_check()
+--
+CREATE FUNCTION gist_index_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index ab50931f75..e67ace01c9 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.3'
+default_version = '1.4'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 0000000000..4f3baa3776
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,119 @@
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index cd81cbf3bc..9e7ebc0499 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
@@ -24,6 +25,7 @@ install_data(
   'amcheck--1.0--1.1.sql',
   'amcheck--1.1--1.2.sql',
   'amcheck--1.2--1.3.sql',
+  'amcheck--1.3--1.4.sql',
   kwargs: contrib_data_args,
 )
 
@@ -35,6 +37,7 @@ tests += {
     'sql': [
       'check',
       'check_btree',
+      'check_gist',
       'check_heap',
     ],
   },
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 0000000000..0e3a8cf3bb
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,42 @@
+
+SELECT setseed(1);
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 0000000000..9776969b4c
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,581 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gist.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "access/tableam.h"
+#include "amcheck.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
+#include "common/pg_prng.h"
+#include "lib/bloomfilter.h"
+#include "utils/memutils.h"
+
+
+/*
+ * GistScanItem represents one item of depth-first scan of GiST index.
+ */
+typedef struct GistScanItem
+{
+	int			depth;
+
+	/* Referenced block number to check next */
+	BlockNumber blkno;
+
+	/*
+	 * Correctess of this parent tuple will be checked against contents of referenced page.
+	 * This tuple will be NULL for root block.
+	 */
+	IndexTuple	parenttup;
+
+	/*
+	 * LSN to hande concurrent scan of the page.
+	 * It's necessary to avoid missing some subtrees from page, that was
+	 * split just before we read it.
+	 */
+	XLogRecPtr	parentlsn;
+
+	/*
+	 * Reference to parent page for re-locking in case of found parent-child
+	 * tuple discrapencies.
+	 */
+	BlockNumber parentblk;
+
+	/* Pointer to a next stack item. */
+	struct GistScanItem *next;
+} GistScanItem;
+
+typedef struct GistCheckState
+{
+	/* Bloom filter fingerprints index tuples */
+	bloom_filter *filter;
+	/* Debug counter */
+	int64		heaptuplespresent;
+	/* GiST state */
+	GISTSTATE  *state;
+
+	Snapshot	snapshot;
+	Relation	rel;
+	Relation	heaprel;
+
+	/* progress reporting stuff */
+	BlockNumber totalblocks;
+	BlockNumber reportedblocks;
+	BlockNumber scannedblocks;
+	BlockNumber deltablocks;
+} GistCheckState;
+
+PG_FUNCTION_INFO_V1(gist_index_check);
+
+static void gist_init_heapallindexed(Relation rel, GistCheckState * result);
+static void gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+											   void *callback_state);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gist_refind_parent(Relation rel, BlockNumber parentblkno,
+									 BlockNumber childblkno,
+									 BufferAccessStrategy strategy);
+static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block,
+								   Page page, OffsetNumber offset);
+static void gist_tuple_present_callback(Relation index, ItemPointer tid,
+										Datum *values, bool *isnull,
+										bool tupleIsAlive, void *checkstate);
+
+/*
+ * gist_index_check(index regclass)
+ *
+ * Verify integrity of GiST index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gist_index_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	bool		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid,
+									GIST_AM_OID,
+									gist_check_parent_keys_consistency,
+									AccessShareLock,
+									&heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+static void
+gist_init_heapallindexed(Relation rel, GistCheckState * result)
+{
+	int64		total_pages;
+	int64		total_elems;
+	uint64		seed;
+
+	/*
+	 * Size Bloom filter based on estimated number of tuples in index. This
+	 * logic is similar to B-tree, see verify_btree.c .
+	 */
+	total_pages = result->totalblocks;
+	total_elems = Max(total_pages * (MaxOffsetNumber / 5),
+					  (int64) rel->rd_rel->reltuples);
+	seed = pg_prng_uint64(&pg_global_prng_state);
+	result->filter = bloom_create(total_elems, maintenance_work_mem, seed);
+
+	result->snapshot = RegisterSnapshot(GetTransactionSnapshot());
+
+
+	/*
+	 * GetTransactionSnapshot() always acquires a new MVCC snapshot in READ
+	 * COMMITTED mode.  A new snapshot is guaranteed to have all the entries
+	 * it requires in the index.
+	 *
+	 * We must defend against the possibility that an old xact snapshot was
+	 * returned at higher isolation levels when that snapshot is not safe for
+	 * index scans of the target index.  This is possible when the snapshot
+	 * sees tuples that are before the index's indcheckxmin horizon.  Throwing
+	 * an error here should be very rare.  It doesn't seem worth using a
+	 * secondary snapshot to avoid this.
+	 */
+	if (IsolationUsesXactSnapshot() && rel->rd_index->indcheckxmin &&
+		!TransactionIdPrecedes(HeapTupleHeaderGetXmin(rel->rd_indextuple->t_data),
+							   result->snapshot->xmin))
+		ereport(ERROR,
+				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+				 errmsg("index \"%s\" cannot be verified using transaction snapshot",
+						RelationGetRelationName(rel))));
+}
+
+/*
+ * Main entry point for GiST check. Allocates memory context and scans through
+ * GiST graph. This scan is performed in a depth-first search using a stack of
+ * GistScanItem-s. Initially this stack contains only root block number. On
+ * each iteration top block numbmer is replcaed by referenced block numbers.
+ *
+ * This function verifies that tuples of internal pages cover all
+ * the key space of each tuple on leaf page.  To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries to
+ * adjust it by tuples on referenced child page.  Parent gist tuple should
+ * never require any adjustments.
+ */
+static void
+gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+								   void *callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GistScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GISTSTATE  *state;
+	int			leafdepth;
+	bool		heapallindexed = *((bool *) callback_state);
+	GistCheckState check_state;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	state = initGISTstate(rel);
+
+	check_state.state = state;
+	check_state.rel = rel;
+	check_state.heaprel = heaprel;
+
+	check_state.totalblocks = RelationGetNumberOfBlocks(rel);
+	check_state.reportedblocks = 0;
+	check_state.scannedblocks = 0;
+	/* report every 100 blocks or 5%, whichever is bigger */
+	check_state.deltablocks = Max(check_state.totalblocks / 20, 100);
+
+	if (heapallindexed)
+		gist_init_heapallindexed(rel, &check_state);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	/*
+	 * This GiST scan is effectively "old" VACUUM version before commit
+	 * fe280694d which introduced physical order scanning.
+	 */
+
+	while (stack)
+	{
+		GistScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Report progress */
+		if (check_state.scannedblocks > check_state.reportedblocks +
+			check_state.deltablocks)
+		{
+			elog(DEBUG1, "verified level %u blocks of approximately %u total",
+				 check_state.scannedblocks, check_state.totalblocks);
+			check_state.reportedblocks = check_state.scannedblocks;
+		}
+		check_state.scannedblocks++;
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page))
+		{
+			/* split page detected, install right link to the stack */
+			GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+			ptr->depth = stack->depth;
+			ptr->parenttup = CopyIndexTuple(stack->parenttup);
+			ptr->parentblk = stack->parentblk;
+			ptr->parentlsn = stack->parentlsn;
+			ptr->blkno = GistPageGetOpaque(page)->rightlink;
+			ptr->next = stack->next;
+			stack->next = ptr;
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GistPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that each tuple looks valid, and is consistent with the
+		 * downlink we followed when we stepped on this page.
+		 */
+		maxoff = PageGetMaxOffsetNumber(page);
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+			/*
+			 * Check that it's not a leftover invalid tuple from pre-9.1 See
+			 * also gistdoinsert() and gistbulkdelete() handling of such
+			 * tuples. We do consider it error here.
+			 */
+			if (GistTupleIsInvalid(idxtuple))
+				ereport(ERROR,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("index \"%s\" contains an inner tuple marked as invalid, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i),
+						 errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+						 errhint("Please REINDEX it.")));
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+			{
+				/*
+				 * There was a discrepancy between parent and child tuples. We
+				 * need to verify it is not a result of concurrent call of
+				 * gistplacetopage(). So, lock parent and try to find downlink
+				 * for current page. It may be missing due to concurrent page
+				 * split, this is OK.
+				 *
+				 * Note that when we aquire parent tuple now we hold lock for
+				 * both parent and child buffers. Thus parent tuple must
+				 * include keyspace of the child.
+				 */
+				pfree(stack->parenttup);
+				stack->parenttup = gist_refind_parent(rel, stack->parentblk,
+													  stack->blkno, strategy);
+
+				/* We found it - make a final check before failing */
+				if (!stack->parenttup)
+					elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+						 stack->blkno, stack->parentblk);
+				else if (gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+				else
+				{
+					/*
+					 * But now it is properly adjusted - nothing to do here.
+					 */
+				}
+			}
+
+			if (GistPageIsLeaf(page))
+			{
+				if (heapallindexed)
+					bloom_add_element(check_state.filter,
+									  (unsigned char *) idxtuple,
+									  IndexTupleSize(idxtuple));
+			}
+			else
+			{
+				/* Internal page, so recurse to the child */
+				GistScanItem *ptr;
+
+				ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+				ptr->depth = stack->depth + 1;
+				ptr->parenttup = CopyIndexTuple(idxtuple);
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		LockBuffer(buffer, GIST_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	if (heapallindexed)
+	{
+		IndexInfo  *indexinfo = BuildIndexInfo(rel);
+		TableScanDesc scan;
+
+		scan = table_beginscan_strat(heaprel,	/* relation */
+									 check_state.snapshot,	/* snapshot */
+									 0, /* number of keys */
+									 NULL,	/* scan key */
+									 true,	/* buffer access strategy OK */
+									 true); /* syncscan OK? */
+
+		/*
+		 * Scan will behave as the first scan of a CREATE INDEX CONCURRENTLY.
+		 */
+		indexinfo->ii_Concurrent = true;
+
+		indexinfo->ii_Unique = false;
+		indexinfo->ii_ExclusionOps = NULL;
+		indexinfo->ii_ExclusionProcs = NULL;
+		indexinfo->ii_ExclusionStrats = NULL;
+
+		elog(DEBUG1, "verifying that tuples from index \"%s\" are present in \"%s\"",
+			 RelationGetRelationName(rel),
+			 RelationGetRelationName(heaprel));
+
+		table_index_build_scan(heaprel, rel, indexinfo, true, false,
+							   gist_tuple_present_callback, (void *) &check_state, scan);
+
+		ereport(DEBUG1,
+				(errmsg_internal("finished verifying presence of " INT64_FORMAT " tuples from table \"%s\" with bitset %.2f%% set",
+								 check_state.heaptuplespresent,
+								 RelationGetRelationName(heaprel),
+								 100.0 * bloom_prop_bits_set(check_state.filter))));
+
+		UnregisterSnapshot(check_state.snapshot);
+		bloom_free(check_state.filter);
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+static void
+gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+							bool *isnull, bool tupleIsAlive, void *checkstate)
+{
+	GistCheckState *state = (GistCheckState *) checkstate;
+	IndexTuple	itup = gistFormTuple(state->state, index, values, isnull, true);
+
+	itup->t_tid = *tid;
+	/* Probe Bloom filter -- tuple should be present */
+	if (bloom_lacks_element(state->filter, (unsigned char *) itup,
+							IndexTupleSize(itup)))
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("heap tuple (%u,%u) from table \"%s\" lacks matching index tuple within index \"%s\"",
+						ItemPointerGetBlockNumber(&(itup->t_tid)),
+						ItemPointerGetOffsetNumber(&(itup->t_tid)),
+						RelationGetRelationName(state->heaprel),
+						RelationGetRelationName(state->rel))));
+
+	state->heaptuplespresent++;
+
+	pfree(itup);
+}
+
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	gistcheckpage(rel, buffer);
+
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted page %d",
+						RelationGetRelationName(rel), blockNo)));
+
+	if (GistPageIsDeleted(page))
+	{
+		if (!GistPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gist_refind_parent(Relation rel,
+				   BlockNumber parentblkno, BlockNumber childblkno,
+				   BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIST_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GistPageIsLeaf(parentpage))
+	{
+		/* That's somewhat suspicious - parent page converted to leaf? */
+		/* Anyway, it's definitely not a page we were looking for */
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/*
+			 * Found it! Make copy and return it while both parent and child
+			 * pages are locked. This guaranties that at this particular moment
+			 * tuples must be coherent to each other.
+			 */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+static ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(sizeof(GISTPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree
+	 * and gist never uses either.  Verify that line pointer has storage, too,
+	 * since even LP_DEAD items should.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 2b9c1a9205..40de7c33f5 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -179,6 +179,25 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gist_index_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
   <tip>
    <para>
-- 
2.32.0 (Apple Git-132)

v23-0003-Add-gin_index_parent_check-to-verify-GIN-index.patchapplication/octet-stream; name=v23-0003-Add-gin_index_parent_check-to-verify-GIN-index.patchDownload

From 980d497e5c8d13431c73701190eb6ec4d069f385 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:22:26 +0500
Subject: [PATCH v23 3/3] Add gin_index_parent_check() to verify GIN index
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Author: Grigory Kryachko <GSKryachko@gmail.com>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile               |   3 +-
 contrib/amcheck/amcheck--1.3--1.4.sql  |  11 +-
 contrib/amcheck/expected/check_gin.out |  64 +++
 contrib/amcheck/meson.build            |   2 +
 contrib/amcheck/sql/check_gin.sql      |  40 ++
 contrib/amcheck/verify_gin.c           | 768 +++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml              |  19 +
 7 files changed, 905 insertions(+), 2 deletions(-)
 create mode 100644 contrib/amcheck/expected/check_gin.out
 create mode 100644 contrib/amcheck/sql/check_gin.sql
 create mode 100644 contrib/amcheck/verify_gin.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index e9e0198276..4c672f0db8 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,6 +4,7 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
+	verify_gin.o \
 	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
@@ -13,7 +14,7 @@ DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck
 		amcheck--1.3--1.4.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_gist check_heap
+REGRESS = check check_btree check_gin check_gist check_heap
 
 TAP_TESTS = 1
 
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
index 5d30784b44..ca985fff2e 100644
--- a/contrib/amcheck/amcheck--1.3--1.4.sql
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -11,4 +11,13 @@ RETURNS VOID
 AS 'MODULE_PATHNAME', 'gist_index_check'
 LANGUAGE C STRICT;
 
-REVOKE ALL ON FUNCTION gist_index_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
+REVOKE ALL ON FUNCTION gist_index_check(regclass, boolean) FROM PUBLIC;
+
+-- gin_index_parent_check()
+--
+CREATE FUNCTION gin_index_parent_check(index regclass)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gin_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gin_index_parent_check(regclass) FROM PUBLIC;
diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
new file mode 100644
index 0000000000..43fd769a50
--- /dev/null
+++ b/contrib/amcheck/expected/check_gin.out
@@ -0,0 +1,64 @@
+-- Test of index bulk load
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx');
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test index inserts
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+SELECT gin_index_parent_check('gin_check_idx');
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test GIN over text array
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx');
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 9e7ebc0499..dc2191bd59 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gin.c',
   'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
@@ -37,6 +38,7 @@ tests += {
     'sql': [
       'check',
       'check_btree',
+      'check_gin',
       'check_gist',
       'check_heap',
     ],
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
new file mode 100644
index 0000000000..9771afffa5
--- /dev/null
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -0,0 +1,40 @@
+-- Test of index bulk load
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test index inserts
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+
+SELECT gin_index_parent_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test GIN over text array
+SELECT setseed(1);
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx');
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
new file mode 100644
index 0000000000..af9ace2f33
--- /dev/null
+++ b/contrib/amcheck/verify_gin.c
@@ -0,0 +1,768 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gin.c
+ *		Verifies the integrity of GIN indexes based on invariants.
+ *
+ * Verification checks that all paths in GIN graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gin.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gin_private.h"
+#include "access/nbtree.h"
+#include "amcheck.h"
+#include "catalog/pg_am.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "string.h"
+
+/*
+ * GinScanItem represents one item of depth-first scan of GIN index.
+ */
+typedef struct GinScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GinScanItem *next;
+} GinScanItem;
+
+/*
+ * GinPostingTreeScanItem represents one item of depth-first scan of GIN  posting tree.
+ */
+typedef struct GinPostingTreeScanItem
+{
+	int			depth;
+	ItemPointerData parentkey;
+	BlockNumber parentblk;
+	BlockNumber blkno;
+	struct GinPostingTreeScanItem *next;
+} GinPostingTreeScanItem;
+
+
+PG_FUNCTION_INFO_V1(gin_index_parent_check);
+
+static void gin_check_parent_keys_consistency(Relation rel,
+											  Relation heaprel,
+											  void *callback_state);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gin_refind_parent(Relation rel,
+									BlockNumber parentblkno,
+									BlockNumber childblkno,
+									BufferAccessStrategy strategy);
+static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+								   OffsetNumber offset);
+
+/*
+ * gin_index_parent_check(index regclass)
+ *
+ * Verify integrity of GIN index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gin_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+
+	amcheck_lock_relation_and_check(indrelid,
+									GIN_AM_OID,
+									gin_check_parent_keys_consistency,
+									AccessShareLock,
+									NULL);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Read item pointers from leaf entry tuple.
+ *
+ * Returns a palloc'd array of ItemPointers. The number of items is returned
+ * in *nitems.
+ */
+static ItemPointer
+ginReadTupleWithoutState(IndexTuple itup, int *nitems)
+{
+	Pointer		ptr = GinGetPosting(itup);
+	int			nipd = GinGetNPosting(itup);
+	ItemPointer ipd;
+	int			ndecoded;
+
+	if (GinItupIsCompressed(itup))
+	{
+		if (nipd > 0)
+		{
+			ipd = ginPostingListDecode((GinPostingList *) ptr, &ndecoded);
+			if (nipd != ndecoded)
+				elog(ERROR, "number of items mismatch in GIN entry tuple, %d in tuple header, %d decoded",
+					 nipd, ndecoded);
+		}
+		else
+		{
+			ipd = palloc(0);
+		}
+	}
+	else
+	{
+		ipd = (ItemPointer) palloc(sizeof(ItemPointerData) * nipd);
+		memcpy(ipd, ptr, sizeof(ItemPointerData) * nipd);
+	}
+	*nitems = nipd;
+	return ipd;
+}
+
+/*
+ * Allocates memory context and scans through postigTree graph
+ *
+ */
+static void
+gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting_tree_root)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinPostingTreeScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinPostingTreeScanItem *) palloc0(sizeof(GinPostingTreeScanItem));
+	stack->depth = 0;
+	ItemPointerSetInvalid(&stack->parentkey);
+	stack->parentblk = InvalidBlockNumber;
+	stack->blkno = posting_tree_root;
+
+	elog(DEBUG3, "processing posting tree at blk %u", posting_tree_root);
+
+	while (stack)
+	{
+		GinPostingTreeScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		Assert(GinPageIsData(page));
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			ItemPointerData minItem;
+			int			nlist;
+			ItemPointerData *list;
+			char		tidrange_buf[100];
+
+			ItemPointerSetMin(&minItem);
+
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			list = GinDataLeafPageGetItems(page, &nlist, minItem);
+
+			if (nlist > 0)
+			{
+				snprintf(tidrange_buf, sizeof(tidrange_buf),
+						 "%d tids (%u, %u) - (%u, %u)",
+						 nlist,
+						 ItemPointerGetBlockNumberNoCheck(&list[0]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[0]),
+						 ItemPointerGetBlockNumberNoCheck(&list[nlist - 1]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[nlist - 1]));
+			}
+			else
+			{
+				snprintf(tidrange_buf, sizeof(tidrange_buf), "0 tids");
+			}
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				elog(DEBUG3, "blk %u: parent %u highkey (%u, %u), %s",
+					 stack->blkno,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey),
+					 tidrange_buf);
+			}
+			else
+			{
+				elog(DEBUG3, "blk %u: root leaf, %s",
+					 stack->blkno,
+					 tidrange_buf);
+			}
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				ItemPointerGetOffsetNumberNoCheck(&stack->parentkey) != InvalidOffsetNumber &&
+				nlist > 0 && ItemPointerCompare(&stack->parentkey, &list[nlist - 1]) < 0)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": tid exceeds parent's high key in postingTree leaf on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+		else
+		{
+			LocationIndex pd_lower;
+			ItemPointerData bound;
+			int			lowersize;
+
+			/*
+			 * Check that tuples in each page are properly ordered and
+			 * consistent with parent high key
+			 */
+			maxoff = GinPageGetOpaque(page)->maxoff;
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: internal posting tree page with %u items, parent %u highkey (%u, %u)",
+					 stack->blkno, maxoff, stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey));
+			else
+				elog(DEBUG3, "blk %u: root internal posting tree page with %u items",
+					 stack->blkno, maxoff);
+
+			/*
+			 * A GIN posting tree internal page stores PostingItems in the
+			 * 'lower' part of the page. The 'upper' part is unused. The
+			 * number of elements is stored in the opaque area (maxoff). Make
+			 * sure the size of the 'lower' part agrees with 'maxoff'
+			 *
+			 * We didn't set pd_lower until PostgreSQL version 9.4, so if this
+			 * check fails, it could also be because the index was
+			 * binary-upgraded from an earlier version. That was a long time
+			 * ago, though, so let's warn if it doesn't match.
+			 */
+			pd_lower = ((PageHeader) page)->pd_lower;
+			lowersize = pd_lower - MAXALIGN(SizeOfPageHeaderData);
+			if ((lowersize - MAXALIGN(sizeof(ItemPointerData))) / sizeof(PostingItem) != maxoff)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has unexpected pd_lower %u in posting tree block %u with maxoff %u)",
+								RelationGetRelationName(rel), pd_lower, stack->blkno, maxoff)));
+
+			/*
+			 * Before the PostingItems, there's one ItemPointerData in the
+			 * 'lower' part that stores the page's high key.
+			 */
+			bound = *GinDataPageGetRightBound(page);
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				!ItemPointerEquals(&stack->parentkey, &bound))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": posting tree page's high key (%u, %u) doesn't match the downlink on block %u (parent blk %u, key (%u, %u))",
+								RelationGetRelationName(rel),
+								ItemPointerGetBlockNumberNoCheck(&bound),
+								ItemPointerGetOffsetNumberNoCheck(&bound),
+								stack->blkno, stack->parentblk,
+								ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+								ItemPointerGetOffsetNumberNoCheck(&stack->parentkey))));
+
+			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
+
+				elog(DEBUG3, "key (%u, %u) -> %u",
+					 ItemPointerGetBlockNumber(&posting_item->key),
+					 ItemPointerGetOffsetNumber(&posting_item->key),
+					 BlockIdGetBlockNumber(&posting_item->child_blkno));
+
+				if (i == maxoff &&
+					GinPageGetOpaque(page)->rightlink == InvalidBlockNumber)
+				{
+					/*
+					 * The rightmost item in the tree level has (0, 0) as the
+					 * key
+					 */
+					if (ItemPointerGetBlockNumberNoCheck(&posting_item->key) != 0 ||
+						ItemPointerGetOffsetNumberNoCheck(&posting_item->key) != 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": rightmost posting tree page (blk %u) has unexpected last key (%u, %u)",
+										RelationGetRelationName(rel),
+										stack->blkno,
+										ItemPointerGetBlockNumberNoCheck(&posting_item->key),
+										ItemPointerGetOffsetNumberNoCheck(&posting_item->key))));
+				}
+				else if (i != FirstOffsetNumber)
+				{
+					PostingItem *previous_posting_item = GinDataPageGetPostingItem(page, i - 1);
+
+					if (ItemPointerCompare(&posting_item->key, &previous_posting_item->key) < 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\" has wrong tuple order in posting tree, block %u, offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+				}
+
+				/*
+				 * Check if this tuple is consistent with the downlink in the
+				 * parent.
+				 */
+				if (stack->parentblk != InvalidBlockNumber && i == maxoff &&
+					ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\": posting item exceeds parent's high key in postingTree internal page on block %u offset %u",
+									RelationGetRelationName(rel),
+									stack->blkno, i)));
+
+				/* If this is an internal page, recurse into the child */
+				if (!GinPageIsLeaf(page))
+				{
+					GinPostingTreeScanItem *ptr;
+
+					ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
+					ptr->depth = stack->depth + 1;
+					ptr->parentkey = posting_item->key;
+					ptr->parentblk = stack->blkno;
+					ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+			}
+		}
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Main entry point for GIN check. Allocates memory context and scans through
+ * GIN graph.
+ */
+static void
+gin_check_parent_keys_consistency(Relation rel,
+								  Relation heaprel,
+								  void *callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GinState	state;
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+	initGinState(&state, rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinScanItem *) palloc0(sizeof(GinScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIN_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GinScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+		IndexTuple	prev_tuple;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+		maxoff = PageGetMaxOffsetNumber(page);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (stack->parenttup != NULL)
+		{
+			GinNullCategory parent_key_category;
+			Datum parent_key = gintuple_get_key(&state,
+												stack->parenttup,
+												&parent_key_category);
+			ItemId iid = PageGetItemIdCareful(rel, stack->blkno,
+											  page, maxoff);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory page_max_key_category;
+			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
+
+			if (GinPageGetOpaque(page)->rightlink != InvalidBlockNumber &&
+				ginCompareEntries(&state, attnum, page_max_key,
+								  page_max_key_category, parent_key,
+								  parent_key_category) > 0)
+			{
+				/* split page detected, install right link to the stack */
+				GinScanItem *ptr;
+
+				elog(DEBUG3, "split detected");
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth;
+				ptr->parenttup = CopyIndexTuple(stack->parenttup);
+				ptr->parentblk = stack->parentblk;
+				ptr->parentlsn = stack->parentlsn;
+				ptr->blkno = GinPageGetOpaque(page)->rightlink;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that tuples in each page are properly ordered and consistent
+		 * with parent high key
+		 */
+		prev_tuple = NULL;
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory prev_key_category;
+			Datum		prev_key;
+			GinNullCategory current_key_category;
+			Datum		current_key;
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
+
+			/* (apparently) first block is metadata, skip order check */
+			if (i != FirstOffsetNumber && stack->blkno != (BlockNumber) 1)
+			{
+				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
+				if (ginCompareEntries(&state, attnum, prev_key,
+									  prev_key_category, current_key,
+									  current_key_category) >= 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has wrong tuple order, block %u, offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+			}
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				i == maxoff)
+			{
+				GinNullCategory parent_key_category;
+				Datum parent_key = gintuple_get_key(&state,
+													stack->parenttup,
+													&parent_key_category);
+
+				if (ginCompareEntries(&state, attnum, current_key,
+									  current_key_category, parent_key,
+									  parent_key_category) > 0)
+				{
+					/*
+					 * There was a discrepancy between parent and child
+					 * tuples. We need to verify it is not a result of
+					 * concurrent call of gistplacetopage(). So, lock parent
+					 * and try to find downlink for current page. It may be
+					 * missing due to concurrent page split, this is OK.
+					 */
+					pfree(stack->parenttup);
+					stack->parenttup = gin_refind_parent(rel, stack->parentblk,
+														 stack->blkno, strategy);
+
+					/* We found it - make a final check before failing */
+					if (!stack->parenttup)
+						elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+							 stack->blkno, stack->parentblk);
+					else
+					{
+						parent_key = gintuple_get_key(&state,
+													  stack->parenttup,
+													  &parent_key_category);
+						if (ginCompareEntries(&state, attnum, current_key,
+											  current_key_category, parent_key,
+											  parent_key_category) > 0)
+							ereport(ERROR,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+											RelationGetRelationName(rel), stack->blkno, i)));
+						else
+						{
+							/*
+							 * But now it is properly adjusted - nothing to do
+							 * here.
+							 */
+						}
+					}
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+			if (!GinPageIsLeaf(page))
+			{
+				GinScanItem *ptr;
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth + 1;
+				/* last tuple in layer has no high key */
+				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
+				{
+					ptr->parenttup = CopyIndexTuple(idxtuple);
+				}
+				else
+				{
+					ptr->parenttup = NULL;
+				}
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = GinGetDownlink(idxtuple);
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+			/* If this item is a pointer to a posting tree, recurse into it */
+			else if (GinIsPostingTree(idxtuple))
+			{
+				BlockNumber rootPostingTree = GinGetPostingTree(idxtuple);
+
+				gin_check_posting_tree_parent_keys_consistency(rel, rootPostingTree);
+			}
+			else
+			{
+				ItemPointer ipd;
+				int			nipd;
+
+				ipd = ginReadTupleWithoutState(idxtuple, &nipd);
+
+				for (int j = 0; j < nipd; j++)
+				{
+					if (!OffsetNumberIsValid(ItemPointerGetOffsetNumber(&ipd[j])))
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting list contains invalid heap pointer on block %u",
+										RelationGetRelationName(rel), stack->blkno)));
+				}
+				pfree(ipd);
+			}
+
+			prev_tuple = CopyIndexTuple(idxtuple);
+		}
+
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Verify that a freshly-read page looks sane.
+ */
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	/*
+	 * ReadBuffer verifies that every newly-read page passes
+	 * PageHeaderIsValid, which means it either contains a reasonably sane
+	 * page header or is all-zero.  We have to defend against the all-zero
+	 * case, however.
+	 */
+	if (PageIsNew(page))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains unexpected zero page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	/*
+	 * Additionally check that the special area looks sane.
+	 */
+	if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains corrupted page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	if (GinPageIsDeleted(page))
+	{
+		if (!GinPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gin_refind_parent(Relation rel, BlockNumber parentblkno,
+				  BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIN_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GinPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+static ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED or LP_DEAD,
+	 * since GIN never uses all three.  Verify that line pointer has storage,
+	 * too.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdIsDead(itemid) || ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 40de7c33f5..e5c8d84db9 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -180,6 +180,25 @@ ORDER BY c.relpages DESC LIMIT 10;
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term>
+     <function>gin_index_parent_check(index regclass) returns void</function>
+     <indexterm>
+      <primary>gin_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gin_index_parent_check</function> tests that its target GIN index
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term>
      <function>gist_index_check(index regclass, heapallindexed boolean) returns void</function>
-- 
2.32.0 (Apple Git-132)

#31

Andrey Borodin

amborodin86@gmail.com

almost 3 years ago

In reply to: Andrey Borodin (#30)

4 attachment(s)

Re: Amcheck verification of GiST and GIN

On Sat, Feb 4, 2023 at 1:37 PM Andrey Borodin <amborodin86@gmail.com> wrote:

I tried adding support of GiST in pg_amcheck, but it is largely
assuming the relation is either heap or B-tree. I hope to do that part
tomorrow or in nearest future.

Here's v24 == (v23 + a step for pg_amcheck). There's a lot of
shotgun-style changes, but I hope next index types will be easy to add
now.

Adding Mark to cc, just in case.

Thanks!

Best regards, Andrey Borodin.

Attachments:

v24-0004-Add-GiST-support-to-pg_amcheck.patchapplication/octet-stream; name=v24-0004-Add-GiST-support-to-pg_amcheck.patchDownload

From 7d896c762ca82f2740c7d780f41dffb402ff8610 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sun, 5 Feb 2023 15:52:14 -0800
Subject: [PATCH v24 4/4] Add GiST support to pg_amcheck

---
 src/bin/pg_amcheck/pg_amcheck.c      | 205 ++++++++++++++++-----------
 src/bin/pg_amcheck/t/002_nonesuch.pl |   8 +-
 src/bin/pg_amcheck/t/003_check.pl    |  48 ++++---
 3 files changed, 157 insertions(+), 104 deletions(-)

diff --git a/src/bin/pg_amcheck/pg_amcheck.c b/src/bin/pg_amcheck/pg_amcheck.c
index 68f8180c19..fad6350cd2 100644
--- a/src/bin/pg_amcheck/pg_amcheck.c
+++ b/src/bin/pg_amcheck/pg_amcheck.c
@@ -39,8 +39,7 @@ typedef struct PatternInfo
 								 * NULL */
 	bool		heap_only;		/* true if rel_regex should only match heap
 								 * tables */
-	bool		btree_only;		/* true if rel_regex should only match btree
-								 * indexes */
+	bool		index_only;		/* true if rel_regex should only match indexes */
 	bool		matched;		/* true if the pattern matched in any database */
 } PatternInfo;
 
@@ -74,7 +73,7 @@ typedef struct AmcheckOptions
 
 	/*
 	 * As an optimization, if any pattern in the exclude list applies to heap
-	 * tables, or similarly if any such pattern applies to btree indexes, or
+	 * tables, or similarly if any such pattern applies to indexes, or
 	 * to schemas, then these will be true, otherwise false.  These should
 	 * always agree with what you'd conclude by grep'ing through the exclude
 	 * list.
@@ -98,13 +97,13 @@ typedef struct AmcheckOptions
 	int64		endblock;
 	const char *skip;
 
-	/* btree index checking options */
+	/* index checking options */
 	bool		parent_check;
 	bool		rootdescend;
 	bool		heapallindexed;
 
-	/* heap and btree hybrid option */
-	bool		no_btree_expansion;
+	/* heap and indexes hybrid option */
+	bool		no_index_expansion;
 } AmcheckOptions;
 
 static AmcheckOptions opts = {
@@ -132,7 +131,7 @@ static AmcheckOptions opts = {
 	.parent_check = false,
 	.rootdescend = false,
 	.heapallindexed = false,
-	.no_btree_expansion = false
+	.no_index_expansion = false
 };
 
 static const char *progname = NULL;
@@ -154,7 +153,8 @@ typedef struct RelationInfo
 {
 	const DatabaseInfo *datinfo;	/* shared by other relinfos */
 	Oid			reloid;
-	bool		is_heap;		/* true if heap, false if btree */
+	Oid			amoid;
+	bool		is_heap;		/* true if heap, false if index */
 	char	   *nspname;
 	char	   *relname;
 	int			relpages;
@@ -175,10 +175,12 @@ static void prepare_heap_command(PQExpBuffer sql, RelationInfo *rel,
 								 PGconn *conn);
 static void prepare_btree_command(PQExpBuffer sql, RelationInfo *rel,
 								  PGconn *conn);
+static void prepare_gist_command(PQExpBuffer sql, RelationInfo *rel,
+								  PGconn *conn);
 static void run_command(ParallelSlot *slot, const char *sql);
 static bool verify_heap_slot_handler(PGresult *res, PGconn *conn,
 									 void *context);
-static bool verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context);
+static bool verify_index_slot_handler(PGresult *res, PGconn *conn, void *context);
 static void help(const char *progname);
 static void progress_report(uint64 relations_total, uint64 relations_checked,
 							uint64 relpages_total, uint64 relpages_checked,
@@ -192,7 +194,7 @@ static void append_relation_pattern(PatternInfoArray *pia, const char *pattern,
 									int encoding);
 static void append_heap_pattern(PatternInfoArray *pia, const char *pattern,
 								int encoding);
-static void append_btree_pattern(PatternInfoArray *pia, const char *pattern,
+static void append_index_pattern(PatternInfoArray *pia, const char *pattern,
 								 int encoding);
 static void compile_database_list(PGconn *conn, SimplePtrList *databases,
 								  const char *initial_dbname);
@@ -318,11 +320,11 @@ main(int argc, char *argv[])
 				break;
 			case 'i':
 				opts.allrel = false;
-				append_btree_pattern(&opts.include, optarg, encoding);
+				append_index_pattern(&opts.include, optarg, encoding);
 				break;
 			case 'I':
 				opts.excludeidx = true;
-				append_btree_pattern(&opts.exclude, optarg, encoding);
+				append_index_pattern(&opts.exclude, optarg, encoding);
 				break;
 			case 'j':
 				if (!option_parse_int(optarg, "-j/--jobs", 1, INT_MAX,
@@ -377,7 +379,7 @@ main(int argc, char *argv[])
 				maintenance_db = pg_strdup(optarg);
 				break;
 			case 2:
-				opts.no_btree_expansion = true;
+				opts.no_index_expansion = true;
 				break;
 			case 3:
 				opts.no_toast_expansion = true;
@@ -609,8 +611,8 @@ main(int argc, char *argv[])
 			if (pat->heap_only)
 				log_no_match("no heap tables to check matching \"%s\"",
 							 pat->pattern);
-			else if (pat->btree_only)
-				log_no_match("no btree indexes to check matching \"%s\"",
+			else if (pat->index_only)
+				log_no_match("no indexes to check matching \"%s\"",
 							 pat->pattern);
 			else if (pat->rel_regex == NULL)
 				log_no_match("no relations to check in schemas matching \"%s\"",
@@ -743,13 +745,20 @@ main(int argc, char *argv[])
 				if (opts.show_progress && progress_since_last_stderr)
 					fprintf(stderr, "\n");
 
-				pg_log_info("checking btree index \"%s.%s.%s\"",
+				pg_log_info("checking index \"%s.%s.%s\"",
 							rel->datinfo->datname, rel->nspname, rel->relname);
 				progress_since_last_stderr = false;
 			}
-			prepare_btree_command(&sql, rel, free_slot->connection);
+			if (rel->amoid == BTREE_AM_OID)
+				prepare_btree_command(&sql, rel, free_slot->connection);
+			else if (rel->amoid == GIST_AM_OID)
+				prepare_gist_command(&sql, rel, free_slot->connection);
+			else
+				/* should not happen at this stage */
+				pg_log_info("Verification of index type %u not supported",
+					 		rel->amoid);
 			rel->sql = pstrdup(sql.data);	/* pg_free'd after command */
-			ParallelSlotSetHandler(free_slot, verify_btree_slot_handler, rel);
+			ParallelSlotSetHandler(free_slot, verify_index_slot_handler, rel);
 			run_command(free_slot, rel->sql);
 		}
 	}
@@ -827,7 +836,7 @@ prepare_heap_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
  * Creates a SQL command for running amcheck checking on the given btree index
  * relation.  The command does not select any columns, as btree checking
  * functions do not return any, but rather return corruption information by
- * raising errors, which verify_btree_slot_handler expects.
+ * raising errors, which verify_index_slot_handler expects.
  *
  * The constructed SQL command will silently skip temporary indexes, and
  * indexes being reindexed concurrently, as checking them would needlessly draw
@@ -869,6 +878,28 @@ prepare_btree_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
 						  rel->reloid);
 }
 
+/*
+ * prepare_gist_command
+ * Similar to btree equivalent prepares command to check GiST index.
+ */
+static void
+prepare_gist_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
+{
+	resetPQExpBuffer(sql);
+
+	appendPQExpBuffer(sql,
+						"SELECT %s.gist_index_check("
+						"index := c.oid, heapallindexed := %s)"
+						"\nFROM pg_catalog.pg_class c, pg_catalog.pg_index i "
+						"WHERE c.oid = %u "
+						"AND c.oid = i.indexrelid "
+						"AND c.relpersistence != 't' "
+						"AND i.indisready AND i.indisvalid AND i.indislive",
+						rel->datinfo->amcheck_schema,
+						(opts.heapallindexed ? "true" : "false"),
+						rel->reloid);
+}
+
 /*
  * run_command
  *
@@ -908,7 +939,7 @@ run_command(ParallelSlot *slot, const char *sql)
  * Note: Heap relation corruption is reported by verify_heapam() via the result
  * set, rather than an ERROR, but running verify_heapam() on a corrupted heap
  * table may still result in an error being returned from the server due to
- * missing relation files, bad checksums, etc.  The btree corruption checking
+ * missing relation files, bad checksums, etc.  The corruption checking
  * functions always use errors to communicate corruption messages.  We can't
  * just abort processing because we got a mere ERROR.
  *
@@ -1057,11 +1088,11 @@ verify_heap_slot_handler(PGresult *res, PGconn *conn, void *context)
 }
 
 /*
- * verify_btree_slot_handler
+ * verify_index_slot_handler
  *
- * ParallelSlotHandler that receives results from a btree checking command
- * created by prepare_btree_command and outputs them for the user.  The results
- * from the btree checking command is assumed to be empty, but when the results
+ * ParallelSlotHandler that receives results from a checking command created by
+ * prepare_[btree,gist]_command and outputs them for the user.  The results
+ * from the checking command is assumed to be empty, but when the results
  * are an error code, the useful information about the corruption is expected
  * in the connection's error message.
  *
@@ -1070,7 +1101,7 @@ verify_heap_slot_handler(PGresult *res, PGconn *conn, void *context)
  * context: unused
  */
 static bool
-verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
+verify_index_slot_handler(PGresult *res, PGconn *conn, void *context)
 {
 	RelationInfo *rel = (RelationInfo *) context;
 
@@ -1081,7 +1112,7 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 		if (ntups > 1)
 		{
 			/*
-			 * We expect the btree checking functions to return one void row
+			 * We expect the checking functions to return one void row
 			 * each, or zero rows if the check was skipped due to the object
 			 * being in the wrong state to be checked, so we should output
 			 * some sort of warning if we get anything more, not because it
@@ -1096,7 +1127,7 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 			 */
 			if (opts.show_progress && progress_since_last_stderr)
 				fprintf(stderr, "\n");
-			pg_log_warning("btree index \"%s.%s.%s\": btree checking function returned unexpected number of rows: %d",
+			pg_log_warning("index \"%s.%s.%s\": checking function returned unexpected number of rows: %d",
 						   rel->datinfo->datname, rel->nspname, rel->relname, ntups);
 			if (opts.verbose)
 				pg_log_warning_detail("Query was: %s", rel->sql);
@@ -1110,7 +1141,7 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 		char	   *msg = indent_lines(PQerrorMessage(conn));
 
 		all_checks_pass = false;
-		printf(_("btree index \"%s.%s.%s\":\n"),
+		printf(_("index \"%s.%s.%s\":\n"),
 			   rel->datinfo->datname, rel->nspname, rel->relname);
 		printf("%s", msg);
 		if (opts.verbose)
@@ -1163,6 +1194,8 @@ help(const char *progname)
 	printf(_("      --heapallindexed            check that all heap tuples are found within indexes\n"));
 	printf(_("      --parent-check              check index parent/child relationships\n"));
 	printf(_("      --rootdescend               search from root page to refind tuples\n"));
+	printf(_("\nGiST index checking options:\n"));
+	printf(_("      --heapallindexed            check that all heap tuples are found within indexes\n"));
 	printf(_("\nConnection options:\n"));
 	printf(_("  -h, --host=HOSTNAME             database server host or socket directory\n"));
 	printf(_("  -p, --port=PORT                 database server port\n"));
@@ -1376,11 +1409,11 @@ append_schema_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
  * pattern: the relation name pattern
  * encoding: client encoding for parsing the pattern
  * heap_only: whether the pattern should only be matched against heap tables
- * btree_only: whether the pattern should only be matched against btree indexes
+ * index_only: whether the pattern should only be matched against indexes
  */
 static void
 append_relation_pattern_helper(PatternInfoArray *pia, const char *pattern,
-							   int encoding, bool heap_only, bool btree_only)
+							   int encoding, bool heap_only, bool index_only)
 {
 	PQExpBufferData dbbuf;
 	PQExpBufferData nspbuf;
@@ -1415,14 +1448,14 @@ append_relation_pattern_helper(PatternInfoArray *pia, const char *pattern,
 	termPQExpBuffer(&relbuf);
 
 	info->heap_only = heap_only;
-	info->btree_only = btree_only;
+	info->index_only = index_only;
 }
 
 /*
  * append_relation_pattern
  *
  * Adds the given pattern interpreted as a relation pattern, to be matched
- * against both heap tables and btree indexes.
+ * against both heap tables and indexes.
  *
  * pia: the pattern info array to be appended
  * pattern: the relation name pattern
@@ -1451,17 +1484,17 @@ append_heap_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
 }
 
 /*
- * append_btree_pattern
+ * append_index_pattern
  *
  * Adds the given pattern interpreted as a relation pattern, to be matched only
- * against btree indexes.
+ * against indexes.
  *
  * pia: the pattern info array to be appended
  * pattern: the relation name pattern
  * encoding: client encoding for parsing the pattern
  */
 static void
-append_btree_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
+append_index_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
 {
 	append_relation_pattern_helper(pia, pattern, encoding, false, true);
 }
@@ -1719,7 +1752,7 @@ compile_database_list(PGconn *conn, SimplePtrList *databases,
  *     rel_regex: the relname regexp parsed from the pattern, or NULL if the
  *                pattern had no relname part
  *     heap_only: true if the pattern applies only to heap tables (not indexes)
- *     btree_only: true if the pattern applies only to btree indexes (not tables)
+ *     index_only: true if the pattern applies only to indexes (not tables)
  *
  * buf: the buffer to be appended
  * patterns: the array of patterns to be inserted into the CTE
@@ -1761,7 +1794,7 @@ append_rel_pattern_raw_cte(PQExpBuffer buf, const PatternInfoArray *pia,
 			appendPQExpBufferStr(buf, "::TEXT, true::BOOLEAN");
 		else
 			appendPQExpBufferStr(buf, "::TEXT, false::BOOLEAN");
-		if (info->btree_only)
+		if (info->index_only)
 			appendPQExpBufferStr(buf, ", true::BOOLEAN");
 		else
 			appendPQExpBufferStr(buf, ", false::BOOLEAN");
@@ -1799,8 +1832,8 @@ append_rel_pattern_filtered_cte(PQExpBuffer buf, const char *raw,
 								const char *filtered, PGconn *conn)
 {
 	appendPQExpBuffer(buf,
-					  "\n%s (pattern_id, nsp_regex, rel_regex, heap_only, btree_only) AS ("
-					  "\nSELECT pattern_id, nsp_regex, rel_regex, heap_only, btree_only "
+					  "\n%s (pattern_id, nsp_regex, rel_regex, heap_only, index_only) AS ("
+					  "\nSELECT pattern_id, nsp_regex, rel_regex, heap_only, index_only "
 					  "FROM %s r"
 					  "\nWHERE (r.db_regex IS NULL "
 					  "OR ",
@@ -1823,7 +1856,7 @@ append_rel_pattern_filtered_cte(PQExpBuffer buf, const char *raw,
  * The cells of the constructed list contain all information about the relation
  * necessary to connect to the database and check the object, including which
  * database to connect to, where contrib/amcheck is installed, and the Oid and
- * type of object (heap table vs. btree index).  Rather than duplicating the
+ * type of object (heap table vs. index).  Rather than duplicating the
  * database details per relation, the relation structs use references to the
  * same database object, provided by the caller.
  *
@@ -1850,7 +1883,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	if (!opts.allrel)
 	{
 		appendPQExpBufferStr(&sql,
-							 " include_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, btree_only) AS (");
+							 " include_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, index_only) AS (");
 		append_rel_pattern_raw_cte(&sql, &opts.include, conn);
 		appendPQExpBufferStr(&sql, "\n),");
 		append_rel_pattern_filtered_cte(&sql, "include_raw", "include_pat", conn);
@@ -1860,7 +1893,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	if (opts.excludetbl || opts.excludeidx || opts.excludensp)
 	{
 		appendPQExpBufferStr(&sql,
-							 " exclude_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, btree_only) AS (");
+							 " exclude_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, index_only) AS (");
 		append_rel_pattern_raw_cte(&sql, &opts.exclude, conn);
 		appendPQExpBufferStr(&sql, "\n),");
 		append_rel_pattern_filtered_cte(&sql, "exclude_raw", "exclude_pat", conn);
@@ -1868,36 +1901,36 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 
 	/* Append the relation CTE. */
 	appendPQExpBufferStr(&sql,
-						 " relation (pattern_id, oid, nspname, relname, reltoastrelid, relpages, is_heap, is_btree) AS ("
+						 " relation (pattern_id, oid, amoid, nspname, relname, reltoastrelid, relpages, is_heap, is_index) AS ("
 						 "\nSELECT DISTINCT ON (c.oid");
 	if (!opts.allrel)
 		appendPQExpBufferStr(&sql, ", ip.pattern_id) ip.pattern_id,");
 	else
 		appendPQExpBufferStr(&sql, ") NULL::INTEGER AS pattern_id,");
 	appendPQExpBuffer(&sql,
-					  "\nc.oid, n.nspname, c.relname, c.reltoastrelid, c.relpages, "
-					  "c.relam = %u AS is_heap, "
-					  "c.relam = %u AS is_btree"
+					  "\nc.oid, c.relam as amoid, n.nspname, c.relname, "
+					  "c.reltoastrelid, c.relpages, c.relam = %u AS is_heap, "
+					  "(c.relam = %u OR c.relam = %u) AS is_index"
 					  "\nFROM pg_catalog.pg_class c "
 					  "INNER JOIN pg_catalog.pg_namespace n "
 					  "ON c.relnamespace = n.oid",
-					  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+					  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 	if (!opts.allrel)
 		appendPQExpBuffer(&sql,
 						  "\nINNER JOIN include_pat ip"
 						  "\nON (n.nspname ~ ip.nsp_regex OR ip.nsp_regex IS NULL)"
 						  "\nAND (c.relname ~ ip.rel_regex OR ip.rel_regex IS NULL)"
 						  "\nAND (c.relam = %u OR NOT ip.heap_only)"
-						  "\nAND (c.relam = %u OR NOT ip.btree_only)",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  "\nAND ((c.relam = %u OR c.relam = %u) OR NOT ip.index_only)",
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 	if (opts.excludetbl || opts.excludeidx || opts.excludensp)
 		appendPQExpBuffer(&sql,
 						  "\nLEFT OUTER JOIN exclude_pat ep"
 						  "\nON (n.nspname ~ ep.nsp_regex OR ep.nsp_regex IS NULL)"
 						  "\nAND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL)"
 						  "\nAND (c.relam = %u OR NOT ep.heap_only OR ep.rel_regex IS NULL)"
-						  "\nAND (c.relam = %u OR NOT ep.btree_only OR ep.rel_regex IS NULL)",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  "\nAND ((c.relam = %u OR c.relam = %u) OR NOT ep.index_only OR ep.rel_regex IS NULL)",
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 
 	/*
 	 * Exclude temporary tables and indexes, which must necessarily belong to
@@ -1931,12 +1964,12 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 						  HEAP_TABLE_AM_OID, PG_TOAST_NAMESPACE);
 	else
 		appendPQExpBuffer(&sql,
-						  " AND c.relam IN (%u, %u)"
+						  " AND c.relam IN (%u, %u, %u)"
 						  "AND c.relkind IN ('r', 'S', 'm', 't', 'i') "
 						  "AND ((c.relam = %u AND c.relkind IN ('r', 'S', 'm', 't')) OR "
-						  "(c.relam = %u AND c.relkind = 'i'))",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID,
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  "((c.relam = %u OR c.relam = %u) AND c.relkind = 'i'))",
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID,
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 
 	appendPQExpBufferStr(&sql,
 						 "\nORDER BY c.oid)");
@@ -1965,17 +1998,18 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		appendPQExpBufferStr(&sql,
 							 "\n)");
 	}
-	if (!opts.no_btree_expansion)
+	if (!opts.no_index_expansion)
 	{
 		/*
 		 * Include a CTE for btree indexes associated with primary heap tables
 		 * selected above, filtering by exclusion patterns (if any) that match
-		 * btree index names.
+		 * btree index names. Currently, only btree indexes can be PK, but this
+		 * might chance in future.
 		 */
 		appendPQExpBufferStr(&sql,
-							 ", index (oid, nspname, relname, relpages) AS ("
-							 "\nSELECT c.oid, r.nspname, c.relname, c.relpages "
-							 "FROM relation r"
+							 ", index (oid, amoid, nspname, relname, relpages) AS ("
+							 "\nSELECT c.oid, c.relam as amoid, r.nspname, "
+							 "c.relname, c.relpages FROM relation r"
 							 "\nINNER JOIN pg_catalog.pg_index i "
 							 "ON r.oid = i.indrelid "
 							 "INNER JOIN pg_catalog.pg_class c "
@@ -1988,7 +2022,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 								 "\nLEFT OUTER JOIN exclude_pat ep "
 								 "ON (n.nspname ~ ep.nsp_regex OR ep.nsp_regex IS NULL) "
 								 "AND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL) "
-								 "AND ep.btree_only"
+								 "AND ep.index_only"
 								 "\nWHERE ep.pattern_id IS NULL");
 		else
 			appendPQExpBufferStr(&sql,
@@ -1996,7 +2030,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		appendPQExpBuffer(&sql,
 						  " AND c.relam = %u "
 						  "AND c.relkind = 'i'",
-						  BTREE_AM_OID);
+						  BTREE_AM_OID); /* Do not expect other AMs here */
 		if (opts.no_toast_expansion)
 			appendPQExpBuffer(&sql,
 							  " AND c.relnamespace != %u",
@@ -2004,7 +2038,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		appendPQExpBufferStr(&sql, "\n)");
 	}
 
-	if (!opts.no_toast_expansion && !opts.no_btree_expansion)
+	if (!opts.no_toast_expansion && !opts.no_index_expansion)
 	{
 		/*
 		 * Include a CTE for btree indexes associated with toast tables of
@@ -2025,13 +2059,13 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 								 "\nLEFT OUTER JOIN exclude_pat ep "
 								 "ON ('pg_toast' ~ ep.nsp_regex OR ep.nsp_regex IS NULL) "
 								 "AND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL) "
-								 "AND ep.btree_only "
+								 "AND ep.index_only "
 								 "WHERE ep.pattern_id IS NULL");
 		else
 			appendPQExpBufferStr(&sql,
 								 "\nWHERE true");
 		appendPQExpBuffer(&sql,
-						  " AND c.relam = %u"
+						  " AND c.relam = %u "
 						  " AND c.relkind = 'i')",
 						  BTREE_AM_OID);
 	}
@@ -2045,12 +2079,13 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	 * list.
 	 */
 	appendPQExpBufferStr(&sql,
-						 "\nSELECT pattern_id, is_heap, is_btree, oid, nspname, relname, relpages "
+						 "\nSELECT pattern_id, is_heap, is_index, oid, amoid, nspname, relname, relpages "
 						 "FROM (");
 	appendPQExpBufferStr(&sql,
 	/* Inclusion patterns that failed to match */
-						 "\nSELECT pattern_id, is_heap, is_btree, "
+						 "\nSELECT pattern_id, is_heap, is_index, "
 						 "NULL::OID AS oid, "
+						 "NULL::OID AS amoid, "
 						 "NULL::TEXT AS nspname, "
 						 "NULL::TEXT AS relname, "
 						 "NULL::INTEGER AS relpages"
@@ -2059,29 +2094,29 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 						 "UNION"
 	/* Primary relations */
 						 "\nSELECT NULL::INTEGER AS pattern_id, "
-						 "is_heap, is_btree, oid, nspname, relname, relpages "
+						 "is_heap, is_index, oid, amoid, nspname, relname, relpages "
 						 "FROM relation");
 	if (!opts.no_toast_expansion)
-		appendPQExpBufferStr(&sql,
+		appendPQExpBuffer(&sql,
 							 " UNION"
 		/* Toast tables for primary relations */
 							 "\nSELECT NULL::INTEGER AS pattern_id, TRUE AS is_heap, "
-							 "FALSE AS is_btree, oid, nspname, relname, relpages "
+							 "FALSE AS is_index, oid, 0 as amoid, nspname, relname, relpages "
 							 "FROM toast");
-	if (!opts.no_btree_expansion)
+	if (!opts.no_index_expansion)
 		appendPQExpBufferStr(&sql,
 							 " UNION"
 		/* Indexes for primary relations */
 							 "\nSELECT NULL::INTEGER AS pattern_id, FALSE AS is_heap, "
-							 "TRUE AS is_btree, oid, nspname, relname, relpages "
+							 "TRUE AS is_index, oid, amoid, nspname, relname, relpages "
 							 "FROM index");
-	if (!opts.no_toast_expansion && !opts.no_btree_expansion)
-		appendPQExpBufferStr(&sql,
+	if (!opts.no_toast_expansion && !opts.no_index_expansion)
+		appendPQExpBuffer(&sql,
 							 " UNION"
 		/* Indexes for toast relations */
 							 "\nSELECT NULL::INTEGER AS pattern_id, FALSE AS is_heap, "
-							 "TRUE AS is_btree, oid, nspname, relname, relpages "
-							 "FROM toast_index");
+							 "TRUE AS is_index, oid, %u as amoid, nspname, relname, relpages "
+							 "FROM toast_index", BTREE_AM_OID);
 	appendPQExpBufferStr(&sql,
 						 "\n) AS combined_records "
 						 "ORDER BY relpages DESC NULLS FIRST, oid");
@@ -2101,8 +2136,9 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	{
 		int			pattern_id = -1;
 		bool		is_heap = false;
-		bool		is_btree PG_USED_FOR_ASSERTS_ONLY = false;
+		bool		is_index PG_USED_FOR_ASSERTS_ONLY = false;
 		Oid			oid = InvalidOid;
+		Oid			amoid = InvalidOid;
 		const char *nspname = NULL;
 		const char *relname = NULL;
 		int			relpages = 0;
@@ -2112,15 +2148,17 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		if (!PQgetisnull(res, i, 1))
 			is_heap = (PQgetvalue(res, i, 1)[0] == 't');
 		if (!PQgetisnull(res, i, 2))
-			is_btree = (PQgetvalue(res, i, 2)[0] == 't');
+			is_index = (PQgetvalue(res, i, 2)[0] == 't');
 		if (!PQgetisnull(res, i, 3))
 			oid = atooid(PQgetvalue(res, i, 3));
 		if (!PQgetisnull(res, i, 4))
-			nspname = PQgetvalue(res, i, 4);
+			amoid = atooid(PQgetvalue(res, i, 4));
 		if (!PQgetisnull(res, i, 5))
-			relname = PQgetvalue(res, i, 5);
+			nspname = PQgetvalue(res, i, 5);
 		if (!PQgetisnull(res, i, 6))
-			relpages = atoi(PQgetvalue(res, i, 6));
+			relname = PQgetvalue(res, i, 6);
+		if (!PQgetisnull(res, i, 7))
+			relpages = atoi(PQgetvalue(res, i, 7));
 
 		if (pattern_id >= 0)
 		{
@@ -2142,10 +2180,11 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 			RelationInfo *rel = (RelationInfo *) pg_malloc0(sizeof(RelationInfo));
 
 			Assert(OidIsValid(oid));
-			Assert((is_heap && !is_btree) || (is_btree && !is_heap));
+			Assert((is_heap && !is_index) || (is_index && !is_heap));
 
 			rel->datinfo = dat;
 			rel->reloid = oid;
+			rel->amoid = amoid;
 			rel->is_heap = is_heap;
 			rel->nspname = pstrdup(nspname);
 			rel->relname = pstrdup(relname);
@@ -2155,7 +2194,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 			{
 				/*
 				 * We apply --startblock and --endblock to heap tables, but
-				 * not btree indexes, and for progress purposes we need to
+				 * not supported indexes, and for progress purposes we need to
 				 * track how many blocks we expect to check.
 				 */
 				if (opts.endblock >= 0 && rel->blocks_to_check > opts.endblock)
diff --git a/src/bin/pg_amcheck/t/002_nonesuch.pl b/src/bin/pg_amcheck/t/002_nonesuch.pl
index 58be2c694d..5e8a63a844 100644
--- a/src/bin/pg_amcheck/t/002_nonesuch.pl
+++ b/src/bin/pg_amcheck/t/002_nonesuch.pl
@@ -272,8 +272,8 @@ $node->command_checks_all(
 	[
 		qr/pg_amcheck: warning: no heap tables to check matching "no_such_table"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "no\*such\*table"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "no_such_index"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "no\*such\*index"/,
+		qr/pg_amcheck: warning: no indexes to check matching "no_such_index"/,
+		qr/pg_amcheck: warning: no indexes to check matching "no\*such\*index"/,
 		qr/pg_amcheck: warning: no relations to check matching "no_such_relation"/,
 		qr/pg_amcheck: warning: no relations to check matching "no\*such\*relation"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "no\*such\*table"/,
@@ -319,8 +319,8 @@ $node->command_checks_all(
 		qr/pg_amcheck: warning: no heap tables to check matching "template1\.public\.foo"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "another_db\.public\.foo"/,
 		qr/pg_amcheck: warning: no connectable databases to check matching "no_such_database\.public\.foo"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "template1\.public\.foo_idx"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "another_db\.public\.foo_idx"/,
+		qr/pg_amcheck: warning: no indexes to check matching "template1\.public\.foo_idx"/,
+		qr/pg_amcheck: warning: no indexes to check matching "another_db\.public\.foo_idx"/,
 		qr/pg_amcheck: warning: no connectable databases to check matching "no_such_database\.public\.foo_idx"/,
 		qr/pg_amcheck: error: no relations to check/,
 	],
diff --git a/src/bin/pg_amcheck/t/003_check.pl b/src/bin/pg_amcheck/t/003_check.pl
index 359abe25a1..8b6326dd3d 100644
--- a/src/bin/pg_amcheck/t/003_check.pl
+++ b/src/bin/pg_amcheck/t/003_check.pl
@@ -185,7 +185,7 @@ for my $dbname (qw(db1 db2 db3))
 	# schemas.  The schemas are all identical to start, but
 	# we will corrupt them differently later.
 	#
-	for my $schema (qw(s1 s2 s3 s4 s5))
+	for my $schema (qw(s1 s2 s3 s4 s5 s6))
 	{
 		$node->safe_psql(
 			$dbname, qq(
@@ -288,22 +288,24 @@ plan_to_corrupt_first_page('db1', 's3.t2_btree');
 # Corrupt toast table, partitions, and materialized views in schema "s4"
 plan_to_remove_toast_file('db1', 's4.t2');
 
-# Corrupt all other object types in schema "s5".  We don't have amcheck support
+# Corrupt GiST index in schema "s5"
+plan_to_remove_relation_file('db1', 's5.t1_gist');
+plan_to_corrupt_first_page('db1', 's5.t2_gist');
+
+# Corrupt all other object types in schema "s6".  We don't have amcheck support
 # for these types, but we check that their corruption does not trigger any
 # errors in pg_amcheck
-plan_to_remove_relation_file('db1', 's5.seq1');
-plan_to_remove_relation_file('db1', 's5.t1_hash');
-plan_to_remove_relation_file('db1', 's5.t1_gist');
-plan_to_remove_relation_file('db1', 's5.t1_gin');
-plan_to_remove_relation_file('db1', 's5.t1_brin');
-plan_to_remove_relation_file('db1', 's5.t1_spgist');
+plan_to_remove_relation_file('db1', 's6.seq1');
+plan_to_remove_relation_file('db1', 's6.t1_hash');
+plan_to_remove_relation_file('db1', 's6.t1_gin');
+plan_to_remove_relation_file('db1', 's6.t1_brin');
+plan_to_remove_relation_file('db1', 's6.t1_spgist');
 
-plan_to_corrupt_first_page('db1', 's5.seq2');
-plan_to_corrupt_first_page('db1', 's5.t2_hash');
-plan_to_corrupt_first_page('db1', 's5.t2_gist');
-plan_to_corrupt_first_page('db1', 's5.t2_gin');
-plan_to_corrupt_first_page('db1', 's5.t2_brin');
-plan_to_corrupt_first_page('db1', 's5.t2_spgist');
+plan_to_corrupt_first_page('db1', 's6.seq2');
+plan_to_corrupt_first_page('db1', 's6.t2_hash');
+plan_to_corrupt_first_page('db1', 's6.t2_gin');
+plan_to_corrupt_first_page('db1', 's6.t2_brin');
+plan_to_corrupt_first_page('db1', 's6.t2_spgist');
 
 
 # Database 'db2' corruptions
@@ -434,10 +436,22 @@ $node->command_checks_all(
 	[$no_output_re],
 	'pg_amcheck in schema s4 excluding toast reports no corruption');
 
-# Check that no corruption is reported in schema db1.s5
-$node->command_checks_all([ @cmd, '-s', 's5', 'db1' ],
+# In schema db1.s5 we should see GiST corruption messages on stdout, and
+# nothing on stderr.
+#
+$node->command_checks_all(
+	[ @cmd, '-s', 's5', 'db1' ],
+	2,
+	[
+		$missing_file_re, $line_pointer_corruption_re,
+	],
+	[$no_output_re],
+	'pg_amcheck schema s5 reports GiST index errors');
+
+# Check that no corruption is reported in schema db1.s6
+$node->command_checks_all([ @cmd, '-s', 's6', 'db1' ],
 	0, [$no_output_re], [$no_output_re],
-	'pg_amcheck over schema s5 reports no corruption');
+	'pg_amcheck over schema s6 reports no corruption');
 
 # In schema db1.s1, only indexes are corrupt.  Verify that when we exclude
 # the indexes, no corruption is reported about the schema.
-- 
2.32.0 (Apple Git-132)

v24-0001-Refactor-amcheck-to-extract-common-locking-routi.patchapplication/octet-stream; name=v24-0001-Refactor-amcheck-to-extract-common-locking-routi.patchDownload

From d45ce69e9d15e5da36ca651d7c6ca46cd84399f2 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:08:10 +0500
Subject: [PATCH v24 1/4] Refactor amcheck to extract common locking routines
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Other indexes will need to do same precautions before doing checks:
 - ensuring index is checkable
 - switching user context
 - taking care about GUCs changed by index functions
To reuse existing functionality this commit moves it to amcheck.c.

Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile        |   1 +
 contrib/amcheck/amcheck.c       | 173 ++++++++++++++++++++++++
 contrib/amcheck/amcheck.h       |  30 ++++
 contrib/amcheck/meson.build     |   1 +
 contrib/amcheck/verify_nbtree.c | 233 +++++++-------------------------
 5 files changed, 256 insertions(+), 182 deletions(-)
 create mode 100644 contrib/amcheck/amcheck.c
 create mode 100644 contrib/amcheck/amcheck.h

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index b82f221e50..6d26551fe3 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -3,6 +3,7 @@
 MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
+	amcheck.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
new file mode 100644
index 0000000000..5a9c9429a3
--- /dev/null
+++ b/contrib/amcheck/amcheck.c
@@ -0,0 +1,173 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.c
+ *		Utility functions common to all access methods.
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/table.h"
+#include "access/tableam.h"
+#include "amcheck.h"
+#include "catalog/index.h"
+#include "commands/tablecmds.h"
+#include "utils/guc.h"
+
+static bool amcheck_index_mainfork_expected(Relation rel);
+
+
+/*
+ * Check if index relation should have a file for its main relation fork.
+ * Verification uses this to skip unlogged indexes when in hot standby mode,
+ * where there is simply nothing to verify.
+ *
+ * NB: Caller should call index_checkable() before calling here.
+ */
+static bool
+amcheck_index_mainfork_expected(Relation rel)
+{
+	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
+		!RecoveryInProgress())
+		return true;
+
+	ereport(NOTICE,
+			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
+			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
+					RelationGetRelationName(rel))));
+
+	return false;
+}
+
+void
+amcheck_lock_relation_and_check(Oid indrelid,
+								Oid am_id,
+								IndexDoCheckCallback check,
+								LOCKMODE lockmode,
+								void *state)
+{
+	Oid			heapid;
+	Relation	indrel;
+	Relation	heaprel;
+	Oid			save_userid;
+	int			save_sec_context;
+	int			save_nestlevel;
+
+	/*
+	 * We must lock table before index to avoid deadlocks.  However, if the
+	 * passed indrelid isn't an index then IndexGetRelation() will fail.
+	 * Rather than emitting a not-very-helpful error message, postpone
+	 * complaining, expecting that the is-it-an-index test below will fail.
+	 *
+	 * In hot standby mode this will raise an error when parentcheck is true.
+	 */
+	heapid = IndexGetRelation(indrelid, true);
+	if (OidIsValid(heapid))
+	{
+		heaprel = table_open(heapid, lockmode);
+
+		/*
+		 * Switch to the table owner's userid, so that any index functions are
+		 * run as that user.  Also lock down security-restricted operations
+		 * and arrange to make GUC variable changes local to this command.
+		 */
+		GetUserIdAndSecContext(&save_userid, &save_sec_context);
+		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
+							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
+		save_nestlevel = NewGUCNestLevel();
+	}
+	else
+	{
+		heaprel = NULL;
+		/* Set these just to suppress "uninitialized variable" warnings */
+		save_userid = InvalidOid;
+		save_sec_context = -1;
+		save_nestlevel = -1;
+	}
+
+	/*
+	 * Open the target index relations separately (like relation_openrv(), but
+	 * with heap relation locked first to prevent deadlocking).  In hot
+	 * standby mode this will raise an error when parentcheck is true.
+	 *
+	 * There is no need for the usual indcheckxmin usability horizon test
+	 * here, even in the heapallindexed case, because index undergoing
+	 * verification only needs to have entries for a new transaction snapshot.
+	 * (If this is a parentcheck verification, there is no question about
+	 * committed or recently dead heap tuples lacking index entries due to
+	 * concurrent activity.)
+	 */
+	indrel = index_open(indrelid, lockmode);
+
+	/*
+	 * Since we did the IndexGetRelation call above without any lock, it's
+	 * barely possible that a race against an index drop/recreation could have
+	 * netted us the wrong table.
+	 */
+	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_TABLE),
+				 errmsg("could not open parent table of index \"%s\"",
+						RelationGetRelationName(indrel))));
+
+	/* Relation suitable for checking */
+	index_checkable(indrel, am_id);
+
+	if (amcheck_index_mainfork_expected(indrel))
+		check(indrel, heaprel, state);
+
+	/* Roll back any GUC changes executed by index functions */
+	AtEOXact_GUC(false, save_nestlevel);
+
+	/* Restore userid and security context */
+	SetUserIdAndSecContext(save_userid, save_sec_context);
+
+	/*
+	 * Release locks early. That's ok here because nothing in the called
+	 * routines will trigger shared cache invalidations to be sent, so we can
+	 * relax the usual pattern of only releasing locks after commit.
+	 */
+	index_close(indrel, lockmode);
+	if (heaprel)
+		table_close(heaprel, lockmode);
+}
+
+/*
+ * Basic checks about the suitability of a relation for checking as an index.
+ *
+ *
+ * NB: Intentionally not checking permissions, the function is normally not
+ * callable by non-superusers. If granted, it's useful to be able to check a
+ * whole cluster.
+ */
+void
+index_checkable(Relation rel, Oid am_id)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != am_id)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only B-Tree indexes are supported as targets for verification"),
+				 errdetail("Relation \"%s\" is not a B-Tree index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid.")));
+}
diff --git a/contrib/amcheck/amcheck.h b/contrib/amcheck/amcheck.h
new file mode 100644
index 0000000000..b139da067a
--- /dev/null
+++ b/contrib/amcheck/amcheck.h
@@ -0,0 +1,30 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "storage/bufpage.h"
+#include "storage/lmgr.h"
+#include "storage/lockdefs.h"
+#include "utils/relcache.h"
+#include "miscadmin.h"
+
+/* Typedefs for callback functions for amcheck_lock_relation */
+typedef void (*IndexCheckableCallback) (Relation index);
+typedef void (*IndexDoCheckCallback) (Relation rel,
+									  Relation heaprel,
+									  void *state);
+
+extern void amcheck_lock_relation_and_check(Oid indrelid,
+											Oid am_id,
+											IndexDoCheckCallback check,
+											LOCKMODE lockmode, void *state);
+
+extern void index_checkable(Relation rel, Oid am_id);
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 5b55cf343a..cd81cbf3bc 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -1,6 +1,7 @@
 # Copyright (c) 2022-2023, PostgreSQL Global Development Group
 
 amcheck_sources = files(
+  'amcheck.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 257cff671b..c2ae2cb011 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -29,13 +29,12 @@
 #include "access/tableam.h"
 #include "access/transam.h"
 #include "access/xact.h"
+#include "amcheck.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
 #include "commands/tablecmds.h"
 #include "common/pg_prng.h"
 #include "lib/bloomfilter.h"
-#include "miscadmin.h"
-#include "storage/lmgr.h"
 #include "storage/smgr.h"
 #include "utils/guc.h"
 #include "utils/memutils.h"
@@ -135,13 +134,19 @@ typedef struct BtreeLevel
 	bool		istruerootlevel;
 } BtreeLevel;
 
+typedef struct BTCallbackState
+{
+	bool		parentcheck;
+	bool		heapallindexed;
+	bool		rootdescend;
+} BTCallbackState;
+
+
 PG_FUNCTION_INFO_V1(bt_index_check);
 PG_FUNCTION_INFO_V1(bt_index_parent_check);
 
-static void bt_index_check_internal(Oid indrelid, bool parentcheck,
-									bool heapallindexed, bool rootdescend);
-static inline void btree_index_checkable(Relation rel);
-static inline bool btree_index_mainfork_expected(Relation rel);
+static void bt_index_check_callback(Relation indrel, Relation heaprel,
+									void *state);
 static void bt_check_every_level(Relation rel, Relation heaprel,
 								 bool heapkeyspace, bool readonly, bool heapallindexed,
 								 bool rootdescend);
@@ -203,12 +208,18 @@ Datum
 bt_index_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
+	BTCallbackState args;
 
-	if (PG_NARGS() == 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = false;
 
-	bt_index_check_internal(indrelid, false, heapallindexed, false);
+	if (PG_NARGS() >= 2)
+		args.heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
+									bt_index_check_callback,
+									AccessShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -226,15 +237,20 @@ Datum
 bt_index_parent_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		rootdescend = false;
+	BTCallbackState args;
+
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = true;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+		args.heapallindexed = PG_GETARG_BOOL(1);
 	if (PG_NARGS() == 3)
-		rootdescend = PG_GETARG_BOOL(2);
+		args.rootdescend = PG_GETARG_BOOL(2);
 
-	bt_index_check_internal(indrelid, true, heapallindexed, rootdescend);
+	amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
+									bt_index_check_callback,
+									ShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -243,182 +259,35 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
  * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
  */
 static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend)
+bt_index_check_callback(Relation indrel, Relation heaprel, void *state)
 {
-	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	LOCKMODE	lockmode;
-	Oid			save_userid;
-	int			save_sec_context;
-	int			save_nestlevel;
-
-	if (parentcheck)
-		lockmode = ShareLock;
-	else
-		lockmode = AccessShareLock;
-
-	/*
-	 * We must lock table before index to avoid deadlocks.  However, if the
-	 * passed indrelid isn't an index then IndexGetRelation() will fail.
-	 * Rather than emitting a not-very-helpful error message, postpone
-	 * complaining, expecting that the is-it-an-index test below will fail.
-	 *
-	 * In hot standby mode this will raise an error when parentcheck is true.
-	 */
-	heapid = IndexGetRelation(indrelid, true);
-	if (OidIsValid(heapid))
-	{
-		heaprel = table_open(heapid, lockmode);
-
-		/*
-		 * Switch to the table owner's userid, so that any index functions are
-		 * run as that user.  Also lock down security-restricted operations
-		 * and arrange to make GUC variable changes local to this command.
-		 */
-		GetUserIdAndSecContext(&save_userid, &save_sec_context);
-		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
-							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
-		save_nestlevel = NewGUCNestLevel();
-	}
-	else
-	{
-		heaprel = NULL;
-		/* Set these just to suppress "uninitialized variable" warnings */
-		save_userid = InvalidOid;
-		save_sec_context = -1;
-		save_nestlevel = -1;
-	}
-
-	/*
-	 * Open the target index relations separately (like relation_openrv(), but
-	 * with heap relation locked first to prevent deadlocking).  In hot
-	 * standby mode this will raise an error when parentcheck is true.
-	 *
-	 * There is no need for the usual indcheckxmin usability horizon test
-	 * here, even in the heapallindexed case, because index undergoing
-	 * verification only needs to have entries for a new transaction snapshot.
-	 * (If this is a parentcheck verification, there is no question about
-	 * committed or recently dead heap tuples lacking index entries due to
-	 * concurrent activity.)
-	 */
-	indrel = index_open(indrelid, lockmode);
+	BTCallbackState *args = (BTCallbackState *) state;
+	bool		heapkeyspace,
+				allequalimage;
 
-	/*
-	 * Since we did the IndexGetRelation call above without any lock, it's
-	 * barely possible that a race against an index drop/recreation could have
-	 * netted us the wrong table.
-	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
 		ereport(ERROR,
-				(errcode(ERRCODE_UNDEFINED_TABLE),
-				 errmsg("could not open parent table of index \"%s\"",
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" lacks a main relation fork",
 						RelationGetRelationName(indrel))));
 
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	if (btree_index_mainfork_expected(indrel))
-	{
-		bool		heapkeyspace,
-					allequalimage;
-
-		if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" lacks a main relation fork",
-							RelationGetRelationName(indrel))));
-
-		/* Extract metadata from metapage, and sanitize it in passing */
-		_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
-		if (allequalimage && !heapkeyspace)
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
-							RelationGetRelationName(indrel))));
-		if (allequalimage && !_bt_allequalimage(indrel, false))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
-							RelationGetRelationName(indrel))));
-
-		/* Check index, possibly against table it is an index on */
-		bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
-							 heapallindexed, rootdescend);
-	}
-
-	/* Roll back any GUC changes executed by index functions */
-	AtEOXact_GUC(false, save_nestlevel);
-
-	/* Restore userid and security context */
-	SetUserIdAndSecContext(save_userid, save_sec_context);
-
-	/*
-	 * Release locks early. That's ok here because nothing in the called
-	 * routines will trigger shared cache invalidations to be sent, so we can
-	 * relax the usual pattern of only releasing locks after commit.
-	 */
-	index_close(indrel, lockmode);
-	if (heaprel)
-		table_close(heaprel, lockmode);
-}
-
-/*
- * Basic checks about the suitability of a relation for checking as a B-Tree
- * index.
- *
- * NB: Intentionally not checking permissions, the function is normally not
- * callable by non-superusers. If granted, it's useful to be able to check a
- * whole cluster.
- */
-static inline void
-btree_index_checkable(Relation rel)
-{
-	if (rel->rd_rel->relkind != RELKIND_INDEX ||
-		rel->rd_rel->relam != BTREE_AM_OID)
+	/* Extract metadata from metapage, and sanitize it in passing */
+	_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
+	if (allequalimage && !heapkeyspace)
 		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("only B-Tree indexes are supported as targets for verification"),
-				 errdetail("Relation \"%s\" is not a B-Tree index.",
-						   RelationGetRelationName(rel))));
-
-	if (RELATION_IS_OTHER_TEMP(rel))
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("cannot access temporary tables of other sessions"),
-				 errdetail("Index \"%s\" is associated with temporary relation.",
-						   RelationGetRelationName(rel))));
-
-	if (!rel->rd_index->indisvalid)
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
+						RelationGetRelationName(indrel))));
+	if (allequalimage && !_bt_allequalimage(indrel, false))
 		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("cannot check index \"%s\"",
-						RelationGetRelationName(rel)),
-				 errdetail("Index is not valid.")));
-}
-
-/*
- * Check if B-Tree index relation should have a file for its main relation
- * fork.  Verification uses this to skip unlogged indexes when in hot standby
- * mode, where there is simply nothing to verify.  We behave as if the
- * relation is empty.
- *
- * NB: Caller should call btree_index_checkable() before calling here.
- */
-static inline bool
-btree_index_mainfork_expected(Relation rel)
-{
-	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
-		!RecoveryInProgress())
-		return true;
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
+						RelationGetRelationName(indrel))));
 
-	ereport(DEBUG1,
-			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
-			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
-					RelationGetRelationName(rel))));
+	/* Check index, possibly against table it is an index on */
+	bt_check_every_level(indrel, heaprel, heapkeyspace, args->parentcheck,
+						 args->heapallindexed, args->rootdescend);
 
-	return false;
 }
 
 /*
-- 
2.32.0 (Apple Git-132)

v24-0002-Add-gist_index_check-function-to-verify-GiST-ind.patchapplication/octet-stream; name=v24-0002-Add-gist_index_check-function-to-verify-GiST-ind.patchDownload

From 3b6b704f2236e0bc3a08d7173a57aec8de9207a5 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:17:44 +0500
Subject: [PATCH v24 2/4] Add gist_index_check() function to verify GiST index
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This function traverses GiST with a depth-fisrt search and checks
that all downlink tuples are included into parent tuple keyspace.
This traverse takes lock of any page until some discapency found.
To re-check suspicious pair of parent and child tuples it aqcuires
locks on both parent and child pages in the same order as page
split does.

Author: Andrey Borodin <amborodin@acm.org>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile                |   6 +-
 contrib/amcheck/amcheck--1.3--1.4.sql   |  14 +
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/expected/check_gist.out | 119 +++++
 contrib/amcheck/meson.build             |   3 +
 contrib/amcheck/sql/check_gist.sql      |  42 ++
 contrib/amcheck/verify_gist.c           | 581 ++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml               |  19 +
 8 files changed, 783 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.3--1.4.sql
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index 6d26551fe3..e9e0198276 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,14 +4,16 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
+	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
-DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql \
+		amcheck--1.3--1.4.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap
+REGRESS = check check_btree check_gist check_heap
 
 TAP_TESTS = 1
 
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
new file mode 100644
index 0000000000..5d30784b44
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -0,0 +1,14 @@
+/* contrib/amcheck/amcheck--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.4'" to load this file. \quit
+
+
+-- gist_index_check()
+--
+CREATE FUNCTION gist_index_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index ab50931f75..e67ace01c9 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.3'
+default_version = '1.4'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 0000000000..4f3baa3776
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,119 @@
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index cd81cbf3bc..9e7ebc0499 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
@@ -24,6 +25,7 @@ install_data(
   'amcheck--1.0--1.1.sql',
   'amcheck--1.1--1.2.sql',
   'amcheck--1.2--1.3.sql',
+  'amcheck--1.3--1.4.sql',
   kwargs: contrib_data_args,
 )
 
@@ -35,6 +37,7 @@ tests += {
     'sql': [
       'check',
       'check_btree',
+      'check_gist',
       'check_heap',
     ],
   },
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 0000000000..0e3a8cf3bb
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,42 @@
+
+SELECT setseed(1);
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 0000000000..9776969b4c
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,581 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gist.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "access/tableam.h"
+#include "amcheck.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
+#include "common/pg_prng.h"
+#include "lib/bloomfilter.h"
+#include "utils/memutils.h"
+
+
+/*
+ * GistScanItem represents one item of depth-first scan of GiST index.
+ */
+typedef struct GistScanItem
+{
+	int			depth;
+
+	/* Referenced block number to check next */
+	BlockNumber blkno;
+
+	/*
+	 * Correctess of this parent tuple will be checked against contents of referenced page.
+	 * This tuple will be NULL for root block.
+	 */
+	IndexTuple	parenttup;
+
+	/*
+	 * LSN to hande concurrent scan of the page.
+	 * It's necessary to avoid missing some subtrees from page, that was
+	 * split just before we read it.
+	 */
+	XLogRecPtr	parentlsn;
+
+	/*
+	 * Reference to parent page for re-locking in case of found parent-child
+	 * tuple discrapencies.
+	 */
+	BlockNumber parentblk;
+
+	/* Pointer to a next stack item. */
+	struct GistScanItem *next;
+} GistScanItem;
+
+typedef struct GistCheckState
+{
+	/* Bloom filter fingerprints index tuples */
+	bloom_filter *filter;
+	/* Debug counter */
+	int64		heaptuplespresent;
+	/* GiST state */
+	GISTSTATE  *state;
+
+	Snapshot	snapshot;
+	Relation	rel;
+	Relation	heaprel;
+
+	/* progress reporting stuff */
+	BlockNumber totalblocks;
+	BlockNumber reportedblocks;
+	BlockNumber scannedblocks;
+	BlockNumber deltablocks;
+} GistCheckState;
+
+PG_FUNCTION_INFO_V1(gist_index_check);
+
+static void gist_init_heapallindexed(Relation rel, GistCheckState * result);
+static void gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+											   void *callback_state);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gist_refind_parent(Relation rel, BlockNumber parentblkno,
+									 BlockNumber childblkno,
+									 BufferAccessStrategy strategy);
+static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block,
+								   Page page, OffsetNumber offset);
+static void gist_tuple_present_callback(Relation index, ItemPointer tid,
+										Datum *values, bool *isnull,
+										bool tupleIsAlive, void *checkstate);
+
+/*
+ * gist_index_check(index regclass)
+ *
+ * Verify integrity of GiST index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gist_index_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	bool		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid,
+									GIST_AM_OID,
+									gist_check_parent_keys_consistency,
+									AccessShareLock,
+									&heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+static void
+gist_init_heapallindexed(Relation rel, GistCheckState * result)
+{
+	int64		total_pages;
+	int64		total_elems;
+	uint64		seed;
+
+	/*
+	 * Size Bloom filter based on estimated number of tuples in index. This
+	 * logic is similar to B-tree, see verify_btree.c .
+	 */
+	total_pages = result->totalblocks;
+	total_elems = Max(total_pages * (MaxOffsetNumber / 5),
+					  (int64) rel->rd_rel->reltuples);
+	seed = pg_prng_uint64(&pg_global_prng_state);
+	result->filter = bloom_create(total_elems, maintenance_work_mem, seed);
+
+	result->snapshot = RegisterSnapshot(GetTransactionSnapshot());
+
+
+	/*
+	 * GetTransactionSnapshot() always acquires a new MVCC snapshot in READ
+	 * COMMITTED mode.  A new snapshot is guaranteed to have all the entries
+	 * it requires in the index.
+	 *
+	 * We must defend against the possibility that an old xact snapshot was
+	 * returned at higher isolation levels when that snapshot is not safe for
+	 * index scans of the target index.  This is possible when the snapshot
+	 * sees tuples that are before the index's indcheckxmin horizon.  Throwing
+	 * an error here should be very rare.  It doesn't seem worth using a
+	 * secondary snapshot to avoid this.
+	 */
+	if (IsolationUsesXactSnapshot() && rel->rd_index->indcheckxmin &&
+		!TransactionIdPrecedes(HeapTupleHeaderGetXmin(rel->rd_indextuple->t_data),
+							   result->snapshot->xmin))
+		ereport(ERROR,
+				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+				 errmsg("index \"%s\" cannot be verified using transaction snapshot",
+						RelationGetRelationName(rel))));
+}
+
+/*
+ * Main entry point for GiST check. Allocates memory context and scans through
+ * GiST graph. This scan is performed in a depth-first search using a stack of
+ * GistScanItem-s. Initially this stack contains only root block number. On
+ * each iteration top block numbmer is replcaed by referenced block numbers.
+ *
+ * This function verifies that tuples of internal pages cover all
+ * the key space of each tuple on leaf page.  To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries to
+ * adjust it by tuples on referenced child page.  Parent gist tuple should
+ * never require any adjustments.
+ */
+static void
+gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+								   void *callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GistScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GISTSTATE  *state;
+	int			leafdepth;
+	bool		heapallindexed = *((bool *) callback_state);
+	GistCheckState check_state;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	state = initGISTstate(rel);
+
+	check_state.state = state;
+	check_state.rel = rel;
+	check_state.heaprel = heaprel;
+
+	check_state.totalblocks = RelationGetNumberOfBlocks(rel);
+	check_state.reportedblocks = 0;
+	check_state.scannedblocks = 0;
+	/* report every 100 blocks or 5%, whichever is bigger */
+	check_state.deltablocks = Max(check_state.totalblocks / 20, 100);
+
+	if (heapallindexed)
+		gist_init_heapallindexed(rel, &check_state);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	/*
+	 * This GiST scan is effectively "old" VACUUM version before commit
+	 * fe280694d which introduced physical order scanning.
+	 */
+
+	while (stack)
+	{
+		GistScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Report progress */
+		if (check_state.scannedblocks > check_state.reportedblocks +
+			check_state.deltablocks)
+		{
+			elog(DEBUG1, "verified level %u blocks of approximately %u total",
+				 check_state.scannedblocks, check_state.totalblocks);
+			check_state.reportedblocks = check_state.scannedblocks;
+		}
+		check_state.scannedblocks++;
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page))
+		{
+			/* split page detected, install right link to the stack */
+			GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+			ptr->depth = stack->depth;
+			ptr->parenttup = CopyIndexTuple(stack->parenttup);
+			ptr->parentblk = stack->parentblk;
+			ptr->parentlsn = stack->parentlsn;
+			ptr->blkno = GistPageGetOpaque(page)->rightlink;
+			ptr->next = stack->next;
+			stack->next = ptr;
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GistPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that each tuple looks valid, and is consistent with the
+		 * downlink we followed when we stepped on this page.
+		 */
+		maxoff = PageGetMaxOffsetNumber(page);
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+			/*
+			 * Check that it's not a leftover invalid tuple from pre-9.1 See
+			 * also gistdoinsert() and gistbulkdelete() handling of such
+			 * tuples. We do consider it error here.
+			 */
+			if (GistTupleIsInvalid(idxtuple))
+				ereport(ERROR,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("index \"%s\" contains an inner tuple marked as invalid, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i),
+						 errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+						 errhint("Please REINDEX it.")));
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+			{
+				/*
+				 * There was a discrepancy between parent and child tuples. We
+				 * need to verify it is not a result of concurrent call of
+				 * gistplacetopage(). So, lock parent and try to find downlink
+				 * for current page. It may be missing due to concurrent page
+				 * split, this is OK.
+				 *
+				 * Note that when we aquire parent tuple now we hold lock for
+				 * both parent and child buffers. Thus parent tuple must
+				 * include keyspace of the child.
+				 */
+				pfree(stack->parenttup);
+				stack->parenttup = gist_refind_parent(rel, stack->parentblk,
+													  stack->blkno, strategy);
+
+				/* We found it - make a final check before failing */
+				if (!stack->parenttup)
+					elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+						 stack->blkno, stack->parentblk);
+				else if (gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+				else
+				{
+					/*
+					 * But now it is properly adjusted - nothing to do here.
+					 */
+				}
+			}
+
+			if (GistPageIsLeaf(page))
+			{
+				if (heapallindexed)
+					bloom_add_element(check_state.filter,
+									  (unsigned char *) idxtuple,
+									  IndexTupleSize(idxtuple));
+			}
+			else
+			{
+				/* Internal page, so recurse to the child */
+				GistScanItem *ptr;
+
+				ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+				ptr->depth = stack->depth + 1;
+				ptr->parenttup = CopyIndexTuple(idxtuple);
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		LockBuffer(buffer, GIST_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	if (heapallindexed)
+	{
+		IndexInfo  *indexinfo = BuildIndexInfo(rel);
+		TableScanDesc scan;
+
+		scan = table_beginscan_strat(heaprel,	/* relation */
+									 check_state.snapshot,	/* snapshot */
+									 0, /* number of keys */
+									 NULL,	/* scan key */
+									 true,	/* buffer access strategy OK */
+									 true); /* syncscan OK? */
+
+		/*
+		 * Scan will behave as the first scan of a CREATE INDEX CONCURRENTLY.
+		 */
+		indexinfo->ii_Concurrent = true;
+
+		indexinfo->ii_Unique = false;
+		indexinfo->ii_ExclusionOps = NULL;
+		indexinfo->ii_ExclusionProcs = NULL;
+		indexinfo->ii_ExclusionStrats = NULL;
+
+		elog(DEBUG1, "verifying that tuples from index \"%s\" are present in \"%s\"",
+			 RelationGetRelationName(rel),
+			 RelationGetRelationName(heaprel));
+
+		table_index_build_scan(heaprel, rel, indexinfo, true, false,
+							   gist_tuple_present_callback, (void *) &check_state, scan);
+
+		ereport(DEBUG1,
+				(errmsg_internal("finished verifying presence of " INT64_FORMAT " tuples from table \"%s\" with bitset %.2f%% set",
+								 check_state.heaptuplespresent,
+								 RelationGetRelationName(heaprel),
+								 100.0 * bloom_prop_bits_set(check_state.filter))));
+
+		UnregisterSnapshot(check_state.snapshot);
+		bloom_free(check_state.filter);
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+static void
+gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+							bool *isnull, bool tupleIsAlive, void *checkstate)
+{
+	GistCheckState *state = (GistCheckState *) checkstate;
+	IndexTuple	itup = gistFormTuple(state->state, index, values, isnull, true);
+
+	itup->t_tid = *tid;
+	/* Probe Bloom filter -- tuple should be present */
+	if (bloom_lacks_element(state->filter, (unsigned char *) itup,
+							IndexTupleSize(itup)))
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("heap tuple (%u,%u) from table \"%s\" lacks matching index tuple within index \"%s\"",
+						ItemPointerGetBlockNumber(&(itup->t_tid)),
+						ItemPointerGetOffsetNumber(&(itup->t_tid)),
+						RelationGetRelationName(state->heaprel),
+						RelationGetRelationName(state->rel))));
+
+	state->heaptuplespresent++;
+
+	pfree(itup);
+}
+
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	gistcheckpage(rel, buffer);
+
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted page %d",
+						RelationGetRelationName(rel), blockNo)));
+
+	if (GistPageIsDeleted(page))
+	{
+		if (!GistPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gist_refind_parent(Relation rel,
+				   BlockNumber parentblkno, BlockNumber childblkno,
+				   BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIST_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GistPageIsLeaf(parentpage))
+	{
+		/* That's somewhat suspicious - parent page converted to leaf? */
+		/* Anyway, it's definitely not a page we were looking for */
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/*
+			 * Found it! Make copy and return it while both parent and child
+			 * pages are locked. This guaranties that at this particular moment
+			 * tuples must be coherent to each other.
+			 */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+static ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(sizeof(GISTPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree
+	 * and gist never uses either.  Verify that line pointer has storage, too,
+	 * since even LP_DEAD items should.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 2b9c1a9205..40de7c33f5 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -179,6 +179,25 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gist_index_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
   <tip>
    <para>
-- 
2.32.0 (Apple Git-132)

v24-0003-Add-gin_index_parent_check-to-verify-GIN-index.patchapplication/octet-stream; name=v24-0003-Add-gin_index_parent_check-to-verify-GIN-index.patchDownload

From 980d497e5c8d13431c73701190eb6ec4d069f385 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:22:26 +0500
Subject: [PATCH v24 3/4] Add gin_index_parent_check() to verify GIN index
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Author: Grigory Kryachko <GSKryachko@gmail.com>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile               |   3 +-
 contrib/amcheck/amcheck--1.3--1.4.sql  |  11 +-
 contrib/amcheck/expected/check_gin.out |  64 +++
 contrib/amcheck/meson.build            |   2 +
 contrib/amcheck/sql/check_gin.sql      |  40 ++
 contrib/amcheck/verify_gin.c           | 768 +++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml              |  19 +
 7 files changed, 905 insertions(+), 2 deletions(-)
 create mode 100644 contrib/amcheck/expected/check_gin.out
 create mode 100644 contrib/amcheck/sql/check_gin.sql
 create mode 100644 contrib/amcheck/verify_gin.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index e9e0198276..4c672f0db8 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,6 +4,7 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
+	verify_gin.o \
 	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
@@ -13,7 +14,7 @@ DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck
 		amcheck--1.3--1.4.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_gist check_heap
+REGRESS = check check_btree check_gin check_gist check_heap
 
 TAP_TESTS = 1
 
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
index 5d30784b44..ca985fff2e 100644
--- a/contrib/amcheck/amcheck--1.3--1.4.sql
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -11,4 +11,13 @@ RETURNS VOID
 AS 'MODULE_PATHNAME', 'gist_index_check'
 LANGUAGE C STRICT;
 
-REVOKE ALL ON FUNCTION gist_index_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
+REVOKE ALL ON FUNCTION gist_index_check(regclass, boolean) FROM PUBLIC;
+
+-- gin_index_parent_check()
+--
+CREATE FUNCTION gin_index_parent_check(index regclass)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gin_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gin_index_parent_check(regclass) FROM PUBLIC;
diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
new file mode 100644
index 0000000000..43fd769a50
--- /dev/null
+++ b/contrib/amcheck/expected/check_gin.out
@@ -0,0 +1,64 @@
+-- Test of index bulk load
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx');
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test index inserts
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+SELECT gin_index_parent_check('gin_check_idx');
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test GIN over text array
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx');
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 9e7ebc0499..dc2191bd59 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gin.c',
   'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
@@ -37,6 +38,7 @@ tests += {
     'sql': [
       'check',
       'check_btree',
+      'check_gin',
       'check_gist',
       'check_heap',
     ],
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
new file mode 100644
index 0000000000..9771afffa5
--- /dev/null
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -0,0 +1,40 @@
+-- Test of index bulk load
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test index inserts
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+
+SELECT gin_index_parent_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test GIN over text array
+SELECT setseed(1);
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx');
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
new file mode 100644
index 0000000000..af9ace2f33
--- /dev/null
+++ b/contrib/amcheck/verify_gin.c
@@ -0,0 +1,768 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gin.c
+ *		Verifies the integrity of GIN indexes based on invariants.
+ *
+ * Verification checks that all paths in GIN graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gin.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gin_private.h"
+#include "access/nbtree.h"
+#include "amcheck.h"
+#include "catalog/pg_am.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "string.h"
+
+/*
+ * GinScanItem represents one item of depth-first scan of GIN index.
+ */
+typedef struct GinScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GinScanItem *next;
+} GinScanItem;
+
+/*
+ * GinPostingTreeScanItem represents one item of depth-first scan of GIN  posting tree.
+ */
+typedef struct GinPostingTreeScanItem
+{
+	int			depth;
+	ItemPointerData parentkey;
+	BlockNumber parentblk;
+	BlockNumber blkno;
+	struct GinPostingTreeScanItem *next;
+} GinPostingTreeScanItem;
+
+
+PG_FUNCTION_INFO_V1(gin_index_parent_check);
+
+static void gin_check_parent_keys_consistency(Relation rel,
+											  Relation heaprel,
+											  void *callback_state);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gin_refind_parent(Relation rel,
+									BlockNumber parentblkno,
+									BlockNumber childblkno,
+									BufferAccessStrategy strategy);
+static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+								   OffsetNumber offset);
+
+/*
+ * gin_index_parent_check(index regclass)
+ *
+ * Verify integrity of GIN index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gin_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+
+	amcheck_lock_relation_and_check(indrelid,
+									GIN_AM_OID,
+									gin_check_parent_keys_consistency,
+									AccessShareLock,
+									NULL);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Read item pointers from leaf entry tuple.
+ *
+ * Returns a palloc'd array of ItemPointers. The number of items is returned
+ * in *nitems.
+ */
+static ItemPointer
+ginReadTupleWithoutState(IndexTuple itup, int *nitems)
+{
+	Pointer		ptr = GinGetPosting(itup);
+	int			nipd = GinGetNPosting(itup);
+	ItemPointer ipd;
+	int			ndecoded;
+
+	if (GinItupIsCompressed(itup))
+	{
+		if (nipd > 0)
+		{
+			ipd = ginPostingListDecode((GinPostingList *) ptr, &ndecoded);
+			if (nipd != ndecoded)
+				elog(ERROR, "number of items mismatch in GIN entry tuple, %d in tuple header, %d decoded",
+					 nipd, ndecoded);
+		}
+		else
+		{
+			ipd = palloc(0);
+		}
+	}
+	else
+	{
+		ipd = (ItemPointer) palloc(sizeof(ItemPointerData) * nipd);
+		memcpy(ipd, ptr, sizeof(ItemPointerData) * nipd);
+	}
+	*nitems = nipd;
+	return ipd;
+}
+
+/*
+ * Allocates memory context and scans through postigTree graph
+ *
+ */
+static void
+gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting_tree_root)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinPostingTreeScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinPostingTreeScanItem *) palloc0(sizeof(GinPostingTreeScanItem));
+	stack->depth = 0;
+	ItemPointerSetInvalid(&stack->parentkey);
+	stack->parentblk = InvalidBlockNumber;
+	stack->blkno = posting_tree_root;
+
+	elog(DEBUG3, "processing posting tree at blk %u", posting_tree_root);
+
+	while (stack)
+	{
+		GinPostingTreeScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		Assert(GinPageIsData(page));
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			ItemPointerData minItem;
+			int			nlist;
+			ItemPointerData *list;
+			char		tidrange_buf[100];
+
+			ItemPointerSetMin(&minItem);
+
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			list = GinDataLeafPageGetItems(page, &nlist, minItem);
+
+			if (nlist > 0)
+			{
+				snprintf(tidrange_buf, sizeof(tidrange_buf),
+						 "%d tids (%u, %u) - (%u, %u)",
+						 nlist,
+						 ItemPointerGetBlockNumberNoCheck(&list[0]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[0]),
+						 ItemPointerGetBlockNumberNoCheck(&list[nlist - 1]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[nlist - 1]));
+			}
+			else
+			{
+				snprintf(tidrange_buf, sizeof(tidrange_buf), "0 tids");
+			}
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				elog(DEBUG3, "blk %u: parent %u highkey (%u, %u), %s",
+					 stack->blkno,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey),
+					 tidrange_buf);
+			}
+			else
+			{
+				elog(DEBUG3, "blk %u: root leaf, %s",
+					 stack->blkno,
+					 tidrange_buf);
+			}
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				ItemPointerGetOffsetNumberNoCheck(&stack->parentkey) != InvalidOffsetNumber &&
+				nlist > 0 && ItemPointerCompare(&stack->parentkey, &list[nlist - 1]) < 0)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": tid exceeds parent's high key in postingTree leaf on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+		else
+		{
+			LocationIndex pd_lower;
+			ItemPointerData bound;
+			int			lowersize;
+
+			/*
+			 * Check that tuples in each page are properly ordered and
+			 * consistent with parent high key
+			 */
+			maxoff = GinPageGetOpaque(page)->maxoff;
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: internal posting tree page with %u items, parent %u highkey (%u, %u)",
+					 stack->blkno, maxoff, stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey));
+			else
+				elog(DEBUG3, "blk %u: root internal posting tree page with %u items",
+					 stack->blkno, maxoff);
+
+			/*
+			 * A GIN posting tree internal page stores PostingItems in the
+			 * 'lower' part of the page. The 'upper' part is unused. The
+			 * number of elements is stored in the opaque area (maxoff). Make
+			 * sure the size of the 'lower' part agrees with 'maxoff'
+			 *
+			 * We didn't set pd_lower until PostgreSQL version 9.4, so if this
+			 * check fails, it could also be because the index was
+			 * binary-upgraded from an earlier version. That was a long time
+			 * ago, though, so let's warn if it doesn't match.
+			 */
+			pd_lower = ((PageHeader) page)->pd_lower;
+			lowersize = pd_lower - MAXALIGN(SizeOfPageHeaderData);
+			if ((lowersize - MAXALIGN(sizeof(ItemPointerData))) / sizeof(PostingItem) != maxoff)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has unexpected pd_lower %u in posting tree block %u with maxoff %u)",
+								RelationGetRelationName(rel), pd_lower, stack->blkno, maxoff)));
+
+			/*
+			 * Before the PostingItems, there's one ItemPointerData in the
+			 * 'lower' part that stores the page's high key.
+			 */
+			bound = *GinDataPageGetRightBound(page);
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				!ItemPointerEquals(&stack->parentkey, &bound))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": posting tree page's high key (%u, %u) doesn't match the downlink on block %u (parent blk %u, key (%u, %u))",
+								RelationGetRelationName(rel),
+								ItemPointerGetBlockNumberNoCheck(&bound),
+								ItemPointerGetOffsetNumberNoCheck(&bound),
+								stack->blkno, stack->parentblk,
+								ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+								ItemPointerGetOffsetNumberNoCheck(&stack->parentkey))));
+
+			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
+
+				elog(DEBUG3, "key (%u, %u) -> %u",
+					 ItemPointerGetBlockNumber(&posting_item->key),
+					 ItemPointerGetOffsetNumber(&posting_item->key),
+					 BlockIdGetBlockNumber(&posting_item->child_blkno));
+
+				if (i == maxoff &&
+					GinPageGetOpaque(page)->rightlink == InvalidBlockNumber)
+				{
+					/*
+					 * The rightmost item in the tree level has (0, 0) as the
+					 * key
+					 */
+					if (ItemPointerGetBlockNumberNoCheck(&posting_item->key) != 0 ||
+						ItemPointerGetOffsetNumberNoCheck(&posting_item->key) != 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": rightmost posting tree page (blk %u) has unexpected last key (%u, %u)",
+										RelationGetRelationName(rel),
+										stack->blkno,
+										ItemPointerGetBlockNumberNoCheck(&posting_item->key),
+										ItemPointerGetOffsetNumberNoCheck(&posting_item->key))));
+				}
+				else if (i != FirstOffsetNumber)
+				{
+					PostingItem *previous_posting_item = GinDataPageGetPostingItem(page, i - 1);
+
+					if (ItemPointerCompare(&posting_item->key, &previous_posting_item->key) < 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\" has wrong tuple order in posting tree, block %u, offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+				}
+
+				/*
+				 * Check if this tuple is consistent with the downlink in the
+				 * parent.
+				 */
+				if (stack->parentblk != InvalidBlockNumber && i == maxoff &&
+					ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\": posting item exceeds parent's high key in postingTree internal page on block %u offset %u",
+									RelationGetRelationName(rel),
+									stack->blkno, i)));
+
+				/* If this is an internal page, recurse into the child */
+				if (!GinPageIsLeaf(page))
+				{
+					GinPostingTreeScanItem *ptr;
+
+					ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
+					ptr->depth = stack->depth + 1;
+					ptr->parentkey = posting_item->key;
+					ptr->parentblk = stack->blkno;
+					ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+			}
+		}
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Main entry point for GIN check. Allocates memory context and scans through
+ * GIN graph.
+ */
+static void
+gin_check_parent_keys_consistency(Relation rel,
+								  Relation heaprel,
+								  void *callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GinState	state;
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+	initGinState(&state, rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinScanItem *) palloc0(sizeof(GinScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIN_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GinScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+		IndexTuple	prev_tuple;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+		maxoff = PageGetMaxOffsetNumber(page);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (stack->parenttup != NULL)
+		{
+			GinNullCategory parent_key_category;
+			Datum parent_key = gintuple_get_key(&state,
+												stack->parenttup,
+												&parent_key_category);
+			ItemId iid = PageGetItemIdCareful(rel, stack->blkno,
+											  page, maxoff);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory page_max_key_category;
+			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
+
+			if (GinPageGetOpaque(page)->rightlink != InvalidBlockNumber &&
+				ginCompareEntries(&state, attnum, page_max_key,
+								  page_max_key_category, parent_key,
+								  parent_key_category) > 0)
+			{
+				/* split page detected, install right link to the stack */
+				GinScanItem *ptr;
+
+				elog(DEBUG3, "split detected");
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth;
+				ptr->parenttup = CopyIndexTuple(stack->parenttup);
+				ptr->parentblk = stack->parentblk;
+				ptr->parentlsn = stack->parentlsn;
+				ptr->blkno = GinPageGetOpaque(page)->rightlink;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that tuples in each page are properly ordered and consistent
+		 * with parent high key
+		 */
+		prev_tuple = NULL;
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory prev_key_category;
+			Datum		prev_key;
+			GinNullCategory current_key_category;
+			Datum		current_key;
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
+
+			/* (apparently) first block is metadata, skip order check */
+			if (i != FirstOffsetNumber && stack->blkno != (BlockNumber) 1)
+			{
+				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
+				if (ginCompareEntries(&state, attnum, prev_key,
+									  prev_key_category, current_key,
+									  current_key_category) >= 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has wrong tuple order, block %u, offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+			}
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				i == maxoff)
+			{
+				GinNullCategory parent_key_category;
+				Datum parent_key = gintuple_get_key(&state,
+													stack->parenttup,
+													&parent_key_category);
+
+				if (ginCompareEntries(&state, attnum, current_key,
+									  current_key_category, parent_key,
+									  parent_key_category) > 0)
+				{
+					/*
+					 * There was a discrepancy between parent and child
+					 * tuples. We need to verify it is not a result of
+					 * concurrent call of gistplacetopage(). So, lock parent
+					 * and try to find downlink for current page. It may be
+					 * missing due to concurrent page split, this is OK.
+					 */
+					pfree(stack->parenttup);
+					stack->parenttup = gin_refind_parent(rel, stack->parentblk,
+														 stack->blkno, strategy);
+
+					/* We found it - make a final check before failing */
+					if (!stack->parenttup)
+						elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+							 stack->blkno, stack->parentblk);
+					else
+					{
+						parent_key = gintuple_get_key(&state,
+													  stack->parenttup,
+													  &parent_key_category);
+						if (ginCompareEntries(&state, attnum, current_key,
+											  current_key_category, parent_key,
+											  parent_key_category) > 0)
+							ereport(ERROR,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+											RelationGetRelationName(rel), stack->blkno, i)));
+						else
+						{
+							/*
+							 * But now it is properly adjusted - nothing to do
+							 * here.
+							 */
+						}
+					}
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+			if (!GinPageIsLeaf(page))
+			{
+				GinScanItem *ptr;
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth + 1;
+				/* last tuple in layer has no high key */
+				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
+				{
+					ptr->parenttup = CopyIndexTuple(idxtuple);
+				}
+				else
+				{
+					ptr->parenttup = NULL;
+				}
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = GinGetDownlink(idxtuple);
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+			/* If this item is a pointer to a posting tree, recurse into it */
+			else if (GinIsPostingTree(idxtuple))
+			{
+				BlockNumber rootPostingTree = GinGetPostingTree(idxtuple);
+
+				gin_check_posting_tree_parent_keys_consistency(rel, rootPostingTree);
+			}
+			else
+			{
+				ItemPointer ipd;
+				int			nipd;
+
+				ipd = ginReadTupleWithoutState(idxtuple, &nipd);
+
+				for (int j = 0; j < nipd; j++)
+				{
+					if (!OffsetNumberIsValid(ItemPointerGetOffsetNumber(&ipd[j])))
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting list contains invalid heap pointer on block %u",
+										RelationGetRelationName(rel), stack->blkno)));
+				}
+				pfree(ipd);
+			}
+
+			prev_tuple = CopyIndexTuple(idxtuple);
+		}
+
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Verify that a freshly-read page looks sane.
+ */
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	/*
+	 * ReadBuffer verifies that every newly-read page passes
+	 * PageHeaderIsValid, which means it either contains a reasonably sane
+	 * page header or is all-zero.  We have to defend against the all-zero
+	 * case, however.
+	 */
+	if (PageIsNew(page))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains unexpected zero page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	/*
+	 * Additionally check that the special area looks sane.
+	 */
+	if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains corrupted page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	if (GinPageIsDeleted(page))
+	{
+		if (!GinPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gin_refind_parent(Relation rel, BlockNumber parentblkno,
+				  BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIN_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GinPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+static ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED or LP_DEAD,
+	 * since GIN never uses all three.  Verify that line pointer has storage,
+	 * too.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdIsDead(itemid) || ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 40de7c33f5..e5c8d84db9 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -180,6 +180,25 @@ ORDER BY c.relpages DESC LIMIT 10;
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term>
+     <function>gin_index_parent_check(index regclass) returns void</function>
+     <indexterm>
+      <primary>gin_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gin_index_parent_check</function> tests that its target GIN index
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term>
      <function>gist_index_check(index regclass, heapallindexed boolean) returns void</function>
-- 
2.32.0 (Apple Git-132)

#32

Michael Banck

mbanck@gmx.net

almost 3 years ago

In reply to: Nikolay Samokhvalov (#27)

Re: Amcheck verification of GiST and GIN

Hi,

On Thu, Feb 02, 2023 at 12:56:47PM -0800, Nikolay Samokhvalov wrote:

On Thu, Feb 2, 2023 at 12:43 PM Peter Geoghegan <pg@bowt.ie> wrote:

I think that that problem should be solved at a higher level, in the
program that runs amcheck. Note that pg_amcheck will already do this
for B-Tree indexes.

That's a great tool, and it's great it supports parallelization, very useful
on large machines.

Right, but unfortunately not an option on managed services. It's clear
that this restriction should not be a general guideline for Postgres
development, but it makes the amcheck extension (that is now shipped
everywhere due to being in-code I believe) somewhat less useful for
use-case of checking your whole database for corruption.

Michael

#33

Peter Geoghegan

pg@bowt.ie

almost 3 years ago

In reply to: Andrey Borodin (#31)

Re: Amcheck verification of GiST and GIN

On Sun, Feb 5, 2023 at 4:45 PM Andrey Borodin <amborodin86@gmail.com> wrote:

Here's v24 == (v23 + a step for pg_amcheck). There's a lot of
shotgun-style changes, but I hope next index types will be easy to add
now.

Some feedback on the GiST patch:

* You forgot to initialize GistCheckState.heaptuplespresent to 0.

It might be better to allocate GistCheckState dynamically, using
palloc0(). That's future proof. "Simple and obvious" is usually the
most important goal for managing memory in amcheck code. It can be a
little inefficient if that makes it simpler.

* ISTM that gist_index_check() should allow the caller to omit a
"heapallindexed" argument by specifying "DEFAULT FALSE", for
consistency with bt_index_check().

(Actually there are two versions of bt_index_check(), with
overloading, but that's just because of the way that the extension
evolved over time).

* What's the point in having a custom memory context that is never reset?

I believe that gistgetadjusted() will leak memory here, so there is a
need for some kind of high level strategy for managing memory. The
strategy within verify_nbtree.c is to call MemoryContextReset() right
after every loop iteration within bt_check_level_from_leftmost() --
which is pretty much once every call to bt_target_page_check(). That
kind of approach is obviously not going to suffer any memory leaks.

Again, "simple and obvious" is good for memory management in amcheck.

* ISTM that it would be clearer if the per-page code within
gist_check_parent_keys_consistency() was broken out into its own
function -- a little like bt_target_page_check()..

That way the control flow would be easier to understand when looking
at the code at a high level.

* ISTM that gist_refind_parent() should throw an error about
corruption in the event of a parent page somehow becoming a leaf page.

Obviously this is never supposed to happen, and likely never will
happen, even with corruption. But it seems like a good idea to make
the most conservative possible assumption by throwing an error. If it
never happens anyway, then the fact that we handle it with an error
won't matter -- so the error is harmless. If it does happen then we'll
want to hear about it as soon as possible -- so the error is useful.

* I suggest using c99 style variable declarations in loops.

Especially for things like "for (OffsetNumber offset =
FirstOffsetNumber; ... ; ... )".

--
Peter Geoghegan

#34

Peter Geoghegan

pg@bowt.ie

almost 3 years ago

In reply to: Peter Geoghegan (#33)

Re: Amcheck verification of GiST and GIN

On Thu, Mar 16, 2023 at 4:48 PM Peter Geoghegan <pg@bowt.ie> wrote:

Some feedback on the GiST patch:

I see that the Bloom filter that's used to implement heapallindexed
verification fingerprints index tuples that are formed via calls to
gistFormTuple(), without any attempt to normalize-away differences in
TOAST input state. In other words, there is nothing like
verify_nbtree.c's bt_normalize_tuple() function involved in the
fingerprinting process. Why is that safe, though? See the "toast_bug"
test case within contrib/amcheck/sql/check_btree.sql for an example of
how inconsistent TOAST input state confused verify_nbtree.c's
heapallindexed verification (before bugfix commit eba775345d). I'm
concerned about GiST heapallindexed verification being buggy in
exactly the same way, or in some way that is roughly analogous.

I do have some concerns about there being analogous problems that are
unique to GiST, since GiST is an AM that gives opclass authors many
more choices than B-Tree opclass authors have. In particular, I wonder
if heapallindexed verification needs to account for how GiST
compression might end up breaking heapallindexed. I refer to the
"compression" implemented by GiST support routine 3 of GiST opclasses.
The existence of GiST support routine 7, the "same" routine, also
makes me feel a bit squeamish about heapallindexed verification -- the
existence of a "same" routine hints at some confusion about "equality
versus equivalence" issues.

In more general terms: heapallindexed verification works by
fingerprinting index tuples during the index verification stage, and
then performing Bloom filter probes in a separate CREATE INDEX style
heap-matches-index stage (obviously). There must be some justification
for our assumption that there can be no false positive corruption
reports due only to a GiST opclass (either extant or theoretical) that
follows the GiST contract, and yet allows an inconsistency to arise
that isn't really index corruption. This justification won't be easy
to come up with, since the GiST contract was not really designed with
these requirements in mind. But...we should try to come up with
something.

What are the assumptions underlying heapallindexed verification for
GiST? It doesn't have to be provably correct or anything, but it
should at least be empirically falsifiable. Basically, something that
says: "Here are our assumptions, if we were wrong in making these
assumptions then you could tell that we made a mistake because of X,
Y, Z". It's not always clear when something is corrupt. Admittedly I
have much less experience with GiST than other people, which likely
includes you (Andrey). I am likely missing some context around the
evolution of GiST. Possibly I'm making a big deal out of something
without it being unhelpful. Unsure.

Here is an example of the basic definition of correctness being
unclear, in a bad way: Is a HOT chain corrupt when its root
LP_REDIRECT points to an LP_DEAD item, or does that not count as
corruption? I'm pretty sure that the answer is ambiguous even today,
or was ambiguous until recently, at least. Hopefully the
verify_heapam.c HOT chain verification patch will be committed,
providing us with a clear *definition* of HOT chain corruption -- the
definition itself may not be the easy part.

On a totally unrelated note: I wonder if we should be checking that
internal page tuples have 0xffff as their offset number? Seems like
it'd be a cheap enough cross-check.

--
Peter Geoghegan

#35

Andrey Borodin

amborodin86@gmail.com

almost 3 years ago

In reply to: Peter Geoghegan (#34)

Re: Amcheck verification of GiST and GIN

Hi Peter,

Thanks for the feedback! I'll work on it during the weekend.

On Thu, Mar 16, 2023 at 6:23 PM Peter Geoghegan <pg@bowt.ie> wrote:

existence of a "same" routine hints at some confusion about "equality
versus equivalence" issues.

Hmm...yes, actually, GiST deals with floats routinely. And there might
be some sorts of NaNs and Infs that are equal, but not binary
equivalent.
I'll think more about it.

gist_get_adjusted() calls "same" routine, which for type point will
use FPeq(double A, double B). And this might be kind of a corruption
out of the box. Because it's an epsilon-comparison, ε=1.0E-06.
GiST might miss newly inserted data, because the "adjusted" tuple was
"same" if data is in proximity of 0.000001 of any previously indexed
point, but out of known MBRs.
I'll try to reproduce this tomorrow, so far no luck.

Best regards, Andrey Borodin.

#36

Andrey Borodin

amborodin86@gmail.com

almost 3 years ago

In reply to: Peter Geoghegan (#34)

4 attachment(s)

Re: Amcheck verification of GiST and GIN

On Fri, Mar 17, 2023 at 8:40 PM Andrey Borodin <amborodin86@gmail.com> wrote:

On Thu, Mar 16, 2023 at 6:23 PM Peter Geoghegan <pg@bowt.ie> wrote:

existence of a "same" routine hints at some confusion about "equality
versus equivalence" issues.

Hmm...yes, actually, GiST deals with floats routinely. And there might
be some sorts of NaNs and Infs that are equal, but not binary
equivalent.
I'll think more about it.

gist_get_adjusted() calls "same" routine, which for type point will
use FPeq(double A, double B). And this might be kind of a corruption
out of the box. Because it's an epsilon-comparison, ε=1.0E-06.
GiST might miss newly inserted data, because the "adjusted" tuple was
"same" if data is in proximity of 0.000001 of any previously indexed
point, but out of known MBRs.
I'll try to reproduce this tomorrow, so far no luck.

After several attempts to corrupt GiST with this 0.000001 epsilon
adjustment tolerance I think GiST indexing of points is valid.
Because intersection for search purposes is determined with the same epsilon!
So it's kind of odd
postgres=# select point(0.0000001,0)~=point(0,0);
?column?
----------
t
(1 row)
, yet the index works correctly.

On Thu, Mar 16, 2023 at 4:48 PM Peter Geoghegan <pg@bowt.ie> wrote:

On Sun, Feb 5, 2023 at 4:45 PM Andrey Borodin <amborodin86@gmail.com> wrote:

Here's v24 == (v23 + a step for pg_amcheck). There's a lot of
shotgun-style changes, but I hope next index types will be easy to add
now.

Some feedback on the GiST patch:

* You forgot to initialize GistCheckState.heaptuplespresent to 0.

It might be better to allocate GistCheckState dynamically, using
palloc0(). That's future proof. "Simple and obvious" is usually the
most important goal for managing memory in amcheck code. It can be a
little inefficient if that makes it simpler.

Done.

* ISTM that gist_index_check() should allow the caller to omit a
"heapallindexed" argument by specifying "DEFAULT FALSE", for
consistency with bt_index_check().

Done.

* What's the point in having a custom memory context that is never reset?

The problem is we traverse index with depth-first scan and must retain
internal tuples for a whole time of the scan.
And gistgetadjusted() will allocate memory only in case of suspicion
of corruption. So, it's kind of an infrequent case.

The context is there only as an overall leak protection mechanism.
Actual memory management is done via pfree() calls.

Again, "simple and obvious" is good for memory management in amcheck.

Yes, that would be great to come up with some "unit of work" contexts.
Yet, now palloced tuples and scan items have very different lifespans.

* ISTM that it would be clearer if the per-page code within
gist_check_parent_keys_consistency() was broken out into its own
function -- a little like bt_target_page_check()..

I've refactored page logic into gist_check_page().

* ISTM that gist_refind_parent() should throw an error about
corruption in the event of a parent page somehow becoming a leaf page.

Done.

* I suggest using c99 style variable declarations in loops.

Done.

On Thu, Mar 16, 2023 at 6:23 PM Peter Geoghegan <pg@bowt.ie> wrote:

On Thu, Mar 16, 2023 at 4:48 PM Peter Geoghegan <pg@bowt.ie> wrote:

Some feedback on the GiST patch:

I see that the Bloom filter that's used to implement heapallindexed
verification fingerprints index tuples that are formed via calls to
gistFormTuple(), without any attempt to normalize-away differences in
TOAST input state. In other words, there is nothing like
verify_nbtree.c's bt_normalize_tuple() function involved in the
fingerprinting process. Why is that safe, though? See the "toast_bug"
test case within contrib/amcheck/sql/check_btree.sql for an example of
how inconsistent TOAST input state confused verify_nbtree.c's
heapallindexed verification (before bugfix commit eba775345d). I'm
concerned about GiST heapallindexed verification being buggy in
exactly the same way, or in some way that is roughly analogous.

FWIW contrib opclasses, AFAIK, always detoast possibly long datums,
see gbt_var_compress()
https://github.com/postgres/postgres/blob/master/contrib/btree_gist/btree_utils_var.c#L281
But there might be opclasses that do not do so...
Also, there are INCLUDEd attributes. Right now we just put them as-is
to the bloom filter. Does this constitute a TOAST bug as in B-tree?
If so, I think we should use a version of tuple formatting that omits
included attributes...
What do you think?

I do have some concerns about there being analogous problems that are
unique to GiST, since GiST is an AM that gives opclass authors many
more choices than B-Tree opclass authors have. In particular, I wonder
if heapallindexed verification needs to account for how GiST
compression might end up breaking heapallindexed. I refer to the
"compression" implemented by GiST support routine 3 of GiST opclasses.
The existence of GiST support routine 7, the "same" routine, also
makes me feel a bit squeamish about heapallindexed verification -- the
existence of a "same" routine hints at some confusion about "equality
versus equivalence" issues.

In more general terms: heapallindexed verification works by
fingerprinting index tuples during the index verification stage, and
then performing Bloom filter probes in a separate CREATE INDEX style
heap-matches-index stage (obviously). There must be some justification
for our assumption that there can be no false positive corruption
reports due only to a GiST opclass (either extant or theoretical) that
follows the GiST contract, and yet allows an inconsistency to arise
that isn't really index corruption. This justification won't be easy
to come up with, since the GiST contract was not really designed with
these requirements in mind. But...we should try to come up with
something.

What are the assumptions underlying heapallindexed verification for
GiST? It doesn't have to be provably correct or anything, but it
should at least be empirically falsifiable. Basically, something that
says: "Here are our assumptions, if we were wrong in making these
assumptions then you could tell that we made a mistake because of X,
Y, Z". It's not always clear when something is corrupt. Admittedly I
have much less experience with GiST than other people, which likely
includes you (Andrey). I am likely missing some context around the
evolution of GiST. Possibly I'm making a big deal out of something
without it being unhelpful. Unsure.

Here is an example of the basic definition of correctness being
unclear, in a bad way: Is a HOT chain corrupt when its root
LP_REDIRECT points to an LP_DEAD item, or does that not count as
corruption? I'm pretty sure that the answer is ambiguous even today,
or was ambiguous until recently, at least. Hopefully the
verify_heapam.c HOT chain verification patch will be committed,
providing us with a clear *definition* of HOT chain corruption -- the
definition itself may not be the easy part.

Rules for compression methods are not described anyware. And I suspect
that it's intentional, to provide more flexibility.
To make heapallindexed check work we need that opclass always returns
the same compression result for the same input datum.
All known to me opclasses (built-in and PostGIS) comply with this requirement.

Yet another behavior might be reasonable. Consider we have a
compression which learns on data. It will observe that some datums are
more frequent and start using shorter version of them.

Compression function actually is not about compression, but kind of a
conversion from heap format to indexable. Many opclasses do not have a
compression function at all.
We can require that the checked opclass would not have a compression
function at all. But GiST is mainly used for PostGIS, and in PostGIS
they use compression to convert complex geometry into a bounding box.

Method "same" is used only for a business of internal tuples, but not
for leaf tuples that we fingerprint in the bloom filter.

We can put requirements for heapallindexed in another way: "the
opclass compression method must be a pure function". It's also a very
strict requirement, disallowing all kinds of detoasting, dictionary
compression etc. And btree_gist opclasses does not comply :) But they
seem to me safe for heapallindexed.

On a totally unrelated note: I wonder if we should be checking that
internal page tuples have 0xffff as their offset number? Seems like
it'd be a cheap enough cross-check.

Done.

Thank you!

Best regards, Andrey Borodin.

Attachments:

v25-0003-Add-gin_index_parent_check-to-verify-GIN-index.patchapplication/octet-stream; name=v25-0003-Add-gin_index_parent_check-to-verify-GIN-index.patchDownload

From 648e34451c0d296a8058d28bb4357d77876cb082 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:22:26 +0500
Subject: [PATCH v25 3/4] Add gin_index_parent_check() to verify GIN index
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Author: Grigory Kryachko <GSKryachko@gmail.com>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile               |   3 +-
 contrib/amcheck/amcheck--1.3--1.4.sql  |  11 +-
 contrib/amcheck/expected/check_gin.out |  64 +++
 contrib/amcheck/meson.build            |   2 +
 contrib/amcheck/sql/check_gin.sql      |  40 ++
 contrib/amcheck/verify_gin.c           | 768 +++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml              |  19 +
 7 files changed, 905 insertions(+), 2 deletions(-)
 create mode 100644 contrib/amcheck/expected/check_gin.out
 create mode 100644 contrib/amcheck/sql/check_gin.sql
 create mode 100644 contrib/amcheck/verify_gin.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index e9e0198276..4c672f0db8 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,6 +4,7 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
+	verify_gin.o \
 	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
@@ -13,7 +14,7 @@ DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck
 		amcheck--1.3--1.4.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_gist check_heap
+REGRESS = check check_btree check_gin check_gist check_heap
 
 TAP_TESTS = 1
 
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
index 5d30784b44..ca985fff2e 100644
--- a/contrib/amcheck/amcheck--1.3--1.4.sql
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -11,4 +11,13 @@ RETURNS VOID
 AS 'MODULE_PATHNAME', 'gist_index_check'
 LANGUAGE C STRICT;
 
-REVOKE ALL ON FUNCTION gist_index_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
+REVOKE ALL ON FUNCTION gist_index_check(regclass, boolean) FROM PUBLIC;
+
+-- gin_index_parent_check()
+--
+CREATE FUNCTION gin_index_parent_check(index regclass)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gin_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gin_index_parent_check(regclass) FROM PUBLIC;
diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
new file mode 100644
index 0000000000..43fd769a50
--- /dev/null
+++ b/contrib/amcheck/expected/check_gin.out
@@ -0,0 +1,64 @@
+-- Test of index bulk load
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx');
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test index inserts
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+SELECT gin_index_parent_check('gin_check_idx');
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test GIN over text array
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx');
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 9e7ebc0499..dc2191bd59 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gin.c',
   'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
@@ -37,6 +38,7 @@ tests += {
     'sql': [
       'check',
       'check_btree',
+      'check_gin',
       'check_gist',
       'check_heap',
     ],
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
new file mode 100644
index 0000000000..9771afffa5
--- /dev/null
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -0,0 +1,40 @@
+-- Test of index bulk load
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test index inserts
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+
+SELECT gin_index_parent_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test GIN over text array
+SELECT setseed(1);
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx');
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
new file mode 100644
index 0000000000..af9ace2f33
--- /dev/null
+++ b/contrib/amcheck/verify_gin.c
@@ -0,0 +1,768 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gin.c
+ *		Verifies the integrity of GIN indexes based on invariants.
+ *
+ * Verification checks that all paths in GIN graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gin.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gin_private.h"
+#include "access/nbtree.h"
+#include "amcheck.h"
+#include "catalog/pg_am.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "string.h"
+
+/*
+ * GinScanItem represents one item of depth-first scan of GIN index.
+ */
+typedef struct GinScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GinScanItem *next;
+} GinScanItem;
+
+/*
+ * GinPostingTreeScanItem represents one item of depth-first scan of GIN  posting tree.
+ */
+typedef struct GinPostingTreeScanItem
+{
+	int			depth;
+	ItemPointerData parentkey;
+	BlockNumber parentblk;
+	BlockNumber blkno;
+	struct GinPostingTreeScanItem *next;
+} GinPostingTreeScanItem;
+
+
+PG_FUNCTION_INFO_V1(gin_index_parent_check);
+
+static void gin_check_parent_keys_consistency(Relation rel,
+											  Relation heaprel,
+											  void *callback_state);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gin_refind_parent(Relation rel,
+									BlockNumber parentblkno,
+									BlockNumber childblkno,
+									BufferAccessStrategy strategy);
+static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+								   OffsetNumber offset);
+
+/*
+ * gin_index_parent_check(index regclass)
+ *
+ * Verify integrity of GIN index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gin_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+
+	amcheck_lock_relation_and_check(indrelid,
+									GIN_AM_OID,
+									gin_check_parent_keys_consistency,
+									AccessShareLock,
+									NULL);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Read item pointers from leaf entry tuple.
+ *
+ * Returns a palloc'd array of ItemPointers. The number of items is returned
+ * in *nitems.
+ */
+static ItemPointer
+ginReadTupleWithoutState(IndexTuple itup, int *nitems)
+{
+	Pointer		ptr = GinGetPosting(itup);
+	int			nipd = GinGetNPosting(itup);
+	ItemPointer ipd;
+	int			ndecoded;
+
+	if (GinItupIsCompressed(itup))
+	{
+		if (nipd > 0)
+		{
+			ipd = ginPostingListDecode((GinPostingList *) ptr, &ndecoded);
+			if (nipd != ndecoded)
+				elog(ERROR, "number of items mismatch in GIN entry tuple, %d in tuple header, %d decoded",
+					 nipd, ndecoded);
+		}
+		else
+		{
+			ipd = palloc(0);
+		}
+	}
+	else
+	{
+		ipd = (ItemPointer) palloc(sizeof(ItemPointerData) * nipd);
+		memcpy(ipd, ptr, sizeof(ItemPointerData) * nipd);
+	}
+	*nitems = nipd;
+	return ipd;
+}
+
+/*
+ * Allocates memory context and scans through postigTree graph
+ *
+ */
+static void
+gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting_tree_root)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinPostingTreeScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinPostingTreeScanItem *) palloc0(sizeof(GinPostingTreeScanItem));
+	stack->depth = 0;
+	ItemPointerSetInvalid(&stack->parentkey);
+	stack->parentblk = InvalidBlockNumber;
+	stack->blkno = posting_tree_root;
+
+	elog(DEBUG3, "processing posting tree at blk %u", posting_tree_root);
+
+	while (stack)
+	{
+		GinPostingTreeScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		Assert(GinPageIsData(page));
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			ItemPointerData minItem;
+			int			nlist;
+			ItemPointerData *list;
+			char		tidrange_buf[100];
+
+			ItemPointerSetMin(&minItem);
+
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			list = GinDataLeafPageGetItems(page, &nlist, minItem);
+
+			if (nlist > 0)
+			{
+				snprintf(tidrange_buf, sizeof(tidrange_buf),
+						 "%d tids (%u, %u) - (%u, %u)",
+						 nlist,
+						 ItemPointerGetBlockNumberNoCheck(&list[0]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[0]),
+						 ItemPointerGetBlockNumberNoCheck(&list[nlist - 1]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[nlist - 1]));
+			}
+			else
+			{
+				snprintf(tidrange_buf, sizeof(tidrange_buf), "0 tids");
+			}
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				elog(DEBUG3, "blk %u: parent %u highkey (%u, %u), %s",
+					 stack->blkno,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey),
+					 tidrange_buf);
+			}
+			else
+			{
+				elog(DEBUG3, "blk %u: root leaf, %s",
+					 stack->blkno,
+					 tidrange_buf);
+			}
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				ItemPointerGetOffsetNumberNoCheck(&stack->parentkey) != InvalidOffsetNumber &&
+				nlist > 0 && ItemPointerCompare(&stack->parentkey, &list[nlist - 1]) < 0)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": tid exceeds parent's high key in postingTree leaf on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+		else
+		{
+			LocationIndex pd_lower;
+			ItemPointerData bound;
+			int			lowersize;
+
+			/*
+			 * Check that tuples in each page are properly ordered and
+			 * consistent with parent high key
+			 */
+			maxoff = GinPageGetOpaque(page)->maxoff;
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: internal posting tree page with %u items, parent %u highkey (%u, %u)",
+					 stack->blkno, maxoff, stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey));
+			else
+				elog(DEBUG3, "blk %u: root internal posting tree page with %u items",
+					 stack->blkno, maxoff);
+
+			/*
+			 * A GIN posting tree internal page stores PostingItems in the
+			 * 'lower' part of the page. The 'upper' part is unused. The
+			 * number of elements is stored in the opaque area (maxoff). Make
+			 * sure the size of the 'lower' part agrees with 'maxoff'
+			 *
+			 * We didn't set pd_lower until PostgreSQL version 9.4, so if this
+			 * check fails, it could also be because the index was
+			 * binary-upgraded from an earlier version. That was a long time
+			 * ago, though, so let's warn if it doesn't match.
+			 */
+			pd_lower = ((PageHeader) page)->pd_lower;
+			lowersize = pd_lower - MAXALIGN(SizeOfPageHeaderData);
+			if ((lowersize - MAXALIGN(sizeof(ItemPointerData))) / sizeof(PostingItem) != maxoff)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has unexpected pd_lower %u in posting tree block %u with maxoff %u)",
+								RelationGetRelationName(rel), pd_lower, stack->blkno, maxoff)));
+
+			/*
+			 * Before the PostingItems, there's one ItemPointerData in the
+			 * 'lower' part that stores the page's high key.
+			 */
+			bound = *GinDataPageGetRightBound(page);
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				!ItemPointerEquals(&stack->parentkey, &bound))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": posting tree page's high key (%u, %u) doesn't match the downlink on block %u (parent blk %u, key (%u, %u))",
+								RelationGetRelationName(rel),
+								ItemPointerGetBlockNumberNoCheck(&bound),
+								ItemPointerGetOffsetNumberNoCheck(&bound),
+								stack->blkno, stack->parentblk,
+								ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+								ItemPointerGetOffsetNumberNoCheck(&stack->parentkey))));
+
+			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
+
+				elog(DEBUG3, "key (%u, %u) -> %u",
+					 ItemPointerGetBlockNumber(&posting_item->key),
+					 ItemPointerGetOffsetNumber(&posting_item->key),
+					 BlockIdGetBlockNumber(&posting_item->child_blkno));
+
+				if (i == maxoff &&
+					GinPageGetOpaque(page)->rightlink == InvalidBlockNumber)
+				{
+					/*
+					 * The rightmost item in the tree level has (0, 0) as the
+					 * key
+					 */
+					if (ItemPointerGetBlockNumberNoCheck(&posting_item->key) != 0 ||
+						ItemPointerGetOffsetNumberNoCheck(&posting_item->key) != 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": rightmost posting tree page (blk %u) has unexpected last key (%u, %u)",
+										RelationGetRelationName(rel),
+										stack->blkno,
+										ItemPointerGetBlockNumberNoCheck(&posting_item->key),
+										ItemPointerGetOffsetNumberNoCheck(&posting_item->key))));
+				}
+				else if (i != FirstOffsetNumber)
+				{
+					PostingItem *previous_posting_item = GinDataPageGetPostingItem(page, i - 1);
+
+					if (ItemPointerCompare(&posting_item->key, &previous_posting_item->key) < 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\" has wrong tuple order in posting tree, block %u, offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+				}
+
+				/*
+				 * Check if this tuple is consistent with the downlink in the
+				 * parent.
+				 */
+				if (stack->parentblk != InvalidBlockNumber && i == maxoff &&
+					ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\": posting item exceeds parent's high key in postingTree internal page on block %u offset %u",
+									RelationGetRelationName(rel),
+									stack->blkno, i)));
+
+				/* If this is an internal page, recurse into the child */
+				if (!GinPageIsLeaf(page))
+				{
+					GinPostingTreeScanItem *ptr;
+
+					ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
+					ptr->depth = stack->depth + 1;
+					ptr->parentkey = posting_item->key;
+					ptr->parentblk = stack->blkno;
+					ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+			}
+		}
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Main entry point for GIN check. Allocates memory context and scans through
+ * GIN graph.
+ */
+static void
+gin_check_parent_keys_consistency(Relation rel,
+								  Relation heaprel,
+								  void *callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GinState	state;
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+	initGinState(&state, rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinScanItem *) palloc0(sizeof(GinScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIN_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GinScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+		IndexTuple	prev_tuple;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+		maxoff = PageGetMaxOffsetNumber(page);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (stack->parenttup != NULL)
+		{
+			GinNullCategory parent_key_category;
+			Datum parent_key = gintuple_get_key(&state,
+												stack->parenttup,
+												&parent_key_category);
+			ItemId iid = PageGetItemIdCareful(rel, stack->blkno,
+											  page, maxoff);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory page_max_key_category;
+			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
+
+			if (GinPageGetOpaque(page)->rightlink != InvalidBlockNumber &&
+				ginCompareEntries(&state, attnum, page_max_key,
+								  page_max_key_category, parent_key,
+								  parent_key_category) > 0)
+			{
+				/* split page detected, install right link to the stack */
+				GinScanItem *ptr;
+
+				elog(DEBUG3, "split detected");
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth;
+				ptr->parenttup = CopyIndexTuple(stack->parenttup);
+				ptr->parentblk = stack->parentblk;
+				ptr->parentlsn = stack->parentlsn;
+				ptr->blkno = GinPageGetOpaque(page)->rightlink;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that tuples in each page are properly ordered and consistent
+		 * with parent high key
+		 */
+		prev_tuple = NULL;
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory prev_key_category;
+			Datum		prev_key;
+			GinNullCategory current_key_category;
+			Datum		current_key;
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
+
+			/* (apparently) first block is metadata, skip order check */
+			if (i != FirstOffsetNumber && stack->blkno != (BlockNumber) 1)
+			{
+				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
+				if (ginCompareEntries(&state, attnum, prev_key,
+									  prev_key_category, current_key,
+									  current_key_category) >= 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has wrong tuple order, block %u, offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+			}
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				i == maxoff)
+			{
+				GinNullCategory parent_key_category;
+				Datum parent_key = gintuple_get_key(&state,
+													stack->parenttup,
+													&parent_key_category);
+
+				if (ginCompareEntries(&state, attnum, current_key,
+									  current_key_category, parent_key,
+									  parent_key_category) > 0)
+				{
+					/*
+					 * There was a discrepancy between parent and child
+					 * tuples. We need to verify it is not a result of
+					 * concurrent call of gistplacetopage(). So, lock parent
+					 * and try to find downlink for current page. It may be
+					 * missing due to concurrent page split, this is OK.
+					 */
+					pfree(stack->parenttup);
+					stack->parenttup = gin_refind_parent(rel, stack->parentblk,
+														 stack->blkno, strategy);
+
+					/* We found it - make a final check before failing */
+					if (!stack->parenttup)
+						elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+							 stack->blkno, stack->parentblk);
+					else
+					{
+						parent_key = gintuple_get_key(&state,
+													  stack->parenttup,
+													  &parent_key_category);
+						if (ginCompareEntries(&state, attnum, current_key,
+											  current_key_category, parent_key,
+											  parent_key_category) > 0)
+							ereport(ERROR,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+											RelationGetRelationName(rel), stack->blkno, i)));
+						else
+						{
+							/*
+							 * But now it is properly adjusted - nothing to do
+							 * here.
+							 */
+						}
+					}
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+			if (!GinPageIsLeaf(page))
+			{
+				GinScanItem *ptr;
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth + 1;
+				/* last tuple in layer has no high key */
+				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
+				{
+					ptr->parenttup = CopyIndexTuple(idxtuple);
+				}
+				else
+				{
+					ptr->parenttup = NULL;
+				}
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = GinGetDownlink(idxtuple);
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+			/* If this item is a pointer to a posting tree, recurse into it */
+			else if (GinIsPostingTree(idxtuple))
+			{
+				BlockNumber rootPostingTree = GinGetPostingTree(idxtuple);
+
+				gin_check_posting_tree_parent_keys_consistency(rel, rootPostingTree);
+			}
+			else
+			{
+				ItemPointer ipd;
+				int			nipd;
+
+				ipd = ginReadTupleWithoutState(idxtuple, &nipd);
+
+				for (int j = 0; j < nipd; j++)
+				{
+					if (!OffsetNumberIsValid(ItemPointerGetOffsetNumber(&ipd[j])))
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting list contains invalid heap pointer on block %u",
+										RelationGetRelationName(rel), stack->blkno)));
+				}
+				pfree(ipd);
+			}
+
+			prev_tuple = CopyIndexTuple(idxtuple);
+		}
+
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Verify that a freshly-read page looks sane.
+ */
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	/*
+	 * ReadBuffer verifies that every newly-read page passes
+	 * PageHeaderIsValid, which means it either contains a reasonably sane
+	 * page header or is all-zero.  We have to defend against the all-zero
+	 * case, however.
+	 */
+	if (PageIsNew(page))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains unexpected zero page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	/*
+	 * Additionally check that the special area looks sane.
+	 */
+	if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains corrupted page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	if (GinPageIsDeleted(page))
+	{
+		if (!GinPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gin_refind_parent(Relation rel, BlockNumber parentblkno,
+				  BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIN_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GinPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+static ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED or LP_DEAD,
+	 * since GIN never uses all three.  Verify that line pointer has storage,
+	 * too.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdIsDead(itemid) || ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 40de7c33f5..e5c8d84db9 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -180,6 +180,25 @@ ORDER BY c.relpages DESC LIMIT 10;
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term>
+     <function>gin_index_parent_check(index regclass) returns void</function>
+     <indexterm>
+      <primary>gin_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gin_index_parent_check</function> tests that its target GIN index
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term>
      <function>gist_index_check(index regclass, heapallindexed boolean) returns void</function>
-- 
2.32.0 (Apple Git-132)

v25-0004-Add-GiST-support-to-pg_amcheck.patchapplication/octet-stream; name=v25-0004-Add-GiST-support-to-pg_amcheck.patchDownload

From 1323d72f29adbc78a5ea3ff66b956b6c7272cbd5 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sun, 5 Feb 2023 15:52:14 -0800
Subject: [PATCH v25 4/4] Add GiST support to pg_amcheck

---
 src/bin/pg_amcheck/pg_amcheck.c      | 205 ++++++++++++++++-----------
 src/bin/pg_amcheck/t/002_nonesuch.pl |   8 +-
 src/bin/pg_amcheck/t/003_check.pl    |  48 ++++---
 3 files changed, 157 insertions(+), 104 deletions(-)

diff --git a/src/bin/pg_amcheck/pg_amcheck.c b/src/bin/pg_amcheck/pg_amcheck.c
index 68f8180c19..337399539d 100644
--- a/src/bin/pg_amcheck/pg_amcheck.c
+++ b/src/bin/pg_amcheck/pg_amcheck.c
@@ -39,8 +39,7 @@ typedef struct PatternInfo
 								 * NULL */
 	bool		heap_only;		/* true if rel_regex should only match heap
 								 * tables */
-	bool		btree_only;		/* true if rel_regex should only match btree
-								 * indexes */
+	bool		index_only;		/* true if rel_regex should only match indexes */
 	bool		matched;		/* true if the pattern matched in any database */
 } PatternInfo;
 
@@ -74,7 +73,7 @@ typedef struct AmcheckOptions
 
 	/*
 	 * As an optimization, if any pattern in the exclude list applies to heap
-	 * tables, or similarly if any such pattern applies to btree indexes, or
+	 * tables, or similarly if any such pattern applies to indexes, or
 	 * to schemas, then these will be true, otherwise false.  These should
 	 * always agree with what you'd conclude by grep'ing through the exclude
 	 * list.
@@ -98,13 +97,13 @@ typedef struct AmcheckOptions
 	int64		endblock;
 	const char *skip;
 
-	/* btree index checking options */
+	/* index checking options */
 	bool		parent_check;
 	bool		rootdescend;
 	bool		heapallindexed;
 
-	/* heap and btree hybrid option */
-	bool		no_btree_expansion;
+	/* heap and indexes hybrid option */
+	bool		no_index_expansion;
 } AmcheckOptions;
 
 static AmcheckOptions opts = {
@@ -132,7 +131,7 @@ static AmcheckOptions opts = {
 	.parent_check = false,
 	.rootdescend = false,
 	.heapallindexed = false,
-	.no_btree_expansion = false
+	.no_index_expansion = false
 };
 
 static const char *progname = NULL;
@@ -154,7 +153,8 @@ typedef struct RelationInfo
 {
 	const DatabaseInfo *datinfo;	/* shared by other relinfos */
 	Oid			reloid;
-	bool		is_heap;		/* true if heap, false if btree */
+	Oid			amoid;
+	bool		is_heap;		/* true if heap, false if index */
 	char	   *nspname;
 	char	   *relname;
 	int			relpages;
@@ -175,10 +175,12 @@ static void prepare_heap_command(PQExpBuffer sql, RelationInfo *rel,
 								 PGconn *conn);
 static void prepare_btree_command(PQExpBuffer sql, RelationInfo *rel,
 								  PGconn *conn);
+static void prepare_gist_command(PQExpBuffer sql, RelationInfo *rel,
+								  PGconn *conn);
 static void run_command(ParallelSlot *slot, const char *sql);
 static bool verify_heap_slot_handler(PGresult *res, PGconn *conn,
 									 void *context);
-static bool verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context);
+static bool verify_index_slot_handler(PGresult *res, PGconn *conn, void *context);
 static void help(const char *progname);
 static void progress_report(uint64 relations_total, uint64 relations_checked,
 							uint64 relpages_total, uint64 relpages_checked,
@@ -192,7 +194,7 @@ static void append_relation_pattern(PatternInfoArray *pia, const char *pattern,
 									int encoding);
 static void append_heap_pattern(PatternInfoArray *pia, const char *pattern,
 								int encoding);
-static void append_btree_pattern(PatternInfoArray *pia, const char *pattern,
+static void append_index_pattern(PatternInfoArray *pia, const char *pattern,
 								 int encoding);
 static void compile_database_list(PGconn *conn, SimplePtrList *databases,
 								  const char *initial_dbname);
@@ -318,11 +320,11 @@ main(int argc, char *argv[])
 				break;
 			case 'i':
 				opts.allrel = false;
-				append_btree_pattern(&opts.include, optarg, encoding);
+				append_index_pattern(&opts.include, optarg, encoding);
 				break;
 			case 'I':
 				opts.excludeidx = true;
-				append_btree_pattern(&opts.exclude, optarg, encoding);
+				append_index_pattern(&opts.exclude, optarg, encoding);
 				break;
 			case 'j':
 				if (!option_parse_int(optarg, "-j/--jobs", 1, INT_MAX,
@@ -377,7 +379,7 @@ main(int argc, char *argv[])
 				maintenance_db = pg_strdup(optarg);
 				break;
 			case 2:
-				opts.no_btree_expansion = true;
+				opts.no_index_expansion = true;
 				break;
 			case 3:
 				opts.no_toast_expansion = true;
@@ -609,8 +611,8 @@ main(int argc, char *argv[])
 			if (pat->heap_only)
 				log_no_match("no heap tables to check matching \"%s\"",
 							 pat->pattern);
-			else if (pat->btree_only)
-				log_no_match("no btree indexes to check matching \"%s\"",
+			else if (pat->index_only)
+				log_no_match("no indexes to check matching \"%s\"",
 							 pat->pattern);
 			else if (pat->rel_regex == NULL)
 				log_no_match("no relations to check in schemas matching \"%s\"",
@@ -743,13 +745,20 @@ main(int argc, char *argv[])
 				if (opts.show_progress && progress_since_last_stderr)
 					fprintf(stderr, "\n");
 
-				pg_log_info("checking btree index \"%s.%s.%s\"",
+				pg_log_info("checking index \"%s.%s.%s\"",
 							rel->datinfo->datname, rel->nspname, rel->relname);
 				progress_since_last_stderr = false;
 			}
-			prepare_btree_command(&sql, rel, free_slot->connection);
+			if (rel->amoid == BTREE_AM_OID)
+				prepare_btree_command(&sql, rel, free_slot->connection);
+			else if (rel->amoid == GIST_AM_OID)
+				prepare_gist_command(&sql, rel, free_slot->connection);
+			else
+				/* should not happen at this stage */
+				pg_log_info("Verification of index type %u not supported",
+							rel->amoid);
 			rel->sql = pstrdup(sql.data);	/* pg_free'd after command */
-			ParallelSlotSetHandler(free_slot, verify_btree_slot_handler, rel);
+			ParallelSlotSetHandler(free_slot, verify_index_slot_handler, rel);
 			run_command(free_slot, rel->sql);
 		}
 	}
@@ -827,7 +836,7 @@ prepare_heap_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
  * Creates a SQL command for running amcheck checking on the given btree index
  * relation.  The command does not select any columns, as btree checking
  * functions do not return any, but rather return corruption information by
- * raising errors, which verify_btree_slot_handler expects.
+ * raising errors, which verify_index_slot_handler expects.
  *
  * The constructed SQL command will silently skip temporary indexes, and
  * indexes being reindexed concurrently, as checking them would needlessly draw
@@ -869,6 +878,28 @@ prepare_btree_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
 						  rel->reloid);
 }
 
+/*
+ * prepare_gist_command
+ * Similar to btree equivalent prepares command to check GiST index.
+ */
+static void
+prepare_gist_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
+{
+	resetPQExpBuffer(sql);
+
+	appendPQExpBuffer(sql,
+						"SELECT %s.gist_index_check("
+						"index := c.oid, heapallindexed := %s)"
+						"\nFROM pg_catalog.pg_class c, pg_catalog.pg_index i "
+						"WHERE c.oid = %u "
+						"AND c.oid = i.indexrelid "
+						"AND c.relpersistence != 't' "
+						"AND i.indisready AND i.indisvalid AND i.indislive",
+						rel->datinfo->amcheck_schema,
+						(opts.heapallindexed ? "true" : "false"),
+						rel->reloid);
+}
+
 /*
  * run_command
  *
@@ -908,7 +939,7 @@ run_command(ParallelSlot *slot, const char *sql)
  * Note: Heap relation corruption is reported by verify_heapam() via the result
  * set, rather than an ERROR, but running verify_heapam() on a corrupted heap
  * table may still result in an error being returned from the server due to
- * missing relation files, bad checksums, etc.  The btree corruption checking
+ * missing relation files, bad checksums, etc.  The corruption checking
  * functions always use errors to communicate corruption messages.  We can't
  * just abort processing because we got a mere ERROR.
  *
@@ -1057,11 +1088,11 @@ verify_heap_slot_handler(PGresult *res, PGconn *conn, void *context)
 }
 
 /*
- * verify_btree_slot_handler
+ * verify_index_slot_handler
  *
- * ParallelSlotHandler that receives results from a btree checking command
- * created by prepare_btree_command and outputs them for the user.  The results
- * from the btree checking command is assumed to be empty, but when the results
+ * ParallelSlotHandler that receives results from a checking command created by
+ * prepare_[btree,gist]_command and outputs them for the user.  The results
+ * from the checking command is assumed to be empty, but when the results
  * are an error code, the useful information about the corruption is expected
  * in the connection's error message.
  *
@@ -1070,7 +1101,7 @@ verify_heap_slot_handler(PGresult *res, PGconn *conn, void *context)
  * context: unused
  */
 static bool
-verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
+verify_index_slot_handler(PGresult *res, PGconn *conn, void *context)
 {
 	RelationInfo *rel = (RelationInfo *) context;
 
@@ -1081,7 +1112,7 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 		if (ntups > 1)
 		{
 			/*
-			 * We expect the btree checking functions to return one void row
+			 * We expect the checking functions to return one void row
 			 * each, or zero rows if the check was skipped due to the object
 			 * being in the wrong state to be checked, so we should output
 			 * some sort of warning if we get anything more, not because it
@@ -1096,7 +1127,7 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 			 */
 			if (opts.show_progress && progress_since_last_stderr)
 				fprintf(stderr, "\n");
-			pg_log_warning("btree index \"%s.%s.%s\": btree checking function returned unexpected number of rows: %d",
+			pg_log_warning("index \"%s.%s.%s\": checking function returned unexpected number of rows: %d",
 						   rel->datinfo->datname, rel->nspname, rel->relname, ntups);
 			if (opts.verbose)
 				pg_log_warning_detail("Query was: %s", rel->sql);
@@ -1110,7 +1141,7 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 		char	   *msg = indent_lines(PQerrorMessage(conn));
 
 		all_checks_pass = false;
-		printf(_("btree index \"%s.%s.%s\":\n"),
+		printf(_("index \"%s.%s.%s\":\n"),
 			   rel->datinfo->datname, rel->nspname, rel->relname);
 		printf("%s", msg);
 		if (opts.verbose)
@@ -1163,6 +1194,8 @@ help(const char *progname)
 	printf(_("      --heapallindexed            check that all heap tuples are found within indexes\n"));
 	printf(_("      --parent-check              check index parent/child relationships\n"));
 	printf(_("      --rootdescend               search from root page to refind tuples\n"));
+	printf(_("\nGiST index checking options:\n"));
+	printf(_("      --heapallindexed            check that all heap tuples are found within indexes\n"));
 	printf(_("\nConnection options:\n"));
 	printf(_("  -h, --host=HOSTNAME             database server host or socket directory\n"));
 	printf(_("  -p, --port=PORT                 database server port\n"));
@@ -1376,11 +1409,11 @@ append_schema_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
  * pattern: the relation name pattern
  * encoding: client encoding for parsing the pattern
  * heap_only: whether the pattern should only be matched against heap tables
- * btree_only: whether the pattern should only be matched against btree indexes
+ * index_only: whether the pattern should only be matched against indexes
  */
 static void
 append_relation_pattern_helper(PatternInfoArray *pia, const char *pattern,
-							   int encoding, bool heap_only, bool btree_only)
+							   int encoding, bool heap_only, bool index_only)
 {
 	PQExpBufferData dbbuf;
 	PQExpBufferData nspbuf;
@@ -1415,14 +1448,14 @@ append_relation_pattern_helper(PatternInfoArray *pia, const char *pattern,
 	termPQExpBuffer(&relbuf);
 
 	info->heap_only = heap_only;
-	info->btree_only = btree_only;
+	info->index_only = index_only;
 }
 
 /*
  * append_relation_pattern
  *
  * Adds the given pattern interpreted as a relation pattern, to be matched
- * against both heap tables and btree indexes.
+ * against both heap tables and indexes.
  *
  * pia: the pattern info array to be appended
  * pattern: the relation name pattern
@@ -1451,17 +1484,17 @@ append_heap_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
 }
 
 /*
- * append_btree_pattern
+ * append_index_pattern
  *
  * Adds the given pattern interpreted as a relation pattern, to be matched only
- * against btree indexes.
+ * against indexes.
  *
  * pia: the pattern info array to be appended
  * pattern: the relation name pattern
  * encoding: client encoding for parsing the pattern
  */
 static void
-append_btree_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
+append_index_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
 {
 	append_relation_pattern_helper(pia, pattern, encoding, false, true);
 }
@@ -1719,7 +1752,7 @@ compile_database_list(PGconn *conn, SimplePtrList *databases,
  *     rel_regex: the relname regexp parsed from the pattern, or NULL if the
  *                pattern had no relname part
  *     heap_only: true if the pattern applies only to heap tables (not indexes)
- *     btree_only: true if the pattern applies only to btree indexes (not tables)
+ *     index_only: true if the pattern applies only to indexes (not tables)
  *
  * buf: the buffer to be appended
  * patterns: the array of patterns to be inserted into the CTE
@@ -1761,7 +1794,7 @@ append_rel_pattern_raw_cte(PQExpBuffer buf, const PatternInfoArray *pia,
 			appendPQExpBufferStr(buf, "::TEXT, true::BOOLEAN");
 		else
 			appendPQExpBufferStr(buf, "::TEXT, false::BOOLEAN");
-		if (info->btree_only)
+		if (info->index_only)
 			appendPQExpBufferStr(buf, ", true::BOOLEAN");
 		else
 			appendPQExpBufferStr(buf, ", false::BOOLEAN");
@@ -1799,8 +1832,8 @@ append_rel_pattern_filtered_cte(PQExpBuffer buf, const char *raw,
 								const char *filtered, PGconn *conn)
 {
 	appendPQExpBuffer(buf,
-					  "\n%s (pattern_id, nsp_regex, rel_regex, heap_only, btree_only) AS ("
-					  "\nSELECT pattern_id, nsp_regex, rel_regex, heap_only, btree_only "
+					  "\n%s (pattern_id, nsp_regex, rel_regex, heap_only, index_only) AS ("
+					  "\nSELECT pattern_id, nsp_regex, rel_regex, heap_only, index_only "
 					  "FROM %s r"
 					  "\nWHERE (r.db_regex IS NULL "
 					  "OR ",
@@ -1823,7 +1856,7 @@ append_rel_pattern_filtered_cte(PQExpBuffer buf, const char *raw,
  * The cells of the constructed list contain all information about the relation
  * necessary to connect to the database and check the object, including which
  * database to connect to, where contrib/amcheck is installed, and the Oid and
- * type of object (heap table vs. btree index).  Rather than duplicating the
+ * type of object (heap table vs. index).  Rather than duplicating the
  * database details per relation, the relation structs use references to the
  * same database object, provided by the caller.
  *
@@ -1850,7 +1883,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	if (!opts.allrel)
 	{
 		appendPQExpBufferStr(&sql,
-							 " include_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, btree_only) AS (");
+							 " include_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, index_only) AS (");
 		append_rel_pattern_raw_cte(&sql, &opts.include, conn);
 		appendPQExpBufferStr(&sql, "\n),");
 		append_rel_pattern_filtered_cte(&sql, "include_raw", "include_pat", conn);
@@ -1860,7 +1893,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	if (opts.excludetbl || opts.excludeidx || opts.excludensp)
 	{
 		appendPQExpBufferStr(&sql,
-							 " exclude_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, btree_only) AS (");
+							 " exclude_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, index_only) AS (");
 		append_rel_pattern_raw_cte(&sql, &opts.exclude, conn);
 		appendPQExpBufferStr(&sql, "\n),");
 		append_rel_pattern_filtered_cte(&sql, "exclude_raw", "exclude_pat", conn);
@@ -1868,36 +1901,36 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 
 	/* Append the relation CTE. */
 	appendPQExpBufferStr(&sql,
-						 " relation (pattern_id, oid, nspname, relname, reltoastrelid, relpages, is_heap, is_btree) AS ("
+						 " relation (pattern_id, oid, amoid, nspname, relname, reltoastrelid, relpages, is_heap, is_index) AS ("
 						 "\nSELECT DISTINCT ON (c.oid");
 	if (!opts.allrel)
 		appendPQExpBufferStr(&sql, ", ip.pattern_id) ip.pattern_id,");
 	else
 		appendPQExpBufferStr(&sql, ") NULL::INTEGER AS pattern_id,");
 	appendPQExpBuffer(&sql,
-					  "\nc.oid, n.nspname, c.relname, c.reltoastrelid, c.relpages, "
-					  "c.relam = %u AS is_heap, "
-					  "c.relam = %u AS is_btree"
+					  "\nc.oid, c.relam as amoid, n.nspname, c.relname, "
+					  "c.reltoastrelid, c.relpages, c.relam = %u AS is_heap, "
+					  "(c.relam = %u OR c.relam = %u) AS is_index"
 					  "\nFROM pg_catalog.pg_class c "
 					  "INNER JOIN pg_catalog.pg_namespace n "
 					  "ON c.relnamespace = n.oid",
-					  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+					  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 	if (!opts.allrel)
 		appendPQExpBuffer(&sql,
 						  "\nINNER JOIN include_pat ip"
 						  "\nON (n.nspname ~ ip.nsp_regex OR ip.nsp_regex IS NULL)"
 						  "\nAND (c.relname ~ ip.rel_regex OR ip.rel_regex IS NULL)"
 						  "\nAND (c.relam = %u OR NOT ip.heap_only)"
-						  "\nAND (c.relam = %u OR NOT ip.btree_only)",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  "\nAND ((c.relam = %u OR c.relam = %u) OR NOT ip.index_only)",
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 	if (opts.excludetbl || opts.excludeidx || opts.excludensp)
 		appendPQExpBuffer(&sql,
 						  "\nLEFT OUTER JOIN exclude_pat ep"
 						  "\nON (n.nspname ~ ep.nsp_regex OR ep.nsp_regex IS NULL)"
 						  "\nAND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL)"
 						  "\nAND (c.relam = %u OR NOT ep.heap_only OR ep.rel_regex IS NULL)"
-						  "\nAND (c.relam = %u OR NOT ep.btree_only OR ep.rel_regex IS NULL)",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  "\nAND ((c.relam = %u OR c.relam = %u) OR NOT ep.index_only OR ep.rel_regex IS NULL)",
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 
 	/*
 	 * Exclude temporary tables and indexes, which must necessarily belong to
@@ -1931,12 +1964,12 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 						  HEAP_TABLE_AM_OID, PG_TOAST_NAMESPACE);
 	else
 		appendPQExpBuffer(&sql,
-						  " AND c.relam IN (%u, %u)"
+						  " AND c.relam IN (%u, %u, %u)"
 						  "AND c.relkind IN ('r', 'S', 'm', 't', 'i') "
 						  "AND ((c.relam = %u AND c.relkind IN ('r', 'S', 'm', 't')) OR "
-						  "(c.relam = %u AND c.relkind = 'i'))",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID,
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  "((c.relam = %u OR c.relam = %u) AND c.relkind = 'i'))",
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID,
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 
 	appendPQExpBufferStr(&sql,
 						 "\nORDER BY c.oid)");
@@ -1965,17 +1998,18 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		appendPQExpBufferStr(&sql,
 							 "\n)");
 	}
-	if (!opts.no_btree_expansion)
+	if (!opts.no_index_expansion)
 	{
 		/*
 		 * Include a CTE for btree indexes associated with primary heap tables
 		 * selected above, filtering by exclusion patterns (if any) that match
-		 * btree index names.
+		 * btree index names. Currently, only btree indexes can be PK, but this
+		 * might chance in future.
 		 */
 		appendPQExpBufferStr(&sql,
-							 ", index (oid, nspname, relname, relpages) AS ("
-							 "\nSELECT c.oid, r.nspname, c.relname, c.relpages "
-							 "FROM relation r"
+							 ", index (oid, amoid, nspname, relname, relpages) AS ("
+							 "\nSELECT c.oid, c.relam as amoid, r.nspname, "
+							 "c.relname, c.relpages FROM relation r"
 							 "\nINNER JOIN pg_catalog.pg_index i "
 							 "ON r.oid = i.indrelid "
 							 "INNER JOIN pg_catalog.pg_class c "
@@ -1988,7 +2022,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 								 "\nLEFT OUTER JOIN exclude_pat ep "
 								 "ON (n.nspname ~ ep.nsp_regex OR ep.nsp_regex IS NULL) "
 								 "AND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL) "
-								 "AND ep.btree_only"
+								 "AND ep.index_only"
 								 "\nWHERE ep.pattern_id IS NULL");
 		else
 			appendPQExpBufferStr(&sql,
@@ -1996,7 +2030,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		appendPQExpBuffer(&sql,
 						  " AND c.relam = %u "
 						  "AND c.relkind = 'i'",
-						  BTREE_AM_OID);
+						  BTREE_AM_OID); /* Do not expect other AMs here */
 		if (opts.no_toast_expansion)
 			appendPQExpBuffer(&sql,
 							  " AND c.relnamespace != %u",
@@ -2004,7 +2038,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		appendPQExpBufferStr(&sql, "\n)");
 	}
 
-	if (!opts.no_toast_expansion && !opts.no_btree_expansion)
+	if (!opts.no_toast_expansion && !opts.no_index_expansion)
 	{
 		/*
 		 * Include a CTE for btree indexes associated with toast tables of
@@ -2025,13 +2059,13 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 								 "\nLEFT OUTER JOIN exclude_pat ep "
 								 "ON ('pg_toast' ~ ep.nsp_regex OR ep.nsp_regex IS NULL) "
 								 "AND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL) "
-								 "AND ep.btree_only "
+								 "AND ep.index_only "
 								 "WHERE ep.pattern_id IS NULL");
 		else
 			appendPQExpBufferStr(&sql,
 								 "\nWHERE true");
 		appendPQExpBuffer(&sql,
-						  " AND c.relam = %u"
+						  " AND c.relam = %u "
 						  " AND c.relkind = 'i')",
 						  BTREE_AM_OID);
 	}
@@ -2045,12 +2079,13 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	 * list.
 	 */
 	appendPQExpBufferStr(&sql,
-						 "\nSELECT pattern_id, is_heap, is_btree, oid, nspname, relname, relpages "
+						 "\nSELECT pattern_id, is_heap, is_index, oid, amoid, nspname, relname, relpages "
 						 "FROM (");
 	appendPQExpBufferStr(&sql,
 	/* Inclusion patterns that failed to match */
-						 "\nSELECT pattern_id, is_heap, is_btree, "
+						 "\nSELECT pattern_id, is_heap, is_index, "
 						 "NULL::OID AS oid, "
+						 "NULL::OID AS amoid, "
 						 "NULL::TEXT AS nspname, "
 						 "NULL::TEXT AS relname, "
 						 "NULL::INTEGER AS relpages"
@@ -2059,29 +2094,29 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 						 "UNION"
 	/* Primary relations */
 						 "\nSELECT NULL::INTEGER AS pattern_id, "
-						 "is_heap, is_btree, oid, nspname, relname, relpages "
+						 "is_heap, is_index, oid, amoid, nspname, relname, relpages "
 						 "FROM relation");
 	if (!opts.no_toast_expansion)
-		appendPQExpBufferStr(&sql,
+		appendPQExpBuffer(&sql,
 							 " UNION"
 		/* Toast tables for primary relations */
 							 "\nSELECT NULL::INTEGER AS pattern_id, TRUE AS is_heap, "
-							 "FALSE AS is_btree, oid, nspname, relname, relpages "
+							 "FALSE AS is_index, oid, 0 as amoid, nspname, relname, relpages "
 							 "FROM toast");
-	if (!opts.no_btree_expansion)
+	if (!opts.no_index_expansion)
 		appendPQExpBufferStr(&sql,
 							 " UNION"
 		/* Indexes for primary relations */
 							 "\nSELECT NULL::INTEGER AS pattern_id, FALSE AS is_heap, "
-							 "TRUE AS is_btree, oid, nspname, relname, relpages "
+							 "TRUE AS is_index, oid, amoid, nspname, relname, relpages "
 							 "FROM index");
-	if (!opts.no_toast_expansion && !opts.no_btree_expansion)
-		appendPQExpBufferStr(&sql,
+	if (!opts.no_toast_expansion && !opts.no_index_expansion)
+		appendPQExpBuffer(&sql,
 							 " UNION"
 		/* Indexes for toast relations */
 							 "\nSELECT NULL::INTEGER AS pattern_id, FALSE AS is_heap, "
-							 "TRUE AS is_btree, oid, nspname, relname, relpages "
-							 "FROM toast_index");
+							 "TRUE AS is_index, oid, %u as amoid, nspname, relname, relpages "
+							 "FROM toast_index", BTREE_AM_OID);
 	appendPQExpBufferStr(&sql,
 						 "\n) AS combined_records "
 						 "ORDER BY relpages DESC NULLS FIRST, oid");
@@ -2101,8 +2136,9 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	{
 		int			pattern_id = -1;
 		bool		is_heap = false;
-		bool		is_btree PG_USED_FOR_ASSERTS_ONLY = false;
+		bool		is_index PG_USED_FOR_ASSERTS_ONLY = false;
 		Oid			oid = InvalidOid;
+		Oid			amoid = InvalidOid;
 		const char *nspname = NULL;
 		const char *relname = NULL;
 		int			relpages = 0;
@@ -2112,15 +2148,17 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		if (!PQgetisnull(res, i, 1))
 			is_heap = (PQgetvalue(res, i, 1)[0] == 't');
 		if (!PQgetisnull(res, i, 2))
-			is_btree = (PQgetvalue(res, i, 2)[0] == 't');
+			is_index = (PQgetvalue(res, i, 2)[0] == 't');
 		if (!PQgetisnull(res, i, 3))
 			oid = atooid(PQgetvalue(res, i, 3));
 		if (!PQgetisnull(res, i, 4))
-			nspname = PQgetvalue(res, i, 4);
+			amoid = atooid(PQgetvalue(res, i, 4));
 		if (!PQgetisnull(res, i, 5))
-			relname = PQgetvalue(res, i, 5);
+			nspname = PQgetvalue(res, i, 5);
 		if (!PQgetisnull(res, i, 6))
-			relpages = atoi(PQgetvalue(res, i, 6));
+			relname = PQgetvalue(res, i, 6);
+		if (!PQgetisnull(res, i, 7))
+			relpages = atoi(PQgetvalue(res, i, 7));
 
 		if (pattern_id >= 0)
 		{
@@ -2142,10 +2180,11 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 			RelationInfo *rel = (RelationInfo *) pg_malloc0(sizeof(RelationInfo));
 
 			Assert(OidIsValid(oid));
-			Assert((is_heap && !is_btree) || (is_btree && !is_heap));
+			Assert((is_heap && !is_index) || (is_index && !is_heap));
 
 			rel->datinfo = dat;
 			rel->reloid = oid;
+			rel->amoid = amoid;
 			rel->is_heap = is_heap;
 			rel->nspname = pstrdup(nspname);
 			rel->relname = pstrdup(relname);
@@ -2155,7 +2194,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 			{
 				/*
 				 * We apply --startblock and --endblock to heap tables, but
-				 * not btree indexes, and for progress purposes we need to
+				 * not supported indexes, and for progress purposes we need to
 				 * track how many blocks we expect to check.
 				 */
 				if (opts.endblock >= 0 && rel->blocks_to_check > opts.endblock)
diff --git a/src/bin/pg_amcheck/t/002_nonesuch.pl b/src/bin/pg_amcheck/t/002_nonesuch.pl
index 58be2c694d..5e8a63a844 100644
--- a/src/bin/pg_amcheck/t/002_nonesuch.pl
+++ b/src/bin/pg_amcheck/t/002_nonesuch.pl
@@ -272,8 +272,8 @@ $node->command_checks_all(
 	[
 		qr/pg_amcheck: warning: no heap tables to check matching "no_such_table"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "no\*such\*table"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "no_such_index"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "no\*such\*index"/,
+		qr/pg_amcheck: warning: no indexes to check matching "no_such_index"/,
+		qr/pg_amcheck: warning: no indexes to check matching "no\*such\*index"/,
 		qr/pg_amcheck: warning: no relations to check matching "no_such_relation"/,
 		qr/pg_amcheck: warning: no relations to check matching "no\*such\*relation"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "no\*such\*table"/,
@@ -319,8 +319,8 @@ $node->command_checks_all(
 		qr/pg_amcheck: warning: no heap tables to check matching "template1\.public\.foo"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "another_db\.public\.foo"/,
 		qr/pg_amcheck: warning: no connectable databases to check matching "no_such_database\.public\.foo"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "template1\.public\.foo_idx"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "another_db\.public\.foo_idx"/,
+		qr/pg_amcheck: warning: no indexes to check matching "template1\.public\.foo_idx"/,
+		qr/pg_amcheck: warning: no indexes to check matching "another_db\.public\.foo_idx"/,
 		qr/pg_amcheck: warning: no connectable databases to check matching "no_such_database\.public\.foo_idx"/,
 		qr/pg_amcheck: error: no relations to check/,
 	],
diff --git a/src/bin/pg_amcheck/t/003_check.pl b/src/bin/pg_amcheck/t/003_check.pl
index 359abe25a1..8b6326dd3d 100644
--- a/src/bin/pg_amcheck/t/003_check.pl
+++ b/src/bin/pg_amcheck/t/003_check.pl
@@ -185,7 +185,7 @@ for my $dbname (qw(db1 db2 db3))
 	# schemas.  The schemas are all identical to start, but
 	# we will corrupt them differently later.
 	#
-	for my $schema (qw(s1 s2 s3 s4 s5))
+	for my $schema (qw(s1 s2 s3 s4 s5 s6))
 	{
 		$node->safe_psql(
 			$dbname, qq(
@@ -288,22 +288,24 @@ plan_to_corrupt_first_page('db1', 's3.t2_btree');
 # Corrupt toast table, partitions, and materialized views in schema "s4"
 plan_to_remove_toast_file('db1', 's4.t2');
 
-# Corrupt all other object types in schema "s5".  We don't have amcheck support
+# Corrupt GiST index in schema "s5"
+plan_to_remove_relation_file('db1', 's5.t1_gist');
+plan_to_corrupt_first_page('db1', 's5.t2_gist');
+
+# Corrupt all other object types in schema "s6".  We don't have amcheck support
 # for these types, but we check that their corruption does not trigger any
 # errors in pg_amcheck
-plan_to_remove_relation_file('db1', 's5.seq1');
-plan_to_remove_relation_file('db1', 's5.t1_hash');
-plan_to_remove_relation_file('db1', 's5.t1_gist');
-plan_to_remove_relation_file('db1', 's5.t1_gin');
-plan_to_remove_relation_file('db1', 's5.t1_brin');
-plan_to_remove_relation_file('db1', 's5.t1_spgist');
+plan_to_remove_relation_file('db1', 's6.seq1');
+plan_to_remove_relation_file('db1', 's6.t1_hash');
+plan_to_remove_relation_file('db1', 's6.t1_gin');
+plan_to_remove_relation_file('db1', 's6.t1_brin');
+plan_to_remove_relation_file('db1', 's6.t1_spgist');
 
-plan_to_corrupt_first_page('db1', 's5.seq2');
-plan_to_corrupt_first_page('db1', 's5.t2_hash');
-plan_to_corrupt_first_page('db1', 's5.t2_gist');
-plan_to_corrupt_first_page('db1', 's5.t2_gin');
-plan_to_corrupt_first_page('db1', 's5.t2_brin');
-plan_to_corrupt_first_page('db1', 's5.t2_spgist');
+plan_to_corrupt_first_page('db1', 's6.seq2');
+plan_to_corrupt_first_page('db1', 's6.t2_hash');
+plan_to_corrupt_first_page('db1', 's6.t2_gin');
+plan_to_corrupt_first_page('db1', 's6.t2_brin');
+plan_to_corrupt_first_page('db1', 's6.t2_spgist');
 
 
 # Database 'db2' corruptions
@@ -434,10 +436,22 @@ $node->command_checks_all(
 	[$no_output_re],
 	'pg_amcheck in schema s4 excluding toast reports no corruption');
 
-# Check that no corruption is reported in schema db1.s5
-$node->command_checks_all([ @cmd, '-s', 's5', 'db1' ],
+# In schema db1.s5 we should see GiST corruption messages on stdout, and
+# nothing on stderr.
+#
+$node->command_checks_all(
+	[ @cmd, '-s', 's5', 'db1' ],
+	2,
+	[
+		$missing_file_re, $line_pointer_corruption_re,
+	],
+	[$no_output_re],
+	'pg_amcheck schema s5 reports GiST index errors');
+
+# Check that no corruption is reported in schema db1.s6
+$node->command_checks_all([ @cmd, '-s', 's6', 'db1' ],
 	0, [$no_output_re], [$no_output_re],
-	'pg_amcheck over schema s5 reports no corruption');
+	'pg_amcheck over schema s6 reports no corruption');
 
 # In schema db1.s1, only indexes are corrupt.  Verify that when we exclude
 # the indexes, no corruption is reported about the schema.
-- 
2.32.0 (Apple Git-132)

v25-0001-Refactor-amcheck-to-extract-common-locking-routi.patchapplication/octet-stream; name=v25-0001-Refactor-amcheck-to-extract-common-locking-routi.patchDownload

From 9dd36253feb5a213a014a61d2e8e7d8cddc3e585 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:08:10 +0500
Subject: [PATCH v25 1/4] Refactor amcheck to extract common locking routines
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Other indexes will need to do same precautions before doing checks:
 - ensuring index is checkable
 - switching user context
 - taking care about GUCs changed by index functions
To reuse existing functionality this commit moves it to amcheck.c.

Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile        |   1 +
 contrib/amcheck/amcheck.c       | 173 ++++++++++++++++++++++++
 contrib/amcheck/amcheck.h       |  30 ++++
 contrib/amcheck/meson.build     |   1 +
 contrib/amcheck/verify_nbtree.c | 233 +++++++-------------------------
 5 files changed, 256 insertions(+), 182 deletions(-)
 create mode 100644 contrib/amcheck/amcheck.c
 create mode 100644 contrib/amcheck/amcheck.h

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index b82f221e50..6d26551fe3 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -3,6 +3,7 @@
 MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
+	amcheck.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
new file mode 100644
index 0000000000..5a9c9429a3
--- /dev/null
+++ b/contrib/amcheck/amcheck.c
@@ -0,0 +1,173 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.c
+ *		Utility functions common to all access methods.
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/table.h"
+#include "access/tableam.h"
+#include "amcheck.h"
+#include "catalog/index.h"
+#include "commands/tablecmds.h"
+#include "utils/guc.h"
+
+static bool amcheck_index_mainfork_expected(Relation rel);
+
+
+/*
+ * Check if index relation should have a file for its main relation fork.
+ * Verification uses this to skip unlogged indexes when in hot standby mode,
+ * where there is simply nothing to verify.
+ *
+ * NB: Caller should call index_checkable() before calling here.
+ */
+static bool
+amcheck_index_mainfork_expected(Relation rel)
+{
+	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
+		!RecoveryInProgress())
+		return true;
+
+	ereport(NOTICE,
+			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
+			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
+					RelationGetRelationName(rel))));
+
+	return false;
+}
+
+void
+amcheck_lock_relation_and_check(Oid indrelid,
+								Oid am_id,
+								IndexDoCheckCallback check,
+								LOCKMODE lockmode,
+								void *state)
+{
+	Oid			heapid;
+	Relation	indrel;
+	Relation	heaprel;
+	Oid			save_userid;
+	int			save_sec_context;
+	int			save_nestlevel;
+
+	/*
+	 * We must lock table before index to avoid deadlocks.  However, if the
+	 * passed indrelid isn't an index then IndexGetRelation() will fail.
+	 * Rather than emitting a not-very-helpful error message, postpone
+	 * complaining, expecting that the is-it-an-index test below will fail.
+	 *
+	 * In hot standby mode this will raise an error when parentcheck is true.
+	 */
+	heapid = IndexGetRelation(indrelid, true);
+	if (OidIsValid(heapid))
+	{
+		heaprel = table_open(heapid, lockmode);
+
+		/*
+		 * Switch to the table owner's userid, so that any index functions are
+		 * run as that user.  Also lock down security-restricted operations
+		 * and arrange to make GUC variable changes local to this command.
+		 */
+		GetUserIdAndSecContext(&save_userid, &save_sec_context);
+		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
+							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
+		save_nestlevel = NewGUCNestLevel();
+	}
+	else
+	{
+		heaprel = NULL;
+		/* Set these just to suppress "uninitialized variable" warnings */
+		save_userid = InvalidOid;
+		save_sec_context = -1;
+		save_nestlevel = -1;
+	}
+
+	/*
+	 * Open the target index relations separately (like relation_openrv(), but
+	 * with heap relation locked first to prevent deadlocking).  In hot
+	 * standby mode this will raise an error when parentcheck is true.
+	 *
+	 * There is no need for the usual indcheckxmin usability horizon test
+	 * here, even in the heapallindexed case, because index undergoing
+	 * verification only needs to have entries for a new transaction snapshot.
+	 * (If this is a parentcheck verification, there is no question about
+	 * committed or recently dead heap tuples lacking index entries due to
+	 * concurrent activity.)
+	 */
+	indrel = index_open(indrelid, lockmode);
+
+	/*
+	 * Since we did the IndexGetRelation call above without any lock, it's
+	 * barely possible that a race against an index drop/recreation could have
+	 * netted us the wrong table.
+	 */
+	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_TABLE),
+				 errmsg("could not open parent table of index \"%s\"",
+						RelationGetRelationName(indrel))));
+
+	/* Relation suitable for checking */
+	index_checkable(indrel, am_id);
+
+	if (amcheck_index_mainfork_expected(indrel))
+		check(indrel, heaprel, state);
+
+	/* Roll back any GUC changes executed by index functions */
+	AtEOXact_GUC(false, save_nestlevel);
+
+	/* Restore userid and security context */
+	SetUserIdAndSecContext(save_userid, save_sec_context);
+
+	/*
+	 * Release locks early. That's ok here because nothing in the called
+	 * routines will trigger shared cache invalidations to be sent, so we can
+	 * relax the usual pattern of only releasing locks after commit.
+	 */
+	index_close(indrel, lockmode);
+	if (heaprel)
+		table_close(heaprel, lockmode);
+}
+
+/*
+ * Basic checks about the suitability of a relation for checking as an index.
+ *
+ *
+ * NB: Intentionally not checking permissions, the function is normally not
+ * callable by non-superusers. If granted, it's useful to be able to check a
+ * whole cluster.
+ */
+void
+index_checkable(Relation rel, Oid am_id)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != am_id)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only B-Tree indexes are supported as targets for verification"),
+				 errdetail("Relation \"%s\" is not a B-Tree index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid.")));
+}
diff --git a/contrib/amcheck/amcheck.h b/contrib/amcheck/amcheck.h
new file mode 100644
index 0000000000..b139da067a
--- /dev/null
+++ b/contrib/amcheck/amcheck.h
@@ -0,0 +1,30 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "storage/bufpage.h"
+#include "storage/lmgr.h"
+#include "storage/lockdefs.h"
+#include "utils/relcache.h"
+#include "miscadmin.h"
+
+/* Typedefs for callback functions for amcheck_lock_relation */
+typedef void (*IndexCheckableCallback) (Relation index);
+typedef void (*IndexDoCheckCallback) (Relation rel,
+									  Relation heaprel,
+									  void *state);
+
+extern void amcheck_lock_relation_and_check(Oid indrelid,
+											Oid am_id,
+											IndexDoCheckCallback check,
+											LOCKMODE lockmode, void *state);
+
+extern void index_checkable(Relation rel, Oid am_id);
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 5b55cf343a..cd81cbf3bc 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -1,6 +1,7 @@
 # Copyright (c) 2022-2023, PostgreSQL Global Development Group
 
 amcheck_sources = files(
+  'amcheck.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 257cff671b..c2ae2cb011 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -29,13 +29,12 @@
 #include "access/tableam.h"
 #include "access/transam.h"
 #include "access/xact.h"
+#include "amcheck.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
 #include "commands/tablecmds.h"
 #include "common/pg_prng.h"
 #include "lib/bloomfilter.h"
-#include "miscadmin.h"
-#include "storage/lmgr.h"
 #include "storage/smgr.h"
 #include "utils/guc.h"
 #include "utils/memutils.h"
@@ -135,13 +134,19 @@ typedef struct BtreeLevel
 	bool		istruerootlevel;
 } BtreeLevel;
 
+typedef struct BTCallbackState
+{
+	bool		parentcheck;
+	bool		heapallindexed;
+	bool		rootdescend;
+} BTCallbackState;
+
+
 PG_FUNCTION_INFO_V1(bt_index_check);
 PG_FUNCTION_INFO_V1(bt_index_parent_check);
 
-static void bt_index_check_internal(Oid indrelid, bool parentcheck,
-									bool heapallindexed, bool rootdescend);
-static inline void btree_index_checkable(Relation rel);
-static inline bool btree_index_mainfork_expected(Relation rel);
+static void bt_index_check_callback(Relation indrel, Relation heaprel,
+									void *state);
 static void bt_check_every_level(Relation rel, Relation heaprel,
 								 bool heapkeyspace, bool readonly, bool heapallindexed,
 								 bool rootdescend);
@@ -203,12 +208,18 @@ Datum
 bt_index_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
+	BTCallbackState args;
 
-	if (PG_NARGS() == 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = false;
 
-	bt_index_check_internal(indrelid, false, heapallindexed, false);
+	if (PG_NARGS() >= 2)
+		args.heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
+									bt_index_check_callback,
+									AccessShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -226,15 +237,20 @@ Datum
 bt_index_parent_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		rootdescend = false;
+	BTCallbackState args;
+
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = true;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+		args.heapallindexed = PG_GETARG_BOOL(1);
 	if (PG_NARGS() == 3)
-		rootdescend = PG_GETARG_BOOL(2);
+		args.rootdescend = PG_GETARG_BOOL(2);
 
-	bt_index_check_internal(indrelid, true, heapallindexed, rootdescend);
+	amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
+									bt_index_check_callback,
+									ShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -243,182 +259,35 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
  * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
  */
 static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend)
+bt_index_check_callback(Relation indrel, Relation heaprel, void *state)
 {
-	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	LOCKMODE	lockmode;
-	Oid			save_userid;
-	int			save_sec_context;
-	int			save_nestlevel;
-
-	if (parentcheck)
-		lockmode = ShareLock;
-	else
-		lockmode = AccessShareLock;
-
-	/*
-	 * We must lock table before index to avoid deadlocks.  However, if the
-	 * passed indrelid isn't an index then IndexGetRelation() will fail.
-	 * Rather than emitting a not-very-helpful error message, postpone
-	 * complaining, expecting that the is-it-an-index test below will fail.
-	 *
-	 * In hot standby mode this will raise an error when parentcheck is true.
-	 */
-	heapid = IndexGetRelation(indrelid, true);
-	if (OidIsValid(heapid))
-	{
-		heaprel = table_open(heapid, lockmode);
-
-		/*
-		 * Switch to the table owner's userid, so that any index functions are
-		 * run as that user.  Also lock down security-restricted operations
-		 * and arrange to make GUC variable changes local to this command.
-		 */
-		GetUserIdAndSecContext(&save_userid, &save_sec_context);
-		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
-							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
-		save_nestlevel = NewGUCNestLevel();
-	}
-	else
-	{
-		heaprel = NULL;
-		/* Set these just to suppress "uninitialized variable" warnings */
-		save_userid = InvalidOid;
-		save_sec_context = -1;
-		save_nestlevel = -1;
-	}
-
-	/*
-	 * Open the target index relations separately (like relation_openrv(), but
-	 * with heap relation locked first to prevent deadlocking).  In hot
-	 * standby mode this will raise an error when parentcheck is true.
-	 *
-	 * There is no need for the usual indcheckxmin usability horizon test
-	 * here, even in the heapallindexed case, because index undergoing
-	 * verification only needs to have entries for a new transaction snapshot.
-	 * (If this is a parentcheck verification, there is no question about
-	 * committed or recently dead heap tuples lacking index entries due to
-	 * concurrent activity.)
-	 */
-	indrel = index_open(indrelid, lockmode);
+	BTCallbackState *args = (BTCallbackState *) state;
+	bool		heapkeyspace,
+				allequalimage;
 
-	/*
-	 * Since we did the IndexGetRelation call above without any lock, it's
-	 * barely possible that a race against an index drop/recreation could have
-	 * netted us the wrong table.
-	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
 		ereport(ERROR,
-				(errcode(ERRCODE_UNDEFINED_TABLE),
-				 errmsg("could not open parent table of index \"%s\"",
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" lacks a main relation fork",
 						RelationGetRelationName(indrel))));
 
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	if (btree_index_mainfork_expected(indrel))
-	{
-		bool		heapkeyspace,
-					allequalimage;
-
-		if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" lacks a main relation fork",
-							RelationGetRelationName(indrel))));
-
-		/* Extract metadata from metapage, and sanitize it in passing */
-		_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
-		if (allequalimage && !heapkeyspace)
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
-							RelationGetRelationName(indrel))));
-		if (allequalimage && !_bt_allequalimage(indrel, false))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
-							RelationGetRelationName(indrel))));
-
-		/* Check index, possibly against table it is an index on */
-		bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
-							 heapallindexed, rootdescend);
-	}
-
-	/* Roll back any GUC changes executed by index functions */
-	AtEOXact_GUC(false, save_nestlevel);
-
-	/* Restore userid and security context */
-	SetUserIdAndSecContext(save_userid, save_sec_context);
-
-	/*
-	 * Release locks early. That's ok here because nothing in the called
-	 * routines will trigger shared cache invalidations to be sent, so we can
-	 * relax the usual pattern of only releasing locks after commit.
-	 */
-	index_close(indrel, lockmode);
-	if (heaprel)
-		table_close(heaprel, lockmode);
-}
-
-/*
- * Basic checks about the suitability of a relation for checking as a B-Tree
- * index.
- *
- * NB: Intentionally not checking permissions, the function is normally not
- * callable by non-superusers. If granted, it's useful to be able to check a
- * whole cluster.
- */
-static inline void
-btree_index_checkable(Relation rel)
-{
-	if (rel->rd_rel->relkind != RELKIND_INDEX ||
-		rel->rd_rel->relam != BTREE_AM_OID)
+	/* Extract metadata from metapage, and sanitize it in passing */
+	_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
+	if (allequalimage && !heapkeyspace)
 		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("only B-Tree indexes are supported as targets for verification"),
-				 errdetail("Relation \"%s\" is not a B-Tree index.",
-						   RelationGetRelationName(rel))));
-
-	if (RELATION_IS_OTHER_TEMP(rel))
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("cannot access temporary tables of other sessions"),
-				 errdetail("Index \"%s\" is associated with temporary relation.",
-						   RelationGetRelationName(rel))));
-
-	if (!rel->rd_index->indisvalid)
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
+						RelationGetRelationName(indrel))));
+	if (allequalimage && !_bt_allequalimage(indrel, false))
 		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("cannot check index \"%s\"",
-						RelationGetRelationName(rel)),
-				 errdetail("Index is not valid.")));
-}
-
-/*
- * Check if B-Tree index relation should have a file for its main relation
- * fork.  Verification uses this to skip unlogged indexes when in hot standby
- * mode, where there is simply nothing to verify.  We behave as if the
- * relation is empty.
- *
- * NB: Caller should call btree_index_checkable() before calling here.
- */
-static inline bool
-btree_index_mainfork_expected(Relation rel)
-{
-	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
-		!RecoveryInProgress())
-		return true;
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
+						RelationGetRelationName(indrel))));
 
-	ereport(DEBUG1,
-			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
-			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
-					RelationGetRelationName(rel))));
+	/* Check index, possibly against table it is an index on */
+	bt_check_every_level(indrel, heaprel, heapkeyspace, args->parentcheck,
+						 args->heapallindexed, args->rootdescend);
 
-	return false;
 }
 
 /*
-- 
2.32.0 (Apple Git-132)

v25-0002-Add-gist_index_check-function-to-verify-GiST-ind.patchapplication/octet-stream; name=v25-0002-Add-gist_index_check-function-to-verify-GiST-ind.patchDownload

From 33d273cd74afa035c60a64460a83eb52c4c32951 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:17:44 +0500
Subject: [PATCH v25 2/4] Add gist_index_check() function to verify GiST index
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This function traverses GiST with a depth-fisrt search and checks
that all downlink tuples are included into parent tuple keyspace.
This traverse takes lock of any page until some discapency found.
To re-check suspicious pair of parent and child tuples it aqcuires
locks on both parent and child pages in the same order as page
split does.

Author: Andrey Borodin <amborodin@acm.org>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile                |   6 +-
 contrib/amcheck/amcheck--1.3--1.4.sql   |  14 +
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/expected/check_gist.out | 119 +++++
 contrib/amcheck/meson.build             |   3 +
 contrib/amcheck/sql/check_gist.sql      |  42 ++
 contrib/amcheck/verify_gist.c           | 607 ++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml               |  19 +
 8 files changed, 809 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.3--1.4.sql
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index 6d26551fe3..e9e0198276 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,14 +4,16 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
+	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
-DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql \
+		amcheck--1.3--1.4.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap
+REGRESS = check check_btree check_gist check_heap
 
 TAP_TESTS = 1
 
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
new file mode 100644
index 0000000000..5d30784b44
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -0,0 +1,14 @@
+/* contrib/amcheck/amcheck--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.4'" to load this file. \quit
+
+
+-- gist_index_check()
+--
+CREATE FUNCTION gist_index_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index ab50931f75..e67ace01c9 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.3'
+default_version = '1.4'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 0000000000..4f3baa3776
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,119 @@
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index cd81cbf3bc..9e7ebc0499 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
@@ -24,6 +25,7 @@ install_data(
   'amcheck--1.0--1.1.sql',
   'amcheck--1.1--1.2.sql',
   'amcheck--1.2--1.3.sql',
+  'amcheck--1.3--1.4.sql',
   kwargs: contrib_data_args,
 )
 
@@ -35,6 +37,7 @@ tests += {
     'sql': [
       'check',
       'check_btree',
+      'check_gist',
       'check_heap',
     ],
   },
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 0000000000..0e3a8cf3bb
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,42 @@
+
+SELECT setseed(1);
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 0000000000..dda37e8b27
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,607 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gist.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "access/tableam.h"
+#include "amcheck.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
+#include "common/pg_prng.h"
+#include "lib/bloomfilter.h"
+#include "utils/memutils.h"
+
+
+/*
+ * GistScanItem represents one item of depth-first scan of GiST index.
+ */
+typedef struct GistScanItem
+{
+	int			depth;
+
+	/* Referenced block number to check next */
+	BlockNumber blkno;
+
+	/*
+	 * Correctess of this parent tuple will be checked against contents of referenced page.
+	 * This tuple will be NULL for root block.
+	 */
+	IndexTuple	parenttup;
+
+	/*
+	 * LSN to hande concurrent scan of the page.
+	 * It's necessary to avoid missing some subtrees from page, that was
+	 * split just before we read it.
+	 */
+	XLogRecPtr	parentlsn;
+
+	/*
+	 * Reference to parent page for re-locking in case of found parent-child
+	 * tuple discrapencies.
+	 */
+	BlockNumber parentblk;
+
+	/* Pointer to a next stack item. */
+	struct GistScanItem *next;
+} GistScanItem;
+
+typedef struct GistCheckState
+{
+	/* Bloom filter fingerprints index tuples */
+	bloom_filter *filter;
+	/* Debug counter */
+	int64		heaptuplespresent;
+	/* GiST state */
+	GISTSTATE  *state;
+
+	Snapshot	snapshot;
+	Relation	rel;
+	Relation	heaprel;
+
+	/* progress reporting stuff */
+	BlockNumber totalblocks;
+	BlockNumber reportedblocks;
+	BlockNumber scannedblocks;
+	BlockNumber deltablocks;
+
+	int leafdepth;
+} GistCheckState;
+
+PG_FUNCTION_INFO_V1(gist_index_check);
+
+static void gist_init_heapallindexed(Relation rel, GistCheckState * result);
+static void gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+											   void *callback_state);
+static void gist_check_page(GistCheckState *check_state,GistScanItem *stack,
+							Page page, bool heapallindexed,
+							BufferAccessStrategy strategy);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gist_refind_parent(Relation rel, BlockNumber parentblkno,
+									 BlockNumber childblkno,
+									 BufferAccessStrategy strategy);
+static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block,
+								   Page page, OffsetNumber offset);
+static void gist_tuple_present_callback(Relation index, ItemPointer tid,
+										Datum *values, bool *isnull,
+										bool tupleIsAlive, void *checkstate);
+
+/*
+ * gist_index_check(index regclass)
+ *
+ * Verify integrity of GiST index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gist_index_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	bool		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid,
+									GIST_AM_OID,
+									gist_check_parent_keys_consistency,
+									AccessShareLock,
+									&heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+static void
+gist_init_heapallindexed(Relation rel, GistCheckState * result)
+{
+	int64		total_pages;
+	int64		total_elems;
+	uint64		seed;
+
+	/*
+	 * Size Bloom filter based on estimated number of tuples in index. This
+	 * logic is similar to B-tree, see verify_btree.c .
+	 */
+	total_pages = result->totalblocks;
+	total_elems = Max(total_pages * (MaxOffsetNumber / 5),
+					  (int64) rel->rd_rel->reltuples);
+	seed = pg_prng_uint64(&pg_global_prng_state);
+	result->filter = bloom_create(total_elems, maintenance_work_mem, seed);
+
+	result->snapshot = RegisterSnapshot(GetTransactionSnapshot());
+
+
+	/*
+	 * GetTransactionSnapshot() always acquires a new MVCC snapshot in READ
+	 * COMMITTED mode.  A new snapshot is guaranteed to have all the entries
+	 * it requires in the index.
+	 *
+	 * We must defend against the possibility that an old xact snapshot was
+	 * returned at higher isolation levels when that snapshot is not safe for
+	 * index scans of the target index.  This is possible when the snapshot
+	 * sees tuples that are before the index's indcheckxmin horizon.  Throwing
+	 * an error here should be very rare.  It doesn't seem worth using a
+	 * secondary snapshot to avoid this.
+	 */
+	if (IsolationUsesXactSnapshot() && rel->rd_index->indcheckxmin &&
+		!TransactionIdPrecedes(HeapTupleHeaderGetXmin(rel->rd_indextuple->t_data),
+							   result->snapshot->xmin))
+		ereport(ERROR,
+				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+				 errmsg("index \"%s\" cannot be verified using transaction snapshot",
+						RelationGetRelationName(rel))));
+}
+
+/*
+ * Main entry point for GiST check. Allocates memory context and scans through
+ * GiST graph. This scan is performed in a depth-first search using a stack of
+ * GistScanItem-s. Initially this stack contains only root block number. On
+ * each iteration top block numbmer is replcaed by referenced block numbers.
+ *
+ * This function verifies that tuples of internal pages cover all
+ * the key space of each tuple on leaf page.  To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries to
+ * adjust it by tuples on referenced child page.  Parent gist tuple should
+ * never require any adjustments.
+ */
+static void
+gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+								   void *callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GistScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GISTSTATE  *state;
+	bool		heapallindexed = *((bool *) callback_state);
+	GistCheckState *check_state = palloc0(sizeof(GistCheckState));
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	state = initGISTstate(rel);
+
+	check_state->state = state;
+	check_state->rel = rel;
+	check_state->heaprel = heaprel;
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	check_state->leafdepth = -1;
+
+	check_state->totalblocks = RelationGetNumberOfBlocks(rel);
+	/* report every 100 blocks or 5%, whichever is bigger */
+	check_state->deltablocks = Max(check_state->totalblocks / 20, 100);
+
+	if (heapallindexed)
+		gist_init_heapallindexed(rel, check_state);
+
+	/* Start the scan at the root page */
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	/*
+	 * This GiST scan is effectively "old" VACUUM version before commit
+	 * fe280694d which introduced physical order scanning.
+	 */
+
+	while (stack)
+	{
+		GistScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		XLogRecPtr	lsn;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Report progress */
+		if (check_state->scannedblocks > check_state->reportedblocks +
+			check_state->deltablocks)
+		{
+			elog(DEBUG1, "verified level %u blocks of approximately %u total",
+				 check_state->scannedblocks, check_state->totalblocks);
+			check_state->reportedblocks = check_state->scannedblocks;
+		}
+		check_state->scannedblocks++;
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page))
+		{
+			/* split page detected, install right link to the stack */
+			GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+			ptr->depth = stack->depth;
+			ptr->parenttup = CopyIndexTuple(stack->parenttup);
+			ptr->parentblk = stack->parentblk;
+			ptr->parentlsn = stack->parentlsn;
+			ptr->blkno = GistPageGetOpaque(page)->rightlink;
+			ptr->next = stack->next;
+			stack->next = ptr;
+		}
+
+		gist_check_page(check_state, stack, page, heapallindexed, strategy);
+
+		if (!GistPageIsLeaf(page))
+		{
+			OffsetNumber maxoff = PageGetMaxOffsetNumber(page);
+			for (OffsetNumber i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				/* Internal page, so recurse to the child */
+				GistScanItem *ptr;
+				ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
+				IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+				ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+				ptr->depth = stack->depth + 1;
+				ptr->parenttup = CopyIndexTuple(idxtuple);
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		LockBuffer(buffer, GIST_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	if (heapallindexed)
+	{
+		IndexInfo  *indexinfo = BuildIndexInfo(rel);
+		TableScanDesc scan;
+
+		scan = table_beginscan_strat(heaprel,	/* relation */
+									 check_state->snapshot,	/* snapshot */
+									 0, /* number of keys */
+									 NULL,	/* scan key */
+									 true,	/* buffer access strategy OK */
+									 true); /* syncscan OK? */
+
+		/*
+		 * Scan will behave as the first scan of a CREATE INDEX CONCURRENTLY.
+		 */
+		indexinfo->ii_Concurrent = true;
+
+		indexinfo->ii_Unique = false;
+		indexinfo->ii_ExclusionOps = NULL;
+		indexinfo->ii_ExclusionProcs = NULL;
+		indexinfo->ii_ExclusionStrats = NULL;
+
+		elog(DEBUG1, "verifying that tuples from index \"%s\" are present in \"%s\"",
+			 RelationGetRelationName(rel),
+			 RelationGetRelationName(heaprel));
+
+		table_index_build_scan(heaprel, rel, indexinfo, true, false,
+							   gist_tuple_present_callback, (void *) check_state, scan);
+
+		ereport(DEBUG1,
+				(errmsg_internal("finished verifying presence of " INT64_FORMAT " tuples from table \"%s\" with bitset %.2f%% set",
+								 check_state->heaptuplespresent,
+								 RelationGetRelationName(heaprel),
+								 100.0 * bloom_prop_bits_set(check_state->filter))));
+
+		UnregisterSnapshot(check_state->snapshot);
+		bloom_free(check_state->filter);
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+	pfree(check_state);
+}
+
+static void gist_check_page(GistCheckState *check_state, GistScanItem *stack,
+							Page page, bool heapallindexed, BufferAccessStrategy strategy)
+{
+	OffsetNumber maxoff = PageGetMaxOffsetNumber(page);
+	/* Check that the tree has the same height in all branches */
+	if (GistPageIsLeaf(page))
+	{
+		if (check_state->leafdepth == -1)
+			check_state->leafdepth = stack->depth;
+		else if (stack->depth != check_state->leafdepth)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+						errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+							RelationGetRelationName(check_state->rel), stack->blkno)));
+	}
+
+	/*
+	 * Check that each tuple looks valid, and is consistent with the
+	 * downlink we followed when we stepped on this page.
+	 */
+	for (OffsetNumber i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+	{
+		ItemId		iid = PageGetItemIdCareful(check_state->rel, stack->blkno, page, i);
+		IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+		/*
+		 * Check that it's not a leftover invalid tuple from pre-9.1 See
+		 * also gistdoinsert() and gistbulkdelete() handling of such
+		 * tuples. We do consider it error here.
+		 */
+		if (GistTupleIsInvalid(idxtuple))
+			ereport(ERROR,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						errmsg("index \"%s\" contains an inner tuple marked as invalid, block %u, offset %u",
+							RelationGetRelationName(check_state->rel), stack->blkno, i),
+						errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+						errhint("Please REINDEX it.")));
+
+		if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+						errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+							RelationGetRelationName(check_state->rel), stack->blkno, i)));
+
+		/*
+		 * Check if this tuple is consistent with the downlink in the
+		 * parent.
+		 */
+		if (stack->parenttup &&
+			gistgetadjusted(check_state->rel, stack->parenttup, idxtuple, check_state->state))
+		{
+			/*
+			 * There was a discrepancy between parent and child tuples. We
+			 * need to verify it is not a result of concurrent call of
+			 * gistplacetopage(). So, lock parent and try to find downlink
+			 * for current page. It may be missing due to concurrent page
+			 * split, this is OK.
+			 *
+			 * Note that when we aquire parent tuple now we hold lock for
+			 * both parent and child buffers. Thus parent tuple must
+			 * include keyspace of the child.
+			 */
+			pfree(stack->parenttup);
+			stack->parenttup = gist_refind_parent(check_state->rel, stack->parentblk,
+													stack->blkno, strategy);
+
+			/* We found it - make a final check before failing */
+			if (!stack->parenttup)
+				elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+						stack->blkno, stack->parentblk);
+			else if (gistgetadjusted(check_state->rel, stack->parenttup, idxtuple, check_state->state))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+							errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+								RelationGetRelationName(check_state->rel), stack->blkno, i)));
+			else
+			{
+				/*
+				 * But now it is properly adjusted - nothing to do here.
+				 */
+			}
+		}
+
+		if (GistPageIsLeaf(page))
+		{
+			if (heapallindexed)
+				bloom_add_element(check_state->filter,
+									(unsigned char *) idxtuple,
+									IndexTupleSize(idxtuple));
+		}
+		else
+		{
+			OffsetNumber off = ItemPointerGetOffsetNumber(&(idxtuple->t_tid));
+			if (off != 0xffff)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+							errmsg("index \"%s\" has on page %u offset %u has item id not pointing to 0xffff, but %hu",
+								RelationGetRelationName(check_state->rel), stack->blkno, i, off)));
+		}
+	}
+}
+
+static void
+gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+							bool *isnull, bool tupleIsAlive, void *checkstate)
+{
+	GistCheckState *state = (GistCheckState *) checkstate;
+	IndexTuple	itup = gistFormTuple(state->state, index, values, isnull, true);
+
+	itup->t_tid = *tid;
+	/* Probe Bloom filter -- tuple should be present */
+	if (bloom_lacks_element(state->filter, (unsigned char *) itup,
+							IndexTupleSize(itup)))
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("heap tuple (%u,%u) from table \"%s\" lacks matching index tuple within index \"%s\"",
+						ItemPointerGetBlockNumber(&(itup->t_tid)),
+						ItemPointerGetOffsetNumber(&(itup->t_tid)),
+						RelationGetRelationName(state->heaprel),
+						RelationGetRelationName(state->rel))));
+
+	state->heaptuplespresent++;
+
+	pfree(itup);
+}
+
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	gistcheckpage(rel, buffer);
+
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted page %d",
+						RelationGetRelationName(rel), blockNo)));
+
+	if (GistPageIsDeleted(page))
+	{
+		if (!GistPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gist_refind_parent(Relation rel,
+				   BlockNumber parentblkno, BlockNumber childblkno,
+				   BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIST_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GistPageIsLeaf(parentpage))
+	{
+		/* 
+		 * Currently GiST never deletes internal pages, thus they can never
+		 * become leaf 
+		 */
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" internal page %d became leaf",
+						RelationGetRelationName(rel), parentblkno)));
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (OffsetNumber o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/*
+			 * Found it! Make copy and return it while both parent and child
+			 * pages are locked. This guaranties that at this particular moment
+			 * tuples must be coherent to each other.
+			 */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+static ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(sizeof(GISTPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree
+	 * and gist never uses either.  Verify that line pointer has storage, too,
+	 * since even LP_DEAD items should.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 2b9c1a9205..40de7c33f5 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -179,6 +179,25 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gist_index_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
   <tip>
    <para>
-- 
2.32.0 (Apple Git-132)

#37

Andrey Borodin

amborodin86@gmail.com

almost 3 years ago

In reply to: Andrey Borodin (#36)

4 attachment(s)

Re: Amcheck verification of GiST and GIN

On Sun, Mar 19, 2023 at 4:00 PM Andrey Borodin <amborodin86@gmail.com> wrote:

Also, there are INCLUDEd attributes. Right now we just put them as-is
to the bloom filter. Does this constitute a TOAST bug as in B-tree?
If so, I think we should use a version of tuple formatting that omits
included attributes...
What do you think?

I've ported the B-tree TOAST test to GiST, and, as expected, it fails.
Finds non-indexed tuple for a fresh valid index.
I've implemented normalization, plz see gistFormNormalizedTuple().
But there are two problems:
1. I could not come up with a proper way to pfree() compressed value
after decompressing. See TODO in gistFormNormalizedTuple().
2. In the index tuples seem to be normalized somewhere. They do not
have to be deformed and normalized. It's not clear to me how this
happened.

Thanks!

Best regards, Andrey Borodin.

Attachments:

v26-0001-Refactor-amcheck-to-extract-common-locking-routi.patchapplication/octet-stream; name=v26-0001-Refactor-amcheck-to-extract-common-locking-routi.patchDownload

From 9dd36253feb5a213a014a61d2e8e7d8cddc3e585 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:08:10 +0500
Subject: [PATCH v26 1/4] Refactor amcheck to extract common locking routines
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Other indexes will need to do same precautions before doing checks:
 - ensuring index is checkable
 - switching user context
 - taking care about GUCs changed by index functions
To reuse existing functionality this commit moves it to amcheck.c.

Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile        |   1 +
 contrib/amcheck/amcheck.c       | 173 ++++++++++++++++++++++++
 contrib/amcheck/amcheck.h       |  30 ++++
 contrib/amcheck/meson.build     |   1 +
 contrib/amcheck/verify_nbtree.c | 233 +++++++-------------------------
 5 files changed, 256 insertions(+), 182 deletions(-)
 create mode 100644 contrib/amcheck/amcheck.c
 create mode 100644 contrib/amcheck/amcheck.h

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index b82f221e50..6d26551fe3 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -3,6 +3,7 @@
 MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
+	amcheck.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
new file mode 100644
index 0000000000..5a9c9429a3
--- /dev/null
+++ b/contrib/amcheck/amcheck.c
@@ -0,0 +1,173 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.c
+ *		Utility functions common to all access methods.
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/table.h"
+#include "access/tableam.h"
+#include "amcheck.h"
+#include "catalog/index.h"
+#include "commands/tablecmds.h"
+#include "utils/guc.h"
+
+static bool amcheck_index_mainfork_expected(Relation rel);
+
+
+/*
+ * Check if index relation should have a file for its main relation fork.
+ * Verification uses this to skip unlogged indexes when in hot standby mode,
+ * where there is simply nothing to verify.
+ *
+ * NB: Caller should call index_checkable() before calling here.
+ */
+static bool
+amcheck_index_mainfork_expected(Relation rel)
+{
+	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
+		!RecoveryInProgress())
+		return true;
+
+	ereport(NOTICE,
+			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
+			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
+					RelationGetRelationName(rel))));
+
+	return false;
+}
+
+void
+amcheck_lock_relation_and_check(Oid indrelid,
+								Oid am_id,
+								IndexDoCheckCallback check,
+								LOCKMODE lockmode,
+								void *state)
+{
+	Oid			heapid;
+	Relation	indrel;
+	Relation	heaprel;
+	Oid			save_userid;
+	int			save_sec_context;
+	int			save_nestlevel;
+
+	/*
+	 * We must lock table before index to avoid deadlocks.  However, if the
+	 * passed indrelid isn't an index then IndexGetRelation() will fail.
+	 * Rather than emitting a not-very-helpful error message, postpone
+	 * complaining, expecting that the is-it-an-index test below will fail.
+	 *
+	 * In hot standby mode this will raise an error when parentcheck is true.
+	 */
+	heapid = IndexGetRelation(indrelid, true);
+	if (OidIsValid(heapid))
+	{
+		heaprel = table_open(heapid, lockmode);
+
+		/*
+		 * Switch to the table owner's userid, so that any index functions are
+		 * run as that user.  Also lock down security-restricted operations
+		 * and arrange to make GUC variable changes local to this command.
+		 */
+		GetUserIdAndSecContext(&save_userid, &save_sec_context);
+		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
+							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
+		save_nestlevel = NewGUCNestLevel();
+	}
+	else
+	{
+		heaprel = NULL;
+		/* Set these just to suppress "uninitialized variable" warnings */
+		save_userid = InvalidOid;
+		save_sec_context = -1;
+		save_nestlevel = -1;
+	}
+
+	/*
+	 * Open the target index relations separately (like relation_openrv(), but
+	 * with heap relation locked first to prevent deadlocking).  In hot
+	 * standby mode this will raise an error when parentcheck is true.
+	 *
+	 * There is no need for the usual indcheckxmin usability horizon test
+	 * here, even in the heapallindexed case, because index undergoing
+	 * verification only needs to have entries for a new transaction snapshot.
+	 * (If this is a parentcheck verification, there is no question about
+	 * committed or recently dead heap tuples lacking index entries due to
+	 * concurrent activity.)
+	 */
+	indrel = index_open(indrelid, lockmode);
+
+	/*
+	 * Since we did the IndexGetRelation call above without any lock, it's
+	 * barely possible that a race against an index drop/recreation could have
+	 * netted us the wrong table.
+	 */
+	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_TABLE),
+				 errmsg("could not open parent table of index \"%s\"",
+						RelationGetRelationName(indrel))));
+
+	/* Relation suitable for checking */
+	index_checkable(indrel, am_id);
+
+	if (amcheck_index_mainfork_expected(indrel))
+		check(indrel, heaprel, state);
+
+	/* Roll back any GUC changes executed by index functions */
+	AtEOXact_GUC(false, save_nestlevel);
+
+	/* Restore userid and security context */
+	SetUserIdAndSecContext(save_userid, save_sec_context);
+
+	/*
+	 * Release locks early. That's ok here because nothing in the called
+	 * routines will trigger shared cache invalidations to be sent, so we can
+	 * relax the usual pattern of only releasing locks after commit.
+	 */
+	index_close(indrel, lockmode);
+	if (heaprel)
+		table_close(heaprel, lockmode);
+}
+
+/*
+ * Basic checks about the suitability of a relation for checking as an index.
+ *
+ *
+ * NB: Intentionally not checking permissions, the function is normally not
+ * callable by non-superusers. If granted, it's useful to be able to check a
+ * whole cluster.
+ */
+void
+index_checkable(Relation rel, Oid am_id)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != am_id)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only B-Tree indexes are supported as targets for verification"),
+				 errdetail("Relation \"%s\" is not a B-Tree index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid.")));
+}
diff --git a/contrib/amcheck/amcheck.h b/contrib/amcheck/amcheck.h
new file mode 100644
index 0000000000..b139da067a
--- /dev/null
+++ b/contrib/amcheck/amcheck.h
@@ -0,0 +1,30 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "storage/bufpage.h"
+#include "storage/lmgr.h"
+#include "storage/lockdefs.h"
+#include "utils/relcache.h"
+#include "miscadmin.h"
+
+/* Typedefs for callback functions for amcheck_lock_relation */
+typedef void (*IndexCheckableCallback) (Relation index);
+typedef void (*IndexDoCheckCallback) (Relation rel,
+									  Relation heaprel,
+									  void *state);
+
+extern void amcheck_lock_relation_and_check(Oid indrelid,
+											Oid am_id,
+											IndexDoCheckCallback check,
+											LOCKMODE lockmode, void *state);
+
+extern void index_checkable(Relation rel, Oid am_id);
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 5b55cf343a..cd81cbf3bc 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -1,6 +1,7 @@
 # Copyright (c) 2022-2023, PostgreSQL Global Development Group
 
 amcheck_sources = files(
+  'amcheck.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 257cff671b..c2ae2cb011 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -29,13 +29,12 @@
 #include "access/tableam.h"
 #include "access/transam.h"
 #include "access/xact.h"
+#include "amcheck.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
 #include "commands/tablecmds.h"
 #include "common/pg_prng.h"
 #include "lib/bloomfilter.h"
-#include "miscadmin.h"
-#include "storage/lmgr.h"
 #include "storage/smgr.h"
 #include "utils/guc.h"
 #include "utils/memutils.h"
@@ -135,13 +134,19 @@ typedef struct BtreeLevel
 	bool		istruerootlevel;
 } BtreeLevel;
 
+typedef struct BTCallbackState
+{
+	bool		parentcheck;
+	bool		heapallindexed;
+	bool		rootdescend;
+} BTCallbackState;
+
+
 PG_FUNCTION_INFO_V1(bt_index_check);
 PG_FUNCTION_INFO_V1(bt_index_parent_check);
 
-static void bt_index_check_internal(Oid indrelid, bool parentcheck,
-									bool heapallindexed, bool rootdescend);
-static inline void btree_index_checkable(Relation rel);
-static inline bool btree_index_mainfork_expected(Relation rel);
+static void bt_index_check_callback(Relation indrel, Relation heaprel,
+									void *state);
 static void bt_check_every_level(Relation rel, Relation heaprel,
 								 bool heapkeyspace, bool readonly, bool heapallindexed,
 								 bool rootdescend);
@@ -203,12 +208,18 @@ Datum
 bt_index_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
+	BTCallbackState args;
 
-	if (PG_NARGS() == 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = false;
 
-	bt_index_check_internal(indrelid, false, heapallindexed, false);
+	if (PG_NARGS() >= 2)
+		args.heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
+									bt_index_check_callback,
+									AccessShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -226,15 +237,20 @@ Datum
 bt_index_parent_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		rootdescend = false;
+	BTCallbackState args;
+
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = true;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+		args.heapallindexed = PG_GETARG_BOOL(1);
 	if (PG_NARGS() == 3)
-		rootdescend = PG_GETARG_BOOL(2);
+		args.rootdescend = PG_GETARG_BOOL(2);
 
-	bt_index_check_internal(indrelid, true, heapallindexed, rootdescend);
+	amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
+									bt_index_check_callback,
+									ShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -243,182 +259,35 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
  * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
  */
 static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend)
+bt_index_check_callback(Relation indrel, Relation heaprel, void *state)
 {
-	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	LOCKMODE	lockmode;
-	Oid			save_userid;
-	int			save_sec_context;
-	int			save_nestlevel;
-
-	if (parentcheck)
-		lockmode = ShareLock;
-	else
-		lockmode = AccessShareLock;
-
-	/*
-	 * We must lock table before index to avoid deadlocks.  However, if the
-	 * passed indrelid isn't an index then IndexGetRelation() will fail.
-	 * Rather than emitting a not-very-helpful error message, postpone
-	 * complaining, expecting that the is-it-an-index test below will fail.
-	 *
-	 * In hot standby mode this will raise an error when parentcheck is true.
-	 */
-	heapid = IndexGetRelation(indrelid, true);
-	if (OidIsValid(heapid))
-	{
-		heaprel = table_open(heapid, lockmode);
-
-		/*
-		 * Switch to the table owner's userid, so that any index functions are
-		 * run as that user.  Also lock down security-restricted operations
-		 * and arrange to make GUC variable changes local to this command.
-		 */
-		GetUserIdAndSecContext(&save_userid, &save_sec_context);
-		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
-							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
-		save_nestlevel = NewGUCNestLevel();
-	}
-	else
-	{
-		heaprel = NULL;
-		/* Set these just to suppress "uninitialized variable" warnings */
-		save_userid = InvalidOid;
-		save_sec_context = -1;
-		save_nestlevel = -1;
-	}
-
-	/*
-	 * Open the target index relations separately (like relation_openrv(), but
-	 * with heap relation locked first to prevent deadlocking).  In hot
-	 * standby mode this will raise an error when parentcheck is true.
-	 *
-	 * There is no need for the usual indcheckxmin usability horizon test
-	 * here, even in the heapallindexed case, because index undergoing
-	 * verification only needs to have entries for a new transaction snapshot.
-	 * (If this is a parentcheck verification, there is no question about
-	 * committed or recently dead heap tuples lacking index entries due to
-	 * concurrent activity.)
-	 */
-	indrel = index_open(indrelid, lockmode);
+	BTCallbackState *args = (BTCallbackState *) state;
+	bool		heapkeyspace,
+				allequalimage;
 
-	/*
-	 * Since we did the IndexGetRelation call above without any lock, it's
-	 * barely possible that a race against an index drop/recreation could have
-	 * netted us the wrong table.
-	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
 		ereport(ERROR,
-				(errcode(ERRCODE_UNDEFINED_TABLE),
-				 errmsg("could not open parent table of index \"%s\"",
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" lacks a main relation fork",
 						RelationGetRelationName(indrel))));
 
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	if (btree_index_mainfork_expected(indrel))
-	{
-		bool		heapkeyspace,
-					allequalimage;
-
-		if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" lacks a main relation fork",
-							RelationGetRelationName(indrel))));
-
-		/* Extract metadata from metapage, and sanitize it in passing */
-		_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
-		if (allequalimage && !heapkeyspace)
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
-							RelationGetRelationName(indrel))));
-		if (allequalimage && !_bt_allequalimage(indrel, false))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
-							RelationGetRelationName(indrel))));
-
-		/* Check index, possibly against table it is an index on */
-		bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
-							 heapallindexed, rootdescend);
-	}
-
-	/* Roll back any GUC changes executed by index functions */
-	AtEOXact_GUC(false, save_nestlevel);
-
-	/* Restore userid and security context */
-	SetUserIdAndSecContext(save_userid, save_sec_context);
-
-	/*
-	 * Release locks early. That's ok here because nothing in the called
-	 * routines will trigger shared cache invalidations to be sent, so we can
-	 * relax the usual pattern of only releasing locks after commit.
-	 */
-	index_close(indrel, lockmode);
-	if (heaprel)
-		table_close(heaprel, lockmode);
-}
-
-/*
- * Basic checks about the suitability of a relation for checking as a B-Tree
- * index.
- *
- * NB: Intentionally not checking permissions, the function is normally not
- * callable by non-superusers. If granted, it's useful to be able to check a
- * whole cluster.
- */
-static inline void
-btree_index_checkable(Relation rel)
-{
-	if (rel->rd_rel->relkind != RELKIND_INDEX ||
-		rel->rd_rel->relam != BTREE_AM_OID)
+	/* Extract metadata from metapage, and sanitize it in passing */
+	_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
+	if (allequalimage && !heapkeyspace)
 		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("only B-Tree indexes are supported as targets for verification"),
-				 errdetail("Relation \"%s\" is not a B-Tree index.",
-						   RelationGetRelationName(rel))));
-
-	if (RELATION_IS_OTHER_TEMP(rel))
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("cannot access temporary tables of other sessions"),
-				 errdetail("Index \"%s\" is associated with temporary relation.",
-						   RelationGetRelationName(rel))));
-
-	if (!rel->rd_index->indisvalid)
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
+						RelationGetRelationName(indrel))));
+	if (allequalimage && !_bt_allequalimage(indrel, false))
 		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("cannot check index \"%s\"",
-						RelationGetRelationName(rel)),
-				 errdetail("Index is not valid.")));
-}
-
-/*
- * Check if B-Tree index relation should have a file for its main relation
- * fork.  Verification uses this to skip unlogged indexes when in hot standby
- * mode, where there is simply nothing to verify.  We behave as if the
- * relation is empty.
- *
- * NB: Caller should call btree_index_checkable() before calling here.
- */
-static inline bool
-btree_index_mainfork_expected(Relation rel)
-{
-	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
-		!RecoveryInProgress())
-		return true;
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
+						RelationGetRelationName(indrel))));
 
-	ereport(DEBUG1,
-			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
-			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
-					RelationGetRelationName(rel))));
+	/* Check index, possibly against table it is an index on */
+	bt_check_every_level(indrel, heaprel, heapkeyspace, args->parentcheck,
+						 args->heapallindexed, args->rootdescend);
 
-	return false;
 }
 
 /*
-- 
2.37.1 (Apple Git-137.1)

v26-0003-Add-gin_index_parent_check-to-verify-GIN-index.patchapplication/octet-stream; name=v26-0003-Add-gin_index_parent_check-to-verify-GIN-index.patchDownload

From 5b244859b96ca4f236c529b8384fe8ea8ad7eec6 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:22:26 +0500
Subject: [PATCH v26 3/4] Add gin_index_parent_check() to verify GIN index
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Author: Grigory Kryachko <GSKryachko@gmail.com>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile               |   3 +-
 contrib/amcheck/amcheck--1.3--1.4.sql  |  11 +-
 contrib/amcheck/expected/check_gin.out |  64 +++
 contrib/amcheck/meson.build            |   2 +
 contrib/amcheck/sql/check_gin.sql      |  40 ++
 contrib/amcheck/verify_gin.c           | 768 +++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml              |  19 +
 7 files changed, 905 insertions(+), 2 deletions(-)
 create mode 100644 contrib/amcheck/expected/check_gin.out
 create mode 100644 contrib/amcheck/sql/check_gin.sql
 create mode 100644 contrib/amcheck/verify_gin.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index e9e0198276..4c672f0db8 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,6 +4,7 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
+	verify_gin.o \
 	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
@@ -13,7 +14,7 @@ DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck
 		amcheck--1.3--1.4.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_gist check_heap
+REGRESS = check check_btree check_gin check_gist check_heap
 
 TAP_TESTS = 1
 
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
index 5d30784b44..ca985fff2e 100644
--- a/contrib/amcheck/amcheck--1.3--1.4.sql
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -11,4 +11,13 @@ RETURNS VOID
 AS 'MODULE_PATHNAME', 'gist_index_check'
 LANGUAGE C STRICT;
 
-REVOKE ALL ON FUNCTION gist_index_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
+REVOKE ALL ON FUNCTION gist_index_check(regclass, boolean) FROM PUBLIC;
+
+-- gin_index_parent_check()
+--
+CREATE FUNCTION gin_index_parent_check(index regclass)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gin_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gin_index_parent_check(regclass) FROM PUBLIC;
diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
new file mode 100644
index 0000000000..43fd769a50
--- /dev/null
+++ b/contrib/amcheck/expected/check_gin.out
@@ -0,0 +1,64 @@
+-- Test of index bulk load
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx');
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test index inserts
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+SELECT gin_index_parent_check('gin_check_idx');
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test GIN over text array
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx');
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 9e7ebc0499..dc2191bd59 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gin.c',
   'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
@@ -37,6 +38,7 @@ tests += {
     'sql': [
       'check',
       'check_btree',
+      'check_gin',
       'check_gist',
       'check_heap',
     ],
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
new file mode 100644
index 0000000000..9771afffa5
--- /dev/null
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -0,0 +1,40 @@
+-- Test of index bulk load
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test index inserts
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+
+SELECT gin_index_parent_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test GIN over text array
+SELECT setseed(1);
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx');
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
new file mode 100644
index 0000000000..af9ace2f33
--- /dev/null
+++ b/contrib/amcheck/verify_gin.c
@@ -0,0 +1,768 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gin.c
+ *		Verifies the integrity of GIN indexes based on invariants.
+ *
+ * Verification checks that all paths in GIN graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gin.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gin_private.h"
+#include "access/nbtree.h"
+#include "amcheck.h"
+#include "catalog/pg_am.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "string.h"
+
+/*
+ * GinScanItem represents one item of depth-first scan of GIN index.
+ */
+typedef struct GinScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GinScanItem *next;
+} GinScanItem;
+
+/*
+ * GinPostingTreeScanItem represents one item of depth-first scan of GIN  posting tree.
+ */
+typedef struct GinPostingTreeScanItem
+{
+	int			depth;
+	ItemPointerData parentkey;
+	BlockNumber parentblk;
+	BlockNumber blkno;
+	struct GinPostingTreeScanItem *next;
+} GinPostingTreeScanItem;
+
+
+PG_FUNCTION_INFO_V1(gin_index_parent_check);
+
+static void gin_check_parent_keys_consistency(Relation rel,
+											  Relation heaprel,
+											  void *callback_state);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gin_refind_parent(Relation rel,
+									BlockNumber parentblkno,
+									BlockNumber childblkno,
+									BufferAccessStrategy strategy);
+static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+								   OffsetNumber offset);
+
+/*
+ * gin_index_parent_check(index regclass)
+ *
+ * Verify integrity of GIN index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gin_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+
+	amcheck_lock_relation_and_check(indrelid,
+									GIN_AM_OID,
+									gin_check_parent_keys_consistency,
+									AccessShareLock,
+									NULL);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Read item pointers from leaf entry tuple.
+ *
+ * Returns a palloc'd array of ItemPointers. The number of items is returned
+ * in *nitems.
+ */
+static ItemPointer
+ginReadTupleWithoutState(IndexTuple itup, int *nitems)
+{
+	Pointer		ptr = GinGetPosting(itup);
+	int			nipd = GinGetNPosting(itup);
+	ItemPointer ipd;
+	int			ndecoded;
+
+	if (GinItupIsCompressed(itup))
+	{
+		if (nipd > 0)
+		{
+			ipd = ginPostingListDecode((GinPostingList *) ptr, &ndecoded);
+			if (nipd != ndecoded)
+				elog(ERROR, "number of items mismatch in GIN entry tuple, %d in tuple header, %d decoded",
+					 nipd, ndecoded);
+		}
+		else
+		{
+			ipd = palloc(0);
+		}
+	}
+	else
+	{
+		ipd = (ItemPointer) palloc(sizeof(ItemPointerData) * nipd);
+		memcpy(ipd, ptr, sizeof(ItemPointerData) * nipd);
+	}
+	*nitems = nipd;
+	return ipd;
+}
+
+/*
+ * Allocates memory context and scans through postigTree graph
+ *
+ */
+static void
+gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting_tree_root)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinPostingTreeScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinPostingTreeScanItem *) palloc0(sizeof(GinPostingTreeScanItem));
+	stack->depth = 0;
+	ItemPointerSetInvalid(&stack->parentkey);
+	stack->parentblk = InvalidBlockNumber;
+	stack->blkno = posting_tree_root;
+
+	elog(DEBUG3, "processing posting tree at blk %u", posting_tree_root);
+
+	while (stack)
+	{
+		GinPostingTreeScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		Assert(GinPageIsData(page));
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			ItemPointerData minItem;
+			int			nlist;
+			ItemPointerData *list;
+			char		tidrange_buf[100];
+
+			ItemPointerSetMin(&minItem);
+
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			list = GinDataLeafPageGetItems(page, &nlist, minItem);
+
+			if (nlist > 0)
+			{
+				snprintf(tidrange_buf, sizeof(tidrange_buf),
+						 "%d tids (%u, %u) - (%u, %u)",
+						 nlist,
+						 ItemPointerGetBlockNumberNoCheck(&list[0]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[0]),
+						 ItemPointerGetBlockNumberNoCheck(&list[nlist - 1]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[nlist - 1]));
+			}
+			else
+			{
+				snprintf(tidrange_buf, sizeof(tidrange_buf), "0 tids");
+			}
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				elog(DEBUG3, "blk %u: parent %u highkey (%u, %u), %s",
+					 stack->blkno,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey),
+					 tidrange_buf);
+			}
+			else
+			{
+				elog(DEBUG3, "blk %u: root leaf, %s",
+					 stack->blkno,
+					 tidrange_buf);
+			}
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				ItemPointerGetOffsetNumberNoCheck(&stack->parentkey) != InvalidOffsetNumber &&
+				nlist > 0 && ItemPointerCompare(&stack->parentkey, &list[nlist - 1]) < 0)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": tid exceeds parent's high key in postingTree leaf on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+		else
+		{
+			LocationIndex pd_lower;
+			ItemPointerData bound;
+			int			lowersize;
+
+			/*
+			 * Check that tuples in each page are properly ordered and
+			 * consistent with parent high key
+			 */
+			maxoff = GinPageGetOpaque(page)->maxoff;
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: internal posting tree page with %u items, parent %u highkey (%u, %u)",
+					 stack->blkno, maxoff, stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey));
+			else
+				elog(DEBUG3, "blk %u: root internal posting tree page with %u items",
+					 stack->blkno, maxoff);
+
+			/*
+			 * A GIN posting tree internal page stores PostingItems in the
+			 * 'lower' part of the page. The 'upper' part is unused. The
+			 * number of elements is stored in the opaque area (maxoff). Make
+			 * sure the size of the 'lower' part agrees with 'maxoff'
+			 *
+			 * We didn't set pd_lower until PostgreSQL version 9.4, so if this
+			 * check fails, it could also be because the index was
+			 * binary-upgraded from an earlier version. That was a long time
+			 * ago, though, so let's warn if it doesn't match.
+			 */
+			pd_lower = ((PageHeader) page)->pd_lower;
+			lowersize = pd_lower - MAXALIGN(SizeOfPageHeaderData);
+			if ((lowersize - MAXALIGN(sizeof(ItemPointerData))) / sizeof(PostingItem) != maxoff)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has unexpected pd_lower %u in posting tree block %u with maxoff %u)",
+								RelationGetRelationName(rel), pd_lower, stack->blkno, maxoff)));
+
+			/*
+			 * Before the PostingItems, there's one ItemPointerData in the
+			 * 'lower' part that stores the page's high key.
+			 */
+			bound = *GinDataPageGetRightBound(page);
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				!ItemPointerEquals(&stack->parentkey, &bound))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": posting tree page's high key (%u, %u) doesn't match the downlink on block %u (parent blk %u, key (%u, %u))",
+								RelationGetRelationName(rel),
+								ItemPointerGetBlockNumberNoCheck(&bound),
+								ItemPointerGetOffsetNumberNoCheck(&bound),
+								stack->blkno, stack->parentblk,
+								ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+								ItemPointerGetOffsetNumberNoCheck(&stack->parentkey))));
+
+			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
+
+				elog(DEBUG3, "key (%u, %u) -> %u",
+					 ItemPointerGetBlockNumber(&posting_item->key),
+					 ItemPointerGetOffsetNumber(&posting_item->key),
+					 BlockIdGetBlockNumber(&posting_item->child_blkno));
+
+				if (i == maxoff &&
+					GinPageGetOpaque(page)->rightlink == InvalidBlockNumber)
+				{
+					/*
+					 * The rightmost item in the tree level has (0, 0) as the
+					 * key
+					 */
+					if (ItemPointerGetBlockNumberNoCheck(&posting_item->key) != 0 ||
+						ItemPointerGetOffsetNumberNoCheck(&posting_item->key) != 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": rightmost posting tree page (blk %u) has unexpected last key (%u, %u)",
+										RelationGetRelationName(rel),
+										stack->blkno,
+										ItemPointerGetBlockNumberNoCheck(&posting_item->key),
+										ItemPointerGetOffsetNumberNoCheck(&posting_item->key))));
+				}
+				else if (i != FirstOffsetNumber)
+				{
+					PostingItem *previous_posting_item = GinDataPageGetPostingItem(page, i - 1);
+
+					if (ItemPointerCompare(&posting_item->key, &previous_posting_item->key) < 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\" has wrong tuple order in posting tree, block %u, offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+				}
+
+				/*
+				 * Check if this tuple is consistent with the downlink in the
+				 * parent.
+				 */
+				if (stack->parentblk != InvalidBlockNumber && i == maxoff &&
+					ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\": posting item exceeds parent's high key in postingTree internal page on block %u offset %u",
+									RelationGetRelationName(rel),
+									stack->blkno, i)));
+
+				/* If this is an internal page, recurse into the child */
+				if (!GinPageIsLeaf(page))
+				{
+					GinPostingTreeScanItem *ptr;
+
+					ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
+					ptr->depth = stack->depth + 1;
+					ptr->parentkey = posting_item->key;
+					ptr->parentblk = stack->blkno;
+					ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+			}
+		}
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Main entry point for GIN check. Allocates memory context and scans through
+ * GIN graph.
+ */
+static void
+gin_check_parent_keys_consistency(Relation rel,
+								  Relation heaprel,
+								  void *callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GinState	state;
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+	initGinState(&state, rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinScanItem *) palloc0(sizeof(GinScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIN_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GinScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+		IndexTuple	prev_tuple;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+		maxoff = PageGetMaxOffsetNumber(page);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (stack->parenttup != NULL)
+		{
+			GinNullCategory parent_key_category;
+			Datum parent_key = gintuple_get_key(&state,
+												stack->parenttup,
+												&parent_key_category);
+			ItemId iid = PageGetItemIdCareful(rel, stack->blkno,
+											  page, maxoff);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory page_max_key_category;
+			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
+
+			if (GinPageGetOpaque(page)->rightlink != InvalidBlockNumber &&
+				ginCompareEntries(&state, attnum, page_max_key,
+								  page_max_key_category, parent_key,
+								  parent_key_category) > 0)
+			{
+				/* split page detected, install right link to the stack */
+				GinScanItem *ptr;
+
+				elog(DEBUG3, "split detected");
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth;
+				ptr->parenttup = CopyIndexTuple(stack->parenttup);
+				ptr->parentblk = stack->parentblk;
+				ptr->parentlsn = stack->parentlsn;
+				ptr->blkno = GinPageGetOpaque(page)->rightlink;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that tuples in each page are properly ordered and consistent
+		 * with parent high key
+		 */
+		prev_tuple = NULL;
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory prev_key_category;
+			Datum		prev_key;
+			GinNullCategory current_key_category;
+			Datum		current_key;
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
+
+			/* (apparently) first block is metadata, skip order check */
+			if (i != FirstOffsetNumber && stack->blkno != (BlockNumber) 1)
+			{
+				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
+				if (ginCompareEntries(&state, attnum, prev_key,
+									  prev_key_category, current_key,
+									  current_key_category) >= 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has wrong tuple order, block %u, offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+			}
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				i == maxoff)
+			{
+				GinNullCategory parent_key_category;
+				Datum parent_key = gintuple_get_key(&state,
+													stack->parenttup,
+													&parent_key_category);
+
+				if (ginCompareEntries(&state, attnum, current_key,
+									  current_key_category, parent_key,
+									  parent_key_category) > 0)
+				{
+					/*
+					 * There was a discrepancy between parent and child
+					 * tuples. We need to verify it is not a result of
+					 * concurrent call of gistplacetopage(). So, lock parent
+					 * and try to find downlink for current page. It may be
+					 * missing due to concurrent page split, this is OK.
+					 */
+					pfree(stack->parenttup);
+					stack->parenttup = gin_refind_parent(rel, stack->parentblk,
+														 stack->blkno, strategy);
+
+					/* We found it - make a final check before failing */
+					if (!stack->parenttup)
+						elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+							 stack->blkno, stack->parentblk);
+					else
+					{
+						parent_key = gintuple_get_key(&state,
+													  stack->parenttup,
+													  &parent_key_category);
+						if (ginCompareEntries(&state, attnum, current_key,
+											  current_key_category, parent_key,
+											  parent_key_category) > 0)
+							ereport(ERROR,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+											RelationGetRelationName(rel), stack->blkno, i)));
+						else
+						{
+							/*
+							 * But now it is properly adjusted - nothing to do
+							 * here.
+							 */
+						}
+					}
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+			if (!GinPageIsLeaf(page))
+			{
+				GinScanItem *ptr;
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth + 1;
+				/* last tuple in layer has no high key */
+				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
+				{
+					ptr->parenttup = CopyIndexTuple(idxtuple);
+				}
+				else
+				{
+					ptr->parenttup = NULL;
+				}
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = GinGetDownlink(idxtuple);
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+			/* If this item is a pointer to a posting tree, recurse into it */
+			else if (GinIsPostingTree(idxtuple))
+			{
+				BlockNumber rootPostingTree = GinGetPostingTree(idxtuple);
+
+				gin_check_posting_tree_parent_keys_consistency(rel, rootPostingTree);
+			}
+			else
+			{
+				ItemPointer ipd;
+				int			nipd;
+
+				ipd = ginReadTupleWithoutState(idxtuple, &nipd);
+
+				for (int j = 0; j < nipd; j++)
+				{
+					if (!OffsetNumberIsValid(ItemPointerGetOffsetNumber(&ipd[j])))
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting list contains invalid heap pointer on block %u",
+										RelationGetRelationName(rel), stack->blkno)));
+				}
+				pfree(ipd);
+			}
+
+			prev_tuple = CopyIndexTuple(idxtuple);
+		}
+
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Verify that a freshly-read page looks sane.
+ */
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	/*
+	 * ReadBuffer verifies that every newly-read page passes
+	 * PageHeaderIsValid, which means it either contains a reasonably sane
+	 * page header or is all-zero.  We have to defend against the all-zero
+	 * case, however.
+	 */
+	if (PageIsNew(page))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains unexpected zero page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	/*
+	 * Additionally check that the special area looks sane.
+	 */
+	if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains corrupted page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	if (GinPageIsDeleted(page))
+	{
+		if (!GinPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gin_refind_parent(Relation rel, BlockNumber parentblkno,
+				  BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIN_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GinPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+static ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED or LP_DEAD,
+	 * since GIN never uses all three.  Verify that line pointer has storage,
+	 * too.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdIsDead(itemid) || ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 40de7c33f5..e5c8d84db9 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -180,6 +180,25 @@ ORDER BY c.relpages DESC LIMIT 10;
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term>
+     <function>gin_index_parent_check(index regclass) returns void</function>
+     <indexterm>
+      <primary>gin_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gin_index_parent_check</function> tests that its target GIN index
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term>
      <function>gist_index_check(index regclass, heapallindexed boolean) returns void</function>
-- 
2.37.1 (Apple Git-137.1)

v26-0004-Add-GiST-support-to-pg_amcheck.patchapplication/octet-stream; name=v26-0004-Add-GiST-support-to-pg_amcheck.patchDownload

From 240f4f42242216794e3348d8027b3ae6abe4f02b Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sun, 5 Feb 2023 15:52:14 -0800
Subject: [PATCH v26 4/4] Add GiST support to pg_amcheck

---
 src/bin/pg_amcheck/pg_amcheck.c      | 205 ++++++++++++++++-----------
 src/bin/pg_amcheck/t/002_nonesuch.pl |   8 +-
 src/bin/pg_amcheck/t/003_check.pl    |  48 ++++---
 3 files changed, 157 insertions(+), 104 deletions(-)

diff --git a/src/bin/pg_amcheck/pg_amcheck.c b/src/bin/pg_amcheck/pg_amcheck.c
index 68f8180c19..337399539d 100644
--- a/src/bin/pg_amcheck/pg_amcheck.c
+++ b/src/bin/pg_amcheck/pg_amcheck.c
@@ -39,8 +39,7 @@ typedef struct PatternInfo
 								 * NULL */
 	bool		heap_only;		/* true if rel_regex should only match heap
 								 * tables */
-	bool		btree_only;		/* true if rel_regex should only match btree
-								 * indexes */
+	bool		index_only;		/* true if rel_regex should only match indexes */
 	bool		matched;		/* true if the pattern matched in any database */
 } PatternInfo;
 
@@ -74,7 +73,7 @@ typedef struct AmcheckOptions
 
 	/*
 	 * As an optimization, if any pattern in the exclude list applies to heap
-	 * tables, or similarly if any such pattern applies to btree indexes, or
+	 * tables, or similarly if any such pattern applies to indexes, or
 	 * to schemas, then these will be true, otherwise false.  These should
 	 * always agree with what you'd conclude by grep'ing through the exclude
 	 * list.
@@ -98,13 +97,13 @@ typedef struct AmcheckOptions
 	int64		endblock;
 	const char *skip;
 
-	/* btree index checking options */
+	/* index checking options */
 	bool		parent_check;
 	bool		rootdescend;
 	bool		heapallindexed;
 
-	/* heap and btree hybrid option */
-	bool		no_btree_expansion;
+	/* heap and indexes hybrid option */
+	bool		no_index_expansion;
 } AmcheckOptions;
 
 static AmcheckOptions opts = {
@@ -132,7 +131,7 @@ static AmcheckOptions opts = {
 	.parent_check = false,
 	.rootdescend = false,
 	.heapallindexed = false,
-	.no_btree_expansion = false
+	.no_index_expansion = false
 };
 
 static const char *progname = NULL;
@@ -154,7 +153,8 @@ typedef struct RelationInfo
 {
 	const DatabaseInfo *datinfo;	/* shared by other relinfos */
 	Oid			reloid;
-	bool		is_heap;		/* true if heap, false if btree */
+	Oid			amoid;
+	bool		is_heap;		/* true if heap, false if index */
 	char	   *nspname;
 	char	   *relname;
 	int			relpages;
@@ -175,10 +175,12 @@ static void prepare_heap_command(PQExpBuffer sql, RelationInfo *rel,
 								 PGconn *conn);
 static void prepare_btree_command(PQExpBuffer sql, RelationInfo *rel,
 								  PGconn *conn);
+static void prepare_gist_command(PQExpBuffer sql, RelationInfo *rel,
+								  PGconn *conn);
 static void run_command(ParallelSlot *slot, const char *sql);
 static bool verify_heap_slot_handler(PGresult *res, PGconn *conn,
 									 void *context);
-static bool verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context);
+static bool verify_index_slot_handler(PGresult *res, PGconn *conn, void *context);
 static void help(const char *progname);
 static void progress_report(uint64 relations_total, uint64 relations_checked,
 							uint64 relpages_total, uint64 relpages_checked,
@@ -192,7 +194,7 @@ static void append_relation_pattern(PatternInfoArray *pia, const char *pattern,
 									int encoding);
 static void append_heap_pattern(PatternInfoArray *pia, const char *pattern,
 								int encoding);
-static void append_btree_pattern(PatternInfoArray *pia, const char *pattern,
+static void append_index_pattern(PatternInfoArray *pia, const char *pattern,
 								 int encoding);
 static void compile_database_list(PGconn *conn, SimplePtrList *databases,
 								  const char *initial_dbname);
@@ -318,11 +320,11 @@ main(int argc, char *argv[])
 				break;
 			case 'i':
 				opts.allrel = false;
-				append_btree_pattern(&opts.include, optarg, encoding);
+				append_index_pattern(&opts.include, optarg, encoding);
 				break;
 			case 'I':
 				opts.excludeidx = true;
-				append_btree_pattern(&opts.exclude, optarg, encoding);
+				append_index_pattern(&opts.exclude, optarg, encoding);
 				break;
 			case 'j':
 				if (!option_parse_int(optarg, "-j/--jobs", 1, INT_MAX,
@@ -377,7 +379,7 @@ main(int argc, char *argv[])
 				maintenance_db = pg_strdup(optarg);
 				break;
 			case 2:
-				opts.no_btree_expansion = true;
+				opts.no_index_expansion = true;
 				break;
 			case 3:
 				opts.no_toast_expansion = true;
@@ -609,8 +611,8 @@ main(int argc, char *argv[])
 			if (pat->heap_only)
 				log_no_match("no heap tables to check matching \"%s\"",
 							 pat->pattern);
-			else if (pat->btree_only)
-				log_no_match("no btree indexes to check matching \"%s\"",
+			else if (pat->index_only)
+				log_no_match("no indexes to check matching \"%s\"",
 							 pat->pattern);
 			else if (pat->rel_regex == NULL)
 				log_no_match("no relations to check in schemas matching \"%s\"",
@@ -743,13 +745,20 @@ main(int argc, char *argv[])
 				if (opts.show_progress && progress_since_last_stderr)
 					fprintf(stderr, "\n");
 
-				pg_log_info("checking btree index \"%s.%s.%s\"",
+				pg_log_info("checking index \"%s.%s.%s\"",
 							rel->datinfo->datname, rel->nspname, rel->relname);
 				progress_since_last_stderr = false;
 			}
-			prepare_btree_command(&sql, rel, free_slot->connection);
+			if (rel->amoid == BTREE_AM_OID)
+				prepare_btree_command(&sql, rel, free_slot->connection);
+			else if (rel->amoid == GIST_AM_OID)
+				prepare_gist_command(&sql, rel, free_slot->connection);
+			else
+				/* should not happen at this stage */
+				pg_log_info("Verification of index type %u not supported",
+							rel->amoid);
 			rel->sql = pstrdup(sql.data);	/* pg_free'd after command */
-			ParallelSlotSetHandler(free_slot, verify_btree_slot_handler, rel);
+			ParallelSlotSetHandler(free_slot, verify_index_slot_handler, rel);
 			run_command(free_slot, rel->sql);
 		}
 	}
@@ -827,7 +836,7 @@ prepare_heap_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
  * Creates a SQL command for running amcheck checking on the given btree index
  * relation.  The command does not select any columns, as btree checking
  * functions do not return any, but rather return corruption information by
- * raising errors, which verify_btree_slot_handler expects.
+ * raising errors, which verify_index_slot_handler expects.
  *
  * The constructed SQL command will silently skip temporary indexes, and
  * indexes being reindexed concurrently, as checking them would needlessly draw
@@ -869,6 +878,28 @@ prepare_btree_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
 						  rel->reloid);
 }
 
+/*
+ * prepare_gist_command
+ * Similar to btree equivalent prepares command to check GiST index.
+ */
+static void
+prepare_gist_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
+{
+	resetPQExpBuffer(sql);
+
+	appendPQExpBuffer(sql,
+						"SELECT %s.gist_index_check("
+						"index := c.oid, heapallindexed := %s)"
+						"\nFROM pg_catalog.pg_class c, pg_catalog.pg_index i "
+						"WHERE c.oid = %u "
+						"AND c.oid = i.indexrelid "
+						"AND c.relpersistence != 't' "
+						"AND i.indisready AND i.indisvalid AND i.indislive",
+						rel->datinfo->amcheck_schema,
+						(opts.heapallindexed ? "true" : "false"),
+						rel->reloid);
+}
+
 /*
  * run_command
  *
@@ -908,7 +939,7 @@ run_command(ParallelSlot *slot, const char *sql)
  * Note: Heap relation corruption is reported by verify_heapam() via the result
  * set, rather than an ERROR, but running verify_heapam() on a corrupted heap
  * table may still result in an error being returned from the server due to
- * missing relation files, bad checksums, etc.  The btree corruption checking
+ * missing relation files, bad checksums, etc.  The corruption checking
  * functions always use errors to communicate corruption messages.  We can't
  * just abort processing because we got a mere ERROR.
  *
@@ -1057,11 +1088,11 @@ verify_heap_slot_handler(PGresult *res, PGconn *conn, void *context)
 }
 
 /*
- * verify_btree_slot_handler
+ * verify_index_slot_handler
  *
- * ParallelSlotHandler that receives results from a btree checking command
- * created by prepare_btree_command and outputs them for the user.  The results
- * from the btree checking command is assumed to be empty, but when the results
+ * ParallelSlotHandler that receives results from a checking command created by
+ * prepare_[btree,gist]_command and outputs them for the user.  The results
+ * from the checking command is assumed to be empty, but when the results
  * are an error code, the useful information about the corruption is expected
  * in the connection's error message.
  *
@@ -1070,7 +1101,7 @@ verify_heap_slot_handler(PGresult *res, PGconn *conn, void *context)
  * context: unused
  */
 static bool
-verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
+verify_index_slot_handler(PGresult *res, PGconn *conn, void *context)
 {
 	RelationInfo *rel = (RelationInfo *) context;
 
@@ -1081,7 +1112,7 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 		if (ntups > 1)
 		{
 			/*
-			 * We expect the btree checking functions to return one void row
+			 * We expect the checking functions to return one void row
 			 * each, or zero rows if the check was skipped due to the object
 			 * being in the wrong state to be checked, so we should output
 			 * some sort of warning if we get anything more, not because it
@@ -1096,7 +1127,7 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 			 */
 			if (opts.show_progress && progress_since_last_stderr)
 				fprintf(stderr, "\n");
-			pg_log_warning("btree index \"%s.%s.%s\": btree checking function returned unexpected number of rows: %d",
+			pg_log_warning("index \"%s.%s.%s\": checking function returned unexpected number of rows: %d",
 						   rel->datinfo->datname, rel->nspname, rel->relname, ntups);
 			if (opts.verbose)
 				pg_log_warning_detail("Query was: %s", rel->sql);
@@ -1110,7 +1141,7 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 		char	   *msg = indent_lines(PQerrorMessage(conn));
 
 		all_checks_pass = false;
-		printf(_("btree index \"%s.%s.%s\":\n"),
+		printf(_("index \"%s.%s.%s\":\n"),
 			   rel->datinfo->datname, rel->nspname, rel->relname);
 		printf("%s", msg);
 		if (opts.verbose)
@@ -1163,6 +1194,8 @@ help(const char *progname)
 	printf(_("      --heapallindexed            check that all heap tuples are found within indexes\n"));
 	printf(_("      --parent-check              check index parent/child relationships\n"));
 	printf(_("      --rootdescend               search from root page to refind tuples\n"));
+	printf(_("\nGiST index checking options:\n"));
+	printf(_("      --heapallindexed            check that all heap tuples are found within indexes\n"));
 	printf(_("\nConnection options:\n"));
 	printf(_("  -h, --host=HOSTNAME             database server host or socket directory\n"));
 	printf(_("  -p, --port=PORT                 database server port\n"));
@@ -1376,11 +1409,11 @@ append_schema_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
  * pattern: the relation name pattern
  * encoding: client encoding for parsing the pattern
  * heap_only: whether the pattern should only be matched against heap tables
- * btree_only: whether the pattern should only be matched against btree indexes
+ * index_only: whether the pattern should only be matched against indexes
  */
 static void
 append_relation_pattern_helper(PatternInfoArray *pia, const char *pattern,
-							   int encoding, bool heap_only, bool btree_only)
+							   int encoding, bool heap_only, bool index_only)
 {
 	PQExpBufferData dbbuf;
 	PQExpBufferData nspbuf;
@@ -1415,14 +1448,14 @@ append_relation_pattern_helper(PatternInfoArray *pia, const char *pattern,
 	termPQExpBuffer(&relbuf);
 
 	info->heap_only = heap_only;
-	info->btree_only = btree_only;
+	info->index_only = index_only;
 }
 
 /*
  * append_relation_pattern
  *
  * Adds the given pattern interpreted as a relation pattern, to be matched
- * against both heap tables and btree indexes.
+ * against both heap tables and indexes.
  *
  * pia: the pattern info array to be appended
  * pattern: the relation name pattern
@@ -1451,17 +1484,17 @@ append_heap_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
 }
 
 /*
- * append_btree_pattern
+ * append_index_pattern
  *
  * Adds the given pattern interpreted as a relation pattern, to be matched only
- * against btree indexes.
+ * against indexes.
  *
  * pia: the pattern info array to be appended
  * pattern: the relation name pattern
  * encoding: client encoding for parsing the pattern
  */
 static void
-append_btree_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
+append_index_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
 {
 	append_relation_pattern_helper(pia, pattern, encoding, false, true);
 }
@@ -1719,7 +1752,7 @@ compile_database_list(PGconn *conn, SimplePtrList *databases,
  *     rel_regex: the relname regexp parsed from the pattern, or NULL if the
  *                pattern had no relname part
  *     heap_only: true if the pattern applies only to heap tables (not indexes)
- *     btree_only: true if the pattern applies only to btree indexes (not tables)
+ *     index_only: true if the pattern applies only to indexes (not tables)
  *
  * buf: the buffer to be appended
  * patterns: the array of patterns to be inserted into the CTE
@@ -1761,7 +1794,7 @@ append_rel_pattern_raw_cte(PQExpBuffer buf, const PatternInfoArray *pia,
 			appendPQExpBufferStr(buf, "::TEXT, true::BOOLEAN");
 		else
 			appendPQExpBufferStr(buf, "::TEXT, false::BOOLEAN");
-		if (info->btree_only)
+		if (info->index_only)
 			appendPQExpBufferStr(buf, ", true::BOOLEAN");
 		else
 			appendPQExpBufferStr(buf, ", false::BOOLEAN");
@@ -1799,8 +1832,8 @@ append_rel_pattern_filtered_cte(PQExpBuffer buf, const char *raw,
 								const char *filtered, PGconn *conn)
 {
 	appendPQExpBuffer(buf,
-					  "\n%s (pattern_id, nsp_regex, rel_regex, heap_only, btree_only) AS ("
-					  "\nSELECT pattern_id, nsp_regex, rel_regex, heap_only, btree_only "
+					  "\n%s (pattern_id, nsp_regex, rel_regex, heap_only, index_only) AS ("
+					  "\nSELECT pattern_id, nsp_regex, rel_regex, heap_only, index_only "
 					  "FROM %s r"
 					  "\nWHERE (r.db_regex IS NULL "
 					  "OR ",
@@ -1823,7 +1856,7 @@ append_rel_pattern_filtered_cte(PQExpBuffer buf, const char *raw,
  * The cells of the constructed list contain all information about the relation
  * necessary to connect to the database and check the object, including which
  * database to connect to, where contrib/amcheck is installed, and the Oid and
- * type of object (heap table vs. btree index).  Rather than duplicating the
+ * type of object (heap table vs. index).  Rather than duplicating the
  * database details per relation, the relation structs use references to the
  * same database object, provided by the caller.
  *
@@ -1850,7 +1883,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	if (!opts.allrel)
 	{
 		appendPQExpBufferStr(&sql,
-							 " include_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, btree_only) AS (");
+							 " include_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, index_only) AS (");
 		append_rel_pattern_raw_cte(&sql, &opts.include, conn);
 		appendPQExpBufferStr(&sql, "\n),");
 		append_rel_pattern_filtered_cte(&sql, "include_raw", "include_pat", conn);
@@ -1860,7 +1893,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	if (opts.excludetbl || opts.excludeidx || opts.excludensp)
 	{
 		appendPQExpBufferStr(&sql,
-							 " exclude_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, btree_only) AS (");
+							 " exclude_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, index_only) AS (");
 		append_rel_pattern_raw_cte(&sql, &opts.exclude, conn);
 		appendPQExpBufferStr(&sql, "\n),");
 		append_rel_pattern_filtered_cte(&sql, "exclude_raw", "exclude_pat", conn);
@@ -1868,36 +1901,36 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 
 	/* Append the relation CTE. */
 	appendPQExpBufferStr(&sql,
-						 " relation (pattern_id, oid, nspname, relname, reltoastrelid, relpages, is_heap, is_btree) AS ("
+						 " relation (pattern_id, oid, amoid, nspname, relname, reltoastrelid, relpages, is_heap, is_index) AS ("
 						 "\nSELECT DISTINCT ON (c.oid");
 	if (!opts.allrel)
 		appendPQExpBufferStr(&sql, ", ip.pattern_id) ip.pattern_id,");
 	else
 		appendPQExpBufferStr(&sql, ") NULL::INTEGER AS pattern_id,");
 	appendPQExpBuffer(&sql,
-					  "\nc.oid, n.nspname, c.relname, c.reltoastrelid, c.relpages, "
-					  "c.relam = %u AS is_heap, "
-					  "c.relam = %u AS is_btree"
+					  "\nc.oid, c.relam as amoid, n.nspname, c.relname, "
+					  "c.reltoastrelid, c.relpages, c.relam = %u AS is_heap, "
+					  "(c.relam = %u OR c.relam = %u) AS is_index"
 					  "\nFROM pg_catalog.pg_class c "
 					  "INNER JOIN pg_catalog.pg_namespace n "
 					  "ON c.relnamespace = n.oid",
-					  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+					  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 	if (!opts.allrel)
 		appendPQExpBuffer(&sql,
 						  "\nINNER JOIN include_pat ip"
 						  "\nON (n.nspname ~ ip.nsp_regex OR ip.nsp_regex IS NULL)"
 						  "\nAND (c.relname ~ ip.rel_regex OR ip.rel_regex IS NULL)"
 						  "\nAND (c.relam = %u OR NOT ip.heap_only)"
-						  "\nAND (c.relam = %u OR NOT ip.btree_only)",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  "\nAND ((c.relam = %u OR c.relam = %u) OR NOT ip.index_only)",
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 	if (opts.excludetbl || opts.excludeidx || opts.excludensp)
 		appendPQExpBuffer(&sql,
 						  "\nLEFT OUTER JOIN exclude_pat ep"
 						  "\nON (n.nspname ~ ep.nsp_regex OR ep.nsp_regex IS NULL)"
 						  "\nAND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL)"
 						  "\nAND (c.relam = %u OR NOT ep.heap_only OR ep.rel_regex IS NULL)"
-						  "\nAND (c.relam = %u OR NOT ep.btree_only OR ep.rel_regex IS NULL)",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  "\nAND ((c.relam = %u OR c.relam = %u) OR NOT ep.index_only OR ep.rel_regex IS NULL)",
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 
 	/*
 	 * Exclude temporary tables and indexes, which must necessarily belong to
@@ -1931,12 +1964,12 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 						  HEAP_TABLE_AM_OID, PG_TOAST_NAMESPACE);
 	else
 		appendPQExpBuffer(&sql,
-						  " AND c.relam IN (%u, %u)"
+						  " AND c.relam IN (%u, %u, %u)"
 						  "AND c.relkind IN ('r', 'S', 'm', 't', 'i') "
 						  "AND ((c.relam = %u AND c.relkind IN ('r', 'S', 'm', 't')) OR "
-						  "(c.relam = %u AND c.relkind = 'i'))",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID,
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  "((c.relam = %u OR c.relam = %u) AND c.relkind = 'i'))",
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID,
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 
 	appendPQExpBufferStr(&sql,
 						 "\nORDER BY c.oid)");
@@ -1965,17 +1998,18 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		appendPQExpBufferStr(&sql,
 							 "\n)");
 	}
-	if (!opts.no_btree_expansion)
+	if (!opts.no_index_expansion)
 	{
 		/*
 		 * Include a CTE for btree indexes associated with primary heap tables
 		 * selected above, filtering by exclusion patterns (if any) that match
-		 * btree index names.
+		 * btree index names. Currently, only btree indexes can be PK, but this
+		 * might chance in future.
 		 */
 		appendPQExpBufferStr(&sql,
-							 ", index (oid, nspname, relname, relpages) AS ("
-							 "\nSELECT c.oid, r.nspname, c.relname, c.relpages "
-							 "FROM relation r"
+							 ", index (oid, amoid, nspname, relname, relpages) AS ("
+							 "\nSELECT c.oid, c.relam as amoid, r.nspname, "
+							 "c.relname, c.relpages FROM relation r"
 							 "\nINNER JOIN pg_catalog.pg_index i "
 							 "ON r.oid = i.indrelid "
 							 "INNER JOIN pg_catalog.pg_class c "
@@ -1988,7 +2022,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 								 "\nLEFT OUTER JOIN exclude_pat ep "
 								 "ON (n.nspname ~ ep.nsp_regex OR ep.nsp_regex IS NULL) "
 								 "AND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL) "
-								 "AND ep.btree_only"
+								 "AND ep.index_only"
 								 "\nWHERE ep.pattern_id IS NULL");
 		else
 			appendPQExpBufferStr(&sql,
@@ -1996,7 +2030,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		appendPQExpBuffer(&sql,
 						  " AND c.relam = %u "
 						  "AND c.relkind = 'i'",
-						  BTREE_AM_OID);
+						  BTREE_AM_OID); /* Do not expect other AMs here */
 		if (opts.no_toast_expansion)
 			appendPQExpBuffer(&sql,
 							  " AND c.relnamespace != %u",
@@ -2004,7 +2038,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		appendPQExpBufferStr(&sql, "\n)");
 	}
 
-	if (!opts.no_toast_expansion && !opts.no_btree_expansion)
+	if (!opts.no_toast_expansion && !opts.no_index_expansion)
 	{
 		/*
 		 * Include a CTE for btree indexes associated with toast tables of
@@ -2025,13 +2059,13 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 								 "\nLEFT OUTER JOIN exclude_pat ep "
 								 "ON ('pg_toast' ~ ep.nsp_regex OR ep.nsp_regex IS NULL) "
 								 "AND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL) "
-								 "AND ep.btree_only "
+								 "AND ep.index_only "
 								 "WHERE ep.pattern_id IS NULL");
 		else
 			appendPQExpBufferStr(&sql,
 								 "\nWHERE true");
 		appendPQExpBuffer(&sql,
-						  " AND c.relam = %u"
+						  " AND c.relam = %u "
 						  " AND c.relkind = 'i')",
 						  BTREE_AM_OID);
 	}
@@ -2045,12 +2079,13 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	 * list.
 	 */
 	appendPQExpBufferStr(&sql,
-						 "\nSELECT pattern_id, is_heap, is_btree, oid, nspname, relname, relpages "
+						 "\nSELECT pattern_id, is_heap, is_index, oid, amoid, nspname, relname, relpages "
 						 "FROM (");
 	appendPQExpBufferStr(&sql,
 	/* Inclusion patterns that failed to match */
-						 "\nSELECT pattern_id, is_heap, is_btree, "
+						 "\nSELECT pattern_id, is_heap, is_index, "
 						 "NULL::OID AS oid, "
+						 "NULL::OID AS amoid, "
 						 "NULL::TEXT AS nspname, "
 						 "NULL::TEXT AS relname, "
 						 "NULL::INTEGER AS relpages"
@@ -2059,29 +2094,29 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 						 "UNION"
 	/* Primary relations */
 						 "\nSELECT NULL::INTEGER AS pattern_id, "
-						 "is_heap, is_btree, oid, nspname, relname, relpages "
+						 "is_heap, is_index, oid, amoid, nspname, relname, relpages "
 						 "FROM relation");
 	if (!opts.no_toast_expansion)
-		appendPQExpBufferStr(&sql,
+		appendPQExpBuffer(&sql,
 							 " UNION"
 		/* Toast tables for primary relations */
 							 "\nSELECT NULL::INTEGER AS pattern_id, TRUE AS is_heap, "
-							 "FALSE AS is_btree, oid, nspname, relname, relpages "
+							 "FALSE AS is_index, oid, 0 as amoid, nspname, relname, relpages "
 							 "FROM toast");
-	if (!opts.no_btree_expansion)
+	if (!opts.no_index_expansion)
 		appendPQExpBufferStr(&sql,
 							 " UNION"
 		/* Indexes for primary relations */
 							 "\nSELECT NULL::INTEGER AS pattern_id, FALSE AS is_heap, "
-							 "TRUE AS is_btree, oid, nspname, relname, relpages "
+							 "TRUE AS is_index, oid, amoid, nspname, relname, relpages "
 							 "FROM index");
-	if (!opts.no_toast_expansion && !opts.no_btree_expansion)
-		appendPQExpBufferStr(&sql,
+	if (!opts.no_toast_expansion && !opts.no_index_expansion)
+		appendPQExpBuffer(&sql,
 							 " UNION"
 		/* Indexes for toast relations */
 							 "\nSELECT NULL::INTEGER AS pattern_id, FALSE AS is_heap, "
-							 "TRUE AS is_btree, oid, nspname, relname, relpages "
-							 "FROM toast_index");
+							 "TRUE AS is_index, oid, %u as amoid, nspname, relname, relpages "
+							 "FROM toast_index", BTREE_AM_OID);
 	appendPQExpBufferStr(&sql,
 						 "\n) AS combined_records "
 						 "ORDER BY relpages DESC NULLS FIRST, oid");
@@ -2101,8 +2136,9 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	{
 		int			pattern_id = -1;
 		bool		is_heap = false;
-		bool		is_btree PG_USED_FOR_ASSERTS_ONLY = false;
+		bool		is_index PG_USED_FOR_ASSERTS_ONLY = false;
 		Oid			oid = InvalidOid;
+		Oid			amoid = InvalidOid;
 		const char *nspname = NULL;
 		const char *relname = NULL;
 		int			relpages = 0;
@@ -2112,15 +2148,17 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		if (!PQgetisnull(res, i, 1))
 			is_heap = (PQgetvalue(res, i, 1)[0] == 't');
 		if (!PQgetisnull(res, i, 2))
-			is_btree = (PQgetvalue(res, i, 2)[0] == 't');
+			is_index = (PQgetvalue(res, i, 2)[0] == 't');
 		if (!PQgetisnull(res, i, 3))
 			oid = atooid(PQgetvalue(res, i, 3));
 		if (!PQgetisnull(res, i, 4))
-			nspname = PQgetvalue(res, i, 4);
+			amoid = atooid(PQgetvalue(res, i, 4));
 		if (!PQgetisnull(res, i, 5))
-			relname = PQgetvalue(res, i, 5);
+			nspname = PQgetvalue(res, i, 5);
 		if (!PQgetisnull(res, i, 6))
-			relpages = atoi(PQgetvalue(res, i, 6));
+			relname = PQgetvalue(res, i, 6);
+		if (!PQgetisnull(res, i, 7))
+			relpages = atoi(PQgetvalue(res, i, 7));
 
 		if (pattern_id >= 0)
 		{
@@ -2142,10 +2180,11 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 			RelationInfo *rel = (RelationInfo *) pg_malloc0(sizeof(RelationInfo));
 
 			Assert(OidIsValid(oid));
-			Assert((is_heap && !is_btree) || (is_btree && !is_heap));
+			Assert((is_heap && !is_index) || (is_index && !is_heap));
 
 			rel->datinfo = dat;
 			rel->reloid = oid;
+			rel->amoid = amoid;
 			rel->is_heap = is_heap;
 			rel->nspname = pstrdup(nspname);
 			rel->relname = pstrdup(relname);
@@ -2155,7 +2194,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 			{
 				/*
 				 * We apply --startblock and --endblock to heap tables, but
-				 * not btree indexes, and for progress purposes we need to
+				 * not supported indexes, and for progress purposes we need to
 				 * track how many blocks we expect to check.
 				 */
 				if (opts.endblock >= 0 && rel->blocks_to_check > opts.endblock)
diff --git a/src/bin/pg_amcheck/t/002_nonesuch.pl b/src/bin/pg_amcheck/t/002_nonesuch.pl
index 58be2c694d..5e8a63a844 100644
--- a/src/bin/pg_amcheck/t/002_nonesuch.pl
+++ b/src/bin/pg_amcheck/t/002_nonesuch.pl
@@ -272,8 +272,8 @@ $node->command_checks_all(
 	[
 		qr/pg_amcheck: warning: no heap tables to check matching "no_such_table"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "no\*such\*table"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "no_such_index"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "no\*such\*index"/,
+		qr/pg_amcheck: warning: no indexes to check matching "no_such_index"/,
+		qr/pg_amcheck: warning: no indexes to check matching "no\*such\*index"/,
 		qr/pg_amcheck: warning: no relations to check matching "no_such_relation"/,
 		qr/pg_amcheck: warning: no relations to check matching "no\*such\*relation"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "no\*such\*table"/,
@@ -319,8 +319,8 @@ $node->command_checks_all(
 		qr/pg_amcheck: warning: no heap tables to check matching "template1\.public\.foo"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "another_db\.public\.foo"/,
 		qr/pg_amcheck: warning: no connectable databases to check matching "no_such_database\.public\.foo"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "template1\.public\.foo_idx"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "another_db\.public\.foo_idx"/,
+		qr/pg_amcheck: warning: no indexes to check matching "template1\.public\.foo_idx"/,
+		qr/pg_amcheck: warning: no indexes to check matching "another_db\.public\.foo_idx"/,
 		qr/pg_amcheck: warning: no connectable databases to check matching "no_such_database\.public\.foo_idx"/,
 		qr/pg_amcheck: error: no relations to check/,
 	],
diff --git a/src/bin/pg_amcheck/t/003_check.pl b/src/bin/pg_amcheck/t/003_check.pl
index 359abe25a1..8b6326dd3d 100644
--- a/src/bin/pg_amcheck/t/003_check.pl
+++ b/src/bin/pg_amcheck/t/003_check.pl
@@ -185,7 +185,7 @@ for my $dbname (qw(db1 db2 db3))
 	# schemas.  The schemas are all identical to start, but
 	# we will corrupt them differently later.
 	#
-	for my $schema (qw(s1 s2 s3 s4 s5))
+	for my $schema (qw(s1 s2 s3 s4 s5 s6))
 	{
 		$node->safe_psql(
 			$dbname, qq(
@@ -288,22 +288,24 @@ plan_to_corrupt_first_page('db1', 's3.t2_btree');
 # Corrupt toast table, partitions, and materialized views in schema "s4"
 plan_to_remove_toast_file('db1', 's4.t2');
 
-# Corrupt all other object types in schema "s5".  We don't have amcheck support
+# Corrupt GiST index in schema "s5"
+plan_to_remove_relation_file('db1', 's5.t1_gist');
+plan_to_corrupt_first_page('db1', 's5.t2_gist');
+
+# Corrupt all other object types in schema "s6".  We don't have amcheck support
 # for these types, but we check that their corruption does not trigger any
 # errors in pg_amcheck
-plan_to_remove_relation_file('db1', 's5.seq1');
-plan_to_remove_relation_file('db1', 's5.t1_hash');
-plan_to_remove_relation_file('db1', 's5.t1_gist');
-plan_to_remove_relation_file('db1', 's5.t1_gin');
-plan_to_remove_relation_file('db1', 's5.t1_brin');
-plan_to_remove_relation_file('db1', 's5.t1_spgist');
+plan_to_remove_relation_file('db1', 's6.seq1');
+plan_to_remove_relation_file('db1', 's6.t1_hash');
+plan_to_remove_relation_file('db1', 's6.t1_gin');
+plan_to_remove_relation_file('db1', 's6.t1_brin');
+plan_to_remove_relation_file('db1', 's6.t1_spgist');
 
-plan_to_corrupt_first_page('db1', 's5.seq2');
-plan_to_corrupt_first_page('db1', 's5.t2_hash');
-plan_to_corrupt_first_page('db1', 's5.t2_gist');
-plan_to_corrupt_first_page('db1', 's5.t2_gin');
-plan_to_corrupt_first_page('db1', 's5.t2_brin');
-plan_to_corrupt_first_page('db1', 's5.t2_spgist');
+plan_to_corrupt_first_page('db1', 's6.seq2');
+plan_to_corrupt_first_page('db1', 's6.t2_hash');
+plan_to_corrupt_first_page('db1', 's6.t2_gin');
+plan_to_corrupt_first_page('db1', 's6.t2_brin');
+plan_to_corrupt_first_page('db1', 's6.t2_spgist');
 
 
 # Database 'db2' corruptions
@@ -434,10 +436,22 @@ $node->command_checks_all(
 	[$no_output_re],
 	'pg_amcheck in schema s4 excluding toast reports no corruption');
 
-# Check that no corruption is reported in schema db1.s5
-$node->command_checks_all([ @cmd, '-s', 's5', 'db1' ],
+# In schema db1.s5 we should see GiST corruption messages on stdout, and
+# nothing on stderr.
+#
+$node->command_checks_all(
+	[ @cmd, '-s', 's5', 'db1' ],
+	2,
+	[
+		$missing_file_re, $line_pointer_corruption_re,
+	],
+	[$no_output_re],
+	'pg_amcheck schema s5 reports GiST index errors');
+
+# Check that no corruption is reported in schema db1.s6
+$node->command_checks_all([ @cmd, '-s', 's6', 'db1' ],
 	0, [$no_output_re], [$no_output_re],
-	'pg_amcheck over schema s5 reports no corruption');
+	'pg_amcheck over schema s6 reports no corruption');
 
 # In schema db1.s1, only indexes are corrupt.  Verify that when we exclude
 # the indexes, no corruption is reported about the schema.
-- 
2.37.1 (Apple Git-137.1)

v26-0002-Add-gist_index_check-function-to-verify-GiST-ind.patchapplication/octet-stream; name=v26-0002-Add-gist_index_check-function-to-verify-GiST-ind.patchDownload

From e1b9ab331422d86de466ecd5707249fda6113182 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:17:44 +0500
Subject: [PATCH v26 2/4] Add gist_index_check() function to verify GiST index
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This function traverses GiST with a depth-fisrt search and checks
that all downlink tuples are included into parent tuple keyspace.
This traverse takes lock of any page until some discapency found.
To re-check suspicious pair of parent and child tuples it aqcuires
locks on both parent and child pages in the same order as page
split does.

Author: Andrey Borodin <amborodin@acm.org>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile                |   6 +-
 contrib/amcheck/amcheck--1.3--1.4.sql   |  14 +
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/expected/check_gist.out | 145 +++++
 contrib/amcheck/meson.build             |   3 +
 contrib/amcheck/sql/check_gist.sql      |  62 +++
 contrib/amcheck/verify_gist.c           | 672 ++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml               |  19 +
 8 files changed, 920 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.3--1.4.sql
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index 6d26551fe3..e9e0198276 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,14 +4,16 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
+	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
-DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql \
+		amcheck--1.3--1.4.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap
+REGRESS = check check_btree check_gist check_heap
 
 TAP_TESTS = 1
 
diff --git a/contrib/amcheck/amcheck--1.3--1.4.sql b/contrib/amcheck/amcheck--1.3--1.4.sql
new file mode 100644
index 0000000000..5d30784b44
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.3--1.4.sql
@@ -0,0 +1,14 @@
+/* contrib/amcheck/amcheck--1.3--1.4.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.4'" to load this file. \quit
+
+
+-- gist_index_check()
+--
+CREATE FUNCTION gist_index_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_check(regclass, boolean) FROM PUBLIC;
\ No newline at end of file
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index ab50931f75..e67ace01c9 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.3'
+default_version = '1.4'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 0000000000..cbc3e27e67
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,145 @@
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gist_check;
+--
+-- Similar to BUG #15597
+--
+CREATE TABLE toast_bug(c point,buggy text);
+ALTER TABLE toast_bug ALTER COLUMN buggy SET STORAGE extended;
+CREATE INDEX toasty ON toast_bug USING gist(c) INCLUDE(buggy);
+-- pg_attribute entry for toasty.buggy (the index) will have plain storage:
+UPDATE pg_attribute SET attstorage = 'p'
+WHERE attrelid = 'toasty'::regclass AND attname = 'buggy';
+-- Whereas pg_attribute entry for toast_bug.buggy (the table) still has extended storage:
+SELECT attstorage FROM pg_attribute
+WHERE attrelid = 'toast_bug'::regclass AND attname = 'buggy';
+ attstorage 
+------------
+ x
+(1 row)
+
+-- Insert compressible heap tuple (comfortably exceeds TOAST_TUPLE_THRESHOLD):
+INSERT INTO toast_bug SELECT point(0,0), repeat('a', 2200);
+-- Should not get false positive report of corruption:
+SELECT gist_index_check('toasty', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index cd81cbf3bc..9e7ebc0499 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
@@ -24,6 +25,7 @@ install_data(
   'amcheck--1.0--1.1.sql',
   'amcheck--1.1--1.2.sql',
   'amcheck--1.2--1.3.sql',
+  'amcheck--1.3--1.4.sql',
   kwargs: contrib_data_args,
 )
 
@@ -35,6 +37,7 @@ tests += {
     'sql': [
       'check',
       'check_btree',
+      'check_gist',
       'check_heap',
     ],
   },
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 0000000000..37966423b8
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,62 @@
+
+SELECT setseed(1);
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+-- cleanup
+DROP TABLE gist_check;
+
+--
+-- Similar to BUG #15597
+--
+CREATE TABLE toast_bug(c point,buggy text);
+ALTER TABLE toast_bug ALTER COLUMN buggy SET STORAGE extended;
+CREATE INDEX toasty ON toast_bug USING gist(c) INCLUDE(buggy);
+
+-- pg_attribute entry for toasty.buggy (the index) will have plain storage:
+UPDATE pg_attribute SET attstorage = 'p'
+WHERE attrelid = 'toasty'::regclass AND attname = 'buggy';
+
+-- Whereas pg_attribute entry for toast_bug.buggy (the table) still has extended storage:
+SELECT attstorage FROM pg_attribute
+WHERE attrelid = 'toast_bug'::regclass AND attname = 'buggy';
+
+-- Insert compressible heap tuple (comfortably exceeds TOAST_TUPLE_THRESHOLD):
+INSERT INTO toast_bug SELECT point(0,0), repeat('a', 2200);
+-- Should not get false positive report of corruption:
+SELECT gist_index_check('toasty', true);
\ No newline at end of file
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 0000000000..f4cc6de8fa
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,672 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gist.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "access/tableam.h"
+#include "amcheck.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
+#include "common/pg_prng.h"
+#include "lib/bloomfilter.h"
+#include "utils/memutils.h"
+
+
+/*
+ * GistScanItem represents one item of depth-first scan of GiST index.
+ */
+typedef struct GistScanItem
+{
+	int			depth;
+
+	/* Referenced block number to check next */
+	BlockNumber blkno;
+
+	/*
+	 * Correctess of this parent tuple will be checked against contents of referenced page.
+	 * This tuple will be NULL for root block.
+	 */
+	IndexTuple	parenttup;
+
+	/*
+	 * LSN to hande concurrent scan of the page.
+	 * It's necessary to avoid missing some subtrees from page, that was
+	 * split just before we read it.
+	 */
+	XLogRecPtr	parentlsn;
+
+	/*
+	 * Reference to parent page for re-locking in case of found parent-child
+	 * tuple discrapencies.
+	 */
+	BlockNumber parentblk;
+
+	/* Pointer to a next stack item. */
+	struct GistScanItem *next;
+} GistScanItem;
+
+typedef struct GistCheckState
+{
+	/* Bloom filter fingerprints index tuples */
+	bloom_filter *filter;
+	/* Debug counter */
+	int64		heaptuplespresent;
+	/* GiST state */
+	GISTSTATE  *state;
+
+	Snapshot	snapshot;
+	Relation	rel;
+	Relation	heaprel;
+
+	/* progress reporting stuff */
+	BlockNumber totalblocks;
+	BlockNumber reportedblocks;
+	BlockNumber scannedblocks;
+	BlockNumber deltablocks;
+
+	int leafdepth;
+} GistCheckState;
+
+PG_FUNCTION_INFO_V1(gist_index_check);
+
+static void gist_init_heapallindexed(Relation rel, GistCheckState * result);
+static void gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+											   void *callback_state);
+static void gist_check_page(GistCheckState *check_state,GistScanItem *stack,
+							Page page, bool heapallindexed,
+							BufferAccessStrategy strategy);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gist_refind_parent(Relation rel, BlockNumber parentblkno,
+									 BlockNumber childblkno,
+									 BufferAccessStrategy strategy);
+static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block,
+								   Page page, OffsetNumber offset);
+static void gist_tuple_present_callback(Relation index, ItemPointer tid,
+										Datum *values, bool *isnull,
+										bool tupleIsAlive, void *checkstate);
+static IndexTuple gistFormNormalizedTuple(GISTSTATE *giststate, Relation r,
+			  Datum *attdata, bool *isnull, ItemPointerData tid);
+
+/*
+ * gist_index_check(index regclass)
+ *
+ * Verify integrity of GiST index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gist_index_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	bool		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid,
+									GIST_AM_OID,
+									gist_check_parent_keys_consistency,
+									AccessShareLock,
+									&heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+static void
+gist_init_heapallindexed(Relation rel, GistCheckState * result)
+{
+	int64		total_pages;
+	int64		total_elems;
+	uint64		seed;
+
+	/*
+	 * Size Bloom filter based on estimated number of tuples in index. This
+	 * logic is similar to B-tree, see verify_btree.c .
+	 */
+	total_pages = result->totalblocks;
+	total_elems = Max(total_pages * (MaxOffsetNumber / 5),
+					  (int64) rel->rd_rel->reltuples);
+	seed = pg_prng_uint64(&pg_global_prng_state);
+	result->filter = bloom_create(total_elems, maintenance_work_mem, seed);
+
+	result->snapshot = RegisterSnapshot(GetTransactionSnapshot());
+
+
+	/*
+	 * GetTransactionSnapshot() always acquires a new MVCC snapshot in READ
+	 * COMMITTED mode.  A new snapshot is guaranteed to have all the entries
+	 * it requires in the index.
+	 *
+	 * We must defend against the possibility that an old xact snapshot was
+	 * returned at higher isolation levels when that snapshot is not safe for
+	 * index scans of the target index.  This is possible when the snapshot
+	 * sees tuples that are before the index's indcheckxmin horizon.  Throwing
+	 * an error here should be very rare.  It doesn't seem worth using a
+	 * secondary snapshot to avoid this.
+	 */
+	if (IsolationUsesXactSnapshot() && rel->rd_index->indcheckxmin &&
+		!TransactionIdPrecedes(HeapTupleHeaderGetXmin(rel->rd_indextuple->t_data),
+							   result->snapshot->xmin))
+		ereport(ERROR,
+				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+				 errmsg("index \"%s\" cannot be verified using transaction snapshot",
+						RelationGetRelationName(rel))));
+}
+
+/*
+ * Main entry point for GiST check. Allocates memory context and scans through
+ * GiST graph. This scan is performed in a depth-first search using a stack of
+ * GistScanItem-s. Initially this stack contains only root block number. On
+ * each iteration top block numbmer is replcaed by referenced block numbers.
+ *
+ * This function verifies that tuples of internal pages cover all
+ * the key space of each tuple on leaf page.  To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries to
+ * adjust it by tuples on referenced child page.  Parent gist tuple should
+ * never require any adjustments.
+ */
+static void
+gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+								   void *callback_state)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GistScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GISTSTATE  *state;
+	bool		heapallindexed = *((bool *) callback_state);
+	GistCheckState *check_state = palloc0(sizeof(GistCheckState));
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	state = initGISTstate(rel);
+
+	check_state->state = state;
+	check_state->rel = rel;
+	check_state->heaprel = heaprel;
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	check_state->leafdepth = -1;
+
+	check_state->totalblocks = RelationGetNumberOfBlocks(rel);
+	/* report every 100 blocks or 5%, whichever is bigger */
+	check_state->deltablocks = Max(check_state->totalblocks / 20, 100);
+
+	if (heapallindexed)
+		gist_init_heapallindexed(rel, check_state);
+
+	/* Start the scan at the root page */
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	/*
+	 * This GiST scan is effectively "old" VACUUM version before commit
+	 * fe280694d which introduced physical order scanning.
+	 */
+
+	while (stack)
+	{
+		GistScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		XLogRecPtr	lsn;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Report progress */
+		if (check_state->scannedblocks > check_state->reportedblocks +
+			check_state->deltablocks)
+		{
+			elog(DEBUG1, "verified level %u blocks of approximately %u total",
+				 check_state->scannedblocks, check_state->totalblocks);
+			check_state->reportedblocks = check_state->scannedblocks;
+		}
+		check_state->scannedblocks++;
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page))
+		{
+			/* split page detected, install right link to the stack */
+			GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+			ptr->depth = stack->depth;
+			ptr->parenttup = CopyIndexTuple(stack->parenttup);
+			ptr->parentblk = stack->parentblk;
+			ptr->parentlsn = stack->parentlsn;
+			ptr->blkno = GistPageGetOpaque(page)->rightlink;
+			ptr->next = stack->next;
+			stack->next = ptr;
+		}
+
+		gist_check_page(check_state, stack, page, heapallindexed, strategy);
+
+		if (!GistPageIsLeaf(page))
+		{
+			OffsetNumber maxoff = PageGetMaxOffsetNumber(page);
+			for (OffsetNumber i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				/* Internal page, so recurse to the child */
+				GistScanItem *ptr;
+				ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
+				IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+				ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+				ptr->depth = stack->depth + 1;
+				ptr->parenttup = CopyIndexTuple(idxtuple);
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		LockBuffer(buffer, GIST_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	if (heapallindexed)
+	{
+		IndexInfo  *indexinfo = BuildIndexInfo(rel);
+		TableScanDesc scan;
+
+		scan = table_beginscan_strat(heaprel,	/* relation */
+									 check_state->snapshot,	/* snapshot */
+									 0, /* number of keys */
+									 NULL,	/* scan key */
+									 true,	/* buffer access strategy OK */
+									 true); /* syncscan OK? */
+
+		/*
+		 * Scan will behave as the first scan of a CREATE INDEX CONCURRENTLY.
+		 */
+		indexinfo->ii_Concurrent = true;
+
+		indexinfo->ii_Unique = false;
+		indexinfo->ii_ExclusionOps = NULL;
+		indexinfo->ii_ExclusionProcs = NULL;
+		indexinfo->ii_ExclusionStrats = NULL;
+
+		elog(DEBUG1, "verifying that tuples from index \"%s\" are present in \"%s\"",
+			 RelationGetRelationName(rel),
+			 RelationGetRelationName(heaprel));
+
+		table_index_build_scan(heaprel, rel, indexinfo, true, false,
+							   gist_tuple_present_callback, (void *) check_state, scan);
+
+		ereport(DEBUG1,
+				(errmsg_internal("finished verifying presence of " INT64_FORMAT " tuples from table \"%s\" with bitset %.2f%% set",
+								 check_state->heaptuplespresent,
+								 RelationGetRelationName(heaprel),
+								 100.0 * bloom_prop_bits_set(check_state->filter))));
+
+		UnregisterSnapshot(check_state->snapshot);
+		bloom_free(check_state->filter);
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+	pfree(check_state);
+}
+
+static void gist_check_page(GistCheckState *check_state, GistScanItem *stack,
+							Page page, bool heapallindexed, BufferAccessStrategy strategy)
+{
+	OffsetNumber maxoff = PageGetMaxOffsetNumber(page);
+	/* Check that the tree has the same height in all branches */
+	if (GistPageIsLeaf(page))
+	{
+		if (check_state->leafdepth == -1)
+			check_state->leafdepth = stack->depth;
+		else if (stack->depth != check_state->leafdepth)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+						errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+							RelationGetRelationName(check_state->rel), stack->blkno)));
+	}
+
+	/*
+	 * Check that each tuple looks valid, and is consistent with the
+	 * downlink we followed when we stepped on this page.
+	 */
+	for (OffsetNumber i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+	{
+		ItemId		iid = PageGetItemIdCareful(check_state->rel, stack->blkno, page, i);
+		IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+		/*
+		 * Check that it's not a leftover invalid tuple from pre-9.1 See
+		 * also gistdoinsert() and gistbulkdelete() handling of such
+		 * tuples. We do consider it error here.
+		 */
+		if (GistTupleIsInvalid(idxtuple))
+			ereport(ERROR,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						errmsg("index \"%s\" contains an inner tuple marked as invalid, block %u, offset %u",
+							RelationGetRelationName(check_state->rel), stack->blkno, i),
+						errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+						errhint("Please REINDEX it.")));
+
+		if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+						errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+							RelationGetRelationName(check_state->rel), stack->blkno, i)));
+
+		/*
+		 * Check if this tuple is consistent with the downlink in the
+		 * parent.
+		 */
+		if (stack->parenttup &&
+			gistgetadjusted(check_state->rel, stack->parenttup, idxtuple, check_state->state))
+		{
+			/*
+			 * There was a discrepancy between parent and child tuples. We
+			 * need to verify it is not a result of concurrent call of
+			 * gistplacetopage(). So, lock parent and try to find downlink
+			 * for current page. It may be missing due to concurrent page
+			 * split, this is OK.
+			 *
+			 * Note that when we aquire parent tuple now we hold lock for
+			 * both parent and child buffers. Thus parent tuple must
+			 * include keyspace of the child.
+			 */
+			pfree(stack->parenttup);
+			stack->parenttup = gist_refind_parent(check_state->rel, stack->parentblk,
+													stack->blkno, strategy);
+
+			/* We found it - make a final check before failing */
+			if (!stack->parenttup)
+				elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+						stack->blkno, stack->parentblk);
+			else if (gistgetadjusted(check_state->rel, stack->parenttup, idxtuple, check_state->state))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+							errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+								RelationGetRelationName(check_state->rel), stack->blkno, i)));
+			else
+			{
+				/*
+				 * But now it is properly adjusted - nothing to do here.
+				 */
+			}
+		}
+
+		if (GistPageIsLeaf(page))
+		{
+			if (heapallindexed)
+				bloom_add_element(check_state->filter,
+									(unsigned char *) idxtuple,
+									IndexTupleSize(idxtuple));
+		}
+		else
+		{
+			OffsetNumber off = ItemPointerGetOffsetNumber(&(idxtuple->t_tid));
+			if (off != 0xffff)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+							errmsg("index \"%s\" has on page %u offset %u has item id not pointing to 0xffff, but %hu",
+								RelationGetRelationName(check_state->rel), stack->blkno, i, off)));
+		}
+	}
+}
+
+/*
+ * gistFormNormalizedTuple - analogue to gistFormTuple, but performs deTOASTing
+ * of all included data (for covering indexes). While we do not expected
+ * toasted attributes in normal index, this can happen as a result of
+ * intervention into system catalog. Detoasting of key attributes is expected
+ * to be done by opclass decompression methods, if indexed type might be
+ * toasted.
+ */
+static IndexTuple
+gistFormNormalizedTuple(GISTSTATE *giststate, Relation r,
+			  Datum *attdata, bool *isnull, ItemPointerData tid)
+{
+	Datum		compatt[INDEX_MAX_KEYS];
+	IndexTuple	res;
+
+	gistCompressValues(giststate, r, attdata, isnull, true, compatt);
+
+	for (int i = 0; i < r->rd_att->natts; i++)
+	{
+		Form_pg_attribute att;
+
+		att = TupleDescAttr(giststate->leafTupdesc, i);
+		if (att->attbyval || att->attlen != -1 || isnull[i])
+			continue;
+
+		if (VARATT_IS_EXTERNAL(DatumGetPointer(compatt[i])))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("external varlena datum in tuple that references heap row (%u,%u) in index \"%s\"",
+							ItemPointerGetBlockNumber(&tid),
+							ItemPointerGetOffsetNumber(&tid),
+							RelationGetRelationName(r))));
+		if (VARATT_IS_COMPRESSED(DatumGetPointer(compatt[i])))
+		{
+			//Datum old = compatt[i];
+			/* Key attributes must never be compressed */
+			if (i < IndexRelationGetNumberOfKeyAttributes(r))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+							errmsg("compressed varlena datum in tuple key that references heap row (%u,%u) in index \"%s\"",
+								ItemPointerGetBlockNumber(&tid),
+								ItemPointerGetOffsetNumber(&tid),
+								RelationGetRelationName(r))));
+
+			compatt[i] = PointerGetDatum(PG_DETOAST_DATUM(compatt[i]));
+			//pfree(DatumGetPointer(old)); // TODO: this fails. Why?
+		}
+	}
+
+	res = index_form_tuple(giststate->leafTupdesc, compatt, isnull);
+
+	/*
+	 * The offset number on tuples on internal pages is unused. For historical
+	 * reasons, it is set to 0xffff.
+	 */
+	ItemPointerSetOffsetNumber(&(res->t_tid), 0xffff);
+	return res;
+}
+
+static void
+gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+							bool *isnull, bool tupleIsAlive, void *checkstate)
+{
+	GistCheckState *state = (GistCheckState *) checkstate;
+	IndexTuple	itup = gistFormNormalizedTuple(state->state, index, values, isnull, *tid);
+
+	itup->t_tid = *tid;
+	/* Probe Bloom filter -- tuple should be present */
+	if (bloom_lacks_element(state->filter, (unsigned char *) itup,
+							IndexTupleSize(itup)))
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("heap tuple (%u,%u) from table \"%s\" lacks matching index tuple within index \"%s\"",
+						ItemPointerGetBlockNumber(&(itup->t_tid)),
+						ItemPointerGetOffsetNumber(&(itup->t_tid)),
+						RelationGetRelationName(state->heaprel),
+						RelationGetRelationName(state->rel))));
+
+	state->heaptuplespresent++;
+
+	pfree(itup);
+}
+
+/*
+ * check_index_page - verification of basic invariants about GiST page data
+ * This function does no any tuple analysis.
+ */
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	gistcheckpage(rel, buffer);
+
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted page %d",
+						RelationGetRelationName(rel), blockNo)));
+
+	if (GistPageIsDeleted(page))
+	{
+		if (!GistPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gist_refind_parent(Relation rel,
+				   BlockNumber parentblkno, BlockNumber childblkno,
+				   BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIST_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GistPageIsLeaf(parentpage))
+	{
+		/* 
+		 * Currently GiST never deletes internal pages, thus they can never
+		 * become leaf 
+		 */
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" internal page %d became leaf",
+						RelationGetRelationName(rel), parentblkno)));
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (OffsetNumber o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/*
+			 * Found it! Make copy and return it while both parent and child
+			 * pages are locked. This guaranties that at this particular moment
+			 * tuples must be coherent to each other.
+			 */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+static ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(sizeof(GISTPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree
+	 * and gist never uses either.  Verify that line pointer has storage, too,
+	 * since even LP_DEAD items should.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 2b9c1a9205..40de7c33f5 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -179,6 +179,25 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gist_index_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
   <tip>
    <para>
-- 
2.37.1 (Apple Git-137.1)

#38

Peter Geoghegan

pg@bowt.ie

almost 3 years ago

In reply to: Andrey Borodin (#36)

Re: Amcheck verification of GiST and GIN

On Sun, Mar 19, 2023 at 4:00 PM Andrey Borodin <amborodin86@gmail.com> wrote:

After several attempts to corrupt GiST with this 0.000001 epsilon
adjustment tolerance I think GiST indexing of points is valid.
Because intersection for search purposes is determined with the same epsilon!
So it's kind of odd
postgres=# select point(0.0000001,0)~=point(0,0);
?column?
----------
t
(1 row)
, yet the index works correctly.

I think that it's okay, provided that we can assume deterministic
behavior in the code that forms new index tuples. Within nbtree,
operator classes like numeric_ops are supported by heapallindexed
verification, without any requirement for special normalization code
to make it work correctly as a special case. This is true even though
operator classes such as numeric_ops have similar "equality is not
equivalence" issues, which comes up in other areas (e.g., nbtree
deduplication, which must call support routine 4 during a CREATE INDEX
[1]: https://www.postgresql.org/docs/devel/btree-support-funcs.html -- Peter Geoghegan

The important principle is that amcheck must always be able to produce
a consistent fingerprintable binary output given the same input (the
same heap tuple/Datum array). This must work across all operator
classes that play by the rules for GiST operator classes. We *can*
tolerate some variation here. Well, we really *have* to tolerate a
little of this kind of variation in order to deal with the TOAST input
state thing...but I hope that that's the only complicating factor
here, for GiST (as it is for nbtree). Note that we already rely on the
fact that index_form_tuple() uses palloc0() (not plain palloc) in
verify_nbtree.c, for the obvious reason.

I think that there is a decent chance that it just wouldn't make sense
for an operator class author to ever do something that we need to
worry about. I'm pretty sure that it's just the TOAST thing. But it's
worth thinking about carefully.

[1]: https://www.postgresql.org/docs/devel/btree-support-funcs.html -- Peter Geoghegan
--
Peter Geoghegan

#39

Alexander Lakhin

exclusion@gmail.com

almost 3 years ago

In reply to: Andrey Borodin (#37)

Re: Amcheck verification of GiST and GIN

Hi Andrey,

27.03.2023 01:17, Andrey Borodin wrote:

I've ported the B-tree TOAST test to GiST, and, as expected, it fails.
Finds non-indexed tuple for a fresh valid index.

I've tried to use this feature with the latest patch set and discovered that
modified pg_amcheck doesn't find any gist indexes when running without a
schema specification. For example:
CREATE TABLE tbl (id integer, p point);
INSERT INTO tbl VALUES (1, point(1, 1));
CREATE INDEX gist_tbl_idx ON tbl USING gist (p);
CREATE INDEX btree_tbl_idx ON tbl USING btree (id);

pg_amcheck -v -s public
prints:
pg_amcheck: checking index "regression.public.btree_tbl_idx"
pg_amcheck: checking heap table "regression.public.tbl"
pg_amcheck: checking index "regression.public.gist_tbl_idx"

but without "-s public" a message about checking of gist_tbl_idx is absent.

As I can see in the server.log, the queries, that generate relation lists in
these cases, differ in:
... AND ep.pattern_id IS NULL AND c.relam = 2 AND c.relkind IN ('r', 'S', 'm', 't') AND c.relnamespace != 99 ...

... AND ep.pattern_id IS NULL AND c.relam IN (2, 403, 783)AND c.relkind IN ('r', 'S', 'm', 't', 'i') AND ((c.relam = 2
AND c.relkind IN ('r', 'S', 'm', 't')) OR ((c.relam = 403 OR c.relam = 783) AND c.relkind = 'i')) ...

Best regards,
Alexander

#40

vignesh C

vignesh21@gmail.com

almost 2 years ago

In reply to: Andrey Borodin (#37)

Re: Amcheck verification of GiST and GIN

On Mon, 27 Mar 2023 at 03:47, Andrey Borodin <amborodin86@gmail.com> wrote:

On Sun, Mar 19, 2023 at 4:00 PM Andrey Borodin <amborodin86@gmail.com> wrote:

Also, there are INCLUDEd attributes. Right now we just put them as-is
to the bloom filter. Does this constitute a TOAST bug as in B-tree?
If so, I think we should use a version of tuple formatting that omits
included attributes...
What do you think?

I've ported the B-tree TOAST test to GiST, and, as expected, it fails.
Finds non-indexed tuple for a fresh valid index.
I've implemented normalization, plz see gistFormNormalizedTuple().
But there are two problems:
1. I could not come up with a proper way to pfree() compressed value
after decompressing. See TODO in gistFormNormalizedTuple().
2. In the index tuples seem to be normalized somewhere. They do not
have to be deformed and normalized. It's not clear to me how this
happened.

I have changed the status of the commitfest entry to "Waiting on
Author" as there was no follow-up on Alexander's queries. Feel free to
address them and change the commitfest entry accordingly.

Regards,
Vignesh

#41

Andrey M. Borodin

x4mmm@yandex-team.ru

almost 2 years ago

In reply to: vignesh C (#40)

Re: Amcheck verification of GiST and GIN

On 20 Jan 2024, at 07:46, vignesh C <vignesh21@gmail.com> wrote:

I have changed the status of the commitfest entry to "Waiting on
Author" as there was no follow-up on Alexander's queries. Feel free to
address them and change the commitfest entry accordingly.

Thanks Vignesh!

At the moment it’s obvious that this change will not be in 17, but I have plans to continue work on this. So I’ll move this item to July CF.

Best regards, Andrey Borodin.

#42

Andrey M. Borodin

x4mmm@yandex-team.ru

over 1 year ago

In reply to: Alexander Lakhin (#39)

4 attachment(s)

Re: Amcheck verification of GiST and GIN

On 6 Apr 2023, at 09:00, Alexander Lakhin <exclusion@gmail.com> wrote:

I've tried to use this feature with the latest patch set and discovered that
modified pg_amcheck doesn't find any gist indexes when running without a
schema specification.

Thanks, Alexander! I’ve fixed this problem and rebased on current HEAD.
There’s one more problem in pg_amcheck’s GiST verification. We must check that amcheck is 1.5+ and use GiST verification only in that case…

Best regards, Andrey Borodin.

Attachments:

v27-0001-Refactor-amcheck-to-extract-common-locking-routi.patchapplication/octet-stream; name=v27-0001-Refactor-amcheck-to-extract-common-locking-routi.patch; x-unix-mode=0644Download

From 117043bee378170da9ade9a89c2c7979aaf78b79 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:08:10 +0500
Subject: [PATCH v27 1/4] Refactor amcheck to extract common locking routines
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Other indexes will need to do same precautions before doing checks:
 - ensuring index is checkable
 - switching user context
 - taking care about GUCs changed by index functions
To reuse existing functionality this commit moves it to amcheck.c.

Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile        |   1 +
 contrib/amcheck/amcheck.c       | 173 +++++++++++++++++++++
 contrib/amcheck/amcheck.h       |  31 ++++
 contrib/amcheck/meson.build     |   1 +
 contrib/amcheck/verify_nbtree.c | 265 ++++++++------------------------
 5 files changed, 273 insertions(+), 198 deletions(-)
 create mode 100644 contrib/amcheck/amcheck.c
 create mode 100644 contrib/amcheck/amcheck.h

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index 5e9002d250..97b60c5115 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -3,6 +3,7 @@
 MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
+	amcheck.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
new file mode 100644
index 0000000000..bf3427e375
--- /dev/null
+++ b/contrib/amcheck/amcheck.c
@@ -0,0 +1,173 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.c
+ *		Utility functions common to all access methods.
+ *
+ * Copyright (c) 2024, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/table.h"
+#include "access/tableam.h"
+#include "amcheck.h"
+#include "catalog/index.h"
+#include "commands/tablecmds.h"
+#include "utils/guc.h"
+
+static bool amcheck_index_mainfork_expected(Relation rel);
+
+
+/*
+ * Check if index relation should have a file for its main relation fork.
+ * Verification uses this to skip unlogged indexes when in hot standby mode,
+ * where there is simply nothing to verify.
+ *
+ * NB: Caller should call index_checkable() before calling here.
+ */
+static bool
+amcheck_index_mainfork_expected(Relation rel)
+{
+	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
+		!RecoveryInProgress())
+		return true;
+
+	ereport(NOTICE,
+			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
+			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
+					RelationGetRelationName(rel))));
+
+	return false;
+}
+
+void
+amcheck_lock_relation_and_check(Oid indrelid,
+								Oid am_id,
+								IndexDoCheckCallback check,
+								LOCKMODE lockmode,
+								void *state)
+{
+	Oid			heapid;
+	Relation	indrel;
+	Relation	heaprel;
+	Oid			save_userid;
+	int			save_sec_context;
+	int			save_nestlevel;
+
+	/*
+	 * We must lock table before index to avoid deadlocks.  However, if the
+	 * passed indrelid isn't an index then IndexGetRelation() will fail.
+	 * Rather than emitting a not-very-helpful error message, postpone
+	 * complaining, expecting that the is-it-an-index test below will fail.
+	 *
+	 * In hot standby mode this will raise an error when parentcheck is true.
+	 */
+	heapid = IndexGetRelation(indrelid, true);
+	if (OidIsValid(heapid))
+	{
+		heaprel = table_open(heapid, lockmode);
+
+		/*
+		 * Switch to the table owner's userid, so that any index functions are
+		 * run as that user.  Also lock down security-restricted operations
+		 * and arrange to make GUC variable changes local to this command.
+		 */
+		GetUserIdAndSecContext(&save_userid, &save_sec_context);
+		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
+							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
+		save_nestlevel = NewGUCNestLevel();
+	}
+	else
+	{
+		heaprel = NULL;
+		/* Set these just to suppress "uninitialized variable" warnings */
+		save_userid = InvalidOid;
+		save_sec_context = -1;
+		save_nestlevel = -1;
+	}
+
+	/*
+	 * Open the target index relations separately (like relation_openrv(), but
+	 * with heap relation locked first to prevent deadlocking).  In hot
+	 * standby mode this will raise an error when parentcheck is true.
+	 *
+	 * There is no need for the usual indcheckxmin usability horizon test
+	 * here, even in the heapallindexed case, because index undergoing
+	 * verification only needs to have entries for a new transaction snapshot.
+	 * (If this is a parentcheck verification, there is no question about
+	 * committed or recently dead heap tuples lacking index entries due to
+	 * concurrent activity.)
+	 */
+	indrel = index_open(indrelid, lockmode);
+
+	/*
+	 * Since we did the IndexGetRelation call above without any lock, it's
+	 * barely possible that a race against an index drop/recreation could have
+	 * netted us the wrong table.
+	 */
+	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_TABLE),
+				 errmsg("could not open parent table of index \"%s\"",
+						RelationGetRelationName(indrel))));
+
+	/* Relation suitable for checking */
+	index_checkable(indrel, am_id);
+
+	if (amcheck_index_mainfork_expected(indrel))
+		check(indrel, heaprel, state, lockmode == ShareLock);
+
+	/* Roll back any GUC changes executed by index functions */
+	AtEOXact_GUC(false, save_nestlevel);
+
+	/* Restore userid and security context */
+	SetUserIdAndSecContext(save_userid, save_sec_context);
+
+	/*
+	 * Release locks early. That's ok here because nothing in the called
+	 * routines will trigger shared cache invalidations to be sent, so we can
+	 * relax the usual pattern of only releasing locks after commit.
+	 */
+	index_close(indrel, lockmode);
+	if (heaprel)
+		table_close(heaprel, lockmode);
+}
+
+/*
+ * Basic checks about the suitability of a relation for checking as an index.
+ *
+ *
+ * NB: Intentionally not checking permissions, the function is normally not
+ * callable by non-superusers. If granted, it's useful to be able to check a
+ * whole cluster.
+ */
+void
+index_checkable(Relation rel, Oid am_id)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != am_id)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only B-Tree indexes are supported as targets for verification"), //TODO name AM
+				 errdetail("Relation \"%s\" is not a B-Tree index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid.")));
+}
diff --git a/contrib/amcheck/amcheck.h b/contrib/amcheck/amcheck.h
new file mode 100644
index 0000000000..945f2ad443
--- /dev/null
+++ b/contrib/amcheck/amcheck.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "storage/bufpage.h"
+#include "storage/lmgr.h"
+#include "storage/lockdefs.h"
+#include "utils/relcache.h"
+#include "miscadmin.h"
+
+/* Typedefs for callback functions for amcheck_lock_relation */
+typedef void (*IndexCheckableCallback) (Relation index);
+typedef void (*IndexDoCheckCallback) (Relation rel,
+									  Relation heaprel,
+									  void *state,
+									  bool readonly);
+
+extern void amcheck_lock_relation_and_check(Oid indrelid,
+											Oid am_id,
+											IndexDoCheckCallback check,
+											LOCKMODE lockmode, void *state);
+
+extern void index_checkable(Relation rel, Oid am_id);
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index fc08e32539..1b38e0aba7 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -1,6 +1,7 @@
 # Copyright (c) 2022-2024, PostgreSQL Global Development Group
 
 amcheck_sources = files(
+  'amcheck.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 34990c5cea..2e30c0c693 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -30,14 +30,13 @@
 #include "access/tableam.h"
 #include "access/transam.h"
 #include "access/xact.h"
+#include "amcheck.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
 #include "catalog/pg_opfamily_d.h"
 #include "commands/tablecmds.h"
 #include "common/pg_prng.h"
 #include "lib/bloomfilter.h"
-#include "miscadmin.h"
-#include "storage/lmgr.h"
 #include "storage/smgr.h"
 #include "utils/guc.h"
 #include "utils/memutils.h"
@@ -158,14 +157,22 @@ typedef struct BtreeLastVisibleEntry
 	ItemPointer tid;			/* Heap tid */
 } BtreeLastVisibleEntry;
 
+/*
+ * Check arguments
+ */
+typedef struct BTCallbackState
+{
+	bool	parentcheck;
+	bool	heapallindexed;
+	bool	rootdescend;
+	bool	checkunique;
+} BTCallbackState;
+
 PG_FUNCTION_INFO_V1(bt_index_check);
 PG_FUNCTION_INFO_V1(bt_index_parent_check);
 
-static void bt_index_check_internal(Oid indrelid, bool parentcheck,
-									bool heapallindexed, bool rootdescend,
-									bool checkunique);
-static inline void btree_index_checkable(Relation rel);
-static inline bool btree_index_mainfork_expected(Relation rel);
+static void bt_index_check_callback(Relation indrel, Relation heaprel,
+									void *state, bool readonly);
 static void bt_check_every_level(Relation rel, Relation heaprel,
 								 bool heapkeyspace, bool readonly, bool heapallindexed,
 								 bool rootdescend, bool checkunique);
@@ -240,15 +247,21 @@ Datum
 bt_index_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		checkunique = false;
+	BTCallbackState args;
+
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = false;
+	args.checkunique = false;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
-	if (PG_NARGS() == 3)
-		checkunique = PG_GETARG_BOOL(2);
+		args.heapallindexed = PG_GETARG_BOOL(1);
+	if (PG_NARGS() >= 3)
+		args.checkunique = PG_GETARG_BOOL(2);
 
-	bt_index_check_internal(indrelid, false, heapallindexed, false, checkunique);
+	amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
+									bt_index_check_callback,
+									AccessShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -266,18 +279,23 @@ Datum
 bt_index_parent_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		rootdescend = false;
-	bool		checkunique = false;
+	BTCallbackState args;
+
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = true;
+	args.checkunique = false;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+		args.heapallindexed = PG_GETARG_BOOL(1);
 	if (PG_NARGS() >= 3)
-		rootdescend = PG_GETARG_BOOL(2);
-	if (PG_NARGS() == 4)
-		checkunique = PG_GETARG_BOOL(3);
+		args.rootdescend = PG_GETARG_BOOL(2);
+	if (PG_NARGS() >= 4)
+		args.checkunique = PG_GETARG_BOOL(3);
 
-	bt_index_check_internal(indrelid, true, heapallindexed, rootdescend, checkunique);
+	amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
+									bt_index_check_callback,
+									ShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -286,193 +304,44 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
  * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
  */
 static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend, bool checkunique)
+bt_index_check_callback(Relation indrel, Relation heaprel, void *state, bool readonly)
 {
-	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	LOCKMODE	lockmode;
-	Oid			save_userid;
-	int			save_sec_context;
-	int			save_nestlevel;
-
-	if (parentcheck)
-		lockmode = ShareLock;
-	else
-		lockmode = AccessShareLock;
-
-	/*
-	 * We must lock table before index to avoid deadlocks.  However, if the
-	 * passed indrelid isn't an index then IndexGetRelation() will fail.
-	 * Rather than emitting a not-very-helpful error message, postpone
-	 * complaining, expecting that the is-it-an-index test below will fail.
-	 *
-	 * In hot standby mode this will raise an error when parentcheck is true.
-	 */
-	heapid = IndexGetRelation(indrelid, true);
-	if (OidIsValid(heapid))
-	{
-		heaprel = table_open(heapid, lockmode);
-
-		/*
-		 * Switch to the table owner's userid, so that any index functions are
-		 * run as that user.  Also lock down security-restricted operations
-		 * and arrange to make GUC variable changes local to this command.
-		 */
-		GetUserIdAndSecContext(&save_userid, &save_sec_context);
-		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
-							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
-		save_nestlevel = NewGUCNestLevel();
-		RestrictSearchPath();
-	}
-	else
-	{
-		heaprel = NULL;
-		/* Set these just to suppress "uninitialized variable" warnings */
-		save_userid = InvalidOid;
-		save_sec_context = -1;
-		save_nestlevel = -1;
-	}
+	BTCallbackState *args = (BTCallbackState *) state;
+	bool		heapkeyspace,
+				allequalimage;
 
-	/*
-	 * Open the target index relations separately (like relation_openrv(), but
-	 * with heap relation locked first to prevent deadlocking).  In hot
-	 * standby mode this will raise an error when parentcheck is true.
-	 *
-	 * There is no need for the usual indcheckxmin usability horizon test
-	 * here, even in the heapallindexed case, because index undergoing
-	 * verification only needs to have entries for a new transaction snapshot.
-	 * (If this is a parentcheck verification, there is no question about
-	 * committed or recently dead heap tuples lacking index entries due to
-	 * concurrent activity.)
-	 */
-	indrel = index_open(indrelid, lockmode);
-
-	/*
-	 * Since we did the IndexGetRelation call above without any lock, it's
-	 * barely possible that a race against an index drop/recreation could have
-	 * netted us the wrong table.
-	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
 		ereport(ERROR,
-				(errcode(ERRCODE_UNDEFINED_TABLE),
-				 errmsg("could not open parent table of index \"%s\"",
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" lacks a main relation fork",
 						RelationGetRelationName(indrel))));
 
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	if (btree_index_mainfork_expected(indrel))
+	/* Extract metadata from metapage, and sanitize it in passing */
+	_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
+	if (allequalimage && !heapkeyspace)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
+						RelationGetRelationName(indrel))));
+	if (allequalimage && !_bt_allequalimage(indrel, false))
 	{
-		bool		heapkeyspace,
-					allequalimage;
-
-		if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" lacks a main relation fork",
-							RelationGetRelationName(indrel))));
+		bool		has_interval_ops = false;
 
-		/* Extract metadata from metapage, and sanitize it in passing */
-		_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
-		if (allequalimage && !heapkeyspace)
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
-							RelationGetRelationName(indrel))));
-		if (allequalimage && !_bt_allequalimage(indrel, false))
-		{
-			bool		has_interval_ops = false;
-
-			for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
-				if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
-					has_interval_ops = true;
-			ereport(ERROR,
+		for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
+			if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
+				has_interval_ops = true;
+				ereport(ERROR,
 					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
-							RelationGetRelationName(indrel)),
-					 has_interval_ops
-					 ? errhint("This is known of \"interval\" indexes last built on a version predating 2023-11.")
-					 : 0));
-		}
-
-		/* Check index, possibly against table it is an index on */
-		bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
-							 heapallindexed, rootdescend, checkunique);
+					errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
+						RelationGetRelationName(indrel)),
+					has_interval_ops
+					? errhint("This is known of \"interval\" indexes last built on a version predating 2023-11.")
+					: 0));
 	}
 
-	/* Roll back any GUC changes executed by index functions */
-	AtEOXact_GUC(false, save_nestlevel);
-
-	/* Restore userid and security context */
-	SetUserIdAndSecContext(save_userid, save_sec_context);
-
-	/*
-	 * Release locks early. That's ok here because nothing in the called
-	 * routines will trigger shared cache invalidations to be sent, so we can
-	 * relax the usual pattern of only releasing locks after commit.
-	 */
-	index_close(indrel, lockmode);
-	if (heaprel)
-		table_close(heaprel, lockmode);
-}
-
-/*
- * Basic checks about the suitability of a relation for checking as a B-Tree
- * index.
- *
- * NB: Intentionally not checking permissions, the function is normally not
- * callable by non-superusers. If granted, it's useful to be able to check a
- * whole cluster.
- */
-static inline void
-btree_index_checkable(Relation rel)
-{
-	if (rel->rd_rel->relkind != RELKIND_INDEX ||
-		rel->rd_rel->relam != BTREE_AM_OID)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("only B-Tree indexes are supported as targets for verification"),
-				 errdetail("Relation \"%s\" is not a B-Tree index.",
-						   RelationGetRelationName(rel))));
-
-	if (RELATION_IS_OTHER_TEMP(rel))
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("cannot access temporary tables of other sessions"),
-				 errdetail("Index \"%s\" is associated with temporary relation.",
-						   RelationGetRelationName(rel))));
-
-	if (!rel->rd_index->indisvalid)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("cannot check index \"%s\"",
-						RelationGetRelationName(rel)),
-				 errdetail("Index is not valid.")));
-}
-
-/*
- * Check if B-Tree index relation should have a file for its main relation
- * fork.  Verification uses this to skip unlogged indexes when in hot standby
- * mode, where there is simply nothing to verify.  We behave as if the
- * relation is empty.
- *
- * NB: Caller should call btree_index_checkable() before calling here.
- */
-static inline bool
-btree_index_mainfork_expected(Relation rel)
-{
-	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
-		!RecoveryInProgress())
-		return true;
-
-	ereport(DEBUG1,
-			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
-			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
-					RelationGetRelationName(rel))));
-
-	return false;
+	/* Check index, possibly against table it is an index on */
+	bt_check_every_level(indrel, heaprel, heapkeyspace, readonly,
+						 args->heapallindexed, args->rootdescend, args->checkunique);
 }
 
 /*
-- 
2.42.0

v27-0004-Add-GiST-support-to-pg_amcheck.patchapplication/octet-stream; name=v27-0004-Add-GiST-support-to-pg_amcheck.patch; x-unix-mode=0644Download

From 2133407a52c5ba94779956715c1b4773415651f8 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sun, 5 Feb 2023 15:52:14 -0800
Subject: [PATCH v27 4/4] Add GiST support to pg_amcheck

Reviewed-by: Alexander Lakhin <exclusion@gmail.com>
---
 src/bin/pg_amcheck/pg_amcheck.c      | 204 ++++++++++++++++-----------
 src/bin/pg_amcheck/t/002_nonesuch.pl |   8 +-
 src/bin/pg_amcheck/t/003_check.pl    |  48 ++++---
 3 files changed, 156 insertions(+), 104 deletions(-)

diff --git a/src/bin/pg_amcheck/pg_amcheck.c b/src/bin/pg_amcheck/pg_amcheck.c
index a1ad41e766..5cd2a67c23 100644
--- a/src/bin/pg_amcheck/pg_amcheck.c
+++ b/src/bin/pg_amcheck/pg_amcheck.c
@@ -39,8 +39,7 @@ typedef struct PatternInfo
 								 * NULL */
 	bool		heap_only;		/* true if rel_regex should only match heap
 								 * tables */
-	bool		btree_only;		/* true if rel_regex should only match btree
-								 * indexes */
+	bool		index_only;		/* true if rel_regex should only match indexes */
 	bool		matched;		/* true if the pattern matched in any database */
 } PatternInfo;
 
@@ -74,7 +73,7 @@ typedef struct AmcheckOptions
 
 	/*
 	 * As an optimization, if any pattern in the exclude list applies to heap
-	 * tables, or similarly if any such pattern applies to btree indexes, or
+	 * tables, or similarly if any such pattern applies to indexes, or
 	 * to schemas, then these will be true, otherwise false.  These should
 	 * always agree with what you'd conclude by grep'ing through the exclude
 	 * list.
@@ -98,14 +97,14 @@ typedef struct AmcheckOptions
 	int64		endblock;
 	const char *skip;
 
-	/* btree index checking options */
+	/* index checking options */
 	bool		parent_check;
 	bool		rootdescend;
 	bool		heapallindexed;
 	bool		checkunique;
 
-	/* heap and btree hybrid option */
-	bool		no_btree_expansion;
+	/* heap and indexes hybrid option */
+	bool		no_index_expansion;
 } AmcheckOptions;
 
 static AmcheckOptions opts = {
@@ -134,7 +133,7 @@ static AmcheckOptions opts = {
 	.rootdescend = false,
 	.heapallindexed = false,
 	.checkunique = false,
-	.no_btree_expansion = false
+	.no_index_expansion = false
 };
 
 static const char *progname = NULL;
@@ -157,7 +156,8 @@ typedef struct RelationInfo
 {
 	const DatabaseInfo *datinfo;	/* shared by other relinfos */
 	Oid			reloid;
-	bool		is_heap;		/* true if heap, false if btree */
+	Oid			amoid;
+	bool		is_heap;		/* true if heap, false if index */
 	char	   *nspname;
 	char	   *relname;
 	int			relpages;
@@ -178,10 +178,12 @@ static void prepare_heap_command(PQExpBuffer sql, RelationInfo *rel,
 								 PGconn *conn);
 static void prepare_btree_command(PQExpBuffer sql, RelationInfo *rel,
 								  PGconn *conn);
+static void prepare_gist_command(PQExpBuffer sql, RelationInfo *rel,
+								  PGconn *conn);
 static void run_command(ParallelSlot *slot, const char *sql);
 static bool verify_heap_slot_handler(PGresult *res, PGconn *conn,
 									 void *context);
-static bool verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context);
+static bool verify_index_slot_handler(PGresult *res, PGconn *conn, void *context);
 static void help(const char *progname);
 static void progress_report(uint64 relations_total, uint64 relations_checked,
 							uint64 relpages_total, uint64 relpages_checked,
@@ -195,7 +197,7 @@ static void append_relation_pattern(PatternInfoArray *pia, const char *pattern,
 									int encoding);
 static void append_heap_pattern(PatternInfoArray *pia, const char *pattern,
 								int encoding);
-static void append_btree_pattern(PatternInfoArray *pia, const char *pattern,
+static void append_index_pattern(PatternInfoArray *pia, const char *pattern,
 								 int encoding);
 static void compile_database_list(PGconn *conn, SimplePtrList *databases,
 								  const char *initial_dbname);
@@ -322,11 +324,11 @@ main(int argc, char *argv[])
 				break;
 			case 'i':
 				opts.allrel = false;
-				append_btree_pattern(&opts.include, optarg, encoding);
+				append_index_pattern(&opts.include, optarg, encoding);
 				break;
 			case 'I':
 				opts.excludeidx = true;
-				append_btree_pattern(&opts.exclude, optarg, encoding);
+				append_index_pattern(&opts.exclude, optarg, encoding);
 				break;
 			case 'j':
 				if (!option_parse_int(optarg, "-j/--jobs", 1, INT_MAX,
@@ -381,7 +383,7 @@ main(int argc, char *argv[])
 				maintenance_db = pg_strdup(optarg);
 				break;
 			case 2:
-				opts.no_btree_expansion = true;
+				opts.no_index_expansion = true;
 				break;
 			case 3:
 				opts.no_toast_expansion = true;
@@ -649,8 +651,8 @@ main(int argc, char *argv[])
 			if (pat->heap_only)
 				log_no_match("no heap tables to check matching \"%s\"",
 							 pat->pattern);
-			else if (pat->btree_only)
-				log_no_match("no btree indexes to check matching \"%s\"",
+			else if (pat->index_only)
+				log_no_match("no indexes to check matching \"%s\"",
 							 pat->pattern);
 			else if (pat->rel_regex == NULL)
 				log_no_match("no relations to check in schemas matching \"%s\"",
@@ -783,13 +785,20 @@ main(int argc, char *argv[])
 				if (opts.show_progress && progress_since_last_stderr)
 					fprintf(stderr, "\n");
 
-				pg_log_info("checking btree index \"%s.%s.%s\"",
+				pg_log_info("checking index \"%s.%s.%s\"",
 							rel->datinfo->datname, rel->nspname, rel->relname);
 				progress_since_last_stderr = false;
 			}
-			prepare_btree_command(&sql, rel, free_slot->connection);
+			if (rel->amoid == BTREE_AM_OID)
+				prepare_btree_command(&sql, rel, free_slot->connection);
+			else if (rel->amoid == GIST_AM_OID)
+				prepare_gist_command(&sql, rel, free_slot->connection);
+			else
+				/* should not happen at this stage */
+				pg_log_info("Verification of index type %u not supported",
+							rel->amoid);
 			rel->sql = pstrdup(sql.data);	/* pg_free'd after command */
-			ParallelSlotSetHandler(free_slot, verify_btree_slot_handler, rel);
+			ParallelSlotSetHandler(free_slot, verify_index_slot_handler, rel);
 			run_command(free_slot, rel->sql);
 		}
 	}
@@ -867,7 +876,7 @@ prepare_heap_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
  * Creates a SQL command for running amcheck checking on the given btree index
  * relation.  The command does not select any columns, as btree checking
  * functions do not return any, but rather return corruption information by
- * raising errors, which verify_btree_slot_handler expects.
+ * raising errors, which verify_index_slot_handler expects.
  *
  * The constructed SQL command will silently skip temporary indexes, and
  * indexes being reindexed concurrently, as checking them would needlessly draw
@@ -913,6 +922,28 @@ prepare_btree_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
 						  rel->reloid);
 }
 
+/*
+ * prepare_gist_command
+ * Similar to btree equivalent prepares command to check GiST index.
+ */
+static void
+prepare_gist_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
+{
+	resetPQExpBuffer(sql);
+
+	appendPQExpBuffer(sql,
+						"SELECT %s.gist_index_check("
+						"index := c.oid, heapallindexed := %s)"
+						"\nFROM pg_catalog.pg_class c, pg_catalog.pg_index i "
+						"WHERE c.oid = %u "
+						"AND c.oid = i.indexrelid "
+						"AND c.relpersistence != 't' "
+						"AND i.indisready AND i.indisvalid AND i.indislive",
+						rel->datinfo->amcheck_schema,
+						(opts.heapallindexed ? "true" : "false"),
+						rel->reloid);
+}
+
 /*
  * run_command
  *
@@ -952,7 +983,7 @@ run_command(ParallelSlot *slot, const char *sql)
  * Note: Heap relation corruption is reported by verify_heapam() via the result
  * set, rather than an ERROR, but running verify_heapam() on a corrupted heap
  * table may still result in an error being returned from the server due to
- * missing relation files, bad checksums, etc.  The btree corruption checking
+ * missing relation files, bad checksums, etc.  The corruption checking
  * functions always use errors to communicate corruption messages.  We can't
  * just abort processing because we got a mere ERROR.
  *
@@ -1102,11 +1133,11 @@ verify_heap_slot_handler(PGresult *res, PGconn *conn, void *context)
 }
 
 /*
- * verify_btree_slot_handler
+ * verify_index_slot_handler
  *
- * ParallelSlotHandler that receives results from a btree checking command
- * created by prepare_btree_command and outputs them for the user.  The results
- * from the btree checking command is assumed to be empty, but when the results
+ * ParallelSlotHandler that receives results from a checking command created by
+ * prepare_[btree,gist]_command and outputs them for the user.  The results
+ * from the checking command is assumed to be empty, but when the results
  * are an error code, the useful information about the corruption is expected
  * in the connection's error message.
  *
@@ -1115,7 +1146,7 @@ verify_heap_slot_handler(PGresult *res, PGconn *conn, void *context)
  * context: unused
  */
 static bool
-verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
+verify_index_slot_handler(PGresult *res, PGconn *conn, void *context)
 {
 	RelationInfo *rel = (RelationInfo *) context;
 
@@ -1126,7 +1157,7 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 		if (ntups > 1)
 		{
 			/*
-			 * We expect the btree checking functions to return one void row
+			 * We expect the checking functions to return one void row
 			 * each, or zero rows if the check was skipped due to the object
 			 * being in the wrong state to be checked, so we should output
 			 * some sort of warning if we get anything more, not because it
@@ -1141,7 +1172,7 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 			 */
 			if (opts.show_progress && progress_since_last_stderr)
 				fprintf(stderr, "\n");
-			pg_log_warning("btree index \"%s.%s.%s\": btree checking function returned unexpected number of rows: %d",
+			pg_log_warning("index \"%s.%s.%s\": checking function returned unexpected number of rows: %d",
 						   rel->datinfo->datname, rel->nspname, rel->relname, ntups);
 			if (opts.verbose)
 				pg_log_warning_detail("Query was: %s", rel->sql);
@@ -1155,7 +1186,7 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 		char	   *msg = indent_lines(PQerrorMessage(conn));
 
 		all_checks_pass = false;
-		printf(_("btree index \"%s.%s.%s\":\n"),
+		printf(_("index \"%s.%s.%s\":\n"),
 			   rel->datinfo->datname, rel->nspname, rel->relname);
 		printf("%s", msg);
 		if (opts.verbose)
@@ -1209,6 +1240,8 @@ help(const char *progname)
 	printf(_("      --heapallindexed            check that all heap tuples are found within indexes\n"));
 	printf(_("      --parent-check              check index parent/child relationships\n"));
 	printf(_("      --rootdescend               search from root page to refind tuples\n"));
+	printf(_("\nGiST index checking options:\n"));
+	printf(_("      --heapallindexed            check that all heap tuples are found within indexes\n"));
 	printf(_("\nConnection options:\n"));
 	printf(_("  -h, --host=HOSTNAME             database server host or socket directory\n"));
 	printf(_("  -p, --port=PORT                 database server port\n"));
@@ -1422,11 +1455,11 @@ append_schema_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
  * pattern: the relation name pattern
  * encoding: client encoding for parsing the pattern
  * heap_only: whether the pattern should only be matched against heap tables
- * btree_only: whether the pattern should only be matched against btree indexes
+ * index_only: whether the pattern should only be matched against indexes
  */
 static void
 append_relation_pattern_helper(PatternInfoArray *pia, const char *pattern,
-							   int encoding, bool heap_only, bool btree_only)
+							   int encoding, bool heap_only, bool index_only)
 {
 	PQExpBufferData dbbuf;
 	PQExpBufferData nspbuf;
@@ -1461,14 +1494,14 @@ append_relation_pattern_helper(PatternInfoArray *pia, const char *pattern,
 	termPQExpBuffer(&relbuf);
 
 	info->heap_only = heap_only;
-	info->btree_only = btree_only;
+	info->index_only = index_only;
 }
 
 /*
  * append_relation_pattern
  *
  * Adds the given pattern interpreted as a relation pattern, to be matched
- * against both heap tables and btree indexes.
+ * against both heap tables and indexes.
  *
  * pia: the pattern info array to be appended
  * pattern: the relation name pattern
@@ -1497,17 +1530,17 @@ append_heap_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
 }
 
 /*
- * append_btree_pattern
+ * append_index_pattern
  *
  * Adds the given pattern interpreted as a relation pattern, to be matched only
- * against btree indexes.
+ * against indexes.
  *
  * pia: the pattern info array to be appended
  * pattern: the relation name pattern
  * encoding: client encoding for parsing the pattern
  */
 static void
-append_btree_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
+append_index_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
 {
 	append_relation_pattern_helper(pia, pattern, encoding, false, true);
 }
@@ -1765,7 +1798,7 @@ compile_database_list(PGconn *conn, SimplePtrList *databases,
  *     rel_regex: the relname regexp parsed from the pattern, or NULL if the
  *                pattern had no relname part
  *     heap_only: true if the pattern applies only to heap tables (not indexes)
- *     btree_only: true if the pattern applies only to btree indexes (not tables)
+ *     index_only: true if the pattern applies only to indexes (not tables)
  *
  * buf: the buffer to be appended
  * patterns: the array of patterns to be inserted into the CTE
@@ -1807,7 +1840,7 @@ append_rel_pattern_raw_cte(PQExpBuffer buf, const PatternInfoArray *pia,
 			appendPQExpBufferStr(buf, "::TEXT, true::BOOLEAN");
 		else
 			appendPQExpBufferStr(buf, "::TEXT, false::BOOLEAN");
-		if (info->btree_only)
+		if (info->index_only)
 			appendPQExpBufferStr(buf, ", true::BOOLEAN");
 		else
 			appendPQExpBufferStr(buf, ", false::BOOLEAN");
@@ -1845,8 +1878,8 @@ append_rel_pattern_filtered_cte(PQExpBuffer buf, const char *raw,
 								const char *filtered, PGconn *conn)
 {
 	appendPQExpBuffer(buf,
-					  "\n%s (pattern_id, nsp_regex, rel_regex, heap_only, btree_only) AS ("
-					  "\nSELECT pattern_id, nsp_regex, rel_regex, heap_only, btree_only "
+					  "\n%s (pattern_id, nsp_regex, rel_regex, heap_only, index_only) AS ("
+					  "\nSELECT pattern_id, nsp_regex, rel_regex, heap_only, index_only "
 					  "FROM %s r"
 					  "\nWHERE (r.db_regex IS NULL "
 					  "OR ",
@@ -1869,7 +1902,7 @@ append_rel_pattern_filtered_cte(PQExpBuffer buf, const char *raw,
  * The cells of the constructed list contain all information about the relation
  * necessary to connect to the database and check the object, including which
  * database to connect to, where contrib/amcheck is installed, and the Oid and
- * type of object (heap table vs. btree index).  Rather than duplicating the
+ * type of object (heap table vs. index).  Rather than duplicating the
  * database details per relation, the relation structs use references to the
  * same database object, provided by the caller.
  *
@@ -1896,7 +1929,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	if (!opts.allrel)
 	{
 		appendPQExpBufferStr(&sql,
-							 " include_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, btree_only) AS (");
+							 " include_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, index_only) AS (");
 		append_rel_pattern_raw_cte(&sql, &opts.include, conn);
 		appendPQExpBufferStr(&sql, "\n),");
 		append_rel_pattern_filtered_cte(&sql, "include_raw", "include_pat", conn);
@@ -1906,7 +1939,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	if (opts.excludetbl || opts.excludeidx || opts.excludensp)
 	{
 		appendPQExpBufferStr(&sql,
-							 " exclude_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, btree_only) AS (");
+							 " exclude_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, index_only) AS (");
 		append_rel_pattern_raw_cte(&sql, &opts.exclude, conn);
 		appendPQExpBufferStr(&sql, "\n),");
 		append_rel_pattern_filtered_cte(&sql, "exclude_raw", "exclude_pat", conn);
@@ -1914,36 +1947,36 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 
 	/* Append the relation CTE. */
 	appendPQExpBufferStr(&sql,
-						 " relation (pattern_id, oid, nspname, relname, reltoastrelid, relpages, is_heap, is_btree) AS ("
+						 " relation (pattern_id, oid, amoid, nspname, relname, reltoastrelid, relpages, is_heap, is_index) AS ("
 						 "\nSELECT DISTINCT ON (c.oid");
 	if (!opts.allrel)
 		appendPQExpBufferStr(&sql, ", ip.pattern_id) ip.pattern_id,");
 	else
 		appendPQExpBufferStr(&sql, ") NULL::INTEGER AS pattern_id,");
 	appendPQExpBuffer(&sql,
-					  "\nc.oid, n.nspname, c.relname, c.reltoastrelid, c.relpages, "
-					  "c.relam = %u AS is_heap, "
-					  "c.relam = %u AS is_btree"
+					  "\nc.oid, c.relam as amoid, n.nspname, c.relname, "
+					  "c.reltoastrelid, c.relpages, c.relam = %u AS is_heap, "
+					  "(c.relam = %u OR c.relam = %u) AS is_index"
 					  "\nFROM pg_catalog.pg_class c "
 					  "INNER JOIN pg_catalog.pg_namespace n "
 					  "ON c.relnamespace = n.oid",
-					  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+					  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 	if (!opts.allrel)
 		appendPQExpBuffer(&sql,
 						  "\nINNER JOIN include_pat ip"
 						  "\nON (n.nspname ~ ip.nsp_regex OR ip.nsp_regex IS NULL)"
 						  "\nAND (c.relname ~ ip.rel_regex OR ip.rel_regex IS NULL)"
 						  "\nAND (c.relam = %u OR NOT ip.heap_only)"
-						  "\nAND (c.relam = %u OR NOT ip.btree_only)",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  "\nAND ((c.relam = %u OR c.relam = %u) OR NOT ip.index_only)",
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 	if (opts.excludetbl || opts.excludeidx || opts.excludensp)
 		appendPQExpBuffer(&sql,
 						  "\nLEFT OUTER JOIN exclude_pat ep"
 						  "\nON (n.nspname ~ ep.nsp_regex OR ep.nsp_regex IS NULL)"
 						  "\nAND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL)"
 						  "\nAND (c.relam = %u OR NOT ep.heap_only OR ep.rel_regex IS NULL)"
-						  "\nAND (c.relam = %u OR NOT ep.btree_only OR ep.rel_regex IS NULL)",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  "\nAND ((c.relam = %u OR c.relam = %u) OR NOT ep.index_only OR ep.rel_regex IS NULL)",
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 
 	/*
 	 * Exclude temporary tables and indexes, which must necessarily belong to
@@ -1977,12 +2010,12 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 						  HEAP_TABLE_AM_OID, PG_TOAST_NAMESPACE);
 	else
 		appendPQExpBuffer(&sql,
-						  " AND c.relam IN (%u, %u)"
+						  " AND c.relam IN (%u, %u, %u)"
 						  "AND c.relkind IN ('r', 'S', 'm', 't', 'i') "
 						  "AND ((c.relam = %u AND c.relkind IN ('r', 'S', 'm', 't')) OR "
-						  "(c.relam = %u AND c.relkind = 'i'))",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID,
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  "((c.relam = %u OR c.relam = %u) AND c.relkind = 'i'))",
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID,
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 
 	appendPQExpBufferStr(&sql,
 						 "\nORDER BY c.oid)");
@@ -2011,7 +2044,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		appendPQExpBufferStr(&sql,
 							 "\n)");
 	}
-	if (!opts.no_btree_expansion)
+	if (!opts.no_index_expansion)
 	{
 		/*
 		 * Include a CTE for btree indexes associated with primary heap tables
@@ -2019,9 +2052,9 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		 * btree index names.
 		 */
 		appendPQExpBufferStr(&sql,
-							 ", index (oid, nspname, relname, relpages) AS ("
-							 "\nSELECT c.oid, r.nspname, c.relname, c.relpages "
-							 "FROM relation r"
+							 ", index (oid, amoid, nspname, relname, relpages) AS ("
+							 "\nSELECT c.oid, c.relam as amoid, r.nspname, "
+							 "c.relname, c.relpages FROM relation r"
 							 "\nINNER JOIN pg_catalog.pg_index i "
 							 "ON r.oid = i.indrelid "
 							 "INNER JOIN pg_catalog.pg_class c "
@@ -2034,15 +2067,15 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 								 "\nLEFT OUTER JOIN exclude_pat ep "
 								 "ON (n.nspname ~ ep.nsp_regex OR ep.nsp_regex IS NULL) "
 								 "AND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL) "
-								 "AND ep.btree_only"
+								 "AND ep.index_only"
 								 "\nWHERE ep.pattern_id IS NULL");
 		else
 			appendPQExpBufferStr(&sql,
 								 "\nWHERE true");
 		appendPQExpBuffer(&sql,
-						  " AND c.relam = %u "
+						  " AND (c.relam = %u or c.relam = %u) "
 						  "AND c.relkind = 'i'",
-						  BTREE_AM_OID);
+						  BTREE_AM_OID, GIST_AM_OID);
 		if (opts.no_toast_expansion)
 			appendPQExpBuffer(&sql,
 							  " AND c.relnamespace != %u",
@@ -2050,7 +2083,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		appendPQExpBufferStr(&sql, "\n)");
 	}
 
-	if (!opts.no_toast_expansion && !opts.no_btree_expansion)
+	if (!opts.no_toast_expansion && !opts.no_index_expansion)
 	{
 		/*
 		 * Include a CTE for btree indexes associated with toast tables of
@@ -2071,13 +2104,13 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 								 "\nLEFT OUTER JOIN exclude_pat ep "
 								 "ON ('pg_toast' ~ ep.nsp_regex OR ep.nsp_regex IS NULL) "
 								 "AND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL) "
-								 "AND ep.btree_only "
+								 "AND ep.index_only "
 								 "WHERE ep.pattern_id IS NULL");
 		else
 			appendPQExpBufferStr(&sql,
 								 "\nWHERE true");
 		appendPQExpBuffer(&sql,
-						  " AND c.relam = %u"
+						  " AND c.relam = %u "
 						  " AND c.relkind = 'i')",
 						  BTREE_AM_OID);
 	}
@@ -2091,12 +2124,13 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	 * list.
 	 */
 	appendPQExpBufferStr(&sql,
-						 "\nSELECT pattern_id, is_heap, is_btree, oid, nspname, relname, relpages "
+						 "\nSELECT pattern_id, is_heap, is_index, oid, amoid, nspname, relname, relpages "
 						 "FROM (");
 	appendPQExpBufferStr(&sql,
 	/* Inclusion patterns that failed to match */
-						 "\nSELECT pattern_id, is_heap, is_btree, "
+						 "\nSELECT pattern_id, is_heap, is_index, "
 						 "NULL::OID AS oid, "
+						 "NULL::OID AS amoid, "
 						 "NULL::TEXT AS nspname, "
 						 "NULL::TEXT AS relname, "
 						 "NULL::INTEGER AS relpages"
@@ -2105,29 +2139,29 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 						 "UNION"
 	/* Primary relations */
 						 "\nSELECT NULL::INTEGER AS pattern_id, "
-						 "is_heap, is_btree, oid, nspname, relname, relpages "
+						 "is_heap, is_index, oid, amoid, nspname, relname, relpages "
 						 "FROM relation");
 	if (!opts.no_toast_expansion)
-		appendPQExpBufferStr(&sql,
+		appendPQExpBuffer(&sql,
 							 " UNION"
 		/* Toast tables for primary relations */
 							 "\nSELECT NULL::INTEGER AS pattern_id, TRUE AS is_heap, "
-							 "FALSE AS is_btree, oid, nspname, relname, relpages "
+							 "FALSE AS is_index, oid, 0 as amoid, nspname, relname, relpages "
 							 "FROM toast");
-	if (!opts.no_btree_expansion)
+	if (!opts.no_index_expansion)
 		appendPQExpBufferStr(&sql,
 							 " UNION"
 		/* Indexes for primary relations */
 							 "\nSELECT NULL::INTEGER AS pattern_id, FALSE AS is_heap, "
-							 "TRUE AS is_btree, oid, nspname, relname, relpages "
+							 "TRUE AS is_index, oid, amoid, nspname, relname, relpages "
 							 "FROM index");
-	if (!opts.no_toast_expansion && !opts.no_btree_expansion)
-		appendPQExpBufferStr(&sql,
+	if (!opts.no_toast_expansion && !opts.no_index_expansion)
+		appendPQExpBuffer(&sql,
 							 " UNION"
 		/* Indexes for toast relations */
 							 "\nSELECT NULL::INTEGER AS pattern_id, FALSE AS is_heap, "
-							 "TRUE AS is_btree, oid, nspname, relname, relpages "
-							 "FROM toast_index");
+							 "TRUE AS is_index, oid, %u as amoid, nspname, relname, relpages "
+							 "FROM toast_index", BTREE_AM_OID);
 	appendPQExpBufferStr(&sql,
 						 "\n) AS combined_records "
 						 "ORDER BY relpages DESC NULLS FIRST, oid");
@@ -2147,8 +2181,9 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	{
 		int			pattern_id = -1;
 		bool		is_heap = false;
-		bool		is_btree PG_USED_FOR_ASSERTS_ONLY = false;
+		bool		is_index PG_USED_FOR_ASSERTS_ONLY = false;
 		Oid			oid = InvalidOid;
+		Oid			amoid = InvalidOid;
 		const char *nspname = NULL;
 		const char *relname = NULL;
 		int			relpages = 0;
@@ -2158,15 +2193,17 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		if (!PQgetisnull(res, i, 1))
 			is_heap = (PQgetvalue(res, i, 1)[0] == 't');
 		if (!PQgetisnull(res, i, 2))
-			is_btree = (PQgetvalue(res, i, 2)[0] == 't');
+			is_index = (PQgetvalue(res, i, 2)[0] == 't');
 		if (!PQgetisnull(res, i, 3))
 			oid = atooid(PQgetvalue(res, i, 3));
 		if (!PQgetisnull(res, i, 4))
-			nspname = PQgetvalue(res, i, 4);
+			amoid = atooid(PQgetvalue(res, i, 4));
 		if (!PQgetisnull(res, i, 5))
-			relname = PQgetvalue(res, i, 5);
+			nspname = PQgetvalue(res, i, 5);
 		if (!PQgetisnull(res, i, 6))
-			relpages = atoi(PQgetvalue(res, i, 6));
+			relname = PQgetvalue(res, i, 6);
+		if (!PQgetisnull(res, i, 7))
+			relpages = atoi(PQgetvalue(res, i, 7));
 
 		if (pattern_id >= 0)
 		{
@@ -2188,10 +2225,11 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 			RelationInfo *rel = (RelationInfo *) pg_malloc0(sizeof(RelationInfo));
 
 			Assert(OidIsValid(oid));
-			Assert((is_heap && !is_btree) || (is_btree && !is_heap));
+			Assert((is_heap && !is_index) || (is_index && !is_heap));
 
 			rel->datinfo = dat;
 			rel->reloid = oid;
+			rel->amoid = amoid;
 			rel->is_heap = is_heap;
 			rel->nspname = pstrdup(nspname);
 			rel->relname = pstrdup(relname);
@@ -2201,7 +2239,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 			{
 				/*
 				 * We apply --startblock and --endblock to heap tables, but
-				 * not btree indexes, and for progress purposes we need to
+				 * not supported indexes, and for progress purposes we need to
 				 * track how many blocks we expect to check.
 				 */
 				if (opts.endblock >= 0 && rel->blocks_to_check > opts.endblock)
diff --git a/src/bin/pg_amcheck/t/002_nonesuch.pl b/src/bin/pg_amcheck/t/002_nonesuch.pl
index 67d700ea07..d4cc0664f3 100644
--- a/src/bin/pg_amcheck/t/002_nonesuch.pl
+++ b/src/bin/pg_amcheck/t/002_nonesuch.pl
@@ -272,8 +272,8 @@ $node->command_checks_all(
 	[
 		qr/pg_amcheck: warning: no heap tables to check matching "no_such_table"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "no\*such\*table"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "no_such_index"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "no\*such\*index"/,
+		qr/pg_amcheck: warning: no indexes to check matching "no_such_index"/,
+		qr/pg_amcheck: warning: no indexes to check matching "no\*such\*index"/,
 		qr/pg_amcheck: warning: no relations to check matching "no_such_relation"/,
 		qr/pg_amcheck: warning: no relations to check matching "no\*such\*relation"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "no\*such\*table"/,
@@ -350,8 +350,8 @@ $node->command_checks_all(
 		qr/pg_amcheck: warning: no heap tables to check matching "template1\.public\.foo"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "another_db\.public\.foo"/,
 		qr/pg_amcheck: warning: no connectable databases to check matching "no_such_database\.public\.foo"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "template1\.public\.foo_idx"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "another_db\.public\.foo_idx"/,
+		qr/pg_amcheck: warning: no indexes to check matching "template1\.public\.foo_idx"/,
+		qr/pg_amcheck: warning: no indexes to check matching "another_db\.public\.foo_idx"/,
 		qr/pg_amcheck: warning: no connectable databases to check matching "no_such_database\.public\.foo_idx"/,
 		qr/pg_amcheck: error: no relations to check/,
 	],
diff --git a/src/bin/pg_amcheck/t/003_check.pl b/src/bin/pg_amcheck/t/003_check.pl
index 4b16bda6a4..7f7bd39f20 100644
--- a/src/bin/pg_amcheck/t/003_check.pl
+++ b/src/bin/pg_amcheck/t/003_check.pl
@@ -185,7 +185,7 @@ for my $dbname (qw(db1 db2 db3))
 	# schemas.  The schemas are all identical to start, but
 	# we will corrupt them differently later.
 	#
-	for my $schema (qw(s1 s2 s3 s4 s5))
+	for my $schema (qw(s1 s2 s3 s4 s5 s6))
 	{
 		$node->safe_psql(
 			$dbname, qq(
@@ -291,22 +291,24 @@ plan_to_corrupt_first_page('db1', 's3.t2_btree');
 # Corrupt toast table, partitions, and materialized views in schema "s4"
 plan_to_remove_toast_file('db1', 's4.t2');
 
-# Corrupt all other object types in schema "s5".  We don't have amcheck support
+# Corrupt GiST index in schema "s5"
+plan_to_remove_relation_file('db1', 's5.t1_gist');
+plan_to_corrupt_first_page('db1', 's5.t2_gist');
+
+# Corrupt all other object types in schema "s6".  We don't have amcheck support
 # for these types, but we check that their corruption does not trigger any
 # errors in pg_amcheck
-plan_to_remove_relation_file('db1', 's5.seq1');
-plan_to_remove_relation_file('db1', 's5.t1_hash');
-plan_to_remove_relation_file('db1', 's5.t1_gist');
-plan_to_remove_relation_file('db1', 's5.t1_gin');
-plan_to_remove_relation_file('db1', 's5.t1_brin');
-plan_to_remove_relation_file('db1', 's5.t1_spgist');
+plan_to_remove_relation_file('db1', 's6.seq1');
+plan_to_remove_relation_file('db1', 's6.t1_hash');
+plan_to_remove_relation_file('db1', 's6.t1_gin');
+plan_to_remove_relation_file('db1', 's6.t1_brin');
+plan_to_remove_relation_file('db1', 's6.t1_spgist');
 
-plan_to_corrupt_first_page('db1', 's5.seq2');
-plan_to_corrupt_first_page('db1', 's5.t2_hash');
-plan_to_corrupt_first_page('db1', 's5.t2_gist');
-plan_to_corrupt_first_page('db1', 's5.t2_gin');
-plan_to_corrupt_first_page('db1', 's5.t2_brin');
-plan_to_corrupt_first_page('db1', 's5.t2_spgist');
+plan_to_corrupt_first_page('db1', 's6.seq2');
+plan_to_corrupt_first_page('db1', 's6.t2_hash');
+plan_to_corrupt_first_page('db1', 's6.t2_gin');
+plan_to_corrupt_first_page('db1', 's6.t2_brin');
+plan_to_corrupt_first_page('db1', 's6.t2_spgist');
 
 
 # Database 'db2' corruptions
@@ -437,10 +439,22 @@ $node->command_checks_all(
 	[$no_output_re],
 	'pg_amcheck in schema s4 excluding toast reports no corruption');
 
-# Check that no corruption is reported in schema db1.s5
-$node->command_checks_all([ @cmd, '-s', 's5', 'db1' ],
+# In schema db1.s5 we should see GiST corruption messages on stdout, and
+# nothing on stderr.
+#
+$node->command_checks_all(
+	[ @cmd, '-s', 's5', 'db1' ],
+	2,
+	[
+		$missing_file_re, $line_pointer_corruption_re,
+	],
+	[$no_output_re],
+	'pg_amcheck schema s5 reports GiST index errors');
+
+# Check that no corruption is reported in schema db1.s6
+$node->command_checks_all([ @cmd, '-s', 's6', 'db1' ],
 	0, [$no_output_re], [$no_output_re],
-	'pg_amcheck over schema s5 reports no corruption');
+	'pg_amcheck over schema s6 reports no corruption');
 
 # In schema db1.s1, only indexes are corrupt.  Verify that when we exclude
 # the indexes, no corruption is reported about the schema.
-- 
2.42.0

v27-0003-Add-gin_index_parent_check-to-verify-GIN-index.patchapplication/octet-stream; name=v27-0003-Add-gin_index_parent_check-to-verify-GIN-index.patch; x-unix-mode=0644Download

From f2b3bd76c511ee3aa85e43aa6e9deac462ff0d4e Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:22:26 +0500
Subject: [PATCH v27 3/4] Add gin_index_parent_check() to verify GIN index
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Author: Grigory Kryachko <GSKryachko@gmail.com>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile               |   3 +-
 contrib/amcheck/amcheck--1.4--1.5.sql  |   9 +
 contrib/amcheck/expected/check_gin.out |  64 ++
 contrib/amcheck/meson.build            |   2 +
 contrib/amcheck/sql/check_gin.sql      |  40 ++
 contrib/amcheck/verify_gin.c           | 769 +++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml              |  19 +
 7 files changed, 905 insertions(+), 1 deletion(-)
 create mode 100644 contrib/amcheck/expected/check_gin.out
 create mode 100644 contrib/amcheck/sql/check_gin.sql
 create mode 100644 contrib/amcheck/verify_gin.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index f63252ff33..5c3ea8bc6a 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,6 +4,7 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
+	verify_gin.o \
 	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
@@ -13,7 +14,7 @@ DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck
 		amcheck--1.3--1.4.sql amcheck--1.4--1.5.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_gist check_heap
+REGRESS = check check_btree check_gin check_gist check_heap
 
 EXTRA_INSTALL = contrib/pg_walinspect
 TAP_TESTS = 1
diff --git a/contrib/amcheck/amcheck--1.4--1.5.sql b/contrib/amcheck/amcheck--1.4--1.5.sql
index 3fc7236418..a2bca7c203 100644
--- a/contrib/amcheck/amcheck--1.4--1.5.sql
+++ b/contrib/amcheck/amcheck--1.4--1.5.sql
@@ -12,3 +12,12 @@ AS 'MODULE_PATHNAME', 'gist_index_check'
 LANGUAGE C STRICT;
 
 REVOKE ALL ON FUNCTION gist_index_check(regclass, boolean) FROM PUBLIC;
+
+-- gin_index_parent_check()
+--
+CREATE FUNCTION gin_index_parent_check(index regclass)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gin_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gin_index_parent_check(regclass) FROM PUBLIC;
diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
new file mode 100644
index 0000000000..43fd769a50
--- /dev/null
+++ b/contrib/amcheck/expected/check_gin.out
@@ -0,0 +1,64 @@
+-- Test of index bulk load
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx');
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test index inserts
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+SELECT gin_index_parent_check('gin_check_idx');
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test GIN over text array
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx');
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 15ae94cc90..5c9ddfe075 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gin.c',
   'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
@@ -38,6 +39,7 @@ tests += {
     'sql': [
       'check',
       'check_btree',
+      'check_gin',
       'check_gist',
       'check_heap',
     ],
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
new file mode 100644
index 0000000000..9771afffa5
--- /dev/null
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -0,0 +1,40 @@
+-- Test of index bulk load
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test index inserts
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+
+SELECT gin_index_parent_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test GIN over text array
+SELECT setseed(1);
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx');
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
new file mode 100644
index 0000000000..877ecacb9c
--- /dev/null
+++ b/contrib/amcheck/verify_gin.c
@@ -0,0 +1,769 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gin.c
+ *		Verifies the integrity of GIN indexes based on invariants.
+ *
+ * Verification checks that all paths in GIN graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gin.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gin_private.h"
+#include "access/nbtree.h"
+#include "amcheck.h"
+#include "catalog/pg_am.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "string.h"
+
+/*
+ * GinScanItem represents one item of depth-first scan of GIN index.
+ */
+typedef struct GinScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GinScanItem *next;
+} GinScanItem;
+
+/*
+ * GinPostingTreeScanItem represents one item of depth-first scan of GIN  posting tree.
+ */
+typedef struct GinPostingTreeScanItem
+{
+	int			depth;
+	ItemPointerData parentkey;
+	BlockNumber parentblk;
+	BlockNumber blkno;
+	struct GinPostingTreeScanItem *next;
+} GinPostingTreeScanItem;
+
+
+PG_FUNCTION_INFO_V1(gin_index_parent_check);
+
+static void gin_check_parent_keys_consistency(Relation rel,
+											  Relation heaprel,
+											  void *callback_state, bool readonly);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gin_refind_parent(Relation rel,
+									BlockNumber parentblkno,
+									BlockNumber childblkno,
+									BufferAccessStrategy strategy);
+static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+								   OffsetNumber offset);
+
+/*
+ * gin_index_parent_check(index regclass)
+ *
+ * Verify integrity of GIN index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gin_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+
+	amcheck_lock_relation_and_check(indrelid,
+									GIN_AM_OID,
+									gin_check_parent_keys_consistency,
+									AccessShareLock,
+									NULL);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Read item pointers from leaf entry tuple.
+ *
+ * Returns a palloc'd array of ItemPointers. The number of items is returned
+ * in *nitems.
+ */
+static ItemPointer
+ginReadTupleWithoutState(IndexTuple itup, int *nitems)
+{
+	Pointer		ptr = GinGetPosting(itup);
+	int			nipd = GinGetNPosting(itup);
+	ItemPointer ipd;
+	int			ndecoded;
+
+	if (GinItupIsCompressed(itup))
+	{
+		if (nipd > 0)
+		{
+			ipd = ginPostingListDecode((GinPostingList *) ptr, &ndecoded);
+			if (nipd != ndecoded)
+				elog(ERROR, "number of items mismatch in GIN entry tuple, %d in tuple header, %d decoded",
+					 nipd, ndecoded);
+		}
+		else
+		{
+			ipd = palloc(0);
+		}
+	}
+	else
+	{
+		ipd = (ItemPointer) palloc(sizeof(ItemPointerData) * nipd);
+		memcpy(ipd, ptr, sizeof(ItemPointerData) * nipd);
+	}
+	*nitems = nipd;
+	return ipd;
+}
+
+/*
+ * Allocates memory context and scans through postigTree graph
+ *
+ */
+static void
+gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting_tree_root)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinPostingTreeScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinPostingTreeScanItem *) palloc0(sizeof(GinPostingTreeScanItem));
+	stack->depth = 0;
+	ItemPointerSetInvalid(&stack->parentkey);
+	stack->parentblk = InvalidBlockNumber;
+	stack->blkno = posting_tree_root;
+
+	elog(DEBUG3, "processing posting tree at blk %u", posting_tree_root);
+
+	while (stack)
+	{
+		GinPostingTreeScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		Assert(GinPageIsData(page));
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			ItemPointerData minItem;
+			int			nlist;
+			ItemPointerData *list;
+			char		tidrange_buf[100];
+
+			ItemPointerSetMin(&minItem);
+
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			list = GinDataLeafPageGetItems(page, &nlist, minItem);
+
+			if (nlist > 0)
+			{
+				snprintf(tidrange_buf, sizeof(tidrange_buf),
+						 "%d tids (%u, %u) - (%u, %u)",
+						 nlist,
+						 ItemPointerGetBlockNumberNoCheck(&list[0]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[0]),
+						 ItemPointerGetBlockNumberNoCheck(&list[nlist - 1]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[nlist - 1]));
+			}
+			else
+			{
+				snprintf(tidrange_buf, sizeof(tidrange_buf), "0 tids");
+			}
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				elog(DEBUG3, "blk %u: parent %u highkey (%u, %u), %s",
+					 stack->blkno,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey),
+					 tidrange_buf);
+			}
+			else
+			{
+				elog(DEBUG3, "blk %u: root leaf, %s",
+					 stack->blkno,
+					 tidrange_buf);
+			}
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				ItemPointerGetOffsetNumberNoCheck(&stack->parentkey) != InvalidOffsetNumber &&
+				nlist > 0 && ItemPointerCompare(&stack->parentkey, &list[nlist - 1]) < 0)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": tid exceeds parent's high key in postingTree leaf on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+		else
+		{
+			LocationIndex pd_lower;
+			ItemPointerData bound;
+			int			lowersize;
+
+			/*
+			 * Check that tuples in each page are properly ordered and
+			 * consistent with parent high key
+			 */
+			maxoff = GinPageGetOpaque(page)->maxoff;
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: internal posting tree page with %u items, parent %u highkey (%u, %u)",
+					 stack->blkno, maxoff, stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey));
+			else
+				elog(DEBUG3, "blk %u: root internal posting tree page with %u items",
+					 stack->blkno, maxoff);
+
+			/*
+			 * A GIN posting tree internal page stores PostingItems in the
+			 * 'lower' part of the page. The 'upper' part is unused. The
+			 * number of elements is stored in the opaque area (maxoff). Make
+			 * sure the size of the 'lower' part agrees with 'maxoff'
+			 *
+			 * We didn't set pd_lower until PostgreSQL version 9.4, so if this
+			 * check fails, it could also be because the index was
+			 * binary-upgraded from an earlier version. That was a long time
+			 * ago, though, so let's warn if it doesn't match.
+			 */
+			pd_lower = ((PageHeader) page)->pd_lower;
+			lowersize = pd_lower - MAXALIGN(SizeOfPageHeaderData);
+			if ((lowersize - MAXALIGN(sizeof(ItemPointerData))) / sizeof(PostingItem) != maxoff)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has unexpected pd_lower %u in posting tree block %u with maxoff %u)",
+								RelationGetRelationName(rel), pd_lower, stack->blkno, maxoff)));
+
+			/*
+			 * Before the PostingItems, there's one ItemPointerData in the
+			 * 'lower' part that stores the page's high key.
+			 */
+			bound = *GinDataPageGetRightBound(page);
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				!ItemPointerEquals(&stack->parentkey, &bound))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": posting tree page's high key (%u, %u) doesn't match the downlink on block %u (parent blk %u, key (%u, %u))",
+								RelationGetRelationName(rel),
+								ItemPointerGetBlockNumberNoCheck(&bound),
+								ItemPointerGetOffsetNumberNoCheck(&bound),
+								stack->blkno, stack->parentblk,
+								ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+								ItemPointerGetOffsetNumberNoCheck(&stack->parentkey))));
+
+			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
+
+				elog(DEBUG3, "key (%u, %u) -> %u",
+					 ItemPointerGetBlockNumber(&posting_item->key),
+					 ItemPointerGetOffsetNumber(&posting_item->key),
+					 BlockIdGetBlockNumber(&posting_item->child_blkno));
+
+				if (i == maxoff &&
+					GinPageGetOpaque(page)->rightlink == InvalidBlockNumber)
+				{
+					/*
+					 * The rightmost item in the tree level has (0, 0) as the
+					 * key
+					 */
+					if (ItemPointerGetBlockNumberNoCheck(&posting_item->key) != 0 ||
+						ItemPointerGetOffsetNumberNoCheck(&posting_item->key) != 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": rightmost posting tree page (blk %u) has unexpected last key (%u, %u)",
+										RelationGetRelationName(rel),
+										stack->blkno,
+										ItemPointerGetBlockNumberNoCheck(&posting_item->key),
+										ItemPointerGetOffsetNumberNoCheck(&posting_item->key))));
+				}
+				else if (i != FirstOffsetNumber)
+				{
+					PostingItem *previous_posting_item = GinDataPageGetPostingItem(page, i - 1);
+
+					if (ItemPointerCompare(&posting_item->key, &previous_posting_item->key) < 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\" has wrong tuple order in posting tree, block %u, offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+				}
+
+				/*
+				 * Check if this tuple is consistent with the downlink in the
+				 * parent.
+				 */
+				if (stack->parentblk != InvalidBlockNumber && i == maxoff &&
+					ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\": posting item exceeds parent's high key in postingTree internal page on block %u offset %u",
+									RelationGetRelationName(rel),
+									stack->blkno, i)));
+
+				/* If this is an internal page, recurse into the child */
+				if (!GinPageIsLeaf(page))
+				{
+					GinPostingTreeScanItem *ptr;
+
+					ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
+					ptr->depth = stack->depth + 1;
+					ptr->parentkey = posting_item->key;
+					ptr->parentblk = stack->blkno;
+					ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+			}
+		}
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Main entry point for GIN check. Allocates memory context and scans through
+ * GIN graph.
+ */
+static void
+gin_check_parent_keys_consistency(Relation rel,
+								  Relation heaprel,
+								  void *callback_state,
+								  bool readonly)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GinState	state;
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+	initGinState(&state, rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinScanItem *) palloc0(sizeof(GinScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIN_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GinScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+		IndexTuple	prev_tuple;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+		maxoff = PageGetMaxOffsetNumber(page);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (stack->parenttup != NULL)
+		{
+			GinNullCategory parent_key_category;
+			Datum parent_key = gintuple_get_key(&state,
+												stack->parenttup,
+												&parent_key_category);
+			ItemId iid = PageGetItemIdCareful(rel, stack->blkno,
+											  page, maxoff);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory page_max_key_category;
+			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
+
+			if (GinPageGetOpaque(page)->rightlink != InvalidBlockNumber &&
+				ginCompareEntries(&state, attnum, page_max_key,
+								  page_max_key_category, parent_key,
+								  parent_key_category) > 0)
+			{
+				/* split page detected, install right link to the stack */
+				GinScanItem *ptr;
+
+				elog(DEBUG3, "split detected");
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth;
+				ptr->parenttup = CopyIndexTuple(stack->parenttup);
+				ptr->parentblk = stack->parentblk;
+				ptr->parentlsn = stack->parentlsn;
+				ptr->blkno = GinPageGetOpaque(page)->rightlink;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that tuples in each page are properly ordered and consistent
+		 * with parent high key
+		 */
+		prev_tuple = NULL;
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory prev_key_category;
+			Datum		prev_key;
+			GinNullCategory current_key_category;
+			Datum		current_key;
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
+
+			/* (apparently) first block is metadata, skip order check */
+			if (i != FirstOffsetNumber && stack->blkno != (BlockNumber) 1)
+			{
+				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
+				if (ginCompareEntries(&state, attnum, prev_key,
+									  prev_key_category, current_key,
+									  current_key_category) >= 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has wrong tuple order, block %u, offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+			}
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				i == maxoff)
+			{
+				GinNullCategory parent_key_category;
+				Datum parent_key = gintuple_get_key(&state,
+													stack->parenttup,
+													&parent_key_category);
+
+				if (ginCompareEntries(&state, attnum, current_key,
+									  current_key_category, parent_key,
+									  parent_key_category) > 0)
+				{
+					/*
+					 * There was a discrepancy between parent and child
+					 * tuples. We need to verify it is not a result of
+					 * concurrent call of gistplacetopage(). So, lock parent
+					 * and try to find downlink for current page. It may be
+					 * missing due to concurrent page split, this is OK.
+					 */
+					pfree(stack->parenttup);
+					stack->parenttup = gin_refind_parent(rel, stack->parentblk,
+														 stack->blkno, strategy);
+
+					/* We found it - make a final check before failing */
+					if (!stack->parenttup)
+						elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+							 stack->blkno, stack->parentblk);
+					else
+					{
+						parent_key = gintuple_get_key(&state,
+													  stack->parenttup,
+													  &parent_key_category);
+						if (ginCompareEntries(&state, attnum, current_key,
+											  current_key_category, parent_key,
+											  parent_key_category) > 0)
+							ereport(ERROR,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+											RelationGetRelationName(rel), stack->blkno, i)));
+						else
+						{
+							/*
+							 * But now it is properly adjusted - nothing to do
+							 * here.
+							 */
+						}
+					}
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+			if (!GinPageIsLeaf(page))
+			{
+				GinScanItem *ptr;
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth + 1;
+				/* last tuple in layer has no high key */
+				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
+				{
+					ptr->parenttup = CopyIndexTuple(idxtuple);
+				}
+				else
+				{
+					ptr->parenttup = NULL;
+				}
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = GinGetDownlink(idxtuple);
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+			/* If this item is a pointer to a posting tree, recurse into it */
+			else if (GinIsPostingTree(idxtuple))
+			{
+				BlockNumber rootPostingTree = GinGetPostingTree(idxtuple);
+
+				gin_check_posting_tree_parent_keys_consistency(rel, rootPostingTree);
+			}
+			else
+			{
+				ItemPointer ipd;
+				int			nipd;
+
+				ipd = ginReadTupleWithoutState(idxtuple, &nipd);
+
+				for (int j = 0; j < nipd; j++)
+				{
+					if (!OffsetNumberIsValid(ItemPointerGetOffsetNumber(&ipd[j])))
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting list contains invalid heap pointer on block %u",
+										RelationGetRelationName(rel), stack->blkno)));
+				}
+				pfree(ipd);
+			}
+
+			prev_tuple = CopyIndexTuple(idxtuple);
+		}
+
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Verify that a freshly-read page looks sane.
+ */
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	/*
+	 * ReadBuffer verifies that every newly-read page passes
+	 * PageHeaderIsValid, which means it either contains a reasonably sane
+	 * page header or is all-zero.  We have to defend against the all-zero
+	 * case, however.
+	 */
+	if (PageIsNew(page))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains unexpected zero page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	/*
+	 * Additionally check that the special area looks sane.
+	 */
+	if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains corrupted page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	if (GinPageIsDeleted(page))
+	{
+		if (!GinPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gin_refind_parent(Relation rel, BlockNumber parentblkno,
+				  BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIN_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GinPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+static ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED or LP_DEAD,
+	 * since GIN never uses all three.  Verify that line pointer has storage,
+	 * too.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdIsDead(itemid) || ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 6eb526c6bb..e1a471474e 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -189,6 +189,25 @@ ORDER BY c.relpages DESC LIMIT 10;
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term>
+     <function>gin_index_parent_check(index regclass) returns void</function>
+     <indexterm>
+      <primary>gin_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gin_index_parent_check</function> tests that its target GIN index
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term>
      <function>gist_index_check(index regclass, heapallindexed boolean) returns void</function>
-- 
2.42.0

v27-0002-Add-gist_index_check-function-to-verify-GiST-ind.patchapplication/octet-stream; name=v27-0002-Add-gist_index_check-function-to-verify-GiST-ind.patch; x-unix-mode=0644Download

From c6c84e61c462904b46244a32773d7e2efd0872d7 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:17:44 +0500
Subject: [PATCH v27 2/4] Add gist_index_check() function to verify GiST index
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This function traverses GiST with a depth-fisrt search and checks
that all downlink tuples are included into parent tuple keyspace.
This traverse takes lock of any page until some discapency found.
To re-check suspicious pair of parent and child tuples it aqcuires
locks on both parent and child pages in the same order as page
split does.

Author: Andrey Borodin <amborodin@acm.org>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile                |   6 +-
 contrib/amcheck/amcheck--1.4--1.5.sql   |  14 +
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/expected/check_gist.out | 145 +++++
 contrib/amcheck/meson.build             |   3 +
 contrib/amcheck/sql/check_gist.sql      |  62 +++
 contrib/amcheck/verify_gist.c           | 672 ++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml               |  19 +
 8 files changed, 920 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.4--1.5.sql
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index 97b60c5115..f63252ff33 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,14 +4,16 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
+	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
-DATA = amcheck--1.3--1.4.sql amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql \
+		amcheck--1.3--1.4.sql amcheck--1.4--1.5.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap
+REGRESS = check check_btree check_gist check_heap
 
 EXTRA_INSTALL = contrib/pg_walinspect
 TAP_TESTS = 1
diff --git a/contrib/amcheck/amcheck--1.4--1.5.sql b/contrib/amcheck/amcheck--1.4--1.5.sql
new file mode 100644
index 0000000000..3fc7236418
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.4--1.5.sql
@@ -0,0 +1,14 @@
+/* contrib/amcheck/amcheck--1.4--1.5.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.5'" to load this file. \quit
+
+
+-- gist_index_check()
+--
+CREATE FUNCTION gist_index_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_check(regclass, boolean) FROM PUBLIC;
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index e67ace01c9..c8ba6d7c9b 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.4'
+default_version = '1.5'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 0000000000..cbc3e27e67
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,145 @@
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gist_check;
+--
+-- Similar to BUG #15597
+--
+CREATE TABLE toast_bug(c point,buggy text);
+ALTER TABLE toast_bug ALTER COLUMN buggy SET STORAGE extended;
+CREATE INDEX toasty ON toast_bug USING gist(c) INCLUDE(buggy);
+-- pg_attribute entry for toasty.buggy (the index) will have plain storage:
+UPDATE pg_attribute SET attstorage = 'p'
+WHERE attrelid = 'toasty'::regclass AND attname = 'buggy';
+-- Whereas pg_attribute entry for toast_bug.buggy (the table) still has extended storage:
+SELECT attstorage FROM pg_attribute
+WHERE attrelid = 'toast_bug'::regclass AND attname = 'buggy';
+ attstorage 
+------------
+ x
+(1 row)
+
+-- Insert compressible heap tuple (comfortably exceeds TOAST_TUPLE_THRESHOLD):
+INSERT INTO toast_bug SELECT point(0,0), repeat('a', 2200);
+-- Should not get false positive report of corruption:
+SELECT gist_index_check('toasty', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 1b38e0aba7..15ae94cc90 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
@@ -25,6 +26,7 @@ install_data(
   'amcheck--1.1--1.2.sql',
   'amcheck--1.2--1.3.sql',
   'amcheck--1.3--1.4.sql',
+  'amcheck--1.4--1.5.sql',
   kwargs: contrib_data_args,
 )
 
@@ -36,6 +38,7 @@ tests += {
     'sql': [
       'check',
       'check_btree',
+      'check_gist',
       'check_heap',
     ],
   },
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 0000000000..37966423b8
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,62 @@
+
+SELECT setseed(1);
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+-- cleanup
+DROP TABLE gist_check;
+
+--
+-- Similar to BUG #15597
+--
+CREATE TABLE toast_bug(c point,buggy text);
+ALTER TABLE toast_bug ALTER COLUMN buggy SET STORAGE extended;
+CREATE INDEX toasty ON toast_bug USING gist(c) INCLUDE(buggy);
+
+-- pg_attribute entry for toasty.buggy (the index) will have plain storage:
+UPDATE pg_attribute SET attstorage = 'p'
+WHERE attrelid = 'toasty'::regclass AND attname = 'buggy';
+
+-- Whereas pg_attribute entry for toast_bug.buggy (the table) still has extended storage:
+SELECT attstorage FROM pg_attribute
+WHERE attrelid = 'toast_bug'::regclass AND attname = 'buggy';
+
+-- Insert compressible heap tuple (comfortably exceeds TOAST_TUPLE_THRESHOLD):
+INSERT INTO toast_bug SELECT point(0,0), repeat('a', 2200);
+-- Should not get false positive report of corruption:
+SELECT gist_index_check('toasty', true);
\ No newline at end of file
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 0000000000..3884d0cc25
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,672 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gist.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "access/tableam.h"
+#include "amcheck.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
+#include "common/pg_prng.h"
+#include "lib/bloomfilter.h"
+#include "utils/memutils.h"
+
+
+/*
+ * GistScanItem represents one item of depth-first scan of GiST index.
+ */
+typedef struct GistScanItem
+{
+	int			depth;
+
+	/* Referenced block number to check next */
+	BlockNumber blkno;
+
+	/*
+	 * Correctess of this parent tuple will be checked against contents of referenced page.
+	 * This tuple will be NULL for root block.
+	 */
+	IndexTuple	parenttup;
+
+	/*
+	 * LSN to hande concurrent scan of the page.
+	 * It's necessary to avoid missing some subtrees from page, that was
+	 * split just before we read it.
+	 */
+	XLogRecPtr	parentlsn;
+
+	/*
+	 * Reference to parent page for re-locking in case of found parent-child
+	 * tuple discrapencies.
+	 */
+	BlockNumber parentblk;
+
+	/* Pointer to a next stack item. */
+	struct GistScanItem *next;
+} GistScanItem;
+
+typedef struct GistCheckState
+{
+	/* Bloom filter fingerprints index tuples */
+	bloom_filter *filter;
+	/* Debug counter */
+	int64		heaptuplespresent;
+	/* GiST state */
+	GISTSTATE  *state;
+
+	Snapshot	snapshot;
+	Relation	rel;
+	Relation	heaprel;
+
+	/* progress reporting stuff */
+	BlockNumber totalblocks;
+	BlockNumber reportedblocks;
+	BlockNumber scannedblocks;
+	BlockNumber deltablocks;
+
+	int leafdepth;
+} GistCheckState;
+
+PG_FUNCTION_INFO_V1(gist_index_check);
+
+static void gist_init_heapallindexed(Relation rel, GistCheckState * result);
+static void gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+											   void *callback_state, bool readonly);
+static void gist_check_page(GistCheckState *check_state,GistScanItem *stack,
+							Page page, bool heapallindexed,
+							BufferAccessStrategy strategy);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gist_refind_parent(Relation rel, BlockNumber parentblkno,
+									 BlockNumber childblkno,
+									 BufferAccessStrategy strategy);
+static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block,
+								   Page page, OffsetNumber offset);
+static void gist_tuple_present_callback(Relation index, ItemPointer tid,
+										Datum *values, bool *isnull,
+										bool tupleIsAlive, void *checkstate);
+static IndexTuple gistFormNormalizedTuple(GISTSTATE *giststate, Relation r,
+			  Datum *attdata, bool *isnull, ItemPointerData tid);
+
+/*
+ * gist_index_check(index regclass)
+ *
+ * Verify integrity of GiST index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gist_index_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	bool		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid,
+									GIST_AM_OID,
+									gist_check_parent_keys_consistency,
+									AccessShareLock,
+									&heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+static void
+gist_init_heapallindexed(Relation rel, GistCheckState * result)
+{
+	int64		total_pages;
+	int64		total_elems;
+	uint64		seed;
+
+	/*
+	 * Size Bloom filter based on estimated number of tuples in index. This
+	 * logic is similar to B-tree, see verify_btree.c .
+	 */
+	total_pages = result->totalblocks;
+	total_elems = Max(total_pages * (MaxOffsetNumber / 5),
+					  (int64) rel->rd_rel->reltuples);
+	seed = pg_prng_uint64(&pg_global_prng_state);
+	result->filter = bloom_create(total_elems, maintenance_work_mem, seed);
+
+	result->snapshot = RegisterSnapshot(GetTransactionSnapshot());
+
+
+	/*
+	 * GetTransactionSnapshot() always acquires a new MVCC snapshot in READ
+	 * COMMITTED mode.  A new snapshot is guaranteed to have all the entries
+	 * it requires in the index.
+	 *
+	 * We must defend against the possibility that an old xact snapshot was
+	 * returned at higher isolation levels when that snapshot is not safe for
+	 * index scans of the target index.  This is possible when the snapshot
+	 * sees tuples that are before the index's indcheckxmin horizon.  Throwing
+	 * an error here should be very rare.  It doesn't seem worth using a
+	 * secondary snapshot to avoid this.
+	 */
+	if (IsolationUsesXactSnapshot() && rel->rd_index->indcheckxmin &&
+		!TransactionIdPrecedes(HeapTupleHeaderGetXmin(rel->rd_indextuple->t_data),
+							   result->snapshot->xmin))
+		ereport(ERROR,
+				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+				 errmsg("index \"%s\" cannot be verified using transaction snapshot",
+						RelationGetRelationName(rel))));
+}
+
+/*
+ * Main entry point for GiST check. Allocates memory context and scans through
+ * GiST graph. This scan is performed in a depth-first search using a stack of
+ * GistScanItem-s. Initially this stack contains only root block number. On
+ * each iteration top block numbmer is replcaed by referenced block numbers.
+ *
+ * This function verifies that tuples of internal pages cover all
+ * the key space of each tuple on leaf page.  To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries to
+ * adjust it by tuples on referenced child page.  Parent gist tuple should
+ * never require any adjustments.
+ */
+static void
+gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+								   void *callback_state, bool readonly)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GistScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GISTSTATE  *state;
+	bool		heapallindexed = *((bool *) callback_state);
+	GistCheckState *check_state = palloc0(sizeof(GistCheckState));
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	state = initGISTstate(rel);
+
+	check_state->state = state;
+	check_state->rel = rel;
+	check_state->heaprel = heaprel;
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	check_state->leafdepth = -1;
+
+	check_state->totalblocks = RelationGetNumberOfBlocks(rel);
+	/* report every 100 blocks or 5%, whichever is bigger */
+	check_state->deltablocks = Max(check_state->totalblocks / 20, 100);
+
+	if (heapallindexed)
+		gist_init_heapallindexed(rel, check_state);
+
+	/* Start the scan at the root page */
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	/*
+	 * This GiST scan is effectively "old" VACUUM version before commit
+	 * fe280694d which introduced physical order scanning.
+	 */
+
+	while (stack)
+	{
+		GistScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		XLogRecPtr	lsn;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Report progress */
+		if (check_state->scannedblocks > check_state->reportedblocks +
+			check_state->deltablocks)
+		{
+			elog(DEBUG1, "verified level %u blocks of approximately %u total",
+				 check_state->scannedblocks, check_state->totalblocks);
+			check_state->reportedblocks = check_state->scannedblocks;
+		}
+		check_state->scannedblocks++;
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page))
+		{
+			/* split page detected, install right link to the stack */
+			GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+			ptr->depth = stack->depth;
+			ptr->parenttup = CopyIndexTuple(stack->parenttup);
+			ptr->parentblk = stack->parentblk;
+			ptr->parentlsn = stack->parentlsn;
+			ptr->blkno = GistPageGetOpaque(page)->rightlink;
+			ptr->next = stack->next;
+			stack->next = ptr;
+		}
+
+		gist_check_page(check_state, stack, page, heapallindexed, strategy);
+
+		if (!GistPageIsLeaf(page))
+		{
+			OffsetNumber maxoff = PageGetMaxOffsetNumber(page);
+			for (OffsetNumber i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				/* Internal page, so recurse to the child */
+				GistScanItem *ptr;
+				ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
+				IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+				ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+				ptr->depth = stack->depth + 1;
+				ptr->parenttup = CopyIndexTuple(idxtuple);
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		LockBuffer(buffer, GIST_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	if (heapallindexed)
+	{
+		IndexInfo  *indexinfo = BuildIndexInfo(rel);
+		TableScanDesc scan;
+
+		scan = table_beginscan_strat(heaprel,	/* relation */
+									 check_state->snapshot,	/* snapshot */
+									 0, /* number of keys */
+									 NULL,	/* scan key */
+									 true,	/* buffer access strategy OK */
+									 true); /* syncscan OK? */
+
+		/*
+		 * Scan will behave as the first scan of a CREATE INDEX CONCURRENTLY.
+		 */
+		indexinfo->ii_Concurrent = true;
+
+		indexinfo->ii_Unique = false;
+		indexinfo->ii_ExclusionOps = NULL;
+		indexinfo->ii_ExclusionProcs = NULL;
+		indexinfo->ii_ExclusionStrats = NULL;
+
+		elog(DEBUG1, "verifying that tuples from index \"%s\" are present in \"%s\"",
+			 RelationGetRelationName(rel),
+			 RelationGetRelationName(heaprel));
+
+		table_index_build_scan(heaprel, rel, indexinfo, true, false,
+							   gist_tuple_present_callback, (void *) check_state, scan);
+
+		ereport(DEBUG1,
+				(errmsg_internal("finished verifying presence of " INT64_FORMAT " tuples from table \"%s\" with bitset %.2f%% set",
+								 check_state->heaptuplespresent,
+								 RelationGetRelationName(heaprel),
+								 100.0 * bloom_prop_bits_set(check_state->filter))));
+
+		UnregisterSnapshot(check_state->snapshot);
+		bloom_free(check_state->filter);
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+	pfree(check_state);
+}
+
+static void gist_check_page(GistCheckState *check_state, GistScanItem *stack,
+							Page page, bool heapallindexed, BufferAccessStrategy strategy)
+{
+	OffsetNumber maxoff = PageGetMaxOffsetNumber(page);
+	/* Check that the tree has the same height in all branches */
+	if (GistPageIsLeaf(page))
+	{
+		if (check_state->leafdepth == -1)
+			check_state->leafdepth = stack->depth;
+		else if (stack->depth != check_state->leafdepth)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+						errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+							RelationGetRelationName(check_state->rel), stack->blkno)));
+	}
+
+	/*
+	 * Check that each tuple looks valid, and is consistent with the
+	 * downlink we followed when we stepped on this page.
+	 */
+	for (OffsetNumber i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+	{
+		ItemId		iid = PageGetItemIdCareful(check_state->rel, stack->blkno, page, i);
+		IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+		/*
+		 * Check that it's not a leftover invalid tuple from pre-9.1 See
+		 * also gistdoinsert() and gistbulkdelete() handling of such
+		 * tuples. We do consider it error here.
+		 */
+		if (GistTupleIsInvalid(idxtuple))
+			ereport(ERROR,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						errmsg("index \"%s\" contains an inner tuple marked as invalid, block %u, offset %u",
+							RelationGetRelationName(check_state->rel), stack->blkno, i),
+						errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+						errhint("Please REINDEX it.")));
+
+		if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+						errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+							RelationGetRelationName(check_state->rel), stack->blkno, i)));
+
+		/*
+		 * Check if this tuple is consistent with the downlink in the
+		 * parent.
+		 */
+		if (stack->parenttup &&
+			gistgetadjusted(check_state->rel, stack->parenttup, idxtuple, check_state->state))
+		{
+			/*
+			 * There was a discrepancy between parent and child tuples. We
+			 * need to verify it is not a result of concurrent call of
+			 * gistplacetopage(). So, lock parent and try to find downlink
+			 * for current page. It may be missing due to concurrent page
+			 * split, this is OK.
+			 *
+			 * Note that when we aquire parent tuple now we hold lock for
+			 * both parent and child buffers. Thus parent tuple must
+			 * include keyspace of the child.
+			 */
+			pfree(stack->parenttup);
+			stack->parenttup = gist_refind_parent(check_state->rel, stack->parentblk,
+													stack->blkno, strategy);
+
+			/* We found it - make a final check before failing */
+			if (!stack->parenttup)
+				elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+						stack->blkno, stack->parentblk);
+			else if (gistgetadjusted(check_state->rel, stack->parenttup, idxtuple, check_state->state))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+							errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+								RelationGetRelationName(check_state->rel), stack->blkno, i)));
+			else
+			{
+				/*
+				 * But now it is properly adjusted - nothing to do here.
+				 */
+			}
+		}
+
+		if (GistPageIsLeaf(page))
+		{
+			if (heapallindexed)
+				bloom_add_element(check_state->filter,
+									(unsigned char *) idxtuple,
+									IndexTupleSize(idxtuple));
+		}
+		else
+		{
+			OffsetNumber off = ItemPointerGetOffsetNumber(&(idxtuple->t_tid));
+			if (off != 0xffff)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+							errmsg("index \"%s\" has on page %u offset %u has item id not pointing to 0xffff, but %hu",
+								RelationGetRelationName(check_state->rel), stack->blkno, i, off)));
+		}
+	}
+}
+
+/*
+ * gistFormNormalizedTuple - analogue to gistFormTuple, but performs deTOASTing
+ * of all included data (for covering indexes). While we do not expected
+ * toasted attributes in normal index, this can happen as a result of
+ * intervention into system catalog. Detoasting of key attributes is expected
+ * to be done by opclass decompression methods, if indexed type might be
+ * toasted.
+ */
+static IndexTuple
+gistFormNormalizedTuple(GISTSTATE *giststate, Relation r,
+			  Datum *attdata, bool *isnull, ItemPointerData tid)
+{
+	Datum		compatt[INDEX_MAX_KEYS];
+	IndexTuple	res;
+
+	gistCompressValues(giststate, r, attdata, isnull, true, compatt);
+
+	for (int i = 0; i < r->rd_att->natts; i++)
+	{
+		Form_pg_attribute att;
+
+		att = TupleDescAttr(giststate->leafTupdesc, i);
+		if (att->attbyval || att->attlen != -1 || isnull[i])
+			continue;
+
+		if (VARATT_IS_EXTERNAL(DatumGetPointer(compatt[i])))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("external varlena datum in tuple that references heap row (%u,%u) in index \"%s\"",
+							ItemPointerGetBlockNumber(&tid),
+							ItemPointerGetOffsetNumber(&tid),
+							RelationGetRelationName(r))));
+		if (VARATT_IS_COMPRESSED(DatumGetPointer(compatt[i])))
+		{
+			//Datum old = compatt[i];
+			/* Key attributes must never be compressed */
+			if (i < IndexRelationGetNumberOfKeyAttributes(r))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+							errmsg("compressed varlena datum in tuple key that references heap row (%u,%u) in index \"%s\"",
+								ItemPointerGetBlockNumber(&tid),
+								ItemPointerGetOffsetNumber(&tid),
+								RelationGetRelationName(r))));
+
+			compatt[i] = PointerGetDatum(PG_DETOAST_DATUM(compatt[i]));
+			//pfree(DatumGetPointer(old)); // TODO: this fails. Why?
+		}
+	}
+
+	res = index_form_tuple(giststate->leafTupdesc, compatt, isnull);
+
+	/*
+	 * The offset number on tuples on internal pages is unused. For historical
+	 * reasons, it is set to 0xffff.
+	 */
+	ItemPointerSetOffsetNumber(&(res->t_tid), 0xffff);
+	return res;
+}
+
+static void
+gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+							bool *isnull, bool tupleIsAlive, void *checkstate)
+{
+	GistCheckState *state = (GistCheckState *) checkstate;
+	IndexTuple	itup = gistFormNormalizedTuple(state->state, index, values, isnull, *tid);
+
+	itup->t_tid = *tid;
+	/* Probe Bloom filter -- tuple should be present */
+	if (bloom_lacks_element(state->filter, (unsigned char *) itup,
+							IndexTupleSize(itup)))
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("heap tuple (%u,%u) from table \"%s\" lacks matching index tuple within index \"%s\"",
+						ItemPointerGetBlockNumber(&(itup->t_tid)),
+						ItemPointerGetOffsetNumber(&(itup->t_tid)),
+						RelationGetRelationName(state->heaprel),
+						RelationGetRelationName(state->rel))));
+
+	state->heaptuplespresent++;
+
+	pfree(itup);
+}
+
+/*
+ * check_index_page - verification of basic invariants about GiST page data
+ * This function does no any tuple analysis.
+ */
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	gistcheckpage(rel, buffer);
+
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted page %d",
+						RelationGetRelationName(rel), blockNo)));
+
+	if (GistPageIsDeleted(page))
+	{
+		if (!GistPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gist_refind_parent(Relation rel,
+				   BlockNumber parentblkno, BlockNumber childblkno,
+				   BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIST_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GistPageIsLeaf(parentpage))
+	{
+		/* 
+		 * Currently GiST never deletes internal pages, thus they can never
+		 * become leaf 
+		 */
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" internal page %d became leaf",
+						RelationGetRelationName(rel), parentblkno)));
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (OffsetNumber o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/*
+			 * Found it! Make copy and return it while both parent and child
+			 * pages are locked. This guaranties that at this particular moment
+			 * tuples must be coherent to each other.
+			 */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+static ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(sizeof(GISTPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree
+	 * and gist never uses either.  Verify that line pointer has storage, too,
+	 * since even LP_DEAD items should.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 3af065615b..6eb526c6bb 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -188,6 +188,25 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gist_index_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
   <tip>
    <para>
-- 
2.42.0

#43

Andrey M. Borodin

x4mmm@yandex-team.ru

over 1 year ago

In reply to: Andrey M. Borodin (#42)

4 attachment(s)

Re: Amcheck verification of GiST and GIN

On 5 Jul 2024, at 17:27, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:

There’s one more problem in pg_amcheck’s GiST verification. We must check that amcheck is 1.5+ and use GiST verification only in that case…

Done. I’ll set the status to “Needs review”.

Best regards, Andrey Borodin.

Attachments:

v28-0001-Refactor-amcheck-to-extract-common-locking-routi.patchapplication/octet-stream; name=v28-0001-Refactor-amcheck-to-extract-common-locking-routi.patch; x-unix-mode=0644Download

From 117043bee378170da9ade9a89c2c7979aaf78b79 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:08:10 +0500
Subject: [PATCH v28 1/4] Refactor amcheck to extract common locking routines
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Other indexes will need to do same precautions before doing checks:
 - ensuring index is checkable
 - switching user context
 - taking care about GUCs changed by index functions
To reuse existing functionality this commit moves it to amcheck.c.

Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile        |   1 +
 contrib/amcheck/amcheck.c       | 173 +++++++++++++++++++++
 contrib/amcheck/amcheck.h       |  31 ++++
 contrib/amcheck/meson.build     |   1 +
 contrib/amcheck/verify_nbtree.c | 265 ++++++++------------------------
 5 files changed, 273 insertions(+), 198 deletions(-)
 create mode 100644 contrib/amcheck/amcheck.c
 create mode 100644 contrib/amcheck/amcheck.h

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index 5e9002d250..97b60c5115 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -3,6 +3,7 @@
 MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
+	amcheck.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
new file mode 100644
index 0000000000..bf3427e375
--- /dev/null
+++ b/contrib/amcheck/amcheck.c
@@ -0,0 +1,173 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.c
+ *		Utility functions common to all access methods.
+ *
+ * Copyright (c) 2024, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/table.h"
+#include "access/tableam.h"
+#include "amcheck.h"
+#include "catalog/index.h"
+#include "commands/tablecmds.h"
+#include "utils/guc.h"
+
+static bool amcheck_index_mainfork_expected(Relation rel);
+
+
+/*
+ * Check if index relation should have a file for its main relation fork.
+ * Verification uses this to skip unlogged indexes when in hot standby mode,
+ * where there is simply nothing to verify.
+ *
+ * NB: Caller should call index_checkable() before calling here.
+ */
+static bool
+amcheck_index_mainfork_expected(Relation rel)
+{
+	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
+		!RecoveryInProgress())
+		return true;
+
+	ereport(NOTICE,
+			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
+			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
+					RelationGetRelationName(rel))));
+
+	return false;
+}
+
+void
+amcheck_lock_relation_and_check(Oid indrelid,
+								Oid am_id,
+								IndexDoCheckCallback check,
+								LOCKMODE lockmode,
+								void *state)
+{
+	Oid			heapid;
+	Relation	indrel;
+	Relation	heaprel;
+	Oid			save_userid;
+	int			save_sec_context;
+	int			save_nestlevel;
+
+	/*
+	 * We must lock table before index to avoid deadlocks.  However, if the
+	 * passed indrelid isn't an index then IndexGetRelation() will fail.
+	 * Rather than emitting a not-very-helpful error message, postpone
+	 * complaining, expecting that the is-it-an-index test below will fail.
+	 *
+	 * In hot standby mode this will raise an error when parentcheck is true.
+	 */
+	heapid = IndexGetRelation(indrelid, true);
+	if (OidIsValid(heapid))
+	{
+		heaprel = table_open(heapid, lockmode);
+
+		/*
+		 * Switch to the table owner's userid, so that any index functions are
+		 * run as that user.  Also lock down security-restricted operations
+		 * and arrange to make GUC variable changes local to this command.
+		 */
+		GetUserIdAndSecContext(&save_userid, &save_sec_context);
+		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
+							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
+		save_nestlevel = NewGUCNestLevel();
+	}
+	else
+	{
+		heaprel = NULL;
+		/* Set these just to suppress "uninitialized variable" warnings */
+		save_userid = InvalidOid;
+		save_sec_context = -1;
+		save_nestlevel = -1;
+	}
+
+	/*
+	 * Open the target index relations separately (like relation_openrv(), but
+	 * with heap relation locked first to prevent deadlocking).  In hot
+	 * standby mode this will raise an error when parentcheck is true.
+	 *
+	 * There is no need for the usual indcheckxmin usability horizon test
+	 * here, even in the heapallindexed case, because index undergoing
+	 * verification only needs to have entries for a new transaction snapshot.
+	 * (If this is a parentcheck verification, there is no question about
+	 * committed or recently dead heap tuples lacking index entries due to
+	 * concurrent activity.)
+	 */
+	indrel = index_open(indrelid, lockmode);
+
+	/*
+	 * Since we did the IndexGetRelation call above without any lock, it's
+	 * barely possible that a race against an index drop/recreation could have
+	 * netted us the wrong table.
+	 */
+	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_TABLE),
+				 errmsg("could not open parent table of index \"%s\"",
+						RelationGetRelationName(indrel))));
+
+	/* Relation suitable for checking */
+	index_checkable(indrel, am_id);
+
+	if (amcheck_index_mainfork_expected(indrel))
+		check(indrel, heaprel, state, lockmode == ShareLock);
+
+	/* Roll back any GUC changes executed by index functions */
+	AtEOXact_GUC(false, save_nestlevel);
+
+	/* Restore userid and security context */
+	SetUserIdAndSecContext(save_userid, save_sec_context);
+
+	/*
+	 * Release locks early. That's ok here because nothing in the called
+	 * routines will trigger shared cache invalidations to be sent, so we can
+	 * relax the usual pattern of only releasing locks after commit.
+	 */
+	index_close(indrel, lockmode);
+	if (heaprel)
+		table_close(heaprel, lockmode);
+}
+
+/*
+ * Basic checks about the suitability of a relation for checking as an index.
+ *
+ *
+ * NB: Intentionally not checking permissions, the function is normally not
+ * callable by non-superusers. If granted, it's useful to be able to check a
+ * whole cluster.
+ */
+void
+index_checkable(Relation rel, Oid am_id)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != am_id)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only B-Tree indexes are supported as targets for verification"), //TODO name AM
+				 errdetail("Relation \"%s\" is not a B-Tree index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid.")));
+}
diff --git a/contrib/amcheck/amcheck.h b/contrib/amcheck/amcheck.h
new file mode 100644
index 0000000000..945f2ad443
--- /dev/null
+++ b/contrib/amcheck/amcheck.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "storage/bufpage.h"
+#include "storage/lmgr.h"
+#include "storage/lockdefs.h"
+#include "utils/relcache.h"
+#include "miscadmin.h"
+
+/* Typedefs for callback functions for amcheck_lock_relation */
+typedef void (*IndexCheckableCallback) (Relation index);
+typedef void (*IndexDoCheckCallback) (Relation rel,
+									  Relation heaprel,
+									  void *state,
+									  bool readonly);
+
+extern void amcheck_lock_relation_and_check(Oid indrelid,
+											Oid am_id,
+											IndexDoCheckCallback check,
+											LOCKMODE lockmode, void *state);
+
+extern void index_checkable(Relation rel, Oid am_id);
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index fc08e32539..1b38e0aba7 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -1,6 +1,7 @@
 # Copyright (c) 2022-2024, PostgreSQL Global Development Group
 
 amcheck_sources = files(
+  'amcheck.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 34990c5cea..2e30c0c693 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -30,14 +30,13 @@
 #include "access/tableam.h"
 #include "access/transam.h"
 #include "access/xact.h"
+#include "amcheck.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
 #include "catalog/pg_opfamily_d.h"
 #include "commands/tablecmds.h"
 #include "common/pg_prng.h"
 #include "lib/bloomfilter.h"
-#include "miscadmin.h"
-#include "storage/lmgr.h"
 #include "storage/smgr.h"
 #include "utils/guc.h"
 #include "utils/memutils.h"
@@ -158,14 +157,22 @@ typedef struct BtreeLastVisibleEntry
 	ItemPointer tid;			/* Heap tid */
 } BtreeLastVisibleEntry;
 
+/*
+ * Check arguments
+ */
+typedef struct BTCallbackState
+{
+	bool	parentcheck;
+	bool	heapallindexed;
+	bool	rootdescend;
+	bool	checkunique;
+} BTCallbackState;
+
 PG_FUNCTION_INFO_V1(bt_index_check);
 PG_FUNCTION_INFO_V1(bt_index_parent_check);
 
-static void bt_index_check_internal(Oid indrelid, bool parentcheck,
-									bool heapallindexed, bool rootdescend,
-									bool checkunique);
-static inline void btree_index_checkable(Relation rel);
-static inline bool btree_index_mainfork_expected(Relation rel);
+static void bt_index_check_callback(Relation indrel, Relation heaprel,
+									void *state, bool readonly);
 static void bt_check_every_level(Relation rel, Relation heaprel,
 								 bool heapkeyspace, bool readonly, bool heapallindexed,
 								 bool rootdescend, bool checkunique);
@@ -240,15 +247,21 @@ Datum
 bt_index_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		checkunique = false;
+	BTCallbackState args;
+
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = false;
+	args.checkunique = false;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
-	if (PG_NARGS() == 3)
-		checkunique = PG_GETARG_BOOL(2);
+		args.heapallindexed = PG_GETARG_BOOL(1);
+	if (PG_NARGS() >= 3)
+		args.checkunique = PG_GETARG_BOOL(2);
 
-	bt_index_check_internal(indrelid, false, heapallindexed, false, checkunique);
+	amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
+									bt_index_check_callback,
+									AccessShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -266,18 +279,23 @@ Datum
 bt_index_parent_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		rootdescend = false;
-	bool		checkunique = false;
+	BTCallbackState args;
+
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = true;
+	args.checkunique = false;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+		args.heapallindexed = PG_GETARG_BOOL(1);
 	if (PG_NARGS() >= 3)
-		rootdescend = PG_GETARG_BOOL(2);
-	if (PG_NARGS() == 4)
-		checkunique = PG_GETARG_BOOL(3);
+		args.rootdescend = PG_GETARG_BOOL(2);
+	if (PG_NARGS() >= 4)
+		args.checkunique = PG_GETARG_BOOL(3);
 
-	bt_index_check_internal(indrelid, true, heapallindexed, rootdescend, checkunique);
+	amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
+									bt_index_check_callback,
+									ShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -286,193 +304,44 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
  * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
  */
 static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend, bool checkunique)
+bt_index_check_callback(Relation indrel, Relation heaprel, void *state, bool readonly)
 {
-	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	LOCKMODE	lockmode;
-	Oid			save_userid;
-	int			save_sec_context;
-	int			save_nestlevel;
-
-	if (parentcheck)
-		lockmode = ShareLock;
-	else
-		lockmode = AccessShareLock;
-
-	/*
-	 * We must lock table before index to avoid deadlocks.  However, if the
-	 * passed indrelid isn't an index then IndexGetRelation() will fail.
-	 * Rather than emitting a not-very-helpful error message, postpone
-	 * complaining, expecting that the is-it-an-index test below will fail.
-	 *
-	 * In hot standby mode this will raise an error when parentcheck is true.
-	 */
-	heapid = IndexGetRelation(indrelid, true);
-	if (OidIsValid(heapid))
-	{
-		heaprel = table_open(heapid, lockmode);
-
-		/*
-		 * Switch to the table owner's userid, so that any index functions are
-		 * run as that user.  Also lock down security-restricted operations
-		 * and arrange to make GUC variable changes local to this command.
-		 */
-		GetUserIdAndSecContext(&save_userid, &save_sec_context);
-		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
-							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
-		save_nestlevel = NewGUCNestLevel();
-		RestrictSearchPath();
-	}
-	else
-	{
-		heaprel = NULL;
-		/* Set these just to suppress "uninitialized variable" warnings */
-		save_userid = InvalidOid;
-		save_sec_context = -1;
-		save_nestlevel = -1;
-	}
+	BTCallbackState *args = (BTCallbackState *) state;
+	bool		heapkeyspace,
+				allequalimage;
 
-	/*
-	 * Open the target index relations separately (like relation_openrv(), but
-	 * with heap relation locked first to prevent deadlocking).  In hot
-	 * standby mode this will raise an error when parentcheck is true.
-	 *
-	 * There is no need for the usual indcheckxmin usability horizon test
-	 * here, even in the heapallindexed case, because index undergoing
-	 * verification only needs to have entries for a new transaction snapshot.
-	 * (If this is a parentcheck verification, there is no question about
-	 * committed or recently dead heap tuples lacking index entries due to
-	 * concurrent activity.)
-	 */
-	indrel = index_open(indrelid, lockmode);
-
-	/*
-	 * Since we did the IndexGetRelation call above without any lock, it's
-	 * barely possible that a race against an index drop/recreation could have
-	 * netted us the wrong table.
-	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
 		ereport(ERROR,
-				(errcode(ERRCODE_UNDEFINED_TABLE),
-				 errmsg("could not open parent table of index \"%s\"",
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" lacks a main relation fork",
 						RelationGetRelationName(indrel))));
 
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	if (btree_index_mainfork_expected(indrel))
+	/* Extract metadata from metapage, and sanitize it in passing */
+	_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
+	if (allequalimage && !heapkeyspace)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
+						RelationGetRelationName(indrel))));
+	if (allequalimage && !_bt_allequalimage(indrel, false))
 	{
-		bool		heapkeyspace,
-					allequalimage;
-
-		if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" lacks a main relation fork",
-							RelationGetRelationName(indrel))));
+		bool		has_interval_ops = false;
 
-		/* Extract metadata from metapage, and sanitize it in passing */
-		_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
-		if (allequalimage && !heapkeyspace)
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
-							RelationGetRelationName(indrel))));
-		if (allequalimage && !_bt_allequalimage(indrel, false))
-		{
-			bool		has_interval_ops = false;
-
-			for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
-				if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
-					has_interval_ops = true;
-			ereport(ERROR,
+		for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
+			if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
+				has_interval_ops = true;
+				ereport(ERROR,
 					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
-							RelationGetRelationName(indrel)),
-					 has_interval_ops
-					 ? errhint("This is known of \"interval\" indexes last built on a version predating 2023-11.")
-					 : 0));
-		}
-
-		/* Check index, possibly against table it is an index on */
-		bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
-							 heapallindexed, rootdescend, checkunique);
+					errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
+						RelationGetRelationName(indrel)),
+					has_interval_ops
+					? errhint("This is known of \"interval\" indexes last built on a version predating 2023-11.")
+					: 0));
 	}
 
-	/* Roll back any GUC changes executed by index functions */
-	AtEOXact_GUC(false, save_nestlevel);
-
-	/* Restore userid and security context */
-	SetUserIdAndSecContext(save_userid, save_sec_context);
-
-	/*
-	 * Release locks early. That's ok here because nothing in the called
-	 * routines will trigger shared cache invalidations to be sent, so we can
-	 * relax the usual pattern of only releasing locks after commit.
-	 */
-	index_close(indrel, lockmode);
-	if (heaprel)
-		table_close(heaprel, lockmode);
-}
-
-/*
- * Basic checks about the suitability of a relation for checking as a B-Tree
- * index.
- *
- * NB: Intentionally not checking permissions, the function is normally not
- * callable by non-superusers. If granted, it's useful to be able to check a
- * whole cluster.
- */
-static inline void
-btree_index_checkable(Relation rel)
-{
-	if (rel->rd_rel->relkind != RELKIND_INDEX ||
-		rel->rd_rel->relam != BTREE_AM_OID)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("only B-Tree indexes are supported as targets for verification"),
-				 errdetail("Relation \"%s\" is not a B-Tree index.",
-						   RelationGetRelationName(rel))));
-
-	if (RELATION_IS_OTHER_TEMP(rel))
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("cannot access temporary tables of other sessions"),
-				 errdetail("Index \"%s\" is associated with temporary relation.",
-						   RelationGetRelationName(rel))));
-
-	if (!rel->rd_index->indisvalid)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("cannot check index \"%s\"",
-						RelationGetRelationName(rel)),
-				 errdetail("Index is not valid.")));
-}
-
-/*
- * Check if B-Tree index relation should have a file for its main relation
- * fork.  Verification uses this to skip unlogged indexes when in hot standby
- * mode, where there is simply nothing to verify.  We behave as if the
- * relation is empty.
- *
- * NB: Caller should call btree_index_checkable() before calling here.
- */
-static inline bool
-btree_index_mainfork_expected(Relation rel)
-{
-	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
-		!RecoveryInProgress())
-		return true;
-
-	ereport(DEBUG1,
-			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
-			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
-					RelationGetRelationName(rel))));
-
-	return false;
+	/* Check index, possibly against table it is an index on */
+	bt_check_every_level(indrel, heaprel, heapkeyspace, readonly,
+						 args->heapallindexed, args->rootdescend, args->checkunique);
 }
 
 /*
-- 
2.42.0

v28-0004-Add-GiST-support-to-pg_amcheck.patchapplication/octet-stream; name=v28-0004-Add-GiST-support-to-pg_amcheck.patch; x-unix-mode=0644Download

From e0dcfd76f22bd609592200525c0045d02052bff6 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sun, 5 Feb 2023 15:52:14 -0800
Subject: [PATCH v28 4/4] Add GiST support to pg_amcheck

Reviewed-by: Alexander Lakhin <exclusion@gmail.com>
---
 src/bin/pg_amcheck/pg_amcheck.c      | 268 ++++++++++++++++-----------
 src/bin/pg_amcheck/t/002_nonesuch.pl |   8 +-
 src/bin/pg_amcheck/t/003_check.pl    |  65 +++++--
 3 files changed, 210 insertions(+), 131 deletions(-)

diff --git a/src/bin/pg_amcheck/pg_amcheck.c b/src/bin/pg_amcheck/pg_amcheck.c
index a1ad41e766..8a93fd4140 100644
--- a/src/bin/pg_amcheck/pg_amcheck.c
+++ b/src/bin/pg_amcheck/pg_amcheck.c
@@ -39,8 +39,7 @@ typedef struct PatternInfo
 								 * NULL */
 	bool		heap_only;		/* true if rel_regex should only match heap
 								 * tables */
-	bool		btree_only;		/* true if rel_regex should only match btree
-								 * indexes */
+	bool		index_only;		/* true if rel_regex should only match indexes */
 	bool		matched;		/* true if the pattern matched in any database */
 } PatternInfo;
 
@@ -74,7 +73,7 @@ typedef struct AmcheckOptions
 
 	/*
 	 * As an optimization, if any pattern in the exclude list applies to heap
-	 * tables, or similarly if any such pattern applies to btree indexes, or
+	 * tables, or similarly if any such pattern applies to indexes, or
 	 * to schemas, then these will be true, otherwise false.  These should
 	 * always agree with what you'd conclude by grep'ing through the exclude
 	 * list.
@@ -98,14 +97,14 @@ typedef struct AmcheckOptions
 	int64		endblock;
 	const char *skip;
 
-	/* btree index checking options */
+	/* index checking options */
 	bool		parent_check;
 	bool		rootdescend;
 	bool		heapallindexed;
 	bool		checkunique;
 
-	/* heap and btree hybrid option */
-	bool		no_btree_expansion;
+	/* heap and indexes hybrid option */
+	bool		no_index_expansion;
 } AmcheckOptions;
 
 static AmcheckOptions opts = {
@@ -134,7 +133,7 @@ static AmcheckOptions opts = {
 	.rootdescend = false,
 	.heapallindexed = false,
 	.checkunique = false,
-	.no_btree_expansion = false
+	.no_index_expansion = false
 };
 
 static const char *progname = NULL;
@@ -151,13 +150,15 @@ typedef struct DatabaseInfo
 	char	   *datname;
 	char	   *amcheck_schema; /* escaped, quoted literal */
 	bool		is_checkunique;
+	bool		gist_supported;
 } DatabaseInfo;
 
 typedef struct RelationInfo
 {
 	const DatabaseInfo *datinfo;	/* shared by other relinfos */
 	Oid			reloid;
-	bool		is_heap;		/* true if heap, false if btree */
+	Oid			amoid;
+	bool		is_heap;		/* true if heap, false if index */
 	char	   *nspname;
 	char	   *relname;
 	int			relpages;
@@ -178,10 +179,12 @@ static void prepare_heap_command(PQExpBuffer sql, RelationInfo *rel,
 								 PGconn *conn);
 static void prepare_btree_command(PQExpBuffer sql, RelationInfo *rel,
 								  PGconn *conn);
+static void prepare_gist_command(PQExpBuffer sql, RelationInfo *rel,
+								  PGconn *conn);
 static void run_command(ParallelSlot *slot, const char *sql);
 static bool verify_heap_slot_handler(PGresult *res, PGconn *conn,
 									 void *context);
-static bool verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context);
+static bool verify_index_slot_handler(PGresult *res, PGconn *conn, void *context);
 static void help(const char *progname);
 static void progress_report(uint64 relations_total, uint64 relations_checked,
 							uint64 relpages_total, uint64 relpages_checked,
@@ -195,7 +198,7 @@ static void append_relation_pattern(PatternInfoArray *pia, const char *pattern,
 									int encoding);
 static void append_heap_pattern(PatternInfoArray *pia, const char *pattern,
 								int encoding);
-static void append_btree_pattern(PatternInfoArray *pia, const char *pattern,
+static void append_index_pattern(PatternInfoArray *pia, const char *pattern,
 								 int encoding);
 static void compile_database_list(PGconn *conn, SimplePtrList *databases,
 								  const char *initial_dbname);
@@ -287,6 +290,7 @@ main(int argc, char *argv[])
 	enum trivalue prompt_password = TRI_DEFAULT;
 	int			encoding = pg_get_encoding_from_locale(NULL, false);
 	ConnParams	cparams;
+	bool		gist_warn_printed = false;
 
 	pg_logging_init(argv[0]);
 	progname = get_progname(argv[0]);
@@ -322,11 +326,11 @@ main(int argc, char *argv[])
 				break;
 			case 'i':
 				opts.allrel = false;
-				append_btree_pattern(&opts.include, optarg, encoding);
+				append_index_pattern(&opts.include, optarg, encoding);
 				break;
 			case 'I':
 				opts.excludeidx = true;
-				append_btree_pattern(&opts.exclude, optarg, encoding);
+				append_index_pattern(&opts.exclude, optarg, encoding);
 				break;
 			case 'j':
 				if (!option_parse_int(optarg, "-j/--jobs", 1, INT_MAX,
@@ -381,7 +385,7 @@ main(int argc, char *argv[])
 				maintenance_db = pg_strdup(optarg);
 				break;
 			case 2:
-				opts.no_btree_expansion = true;
+				opts.no_index_expansion = true;
 				break;
 			case 3:
 				opts.no_toast_expansion = true;
@@ -530,6 +534,10 @@ main(int argc, char *argv[])
 		int			ntups;
 		const char *amcheck_schema = NULL;
 		DatabaseInfo *dat = (DatabaseInfo *) cell->ptr;
+		int			vmaj = 0,
+					vmin = 0,
+					vrev = 0;
+		const char *amcheck_version;
 
 		cparams.override_dbname = dat->datname;
 		if (conn == NULL || strcmp(PQdb(conn), dat->datname) != 0)
@@ -598,36 +606,33 @@ main(int argc, char *argv[])
 												 strlen(amcheck_schema));
 
 		/*
-		 * Check the version of amcheck extension. Skip requested unique
-		 * constraint check with warning if it is not yet supported by
-		 * amcheck.
+		 * Check the version of amcheck extension. 
 		 */
-		if (opts.checkunique == true)
-		{
-			/*
-			 * Now amcheck has only major and minor versions in the string but
-			 * we also support revision just in case. Now it is expected to be
-			 * zero.
-			 */
-			int			vmaj = 0,
-						vmin = 0,
-						vrev = 0;
-			const char *amcheck_version = PQgetvalue(result, 0, 1);
+		amcheck_version = PQgetvalue(result, 0, 1);
 
-			sscanf(amcheck_version, "%d.%d.%d", &vmaj, &vmin, &vrev);
+		/*
+		 * Now amcheck has only major and minor versions in the string but
+		 * we also support revision just in case. Now it is expected to be
+		 * zero.
+		 */
+		sscanf(amcheck_version, "%d.%d.%d", &vmaj, &vmin, &vrev);
 
-			/*
-			 * checkunique option is supported in amcheck since version 1.4
-			 */
-			if ((vmaj == 1 && vmin < 4) || vmaj == 0)
-			{
-				pg_log_warning("--checkunique option is not supported by amcheck "
-							   "version \"%s\"", amcheck_version);
-				dat->is_checkunique = false;
-			}
-			else
-				dat->is_checkunique = true;
+		/*
+		 * checkunique option is supported in amcheck since version 1.4. Skip
+		 * requested unique constraint check with warning if it is not yet
+		 * supported by amcheck.
+		 */
+		if (opts.checkunique && ((vmaj == 1 && vmin < 4) || vmaj == 0))
+		{
+			pg_log_warning("--checkunique option is not supported by amcheck "
+							"version \"%s\"", amcheck_version);
+			dat->is_checkunique = false;
 		}
+		else
+			dat->is_checkunique = opts.checkunique;
+
+		/* GiST indexes are supported in 1.5+ */
+		dat->gist_supported = ((vmaj == 1 && vmin >= 5) || vmaj > 1);
 
 		PQclear(result);
 
@@ -649,8 +654,8 @@ main(int argc, char *argv[])
 			if (pat->heap_only)
 				log_no_match("no heap tables to check matching \"%s\"",
 							 pat->pattern);
-			else if (pat->btree_only)
-				log_no_match("no btree indexes to check matching \"%s\"",
+			else if (pat->index_only)
+				log_no_match("no indexes to check matching \"%s\"",
 							 pat->pattern);
 			else if (pat->rel_regex == NULL)
 				log_no_match("no relations to check in schemas matching \"%s\"",
@@ -783,13 +788,29 @@ main(int argc, char *argv[])
 				if (opts.show_progress && progress_since_last_stderr)
 					fprintf(stderr, "\n");
 
-				pg_log_info("checking btree index \"%s.%s.%s\"",
+				pg_log_info("checking index \"%s.%s.%s\"",
 							rel->datinfo->datname, rel->nspname, rel->relname);
 				progress_since_last_stderr = false;
 			}
-			prepare_btree_command(&sql, rel, free_slot->connection);
+			if (rel->amoid == BTREE_AM_OID)
+				prepare_btree_command(&sql, rel, free_slot->connection);
+			else if (rel->amoid == GIST_AM_OID)
+			{
+				if (rel->datinfo->gist_supported)
+					prepare_gist_command(&sql, rel, free_slot->connection);
+				else
+				{
+					if (!gist_warn_printed)
+						pg_log_warning("GiST verification is not supported by installed amcheck version");
+					gist_warn_printed = true;
+				}
+			}
+			else
+				/* should not happen at this stage */
+				pg_log_info("Verification of index type %u not supported",
+							rel->amoid);
 			rel->sql = pstrdup(sql.data);	/* pg_free'd after command */
-			ParallelSlotSetHandler(free_slot, verify_btree_slot_handler, rel);
+			ParallelSlotSetHandler(free_slot, verify_index_slot_handler, rel);
 			run_command(free_slot, rel->sql);
 		}
 	}
@@ -867,7 +888,7 @@ prepare_heap_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
  * Creates a SQL command for running amcheck checking on the given btree index
  * relation.  The command does not select any columns, as btree checking
  * functions do not return any, but rather return corruption information by
- * raising errors, which verify_btree_slot_handler expects.
+ * raising errors, which verify_index_slot_handler expects.
  *
  * The constructed SQL command will silently skip temporary indexes, and
  * indexes being reindexed concurrently, as checking them would needlessly draw
@@ -913,6 +934,28 @@ prepare_btree_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
 						  rel->reloid);
 }
 
+/*
+ * prepare_gist_command
+ * Similar to btree equivalent prepares command to check GiST index.
+ */
+static void
+prepare_gist_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
+{
+	resetPQExpBuffer(sql);
+
+	appendPQExpBuffer(sql,
+						"SELECT %s.gist_index_check("
+						"index := c.oid, heapallindexed := %s)"
+						"\nFROM pg_catalog.pg_class c, pg_catalog.pg_index i "
+						"WHERE c.oid = %u "
+						"AND c.oid = i.indexrelid "
+						"AND c.relpersistence != 't' "
+						"AND i.indisready AND i.indisvalid AND i.indislive",
+						rel->datinfo->amcheck_schema,
+						(opts.heapallindexed ? "true" : "false"),
+						rel->reloid);
+}
+
 /*
  * run_command
  *
@@ -952,7 +995,7 @@ run_command(ParallelSlot *slot, const char *sql)
  * Note: Heap relation corruption is reported by verify_heapam() via the result
  * set, rather than an ERROR, but running verify_heapam() on a corrupted heap
  * table may still result in an error being returned from the server due to
- * missing relation files, bad checksums, etc.  The btree corruption checking
+ * missing relation files, bad checksums, etc.  The corruption checking
  * functions always use errors to communicate corruption messages.  We can't
  * just abort processing because we got a mere ERROR.
  *
@@ -1102,11 +1145,11 @@ verify_heap_slot_handler(PGresult *res, PGconn *conn, void *context)
 }
 
 /*
- * verify_btree_slot_handler
+ * verify_index_slot_handler
  *
- * ParallelSlotHandler that receives results from a btree checking command
- * created by prepare_btree_command and outputs them for the user.  The results
- * from the btree checking command is assumed to be empty, but when the results
+ * ParallelSlotHandler that receives results from a checking command created by
+ * prepare_[btree,gist]_command and outputs them for the user.  The results
+ * from the checking command is assumed to be empty, but when the results
  * are an error code, the useful information about the corruption is expected
  * in the connection's error message.
  *
@@ -1115,7 +1158,7 @@ verify_heap_slot_handler(PGresult *res, PGconn *conn, void *context)
  * context: unused
  */
 static bool
-verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
+verify_index_slot_handler(PGresult *res, PGconn *conn, void *context)
 {
 	RelationInfo *rel = (RelationInfo *) context;
 
@@ -1126,7 +1169,7 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 		if (ntups > 1)
 		{
 			/*
-			 * We expect the btree checking functions to return one void row
+			 * We expect the checking functions to return one void row
 			 * each, or zero rows if the check was skipped due to the object
 			 * being in the wrong state to be checked, so we should output
 			 * some sort of warning if we get anything more, not because it
@@ -1141,7 +1184,7 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 			 */
 			if (opts.show_progress && progress_since_last_stderr)
 				fprintf(stderr, "\n");
-			pg_log_warning("btree index \"%s.%s.%s\": btree checking function returned unexpected number of rows: %d",
+			pg_log_warning("index \"%s.%s.%s\": checking function returned unexpected number of rows: %d",
 						   rel->datinfo->datname, rel->nspname, rel->relname, ntups);
 			if (opts.verbose)
 				pg_log_warning_detail("Query was: %s", rel->sql);
@@ -1155,7 +1198,7 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 		char	   *msg = indent_lines(PQerrorMessage(conn));
 
 		all_checks_pass = false;
-		printf(_("btree index \"%s.%s.%s\":\n"),
+		printf(_("index \"%s.%s.%s\":\n"),
 			   rel->datinfo->datname, rel->nspname, rel->relname);
 		printf("%s", msg);
 		if (opts.verbose)
@@ -1209,6 +1252,8 @@ help(const char *progname)
 	printf(_("      --heapallindexed            check that all heap tuples are found within indexes\n"));
 	printf(_("      --parent-check              check index parent/child relationships\n"));
 	printf(_("      --rootdescend               search from root page to refind tuples\n"));
+	printf(_("\nGiST index checking options:\n"));
+	printf(_("      --heapallindexed            check that all heap tuples are found within indexes\n"));
 	printf(_("\nConnection options:\n"));
 	printf(_("  -h, --host=HOSTNAME             database server host or socket directory\n"));
 	printf(_("  -p, --port=PORT                 database server port\n"));
@@ -1422,11 +1467,11 @@ append_schema_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
  * pattern: the relation name pattern
  * encoding: client encoding for parsing the pattern
  * heap_only: whether the pattern should only be matched against heap tables
- * btree_only: whether the pattern should only be matched against btree indexes
+ * index_only: whether the pattern should only be matched against indexes
  */
 static void
 append_relation_pattern_helper(PatternInfoArray *pia, const char *pattern,
-							   int encoding, bool heap_only, bool btree_only)
+							   int encoding, bool heap_only, bool index_only)
 {
 	PQExpBufferData dbbuf;
 	PQExpBufferData nspbuf;
@@ -1461,14 +1506,14 @@ append_relation_pattern_helper(PatternInfoArray *pia, const char *pattern,
 	termPQExpBuffer(&relbuf);
 
 	info->heap_only = heap_only;
-	info->btree_only = btree_only;
+	info->index_only = index_only;
 }
 
 /*
  * append_relation_pattern
  *
  * Adds the given pattern interpreted as a relation pattern, to be matched
- * against both heap tables and btree indexes.
+ * against both heap tables and indexes.
  *
  * pia: the pattern info array to be appended
  * pattern: the relation name pattern
@@ -1497,17 +1542,17 @@ append_heap_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
 }
 
 /*
- * append_btree_pattern
+ * append_index_pattern
  *
  * Adds the given pattern interpreted as a relation pattern, to be matched only
- * against btree indexes.
+ * against indexes.
  *
  * pia: the pattern info array to be appended
  * pattern: the relation name pattern
  * encoding: client encoding for parsing the pattern
  */
 static void
-append_btree_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
+append_index_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
 {
 	append_relation_pattern_helper(pia, pattern, encoding, false, true);
 }
@@ -1765,7 +1810,7 @@ compile_database_list(PGconn *conn, SimplePtrList *databases,
  *     rel_regex: the relname regexp parsed from the pattern, or NULL if the
  *                pattern had no relname part
  *     heap_only: true if the pattern applies only to heap tables (not indexes)
- *     btree_only: true if the pattern applies only to btree indexes (not tables)
+ *     index_only: true if the pattern applies only to indexes (not tables)
  *
  * buf: the buffer to be appended
  * patterns: the array of patterns to be inserted into the CTE
@@ -1807,7 +1852,7 @@ append_rel_pattern_raw_cte(PQExpBuffer buf, const PatternInfoArray *pia,
 			appendPQExpBufferStr(buf, "::TEXT, true::BOOLEAN");
 		else
 			appendPQExpBufferStr(buf, "::TEXT, false::BOOLEAN");
-		if (info->btree_only)
+		if (info->index_only)
 			appendPQExpBufferStr(buf, ", true::BOOLEAN");
 		else
 			appendPQExpBufferStr(buf, ", false::BOOLEAN");
@@ -1845,8 +1890,8 @@ append_rel_pattern_filtered_cte(PQExpBuffer buf, const char *raw,
 								const char *filtered, PGconn *conn)
 {
 	appendPQExpBuffer(buf,
-					  "\n%s (pattern_id, nsp_regex, rel_regex, heap_only, btree_only) AS ("
-					  "\nSELECT pattern_id, nsp_regex, rel_regex, heap_only, btree_only "
+					  "\n%s (pattern_id, nsp_regex, rel_regex, heap_only, index_only) AS ("
+					  "\nSELECT pattern_id, nsp_regex, rel_regex, heap_only, index_only "
 					  "FROM %s r"
 					  "\nWHERE (r.db_regex IS NULL "
 					  "OR ",
@@ -1869,7 +1914,7 @@ append_rel_pattern_filtered_cte(PQExpBuffer buf, const char *raw,
  * The cells of the constructed list contain all information about the relation
  * necessary to connect to the database and check the object, including which
  * database to connect to, where contrib/amcheck is installed, and the Oid and
- * type of object (heap table vs. btree index).  Rather than duplicating the
+ * type of object (heap table vs. index).  Rather than duplicating the
  * database details per relation, the relation structs use references to the
  * same database object, provided by the caller.
  *
@@ -1896,7 +1941,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	if (!opts.allrel)
 	{
 		appendPQExpBufferStr(&sql,
-							 " include_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, btree_only) AS (");
+							 " include_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, index_only) AS (");
 		append_rel_pattern_raw_cte(&sql, &opts.include, conn);
 		appendPQExpBufferStr(&sql, "\n),");
 		append_rel_pattern_filtered_cte(&sql, "include_raw", "include_pat", conn);
@@ -1906,7 +1951,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	if (opts.excludetbl || opts.excludeidx || opts.excludensp)
 	{
 		appendPQExpBufferStr(&sql,
-							 " exclude_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, btree_only) AS (");
+							 " exclude_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, index_only) AS (");
 		append_rel_pattern_raw_cte(&sql, &opts.exclude, conn);
 		appendPQExpBufferStr(&sql, "\n),");
 		append_rel_pattern_filtered_cte(&sql, "exclude_raw", "exclude_pat", conn);
@@ -1914,36 +1959,36 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 
 	/* Append the relation CTE. */
 	appendPQExpBufferStr(&sql,
-						 " relation (pattern_id, oid, nspname, relname, reltoastrelid, relpages, is_heap, is_btree) AS ("
+						 " relation (pattern_id, oid, amoid, nspname, relname, reltoastrelid, relpages, is_heap, is_index) AS ("
 						 "\nSELECT DISTINCT ON (c.oid");
 	if (!opts.allrel)
 		appendPQExpBufferStr(&sql, ", ip.pattern_id) ip.pattern_id,");
 	else
 		appendPQExpBufferStr(&sql, ") NULL::INTEGER AS pattern_id,");
 	appendPQExpBuffer(&sql,
-					  "\nc.oid, n.nspname, c.relname, c.reltoastrelid, c.relpages, "
-					  "c.relam = %u AS is_heap, "
-					  "c.relam = %u AS is_btree"
+					  "\nc.oid, c.relam as amoid, n.nspname, c.relname, "
+					  "c.reltoastrelid, c.relpages, c.relam = %u AS is_heap, "
+					  "(c.relam = %u OR c.relam = %u) AS is_index"
 					  "\nFROM pg_catalog.pg_class c "
 					  "INNER JOIN pg_catalog.pg_namespace n "
 					  "ON c.relnamespace = n.oid",
-					  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+					  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 	if (!opts.allrel)
 		appendPQExpBuffer(&sql,
 						  "\nINNER JOIN include_pat ip"
 						  "\nON (n.nspname ~ ip.nsp_regex OR ip.nsp_regex IS NULL)"
 						  "\nAND (c.relname ~ ip.rel_regex OR ip.rel_regex IS NULL)"
 						  "\nAND (c.relam = %u OR NOT ip.heap_only)"
-						  "\nAND (c.relam = %u OR NOT ip.btree_only)",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  "\nAND ((c.relam = %u OR c.relam = %u) OR NOT ip.index_only)",
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 	if (opts.excludetbl || opts.excludeidx || opts.excludensp)
 		appendPQExpBuffer(&sql,
 						  "\nLEFT OUTER JOIN exclude_pat ep"
 						  "\nON (n.nspname ~ ep.nsp_regex OR ep.nsp_regex IS NULL)"
 						  "\nAND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL)"
 						  "\nAND (c.relam = %u OR NOT ep.heap_only OR ep.rel_regex IS NULL)"
-						  "\nAND (c.relam = %u OR NOT ep.btree_only OR ep.rel_regex IS NULL)",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  "\nAND ((c.relam = %u OR c.relam = %u) OR NOT ep.index_only OR ep.rel_regex IS NULL)",
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 
 	/*
 	 * Exclude temporary tables and indexes, which must necessarily belong to
@@ -1977,12 +2022,12 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 						  HEAP_TABLE_AM_OID, PG_TOAST_NAMESPACE);
 	else
 		appendPQExpBuffer(&sql,
-						  " AND c.relam IN (%u, %u)"
+						  " AND c.relam IN (%u, %u, %u)"
 						  "AND c.relkind IN ('r', 'S', 'm', 't', 'i') "
 						  "AND ((c.relam = %u AND c.relkind IN ('r', 'S', 'm', 't')) OR "
-						  "(c.relam = %u AND c.relkind = 'i'))",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID,
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  "((c.relam = %u OR c.relam = %u) AND c.relkind = 'i'))",
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID,
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 
 	appendPQExpBufferStr(&sql,
 						 "\nORDER BY c.oid)");
@@ -2011,7 +2056,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		appendPQExpBufferStr(&sql,
 							 "\n)");
 	}
-	if (!opts.no_btree_expansion)
+	if (!opts.no_index_expansion)
 	{
 		/*
 		 * Include a CTE for btree indexes associated with primary heap tables
@@ -2019,9 +2064,9 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		 * btree index names.
 		 */
 		appendPQExpBufferStr(&sql,
-							 ", index (oid, nspname, relname, relpages) AS ("
-							 "\nSELECT c.oid, r.nspname, c.relname, c.relpages "
-							 "FROM relation r"
+							 ", index (oid, amoid, nspname, relname, relpages) AS ("
+							 "\nSELECT c.oid, c.relam as amoid, r.nspname, "
+							 "c.relname, c.relpages FROM relation r"
 							 "\nINNER JOIN pg_catalog.pg_index i "
 							 "ON r.oid = i.indrelid "
 							 "INNER JOIN pg_catalog.pg_class c "
@@ -2034,15 +2079,15 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 								 "\nLEFT OUTER JOIN exclude_pat ep "
 								 "ON (n.nspname ~ ep.nsp_regex OR ep.nsp_regex IS NULL) "
 								 "AND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL) "
-								 "AND ep.btree_only"
+								 "AND ep.index_only"
 								 "\nWHERE ep.pattern_id IS NULL");
 		else
 			appendPQExpBufferStr(&sql,
 								 "\nWHERE true");
 		appendPQExpBuffer(&sql,
-						  " AND c.relam = %u "
+						  " AND (c.relam = %u or c.relam = %u) "
 						  "AND c.relkind = 'i'",
-						  BTREE_AM_OID);
+						  BTREE_AM_OID, GIST_AM_OID);
 		if (opts.no_toast_expansion)
 			appendPQExpBuffer(&sql,
 							  " AND c.relnamespace != %u",
@@ -2050,7 +2095,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		appendPQExpBufferStr(&sql, "\n)");
 	}
 
-	if (!opts.no_toast_expansion && !opts.no_btree_expansion)
+	if (!opts.no_toast_expansion && !opts.no_index_expansion)
 	{
 		/*
 		 * Include a CTE for btree indexes associated with toast tables of
@@ -2071,13 +2116,13 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 								 "\nLEFT OUTER JOIN exclude_pat ep "
 								 "ON ('pg_toast' ~ ep.nsp_regex OR ep.nsp_regex IS NULL) "
 								 "AND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL) "
-								 "AND ep.btree_only "
+								 "AND ep.index_only "
 								 "WHERE ep.pattern_id IS NULL");
 		else
 			appendPQExpBufferStr(&sql,
 								 "\nWHERE true");
 		appendPQExpBuffer(&sql,
-						  " AND c.relam = %u"
+						  " AND c.relam = %u "
 						  " AND c.relkind = 'i')",
 						  BTREE_AM_OID);
 	}
@@ -2091,12 +2136,13 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	 * list.
 	 */
 	appendPQExpBufferStr(&sql,
-						 "\nSELECT pattern_id, is_heap, is_btree, oid, nspname, relname, relpages "
+						 "\nSELECT pattern_id, is_heap, is_index, oid, amoid, nspname, relname, relpages "
 						 "FROM (");
 	appendPQExpBufferStr(&sql,
 	/* Inclusion patterns that failed to match */
-						 "\nSELECT pattern_id, is_heap, is_btree, "
+						 "\nSELECT pattern_id, is_heap, is_index, "
 						 "NULL::OID AS oid, "
+						 "NULL::OID AS amoid, "
 						 "NULL::TEXT AS nspname, "
 						 "NULL::TEXT AS relname, "
 						 "NULL::INTEGER AS relpages"
@@ -2105,29 +2151,29 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 						 "UNION"
 	/* Primary relations */
 						 "\nSELECT NULL::INTEGER AS pattern_id, "
-						 "is_heap, is_btree, oid, nspname, relname, relpages "
+						 "is_heap, is_index, oid, amoid, nspname, relname, relpages "
 						 "FROM relation");
 	if (!opts.no_toast_expansion)
-		appendPQExpBufferStr(&sql,
+		appendPQExpBuffer(&sql,
 							 " UNION"
 		/* Toast tables for primary relations */
 							 "\nSELECT NULL::INTEGER AS pattern_id, TRUE AS is_heap, "
-							 "FALSE AS is_btree, oid, nspname, relname, relpages "
+							 "FALSE AS is_index, oid, 0 as amoid, nspname, relname, relpages "
 							 "FROM toast");
-	if (!opts.no_btree_expansion)
+	if (!opts.no_index_expansion)
 		appendPQExpBufferStr(&sql,
 							 " UNION"
 		/* Indexes for primary relations */
 							 "\nSELECT NULL::INTEGER AS pattern_id, FALSE AS is_heap, "
-							 "TRUE AS is_btree, oid, nspname, relname, relpages "
+							 "TRUE AS is_index, oid, amoid, nspname, relname, relpages "
 							 "FROM index");
-	if (!opts.no_toast_expansion && !opts.no_btree_expansion)
-		appendPQExpBufferStr(&sql,
+	if (!opts.no_toast_expansion && !opts.no_index_expansion)
+		appendPQExpBuffer(&sql,
 							 " UNION"
 		/* Indexes for toast relations */
 							 "\nSELECT NULL::INTEGER AS pattern_id, FALSE AS is_heap, "
-							 "TRUE AS is_btree, oid, nspname, relname, relpages "
-							 "FROM toast_index");
+							 "TRUE AS is_index, oid, %u as amoid, nspname, relname, relpages "
+							 "FROM toast_index", BTREE_AM_OID);
 	appendPQExpBufferStr(&sql,
 						 "\n) AS combined_records "
 						 "ORDER BY relpages DESC NULLS FIRST, oid");
@@ -2147,8 +2193,9 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	{
 		int			pattern_id = -1;
 		bool		is_heap = false;
-		bool		is_btree PG_USED_FOR_ASSERTS_ONLY = false;
+		bool		is_index PG_USED_FOR_ASSERTS_ONLY = false;
 		Oid			oid = InvalidOid;
+		Oid			amoid = InvalidOid;
 		const char *nspname = NULL;
 		const char *relname = NULL;
 		int			relpages = 0;
@@ -2158,15 +2205,17 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		if (!PQgetisnull(res, i, 1))
 			is_heap = (PQgetvalue(res, i, 1)[0] == 't');
 		if (!PQgetisnull(res, i, 2))
-			is_btree = (PQgetvalue(res, i, 2)[0] == 't');
+			is_index = (PQgetvalue(res, i, 2)[0] == 't');
 		if (!PQgetisnull(res, i, 3))
 			oid = atooid(PQgetvalue(res, i, 3));
 		if (!PQgetisnull(res, i, 4))
-			nspname = PQgetvalue(res, i, 4);
+			amoid = atooid(PQgetvalue(res, i, 4));
 		if (!PQgetisnull(res, i, 5))
-			relname = PQgetvalue(res, i, 5);
+			nspname = PQgetvalue(res, i, 5);
 		if (!PQgetisnull(res, i, 6))
-			relpages = atoi(PQgetvalue(res, i, 6));
+			relname = PQgetvalue(res, i, 6);
+		if (!PQgetisnull(res, i, 7))
+			relpages = atoi(PQgetvalue(res, i, 7));
 
 		if (pattern_id >= 0)
 		{
@@ -2188,10 +2237,11 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 			RelationInfo *rel = (RelationInfo *) pg_malloc0(sizeof(RelationInfo));
 
 			Assert(OidIsValid(oid));
-			Assert((is_heap && !is_btree) || (is_btree && !is_heap));
+			Assert((is_heap && !is_index) || (is_index && !is_heap));
 
 			rel->datinfo = dat;
 			rel->reloid = oid;
+			rel->amoid = amoid;
 			rel->is_heap = is_heap;
 			rel->nspname = pstrdup(nspname);
 			rel->relname = pstrdup(relname);
@@ -2201,7 +2251,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 			{
 				/*
 				 * We apply --startblock and --endblock to heap tables, but
-				 * not btree indexes, and for progress purposes we need to
+				 * not supported indexes, and for progress purposes we need to
 				 * track how many blocks we expect to check.
 				 */
 				if (opts.endblock >= 0 && rel->blocks_to_check > opts.endblock)
diff --git a/src/bin/pg_amcheck/t/002_nonesuch.pl b/src/bin/pg_amcheck/t/002_nonesuch.pl
index 67d700ea07..d4cc0664f3 100644
--- a/src/bin/pg_amcheck/t/002_nonesuch.pl
+++ b/src/bin/pg_amcheck/t/002_nonesuch.pl
@@ -272,8 +272,8 @@ $node->command_checks_all(
 	[
 		qr/pg_amcheck: warning: no heap tables to check matching "no_such_table"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "no\*such\*table"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "no_such_index"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "no\*such\*index"/,
+		qr/pg_amcheck: warning: no indexes to check matching "no_such_index"/,
+		qr/pg_amcheck: warning: no indexes to check matching "no\*such\*index"/,
 		qr/pg_amcheck: warning: no relations to check matching "no_such_relation"/,
 		qr/pg_amcheck: warning: no relations to check matching "no\*such\*relation"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "no\*such\*table"/,
@@ -350,8 +350,8 @@ $node->command_checks_all(
 		qr/pg_amcheck: warning: no heap tables to check matching "template1\.public\.foo"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "another_db\.public\.foo"/,
 		qr/pg_amcheck: warning: no connectable databases to check matching "no_such_database\.public\.foo"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "template1\.public\.foo_idx"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "another_db\.public\.foo_idx"/,
+		qr/pg_amcheck: warning: no indexes to check matching "template1\.public\.foo_idx"/,
+		qr/pg_amcheck: warning: no indexes to check matching "another_db\.public\.foo_idx"/,
 		qr/pg_amcheck: warning: no connectable databases to check matching "no_such_database\.public\.foo_idx"/,
 		qr/pg_amcheck: error: no relations to check/,
 	],
diff --git a/src/bin/pg_amcheck/t/003_check.pl b/src/bin/pg_amcheck/t/003_check.pl
index 4b16bda6a4..7da498ea98 100644
--- a/src/bin/pg_amcheck/t/003_check.pl
+++ b/src/bin/pg_amcheck/t/003_check.pl
@@ -185,7 +185,7 @@ for my $dbname (qw(db1 db2 db3))
 	# schemas.  The schemas are all identical to start, but
 	# we will corrupt them differently later.
 	#
-	for my $schema (qw(s1 s2 s3 s4 s5))
+	for my $schema (qw(s1 s2 s3 s4 s5 s6))
 	{
 		$node->safe_psql(
 			$dbname, qq(
@@ -291,22 +291,24 @@ plan_to_corrupt_first_page('db1', 's3.t2_btree');
 # Corrupt toast table, partitions, and materialized views in schema "s4"
 plan_to_remove_toast_file('db1', 's4.t2');
 
-# Corrupt all other object types in schema "s5".  We don't have amcheck support
+# Corrupt GiST index in schema "s5"
+plan_to_remove_relation_file('db1', 's5.t1_gist');
+plan_to_corrupt_first_page('db1', 's5.t2_gist');
+
+# Corrupt all other object types in schema "s6".  We don't have amcheck support
 # for these types, but we check that their corruption does not trigger any
 # errors in pg_amcheck
-plan_to_remove_relation_file('db1', 's5.seq1');
-plan_to_remove_relation_file('db1', 's5.t1_hash');
-plan_to_remove_relation_file('db1', 's5.t1_gist');
-plan_to_remove_relation_file('db1', 's5.t1_gin');
-plan_to_remove_relation_file('db1', 's5.t1_brin');
-plan_to_remove_relation_file('db1', 's5.t1_spgist');
+plan_to_remove_relation_file('db1', 's6.seq1');
+plan_to_remove_relation_file('db1', 's6.t1_hash');
+plan_to_remove_relation_file('db1', 's6.t1_gin');
+plan_to_remove_relation_file('db1', 's6.t1_brin');
+plan_to_remove_relation_file('db1', 's6.t1_spgist');
 
-plan_to_corrupt_first_page('db1', 's5.seq2');
-plan_to_corrupt_first_page('db1', 's5.t2_hash');
-plan_to_corrupt_first_page('db1', 's5.t2_gist');
-plan_to_corrupt_first_page('db1', 's5.t2_gin');
-plan_to_corrupt_first_page('db1', 's5.t2_brin');
-plan_to_corrupt_first_page('db1', 's5.t2_spgist');
+plan_to_corrupt_first_page('db1', 's6.seq2');
+plan_to_corrupt_first_page('db1', 's6.t2_hash');
+plan_to_corrupt_first_page('db1', 's6.t2_gin');
+plan_to_corrupt_first_page('db1', 's6.t2_brin');
+plan_to_corrupt_first_page('db1', 's6.t2_spgist');
 
 
 # Database 'db2' corruptions
@@ -437,10 +439,22 @@ $node->command_checks_all(
 	[$no_output_re],
 	'pg_amcheck in schema s4 excluding toast reports no corruption');
 
-# Check that no corruption is reported in schema db1.s5
-$node->command_checks_all([ @cmd, '-s', 's5', 'db1' ],
+# In schema db1.s5 we should see GiST corruption messages on stdout, and
+# nothing on stderr.
+#
+$node->command_checks_all(
+	[ @cmd, '-s', 's5', 'db1' ],
+	2,
+	[
+		$missing_file_re, $line_pointer_corruption_re,
+	],
+	[$no_output_re],
+	'pg_amcheck schema s5 reports GiST index errors');
+
+# Check that no corruption is reported in schema db1.s6
+$node->command_checks_all([ @cmd, '-s', 's6', 'db1' ],
 	0, [$no_output_re], [$no_output_re],
-	'pg_amcheck over schema s5 reports no corruption');
+	'pg_amcheck over schema s6 reports no corruption');
 
 # In schema db1.s1, only indexes are corrupt.  Verify that when we exclude
 # the indexes, no corruption is reported about the schema.
@@ -551,7 +565,7 @@ $node->command_checks_all(
 	'pg_amcheck excluding all corrupt schemas with --checkunique option');
 
 #
-# Smoke test for checkunique option for not supported versions.
+# Smoke test for checkunique option and GiST indexes for not supported versions.
 #
 $node->safe_psql(
 	'db3', q(
@@ -567,4 +581,19 @@ $node->command_checks_all(
 		qr/pg_amcheck: warning: --checkunique option is not supported by amcheck version "1.3"/
 	],
 	'pg_amcheck smoke test --checkunique');
+
+$node->safe_psql(
+	'db1', q(
+		DROP EXTENSION amcheck;
+		CREATE EXTENSION amcheck WITH SCHEMA amcheck_schema VERSION '1.3' ;
+));
+
+$node->command_checks_all(
+	[ @cmd, '-s', 's5', 'db1' ],
+	0,
+	[$no_output_re],
+	[
+		qr/pg_amcheck: warning: GiST verification is not supported by installed amcheck version/
+	],
+	'pg_amcheck smoke test --checkunique');
 done_testing();
-- 
2.42.0

v28-0003-Add-gin_index_parent_check-to-verify-GIN-index.patchapplication/octet-stream; name=v28-0003-Add-gin_index_parent_check-to-verify-GIN-index.patch; x-unix-mode=0644Download

From f2b3bd76c511ee3aa85e43aa6e9deac462ff0d4e Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:22:26 +0500
Subject: [PATCH v28 3/4] Add gin_index_parent_check() to verify GIN index
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Author: Grigory Kryachko <GSKryachko@gmail.com>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile               |   3 +-
 contrib/amcheck/amcheck--1.4--1.5.sql  |   9 +
 contrib/amcheck/expected/check_gin.out |  64 ++
 contrib/amcheck/meson.build            |   2 +
 contrib/amcheck/sql/check_gin.sql      |  40 ++
 contrib/amcheck/verify_gin.c           | 769 +++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml              |  19 +
 7 files changed, 905 insertions(+), 1 deletion(-)
 create mode 100644 contrib/amcheck/expected/check_gin.out
 create mode 100644 contrib/amcheck/sql/check_gin.sql
 create mode 100644 contrib/amcheck/verify_gin.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index f63252ff33..5c3ea8bc6a 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,6 +4,7 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
+	verify_gin.o \
 	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
@@ -13,7 +14,7 @@ DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck
 		amcheck--1.3--1.4.sql amcheck--1.4--1.5.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_gist check_heap
+REGRESS = check check_btree check_gin check_gist check_heap
 
 EXTRA_INSTALL = contrib/pg_walinspect
 TAP_TESTS = 1
diff --git a/contrib/amcheck/amcheck--1.4--1.5.sql b/contrib/amcheck/amcheck--1.4--1.5.sql
index 3fc7236418..a2bca7c203 100644
--- a/contrib/amcheck/amcheck--1.4--1.5.sql
+++ b/contrib/amcheck/amcheck--1.4--1.5.sql
@@ -12,3 +12,12 @@ AS 'MODULE_PATHNAME', 'gist_index_check'
 LANGUAGE C STRICT;
 
 REVOKE ALL ON FUNCTION gist_index_check(regclass, boolean) FROM PUBLIC;
+
+-- gin_index_parent_check()
+--
+CREATE FUNCTION gin_index_parent_check(index regclass)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gin_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gin_index_parent_check(regclass) FROM PUBLIC;
diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
new file mode 100644
index 0000000000..43fd769a50
--- /dev/null
+++ b/contrib/amcheck/expected/check_gin.out
@@ -0,0 +1,64 @@
+-- Test of index bulk load
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx');
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test index inserts
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+SELECT gin_index_parent_check('gin_check_idx');
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test GIN over text array
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx');
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 15ae94cc90..5c9ddfe075 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gin.c',
   'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
@@ -38,6 +39,7 @@ tests += {
     'sql': [
       'check',
       'check_btree',
+      'check_gin',
       'check_gist',
       'check_heap',
     ],
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
new file mode 100644
index 0000000000..9771afffa5
--- /dev/null
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -0,0 +1,40 @@
+-- Test of index bulk load
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test index inserts
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+
+SELECT gin_index_parent_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test GIN over text array
+SELECT setseed(1);
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx');
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
new file mode 100644
index 0000000000..877ecacb9c
--- /dev/null
+++ b/contrib/amcheck/verify_gin.c
@@ -0,0 +1,769 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gin.c
+ *		Verifies the integrity of GIN indexes based on invariants.
+ *
+ * Verification checks that all paths in GIN graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gin.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gin_private.h"
+#include "access/nbtree.h"
+#include "amcheck.h"
+#include "catalog/pg_am.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "string.h"
+
+/*
+ * GinScanItem represents one item of depth-first scan of GIN index.
+ */
+typedef struct GinScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GinScanItem *next;
+} GinScanItem;
+
+/*
+ * GinPostingTreeScanItem represents one item of depth-first scan of GIN  posting tree.
+ */
+typedef struct GinPostingTreeScanItem
+{
+	int			depth;
+	ItemPointerData parentkey;
+	BlockNumber parentblk;
+	BlockNumber blkno;
+	struct GinPostingTreeScanItem *next;
+} GinPostingTreeScanItem;
+
+
+PG_FUNCTION_INFO_V1(gin_index_parent_check);
+
+static void gin_check_parent_keys_consistency(Relation rel,
+											  Relation heaprel,
+											  void *callback_state, bool readonly);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gin_refind_parent(Relation rel,
+									BlockNumber parentblkno,
+									BlockNumber childblkno,
+									BufferAccessStrategy strategy);
+static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+								   OffsetNumber offset);
+
+/*
+ * gin_index_parent_check(index regclass)
+ *
+ * Verify integrity of GIN index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gin_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+
+	amcheck_lock_relation_and_check(indrelid,
+									GIN_AM_OID,
+									gin_check_parent_keys_consistency,
+									AccessShareLock,
+									NULL);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Read item pointers from leaf entry tuple.
+ *
+ * Returns a palloc'd array of ItemPointers. The number of items is returned
+ * in *nitems.
+ */
+static ItemPointer
+ginReadTupleWithoutState(IndexTuple itup, int *nitems)
+{
+	Pointer		ptr = GinGetPosting(itup);
+	int			nipd = GinGetNPosting(itup);
+	ItemPointer ipd;
+	int			ndecoded;
+
+	if (GinItupIsCompressed(itup))
+	{
+		if (nipd > 0)
+		{
+			ipd = ginPostingListDecode((GinPostingList *) ptr, &ndecoded);
+			if (nipd != ndecoded)
+				elog(ERROR, "number of items mismatch in GIN entry tuple, %d in tuple header, %d decoded",
+					 nipd, ndecoded);
+		}
+		else
+		{
+			ipd = palloc(0);
+		}
+	}
+	else
+	{
+		ipd = (ItemPointer) palloc(sizeof(ItemPointerData) * nipd);
+		memcpy(ipd, ptr, sizeof(ItemPointerData) * nipd);
+	}
+	*nitems = nipd;
+	return ipd;
+}
+
+/*
+ * Allocates memory context and scans through postigTree graph
+ *
+ */
+static void
+gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting_tree_root)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinPostingTreeScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinPostingTreeScanItem *) palloc0(sizeof(GinPostingTreeScanItem));
+	stack->depth = 0;
+	ItemPointerSetInvalid(&stack->parentkey);
+	stack->parentblk = InvalidBlockNumber;
+	stack->blkno = posting_tree_root;
+
+	elog(DEBUG3, "processing posting tree at blk %u", posting_tree_root);
+
+	while (stack)
+	{
+		GinPostingTreeScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		Assert(GinPageIsData(page));
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			ItemPointerData minItem;
+			int			nlist;
+			ItemPointerData *list;
+			char		tidrange_buf[100];
+
+			ItemPointerSetMin(&minItem);
+
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			list = GinDataLeafPageGetItems(page, &nlist, minItem);
+
+			if (nlist > 0)
+			{
+				snprintf(tidrange_buf, sizeof(tidrange_buf),
+						 "%d tids (%u, %u) - (%u, %u)",
+						 nlist,
+						 ItemPointerGetBlockNumberNoCheck(&list[0]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[0]),
+						 ItemPointerGetBlockNumberNoCheck(&list[nlist - 1]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[nlist - 1]));
+			}
+			else
+			{
+				snprintf(tidrange_buf, sizeof(tidrange_buf), "0 tids");
+			}
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				elog(DEBUG3, "blk %u: parent %u highkey (%u, %u), %s",
+					 stack->blkno,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey),
+					 tidrange_buf);
+			}
+			else
+			{
+				elog(DEBUG3, "blk %u: root leaf, %s",
+					 stack->blkno,
+					 tidrange_buf);
+			}
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				ItemPointerGetOffsetNumberNoCheck(&stack->parentkey) != InvalidOffsetNumber &&
+				nlist > 0 && ItemPointerCompare(&stack->parentkey, &list[nlist - 1]) < 0)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": tid exceeds parent's high key in postingTree leaf on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+		else
+		{
+			LocationIndex pd_lower;
+			ItemPointerData bound;
+			int			lowersize;
+
+			/*
+			 * Check that tuples in each page are properly ordered and
+			 * consistent with parent high key
+			 */
+			maxoff = GinPageGetOpaque(page)->maxoff;
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: internal posting tree page with %u items, parent %u highkey (%u, %u)",
+					 stack->blkno, maxoff, stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey));
+			else
+				elog(DEBUG3, "blk %u: root internal posting tree page with %u items",
+					 stack->blkno, maxoff);
+
+			/*
+			 * A GIN posting tree internal page stores PostingItems in the
+			 * 'lower' part of the page. The 'upper' part is unused. The
+			 * number of elements is stored in the opaque area (maxoff). Make
+			 * sure the size of the 'lower' part agrees with 'maxoff'
+			 *
+			 * We didn't set pd_lower until PostgreSQL version 9.4, so if this
+			 * check fails, it could also be because the index was
+			 * binary-upgraded from an earlier version. That was a long time
+			 * ago, though, so let's warn if it doesn't match.
+			 */
+			pd_lower = ((PageHeader) page)->pd_lower;
+			lowersize = pd_lower - MAXALIGN(SizeOfPageHeaderData);
+			if ((lowersize - MAXALIGN(sizeof(ItemPointerData))) / sizeof(PostingItem) != maxoff)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has unexpected pd_lower %u in posting tree block %u with maxoff %u)",
+								RelationGetRelationName(rel), pd_lower, stack->blkno, maxoff)));
+
+			/*
+			 * Before the PostingItems, there's one ItemPointerData in the
+			 * 'lower' part that stores the page's high key.
+			 */
+			bound = *GinDataPageGetRightBound(page);
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				!ItemPointerEquals(&stack->parentkey, &bound))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": posting tree page's high key (%u, %u) doesn't match the downlink on block %u (parent blk %u, key (%u, %u))",
+								RelationGetRelationName(rel),
+								ItemPointerGetBlockNumberNoCheck(&bound),
+								ItemPointerGetOffsetNumberNoCheck(&bound),
+								stack->blkno, stack->parentblk,
+								ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+								ItemPointerGetOffsetNumberNoCheck(&stack->parentkey))));
+
+			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
+
+				elog(DEBUG3, "key (%u, %u) -> %u",
+					 ItemPointerGetBlockNumber(&posting_item->key),
+					 ItemPointerGetOffsetNumber(&posting_item->key),
+					 BlockIdGetBlockNumber(&posting_item->child_blkno));
+
+				if (i == maxoff &&
+					GinPageGetOpaque(page)->rightlink == InvalidBlockNumber)
+				{
+					/*
+					 * The rightmost item in the tree level has (0, 0) as the
+					 * key
+					 */
+					if (ItemPointerGetBlockNumberNoCheck(&posting_item->key) != 0 ||
+						ItemPointerGetOffsetNumberNoCheck(&posting_item->key) != 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": rightmost posting tree page (blk %u) has unexpected last key (%u, %u)",
+										RelationGetRelationName(rel),
+										stack->blkno,
+										ItemPointerGetBlockNumberNoCheck(&posting_item->key),
+										ItemPointerGetOffsetNumberNoCheck(&posting_item->key))));
+				}
+				else if (i != FirstOffsetNumber)
+				{
+					PostingItem *previous_posting_item = GinDataPageGetPostingItem(page, i - 1);
+
+					if (ItemPointerCompare(&posting_item->key, &previous_posting_item->key) < 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\" has wrong tuple order in posting tree, block %u, offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+				}
+
+				/*
+				 * Check if this tuple is consistent with the downlink in the
+				 * parent.
+				 */
+				if (stack->parentblk != InvalidBlockNumber && i == maxoff &&
+					ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\": posting item exceeds parent's high key in postingTree internal page on block %u offset %u",
+									RelationGetRelationName(rel),
+									stack->blkno, i)));
+
+				/* If this is an internal page, recurse into the child */
+				if (!GinPageIsLeaf(page))
+				{
+					GinPostingTreeScanItem *ptr;
+
+					ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
+					ptr->depth = stack->depth + 1;
+					ptr->parentkey = posting_item->key;
+					ptr->parentblk = stack->blkno;
+					ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+			}
+		}
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Main entry point for GIN check. Allocates memory context and scans through
+ * GIN graph.
+ */
+static void
+gin_check_parent_keys_consistency(Relation rel,
+								  Relation heaprel,
+								  void *callback_state,
+								  bool readonly)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GinState	state;
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+	initGinState(&state, rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinScanItem *) palloc0(sizeof(GinScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIN_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GinScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+		IndexTuple	prev_tuple;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+		maxoff = PageGetMaxOffsetNumber(page);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (stack->parenttup != NULL)
+		{
+			GinNullCategory parent_key_category;
+			Datum parent_key = gintuple_get_key(&state,
+												stack->parenttup,
+												&parent_key_category);
+			ItemId iid = PageGetItemIdCareful(rel, stack->blkno,
+											  page, maxoff);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory page_max_key_category;
+			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
+
+			if (GinPageGetOpaque(page)->rightlink != InvalidBlockNumber &&
+				ginCompareEntries(&state, attnum, page_max_key,
+								  page_max_key_category, parent_key,
+								  parent_key_category) > 0)
+			{
+				/* split page detected, install right link to the stack */
+				GinScanItem *ptr;
+
+				elog(DEBUG3, "split detected");
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth;
+				ptr->parenttup = CopyIndexTuple(stack->parenttup);
+				ptr->parentblk = stack->parentblk;
+				ptr->parentlsn = stack->parentlsn;
+				ptr->blkno = GinPageGetOpaque(page)->rightlink;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that tuples in each page are properly ordered and consistent
+		 * with parent high key
+		 */
+		prev_tuple = NULL;
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory prev_key_category;
+			Datum		prev_key;
+			GinNullCategory current_key_category;
+			Datum		current_key;
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
+
+			/* (apparently) first block is metadata, skip order check */
+			if (i != FirstOffsetNumber && stack->blkno != (BlockNumber) 1)
+			{
+				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
+				if (ginCompareEntries(&state, attnum, prev_key,
+									  prev_key_category, current_key,
+									  current_key_category) >= 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has wrong tuple order, block %u, offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+			}
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				i == maxoff)
+			{
+				GinNullCategory parent_key_category;
+				Datum parent_key = gintuple_get_key(&state,
+													stack->parenttup,
+													&parent_key_category);
+
+				if (ginCompareEntries(&state, attnum, current_key,
+									  current_key_category, parent_key,
+									  parent_key_category) > 0)
+				{
+					/*
+					 * There was a discrepancy between parent and child
+					 * tuples. We need to verify it is not a result of
+					 * concurrent call of gistplacetopage(). So, lock parent
+					 * and try to find downlink for current page. It may be
+					 * missing due to concurrent page split, this is OK.
+					 */
+					pfree(stack->parenttup);
+					stack->parenttup = gin_refind_parent(rel, stack->parentblk,
+														 stack->blkno, strategy);
+
+					/* We found it - make a final check before failing */
+					if (!stack->parenttup)
+						elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+							 stack->blkno, stack->parentblk);
+					else
+					{
+						parent_key = gintuple_get_key(&state,
+													  stack->parenttup,
+													  &parent_key_category);
+						if (ginCompareEntries(&state, attnum, current_key,
+											  current_key_category, parent_key,
+											  parent_key_category) > 0)
+							ereport(ERROR,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+											RelationGetRelationName(rel), stack->blkno, i)));
+						else
+						{
+							/*
+							 * But now it is properly adjusted - nothing to do
+							 * here.
+							 */
+						}
+					}
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+			if (!GinPageIsLeaf(page))
+			{
+				GinScanItem *ptr;
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth + 1;
+				/* last tuple in layer has no high key */
+				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
+				{
+					ptr->parenttup = CopyIndexTuple(idxtuple);
+				}
+				else
+				{
+					ptr->parenttup = NULL;
+				}
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = GinGetDownlink(idxtuple);
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+			/* If this item is a pointer to a posting tree, recurse into it */
+			else if (GinIsPostingTree(idxtuple))
+			{
+				BlockNumber rootPostingTree = GinGetPostingTree(idxtuple);
+
+				gin_check_posting_tree_parent_keys_consistency(rel, rootPostingTree);
+			}
+			else
+			{
+				ItemPointer ipd;
+				int			nipd;
+
+				ipd = ginReadTupleWithoutState(idxtuple, &nipd);
+
+				for (int j = 0; j < nipd; j++)
+				{
+					if (!OffsetNumberIsValid(ItemPointerGetOffsetNumber(&ipd[j])))
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting list contains invalid heap pointer on block %u",
+										RelationGetRelationName(rel), stack->blkno)));
+				}
+				pfree(ipd);
+			}
+
+			prev_tuple = CopyIndexTuple(idxtuple);
+		}
+
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Verify that a freshly-read page looks sane.
+ */
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	/*
+	 * ReadBuffer verifies that every newly-read page passes
+	 * PageHeaderIsValid, which means it either contains a reasonably sane
+	 * page header or is all-zero.  We have to defend against the all-zero
+	 * case, however.
+	 */
+	if (PageIsNew(page))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains unexpected zero page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	/*
+	 * Additionally check that the special area looks sane.
+	 */
+	if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains corrupted page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	if (GinPageIsDeleted(page))
+	{
+		if (!GinPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gin_refind_parent(Relation rel, BlockNumber parentblkno,
+				  BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIN_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GinPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+static ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED or LP_DEAD,
+	 * since GIN never uses all three.  Verify that line pointer has storage,
+	 * too.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdIsDead(itemid) || ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 6eb526c6bb..e1a471474e 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -189,6 +189,25 @@ ORDER BY c.relpages DESC LIMIT 10;
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term>
+     <function>gin_index_parent_check(index regclass) returns void</function>
+     <indexterm>
+      <primary>gin_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gin_index_parent_check</function> tests that its target GIN index
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term>
      <function>gist_index_check(index regclass, heapallindexed boolean) returns void</function>
-- 
2.42.0

v28-0002-Add-gist_index_check-function-to-verify-GiST-ind.patchapplication/octet-stream; name=v28-0002-Add-gist_index_check-function-to-verify-GiST-ind.patch; x-unix-mode=0644Download

From c6c84e61c462904b46244a32773d7e2efd0872d7 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:17:44 +0500
Subject: [PATCH v28 2/4] Add gist_index_check() function to verify GiST index
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This function traverses GiST with a depth-fisrt search and checks
that all downlink tuples are included into parent tuple keyspace.
This traverse takes lock of any page until some discapency found.
To re-check suspicious pair of parent and child tuples it aqcuires
locks on both parent and child pages in the same order as page
split does.

Author: Andrey Borodin <amborodin@acm.org>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile                |   6 +-
 contrib/amcheck/amcheck--1.4--1.5.sql   |  14 +
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/expected/check_gist.out | 145 +++++
 contrib/amcheck/meson.build             |   3 +
 contrib/amcheck/sql/check_gist.sql      |  62 +++
 contrib/amcheck/verify_gist.c           | 672 ++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml               |  19 +
 8 files changed, 920 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.4--1.5.sql
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index 97b60c5115..f63252ff33 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,14 +4,16 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
+	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
-DATA = amcheck--1.3--1.4.sql amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql \
+		amcheck--1.3--1.4.sql amcheck--1.4--1.5.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap
+REGRESS = check check_btree check_gist check_heap
 
 EXTRA_INSTALL = contrib/pg_walinspect
 TAP_TESTS = 1
diff --git a/contrib/amcheck/amcheck--1.4--1.5.sql b/contrib/amcheck/amcheck--1.4--1.5.sql
new file mode 100644
index 0000000000..3fc7236418
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.4--1.5.sql
@@ -0,0 +1,14 @@
+/* contrib/amcheck/amcheck--1.4--1.5.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.5'" to load this file. \quit
+
+
+-- gist_index_check()
+--
+CREATE FUNCTION gist_index_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_check(regclass, boolean) FROM PUBLIC;
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index e67ace01c9..c8ba6d7c9b 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.4'
+default_version = '1.5'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 0000000000..cbc3e27e67
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,145 @@
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gist_check;
+--
+-- Similar to BUG #15597
+--
+CREATE TABLE toast_bug(c point,buggy text);
+ALTER TABLE toast_bug ALTER COLUMN buggy SET STORAGE extended;
+CREATE INDEX toasty ON toast_bug USING gist(c) INCLUDE(buggy);
+-- pg_attribute entry for toasty.buggy (the index) will have plain storage:
+UPDATE pg_attribute SET attstorage = 'p'
+WHERE attrelid = 'toasty'::regclass AND attname = 'buggy';
+-- Whereas pg_attribute entry for toast_bug.buggy (the table) still has extended storage:
+SELECT attstorage FROM pg_attribute
+WHERE attrelid = 'toast_bug'::regclass AND attname = 'buggy';
+ attstorage 
+------------
+ x
+(1 row)
+
+-- Insert compressible heap tuple (comfortably exceeds TOAST_TUPLE_THRESHOLD):
+INSERT INTO toast_bug SELECT point(0,0), repeat('a', 2200);
+-- Should not get false positive report of corruption:
+SELECT gist_index_check('toasty', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 1b38e0aba7..15ae94cc90 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
@@ -25,6 +26,7 @@ install_data(
   'amcheck--1.1--1.2.sql',
   'amcheck--1.2--1.3.sql',
   'amcheck--1.3--1.4.sql',
+  'amcheck--1.4--1.5.sql',
   kwargs: contrib_data_args,
 )
 
@@ -36,6 +38,7 @@ tests += {
     'sql': [
       'check',
       'check_btree',
+      'check_gist',
       'check_heap',
     ],
   },
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 0000000000..37966423b8
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,62 @@
+
+SELECT setseed(1);
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+-- cleanup
+DROP TABLE gist_check;
+
+--
+-- Similar to BUG #15597
+--
+CREATE TABLE toast_bug(c point,buggy text);
+ALTER TABLE toast_bug ALTER COLUMN buggy SET STORAGE extended;
+CREATE INDEX toasty ON toast_bug USING gist(c) INCLUDE(buggy);
+
+-- pg_attribute entry for toasty.buggy (the index) will have plain storage:
+UPDATE pg_attribute SET attstorage = 'p'
+WHERE attrelid = 'toasty'::regclass AND attname = 'buggy';
+
+-- Whereas pg_attribute entry for toast_bug.buggy (the table) still has extended storage:
+SELECT attstorage FROM pg_attribute
+WHERE attrelid = 'toast_bug'::regclass AND attname = 'buggy';
+
+-- Insert compressible heap tuple (comfortably exceeds TOAST_TUPLE_THRESHOLD):
+INSERT INTO toast_bug SELECT point(0,0), repeat('a', 2200);
+-- Should not get false positive report of corruption:
+SELECT gist_index_check('toasty', true);
\ No newline at end of file
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 0000000000..3884d0cc25
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,672 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gist.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "access/tableam.h"
+#include "amcheck.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
+#include "common/pg_prng.h"
+#include "lib/bloomfilter.h"
+#include "utils/memutils.h"
+
+
+/*
+ * GistScanItem represents one item of depth-first scan of GiST index.
+ */
+typedef struct GistScanItem
+{
+	int			depth;
+
+	/* Referenced block number to check next */
+	BlockNumber blkno;
+
+	/*
+	 * Correctess of this parent tuple will be checked against contents of referenced page.
+	 * This tuple will be NULL for root block.
+	 */
+	IndexTuple	parenttup;
+
+	/*
+	 * LSN to hande concurrent scan of the page.
+	 * It's necessary to avoid missing some subtrees from page, that was
+	 * split just before we read it.
+	 */
+	XLogRecPtr	parentlsn;
+
+	/*
+	 * Reference to parent page for re-locking in case of found parent-child
+	 * tuple discrapencies.
+	 */
+	BlockNumber parentblk;
+
+	/* Pointer to a next stack item. */
+	struct GistScanItem *next;
+} GistScanItem;
+
+typedef struct GistCheckState
+{
+	/* Bloom filter fingerprints index tuples */
+	bloom_filter *filter;
+	/* Debug counter */
+	int64		heaptuplespresent;
+	/* GiST state */
+	GISTSTATE  *state;
+
+	Snapshot	snapshot;
+	Relation	rel;
+	Relation	heaprel;
+
+	/* progress reporting stuff */
+	BlockNumber totalblocks;
+	BlockNumber reportedblocks;
+	BlockNumber scannedblocks;
+	BlockNumber deltablocks;
+
+	int leafdepth;
+} GistCheckState;
+
+PG_FUNCTION_INFO_V1(gist_index_check);
+
+static void gist_init_heapallindexed(Relation rel, GistCheckState * result);
+static void gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+											   void *callback_state, bool readonly);
+static void gist_check_page(GistCheckState *check_state,GistScanItem *stack,
+							Page page, bool heapallindexed,
+							BufferAccessStrategy strategy);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gist_refind_parent(Relation rel, BlockNumber parentblkno,
+									 BlockNumber childblkno,
+									 BufferAccessStrategy strategy);
+static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block,
+								   Page page, OffsetNumber offset);
+static void gist_tuple_present_callback(Relation index, ItemPointer tid,
+										Datum *values, bool *isnull,
+										bool tupleIsAlive, void *checkstate);
+static IndexTuple gistFormNormalizedTuple(GISTSTATE *giststate, Relation r,
+			  Datum *attdata, bool *isnull, ItemPointerData tid);
+
+/*
+ * gist_index_check(index regclass)
+ *
+ * Verify integrity of GiST index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gist_index_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	bool		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid,
+									GIST_AM_OID,
+									gist_check_parent_keys_consistency,
+									AccessShareLock,
+									&heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+static void
+gist_init_heapallindexed(Relation rel, GistCheckState * result)
+{
+	int64		total_pages;
+	int64		total_elems;
+	uint64		seed;
+
+	/*
+	 * Size Bloom filter based on estimated number of tuples in index. This
+	 * logic is similar to B-tree, see verify_btree.c .
+	 */
+	total_pages = result->totalblocks;
+	total_elems = Max(total_pages * (MaxOffsetNumber / 5),
+					  (int64) rel->rd_rel->reltuples);
+	seed = pg_prng_uint64(&pg_global_prng_state);
+	result->filter = bloom_create(total_elems, maintenance_work_mem, seed);
+
+	result->snapshot = RegisterSnapshot(GetTransactionSnapshot());
+
+
+	/*
+	 * GetTransactionSnapshot() always acquires a new MVCC snapshot in READ
+	 * COMMITTED mode.  A new snapshot is guaranteed to have all the entries
+	 * it requires in the index.
+	 *
+	 * We must defend against the possibility that an old xact snapshot was
+	 * returned at higher isolation levels when that snapshot is not safe for
+	 * index scans of the target index.  This is possible when the snapshot
+	 * sees tuples that are before the index's indcheckxmin horizon.  Throwing
+	 * an error here should be very rare.  It doesn't seem worth using a
+	 * secondary snapshot to avoid this.
+	 */
+	if (IsolationUsesXactSnapshot() && rel->rd_index->indcheckxmin &&
+		!TransactionIdPrecedes(HeapTupleHeaderGetXmin(rel->rd_indextuple->t_data),
+							   result->snapshot->xmin))
+		ereport(ERROR,
+				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+				 errmsg("index \"%s\" cannot be verified using transaction snapshot",
+						RelationGetRelationName(rel))));
+}
+
+/*
+ * Main entry point for GiST check. Allocates memory context and scans through
+ * GiST graph. This scan is performed in a depth-first search using a stack of
+ * GistScanItem-s. Initially this stack contains only root block number. On
+ * each iteration top block numbmer is replcaed by referenced block numbers.
+ *
+ * This function verifies that tuples of internal pages cover all
+ * the key space of each tuple on leaf page.  To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries to
+ * adjust it by tuples on referenced child page.  Parent gist tuple should
+ * never require any adjustments.
+ */
+static void
+gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+								   void *callback_state, bool readonly)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GistScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GISTSTATE  *state;
+	bool		heapallindexed = *((bool *) callback_state);
+	GistCheckState *check_state = palloc0(sizeof(GistCheckState));
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	state = initGISTstate(rel);
+
+	check_state->state = state;
+	check_state->rel = rel;
+	check_state->heaprel = heaprel;
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	check_state->leafdepth = -1;
+
+	check_state->totalblocks = RelationGetNumberOfBlocks(rel);
+	/* report every 100 blocks or 5%, whichever is bigger */
+	check_state->deltablocks = Max(check_state->totalblocks / 20, 100);
+
+	if (heapallindexed)
+		gist_init_heapallindexed(rel, check_state);
+
+	/* Start the scan at the root page */
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	/*
+	 * This GiST scan is effectively "old" VACUUM version before commit
+	 * fe280694d which introduced physical order scanning.
+	 */
+
+	while (stack)
+	{
+		GistScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		XLogRecPtr	lsn;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Report progress */
+		if (check_state->scannedblocks > check_state->reportedblocks +
+			check_state->deltablocks)
+		{
+			elog(DEBUG1, "verified level %u blocks of approximately %u total",
+				 check_state->scannedblocks, check_state->totalblocks);
+			check_state->reportedblocks = check_state->scannedblocks;
+		}
+		check_state->scannedblocks++;
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page))
+		{
+			/* split page detected, install right link to the stack */
+			GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+			ptr->depth = stack->depth;
+			ptr->parenttup = CopyIndexTuple(stack->parenttup);
+			ptr->parentblk = stack->parentblk;
+			ptr->parentlsn = stack->parentlsn;
+			ptr->blkno = GistPageGetOpaque(page)->rightlink;
+			ptr->next = stack->next;
+			stack->next = ptr;
+		}
+
+		gist_check_page(check_state, stack, page, heapallindexed, strategy);
+
+		if (!GistPageIsLeaf(page))
+		{
+			OffsetNumber maxoff = PageGetMaxOffsetNumber(page);
+			for (OffsetNumber i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				/* Internal page, so recurse to the child */
+				GistScanItem *ptr;
+				ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
+				IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+				ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+				ptr->depth = stack->depth + 1;
+				ptr->parenttup = CopyIndexTuple(idxtuple);
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		LockBuffer(buffer, GIST_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	if (heapallindexed)
+	{
+		IndexInfo  *indexinfo = BuildIndexInfo(rel);
+		TableScanDesc scan;
+
+		scan = table_beginscan_strat(heaprel,	/* relation */
+									 check_state->snapshot,	/* snapshot */
+									 0, /* number of keys */
+									 NULL,	/* scan key */
+									 true,	/* buffer access strategy OK */
+									 true); /* syncscan OK? */
+
+		/*
+		 * Scan will behave as the first scan of a CREATE INDEX CONCURRENTLY.
+		 */
+		indexinfo->ii_Concurrent = true;
+
+		indexinfo->ii_Unique = false;
+		indexinfo->ii_ExclusionOps = NULL;
+		indexinfo->ii_ExclusionProcs = NULL;
+		indexinfo->ii_ExclusionStrats = NULL;
+
+		elog(DEBUG1, "verifying that tuples from index \"%s\" are present in \"%s\"",
+			 RelationGetRelationName(rel),
+			 RelationGetRelationName(heaprel));
+
+		table_index_build_scan(heaprel, rel, indexinfo, true, false,
+							   gist_tuple_present_callback, (void *) check_state, scan);
+
+		ereport(DEBUG1,
+				(errmsg_internal("finished verifying presence of " INT64_FORMAT " tuples from table \"%s\" with bitset %.2f%% set",
+								 check_state->heaptuplespresent,
+								 RelationGetRelationName(heaprel),
+								 100.0 * bloom_prop_bits_set(check_state->filter))));
+
+		UnregisterSnapshot(check_state->snapshot);
+		bloom_free(check_state->filter);
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+	pfree(check_state);
+}
+
+static void gist_check_page(GistCheckState *check_state, GistScanItem *stack,
+							Page page, bool heapallindexed, BufferAccessStrategy strategy)
+{
+	OffsetNumber maxoff = PageGetMaxOffsetNumber(page);
+	/* Check that the tree has the same height in all branches */
+	if (GistPageIsLeaf(page))
+	{
+		if (check_state->leafdepth == -1)
+			check_state->leafdepth = stack->depth;
+		else if (stack->depth != check_state->leafdepth)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+						errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+							RelationGetRelationName(check_state->rel), stack->blkno)));
+	}
+
+	/*
+	 * Check that each tuple looks valid, and is consistent with the
+	 * downlink we followed when we stepped on this page.
+	 */
+	for (OffsetNumber i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+	{
+		ItemId		iid = PageGetItemIdCareful(check_state->rel, stack->blkno, page, i);
+		IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+		/*
+		 * Check that it's not a leftover invalid tuple from pre-9.1 See
+		 * also gistdoinsert() and gistbulkdelete() handling of such
+		 * tuples. We do consider it error here.
+		 */
+		if (GistTupleIsInvalid(idxtuple))
+			ereport(ERROR,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						errmsg("index \"%s\" contains an inner tuple marked as invalid, block %u, offset %u",
+							RelationGetRelationName(check_state->rel), stack->blkno, i),
+						errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+						errhint("Please REINDEX it.")));
+
+		if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+						errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+							RelationGetRelationName(check_state->rel), stack->blkno, i)));
+
+		/*
+		 * Check if this tuple is consistent with the downlink in the
+		 * parent.
+		 */
+		if (stack->parenttup &&
+			gistgetadjusted(check_state->rel, stack->parenttup, idxtuple, check_state->state))
+		{
+			/*
+			 * There was a discrepancy between parent and child tuples. We
+			 * need to verify it is not a result of concurrent call of
+			 * gistplacetopage(). So, lock parent and try to find downlink
+			 * for current page. It may be missing due to concurrent page
+			 * split, this is OK.
+			 *
+			 * Note that when we aquire parent tuple now we hold lock for
+			 * both parent and child buffers. Thus parent tuple must
+			 * include keyspace of the child.
+			 */
+			pfree(stack->parenttup);
+			stack->parenttup = gist_refind_parent(check_state->rel, stack->parentblk,
+													stack->blkno, strategy);
+
+			/* We found it - make a final check before failing */
+			if (!stack->parenttup)
+				elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+						stack->blkno, stack->parentblk);
+			else if (gistgetadjusted(check_state->rel, stack->parenttup, idxtuple, check_state->state))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+							errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+								RelationGetRelationName(check_state->rel), stack->blkno, i)));
+			else
+			{
+				/*
+				 * But now it is properly adjusted - nothing to do here.
+				 */
+			}
+		}
+
+		if (GistPageIsLeaf(page))
+		{
+			if (heapallindexed)
+				bloom_add_element(check_state->filter,
+									(unsigned char *) idxtuple,
+									IndexTupleSize(idxtuple));
+		}
+		else
+		{
+			OffsetNumber off = ItemPointerGetOffsetNumber(&(idxtuple->t_tid));
+			if (off != 0xffff)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+							errmsg("index \"%s\" has on page %u offset %u has item id not pointing to 0xffff, but %hu",
+								RelationGetRelationName(check_state->rel), stack->blkno, i, off)));
+		}
+	}
+}
+
+/*
+ * gistFormNormalizedTuple - analogue to gistFormTuple, but performs deTOASTing
+ * of all included data (for covering indexes). While we do not expected
+ * toasted attributes in normal index, this can happen as a result of
+ * intervention into system catalog. Detoasting of key attributes is expected
+ * to be done by opclass decompression methods, if indexed type might be
+ * toasted.
+ */
+static IndexTuple
+gistFormNormalizedTuple(GISTSTATE *giststate, Relation r,
+			  Datum *attdata, bool *isnull, ItemPointerData tid)
+{
+	Datum		compatt[INDEX_MAX_KEYS];
+	IndexTuple	res;
+
+	gistCompressValues(giststate, r, attdata, isnull, true, compatt);
+
+	for (int i = 0; i < r->rd_att->natts; i++)
+	{
+		Form_pg_attribute att;
+
+		att = TupleDescAttr(giststate->leafTupdesc, i);
+		if (att->attbyval || att->attlen != -1 || isnull[i])
+			continue;
+
+		if (VARATT_IS_EXTERNAL(DatumGetPointer(compatt[i])))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("external varlena datum in tuple that references heap row (%u,%u) in index \"%s\"",
+							ItemPointerGetBlockNumber(&tid),
+							ItemPointerGetOffsetNumber(&tid),
+							RelationGetRelationName(r))));
+		if (VARATT_IS_COMPRESSED(DatumGetPointer(compatt[i])))
+		{
+			//Datum old = compatt[i];
+			/* Key attributes must never be compressed */
+			if (i < IndexRelationGetNumberOfKeyAttributes(r))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+							errmsg("compressed varlena datum in tuple key that references heap row (%u,%u) in index \"%s\"",
+								ItemPointerGetBlockNumber(&tid),
+								ItemPointerGetOffsetNumber(&tid),
+								RelationGetRelationName(r))));
+
+			compatt[i] = PointerGetDatum(PG_DETOAST_DATUM(compatt[i]));
+			//pfree(DatumGetPointer(old)); // TODO: this fails. Why?
+		}
+	}
+
+	res = index_form_tuple(giststate->leafTupdesc, compatt, isnull);
+
+	/*
+	 * The offset number on tuples on internal pages is unused. For historical
+	 * reasons, it is set to 0xffff.
+	 */
+	ItemPointerSetOffsetNumber(&(res->t_tid), 0xffff);
+	return res;
+}
+
+static void
+gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+							bool *isnull, bool tupleIsAlive, void *checkstate)
+{
+	GistCheckState *state = (GistCheckState *) checkstate;
+	IndexTuple	itup = gistFormNormalizedTuple(state->state, index, values, isnull, *tid);
+
+	itup->t_tid = *tid;
+	/* Probe Bloom filter -- tuple should be present */
+	if (bloom_lacks_element(state->filter, (unsigned char *) itup,
+							IndexTupleSize(itup)))
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("heap tuple (%u,%u) from table \"%s\" lacks matching index tuple within index \"%s\"",
+						ItemPointerGetBlockNumber(&(itup->t_tid)),
+						ItemPointerGetOffsetNumber(&(itup->t_tid)),
+						RelationGetRelationName(state->heaprel),
+						RelationGetRelationName(state->rel))));
+
+	state->heaptuplespresent++;
+
+	pfree(itup);
+}
+
+/*
+ * check_index_page - verification of basic invariants about GiST page data
+ * This function does no any tuple analysis.
+ */
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	gistcheckpage(rel, buffer);
+
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted page %d",
+						RelationGetRelationName(rel), blockNo)));
+
+	if (GistPageIsDeleted(page))
+	{
+		if (!GistPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gist_refind_parent(Relation rel,
+				   BlockNumber parentblkno, BlockNumber childblkno,
+				   BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIST_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GistPageIsLeaf(parentpage))
+	{
+		/* 
+		 * Currently GiST never deletes internal pages, thus they can never
+		 * become leaf 
+		 */
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" internal page %d became leaf",
+						RelationGetRelationName(rel), parentblkno)));
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (OffsetNumber o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/*
+			 * Found it! Make copy and return it while both parent and child
+			 * pages are locked. This guaranties that at this particular moment
+			 * tuples must be coherent to each other.
+			 */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+static ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(sizeof(GISTPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree
+	 * and gist never uses either.  Verify that line pointer has storage, too,
+	 * since even LP_DEAD items should.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 3af065615b..6eb526c6bb 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -188,6 +188,25 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gist_index_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
   <tip>
    <para>
-- 
2.42.0

#44

Tomas Vondra

tomas.vondra@enterprisedb.com

over 1 year ago

In reply to: Andrey M. Borodin (#43)

13 attachment(s)

Re: Amcheck verification of GiST and GIN

Hi,

On 7/9/24 08:36, Andrey M. Borodin wrote:

On 5 Jul 2024, at 17:27, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:

There’s one more problem in pg_amcheck’s GiST verification. We must
check that amcheck is 1.5+ and use GiST verification only in that
case …

Done. I’ll set the status to “Needs review”.

I realized amcheck GIN/GiST support would be useful for testing my
patches adding parallel builds for these index types, so I decided to
take a look at this and do an initial review today.

Attached is a patch series with a extra commits to keep the review
comments and patches adjusting the formatting by pgindent (the patch
seems far enough for this).

Let me quickly go through the review comments:

1) Not sure I like 'amcheck.c' very much, I'd probably go with something
like 'verify_common.c' to match naming of the other files. But it's just
nitpicking and I can live with it.

2) amcheck_lock_relation_and_check seems to be the most important
function, yet there's no comment explaining what it does :-(

3) amcheck_lock_relation_and_check still has a TODO to add the correct
name of the AM

4) Do we actually need amcheck_index_mainfork_expected as a separate
function, or could it be a part of index_checkable?

5) The comment for heaptuplespresent says "debug counter" but that does
not really explain what it's for. (I see verify_nbtree has the same
comment, but maybe let's improve that.)

6) I'd suggest moving the GISTSTATE + blocknum fields to the beginning
of GistCheckState, it seems more natural to start with "generic" fields.

7) I'd adjust the gist_check_parent_keys_consistency comment a bit, to
explain what the function does first, and only then explain how.

8) We seem to be copying PageGetItemIdCareful() around, right? And the
copy in _gist.c still references nbtree - I guess that's not right.

9) Why is the GIN function called gin_index_parent_check() and not
simply gin_index_check() as for the other AMs?

10) The debug in gin_check_posting_tree_parent_keys_consistency triggers
assert when running with client_min_messages='debug5', it seems to be
accessing bogus item pointers.

11) Why does it add pg_amcheck support only for GiST and not GIN?

That's all for now. I'll add this to the stress-testing tests of my
index build patches, and if that triggers more issues I'll report those.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

v28-review-0001-Refactor-amcheck-to-extract-common-lockin.patchtext/x-patch; charset=UTF-8; name=v28-review-0001-Refactor-amcheck-to-extract-common-lockin.patchDownload

From 7377e8d1d0073da6144c86090af56fea46d74253 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:08:10 +0500
Subject: [PATCH v28-review 01/13] Refactor amcheck to extract common locking
 routines
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Other indexes will need to do same precautions before doing checks:
 - ensuring index is checkable
 - switching user context
 - taking care about GUCs changed by index functions
To reuse existing functionality this commit moves it to amcheck.c.

Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile        |   1 +
 contrib/amcheck/amcheck.c       | 173 +++++++++++++++++++++
 contrib/amcheck/amcheck.h       |  31 ++++
 contrib/amcheck/meson.build     |   1 +
 contrib/amcheck/verify_nbtree.c | 265 ++++++++------------------------
 5 files changed, 273 insertions(+), 198 deletions(-)
 create mode 100644 contrib/amcheck/amcheck.c
 create mode 100644 contrib/amcheck/amcheck.h

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index 5e9002d2501..97b60c5115a 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -3,6 +3,7 @@
 MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
+	amcheck.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
new file mode 100644
index 00000000000..bf3427e375d
--- /dev/null
+++ b/contrib/amcheck/amcheck.c
@@ -0,0 +1,173 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.c
+ *		Utility functions common to all access methods.
+ *
+ * Copyright (c) 2024, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/table.h"
+#include "access/tableam.h"
+#include "amcheck.h"
+#include "catalog/index.h"
+#include "commands/tablecmds.h"
+#include "utils/guc.h"
+
+static bool amcheck_index_mainfork_expected(Relation rel);
+
+
+/*
+ * Check if index relation should have a file for its main relation fork.
+ * Verification uses this to skip unlogged indexes when in hot standby mode,
+ * where there is simply nothing to verify.
+ *
+ * NB: Caller should call index_checkable() before calling here.
+ */
+static bool
+amcheck_index_mainfork_expected(Relation rel)
+{
+	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
+		!RecoveryInProgress())
+		return true;
+
+	ereport(NOTICE,
+			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
+			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
+					RelationGetRelationName(rel))));
+
+	return false;
+}
+
+void
+amcheck_lock_relation_and_check(Oid indrelid,
+								Oid am_id,
+								IndexDoCheckCallback check,
+								LOCKMODE lockmode,
+								void *state)
+{
+	Oid			heapid;
+	Relation	indrel;
+	Relation	heaprel;
+	Oid			save_userid;
+	int			save_sec_context;
+	int			save_nestlevel;
+
+	/*
+	 * We must lock table before index to avoid deadlocks.  However, if the
+	 * passed indrelid isn't an index then IndexGetRelation() will fail.
+	 * Rather than emitting a not-very-helpful error message, postpone
+	 * complaining, expecting that the is-it-an-index test below will fail.
+	 *
+	 * In hot standby mode this will raise an error when parentcheck is true.
+	 */
+	heapid = IndexGetRelation(indrelid, true);
+	if (OidIsValid(heapid))
+	{
+		heaprel = table_open(heapid, lockmode);
+
+		/*
+		 * Switch to the table owner's userid, so that any index functions are
+		 * run as that user.  Also lock down security-restricted operations
+		 * and arrange to make GUC variable changes local to this command.
+		 */
+		GetUserIdAndSecContext(&save_userid, &save_sec_context);
+		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
+							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
+		save_nestlevel = NewGUCNestLevel();
+	}
+	else
+	{
+		heaprel = NULL;
+		/* Set these just to suppress "uninitialized variable" warnings */
+		save_userid = InvalidOid;
+		save_sec_context = -1;
+		save_nestlevel = -1;
+	}
+
+	/*
+	 * Open the target index relations separately (like relation_openrv(), but
+	 * with heap relation locked first to prevent deadlocking).  In hot
+	 * standby mode this will raise an error when parentcheck is true.
+	 *
+	 * There is no need for the usual indcheckxmin usability horizon test
+	 * here, even in the heapallindexed case, because index undergoing
+	 * verification only needs to have entries for a new transaction snapshot.
+	 * (If this is a parentcheck verification, there is no question about
+	 * committed or recently dead heap tuples lacking index entries due to
+	 * concurrent activity.)
+	 */
+	indrel = index_open(indrelid, lockmode);
+
+	/*
+	 * Since we did the IndexGetRelation call above without any lock, it's
+	 * barely possible that a race against an index drop/recreation could have
+	 * netted us the wrong table.
+	 */
+	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_TABLE),
+				 errmsg("could not open parent table of index \"%s\"",
+						RelationGetRelationName(indrel))));
+
+	/* Relation suitable for checking */
+	index_checkable(indrel, am_id);
+
+	if (amcheck_index_mainfork_expected(indrel))
+		check(indrel, heaprel, state, lockmode == ShareLock);
+
+	/* Roll back any GUC changes executed by index functions */
+	AtEOXact_GUC(false, save_nestlevel);
+
+	/* Restore userid and security context */
+	SetUserIdAndSecContext(save_userid, save_sec_context);
+
+	/*
+	 * Release locks early. That's ok here because nothing in the called
+	 * routines will trigger shared cache invalidations to be sent, so we can
+	 * relax the usual pattern of only releasing locks after commit.
+	 */
+	index_close(indrel, lockmode);
+	if (heaprel)
+		table_close(heaprel, lockmode);
+}
+
+/*
+ * Basic checks about the suitability of a relation for checking as an index.
+ *
+ *
+ * NB: Intentionally not checking permissions, the function is normally not
+ * callable by non-superusers. If granted, it's useful to be able to check a
+ * whole cluster.
+ */
+void
+index_checkable(Relation rel, Oid am_id)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != am_id)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only B-Tree indexes are supported as targets for verification"), //TODO name AM
+				 errdetail("Relation \"%s\" is not a B-Tree index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid.")));
+}
diff --git a/contrib/amcheck/amcheck.h b/contrib/amcheck/amcheck.h
new file mode 100644
index 00000000000..945f2ad4437
--- /dev/null
+++ b/contrib/amcheck/amcheck.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "storage/bufpage.h"
+#include "storage/lmgr.h"
+#include "storage/lockdefs.h"
+#include "utils/relcache.h"
+#include "miscadmin.h"
+
+/* Typedefs for callback functions for amcheck_lock_relation */
+typedef void (*IndexCheckableCallback) (Relation index);
+typedef void (*IndexDoCheckCallback) (Relation rel,
+									  Relation heaprel,
+									  void *state,
+									  bool readonly);
+
+extern void amcheck_lock_relation_and_check(Oid indrelid,
+											Oid am_id,
+											IndexDoCheckCallback check,
+											LOCKMODE lockmode, void *state);
+
+extern void index_checkable(Relation rel, Oid am_id);
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index fc08e32539a..1b38e0aba77 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -1,6 +1,7 @@
 # Copyright (c) 2022-2024, PostgreSQL Global Development Group
 
 amcheck_sources = files(
+  'amcheck.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 34990c5cea3..2e30c0c693e 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -30,14 +30,13 @@
 #include "access/tableam.h"
 #include "access/transam.h"
 #include "access/xact.h"
+#include "amcheck.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
 #include "catalog/pg_opfamily_d.h"
 #include "commands/tablecmds.h"
 #include "common/pg_prng.h"
 #include "lib/bloomfilter.h"
-#include "miscadmin.h"
-#include "storage/lmgr.h"
 #include "storage/smgr.h"
 #include "utils/guc.h"
 #include "utils/memutils.h"
@@ -158,14 +157,22 @@ typedef struct BtreeLastVisibleEntry
 	ItemPointer tid;			/* Heap tid */
 } BtreeLastVisibleEntry;
 
+/*
+ * Check arguments
+ */
+typedef struct BTCallbackState
+{
+	bool	parentcheck;
+	bool	heapallindexed;
+	bool	rootdescend;
+	bool	checkunique;
+} BTCallbackState;
+
 PG_FUNCTION_INFO_V1(bt_index_check);
 PG_FUNCTION_INFO_V1(bt_index_parent_check);
 
-static void bt_index_check_internal(Oid indrelid, bool parentcheck,
-									bool heapallindexed, bool rootdescend,
-									bool checkunique);
-static inline void btree_index_checkable(Relation rel);
-static inline bool btree_index_mainfork_expected(Relation rel);
+static void bt_index_check_callback(Relation indrel, Relation heaprel,
+									void *state, bool readonly);
 static void bt_check_every_level(Relation rel, Relation heaprel,
 								 bool heapkeyspace, bool readonly, bool heapallindexed,
 								 bool rootdescend, bool checkunique);
@@ -240,15 +247,21 @@ Datum
 bt_index_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		checkunique = false;
+	BTCallbackState args;
+
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = false;
+	args.checkunique = false;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
-	if (PG_NARGS() == 3)
-		checkunique = PG_GETARG_BOOL(2);
+		args.heapallindexed = PG_GETARG_BOOL(1);
+	if (PG_NARGS() >= 3)
+		args.checkunique = PG_GETARG_BOOL(2);
 
-	bt_index_check_internal(indrelid, false, heapallindexed, false, checkunique);
+	amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
+									bt_index_check_callback,
+									AccessShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -266,18 +279,23 @@ Datum
 bt_index_parent_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		rootdescend = false;
-	bool		checkunique = false;
+	BTCallbackState args;
+
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = true;
+	args.checkunique = false;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+		args.heapallindexed = PG_GETARG_BOOL(1);
 	if (PG_NARGS() >= 3)
-		rootdescend = PG_GETARG_BOOL(2);
-	if (PG_NARGS() == 4)
-		checkunique = PG_GETARG_BOOL(3);
+		args.rootdescend = PG_GETARG_BOOL(2);
+	if (PG_NARGS() >= 4)
+		args.checkunique = PG_GETARG_BOOL(3);
 
-	bt_index_check_internal(indrelid, true, heapallindexed, rootdescend, checkunique);
+	amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
+									bt_index_check_callback,
+									ShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -286,193 +304,44 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
  * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
  */
 static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend, bool checkunique)
+bt_index_check_callback(Relation indrel, Relation heaprel, void *state, bool readonly)
 {
-	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	LOCKMODE	lockmode;
-	Oid			save_userid;
-	int			save_sec_context;
-	int			save_nestlevel;
-
-	if (parentcheck)
-		lockmode = ShareLock;
-	else
-		lockmode = AccessShareLock;
-
-	/*
-	 * We must lock table before index to avoid deadlocks.  However, if the
-	 * passed indrelid isn't an index then IndexGetRelation() will fail.
-	 * Rather than emitting a not-very-helpful error message, postpone
-	 * complaining, expecting that the is-it-an-index test below will fail.
-	 *
-	 * In hot standby mode this will raise an error when parentcheck is true.
-	 */
-	heapid = IndexGetRelation(indrelid, true);
-	if (OidIsValid(heapid))
-	{
-		heaprel = table_open(heapid, lockmode);
-
-		/*
-		 * Switch to the table owner's userid, so that any index functions are
-		 * run as that user.  Also lock down security-restricted operations
-		 * and arrange to make GUC variable changes local to this command.
-		 */
-		GetUserIdAndSecContext(&save_userid, &save_sec_context);
-		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
-							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
-		save_nestlevel = NewGUCNestLevel();
-		RestrictSearchPath();
-	}
-	else
-	{
-		heaprel = NULL;
-		/* Set these just to suppress "uninitialized variable" warnings */
-		save_userid = InvalidOid;
-		save_sec_context = -1;
-		save_nestlevel = -1;
-	}
+	BTCallbackState *args = (BTCallbackState *) state;
+	bool		heapkeyspace,
+				allequalimage;
 
-	/*
-	 * Open the target index relations separately (like relation_openrv(), but
-	 * with heap relation locked first to prevent deadlocking).  In hot
-	 * standby mode this will raise an error when parentcheck is true.
-	 *
-	 * There is no need for the usual indcheckxmin usability horizon test
-	 * here, even in the heapallindexed case, because index undergoing
-	 * verification only needs to have entries for a new transaction snapshot.
-	 * (If this is a parentcheck verification, there is no question about
-	 * committed or recently dead heap tuples lacking index entries due to
-	 * concurrent activity.)
-	 */
-	indrel = index_open(indrelid, lockmode);
-
-	/*
-	 * Since we did the IndexGetRelation call above without any lock, it's
-	 * barely possible that a race against an index drop/recreation could have
-	 * netted us the wrong table.
-	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
 		ereport(ERROR,
-				(errcode(ERRCODE_UNDEFINED_TABLE),
-				 errmsg("could not open parent table of index \"%s\"",
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" lacks a main relation fork",
 						RelationGetRelationName(indrel))));
 
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	if (btree_index_mainfork_expected(indrel))
+	/* Extract metadata from metapage, and sanitize it in passing */
+	_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
+	if (allequalimage && !heapkeyspace)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
+						RelationGetRelationName(indrel))));
+	if (allequalimage && !_bt_allequalimage(indrel, false))
 	{
-		bool		heapkeyspace,
-					allequalimage;
-
-		if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" lacks a main relation fork",
-							RelationGetRelationName(indrel))));
+		bool		has_interval_ops = false;
 
-		/* Extract metadata from metapage, and sanitize it in passing */
-		_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
-		if (allequalimage && !heapkeyspace)
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
-							RelationGetRelationName(indrel))));
-		if (allequalimage && !_bt_allequalimage(indrel, false))
-		{
-			bool		has_interval_ops = false;
-
-			for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
-				if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
-					has_interval_ops = true;
-			ereport(ERROR,
+		for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
+			if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
+				has_interval_ops = true;
+				ereport(ERROR,
 					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
-							RelationGetRelationName(indrel)),
-					 has_interval_ops
-					 ? errhint("This is known of \"interval\" indexes last built on a version predating 2023-11.")
-					 : 0));
-		}
-
-		/* Check index, possibly against table it is an index on */
-		bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
-							 heapallindexed, rootdescend, checkunique);
+					errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
+						RelationGetRelationName(indrel)),
+					has_interval_ops
+					? errhint("This is known of \"interval\" indexes last built on a version predating 2023-11.")
+					: 0));
 	}
 
-	/* Roll back any GUC changes executed by index functions */
-	AtEOXact_GUC(false, save_nestlevel);
-
-	/* Restore userid and security context */
-	SetUserIdAndSecContext(save_userid, save_sec_context);
-
-	/*
-	 * Release locks early. That's ok here because nothing in the called
-	 * routines will trigger shared cache invalidations to be sent, so we can
-	 * relax the usual pattern of only releasing locks after commit.
-	 */
-	index_close(indrel, lockmode);
-	if (heaprel)
-		table_close(heaprel, lockmode);
-}
-
-/*
- * Basic checks about the suitability of a relation for checking as a B-Tree
- * index.
- *
- * NB: Intentionally not checking permissions, the function is normally not
- * callable by non-superusers. If granted, it's useful to be able to check a
- * whole cluster.
- */
-static inline void
-btree_index_checkable(Relation rel)
-{
-	if (rel->rd_rel->relkind != RELKIND_INDEX ||
-		rel->rd_rel->relam != BTREE_AM_OID)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("only B-Tree indexes are supported as targets for verification"),
-				 errdetail("Relation \"%s\" is not a B-Tree index.",
-						   RelationGetRelationName(rel))));
-
-	if (RELATION_IS_OTHER_TEMP(rel))
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("cannot access temporary tables of other sessions"),
-				 errdetail("Index \"%s\" is associated with temporary relation.",
-						   RelationGetRelationName(rel))));
-
-	if (!rel->rd_index->indisvalid)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("cannot check index \"%s\"",
-						RelationGetRelationName(rel)),
-				 errdetail("Index is not valid.")));
-}
-
-/*
- * Check if B-Tree index relation should have a file for its main relation
- * fork.  Verification uses this to skip unlogged indexes when in hot standby
- * mode, where there is simply nothing to verify.  We behave as if the
- * relation is empty.
- *
- * NB: Caller should call btree_index_checkable() before calling here.
- */
-static inline bool
-btree_index_mainfork_expected(Relation rel)
-{
-	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
-		!RecoveryInProgress())
-		return true;
-
-	ereport(DEBUG1,
-			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
-			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
-					RelationGetRelationName(rel))));
-
-	return false;
+	/* Check index, possibly against table it is an index on */
+	bt_check_every_level(indrel, heaprel, heapkeyspace, readonly,
+						 args->heapallindexed, args->rootdescend, args->checkunique);
 }
 
 /*
-- 
2.45.2

v28-review-0002-review.patchtext/x-patch; charset=UTF-8; name=v28-review-0002-review.patchDownload

From 60a10f1609820462259eadd83e822e9709d038e6 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Wed, 10 Jul 2024 15:09:45 +0200
Subject: [PATCH v28-review 02/13] review

---
 contrib/amcheck/amcheck.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
index bf3427e375d..41cdb5e35a4 100644
--- a/contrib/amcheck/amcheck.c
+++ b/contrib/amcheck/amcheck.c
@@ -8,6 +8,8 @@
  * IDENTIFICATION
  *	  contrib/amcheck/amcheck.c
  *
+ *
+ * XXX I'd probably call this verify_common.c or something like that.
  *-------------------------------------------------------------------------
  */
 #include "postgres.h"
@@ -29,6 +31,8 @@ static bool amcheck_index_mainfork_expected(Relation rel);
  * where there is simply nothing to verify.
  *
  * NB: Caller should call index_checkable() before calling here.
+ *
+ * XXX Wouldn't it be more natural to have this check in index_checkable?
  */
 static bool
 amcheck_index_mainfork_expected(Relation rel)
@@ -45,6 +49,9 @@ amcheck_index_mainfork_expected(Relation rel)
 	return false;
 }
 
+/*
+ * XXX missing comment, and it's the longest/most important function in the file
+ */
 void
 amcheck_lock_relation_and_check(Oid indrelid,
 								Oid am_id,
@@ -153,7 +160,7 @@ index_checkable(Relation rel, Oid am_id)
 		rel->rd_rel->relam != am_id)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("only B-Tree indexes are supported as targets for verification"), //TODO name AM
+				 errmsg("only B-Tree indexes are supported as targets for verification"), //FIXME name AM, shouldn't be hhard to lookup in AMOID syscache
 				 errdetail("Relation \"%s\" is not a B-Tree index.",
 						   RelationGetRelationName(rel))));
 
-- 
2.45.2

v28-review-0003-pgindent.patchtext/x-patch; charset=UTF-8; name=v28-review-0003-pgindent.patchDownload

From f82459e7c4b082b91707ed9fb4252fc8d72fa058 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Wed, 10 Jul 2024 15:45:49 +0200
Subject: [PATCH v28-review 03/13] pgindent

---
 contrib/amcheck/amcheck.c       |  3 ++-
 contrib/amcheck/verify_nbtree.c | 22 +++++++++++-----------
 2 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
index 41cdb5e35a4..964c06e7376 100644
--- a/contrib/amcheck/amcheck.c
+++ b/contrib/amcheck/amcheck.c
@@ -158,9 +158,10 @@ index_checkable(Relation rel, Oid am_id)
 {
 	if (rel->rd_rel->relkind != RELKIND_INDEX ||
 		rel->rd_rel->relam != am_id)
+		/* FIXME name AM, shouldn't be hhard to lookup in AMOID syscache */
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("only B-Tree indexes are supported as targets for verification"), //FIXME name AM, shouldn't be hhard to lookup in AMOID syscache
+				 errmsg("only B-Tree indexes are supported as targets for verification"),
 				 errdetail("Relation \"%s\" is not a B-Tree index.",
 						   RelationGetRelationName(rel))));
 
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 2e30c0c693e..b630e36ead7 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -162,11 +162,11 @@ typedef struct BtreeLastVisibleEntry
  */
 typedef struct BTCallbackState
 {
-	bool	parentcheck;
-	bool	heapallindexed;
-	bool	rootdescend;
-	bool	checkunique;
-} BTCallbackState;
+	bool		parentcheck;
+	bool		heapallindexed;
+	bool		rootdescend;
+	bool		checkunique;
+}			BTCallbackState;
 
 PG_FUNCTION_INFO_V1(bt_index_check);
 PG_FUNCTION_INFO_V1(bt_index_parent_check);
@@ -330,13 +330,13 @@ bt_index_check_callback(Relation indrel, Relation heaprel, void *state, bool rea
 		for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
 			if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
 				has_interval_ops = true;
-				ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
 						RelationGetRelationName(indrel)),
-					has_interval_ops
-					? errhint("This is known of \"interval\" indexes last built on a version predating 2023-11.")
-					: 0));
+				 has_interval_ops
+				 ? errhint("This is known of \"interval\" indexes last built on a version predating 2023-11.")
+				 : 0));
 	}
 
 	/* Check index, possibly against table it is an index on */
-- 
2.45.2

v28-review-0004-Add-gist_index_check-function-to-verify-G.patchtext/x-patch; charset=UTF-8; name=v28-review-0004-Add-gist_index_check-function-to-verify-G.patchDownload

From 7b33c5ac16672d1894a81bda110330b830b3fa2c Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:17:44 +0500
Subject: [PATCH v28-review 04/13] Add gist_index_check() function to verify
 GiST index
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This function traverses GiST with a depth-fisrt search and checks
that all downlink tuples are included into parent tuple keyspace.
This traverse takes lock of any page until some discapency found.
To re-check suspicious pair of parent and child tuples it aqcuires
locks on both parent and child pages in the same order as page
split does.

Author: Andrey Borodin <amborodin@acm.org>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile                |   6 +-
 contrib/amcheck/amcheck--1.4--1.5.sql   |  14 +
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/expected/check_gist.out | 145 +++++
 contrib/amcheck/meson.build             |   3 +
 contrib/amcheck/sql/check_gist.sql      |  62 +++
 contrib/amcheck/verify_gist.c           | 672 ++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml               |  19 +
 8 files changed, 920 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.4--1.5.sql
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index 97b60c5115a..f63252ff33c 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,14 +4,16 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
+	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
-DATA = amcheck--1.3--1.4.sql amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql \
+		amcheck--1.3--1.4.sql amcheck--1.4--1.5.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap
+REGRESS = check check_btree check_gist check_heap
 
 EXTRA_INSTALL = contrib/pg_walinspect
 TAP_TESTS = 1
diff --git a/contrib/amcheck/amcheck--1.4--1.5.sql b/contrib/amcheck/amcheck--1.4--1.5.sql
new file mode 100644
index 00000000000..3fc72364180
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.4--1.5.sql
@@ -0,0 +1,14 @@
+/* contrib/amcheck/amcheck--1.4--1.5.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.5'" to load this file. \quit
+
+
+-- gist_index_check()
+--
+CREATE FUNCTION gist_index_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_check(regclass, boolean) FROM PUBLIC;
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index e67ace01c99..c8ba6d7c9bc 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.4'
+default_version = '1.5'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 00000000000..cbc3e27e679
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,145 @@
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gist_check;
+--
+-- Similar to BUG #15597
+--
+CREATE TABLE toast_bug(c point,buggy text);
+ALTER TABLE toast_bug ALTER COLUMN buggy SET STORAGE extended;
+CREATE INDEX toasty ON toast_bug USING gist(c) INCLUDE(buggy);
+-- pg_attribute entry for toasty.buggy (the index) will have plain storage:
+UPDATE pg_attribute SET attstorage = 'p'
+WHERE attrelid = 'toasty'::regclass AND attname = 'buggy';
+-- Whereas pg_attribute entry for toast_bug.buggy (the table) still has extended storage:
+SELECT attstorage FROM pg_attribute
+WHERE attrelid = 'toast_bug'::regclass AND attname = 'buggy';
+ attstorage 
+------------
+ x
+(1 row)
+
+-- Insert compressible heap tuple (comfortably exceeds TOAST_TUPLE_THRESHOLD):
+INSERT INTO toast_bug SELECT point(0,0), repeat('a', 2200);
+-- Should not get false positive report of corruption:
+SELECT gist_index_check('toasty', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 1b38e0aba77..15ae94cc90f 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
@@ -25,6 +26,7 @@ install_data(
   'amcheck--1.1--1.2.sql',
   'amcheck--1.2--1.3.sql',
   'amcheck--1.3--1.4.sql',
+  'amcheck--1.4--1.5.sql',
   kwargs: contrib_data_args,
 )
 
@@ -36,6 +38,7 @@ tests += {
     'sql': [
       'check',
       'check_btree',
+      'check_gist',
       'check_heap',
     ],
   },
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 00000000000..37966423b8b
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,62 @@
+
+SELECT setseed(1);
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+-- cleanup
+DROP TABLE gist_check;
+
+--
+-- Similar to BUG #15597
+--
+CREATE TABLE toast_bug(c point,buggy text);
+ALTER TABLE toast_bug ALTER COLUMN buggy SET STORAGE extended;
+CREATE INDEX toasty ON toast_bug USING gist(c) INCLUDE(buggy);
+
+-- pg_attribute entry for toasty.buggy (the index) will have plain storage:
+UPDATE pg_attribute SET attstorage = 'p'
+WHERE attrelid = 'toasty'::regclass AND attname = 'buggy';
+
+-- Whereas pg_attribute entry for toast_bug.buggy (the table) still has extended storage:
+SELECT attstorage FROM pg_attribute
+WHERE attrelid = 'toast_bug'::regclass AND attname = 'buggy';
+
+-- Insert compressible heap tuple (comfortably exceeds TOAST_TUPLE_THRESHOLD):
+INSERT INTO toast_bug SELECT point(0,0), repeat('a', 2200);
+-- Should not get false positive report of corruption:
+SELECT gist_index_check('toasty', true);
\ No newline at end of file
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 00000000000..3884d0cc25b
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,672 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gist.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "access/tableam.h"
+#include "amcheck.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
+#include "common/pg_prng.h"
+#include "lib/bloomfilter.h"
+#include "utils/memutils.h"
+
+
+/*
+ * GistScanItem represents one item of depth-first scan of GiST index.
+ */
+typedef struct GistScanItem
+{
+	int			depth;
+
+	/* Referenced block number to check next */
+	BlockNumber blkno;
+
+	/*
+	 * Correctess of this parent tuple will be checked against contents of referenced page.
+	 * This tuple will be NULL for root block.
+	 */
+	IndexTuple	parenttup;
+
+	/*
+	 * LSN to hande concurrent scan of the page.
+	 * It's necessary to avoid missing some subtrees from page, that was
+	 * split just before we read it.
+	 */
+	XLogRecPtr	parentlsn;
+
+	/*
+	 * Reference to parent page for re-locking in case of found parent-child
+	 * tuple discrapencies.
+	 */
+	BlockNumber parentblk;
+
+	/* Pointer to a next stack item. */
+	struct GistScanItem *next;
+} GistScanItem;
+
+typedef struct GistCheckState
+{
+	/* Bloom filter fingerprints index tuples */
+	bloom_filter *filter;
+	/* Debug counter */
+	int64		heaptuplespresent;
+	/* GiST state */
+	GISTSTATE  *state;
+
+	Snapshot	snapshot;
+	Relation	rel;
+	Relation	heaprel;
+
+	/* progress reporting stuff */
+	BlockNumber totalblocks;
+	BlockNumber reportedblocks;
+	BlockNumber scannedblocks;
+	BlockNumber deltablocks;
+
+	int leafdepth;
+} GistCheckState;
+
+PG_FUNCTION_INFO_V1(gist_index_check);
+
+static void gist_init_heapallindexed(Relation rel, GistCheckState * result);
+static void gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+											   void *callback_state, bool readonly);
+static void gist_check_page(GistCheckState *check_state,GistScanItem *stack,
+							Page page, bool heapallindexed,
+							BufferAccessStrategy strategy);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gist_refind_parent(Relation rel, BlockNumber parentblkno,
+									 BlockNumber childblkno,
+									 BufferAccessStrategy strategy);
+static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block,
+								   Page page, OffsetNumber offset);
+static void gist_tuple_present_callback(Relation index, ItemPointer tid,
+										Datum *values, bool *isnull,
+										bool tupleIsAlive, void *checkstate);
+static IndexTuple gistFormNormalizedTuple(GISTSTATE *giststate, Relation r,
+			  Datum *attdata, bool *isnull, ItemPointerData tid);
+
+/*
+ * gist_index_check(index regclass)
+ *
+ * Verify integrity of GiST index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gist_index_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	bool		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid,
+									GIST_AM_OID,
+									gist_check_parent_keys_consistency,
+									AccessShareLock,
+									&heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+static void
+gist_init_heapallindexed(Relation rel, GistCheckState * result)
+{
+	int64		total_pages;
+	int64		total_elems;
+	uint64		seed;
+
+	/*
+	 * Size Bloom filter based on estimated number of tuples in index. This
+	 * logic is similar to B-tree, see verify_btree.c .
+	 */
+	total_pages = result->totalblocks;
+	total_elems = Max(total_pages * (MaxOffsetNumber / 5),
+					  (int64) rel->rd_rel->reltuples);
+	seed = pg_prng_uint64(&pg_global_prng_state);
+	result->filter = bloom_create(total_elems, maintenance_work_mem, seed);
+
+	result->snapshot = RegisterSnapshot(GetTransactionSnapshot());
+
+
+	/*
+	 * GetTransactionSnapshot() always acquires a new MVCC snapshot in READ
+	 * COMMITTED mode.  A new snapshot is guaranteed to have all the entries
+	 * it requires in the index.
+	 *
+	 * We must defend against the possibility that an old xact snapshot was
+	 * returned at higher isolation levels when that snapshot is not safe for
+	 * index scans of the target index.  This is possible when the snapshot
+	 * sees tuples that are before the index's indcheckxmin horizon.  Throwing
+	 * an error here should be very rare.  It doesn't seem worth using a
+	 * secondary snapshot to avoid this.
+	 */
+	if (IsolationUsesXactSnapshot() && rel->rd_index->indcheckxmin &&
+		!TransactionIdPrecedes(HeapTupleHeaderGetXmin(rel->rd_indextuple->t_data),
+							   result->snapshot->xmin))
+		ereport(ERROR,
+				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+				 errmsg("index \"%s\" cannot be verified using transaction snapshot",
+						RelationGetRelationName(rel))));
+}
+
+/*
+ * Main entry point for GiST check. Allocates memory context and scans through
+ * GiST graph. This scan is performed in a depth-first search using a stack of
+ * GistScanItem-s. Initially this stack contains only root block number. On
+ * each iteration top block numbmer is replcaed by referenced block numbers.
+ *
+ * This function verifies that tuples of internal pages cover all
+ * the key space of each tuple on leaf page.  To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries to
+ * adjust it by tuples on referenced child page.  Parent gist tuple should
+ * never require any adjustments.
+ */
+static void
+gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+								   void *callback_state, bool readonly)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GistScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GISTSTATE  *state;
+	bool		heapallindexed = *((bool *) callback_state);
+	GistCheckState *check_state = palloc0(sizeof(GistCheckState));
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	state = initGISTstate(rel);
+
+	check_state->state = state;
+	check_state->rel = rel;
+	check_state->heaprel = heaprel;
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	check_state->leafdepth = -1;
+
+	check_state->totalblocks = RelationGetNumberOfBlocks(rel);
+	/* report every 100 blocks or 5%, whichever is bigger */
+	check_state->deltablocks = Max(check_state->totalblocks / 20, 100);
+
+	if (heapallindexed)
+		gist_init_heapallindexed(rel, check_state);
+
+	/* Start the scan at the root page */
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	/*
+	 * This GiST scan is effectively "old" VACUUM version before commit
+	 * fe280694d which introduced physical order scanning.
+	 */
+
+	while (stack)
+	{
+		GistScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		XLogRecPtr	lsn;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Report progress */
+		if (check_state->scannedblocks > check_state->reportedblocks +
+			check_state->deltablocks)
+		{
+			elog(DEBUG1, "verified level %u blocks of approximately %u total",
+				 check_state->scannedblocks, check_state->totalblocks);
+			check_state->reportedblocks = check_state->scannedblocks;
+		}
+		check_state->scannedblocks++;
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page))
+		{
+			/* split page detected, install right link to the stack */
+			GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+			ptr->depth = stack->depth;
+			ptr->parenttup = CopyIndexTuple(stack->parenttup);
+			ptr->parentblk = stack->parentblk;
+			ptr->parentlsn = stack->parentlsn;
+			ptr->blkno = GistPageGetOpaque(page)->rightlink;
+			ptr->next = stack->next;
+			stack->next = ptr;
+		}
+
+		gist_check_page(check_state, stack, page, heapallindexed, strategy);
+
+		if (!GistPageIsLeaf(page))
+		{
+			OffsetNumber maxoff = PageGetMaxOffsetNumber(page);
+			for (OffsetNumber i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				/* Internal page, so recurse to the child */
+				GistScanItem *ptr;
+				ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
+				IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+				ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+				ptr->depth = stack->depth + 1;
+				ptr->parenttup = CopyIndexTuple(idxtuple);
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		LockBuffer(buffer, GIST_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	if (heapallindexed)
+	{
+		IndexInfo  *indexinfo = BuildIndexInfo(rel);
+		TableScanDesc scan;
+
+		scan = table_beginscan_strat(heaprel,	/* relation */
+									 check_state->snapshot,	/* snapshot */
+									 0, /* number of keys */
+									 NULL,	/* scan key */
+									 true,	/* buffer access strategy OK */
+									 true); /* syncscan OK? */
+
+		/*
+		 * Scan will behave as the first scan of a CREATE INDEX CONCURRENTLY.
+		 */
+		indexinfo->ii_Concurrent = true;
+
+		indexinfo->ii_Unique = false;
+		indexinfo->ii_ExclusionOps = NULL;
+		indexinfo->ii_ExclusionProcs = NULL;
+		indexinfo->ii_ExclusionStrats = NULL;
+
+		elog(DEBUG1, "verifying that tuples from index \"%s\" are present in \"%s\"",
+			 RelationGetRelationName(rel),
+			 RelationGetRelationName(heaprel));
+
+		table_index_build_scan(heaprel, rel, indexinfo, true, false,
+							   gist_tuple_present_callback, (void *) check_state, scan);
+
+		ereport(DEBUG1,
+				(errmsg_internal("finished verifying presence of " INT64_FORMAT " tuples from table \"%s\" with bitset %.2f%% set",
+								 check_state->heaptuplespresent,
+								 RelationGetRelationName(heaprel),
+								 100.0 * bloom_prop_bits_set(check_state->filter))));
+
+		UnregisterSnapshot(check_state->snapshot);
+		bloom_free(check_state->filter);
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+	pfree(check_state);
+}
+
+static void gist_check_page(GistCheckState *check_state, GistScanItem *stack,
+							Page page, bool heapallindexed, BufferAccessStrategy strategy)
+{
+	OffsetNumber maxoff = PageGetMaxOffsetNumber(page);
+	/* Check that the tree has the same height in all branches */
+	if (GistPageIsLeaf(page))
+	{
+		if (check_state->leafdepth == -1)
+			check_state->leafdepth = stack->depth;
+		else if (stack->depth != check_state->leafdepth)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+						errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+							RelationGetRelationName(check_state->rel), stack->blkno)));
+	}
+
+	/*
+	 * Check that each tuple looks valid, and is consistent with the
+	 * downlink we followed when we stepped on this page.
+	 */
+	for (OffsetNumber i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+	{
+		ItemId		iid = PageGetItemIdCareful(check_state->rel, stack->blkno, page, i);
+		IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+		/*
+		 * Check that it's not a leftover invalid tuple from pre-9.1 See
+		 * also gistdoinsert() and gistbulkdelete() handling of such
+		 * tuples. We do consider it error here.
+		 */
+		if (GistTupleIsInvalid(idxtuple))
+			ereport(ERROR,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						errmsg("index \"%s\" contains an inner tuple marked as invalid, block %u, offset %u",
+							RelationGetRelationName(check_state->rel), stack->blkno, i),
+						errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+						errhint("Please REINDEX it.")));
+
+		if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+						errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+							RelationGetRelationName(check_state->rel), stack->blkno, i)));
+
+		/*
+		 * Check if this tuple is consistent with the downlink in the
+		 * parent.
+		 */
+		if (stack->parenttup &&
+			gistgetadjusted(check_state->rel, stack->parenttup, idxtuple, check_state->state))
+		{
+			/*
+			 * There was a discrepancy between parent and child tuples. We
+			 * need to verify it is not a result of concurrent call of
+			 * gistplacetopage(). So, lock parent and try to find downlink
+			 * for current page. It may be missing due to concurrent page
+			 * split, this is OK.
+			 *
+			 * Note that when we aquire parent tuple now we hold lock for
+			 * both parent and child buffers. Thus parent tuple must
+			 * include keyspace of the child.
+			 */
+			pfree(stack->parenttup);
+			stack->parenttup = gist_refind_parent(check_state->rel, stack->parentblk,
+													stack->blkno, strategy);
+
+			/* We found it - make a final check before failing */
+			if (!stack->parenttup)
+				elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+						stack->blkno, stack->parentblk);
+			else if (gistgetadjusted(check_state->rel, stack->parenttup, idxtuple, check_state->state))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+							errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+								RelationGetRelationName(check_state->rel), stack->blkno, i)));
+			else
+			{
+				/*
+				 * But now it is properly adjusted - nothing to do here.
+				 */
+			}
+		}
+
+		if (GistPageIsLeaf(page))
+		{
+			if (heapallindexed)
+				bloom_add_element(check_state->filter,
+									(unsigned char *) idxtuple,
+									IndexTupleSize(idxtuple));
+		}
+		else
+		{
+			OffsetNumber off = ItemPointerGetOffsetNumber(&(idxtuple->t_tid));
+			if (off != 0xffff)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+							errmsg("index \"%s\" has on page %u offset %u has item id not pointing to 0xffff, but %hu",
+								RelationGetRelationName(check_state->rel), stack->blkno, i, off)));
+		}
+	}
+}
+
+/*
+ * gistFormNormalizedTuple - analogue to gistFormTuple, but performs deTOASTing
+ * of all included data (for covering indexes). While we do not expected
+ * toasted attributes in normal index, this can happen as a result of
+ * intervention into system catalog. Detoasting of key attributes is expected
+ * to be done by opclass decompression methods, if indexed type might be
+ * toasted.
+ */
+static IndexTuple
+gistFormNormalizedTuple(GISTSTATE *giststate, Relation r,
+			  Datum *attdata, bool *isnull, ItemPointerData tid)
+{
+	Datum		compatt[INDEX_MAX_KEYS];
+	IndexTuple	res;
+
+	gistCompressValues(giststate, r, attdata, isnull, true, compatt);
+
+	for (int i = 0; i < r->rd_att->natts; i++)
+	{
+		Form_pg_attribute att;
+
+		att = TupleDescAttr(giststate->leafTupdesc, i);
+		if (att->attbyval || att->attlen != -1 || isnull[i])
+			continue;
+
+		if (VARATT_IS_EXTERNAL(DatumGetPointer(compatt[i])))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("external varlena datum in tuple that references heap row (%u,%u) in index \"%s\"",
+							ItemPointerGetBlockNumber(&tid),
+							ItemPointerGetOffsetNumber(&tid),
+							RelationGetRelationName(r))));
+		if (VARATT_IS_COMPRESSED(DatumGetPointer(compatt[i])))
+		{
+			//Datum old = compatt[i];
+			/* Key attributes must never be compressed */
+			if (i < IndexRelationGetNumberOfKeyAttributes(r))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+							errmsg("compressed varlena datum in tuple key that references heap row (%u,%u) in index \"%s\"",
+								ItemPointerGetBlockNumber(&tid),
+								ItemPointerGetOffsetNumber(&tid),
+								RelationGetRelationName(r))));
+
+			compatt[i] = PointerGetDatum(PG_DETOAST_DATUM(compatt[i]));
+			//pfree(DatumGetPointer(old)); // TODO: this fails. Why?
+		}
+	}
+
+	res = index_form_tuple(giststate->leafTupdesc, compatt, isnull);
+
+	/*
+	 * The offset number on tuples on internal pages is unused. For historical
+	 * reasons, it is set to 0xffff.
+	 */
+	ItemPointerSetOffsetNumber(&(res->t_tid), 0xffff);
+	return res;
+}
+
+static void
+gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+							bool *isnull, bool tupleIsAlive, void *checkstate)
+{
+	GistCheckState *state = (GistCheckState *) checkstate;
+	IndexTuple	itup = gistFormNormalizedTuple(state->state, index, values, isnull, *tid);
+
+	itup->t_tid = *tid;
+	/* Probe Bloom filter -- tuple should be present */
+	if (bloom_lacks_element(state->filter, (unsigned char *) itup,
+							IndexTupleSize(itup)))
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("heap tuple (%u,%u) from table \"%s\" lacks matching index tuple within index \"%s\"",
+						ItemPointerGetBlockNumber(&(itup->t_tid)),
+						ItemPointerGetOffsetNumber(&(itup->t_tid)),
+						RelationGetRelationName(state->heaprel),
+						RelationGetRelationName(state->rel))));
+
+	state->heaptuplespresent++;
+
+	pfree(itup);
+}
+
+/*
+ * check_index_page - verification of basic invariants about GiST page data
+ * This function does no any tuple analysis.
+ */
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	gistcheckpage(rel, buffer);
+
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted page %d",
+						RelationGetRelationName(rel), blockNo)));
+
+	if (GistPageIsDeleted(page))
+	{
+		if (!GistPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gist_refind_parent(Relation rel,
+				   BlockNumber parentblkno, BlockNumber childblkno,
+				   BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIST_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GistPageIsLeaf(parentpage))
+	{
+		/* 
+		 * Currently GiST never deletes internal pages, thus they can never
+		 * become leaf 
+		 */
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" internal page %d became leaf",
+						RelationGetRelationName(rel), parentblkno)));
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (OffsetNumber o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/*
+			 * Found it! Make copy and return it while both parent and child
+			 * pages are locked. This guaranties that at this particular moment
+			 * tuples must be coherent to each other.
+			 */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+static ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(sizeof(GISTPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree
+	 * and gist never uses either.  Verify that line pointer has storage, too,
+	 * since even LP_DEAD items should.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 3af065615bc..6eb526c6bb7 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -188,6 +188,25 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gist_index_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
   <tip>
    <para>
-- 
2.45.2

v28-review-0005-review.patchtext/x-patch; charset=UTF-8; name=v28-review-0005-review.patchDownload

From b6a67accbe69af78597b65f9b41560af00e10078 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Wed, 10 Jul 2024 15:38:13 +0200
Subject: [PATCH v28-review 05/13] review

---
 contrib/amcheck/verify_gist.c | 40 +++++++++++++++++++++++++----------
 1 file changed, 29 insertions(+), 11 deletions(-)

diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
index 3884d0cc25b..5ea3e216b7d 100644
--- a/contrib/amcheck/verify_gist.c
+++ b/contrib/amcheck/verify_gist.c
@@ -54,7 +54,7 @@ typedef struct GistScanItem
 
 	/*
 	 * Reference to parent page for re-locking in case of found parent-child
-	 * tuple discrapencies.
+	 * tuple discrepencies.
 	 */
 	BlockNumber parentblk;
 
@@ -66,8 +66,11 @@ typedef struct GistCheckState
 {
 	/* Bloom filter fingerprints index tuples */
 	bloom_filter *filter;
-	/* Debug counter */
+
+	/* Debug counter	FIXME what does 'debug counter' mean?*/
 	int64		heaptuplespresent;
+
+	/* XXX nitpick: I'd move these 'generic' fields to the beginning */
 	/* GiST state */
 	GISTSTATE  *state;
 
@@ -106,8 +109,7 @@ static IndexTuple gistFormNormalizedTuple(GISTSTATE *giststate, Relation r,
 
 /*
  * gist_index_check(index regclass)
- *
- * Verify integrity of GiST index.
+ *		Verify integrity of GiST index.
  *
  * Acquires AccessShareLock on heap & index relations.
  */
@@ -126,6 +128,10 @@ gist_index_check(PG_FUNCTION_ARGS)
 	PG_RETURN_VOID();
 }
 
+/*
+ * XXX This talks about 'heapallindexed' but it initializes a snapshot too,
+ * but isn't that unrelated / confusing?
+ */
 static void
 gist_init_heapallindexed(Relation rel, GistCheckState * result)
 {
@@ -145,7 +151,6 @@ gist_init_heapallindexed(Relation rel, GistCheckState * result)
 
 	result->snapshot = RegisterSnapshot(GetTransactionSnapshot());
 
-
 	/*
 	 * GetTransactionSnapshot() always acquires a new MVCC snapshot in READ
 	 * COMMITTED mode.  A new snapshot is guaranteed to have all the entries
@@ -171,7 +176,11 @@ gist_init_heapallindexed(Relation rel, GistCheckState * result)
  * Main entry point for GiST check. Allocates memory context and scans through
  * GiST graph. This scan is performed in a depth-first search using a stack of
  * GistScanItem-s. Initially this stack contains only root block number. On
- * each iteration top block numbmer is replcaed by referenced block numbers.
+ * each iteration the top block number is replaced by referenced block numbers.
+ *
+ * XXX I'd move the following paragraph right before "allocates memory". It
+ * describes what the function does, while the "allocates memory" is more
+ * a description of "how" it is done.
  *
  * This function verifies that tuples of internal pages cover all
  * the key space of each tuple on leaf page.  To do this we invoke
@@ -356,10 +365,12 @@ gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
 	pfree(check_state);
 }
 
-static void gist_check_page(GistCheckState *check_state, GistScanItem *stack,
-							Page page, bool heapallindexed, BufferAccessStrategy strategy)
+static void
+gist_check_page(GistCheckState *check_state, GistScanItem *stack,
+				Page page, bool heapallindexed, BufferAccessStrategy strategy)
 {
 	OffsetNumber maxoff = PageGetMaxOffsetNumber(page);
+
 	/* Check that the tree has the same height in all branches */
 	if (GistPageIsLeaf(page))
 	{
@@ -468,7 +479,7 @@ static void gist_check_page(GistCheckState *check_state, GistScanItem *stack,
  */
 static IndexTuple
 gistFormNormalizedTuple(GISTSTATE *giststate, Relation r,
-			  Datum *attdata, bool *isnull, ItemPointerData tid)
+						Datum *attdata, bool *isnull, ItemPointerData tid)
 {
 	Datum		compatt[INDEX_MAX_KEYS];
 	IndexTuple	res;
@@ -525,6 +536,7 @@ gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
 	IndexTuple	itup = gistFormNormalizedTuple(state->state, index, values, isnull, *tid);
 
 	itup->t_tid = *tid;
+
 	/* Probe Bloom filter -- tuple should be present */
 	if (bloom_lacks_element(state->filter, (unsigned char *) itup,
 							IndexTupleSize(itup)))
@@ -602,9 +614,9 @@ gist_refind_parent(Relation rel,
 
 	if (GistPageIsLeaf(parentpage))
 	{
-		/* 
+		/*
 		 * Currently GiST never deletes internal pages, thus they can never
-		 * become leaf 
+		 * become leaf.
 		 */
 		ereport(ERROR,
 				(errcode(ERRCODE_INDEX_CORRUPTED),
@@ -635,6 +647,10 @@ gist_refind_parent(Relation rel,
 	return result;
 }
 
+/*
+ * XXX What does this do? Maybe it should be in core or some common file?
+ * Seems to be used in verify_btree.c too.
+ */
 static ItemId
 PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
 					 OffsetNumber offset)
@@ -656,6 +672,8 @@ PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
 	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree
 	 * and gist never uses either.  Verify that line pointer has storage, too,
 	 * since even LP_DEAD items should.
+	 *
+	 * XXX why does this reference nbtree?
 	 */
 	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
 		ItemIdGetLength(itemid) == 0)
-- 
2.45.2

v28-review-0006-pgindent.patchtext/x-patch; charset=UTF-8; name=v28-review-0006-pgindent.patchDownload

From 69c374628c6b1fc7b90703b381ac3e63f9b64dff Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Wed, 10 Jul 2024 15:56:41 +0200
Subject: [PATCH v28-review 06/13] pgindent

---
 contrib/amcheck/verify_gist.c | 86 +++++++++++++++++------------------
 1 file changed, 43 insertions(+), 43 deletions(-)

diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
index 5ea3e216b7d..63f6175e17c 100644
--- a/contrib/amcheck/verify_gist.c
+++ b/contrib/amcheck/verify_gist.c
@@ -40,15 +40,14 @@ typedef struct GistScanItem
 	BlockNumber blkno;
 
 	/*
-	 * Correctess of this parent tuple will be checked against contents of referenced page.
-	 * This tuple will be NULL for root block.
+	 * Correctess of this parent tuple will be checked against contents of
+	 * referenced page. This tuple will be NULL for root block.
 	 */
 	IndexTuple	parenttup;
 
 	/*
-	 * LSN to hande concurrent scan of the page.
-	 * It's necessary to avoid missing some subtrees from page, that was
-	 * split just before we read it.
+	 * LSN to hande concurrent scan of the page. It's necessary to avoid
+	 * missing some subtrees from page, that was split just before we read it.
 	 */
 	XLogRecPtr	parentlsn;
 
@@ -60,14 +59,14 @@ typedef struct GistScanItem
 
 	/* Pointer to a next stack item. */
 	struct GistScanItem *next;
-} GistScanItem;
+}			GistScanItem;
 
 typedef struct GistCheckState
 {
 	/* Bloom filter fingerprints index tuples */
 	bloom_filter *filter;
 
-	/* Debug counter	FIXME what does 'debug counter' mean?*/
+	/* Debug counter	FIXME what does 'debug counter' mean? */
 	int64		heaptuplespresent;
 
 	/* XXX nitpick: I'd move these 'generic' fields to the beginning */
@@ -84,15 +83,15 @@ typedef struct GistCheckState
 	BlockNumber scannedblocks;
 	BlockNumber deltablocks;
 
-	int leafdepth;
-} GistCheckState;
+	int			leafdepth;
+}			GistCheckState;
 
 PG_FUNCTION_INFO_V1(gist_index_check);
 
 static void gist_init_heapallindexed(Relation rel, GistCheckState * result);
 static void gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
 											   void *callback_state, bool readonly);
-static void gist_check_page(GistCheckState *check_state,GistScanItem *stack,
+static void gist_check_page(GistCheckState * check_state, GistScanItem * stack,
 							Page page, bool heapallindexed,
 							BufferAccessStrategy strategy);
 static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
@@ -105,7 +104,7 @@ static void gist_tuple_present_callback(Relation index, ItemPointer tid,
 										Datum *values, bool *isnull,
 										bool tupleIsAlive, void *checkstate);
 static IndexTuple gistFormNormalizedTuple(GISTSTATE *giststate, Relation r,
-			  Datum *attdata, bool *isnull, ItemPointerData tid);
+										  Datum *attdata, bool *isnull, ItemPointerData tid);
 
 /*
  * gist_index_check(index regclass)
@@ -292,6 +291,7 @@ gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
 		if (!GistPageIsLeaf(page))
 		{
 			OffsetNumber maxoff = PageGetMaxOffsetNumber(page);
+
 			for (OffsetNumber i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
 			{
 				/* Internal page, so recurse to the child */
@@ -327,7 +327,7 @@ gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
 		TableScanDesc scan;
 
 		scan = table_beginscan_strat(heaprel,	/* relation */
-									 check_state->snapshot,	/* snapshot */
+									 check_state->snapshot, /* snapshot */
 									 0, /* number of keys */
 									 NULL,	/* scan key */
 									 true,	/* buffer access strategy OK */
@@ -366,7 +366,7 @@ gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
 }
 
 static void
-gist_check_page(GistCheckState *check_state, GistScanItem *stack,
+gist_check_page(GistCheckState * check_state, GistScanItem * stack,
 				Page page, bool heapallindexed, BufferAccessStrategy strategy)
 {
 	OffsetNumber maxoff = PageGetMaxOffsetNumber(page);
@@ -379,13 +379,13 @@ gist_check_page(GistCheckState *check_state, GistScanItem *stack,
 		else if (stack->depth != check_state->leafdepth)
 			ereport(ERROR,
 					(errcode(ERRCODE_INDEX_CORRUPTED),
-						errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+					 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
 							RelationGetRelationName(check_state->rel), stack->blkno)));
 	}
 
 	/*
-	 * Check that each tuple looks valid, and is consistent with the
-	 * downlink we followed when we stepped on this page.
+	 * Check that each tuple looks valid, and is consistent with the downlink
+	 * we followed when we stepped on this page.
 	 */
 	for (OffsetNumber i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
 	{
@@ -393,27 +393,26 @@ gist_check_page(GistCheckState *check_state, GistScanItem *stack,
 		IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
 
 		/*
-		 * Check that it's not a leftover invalid tuple from pre-9.1 See
-		 * also gistdoinsert() and gistbulkdelete() handling of such
-		 * tuples. We do consider it error here.
+		 * Check that it's not a leftover invalid tuple from pre-9.1 See also
+		 * gistdoinsert() and gistbulkdelete() handling of such tuples. We do
+		 * consider it error here.
 		 */
 		if (GistTupleIsInvalid(idxtuple))
 			ereport(ERROR,
 					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-						errmsg("index \"%s\" contains an inner tuple marked as invalid, block %u, offset %u",
+					 errmsg("index \"%s\" contains an inner tuple marked as invalid, block %u, offset %u",
 							RelationGetRelationName(check_state->rel), stack->blkno, i),
-						errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
-						errhint("Please REINDEX it.")));
+					 errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+					 errhint("Please REINDEX it.")));
 
 		if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
 			ereport(ERROR,
 					(errcode(ERRCODE_INDEX_CORRUPTED),
-						errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+					 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
 							RelationGetRelationName(check_state->rel), stack->blkno, i)));
 
 		/*
-		 * Check if this tuple is consistent with the downlink in the
-		 * parent.
+		 * Check if this tuple is consistent with the downlink in the parent.
 		 */
 		if (stack->parenttup &&
 			gistgetadjusted(check_state->rel, stack->parenttup, idxtuple, check_state->state))
@@ -421,26 +420,26 @@ gist_check_page(GistCheckState *check_state, GistScanItem *stack,
 			/*
 			 * There was a discrepancy between parent and child tuples. We
 			 * need to verify it is not a result of concurrent call of
-			 * gistplacetopage(). So, lock parent and try to find downlink
-			 * for current page. It may be missing due to concurrent page
-			 * split, this is OK.
+			 * gistplacetopage(). So, lock parent and try to find downlink for
+			 * current page. It may be missing due to concurrent page split,
+			 * this is OK.
 			 *
-			 * Note that when we aquire parent tuple now we hold lock for
-			 * both parent and child buffers. Thus parent tuple must
-			 * include keyspace of the child.
+			 * Note that when we aquire parent tuple now we hold lock for both
+			 * parent and child buffers. Thus parent tuple must include
+			 * keyspace of the child.
 			 */
 			pfree(stack->parenttup);
 			stack->parenttup = gist_refind_parent(check_state->rel, stack->parentblk,
-													stack->blkno, strategy);
+												  stack->blkno, strategy);
 
 			/* We found it - make a final check before failing */
 			if (!stack->parenttup)
 				elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
-						stack->blkno, stack->parentblk);
+					 stack->blkno, stack->parentblk);
 			else if (gistgetadjusted(check_state->rel, stack->parenttup, idxtuple, check_state->state))
 				ereport(ERROR,
 						(errcode(ERRCODE_INDEX_CORRUPTED),
-							errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+						 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
 								RelationGetRelationName(check_state->rel), stack->blkno, i)));
 			else
 			{
@@ -454,16 +453,17 @@ gist_check_page(GistCheckState *check_state, GistScanItem *stack,
 		{
 			if (heapallindexed)
 				bloom_add_element(check_state->filter,
-									(unsigned char *) idxtuple,
-									IndexTupleSize(idxtuple));
+								  (unsigned char *) idxtuple,
+								  IndexTupleSize(idxtuple));
 		}
 		else
 		{
 			OffsetNumber off = ItemPointerGetOffsetNumber(&(idxtuple->t_tid));
+
 			if (off != 0xffff)
 				ereport(ERROR,
 						(errcode(ERRCODE_INDEX_CORRUPTED),
-							errmsg("index \"%s\" has on page %u offset %u has item id not pointing to 0xffff, but %hu",
+						 errmsg("index \"%s\" has on page %u offset %u has item id not pointing to 0xffff, but %hu",
 								RelationGetRelationName(check_state->rel), stack->blkno, i, off)));
 		}
 	}
@@ -503,18 +503,18 @@ gistFormNormalizedTuple(GISTSTATE *giststate, Relation r,
 							RelationGetRelationName(r))));
 		if (VARATT_IS_COMPRESSED(DatumGetPointer(compatt[i])))
 		{
-			//Datum old = compatt[i];
+			/* Datum old = compatt[i]; */
 			/* Key attributes must never be compressed */
 			if (i < IndexRelationGetNumberOfKeyAttributes(r))
 				ereport(ERROR,
 						(errcode(ERRCODE_INDEX_CORRUPTED),
-							errmsg("compressed varlena datum in tuple key that references heap row (%u,%u) in index \"%s\"",
+						 errmsg("compressed varlena datum in tuple key that references heap row (%u,%u) in index \"%s\"",
 								ItemPointerGetBlockNumber(&tid),
 								ItemPointerGetOffsetNumber(&tid),
 								RelationGetRelationName(r))));
 
 			compatt[i] = PointerGetDatum(PG_DETOAST_DATUM(compatt[i]));
-			//pfree(DatumGetPointer(old)); // TODO: this fails. Why?
+			/* pfree(DatumGetPointer(old)); // TODO: this fails. Why? */
 		}
 	}
 
@@ -620,7 +620,7 @@ gist_refind_parent(Relation rel,
 		 */
 		ereport(ERROR,
 				(errcode(ERRCODE_INDEX_CORRUPTED),
-					errmsg("index \"%s\" internal page %d became leaf",
+				 errmsg("index \"%s\" internal page %d became leaf",
 						RelationGetRelationName(rel), parentblkno)));
 	}
 
@@ -634,8 +634,8 @@ gist_refind_parent(Relation rel,
 		{
 			/*
 			 * Found it! Make copy and return it while both parent and child
-			 * pages are locked. This guaranties that at this particular moment
-			 * tuples must be coherent to each other.
+			 * pages are locked. This guaranties that at this particular
+			 * moment tuples must be coherent to each other.
 			 */
 			result = CopyIndexTuple(itup);
 			break;
-- 
2.45.2

v28-review-0007-Add-gin_index_parent_check-to-verify-GIN-.patchtext/x-patch; charset=UTF-8; name=v28-review-0007-Add-gin_index_parent_check-to-verify-GIN-.patchDownload

From 60aa76aaea5e21a6821aa2be44f11bf35f7ef94b Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:22:26 +0500
Subject: [PATCH v28-review 07/13] Add gin_index_parent_check() to verify GIN
 index
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Author: Grigory Kryachko <GSKryachko@gmail.com>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile               |   3 +-
 contrib/amcheck/amcheck--1.4--1.5.sql  |   9 +
 contrib/amcheck/expected/check_gin.out |  64 ++
 contrib/amcheck/meson.build            |   2 +
 contrib/amcheck/sql/check_gin.sql      |  40 ++
 contrib/amcheck/verify_gin.c           | 769 +++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml              |  19 +
 7 files changed, 905 insertions(+), 1 deletion(-)
 create mode 100644 contrib/amcheck/expected/check_gin.out
 create mode 100644 contrib/amcheck/sql/check_gin.sql
 create mode 100644 contrib/amcheck/verify_gin.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index f63252ff33c..5c3ea8bc6a4 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,6 +4,7 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	amcheck.o \
+	verify_gin.o \
 	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
@@ -13,7 +14,7 @@ DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck
 		amcheck--1.3--1.4.sql amcheck--1.4--1.5.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_gist check_heap
+REGRESS = check check_btree check_gin check_gist check_heap
 
 EXTRA_INSTALL = contrib/pg_walinspect
 TAP_TESTS = 1
diff --git a/contrib/amcheck/amcheck--1.4--1.5.sql b/contrib/amcheck/amcheck--1.4--1.5.sql
index 3fc72364180..a2bca7c2037 100644
--- a/contrib/amcheck/amcheck--1.4--1.5.sql
+++ b/contrib/amcheck/amcheck--1.4--1.5.sql
@@ -12,3 +12,12 @@ AS 'MODULE_PATHNAME', 'gist_index_check'
 LANGUAGE C STRICT;
 
 REVOKE ALL ON FUNCTION gist_index_check(regclass, boolean) FROM PUBLIC;
+
+-- gin_index_parent_check()
+--
+CREATE FUNCTION gin_index_parent_check(index regclass)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gin_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gin_index_parent_check(regclass) FROM PUBLIC;
diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
new file mode 100644
index 00000000000..43fd769a506
--- /dev/null
+++ b/contrib/amcheck/expected/check_gin.out
@@ -0,0 +1,64 @@
+-- Test of index bulk load
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx');
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test index inserts
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+SELECT gin_index_parent_check('gin_check_idx');
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test GIN over text array
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx');
+ gin_index_parent_check 
+------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 15ae94cc90f..5c9ddfe0758 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gin.c',
   'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
@@ -38,6 +39,7 @@ tests += {
     'sql': [
       'check',
       'check_btree',
+      'check_gin',
       'check_gist',
       'check_heap',
     ],
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
new file mode 100644
index 00000000000..9771afffa5c
--- /dev/null
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -0,0 +1,40 @@
+-- Test of index bulk load
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test index inserts
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+
+SELECT gin_index_parent_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test GIN over text array
+SELECT setseed(1);
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_parent_check('gin_check_text_array_idx');
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
new file mode 100644
index 00000000000..877ecacb9c1
--- /dev/null
+++ b/contrib/amcheck/verify_gin.c
@@ -0,0 +1,769 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gin.c
+ *		Verifies the integrity of GIN indexes based on invariants.
+ *
+ * Verification checks that all paths in GIN graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gin.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gin_private.h"
+#include "access/nbtree.h"
+#include "amcheck.h"
+#include "catalog/pg_am.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "string.h"
+
+/*
+ * GinScanItem represents one item of depth-first scan of GIN index.
+ */
+typedef struct GinScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GinScanItem *next;
+} GinScanItem;
+
+/*
+ * GinPostingTreeScanItem represents one item of depth-first scan of GIN  posting tree.
+ */
+typedef struct GinPostingTreeScanItem
+{
+	int			depth;
+	ItemPointerData parentkey;
+	BlockNumber parentblk;
+	BlockNumber blkno;
+	struct GinPostingTreeScanItem *next;
+} GinPostingTreeScanItem;
+
+
+PG_FUNCTION_INFO_V1(gin_index_parent_check);
+
+static void gin_check_parent_keys_consistency(Relation rel,
+											  Relation heaprel,
+											  void *callback_state, bool readonly);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gin_refind_parent(Relation rel,
+									BlockNumber parentblkno,
+									BlockNumber childblkno,
+									BufferAccessStrategy strategy);
+static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+								   OffsetNumber offset);
+
+/*
+ * gin_index_parent_check(index regclass)
+ *
+ * Verify integrity of GIN index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gin_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+
+	amcheck_lock_relation_and_check(indrelid,
+									GIN_AM_OID,
+									gin_check_parent_keys_consistency,
+									AccessShareLock,
+									NULL);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Read item pointers from leaf entry tuple.
+ *
+ * Returns a palloc'd array of ItemPointers. The number of items is returned
+ * in *nitems.
+ */
+static ItemPointer
+ginReadTupleWithoutState(IndexTuple itup, int *nitems)
+{
+	Pointer		ptr = GinGetPosting(itup);
+	int			nipd = GinGetNPosting(itup);
+	ItemPointer ipd;
+	int			ndecoded;
+
+	if (GinItupIsCompressed(itup))
+	{
+		if (nipd > 0)
+		{
+			ipd = ginPostingListDecode((GinPostingList *) ptr, &ndecoded);
+			if (nipd != ndecoded)
+				elog(ERROR, "number of items mismatch in GIN entry tuple, %d in tuple header, %d decoded",
+					 nipd, ndecoded);
+		}
+		else
+		{
+			ipd = palloc(0);
+		}
+	}
+	else
+	{
+		ipd = (ItemPointer) palloc(sizeof(ItemPointerData) * nipd);
+		memcpy(ipd, ptr, sizeof(ItemPointerData) * nipd);
+	}
+	*nitems = nipd;
+	return ipd;
+}
+
+/*
+ * Allocates memory context and scans through postigTree graph
+ *
+ */
+static void
+gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting_tree_root)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinPostingTreeScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinPostingTreeScanItem *) palloc0(sizeof(GinPostingTreeScanItem));
+	stack->depth = 0;
+	ItemPointerSetInvalid(&stack->parentkey);
+	stack->parentblk = InvalidBlockNumber;
+	stack->blkno = posting_tree_root;
+
+	elog(DEBUG3, "processing posting tree at blk %u", posting_tree_root);
+
+	while (stack)
+	{
+		GinPostingTreeScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		Assert(GinPageIsData(page));
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			ItemPointerData minItem;
+			int			nlist;
+			ItemPointerData *list;
+			char		tidrange_buf[100];
+
+			ItemPointerSetMin(&minItem);
+
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			list = GinDataLeafPageGetItems(page, &nlist, minItem);
+
+			if (nlist > 0)
+			{
+				snprintf(tidrange_buf, sizeof(tidrange_buf),
+						 "%d tids (%u, %u) - (%u, %u)",
+						 nlist,
+						 ItemPointerGetBlockNumberNoCheck(&list[0]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[0]),
+						 ItemPointerGetBlockNumberNoCheck(&list[nlist - 1]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[nlist - 1]));
+			}
+			else
+			{
+				snprintf(tidrange_buf, sizeof(tidrange_buf), "0 tids");
+			}
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				elog(DEBUG3, "blk %u: parent %u highkey (%u, %u), %s",
+					 stack->blkno,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey),
+					 tidrange_buf);
+			}
+			else
+			{
+				elog(DEBUG3, "blk %u: root leaf, %s",
+					 stack->blkno,
+					 tidrange_buf);
+			}
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				ItemPointerGetOffsetNumberNoCheck(&stack->parentkey) != InvalidOffsetNumber &&
+				nlist > 0 && ItemPointerCompare(&stack->parentkey, &list[nlist - 1]) < 0)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": tid exceeds parent's high key in postingTree leaf on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+		else
+		{
+			LocationIndex pd_lower;
+			ItemPointerData bound;
+			int			lowersize;
+
+			/*
+			 * Check that tuples in each page are properly ordered and
+			 * consistent with parent high key
+			 */
+			maxoff = GinPageGetOpaque(page)->maxoff;
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: internal posting tree page with %u items, parent %u highkey (%u, %u)",
+					 stack->blkno, maxoff, stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey));
+			else
+				elog(DEBUG3, "blk %u: root internal posting tree page with %u items",
+					 stack->blkno, maxoff);
+
+			/*
+			 * A GIN posting tree internal page stores PostingItems in the
+			 * 'lower' part of the page. The 'upper' part is unused. The
+			 * number of elements is stored in the opaque area (maxoff). Make
+			 * sure the size of the 'lower' part agrees with 'maxoff'
+			 *
+			 * We didn't set pd_lower until PostgreSQL version 9.4, so if this
+			 * check fails, it could also be because the index was
+			 * binary-upgraded from an earlier version. That was a long time
+			 * ago, though, so let's warn if it doesn't match.
+			 */
+			pd_lower = ((PageHeader) page)->pd_lower;
+			lowersize = pd_lower - MAXALIGN(SizeOfPageHeaderData);
+			if ((lowersize - MAXALIGN(sizeof(ItemPointerData))) / sizeof(PostingItem) != maxoff)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has unexpected pd_lower %u in posting tree block %u with maxoff %u)",
+								RelationGetRelationName(rel), pd_lower, stack->blkno, maxoff)));
+
+			/*
+			 * Before the PostingItems, there's one ItemPointerData in the
+			 * 'lower' part that stores the page's high key.
+			 */
+			bound = *GinDataPageGetRightBound(page);
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				!ItemPointerEquals(&stack->parentkey, &bound))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": posting tree page's high key (%u, %u) doesn't match the downlink on block %u (parent blk %u, key (%u, %u))",
+								RelationGetRelationName(rel),
+								ItemPointerGetBlockNumberNoCheck(&bound),
+								ItemPointerGetOffsetNumberNoCheck(&bound),
+								stack->blkno, stack->parentblk,
+								ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+								ItemPointerGetOffsetNumberNoCheck(&stack->parentkey))));
+
+			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
+
+				elog(DEBUG3, "key (%u, %u) -> %u",
+					 ItemPointerGetBlockNumber(&posting_item->key),
+					 ItemPointerGetOffsetNumber(&posting_item->key),
+					 BlockIdGetBlockNumber(&posting_item->child_blkno));
+
+				if (i == maxoff &&
+					GinPageGetOpaque(page)->rightlink == InvalidBlockNumber)
+				{
+					/*
+					 * The rightmost item in the tree level has (0, 0) as the
+					 * key
+					 */
+					if (ItemPointerGetBlockNumberNoCheck(&posting_item->key) != 0 ||
+						ItemPointerGetOffsetNumberNoCheck(&posting_item->key) != 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": rightmost posting tree page (blk %u) has unexpected last key (%u, %u)",
+										RelationGetRelationName(rel),
+										stack->blkno,
+										ItemPointerGetBlockNumberNoCheck(&posting_item->key),
+										ItemPointerGetOffsetNumberNoCheck(&posting_item->key))));
+				}
+				else if (i != FirstOffsetNumber)
+				{
+					PostingItem *previous_posting_item = GinDataPageGetPostingItem(page, i - 1);
+
+					if (ItemPointerCompare(&posting_item->key, &previous_posting_item->key) < 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\" has wrong tuple order in posting tree, block %u, offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+				}
+
+				/*
+				 * Check if this tuple is consistent with the downlink in the
+				 * parent.
+				 */
+				if (stack->parentblk != InvalidBlockNumber && i == maxoff &&
+					ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\": posting item exceeds parent's high key in postingTree internal page on block %u offset %u",
+									RelationGetRelationName(rel),
+									stack->blkno, i)));
+
+				/* If this is an internal page, recurse into the child */
+				if (!GinPageIsLeaf(page))
+				{
+					GinPostingTreeScanItem *ptr;
+
+					ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
+					ptr->depth = stack->depth + 1;
+					ptr->parentkey = posting_item->key;
+					ptr->parentblk = stack->blkno;
+					ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+			}
+		}
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Main entry point for GIN check. Allocates memory context and scans through
+ * GIN graph.
+ */
+static void
+gin_check_parent_keys_consistency(Relation rel,
+								  Relation heaprel,
+								  void *callback_state,
+								  bool readonly)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GinState	state;
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+	initGinState(&state, rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinScanItem *) palloc0(sizeof(GinScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIN_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GinScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+		IndexTuple	prev_tuple;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+		maxoff = PageGetMaxOffsetNumber(page);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (stack->parenttup != NULL)
+		{
+			GinNullCategory parent_key_category;
+			Datum parent_key = gintuple_get_key(&state,
+												stack->parenttup,
+												&parent_key_category);
+			ItemId iid = PageGetItemIdCareful(rel, stack->blkno,
+											  page, maxoff);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory page_max_key_category;
+			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
+
+			if (GinPageGetOpaque(page)->rightlink != InvalidBlockNumber &&
+				ginCompareEntries(&state, attnum, page_max_key,
+								  page_max_key_category, parent_key,
+								  parent_key_category) > 0)
+			{
+				/* split page detected, install right link to the stack */
+				GinScanItem *ptr;
+
+				elog(DEBUG3, "split detected");
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth;
+				ptr->parenttup = CopyIndexTuple(stack->parenttup);
+				ptr->parentblk = stack->parentblk;
+				ptr->parentlsn = stack->parentlsn;
+				ptr->blkno = GinPageGetOpaque(page)->rightlink;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that tuples in each page are properly ordered and consistent
+		 * with parent high key
+		 */
+		prev_tuple = NULL;
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory prev_key_category;
+			Datum		prev_key;
+			GinNullCategory current_key_category;
+			Datum		current_key;
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
+
+			/* (apparently) first block is metadata, skip order check */
+			if (i != FirstOffsetNumber && stack->blkno != (BlockNumber) 1)
+			{
+				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
+				if (ginCompareEntries(&state, attnum, prev_key,
+									  prev_key_category, current_key,
+									  current_key_category) >= 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has wrong tuple order, block %u, offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+			}
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				i == maxoff)
+			{
+				GinNullCategory parent_key_category;
+				Datum parent_key = gintuple_get_key(&state,
+													stack->parenttup,
+													&parent_key_category);
+
+				if (ginCompareEntries(&state, attnum, current_key,
+									  current_key_category, parent_key,
+									  parent_key_category) > 0)
+				{
+					/*
+					 * There was a discrepancy between parent and child
+					 * tuples. We need to verify it is not a result of
+					 * concurrent call of gistplacetopage(). So, lock parent
+					 * and try to find downlink for current page. It may be
+					 * missing due to concurrent page split, this is OK.
+					 */
+					pfree(stack->parenttup);
+					stack->parenttup = gin_refind_parent(rel, stack->parentblk,
+														 stack->blkno, strategy);
+
+					/* We found it - make a final check before failing */
+					if (!stack->parenttup)
+						elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+							 stack->blkno, stack->parentblk);
+					else
+					{
+						parent_key = gintuple_get_key(&state,
+													  stack->parenttup,
+													  &parent_key_category);
+						if (ginCompareEntries(&state, attnum, current_key,
+											  current_key_category, parent_key,
+											  parent_key_category) > 0)
+							ereport(ERROR,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+											RelationGetRelationName(rel), stack->blkno, i)));
+						else
+						{
+							/*
+							 * But now it is properly adjusted - nothing to do
+							 * here.
+							 */
+						}
+					}
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+			if (!GinPageIsLeaf(page))
+			{
+				GinScanItem *ptr;
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth + 1;
+				/* last tuple in layer has no high key */
+				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
+				{
+					ptr->parenttup = CopyIndexTuple(idxtuple);
+				}
+				else
+				{
+					ptr->parenttup = NULL;
+				}
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = GinGetDownlink(idxtuple);
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+			/* If this item is a pointer to a posting tree, recurse into it */
+			else if (GinIsPostingTree(idxtuple))
+			{
+				BlockNumber rootPostingTree = GinGetPostingTree(idxtuple);
+
+				gin_check_posting_tree_parent_keys_consistency(rel, rootPostingTree);
+			}
+			else
+			{
+				ItemPointer ipd;
+				int			nipd;
+
+				ipd = ginReadTupleWithoutState(idxtuple, &nipd);
+
+				for (int j = 0; j < nipd; j++)
+				{
+					if (!OffsetNumberIsValid(ItemPointerGetOffsetNumber(&ipd[j])))
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting list contains invalid heap pointer on block %u",
+										RelationGetRelationName(rel), stack->blkno)));
+				}
+				pfree(ipd);
+			}
+
+			prev_tuple = CopyIndexTuple(idxtuple);
+		}
+
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Verify that a freshly-read page looks sane.
+ */
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	/*
+	 * ReadBuffer verifies that every newly-read page passes
+	 * PageHeaderIsValid, which means it either contains a reasonably sane
+	 * page header or is all-zero.  We have to defend against the all-zero
+	 * case, however.
+	 */
+	if (PageIsNew(page))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains unexpected zero page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	/*
+	 * Additionally check that the special area looks sane.
+	 */
+	if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains corrupted page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	if (GinPageIsDeleted(page))
+	{
+		if (!GinPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gin_refind_parent(Relation rel, BlockNumber parentblkno,
+				  BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIN_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GinPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+static ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED or LP_DEAD,
+	 * since GIN never uses all three.  Verify that line pointer has storage,
+	 * too.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdIsDead(itemid) || ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 6eb526c6bb7..e1a471474e6 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -189,6 +189,25 @@ ORDER BY c.relpages DESC LIMIT 10;
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term>
+     <function>gin_index_parent_check(index regclass) returns void</function>
+     <indexterm>
+      <primary>gin_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gin_index_parent_check</function> tests that its target GIN index
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term>
      <function>gist_index_check(index regclass, heapallindexed boolean) returns void</function>
-- 
2.45.2

v28-review-0008-review.patchtext/x-patch; charset=UTF-8; name=v28-review-0008-review.patchDownload

From 7ed4f7a82663d576bdb928a2767920fff36d60f9 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Wed, 10 Jul 2024 15:40:22 +0200
Subject: [PATCH v28-review 08/13] review

---
 contrib/amcheck/amcheck--1.4--1.5.sql | 2 +-
 contrib/amcheck/verify_gin.c          | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/contrib/amcheck/amcheck--1.4--1.5.sql b/contrib/amcheck/amcheck--1.4--1.5.sql
index a2bca7c2037..7414b611e06 100644
--- a/contrib/amcheck/amcheck--1.4--1.5.sql
+++ b/contrib/amcheck/amcheck--1.4--1.5.sql
@@ -14,7 +14,7 @@ LANGUAGE C STRICT;
 REVOKE ALL ON FUNCTION gist_index_check(regclass, boolean) FROM PUBLIC;
 
 -- gin_index_parent_check()
---
+-- XXX why is this not called simply gin_index_check?
 CREATE FUNCTION gin_index_parent_check(index regclass)
 RETURNS VOID
 AS 'MODULE_PATHNAME', 'gin_index_parent_check'
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index 877ecacb9c1..2a63ba6b83e 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -732,6 +732,7 @@ gin_refind_parent(Relation rel, BlockNumber parentblkno,
 	return result;
 }
 
+/* XXX yet another copy of this? */
 static ItemId
 PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
 					 OffsetNumber offset)
-- 
2.45.2

v28-review-0009-pgindent.patchtext/x-patch; charset=UTF-8; name=v28-review-0009-pgindent.patchDownload

From d280ad00e2834093b838c1cdf0dd8539920b2724 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Wed, 10 Jul 2024 15:47:45 +0200
Subject: [PATCH v28-review 09/13] pgindent

---
 contrib/amcheck/verify_gin.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index 2a63ba6b83e..8ce112fe49e 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -38,7 +38,7 @@ typedef struct GinScanItem
 	XLogRecPtr	parentlsn;
 	BlockNumber blkno;
 	struct GinScanItem *next;
-} GinScanItem;
+}			GinScanItem;
 
 /*
  * GinPostingTreeScanItem represents one item of depth-first scan of GIN  posting tree.
@@ -50,7 +50,7 @@ typedef struct GinPostingTreeScanItem
 	BlockNumber parentblk;
 	BlockNumber blkno;
 	struct GinPostingTreeScanItem *next;
-} GinPostingTreeScanItem;
+}			GinPostingTreeScanItem;
 
 
 PG_FUNCTION_INFO_V1(gin_index_parent_check);
@@ -434,11 +434,11 @@ gin_check_parent_keys_consistency(Relation rel,
 		if (stack->parenttup != NULL)
 		{
 			GinNullCategory parent_key_category;
-			Datum parent_key = gintuple_get_key(&state,
-												stack->parenttup,
-												&parent_key_category);
-			ItemId iid = PageGetItemIdCareful(rel, stack->blkno,
-											  page, maxoff);
+			Datum		parent_key = gintuple_get_key(&state,
+													  stack->parenttup,
+													  &parent_key_category);
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno,
+												   page, maxoff);
 			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
 			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
 			GinNullCategory page_max_key_category;
@@ -521,9 +521,9 @@ gin_check_parent_keys_consistency(Relation rel,
 				i == maxoff)
 			{
 				GinNullCategory parent_key_category;
-				Datum parent_key = gintuple_get_key(&state,
-													stack->parenttup,
-													&parent_key_category);
+				Datum		parent_key = gintuple_get_key(&state,
+														  stack->parenttup,
+														  &parent_key_category);
 
 				if (ginCompareEntries(&state, attnum, current_key,
 									  current_key_category, parent_key,
-- 
2.45.2

v28-review-0010-Add-GiST-support-to-pg_amcheck.patchtext/x-patch; charset=UTF-8; name=v28-review-0010-Add-GiST-support-to-pg_amcheck.patchDownload

From 905964ef70d1955fe7270c6b9d02c73c1c6dbb5f Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sun, 5 Feb 2023 15:52:14 -0800
Subject: [PATCH v28-review 10/13] Add GiST support to pg_amcheck

Reviewed-by: Alexander Lakhin <exclusion@gmail.com>
---
 src/bin/pg_amcheck/pg_amcheck.c      | 268 ++++++++++++++++-----------
 src/bin/pg_amcheck/t/002_nonesuch.pl |   8 +-
 src/bin/pg_amcheck/t/003_check.pl    |  65 +++++--
 3 files changed, 210 insertions(+), 131 deletions(-)

diff --git a/src/bin/pg_amcheck/pg_amcheck.c b/src/bin/pg_amcheck/pg_amcheck.c
index a1ad41e7664..8a93fd4140d 100644
--- a/src/bin/pg_amcheck/pg_amcheck.c
+++ b/src/bin/pg_amcheck/pg_amcheck.c
@@ -39,8 +39,7 @@ typedef struct PatternInfo
 								 * NULL */
 	bool		heap_only;		/* true if rel_regex should only match heap
 								 * tables */
-	bool		btree_only;		/* true if rel_regex should only match btree
-								 * indexes */
+	bool		index_only;		/* true if rel_regex should only match indexes */
 	bool		matched;		/* true if the pattern matched in any database */
 } PatternInfo;
 
@@ -74,7 +73,7 @@ typedef struct AmcheckOptions
 
 	/*
 	 * As an optimization, if any pattern in the exclude list applies to heap
-	 * tables, or similarly if any such pattern applies to btree indexes, or
+	 * tables, or similarly if any such pattern applies to indexes, or
 	 * to schemas, then these will be true, otherwise false.  These should
 	 * always agree with what you'd conclude by grep'ing through the exclude
 	 * list.
@@ -98,14 +97,14 @@ typedef struct AmcheckOptions
 	int64		endblock;
 	const char *skip;
 
-	/* btree index checking options */
+	/* index checking options */
 	bool		parent_check;
 	bool		rootdescend;
 	bool		heapallindexed;
 	bool		checkunique;
 
-	/* heap and btree hybrid option */
-	bool		no_btree_expansion;
+	/* heap and indexes hybrid option */
+	bool		no_index_expansion;
 } AmcheckOptions;
 
 static AmcheckOptions opts = {
@@ -134,7 +133,7 @@ static AmcheckOptions opts = {
 	.rootdescend = false,
 	.heapallindexed = false,
 	.checkunique = false,
-	.no_btree_expansion = false
+	.no_index_expansion = false
 };
 
 static const char *progname = NULL;
@@ -151,13 +150,15 @@ typedef struct DatabaseInfo
 	char	   *datname;
 	char	   *amcheck_schema; /* escaped, quoted literal */
 	bool		is_checkunique;
+	bool		gist_supported;
 } DatabaseInfo;
 
 typedef struct RelationInfo
 {
 	const DatabaseInfo *datinfo;	/* shared by other relinfos */
 	Oid			reloid;
-	bool		is_heap;		/* true if heap, false if btree */
+	Oid			amoid;
+	bool		is_heap;		/* true if heap, false if index */
 	char	   *nspname;
 	char	   *relname;
 	int			relpages;
@@ -178,10 +179,12 @@ static void prepare_heap_command(PQExpBuffer sql, RelationInfo *rel,
 								 PGconn *conn);
 static void prepare_btree_command(PQExpBuffer sql, RelationInfo *rel,
 								  PGconn *conn);
+static void prepare_gist_command(PQExpBuffer sql, RelationInfo *rel,
+								  PGconn *conn);
 static void run_command(ParallelSlot *slot, const char *sql);
 static bool verify_heap_slot_handler(PGresult *res, PGconn *conn,
 									 void *context);
-static bool verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context);
+static bool verify_index_slot_handler(PGresult *res, PGconn *conn, void *context);
 static void help(const char *progname);
 static void progress_report(uint64 relations_total, uint64 relations_checked,
 							uint64 relpages_total, uint64 relpages_checked,
@@ -195,7 +198,7 @@ static void append_relation_pattern(PatternInfoArray *pia, const char *pattern,
 									int encoding);
 static void append_heap_pattern(PatternInfoArray *pia, const char *pattern,
 								int encoding);
-static void append_btree_pattern(PatternInfoArray *pia, const char *pattern,
+static void append_index_pattern(PatternInfoArray *pia, const char *pattern,
 								 int encoding);
 static void compile_database_list(PGconn *conn, SimplePtrList *databases,
 								  const char *initial_dbname);
@@ -287,6 +290,7 @@ main(int argc, char *argv[])
 	enum trivalue prompt_password = TRI_DEFAULT;
 	int			encoding = pg_get_encoding_from_locale(NULL, false);
 	ConnParams	cparams;
+	bool		gist_warn_printed = false;
 
 	pg_logging_init(argv[0]);
 	progname = get_progname(argv[0]);
@@ -322,11 +326,11 @@ main(int argc, char *argv[])
 				break;
 			case 'i':
 				opts.allrel = false;
-				append_btree_pattern(&opts.include, optarg, encoding);
+				append_index_pattern(&opts.include, optarg, encoding);
 				break;
 			case 'I':
 				opts.excludeidx = true;
-				append_btree_pattern(&opts.exclude, optarg, encoding);
+				append_index_pattern(&opts.exclude, optarg, encoding);
 				break;
 			case 'j':
 				if (!option_parse_int(optarg, "-j/--jobs", 1, INT_MAX,
@@ -381,7 +385,7 @@ main(int argc, char *argv[])
 				maintenance_db = pg_strdup(optarg);
 				break;
 			case 2:
-				opts.no_btree_expansion = true;
+				opts.no_index_expansion = true;
 				break;
 			case 3:
 				opts.no_toast_expansion = true;
@@ -530,6 +534,10 @@ main(int argc, char *argv[])
 		int			ntups;
 		const char *amcheck_schema = NULL;
 		DatabaseInfo *dat = (DatabaseInfo *) cell->ptr;
+		int			vmaj = 0,
+					vmin = 0,
+					vrev = 0;
+		const char *amcheck_version;
 
 		cparams.override_dbname = dat->datname;
 		if (conn == NULL || strcmp(PQdb(conn), dat->datname) != 0)
@@ -598,36 +606,33 @@ main(int argc, char *argv[])
 												 strlen(amcheck_schema));
 
 		/*
-		 * Check the version of amcheck extension. Skip requested unique
-		 * constraint check with warning if it is not yet supported by
-		 * amcheck.
+		 * Check the version of amcheck extension. 
 		 */
-		if (opts.checkunique == true)
-		{
-			/*
-			 * Now amcheck has only major and minor versions in the string but
-			 * we also support revision just in case. Now it is expected to be
-			 * zero.
-			 */
-			int			vmaj = 0,
-						vmin = 0,
-						vrev = 0;
-			const char *amcheck_version = PQgetvalue(result, 0, 1);
+		amcheck_version = PQgetvalue(result, 0, 1);
 
-			sscanf(amcheck_version, "%d.%d.%d", &vmaj, &vmin, &vrev);
+		/*
+		 * Now amcheck has only major and minor versions in the string but
+		 * we also support revision just in case. Now it is expected to be
+		 * zero.
+		 */
+		sscanf(amcheck_version, "%d.%d.%d", &vmaj, &vmin, &vrev);
 
-			/*
-			 * checkunique option is supported in amcheck since version 1.4
-			 */
-			if ((vmaj == 1 && vmin < 4) || vmaj == 0)
-			{
-				pg_log_warning("--checkunique option is not supported by amcheck "
-							   "version \"%s\"", amcheck_version);
-				dat->is_checkunique = false;
-			}
-			else
-				dat->is_checkunique = true;
+		/*
+		 * checkunique option is supported in amcheck since version 1.4. Skip
+		 * requested unique constraint check with warning if it is not yet
+		 * supported by amcheck.
+		 */
+		if (opts.checkunique && ((vmaj == 1 && vmin < 4) || vmaj == 0))
+		{
+			pg_log_warning("--checkunique option is not supported by amcheck "
+							"version \"%s\"", amcheck_version);
+			dat->is_checkunique = false;
 		}
+		else
+			dat->is_checkunique = opts.checkunique;
+
+		/* GiST indexes are supported in 1.5+ */
+		dat->gist_supported = ((vmaj == 1 && vmin >= 5) || vmaj > 1);
 
 		PQclear(result);
 
@@ -649,8 +654,8 @@ main(int argc, char *argv[])
 			if (pat->heap_only)
 				log_no_match("no heap tables to check matching \"%s\"",
 							 pat->pattern);
-			else if (pat->btree_only)
-				log_no_match("no btree indexes to check matching \"%s\"",
+			else if (pat->index_only)
+				log_no_match("no indexes to check matching \"%s\"",
 							 pat->pattern);
 			else if (pat->rel_regex == NULL)
 				log_no_match("no relations to check in schemas matching \"%s\"",
@@ -783,13 +788,29 @@ main(int argc, char *argv[])
 				if (opts.show_progress && progress_since_last_stderr)
 					fprintf(stderr, "\n");
 
-				pg_log_info("checking btree index \"%s.%s.%s\"",
+				pg_log_info("checking index \"%s.%s.%s\"",
 							rel->datinfo->datname, rel->nspname, rel->relname);
 				progress_since_last_stderr = false;
 			}
-			prepare_btree_command(&sql, rel, free_slot->connection);
+			if (rel->amoid == BTREE_AM_OID)
+				prepare_btree_command(&sql, rel, free_slot->connection);
+			else if (rel->amoid == GIST_AM_OID)
+			{
+				if (rel->datinfo->gist_supported)
+					prepare_gist_command(&sql, rel, free_slot->connection);
+				else
+				{
+					if (!gist_warn_printed)
+						pg_log_warning("GiST verification is not supported by installed amcheck version");
+					gist_warn_printed = true;
+				}
+			}
+			else
+				/* should not happen at this stage */
+				pg_log_info("Verification of index type %u not supported",
+							rel->amoid);
 			rel->sql = pstrdup(sql.data);	/* pg_free'd after command */
-			ParallelSlotSetHandler(free_slot, verify_btree_slot_handler, rel);
+			ParallelSlotSetHandler(free_slot, verify_index_slot_handler, rel);
 			run_command(free_slot, rel->sql);
 		}
 	}
@@ -867,7 +888,7 @@ prepare_heap_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
  * Creates a SQL command for running amcheck checking on the given btree index
  * relation.  The command does not select any columns, as btree checking
  * functions do not return any, but rather return corruption information by
- * raising errors, which verify_btree_slot_handler expects.
+ * raising errors, which verify_index_slot_handler expects.
  *
  * The constructed SQL command will silently skip temporary indexes, and
  * indexes being reindexed concurrently, as checking them would needlessly draw
@@ -913,6 +934,28 @@ prepare_btree_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
 						  rel->reloid);
 }
 
+/*
+ * prepare_gist_command
+ * Similar to btree equivalent prepares command to check GiST index.
+ */
+static void
+prepare_gist_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
+{
+	resetPQExpBuffer(sql);
+
+	appendPQExpBuffer(sql,
+						"SELECT %s.gist_index_check("
+						"index := c.oid, heapallindexed := %s)"
+						"\nFROM pg_catalog.pg_class c, pg_catalog.pg_index i "
+						"WHERE c.oid = %u "
+						"AND c.oid = i.indexrelid "
+						"AND c.relpersistence != 't' "
+						"AND i.indisready AND i.indisvalid AND i.indislive",
+						rel->datinfo->amcheck_schema,
+						(opts.heapallindexed ? "true" : "false"),
+						rel->reloid);
+}
+
 /*
  * run_command
  *
@@ -952,7 +995,7 @@ run_command(ParallelSlot *slot, const char *sql)
  * Note: Heap relation corruption is reported by verify_heapam() via the result
  * set, rather than an ERROR, but running verify_heapam() on a corrupted heap
  * table may still result in an error being returned from the server due to
- * missing relation files, bad checksums, etc.  The btree corruption checking
+ * missing relation files, bad checksums, etc.  The corruption checking
  * functions always use errors to communicate corruption messages.  We can't
  * just abort processing because we got a mere ERROR.
  *
@@ -1102,11 +1145,11 @@ verify_heap_slot_handler(PGresult *res, PGconn *conn, void *context)
 }
 
 /*
- * verify_btree_slot_handler
+ * verify_index_slot_handler
  *
- * ParallelSlotHandler that receives results from a btree checking command
- * created by prepare_btree_command and outputs them for the user.  The results
- * from the btree checking command is assumed to be empty, but when the results
+ * ParallelSlotHandler that receives results from a checking command created by
+ * prepare_[btree,gist]_command and outputs them for the user.  The results
+ * from the checking command is assumed to be empty, but when the results
  * are an error code, the useful information about the corruption is expected
  * in the connection's error message.
  *
@@ -1115,7 +1158,7 @@ verify_heap_slot_handler(PGresult *res, PGconn *conn, void *context)
  * context: unused
  */
 static bool
-verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
+verify_index_slot_handler(PGresult *res, PGconn *conn, void *context)
 {
 	RelationInfo *rel = (RelationInfo *) context;
 
@@ -1126,7 +1169,7 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 		if (ntups > 1)
 		{
 			/*
-			 * We expect the btree checking functions to return one void row
+			 * We expect the checking functions to return one void row
 			 * each, or zero rows if the check was skipped due to the object
 			 * being in the wrong state to be checked, so we should output
 			 * some sort of warning if we get anything more, not because it
@@ -1141,7 +1184,7 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 			 */
 			if (opts.show_progress && progress_since_last_stderr)
 				fprintf(stderr, "\n");
-			pg_log_warning("btree index \"%s.%s.%s\": btree checking function returned unexpected number of rows: %d",
+			pg_log_warning("index \"%s.%s.%s\": checking function returned unexpected number of rows: %d",
 						   rel->datinfo->datname, rel->nspname, rel->relname, ntups);
 			if (opts.verbose)
 				pg_log_warning_detail("Query was: %s", rel->sql);
@@ -1155,7 +1198,7 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 		char	   *msg = indent_lines(PQerrorMessage(conn));
 
 		all_checks_pass = false;
-		printf(_("btree index \"%s.%s.%s\":\n"),
+		printf(_("index \"%s.%s.%s\":\n"),
 			   rel->datinfo->datname, rel->nspname, rel->relname);
 		printf("%s", msg);
 		if (opts.verbose)
@@ -1209,6 +1252,8 @@ help(const char *progname)
 	printf(_("      --heapallindexed            check that all heap tuples are found within indexes\n"));
 	printf(_("      --parent-check              check index parent/child relationships\n"));
 	printf(_("      --rootdescend               search from root page to refind tuples\n"));
+	printf(_("\nGiST index checking options:\n"));
+	printf(_("      --heapallindexed            check that all heap tuples are found within indexes\n"));
 	printf(_("\nConnection options:\n"));
 	printf(_("  -h, --host=HOSTNAME             database server host or socket directory\n"));
 	printf(_("  -p, --port=PORT                 database server port\n"));
@@ -1422,11 +1467,11 @@ append_schema_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
  * pattern: the relation name pattern
  * encoding: client encoding for parsing the pattern
  * heap_only: whether the pattern should only be matched against heap tables
- * btree_only: whether the pattern should only be matched against btree indexes
+ * index_only: whether the pattern should only be matched against indexes
  */
 static void
 append_relation_pattern_helper(PatternInfoArray *pia, const char *pattern,
-							   int encoding, bool heap_only, bool btree_only)
+							   int encoding, bool heap_only, bool index_only)
 {
 	PQExpBufferData dbbuf;
 	PQExpBufferData nspbuf;
@@ -1461,14 +1506,14 @@ append_relation_pattern_helper(PatternInfoArray *pia, const char *pattern,
 	termPQExpBuffer(&relbuf);
 
 	info->heap_only = heap_only;
-	info->btree_only = btree_only;
+	info->index_only = index_only;
 }
 
 /*
  * append_relation_pattern
  *
  * Adds the given pattern interpreted as a relation pattern, to be matched
- * against both heap tables and btree indexes.
+ * against both heap tables and indexes.
  *
  * pia: the pattern info array to be appended
  * pattern: the relation name pattern
@@ -1497,17 +1542,17 @@ append_heap_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
 }
 
 /*
- * append_btree_pattern
+ * append_index_pattern
  *
  * Adds the given pattern interpreted as a relation pattern, to be matched only
- * against btree indexes.
+ * against indexes.
  *
  * pia: the pattern info array to be appended
  * pattern: the relation name pattern
  * encoding: client encoding for parsing the pattern
  */
 static void
-append_btree_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
+append_index_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
 {
 	append_relation_pattern_helper(pia, pattern, encoding, false, true);
 }
@@ -1765,7 +1810,7 @@ compile_database_list(PGconn *conn, SimplePtrList *databases,
  *     rel_regex: the relname regexp parsed from the pattern, or NULL if the
  *                pattern had no relname part
  *     heap_only: true if the pattern applies only to heap tables (not indexes)
- *     btree_only: true if the pattern applies only to btree indexes (not tables)
+ *     index_only: true if the pattern applies only to indexes (not tables)
  *
  * buf: the buffer to be appended
  * patterns: the array of patterns to be inserted into the CTE
@@ -1807,7 +1852,7 @@ append_rel_pattern_raw_cte(PQExpBuffer buf, const PatternInfoArray *pia,
 			appendPQExpBufferStr(buf, "::TEXT, true::BOOLEAN");
 		else
 			appendPQExpBufferStr(buf, "::TEXT, false::BOOLEAN");
-		if (info->btree_only)
+		if (info->index_only)
 			appendPQExpBufferStr(buf, ", true::BOOLEAN");
 		else
 			appendPQExpBufferStr(buf, ", false::BOOLEAN");
@@ -1845,8 +1890,8 @@ append_rel_pattern_filtered_cte(PQExpBuffer buf, const char *raw,
 								const char *filtered, PGconn *conn)
 {
 	appendPQExpBuffer(buf,
-					  "\n%s (pattern_id, nsp_regex, rel_regex, heap_only, btree_only) AS ("
-					  "\nSELECT pattern_id, nsp_regex, rel_regex, heap_only, btree_only "
+					  "\n%s (pattern_id, nsp_regex, rel_regex, heap_only, index_only) AS ("
+					  "\nSELECT pattern_id, nsp_regex, rel_regex, heap_only, index_only "
 					  "FROM %s r"
 					  "\nWHERE (r.db_regex IS NULL "
 					  "OR ",
@@ -1869,7 +1914,7 @@ append_rel_pattern_filtered_cte(PQExpBuffer buf, const char *raw,
  * The cells of the constructed list contain all information about the relation
  * necessary to connect to the database and check the object, including which
  * database to connect to, where contrib/amcheck is installed, and the Oid and
- * type of object (heap table vs. btree index).  Rather than duplicating the
+ * type of object (heap table vs. index).  Rather than duplicating the
  * database details per relation, the relation structs use references to the
  * same database object, provided by the caller.
  *
@@ -1896,7 +1941,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	if (!opts.allrel)
 	{
 		appendPQExpBufferStr(&sql,
-							 " include_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, btree_only) AS (");
+							 " include_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, index_only) AS (");
 		append_rel_pattern_raw_cte(&sql, &opts.include, conn);
 		appendPQExpBufferStr(&sql, "\n),");
 		append_rel_pattern_filtered_cte(&sql, "include_raw", "include_pat", conn);
@@ -1906,7 +1951,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	if (opts.excludetbl || opts.excludeidx || opts.excludensp)
 	{
 		appendPQExpBufferStr(&sql,
-							 " exclude_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, btree_only) AS (");
+							 " exclude_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, index_only) AS (");
 		append_rel_pattern_raw_cte(&sql, &opts.exclude, conn);
 		appendPQExpBufferStr(&sql, "\n),");
 		append_rel_pattern_filtered_cte(&sql, "exclude_raw", "exclude_pat", conn);
@@ -1914,36 +1959,36 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 
 	/* Append the relation CTE. */
 	appendPQExpBufferStr(&sql,
-						 " relation (pattern_id, oid, nspname, relname, reltoastrelid, relpages, is_heap, is_btree) AS ("
+						 " relation (pattern_id, oid, amoid, nspname, relname, reltoastrelid, relpages, is_heap, is_index) AS ("
 						 "\nSELECT DISTINCT ON (c.oid");
 	if (!opts.allrel)
 		appendPQExpBufferStr(&sql, ", ip.pattern_id) ip.pattern_id,");
 	else
 		appendPQExpBufferStr(&sql, ") NULL::INTEGER AS pattern_id,");
 	appendPQExpBuffer(&sql,
-					  "\nc.oid, n.nspname, c.relname, c.reltoastrelid, c.relpages, "
-					  "c.relam = %u AS is_heap, "
-					  "c.relam = %u AS is_btree"
+					  "\nc.oid, c.relam as amoid, n.nspname, c.relname, "
+					  "c.reltoastrelid, c.relpages, c.relam = %u AS is_heap, "
+					  "(c.relam = %u OR c.relam = %u) AS is_index"
 					  "\nFROM pg_catalog.pg_class c "
 					  "INNER JOIN pg_catalog.pg_namespace n "
 					  "ON c.relnamespace = n.oid",
-					  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+					  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 	if (!opts.allrel)
 		appendPQExpBuffer(&sql,
 						  "\nINNER JOIN include_pat ip"
 						  "\nON (n.nspname ~ ip.nsp_regex OR ip.nsp_regex IS NULL)"
 						  "\nAND (c.relname ~ ip.rel_regex OR ip.rel_regex IS NULL)"
 						  "\nAND (c.relam = %u OR NOT ip.heap_only)"
-						  "\nAND (c.relam = %u OR NOT ip.btree_only)",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  "\nAND ((c.relam = %u OR c.relam = %u) OR NOT ip.index_only)",
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 	if (opts.excludetbl || opts.excludeidx || opts.excludensp)
 		appendPQExpBuffer(&sql,
 						  "\nLEFT OUTER JOIN exclude_pat ep"
 						  "\nON (n.nspname ~ ep.nsp_regex OR ep.nsp_regex IS NULL)"
 						  "\nAND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL)"
 						  "\nAND (c.relam = %u OR NOT ep.heap_only OR ep.rel_regex IS NULL)"
-						  "\nAND (c.relam = %u OR NOT ep.btree_only OR ep.rel_regex IS NULL)",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  "\nAND ((c.relam = %u OR c.relam = %u) OR NOT ep.index_only OR ep.rel_regex IS NULL)",
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 
 	/*
 	 * Exclude temporary tables and indexes, which must necessarily belong to
@@ -1977,12 +2022,12 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 						  HEAP_TABLE_AM_OID, PG_TOAST_NAMESPACE);
 	else
 		appendPQExpBuffer(&sql,
-						  " AND c.relam IN (%u, %u)"
+						  " AND c.relam IN (%u, %u, %u)"
 						  "AND c.relkind IN ('r', 'S', 'm', 't', 'i') "
 						  "AND ((c.relam = %u AND c.relkind IN ('r', 'S', 'm', 't')) OR "
-						  "(c.relam = %u AND c.relkind = 'i'))",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID,
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  "((c.relam = %u OR c.relam = %u) AND c.relkind = 'i'))",
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID,
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 
 	appendPQExpBufferStr(&sql,
 						 "\nORDER BY c.oid)");
@@ -2011,7 +2056,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		appendPQExpBufferStr(&sql,
 							 "\n)");
 	}
-	if (!opts.no_btree_expansion)
+	if (!opts.no_index_expansion)
 	{
 		/*
 		 * Include a CTE for btree indexes associated with primary heap tables
@@ -2019,9 +2064,9 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		 * btree index names.
 		 */
 		appendPQExpBufferStr(&sql,
-							 ", index (oid, nspname, relname, relpages) AS ("
-							 "\nSELECT c.oid, r.nspname, c.relname, c.relpages "
-							 "FROM relation r"
+							 ", index (oid, amoid, nspname, relname, relpages) AS ("
+							 "\nSELECT c.oid, c.relam as amoid, r.nspname, "
+							 "c.relname, c.relpages FROM relation r"
 							 "\nINNER JOIN pg_catalog.pg_index i "
 							 "ON r.oid = i.indrelid "
 							 "INNER JOIN pg_catalog.pg_class c "
@@ -2034,15 +2079,15 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 								 "\nLEFT OUTER JOIN exclude_pat ep "
 								 "ON (n.nspname ~ ep.nsp_regex OR ep.nsp_regex IS NULL) "
 								 "AND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL) "
-								 "AND ep.btree_only"
+								 "AND ep.index_only"
 								 "\nWHERE ep.pattern_id IS NULL");
 		else
 			appendPQExpBufferStr(&sql,
 								 "\nWHERE true");
 		appendPQExpBuffer(&sql,
-						  " AND c.relam = %u "
+						  " AND (c.relam = %u or c.relam = %u) "
 						  "AND c.relkind = 'i'",
-						  BTREE_AM_OID);
+						  BTREE_AM_OID, GIST_AM_OID);
 		if (opts.no_toast_expansion)
 			appendPQExpBuffer(&sql,
 							  " AND c.relnamespace != %u",
@@ -2050,7 +2095,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		appendPQExpBufferStr(&sql, "\n)");
 	}
 
-	if (!opts.no_toast_expansion && !opts.no_btree_expansion)
+	if (!opts.no_toast_expansion && !opts.no_index_expansion)
 	{
 		/*
 		 * Include a CTE for btree indexes associated with toast tables of
@@ -2071,13 +2116,13 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 								 "\nLEFT OUTER JOIN exclude_pat ep "
 								 "ON ('pg_toast' ~ ep.nsp_regex OR ep.nsp_regex IS NULL) "
 								 "AND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL) "
-								 "AND ep.btree_only "
+								 "AND ep.index_only "
 								 "WHERE ep.pattern_id IS NULL");
 		else
 			appendPQExpBufferStr(&sql,
 								 "\nWHERE true");
 		appendPQExpBuffer(&sql,
-						  " AND c.relam = %u"
+						  " AND c.relam = %u "
 						  " AND c.relkind = 'i')",
 						  BTREE_AM_OID);
 	}
@@ -2091,12 +2136,13 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	 * list.
 	 */
 	appendPQExpBufferStr(&sql,
-						 "\nSELECT pattern_id, is_heap, is_btree, oid, nspname, relname, relpages "
+						 "\nSELECT pattern_id, is_heap, is_index, oid, amoid, nspname, relname, relpages "
 						 "FROM (");
 	appendPQExpBufferStr(&sql,
 	/* Inclusion patterns that failed to match */
-						 "\nSELECT pattern_id, is_heap, is_btree, "
+						 "\nSELECT pattern_id, is_heap, is_index, "
 						 "NULL::OID AS oid, "
+						 "NULL::OID AS amoid, "
 						 "NULL::TEXT AS nspname, "
 						 "NULL::TEXT AS relname, "
 						 "NULL::INTEGER AS relpages"
@@ -2105,29 +2151,29 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 						 "UNION"
 	/* Primary relations */
 						 "\nSELECT NULL::INTEGER AS pattern_id, "
-						 "is_heap, is_btree, oid, nspname, relname, relpages "
+						 "is_heap, is_index, oid, amoid, nspname, relname, relpages "
 						 "FROM relation");
 	if (!opts.no_toast_expansion)
-		appendPQExpBufferStr(&sql,
+		appendPQExpBuffer(&sql,
 							 " UNION"
 		/* Toast tables for primary relations */
 							 "\nSELECT NULL::INTEGER AS pattern_id, TRUE AS is_heap, "
-							 "FALSE AS is_btree, oid, nspname, relname, relpages "
+							 "FALSE AS is_index, oid, 0 as amoid, nspname, relname, relpages "
 							 "FROM toast");
-	if (!opts.no_btree_expansion)
+	if (!opts.no_index_expansion)
 		appendPQExpBufferStr(&sql,
 							 " UNION"
 		/* Indexes for primary relations */
 							 "\nSELECT NULL::INTEGER AS pattern_id, FALSE AS is_heap, "
-							 "TRUE AS is_btree, oid, nspname, relname, relpages "
+							 "TRUE AS is_index, oid, amoid, nspname, relname, relpages "
 							 "FROM index");
-	if (!opts.no_toast_expansion && !opts.no_btree_expansion)
-		appendPQExpBufferStr(&sql,
+	if (!opts.no_toast_expansion && !opts.no_index_expansion)
+		appendPQExpBuffer(&sql,
 							 " UNION"
 		/* Indexes for toast relations */
 							 "\nSELECT NULL::INTEGER AS pattern_id, FALSE AS is_heap, "
-							 "TRUE AS is_btree, oid, nspname, relname, relpages "
-							 "FROM toast_index");
+							 "TRUE AS is_index, oid, %u as amoid, nspname, relname, relpages "
+							 "FROM toast_index", BTREE_AM_OID);
 	appendPQExpBufferStr(&sql,
 						 "\n) AS combined_records "
 						 "ORDER BY relpages DESC NULLS FIRST, oid");
@@ -2147,8 +2193,9 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	{
 		int			pattern_id = -1;
 		bool		is_heap = false;
-		bool		is_btree PG_USED_FOR_ASSERTS_ONLY = false;
+		bool		is_index PG_USED_FOR_ASSERTS_ONLY = false;
 		Oid			oid = InvalidOid;
+		Oid			amoid = InvalidOid;
 		const char *nspname = NULL;
 		const char *relname = NULL;
 		int			relpages = 0;
@@ -2158,15 +2205,17 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		if (!PQgetisnull(res, i, 1))
 			is_heap = (PQgetvalue(res, i, 1)[0] == 't');
 		if (!PQgetisnull(res, i, 2))
-			is_btree = (PQgetvalue(res, i, 2)[0] == 't');
+			is_index = (PQgetvalue(res, i, 2)[0] == 't');
 		if (!PQgetisnull(res, i, 3))
 			oid = atooid(PQgetvalue(res, i, 3));
 		if (!PQgetisnull(res, i, 4))
-			nspname = PQgetvalue(res, i, 4);
+			amoid = atooid(PQgetvalue(res, i, 4));
 		if (!PQgetisnull(res, i, 5))
-			relname = PQgetvalue(res, i, 5);
+			nspname = PQgetvalue(res, i, 5);
 		if (!PQgetisnull(res, i, 6))
-			relpages = atoi(PQgetvalue(res, i, 6));
+			relname = PQgetvalue(res, i, 6);
+		if (!PQgetisnull(res, i, 7))
+			relpages = atoi(PQgetvalue(res, i, 7));
 
 		if (pattern_id >= 0)
 		{
@@ -2188,10 +2237,11 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 			RelationInfo *rel = (RelationInfo *) pg_malloc0(sizeof(RelationInfo));
 
 			Assert(OidIsValid(oid));
-			Assert((is_heap && !is_btree) || (is_btree && !is_heap));
+			Assert((is_heap && !is_index) || (is_index && !is_heap));
 
 			rel->datinfo = dat;
 			rel->reloid = oid;
+			rel->amoid = amoid;
 			rel->is_heap = is_heap;
 			rel->nspname = pstrdup(nspname);
 			rel->relname = pstrdup(relname);
@@ -2201,7 +2251,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 			{
 				/*
 				 * We apply --startblock and --endblock to heap tables, but
-				 * not btree indexes, and for progress purposes we need to
+				 * not supported indexes, and for progress purposes we need to
 				 * track how many blocks we expect to check.
 				 */
 				if (opts.endblock >= 0 && rel->blocks_to_check > opts.endblock)
diff --git a/src/bin/pg_amcheck/t/002_nonesuch.pl b/src/bin/pg_amcheck/t/002_nonesuch.pl
index 67d700ea07a..d4cc0664f3b 100644
--- a/src/bin/pg_amcheck/t/002_nonesuch.pl
+++ b/src/bin/pg_amcheck/t/002_nonesuch.pl
@@ -272,8 +272,8 @@ $node->command_checks_all(
 	[
 		qr/pg_amcheck: warning: no heap tables to check matching "no_such_table"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "no\*such\*table"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "no_such_index"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "no\*such\*index"/,
+		qr/pg_amcheck: warning: no indexes to check matching "no_such_index"/,
+		qr/pg_amcheck: warning: no indexes to check matching "no\*such\*index"/,
 		qr/pg_amcheck: warning: no relations to check matching "no_such_relation"/,
 		qr/pg_amcheck: warning: no relations to check matching "no\*such\*relation"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "no\*such\*table"/,
@@ -350,8 +350,8 @@ $node->command_checks_all(
 		qr/pg_amcheck: warning: no heap tables to check matching "template1\.public\.foo"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "another_db\.public\.foo"/,
 		qr/pg_amcheck: warning: no connectable databases to check matching "no_such_database\.public\.foo"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "template1\.public\.foo_idx"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "another_db\.public\.foo_idx"/,
+		qr/pg_amcheck: warning: no indexes to check matching "template1\.public\.foo_idx"/,
+		qr/pg_amcheck: warning: no indexes to check matching "another_db\.public\.foo_idx"/,
 		qr/pg_amcheck: warning: no connectable databases to check matching "no_such_database\.public\.foo_idx"/,
 		qr/pg_amcheck: error: no relations to check/,
 	],
diff --git a/src/bin/pg_amcheck/t/003_check.pl b/src/bin/pg_amcheck/t/003_check.pl
index 4b16bda6a48..7da498ea98d 100644
--- a/src/bin/pg_amcheck/t/003_check.pl
+++ b/src/bin/pg_amcheck/t/003_check.pl
@@ -185,7 +185,7 @@ for my $dbname (qw(db1 db2 db3))
 	# schemas.  The schemas are all identical to start, but
 	# we will corrupt them differently later.
 	#
-	for my $schema (qw(s1 s2 s3 s4 s5))
+	for my $schema (qw(s1 s2 s3 s4 s5 s6))
 	{
 		$node->safe_psql(
 			$dbname, qq(
@@ -291,22 +291,24 @@ plan_to_corrupt_first_page('db1', 's3.t2_btree');
 # Corrupt toast table, partitions, and materialized views in schema "s4"
 plan_to_remove_toast_file('db1', 's4.t2');
 
-# Corrupt all other object types in schema "s5".  We don't have amcheck support
+# Corrupt GiST index in schema "s5"
+plan_to_remove_relation_file('db1', 's5.t1_gist');
+plan_to_corrupt_first_page('db1', 's5.t2_gist');
+
+# Corrupt all other object types in schema "s6".  We don't have amcheck support
 # for these types, but we check that their corruption does not trigger any
 # errors in pg_amcheck
-plan_to_remove_relation_file('db1', 's5.seq1');
-plan_to_remove_relation_file('db1', 's5.t1_hash');
-plan_to_remove_relation_file('db1', 's5.t1_gist');
-plan_to_remove_relation_file('db1', 's5.t1_gin');
-plan_to_remove_relation_file('db1', 's5.t1_brin');
-plan_to_remove_relation_file('db1', 's5.t1_spgist');
+plan_to_remove_relation_file('db1', 's6.seq1');
+plan_to_remove_relation_file('db1', 's6.t1_hash');
+plan_to_remove_relation_file('db1', 's6.t1_gin');
+plan_to_remove_relation_file('db1', 's6.t1_brin');
+plan_to_remove_relation_file('db1', 's6.t1_spgist');
 
-plan_to_corrupt_first_page('db1', 's5.seq2');
-plan_to_corrupt_first_page('db1', 's5.t2_hash');
-plan_to_corrupt_first_page('db1', 's5.t2_gist');
-plan_to_corrupt_first_page('db1', 's5.t2_gin');
-plan_to_corrupt_first_page('db1', 's5.t2_brin');
-plan_to_corrupt_first_page('db1', 's5.t2_spgist');
+plan_to_corrupt_first_page('db1', 's6.seq2');
+plan_to_corrupt_first_page('db1', 's6.t2_hash');
+plan_to_corrupt_first_page('db1', 's6.t2_gin');
+plan_to_corrupt_first_page('db1', 's6.t2_brin');
+plan_to_corrupt_first_page('db1', 's6.t2_spgist');
 
 
 # Database 'db2' corruptions
@@ -437,10 +439,22 @@ $node->command_checks_all(
 	[$no_output_re],
 	'pg_amcheck in schema s4 excluding toast reports no corruption');
 
-# Check that no corruption is reported in schema db1.s5
-$node->command_checks_all([ @cmd, '-s', 's5', 'db1' ],
+# In schema db1.s5 we should see GiST corruption messages on stdout, and
+# nothing on stderr.
+#
+$node->command_checks_all(
+	[ @cmd, '-s', 's5', 'db1' ],
+	2,
+	[
+		$missing_file_re, $line_pointer_corruption_re,
+	],
+	[$no_output_re],
+	'pg_amcheck schema s5 reports GiST index errors');
+
+# Check that no corruption is reported in schema db1.s6
+$node->command_checks_all([ @cmd, '-s', 's6', 'db1' ],
 	0, [$no_output_re], [$no_output_re],
-	'pg_amcheck over schema s5 reports no corruption');
+	'pg_amcheck over schema s6 reports no corruption');
 
 # In schema db1.s1, only indexes are corrupt.  Verify that when we exclude
 # the indexes, no corruption is reported about the schema.
@@ -551,7 +565,7 @@ $node->command_checks_all(
 	'pg_amcheck excluding all corrupt schemas with --checkunique option');
 
 #
-# Smoke test for checkunique option for not supported versions.
+# Smoke test for checkunique option and GiST indexes for not supported versions.
 #
 $node->safe_psql(
 	'db3', q(
@@ -567,4 +581,19 @@ $node->command_checks_all(
 		qr/pg_amcheck: warning: --checkunique option is not supported by amcheck version "1.3"/
 	],
 	'pg_amcheck smoke test --checkunique');
+
+$node->safe_psql(
+	'db1', q(
+		DROP EXTENSION amcheck;
+		CREATE EXTENSION amcheck WITH SCHEMA amcheck_schema VERSION '1.3' ;
+));
+
+$node->command_checks_all(
+	[ @cmd, '-s', 's5', 'db1' ],
+	0,
+	[$no_output_re],
+	[
+		qr/pg_amcheck: warning: GiST verification is not supported by installed amcheck version/
+	],
+	'pg_amcheck smoke test --checkunique');
 done_testing();
-- 
2.45.2

v28-review-0011-review.patchtext/x-patch; charset=UTF-8; name=v28-review-0011-review.patchDownload

From 67b59501fe46ecc4899324f6019d0d7a213d57ab Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Wed, 10 Jul 2024 15:58:38 +0200
Subject: [PATCH v28-review 11/13] review

---
 src/bin/pg_amcheck/pg_amcheck.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/bin/pg_amcheck/pg_amcheck.c b/src/bin/pg_amcheck/pg_amcheck.c
index 8a93fd4140d..101cf4f0ae7 100644
--- a/src/bin/pg_amcheck/pg_amcheck.c
+++ b/src/bin/pg_amcheck/pg_amcheck.c
@@ -606,7 +606,7 @@ main(int argc, char *argv[])
 												 strlen(amcheck_schema));
 
 		/*
-		 * Check the version of amcheck extension. 
+		 * Check the version of amcheck extension.
 		 */
 		amcheck_version = PQgetvalue(result, 0, 1);
 
-- 
2.45.2

v28-review-0012-pgindent.patchtext/x-patch; charset=UTF-8; name=v28-review-0012-pgindent.patchDownload

From 62f23870f276e72ed78e1700f550788bea443832 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Wed, 10 Jul 2024 16:00:55 +0200
Subject: [PATCH v28-review 12/13] pgindent

---
 src/bin/pg_amcheck/pg_amcheck.c | 64 ++++++++++++++++-----------------
 1 file changed, 31 insertions(+), 33 deletions(-)

diff --git a/src/bin/pg_amcheck/pg_amcheck.c b/src/bin/pg_amcheck/pg_amcheck.c
index 101cf4f0ae7..62f43708da9 100644
--- a/src/bin/pg_amcheck/pg_amcheck.c
+++ b/src/bin/pg_amcheck/pg_amcheck.c
@@ -73,10 +73,9 @@ typedef struct AmcheckOptions
 
 	/*
 	 * As an optimization, if any pattern in the exclude list applies to heap
-	 * tables, or similarly if any such pattern applies to indexes, or
-	 * to schemas, then these will be true, otherwise false.  These should
-	 * always agree with what you'd conclude by grep'ing through the exclude
-	 * list.
+	 * tables, or similarly if any such pattern applies to indexes, or to
+	 * schemas, then these will be true, otherwise false.  These should always
+	 * agree with what you'd conclude by grep'ing through the exclude list.
 	 */
 	bool		excludetbl;
 	bool		excludeidx;
@@ -180,7 +179,7 @@ static void prepare_heap_command(PQExpBuffer sql, RelationInfo *rel,
 static void prepare_btree_command(PQExpBuffer sql, RelationInfo *rel,
 								  PGconn *conn);
 static void prepare_gist_command(PQExpBuffer sql, RelationInfo *rel,
-								  PGconn *conn);
+								 PGconn *conn);
 static void run_command(ParallelSlot *slot, const char *sql);
 static bool verify_heap_slot_handler(PGresult *res, PGconn *conn,
 									 void *context);
@@ -611,9 +610,8 @@ main(int argc, char *argv[])
 		amcheck_version = PQgetvalue(result, 0, 1);
 
 		/*
-		 * Now amcheck has only major and minor versions in the string but
-		 * we also support revision just in case. Now it is expected to be
-		 * zero.
+		 * Now amcheck has only major and minor versions in the string but we
+		 * also support revision just in case. Now it is expected to be zero.
 		 */
 		sscanf(amcheck_version, "%d.%d.%d", &vmaj, &vmin, &vrev);
 
@@ -625,7 +623,7 @@ main(int argc, char *argv[])
 		if (opts.checkunique && ((vmaj == 1 && vmin < 4) || vmaj == 0))
 		{
 			pg_log_warning("--checkunique option is not supported by amcheck "
-							"version \"%s\"", amcheck_version);
+						   "version \"%s\"", amcheck_version);
 			dat->is_checkunique = false;
 		}
 		else
@@ -944,16 +942,16 @@ prepare_gist_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
 	resetPQExpBuffer(sql);
 
 	appendPQExpBuffer(sql,
-						"SELECT %s.gist_index_check("
-						"index := c.oid, heapallindexed := %s)"
-						"\nFROM pg_catalog.pg_class c, pg_catalog.pg_index i "
-						"WHERE c.oid = %u "
-						"AND c.oid = i.indexrelid "
-						"AND c.relpersistence != 't' "
-						"AND i.indisready AND i.indisvalid AND i.indislive",
-						rel->datinfo->amcheck_schema,
-						(opts.heapallindexed ? "true" : "false"),
-						rel->reloid);
+					  "SELECT %s.gist_index_check("
+					  "index := c.oid, heapallindexed := %s)"
+					  "\nFROM pg_catalog.pg_class c, pg_catalog.pg_index i "
+					  "WHERE c.oid = %u "
+					  "AND c.oid = i.indexrelid "
+					  "AND c.relpersistence != 't' "
+					  "AND i.indisready AND i.indisvalid AND i.indislive",
+					  rel->datinfo->amcheck_schema,
+					  (opts.heapallindexed ? "true" : "false"),
+					  rel->reloid);
 }
 
 /*
@@ -1169,12 +1167,12 @@ verify_index_slot_handler(PGresult *res, PGconn *conn, void *context)
 		if (ntups > 1)
 		{
 			/*
-			 * We expect the checking functions to return one void row
-			 * each, or zero rows if the check was skipped due to the object
-			 * being in the wrong state to be checked, so we should output
-			 * some sort of warning if we get anything more, not because it
-			 * indicates corruption, but because it suggests a mismatch
-			 * between amcheck and pg_amcheck versions.
+			 * We expect the checking functions to return one void row each,
+			 * or zero rows if the check was skipped due to the object being
+			 * in the wrong state to be checked, so we should output some sort
+			 * of warning if we get anything more, not because it indicates
+			 * corruption, but because it suggests a mismatch between amcheck
+			 * and pg_amcheck versions.
 			 *
 			 * In conjunction with --progress, anything written to stderr at
 			 * this time would present strangely to the user without an extra
@@ -2155,11 +2153,11 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 						 "FROM relation");
 	if (!opts.no_toast_expansion)
 		appendPQExpBuffer(&sql,
-							 " UNION"
+						  " UNION"
 		/* Toast tables for primary relations */
-							 "\nSELECT NULL::INTEGER AS pattern_id, TRUE AS is_heap, "
-							 "FALSE AS is_index, oid, 0 as amoid, nspname, relname, relpages "
-							 "FROM toast");
+						  "\nSELECT NULL::INTEGER AS pattern_id, TRUE AS is_heap, "
+						  "FALSE AS is_index, oid, 0 as amoid, nspname, relname, relpages "
+						  "FROM toast");
 	if (!opts.no_index_expansion)
 		appendPQExpBufferStr(&sql,
 							 " UNION"
@@ -2169,11 +2167,11 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 							 "FROM index");
 	if (!opts.no_toast_expansion && !opts.no_index_expansion)
 		appendPQExpBuffer(&sql,
-							 " UNION"
+						  " UNION"
 		/* Indexes for toast relations */
-							 "\nSELECT NULL::INTEGER AS pattern_id, FALSE AS is_heap, "
-							 "TRUE AS is_index, oid, %u as amoid, nspname, relname, relpages "
-							 "FROM toast_index", BTREE_AM_OID);
+						  "\nSELECT NULL::INTEGER AS pattern_id, FALSE AS is_heap, "
+						  "TRUE AS is_index, oid, %u as amoid, nspname, relname, relpages "
+						  "FROM toast_index", BTREE_AM_OID);
 	appendPQExpBufferStr(&sql,
 						 "\n) AS combined_records "
 						 "ORDER BY relpages DESC NULLS FIRST, oid");
-- 
2.45.2

v28-review-0013-assert-in-GIN-debug-message.patchtext/x-patch; charset=UTF-8; name=v28-review-0013-assert-in-GIN-debug-message.patchDownload

From 5fcd217ab30665df8b3c3a914375ff6dee1b7b60 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tv@fuzzy.cz>
Date: Wed, 10 Jul 2024 17:33:28 +0200
Subject: [PATCH v28-review 13/13] assert in GIN debug message

---
 contrib/amcheck/verify_gin.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index 8ce112fe49e..c9c20bcb3af 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -293,11 +293,14 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 			{
 				PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
 
+/*
+ * triggers assert in ItemPointerGetOffsetNumber
+ *
 				elog(DEBUG3, "key (%u, %u) -> %u",
 					 ItemPointerGetBlockNumber(&posting_item->key),
 					 ItemPointerGetOffsetNumber(&posting_item->key),
 					 BlockIdGetBlockNumber(&posting_item->child_blkno));
-
+*/
 				if (i == maxoff &&
 					GinPageGetOpaque(page)->rightlink == InvalidBlockNumber)
 				{
-- 
2.45.2

#45

Tomas Vondra

tomas.vondra@enterprisedb.com

over 1 year ago

In reply to: Tomas Vondra (#44)

Re: Amcheck verification of GiST and GIN

On 7/10/24 18:01, Tomas Vondra wrote:

...

That's all for now. I'll add this to the stress-testing tests of my
index build patches, and if that triggers more issues I'll report those.

As mentioned a couple days ago, I started using this patch to validate
the patches adding parallel builds to GIN and GiST indexes - I scripts
to stress-test the builds, and I added the new amcheck functions as
another validation step.

For GIN indexes it didn't find anything new (in either this or my
patches), aside from the assert crash I already reported.

But for GiST it turned out to be very valuable - it did actually find an
issue in my patches, or rather confirm my hypothesis that the way the
patch generates fake LSN may not be quite right.

In particular, I've observed these two issues:

ERROR: heap tuple (13315,38) from table "planet_osm_roads" lacks
matching index tuple within index "roads_7_1_idx"

ERROR: index "roads_7_7_idx" has inconsistent records on page 23723
offset 113

And those consistency issues are real - I've managed to reproduce issues
with incorrect query results (by comparing the results to an index built
without parallelism).

So that's nice - it shows the value of this patch, and I like it.

One thing I've been wondering about is that currently amcheck (in
general, not just these new GIN/GiST functions) errors out on the first
issue, because it does ereport(ERROR). Which is good enough to decide if
there is some corruption, but a bit inconvenient if you need to assess
how much corruption is there. For example when investigating the issue
in my patch it would have been great to know if there's just one broken
page, or if there are dozens/hundreds/thousands of them.

I'd imagine we could have a flag which says whether to fail on the first
issue, or keep looking at future pages. Essentially, whether to do
ereport(ERROR) or ereport(WARNING). But maybe that's a dead-end, and
once we find the first issue it's futile to inspect the rest of the
index, because it can be garbage. Not sure. In any case, it's not up to
this patch to invent that.

I don't have additional comments, the patch seems to be clean and likely
ready to go. There's a couple committers already involved in this
thread, I wonder if one of them already planned to take care of this?
Peter and Andres, either of you interested in this?

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#46

Tomas Vondra

tomas.vondra@enterprisedb.com

over 1 year ago

In reply to: Tomas Vondra (#45)

Re: Amcheck verification of GiST and GIN

OK, one mere comment - it seems the 0001 patch has incorrect indentation
in bt_index_check_callback, triggering this:

----------------------------------------------------------------------
verify_nbtree.c: In function ‘bt_index_check_callback’:
verify_nbtree.c:331:25: warning: this ‘if’ clause does not guard...
[-Wmisleading-indentation]
331 | if (indrel->rd_opfamily[i] ==
INTERVAL_BTREE_FAM_OID)
| ^~
In file included from ../../src/include/postgres.h:46,
from verify_nbtree.c:24:
../../src/include/utils/elog.h:142:9: note: ...this statement, but the
latter is misleadingly indented as if it were guarded by the ‘if’
142 | do { \
| ^~
../../src/include/utils/elog.h:164:9: note: in expansion of macro
‘ereport_domain’
164 | ereport_domain(elevel, TEXTDOMAIN, __VA_ARGS__)
| ^~~~~~~~~~~~~~
verify_nbtree.c:333:33: note: in expansion of macro ‘ereport’
333 | ereport(ERROR,
| ^~~~~~~
----------------------------------------------------------------------

This seems to be because the ereport() happens to be indented as if it
was in the "if", but should have been at the "for" level.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#47

Tomas Vondra

tomas.vondra@enterprisedb.com

over 1 year ago

In reply to: Tomas Vondra (#46)

Re: Amcheck verification of GiST and GIN

OK, one more issue report. I originally thought it's a bug in my patch
adding parallel builds for GIN indexes, but it turns out it happens even
with serial builds on master ...

If I build any GIN index, and then do gin_index_parent_check() on it, I
get this error:

create index jsonb_hash on messages using gin (msg_headers jsonb_path_ops);

select gin_index_parent_check('jsonb_hash');
ERROR: index "jsonb_hash" has wrong tuple order, block 43932, offset 328

I did try investigating usinng pageinspect - the page seems to be the
right-most in the tree, judging by rightlink = InvalidBlockNumber:

test=# select gin_page_opaque_info(get_raw_page('jsonb_hash', 43932));
gin_page_opaque_info
----------------------
(4294967295,0,{})
(1 row)

But gin_leafpage_items() apparently only works with compressed leaf
pages, so I'm not sure what's in the page. In any case, the index seems
to be working fine, so it seems like a bug in this patch.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#48

Andrey M. Borodin

x4mmm@yandex-team.ru

over 1 year ago

In reply to: Tomas Vondra (#44)

Re: Amcheck verification of GiST and GIN

Hi Tomas!

Thank you so much for your interest in the patchset.

On 10 Jul 2024, at 19:01, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:

I realized amcheck GIN/GiST support would be useful for testing my
patches adding parallel builds for these index types, so I decided to
take a look at this and do an initial review today.

Great! Thank you!

Attached is a patch series with a extra commits to keep the review
comments and patches adjusting the formatting by pgindent (the patch
seems far enough for this).

I was hoping to address your review comments this weekend, but unfortunately I could not. I'll do this ASAP, but at least I decided to post answers on questions.

Let me quickly go through the review comments:

1) Not sure I like 'amcheck.c' very much, I'd probably go with something
like 'verify_common.c' to match naming of the other files. But it's just
nitpicking and I can live with it.

Any name works for me. We have tens of files ending with "common.c", so I think that's a good way to go.

2) amcheck_lock_relation_and_check seems to be the most important
function, yet there's no comment explaining what it does :-(

Makes sense.

3) amcheck_lock_relation_and_check still has a TODO to add the correct
name of the AM

Yes, I've discovered it during rebase and added TODO.

4) Do we actually need amcheck_index_mainfork_expected as a separate
function, or could it be a part of index_checkable?

It was separate function before refactoring...

5) The comment for heaptuplespresent says "debug counter" but that does
not really explain what it's for. (I see verify_nbtree has the same
comment, but maybe let's improve that.)

It's there for a DEBUG1 message
ereport(DEBUG1,
(errmsg_internal("finished verifying presence of " INT64_FORMAT " tuples from table \"%s\" with bitset %.2f%% set",
But the message is gone for GiST. Perhaps, let's restore this message?

6) I'd suggest moving the GISTSTATE + blocknum fields to the beginning
of GistCheckState, it seems more natural to start with "generic" fields.

Makes sense.

7) I'd adjust the gist_check_parent_keys_consistency comment a bit, to
explain what the function does first, and only then explain how.

Makes sense.

8) We seem to be copying PageGetItemIdCareful() around, right? And the
copy in _gist.c still references nbtree - I guess that's not right.

Version differ in two aspects:
1. Size of opaque data may be different. But we can pass it as a parameter.
2. GIN's line pointer verification is slightly more strict.

9) Why is the GIN function called gin_index_parent_check() and not
simply gin_index_check() as for the other AMs?

AFAIR function should be called _parent_ if it takes ShareLock. gin_index_parent_check() does not, so I think we should rename it.

10) The debug in gin_check_posting_tree_parent_keys_consistency triggers
assert when running with client_min_messages='debug5', it seems to be
accessing bogus item pointers.

11) Why does it add pg_amcheck support only for GiST and not GIN?

GiST part is by far more polished. When we were discussing current implementation with Peter G, we decided that we could finish work on GiST, and then proceed to GIN. Main concern is about GIN's locking model.

On 12 Jul 2024, at 15:16, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:

On 7/10/24 18:01, Tomas Vondra wrote:

...

That's all for now. I'll add this to the stress-testing tests of my
index build patches, and if that triggers more issues I'll report those.

As mentioned a couple days ago, I started using this patch to validate
the patches adding parallel builds to GIN and GiST indexes - I scripts
to stress-test the builds, and I added the new amcheck functions as
another validation step.

For GIN indexes it didn't find anything new (in either this or my
patches), aside from the assert crash I already reported.

But for GiST it turned out to be very valuable - it did actually find an
issue in my patches, or rather confirm my hypothesis that the way the
patch generates fake LSN may not be quite right.

In particular, I've observed these two issues:

ERROR: heap tuple (13315,38) from table "planet_osm_roads" lacks
matching index tuple within index "roads_7_1_idx"

ERROR: index "roads_7_7_idx" has inconsistent records on page 23723
offset 113

And those consistency issues are real - I've managed to reproduce issues
with incorrect query results (by comparing the results to an index built
without parallelism).

So that's nice - it shows the value of this patch, and I like it.

That's great!

One thing I've been wondering about is that currently amcheck (in
general, not just these new GIN/GiST functions) errors out on the first
issue, because it does ereport(ERROR). Which is good enough to decide if
there is some corruption, but a bit inconvenient if you need to assess
how much corruption is there. For example when investigating the issue
in my patch it would have been great to know if there's just one broken
page, or if there are dozens/hundreds/thousands of them.

I'd imagine we could have a flag which says whether to fail on the first
issue, or keep looking at future pages. Essentially, whether to do
ereport(ERROR) or ereport(WARNING). But maybe that's a dead-end, and
once we find the first issue it's futile to inspect the rest of the
index, because it can be garbage. Not sure. In any case, it's not up to
this patch to invent that.

The thing is amcheck tries hard to to do a core dump. It's still possible to crash it with garbage. But if we continue check after encountering first corruption - increase in SegFaults is inevitable.

Thank you! I hope I can get back to code ASAP.

Best regards, Andrey Borodin.

#49

Tomas Vondra

tomas@vondra.me

over 1 year ago

In reply to: Andrey M. Borodin (#48)

Re: Amcheck verification of GiST and GIN

Hi,

I've spent a bit more time looking at the GiST part as part of my
"parallel GiST build" patch nearby, and I think there's some sort of
memory leak.

Consider this:

create table t (a text);

insert into t select md5(i::text)
from generate_series(1,25000000) s(i);

create index on t using gist (a gist_trgm_ops);

select gist_index_check('t_a_idx', true);

This creates a ~4GB GiST trigram index, and then checks it. But that
gets killed, because of OOM killer. On my test machine it consumes
~6.5GB of memory before OOM intervenes.

The memory context stats look like this:

TopPortalContext: 8192 total in 1 blocks; 7680 free (0 chunks); 512 used
PortalContext: 1024 total in 1 blocks; 616 free (0 chunks); 408
used: <unnamed>
ExecutorState: 8192 total in 1 blocks; 4024 free (4 chunks); 4168 used
printtup: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
ExprContext: 8192 total in 1 blocks; 7224 free (10 chunks); 968 used
amcheck context: 3128950872 total in 376 blocks; 219392 free
(1044 chunks); 3128731480 used
ExecutorState: 8192 total in 1 blocks; 7200 free (0 chunks);
992 used
ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks);
240 used
GiST scan context: 22248 total in 2 blocks; 7808 free (8
chunks); 14440 used

This is from before the OOM kill, but it shows there's ~3GB of memory is
the amcheck context.

Seems like a memory leak to me - I didn't look at which place leaks.

regards

--
Tomas Vondra

#50

Kirill Reshke

reshkekirill@gmail.com

about 1 year ago

In reply to: Tomas Vondra (#49)

1 attachment(s)

Re: Amcheck verification of GiST and GIN

Hi!

On Mon, 5 Aug 2024 at 20:05, Tomas Vondra <tomas@vondra.me> wrote:

Hi,

I've spent a bit more time looking at the GiST part as part of my
"parallel GiST build" patch nearby, and I think there's some sort of
memory leak.

Consider this:

create table t (a text);

insert into t select md5(i::text)
from generate_series(1,25000000) s(i);

create index on t using gist (a gist_trgm_ops);

select gist_index_check('t_a_idx', true);

This creates a ~4GB GiST trigram index, and then checks it. But that
gets killed, because of OOM killer. On my test machine it consumes
~6.5GB of memory before OOM intervenes.

The memory context stats look like this:

TopPortalContext: 8192 total in 1 blocks; 7680 free (0 chunks); 512 used
PortalContext: 1024 total in 1 blocks; 616 free (0 chunks); 408
used: <unnamed>
ExecutorState: 8192 total in 1 blocks; 4024 free (4 chunks); 4168 used
printtup: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
ExprContext: 8192 total in 1 blocks; 7224 free (10 chunks); 968 used
amcheck context: 3128950872 total in 376 blocks; 219392 free
(1044 chunks); 3128731480 used
ExecutorState: 8192 total in 1 blocks; 7200 free (0 chunks);
992 used
ExprContext: 8192 total in 1 blocks; 7952 free (0 chunks);
240 used
GiST scan context: 22248 total in 2 blocks; 7808 free (8
chunks); 14440 used

This is from before the OOM kill, but it shows there's ~3GB of memory is
the amcheck context.

Seems like a memory leak to me - I didn't look at which place leaks.

+ 1, there is a memory leak.

regards

--
Tomas Vondra

So, I did some testing, and it seems that the tuple returned by
`gistgetadjusted` inside `gist_check_page` is not being freed.

Trivial fix attached.

--
Best regards,
Kirill Reshke

Attachments:

v28-0014-Fix-memory-leak-in-gist_check_page.patchapplication/octet-stream; name=v28-0014-Fix-memory-leak-in-gist_check_page.patchDownload

From 31fbef4e654946f3513b81ddf437cabf41ae3e74 Mon Sep 17 00:00:00 2001
From: reshke kirill <reshke@double.cloud>
Date: Fri, 18 Oct 2024 12:39:48 +0000
Subject: [PATCH v28] Fix memory leak in `gist_check_page`

---
 contrib/amcheck/verify_gist.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
index 63f6175e17c..3241f84cba7 100644
--- a/contrib/amcheck/verify_gist.c
+++ b/contrib/amcheck/verify_gist.c
@@ -391,6 +391,7 @@ gist_check_page(GistCheckState * check_state, GistScanItem * stack,
 	{
 		ItemId		iid = PageGetItemIdCareful(check_state->rel, stack->blkno, page, i);
 		IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+		IndexTuple  tmpTuple = NULL;
 
 		/*
 		 * Check that it's not a leftover invalid tuple from pre-9.1 See also
@@ -414,8 +415,10 @@ gist_check_page(GistCheckState * check_state, GistScanItem * stack,
 		/*
 		 * Check if this tuple is consistent with the downlink in the parent.
 		 */
-		if (stack->parenttup &&
-			gistgetadjusted(check_state->rel, stack->parenttup, idxtuple, check_state->state))
+		if (stack->parenttup)
+			tmpTuple = gistgetadjusted(check_state->rel, stack->parenttup, idxtuple, check_state->state);
+
+		if (tmpTuple)
 		{
 			/*
 			 * There was a discrepancy between parent and child tuples. We
@@ -428,6 +431,8 @@ gist_check_page(GistCheckState * check_state, GistScanItem * stack,
 			 * parent and child buffers. Thus parent tuple must include
 			 * keyspace of the child.
 			 */
+
+			pfree(tmpTuple);
 			pfree(stack->parenttup);
 			stack->parenttup = gist_refind_parent(check_state->rel, stack->parentblk,
 												  stack->blkno, strategy);
-- 
2.34.1

#51

Kirill Reshke

reshkekirill@gmail.com

about 1 year ago

In reply to: Andrey M. Borodin (#48)

5 attachment(s)

Re: Amcheck verification of GiST and GIN

On Mon, 15 Jul 2024 at 00:00, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:

Hi Tomas!

Thank you so much for your interest in the patchset.

On 10 Jul 2024, at 19:01, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:

I realized amcheck GIN/GiST support would be useful for testing my
patches adding parallel builds for these index types, so I decided to
take a look at this and do an initial review today.

Great! Thank you!

Attached is a patch series with a extra commits to keep the review
comments and patches adjusting the formatting by pgindent (the patch
seems far enough for this).

I was hoping to address your review comments this weekend, but unfortunately I could not. I'll do this ASAP, but at least I decided to post answers on questions.

Let me quickly go through the review comments:

1) Not sure I like 'amcheck.c' very much, I'd probably go with something
like 'verify_common.c' to match naming of the other files. But it's just
nitpicking and I can live with it.

Any name works for me. We have tens of files ending with "common.c", so I think that's a good way to go.

2) amcheck_lock_relation_and_check seems to be the most important
function, yet there's no comment explaining what it does :-(

Makes sense.

3) amcheck_lock_relation_and_check still has a TODO to add the correct
name of the AM

Yes, I've discovered it during rebase and added TODO.

4) Do we actually need amcheck_index_mainfork_expected as a separate
function, or could it be a part of index_checkable?

It was separate function before refactoring...

5) The comment for heaptuplespresent says "debug counter" but that does
not really explain what it's for. (I see verify_nbtree has the same
comment, but maybe let's improve that.)

It's there for a DEBUG1 message
ereport(DEBUG1,
(errmsg_internal("finished verifying presence of " INT64_FORMAT " tuples from table \"%s\" with bitset %.2f%% set",
But the message is gone for GiST. Perhaps, let's restore this message?

6) I'd suggest moving the GISTSTATE + blocknum fields to the beginning
of GistCheckState, it seems more natural to start with "generic" fields.

Makes sense.

7) I'd adjust the gist_check_parent_keys_consistency comment a bit, to
explain what the function does first, and only then explain how.

Makes sense.

8) We seem to be copying PageGetItemIdCareful() around, right? And the
copy in _gist.c still references nbtree - I guess that's not right.

Version differ in two aspects:
1. Size of opaque data may be different. But we can pass it as a parameter.
2. GIN's line pointer verification is slightly more strict.

9) Why is the GIN function called gin_index_parent_check() and not
simply gin_index_check() as for the other AMs?

AFAIR function should be called _parent_ if it takes ShareLock. gin_index_parent_check() does not, so I think we should rename it.

10) The debug in gin_check_posting_tree_parent_keys_consistency triggers
assert when running with client_min_messages='debug5', it seems to be
accessing bogus item pointers.

11) Why does it add pg_amcheck support only for GiST and not GIN?

GiST part is by far more polished. When we were discussing current implementation with Peter G, we decided that we could finish work on GiST, and then proceed to GIN. Main concern is about GIN's locking model.

On 12 Jul 2024, at 15:16, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:

On 7/10/24 18:01, Tomas Vondra wrote:

...

That's all for now. I'll add this to the stress-testing tests of my
index build patches, and if that triggers more issues I'll report those.

As mentioned a couple days ago, I started using this patch to validate
the patches adding parallel builds to GIN and GiST indexes - I scripts
to stress-test the builds, and I added the new amcheck functions as
another validation step.

For GIN indexes it didn't find anything new (in either this or my
patches), aside from the assert crash I already reported.

But for GiST it turned out to be very valuable - it did actually find an
issue in my patches, or rather confirm my hypothesis that the way the
patch generates fake LSN may not be quite right.

In particular, I've observed these two issues:

ERROR: heap tuple (13315,38) from table "planet_osm_roads" lacks
matching index tuple within index "roads_7_1_idx"

ERROR: index "roads_7_7_idx" has inconsistent records on page 23723
offset 113

And those consistency issues are real - I've managed to reproduce issues
with incorrect query results (by comparing the results to an index built
without parallelism).

So that's nice - it shows the value of this patch, and I like it.

That's great!

One thing I've been wondering about is that currently amcheck (in
general, not just these new GIN/GiST functions) errors out on the first
issue, because it does ereport(ERROR). Which is good enough to decide if
there is some corruption, but a bit inconvenient if you need to assess
how much corruption is there. For example when investigating the issue
in my patch it would have been great to know if there's just one broken
page, or if there are dozens/hundreds/thousands of them.

I'd imagine we could have a flag which says whether to fail on the first
issue, or keep looking at future pages. Essentially, whether to do
ereport(ERROR) or ereport(WARNING). But maybe that's a dead-end, and
once we find the first issue it's futile to inspect the rest of the
index, because it can be garbage. Not sure. In any case, it's not up to
this patch to invent that.

The thing is amcheck tries hard to to do a core dump. It's still possible to crash it with garbage. But if we continue check after encountering first corruption - increase in SegFaults is inevitable.

Thank you! I hope I can get back to code ASAP.

Best regards, Andrey Borodin.

Hi!
I did mechanical patch rebase & beautification.

Notice my first patch, i did small refactoring as a separate contribution.

=== review from Tomas fixups

1) 0001-Refactor-amcheck-to-extract-common..

This change was not correct (if stmt now need parenthesis )

- if (allequalimage && !_bt_allequalimage(indrel, false))
- {
- bool has_interval_ops = false;
-
- for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
- if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
- has_interval_ops = true;
- ereport(ERROR,
+ for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
+ if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
+ has_interval_ops = true;
+ ereport(ERROR,

Applied all tomas review comments. The index relation AM mismatch
error message now looks like this:
```
db1=# select bt_index_check('users_search_idx');
ERROR: expected "btree" index as targets for verification
DETAIL: Relation "users_search_idx" is a gin index.
```
I included Tomas to review-by section of this patch (in the commit message).
I also changed the commit message for this patch.

2) 0002-Add-gist_index_check-function-to-verify-G.patch
Did apply Tomas review comments. I left GIST's version of
PageGetItemIdCareful unchanged. Maybe we should have a common check in
verify_common.c, as Tomas was arguing for, but I'm not doing anything
for now, because I don't really understand its purpose. All other
review comments are addressed (i hope), if i'm not missing anything.

I also included my fix for the memory leak mentioned by Tomas.

3) 003-Add gin_index_check() to verify GIN index
The only change is gin_index_parent_check() -> gin_index_check()

4) Applying: Add GiST support to pg_amcheck

Simply rebased & run pgident.

==== tests

make check runs with success

=== problems with gin_index_check

1)
```
reshke@ygp-jammy:~/postgres/contrib/amcheck$ ../../pgbin/bin/psql db1
psql (18devel)
Type "help" for help.

db1=# select gin_index_check('users_search_idx');
ERROR: index "users_search_idx" has wrong tuple order, block 35868, offset 33
```

For some reason gin_index_check fails on my index. I am 99% there is
no corruption in it. Will try to investigate.

2) this is already discovered by Tomas, but I add my input here:

psql session:
```
db1=# set log_min_messages to debug5;
SET
db1=# select gin_index_check('users_search_idx');

```

gdb session:
```
(gdb) bt
#0 __pthread_kill_implementation (no_tid=0, signo=6,
threadid=140601454760896) at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (signo=6, threadid=140601454760896) at
./nptl/pthread_kill.c:78
#2 __GI___pthread_kill (threadid=140601454760896,
signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3 0x00007fe055af0476 in __GI_raise (sig=sig@entry=6) at
../sysdeps/posix/raise.c:26
#4 0x00007fe055ad67f3 in __GI_abort () at ./stdlib/abort.c:79
#5 0x000055ea82af4ef0 in ExceptionalCondition
(conditionName=conditionName@entry=0x7fe04a87aa35
"ItemPointerIsValid(pointer)",
fileName=fileName@entry=0x7fe04a87a928
"../../src/include/storage/itemptr.h",
lineNumber=lineNumber@entry=126) at assert.c:66
#6 0x00007fe04a871372 in ItemPointerGetOffsetNumber
(pointer=<optimized out>) at ../../src/include/storage/itemptr.h:126
#7 ItemPointerGetOffsetNumber (pointer=<optimized out>) at
../../src/include/storage/itemptr.h:124
#8 gin_check_posting_tree_parent_keys_consistency
(posting_tree_root=<optimized out>, rel=<optimized out>) at
verify_gin.c:296
#9 gin_check_parent_keys_consistency (rel=rel@entry=0x7fe04a8aa328,
heaprel=heaprel@entry=0x7fe04a8a9db8,
callback_state=callback_state@entry=0x0,
readonly=readonly@entry=false) at verify_gin.c:597
#10 0x00007fe04a87098d in amcheck_lock_relation_and_check
(indrelid=16488, am_id=am_id@entry=2742,
check=check@entry=0x7fe04a870a80 <gin_check_parent_keys_consistency>,
lockmode=lockmode@entry=1,
state=state@entry=0x0) at verify_common.c:132
#11 0x00007fe04a871e34 in gin_index_check (fcinfo=<optimized out>) at
verify_gin.c:81
#12 0x000055ea827cc275 in ExecInterpExpr (state=0x55ea84903390,
econtext=0x55ea84903138, isnull=<optimized out>) at
execExprInterp.c:770
#13 0x000055ea82804fdc in ExecEvalExprSwitchContext
(isNull=0x7ffeba7fdd37, econtext=0x55ea84903138, state=0x55ea84903390)
at ../../../src/include/executor/executor.h:367
#14 ExecProject (projInfo=0x55ea84903388) at
../../../src/include/executor/executor.h:401
#15 ExecResult (pstate=<optimized out>) at nodeResult.c:135
#16 0x000055ea827d007a in ExecProcNode (node=0x55ea84903028) at
../../../src/include/executor/executor.h:278
#17 ExecutePlan (execute_once=<optimized out>, dest=0x55ea84901940,
direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>,
operation=CMD_SELECT, use_parallel_mode=<optimized out>,
planstate=0x55ea84903028, estate=0x55ea84902e00) at execMain.c:1655
#18 standard_ExecutorRun (queryDesc=0x55ea8485c1a0,
direction=<optimized out>, count=0, execute_once=<optimized out>) at
execMain.c:362
#19 0x000055ea829ad6df in PortalRunSelect (portal=0x55ea848b1810,
forward=<optimized out>, count=0, dest=<optimized out>) at
pquery.c:924
#20 0x000055ea829aedc1 in PortalRun
(portal=portal@entry=0x55ea848b1810,
count=count@entry=9223372036854775807,
isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true,
dest=dest@entry=0x55ea84901940,
altdest=altdest@entry=0x55ea84901940, qc=0x7ffeba7fdfd0) at
pquery.c:768
#21 0x000055ea829aab47 in exec_simple_query
(query_string=0x55ea84831250 "select
gin_index_check('users_search_idx');") at postgres.c:1283
#22 0x000055ea829ac777 in PostgresMain (dbname=<optimized out>,
username=<optimized out>) at postgres.c:4798
#23 0x000055ea829a6a33 in BackendMain (startup_data=<optimized out>,
startup_data_len=<optimized out>) at backend_startup.c:107
#24 0x000055ea8290122f in postmaster_child_launch
(child_type=<optimized out>, child_slot=1,
startup_data=startup_data@entry=0x7ffeba7fe48c "",
startup_data_len=startup_data_len@entry=4,
client_sock=client_sock@entry=0x7ffeba7fe490) at launch_backend.c:274
#25 0x000055ea82904c3f in BackendStartup (client_sock=0x7ffeba7fe490)
at postmaster.c:3377
#26 ServerLoop () at postmaster.c:1663
#27 0x000055ea8290656b in PostmasterMain (argc=argc@entry=3,
argv=argv@entry=0x55ea8482ab10) at postmaster.c:1361
#28 0x000055ea825ecc0a in main (argc=3, argv=0x55ea8482ab10) at main.c:196
(gdb)
```

We also need to change the default version of the extension to 1.5.
I'm not sure which patch of this series should do that.

====
Overall I think 0001 & 0002 are ready as-is. 0003 is maybe ok. Other
patches need more review rounds.

--
Best regards,
Kirill Reshke

Attachments:

v29-0004-Add-gin_index_check-to-verify-GIN-index.patchapplication/octet-stream; name=v29-0004-Add-gin_index_check-to-verify-GIN-index.patchDownload

From b099ffb366e0f4e08a08a5cdefe971203f7232de Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:22:26 +0500
Subject: [PATCH v29 4/5] Add gin_index_check() to verify GIN index
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Author: Grigory Kryachko <GSKryachko@gmail.com>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile               |   3 +-
 contrib/amcheck/amcheck--1.4--1.5.sql  |   9 +
 contrib/amcheck/expected/check_gin.out |  64 ++
 contrib/amcheck/meson.build            |   2 +
 contrib/amcheck/sql/check_gin.sql      |  40 ++
 contrib/amcheck/verify_gin.c           | 769 +++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml              |  19 +
 src/tools/pgindent/pgindent            |   2 +-
 8 files changed, 906 insertions(+), 2 deletions(-)
 create mode 100644 contrib/amcheck/expected/check_gin.out
 create mode 100644 contrib/amcheck/sql/check_gin.sql
 create mode 100644 contrib/amcheck/verify_gin.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index 952e458c53b..c01f8e618f3 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,6 +4,7 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	verify_common.o \
+	verify_gin.o \
 	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
@@ -13,7 +14,7 @@ DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck
 		amcheck--1.3--1.4.sql amcheck--1.4--1.5.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_gist check_heap
+REGRESS = check check_btree check_gin check_gist check_heap
 
 EXTRA_INSTALL = contrib/pg_walinspect
 TAP_TESTS = 1
diff --git a/contrib/amcheck/amcheck--1.4--1.5.sql b/contrib/amcheck/amcheck--1.4--1.5.sql
index 3fc72364180..c013abc4f55 100644
--- a/contrib/amcheck/amcheck--1.4--1.5.sql
+++ b/contrib/amcheck/amcheck--1.4--1.5.sql
@@ -12,3 +12,12 @@ AS 'MODULE_PATHNAME', 'gist_index_check'
 LANGUAGE C STRICT;
 
 REVOKE ALL ON FUNCTION gist_index_check(regclass, boolean) FROM PUBLIC;
+
+-- gin_index_check()
+--
+CREATE FUNCTION gin_index_check(index regclass)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gin_index_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gin_index_check(regclass) FROM PUBLIC;
diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
new file mode 100644
index 00000000000..bbcde80e627
--- /dev/null
+++ b/contrib/amcheck/expected/check_gin.out
@@ -0,0 +1,64 @@
+-- Test of index bulk load
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_check('gin_check_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test index inserts
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+SELECT gin_index_check('gin_check_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test GIN over text array
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_check('gin_check_text_array_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 15ae94cc90f..5c9ddfe0758 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gin.c',
   'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
@@ -38,6 +39,7 @@ tests += {
     'sql': [
       'check',
       'check_btree',
+      'check_gin',
       'check_gist',
       'check_heap',
     ],
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
new file mode 100644
index 00000000000..bbd9b9f8281
--- /dev/null
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -0,0 +1,40 @@
+-- Test of index bulk load
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test index inserts
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+
+SELECT gin_index_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test GIN over text array
+SELECT setseed(1);
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_check('gin_check_text_array_idx');
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
new file mode 100644
index 00000000000..47b6e81fbc4
--- /dev/null
+++ b/contrib/amcheck/verify_gin.c
@@ -0,0 +1,769 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gin.c
+ *		Verifies the integrity of GIN indexes based on invariants.
+ *
+ * Verification checks that all paths in GIN graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gin.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gin_private.h"
+#include "access/nbtree.h"
+#include "catalog/pg_am.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "verify_common.h"
+#include "string.h"
+
+/*
+ * GinScanItem represents one item of depth-first scan of GIN index.
+ */
+typedef struct GinScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GinScanItem *next;
+}			GinScanItem;
+
+/*
+ * GinPostingTreeScanItem represents one item of depth-first scan of GIN  posting tree.
+ */
+typedef struct GinPostingTreeScanItem
+{
+	int			depth;
+	ItemPointerData parentkey;
+	BlockNumber parentblk;
+	BlockNumber blkno;
+	struct GinPostingTreeScanItem *next;
+}			GinPostingTreeScanItem;
+
+
+PG_FUNCTION_INFO_V1(gin_index_check);
+
+static void gin_check_parent_keys_consistency(Relation rel,
+											  Relation heaprel,
+											  void *callback_state, bool readonly);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gin_refind_parent(Relation rel,
+									BlockNumber parentblkno,
+									BlockNumber childblkno,
+									BufferAccessStrategy strategy);
+static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+								   OffsetNumber offset);
+
+/*
+ * gin_index_check(index regclass)
+ *
+ * Verify integrity of GIN index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gin_index_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+
+	amcheck_lock_relation_and_check(indrelid,
+									GIN_AM_OID,
+									gin_check_parent_keys_consistency,
+									AccessShareLock,
+									NULL);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Read item pointers from leaf entry tuple.
+ *
+ * Returns a palloc'd array of ItemPointers. The number of items is returned
+ * in *nitems.
+ */
+static ItemPointer
+ginReadTupleWithoutState(IndexTuple itup, int *nitems)
+{
+	Pointer		ptr = GinGetPosting(itup);
+	int			nipd = GinGetNPosting(itup);
+	ItemPointer ipd;
+	int			ndecoded;
+
+	if (GinItupIsCompressed(itup))
+	{
+		if (nipd > 0)
+		{
+			ipd = ginPostingListDecode((GinPostingList *) ptr, &ndecoded);
+			if (nipd != ndecoded)
+				elog(ERROR, "number of items mismatch in GIN entry tuple, %d in tuple header, %d decoded",
+					 nipd, ndecoded);
+		}
+		else
+		{
+			ipd = palloc(0);
+		}
+	}
+	else
+	{
+		ipd = (ItemPointer) palloc(sizeof(ItemPointerData) * nipd);
+		memcpy(ipd, ptr, sizeof(ItemPointerData) * nipd);
+	}
+	*nitems = nipd;
+	return ipd;
+}
+
+/*
+ * Allocates memory context and scans through postigTree graph
+ *
+ */
+static void
+gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting_tree_root)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinPostingTreeScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinPostingTreeScanItem *) palloc0(sizeof(GinPostingTreeScanItem));
+	stack->depth = 0;
+	ItemPointerSetInvalid(&stack->parentkey);
+	stack->parentblk = InvalidBlockNumber;
+	stack->blkno = posting_tree_root;
+
+	elog(DEBUG3, "processing posting tree at blk %u", posting_tree_root);
+
+	while (stack)
+	{
+		GinPostingTreeScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		Assert(GinPageIsData(page));
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			ItemPointerData minItem;
+			int			nlist;
+			ItemPointerData *list;
+			char		tidrange_buf[100];
+
+			ItemPointerSetMin(&minItem);
+
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			list = GinDataLeafPageGetItems(page, &nlist, minItem);
+
+			if (nlist > 0)
+			{
+				snprintf(tidrange_buf, sizeof(tidrange_buf),
+						 "%d tids (%u, %u) - (%u, %u)",
+						 nlist,
+						 ItemPointerGetBlockNumberNoCheck(&list[0]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[0]),
+						 ItemPointerGetBlockNumberNoCheck(&list[nlist - 1]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[nlist - 1]));
+			}
+			else
+			{
+				snprintf(tidrange_buf, sizeof(tidrange_buf), "0 tids");
+			}
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				elog(DEBUG3, "blk %u: parent %u highkey (%u, %u), %s",
+					 stack->blkno,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey),
+					 tidrange_buf);
+			}
+			else
+			{
+				elog(DEBUG3, "blk %u: root leaf, %s",
+					 stack->blkno,
+					 tidrange_buf);
+			}
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				ItemPointerGetOffsetNumberNoCheck(&stack->parentkey) != InvalidOffsetNumber &&
+				nlist > 0 && ItemPointerCompare(&stack->parentkey, &list[nlist - 1]) < 0)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": tid exceeds parent's high key in postingTree leaf on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+		else
+		{
+			LocationIndex pd_lower;
+			ItemPointerData bound;
+			int			lowersize;
+
+			/*
+			 * Check that tuples in each page are properly ordered and
+			 * consistent with parent high key
+			 */
+			maxoff = GinPageGetOpaque(page)->maxoff;
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: internal posting tree page with %u items, parent %u highkey (%u, %u)",
+					 stack->blkno, maxoff, stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey));
+			else
+				elog(DEBUG3, "blk %u: root internal posting tree page with %u items",
+					 stack->blkno, maxoff);
+
+			/*
+			 * A GIN posting tree internal page stores PostingItems in the
+			 * 'lower' part of the page. The 'upper' part is unused. The
+			 * number of elements is stored in the opaque area (maxoff). Make
+			 * sure the size of the 'lower' part agrees with 'maxoff'
+			 *
+			 * We didn't set pd_lower until PostgreSQL version 9.4, so if this
+			 * check fails, it could also be because the index was
+			 * binary-upgraded from an earlier version. That was a long time
+			 * ago, though, so let's warn if it doesn't match.
+			 */
+			pd_lower = ((PageHeader) page)->pd_lower;
+			lowersize = pd_lower - MAXALIGN(SizeOfPageHeaderData);
+			if ((lowersize - MAXALIGN(sizeof(ItemPointerData))) / sizeof(PostingItem) != maxoff)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has unexpected pd_lower %u in posting tree block %u with maxoff %u)",
+								RelationGetRelationName(rel), pd_lower, stack->blkno, maxoff)));
+
+			/*
+			 * Before the PostingItems, there's one ItemPointerData in the
+			 * 'lower' part that stores the page's high key.
+			 */
+			bound = *GinDataPageGetRightBound(page);
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				!ItemPointerEquals(&stack->parentkey, &bound))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": posting tree page's high key (%u, %u) doesn't match the downlink on block %u (parent blk %u, key (%u, %u))",
+								RelationGetRelationName(rel),
+								ItemPointerGetBlockNumberNoCheck(&bound),
+								ItemPointerGetOffsetNumberNoCheck(&bound),
+								stack->blkno, stack->parentblk,
+								ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+								ItemPointerGetOffsetNumberNoCheck(&stack->parentkey))));
+
+			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
+
+				elog(DEBUG3, "key (%u, %u) -> %u",
+					 ItemPointerGetBlockNumber(&posting_item->key),
+					 ItemPointerGetOffsetNumber(&posting_item->key),
+					 BlockIdGetBlockNumber(&posting_item->child_blkno));
+
+				if (i == maxoff &&
+					GinPageGetOpaque(page)->rightlink == InvalidBlockNumber)
+				{
+					/*
+					 * The rightmost item in the tree level has (0, 0) as the
+					 * key
+					 */
+					if (ItemPointerGetBlockNumberNoCheck(&posting_item->key) != 0 ||
+						ItemPointerGetOffsetNumberNoCheck(&posting_item->key) != 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": rightmost posting tree page (blk %u) has unexpected last key (%u, %u)",
+										RelationGetRelationName(rel),
+										stack->blkno,
+										ItemPointerGetBlockNumberNoCheck(&posting_item->key),
+										ItemPointerGetOffsetNumberNoCheck(&posting_item->key))));
+				}
+				else if (i != FirstOffsetNumber)
+				{
+					PostingItem *previous_posting_item = GinDataPageGetPostingItem(page, i - 1);
+
+					if (ItemPointerCompare(&posting_item->key, &previous_posting_item->key) < 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\" has wrong tuple order in posting tree, block %u, offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+				}
+
+				/*
+				 * Check if this tuple is consistent with the downlink in the
+				 * parent.
+				 */
+				if (stack->parentblk != InvalidBlockNumber && i == maxoff &&
+					ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\": posting item exceeds parent's high key in postingTree internal page on block %u offset %u",
+									RelationGetRelationName(rel),
+									stack->blkno, i)));
+
+				/* If this is an internal page, recurse into the child */
+				if (!GinPageIsLeaf(page))
+				{
+					GinPostingTreeScanItem *ptr;
+
+					ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
+					ptr->depth = stack->depth + 1;
+					ptr->parentkey = posting_item->key;
+					ptr->parentblk = stack->blkno;
+					ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+			}
+		}
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Main entry point for GIN check. Allocates memory context and scans through
+ * GIN graph.
+ */
+static void
+gin_check_parent_keys_consistency(Relation rel,
+								  Relation heaprel,
+								  void *callback_state,
+								  bool readonly)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GinState	state;
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+	initGinState(&state, rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinScanItem *) palloc0(sizeof(GinScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIN_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GinScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+		IndexTuple	prev_tuple;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+		maxoff = PageGetMaxOffsetNumber(page);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (stack->parenttup != NULL)
+		{
+			GinNullCategory parent_key_category;
+			Datum		parent_key = gintuple_get_key(&state,
+													  stack->parenttup,
+													  &parent_key_category);
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno,
+												   page, maxoff);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory page_max_key_category;
+			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
+
+			if (GinPageGetOpaque(page)->rightlink != InvalidBlockNumber &&
+				ginCompareEntries(&state, attnum, page_max_key,
+								  page_max_key_category, parent_key,
+								  parent_key_category) > 0)
+			{
+				/* split page detected, install right link to the stack */
+				GinScanItem *ptr;
+
+				elog(DEBUG3, "split detected");
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth;
+				ptr->parenttup = CopyIndexTuple(stack->parenttup);
+				ptr->parentblk = stack->parentblk;
+				ptr->parentlsn = stack->parentlsn;
+				ptr->blkno = GinPageGetOpaque(page)->rightlink;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that tuples in each page are properly ordered and consistent
+		 * with parent high key
+		 */
+		prev_tuple = NULL;
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory prev_key_category;
+			Datum		prev_key;
+			GinNullCategory current_key_category;
+			Datum		current_key;
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
+
+			/* (apparently) first block is metadata, skip order check */
+			if (i != FirstOffsetNumber && stack->blkno != (BlockNumber) 1)
+			{
+				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
+				if (ginCompareEntries(&state, attnum, prev_key,
+									  prev_key_category, current_key,
+									  current_key_category) >= 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has wrong tuple order, block %u, offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+			}
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				i == maxoff)
+			{
+				GinNullCategory parent_key_category;
+				Datum		parent_key = gintuple_get_key(&state,
+														  stack->parenttup,
+														  &parent_key_category);
+
+				if (ginCompareEntries(&state, attnum, current_key,
+									  current_key_category, parent_key,
+									  parent_key_category) > 0)
+				{
+					/*
+					 * There was a discrepancy between parent and child
+					 * tuples. We need to verify it is not a result of
+					 * concurrent call of gistplacetopage(). So, lock parent
+					 * and try to find downlink for current page. It may be
+					 * missing due to concurrent page split, this is OK.
+					 */
+					pfree(stack->parenttup);
+					stack->parenttup = gin_refind_parent(rel, stack->parentblk,
+														 stack->blkno, strategy);
+
+					/* We found it - make a final check before failing */
+					if (!stack->parenttup)
+						elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+							 stack->blkno, stack->parentblk);
+					else
+					{
+						parent_key = gintuple_get_key(&state,
+													  stack->parenttup,
+													  &parent_key_category);
+						if (ginCompareEntries(&state, attnum, current_key,
+											  current_key_category, parent_key,
+											  parent_key_category) > 0)
+							ereport(ERROR,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+											RelationGetRelationName(rel), stack->blkno, i)));
+						else
+						{
+							/*
+							 * But now it is properly adjusted - nothing to do
+							 * here.
+							 */
+						}
+					}
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+			if (!GinPageIsLeaf(page))
+			{
+				GinScanItem *ptr;
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth + 1;
+				/* last tuple in layer has no high key */
+				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
+				{
+					ptr->parenttup = CopyIndexTuple(idxtuple);
+				}
+				else
+				{
+					ptr->parenttup = NULL;
+				}
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = GinGetDownlink(idxtuple);
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+			/* If this item is a pointer to a posting tree, recurse into it */
+			else if (GinIsPostingTree(idxtuple))
+			{
+				BlockNumber rootPostingTree = GinGetPostingTree(idxtuple);
+
+				gin_check_posting_tree_parent_keys_consistency(rel, rootPostingTree);
+			}
+			else
+			{
+				ItemPointer ipd;
+				int			nipd;
+
+				ipd = ginReadTupleWithoutState(idxtuple, &nipd);
+
+				for (int j = 0; j < nipd; j++)
+				{
+					if (!OffsetNumberIsValid(ItemPointerGetOffsetNumber(&ipd[j])))
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting list contains invalid heap pointer on block %u",
+										RelationGetRelationName(rel), stack->blkno)));
+				}
+				pfree(ipd);
+			}
+
+			prev_tuple = CopyIndexTuple(idxtuple);
+		}
+
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Verify that a freshly-read page looks sane.
+ */
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	/*
+	 * ReadBuffer verifies that every newly-read page passes
+	 * PageHeaderIsValid, which means it either contains a reasonably sane
+	 * page header or is all-zero.  We have to defend against the all-zero
+	 * case, however.
+	 */
+	if (PageIsNew(page))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains unexpected zero page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	/*
+	 * Additionally check that the special area looks sane.
+	 */
+	if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains corrupted page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	if (GinPageIsDeleted(page))
+	{
+		if (!GinPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gin_refind_parent(Relation rel, BlockNumber parentblkno,
+				  BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIN_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GinPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+static ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED or LP_DEAD,
+	 * since GIN never uses all three.  Verify that line pointer has storage,
+	 * too.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdIsDead(itemid) || ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 6eb526c6bb7..55f2b587e57 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -189,6 +189,25 @@ ORDER BY c.relpages DESC LIMIT 10;
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term>
+     <function>gin_index_check(index regclass) returns void</function>
+     <indexterm>
+      <primary>gin_index_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gin_index_check</function> tests that its target GIN index
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term>
      <function>gist_index_check(index regclass, heapallindexed boolean) returns void</function>
diff --git a/src/tools/pgindent/pgindent b/src/tools/pgindent/pgindent
index e889af6b1e4..e5ac0410665 100755
--- a/src/tools/pgindent/pgindent
+++ b/src/tools/pgindent/pgindent
@@ -13,7 +13,7 @@ use IO::Handle;
 use Getopt::Long;
 
 # Update for pg_bsd_indent version
-my $INDENT_VERSION = "2.1.2";
+my $INDENT_VERSION = "2.1.1";
 
 # Our standard indent settings
 my $indent_opts =
-- 
2.34.1

v29-0001-A-tiny-nitpicky-tweak-to-beautify-the-Amcheck-in.patchapplication/octet-stream; name=v29-0001-A-tiny-nitpicky-tweak-to-beautify-the-Amcheck-in.patchDownload

From 77b128cb0b0c10cf71d3282cd1d55c473b691a7d Mon Sep 17 00:00:00 2001
From: reshke kirill <reshke@double.cloud>
Date: Tue, 26 Nov 2024 05:32:27 +0000
Subject: [PATCH v29 1/5] A tiny nitpicky tweak to beautify the Amcheck
 interiors.

The heaptuplespresent field in BtreeCheckState was not previously
adequately documented. To clarify the meaning of this field, the comment was changed.
---
 contrib/amcheck/verify_nbtree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 8b82797c10f..746f7ce09fb 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -124,7 +124,7 @@ typedef struct BtreeCheckState
 
 	/* Bloom filter fingerprints B-Tree index */
 	bloom_filter *filter;
-	/* Debug counter */
+	/* Debug counter for reporting percentage of work already done */
 	int64		heaptuplespresent;
 } BtreeCheckState;
 
-- 
2.34.1

v29-0005-Add-GiST-support-to-pg_amcheck.patchapplication/octet-stream; name=v29-0005-Add-GiST-support-to-pg_amcheck.patchDownload

From db939f443a47120c7c1adc8d7b47c5f0025b2224 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sun, 5 Feb 2023 15:52:14 -0800
Subject: [PATCH v29 5/5] Add GiST support to pg_amcheck

Proof of concept patch for pg_amcheck binary support
for GIST & GIN index checks.

Reviewed-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
---
 src/bin/pg_amcheck/pg_amcheck.c      | 292 ++++++++++++++++-----------
 src/bin/pg_amcheck/t/002_nonesuch.pl |   8 +-
 src/bin/pg_amcheck/t/003_check.pl    |  65 ++++--
 3 files changed, 221 insertions(+), 144 deletions(-)

diff --git a/src/bin/pg_amcheck/pg_amcheck.c b/src/bin/pg_amcheck/pg_amcheck.c
index 0c05cf58bce..cf01a370b1b 100644
--- a/src/bin/pg_amcheck/pg_amcheck.c
+++ b/src/bin/pg_amcheck/pg_amcheck.c
@@ -39,8 +39,7 @@ typedef struct PatternInfo
 								 * NULL */
 	bool		heap_only;		/* true if rel_regex should only match heap
 								 * tables */
-	bool		btree_only;		/* true if rel_regex should only match btree
-								 * indexes */
+	bool		index_only;		/* true if rel_regex should only match indexes */
 	bool		matched;		/* true if the pattern matched in any database */
 } PatternInfo;
 
@@ -74,10 +73,9 @@ typedef struct AmcheckOptions
 
 	/*
 	 * As an optimization, if any pattern in the exclude list applies to heap
-	 * tables, or similarly if any such pattern applies to btree indexes, or
-	 * to schemas, then these will be true, otherwise false.  These should
-	 * always agree with what you'd conclude by grep'ing through the exclude
-	 * list.
+	 * tables, or similarly if any such pattern applies to indexes, or to
+	 * schemas, then these will be true, otherwise false.  These should always
+	 * agree with what you'd conclude by grep'ing through the exclude list.
 	 */
 	bool		excludetbl;
 	bool		excludeidx;
@@ -98,14 +96,14 @@ typedef struct AmcheckOptions
 	int64		endblock;
 	const char *skip;
 
-	/* btree index checking options */
+	/* index checking options */
 	bool		parent_check;
 	bool		rootdescend;
 	bool		heapallindexed;
 	bool		checkunique;
 
-	/* heap and btree hybrid option */
-	bool		no_btree_expansion;
+	/* heap and indexes hybrid option */
+	bool		no_index_expansion;
 } AmcheckOptions;
 
 static AmcheckOptions opts = {
@@ -134,7 +132,7 @@ static AmcheckOptions opts = {
 	.rootdescend = false,
 	.heapallindexed = false,
 	.checkunique = false,
-	.no_btree_expansion = false
+	.no_index_expansion = false
 };
 
 static const char *progname = NULL;
@@ -151,13 +149,15 @@ typedef struct DatabaseInfo
 	char	   *datname;
 	char	   *amcheck_schema; /* escaped, quoted literal */
 	bool		is_checkunique;
+	bool		gist_supported;
 } DatabaseInfo;
 
 typedef struct RelationInfo
 {
 	const DatabaseInfo *datinfo;	/* shared by other relinfos */
 	Oid			reloid;
-	bool		is_heap;		/* true if heap, false if btree */
+	Oid			amoid;
+	bool		is_heap;		/* true if heap, false if index */
 	char	   *nspname;
 	char	   *relname;
 	int			relpages;
@@ -178,10 +178,12 @@ static void prepare_heap_command(PQExpBuffer sql, RelationInfo *rel,
 								 PGconn *conn);
 static void prepare_btree_command(PQExpBuffer sql, RelationInfo *rel,
 								  PGconn *conn);
+static void prepare_gist_command(PQExpBuffer sql, RelationInfo *rel,
+								 PGconn *conn);
 static void run_command(ParallelSlot *slot, const char *sql);
 static bool verify_heap_slot_handler(PGresult *res, PGconn *conn,
 									 void *context);
-static bool verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context);
+static bool verify_index_slot_handler(PGresult *res, PGconn *conn, void *context);
 static void help(const char *progname);
 static void progress_report(uint64 relations_total, uint64 relations_checked,
 							uint64 relpages_total, uint64 relpages_checked,
@@ -195,7 +197,7 @@ static void append_relation_pattern(PatternInfoArray *pia, const char *pattern,
 									int encoding);
 static void append_heap_pattern(PatternInfoArray *pia, const char *pattern,
 								int encoding);
-static void append_btree_pattern(PatternInfoArray *pia, const char *pattern,
+static void append_index_pattern(PatternInfoArray *pia, const char *pattern,
 								 int encoding);
 static void compile_database_list(PGconn *conn, SimplePtrList *databases,
 								  const char *initial_dbname);
@@ -287,6 +289,7 @@ main(int argc, char *argv[])
 	enum trivalue prompt_password = TRI_DEFAULT;
 	int			encoding = pg_get_encoding_from_locale(NULL, false);
 	ConnParams	cparams;
+	bool		gist_warn_printed = false;
 
 	pg_logging_init(argv[0]);
 	progname = get_progname(argv[0]);
@@ -322,11 +325,11 @@ main(int argc, char *argv[])
 				break;
 			case 'i':
 				opts.allrel = false;
-				append_btree_pattern(&opts.include, optarg, encoding);
+				append_index_pattern(&opts.include, optarg, encoding);
 				break;
 			case 'I':
 				opts.excludeidx = true;
-				append_btree_pattern(&opts.exclude, optarg, encoding);
+				append_index_pattern(&opts.exclude, optarg, encoding);
 				break;
 			case 'j':
 				if (!option_parse_int(optarg, "-j/--jobs", 1, INT_MAX,
@@ -381,7 +384,7 @@ main(int argc, char *argv[])
 				maintenance_db = pg_strdup(optarg);
 				break;
 			case 2:
-				opts.no_btree_expansion = true;
+				opts.no_index_expansion = true;
 				break;
 			case 3:
 				opts.no_toast_expansion = true;
@@ -530,6 +533,10 @@ main(int argc, char *argv[])
 		int			ntups;
 		const char *amcheck_schema = NULL;
 		DatabaseInfo *dat = (DatabaseInfo *) cell->ptr;
+		int			vmaj = 0,
+					vmin = 0,
+					vrev = 0;
+		const char *amcheck_version;
 
 		cparams.override_dbname = dat->datname;
 		if (conn == NULL || strcmp(PQdb(conn), dat->datname) != 0)
@@ -598,36 +605,32 @@ main(int argc, char *argv[])
 												 strlen(amcheck_schema));
 
 		/*
-		 * Check the version of amcheck extension. Skip requested unique
-		 * constraint check with warning if it is not yet supported by
-		 * amcheck.
+		 * Check the version of amcheck extension.
 		 */
-		if (opts.checkunique == true)
-		{
-			/*
-			 * Now amcheck has only major and minor versions in the string but
-			 * we also support revision just in case. Now it is expected to be
-			 * zero.
-			 */
-			int			vmaj = 0,
-						vmin = 0,
-						vrev = 0;
-			const char *amcheck_version = PQgetvalue(result, 0, 1);
+		amcheck_version = PQgetvalue(result, 0, 1);
 
-			sscanf(amcheck_version, "%d.%d.%d", &vmaj, &vmin, &vrev);
+		/*
+		 * Now amcheck has only major and minor versions in the string but we
+		 * also support revision just in case. Now it is expected to be zero.
+		 */
+		sscanf(amcheck_version, "%d.%d.%d", &vmaj, &vmin, &vrev);
 
-			/*
-			 * checkunique option is supported in amcheck since version 1.4
-			 */
-			if ((vmaj == 1 && vmin < 4) || vmaj == 0)
-			{
-				pg_log_warning("option %s is not supported by amcheck version %s",
-							   "--checkunique", amcheck_version);
-				dat->is_checkunique = false;
-			}
-			else
-				dat->is_checkunique = true;
+		/*
+		 * checkunique option is supported in amcheck since version 1.4. Skip
+		 * requested unique constraint check with warning if it is not yet
+		 * supported by amcheck.
+		 */
+		if (opts.checkunique && ((vmaj == 1 && vmin < 4) || vmaj == 0))
+		{
+			pg_log_warning("option %s is not supported by amcheck version %s",
+						   "--checkunique", amcheck_version);
+			dat->is_checkunique = false;
 		}
+		else
+			dat->is_checkunique = opts.checkunique;
+
+		/* GiST indexes are supported in 1.5+ */
+		dat->gist_supported = ((vmaj == 1 && vmin >= 5) || vmaj > 1);
 
 		PQclear(result);
 
@@ -649,8 +652,8 @@ main(int argc, char *argv[])
 			if (pat->heap_only)
 				log_no_match("no heap tables to check matching \"%s\"",
 							 pat->pattern);
-			else if (pat->btree_only)
-				log_no_match("no btree indexes to check matching \"%s\"",
+			else if (pat->index_only)
+				log_no_match("no indexes to check matching \"%s\"",
 							 pat->pattern);
 			else if (pat->rel_regex == NULL)
 				log_no_match("no relations to check in schemas matching \"%s\"",
@@ -783,13 +786,29 @@ main(int argc, char *argv[])
 				if (opts.show_progress && progress_since_last_stderr)
 					fprintf(stderr, "\n");
 
-				pg_log_info("checking btree index \"%s.%s.%s\"",
+				pg_log_info("checking index \"%s.%s.%s\"",
 							rel->datinfo->datname, rel->nspname, rel->relname);
 				progress_since_last_stderr = false;
 			}
-			prepare_btree_command(&sql, rel, free_slot->connection);
+			if (rel->amoid == BTREE_AM_OID)
+				prepare_btree_command(&sql, rel, free_slot->connection);
+			else if (rel->amoid == GIST_AM_OID)
+			{
+				if (rel->datinfo->gist_supported)
+					prepare_gist_command(&sql, rel, free_slot->connection);
+				else
+				{
+					if (!gist_warn_printed)
+						pg_log_warning("GiST verification is not supported by installed amcheck version");
+					gist_warn_printed = true;
+				}
+			}
+			else
+				/* should not happen at this stage */
+				pg_log_info("Verification of index type %u not supported",
+							rel->amoid);
 			rel->sql = pstrdup(sql.data);	/* pg_free'd after command */
-			ParallelSlotSetHandler(free_slot, verify_btree_slot_handler, rel);
+			ParallelSlotSetHandler(free_slot, verify_index_slot_handler, rel);
 			run_command(free_slot, rel->sql);
 		}
 	}
@@ -867,7 +886,7 @@ prepare_heap_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
  * Creates a SQL command for running amcheck checking on the given btree index
  * relation.  The command does not select any columns, as btree checking
  * functions do not return any, but rather return corruption information by
- * raising errors, which verify_btree_slot_handler expects.
+ * raising errors, which verify_index_slot_handler expects.
  *
  * The constructed SQL command will silently skip temporary indexes, and
  * indexes being reindexed concurrently, as checking them would needlessly draw
@@ -913,6 +932,28 @@ prepare_btree_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
 						  rel->reloid);
 }
 
+/*
+ * prepare_gist_command
+ * Similar to btree equivalent prepares command to check GiST index.
+ */
+static void
+prepare_gist_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
+{
+	resetPQExpBuffer(sql);
+
+	appendPQExpBuffer(sql,
+					  "SELECT %s.gist_index_check("
+					  "index := c.oid, heapallindexed := %s)"
+					  "\nFROM pg_catalog.pg_class c, pg_catalog.pg_index i "
+					  "WHERE c.oid = %u "
+					  "AND c.oid = i.indexrelid "
+					  "AND c.relpersistence != 't' "
+					  "AND i.indisready AND i.indisvalid AND i.indislive",
+					  rel->datinfo->amcheck_schema,
+					  (opts.heapallindexed ? "true" : "false"),
+					  rel->reloid);
+}
+
 /*
  * run_command
  *
@@ -952,7 +993,7 @@ run_command(ParallelSlot *slot, const char *sql)
  * Note: Heap relation corruption is reported by verify_heapam() via the result
  * set, rather than an ERROR, but running verify_heapam() on a corrupted heap
  * table may still result in an error being returned from the server due to
- * missing relation files, bad checksums, etc.  The btree corruption checking
+ * missing relation files, bad checksums, etc.  The corruption checking
  * functions always use errors to communicate corruption messages.  We can't
  * just abort processing because we got a mere ERROR.
  *
@@ -1102,11 +1143,11 @@ verify_heap_slot_handler(PGresult *res, PGconn *conn, void *context)
 }
 
 /*
- * verify_btree_slot_handler
+ * verify_index_slot_handler
  *
- * ParallelSlotHandler that receives results from a btree checking command
- * created by prepare_btree_command and outputs them for the user.  The results
- * from the btree checking command is assumed to be empty, but when the results
+ * ParallelSlotHandler that receives results from a checking command created by
+ * prepare_[btree,gist]_command and outputs them for the user.  The results
+ * from the checking command is assumed to be empty, but when the results
  * are an error code, the useful information about the corruption is expected
  * in the connection's error message.
  *
@@ -1115,7 +1156,7 @@ verify_heap_slot_handler(PGresult *res, PGconn *conn, void *context)
  * context: unused
  */
 static bool
-verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
+verify_index_slot_handler(PGresult *res, PGconn *conn, void *context)
 {
 	RelationInfo *rel = (RelationInfo *) context;
 
@@ -1126,12 +1167,12 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 		if (ntups > 1)
 		{
 			/*
-			 * We expect the btree checking functions to return one void row
-			 * each, or zero rows if the check was skipped due to the object
-			 * being in the wrong state to be checked, so we should output
-			 * some sort of warning if we get anything more, not because it
-			 * indicates corruption, but because it suggests a mismatch
-			 * between amcheck and pg_amcheck versions.
+			 * We expect the checking functions to return one void row each,
+			 * or zero rows if the check was skipped due to the object being
+			 * in the wrong state to be checked, so we should output some sort
+			 * of warning if we get anything more, not because it indicates
+			 * corruption, but because it suggests a mismatch between amcheck
+			 * and pg_amcheck versions.
 			 *
 			 * In conjunction with --progress, anything written to stderr at
 			 * this time would present strangely to the user without an extra
@@ -1141,7 +1182,7 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 			 */
 			if (opts.show_progress && progress_since_last_stderr)
 				fprintf(stderr, "\n");
-			pg_log_warning("btree index \"%s.%s.%s\": btree checking function returned unexpected number of rows: %d",
+			pg_log_warning("index \"%s.%s.%s\": checking function returned unexpected number of rows: %d",
 						   rel->datinfo->datname, rel->nspname, rel->relname, ntups);
 			if (opts.verbose)
 				pg_log_warning_detail("Query was: %s", rel->sql);
@@ -1155,7 +1196,7 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 		char	   *msg = indent_lines(PQerrorMessage(conn));
 
 		all_checks_pass = false;
-		printf(_("btree index \"%s.%s.%s\":\n"),
+		printf(_("index \"%s.%s.%s\":\n"),
 			   rel->datinfo->datname, rel->nspname, rel->relname);
 		printf("%s", msg);
 		if (opts.verbose)
@@ -1209,6 +1250,8 @@ help(const char *progname)
 	printf(_("      --heapallindexed            check that all heap tuples are found within indexes\n"));
 	printf(_("      --parent-check              check index parent/child relationships\n"));
 	printf(_("      --rootdescend               search from root page to refind tuples\n"));
+	printf(_("\nGiST index checking options:\n"));
+	printf(_("      --heapallindexed            check that all heap tuples are found within indexes\n"));
 	printf(_("\nConnection options:\n"));
 	printf(_("  -h, --host=HOSTNAME             database server host or socket directory\n"));
 	printf(_("  -p, --port=PORT                 database server port\n"));
@@ -1422,11 +1465,11 @@ append_schema_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
  * pattern: the relation name pattern
  * encoding: client encoding for parsing the pattern
  * heap_only: whether the pattern should only be matched against heap tables
- * btree_only: whether the pattern should only be matched against btree indexes
+ * index_only: whether the pattern should only be matched against indexes
  */
 static void
 append_relation_pattern_helper(PatternInfoArray *pia, const char *pattern,
-							   int encoding, bool heap_only, bool btree_only)
+							   int encoding, bool heap_only, bool index_only)
 {
 	PQExpBufferData dbbuf;
 	PQExpBufferData nspbuf;
@@ -1461,14 +1504,14 @@ append_relation_pattern_helper(PatternInfoArray *pia, const char *pattern,
 	termPQExpBuffer(&relbuf);
 
 	info->heap_only = heap_only;
-	info->btree_only = btree_only;
+	info->index_only = index_only;
 }
 
 /*
  * append_relation_pattern
  *
  * Adds the given pattern interpreted as a relation pattern, to be matched
- * against both heap tables and btree indexes.
+ * against both heap tables and indexes.
  *
  * pia: the pattern info array to be appended
  * pattern: the relation name pattern
@@ -1497,17 +1540,17 @@ append_heap_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
 }
 
 /*
- * append_btree_pattern
+ * append_index_pattern
  *
  * Adds the given pattern interpreted as a relation pattern, to be matched only
- * against btree indexes.
+ * against indexes.
  *
  * pia: the pattern info array to be appended
  * pattern: the relation name pattern
  * encoding: client encoding for parsing the pattern
  */
 static void
-append_btree_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
+append_index_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
 {
 	append_relation_pattern_helper(pia, pattern, encoding, false, true);
 }
@@ -1765,7 +1808,7 @@ compile_database_list(PGconn *conn, SimplePtrList *databases,
  *     rel_regex: the relname regexp parsed from the pattern, or NULL if the
  *                pattern had no relname part
  *     heap_only: true if the pattern applies only to heap tables (not indexes)
- *     btree_only: true if the pattern applies only to btree indexes (not tables)
+ *     index_only: true if the pattern applies only to indexes (not tables)
  *
  * buf: the buffer to be appended
  * patterns: the array of patterns to be inserted into the CTE
@@ -1807,7 +1850,7 @@ append_rel_pattern_raw_cte(PQExpBuffer buf, const PatternInfoArray *pia,
 			appendPQExpBufferStr(buf, "::TEXT, true::BOOLEAN");
 		else
 			appendPQExpBufferStr(buf, "::TEXT, false::BOOLEAN");
-		if (info->btree_only)
+		if (info->index_only)
 			appendPQExpBufferStr(buf, ", true::BOOLEAN");
 		else
 			appendPQExpBufferStr(buf, ", false::BOOLEAN");
@@ -1845,8 +1888,8 @@ append_rel_pattern_filtered_cte(PQExpBuffer buf, const char *raw,
 								const char *filtered, PGconn *conn)
 {
 	appendPQExpBuffer(buf,
-					  "\n%s (pattern_id, nsp_regex, rel_regex, heap_only, btree_only) AS ("
-					  "\nSELECT pattern_id, nsp_regex, rel_regex, heap_only, btree_only "
+					  "\n%s (pattern_id, nsp_regex, rel_regex, heap_only, index_only) AS ("
+					  "\nSELECT pattern_id, nsp_regex, rel_regex, heap_only, index_only "
 					  "FROM %s r"
 					  "\nWHERE (r.db_regex IS NULL "
 					  "OR ",
@@ -1869,7 +1912,7 @@ append_rel_pattern_filtered_cte(PQExpBuffer buf, const char *raw,
  * The cells of the constructed list contain all information about the relation
  * necessary to connect to the database and check the object, including which
  * database to connect to, where contrib/amcheck is installed, and the Oid and
- * type of object (heap table vs. btree index).  Rather than duplicating the
+ * type of object (heap table vs. index).  Rather than duplicating the
  * database details per relation, the relation structs use references to the
  * same database object, provided by the caller.
  *
@@ -1896,7 +1939,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	if (!opts.allrel)
 	{
 		appendPQExpBufferStr(&sql,
-							 " include_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, btree_only) AS (");
+							 " include_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, index_only) AS (");
 		append_rel_pattern_raw_cte(&sql, &opts.include, conn);
 		appendPQExpBufferStr(&sql, "\n),");
 		append_rel_pattern_filtered_cte(&sql, "include_raw", "include_pat", conn);
@@ -1906,7 +1949,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	if (opts.excludetbl || opts.excludeidx || opts.excludensp)
 	{
 		appendPQExpBufferStr(&sql,
-							 " exclude_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, btree_only) AS (");
+							 " exclude_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, index_only) AS (");
 		append_rel_pattern_raw_cte(&sql, &opts.exclude, conn);
 		appendPQExpBufferStr(&sql, "\n),");
 		append_rel_pattern_filtered_cte(&sql, "exclude_raw", "exclude_pat", conn);
@@ -1914,36 +1957,36 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 
 	/* Append the relation CTE. */
 	appendPQExpBufferStr(&sql,
-						 " relation (pattern_id, oid, nspname, relname, reltoastrelid, relpages, is_heap, is_btree) AS ("
+						 " relation (pattern_id, oid, amoid, nspname, relname, reltoastrelid, relpages, is_heap, is_index) AS ("
 						 "\nSELECT DISTINCT ON (c.oid");
 	if (!opts.allrel)
 		appendPQExpBufferStr(&sql, ", ip.pattern_id) ip.pattern_id,");
 	else
 		appendPQExpBufferStr(&sql, ") NULL::INTEGER AS pattern_id,");
 	appendPQExpBuffer(&sql,
-					  "\nc.oid, n.nspname, c.relname, c.reltoastrelid, c.relpages, "
-					  "c.relam = %u AS is_heap, "
-					  "c.relam = %u AS is_btree"
+					  "\nc.oid, c.relam as amoid, n.nspname, c.relname, "
+					  "c.reltoastrelid, c.relpages, c.relam = %u AS is_heap, "
+					  "(c.relam = %u OR c.relam = %u) AS is_index"
 					  "\nFROM pg_catalog.pg_class c "
 					  "INNER JOIN pg_catalog.pg_namespace n "
 					  "ON c.relnamespace = n.oid",
-					  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+					  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 	if (!opts.allrel)
 		appendPQExpBuffer(&sql,
 						  "\nINNER JOIN include_pat ip"
 						  "\nON (n.nspname ~ ip.nsp_regex OR ip.nsp_regex IS NULL)"
 						  "\nAND (c.relname ~ ip.rel_regex OR ip.rel_regex IS NULL)"
 						  "\nAND (c.relam = %u OR NOT ip.heap_only)"
-						  "\nAND (c.relam = %u OR NOT ip.btree_only)",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  "\nAND ((c.relam = %u OR c.relam = %u) OR NOT ip.index_only)",
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 	if (opts.excludetbl || opts.excludeidx || opts.excludensp)
 		appendPQExpBuffer(&sql,
 						  "\nLEFT OUTER JOIN exclude_pat ep"
 						  "\nON (n.nspname ~ ep.nsp_regex OR ep.nsp_regex IS NULL)"
 						  "\nAND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL)"
 						  "\nAND (c.relam = %u OR NOT ep.heap_only OR ep.rel_regex IS NULL)"
-						  "\nAND (c.relam = %u OR NOT ep.btree_only OR ep.rel_regex IS NULL)",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  "\nAND ((c.relam = %u OR c.relam = %u) OR NOT ep.index_only OR ep.rel_regex IS NULL)",
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 
 	/*
 	 * Exclude temporary tables and indexes, which must necessarily belong to
@@ -1977,12 +2020,12 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 						  HEAP_TABLE_AM_OID, PG_TOAST_NAMESPACE);
 	else
 		appendPQExpBuffer(&sql,
-						  " AND c.relam IN (%u, %u)"
+						  " AND c.relam IN (%u, %u, %u)"
 						  "AND c.relkind IN ('r', 'S', 'm', 't', 'i') "
 						  "AND ((c.relam = %u AND c.relkind IN ('r', 'S', 'm', 't')) OR "
-						  "(c.relam = %u AND c.relkind = 'i'))",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID,
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  "((c.relam = %u OR c.relam = %u) AND c.relkind = 'i'))",
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID,
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 
 	appendPQExpBufferStr(&sql,
 						 "\nORDER BY c.oid)");
@@ -2011,7 +2054,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		appendPQExpBufferStr(&sql,
 							 "\n)");
 	}
-	if (!opts.no_btree_expansion)
+	if (!opts.no_index_expansion)
 	{
 		/*
 		 * Include a CTE for btree indexes associated with primary heap tables
@@ -2019,9 +2062,9 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		 * btree index names.
 		 */
 		appendPQExpBufferStr(&sql,
-							 ", index (oid, nspname, relname, relpages) AS ("
-							 "\nSELECT c.oid, r.nspname, c.relname, c.relpages "
-							 "FROM relation r"
+							 ", index (oid, amoid, nspname, relname, relpages) AS ("
+							 "\nSELECT c.oid, c.relam as amoid, r.nspname, "
+							 "c.relname, c.relpages FROM relation r"
 							 "\nINNER JOIN pg_catalog.pg_index i "
 							 "ON r.oid = i.indrelid "
 							 "INNER JOIN pg_catalog.pg_class c "
@@ -2034,15 +2077,15 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 								 "\nLEFT OUTER JOIN exclude_pat ep "
 								 "ON (n.nspname ~ ep.nsp_regex OR ep.nsp_regex IS NULL) "
 								 "AND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL) "
-								 "AND ep.btree_only"
+								 "AND ep.index_only"
 								 "\nWHERE ep.pattern_id IS NULL");
 		else
 			appendPQExpBufferStr(&sql,
 								 "\nWHERE true");
 		appendPQExpBuffer(&sql,
-						  " AND c.relam = %u "
+						  " AND (c.relam = %u or c.relam = %u) "
 						  "AND c.relkind = 'i'",
-						  BTREE_AM_OID);
+						  BTREE_AM_OID, GIST_AM_OID);
 		if (opts.no_toast_expansion)
 			appendPQExpBuffer(&sql,
 							  " AND c.relnamespace != %u",
@@ -2050,7 +2093,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		appendPQExpBufferStr(&sql, "\n)");
 	}
 
-	if (!opts.no_toast_expansion && !opts.no_btree_expansion)
+	if (!opts.no_toast_expansion && !opts.no_index_expansion)
 	{
 		/*
 		 * Include a CTE for btree indexes associated with toast tables of
@@ -2071,13 +2114,13 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 								 "\nLEFT OUTER JOIN exclude_pat ep "
 								 "ON ('pg_toast' ~ ep.nsp_regex OR ep.nsp_regex IS NULL) "
 								 "AND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL) "
-								 "AND ep.btree_only "
+								 "AND ep.index_only "
 								 "WHERE ep.pattern_id IS NULL");
 		else
 			appendPQExpBufferStr(&sql,
 								 "\nWHERE true");
 		appendPQExpBuffer(&sql,
-						  " AND c.relam = %u"
+						  " AND c.relam = %u "
 						  " AND c.relkind = 'i')",
 						  BTREE_AM_OID);
 	}
@@ -2091,12 +2134,13 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	 * list.
 	 */
 	appendPQExpBufferStr(&sql,
-						 "\nSELECT pattern_id, is_heap, is_btree, oid, nspname, relname, relpages "
+						 "\nSELECT pattern_id, is_heap, is_index, oid, amoid, nspname, relname, relpages "
 						 "FROM (");
 	appendPQExpBufferStr(&sql,
 	/* Inclusion patterns that failed to match */
-						 "\nSELECT pattern_id, is_heap, is_btree, "
+						 "\nSELECT pattern_id, is_heap, is_index, "
 						 "NULL::OID AS oid, "
+						 "NULL::OID AS amoid, "
 						 "NULL::TEXT AS nspname, "
 						 "NULL::TEXT AS relname, "
 						 "NULL::INTEGER AS relpages"
@@ -2105,29 +2149,29 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 						 "UNION"
 	/* Primary relations */
 						 "\nSELECT NULL::INTEGER AS pattern_id, "
-						 "is_heap, is_btree, oid, nspname, relname, relpages "
+						 "is_heap, is_index, oid, amoid, nspname, relname, relpages "
 						 "FROM relation");
 	if (!opts.no_toast_expansion)
-		appendPQExpBufferStr(&sql,
-							 " UNION"
+		appendPQExpBuffer(&sql,
+						  " UNION"
 		/* Toast tables for primary relations */
-							 "\nSELECT NULL::INTEGER AS pattern_id, TRUE AS is_heap, "
-							 "FALSE AS is_btree, oid, nspname, relname, relpages "
-							 "FROM toast");
-	if (!opts.no_btree_expansion)
+						  "\nSELECT NULL::INTEGER AS pattern_id, TRUE AS is_heap, "
+						  "FALSE AS is_index, oid, 0 as amoid, nspname, relname, relpages "
+						  "FROM toast");
+	if (!opts.no_index_expansion)
 		appendPQExpBufferStr(&sql,
 							 " UNION"
 		/* Indexes for primary relations */
 							 "\nSELECT NULL::INTEGER AS pattern_id, FALSE AS is_heap, "
-							 "TRUE AS is_btree, oid, nspname, relname, relpages "
+							 "TRUE AS is_index, oid, amoid, nspname, relname, relpages "
 							 "FROM index");
-	if (!opts.no_toast_expansion && !opts.no_btree_expansion)
-		appendPQExpBufferStr(&sql,
-							 " UNION"
+	if (!opts.no_toast_expansion && !opts.no_index_expansion)
+		appendPQExpBuffer(&sql,
+						  " UNION"
 		/* Indexes for toast relations */
-							 "\nSELECT NULL::INTEGER AS pattern_id, FALSE AS is_heap, "
-							 "TRUE AS is_btree, oid, nspname, relname, relpages "
-							 "FROM toast_index");
+						  "\nSELECT NULL::INTEGER AS pattern_id, FALSE AS is_heap, "
+						  "TRUE AS is_index, oid, %u as amoid, nspname, relname, relpages "
+						  "FROM toast_index", BTREE_AM_OID);
 	appendPQExpBufferStr(&sql,
 						 "\n) AS combined_records "
 						 "ORDER BY relpages DESC NULLS FIRST, oid");
@@ -2147,8 +2191,9 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	{
 		int			pattern_id = -1;
 		bool		is_heap = false;
-		bool		is_btree PG_USED_FOR_ASSERTS_ONLY = false;
+		bool		is_index PG_USED_FOR_ASSERTS_ONLY = false;
 		Oid			oid = InvalidOid;
+		Oid			amoid = InvalidOid;
 		const char *nspname = NULL;
 		const char *relname = NULL;
 		int			relpages = 0;
@@ -2158,15 +2203,17 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		if (!PQgetisnull(res, i, 1))
 			is_heap = (PQgetvalue(res, i, 1)[0] == 't');
 		if (!PQgetisnull(res, i, 2))
-			is_btree = (PQgetvalue(res, i, 2)[0] == 't');
+			is_index = (PQgetvalue(res, i, 2)[0] == 't');
 		if (!PQgetisnull(res, i, 3))
 			oid = atooid(PQgetvalue(res, i, 3));
 		if (!PQgetisnull(res, i, 4))
-			nspname = PQgetvalue(res, i, 4);
+			amoid = atooid(PQgetvalue(res, i, 4));
 		if (!PQgetisnull(res, i, 5))
-			relname = PQgetvalue(res, i, 5);
+			nspname = PQgetvalue(res, i, 5);
 		if (!PQgetisnull(res, i, 6))
-			relpages = atoi(PQgetvalue(res, i, 6));
+			relname = PQgetvalue(res, i, 6);
+		if (!PQgetisnull(res, i, 7))
+			relpages = atoi(PQgetvalue(res, i, 7));
 
 		if (pattern_id >= 0)
 		{
@@ -2188,10 +2235,11 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 			RelationInfo *rel = (RelationInfo *) pg_malloc0(sizeof(RelationInfo));
 
 			Assert(OidIsValid(oid));
-			Assert((is_heap && !is_btree) || (is_btree && !is_heap));
+			Assert((is_heap && !is_index) || (is_index && !is_heap));
 
 			rel->datinfo = dat;
 			rel->reloid = oid;
+			rel->amoid = amoid;
 			rel->is_heap = is_heap;
 			rel->nspname = pstrdup(nspname);
 			rel->relname = pstrdup(relname);
@@ -2201,7 +2249,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 			{
 				/*
 				 * We apply --startblock and --endblock to heap tables, but
-				 * not btree indexes, and for progress purposes we need to
+				 * not supported indexes, and for progress purposes we need to
 				 * track how many blocks we expect to check.
 				 */
 				if (opts.endblock >= 0 && rel->blocks_to_check > opts.endblock)
diff --git a/src/bin/pg_amcheck/t/002_nonesuch.pl b/src/bin/pg_amcheck/t/002_nonesuch.pl
index 67d700ea07a..d4cc0664f3b 100644
--- a/src/bin/pg_amcheck/t/002_nonesuch.pl
+++ b/src/bin/pg_amcheck/t/002_nonesuch.pl
@@ -272,8 +272,8 @@ $node->command_checks_all(
 	[
 		qr/pg_amcheck: warning: no heap tables to check matching "no_such_table"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "no\*such\*table"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "no_such_index"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "no\*such\*index"/,
+		qr/pg_amcheck: warning: no indexes to check matching "no_such_index"/,
+		qr/pg_amcheck: warning: no indexes to check matching "no\*such\*index"/,
 		qr/pg_amcheck: warning: no relations to check matching "no_such_relation"/,
 		qr/pg_amcheck: warning: no relations to check matching "no\*such\*relation"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "no\*such\*table"/,
@@ -350,8 +350,8 @@ $node->command_checks_all(
 		qr/pg_amcheck: warning: no heap tables to check matching "template1\.public\.foo"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "another_db\.public\.foo"/,
 		qr/pg_amcheck: warning: no connectable databases to check matching "no_such_database\.public\.foo"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "template1\.public\.foo_idx"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "another_db\.public\.foo_idx"/,
+		qr/pg_amcheck: warning: no indexes to check matching "template1\.public\.foo_idx"/,
+		qr/pg_amcheck: warning: no indexes to check matching "another_db\.public\.foo_idx"/,
 		qr/pg_amcheck: warning: no connectable databases to check matching "no_such_database\.public\.foo_idx"/,
 		qr/pg_amcheck: error: no relations to check/,
 	],
diff --git a/src/bin/pg_amcheck/t/003_check.pl b/src/bin/pg_amcheck/t/003_check.pl
index 2b57c4dbac1..0aa66b24258 100644
--- a/src/bin/pg_amcheck/t/003_check.pl
+++ b/src/bin/pg_amcheck/t/003_check.pl
@@ -185,7 +185,7 @@ for my $dbname (qw(db1 db2 db3))
 	# schemas.  The schemas are all identical to start, but
 	# we will corrupt them differently later.
 	#
-	for my $schema (qw(s1 s2 s3 s4 s5))
+	for my $schema (qw(s1 s2 s3 s4 s5 s6))
 	{
 		$node->safe_psql(
 			$dbname, qq(
@@ -291,22 +291,24 @@ plan_to_corrupt_first_page('db1', 's3.t2_btree');
 # Corrupt toast table, partitions, and materialized views in schema "s4"
 plan_to_remove_toast_file('db1', 's4.t2');
 
-# Corrupt all other object types in schema "s5".  We don't have amcheck support
+# Corrupt GiST index in schema "s5"
+plan_to_remove_relation_file('db1', 's5.t1_gist');
+plan_to_corrupt_first_page('db1', 's5.t2_gist');
+
+# Corrupt all other object types in schema "s6".  We don't have amcheck support
 # for these types, but we check that their corruption does not trigger any
 # errors in pg_amcheck
-plan_to_remove_relation_file('db1', 's5.seq1');
-plan_to_remove_relation_file('db1', 's5.t1_hash');
-plan_to_remove_relation_file('db1', 's5.t1_gist');
-plan_to_remove_relation_file('db1', 's5.t1_gin');
-plan_to_remove_relation_file('db1', 's5.t1_brin');
-plan_to_remove_relation_file('db1', 's5.t1_spgist');
+plan_to_remove_relation_file('db1', 's6.seq1');
+plan_to_remove_relation_file('db1', 's6.t1_hash');
+plan_to_remove_relation_file('db1', 's6.t1_gin');
+plan_to_remove_relation_file('db1', 's6.t1_brin');
+plan_to_remove_relation_file('db1', 's6.t1_spgist');
 
-plan_to_corrupt_first_page('db1', 's5.seq2');
-plan_to_corrupt_first_page('db1', 's5.t2_hash');
-plan_to_corrupt_first_page('db1', 's5.t2_gist');
-plan_to_corrupt_first_page('db1', 's5.t2_gin');
-plan_to_corrupt_first_page('db1', 's5.t2_brin');
-plan_to_corrupt_first_page('db1', 's5.t2_spgist');
+plan_to_corrupt_first_page('db1', 's6.seq2');
+plan_to_corrupt_first_page('db1', 's6.t2_hash');
+plan_to_corrupt_first_page('db1', 's6.t2_gin');
+plan_to_corrupt_first_page('db1', 's6.t2_brin');
+plan_to_corrupt_first_page('db1', 's6.t2_spgist');
 
 
 # Database 'db2' corruptions
@@ -437,10 +439,22 @@ $node->command_checks_all(
 	[$no_output_re],
 	'pg_amcheck in schema s4 excluding toast reports no corruption');
 
-# Check that no corruption is reported in schema db1.s5
-$node->command_checks_all([ @cmd, '-s', 's5', 'db1' ],
+# In schema db1.s5 we should see GiST corruption messages on stdout, and
+# nothing on stderr.
+#
+$node->command_checks_all(
+	[ @cmd, '-s', 's5', 'db1' ],
+	2,
+	[
+		$missing_file_re, $line_pointer_corruption_re,
+	],
+	[$no_output_re],
+	'pg_amcheck schema s5 reports GiST index errors');
+
+# Check that no corruption is reported in schema db1.s6
+$node->command_checks_all([ @cmd, '-s', 's6', 'db1' ],
 	0, [$no_output_re], [$no_output_re],
-	'pg_amcheck over schema s5 reports no corruption');
+	'pg_amcheck over schema s6 reports no corruption');
 
 # In schema db1.s1, only indexes are corrupt.  Verify that when we exclude
 # the indexes, no corruption is reported about the schema.
@@ -551,7 +565,7 @@ $node->command_checks_all(
 	'pg_amcheck excluding all corrupt schemas with --checkunique option');
 
 #
-# Smoke test for checkunique option for not supported versions.
+# Smoke test for checkunique option and GiST indexes for not supported versions.
 #
 $node->safe_psql(
 	'db3', q(
@@ -567,4 +581,19 @@ $node->command_checks_all(
 		qr/pg_amcheck: warning: option --checkunique is not supported by amcheck version 1.3/
 	],
 	'pg_amcheck smoke test --checkunique');
+
+$node->safe_psql(
+	'db1', q(
+		DROP EXTENSION amcheck;
+		CREATE EXTENSION amcheck WITH SCHEMA amcheck_schema VERSION '1.3' ;
+));
+
+$node->command_checks_all(
+	[ @cmd, '-s', 's5', 'db1' ],
+	0,
+	[$no_output_re],
+	[
+		qr/pg_amcheck: warning: GiST verification is not supported by installed amcheck version/
+	],
+	'pg_amcheck smoke test --checkunique');
 done_testing();
-- 
2.34.1

v29-0003-Add-gist_index_check-function-to-verify-GiST-ind.patchapplication/octet-stream; name=v29-0003-Add-gist_index_check-function-to-verify-GiST-ind.patchDownload

From 94f71608c8ab98ee6218cc497af55c28bc7b9abf Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:17:44 +0500
Subject: [PATCH v29 3/5] Add gist_index_check() function to verify GiST index
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This function traverses GiST with a depth-fisrt search and checks
that all downlink tuples are included into parent tuple keyspace.
This traverse takes lock of any page until some discapency found.
To re-check suspicious pair of parent and child tuples it aqcuires
locks on both parent and child pages in the same order as page
split does.

Author: Andrey Borodin <amborodin@acm.org>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile                |   6 +-
 contrib/amcheck/amcheck--1.4--1.5.sql   |  14 +
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/expected/check_gist.out | 145 +++++
 contrib/amcheck/meson.build             |   3 +
 contrib/amcheck/sql/check_gist.sql      |  62 +++
 contrib/amcheck/verify_gist.c           | 687 ++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml               |  19 +
 8 files changed, 935 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.4--1.5.sql
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index c3d70f3369c..952e458c53b 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,14 +4,16 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	verify_common.o \
+	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
-DATA = amcheck--1.3--1.4.sql amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql \
+		amcheck--1.3--1.4.sql amcheck--1.4--1.5.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap
+REGRESS = check check_btree check_gist check_heap
 
 EXTRA_INSTALL = contrib/pg_walinspect
 TAP_TESTS = 1
diff --git a/contrib/amcheck/amcheck--1.4--1.5.sql b/contrib/amcheck/amcheck--1.4--1.5.sql
new file mode 100644
index 00000000000..3fc72364180
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.4--1.5.sql
@@ -0,0 +1,14 @@
+/* contrib/amcheck/amcheck--1.4--1.5.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.5'" to load this file. \quit
+
+
+-- gist_index_check()
+--
+CREATE FUNCTION gist_index_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_check(regclass, boolean) FROM PUBLIC;
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index e67ace01c99..c8ba6d7c9bc 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.4'
+default_version = '1.5'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 00000000000..cbc3e27e679
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,145 @@
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gist_check;
+--
+-- Similar to BUG #15597
+--
+CREATE TABLE toast_bug(c point,buggy text);
+ALTER TABLE toast_bug ALTER COLUMN buggy SET STORAGE extended;
+CREATE INDEX toasty ON toast_bug USING gist(c) INCLUDE(buggy);
+-- pg_attribute entry for toasty.buggy (the index) will have plain storage:
+UPDATE pg_attribute SET attstorage = 'p'
+WHERE attrelid = 'toasty'::regclass AND attname = 'buggy';
+-- Whereas pg_attribute entry for toast_bug.buggy (the table) still has extended storage:
+SELECT attstorage FROM pg_attribute
+WHERE attrelid = 'toast_bug'::regclass AND attname = 'buggy';
+ attstorage 
+------------
+ x
+(1 row)
+
+-- Insert compressible heap tuple (comfortably exceeds TOAST_TUPLE_THRESHOLD):
+INSERT INTO toast_bug SELECT point(0,0), repeat('a', 2200);
+-- Should not get false positive report of corruption:
+SELECT gist_index_check('toasty', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 1b38e0aba77..15ae94cc90f 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
@@ -25,6 +26,7 @@ install_data(
   'amcheck--1.1--1.2.sql',
   'amcheck--1.2--1.3.sql',
   'amcheck--1.3--1.4.sql',
+  'amcheck--1.4--1.5.sql',
   kwargs: contrib_data_args,
 )
 
@@ -36,6 +38,7 @@ tests += {
     'sql': [
       'check',
       'check_btree',
+      'check_gist',
       'check_heap',
     ],
   },
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 00000000000..37966423b8b
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,62 @@
+
+SELECT setseed(1);
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+-- cleanup
+DROP TABLE gist_check;
+
+--
+-- Similar to BUG #15597
+--
+CREATE TABLE toast_bug(c point,buggy text);
+ALTER TABLE toast_bug ALTER COLUMN buggy SET STORAGE extended;
+CREATE INDEX toasty ON toast_bug USING gist(c) INCLUDE(buggy);
+
+-- pg_attribute entry for toasty.buggy (the index) will have plain storage:
+UPDATE pg_attribute SET attstorage = 'p'
+WHERE attrelid = 'toasty'::regclass AND attname = 'buggy';
+
+-- Whereas pg_attribute entry for toast_bug.buggy (the table) still has extended storage:
+SELECT attstorage FROM pg_attribute
+WHERE attrelid = 'toast_bug'::regclass AND attname = 'buggy';
+
+-- Insert compressible heap tuple (comfortably exceeds TOAST_TUPLE_THRESHOLD):
+INSERT INTO toast_bug SELECT point(0,0), repeat('a', 2200);
+-- Should not get false positive report of corruption:
+SELECT gist_index_check('toasty', true);
\ No newline at end of file
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 00000000000..477150ac802
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,687 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gist.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "access/tableam.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
+#include "common/pg_prng.h"
+#include "lib/bloomfilter.h"
+#include "verify_common.h"
+#include "utils/memutils.h"
+
+
+/*
+ * GistScanItem represents one item of depth-first scan of GiST index.
+ */
+typedef struct GistScanItem
+{
+	int			depth;
+
+	/* Referenced block number to check next */
+	BlockNumber blkno;
+
+	/*
+	 * Correctess of this parent tuple will be checked against contents of
+	 * referenced page. This tuple will be NULL for root block.
+	 */
+	IndexTuple	parenttup;
+
+	/*
+	 * LSN to hande concurrent scan of the page. It's necessary to avoid
+	 * missing some subtrees from page, that was split just before we read it.
+	 */
+	XLogRecPtr	parentlsn;
+
+	/*
+	 * Reference to parent page for re-locking in case of found parent-child
+	 * tuple discrepencies.
+	 */
+	BlockNumber parentblk;
+
+	/* Pointer to a next stack item. */
+	struct GistScanItem *next;
+}			GistScanItem;
+
+typedef struct GistCheckState
+{
+	/* GiST state */
+	GISTSTATE  *state;
+	/* Bloom filter fingerprints index tuples */
+	bloom_filter *filter;
+
+	Snapshot	snapshot;
+	Relation	rel;
+	Relation	heaprel;
+
+	/* Debug counter for reporting percentage of work already done */
+	int64		heaptuplespresent;
+
+	/* progress reporting stuff */
+	BlockNumber totalblocks;
+	BlockNumber reportedblocks;
+	BlockNumber scannedblocks;
+	BlockNumber deltablocks;
+
+	int			leafdepth;
+}			GistCheckState;
+
+PG_FUNCTION_INFO_V1(gist_index_check);
+
+static void giststate_init_heapallindexed(Relation rel, GistCheckState * result);
+static void gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+											   void *callback_state, bool readonly);
+static void gist_check_page(GistCheckState * check_state, GistScanItem * stack,
+							Page page, bool heapallindexed,
+							BufferAccessStrategy strategy);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gist_refind_parent(Relation rel, BlockNumber parentblkno,
+									 BlockNumber childblkno,
+									 BufferAccessStrategy strategy);
+static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block,
+								   Page page, OffsetNumber offset);
+static void gist_tuple_present_callback(Relation index, ItemPointer tid,
+										Datum *values, bool *isnull,
+										bool tupleIsAlive, void *checkstate);
+static IndexTuple gistFormNormalizedTuple(GISTSTATE *giststate, Relation r,
+										  Datum *attdata, bool *isnull, ItemPointerData tid);
+
+/*
+ * gist_index_check(index regclass)
+ *
+ * Verify integrity of GiST index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gist_index_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	bool		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid,
+									GIST_AM_OID,
+									gist_check_parent_keys_consistency,
+									AccessShareLock,
+									&heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+/*
+* Initaliaze GIST state filed needed to perform.
+* This initialized bloom filter and snapshot.
+*/
+static void
+giststate_init_heapallindexed(Relation rel, GistCheckState * result)
+{
+	int64		total_pages;
+	int64		total_elems;
+	uint64		seed;
+
+	/*
+	 * Size Bloom filter based on estimated number of tuples in index. This
+	 * logic is similar to B-tree, see verify_btree.c .
+	 */
+	total_pages = result->totalblocks;
+	total_elems = Max(total_pages * (MaxOffsetNumber / 5),
+					  (int64) rel->rd_rel->reltuples);
+	seed = pg_prng_uint64(&pg_global_prng_state);
+	result->filter = bloom_create(total_elems, maintenance_work_mem, seed);
+
+	result->snapshot = RegisterSnapshot(GetTransactionSnapshot());
+
+
+	/*
+	 * GetTransactionSnapshot() always acquires a new MVCC snapshot in READ
+	 * COMMITTED mode.  A new snapshot is guaranteed to have all the entries
+	 * it requires in the index.
+	 *
+	 * We must defend against the possibility that an old xact snapshot was
+	 * returned at higher isolation levels when that snapshot is not safe for
+	 * index scans of the target index.  This is possible when the snapshot
+	 * sees tuples that are before the index's indcheckxmin horizon.  Throwing
+	 * an error here should be very rare.  It doesn't seem worth using a
+	 * secondary snapshot to avoid this.
+	 */
+	if (IsolationUsesXactSnapshot() && rel->rd_index->indcheckxmin &&
+		!TransactionIdPrecedes(HeapTupleHeaderGetXmin(rel->rd_indextuple->t_data),
+							   result->snapshot->xmin))
+		ereport(ERROR,
+				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+				 errmsg("index \"%s\" cannot be verified using transaction snapshot",
+						RelationGetRelationName(rel))));
+}
+
+/*
+ * Main entry point for GiST check.
+ *
+ * This function verifies that tuples of internal pages cover all
+ * the key space of each tuple on leaf page.  To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * This check allocates memory context and scans through
+ * GiST graph. This scan is performed in a depth-first search using a stack of
+ * GistScanItem-s. Initially this stack contains only root block number. On
+ * each iteration top block numbmer is replcaed by referenced block numbers.
+ *
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries to
+ * adjust it by tuples on referenced child page.  Parent gist tuple should
+ * never require any adjustments.
+ */
+static void
+gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+								   void *callback_state, bool readonly)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GistScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GISTSTATE  *state;
+	bool		heapallindexed = *((bool *) callback_state);
+	GistCheckState *check_state = palloc0(sizeof(GistCheckState));
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	state = initGISTstate(rel);
+
+	check_state->state = state;
+	check_state->rel = rel;
+	check_state->heaprel = heaprel;
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	check_state->leafdepth = -1;
+
+	check_state->totalblocks = RelationGetNumberOfBlocks(rel);
+	/* report every 100 blocks or 5%, whichever is bigger */
+	check_state->deltablocks = Max(check_state->totalblocks / 20, 100);
+
+	if (heapallindexed)
+		giststate_init_heapallindexed(rel, check_state);
+
+	/* Start the scan at the root page */
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	/*
+	 * This GiST scan is effectively "old" VACUUM version before commit
+	 * fe280694d which introduced physical order scanning.
+	 */
+
+	while (stack)
+	{
+		GistScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		XLogRecPtr	lsn;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Report progress */
+		if (check_state->scannedblocks > check_state->reportedblocks +
+			check_state->deltablocks)
+		{
+			elog(DEBUG1, "verified level %u blocks of approximately %u total",
+				 check_state->scannedblocks, check_state->totalblocks);
+			check_state->reportedblocks = check_state->scannedblocks;
+		}
+		check_state->scannedblocks++;
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page))
+		{
+			/* split page detected, install right link to the stack */
+			GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+			ptr->depth = stack->depth;
+			ptr->parenttup = CopyIndexTuple(stack->parenttup);
+			ptr->parentblk = stack->parentblk;
+			ptr->parentlsn = stack->parentlsn;
+			ptr->blkno = GistPageGetOpaque(page)->rightlink;
+			ptr->next = stack->next;
+			stack->next = ptr;
+		}
+
+		gist_check_page(check_state, stack, page, heapallindexed, strategy);
+
+		if (!GistPageIsLeaf(page))
+		{
+			OffsetNumber maxoff = PageGetMaxOffsetNumber(page);
+
+			for (OffsetNumber i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				/* Internal page, so recurse to the child */
+				GistScanItem *ptr;
+				ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
+				IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+				ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+				ptr->depth = stack->depth + 1;
+				ptr->parenttup = CopyIndexTuple(idxtuple);
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		LockBuffer(buffer, GIST_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	if (heapallindexed)
+	{
+		IndexInfo  *indexinfo = BuildIndexInfo(rel);
+		TableScanDesc scan;
+
+		scan = table_beginscan_strat(heaprel,	/* relation */
+									 check_state->snapshot, /* snapshot */
+									 0, /* number of keys */
+									 NULL,	/* scan key */
+									 true,	/* buffer access strategy OK */
+									 true); /* syncscan OK? */
+
+		/*
+		 * Scan will behave as the first scan of a CREATE INDEX CONCURRENTLY.
+		 */
+		indexinfo->ii_Concurrent = true;
+
+		indexinfo->ii_Unique = false;
+		indexinfo->ii_ExclusionOps = NULL;
+		indexinfo->ii_ExclusionProcs = NULL;
+		indexinfo->ii_ExclusionStrats = NULL;
+
+		elog(DEBUG1, "verifying that tuples from index \"%s\" are present in \"%s\"",
+			 RelationGetRelationName(rel),
+			 RelationGetRelationName(heaprel));
+
+		table_index_build_scan(heaprel, rel, indexinfo, true, false,
+							   gist_tuple_present_callback, (void *) check_state, scan);
+
+		ereport(DEBUG1,
+				(errmsg_internal("finished verifying presence of " INT64_FORMAT " tuples from table \"%s\" with bitset %.2f%% set",
+								 check_state->heaptuplespresent,
+								 RelationGetRelationName(heaprel),
+								 100.0 * bloom_prop_bits_set(check_state->filter))));
+
+		UnregisterSnapshot(check_state->snapshot);
+		bloom_free(check_state->filter);
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+	pfree(check_state);
+}
+
+static void
+gist_check_page(GistCheckState * check_state, GistScanItem * stack,
+				Page page, bool heapallindexed, BufferAccessStrategy strategy)
+{
+	OffsetNumber maxoff = PageGetMaxOffsetNumber(page);
+
+	/* Check that the tree has the same height in all branches */
+	if (GistPageIsLeaf(page))
+	{
+		if (check_state->leafdepth == -1)
+			check_state->leafdepth = stack->depth;
+		else if (stack->depth != check_state->leafdepth)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+							RelationGetRelationName(check_state->rel), stack->blkno)));
+	}
+
+	/*
+	 * Check that each tuple looks valid, and is consistent with the downlink
+	 * we followed when we stepped on this page.
+	 */
+	for (OffsetNumber i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+	{
+		ItemId		iid = PageGetItemIdCareful(check_state->rel, stack->blkno, page, i);
+		IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+		IndexTuple  tmpTuple = NULL;
+
+		/*
+		 * Check that it's not a leftover invalid tuple from pre-9.1 See also
+		 * gistdoinsert() and gistbulkdelete() handling of such tuples. We do
+		 * consider it error here.
+		 */
+		if (GistTupleIsInvalid(idxtuple))
+			ereport(ERROR,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					 errmsg("index \"%s\" contains an inner tuple marked as invalid, block %u, offset %u",
+							RelationGetRelationName(check_state->rel), stack->blkno, i),
+					 errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+					 errhint("Please REINDEX it.")));
+
+		if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+							RelationGetRelationName(check_state->rel), stack->blkno, i)));
+
+		/*
+		 * Check if this tuple is consistent with the downlink in the parent.
+		 */
+		if (stack->parenttup)
+			tmpTuple = gistgetadjusted(check_state->rel, stack->parenttup, idxtuple, check_state->state);
+
+		if (tmpTuple)
+		{
+			/*
+			 * There was a discrepancy between parent and child tuples. We
+			 * need to verify it is not a result of concurrent call of
+			 * gistplacetopage(). So, lock parent and try to find downlink for
+			 * current page. It may be missing due to concurrent page split,
+			 * this is OK.
+			 *
+			 * Note that when we aquire parent tuple now we hold lock for both
+			 * parent and child buffers. Thus parent tuple must include
+			 * keyspace of the child.
+			 */
+
+			pfree(tmpTuple);
+			pfree(stack->parenttup);
+			stack->parenttup = gist_refind_parent(check_state->rel, stack->parentblk,
+												  stack->blkno, strategy);
+
+			/* We found it - make a final check before failing */
+			if (!stack->parenttup)
+				elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+					 stack->blkno, stack->parentblk);
+			else if (gistgetadjusted(check_state->rel, stack->parenttup, idxtuple, check_state->state))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+								RelationGetRelationName(check_state->rel), stack->blkno, i)));
+			else
+			{
+				/*
+				 * But now it is properly adjusted - nothing to do here.
+				 */
+			}
+		}
+
+		if (GistPageIsLeaf(page))
+		{
+			if (heapallindexed)
+				bloom_add_element(check_state->filter,
+								  (unsigned char *) idxtuple,
+								  IndexTupleSize(idxtuple));
+		}
+		else
+		{
+			OffsetNumber off = ItemPointerGetOffsetNumber(&(idxtuple->t_tid));
+
+			if (off != 0xffff)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has on page %u offset %u has item id not pointing to 0xffff, but %hu",
+								RelationGetRelationName(check_state->rel), stack->blkno, i, off)));
+		}
+	}
+}
+
+/*
+ * gistFormNormalizedTuple - analogue to gistFormTuple, but performs deTOASTing
+ * of all included data (for covering indexes). While we do not expected
+ * toasted attributes in normal index, this can happen as a result of
+ * intervention into system catalog. Detoasting of key attributes is expected
+ * to be done by opclass decompression methods, if indexed type might be
+ * toasted.
+ */
+static IndexTuple
+gistFormNormalizedTuple(GISTSTATE *giststate, Relation r,
+						Datum *attdata, bool *isnull, ItemPointerData tid)
+{
+	Datum		compatt[INDEX_MAX_KEYS];
+	IndexTuple	res;
+
+	gistCompressValues(giststate, r, attdata, isnull, true, compatt);
+
+	for (int i = 0; i < r->rd_att->natts; i++)
+	{
+		Form_pg_attribute att;
+
+		att = TupleDescAttr(giststate->leafTupdesc, i);
+		if (att->attbyval || att->attlen != -1 || isnull[i])
+			continue;
+
+		if (VARATT_IS_EXTERNAL(DatumGetPointer(compatt[i])))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("external varlena datum in tuple that references heap row (%u,%u) in index \"%s\"",
+							ItemPointerGetBlockNumber(&tid),
+							ItemPointerGetOffsetNumber(&tid),
+							RelationGetRelationName(r))));
+		if (VARATT_IS_COMPRESSED(DatumGetPointer(compatt[i])))
+		{
+			/* Datum old = compatt[i]; */
+			/* Key attributes must never be compressed */
+			if (i < IndexRelationGetNumberOfKeyAttributes(r))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("compressed varlena datum in tuple key that references heap row (%u,%u) in index \"%s\"",
+								ItemPointerGetBlockNumber(&tid),
+								ItemPointerGetOffsetNumber(&tid),
+								RelationGetRelationName(r))));
+
+			compatt[i] = PointerGetDatum(PG_DETOAST_DATUM(compatt[i]));
+			/* pfree(DatumGetPointer(old)); // TODO: this fails. Why? */
+		}
+	}
+
+	res = index_form_tuple(giststate->leafTupdesc, compatt, isnull);
+
+	/*
+	 * The offset number on tuples on internal pages is unused. For historical
+	 * reasons, it is set to 0xffff.
+	 */
+	ItemPointerSetOffsetNumber(&(res->t_tid), 0xffff);
+	return res;
+}
+
+static void
+gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+							bool *isnull, bool tupleIsAlive, void *checkstate)
+{
+	GistCheckState *state = (GistCheckState *) checkstate;
+	IndexTuple	itup = gistFormNormalizedTuple(state->state, index, values, isnull, *tid);
+
+	itup->t_tid = *tid;
+	/* Probe Bloom filter -- tuple should be present */
+	if (bloom_lacks_element(state->filter, (unsigned char *) itup,
+							IndexTupleSize(itup)))
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("heap tuple (%u,%u) from table \"%s\" lacks matching index tuple within index \"%s\"",
+						ItemPointerGetBlockNumber(&(itup->t_tid)),
+						ItemPointerGetOffsetNumber(&(itup->t_tid)),
+						RelationGetRelationName(state->heaprel),
+						RelationGetRelationName(state->rel))));
+
+	state->heaptuplespresent++;
+
+	pfree(itup);
+}
+
+/*
+ * check_index_page - verification of basic invariants about GiST page data
+ * This function does no any tuple analysis.
+ */
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	gistcheckpage(rel, buffer);
+
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted page %d",
+						RelationGetRelationName(rel), blockNo)));
+
+	if (GistPageIsDeleted(page))
+	{
+		if (!GistPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gist_refind_parent(Relation rel,
+				   BlockNumber parentblkno, BlockNumber childblkno,
+				   BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIST_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GistPageIsLeaf(parentpage))
+	{
+		/*
+		 * Currently GiST never deletes internal pages, thus they can never
+		 * become leaf.
+		 */
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" internal page %d became leaf",
+						RelationGetRelationName(rel), parentblkno)));
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (OffsetNumber o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/*
+			 * Found it! Make copy and return it while both parent and child
+			 * pages are locked. This guaranties that at this particular
+			 * moment tuples must be coherent to each other.
+			 */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+static ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(sizeof(GISTPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since gist
+	 * never uses either.  Verify that line pointer has storage, too, since
+	 * even LP_DEAD items should.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 3af065615bc..6eb526c6bb7 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -188,6 +188,25 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gist_index_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
   <tip>
    <para>
-- 
2.34.1

v29-0002-Refactor-amcheck-internals-to-isolate-common-loc.patchapplication/octet-stream; name=v29-0002-Refactor-amcheck-internals-to-isolate-common-loc.patchDownload

From c945a914331f4d558eae15ecc7d17ba2c847d2eb Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:08:10 +0500
Subject: [PATCH v29 2/5] Refactor amcheck internals to isolate common locking
 and checking routines
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Before doing checks, other indexes must take the same safety measures:
 - Making sure the index can be checked
 - changing the context of the user
 - keeping track of GUCs modified via index functions
This contribution relocates the existing functionality to amcheck.c for reuse.

Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile                 |   1 +
 contrib/amcheck/expected/check_btree.out |   4 +-
 contrib/amcheck/meson.build              |   1 +
 contrib/amcheck/verify_common.c          | 191 ++++++++++++++++
 contrib/amcheck/verify_common.h          |  31 +++
 contrib/amcheck/verify_nbtree.c          | 267 ++++++-----------------
 6 files changed, 296 insertions(+), 199 deletions(-)
 create mode 100644 contrib/amcheck/verify_common.c
 create mode 100644 contrib/amcheck/verify_common.h

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index 5e9002d2501..c3d70f3369c 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -3,6 +3,7 @@
 MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
+	verify_common.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
diff --git a/contrib/amcheck/expected/check_btree.out b/contrib/amcheck/expected/check_btree.out
index e7fb5f55157..c6f4b16c556 100644
--- a/contrib/amcheck/expected/check_btree.out
+++ b/contrib/amcheck/expected/check_btree.out
@@ -57,8 +57,8 @@ ERROR:  could not open relation with OID 17
 BEGIN;
 CREATE INDEX bttest_a_brin_idx ON bttest_a USING brin(id);
 SELECT bt_index_parent_check('bttest_a_brin_idx');
-ERROR:  only B-Tree indexes are supported as targets for verification
-DETAIL:  Relation "bttest_a_brin_idx" is not a B-Tree index.
+ERROR:  expected "btree" index as targets for verification
+DETAIL:  Relation "bttest_a_brin_idx" is a brin index.
 ROLLBACK;
 -- normal check outside of xact
 SELECT bt_index_check('bttest_a_idx');
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index fc08e32539a..1b38e0aba77 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -1,6 +1,7 @@
 # Copyright (c) 2022-2024, PostgreSQL Global Development Group
 
 amcheck_sources = files(
+  'amcheck.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
diff --git a/contrib/amcheck/verify_common.c b/contrib/amcheck/verify_common.c
new file mode 100644
index 00000000000..acdcf5729f7
--- /dev/null
+++ b/contrib/amcheck/verify_common.c
@@ -0,0 +1,191 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.c
+ *		Utility functions common to all access methods.
+ *
+ * Copyright (c) 2024, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/table.h"
+#include "access/tableam.h"
+#include "verify_common.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
+#include "commands/tablecmds.h"
+#include "utils/guc.h"
+#include "utils/syscache.h"
+
+static bool amcheck_index_mainfork_expected(Relation rel);
+
+
+/*
+ * Check if index relation should have a file for its main relation fork.
+ * Verification uses this to skip unlogged indexes when in hot standby mode,
+ * where there is simply nothing to verify.
+ *
+ * NB: Caller should call index_checkable() before calling here.
+ */
+static bool
+amcheck_index_mainfork_expected(Relation rel)
+{
+	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
+		!RecoveryInProgress())
+		return true;
+
+	ereport(NOTICE,
+			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
+			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
+					RelationGetRelationName(rel))));
+
+	return false;
+}
+
+/*
+* Amcheck main workhorse.
+* Given index relation OID, lock relation.
+* Next, take a number of standard actions:
+* 1) Make sure the index can be checked
+* 2) change the context of the user,
+* 3) keep track of GUCs modified via index functions
+* 4) execute callback function to verify integrity.
+*/
+void
+amcheck_lock_relation_and_check(Oid indrelid,
+								Oid am_id,
+								IndexDoCheckCallback check,
+								LOCKMODE lockmode,
+								void *state)
+{
+	Oid			heapid;
+	Relation	indrel;
+	Relation	heaprel;
+	Oid			save_userid;
+	int			save_sec_context;
+	int			save_nestlevel;
+
+	/*
+	 * We must lock table before index to avoid deadlocks.  However, if the
+	 * passed indrelid isn't an index then IndexGetRelation() will fail.
+	 * Rather than emitting a not-very-helpful error message, postpone
+	 * complaining, expecting that the is-it-an-index test below will fail.
+	 *
+	 * In hot standby mode this will raise an error when parentcheck is true.
+	 */
+	heapid = IndexGetRelation(indrelid, true);
+	if (OidIsValid(heapid))
+	{
+		heaprel = table_open(heapid, lockmode);
+
+		/*
+		 * Switch to the table owner's userid, so that any index functions are
+		 * run as that user.  Also lock down security-restricted operations
+		 * and arrange to make GUC variable changes local to this command.
+		 */
+		GetUserIdAndSecContext(&save_userid, &save_sec_context);
+		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
+							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
+		save_nestlevel = NewGUCNestLevel();
+	}
+	else
+	{
+		heaprel = NULL;
+		/* Set these just to suppress "uninitialized variable" warnings */
+		save_userid = InvalidOid;
+		save_sec_context = -1;
+		save_nestlevel = -1;
+	}
+
+	/*
+	 * Open the target index relations separately (like relation_openrv(), but
+	 * with heap relation locked first to prevent deadlocking).  In hot
+	 * standby mode this will raise an error when parentcheck is true.
+	 *
+	 * There is no need for the usual indcheckxmin usability horizon test
+	 * here, even in the heapallindexed case, because index undergoing
+	 * verification only needs to have entries for a new transaction snapshot.
+	 * (If this is a parentcheck verification, there is no question about
+	 * committed or recently dead heap tuples lacking index entries due to
+	 * concurrent activity.)
+	 */
+	indrel = index_open(indrelid, lockmode);
+
+	/*
+	 * Since we did the IndexGetRelation call above without any lock, it's
+	 * barely possible that a race against an index drop/recreation could have
+	 * netted us the wrong table.
+	 */
+	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_TABLE),
+				 errmsg("could not open parent table of index \"%s\"",
+						RelationGetRelationName(indrel))));
+
+	/* Check that relation suitable for checking */
+	if (index_checkable(indrel, am_id))
+		check(indrel, heaprel, state, lockmode == ShareLock);
+
+	/* Roll back any GUC changes executed by index functions */
+	AtEOXact_GUC(false, save_nestlevel);
+
+	/* Restore userid and security context */
+	SetUserIdAndSecContext(save_userid, save_sec_context);
+
+	/*
+	 * Release locks early. That's ok here because nothing in the called
+	 * routines will trigger shared cache invalidations to be sent, so we can
+	 * relax the usual pattern of only releasing locks after commit.
+	 */
+	index_close(indrel, lockmode);
+	if (heaprel)
+		table_close(heaprel, lockmode);
+}
+
+/*
+ * Basic checks about the suitability of a relation for checking as an index.
+ *
+ *
+ * NB: Intentionally not checking permissions, the function is normally not
+ * callable by non-superusers. If granted, it's useful to be able to check a
+ * whole cluster.
+ */
+bool
+index_checkable(Relation rel, Oid am_id)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != am_id)
+	{
+		HeapTuple	amtup;
+		HeapTuple	amtuprel;
+
+		amtup = SearchSysCache1(AMOID, ObjectIdGetDatum(am_id));
+		amtuprel = SearchSysCache1(AMOID, ObjectIdGetDatum(rel->rd_rel->relam));
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("expected \"%s\" index as targets for verification", NameStr(((Form_pg_am) GETSTRUCT(amtup))->amname)),
+				 errdetail("Relation \"%s\" is a %s index.",
+						   RelationGetRelationName(rel), NameStr(((Form_pg_am) GETSTRUCT(amtuprel))->amname))));
+	}
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid.")));
+
+	return amcheck_index_mainfork_expected(rel);
+}
diff --git a/contrib/amcheck/verify_common.h b/contrib/amcheck/verify_common.h
new file mode 100644
index 00000000000..30994e22933
--- /dev/null
+++ b/contrib/amcheck/verify_common.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "storage/bufpage.h"
+#include "storage/lmgr.h"
+#include "storage/lockdefs.h"
+#include "utils/relcache.h"
+#include "miscadmin.h"
+
+/* Typedefs for callback functions for amcheck_lock_relation */
+typedef void (*IndexCheckableCallback) (Relation index);
+typedef void (*IndexDoCheckCallback) (Relation rel,
+									  Relation heaprel,
+									  void *state,
+									  bool readonly);
+
+extern void amcheck_lock_relation_and_check(Oid indrelid,
+											Oid am_id,
+											IndexDoCheckCallback check,
+											LOCKMODE lockmode, void *state);
+
+extern bool index_checkable(Relation rel, Oid am_id);
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 746f7ce09fb..e338891671a 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -30,6 +30,7 @@
 #include "access/tableam.h"
 #include "access/transam.h"
 #include "access/xact.h"
+#include "verify_common.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
 #include "catalog/pg_opfamily_d.h"
@@ -156,14 +157,22 @@ typedef struct BtreeLastVisibleEntry
 	ItemPointer tid;			/* Heap tid */
 } BtreeLastVisibleEntry;
 
+/*
+ * Check arguments
+ */
+typedef struct BTCallbackState
+{
+	bool		parentcheck;
+	bool		heapallindexed;
+	bool		rootdescend;
+	bool		checkunique;
+}			BTCallbackState;
+
 PG_FUNCTION_INFO_V1(bt_index_check);
 PG_FUNCTION_INFO_V1(bt_index_parent_check);
 
-static void bt_index_check_internal(Oid indrelid, bool parentcheck,
-									bool heapallindexed, bool rootdescend,
-									bool checkunique);
-static inline void btree_index_checkable(Relation rel);
-static inline bool btree_index_mainfork_expected(Relation rel);
+static void bt_index_check_callback(Relation indrel, Relation heaprel,
+									void *state, bool readonly);
 static void bt_check_every_level(Relation rel, Relation heaprel,
 								 bool heapkeyspace, bool readonly, bool heapallindexed,
 								 bool rootdescend, bool checkunique);
@@ -238,15 +247,21 @@ Datum
 bt_index_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		checkunique = false;
+	BTCallbackState args;
+
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = false;
+	args.checkunique = false;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
-	if (PG_NARGS() == 3)
-		checkunique = PG_GETARG_BOOL(2);
+		args.heapallindexed = PG_GETARG_BOOL(1);
+	if (PG_NARGS() >= 3)
+		args.checkunique = PG_GETARG_BOOL(2);
 
-	bt_index_check_internal(indrelid, false, heapallindexed, false, checkunique);
+	amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
+									bt_index_check_callback,
+									AccessShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -264,18 +279,23 @@ Datum
 bt_index_parent_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		rootdescend = false;
-	bool		checkunique = false;
+	BTCallbackState args;
+
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = true;
+	args.checkunique = false;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+		args.heapallindexed = PG_GETARG_BOOL(1);
 	if (PG_NARGS() >= 3)
-		rootdescend = PG_GETARG_BOOL(2);
-	if (PG_NARGS() == 4)
-		checkunique = PG_GETARG_BOOL(3);
+		args.rootdescend = PG_GETARG_BOOL(2);
+	if (PG_NARGS() >= 4)
+		args.checkunique = PG_GETARG_BOOL(3);
 
-	bt_index_check_internal(indrelid, true, heapallindexed, rootdescend, checkunique);
+	amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
+									bt_index_check_callback,
+									ShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -284,193 +304,46 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
  * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
  */
 static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend, bool checkunique)
+bt_index_check_callback(Relation indrel, Relation heaprel, void *state, bool readonly)
 {
-	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	LOCKMODE	lockmode;
-	Oid			save_userid;
-	int			save_sec_context;
-	int			save_nestlevel;
-
-	if (parentcheck)
-		lockmode = ShareLock;
-	else
-		lockmode = AccessShareLock;
-
-	/*
-	 * We must lock table before index to avoid deadlocks.  However, if the
-	 * passed indrelid isn't an index then IndexGetRelation() will fail.
-	 * Rather than emitting a not-very-helpful error message, postpone
-	 * complaining, expecting that the is-it-an-index test below will fail.
-	 *
-	 * In hot standby mode this will raise an error when parentcheck is true.
-	 */
-	heapid = IndexGetRelation(indrelid, true);
-	if (OidIsValid(heapid))
-	{
-		heaprel = table_open(heapid, lockmode);
-
-		/*
-		 * Switch to the table owner's userid, so that any index functions are
-		 * run as that user.  Also lock down security-restricted operations
-		 * and arrange to make GUC variable changes local to this command.
-		 */
-		GetUserIdAndSecContext(&save_userid, &save_sec_context);
-		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
-							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
-		save_nestlevel = NewGUCNestLevel();
-		RestrictSearchPath();
-	}
-	else
-	{
-		heaprel = NULL;
-		/* Set these just to suppress "uninitialized variable" warnings */
-		save_userid = InvalidOid;
-		save_sec_context = -1;
-		save_nestlevel = -1;
-	}
+	BTCallbackState *args = (BTCallbackState *) state;
+	bool		heapkeyspace,
+				allequalimage;
 
-	/*
-	 * Open the target index relations separately (like relation_openrv(), but
-	 * with heap relation locked first to prevent deadlocking).  In hot
-	 * standby mode this will raise an error when parentcheck is true.
-	 *
-	 * There is no need for the usual indcheckxmin usability horizon test
-	 * here, even in the heapallindexed case, because index undergoing
-	 * verification only needs to have entries for a new transaction snapshot.
-	 * (If this is a parentcheck verification, there is no question about
-	 * committed or recently dead heap tuples lacking index entries due to
-	 * concurrent activity.)
-	 */
-	indrel = index_open(indrelid, lockmode);
-
-	/*
-	 * Since we did the IndexGetRelation call above without any lock, it's
-	 * barely possible that a race against an index drop/recreation could have
-	 * netted us the wrong table.
-	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
 		ereport(ERROR,
-				(errcode(ERRCODE_UNDEFINED_TABLE),
-				 errmsg("could not open parent table of index \"%s\"",
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" lacks a main relation fork",
 						RelationGetRelationName(indrel))));
 
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	if (btree_index_mainfork_expected(indrel))
+	/* Extract metadata from metapage, and sanitize it in passing */
+	_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
+	if (allequalimage && !heapkeyspace)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
+						RelationGetRelationName(indrel))));
+	if (allequalimage && !_bt_allequalimage(indrel, false))
 	{
-		bool		heapkeyspace,
-					allequalimage;
+		bool		has_interval_ops = false;
 
-		if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" lacks a main relation fork",
-							RelationGetRelationName(indrel))));
-
-		/* Extract metadata from metapage, and sanitize it in passing */
-		_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
-		if (allequalimage && !heapkeyspace)
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
-							RelationGetRelationName(indrel))));
-		if (allequalimage && !_bt_allequalimage(indrel, false))
-		{
-			bool		has_interval_ops = false;
-
-			for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
-				if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
-					has_interval_ops = true;
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
-							RelationGetRelationName(indrel)),
-					 has_interval_ops
-					 ? errhint("This is known of \"interval\" indexes last built on a version predating 2023-11.")
-					 : 0));
-		}
-
-		/* Check index, possibly against table it is an index on */
-		bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
-							 heapallindexed, rootdescend, checkunique);
+		for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
+			if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
+			{
+				has_interval_ops = true;
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
+								RelationGetRelationName(indrel)),
+						 has_interval_ops
+						 ? errhint("This is known of \"interval\" indexes last built on a version predating 2023-11.")
+						 : 0));
+			}
 	}
 
-	/* Roll back any GUC changes executed by index functions */
-	AtEOXact_GUC(false, save_nestlevel);
-
-	/* Restore userid and security context */
-	SetUserIdAndSecContext(save_userid, save_sec_context);
-
-	/*
-	 * Release locks early. That's ok here because nothing in the called
-	 * routines will trigger shared cache invalidations to be sent, so we can
-	 * relax the usual pattern of only releasing locks after commit.
-	 */
-	index_close(indrel, lockmode);
-	if (heaprel)
-		table_close(heaprel, lockmode);
-}
-
-/*
- * Basic checks about the suitability of a relation for checking as a B-Tree
- * index.
- *
- * NB: Intentionally not checking permissions, the function is normally not
- * callable by non-superusers. If granted, it's useful to be able to check a
- * whole cluster.
- */
-static inline void
-btree_index_checkable(Relation rel)
-{
-	if (rel->rd_rel->relkind != RELKIND_INDEX ||
-		rel->rd_rel->relam != BTREE_AM_OID)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("only B-Tree indexes are supported as targets for verification"),
-				 errdetail("Relation \"%s\" is not a B-Tree index.",
-						   RelationGetRelationName(rel))));
-
-	if (RELATION_IS_OTHER_TEMP(rel))
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("cannot access temporary tables of other sessions"),
-				 errdetail("Index \"%s\" is associated with temporary relation.",
-						   RelationGetRelationName(rel))));
-
-	if (!rel->rd_index->indisvalid)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("cannot check index \"%s\"",
-						RelationGetRelationName(rel)),
-				 errdetail("Index is not valid.")));
-}
-
-/*
- * Check if B-Tree index relation should have a file for its main relation
- * fork.  Verification uses this to skip unlogged indexes when in hot standby
- * mode, where there is simply nothing to verify.  We behave as if the
- * relation is empty.
- *
- * NB: Caller should call btree_index_checkable() before calling here.
- */
-static inline bool
-btree_index_mainfork_expected(Relation rel)
-{
-	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
-		!RecoveryInProgress())
-		return true;
-
-	ereport(DEBUG1,
-			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
-			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
-					RelationGetRelationName(rel))));
-
-	return false;
+	/* Check index, possibly against table it is an index on */
+	bt_check_every_level(indrel, heaprel, heapkeyspace, readonly,
+						 args->heapallindexed, args->rootdescend, args->checkunique);
 }
 
 /*
-- 
2.34.1

#52

Andrey M. Borodin

x4mmm@yandex-team.ru

about 1 year ago

In reply to: Kirill Reshke (#51)

Re: Amcheck verification of GiST and GIN

On 26 Nov 2024, at 11:50, Kirill Reshke <reshkekirill@gmail.com> wrote:

I did mechanical patch rebase & beautification.

Many thanks! Addressing Tomas' feedback was still one of top items on my todo list. And I'm more than happy that someone advance this patchset.

Notice my first patch, i did small refactoring as a separate contribution.

It looks like these days such patches are committed via separate threads. Change looks good to me.

=== review from Tomas fixups

1) 0001-Refactor-amcheck-to-extract-common..

This change was not correct (if stmt now need parenthesis )

- if (allequalimage && !_bt_allequalimage(indrel, false))
- {
- bool has_interval_ops = false;
-
- for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
- if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
- has_interval_ops = true;
- ereport(ERROR,
+ for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
+ if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
+ has_interval_ops = true;
+ ereport(ERROR,

Applied all tomas review comments.

+1. All changes were correct, but there were some questions from Tomas.

The index relation AM mismatch
error message now looks like this:
```
db1=# select bt_index_check('users_search_idx');
ERROR: expected "btree" index as targets for verification
DETAIL: Relation "users_search_idx" is a gin index.
```

Great!

I included Tomas to review-by section of this patch (in the commit message).
I also changed the commit message for this patch.

2) 0002-Add-gist_index_check-function-to-verify-G.patch
Did apply Tomas review comments. I left GIST's version of
PageGetItemIdCareful unchanged. Maybe we should have a common check in
verify_common.c, as Tomas was arguing for, but I'm not doing anything
for now, because I don't really understand its purpose. All other
review comments are addressed (i hope), if i'm not missing anything.

I also included my fix for the memory leak mentioned by Tomas.

The fix looks correct to me.

=== problems with gin_index_check

1)
```
reshke@ygp-jammy:~/postgres/contrib/amcheck$ ../../pgbin/bin/psql db1
psql (18devel)
Type "help" for help.

db1=# select gin_index_check('users_search_idx');
ERROR: index "users_search_idx" has wrong tuple order, block 35868, offset 33
```

Ughm... are you sure it's not induced by some collation issues? Did you create the index on same VM where test was performed?

For some reason gin_index_check fails on my index. I am 99% there is
no corruption in it. Will try to investigate.

2) this is already discovered by Tomas, but I add my input here:

psql session:
```
db1=# set log_min_messages to debug5;
SET
db1=# select gin_index_check('users_search_idx');

```

gdb session:
```
(gdb) bt
#0 __pthread_kill_implementation (no_tid=0, signo=6,
threadid=140601454760896) at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (signo=6, threadid=140601454760896) at
./nptl/pthread_kill.c:78
#2 __GI___pthread_kill (threadid=140601454760896,
signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3 0x00007fe055af0476 in __GI_raise (sig=sig@entry=6) at
../sysdeps/posix/raise.c:26
#4 0x00007fe055ad67f3 in __GI_abort () at ./stdlib/abort.c:79
#5 0x000055ea82af4ef0 in ExceptionalCondition
(conditionName=conditionName@entry=0x7fe04a87aa35
"ItemPointerIsValid(pointer)",
fileName=fileName@entry=0x7fe04a87a928
"../../src/include/storage/itemptr.h",
lineNumber=lineNumber@entry=126) at assert.c:66
#6 0x00007fe04a871372 in ItemPointerGetOffsetNumber
(pointer=<optimized out>) at ../../src/include/storage/itemptr.h:126
#7 ItemPointerGetOffsetNumber (pointer=<optimized out>) at
../../src/include/storage/itemptr.h:124
#8 gin_check_posting_tree_parent_keys_consistency
(posting_tree_root=<optimized out>, rel=<optimized out>) at
verify_gin.c:296
#9 gin_check_parent_keys_consistency (rel=rel@entry=0x7fe04a8aa328,
heaprel=heaprel@entry=0x7fe04a8a9db8,
callback_state=callback_state@entry=0x0,
readonly=readonly@entry=false) at verify_gin.c:597
#10 0x00007fe04a87098d in amcheck_lock_relation_and_check
(indrelid=16488, am_id=am_id@entry=2742,
check=check@entry=0x7fe04a870a80 <gin_check_parent_keys_consistency>,
lockmode=lockmode@entry=1,
state=state@entry=0x0) at verify_common.c:132
#11 0x00007fe04a871e34 in gin_index_check (fcinfo=<optimized out>) at
verify_gin.c:81
#12 0x000055ea827cc275 in ExecInterpExpr (state=0x55ea84903390,
econtext=0x55ea84903138, isnull=<optimized out>) at
execExprInterp.c:770
#13 0x000055ea82804fdc in ExecEvalExprSwitchContext
(isNull=0x7ffeba7fdd37, econtext=0x55ea84903138, state=0x55ea84903390)
at ../../../src/include/executor/executor.h:367
#14 ExecProject (projInfo=0x55ea84903388) at
../../../src/include/executor/executor.h:401
#15 ExecResult (pstate=<optimized out>) at nodeResult.c:135
#16 0x000055ea827d007a in ExecProcNode (node=0x55ea84903028) at
../../../src/include/executor/executor.h:278
#17 ExecutePlan (execute_once=<optimized out>, dest=0x55ea84901940,
direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>,
operation=CMD_SELECT, use_parallel_mode=<optimized out>,
planstate=0x55ea84903028, estate=0x55ea84902e00) at execMain.c:1655
#18 standard_ExecutorRun (queryDesc=0x55ea8485c1a0,
direction=<optimized out>, count=0, execute_once=<optimized out>) at
execMain.c:362
#19 0x000055ea829ad6df in PortalRunSelect (portal=0x55ea848b1810,
forward=<optimized out>, count=0, dest=<optimized out>) at
pquery.c:924
#20 0x000055ea829aedc1 in PortalRun
(portal=portal@entry=0x55ea848b1810,
count=count@entry=9223372036854775807,
isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true,
dest=dest@entry=0x55ea84901940,
altdest=altdest@entry=0x55ea84901940, qc=0x7ffeba7fdfd0) at
pquery.c:768
#21 0x000055ea829aab47 in exec_simple_query
(query_string=0x55ea84831250 "select
gin_index_check('users_search_idx');") at postgres.c:1283
#22 0x000055ea829ac777 in PostgresMain (dbname=<optimized out>,
username=<optimized out>) at postgres.c:4798
#23 0x000055ea829a6a33 in BackendMain (startup_data=<optimized out>,
startup_data_len=<optimized out>) at backend_startup.c:107
#24 0x000055ea8290122f in postmaster_child_launch
(child_type=<optimized out>, child_slot=1,
startup_data=startup_data@entry=0x7ffeba7fe48c "",
startup_data_len=startup_data_len@entry=4,
client_sock=client_sock@entry=0x7ffeba7fe490) at launch_backend.c:274
#25 0x000055ea82904c3f in BackendStartup (client_sock=0x7ffeba7fe490)
at postmaster.c:3377
#26 ServerLoop () at postmaster.c:1663
#27 0x000055ea8290656b in PostmasterMain (argc=argc@entry=3,
argv=argv@entry=0x55ea8482ab10) at postmaster.c:1361
#28 0x000055ea825ecc0a in main (argc=3, argv=0x55ea8482ab10) at main.c:196
(gdb)
```

We also need to change the default version of the extension to 1.5.
I'm not sure which patch of this series should do that.

====
Overall I think 0001 & 0002 are ready as-is. 0003 is maybe ok. Other
patches need more review rounds.

Yeah, I agree with this analysis.

Best regards, Andrey Borodin.

#53

Kirill Reshke

reshkekirill@gmail.com

about 1 year ago

In reply to: Andrey M. Borodin (#52)

1 attachment(s)

Re: Amcheck verification of GiST and GIN

On Tue, 26 Nov 2024 at 12:22, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:

On 26 Nov 2024, at 11:50, Kirill Reshke <reshkekirill@gmail.com> wrote:

I did mechanical patch rebase & beautification.

Many thanks! Addressing Tomas' feedback was still one of top items on my todo list. And I'm more than happy that someone advance this patchset.

Notice my first patch, i did small refactoring as a separate contribution.

It looks like these days such patches are committed via separate threads. Change looks good to me.
=== review from Tomas fixups

1) 0001-Refactor-amcheck-to-extract-common..

This change was not correct (if stmt now need parenthesis )
- if (allequalimage && !_bt_allequalimage(indrel, false))
- {
- bool has_interval_ops = false;
-
- for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
- if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
- has_interval_ops = true;
- ereport(ERROR,
+ for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
+ if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
+ has_interval_ops = true;
+ ereport(ERROR,
Applied all tomas review comments.
+1. All changes were correct, but there were some questions from Tomas.

The index relation AM mismatch
error message now looks like this:
```
db1=# select bt_index_check('users_search_idx');
ERROR: expected "btree" index as targets for verification
DETAIL: Relation "users_search_idx" is a gin index.
```

Great!

I included Tomas to review-by section of this patch (in the commit message).
I also changed the commit message for this patch.

2) 0002-Add-gist_index_check-function-to-verify-G.patch
Did apply Tomas review comments. I left GIST's version of
PageGetItemIdCareful unchanged. Maybe we should have a common check in
verify_common.c, as Tomas was arguing for, but I'm not doing anything
for now, because I don't really understand its purpose. All other
review comments are addressed (i hope), if i'm not missing anything.

I also included my fix for the memory leak mentioned by Tomas.

The fix looks correct to me.

=== problems with gin_index_check

1)
```
reshke@ygp-jammy:~/postgres/contrib/amcheck$ ../../pgbin/bin/psql db1
psql (18devel)
Type "help" for help.

db1=# select gin_index_check('users_search_idx');
ERROR: index "users_search_idx" has wrong tuple order, block 35868, offset 33
```

Ughm... are you sure it's not induced by some collation issues? Did you create the index on same VM where test was performed?

I have an input for this. I generated a big table with 2 md5 fields
and user binary search to find the exact border where gin_index_check
stops working.

```

db1=# create table users_3084 (like users INCLUDING INDEXES);
CREATE TABLE
db1=# insert into users_3084 select * from users order by first_name limit 3084;
INSERT 0 3084
db1=# select gin_index_check('users_3084_first_name_last_name_idx');
gin_index_check
-----------------

(1 row)
.....
db1=# create table users_3085 (like users INCLUDING INDEXES);
CREATE TABLE
db1=# insert into users_3085 select * from users order by first_name limit 3085;
INSERT 0 3085
db1=# select gin_index_check('users_3085_first_name_last_name_idx');
ERROR: index "users_3085_first_name_last_name_idx" has wrong tuple
order, block 589, offset 45
```

DDL:

CREATE TABLE public.users_3085 (
first_name text,
last_name text
);
CREATE INDEX users_3085_first_name_idx ON public.users_3085 USING hash
(first_name);
CREATE INDEX users_3085_first_name_last_name_idx ON public.users_3085
USING gin (first_name public.gin_trgm_ops, last_name
public.gin_trgm_ops);

PFA contents of users_3085

For some reason gin_index_check fails on my index. I am 99% there is
no corruption in it. Will try to investigate.

2) this is already discovered by Tomas, but I add my input here:

psql session:
```
db1=# set log_min_messages to debug5;
SET
db1=# select gin_index_check('users_search_idx');

```

gdb session:
```
(gdb) bt
#0 __pthread_kill_implementation (no_tid=0, signo=6,
threadid=140601454760896) at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (signo=6, threadid=140601454760896) at
./nptl/pthread_kill.c:78
#2 __GI___pthread_kill (threadid=140601454760896,
signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3 0x00007fe055af0476 in __GI_raise (sig=sig@entry=6) at
../sysdeps/posix/raise.c:26
#4 0x00007fe055ad67f3 in __GI_abort () at ./stdlib/abort.c:79
#5 0x000055ea82af4ef0 in ExceptionalCondition
(conditionName=conditionName@entry=0x7fe04a87aa35
"ItemPointerIsValid(pointer)",
fileName=fileName@entry=0x7fe04a87a928
"../../src/include/storage/itemptr.h",
lineNumber=lineNumber@entry=126) at assert.c:66
#6 0x00007fe04a871372 in ItemPointerGetOffsetNumber
(pointer=<optimized out>) at ../../src/include/storage/itemptr.h:126
#7 ItemPointerGetOffsetNumber (pointer=<optimized out>) at
../../src/include/storage/itemptr.h:124
#8 gin_check_posting_tree_parent_keys_consistency
(posting_tree_root=<optimized out>, rel=<optimized out>) at
verify_gin.c:296
#9 gin_check_parent_keys_consistency (rel=rel@entry=0x7fe04a8aa328,
heaprel=heaprel@entry=0x7fe04a8a9db8,
callback_state=callback_state@entry=0x0,
readonly=readonly@entry=false) at verify_gin.c:597
#10 0x00007fe04a87098d in amcheck_lock_relation_and_check
(indrelid=16488, am_id=am_id@entry=2742,
check=check@entry=0x7fe04a870a80 <gin_check_parent_keys_consistency>,
lockmode=lockmode@entry=1,
state=state@entry=0x0) at verify_common.c:132
#11 0x00007fe04a871e34 in gin_index_check (fcinfo=<optimized out>) at
verify_gin.c:81
#12 0x000055ea827cc275 in ExecInterpExpr (state=0x55ea84903390,
econtext=0x55ea84903138, isnull=<optimized out>) at
execExprInterp.c:770
#13 0x000055ea82804fdc in ExecEvalExprSwitchContext
(isNull=0x7ffeba7fdd37, econtext=0x55ea84903138, state=0x55ea84903390)
at ../../../src/include/executor/executor.h:367
#14 ExecProject (projInfo=0x55ea84903388) at
../../../src/include/executor/executor.h:401
#15 ExecResult (pstate=<optimized out>) at nodeResult.c:135
#16 0x000055ea827d007a in ExecProcNode (node=0x55ea84903028) at
../../../src/include/executor/executor.h:278
#17 ExecutePlan (execute_once=<optimized out>, dest=0x55ea84901940,
direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>,
operation=CMD_SELECT, use_parallel_mode=<optimized out>,
planstate=0x55ea84903028, estate=0x55ea84902e00) at execMain.c:1655
#18 standard_ExecutorRun (queryDesc=0x55ea8485c1a0,
direction=<optimized out>, count=0, execute_once=<optimized out>) at
execMain.c:362
#19 0x000055ea829ad6df in PortalRunSelect (portal=0x55ea848b1810,
forward=<optimized out>, count=0, dest=<optimized out>) at
pquery.c:924
#20 0x000055ea829aedc1 in PortalRun
(portal=portal@entry=0x55ea848b1810,
count=count@entry=9223372036854775807,
isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true,
dest=dest@entry=0x55ea84901940,
altdest=altdest@entry=0x55ea84901940, qc=0x7ffeba7fdfd0) at
pquery.c:768
#21 0x000055ea829aab47 in exec_simple_query
(query_string=0x55ea84831250 "select
gin_index_check('users_search_idx');") at postgres.c:1283
#22 0x000055ea829ac777 in PostgresMain (dbname=<optimized out>,
username=<optimized out>) at postgres.c:4798
#23 0x000055ea829a6a33 in BackendMain (startup_data=<optimized out>,
startup_data_len=<optimized out>) at backend_startup.c:107
#24 0x000055ea8290122f in postmaster_child_launch
(child_type=<optimized out>, child_slot=1,
startup_data=startup_data@entry=0x7ffeba7fe48c "",
startup_data_len=startup_data_len@entry=4,
client_sock=client_sock@entry=0x7ffeba7fe490) at launch_backend.c:274
#25 0x000055ea82904c3f in BackendStartup (client_sock=0x7ffeba7fe490)
at postmaster.c:3377
#26 ServerLoop () at postmaster.c:1663
#27 0x000055ea8290656b in PostmasterMain (argc=argc@entry=3,
argv=argv@entry=0x55ea8482ab10) at postmaster.c:1361
#28 0x000055ea825ecc0a in main (argc=3, argv=0x55ea8482ab10) at main.c:196
(gdb)
```

We also need to change the default version of the extension to 1.5.
I'm not sure which patch of this series should do that.

+1

====
Overall I think 0001 & 0002 are ready as-is. 0003 is maybe ok. Other
patches need more review rounds.

Yeah, I agree with this analysis.

I now don't: im leaning towards the option to squash 0001-0002 into one patch.

Best regards, Andrey Borodin.

--
Best regards,
Kirill Reshke

#54

Kirill Reshke

reshkekirill@gmail.com

about 1 year ago

In reply to: Kirill Reshke (#51)

5 attachment(s)

Re: Amcheck verification of GiST and GIN

On Tue, 26 Nov 2024 at 11:50, Kirill Reshke <reshkekirill@gmail.com> wrote:

====
Overall I think 0001 & 0002 are ready as-is. 0003 is maybe ok. Other
patches need more review rounds.

--
Best regards,
Kirill Reshke

=== Patch changes.

Polishing:
1)

+ /* last tuple in layer has no high key */
+ if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
+ {
+ ptr->parenttup = CopyIndexTuple(idxtuple);
+ }
+ else
+ {
+ ptr->parenttup = NULL;
+ }

This coding does not align with PostgreSQL style. I removed
parentheses here and in other few places :

```
reshke@ygp-jammy:~/postgres$ git diff contrib/amcheck/verify_gin.c
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index 47b6e81fbc4..be44cc724f8 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -111,9 +111,7 @@ ginReadTupleWithoutState(IndexTuple itup, int *nitems)
                                         nipd, ndecoded);
                }
                else
-               {
                        ipd = palloc(0);
-               }
        }
        else
        {
@@ -194,7 +192,6 @@
gin_check_posting_tree_parent_keys_consistency(Relation rel,
BlockNumber posting
                        list = GinDataLeafPageGetItems(page, &nlist, minItem);

if (nlist > 0)
- {
snprintf(tidrange_buf, sizeof(tidrange_buf),
"%d tids (%u, %u) - (%u, %u)",
nlist,
@@ -202,11 +199,8 @@
gin_check_posting_tree_parent_keys_consistency(Relation rel,
BlockNumber posting

ItemPointerGetOffsetNumberNoCheck(&list[0]),

ItemPointerGetBlockNumberNoCheck(&list[nlist - 1]),

ItemPointerGetOffsetNumberNoCheck(&list[nlist - 1]));
- }
else
- {
snprintf(tidrange_buf,
sizeof(tidrange_buf), "0 tids");
- }

if (stack->parentblk != InvalidBlockNumber)
{
@@ -218,11 +212,9 @@
gin_check_posting_tree_parent_keys_consistency(Relation rel,
BlockNumber posting
tidrange_buf);
}
else
- {
elog(DEBUG3, "blk %u: root leaf, %s",
stack->blkno,
tidrange_buf);
- }

if (stack->parentblk != InvalidBlockNumber &&

ItemPointerGetOffsetNumberNoCheck(&stack->parentkey) !=
InvalidOffsetNumber &&
@@ -576,13 +568,9 @@ gin_check_parent_keys_consistency(Relation rel,
ptr->depth = stack->depth + 1;
/* last tuple in layer has no high key */
if (i != maxoff &&
!GinPageGetOpaque(page)->rightlink)
- {
ptr->parenttup =
CopyIndexTuple(idxtuple);
- }
else
- {
ptr->parenttup = NULL;
- }
ptr->parentblk = stack->blkno;
ptr->blkno = GinGetDownlink(idxtuple);
ptr->parentlsn = lsn;

```

+                                               else
+                                               {
+                                                       /*
+                                                        * But now it is properly adjusted - nothing to do
+                                                        * here.
+                                                        */
+                                               }

if (...) ... else {/* comment */} is a strange pattern.

PFA v6.
No other changes from v5 except for mandatory rebase of v5-0005 due to 18954ce.

=== CC changes
I changed Tomas's email to tomas@vondra.me, as @enterprisedb one no
longer exists.

--
Best regards,
Kirill Reshke

Attachments:

v30-0001-A-tiny-nitpicky-tweak-to-beautify-the-Amcheck-in.patchapplication/octet-stream; name=v30-0001-A-tiny-nitpicky-tweak-to-beautify-the-Amcheck-in.patchDownload

From e78bc24a3a355c5731cb677b89456ceb9fbd9b55 Mon Sep 17 00:00:00 2001
From: reshke kirill <reshke@double.cloud>
Date: Tue, 26 Nov 2024 05:32:27 +0000
Subject: [PATCH v30 1/5] A tiny nitpicky tweak to beautify the Amcheck
 interiors.

The heaptuplespresent field in BtreeCheckState was not previously
adequately documented. To clarify the meaning of this field, the comment was changed.
---
 contrib/amcheck/verify_nbtree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index ffe4f721672..c76349bf436 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -124,7 +124,7 @@ typedef struct BtreeCheckState
 
 	/* Bloom filter fingerprints B-Tree index */
 	bloom_filter *filter;
-	/* Debug counter */
+	/* Debug counter for reporting percentage of work already done */
 	int64		heaptuplespresent;
 } BtreeCheckState;
 
-- 
2.34.1

v30-0002-Refactor-amcheck-internals-to-isolate-common-loc.patchapplication/octet-stream; name=v30-0002-Refactor-amcheck-internals-to-isolate-common-loc.patchDownload

From 5cb507ead70de64cd966821045a77ae25ef2433a Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:08:10 +0500
Subject: [PATCH v30 2/5] Refactor amcheck internals to isolate common locking
 and checking routines
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Before doing checks, other indexes must take the same safety measures:
 - Making sure the index can be checked
 - changing the context of the user
 - keeping track of GUCs modified via index functions
This contribution relocates the existing functionality to amcheck.c for reuse.

Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile                 |   1 +
 contrib/amcheck/expected/check_btree.out |   4 +-
 contrib/amcheck/meson.build              |   1 +
 contrib/amcheck/verify_common.c          | 191 ++++++++++++++++
 contrib/amcheck/verify_common.h          |  31 +++
 contrib/amcheck/verify_nbtree.c          | 267 ++++++-----------------
 6 files changed, 296 insertions(+), 199 deletions(-)
 create mode 100644 contrib/amcheck/verify_common.c
 create mode 100644 contrib/amcheck/verify_common.h

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index 5e9002d2501..c3d70f3369c 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -3,6 +3,7 @@
 MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
+	verify_common.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
diff --git a/contrib/amcheck/expected/check_btree.out b/contrib/amcheck/expected/check_btree.out
index e7fb5f55157..c6f4b16c556 100644
--- a/contrib/amcheck/expected/check_btree.out
+++ b/contrib/amcheck/expected/check_btree.out
@@ -57,8 +57,8 @@ ERROR:  could not open relation with OID 17
 BEGIN;
 CREATE INDEX bttest_a_brin_idx ON bttest_a USING brin(id);
 SELECT bt_index_parent_check('bttest_a_brin_idx');
-ERROR:  only B-Tree indexes are supported as targets for verification
-DETAIL:  Relation "bttest_a_brin_idx" is not a B-Tree index.
+ERROR:  expected "btree" index as targets for verification
+DETAIL:  Relation "bttest_a_brin_idx" is a brin index.
 ROLLBACK;
 -- normal check outside of xact
 SELECT bt_index_check('bttest_a_idx');
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index fc08e32539a..1b38e0aba77 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -1,6 +1,7 @@
 # Copyright (c) 2022-2024, PostgreSQL Global Development Group
 
 amcheck_sources = files(
+  'amcheck.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
diff --git a/contrib/amcheck/verify_common.c b/contrib/amcheck/verify_common.c
new file mode 100644
index 00000000000..acdcf5729f7
--- /dev/null
+++ b/contrib/amcheck/verify_common.c
@@ -0,0 +1,191 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.c
+ *		Utility functions common to all access methods.
+ *
+ * Copyright (c) 2024, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/table.h"
+#include "access/tableam.h"
+#include "verify_common.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
+#include "commands/tablecmds.h"
+#include "utils/guc.h"
+#include "utils/syscache.h"
+
+static bool amcheck_index_mainfork_expected(Relation rel);
+
+
+/*
+ * Check if index relation should have a file for its main relation fork.
+ * Verification uses this to skip unlogged indexes when in hot standby mode,
+ * where there is simply nothing to verify.
+ *
+ * NB: Caller should call index_checkable() before calling here.
+ */
+static bool
+amcheck_index_mainfork_expected(Relation rel)
+{
+	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
+		!RecoveryInProgress())
+		return true;
+
+	ereport(NOTICE,
+			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
+			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
+					RelationGetRelationName(rel))));
+
+	return false;
+}
+
+/*
+* Amcheck main workhorse.
+* Given index relation OID, lock relation.
+* Next, take a number of standard actions:
+* 1) Make sure the index can be checked
+* 2) change the context of the user,
+* 3) keep track of GUCs modified via index functions
+* 4) execute callback function to verify integrity.
+*/
+void
+amcheck_lock_relation_and_check(Oid indrelid,
+								Oid am_id,
+								IndexDoCheckCallback check,
+								LOCKMODE lockmode,
+								void *state)
+{
+	Oid			heapid;
+	Relation	indrel;
+	Relation	heaprel;
+	Oid			save_userid;
+	int			save_sec_context;
+	int			save_nestlevel;
+
+	/*
+	 * We must lock table before index to avoid deadlocks.  However, if the
+	 * passed indrelid isn't an index then IndexGetRelation() will fail.
+	 * Rather than emitting a not-very-helpful error message, postpone
+	 * complaining, expecting that the is-it-an-index test below will fail.
+	 *
+	 * In hot standby mode this will raise an error when parentcheck is true.
+	 */
+	heapid = IndexGetRelation(indrelid, true);
+	if (OidIsValid(heapid))
+	{
+		heaprel = table_open(heapid, lockmode);
+
+		/*
+		 * Switch to the table owner's userid, so that any index functions are
+		 * run as that user.  Also lock down security-restricted operations
+		 * and arrange to make GUC variable changes local to this command.
+		 */
+		GetUserIdAndSecContext(&save_userid, &save_sec_context);
+		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
+							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
+		save_nestlevel = NewGUCNestLevel();
+	}
+	else
+	{
+		heaprel = NULL;
+		/* Set these just to suppress "uninitialized variable" warnings */
+		save_userid = InvalidOid;
+		save_sec_context = -1;
+		save_nestlevel = -1;
+	}
+
+	/*
+	 * Open the target index relations separately (like relation_openrv(), but
+	 * with heap relation locked first to prevent deadlocking).  In hot
+	 * standby mode this will raise an error when parentcheck is true.
+	 *
+	 * There is no need for the usual indcheckxmin usability horizon test
+	 * here, even in the heapallindexed case, because index undergoing
+	 * verification only needs to have entries for a new transaction snapshot.
+	 * (If this is a parentcheck verification, there is no question about
+	 * committed or recently dead heap tuples lacking index entries due to
+	 * concurrent activity.)
+	 */
+	indrel = index_open(indrelid, lockmode);
+
+	/*
+	 * Since we did the IndexGetRelation call above without any lock, it's
+	 * barely possible that a race against an index drop/recreation could have
+	 * netted us the wrong table.
+	 */
+	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_TABLE),
+				 errmsg("could not open parent table of index \"%s\"",
+						RelationGetRelationName(indrel))));
+
+	/* Check that relation suitable for checking */
+	if (index_checkable(indrel, am_id))
+		check(indrel, heaprel, state, lockmode == ShareLock);
+
+	/* Roll back any GUC changes executed by index functions */
+	AtEOXact_GUC(false, save_nestlevel);
+
+	/* Restore userid and security context */
+	SetUserIdAndSecContext(save_userid, save_sec_context);
+
+	/*
+	 * Release locks early. That's ok here because nothing in the called
+	 * routines will trigger shared cache invalidations to be sent, so we can
+	 * relax the usual pattern of only releasing locks after commit.
+	 */
+	index_close(indrel, lockmode);
+	if (heaprel)
+		table_close(heaprel, lockmode);
+}
+
+/*
+ * Basic checks about the suitability of a relation for checking as an index.
+ *
+ *
+ * NB: Intentionally not checking permissions, the function is normally not
+ * callable by non-superusers. If granted, it's useful to be able to check a
+ * whole cluster.
+ */
+bool
+index_checkable(Relation rel, Oid am_id)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != am_id)
+	{
+		HeapTuple	amtup;
+		HeapTuple	amtuprel;
+
+		amtup = SearchSysCache1(AMOID, ObjectIdGetDatum(am_id));
+		amtuprel = SearchSysCache1(AMOID, ObjectIdGetDatum(rel->rd_rel->relam));
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("expected \"%s\" index as targets for verification", NameStr(((Form_pg_am) GETSTRUCT(amtup))->amname)),
+				 errdetail("Relation \"%s\" is a %s index.",
+						   RelationGetRelationName(rel), NameStr(((Form_pg_am) GETSTRUCT(amtuprel))->amname))));
+	}
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid.")));
+
+	return amcheck_index_mainfork_expected(rel);
+}
diff --git a/contrib/amcheck/verify_common.h b/contrib/amcheck/verify_common.h
new file mode 100644
index 00000000000..30994e22933
--- /dev/null
+++ b/contrib/amcheck/verify_common.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "storage/bufpage.h"
+#include "storage/lmgr.h"
+#include "storage/lockdefs.h"
+#include "utils/relcache.h"
+#include "miscadmin.h"
+
+/* Typedefs for callback functions for amcheck_lock_relation */
+typedef void (*IndexCheckableCallback) (Relation index);
+typedef void (*IndexDoCheckCallback) (Relation rel,
+									  Relation heaprel,
+									  void *state,
+									  bool readonly);
+
+extern void amcheck_lock_relation_and_check(Oid indrelid,
+											Oid am_id,
+											IndexDoCheckCallback check,
+											LOCKMODE lockmode, void *state);
+
+extern bool index_checkable(Relation rel, Oid am_id);
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index c76349bf436..1da4f0c3461 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -30,6 +30,7 @@
 #include "access/tableam.h"
 #include "access/transam.h"
 #include "access/xact.h"
+#include "verify_common.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
 #include "catalog/pg_opfamily_d.h"
@@ -156,14 +157,22 @@ typedef struct BtreeLastVisibleEntry
 	ItemPointer tid;			/* Heap tid */
 } BtreeLastVisibleEntry;
 
+/*
+ * Check arguments
+ */
+typedef struct BTCallbackState
+{
+	bool		parentcheck;
+	bool		heapallindexed;
+	bool		rootdescend;
+	bool		checkunique;
+}			BTCallbackState;
+
 PG_FUNCTION_INFO_V1(bt_index_check);
 PG_FUNCTION_INFO_V1(bt_index_parent_check);
 
-static void bt_index_check_internal(Oid indrelid, bool parentcheck,
-									bool heapallindexed, bool rootdescend,
-									bool checkunique);
-static inline void btree_index_checkable(Relation rel);
-static inline bool btree_index_mainfork_expected(Relation rel);
+static void bt_index_check_callback(Relation indrel, Relation heaprel,
+									void *state, bool readonly);
 static void bt_check_every_level(Relation rel, Relation heaprel,
 								 bool heapkeyspace, bool readonly, bool heapallindexed,
 								 bool rootdescend, bool checkunique);
@@ -238,15 +247,21 @@ Datum
 bt_index_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		checkunique = false;
+	BTCallbackState args;
+
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = false;
+	args.checkunique = false;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
-	if (PG_NARGS() == 3)
-		checkunique = PG_GETARG_BOOL(2);
+		args.heapallindexed = PG_GETARG_BOOL(1);
+	if (PG_NARGS() >= 3)
+		args.checkunique = PG_GETARG_BOOL(2);
 
-	bt_index_check_internal(indrelid, false, heapallindexed, false, checkunique);
+	amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
+									bt_index_check_callback,
+									AccessShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -264,18 +279,23 @@ Datum
 bt_index_parent_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		rootdescend = false;
-	bool		checkunique = false;
+	BTCallbackState args;
+
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = true;
+	args.checkunique = false;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+		args.heapallindexed = PG_GETARG_BOOL(1);
 	if (PG_NARGS() >= 3)
-		rootdescend = PG_GETARG_BOOL(2);
-	if (PG_NARGS() == 4)
-		checkunique = PG_GETARG_BOOL(3);
+		args.rootdescend = PG_GETARG_BOOL(2);
+	if (PG_NARGS() >= 4)
+		args.checkunique = PG_GETARG_BOOL(3);
 
-	bt_index_check_internal(indrelid, true, heapallindexed, rootdescend, checkunique);
+	amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
+									bt_index_check_callback,
+									ShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -284,193 +304,46 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
  * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
  */
 static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend, bool checkunique)
+bt_index_check_callback(Relation indrel, Relation heaprel, void *state, bool readonly)
 {
-	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	LOCKMODE	lockmode;
-	Oid			save_userid;
-	int			save_sec_context;
-	int			save_nestlevel;
-
-	if (parentcheck)
-		lockmode = ShareLock;
-	else
-		lockmode = AccessShareLock;
-
-	/*
-	 * We must lock table before index to avoid deadlocks.  However, if the
-	 * passed indrelid isn't an index then IndexGetRelation() will fail.
-	 * Rather than emitting a not-very-helpful error message, postpone
-	 * complaining, expecting that the is-it-an-index test below will fail.
-	 *
-	 * In hot standby mode this will raise an error when parentcheck is true.
-	 */
-	heapid = IndexGetRelation(indrelid, true);
-	if (OidIsValid(heapid))
-	{
-		heaprel = table_open(heapid, lockmode);
-
-		/*
-		 * Switch to the table owner's userid, so that any index functions are
-		 * run as that user.  Also lock down security-restricted operations
-		 * and arrange to make GUC variable changes local to this command.
-		 */
-		GetUserIdAndSecContext(&save_userid, &save_sec_context);
-		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
-							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
-		save_nestlevel = NewGUCNestLevel();
-		RestrictSearchPath();
-	}
-	else
-	{
-		heaprel = NULL;
-		/* Set these just to suppress "uninitialized variable" warnings */
-		save_userid = InvalidOid;
-		save_sec_context = -1;
-		save_nestlevel = -1;
-	}
+	BTCallbackState *args = (BTCallbackState *) state;
+	bool		heapkeyspace,
+				allequalimage;
 
-	/*
-	 * Open the target index relations separately (like relation_openrv(), but
-	 * with heap relation locked first to prevent deadlocking).  In hot
-	 * standby mode this will raise an error when parentcheck is true.
-	 *
-	 * There is no need for the usual indcheckxmin usability horizon test
-	 * here, even in the heapallindexed case, because index undergoing
-	 * verification only needs to have entries for a new transaction snapshot.
-	 * (If this is a parentcheck verification, there is no question about
-	 * committed or recently dead heap tuples lacking index entries due to
-	 * concurrent activity.)
-	 */
-	indrel = index_open(indrelid, lockmode);
-
-	/*
-	 * Since we did the IndexGetRelation call above without any lock, it's
-	 * barely possible that a race against an index drop/recreation could have
-	 * netted us the wrong table.
-	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
 		ereport(ERROR,
-				(errcode(ERRCODE_UNDEFINED_TABLE),
-				 errmsg("could not open parent table of index \"%s\"",
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" lacks a main relation fork",
 						RelationGetRelationName(indrel))));
 
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	if (btree_index_mainfork_expected(indrel))
+	/* Extract metadata from metapage, and sanitize it in passing */
+	_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
+	if (allequalimage && !heapkeyspace)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
+						RelationGetRelationName(indrel))));
+	if (allequalimage && !_bt_allequalimage(indrel, false))
 	{
-		bool		heapkeyspace,
-					allequalimage;
+		bool		has_interval_ops = false;
 
-		if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" lacks a main relation fork",
-							RelationGetRelationName(indrel))));
-
-		/* Extract metadata from metapage, and sanitize it in passing */
-		_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
-		if (allequalimage && !heapkeyspace)
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
-							RelationGetRelationName(indrel))));
-		if (allequalimage && !_bt_allequalimage(indrel, false))
-		{
-			bool		has_interval_ops = false;
-
-			for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
-				if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
-					has_interval_ops = true;
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
-							RelationGetRelationName(indrel)),
-					 has_interval_ops
-					 ? errhint("This is known of \"interval\" indexes last built on a version predating 2023-11.")
-					 : 0));
-		}
-
-		/* Check index, possibly against table it is an index on */
-		bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
-							 heapallindexed, rootdescend, checkunique);
+		for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
+			if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
+			{
+				has_interval_ops = true;
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
+								RelationGetRelationName(indrel)),
+						 has_interval_ops
+						 ? errhint("This is known of \"interval\" indexes last built on a version predating 2023-11.")
+						 : 0));
+			}
 	}
 
-	/* Roll back any GUC changes executed by index functions */
-	AtEOXact_GUC(false, save_nestlevel);
-
-	/* Restore userid and security context */
-	SetUserIdAndSecContext(save_userid, save_sec_context);
-
-	/*
-	 * Release locks early. That's ok here because nothing in the called
-	 * routines will trigger shared cache invalidations to be sent, so we can
-	 * relax the usual pattern of only releasing locks after commit.
-	 */
-	index_close(indrel, lockmode);
-	if (heaprel)
-		table_close(heaprel, lockmode);
-}
-
-/*
- * Basic checks about the suitability of a relation for checking as a B-Tree
- * index.
- *
- * NB: Intentionally not checking permissions, the function is normally not
- * callable by non-superusers. If granted, it's useful to be able to check a
- * whole cluster.
- */
-static inline void
-btree_index_checkable(Relation rel)
-{
-	if (rel->rd_rel->relkind != RELKIND_INDEX ||
-		rel->rd_rel->relam != BTREE_AM_OID)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("only B-Tree indexes are supported as targets for verification"),
-				 errdetail("Relation \"%s\" is not a B-Tree index.",
-						   RelationGetRelationName(rel))));
-
-	if (RELATION_IS_OTHER_TEMP(rel))
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("cannot access temporary tables of other sessions"),
-				 errdetail("Index \"%s\" is associated with temporary relation.",
-						   RelationGetRelationName(rel))));
-
-	if (!rel->rd_index->indisvalid)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("cannot check index \"%s\"",
-						RelationGetRelationName(rel)),
-				 errdetail("Index is not valid.")));
-}
-
-/*
- * Check if B-Tree index relation should have a file for its main relation
- * fork.  Verification uses this to skip unlogged indexes when in hot standby
- * mode, where there is simply nothing to verify.  We behave as if the
- * relation is empty.
- *
- * NB: Caller should call btree_index_checkable() before calling here.
- */
-static inline bool
-btree_index_mainfork_expected(Relation rel)
-{
-	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
-		!RecoveryInProgress())
-		return true;
-
-	ereport(DEBUG1,
-			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
-			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
-					RelationGetRelationName(rel))));
-
-	return false;
+	/* Check index, possibly against table it is an index on */
+	bt_check_every_level(indrel, heaprel, heapkeyspace, readonly,
+						 args->heapallindexed, args->rootdescend, args->checkunique);
 }
 
 /*
-- 
2.34.1

v30-0004-Add-gin_index_check-to-verify-GIN-index.patchapplication/octet-stream; name=v30-0004-Add-gin_index_check-to-verify-GIN-index.patchDownload

From 1cd7e7650d1723c11ffc001e8ee8376f61884c60 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:22:26 +0500
Subject: [PATCH v30 4/5] Add gin_index_check() to verify GIN index
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Author: Grigory Kryachko <GSKryachko@gmail.com>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile               |   3 +-
 contrib/amcheck/amcheck--1.4--1.5.sql  |   9 +
 contrib/amcheck/expected/check_gin.out |  64 +++
 contrib/amcheck/meson.build            |   2 +
 contrib/amcheck/sql/check_gin.sql      |  40 ++
 contrib/amcheck/verify_gin.c           | 755 +++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml              |  19 +
 src/tools/pgindent/pgindent            |   2 +-
 8 files changed, 892 insertions(+), 2 deletions(-)
 create mode 100644 contrib/amcheck/expected/check_gin.out
 create mode 100644 contrib/amcheck/sql/check_gin.sql
 create mode 100644 contrib/amcheck/verify_gin.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index 952e458c53b..c01f8e618f3 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,6 +4,7 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	verify_common.o \
+	verify_gin.o \
 	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
@@ -13,7 +14,7 @@ DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck
 		amcheck--1.3--1.4.sql amcheck--1.4--1.5.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_gist check_heap
+REGRESS = check check_btree check_gin check_gist check_heap
 
 EXTRA_INSTALL = contrib/pg_walinspect
 TAP_TESTS = 1
diff --git a/contrib/amcheck/amcheck--1.4--1.5.sql b/contrib/amcheck/amcheck--1.4--1.5.sql
index 3fc72364180..c013abc4f55 100644
--- a/contrib/amcheck/amcheck--1.4--1.5.sql
+++ b/contrib/amcheck/amcheck--1.4--1.5.sql
@@ -12,3 +12,12 @@ AS 'MODULE_PATHNAME', 'gist_index_check'
 LANGUAGE C STRICT;
 
 REVOKE ALL ON FUNCTION gist_index_check(regclass, boolean) FROM PUBLIC;
+
+-- gin_index_check()
+--
+CREATE FUNCTION gin_index_check(index regclass)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gin_index_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gin_index_check(regclass) FROM PUBLIC;
diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
new file mode 100644
index 00000000000..bbcde80e627
--- /dev/null
+++ b/contrib/amcheck/expected/check_gin.out
@@ -0,0 +1,64 @@
+-- Test of index bulk load
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_check('gin_check_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test index inserts
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+SELECT gin_index_check('gin_check_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test GIN over text array
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_check('gin_check_text_array_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 15ae94cc90f..5c9ddfe0758 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gin.c',
   'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
@@ -38,6 +39,7 @@ tests += {
     'sql': [
       'check',
       'check_btree',
+      'check_gin',
       'check_gist',
       'check_heap',
     ],
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
new file mode 100644
index 00000000000..bbd9b9f8281
--- /dev/null
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -0,0 +1,40 @@
+-- Test of index bulk load
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test index inserts
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+
+SELECT gin_index_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test GIN over text array
+SELECT setseed(1);
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_check('gin_check_text_array_idx');
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
new file mode 100644
index 00000000000..39baae40f0c
--- /dev/null
+++ b/contrib/amcheck/verify_gin.c
@@ -0,0 +1,755 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gin.c
+ *		Verifies the integrity of GIN indexes based on invariants.
+ *
+ * Verification checks that all paths in GIN graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gin.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gin_private.h"
+#include "access/nbtree.h"
+#include "catalog/pg_am.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "verify_common.h"
+#include "string.h"
+
+/*
+ * GinScanItem represents one item of depth-first scan of GIN index.
+ */
+typedef struct GinScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GinScanItem *next;
+}			GinScanItem;
+
+/*
+ * GinPostingTreeScanItem represents one item of depth-first scan of GIN  posting tree.
+ */
+typedef struct GinPostingTreeScanItem
+{
+	int			depth;
+	ItemPointerData parentkey;
+	BlockNumber parentblk;
+	BlockNumber blkno;
+	struct GinPostingTreeScanItem *next;
+}			GinPostingTreeScanItem;
+
+
+PG_FUNCTION_INFO_V1(gin_index_check);
+
+static void gin_check_parent_keys_consistency(Relation rel,
+											  Relation heaprel,
+											  void *callback_state, bool readonly);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gin_refind_parent(Relation rel,
+									BlockNumber parentblkno,
+									BlockNumber childblkno,
+									BufferAccessStrategy strategy);
+static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+								   OffsetNumber offset);
+
+/*
+ * gin_index_check(index regclass)
+ *
+ * Verify integrity of GIN index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gin_index_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+
+	amcheck_lock_relation_and_check(indrelid,
+									GIN_AM_OID,
+									gin_check_parent_keys_consistency,
+									AccessShareLock,
+									NULL);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Read item pointers from leaf entry tuple.
+ *
+ * Returns a palloc'd array of ItemPointers. The number of items is returned
+ * in *nitems.
+ */
+static ItemPointer
+ginReadTupleWithoutState(IndexTuple itup, int *nitems)
+{
+	Pointer		ptr = GinGetPosting(itup);
+	int			nipd = GinGetNPosting(itup);
+	ItemPointer ipd;
+	int			ndecoded;
+
+	if (GinItupIsCompressed(itup))
+	{
+		if (nipd > 0)
+		{
+			ipd = ginPostingListDecode((GinPostingList *) ptr, &ndecoded);
+			if (nipd != ndecoded)
+				elog(ERROR, "number of items mismatch in GIN entry tuple, %d in tuple header, %d decoded",
+					 nipd, ndecoded);
+		}
+		else
+			ipd = palloc(0);
+	}
+	else
+	{
+		ipd = (ItemPointer) palloc(sizeof(ItemPointerData) * nipd);
+		memcpy(ipd, ptr, sizeof(ItemPointerData) * nipd);
+	}
+	*nitems = nipd;
+	return ipd;
+}
+
+/*
+ * Allocates memory context and scans through postigTree graph
+ *
+ */
+static void
+gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting_tree_root)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinPostingTreeScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinPostingTreeScanItem *) palloc0(sizeof(GinPostingTreeScanItem));
+	stack->depth = 0;
+	ItemPointerSetInvalid(&stack->parentkey);
+	stack->parentblk = InvalidBlockNumber;
+	stack->blkno = posting_tree_root;
+
+	elog(DEBUG3, "processing posting tree at blk %u", posting_tree_root);
+
+	while (stack)
+	{
+		GinPostingTreeScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		Assert(GinPageIsData(page));
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			ItemPointerData minItem;
+			int			nlist;
+			ItemPointerData *list;
+			char		tidrange_buf[100];
+
+			ItemPointerSetMin(&minItem);
+
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			list = GinDataLeafPageGetItems(page, &nlist, minItem);
+
+			if (nlist > 0)
+				snprintf(tidrange_buf, sizeof(tidrange_buf),
+						 "%d tids (%u, %u) - (%u, %u)",
+						 nlist,
+						 ItemPointerGetBlockNumberNoCheck(&list[0]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[0]),
+						 ItemPointerGetBlockNumberNoCheck(&list[nlist - 1]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[nlist - 1]));
+			else
+				snprintf(tidrange_buf, sizeof(tidrange_buf), "0 tids");
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				elog(DEBUG3, "blk %u: parent %u highkey (%u, %u), %s",
+					 stack->blkno,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey),
+					 tidrange_buf);
+			}
+			else
+				elog(DEBUG3, "blk %u: root leaf, %s",
+					 stack->blkno,
+					 tidrange_buf);
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				ItemPointerGetOffsetNumberNoCheck(&stack->parentkey) != InvalidOffsetNumber &&
+				nlist > 0 && ItemPointerCompare(&stack->parentkey, &list[nlist - 1]) < 0)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": tid exceeds parent's high key in postingTree leaf on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+		else
+		{
+			LocationIndex pd_lower;
+			ItemPointerData bound;
+			int			lowersize;
+
+			/*
+			 * Check that tuples in each page are properly ordered and
+			 * consistent with parent high key
+			 */
+			maxoff = GinPageGetOpaque(page)->maxoff;
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: internal posting tree page with %u items, parent %u highkey (%u, %u)",
+					 stack->blkno, maxoff, stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey));
+			else
+				elog(DEBUG3, "blk %u: root internal posting tree page with %u items",
+					 stack->blkno, maxoff);
+
+			/*
+			 * A GIN posting tree internal page stores PostingItems in the
+			 * 'lower' part of the page. The 'upper' part is unused. The
+			 * number of elements is stored in the opaque area (maxoff). Make
+			 * sure the size of the 'lower' part agrees with 'maxoff'
+			 *
+			 * We didn't set pd_lower until PostgreSQL version 9.4, so if this
+			 * check fails, it could also be because the index was
+			 * binary-upgraded from an earlier version. That was a long time
+			 * ago, though, so let's warn if it doesn't match.
+			 */
+			pd_lower = ((PageHeader) page)->pd_lower;
+			lowersize = pd_lower - MAXALIGN(SizeOfPageHeaderData);
+			if ((lowersize - MAXALIGN(sizeof(ItemPointerData))) / sizeof(PostingItem) != maxoff)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has unexpected pd_lower %u in posting tree block %u with maxoff %u)",
+								RelationGetRelationName(rel), pd_lower, stack->blkno, maxoff)));
+
+			/*
+			 * Before the PostingItems, there's one ItemPointerData in the
+			 * 'lower' part that stores the page's high key.
+			 */
+			bound = *GinDataPageGetRightBound(page);
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				!ItemPointerEquals(&stack->parentkey, &bound))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": posting tree page's high key (%u, %u) doesn't match the downlink on block %u (parent blk %u, key (%u, %u))",
+								RelationGetRelationName(rel),
+								ItemPointerGetBlockNumberNoCheck(&bound),
+								ItemPointerGetOffsetNumberNoCheck(&bound),
+								stack->blkno, stack->parentblk,
+								ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+								ItemPointerGetOffsetNumberNoCheck(&stack->parentkey))));
+
+			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
+
+				elog(DEBUG3, "key (%u, %u) -> %u",
+					 ItemPointerGetBlockNumber(&posting_item->key),
+					 ItemPointerGetOffsetNumber(&posting_item->key),
+					 BlockIdGetBlockNumber(&posting_item->child_blkno));
+
+				if (i == maxoff &&
+					GinPageGetOpaque(page)->rightlink == InvalidBlockNumber)
+				{
+					/*
+					 * The rightmost item in the tree level has (0, 0) as the
+					 * key
+					 */
+					if (ItemPointerGetBlockNumberNoCheck(&posting_item->key) != 0 ||
+						ItemPointerGetOffsetNumberNoCheck(&posting_item->key) != 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": rightmost posting tree page (blk %u) has unexpected last key (%u, %u)",
+										RelationGetRelationName(rel),
+										stack->blkno,
+										ItemPointerGetBlockNumberNoCheck(&posting_item->key),
+										ItemPointerGetOffsetNumberNoCheck(&posting_item->key))));
+				}
+				else if (i != FirstOffsetNumber)
+				{
+					PostingItem *previous_posting_item = GinDataPageGetPostingItem(page, i - 1);
+
+					if (ItemPointerCompare(&posting_item->key, &previous_posting_item->key) < 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\" has wrong tuple order in posting tree, block %u, offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+				}
+
+				/*
+				 * Check if this tuple is consistent with the downlink in the
+				 * parent.
+				 */
+				if (stack->parentblk != InvalidBlockNumber && i == maxoff &&
+					ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\": posting item exceeds parent's high key in postingTree internal page on block %u offset %u",
+									RelationGetRelationName(rel),
+									stack->blkno, i)));
+
+				/* If this is an internal page, recurse into the child */
+				if (!GinPageIsLeaf(page))
+				{
+					GinPostingTreeScanItem *ptr;
+
+					ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
+					ptr->depth = stack->depth + 1;
+					ptr->parentkey = posting_item->key;
+					ptr->parentblk = stack->blkno;
+					ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+			}
+		}
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Main entry point for GIN check. Allocates memory context and scans through
+ * GIN graph.
+ */
+static void
+gin_check_parent_keys_consistency(Relation rel,
+								  Relation heaprel,
+								  void *callback_state,
+								  bool readonly)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GinState	state;
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+	initGinState(&state, rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinScanItem *) palloc0(sizeof(GinScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIN_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GinScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+		IndexTuple	prev_tuple;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+		maxoff = PageGetMaxOffsetNumber(page);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (stack->parenttup != NULL)
+		{
+			GinNullCategory parent_key_category;
+			Datum		parent_key = gintuple_get_key(&state,
+													  stack->parenttup,
+													  &parent_key_category);
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno,
+												   page, maxoff);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory page_max_key_category;
+			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
+
+			if (GinPageGetOpaque(page)->rightlink != InvalidBlockNumber &&
+				ginCompareEntries(&state, attnum, page_max_key,
+								  page_max_key_category, parent_key,
+								  parent_key_category) > 0)
+			{
+				/* split page detected, install right link to the stack */
+				GinScanItem *ptr;
+
+				elog(DEBUG3, "split detected");
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth;
+				ptr->parenttup = CopyIndexTuple(stack->parenttup);
+				ptr->parentblk = stack->parentblk;
+				ptr->parentlsn = stack->parentlsn;
+				ptr->blkno = GinPageGetOpaque(page)->rightlink;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that tuples in each page are properly ordered and consistent
+		 * with parent high key
+		 */
+		prev_tuple = NULL;
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory prev_key_category;
+			Datum		prev_key;
+			GinNullCategory current_key_category;
+			Datum		current_key;
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
+
+			/* (apparently) first block is metadata, skip order check */
+			if (i != FirstOffsetNumber && stack->blkno != (BlockNumber) 1)
+			{
+				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
+				if (ginCompareEntries(&state, attnum, prev_key,
+									  prev_key_category, current_key,
+									  current_key_category) >= 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has wrong tuple order, block %u, offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+			}
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				i == maxoff)
+			{
+				GinNullCategory parent_key_category;
+				Datum		parent_key = gintuple_get_key(&state,
+														  stack->parenttup,
+														  &parent_key_category);
+
+				if (ginCompareEntries(&state, attnum, current_key,
+									  current_key_category, parent_key,
+									  parent_key_category) > 0)
+				{
+					/*
+					 * There was a discrepancy between parent and child
+					 * tuples. We need to verify it is not a result of
+					 * concurrent call of gistplacetopage(). So, lock parent
+					 * and try to find downlink for current page. It may be
+					 * missing due to concurrent page split, this is OK.
+					 */
+					pfree(stack->parenttup);
+					stack->parenttup = gin_refind_parent(rel, stack->parentblk,
+														 stack->blkno, strategy);
+
+					/* We found it - make a final check before failing */
+					if (!stack->parenttup)
+						elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+							 stack->blkno, stack->parentblk);
+					else
+					{
+						parent_key = gintuple_get_key(&state,
+													  stack->parenttup,
+													  &parent_key_category);
+
+						/*
+						* Check if it is properly adjusted.
+						* If succeed, procced to the next key.
+						*/
+						if (ginCompareEntries(&state, attnum, current_key,
+											  current_key_category, parent_key,
+											  parent_key_category) > 0)
+							ereport(ERROR,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+											RelationGetRelationName(rel), stack->blkno, i)));
+					}
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+			if (!GinPageIsLeaf(page))
+			{
+				GinScanItem *ptr;
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth + 1;
+				/* last tuple in layer has no high key */
+				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
+					ptr->parenttup = CopyIndexTuple(idxtuple);
+				else
+					ptr->parenttup = NULL;
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = GinGetDownlink(idxtuple);
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+			/* If this item is a pointer to a posting tree, recurse into it */
+			else if (GinIsPostingTree(idxtuple))
+			{
+				BlockNumber rootPostingTree = GinGetPostingTree(idxtuple);
+
+				gin_check_posting_tree_parent_keys_consistency(rel, rootPostingTree);
+			}
+			else
+			{
+				ItemPointer ipd;
+				int			nipd;
+
+				ipd = ginReadTupleWithoutState(idxtuple, &nipd);
+
+				for (int j = 0; j < nipd; j++)
+				{
+					if (!OffsetNumberIsValid(ItemPointerGetOffsetNumber(&ipd[j])))
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting list contains invalid heap pointer on block %u",
+										RelationGetRelationName(rel), stack->blkno)));
+				}
+				pfree(ipd);
+			}
+
+			prev_tuple = CopyIndexTuple(idxtuple);
+		}
+
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Verify that a freshly-read page looks sane.
+ */
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	/*
+	 * ReadBuffer verifies that every newly-read page passes
+	 * PageHeaderIsValid, which means it either contains a reasonably sane
+	 * page header or is all-zero.  We have to defend against the all-zero
+	 * case, however.
+	 */
+	if (PageIsNew(page))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains unexpected zero page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	/*
+	 * Additionally check that the special area looks sane.
+	 */
+	if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains corrupted page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	if (GinPageIsDeleted(page))
+	{
+		if (!GinPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gin_refind_parent(Relation rel, BlockNumber parentblkno,
+				  BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIN_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GinPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+static ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED or LP_DEAD,
+	 * since GIN never uses all three.  Verify that line pointer has storage,
+	 * too.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdIsDead(itemid) || ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 6eb526c6bb7..55f2b587e57 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -189,6 +189,25 @@ ORDER BY c.relpages DESC LIMIT 10;
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term>
+     <function>gin_index_check(index regclass) returns void</function>
+     <indexterm>
+      <primary>gin_index_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gin_index_check</function> tests that its target GIN index
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term>
      <function>gist_index_check(index regclass, heapallindexed boolean) returns void</function>
diff --git a/src/tools/pgindent/pgindent b/src/tools/pgindent/pgindent
index e889af6b1e4..e5ac0410665 100755
--- a/src/tools/pgindent/pgindent
+++ b/src/tools/pgindent/pgindent
@@ -13,7 +13,7 @@ use IO::Handle;
 use Getopt::Long;
 
 # Update for pg_bsd_indent version
-my $INDENT_VERSION = "2.1.2";
+my $INDENT_VERSION = "2.1.1";
 
 # Our standard indent settings
 my $indent_opts =
-- 
2.34.1

v30-0003-Add-gist_index_check-function-to-verify-GiST-ind.patchapplication/octet-stream; name=v30-0003-Add-gist_index_check-function-to-verify-GiST-ind.patchDownload

From 3934621f6aaf2659e38691f7519738ad53fd7e99 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:17:44 +0500
Subject: [PATCH v30 3/5] Add gist_index_check() function to verify GiST index
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This function traverses GiST with a depth-fisrt search and checks
that all downlink tuples are included into parent tuple keyspace.
This traverse takes lock of any page until some discapency found.
To re-check suspicious pair of parent and child tuples it aqcuires
locks on both parent and child pages in the same order as page
split does.

Author: Andrey Borodin <amborodin@acm.org>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile                |   6 +-
 contrib/amcheck/amcheck--1.4--1.5.sql   |  14 +
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/expected/check_gist.out | 145 +++++
 contrib/amcheck/meson.build             |   3 +
 contrib/amcheck/sql/check_gist.sql      |  62 +++
 contrib/amcheck/verify_gist.c           | 687 ++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml               |  19 +
 8 files changed, 935 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.4--1.5.sql
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index c3d70f3369c..952e458c53b 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,14 +4,16 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	verify_common.o \
+	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
-DATA = amcheck--1.3--1.4.sql amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql \
+		amcheck--1.3--1.4.sql amcheck--1.4--1.5.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap
+REGRESS = check check_btree check_gist check_heap
 
 EXTRA_INSTALL = contrib/pg_walinspect
 TAP_TESTS = 1
diff --git a/contrib/amcheck/amcheck--1.4--1.5.sql b/contrib/amcheck/amcheck--1.4--1.5.sql
new file mode 100644
index 00000000000..3fc72364180
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.4--1.5.sql
@@ -0,0 +1,14 @@
+/* contrib/amcheck/amcheck--1.4--1.5.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.5'" to load this file. \quit
+
+
+-- gist_index_check()
+--
+CREATE FUNCTION gist_index_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_check(regclass, boolean) FROM PUBLIC;
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index e67ace01c99..c8ba6d7c9bc 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.4'
+default_version = '1.5'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 00000000000..cbc3e27e679
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,145 @@
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gist_check;
+--
+-- Similar to BUG #15597
+--
+CREATE TABLE toast_bug(c point,buggy text);
+ALTER TABLE toast_bug ALTER COLUMN buggy SET STORAGE extended;
+CREATE INDEX toasty ON toast_bug USING gist(c) INCLUDE(buggy);
+-- pg_attribute entry for toasty.buggy (the index) will have plain storage:
+UPDATE pg_attribute SET attstorage = 'p'
+WHERE attrelid = 'toasty'::regclass AND attname = 'buggy';
+-- Whereas pg_attribute entry for toast_bug.buggy (the table) still has extended storage:
+SELECT attstorage FROM pg_attribute
+WHERE attrelid = 'toast_bug'::regclass AND attname = 'buggy';
+ attstorage 
+------------
+ x
+(1 row)
+
+-- Insert compressible heap tuple (comfortably exceeds TOAST_TUPLE_THRESHOLD):
+INSERT INTO toast_bug SELECT point(0,0), repeat('a', 2200);
+-- Should not get false positive report of corruption:
+SELECT gist_index_check('toasty', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 1b38e0aba77..15ae94cc90f 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
@@ -25,6 +26,7 @@ install_data(
   'amcheck--1.1--1.2.sql',
   'amcheck--1.2--1.3.sql',
   'amcheck--1.3--1.4.sql',
+  'amcheck--1.4--1.5.sql',
   kwargs: contrib_data_args,
 )
 
@@ -36,6 +38,7 @@ tests += {
     'sql': [
       'check',
       'check_btree',
+      'check_gist',
       'check_heap',
     ],
   },
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 00000000000..37966423b8b
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,62 @@
+
+SELECT setseed(1);
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+-- cleanup
+DROP TABLE gist_check;
+
+--
+-- Similar to BUG #15597
+--
+CREATE TABLE toast_bug(c point,buggy text);
+ALTER TABLE toast_bug ALTER COLUMN buggy SET STORAGE extended;
+CREATE INDEX toasty ON toast_bug USING gist(c) INCLUDE(buggy);
+
+-- pg_attribute entry for toasty.buggy (the index) will have plain storage:
+UPDATE pg_attribute SET attstorage = 'p'
+WHERE attrelid = 'toasty'::regclass AND attname = 'buggy';
+
+-- Whereas pg_attribute entry for toast_bug.buggy (the table) still has extended storage:
+SELECT attstorage FROM pg_attribute
+WHERE attrelid = 'toast_bug'::regclass AND attname = 'buggy';
+
+-- Insert compressible heap tuple (comfortably exceeds TOAST_TUPLE_THRESHOLD):
+INSERT INTO toast_bug SELECT point(0,0), repeat('a', 2200);
+-- Should not get false positive report of corruption:
+SELECT gist_index_check('toasty', true);
\ No newline at end of file
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 00000000000..477150ac802
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,687 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gist.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "access/tableam.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
+#include "common/pg_prng.h"
+#include "lib/bloomfilter.h"
+#include "verify_common.h"
+#include "utils/memutils.h"
+
+
+/*
+ * GistScanItem represents one item of depth-first scan of GiST index.
+ */
+typedef struct GistScanItem
+{
+	int			depth;
+
+	/* Referenced block number to check next */
+	BlockNumber blkno;
+
+	/*
+	 * Correctess of this parent tuple will be checked against contents of
+	 * referenced page. This tuple will be NULL for root block.
+	 */
+	IndexTuple	parenttup;
+
+	/*
+	 * LSN to hande concurrent scan of the page. It's necessary to avoid
+	 * missing some subtrees from page, that was split just before we read it.
+	 */
+	XLogRecPtr	parentlsn;
+
+	/*
+	 * Reference to parent page for re-locking in case of found parent-child
+	 * tuple discrepencies.
+	 */
+	BlockNumber parentblk;
+
+	/* Pointer to a next stack item. */
+	struct GistScanItem *next;
+}			GistScanItem;
+
+typedef struct GistCheckState
+{
+	/* GiST state */
+	GISTSTATE  *state;
+	/* Bloom filter fingerprints index tuples */
+	bloom_filter *filter;
+
+	Snapshot	snapshot;
+	Relation	rel;
+	Relation	heaprel;
+
+	/* Debug counter for reporting percentage of work already done */
+	int64		heaptuplespresent;
+
+	/* progress reporting stuff */
+	BlockNumber totalblocks;
+	BlockNumber reportedblocks;
+	BlockNumber scannedblocks;
+	BlockNumber deltablocks;
+
+	int			leafdepth;
+}			GistCheckState;
+
+PG_FUNCTION_INFO_V1(gist_index_check);
+
+static void giststate_init_heapallindexed(Relation rel, GistCheckState * result);
+static void gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+											   void *callback_state, bool readonly);
+static void gist_check_page(GistCheckState * check_state, GistScanItem * stack,
+							Page page, bool heapallindexed,
+							BufferAccessStrategy strategy);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gist_refind_parent(Relation rel, BlockNumber parentblkno,
+									 BlockNumber childblkno,
+									 BufferAccessStrategy strategy);
+static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block,
+								   Page page, OffsetNumber offset);
+static void gist_tuple_present_callback(Relation index, ItemPointer tid,
+										Datum *values, bool *isnull,
+										bool tupleIsAlive, void *checkstate);
+static IndexTuple gistFormNormalizedTuple(GISTSTATE *giststate, Relation r,
+										  Datum *attdata, bool *isnull, ItemPointerData tid);
+
+/*
+ * gist_index_check(index regclass)
+ *
+ * Verify integrity of GiST index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gist_index_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	bool		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid,
+									GIST_AM_OID,
+									gist_check_parent_keys_consistency,
+									AccessShareLock,
+									&heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+/*
+* Initaliaze GIST state filed needed to perform.
+* This initialized bloom filter and snapshot.
+*/
+static void
+giststate_init_heapallindexed(Relation rel, GistCheckState * result)
+{
+	int64		total_pages;
+	int64		total_elems;
+	uint64		seed;
+
+	/*
+	 * Size Bloom filter based on estimated number of tuples in index. This
+	 * logic is similar to B-tree, see verify_btree.c .
+	 */
+	total_pages = result->totalblocks;
+	total_elems = Max(total_pages * (MaxOffsetNumber / 5),
+					  (int64) rel->rd_rel->reltuples);
+	seed = pg_prng_uint64(&pg_global_prng_state);
+	result->filter = bloom_create(total_elems, maintenance_work_mem, seed);
+
+	result->snapshot = RegisterSnapshot(GetTransactionSnapshot());
+
+
+	/*
+	 * GetTransactionSnapshot() always acquires a new MVCC snapshot in READ
+	 * COMMITTED mode.  A new snapshot is guaranteed to have all the entries
+	 * it requires in the index.
+	 *
+	 * We must defend against the possibility that an old xact snapshot was
+	 * returned at higher isolation levels when that snapshot is not safe for
+	 * index scans of the target index.  This is possible when the snapshot
+	 * sees tuples that are before the index's indcheckxmin horizon.  Throwing
+	 * an error here should be very rare.  It doesn't seem worth using a
+	 * secondary snapshot to avoid this.
+	 */
+	if (IsolationUsesXactSnapshot() && rel->rd_index->indcheckxmin &&
+		!TransactionIdPrecedes(HeapTupleHeaderGetXmin(rel->rd_indextuple->t_data),
+							   result->snapshot->xmin))
+		ereport(ERROR,
+				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+				 errmsg("index \"%s\" cannot be verified using transaction snapshot",
+						RelationGetRelationName(rel))));
+}
+
+/*
+ * Main entry point for GiST check.
+ *
+ * This function verifies that tuples of internal pages cover all
+ * the key space of each tuple on leaf page.  To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * This check allocates memory context and scans through
+ * GiST graph. This scan is performed in a depth-first search using a stack of
+ * GistScanItem-s. Initially this stack contains only root block number. On
+ * each iteration top block numbmer is replcaed by referenced block numbers.
+ *
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries to
+ * adjust it by tuples on referenced child page.  Parent gist tuple should
+ * never require any adjustments.
+ */
+static void
+gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+								   void *callback_state, bool readonly)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GistScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GISTSTATE  *state;
+	bool		heapallindexed = *((bool *) callback_state);
+	GistCheckState *check_state = palloc0(sizeof(GistCheckState));
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	state = initGISTstate(rel);
+
+	check_state->state = state;
+	check_state->rel = rel;
+	check_state->heaprel = heaprel;
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	check_state->leafdepth = -1;
+
+	check_state->totalblocks = RelationGetNumberOfBlocks(rel);
+	/* report every 100 blocks or 5%, whichever is bigger */
+	check_state->deltablocks = Max(check_state->totalblocks / 20, 100);
+
+	if (heapallindexed)
+		giststate_init_heapallindexed(rel, check_state);
+
+	/* Start the scan at the root page */
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	/*
+	 * This GiST scan is effectively "old" VACUUM version before commit
+	 * fe280694d which introduced physical order scanning.
+	 */
+
+	while (stack)
+	{
+		GistScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		XLogRecPtr	lsn;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Report progress */
+		if (check_state->scannedblocks > check_state->reportedblocks +
+			check_state->deltablocks)
+		{
+			elog(DEBUG1, "verified level %u blocks of approximately %u total",
+				 check_state->scannedblocks, check_state->totalblocks);
+			check_state->reportedblocks = check_state->scannedblocks;
+		}
+		check_state->scannedblocks++;
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page))
+		{
+			/* split page detected, install right link to the stack */
+			GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+			ptr->depth = stack->depth;
+			ptr->parenttup = CopyIndexTuple(stack->parenttup);
+			ptr->parentblk = stack->parentblk;
+			ptr->parentlsn = stack->parentlsn;
+			ptr->blkno = GistPageGetOpaque(page)->rightlink;
+			ptr->next = stack->next;
+			stack->next = ptr;
+		}
+
+		gist_check_page(check_state, stack, page, heapallindexed, strategy);
+
+		if (!GistPageIsLeaf(page))
+		{
+			OffsetNumber maxoff = PageGetMaxOffsetNumber(page);
+
+			for (OffsetNumber i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				/* Internal page, so recurse to the child */
+				GistScanItem *ptr;
+				ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
+				IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+				ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+				ptr->depth = stack->depth + 1;
+				ptr->parenttup = CopyIndexTuple(idxtuple);
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		LockBuffer(buffer, GIST_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	if (heapallindexed)
+	{
+		IndexInfo  *indexinfo = BuildIndexInfo(rel);
+		TableScanDesc scan;
+
+		scan = table_beginscan_strat(heaprel,	/* relation */
+									 check_state->snapshot, /* snapshot */
+									 0, /* number of keys */
+									 NULL,	/* scan key */
+									 true,	/* buffer access strategy OK */
+									 true); /* syncscan OK? */
+
+		/*
+		 * Scan will behave as the first scan of a CREATE INDEX CONCURRENTLY.
+		 */
+		indexinfo->ii_Concurrent = true;
+
+		indexinfo->ii_Unique = false;
+		indexinfo->ii_ExclusionOps = NULL;
+		indexinfo->ii_ExclusionProcs = NULL;
+		indexinfo->ii_ExclusionStrats = NULL;
+
+		elog(DEBUG1, "verifying that tuples from index \"%s\" are present in \"%s\"",
+			 RelationGetRelationName(rel),
+			 RelationGetRelationName(heaprel));
+
+		table_index_build_scan(heaprel, rel, indexinfo, true, false,
+							   gist_tuple_present_callback, (void *) check_state, scan);
+
+		ereport(DEBUG1,
+				(errmsg_internal("finished verifying presence of " INT64_FORMAT " tuples from table \"%s\" with bitset %.2f%% set",
+								 check_state->heaptuplespresent,
+								 RelationGetRelationName(heaprel),
+								 100.0 * bloom_prop_bits_set(check_state->filter))));
+
+		UnregisterSnapshot(check_state->snapshot);
+		bloom_free(check_state->filter);
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+	pfree(check_state);
+}
+
+static void
+gist_check_page(GistCheckState * check_state, GistScanItem * stack,
+				Page page, bool heapallindexed, BufferAccessStrategy strategy)
+{
+	OffsetNumber maxoff = PageGetMaxOffsetNumber(page);
+
+	/* Check that the tree has the same height in all branches */
+	if (GistPageIsLeaf(page))
+	{
+		if (check_state->leafdepth == -1)
+			check_state->leafdepth = stack->depth;
+		else if (stack->depth != check_state->leafdepth)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+							RelationGetRelationName(check_state->rel), stack->blkno)));
+	}
+
+	/*
+	 * Check that each tuple looks valid, and is consistent with the downlink
+	 * we followed when we stepped on this page.
+	 */
+	for (OffsetNumber i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+	{
+		ItemId		iid = PageGetItemIdCareful(check_state->rel, stack->blkno, page, i);
+		IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+		IndexTuple  tmpTuple = NULL;
+
+		/*
+		 * Check that it's not a leftover invalid tuple from pre-9.1 See also
+		 * gistdoinsert() and gistbulkdelete() handling of such tuples. We do
+		 * consider it error here.
+		 */
+		if (GistTupleIsInvalid(idxtuple))
+			ereport(ERROR,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					 errmsg("index \"%s\" contains an inner tuple marked as invalid, block %u, offset %u",
+							RelationGetRelationName(check_state->rel), stack->blkno, i),
+					 errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+					 errhint("Please REINDEX it.")));
+
+		if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+							RelationGetRelationName(check_state->rel), stack->blkno, i)));
+
+		/*
+		 * Check if this tuple is consistent with the downlink in the parent.
+		 */
+		if (stack->parenttup)
+			tmpTuple = gistgetadjusted(check_state->rel, stack->parenttup, idxtuple, check_state->state);
+
+		if (tmpTuple)
+		{
+			/*
+			 * There was a discrepancy between parent and child tuples. We
+			 * need to verify it is not a result of concurrent call of
+			 * gistplacetopage(). So, lock parent and try to find downlink for
+			 * current page. It may be missing due to concurrent page split,
+			 * this is OK.
+			 *
+			 * Note that when we aquire parent tuple now we hold lock for both
+			 * parent and child buffers. Thus parent tuple must include
+			 * keyspace of the child.
+			 */
+
+			pfree(tmpTuple);
+			pfree(stack->parenttup);
+			stack->parenttup = gist_refind_parent(check_state->rel, stack->parentblk,
+												  stack->blkno, strategy);
+
+			/* We found it - make a final check before failing */
+			if (!stack->parenttup)
+				elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+					 stack->blkno, stack->parentblk);
+			else if (gistgetadjusted(check_state->rel, stack->parenttup, idxtuple, check_state->state))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+								RelationGetRelationName(check_state->rel), stack->blkno, i)));
+			else
+			{
+				/*
+				 * But now it is properly adjusted - nothing to do here.
+				 */
+			}
+		}
+
+		if (GistPageIsLeaf(page))
+		{
+			if (heapallindexed)
+				bloom_add_element(check_state->filter,
+								  (unsigned char *) idxtuple,
+								  IndexTupleSize(idxtuple));
+		}
+		else
+		{
+			OffsetNumber off = ItemPointerGetOffsetNumber(&(idxtuple->t_tid));
+
+			if (off != 0xffff)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has on page %u offset %u has item id not pointing to 0xffff, but %hu",
+								RelationGetRelationName(check_state->rel), stack->blkno, i, off)));
+		}
+	}
+}
+
+/*
+ * gistFormNormalizedTuple - analogue to gistFormTuple, but performs deTOASTing
+ * of all included data (for covering indexes). While we do not expected
+ * toasted attributes in normal index, this can happen as a result of
+ * intervention into system catalog. Detoasting of key attributes is expected
+ * to be done by opclass decompression methods, if indexed type might be
+ * toasted.
+ */
+static IndexTuple
+gistFormNormalizedTuple(GISTSTATE *giststate, Relation r,
+						Datum *attdata, bool *isnull, ItemPointerData tid)
+{
+	Datum		compatt[INDEX_MAX_KEYS];
+	IndexTuple	res;
+
+	gistCompressValues(giststate, r, attdata, isnull, true, compatt);
+
+	for (int i = 0; i < r->rd_att->natts; i++)
+	{
+		Form_pg_attribute att;
+
+		att = TupleDescAttr(giststate->leafTupdesc, i);
+		if (att->attbyval || att->attlen != -1 || isnull[i])
+			continue;
+
+		if (VARATT_IS_EXTERNAL(DatumGetPointer(compatt[i])))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("external varlena datum in tuple that references heap row (%u,%u) in index \"%s\"",
+							ItemPointerGetBlockNumber(&tid),
+							ItemPointerGetOffsetNumber(&tid),
+							RelationGetRelationName(r))));
+		if (VARATT_IS_COMPRESSED(DatumGetPointer(compatt[i])))
+		{
+			/* Datum old = compatt[i]; */
+			/* Key attributes must never be compressed */
+			if (i < IndexRelationGetNumberOfKeyAttributes(r))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("compressed varlena datum in tuple key that references heap row (%u,%u) in index \"%s\"",
+								ItemPointerGetBlockNumber(&tid),
+								ItemPointerGetOffsetNumber(&tid),
+								RelationGetRelationName(r))));
+
+			compatt[i] = PointerGetDatum(PG_DETOAST_DATUM(compatt[i]));
+			/* pfree(DatumGetPointer(old)); // TODO: this fails. Why? */
+		}
+	}
+
+	res = index_form_tuple(giststate->leafTupdesc, compatt, isnull);
+
+	/*
+	 * The offset number on tuples on internal pages is unused. For historical
+	 * reasons, it is set to 0xffff.
+	 */
+	ItemPointerSetOffsetNumber(&(res->t_tid), 0xffff);
+	return res;
+}
+
+static void
+gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+							bool *isnull, bool tupleIsAlive, void *checkstate)
+{
+	GistCheckState *state = (GistCheckState *) checkstate;
+	IndexTuple	itup = gistFormNormalizedTuple(state->state, index, values, isnull, *tid);
+
+	itup->t_tid = *tid;
+	/* Probe Bloom filter -- tuple should be present */
+	if (bloom_lacks_element(state->filter, (unsigned char *) itup,
+							IndexTupleSize(itup)))
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("heap tuple (%u,%u) from table \"%s\" lacks matching index tuple within index \"%s\"",
+						ItemPointerGetBlockNumber(&(itup->t_tid)),
+						ItemPointerGetOffsetNumber(&(itup->t_tid)),
+						RelationGetRelationName(state->heaprel),
+						RelationGetRelationName(state->rel))));
+
+	state->heaptuplespresent++;
+
+	pfree(itup);
+}
+
+/*
+ * check_index_page - verification of basic invariants about GiST page data
+ * This function does no any tuple analysis.
+ */
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	gistcheckpage(rel, buffer);
+
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted page %d",
+						RelationGetRelationName(rel), blockNo)));
+
+	if (GistPageIsDeleted(page))
+	{
+		if (!GistPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gist_refind_parent(Relation rel,
+				   BlockNumber parentblkno, BlockNumber childblkno,
+				   BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIST_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GistPageIsLeaf(parentpage))
+	{
+		/*
+		 * Currently GiST never deletes internal pages, thus they can never
+		 * become leaf.
+		 */
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" internal page %d became leaf",
+						RelationGetRelationName(rel), parentblkno)));
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (OffsetNumber o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/*
+			 * Found it! Make copy and return it while both parent and child
+			 * pages are locked. This guaranties that at this particular
+			 * moment tuples must be coherent to each other.
+			 */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+static ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(sizeof(GISTPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since gist
+	 * never uses either.  Verify that line pointer has storage, too, since
+	 * even LP_DEAD items should.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 3af065615bc..6eb526c6bb7 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -188,6 +188,25 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gist_index_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
   <tip>
    <para>
-- 
2.34.1

v30-0005-Add-GiST-support-to-pg_amcheck.patchapplication/octet-stream; name=v30-0005-Add-GiST-support-to-pg_amcheck.patchDownload

From e695d157c919fa5dfc5ba34b43620d18a595b899 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sun, 5 Feb 2023 15:52:14 -0800
Subject: [PATCH v30 5/5] Add GiST support to pg_amcheck

Proof of concept patch for pg_amcheck binary support
for GIST and GIN index checks.

Author: Andrey Borodin <amborodin@acm.org>
Reviewed-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
---
 src/bin/pg_amcheck/pg_amcheck.c      | 290 ++++++++++++++++-----------
 src/bin/pg_amcheck/t/002_nonesuch.pl |   8 +-
 src/bin/pg_amcheck/t/003_check.pl    |  65 ++++--
 3 files changed, 220 insertions(+), 143 deletions(-)

diff --git a/src/bin/pg_amcheck/pg_amcheck.c b/src/bin/pg_amcheck/pg_amcheck.c
index 27a7d5e925e..8146ea1e604 100644
--- a/src/bin/pg_amcheck/pg_amcheck.c
+++ b/src/bin/pg_amcheck/pg_amcheck.c
@@ -40,8 +40,7 @@ typedef struct PatternInfo
 								 * NULL */
 	bool		heap_only;		/* true if rel_regex should only match heap
 								 * tables */
-	bool		btree_only;		/* true if rel_regex should only match btree
-								 * indexes */
+	bool		index_only;		/* true if rel_regex should only match indexes */
 	bool		matched;		/* true if the pattern matched in any database */
 } PatternInfo;
 
@@ -75,10 +74,9 @@ typedef struct AmcheckOptions
 
 	/*
 	 * As an optimization, if any pattern in the exclude list applies to heap
-	 * tables, or similarly if any such pattern applies to btree indexes, or
-	 * to schemas, then these will be true, otherwise false.  These should
-	 * always agree with what you'd conclude by grep'ing through the exclude
-	 * list.
+	 * tables, or similarly if any such pattern applies to indexes, or to
+	 * schemas, then these will be true, otherwise false.  These should always
+	 * agree with what you'd conclude by grep'ing through the exclude list.
 	 */
 	bool		excludetbl;
 	bool		excludeidx;
@@ -99,14 +97,14 @@ typedef struct AmcheckOptions
 	int64		endblock;
 	const char *skip;
 
-	/* btree index checking options */
+	/* index checking options */
 	bool		parent_check;
 	bool		rootdescend;
 	bool		heapallindexed;
 	bool		checkunique;
 
-	/* heap and btree hybrid option */
-	bool		no_btree_expansion;
+	/* heap and indexes hybrid option */
+	bool		no_index_expansion;
 } AmcheckOptions;
 
 static AmcheckOptions opts = {
@@ -135,7 +133,7 @@ static AmcheckOptions opts = {
 	.rootdescend = false,
 	.heapallindexed = false,
 	.checkunique = false,
-	.no_btree_expansion = false
+	.no_index_expansion = false
 };
 
 static const char *progname = NULL;
@@ -152,13 +150,15 @@ typedef struct DatabaseInfo
 	char	   *datname;
 	char	   *amcheck_schema; /* escaped, quoted literal */
 	bool		is_checkunique;
+	bool		gist_supported;
 } DatabaseInfo;
 
 typedef struct RelationInfo
 {
 	const DatabaseInfo *datinfo;	/* shared by other relinfos */
 	Oid			reloid;
-	bool		is_heap;		/* true if heap, false if btree */
+	Oid			amoid;
+	bool		is_heap;		/* true if heap, false if index */
 	char	   *nspname;
 	char	   *relname;
 	int			relpages;
@@ -179,10 +179,12 @@ static void prepare_heap_command(PQExpBuffer sql, RelationInfo *rel,
 								 PGconn *conn);
 static void prepare_btree_command(PQExpBuffer sql, RelationInfo *rel,
 								  PGconn *conn);
+static void prepare_gist_command(PQExpBuffer sql, RelationInfo *rel,
+								 PGconn *conn);
 static void run_command(ParallelSlot *slot, const char *sql);
 static bool verify_heap_slot_handler(PGresult *res, PGconn *conn,
 									 void *context);
-static bool verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context);
+static bool verify_index_slot_handler(PGresult *res, PGconn *conn, void *context);
 static void help(const char *progname);
 static void progress_report(uint64 relations_total, uint64 relations_checked,
 							uint64 relpages_total, uint64 relpages_checked,
@@ -196,7 +198,7 @@ static void append_relation_pattern(PatternInfoArray *pia, const char *pattern,
 									int encoding);
 static void append_heap_pattern(PatternInfoArray *pia, const char *pattern,
 								int encoding);
-static void append_btree_pattern(PatternInfoArray *pia, const char *pattern,
+static void append_index_pattern(PatternInfoArray *pia, const char *pattern,
 								 int encoding);
 static void compile_database_list(PGconn *conn, SimplePtrList *databases,
 								  const char *initial_dbname);
@@ -288,6 +290,7 @@ main(int argc, char *argv[])
 	enum trivalue prompt_password = TRI_DEFAULT;
 	int			encoding = pg_get_encoding_from_locale(NULL, false);
 	ConnParams	cparams;
+	bool		gist_warn_printed = false;
 
 	pg_logging_init(argv[0]);
 	progname = get_progname(argv[0]);
@@ -323,11 +326,11 @@ main(int argc, char *argv[])
 				break;
 			case 'i':
 				opts.allrel = false;
-				append_btree_pattern(&opts.include, optarg, encoding);
+				append_index_pattern(&opts.include, optarg, encoding);
 				break;
 			case 'I':
 				opts.excludeidx = true;
-				append_btree_pattern(&opts.exclude, optarg, encoding);
+				append_index_pattern(&opts.exclude, optarg, encoding);
 				break;
 			case 'j':
 				if (!option_parse_int(optarg, "-j/--jobs", 1, INT_MAX,
@@ -382,7 +385,7 @@ main(int argc, char *argv[])
 				maintenance_db = pg_strdup(optarg);
 				break;
 			case 2:
-				opts.no_btree_expansion = true;
+				opts.no_index_expansion = true;
 				break;
 			case 3:
 				opts.no_toast_expansion = true;
@@ -531,6 +534,10 @@ main(int argc, char *argv[])
 		int			ntups;
 		const char *amcheck_schema = NULL;
 		DatabaseInfo *dat = (DatabaseInfo *) cell->ptr;
+		int			vmaj = 0,
+					vmin = 0,
+					vrev = 0;
+		const char *amcheck_version;
 
 		cparams.override_dbname = dat->datname;
 		if (conn == NULL || strcmp(PQdb(conn), dat->datname) != 0)
@@ -599,36 +606,32 @@ main(int argc, char *argv[])
 												 strlen(amcheck_schema));
 
 		/*
-		 * Check the version of amcheck extension. Skip requested unique
-		 * constraint check with warning if it is not yet supported by
-		 * amcheck.
+		 * Check the version of amcheck extension.
 		 */
-		if (opts.checkunique == true)
-		{
-			/*
-			 * Now amcheck has only major and minor versions in the string but
-			 * we also support revision just in case. Now it is expected to be
-			 * zero.
-			 */
-			int			vmaj = 0,
-						vmin = 0,
-						vrev = 0;
-			const char *amcheck_version = PQgetvalue(result, 0, 1);
+		amcheck_version = PQgetvalue(result, 0, 1);
 
-			sscanf(amcheck_version, "%d.%d.%d", &vmaj, &vmin, &vrev);
+		/*
+		 * Now amcheck has only major and minor versions in the string but we
+		 * also support revision just in case. Now it is expected to be zero.
+		 */
+		sscanf(amcheck_version, "%d.%d.%d", &vmaj, &vmin, &vrev);
 
-			/*
-			 * checkunique option is supported in amcheck since version 1.4
-			 */
-			if ((vmaj == 1 && vmin < 4) || vmaj == 0)
-			{
-				pg_log_warning("option %s is not supported by amcheck version %s",
-							   "--checkunique", amcheck_version);
-				dat->is_checkunique = false;
-			}
-			else
-				dat->is_checkunique = true;
+		/*
+		 * checkunique option is supported in amcheck since version 1.4. Skip
+		 * requested unique constraint check with warning if it is not yet
+		 * supported by amcheck.
+		 */
+		if (opts.checkunique && ((vmaj == 1 && vmin < 4) || vmaj == 0))
+		{
+			pg_log_warning("option %s is not supported by amcheck version %s",
+						   "--checkunique", amcheck_version);
+			dat->is_checkunique = false;
 		}
+		else
+			dat->is_checkunique = opts.checkunique;
+
+		/* GiST indexes are supported in 1.5+ */
+		dat->gist_supported = ((vmaj == 1 && vmin >= 5) || vmaj > 1);
 
 		PQclear(result);
 
@@ -650,8 +653,8 @@ main(int argc, char *argv[])
 			if (pat->heap_only)
 				log_no_match("no heap tables to check matching \"%s\"",
 							 pat->pattern);
-			else if (pat->btree_only)
-				log_no_match("no btree indexes to check matching \"%s\"",
+			else if (pat->index_only)
+				log_no_match("no indexes to check matching \"%s\"",
 							 pat->pattern);
 			else if (pat->rel_regex == NULL)
 				log_no_match("no relations to check in schemas matching \"%s\"",
@@ -784,13 +787,29 @@ main(int argc, char *argv[])
 				if (opts.show_progress && progress_since_last_stderr)
 					fprintf(stderr, "\n");
 
-				pg_log_info("checking btree index \"%s.%s.%s\"",
+				pg_log_info("checking index \"%s.%s.%s\"",
 							rel->datinfo->datname, rel->nspname, rel->relname);
 				progress_since_last_stderr = false;
 			}
-			prepare_btree_command(&sql, rel, free_slot->connection);
+			if (rel->amoid == BTREE_AM_OID)
+				prepare_btree_command(&sql, rel, free_slot->connection);
+			else if (rel->amoid == GIST_AM_OID)
+			{
+				if (rel->datinfo->gist_supported)
+					prepare_gist_command(&sql, rel, free_slot->connection);
+				else
+				{
+					if (!gist_warn_printed)
+						pg_log_warning("GiST verification is not supported by installed amcheck version");
+					gist_warn_printed = true;
+				}
+			}
+			else
+				/* should not happen at this stage */
+				pg_log_info("Verification of index type %u not supported",
+							rel->amoid);
 			rel->sql = pstrdup(sql.data);	/* pg_free'd after command */
-			ParallelSlotSetHandler(free_slot, verify_btree_slot_handler, rel);
+			ParallelSlotSetHandler(free_slot, verify_index_slot_handler, rel);
 			run_command(free_slot, rel->sql);
 		}
 	}
@@ -868,7 +887,7 @@ prepare_heap_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
  * Creates a SQL command for running amcheck checking on the given btree index
  * relation.  The command does not select any columns, as btree checking
  * functions do not return any, but rather return corruption information by
- * raising errors, which verify_btree_slot_handler expects.
+ * raising errors, which verify_index_slot_handler expects.
  *
  * The constructed SQL command will silently skip temporary indexes, and
  * indexes being reindexed concurrently, as checking them would needlessly draw
@@ -914,6 +933,28 @@ prepare_btree_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
 						  rel->reloid);
 }
 
+/*
+ * prepare_gist_command
+ * Similar to btree equivalent prepares command to check GiST index.
+ */
+static void
+prepare_gist_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
+{
+	resetPQExpBuffer(sql);
+
+	appendPQExpBuffer(sql,
+					  "SELECT %s.gist_index_check("
+					  "index := c.oid, heapallindexed := %s)"
+					  "\nFROM pg_catalog.pg_class c, pg_catalog.pg_index i "
+					  "WHERE c.oid = %u "
+					  "AND c.oid = i.indexrelid "
+					  "AND c.relpersistence != 't' "
+					  "AND i.indisready AND i.indisvalid AND i.indislive",
+					  rel->datinfo->amcheck_schema,
+					  (opts.heapallindexed ? "true" : "false"),
+					  rel->reloid);
+}
+
 /*
  * run_command
  *
@@ -953,7 +994,7 @@ run_command(ParallelSlot *slot, const char *sql)
  * Note: Heap relation corruption is reported by verify_heapam() via the result
  * set, rather than an ERROR, but running verify_heapam() on a corrupted heap
  * table may still result in an error being returned from the server due to
- * missing relation files, bad checksums, etc.  The btree corruption checking
+ * missing relation files, bad checksums, etc.  The corruption checking
  * functions always use errors to communicate corruption messages.  We can't
  * just abort processing because we got a mere ERROR.
  *
@@ -1103,11 +1144,11 @@ verify_heap_slot_handler(PGresult *res, PGconn *conn, void *context)
 }
 
 /*
- * verify_btree_slot_handler
+ * verify_index_slot_handler
  *
- * ParallelSlotHandler that receives results from a btree checking command
- * created by prepare_btree_command and outputs them for the user.  The results
- * from the btree checking command is assumed to be empty, but when the results
+ * ParallelSlotHandler that receives results from a checking command created by
+ * prepare_[btree,gist]_command and outputs them for the user.  The results
+ * from the checking command is assumed to be empty, but when the results
  * are an error code, the useful information about the corruption is expected
  * in the connection's error message.
  *
@@ -1116,7 +1157,7 @@ verify_heap_slot_handler(PGresult *res, PGconn *conn, void *context)
  * context: unused
  */
 static bool
-verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
+verify_index_slot_handler(PGresult *res, PGconn *conn, void *context)
 {
 	RelationInfo *rel = (RelationInfo *) context;
 
@@ -1127,12 +1168,12 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 		if (ntups > 1)
 		{
 			/*
-			 * We expect the btree checking functions to return one void row
-			 * each, or zero rows if the check was skipped due to the object
-			 * being in the wrong state to be checked, so we should output
-			 * some sort of warning if we get anything more, not because it
-			 * indicates corruption, but because it suggests a mismatch
-			 * between amcheck and pg_amcheck versions.
+			 * We expect the checking functions to return one void row each,
+			 * or zero rows if the check was skipped due to the object being
+			 * in the wrong state to be checked, so we should output some sort
+			 * of warning if we get anything more, not because it indicates
+			 * corruption, but because it suggests a mismatch between amcheck
+			 * and pg_amcheck versions.
 			 *
 			 * In conjunction with --progress, anything written to stderr at
 			 * this time would present strangely to the user without an extra
@@ -1142,7 +1183,7 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 			 */
 			if (opts.show_progress && progress_since_last_stderr)
 				fprintf(stderr, "\n");
-			pg_log_warning("btree index \"%s.%s.%s\": btree checking function returned unexpected number of rows: %d",
+			pg_log_warning("index \"%s.%s.%s\": checking function returned unexpected number of rows: %d",
 						   rel->datinfo->datname, rel->nspname, rel->relname, ntups);
 			if (opts.verbose)
 				pg_log_warning_detail("Query was: %s", rel->sql);
@@ -1156,7 +1197,7 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 		char	   *msg = indent_lines(PQerrorMessage(conn));
 
 		all_checks_pass = false;
-		printf(_("btree index \"%s.%s.%s\":\n"),
+		printf(_("index \"%s.%s.%s\":\n"),
 			   rel->datinfo->datname, rel->nspname, rel->relname);
 		printf("%s", msg);
 		if (opts.verbose)
@@ -1210,6 +1251,8 @@ help(const char *progname)
 	printf(_("      --heapallindexed            check that all heap tuples are found within indexes\n"));
 	printf(_("      --parent-check              check index parent/child relationships\n"));
 	printf(_("      --rootdescend               search from root page to refind tuples\n"));
+	printf(_("\nGiST index checking options:\n"));
+	printf(_("      --heapallindexed            check that all heap tuples are found within indexes\n"));
 	printf(_("\nConnection options:\n"));
 	printf(_("  -h, --host=HOSTNAME             database server host or socket directory\n"));
 	printf(_("  -p, --port=PORT                 database server port\n"));
@@ -1423,11 +1466,11 @@ append_schema_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
  * pattern: the relation name pattern
  * encoding: client encoding for parsing the pattern
  * heap_only: whether the pattern should only be matched against heap tables
- * btree_only: whether the pattern should only be matched against btree indexes
+ * index_only: whether the pattern should only be matched against indexes
  */
 static void
 append_relation_pattern_helper(PatternInfoArray *pia, const char *pattern,
-							   int encoding, bool heap_only, bool btree_only)
+							   int encoding, bool heap_only, bool index_only)
 {
 	PQExpBufferData dbbuf;
 	PQExpBufferData nspbuf;
@@ -1462,14 +1505,14 @@ append_relation_pattern_helper(PatternInfoArray *pia, const char *pattern,
 	termPQExpBuffer(&relbuf);
 
 	info->heap_only = heap_only;
-	info->btree_only = btree_only;
+	info->index_only = index_only;
 }
 
 /*
  * append_relation_pattern
  *
  * Adds the given pattern interpreted as a relation pattern, to be matched
- * against both heap tables and btree indexes.
+ * against both heap tables and indexes.
  *
  * pia: the pattern info array to be appended
  * pattern: the relation name pattern
@@ -1498,17 +1541,17 @@ append_heap_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
 }
 
 /*
- * append_btree_pattern
+ * append_index_pattern
  *
  * Adds the given pattern interpreted as a relation pattern, to be matched only
- * against btree indexes.
+ * against indexes.
  *
  * pia: the pattern info array to be appended
  * pattern: the relation name pattern
  * encoding: client encoding for parsing the pattern
  */
 static void
-append_btree_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
+append_index_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
 {
 	append_relation_pattern_helper(pia, pattern, encoding, false, true);
 }
@@ -1766,7 +1809,7 @@ compile_database_list(PGconn *conn, SimplePtrList *databases,
  *     rel_regex: the relname regexp parsed from the pattern, or NULL if the
  *                pattern had no relname part
  *     heap_only: true if the pattern applies only to heap tables (not indexes)
- *     btree_only: true if the pattern applies only to btree indexes (not tables)
+ *     index_only: true if the pattern applies only to indexes (not tables)
  *
  * buf: the buffer to be appended
  * patterns: the array of patterns to be inserted into the CTE
@@ -1808,7 +1851,7 @@ append_rel_pattern_raw_cte(PQExpBuffer buf, const PatternInfoArray *pia,
 			appendPQExpBufferStr(buf, "::TEXT, true::BOOLEAN");
 		else
 			appendPQExpBufferStr(buf, "::TEXT, false::BOOLEAN");
-		if (info->btree_only)
+		if (info->index_only)
 			appendPQExpBufferStr(buf, ", true::BOOLEAN");
 		else
 			appendPQExpBufferStr(buf, ", false::BOOLEAN");
@@ -1846,8 +1889,8 @@ append_rel_pattern_filtered_cte(PQExpBuffer buf, const char *raw,
 								const char *filtered, PGconn *conn)
 {
 	appendPQExpBuffer(buf,
-					  "\n%s (pattern_id, nsp_regex, rel_regex, heap_only, btree_only) AS ("
-					  "\nSELECT pattern_id, nsp_regex, rel_regex, heap_only, btree_only "
+					  "\n%s (pattern_id, nsp_regex, rel_regex, heap_only, index_only) AS ("
+					  "\nSELECT pattern_id, nsp_regex, rel_regex, heap_only, index_only "
 					  "FROM %s r"
 					  "\nWHERE (r.db_regex IS NULL "
 					  "OR ",
@@ -1870,7 +1913,7 @@ append_rel_pattern_filtered_cte(PQExpBuffer buf, const char *raw,
  * The cells of the constructed list contain all information about the relation
  * necessary to connect to the database and check the object, including which
  * database to connect to, where contrib/amcheck is installed, and the Oid and
- * type of object (heap table vs. btree index).  Rather than duplicating the
+ * type of object (heap table vs. index).  Rather than duplicating the
  * database details per relation, the relation structs use references to the
  * same database object, provided by the caller.
  *
@@ -1897,7 +1940,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	if (!opts.allrel)
 	{
 		appendPQExpBufferStr(&sql,
-							 " include_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, btree_only) AS (");
+							 " include_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, index_only) AS (");
 		append_rel_pattern_raw_cte(&sql, &opts.include, conn);
 		appendPQExpBufferStr(&sql, "\n),");
 		append_rel_pattern_filtered_cte(&sql, "include_raw", "include_pat", conn);
@@ -1907,7 +1950,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	if (opts.excludetbl || opts.excludeidx || opts.excludensp)
 	{
 		appendPQExpBufferStr(&sql,
-							 " exclude_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, btree_only) AS (");
+							 " exclude_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, index_only) AS (");
 		append_rel_pattern_raw_cte(&sql, &opts.exclude, conn);
 		appendPQExpBufferStr(&sql, "\n),");
 		append_rel_pattern_filtered_cte(&sql, "exclude_raw", "exclude_pat", conn);
@@ -1915,36 +1958,36 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 
 	/* Append the relation CTE. */
 	appendPQExpBufferStr(&sql,
-						 " relation (pattern_id, oid, nspname, relname, reltoastrelid, relpages, is_heap, is_btree) AS ("
+						 " relation (pattern_id, oid, amoid, nspname, relname, reltoastrelid, relpages, is_heap, is_index) AS ("
 						 "\nSELECT DISTINCT ON (c.oid");
 	if (!opts.allrel)
 		appendPQExpBufferStr(&sql, ", ip.pattern_id) ip.pattern_id,");
 	else
 		appendPQExpBufferStr(&sql, ") NULL::INTEGER AS pattern_id,");
 	appendPQExpBuffer(&sql,
-					  "\nc.oid, n.nspname, c.relname, c.reltoastrelid, c.relpages, "
-					  "c.relam = %u AS is_heap, "
-					  "c.relam = %u AS is_btree"
+					  "\nc.oid, c.relam as amoid, n.nspname, c.relname, "
+					  "c.reltoastrelid, c.relpages, c.relam = %u AS is_heap, "
+					  "(c.relam = %u OR c.relam = %u) AS is_index"
 					  "\nFROM pg_catalog.pg_class c "
 					  "INNER JOIN pg_catalog.pg_namespace n "
 					  "ON c.relnamespace = n.oid",
-					  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+					  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 	if (!opts.allrel)
 		appendPQExpBuffer(&sql,
 						  "\nINNER JOIN include_pat ip"
 						  "\nON (n.nspname ~ ip.nsp_regex OR ip.nsp_regex IS NULL)"
 						  "\nAND (c.relname ~ ip.rel_regex OR ip.rel_regex IS NULL)"
 						  "\nAND (c.relam = %u OR NOT ip.heap_only)"
-						  "\nAND (c.relam = %u OR NOT ip.btree_only)",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  "\nAND ((c.relam = %u OR c.relam = %u) OR NOT ip.index_only)",
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 	if (opts.excludetbl || opts.excludeidx || opts.excludensp)
 		appendPQExpBuffer(&sql,
 						  "\nLEFT OUTER JOIN exclude_pat ep"
 						  "\nON (n.nspname ~ ep.nsp_regex OR ep.nsp_regex IS NULL)"
 						  "\nAND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL)"
 						  "\nAND (c.relam = %u OR NOT ep.heap_only OR ep.rel_regex IS NULL)"
-						  "\nAND (c.relam = %u OR NOT ep.btree_only OR ep.rel_regex IS NULL)",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  "\nAND ((c.relam = %u OR c.relam = %u) OR NOT ep.index_only OR ep.rel_regex IS NULL)",
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 
 	/*
 	 * Exclude temporary tables and indexes, which must necessarily belong to
@@ -1983,7 +2026,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 						  HEAP_TABLE_AM_OID, PG_TOAST_NAMESPACE);
 	else
 		appendPQExpBuffer(&sql,
-						  " AND c.relam IN (%u, %u)"
+						  " AND c.relam IN (%u, %u, %u)"
 						  "AND c.relkind IN ("
 						  CppAsString2(RELKIND_RELATION) ", "
 						  CppAsString2(RELKIND_SEQUENCE) ", "
@@ -1995,10 +2038,10 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 						  CppAsString2(RELKIND_SEQUENCE) ", "
 						  CppAsString2(RELKIND_MATVIEW) ", "
 						  CppAsString2(RELKIND_TOASTVALUE) ")) OR "
-						  "(c.relam = %u AND c.relkind = "
+						  "((c.relam = %u OR c.relam = %u) AND c.relkind = "
 						  CppAsString2(RELKIND_INDEX) "))",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID,
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID,
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 
 	appendPQExpBufferStr(&sql,
 						 "\nORDER BY c.oid)");
@@ -2027,7 +2070,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		appendPQExpBufferStr(&sql,
 							 "\n)");
 	}
-	if (!opts.no_btree_expansion)
+	if (!opts.no_index_expansion)
 	{
 		/*
 		 * Include a CTE for btree indexes associated with primary heap tables
@@ -2035,9 +2078,9 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		 * btree index names.
 		 */
 		appendPQExpBufferStr(&sql,
-							 ", index (oid, nspname, relname, relpages) AS ("
-							 "\nSELECT c.oid, r.nspname, c.relname, c.relpages "
-							 "FROM relation r"
+							 ", index (oid, amoid, nspname, relname, relpages) AS ("
+							 "\nSELECT c.oid, c.relam as amoid, r.nspname, "
+							 "c.relname, c.relpages FROM relation r"
 							 "\nINNER JOIN pg_catalog.pg_index i "
 							 "ON r.oid = i.indrelid "
 							 "INNER JOIN pg_catalog.pg_class c "
@@ -2050,15 +2093,15 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 								 "\nLEFT OUTER JOIN exclude_pat ep "
 								 "ON (n.nspname ~ ep.nsp_regex OR ep.nsp_regex IS NULL) "
 								 "AND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL) "
-								 "AND ep.btree_only"
+								 "AND ep.index_only"
 								 "\nWHERE ep.pattern_id IS NULL");
 		else
 			appendPQExpBufferStr(&sql,
 								 "\nWHERE true");
 		appendPQExpBuffer(&sql,
-						  " AND c.relam = %u "
+						  " AND (c.relam = %u or c.relam = %u) "
 						  "AND c.relkind = " CppAsString2(RELKIND_INDEX),
-						  BTREE_AM_OID);
+						  BTREE_AM_OID, GIST_AM_OID);
 		if (opts.no_toast_expansion)
 			appendPQExpBuffer(&sql,
 							  " AND c.relnamespace != %u",
@@ -2066,7 +2109,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		appendPQExpBufferStr(&sql, "\n)");
 	}
 
-	if (!opts.no_toast_expansion && !opts.no_btree_expansion)
+	if (!opts.no_toast_expansion && !opts.no_index_expansion)
 	{
 		/*
 		 * Include a CTE for btree indexes associated with toast tables of
@@ -2087,7 +2130,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 								 "\nLEFT OUTER JOIN exclude_pat ep "
 								 "ON ('pg_toast' ~ ep.nsp_regex OR ep.nsp_regex IS NULL) "
 								 "AND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL) "
-								 "AND ep.btree_only "
+								 "AND ep.index_only "
 								 "WHERE ep.pattern_id IS NULL");
 		else
 			appendPQExpBufferStr(&sql,
@@ -2107,12 +2150,13 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	 * list.
 	 */
 	appendPQExpBufferStr(&sql,
-						 "\nSELECT pattern_id, is_heap, is_btree, oid, nspname, relname, relpages "
+						 "\nSELECT pattern_id, is_heap, is_index, oid, amoid, nspname, relname, relpages "
 						 "FROM (");
 	appendPQExpBufferStr(&sql,
 	/* Inclusion patterns that failed to match */
-						 "\nSELECT pattern_id, is_heap, is_btree, "
+						 "\nSELECT pattern_id, is_heap, is_index, "
 						 "NULL::OID AS oid, "
+						 "NULL::OID AS amoid, "
 						 "NULL::TEXT AS nspname, "
 						 "NULL::TEXT AS relname, "
 						 "NULL::INTEGER AS relpages"
@@ -2121,29 +2165,29 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 						 "UNION"
 	/* Primary relations */
 						 "\nSELECT NULL::INTEGER AS pattern_id, "
-						 "is_heap, is_btree, oid, nspname, relname, relpages "
+						 "is_heap, is_index, oid, amoid, nspname, relname, relpages "
 						 "FROM relation");
 	if (!opts.no_toast_expansion)
-		appendPQExpBufferStr(&sql,
-							 " UNION"
+		appendPQExpBuffer(&sql,
+						  " UNION"
 		/* Toast tables for primary relations */
-							 "\nSELECT NULL::INTEGER AS pattern_id, TRUE AS is_heap, "
-							 "FALSE AS is_btree, oid, nspname, relname, relpages "
-							 "FROM toast");
-	if (!opts.no_btree_expansion)
+						  "\nSELECT NULL::INTEGER AS pattern_id, TRUE AS is_heap, "
+						  "FALSE AS is_index, oid, 0 as amoid, nspname, relname, relpages "
+						  "FROM toast");
+	if (!opts.no_index_expansion)
 		appendPQExpBufferStr(&sql,
 							 " UNION"
 		/* Indexes for primary relations */
 							 "\nSELECT NULL::INTEGER AS pattern_id, FALSE AS is_heap, "
-							 "TRUE AS is_btree, oid, nspname, relname, relpages "
+							 "TRUE AS is_index, oid, amoid, nspname, relname, relpages "
 							 "FROM index");
-	if (!opts.no_toast_expansion && !opts.no_btree_expansion)
-		appendPQExpBufferStr(&sql,
-							 " UNION"
+	if (!opts.no_toast_expansion && !opts.no_index_expansion)
+		appendPQExpBuffer(&sql,
+						  " UNION"
 		/* Indexes for toast relations */
-							 "\nSELECT NULL::INTEGER AS pattern_id, FALSE AS is_heap, "
-							 "TRUE AS is_btree, oid, nspname, relname, relpages "
-							 "FROM toast_index");
+						  "\nSELECT NULL::INTEGER AS pattern_id, FALSE AS is_heap, "
+						  "TRUE AS is_index, oid, %u as amoid, nspname, relname, relpages "
+						  "FROM toast_index", BTREE_AM_OID);
 	appendPQExpBufferStr(&sql,
 						 "\n) AS combined_records "
 						 "ORDER BY relpages DESC NULLS FIRST, oid");
@@ -2163,8 +2207,9 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	{
 		int			pattern_id = -1;
 		bool		is_heap = false;
-		bool		is_btree PG_USED_FOR_ASSERTS_ONLY = false;
+		bool		is_index PG_USED_FOR_ASSERTS_ONLY = false;
 		Oid			oid = InvalidOid;
+		Oid			amoid = InvalidOid;
 		const char *nspname = NULL;
 		const char *relname = NULL;
 		int			relpages = 0;
@@ -2174,15 +2219,17 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		if (!PQgetisnull(res, i, 1))
 			is_heap = (PQgetvalue(res, i, 1)[0] == 't');
 		if (!PQgetisnull(res, i, 2))
-			is_btree = (PQgetvalue(res, i, 2)[0] == 't');
+			is_index = (PQgetvalue(res, i, 2)[0] == 't');
 		if (!PQgetisnull(res, i, 3))
 			oid = atooid(PQgetvalue(res, i, 3));
 		if (!PQgetisnull(res, i, 4))
-			nspname = PQgetvalue(res, i, 4);
+			amoid = atooid(PQgetvalue(res, i, 4));
 		if (!PQgetisnull(res, i, 5))
-			relname = PQgetvalue(res, i, 5);
+			nspname = PQgetvalue(res, i, 5);
 		if (!PQgetisnull(res, i, 6))
-			relpages = atoi(PQgetvalue(res, i, 6));
+			relname = PQgetvalue(res, i, 6);
+		if (!PQgetisnull(res, i, 7))
+			relpages = atoi(PQgetvalue(res, i, 7));
 
 		if (pattern_id >= 0)
 		{
@@ -2204,10 +2251,11 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 			RelationInfo *rel = (RelationInfo *) pg_malloc0(sizeof(RelationInfo));
 
 			Assert(OidIsValid(oid));
-			Assert((is_heap && !is_btree) || (is_btree && !is_heap));
+			Assert((is_heap && !is_index) || (is_index && !is_heap));
 
 			rel->datinfo = dat;
 			rel->reloid = oid;
+			rel->amoid = amoid;
 			rel->is_heap = is_heap;
 			rel->nspname = pstrdup(nspname);
 			rel->relname = pstrdup(relname);
@@ -2217,7 +2265,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 			{
 				/*
 				 * We apply --startblock and --endblock to heap tables, but
-				 * not btree indexes, and for progress purposes we need to
+				 * not supported indexes, and for progress purposes we need to
 				 * track how many blocks we expect to check.
 				 */
 				if (opts.endblock >= 0 && rel->blocks_to_check > opts.endblock)
diff --git a/src/bin/pg_amcheck/t/002_nonesuch.pl b/src/bin/pg_amcheck/t/002_nonesuch.pl
index 67d700ea07a..d4cc0664f3b 100644
--- a/src/bin/pg_amcheck/t/002_nonesuch.pl
+++ b/src/bin/pg_amcheck/t/002_nonesuch.pl
@@ -272,8 +272,8 @@ $node->command_checks_all(
 	[
 		qr/pg_amcheck: warning: no heap tables to check matching "no_such_table"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "no\*such\*table"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "no_such_index"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "no\*such\*index"/,
+		qr/pg_amcheck: warning: no indexes to check matching "no_such_index"/,
+		qr/pg_amcheck: warning: no indexes to check matching "no\*such\*index"/,
 		qr/pg_amcheck: warning: no relations to check matching "no_such_relation"/,
 		qr/pg_amcheck: warning: no relations to check matching "no\*such\*relation"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "no\*such\*table"/,
@@ -350,8 +350,8 @@ $node->command_checks_all(
 		qr/pg_amcheck: warning: no heap tables to check matching "template1\.public\.foo"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "another_db\.public\.foo"/,
 		qr/pg_amcheck: warning: no connectable databases to check matching "no_such_database\.public\.foo"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "template1\.public\.foo_idx"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "another_db\.public\.foo_idx"/,
+		qr/pg_amcheck: warning: no indexes to check matching "template1\.public\.foo_idx"/,
+		qr/pg_amcheck: warning: no indexes to check matching "another_db\.public\.foo_idx"/,
 		qr/pg_amcheck: warning: no connectable databases to check matching "no_such_database\.public\.foo_idx"/,
 		qr/pg_amcheck: error: no relations to check/,
 	],
diff --git a/src/bin/pg_amcheck/t/003_check.pl b/src/bin/pg_amcheck/t/003_check.pl
index 2b57c4dbac1..0aa66b24258 100644
--- a/src/bin/pg_amcheck/t/003_check.pl
+++ b/src/bin/pg_amcheck/t/003_check.pl
@@ -185,7 +185,7 @@ for my $dbname (qw(db1 db2 db3))
 	# schemas.  The schemas are all identical to start, but
 	# we will corrupt them differently later.
 	#
-	for my $schema (qw(s1 s2 s3 s4 s5))
+	for my $schema (qw(s1 s2 s3 s4 s5 s6))
 	{
 		$node->safe_psql(
 			$dbname, qq(
@@ -291,22 +291,24 @@ plan_to_corrupt_first_page('db1', 's3.t2_btree');
 # Corrupt toast table, partitions, and materialized views in schema "s4"
 plan_to_remove_toast_file('db1', 's4.t2');
 
-# Corrupt all other object types in schema "s5".  We don't have amcheck support
+# Corrupt GiST index in schema "s5"
+plan_to_remove_relation_file('db1', 's5.t1_gist');
+plan_to_corrupt_first_page('db1', 's5.t2_gist');
+
+# Corrupt all other object types in schema "s6".  We don't have amcheck support
 # for these types, but we check that their corruption does not trigger any
 # errors in pg_amcheck
-plan_to_remove_relation_file('db1', 's5.seq1');
-plan_to_remove_relation_file('db1', 's5.t1_hash');
-plan_to_remove_relation_file('db1', 's5.t1_gist');
-plan_to_remove_relation_file('db1', 's5.t1_gin');
-plan_to_remove_relation_file('db1', 's5.t1_brin');
-plan_to_remove_relation_file('db1', 's5.t1_spgist');
+plan_to_remove_relation_file('db1', 's6.seq1');
+plan_to_remove_relation_file('db1', 's6.t1_hash');
+plan_to_remove_relation_file('db1', 's6.t1_gin');
+plan_to_remove_relation_file('db1', 's6.t1_brin');
+plan_to_remove_relation_file('db1', 's6.t1_spgist');
 
-plan_to_corrupt_first_page('db1', 's5.seq2');
-plan_to_corrupt_first_page('db1', 's5.t2_hash');
-plan_to_corrupt_first_page('db1', 's5.t2_gist');
-plan_to_corrupt_first_page('db1', 's5.t2_gin');
-plan_to_corrupt_first_page('db1', 's5.t2_brin');
-plan_to_corrupt_first_page('db1', 's5.t2_spgist');
+plan_to_corrupt_first_page('db1', 's6.seq2');
+plan_to_corrupt_first_page('db1', 's6.t2_hash');
+plan_to_corrupt_first_page('db1', 's6.t2_gin');
+plan_to_corrupt_first_page('db1', 's6.t2_brin');
+plan_to_corrupt_first_page('db1', 's6.t2_spgist');
 
 
 # Database 'db2' corruptions
@@ -437,10 +439,22 @@ $node->command_checks_all(
 	[$no_output_re],
 	'pg_amcheck in schema s4 excluding toast reports no corruption');
 
-# Check that no corruption is reported in schema db1.s5
-$node->command_checks_all([ @cmd, '-s', 's5', 'db1' ],
+# In schema db1.s5 we should see GiST corruption messages on stdout, and
+# nothing on stderr.
+#
+$node->command_checks_all(
+	[ @cmd, '-s', 's5', 'db1' ],
+	2,
+	[
+		$missing_file_re, $line_pointer_corruption_re,
+	],
+	[$no_output_re],
+	'pg_amcheck schema s5 reports GiST index errors');
+
+# Check that no corruption is reported in schema db1.s6
+$node->command_checks_all([ @cmd, '-s', 's6', 'db1' ],
 	0, [$no_output_re], [$no_output_re],
-	'pg_amcheck over schema s5 reports no corruption');
+	'pg_amcheck over schema s6 reports no corruption');
 
 # In schema db1.s1, only indexes are corrupt.  Verify that when we exclude
 # the indexes, no corruption is reported about the schema.
@@ -551,7 +565,7 @@ $node->command_checks_all(
 	'pg_amcheck excluding all corrupt schemas with --checkunique option');
 
 #
-# Smoke test for checkunique option for not supported versions.
+# Smoke test for checkunique option and GiST indexes for not supported versions.
 #
 $node->safe_psql(
 	'db3', q(
@@ -567,4 +581,19 @@ $node->command_checks_all(
 		qr/pg_amcheck: warning: option --checkunique is not supported by amcheck version 1.3/
 	],
 	'pg_amcheck smoke test --checkunique');
+
+$node->safe_psql(
+	'db1', q(
+		DROP EXTENSION amcheck;
+		CREATE EXTENSION amcheck WITH SCHEMA amcheck_schema VERSION '1.3' ;
+));
+
+$node->command_checks_all(
+	[ @cmd, '-s', 's5', 'db1' ],
+	0,
+	[$no_output_re],
+	[
+		qr/pg_amcheck: warning: GiST verification is not supported by installed amcheck version/
+	],
+	'pg_amcheck smoke test --checkunique');
 done_testing();
-- 
2.34.1

#55

Kirill Reshke

reshkekirill@gmail.com

about 1 year ago

In reply to: Kirill Reshke (#51)

5 attachment(s)

Re: Amcheck verification of GiST and GIN

On Tue, 26 Nov 2024 at 11:50, Kirill Reshke <reshkekirill@gmail.com> wrote:

=== problems with gin_index_check

1)
```
reshke@ygp-jammy:~/postgres/contrib/amcheck$ ../../pgbin/bin/psql db1
psql (18devel)
Type "help" for help.

db1=# select gin_index_check('users_search_idx');
ERROR: index "users_search_idx" has wrong tuple order, block 35868, offset 33
```

For some reason gin_index_check fails on my index. I am 99% there is
no corruption in it. Will try to investigate.

2) this is already discovered by Tomas, but I add my input here:

psql session:
```
db1=# set log_min_messages to debug5;
SET
db1=# select gin_index_check('users_search_idx');

```

gdb session:
```
(gdb) bt
#0 __pthread_kill_implementation (no_tid=0, signo=6,
threadid=140601454760896) at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (signo=6, threadid=140601454760896) at
./nptl/pthread_kill.c:78
#2 __GI___pthread_kill (threadid=140601454760896,
signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3 0x00007fe055af0476 in __GI_raise (sig=sig@entry=6) at
../sysdeps/posix/raise.c:26
#4 0x00007fe055ad67f3 in __GI_abort () at ./stdlib/abort.c:79
#5 0x000055ea82af4ef0 in ExceptionalCondition
(conditionName=conditionName@entry=0x7fe04a87aa35
"ItemPointerIsValid(pointer)",
fileName=fileName@entry=0x7fe04a87a928
"../../src/include/storage/itemptr.h",
lineNumber=lineNumber@entry=126) at assert.c:66
#6 0x00007fe04a871372 in ItemPointerGetOffsetNumber
(pointer=<optimized out>) at ../../src/include/storage/itemptr.h:126
#7 ItemPointerGetOffsetNumber (pointer=<optimized out>) at
../../src/include/storage/itemptr.h:124
#8 gin_check_posting_tree_parent_keys_consistency
(posting_tree_root=<optimized out>, rel=<optimized out>) at
verify_gin.c:296
#9 gin_check_parent_keys_consistency (rel=rel@entry=0x7fe04a8aa328,
heaprel=heaprel@entry=0x7fe04a8a9db8,
callback_state=callback_state@entry=0x0,
readonly=readonly@entry=false) at verify_gin.c:597
#10 0x00007fe04a87098d in amcheck_lock_relation_and_check
(indrelid=16488, am_id=am_id@entry=2742,
check=check@entry=0x7fe04a870a80 <gin_check_parent_keys_consistency>,
lockmode=lockmode@entry=1,
state=state@entry=0x0) at verify_common.c:132
#11 0x00007fe04a871e34 in gin_index_check (fcinfo=<optimized out>) at
verify_gin.c:81
#12 0x000055ea827cc275 in ExecInterpExpr (state=0x55ea84903390,
econtext=0x55ea84903138, isnull=<optimized out>) at
execExprInterp.c:770
#13 0x000055ea82804fdc in ExecEvalExprSwitchContext
(isNull=0x7ffeba7fdd37, econtext=0x55ea84903138, state=0x55ea84903390)
at ../../../src/include/executor/executor.h:367
#14 ExecProject (projInfo=0x55ea84903388) at
../../../src/include/executor/executor.h:401
#15 ExecResult (pstate=<optimized out>) at nodeResult.c:135
#16 0x000055ea827d007a in ExecProcNode (node=0x55ea84903028) at
../../../src/include/executor/executor.h:278
#17 ExecutePlan (execute_once=<optimized out>, dest=0x55ea84901940,
direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>,
operation=CMD_SELECT, use_parallel_mode=<optimized out>,
planstate=0x55ea84903028, estate=0x55ea84902e00) at execMain.c:1655
#18 standard_ExecutorRun (queryDesc=0x55ea8485c1a0,
direction=<optimized out>, count=0, execute_once=<optimized out>) at
execMain.c:362
#19 0x000055ea829ad6df in PortalRunSelect (portal=0x55ea848b1810,
forward=<optimized out>, count=0, dest=<optimized out>) at
pquery.c:924
#20 0x000055ea829aedc1 in PortalRun
(portal=portal@entry=0x55ea848b1810,
count=count@entry=9223372036854775807,
isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true,
dest=dest@entry=0x55ea84901940,
altdest=altdest@entry=0x55ea84901940, qc=0x7ffeba7fdfd0) at
pquery.c:768
#21 0x000055ea829aab47 in exec_simple_query
(query_string=0x55ea84831250 "select
gin_index_check('users_search_idx');") at postgres.c:1283
#22 0x000055ea829ac777 in PostgresMain (dbname=<optimized out>,
username=<optimized out>) at postgres.c:4798
#23 0x000055ea829a6a33 in BackendMain (startup_data=<optimized out>,
startup_data_len=<optimized out>) at backend_startup.c:107
#24 0x000055ea8290122f in postmaster_child_launch
(child_type=<optimized out>, child_slot=1,
startup_data=startup_data@entry=0x7ffeba7fe48c "",
startup_data_len=startup_data_len@entry=4,
client_sock=client_sock@entry=0x7ffeba7fe490) at launch_backend.c:274
#25 0x000055ea82904c3f in BackendStartup (client_sock=0x7ffeba7fe490)
at postmaster.c:3377
#26 ServerLoop () at postmaster.c:1663
#27 0x000055ea8290656b in PostmasterMain (argc=argc@entry=3,
argv=argv@entry=0x55ea8482ab10) at postmaster.c:1361
#28 0x000055ea825ecc0a in main (argc=3, argv=0x55ea8482ab10) at main.c:196
(gdb)
```

We also need to change the default version of the extension to 1.5.
I'm not sure which patch of this series should do that.

Both problems resolved with correct maxoff calculation:

```
reshke@ygp-jammy:~/postgres$ git diff contrib/amcheck/verify_gin.c
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index 39baae40f0c..ddf072d468d 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -412,7 +412,7 @@ gin_check_parent_keys_consistency(Relation rel,
                LockBuffer(buffer, GIN_SHARE);
                page = (Page) BufferGetPage(buffer);
                lsn = BufferGetLSNAtomic(buffer);
-               maxoff = PageGetMaxOffsetNumber(page);
+               maxoff = GinPageGetOpaque(page)->maxoff;

/* Do basic sanity checks on the page headers */
check_index_page(rel, buffer, stack->blkno);
```

--
Best regards,
Kirill Reshke

Attachments:

v31-0005-Add-GiST-support-to-pg_amcheck.patchapplication/octet-stream; name=v31-0005-Add-GiST-support-to-pg_amcheck.patchDownload

From ef5acedd5e02d35925985cf6b166877a54d4788c Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sun, 5 Feb 2023 15:52:14 -0800
Subject: [PATCH v31 5/5] Add GiST support to pg_amcheck

Proof of concept patch for pg_amcheck binary support
for GIST and GIN index checks.

Author: Andrey Borodin <amborodin@acm.org>
Reviewed-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
---
 src/bin/pg_amcheck/pg_amcheck.c      | 290 ++++++++++++++++-----------
 src/bin/pg_amcheck/t/002_nonesuch.pl |   8 +-
 src/bin/pg_amcheck/t/003_check.pl    |  65 ++++--
 3 files changed, 220 insertions(+), 143 deletions(-)

diff --git a/src/bin/pg_amcheck/pg_amcheck.c b/src/bin/pg_amcheck/pg_amcheck.c
index 27a7d5e925e..8146ea1e604 100644
--- a/src/bin/pg_amcheck/pg_amcheck.c
+++ b/src/bin/pg_amcheck/pg_amcheck.c
@@ -40,8 +40,7 @@ typedef struct PatternInfo
 								 * NULL */
 	bool		heap_only;		/* true if rel_regex should only match heap
 								 * tables */
-	bool		btree_only;		/* true if rel_regex should only match btree
-								 * indexes */
+	bool		index_only;		/* true if rel_regex should only match indexes */
 	bool		matched;		/* true if the pattern matched in any database */
 } PatternInfo;
 
@@ -75,10 +74,9 @@ typedef struct AmcheckOptions
 
 	/*
 	 * As an optimization, if any pattern in the exclude list applies to heap
-	 * tables, or similarly if any such pattern applies to btree indexes, or
-	 * to schemas, then these will be true, otherwise false.  These should
-	 * always agree with what you'd conclude by grep'ing through the exclude
-	 * list.
+	 * tables, or similarly if any such pattern applies to indexes, or to
+	 * schemas, then these will be true, otherwise false.  These should always
+	 * agree with what you'd conclude by grep'ing through the exclude list.
 	 */
 	bool		excludetbl;
 	bool		excludeidx;
@@ -99,14 +97,14 @@ typedef struct AmcheckOptions
 	int64		endblock;
 	const char *skip;
 
-	/* btree index checking options */
+	/* index checking options */
 	bool		parent_check;
 	bool		rootdescend;
 	bool		heapallindexed;
 	bool		checkunique;
 
-	/* heap and btree hybrid option */
-	bool		no_btree_expansion;
+	/* heap and indexes hybrid option */
+	bool		no_index_expansion;
 } AmcheckOptions;
 
 static AmcheckOptions opts = {
@@ -135,7 +133,7 @@ static AmcheckOptions opts = {
 	.rootdescend = false,
 	.heapallindexed = false,
 	.checkunique = false,
-	.no_btree_expansion = false
+	.no_index_expansion = false
 };
 
 static const char *progname = NULL;
@@ -152,13 +150,15 @@ typedef struct DatabaseInfo
 	char	   *datname;
 	char	   *amcheck_schema; /* escaped, quoted literal */
 	bool		is_checkunique;
+	bool		gist_supported;
 } DatabaseInfo;
 
 typedef struct RelationInfo
 {
 	const DatabaseInfo *datinfo;	/* shared by other relinfos */
 	Oid			reloid;
-	bool		is_heap;		/* true if heap, false if btree */
+	Oid			amoid;
+	bool		is_heap;		/* true if heap, false if index */
 	char	   *nspname;
 	char	   *relname;
 	int			relpages;
@@ -179,10 +179,12 @@ static void prepare_heap_command(PQExpBuffer sql, RelationInfo *rel,
 								 PGconn *conn);
 static void prepare_btree_command(PQExpBuffer sql, RelationInfo *rel,
 								  PGconn *conn);
+static void prepare_gist_command(PQExpBuffer sql, RelationInfo *rel,
+								 PGconn *conn);
 static void run_command(ParallelSlot *slot, const char *sql);
 static bool verify_heap_slot_handler(PGresult *res, PGconn *conn,
 									 void *context);
-static bool verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context);
+static bool verify_index_slot_handler(PGresult *res, PGconn *conn, void *context);
 static void help(const char *progname);
 static void progress_report(uint64 relations_total, uint64 relations_checked,
 							uint64 relpages_total, uint64 relpages_checked,
@@ -196,7 +198,7 @@ static void append_relation_pattern(PatternInfoArray *pia, const char *pattern,
 									int encoding);
 static void append_heap_pattern(PatternInfoArray *pia, const char *pattern,
 								int encoding);
-static void append_btree_pattern(PatternInfoArray *pia, const char *pattern,
+static void append_index_pattern(PatternInfoArray *pia, const char *pattern,
 								 int encoding);
 static void compile_database_list(PGconn *conn, SimplePtrList *databases,
 								  const char *initial_dbname);
@@ -288,6 +290,7 @@ main(int argc, char *argv[])
 	enum trivalue prompt_password = TRI_DEFAULT;
 	int			encoding = pg_get_encoding_from_locale(NULL, false);
 	ConnParams	cparams;
+	bool		gist_warn_printed = false;
 
 	pg_logging_init(argv[0]);
 	progname = get_progname(argv[0]);
@@ -323,11 +326,11 @@ main(int argc, char *argv[])
 				break;
 			case 'i':
 				opts.allrel = false;
-				append_btree_pattern(&opts.include, optarg, encoding);
+				append_index_pattern(&opts.include, optarg, encoding);
 				break;
 			case 'I':
 				opts.excludeidx = true;
-				append_btree_pattern(&opts.exclude, optarg, encoding);
+				append_index_pattern(&opts.exclude, optarg, encoding);
 				break;
 			case 'j':
 				if (!option_parse_int(optarg, "-j/--jobs", 1, INT_MAX,
@@ -382,7 +385,7 @@ main(int argc, char *argv[])
 				maintenance_db = pg_strdup(optarg);
 				break;
 			case 2:
-				opts.no_btree_expansion = true;
+				opts.no_index_expansion = true;
 				break;
 			case 3:
 				opts.no_toast_expansion = true;
@@ -531,6 +534,10 @@ main(int argc, char *argv[])
 		int			ntups;
 		const char *amcheck_schema = NULL;
 		DatabaseInfo *dat = (DatabaseInfo *) cell->ptr;
+		int			vmaj = 0,
+					vmin = 0,
+					vrev = 0;
+		const char *amcheck_version;
 
 		cparams.override_dbname = dat->datname;
 		if (conn == NULL || strcmp(PQdb(conn), dat->datname) != 0)
@@ -599,36 +606,32 @@ main(int argc, char *argv[])
 												 strlen(amcheck_schema));
 
 		/*
-		 * Check the version of amcheck extension. Skip requested unique
-		 * constraint check with warning if it is not yet supported by
-		 * amcheck.
+		 * Check the version of amcheck extension.
 		 */
-		if (opts.checkunique == true)
-		{
-			/*
-			 * Now amcheck has only major and minor versions in the string but
-			 * we also support revision just in case. Now it is expected to be
-			 * zero.
-			 */
-			int			vmaj = 0,
-						vmin = 0,
-						vrev = 0;
-			const char *amcheck_version = PQgetvalue(result, 0, 1);
+		amcheck_version = PQgetvalue(result, 0, 1);
 
-			sscanf(amcheck_version, "%d.%d.%d", &vmaj, &vmin, &vrev);
+		/*
+		 * Now amcheck has only major and minor versions in the string but we
+		 * also support revision just in case. Now it is expected to be zero.
+		 */
+		sscanf(amcheck_version, "%d.%d.%d", &vmaj, &vmin, &vrev);
 
-			/*
-			 * checkunique option is supported in amcheck since version 1.4
-			 */
-			if ((vmaj == 1 && vmin < 4) || vmaj == 0)
-			{
-				pg_log_warning("option %s is not supported by amcheck version %s",
-							   "--checkunique", amcheck_version);
-				dat->is_checkunique = false;
-			}
-			else
-				dat->is_checkunique = true;
+		/*
+		 * checkunique option is supported in amcheck since version 1.4. Skip
+		 * requested unique constraint check with warning if it is not yet
+		 * supported by amcheck.
+		 */
+		if (opts.checkunique && ((vmaj == 1 && vmin < 4) || vmaj == 0))
+		{
+			pg_log_warning("option %s is not supported by amcheck version %s",
+						   "--checkunique", amcheck_version);
+			dat->is_checkunique = false;
 		}
+		else
+			dat->is_checkunique = opts.checkunique;
+
+		/* GiST indexes are supported in 1.5+ */
+		dat->gist_supported = ((vmaj == 1 && vmin >= 5) || vmaj > 1);
 
 		PQclear(result);
 
@@ -650,8 +653,8 @@ main(int argc, char *argv[])
 			if (pat->heap_only)
 				log_no_match("no heap tables to check matching \"%s\"",
 							 pat->pattern);
-			else if (pat->btree_only)
-				log_no_match("no btree indexes to check matching \"%s\"",
+			else if (pat->index_only)
+				log_no_match("no indexes to check matching \"%s\"",
 							 pat->pattern);
 			else if (pat->rel_regex == NULL)
 				log_no_match("no relations to check in schemas matching \"%s\"",
@@ -784,13 +787,29 @@ main(int argc, char *argv[])
 				if (opts.show_progress && progress_since_last_stderr)
 					fprintf(stderr, "\n");
 
-				pg_log_info("checking btree index \"%s.%s.%s\"",
+				pg_log_info("checking index \"%s.%s.%s\"",
 							rel->datinfo->datname, rel->nspname, rel->relname);
 				progress_since_last_stderr = false;
 			}
-			prepare_btree_command(&sql, rel, free_slot->connection);
+			if (rel->amoid == BTREE_AM_OID)
+				prepare_btree_command(&sql, rel, free_slot->connection);
+			else if (rel->amoid == GIST_AM_OID)
+			{
+				if (rel->datinfo->gist_supported)
+					prepare_gist_command(&sql, rel, free_slot->connection);
+				else
+				{
+					if (!gist_warn_printed)
+						pg_log_warning("GiST verification is not supported by installed amcheck version");
+					gist_warn_printed = true;
+				}
+			}
+			else
+				/* should not happen at this stage */
+				pg_log_info("Verification of index type %u not supported",
+							rel->amoid);
 			rel->sql = pstrdup(sql.data);	/* pg_free'd after command */
-			ParallelSlotSetHandler(free_slot, verify_btree_slot_handler, rel);
+			ParallelSlotSetHandler(free_slot, verify_index_slot_handler, rel);
 			run_command(free_slot, rel->sql);
 		}
 	}
@@ -868,7 +887,7 @@ prepare_heap_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
  * Creates a SQL command for running amcheck checking on the given btree index
  * relation.  The command does not select any columns, as btree checking
  * functions do not return any, but rather return corruption information by
- * raising errors, which verify_btree_slot_handler expects.
+ * raising errors, which verify_index_slot_handler expects.
  *
  * The constructed SQL command will silently skip temporary indexes, and
  * indexes being reindexed concurrently, as checking them would needlessly draw
@@ -914,6 +933,28 @@ prepare_btree_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
 						  rel->reloid);
 }
 
+/*
+ * prepare_gist_command
+ * Similar to btree equivalent prepares command to check GiST index.
+ */
+static void
+prepare_gist_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
+{
+	resetPQExpBuffer(sql);
+
+	appendPQExpBuffer(sql,
+					  "SELECT %s.gist_index_check("
+					  "index := c.oid, heapallindexed := %s)"
+					  "\nFROM pg_catalog.pg_class c, pg_catalog.pg_index i "
+					  "WHERE c.oid = %u "
+					  "AND c.oid = i.indexrelid "
+					  "AND c.relpersistence != 't' "
+					  "AND i.indisready AND i.indisvalid AND i.indislive",
+					  rel->datinfo->amcheck_schema,
+					  (opts.heapallindexed ? "true" : "false"),
+					  rel->reloid);
+}
+
 /*
  * run_command
  *
@@ -953,7 +994,7 @@ run_command(ParallelSlot *slot, const char *sql)
  * Note: Heap relation corruption is reported by verify_heapam() via the result
  * set, rather than an ERROR, but running verify_heapam() on a corrupted heap
  * table may still result in an error being returned from the server due to
- * missing relation files, bad checksums, etc.  The btree corruption checking
+ * missing relation files, bad checksums, etc.  The corruption checking
  * functions always use errors to communicate corruption messages.  We can't
  * just abort processing because we got a mere ERROR.
  *
@@ -1103,11 +1144,11 @@ verify_heap_slot_handler(PGresult *res, PGconn *conn, void *context)
 }
 
 /*
- * verify_btree_slot_handler
+ * verify_index_slot_handler
  *
- * ParallelSlotHandler that receives results from a btree checking command
- * created by prepare_btree_command and outputs them for the user.  The results
- * from the btree checking command is assumed to be empty, but when the results
+ * ParallelSlotHandler that receives results from a checking command created by
+ * prepare_[btree,gist]_command and outputs them for the user.  The results
+ * from the checking command is assumed to be empty, but when the results
  * are an error code, the useful information about the corruption is expected
  * in the connection's error message.
  *
@@ -1116,7 +1157,7 @@ verify_heap_slot_handler(PGresult *res, PGconn *conn, void *context)
  * context: unused
  */
 static bool
-verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
+verify_index_slot_handler(PGresult *res, PGconn *conn, void *context)
 {
 	RelationInfo *rel = (RelationInfo *) context;
 
@@ -1127,12 +1168,12 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 		if (ntups > 1)
 		{
 			/*
-			 * We expect the btree checking functions to return one void row
-			 * each, or zero rows if the check was skipped due to the object
-			 * being in the wrong state to be checked, so we should output
-			 * some sort of warning if we get anything more, not because it
-			 * indicates corruption, but because it suggests a mismatch
-			 * between amcheck and pg_amcheck versions.
+			 * We expect the checking functions to return one void row each,
+			 * or zero rows if the check was skipped due to the object being
+			 * in the wrong state to be checked, so we should output some sort
+			 * of warning if we get anything more, not because it indicates
+			 * corruption, but because it suggests a mismatch between amcheck
+			 * and pg_amcheck versions.
 			 *
 			 * In conjunction with --progress, anything written to stderr at
 			 * this time would present strangely to the user without an extra
@@ -1142,7 +1183,7 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 			 */
 			if (opts.show_progress && progress_since_last_stderr)
 				fprintf(stderr, "\n");
-			pg_log_warning("btree index \"%s.%s.%s\": btree checking function returned unexpected number of rows: %d",
+			pg_log_warning("index \"%s.%s.%s\": checking function returned unexpected number of rows: %d",
 						   rel->datinfo->datname, rel->nspname, rel->relname, ntups);
 			if (opts.verbose)
 				pg_log_warning_detail("Query was: %s", rel->sql);
@@ -1156,7 +1197,7 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 		char	   *msg = indent_lines(PQerrorMessage(conn));
 
 		all_checks_pass = false;
-		printf(_("btree index \"%s.%s.%s\":\n"),
+		printf(_("index \"%s.%s.%s\":\n"),
 			   rel->datinfo->datname, rel->nspname, rel->relname);
 		printf("%s", msg);
 		if (opts.verbose)
@@ -1210,6 +1251,8 @@ help(const char *progname)
 	printf(_("      --heapallindexed            check that all heap tuples are found within indexes\n"));
 	printf(_("      --parent-check              check index parent/child relationships\n"));
 	printf(_("      --rootdescend               search from root page to refind tuples\n"));
+	printf(_("\nGiST index checking options:\n"));
+	printf(_("      --heapallindexed            check that all heap tuples are found within indexes\n"));
 	printf(_("\nConnection options:\n"));
 	printf(_("  -h, --host=HOSTNAME             database server host or socket directory\n"));
 	printf(_("  -p, --port=PORT                 database server port\n"));
@@ -1423,11 +1466,11 @@ append_schema_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
  * pattern: the relation name pattern
  * encoding: client encoding for parsing the pattern
  * heap_only: whether the pattern should only be matched against heap tables
- * btree_only: whether the pattern should only be matched against btree indexes
+ * index_only: whether the pattern should only be matched against indexes
  */
 static void
 append_relation_pattern_helper(PatternInfoArray *pia, const char *pattern,
-							   int encoding, bool heap_only, bool btree_only)
+							   int encoding, bool heap_only, bool index_only)
 {
 	PQExpBufferData dbbuf;
 	PQExpBufferData nspbuf;
@@ -1462,14 +1505,14 @@ append_relation_pattern_helper(PatternInfoArray *pia, const char *pattern,
 	termPQExpBuffer(&relbuf);
 
 	info->heap_only = heap_only;
-	info->btree_only = btree_only;
+	info->index_only = index_only;
 }
 
 /*
  * append_relation_pattern
  *
  * Adds the given pattern interpreted as a relation pattern, to be matched
- * against both heap tables and btree indexes.
+ * against both heap tables and indexes.
  *
  * pia: the pattern info array to be appended
  * pattern: the relation name pattern
@@ -1498,17 +1541,17 @@ append_heap_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
 }
 
 /*
- * append_btree_pattern
+ * append_index_pattern
  *
  * Adds the given pattern interpreted as a relation pattern, to be matched only
- * against btree indexes.
+ * against indexes.
  *
  * pia: the pattern info array to be appended
  * pattern: the relation name pattern
  * encoding: client encoding for parsing the pattern
  */
 static void
-append_btree_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
+append_index_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
 {
 	append_relation_pattern_helper(pia, pattern, encoding, false, true);
 }
@@ -1766,7 +1809,7 @@ compile_database_list(PGconn *conn, SimplePtrList *databases,
  *     rel_regex: the relname regexp parsed from the pattern, or NULL if the
  *                pattern had no relname part
  *     heap_only: true if the pattern applies only to heap tables (not indexes)
- *     btree_only: true if the pattern applies only to btree indexes (not tables)
+ *     index_only: true if the pattern applies only to indexes (not tables)
  *
  * buf: the buffer to be appended
  * patterns: the array of patterns to be inserted into the CTE
@@ -1808,7 +1851,7 @@ append_rel_pattern_raw_cte(PQExpBuffer buf, const PatternInfoArray *pia,
 			appendPQExpBufferStr(buf, "::TEXT, true::BOOLEAN");
 		else
 			appendPQExpBufferStr(buf, "::TEXT, false::BOOLEAN");
-		if (info->btree_only)
+		if (info->index_only)
 			appendPQExpBufferStr(buf, ", true::BOOLEAN");
 		else
 			appendPQExpBufferStr(buf, ", false::BOOLEAN");
@@ -1846,8 +1889,8 @@ append_rel_pattern_filtered_cte(PQExpBuffer buf, const char *raw,
 								const char *filtered, PGconn *conn)
 {
 	appendPQExpBuffer(buf,
-					  "\n%s (pattern_id, nsp_regex, rel_regex, heap_only, btree_only) AS ("
-					  "\nSELECT pattern_id, nsp_regex, rel_regex, heap_only, btree_only "
+					  "\n%s (pattern_id, nsp_regex, rel_regex, heap_only, index_only) AS ("
+					  "\nSELECT pattern_id, nsp_regex, rel_regex, heap_only, index_only "
 					  "FROM %s r"
 					  "\nWHERE (r.db_regex IS NULL "
 					  "OR ",
@@ -1870,7 +1913,7 @@ append_rel_pattern_filtered_cte(PQExpBuffer buf, const char *raw,
  * The cells of the constructed list contain all information about the relation
  * necessary to connect to the database and check the object, including which
  * database to connect to, where contrib/amcheck is installed, and the Oid and
- * type of object (heap table vs. btree index).  Rather than duplicating the
+ * type of object (heap table vs. index).  Rather than duplicating the
  * database details per relation, the relation structs use references to the
  * same database object, provided by the caller.
  *
@@ -1897,7 +1940,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	if (!opts.allrel)
 	{
 		appendPQExpBufferStr(&sql,
-							 " include_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, btree_only) AS (");
+							 " include_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, index_only) AS (");
 		append_rel_pattern_raw_cte(&sql, &opts.include, conn);
 		appendPQExpBufferStr(&sql, "\n),");
 		append_rel_pattern_filtered_cte(&sql, "include_raw", "include_pat", conn);
@@ -1907,7 +1950,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	if (opts.excludetbl || opts.excludeidx || opts.excludensp)
 	{
 		appendPQExpBufferStr(&sql,
-							 " exclude_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, btree_only) AS (");
+							 " exclude_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, index_only) AS (");
 		append_rel_pattern_raw_cte(&sql, &opts.exclude, conn);
 		appendPQExpBufferStr(&sql, "\n),");
 		append_rel_pattern_filtered_cte(&sql, "exclude_raw", "exclude_pat", conn);
@@ -1915,36 +1958,36 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 
 	/* Append the relation CTE. */
 	appendPQExpBufferStr(&sql,
-						 " relation (pattern_id, oid, nspname, relname, reltoastrelid, relpages, is_heap, is_btree) AS ("
+						 " relation (pattern_id, oid, amoid, nspname, relname, reltoastrelid, relpages, is_heap, is_index) AS ("
 						 "\nSELECT DISTINCT ON (c.oid");
 	if (!opts.allrel)
 		appendPQExpBufferStr(&sql, ", ip.pattern_id) ip.pattern_id,");
 	else
 		appendPQExpBufferStr(&sql, ") NULL::INTEGER AS pattern_id,");
 	appendPQExpBuffer(&sql,
-					  "\nc.oid, n.nspname, c.relname, c.reltoastrelid, c.relpages, "
-					  "c.relam = %u AS is_heap, "
-					  "c.relam = %u AS is_btree"
+					  "\nc.oid, c.relam as amoid, n.nspname, c.relname, "
+					  "c.reltoastrelid, c.relpages, c.relam = %u AS is_heap, "
+					  "(c.relam = %u OR c.relam = %u) AS is_index"
 					  "\nFROM pg_catalog.pg_class c "
 					  "INNER JOIN pg_catalog.pg_namespace n "
 					  "ON c.relnamespace = n.oid",
-					  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+					  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 	if (!opts.allrel)
 		appendPQExpBuffer(&sql,
 						  "\nINNER JOIN include_pat ip"
 						  "\nON (n.nspname ~ ip.nsp_regex OR ip.nsp_regex IS NULL)"
 						  "\nAND (c.relname ~ ip.rel_regex OR ip.rel_regex IS NULL)"
 						  "\nAND (c.relam = %u OR NOT ip.heap_only)"
-						  "\nAND (c.relam = %u OR NOT ip.btree_only)",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  "\nAND ((c.relam = %u OR c.relam = %u) OR NOT ip.index_only)",
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 	if (opts.excludetbl || opts.excludeidx || opts.excludensp)
 		appendPQExpBuffer(&sql,
 						  "\nLEFT OUTER JOIN exclude_pat ep"
 						  "\nON (n.nspname ~ ep.nsp_regex OR ep.nsp_regex IS NULL)"
 						  "\nAND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL)"
 						  "\nAND (c.relam = %u OR NOT ep.heap_only OR ep.rel_regex IS NULL)"
-						  "\nAND (c.relam = %u OR NOT ep.btree_only OR ep.rel_regex IS NULL)",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  "\nAND ((c.relam = %u OR c.relam = %u) OR NOT ep.index_only OR ep.rel_regex IS NULL)",
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 
 	/*
 	 * Exclude temporary tables and indexes, which must necessarily belong to
@@ -1983,7 +2026,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 						  HEAP_TABLE_AM_OID, PG_TOAST_NAMESPACE);
 	else
 		appendPQExpBuffer(&sql,
-						  " AND c.relam IN (%u, %u)"
+						  " AND c.relam IN (%u, %u, %u)"
 						  "AND c.relkind IN ("
 						  CppAsString2(RELKIND_RELATION) ", "
 						  CppAsString2(RELKIND_SEQUENCE) ", "
@@ -1995,10 +2038,10 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 						  CppAsString2(RELKIND_SEQUENCE) ", "
 						  CppAsString2(RELKIND_MATVIEW) ", "
 						  CppAsString2(RELKIND_TOASTVALUE) ")) OR "
-						  "(c.relam = %u AND c.relkind = "
+						  "((c.relam = %u OR c.relam = %u) AND c.relkind = "
 						  CppAsString2(RELKIND_INDEX) "))",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID,
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID,
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 
 	appendPQExpBufferStr(&sql,
 						 "\nORDER BY c.oid)");
@@ -2027,7 +2070,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		appendPQExpBufferStr(&sql,
 							 "\n)");
 	}
-	if (!opts.no_btree_expansion)
+	if (!opts.no_index_expansion)
 	{
 		/*
 		 * Include a CTE for btree indexes associated with primary heap tables
@@ -2035,9 +2078,9 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		 * btree index names.
 		 */
 		appendPQExpBufferStr(&sql,
-							 ", index (oid, nspname, relname, relpages) AS ("
-							 "\nSELECT c.oid, r.nspname, c.relname, c.relpages "
-							 "FROM relation r"
+							 ", index (oid, amoid, nspname, relname, relpages) AS ("
+							 "\nSELECT c.oid, c.relam as amoid, r.nspname, "
+							 "c.relname, c.relpages FROM relation r"
 							 "\nINNER JOIN pg_catalog.pg_index i "
 							 "ON r.oid = i.indrelid "
 							 "INNER JOIN pg_catalog.pg_class c "
@@ -2050,15 +2093,15 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 								 "\nLEFT OUTER JOIN exclude_pat ep "
 								 "ON (n.nspname ~ ep.nsp_regex OR ep.nsp_regex IS NULL) "
 								 "AND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL) "
-								 "AND ep.btree_only"
+								 "AND ep.index_only"
 								 "\nWHERE ep.pattern_id IS NULL");
 		else
 			appendPQExpBufferStr(&sql,
 								 "\nWHERE true");
 		appendPQExpBuffer(&sql,
-						  " AND c.relam = %u "
+						  " AND (c.relam = %u or c.relam = %u) "
 						  "AND c.relkind = " CppAsString2(RELKIND_INDEX),
-						  BTREE_AM_OID);
+						  BTREE_AM_OID, GIST_AM_OID);
 		if (opts.no_toast_expansion)
 			appendPQExpBuffer(&sql,
 							  " AND c.relnamespace != %u",
@@ -2066,7 +2109,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		appendPQExpBufferStr(&sql, "\n)");
 	}
 
-	if (!opts.no_toast_expansion && !opts.no_btree_expansion)
+	if (!opts.no_toast_expansion && !opts.no_index_expansion)
 	{
 		/*
 		 * Include a CTE for btree indexes associated with toast tables of
@@ -2087,7 +2130,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 								 "\nLEFT OUTER JOIN exclude_pat ep "
 								 "ON ('pg_toast' ~ ep.nsp_regex OR ep.nsp_regex IS NULL) "
 								 "AND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL) "
-								 "AND ep.btree_only "
+								 "AND ep.index_only "
 								 "WHERE ep.pattern_id IS NULL");
 		else
 			appendPQExpBufferStr(&sql,
@@ -2107,12 +2150,13 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	 * list.
 	 */
 	appendPQExpBufferStr(&sql,
-						 "\nSELECT pattern_id, is_heap, is_btree, oid, nspname, relname, relpages "
+						 "\nSELECT pattern_id, is_heap, is_index, oid, amoid, nspname, relname, relpages "
 						 "FROM (");
 	appendPQExpBufferStr(&sql,
 	/* Inclusion patterns that failed to match */
-						 "\nSELECT pattern_id, is_heap, is_btree, "
+						 "\nSELECT pattern_id, is_heap, is_index, "
 						 "NULL::OID AS oid, "
+						 "NULL::OID AS amoid, "
 						 "NULL::TEXT AS nspname, "
 						 "NULL::TEXT AS relname, "
 						 "NULL::INTEGER AS relpages"
@@ -2121,29 +2165,29 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 						 "UNION"
 	/* Primary relations */
 						 "\nSELECT NULL::INTEGER AS pattern_id, "
-						 "is_heap, is_btree, oid, nspname, relname, relpages "
+						 "is_heap, is_index, oid, amoid, nspname, relname, relpages "
 						 "FROM relation");
 	if (!opts.no_toast_expansion)
-		appendPQExpBufferStr(&sql,
-							 " UNION"
+		appendPQExpBuffer(&sql,
+						  " UNION"
 		/* Toast tables for primary relations */
-							 "\nSELECT NULL::INTEGER AS pattern_id, TRUE AS is_heap, "
-							 "FALSE AS is_btree, oid, nspname, relname, relpages "
-							 "FROM toast");
-	if (!opts.no_btree_expansion)
+						  "\nSELECT NULL::INTEGER AS pattern_id, TRUE AS is_heap, "
+						  "FALSE AS is_index, oid, 0 as amoid, nspname, relname, relpages "
+						  "FROM toast");
+	if (!opts.no_index_expansion)
 		appendPQExpBufferStr(&sql,
 							 " UNION"
 		/* Indexes for primary relations */
 							 "\nSELECT NULL::INTEGER AS pattern_id, FALSE AS is_heap, "
-							 "TRUE AS is_btree, oid, nspname, relname, relpages "
+							 "TRUE AS is_index, oid, amoid, nspname, relname, relpages "
 							 "FROM index");
-	if (!opts.no_toast_expansion && !opts.no_btree_expansion)
-		appendPQExpBufferStr(&sql,
-							 " UNION"
+	if (!opts.no_toast_expansion && !opts.no_index_expansion)
+		appendPQExpBuffer(&sql,
+						  " UNION"
 		/* Indexes for toast relations */
-							 "\nSELECT NULL::INTEGER AS pattern_id, FALSE AS is_heap, "
-							 "TRUE AS is_btree, oid, nspname, relname, relpages "
-							 "FROM toast_index");
+						  "\nSELECT NULL::INTEGER AS pattern_id, FALSE AS is_heap, "
+						  "TRUE AS is_index, oid, %u as amoid, nspname, relname, relpages "
+						  "FROM toast_index", BTREE_AM_OID);
 	appendPQExpBufferStr(&sql,
 						 "\n) AS combined_records "
 						 "ORDER BY relpages DESC NULLS FIRST, oid");
@@ -2163,8 +2207,9 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	{
 		int			pattern_id = -1;
 		bool		is_heap = false;
-		bool		is_btree PG_USED_FOR_ASSERTS_ONLY = false;
+		bool		is_index PG_USED_FOR_ASSERTS_ONLY = false;
 		Oid			oid = InvalidOid;
+		Oid			amoid = InvalidOid;
 		const char *nspname = NULL;
 		const char *relname = NULL;
 		int			relpages = 0;
@@ -2174,15 +2219,17 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		if (!PQgetisnull(res, i, 1))
 			is_heap = (PQgetvalue(res, i, 1)[0] == 't');
 		if (!PQgetisnull(res, i, 2))
-			is_btree = (PQgetvalue(res, i, 2)[0] == 't');
+			is_index = (PQgetvalue(res, i, 2)[0] == 't');
 		if (!PQgetisnull(res, i, 3))
 			oid = atooid(PQgetvalue(res, i, 3));
 		if (!PQgetisnull(res, i, 4))
-			nspname = PQgetvalue(res, i, 4);
+			amoid = atooid(PQgetvalue(res, i, 4));
 		if (!PQgetisnull(res, i, 5))
-			relname = PQgetvalue(res, i, 5);
+			nspname = PQgetvalue(res, i, 5);
 		if (!PQgetisnull(res, i, 6))
-			relpages = atoi(PQgetvalue(res, i, 6));
+			relname = PQgetvalue(res, i, 6);
+		if (!PQgetisnull(res, i, 7))
+			relpages = atoi(PQgetvalue(res, i, 7));
 
 		if (pattern_id >= 0)
 		{
@@ -2204,10 +2251,11 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 			RelationInfo *rel = (RelationInfo *) pg_malloc0(sizeof(RelationInfo));
 
 			Assert(OidIsValid(oid));
-			Assert((is_heap && !is_btree) || (is_btree && !is_heap));
+			Assert((is_heap && !is_index) || (is_index && !is_heap));
 
 			rel->datinfo = dat;
 			rel->reloid = oid;
+			rel->amoid = amoid;
 			rel->is_heap = is_heap;
 			rel->nspname = pstrdup(nspname);
 			rel->relname = pstrdup(relname);
@@ -2217,7 +2265,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 			{
 				/*
 				 * We apply --startblock and --endblock to heap tables, but
-				 * not btree indexes, and for progress purposes we need to
+				 * not supported indexes, and for progress purposes we need to
 				 * track how many blocks we expect to check.
 				 */
 				if (opts.endblock >= 0 && rel->blocks_to_check > opts.endblock)
diff --git a/src/bin/pg_amcheck/t/002_nonesuch.pl b/src/bin/pg_amcheck/t/002_nonesuch.pl
index 67d700ea07a..d4cc0664f3b 100644
--- a/src/bin/pg_amcheck/t/002_nonesuch.pl
+++ b/src/bin/pg_amcheck/t/002_nonesuch.pl
@@ -272,8 +272,8 @@ $node->command_checks_all(
 	[
 		qr/pg_amcheck: warning: no heap tables to check matching "no_such_table"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "no\*such\*table"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "no_such_index"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "no\*such\*index"/,
+		qr/pg_amcheck: warning: no indexes to check matching "no_such_index"/,
+		qr/pg_amcheck: warning: no indexes to check matching "no\*such\*index"/,
 		qr/pg_amcheck: warning: no relations to check matching "no_such_relation"/,
 		qr/pg_amcheck: warning: no relations to check matching "no\*such\*relation"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "no\*such\*table"/,
@@ -350,8 +350,8 @@ $node->command_checks_all(
 		qr/pg_amcheck: warning: no heap tables to check matching "template1\.public\.foo"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "another_db\.public\.foo"/,
 		qr/pg_amcheck: warning: no connectable databases to check matching "no_such_database\.public\.foo"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "template1\.public\.foo_idx"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "another_db\.public\.foo_idx"/,
+		qr/pg_amcheck: warning: no indexes to check matching "template1\.public\.foo_idx"/,
+		qr/pg_amcheck: warning: no indexes to check matching "another_db\.public\.foo_idx"/,
 		qr/pg_amcheck: warning: no connectable databases to check matching "no_such_database\.public\.foo_idx"/,
 		qr/pg_amcheck: error: no relations to check/,
 	],
diff --git a/src/bin/pg_amcheck/t/003_check.pl b/src/bin/pg_amcheck/t/003_check.pl
index 2b57c4dbac1..0aa66b24258 100644
--- a/src/bin/pg_amcheck/t/003_check.pl
+++ b/src/bin/pg_amcheck/t/003_check.pl
@@ -185,7 +185,7 @@ for my $dbname (qw(db1 db2 db3))
 	# schemas.  The schemas are all identical to start, but
 	# we will corrupt them differently later.
 	#
-	for my $schema (qw(s1 s2 s3 s4 s5))
+	for my $schema (qw(s1 s2 s3 s4 s5 s6))
 	{
 		$node->safe_psql(
 			$dbname, qq(
@@ -291,22 +291,24 @@ plan_to_corrupt_first_page('db1', 's3.t2_btree');
 # Corrupt toast table, partitions, and materialized views in schema "s4"
 plan_to_remove_toast_file('db1', 's4.t2');
 
-# Corrupt all other object types in schema "s5".  We don't have amcheck support
+# Corrupt GiST index in schema "s5"
+plan_to_remove_relation_file('db1', 's5.t1_gist');
+plan_to_corrupt_first_page('db1', 's5.t2_gist');
+
+# Corrupt all other object types in schema "s6".  We don't have amcheck support
 # for these types, but we check that their corruption does not trigger any
 # errors in pg_amcheck
-plan_to_remove_relation_file('db1', 's5.seq1');
-plan_to_remove_relation_file('db1', 's5.t1_hash');
-plan_to_remove_relation_file('db1', 's5.t1_gist');
-plan_to_remove_relation_file('db1', 's5.t1_gin');
-plan_to_remove_relation_file('db1', 's5.t1_brin');
-plan_to_remove_relation_file('db1', 's5.t1_spgist');
+plan_to_remove_relation_file('db1', 's6.seq1');
+plan_to_remove_relation_file('db1', 's6.t1_hash');
+plan_to_remove_relation_file('db1', 's6.t1_gin');
+plan_to_remove_relation_file('db1', 's6.t1_brin');
+plan_to_remove_relation_file('db1', 's6.t1_spgist');
 
-plan_to_corrupt_first_page('db1', 's5.seq2');
-plan_to_corrupt_first_page('db1', 's5.t2_hash');
-plan_to_corrupt_first_page('db1', 's5.t2_gist');
-plan_to_corrupt_first_page('db1', 's5.t2_gin');
-plan_to_corrupt_first_page('db1', 's5.t2_brin');
-plan_to_corrupt_first_page('db1', 's5.t2_spgist');
+plan_to_corrupt_first_page('db1', 's6.seq2');
+plan_to_corrupt_first_page('db1', 's6.t2_hash');
+plan_to_corrupt_first_page('db1', 's6.t2_gin');
+plan_to_corrupt_first_page('db1', 's6.t2_brin');
+plan_to_corrupt_first_page('db1', 's6.t2_spgist');
 
 
 # Database 'db2' corruptions
@@ -437,10 +439,22 @@ $node->command_checks_all(
 	[$no_output_re],
 	'pg_amcheck in schema s4 excluding toast reports no corruption');
 
-# Check that no corruption is reported in schema db1.s5
-$node->command_checks_all([ @cmd, '-s', 's5', 'db1' ],
+# In schema db1.s5 we should see GiST corruption messages on stdout, and
+# nothing on stderr.
+#
+$node->command_checks_all(
+	[ @cmd, '-s', 's5', 'db1' ],
+	2,
+	[
+		$missing_file_re, $line_pointer_corruption_re,
+	],
+	[$no_output_re],
+	'pg_amcheck schema s5 reports GiST index errors');
+
+# Check that no corruption is reported in schema db1.s6
+$node->command_checks_all([ @cmd, '-s', 's6', 'db1' ],
 	0, [$no_output_re], [$no_output_re],
-	'pg_amcheck over schema s5 reports no corruption');
+	'pg_amcheck over schema s6 reports no corruption');
 
 # In schema db1.s1, only indexes are corrupt.  Verify that when we exclude
 # the indexes, no corruption is reported about the schema.
@@ -551,7 +565,7 @@ $node->command_checks_all(
 	'pg_amcheck excluding all corrupt schemas with --checkunique option');
 
 #
-# Smoke test for checkunique option for not supported versions.
+# Smoke test for checkunique option and GiST indexes for not supported versions.
 #
 $node->safe_psql(
 	'db3', q(
@@ -567,4 +581,19 @@ $node->command_checks_all(
 		qr/pg_amcheck: warning: option --checkunique is not supported by amcheck version 1.3/
 	],
 	'pg_amcheck smoke test --checkunique');
+
+$node->safe_psql(
+	'db1', q(
+		DROP EXTENSION amcheck;
+		CREATE EXTENSION amcheck WITH SCHEMA amcheck_schema VERSION '1.3' ;
+));
+
+$node->command_checks_all(
+	[ @cmd, '-s', 's5', 'db1' ],
+	0,
+	[$no_output_re],
+	[
+		qr/pg_amcheck: warning: GiST verification is not supported by installed amcheck version/
+	],
+	'pg_amcheck smoke test --checkunique');
 done_testing();
-- 
2.34.1

v31-0002-Refactor-amcheck-internals-to-isolate-common-loc.patchapplication/octet-stream; name=v31-0002-Refactor-amcheck-internals-to-isolate-common-loc.patchDownload

From 5cb507ead70de64cd966821045a77ae25ef2433a Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:08:10 +0500
Subject: [PATCH v31 2/5] Refactor amcheck internals to isolate common locking
 and checking routines
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Before doing checks, other indexes must take the same safety measures:
 - Making sure the index can be checked
 - changing the context of the user
 - keeping track of GUCs modified via index functions
This contribution relocates the existing functionality to amcheck.c for reuse.

Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile                 |   1 +
 contrib/amcheck/expected/check_btree.out |   4 +-
 contrib/amcheck/meson.build              |   1 +
 contrib/amcheck/verify_common.c          | 191 ++++++++++++++++
 contrib/amcheck/verify_common.h          |  31 +++
 contrib/amcheck/verify_nbtree.c          | 267 ++++++-----------------
 6 files changed, 296 insertions(+), 199 deletions(-)
 create mode 100644 contrib/amcheck/verify_common.c
 create mode 100644 contrib/amcheck/verify_common.h

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index 5e9002d2501..c3d70f3369c 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -3,6 +3,7 @@
 MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
+	verify_common.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
diff --git a/contrib/amcheck/expected/check_btree.out b/contrib/amcheck/expected/check_btree.out
index e7fb5f55157..c6f4b16c556 100644
--- a/contrib/amcheck/expected/check_btree.out
+++ b/contrib/amcheck/expected/check_btree.out
@@ -57,8 +57,8 @@ ERROR:  could not open relation with OID 17
 BEGIN;
 CREATE INDEX bttest_a_brin_idx ON bttest_a USING brin(id);
 SELECT bt_index_parent_check('bttest_a_brin_idx');
-ERROR:  only B-Tree indexes are supported as targets for verification
-DETAIL:  Relation "bttest_a_brin_idx" is not a B-Tree index.
+ERROR:  expected "btree" index as targets for verification
+DETAIL:  Relation "bttest_a_brin_idx" is a brin index.
 ROLLBACK;
 -- normal check outside of xact
 SELECT bt_index_check('bttest_a_idx');
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index fc08e32539a..1b38e0aba77 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -1,6 +1,7 @@
 # Copyright (c) 2022-2024, PostgreSQL Global Development Group
 
 amcheck_sources = files(
+  'amcheck.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
diff --git a/contrib/amcheck/verify_common.c b/contrib/amcheck/verify_common.c
new file mode 100644
index 00000000000..acdcf5729f7
--- /dev/null
+++ b/contrib/amcheck/verify_common.c
@@ -0,0 +1,191 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.c
+ *		Utility functions common to all access methods.
+ *
+ * Copyright (c) 2024, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/table.h"
+#include "access/tableam.h"
+#include "verify_common.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
+#include "commands/tablecmds.h"
+#include "utils/guc.h"
+#include "utils/syscache.h"
+
+static bool amcheck_index_mainfork_expected(Relation rel);
+
+
+/*
+ * Check if index relation should have a file for its main relation fork.
+ * Verification uses this to skip unlogged indexes when in hot standby mode,
+ * where there is simply nothing to verify.
+ *
+ * NB: Caller should call index_checkable() before calling here.
+ */
+static bool
+amcheck_index_mainfork_expected(Relation rel)
+{
+	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
+		!RecoveryInProgress())
+		return true;
+
+	ereport(NOTICE,
+			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
+			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
+					RelationGetRelationName(rel))));
+
+	return false;
+}
+
+/*
+* Amcheck main workhorse.
+* Given index relation OID, lock relation.
+* Next, take a number of standard actions:
+* 1) Make sure the index can be checked
+* 2) change the context of the user,
+* 3) keep track of GUCs modified via index functions
+* 4) execute callback function to verify integrity.
+*/
+void
+amcheck_lock_relation_and_check(Oid indrelid,
+								Oid am_id,
+								IndexDoCheckCallback check,
+								LOCKMODE lockmode,
+								void *state)
+{
+	Oid			heapid;
+	Relation	indrel;
+	Relation	heaprel;
+	Oid			save_userid;
+	int			save_sec_context;
+	int			save_nestlevel;
+
+	/*
+	 * We must lock table before index to avoid deadlocks.  However, if the
+	 * passed indrelid isn't an index then IndexGetRelation() will fail.
+	 * Rather than emitting a not-very-helpful error message, postpone
+	 * complaining, expecting that the is-it-an-index test below will fail.
+	 *
+	 * In hot standby mode this will raise an error when parentcheck is true.
+	 */
+	heapid = IndexGetRelation(indrelid, true);
+	if (OidIsValid(heapid))
+	{
+		heaprel = table_open(heapid, lockmode);
+
+		/*
+		 * Switch to the table owner's userid, so that any index functions are
+		 * run as that user.  Also lock down security-restricted operations
+		 * and arrange to make GUC variable changes local to this command.
+		 */
+		GetUserIdAndSecContext(&save_userid, &save_sec_context);
+		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
+							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
+		save_nestlevel = NewGUCNestLevel();
+	}
+	else
+	{
+		heaprel = NULL;
+		/* Set these just to suppress "uninitialized variable" warnings */
+		save_userid = InvalidOid;
+		save_sec_context = -1;
+		save_nestlevel = -1;
+	}
+
+	/*
+	 * Open the target index relations separately (like relation_openrv(), but
+	 * with heap relation locked first to prevent deadlocking).  In hot
+	 * standby mode this will raise an error when parentcheck is true.
+	 *
+	 * There is no need for the usual indcheckxmin usability horizon test
+	 * here, even in the heapallindexed case, because index undergoing
+	 * verification only needs to have entries for a new transaction snapshot.
+	 * (If this is a parentcheck verification, there is no question about
+	 * committed or recently dead heap tuples lacking index entries due to
+	 * concurrent activity.)
+	 */
+	indrel = index_open(indrelid, lockmode);
+
+	/*
+	 * Since we did the IndexGetRelation call above without any lock, it's
+	 * barely possible that a race against an index drop/recreation could have
+	 * netted us the wrong table.
+	 */
+	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_TABLE),
+				 errmsg("could not open parent table of index \"%s\"",
+						RelationGetRelationName(indrel))));
+
+	/* Check that relation suitable for checking */
+	if (index_checkable(indrel, am_id))
+		check(indrel, heaprel, state, lockmode == ShareLock);
+
+	/* Roll back any GUC changes executed by index functions */
+	AtEOXact_GUC(false, save_nestlevel);
+
+	/* Restore userid and security context */
+	SetUserIdAndSecContext(save_userid, save_sec_context);
+
+	/*
+	 * Release locks early. That's ok here because nothing in the called
+	 * routines will trigger shared cache invalidations to be sent, so we can
+	 * relax the usual pattern of only releasing locks after commit.
+	 */
+	index_close(indrel, lockmode);
+	if (heaprel)
+		table_close(heaprel, lockmode);
+}
+
+/*
+ * Basic checks about the suitability of a relation for checking as an index.
+ *
+ *
+ * NB: Intentionally not checking permissions, the function is normally not
+ * callable by non-superusers. If granted, it's useful to be able to check a
+ * whole cluster.
+ */
+bool
+index_checkable(Relation rel, Oid am_id)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != am_id)
+	{
+		HeapTuple	amtup;
+		HeapTuple	amtuprel;
+
+		amtup = SearchSysCache1(AMOID, ObjectIdGetDatum(am_id));
+		amtuprel = SearchSysCache1(AMOID, ObjectIdGetDatum(rel->rd_rel->relam));
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("expected \"%s\" index as targets for verification", NameStr(((Form_pg_am) GETSTRUCT(amtup))->amname)),
+				 errdetail("Relation \"%s\" is a %s index.",
+						   RelationGetRelationName(rel), NameStr(((Form_pg_am) GETSTRUCT(amtuprel))->amname))));
+	}
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid.")));
+
+	return amcheck_index_mainfork_expected(rel);
+}
diff --git a/contrib/amcheck/verify_common.h b/contrib/amcheck/verify_common.h
new file mode 100644
index 00000000000..30994e22933
--- /dev/null
+++ b/contrib/amcheck/verify_common.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "storage/bufpage.h"
+#include "storage/lmgr.h"
+#include "storage/lockdefs.h"
+#include "utils/relcache.h"
+#include "miscadmin.h"
+
+/* Typedefs for callback functions for amcheck_lock_relation */
+typedef void (*IndexCheckableCallback) (Relation index);
+typedef void (*IndexDoCheckCallback) (Relation rel,
+									  Relation heaprel,
+									  void *state,
+									  bool readonly);
+
+extern void amcheck_lock_relation_and_check(Oid indrelid,
+											Oid am_id,
+											IndexDoCheckCallback check,
+											LOCKMODE lockmode, void *state);
+
+extern bool index_checkable(Relation rel, Oid am_id);
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index c76349bf436..1da4f0c3461 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -30,6 +30,7 @@
 #include "access/tableam.h"
 #include "access/transam.h"
 #include "access/xact.h"
+#include "verify_common.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
 #include "catalog/pg_opfamily_d.h"
@@ -156,14 +157,22 @@ typedef struct BtreeLastVisibleEntry
 	ItemPointer tid;			/* Heap tid */
 } BtreeLastVisibleEntry;
 
+/*
+ * Check arguments
+ */
+typedef struct BTCallbackState
+{
+	bool		parentcheck;
+	bool		heapallindexed;
+	bool		rootdescend;
+	bool		checkunique;
+}			BTCallbackState;
+
 PG_FUNCTION_INFO_V1(bt_index_check);
 PG_FUNCTION_INFO_V1(bt_index_parent_check);
 
-static void bt_index_check_internal(Oid indrelid, bool parentcheck,
-									bool heapallindexed, bool rootdescend,
-									bool checkunique);
-static inline void btree_index_checkable(Relation rel);
-static inline bool btree_index_mainfork_expected(Relation rel);
+static void bt_index_check_callback(Relation indrel, Relation heaprel,
+									void *state, bool readonly);
 static void bt_check_every_level(Relation rel, Relation heaprel,
 								 bool heapkeyspace, bool readonly, bool heapallindexed,
 								 bool rootdescend, bool checkunique);
@@ -238,15 +247,21 @@ Datum
 bt_index_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		checkunique = false;
+	BTCallbackState args;
+
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = false;
+	args.checkunique = false;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
-	if (PG_NARGS() == 3)
-		checkunique = PG_GETARG_BOOL(2);
+		args.heapallindexed = PG_GETARG_BOOL(1);
+	if (PG_NARGS() >= 3)
+		args.checkunique = PG_GETARG_BOOL(2);
 
-	bt_index_check_internal(indrelid, false, heapallindexed, false, checkunique);
+	amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
+									bt_index_check_callback,
+									AccessShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -264,18 +279,23 @@ Datum
 bt_index_parent_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		rootdescend = false;
-	bool		checkunique = false;
+	BTCallbackState args;
+
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = true;
+	args.checkunique = false;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+		args.heapallindexed = PG_GETARG_BOOL(1);
 	if (PG_NARGS() >= 3)
-		rootdescend = PG_GETARG_BOOL(2);
-	if (PG_NARGS() == 4)
-		checkunique = PG_GETARG_BOOL(3);
+		args.rootdescend = PG_GETARG_BOOL(2);
+	if (PG_NARGS() >= 4)
+		args.checkunique = PG_GETARG_BOOL(3);
 
-	bt_index_check_internal(indrelid, true, heapallindexed, rootdescend, checkunique);
+	amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
+									bt_index_check_callback,
+									ShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -284,193 +304,46 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
  * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
  */
 static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend, bool checkunique)
+bt_index_check_callback(Relation indrel, Relation heaprel, void *state, bool readonly)
 {
-	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	LOCKMODE	lockmode;
-	Oid			save_userid;
-	int			save_sec_context;
-	int			save_nestlevel;
-
-	if (parentcheck)
-		lockmode = ShareLock;
-	else
-		lockmode = AccessShareLock;
-
-	/*
-	 * We must lock table before index to avoid deadlocks.  However, if the
-	 * passed indrelid isn't an index then IndexGetRelation() will fail.
-	 * Rather than emitting a not-very-helpful error message, postpone
-	 * complaining, expecting that the is-it-an-index test below will fail.
-	 *
-	 * In hot standby mode this will raise an error when parentcheck is true.
-	 */
-	heapid = IndexGetRelation(indrelid, true);
-	if (OidIsValid(heapid))
-	{
-		heaprel = table_open(heapid, lockmode);
-
-		/*
-		 * Switch to the table owner's userid, so that any index functions are
-		 * run as that user.  Also lock down security-restricted operations
-		 * and arrange to make GUC variable changes local to this command.
-		 */
-		GetUserIdAndSecContext(&save_userid, &save_sec_context);
-		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
-							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
-		save_nestlevel = NewGUCNestLevel();
-		RestrictSearchPath();
-	}
-	else
-	{
-		heaprel = NULL;
-		/* Set these just to suppress "uninitialized variable" warnings */
-		save_userid = InvalidOid;
-		save_sec_context = -1;
-		save_nestlevel = -1;
-	}
+	BTCallbackState *args = (BTCallbackState *) state;
+	bool		heapkeyspace,
+				allequalimage;
 
-	/*
-	 * Open the target index relations separately (like relation_openrv(), but
-	 * with heap relation locked first to prevent deadlocking).  In hot
-	 * standby mode this will raise an error when parentcheck is true.
-	 *
-	 * There is no need for the usual indcheckxmin usability horizon test
-	 * here, even in the heapallindexed case, because index undergoing
-	 * verification only needs to have entries for a new transaction snapshot.
-	 * (If this is a parentcheck verification, there is no question about
-	 * committed or recently dead heap tuples lacking index entries due to
-	 * concurrent activity.)
-	 */
-	indrel = index_open(indrelid, lockmode);
-
-	/*
-	 * Since we did the IndexGetRelation call above without any lock, it's
-	 * barely possible that a race against an index drop/recreation could have
-	 * netted us the wrong table.
-	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
 		ereport(ERROR,
-				(errcode(ERRCODE_UNDEFINED_TABLE),
-				 errmsg("could not open parent table of index \"%s\"",
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" lacks a main relation fork",
 						RelationGetRelationName(indrel))));
 
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	if (btree_index_mainfork_expected(indrel))
+	/* Extract metadata from metapage, and sanitize it in passing */
+	_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
+	if (allequalimage && !heapkeyspace)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
+						RelationGetRelationName(indrel))));
+	if (allequalimage && !_bt_allequalimage(indrel, false))
 	{
-		bool		heapkeyspace,
-					allequalimage;
+		bool		has_interval_ops = false;
 
-		if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" lacks a main relation fork",
-							RelationGetRelationName(indrel))));
-
-		/* Extract metadata from metapage, and sanitize it in passing */
-		_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
-		if (allequalimage && !heapkeyspace)
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
-							RelationGetRelationName(indrel))));
-		if (allequalimage && !_bt_allequalimage(indrel, false))
-		{
-			bool		has_interval_ops = false;
-
-			for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
-				if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
-					has_interval_ops = true;
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
-							RelationGetRelationName(indrel)),
-					 has_interval_ops
-					 ? errhint("This is known of \"interval\" indexes last built on a version predating 2023-11.")
-					 : 0));
-		}
-
-		/* Check index, possibly against table it is an index on */
-		bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
-							 heapallindexed, rootdescend, checkunique);
+		for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
+			if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
+			{
+				has_interval_ops = true;
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
+								RelationGetRelationName(indrel)),
+						 has_interval_ops
+						 ? errhint("This is known of \"interval\" indexes last built on a version predating 2023-11.")
+						 : 0));
+			}
 	}
 
-	/* Roll back any GUC changes executed by index functions */
-	AtEOXact_GUC(false, save_nestlevel);
-
-	/* Restore userid and security context */
-	SetUserIdAndSecContext(save_userid, save_sec_context);
-
-	/*
-	 * Release locks early. That's ok here because nothing in the called
-	 * routines will trigger shared cache invalidations to be sent, so we can
-	 * relax the usual pattern of only releasing locks after commit.
-	 */
-	index_close(indrel, lockmode);
-	if (heaprel)
-		table_close(heaprel, lockmode);
-}
-
-/*
- * Basic checks about the suitability of a relation for checking as a B-Tree
- * index.
- *
- * NB: Intentionally not checking permissions, the function is normally not
- * callable by non-superusers. If granted, it's useful to be able to check a
- * whole cluster.
- */
-static inline void
-btree_index_checkable(Relation rel)
-{
-	if (rel->rd_rel->relkind != RELKIND_INDEX ||
-		rel->rd_rel->relam != BTREE_AM_OID)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("only B-Tree indexes are supported as targets for verification"),
-				 errdetail("Relation \"%s\" is not a B-Tree index.",
-						   RelationGetRelationName(rel))));
-
-	if (RELATION_IS_OTHER_TEMP(rel))
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("cannot access temporary tables of other sessions"),
-				 errdetail("Index \"%s\" is associated with temporary relation.",
-						   RelationGetRelationName(rel))));
-
-	if (!rel->rd_index->indisvalid)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("cannot check index \"%s\"",
-						RelationGetRelationName(rel)),
-				 errdetail("Index is not valid.")));
-}
-
-/*
- * Check if B-Tree index relation should have a file for its main relation
- * fork.  Verification uses this to skip unlogged indexes when in hot standby
- * mode, where there is simply nothing to verify.  We behave as if the
- * relation is empty.
- *
- * NB: Caller should call btree_index_checkable() before calling here.
- */
-static inline bool
-btree_index_mainfork_expected(Relation rel)
-{
-	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
-		!RecoveryInProgress())
-		return true;
-
-	ereport(DEBUG1,
-			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
-			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
-					RelationGetRelationName(rel))));
-
-	return false;
+	/* Check index, possibly against table it is an index on */
+	bt_check_every_level(indrel, heaprel, heapkeyspace, readonly,
+						 args->heapallindexed, args->rootdescend, args->checkunique);
 }
 
 /*
-- 
2.34.1

v31-0001-A-tiny-nitpicky-tweak-to-beautify-the-Amcheck-in.patchapplication/octet-stream; name=v31-0001-A-tiny-nitpicky-tweak-to-beautify-the-Amcheck-in.patchDownload

From e78bc24a3a355c5731cb677b89456ceb9fbd9b55 Mon Sep 17 00:00:00 2001
From: reshke kirill <reshke@double.cloud>
Date: Tue, 26 Nov 2024 05:32:27 +0000
Subject: [PATCH v31 1/5] A tiny nitpicky tweak to beautify the Amcheck
 interiors.

The heaptuplespresent field in BtreeCheckState was not previously
adequately documented. To clarify the meaning of this field, the comment was changed.
---
 contrib/amcheck/verify_nbtree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index ffe4f721672..c76349bf436 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -124,7 +124,7 @@ typedef struct BtreeCheckState
 
 	/* Bloom filter fingerprints B-Tree index */
 	bloom_filter *filter;
-	/* Debug counter */
+	/* Debug counter for reporting percentage of work already done */
 	int64		heaptuplespresent;
 } BtreeCheckState;
 
-- 
2.34.1

v31-0004-Add-gin_index_check-to-verify-GIN-index.patchapplication/octet-stream; name=v31-0004-Add-gin_index_check-to-verify-GIN-index.patchDownload

From fab0d9c48801247cfc75a92f0a5cb2dd945c11cc Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:22:26 +0500
Subject: [PATCH v31 4/5] Add gin_index_check() to verify GIN index
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Author: Grigory Kryachko <GSKryachko@gmail.com>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile               |   3 +-
 contrib/amcheck/amcheck--1.4--1.5.sql  |   9 +
 contrib/amcheck/expected/check_gin.out |  64 +++
 contrib/amcheck/meson.build            |   2 +
 contrib/amcheck/sql/check_gin.sql      |  40 ++
 contrib/amcheck/verify_gin.c           | 755 +++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml              |  19 +
 src/tools/pgindent/pgindent            |   2 +-
 8 files changed, 892 insertions(+), 2 deletions(-)
 create mode 100644 contrib/amcheck/expected/check_gin.out
 create mode 100644 contrib/amcheck/sql/check_gin.sql
 create mode 100644 contrib/amcheck/verify_gin.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index 952e458c53b..c01f8e618f3 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,6 +4,7 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	verify_common.o \
+	verify_gin.o \
 	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
@@ -13,7 +14,7 @@ DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck
 		amcheck--1.3--1.4.sql amcheck--1.4--1.5.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_gist check_heap
+REGRESS = check check_btree check_gin check_gist check_heap
 
 EXTRA_INSTALL = contrib/pg_walinspect
 TAP_TESTS = 1
diff --git a/contrib/amcheck/amcheck--1.4--1.5.sql b/contrib/amcheck/amcheck--1.4--1.5.sql
index 3fc72364180..c013abc4f55 100644
--- a/contrib/amcheck/amcheck--1.4--1.5.sql
+++ b/contrib/amcheck/amcheck--1.4--1.5.sql
@@ -12,3 +12,12 @@ AS 'MODULE_PATHNAME', 'gist_index_check'
 LANGUAGE C STRICT;
 
 REVOKE ALL ON FUNCTION gist_index_check(regclass, boolean) FROM PUBLIC;
+
+-- gin_index_check()
+--
+CREATE FUNCTION gin_index_check(index regclass)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gin_index_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gin_index_check(regclass) FROM PUBLIC;
diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
new file mode 100644
index 00000000000..bbcde80e627
--- /dev/null
+++ b/contrib/amcheck/expected/check_gin.out
@@ -0,0 +1,64 @@
+-- Test of index bulk load
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_check('gin_check_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test index inserts
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+SELECT gin_index_check('gin_check_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test GIN over text array
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_check('gin_check_text_array_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 15ae94cc90f..5c9ddfe0758 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gin.c',
   'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
@@ -38,6 +39,7 @@ tests += {
     'sql': [
       'check',
       'check_btree',
+      'check_gin',
       'check_gist',
       'check_heap',
     ],
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
new file mode 100644
index 00000000000..bbd9b9f8281
--- /dev/null
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -0,0 +1,40 @@
+-- Test of index bulk load
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test index inserts
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+
+SELECT gin_index_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test GIN over text array
+SELECT setseed(1);
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_check('gin_check_text_array_idx');
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
new file mode 100644
index 00000000000..ddf072d468d
--- /dev/null
+++ b/contrib/amcheck/verify_gin.c
@@ -0,0 +1,755 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gin.c
+ *		Verifies the integrity of GIN indexes based on invariants.
+ *
+ * Verification checks that all paths in GIN graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gin.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gin_private.h"
+#include "access/nbtree.h"
+#include "catalog/pg_am.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "verify_common.h"
+#include "string.h"
+
+/*
+ * GinScanItem represents one item of depth-first scan of GIN index.
+ */
+typedef struct GinScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GinScanItem *next;
+}			GinScanItem;
+
+/*
+ * GinPostingTreeScanItem represents one item of depth-first scan of GIN  posting tree.
+ */
+typedef struct GinPostingTreeScanItem
+{
+	int			depth;
+	ItemPointerData parentkey;
+	BlockNumber parentblk;
+	BlockNumber blkno;
+	struct GinPostingTreeScanItem *next;
+}			GinPostingTreeScanItem;
+
+
+PG_FUNCTION_INFO_V1(gin_index_check);
+
+static void gin_check_parent_keys_consistency(Relation rel,
+											  Relation heaprel,
+											  void *callback_state, bool readonly);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gin_refind_parent(Relation rel,
+									BlockNumber parentblkno,
+									BlockNumber childblkno,
+									BufferAccessStrategy strategy);
+static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+								   OffsetNumber offset);
+
+/*
+ * gin_index_check(index regclass)
+ *
+ * Verify integrity of GIN index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gin_index_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+
+	amcheck_lock_relation_and_check(indrelid,
+									GIN_AM_OID,
+									gin_check_parent_keys_consistency,
+									AccessShareLock,
+									NULL);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Read item pointers from leaf entry tuple.
+ *
+ * Returns a palloc'd array of ItemPointers. The number of items is returned
+ * in *nitems.
+ */
+static ItemPointer
+ginReadTupleWithoutState(IndexTuple itup, int *nitems)
+{
+	Pointer		ptr = GinGetPosting(itup);
+	int			nipd = GinGetNPosting(itup);
+	ItemPointer ipd;
+	int			ndecoded;
+
+	if (GinItupIsCompressed(itup))
+	{
+		if (nipd > 0)
+		{
+			ipd = ginPostingListDecode((GinPostingList *) ptr, &ndecoded);
+			if (nipd != ndecoded)
+				elog(ERROR, "number of items mismatch in GIN entry tuple, %d in tuple header, %d decoded",
+					 nipd, ndecoded);
+		}
+		else
+			ipd = palloc(0);
+	}
+	else
+	{
+		ipd = (ItemPointer) palloc(sizeof(ItemPointerData) * nipd);
+		memcpy(ipd, ptr, sizeof(ItemPointerData) * nipd);
+	}
+	*nitems = nipd;
+	return ipd;
+}
+
+/*
+ * Allocates memory context and scans through postigTree graph
+ *
+ */
+static void
+gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting_tree_root)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinPostingTreeScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinPostingTreeScanItem *) palloc0(sizeof(GinPostingTreeScanItem));
+	stack->depth = 0;
+	ItemPointerSetInvalid(&stack->parentkey);
+	stack->parentblk = InvalidBlockNumber;
+	stack->blkno = posting_tree_root;
+
+	elog(DEBUG3, "processing posting tree at blk %u", posting_tree_root);
+
+	while (stack)
+	{
+		GinPostingTreeScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		Assert(GinPageIsData(page));
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			ItemPointerData minItem;
+			int			nlist;
+			ItemPointerData *list;
+			char		tidrange_buf[100];
+
+			ItemPointerSetMin(&minItem);
+
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			list = GinDataLeafPageGetItems(page, &nlist, minItem);
+
+			if (nlist > 0)
+				snprintf(tidrange_buf, sizeof(tidrange_buf),
+						 "%d tids (%u, %u) - (%u, %u)",
+						 nlist,
+						 ItemPointerGetBlockNumberNoCheck(&list[0]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[0]),
+						 ItemPointerGetBlockNumberNoCheck(&list[nlist - 1]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[nlist - 1]));
+			else
+				snprintf(tidrange_buf, sizeof(tidrange_buf), "0 tids");
+
+			if (stack->parentblk != InvalidBlockNumber)
+			{
+				elog(DEBUG3, "blk %u: parent %u highkey (%u, %u), %s",
+					 stack->blkno,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey),
+					 tidrange_buf);
+			}
+			else
+				elog(DEBUG3, "blk %u: root leaf, %s",
+					 stack->blkno,
+					 tidrange_buf);
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				ItemPointerGetOffsetNumberNoCheck(&stack->parentkey) != InvalidOffsetNumber &&
+				nlist > 0 && ItemPointerCompare(&stack->parentkey, &list[nlist - 1]) < 0)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": tid exceeds parent's high key in postingTree leaf on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+		else
+		{
+			LocationIndex pd_lower;
+			ItemPointerData bound;
+			int			lowersize;
+
+			/*
+			 * Check that tuples in each page are properly ordered and
+			 * consistent with parent high key
+			 */
+			maxoff = GinPageGetOpaque(page)->maxoff;
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: internal posting tree page with %u items, parent %u highkey (%u, %u)",
+					 stack->blkno, maxoff, stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey));
+			else
+				elog(DEBUG3, "blk %u: root internal posting tree page with %u items",
+					 stack->blkno, maxoff);
+
+			/*
+			 * A GIN posting tree internal page stores PostingItems in the
+			 * 'lower' part of the page. The 'upper' part is unused. The
+			 * number of elements is stored in the opaque area (maxoff). Make
+			 * sure the size of the 'lower' part agrees with 'maxoff'
+			 *
+			 * We didn't set pd_lower until PostgreSQL version 9.4, so if this
+			 * check fails, it could also be because the index was
+			 * binary-upgraded from an earlier version. That was a long time
+			 * ago, though, so let's warn if it doesn't match.
+			 */
+			pd_lower = ((PageHeader) page)->pd_lower;
+			lowersize = pd_lower - MAXALIGN(SizeOfPageHeaderData);
+			if ((lowersize - MAXALIGN(sizeof(ItemPointerData))) / sizeof(PostingItem) != maxoff)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has unexpected pd_lower %u in posting tree block %u with maxoff %u)",
+								RelationGetRelationName(rel), pd_lower, stack->blkno, maxoff)));
+
+			/*
+			 * Before the PostingItems, there's one ItemPointerData in the
+			 * 'lower' part that stores the page's high key.
+			 */
+			bound = *GinDataPageGetRightBound(page);
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				!ItemPointerEquals(&stack->parentkey, &bound))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": posting tree page's high key (%u, %u) doesn't match the downlink on block %u (parent blk %u, key (%u, %u))",
+								RelationGetRelationName(rel),
+								ItemPointerGetBlockNumberNoCheck(&bound),
+								ItemPointerGetOffsetNumberNoCheck(&bound),
+								stack->blkno, stack->parentblk,
+								ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+								ItemPointerGetOffsetNumberNoCheck(&stack->parentkey))));
+
+			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
+
+				elog(DEBUG3, "key (%u, %u) -> %u",
+					 ItemPointerGetBlockNumber(&posting_item->key),
+					 ItemPointerGetOffsetNumber(&posting_item->key),
+					 BlockIdGetBlockNumber(&posting_item->child_blkno));
+
+				if (i == maxoff &&
+					GinPageGetOpaque(page)->rightlink == InvalidBlockNumber)
+				{
+					/*
+					 * The rightmost item in the tree level has (0, 0) as the
+					 * key
+					 */
+					if (ItemPointerGetBlockNumberNoCheck(&posting_item->key) != 0 ||
+						ItemPointerGetOffsetNumberNoCheck(&posting_item->key) != 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": rightmost posting tree page (blk %u) has unexpected last key (%u, %u)",
+										RelationGetRelationName(rel),
+										stack->blkno,
+										ItemPointerGetBlockNumberNoCheck(&posting_item->key),
+										ItemPointerGetOffsetNumberNoCheck(&posting_item->key))));
+				}
+				else if (i != FirstOffsetNumber)
+				{
+					PostingItem *previous_posting_item = GinDataPageGetPostingItem(page, i - 1);
+
+					if (ItemPointerCompare(&posting_item->key, &previous_posting_item->key) < 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\" has wrong tuple order in posting tree, block %u, offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+				}
+
+				/*
+				 * Check if this tuple is consistent with the downlink in the
+				 * parent.
+				 */
+				if (stack->parentblk != InvalidBlockNumber && i == maxoff &&
+					ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\": posting item exceeds parent's high key in postingTree internal page on block %u offset %u",
+									RelationGetRelationName(rel),
+									stack->blkno, i)));
+
+				/* If this is an internal page, recurse into the child */
+				if (!GinPageIsLeaf(page))
+				{
+					GinPostingTreeScanItem *ptr;
+
+					ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
+					ptr->depth = stack->depth + 1;
+					ptr->parentkey = posting_item->key;
+					ptr->parentblk = stack->blkno;
+					ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+			}
+		}
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Main entry point for GIN check. Allocates memory context and scans through
+ * GIN graph.
+ */
+static void
+gin_check_parent_keys_consistency(Relation rel,
+								  Relation heaprel,
+								  void *callback_state,
+								  bool readonly)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GinState	state;
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+	initGinState(&state, rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinScanItem *) palloc0(sizeof(GinScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIN_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GinScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+		IndexTuple	prev_tuple;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+		maxoff = GinPageGetOpaque(page)->maxoff;
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (stack->parenttup != NULL)
+		{
+			GinNullCategory parent_key_category;
+			Datum		parent_key = gintuple_get_key(&state,
+													  stack->parenttup,
+													  &parent_key_category);
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno,
+												   page, maxoff);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory page_max_key_category;
+			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
+
+			if (GinPageGetOpaque(page)->rightlink != InvalidBlockNumber &&
+				ginCompareEntries(&state, attnum, page_max_key,
+								  page_max_key_category, parent_key,
+								  parent_key_category) > 0)
+			{
+				/* split page detected, install right link to the stack */
+				GinScanItem *ptr;
+
+				elog(DEBUG3, "split detected");
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth;
+				ptr->parenttup = CopyIndexTuple(stack->parenttup);
+				ptr->parentblk = stack->parentblk;
+				ptr->parentlsn = stack->parentlsn;
+				ptr->blkno = GinPageGetOpaque(page)->rightlink;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that tuples in each page are properly ordered and consistent
+		 * with parent high key
+		 */
+		prev_tuple = NULL;
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory prev_key_category;
+			Datum		prev_key;
+			GinNullCategory current_key_category;
+			Datum		current_key;
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
+
+			/* (apparently) first block is metadata, skip order check */
+			if (i != FirstOffsetNumber && stack->blkno != (BlockNumber) 1)
+			{
+				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
+				if (ginCompareEntries(&state, attnum, prev_key,
+									  prev_key_category, current_key,
+									  current_key_category) >= 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has wrong tuple order, block %u, offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+			}
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				i == maxoff)
+			{
+				GinNullCategory parent_key_category;
+				Datum		parent_key = gintuple_get_key(&state,
+														  stack->parenttup,
+														  &parent_key_category);
+
+				if (ginCompareEntries(&state, attnum, current_key,
+									  current_key_category, parent_key,
+									  parent_key_category) > 0)
+				{
+					/*
+					 * There was a discrepancy between parent and child
+					 * tuples. We need to verify it is not a result of
+					 * concurrent call of gistplacetopage(). So, lock parent
+					 * and try to find downlink for current page. It may be
+					 * missing due to concurrent page split, this is OK.
+					 */
+					pfree(stack->parenttup);
+					stack->parenttup = gin_refind_parent(rel, stack->parentblk,
+														 stack->blkno, strategy);
+
+					/* We found it - make a final check before failing */
+					if (!stack->parenttup)
+						elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+							 stack->blkno, stack->parentblk);
+					else
+					{
+						parent_key = gintuple_get_key(&state,
+													  stack->parenttup,
+													  &parent_key_category);
+
+						/*
+						* Check if it is properly adjusted.
+						* If succeed, procced to the next key.
+						*/
+						if (ginCompareEntries(&state, attnum, current_key,
+											  current_key_category, parent_key,
+											  parent_key_category) > 0)
+							ereport(ERROR,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+											RelationGetRelationName(rel), stack->blkno, i)));
+					}
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+			if (!GinPageIsLeaf(page))
+			{
+				GinScanItem *ptr;
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth + 1;
+				/* last tuple in layer has no high key */
+				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
+					ptr->parenttup = CopyIndexTuple(idxtuple);
+				else
+					ptr->parenttup = NULL;
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = GinGetDownlink(idxtuple);
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+			/* If this item is a pointer to a posting tree, recurse into it */
+			else if (GinIsPostingTree(idxtuple))
+			{
+				BlockNumber rootPostingTree = GinGetPostingTree(idxtuple);
+
+				gin_check_posting_tree_parent_keys_consistency(rel, rootPostingTree);
+			}
+			else
+			{
+				ItemPointer ipd;
+				int			nipd;
+
+				ipd = ginReadTupleWithoutState(idxtuple, &nipd);
+
+				for (int j = 0; j < nipd; j++)
+				{
+					if (!OffsetNumberIsValid(ItemPointerGetOffsetNumber(&ipd[j])))
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting list contains invalid heap pointer on block %u",
+										RelationGetRelationName(rel), stack->blkno)));
+				}
+				pfree(ipd);
+			}
+
+			prev_tuple = CopyIndexTuple(idxtuple);
+		}
+
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Verify that a freshly-read page looks sane.
+ */
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	/*
+	 * ReadBuffer verifies that every newly-read page passes
+	 * PageHeaderIsValid, which means it either contains a reasonably sane
+	 * page header or is all-zero.  We have to defend against the all-zero
+	 * case, however.
+	 */
+	if (PageIsNew(page))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains unexpected zero page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	/*
+	 * Additionally check that the special area looks sane.
+	 */
+	if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains corrupted page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	if (GinPageIsDeleted(page))
+	{
+		if (!GinPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gin_refind_parent(Relation rel, BlockNumber parentblkno,
+				  BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIN_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GinPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+static ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED or LP_DEAD,
+	 * since GIN never uses all three.  Verify that line pointer has storage,
+	 * too.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdIsDead(itemid) || ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 6eb526c6bb7..55f2b587e57 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -189,6 +189,25 @@ ORDER BY c.relpages DESC LIMIT 10;
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term>
+     <function>gin_index_check(index regclass) returns void</function>
+     <indexterm>
+      <primary>gin_index_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gin_index_check</function> tests that its target GIN index
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term>
      <function>gist_index_check(index regclass, heapallindexed boolean) returns void</function>
diff --git a/src/tools/pgindent/pgindent b/src/tools/pgindent/pgindent
index e889af6b1e4..e5ac0410665 100755
--- a/src/tools/pgindent/pgindent
+++ b/src/tools/pgindent/pgindent
@@ -13,7 +13,7 @@ use IO::Handle;
 use Getopt::Long;
 
 # Update for pg_bsd_indent version
-my $INDENT_VERSION = "2.1.2";
+my $INDENT_VERSION = "2.1.1";
 
 # Our standard indent settings
 my $indent_opts =
-- 
2.34.1

v31-0003-Add-gist_index_check-function-to-verify-GiST-ind.patchapplication/octet-stream; name=v31-0003-Add-gist_index_check-function-to-verify-GiST-ind.patchDownload

From 3934621f6aaf2659e38691f7519738ad53fd7e99 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:17:44 +0500
Subject: [PATCH v31 3/5] Add gist_index_check() function to verify GiST index
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This function traverses GiST with a depth-fisrt search and checks
that all downlink tuples are included into parent tuple keyspace.
This traverse takes lock of any page until some discapency found.
To re-check suspicious pair of parent and child tuples it aqcuires
locks on both parent and child pages in the same order as page
split does.

Author: Andrey Borodin <amborodin@acm.org>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile                |   6 +-
 contrib/amcheck/amcheck--1.4--1.5.sql   |  14 +
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/expected/check_gist.out | 145 +++++
 contrib/amcheck/meson.build             |   3 +
 contrib/amcheck/sql/check_gist.sql      |  62 +++
 contrib/amcheck/verify_gist.c           | 687 ++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml               |  19 +
 8 files changed, 935 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.4--1.5.sql
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index c3d70f3369c..952e458c53b 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,14 +4,16 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	verify_common.o \
+	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
-DATA = amcheck--1.3--1.4.sql amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql \
+		amcheck--1.3--1.4.sql amcheck--1.4--1.5.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap
+REGRESS = check check_btree check_gist check_heap
 
 EXTRA_INSTALL = contrib/pg_walinspect
 TAP_TESTS = 1
diff --git a/contrib/amcheck/amcheck--1.4--1.5.sql b/contrib/amcheck/amcheck--1.4--1.5.sql
new file mode 100644
index 00000000000..3fc72364180
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.4--1.5.sql
@@ -0,0 +1,14 @@
+/* contrib/amcheck/amcheck--1.4--1.5.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.5'" to load this file. \quit
+
+
+-- gist_index_check()
+--
+CREATE FUNCTION gist_index_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_check(regclass, boolean) FROM PUBLIC;
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index e67ace01c99..c8ba6d7c9bc 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.4'
+default_version = '1.5'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 00000000000..cbc3e27e679
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,145 @@
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gist_check;
+--
+-- Similar to BUG #15597
+--
+CREATE TABLE toast_bug(c point,buggy text);
+ALTER TABLE toast_bug ALTER COLUMN buggy SET STORAGE extended;
+CREATE INDEX toasty ON toast_bug USING gist(c) INCLUDE(buggy);
+-- pg_attribute entry for toasty.buggy (the index) will have plain storage:
+UPDATE pg_attribute SET attstorage = 'p'
+WHERE attrelid = 'toasty'::regclass AND attname = 'buggy';
+-- Whereas pg_attribute entry for toast_bug.buggy (the table) still has extended storage:
+SELECT attstorage FROM pg_attribute
+WHERE attrelid = 'toast_bug'::regclass AND attname = 'buggy';
+ attstorage 
+------------
+ x
+(1 row)
+
+-- Insert compressible heap tuple (comfortably exceeds TOAST_TUPLE_THRESHOLD):
+INSERT INTO toast_bug SELECT point(0,0), repeat('a', 2200);
+-- Should not get false positive report of corruption:
+SELECT gist_index_check('toasty', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 1b38e0aba77..15ae94cc90f 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
@@ -25,6 +26,7 @@ install_data(
   'amcheck--1.1--1.2.sql',
   'amcheck--1.2--1.3.sql',
   'amcheck--1.3--1.4.sql',
+  'amcheck--1.4--1.5.sql',
   kwargs: contrib_data_args,
 )
 
@@ -36,6 +38,7 @@ tests += {
     'sql': [
       'check',
       'check_btree',
+      'check_gist',
       'check_heap',
     ],
   },
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 00000000000..37966423b8b
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,62 @@
+
+SELECT setseed(1);
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+-- cleanup
+DROP TABLE gist_check;
+
+--
+-- Similar to BUG #15597
+--
+CREATE TABLE toast_bug(c point,buggy text);
+ALTER TABLE toast_bug ALTER COLUMN buggy SET STORAGE extended;
+CREATE INDEX toasty ON toast_bug USING gist(c) INCLUDE(buggy);
+
+-- pg_attribute entry for toasty.buggy (the index) will have plain storage:
+UPDATE pg_attribute SET attstorage = 'p'
+WHERE attrelid = 'toasty'::regclass AND attname = 'buggy';
+
+-- Whereas pg_attribute entry for toast_bug.buggy (the table) still has extended storage:
+SELECT attstorage FROM pg_attribute
+WHERE attrelid = 'toast_bug'::regclass AND attname = 'buggy';
+
+-- Insert compressible heap tuple (comfortably exceeds TOAST_TUPLE_THRESHOLD):
+INSERT INTO toast_bug SELECT point(0,0), repeat('a', 2200);
+-- Should not get false positive report of corruption:
+SELECT gist_index_check('toasty', true);
\ No newline at end of file
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 00000000000..477150ac802
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,687 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gist.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "access/tableam.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
+#include "common/pg_prng.h"
+#include "lib/bloomfilter.h"
+#include "verify_common.h"
+#include "utils/memutils.h"
+
+
+/*
+ * GistScanItem represents one item of depth-first scan of GiST index.
+ */
+typedef struct GistScanItem
+{
+	int			depth;
+
+	/* Referenced block number to check next */
+	BlockNumber blkno;
+
+	/*
+	 * Correctess of this parent tuple will be checked against contents of
+	 * referenced page. This tuple will be NULL for root block.
+	 */
+	IndexTuple	parenttup;
+
+	/*
+	 * LSN to hande concurrent scan of the page. It's necessary to avoid
+	 * missing some subtrees from page, that was split just before we read it.
+	 */
+	XLogRecPtr	parentlsn;
+
+	/*
+	 * Reference to parent page for re-locking in case of found parent-child
+	 * tuple discrepencies.
+	 */
+	BlockNumber parentblk;
+
+	/* Pointer to a next stack item. */
+	struct GistScanItem *next;
+}			GistScanItem;
+
+typedef struct GistCheckState
+{
+	/* GiST state */
+	GISTSTATE  *state;
+	/* Bloom filter fingerprints index tuples */
+	bloom_filter *filter;
+
+	Snapshot	snapshot;
+	Relation	rel;
+	Relation	heaprel;
+
+	/* Debug counter for reporting percentage of work already done */
+	int64		heaptuplespresent;
+
+	/* progress reporting stuff */
+	BlockNumber totalblocks;
+	BlockNumber reportedblocks;
+	BlockNumber scannedblocks;
+	BlockNumber deltablocks;
+
+	int			leafdepth;
+}			GistCheckState;
+
+PG_FUNCTION_INFO_V1(gist_index_check);
+
+static void giststate_init_heapallindexed(Relation rel, GistCheckState * result);
+static void gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+											   void *callback_state, bool readonly);
+static void gist_check_page(GistCheckState * check_state, GistScanItem * stack,
+							Page page, bool heapallindexed,
+							BufferAccessStrategy strategy);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gist_refind_parent(Relation rel, BlockNumber parentblkno,
+									 BlockNumber childblkno,
+									 BufferAccessStrategy strategy);
+static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block,
+								   Page page, OffsetNumber offset);
+static void gist_tuple_present_callback(Relation index, ItemPointer tid,
+										Datum *values, bool *isnull,
+										bool tupleIsAlive, void *checkstate);
+static IndexTuple gistFormNormalizedTuple(GISTSTATE *giststate, Relation r,
+										  Datum *attdata, bool *isnull, ItemPointerData tid);
+
+/*
+ * gist_index_check(index regclass)
+ *
+ * Verify integrity of GiST index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gist_index_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	bool		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid,
+									GIST_AM_OID,
+									gist_check_parent_keys_consistency,
+									AccessShareLock,
+									&heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+/*
+* Initaliaze GIST state filed needed to perform.
+* This initialized bloom filter and snapshot.
+*/
+static void
+giststate_init_heapallindexed(Relation rel, GistCheckState * result)
+{
+	int64		total_pages;
+	int64		total_elems;
+	uint64		seed;
+
+	/*
+	 * Size Bloom filter based on estimated number of tuples in index. This
+	 * logic is similar to B-tree, see verify_btree.c .
+	 */
+	total_pages = result->totalblocks;
+	total_elems = Max(total_pages * (MaxOffsetNumber / 5),
+					  (int64) rel->rd_rel->reltuples);
+	seed = pg_prng_uint64(&pg_global_prng_state);
+	result->filter = bloom_create(total_elems, maintenance_work_mem, seed);
+
+	result->snapshot = RegisterSnapshot(GetTransactionSnapshot());
+
+
+	/*
+	 * GetTransactionSnapshot() always acquires a new MVCC snapshot in READ
+	 * COMMITTED mode.  A new snapshot is guaranteed to have all the entries
+	 * it requires in the index.
+	 *
+	 * We must defend against the possibility that an old xact snapshot was
+	 * returned at higher isolation levels when that snapshot is not safe for
+	 * index scans of the target index.  This is possible when the snapshot
+	 * sees tuples that are before the index's indcheckxmin horizon.  Throwing
+	 * an error here should be very rare.  It doesn't seem worth using a
+	 * secondary snapshot to avoid this.
+	 */
+	if (IsolationUsesXactSnapshot() && rel->rd_index->indcheckxmin &&
+		!TransactionIdPrecedes(HeapTupleHeaderGetXmin(rel->rd_indextuple->t_data),
+							   result->snapshot->xmin))
+		ereport(ERROR,
+				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+				 errmsg("index \"%s\" cannot be verified using transaction snapshot",
+						RelationGetRelationName(rel))));
+}
+
+/*
+ * Main entry point for GiST check.
+ *
+ * This function verifies that tuples of internal pages cover all
+ * the key space of each tuple on leaf page.  To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * This check allocates memory context and scans through
+ * GiST graph. This scan is performed in a depth-first search using a stack of
+ * GistScanItem-s. Initially this stack contains only root block number. On
+ * each iteration top block numbmer is replcaed by referenced block numbers.
+ *
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries to
+ * adjust it by tuples on referenced child page.  Parent gist tuple should
+ * never require any adjustments.
+ */
+static void
+gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+								   void *callback_state, bool readonly)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GistScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GISTSTATE  *state;
+	bool		heapallindexed = *((bool *) callback_state);
+	GistCheckState *check_state = palloc0(sizeof(GistCheckState));
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	state = initGISTstate(rel);
+
+	check_state->state = state;
+	check_state->rel = rel;
+	check_state->heaprel = heaprel;
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	check_state->leafdepth = -1;
+
+	check_state->totalblocks = RelationGetNumberOfBlocks(rel);
+	/* report every 100 blocks or 5%, whichever is bigger */
+	check_state->deltablocks = Max(check_state->totalblocks / 20, 100);
+
+	if (heapallindexed)
+		giststate_init_heapallindexed(rel, check_state);
+
+	/* Start the scan at the root page */
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	/*
+	 * This GiST scan is effectively "old" VACUUM version before commit
+	 * fe280694d which introduced physical order scanning.
+	 */
+
+	while (stack)
+	{
+		GistScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		XLogRecPtr	lsn;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Report progress */
+		if (check_state->scannedblocks > check_state->reportedblocks +
+			check_state->deltablocks)
+		{
+			elog(DEBUG1, "verified level %u blocks of approximately %u total",
+				 check_state->scannedblocks, check_state->totalblocks);
+			check_state->reportedblocks = check_state->scannedblocks;
+		}
+		check_state->scannedblocks++;
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page))
+		{
+			/* split page detected, install right link to the stack */
+			GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+			ptr->depth = stack->depth;
+			ptr->parenttup = CopyIndexTuple(stack->parenttup);
+			ptr->parentblk = stack->parentblk;
+			ptr->parentlsn = stack->parentlsn;
+			ptr->blkno = GistPageGetOpaque(page)->rightlink;
+			ptr->next = stack->next;
+			stack->next = ptr;
+		}
+
+		gist_check_page(check_state, stack, page, heapallindexed, strategy);
+
+		if (!GistPageIsLeaf(page))
+		{
+			OffsetNumber maxoff = PageGetMaxOffsetNumber(page);
+
+			for (OffsetNumber i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				/* Internal page, so recurse to the child */
+				GistScanItem *ptr;
+				ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
+				IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+				ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+				ptr->depth = stack->depth + 1;
+				ptr->parenttup = CopyIndexTuple(idxtuple);
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		LockBuffer(buffer, GIST_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	if (heapallindexed)
+	{
+		IndexInfo  *indexinfo = BuildIndexInfo(rel);
+		TableScanDesc scan;
+
+		scan = table_beginscan_strat(heaprel,	/* relation */
+									 check_state->snapshot, /* snapshot */
+									 0, /* number of keys */
+									 NULL,	/* scan key */
+									 true,	/* buffer access strategy OK */
+									 true); /* syncscan OK? */
+
+		/*
+		 * Scan will behave as the first scan of a CREATE INDEX CONCURRENTLY.
+		 */
+		indexinfo->ii_Concurrent = true;
+
+		indexinfo->ii_Unique = false;
+		indexinfo->ii_ExclusionOps = NULL;
+		indexinfo->ii_ExclusionProcs = NULL;
+		indexinfo->ii_ExclusionStrats = NULL;
+
+		elog(DEBUG1, "verifying that tuples from index \"%s\" are present in \"%s\"",
+			 RelationGetRelationName(rel),
+			 RelationGetRelationName(heaprel));
+
+		table_index_build_scan(heaprel, rel, indexinfo, true, false,
+							   gist_tuple_present_callback, (void *) check_state, scan);
+
+		ereport(DEBUG1,
+				(errmsg_internal("finished verifying presence of " INT64_FORMAT " tuples from table \"%s\" with bitset %.2f%% set",
+								 check_state->heaptuplespresent,
+								 RelationGetRelationName(heaprel),
+								 100.0 * bloom_prop_bits_set(check_state->filter))));
+
+		UnregisterSnapshot(check_state->snapshot);
+		bloom_free(check_state->filter);
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+	pfree(check_state);
+}
+
+static void
+gist_check_page(GistCheckState * check_state, GistScanItem * stack,
+				Page page, bool heapallindexed, BufferAccessStrategy strategy)
+{
+	OffsetNumber maxoff = PageGetMaxOffsetNumber(page);
+
+	/* Check that the tree has the same height in all branches */
+	if (GistPageIsLeaf(page))
+	{
+		if (check_state->leafdepth == -1)
+			check_state->leafdepth = stack->depth;
+		else if (stack->depth != check_state->leafdepth)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+							RelationGetRelationName(check_state->rel), stack->blkno)));
+	}
+
+	/*
+	 * Check that each tuple looks valid, and is consistent with the downlink
+	 * we followed when we stepped on this page.
+	 */
+	for (OffsetNumber i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+	{
+		ItemId		iid = PageGetItemIdCareful(check_state->rel, stack->blkno, page, i);
+		IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+		IndexTuple  tmpTuple = NULL;
+
+		/*
+		 * Check that it's not a leftover invalid tuple from pre-9.1 See also
+		 * gistdoinsert() and gistbulkdelete() handling of such tuples. We do
+		 * consider it error here.
+		 */
+		if (GistTupleIsInvalid(idxtuple))
+			ereport(ERROR,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					 errmsg("index \"%s\" contains an inner tuple marked as invalid, block %u, offset %u",
+							RelationGetRelationName(check_state->rel), stack->blkno, i),
+					 errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+					 errhint("Please REINDEX it.")));
+
+		if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+							RelationGetRelationName(check_state->rel), stack->blkno, i)));
+
+		/*
+		 * Check if this tuple is consistent with the downlink in the parent.
+		 */
+		if (stack->parenttup)
+			tmpTuple = gistgetadjusted(check_state->rel, stack->parenttup, idxtuple, check_state->state);
+
+		if (tmpTuple)
+		{
+			/*
+			 * There was a discrepancy between parent and child tuples. We
+			 * need to verify it is not a result of concurrent call of
+			 * gistplacetopage(). So, lock parent and try to find downlink for
+			 * current page. It may be missing due to concurrent page split,
+			 * this is OK.
+			 *
+			 * Note that when we aquire parent tuple now we hold lock for both
+			 * parent and child buffers. Thus parent tuple must include
+			 * keyspace of the child.
+			 */
+
+			pfree(tmpTuple);
+			pfree(stack->parenttup);
+			stack->parenttup = gist_refind_parent(check_state->rel, stack->parentblk,
+												  stack->blkno, strategy);
+
+			/* We found it - make a final check before failing */
+			if (!stack->parenttup)
+				elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+					 stack->blkno, stack->parentblk);
+			else if (gistgetadjusted(check_state->rel, stack->parenttup, idxtuple, check_state->state))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+								RelationGetRelationName(check_state->rel), stack->blkno, i)));
+			else
+			{
+				/*
+				 * But now it is properly adjusted - nothing to do here.
+				 */
+			}
+		}
+
+		if (GistPageIsLeaf(page))
+		{
+			if (heapallindexed)
+				bloom_add_element(check_state->filter,
+								  (unsigned char *) idxtuple,
+								  IndexTupleSize(idxtuple));
+		}
+		else
+		{
+			OffsetNumber off = ItemPointerGetOffsetNumber(&(idxtuple->t_tid));
+
+			if (off != 0xffff)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has on page %u offset %u has item id not pointing to 0xffff, but %hu",
+								RelationGetRelationName(check_state->rel), stack->blkno, i, off)));
+		}
+	}
+}
+
+/*
+ * gistFormNormalizedTuple - analogue to gistFormTuple, but performs deTOASTing
+ * of all included data (for covering indexes). While we do not expected
+ * toasted attributes in normal index, this can happen as a result of
+ * intervention into system catalog. Detoasting of key attributes is expected
+ * to be done by opclass decompression methods, if indexed type might be
+ * toasted.
+ */
+static IndexTuple
+gistFormNormalizedTuple(GISTSTATE *giststate, Relation r,
+						Datum *attdata, bool *isnull, ItemPointerData tid)
+{
+	Datum		compatt[INDEX_MAX_KEYS];
+	IndexTuple	res;
+
+	gistCompressValues(giststate, r, attdata, isnull, true, compatt);
+
+	for (int i = 0; i < r->rd_att->natts; i++)
+	{
+		Form_pg_attribute att;
+
+		att = TupleDescAttr(giststate->leafTupdesc, i);
+		if (att->attbyval || att->attlen != -1 || isnull[i])
+			continue;
+
+		if (VARATT_IS_EXTERNAL(DatumGetPointer(compatt[i])))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("external varlena datum in tuple that references heap row (%u,%u) in index \"%s\"",
+							ItemPointerGetBlockNumber(&tid),
+							ItemPointerGetOffsetNumber(&tid),
+							RelationGetRelationName(r))));
+		if (VARATT_IS_COMPRESSED(DatumGetPointer(compatt[i])))
+		{
+			/* Datum old = compatt[i]; */
+			/* Key attributes must never be compressed */
+			if (i < IndexRelationGetNumberOfKeyAttributes(r))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("compressed varlena datum in tuple key that references heap row (%u,%u) in index \"%s\"",
+								ItemPointerGetBlockNumber(&tid),
+								ItemPointerGetOffsetNumber(&tid),
+								RelationGetRelationName(r))));
+
+			compatt[i] = PointerGetDatum(PG_DETOAST_DATUM(compatt[i]));
+			/* pfree(DatumGetPointer(old)); // TODO: this fails. Why? */
+		}
+	}
+
+	res = index_form_tuple(giststate->leafTupdesc, compatt, isnull);
+
+	/*
+	 * The offset number on tuples on internal pages is unused. For historical
+	 * reasons, it is set to 0xffff.
+	 */
+	ItemPointerSetOffsetNumber(&(res->t_tid), 0xffff);
+	return res;
+}
+
+static void
+gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+							bool *isnull, bool tupleIsAlive, void *checkstate)
+{
+	GistCheckState *state = (GistCheckState *) checkstate;
+	IndexTuple	itup = gistFormNormalizedTuple(state->state, index, values, isnull, *tid);
+
+	itup->t_tid = *tid;
+	/* Probe Bloom filter -- tuple should be present */
+	if (bloom_lacks_element(state->filter, (unsigned char *) itup,
+							IndexTupleSize(itup)))
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("heap tuple (%u,%u) from table \"%s\" lacks matching index tuple within index \"%s\"",
+						ItemPointerGetBlockNumber(&(itup->t_tid)),
+						ItemPointerGetOffsetNumber(&(itup->t_tid)),
+						RelationGetRelationName(state->heaprel),
+						RelationGetRelationName(state->rel))));
+
+	state->heaptuplespresent++;
+
+	pfree(itup);
+}
+
+/*
+ * check_index_page - verification of basic invariants about GiST page data
+ * This function does no any tuple analysis.
+ */
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	gistcheckpage(rel, buffer);
+
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted page %d",
+						RelationGetRelationName(rel), blockNo)));
+
+	if (GistPageIsDeleted(page))
+	{
+		if (!GistPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gist_refind_parent(Relation rel,
+				   BlockNumber parentblkno, BlockNumber childblkno,
+				   BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIST_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GistPageIsLeaf(parentpage))
+	{
+		/*
+		 * Currently GiST never deletes internal pages, thus they can never
+		 * become leaf.
+		 */
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" internal page %d became leaf",
+						RelationGetRelationName(rel), parentblkno)));
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (OffsetNumber o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/*
+			 * Found it! Make copy and return it while both parent and child
+			 * pages are locked. This guaranties that at this particular
+			 * moment tuples must be coherent to each other.
+			 */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+static ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(sizeof(GISTPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since gist
+	 * never uses either.  Verify that line pointer has storage, too, since
+	 * even LP_DEAD items should.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 3af065615bc..6eb526c6bb7 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -188,6 +188,25 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gist_index_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
   <tip>
    <para>
-- 
2.34.1

#56

Kirill Reshke

reshkekirill@gmail.com

about 1 year ago

In reply to: Kirill Reshke (#55)

5 attachment(s)

Re: Amcheck verification of GiST and GIN

On Fri, 29 Nov 2024 at 22:24, Kirill Reshke <reshkekirill@gmail.com> wrote:
ic(buffer);

-               maxoff = PageGetMaxOffsetNumber(page);
+               maxoff = GinPageGetOpaque(page)->maxoff;

This change was not correct at all.
PFA v32.

Little sketch of what's changed:
1) More debug1-debug3 output added.
2) Assertion failure under debug3 (pointed by Tomas Vonda) resolved
again. The fix is not to call ItemPointerGetOffsetNumber on rightmost
tuple, because,
ItemPointerGetOffsetNumber expects a valid pointer and rightmost tuple
pointer is (0, 0)
3) pgindent run

only 0004 was changed since v31. To clarify, gin_index_check still
does not work under some conditions. So, works on some types of
indexes and does not on others.
I think the issue is around the full entry page.

Repro:
```
db1=# create table ttt(t text);
CREATE TABLE
db1=# create index on ttt using gin(t gin_trgm_ops);
CREATE INDEX
db1=# insert into ttt select md5(random()::text) from generate_series(1,50000);
INSERT 0 50000
db1=# set client_min_messages to debug5;
DEBUG: CommitTransaction(1) name: unnamed; blockState: STARTED;
state: INPROGRESS, xid/subid/cid: 0/1/0
SET
db1=# select gin_index_check('ttt_t_idx');
DEBUG: StartTransaction(1) name: unnamed; blockState: DEFAULT; state:
INPROGRESS, xid/subid/cid: 0/1/0
DEBUG: processing entry tree page at blk 1, maxoff: 2
DEBUG: processing entry tree page at blk 941, maxoff: 229
ERROR: index "ttt_t_idx" has wrong tuple order on entry tree page,
block 941, offset 229
db1=#
```
I have only observed failures on the last tuple of the entry page. All
other known issues that were on v31 are now fixed (I hope).

--
Best regards,
Kirill Reshke

Attachments:

v32-0005-Add-GiST-support-to-pg_amcheck.patchapplication/octet-stream; name=v32-0005-Add-GiST-support-to-pg_amcheck.patchDownload

From b6ae0cee36ba03d060502a89d11dbbed17b3b10f Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sun, 5 Feb 2023 15:52:14 -0800
Subject: [PATCH v32 5/5] Add GiST support to pg_amcheck

Proof of concept patch for pg_amcheck binary support
for GIST and GIN index checks.

Author: Andrey Borodin <amborodin@acm.org>
Reviewed-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
---
 src/bin/pg_amcheck/pg_amcheck.c      | 290 ++++++++++++++++-----------
 src/bin/pg_amcheck/t/002_nonesuch.pl |   8 +-
 src/bin/pg_amcheck/t/003_check.pl    |  65 ++++--
 3 files changed, 220 insertions(+), 143 deletions(-)

diff --git a/src/bin/pg_amcheck/pg_amcheck.c b/src/bin/pg_amcheck/pg_amcheck.c
index 27a7d5e925e..8146ea1e604 100644
--- a/src/bin/pg_amcheck/pg_amcheck.c
+++ b/src/bin/pg_amcheck/pg_amcheck.c
@@ -40,8 +40,7 @@ typedef struct PatternInfo
 								 * NULL */
 	bool		heap_only;		/* true if rel_regex should only match heap
 								 * tables */
-	bool		btree_only;		/* true if rel_regex should only match btree
-								 * indexes */
+	bool		index_only;		/* true if rel_regex should only match indexes */
 	bool		matched;		/* true if the pattern matched in any database */
 } PatternInfo;
 
@@ -75,10 +74,9 @@ typedef struct AmcheckOptions
 
 	/*
 	 * As an optimization, if any pattern in the exclude list applies to heap
-	 * tables, or similarly if any such pattern applies to btree indexes, or
-	 * to schemas, then these will be true, otherwise false.  These should
-	 * always agree with what you'd conclude by grep'ing through the exclude
-	 * list.
+	 * tables, or similarly if any such pattern applies to indexes, or to
+	 * schemas, then these will be true, otherwise false.  These should always
+	 * agree with what you'd conclude by grep'ing through the exclude list.
 	 */
 	bool		excludetbl;
 	bool		excludeidx;
@@ -99,14 +97,14 @@ typedef struct AmcheckOptions
 	int64		endblock;
 	const char *skip;
 
-	/* btree index checking options */
+	/* index checking options */
 	bool		parent_check;
 	bool		rootdescend;
 	bool		heapallindexed;
 	bool		checkunique;
 
-	/* heap and btree hybrid option */
-	bool		no_btree_expansion;
+	/* heap and indexes hybrid option */
+	bool		no_index_expansion;
 } AmcheckOptions;
 
 static AmcheckOptions opts = {
@@ -135,7 +133,7 @@ static AmcheckOptions opts = {
 	.rootdescend = false,
 	.heapallindexed = false,
 	.checkunique = false,
-	.no_btree_expansion = false
+	.no_index_expansion = false
 };
 
 static const char *progname = NULL;
@@ -152,13 +150,15 @@ typedef struct DatabaseInfo
 	char	   *datname;
 	char	   *amcheck_schema; /* escaped, quoted literal */
 	bool		is_checkunique;
+	bool		gist_supported;
 } DatabaseInfo;
 
 typedef struct RelationInfo
 {
 	const DatabaseInfo *datinfo;	/* shared by other relinfos */
 	Oid			reloid;
-	bool		is_heap;		/* true if heap, false if btree */
+	Oid			amoid;
+	bool		is_heap;		/* true if heap, false if index */
 	char	   *nspname;
 	char	   *relname;
 	int			relpages;
@@ -179,10 +179,12 @@ static void prepare_heap_command(PQExpBuffer sql, RelationInfo *rel,
 								 PGconn *conn);
 static void prepare_btree_command(PQExpBuffer sql, RelationInfo *rel,
 								  PGconn *conn);
+static void prepare_gist_command(PQExpBuffer sql, RelationInfo *rel,
+								 PGconn *conn);
 static void run_command(ParallelSlot *slot, const char *sql);
 static bool verify_heap_slot_handler(PGresult *res, PGconn *conn,
 									 void *context);
-static bool verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context);
+static bool verify_index_slot_handler(PGresult *res, PGconn *conn, void *context);
 static void help(const char *progname);
 static void progress_report(uint64 relations_total, uint64 relations_checked,
 							uint64 relpages_total, uint64 relpages_checked,
@@ -196,7 +198,7 @@ static void append_relation_pattern(PatternInfoArray *pia, const char *pattern,
 									int encoding);
 static void append_heap_pattern(PatternInfoArray *pia, const char *pattern,
 								int encoding);
-static void append_btree_pattern(PatternInfoArray *pia, const char *pattern,
+static void append_index_pattern(PatternInfoArray *pia, const char *pattern,
 								 int encoding);
 static void compile_database_list(PGconn *conn, SimplePtrList *databases,
 								  const char *initial_dbname);
@@ -288,6 +290,7 @@ main(int argc, char *argv[])
 	enum trivalue prompt_password = TRI_DEFAULT;
 	int			encoding = pg_get_encoding_from_locale(NULL, false);
 	ConnParams	cparams;
+	bool		gist_warn_printed = false;
 
 	pg_logging_init(argv[0]);
 	progname = get_progname(argv[0]);
@@ -323,11 +326,11 @@ main(int argc, char *argv[])
 				break;
 			case 'i':
 				opts.allrel = false;
-				append_btree_pattern(&opts.include, optarg, encoding);
+				append_index_pattern(&opts.include, optarg, encoding);
 				break;
 			case 'I':
 				opts.excludeidx = true;
-				append_btree_pattern(&opts.exclude, optarg, encoding);
+				append_index_pattern(&opts.exclude, optarg, encoding);
 				break;
 			case 'j':
 				if (!option_parse_int(optarg, "-j/--jobs", 1, INT_MAX,
@@ -382,7 +385,7 @@ main(int argc, char *argv[])
 				maintenance_db = pg_strdup(optarg);
 				break;
 			case 2:
-				opts.no_btree_expansion = true;
+				opts.no_index_expansion = true;
 				break;
 			case 3:
 				opts.no_toast_expansion = true;
@@ -531,6 +534,10 @@ main(int argc, char *argv[])
 		int			ntups;
 		const char *amcheck_schema = NULL;
 		DatabaseInfo *dat = (DatabaseInfo *) cell->ptr;
+		int			vmaj = 0,
+					vmin = 0,
+					vrev = 0;
+		const char *amcheck_version;
 
 		cparams.override_dbname = dat->datname;
 		if (conn == NULL || strcmp(PQdb(conn), dat->datname) != 0)
@@ -599,36 +606,32 @@ main(int argc, char *argv[])
 												 strlen(amcheck_schema));
 
 		/*
-		 * Check the version of amcheck extension. Skip requested unique
-		 * constraint check with warning if it is not yet supported by
-		 * amcheck.
+		 * Check the version of amcheck extension.
 		 */
-		if (opts.checkunique == true)
-		{
-			/*
-			 * Now amcheck has only major and minor versions in the string but
-			 * we also support revision just in case. Now it is expected to be
-			 * zero.
-			 */
-			int			vmaj = 0,
-						vmin = 0,
-						vrev = 0;
-			const char *amcheck_version = PQgetvalue(result, 0, 1);
+		amcheck_version = PQgetvalue(result, 0, 1);
 
-			sscanf(amcheck_version, "%d.%d.%d", &vmaj, &vmin, &vrev);
+		/*
+		 * Now amcheck has only major and minor versions in the string but we
+		 * also support revision just in case. Now it is expected to be zero.
+		 */
+		sscanf(amcheck_version, "%d.%d.%d", &vmaj, &vmin, &vrev);
 
-			/*
-			 * checkunique option is supported in amcheck since version 1.4
-			 */
-			if ((vmaj == 1 && vmin < 4) || vmaj == 0)
-			{
-				pg_log_warning("option %s is not supported by amcheck version %s",
-							   "--checkunique", amcheck_version);
-				dat->is_checkunique = false;
-			}
-			else
-				dat->is_checkunique = true;
+		/*
+		 * checkunique option is supported in amcheck since version 1.4. Skip
+		 * requested unique constraint check with warning if it is not yet
+		 * supported by amcheck.
+		 */
+		if (opts.checkunique && ((vmaj == 1 && vmin < 4) || vmaj == 0))
+		{
+			pg_log_warning("option %s is not supported by amcheck version %s",
+						   "--checkunique", amcheck_version);
+			dat->is_checkunique = false;
 		}
+		else
+			dat->is_checkunique = opts.checkunique;
+
+		/* GiST indexes are supported in 1.5+ */
+		dat->gist_supported = ((vmaj == 1 && vmin >= 5) || vmaj > 1);
 
 		PQclear(result);
 
@@ -650,8 +653,8 @@ main(int argc, char *argv[])
 			if (pat->heap_only)
 				log_no_match("no heap tables to check matching \"%s\"",
 							 pat->pattern);
-			else if (pat->btree_only)
-				log_no_match("no btree indexes to check matching \"%s\"",
+			else if (pat->index_only)
+				log_no_match("no indexes to check matching \"%s\"",
 							 pat->pattern);
 			else if (pat->rel_regex == NULL)
 				log_no_match("no relations to check in schemas matching \"%s\"",
@@ -784,13 +787,29 @@ main(int argc, char *argv[])
 				if (opts.show_progress && progress_since_last_stderr)
 					fprintf(stderr, "\n");
 
-				pg_log_info("checking btree index \"%s.%s.%s\"",
+				pg_log_info("checking index \"%s.%s.%s\"",
 							rel->datinfo->datname, rel->nspname, rel->relname);
 				progress_since_last_stderr = false;
 			}
-			prepare_btree_command(&sql, rel, free_slot->connection);
+			if (rel->amoid == BTREE_AM_OID)
+				prepare_btree_command(&sql, rel, free_slot->connection);
+			else if (rel->amoid == GIST_AM_OID)
+			{
+				if (rel->datinfo->gist_supported)
+					prepare_gist_command(&sql, rel, free_slot->connection);
+				else
+				{
+					if (!gist_warn_printed)
+						pg_log_warning("GiST verification is not supported by installed amcheck version");
+					gist_warn_printed = true;
+				}
+			}
+			else
+				/* should not happen at this stage */
+				pg_log_info("Verification of index type %u not supported",
+							rel->amoid);
 			rel->sql = pstrdup(sql.data);	/* pg_free'd after command */
-			ParallelSlotSetHandler(free_slot, verify_btree_slot_handler, rel);
+			ParallelSlotSetHandler(free_slot, verify_index_slot_handler, rel);
 			run_command(free_slot, rel->sql);
 		}
 	}
@@ -868,7 +887,7 @@ prepare_heap_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
  * Creates a SQL command for running amcheck checking on the given btree index
  * relation.  The command does not select any columns, as btree checking
  * functions do not return any, but rather return corruption information by
- * raising errors, which verify_btree_slot_handler expects.
+ * raising errors, which verify_index_slot_handler expects.
  *
  * The constructed SQL command will silently skip temporary indexes, and
  * indexes being reindexed concurrently, as checking them would needlessly draw
@@ -914,6 +933,28 @@ prepare_btree_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
 						  rel->reloid);
 }
 
+/*
+ * prepare_gist_command
+ * Similar to btree equivalent prepares command to check GiST index.
+ */
+static void
+prepare_gist_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
+{
+	resetPQExpBuffer(sql);
+
+	appendPQExpBuffer(sql,
+					  "SELECT %s.gist_index_check("
+					  "index := c.oid, heapallindexed := %s)"
+					  "\nFROM pg_catalog.pg_class c, pg_catalog.pg_index i "
+					  "WHERE c.oid = %u "
+					  "AND c.oid = i.indexrelid "
+					  "AND c.relpersistence != 't' "
+					  "AND i.indisready AND i.indisvalid AND i.indislive",
+					  rel->datinfo->amcheck_schema,
+					  (opts.heapallindexed ? "true" : "false"),
+					  rel->reloid);
+}
+
 /*
  * run_command
  *
@@ -953,7 +994,7 @@ run_command(ParallelSlot *slot, const char *sql)
  * Note: Heap relation corruption is reported by verify_heapam() via the result
  * set, rather than an ERROR, but running verify_heapam() on a corrupted heap
  * table may still result in an error being returned from the server due to
- * missing relation files, bad checksums, etc.  The btree corruption checking
+ * missing relation files, bad checksums, etc.  The corruption checking
  * functions always use errors to communicate corruption messages.  We can't
  * just abort processing because we got a mere ERROR.
  *
@@ -1103,11 +1144,11 @@ verify_heap_slot_handler(PGresult *res, PGconn *conn, void *context)
 }
 
 /*
- * verify_btree_slot_handler
+ * verify_index_slot_handler
  *
- * ParallelSlotHandler that receives results from a btree checking command
- * created by prepare_btree_command and outputs them for the user.  The results
- * from the btree checking command is assumed to be empty, but when the results
+ * ParallelSlotHandler that receives results from a checking command created by
+ * prepare_[btree,gist]_command and outputs them for the user.  The results
+ * from the checking command is assumed to be empty, but when the results
  * are an error code, the useful information about the corruption is expected
  * in the connection's error message.
  *
@@ -1116,7 +1157,7 @@ verify_heap_slot_handler(PGresult *res, PGconn *conn, void *context)
  * context: unused
  */
 static bool
-verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
+verify_index_slot_handler(PGresult *res, PGconn *conn, void *context)
 {
 	RelationInfo *rel = (RelationInfo *) context;
 
@@ -1127,12 +1168,12 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 		if (ntups > 1)
 		{
 			/*
-			 * We expect the btree checking functions to return one void row
-			 * each, or zero rows if the check was skipped due to the object
-			 * being in the wrong state to be checked, so we should output
-			 * some sort of warning if we get anything more, not because it
-			 * indicates corruption, but because it suggests a mismatch
-			 * between amcheck and pg_amcheck versions.
+			 * We expect the checking functions to return one void row each,
+			 * or zero rows if the check was skipped due to the object being
+			 * in the wrong state to be checked, so we should output some sort
+			 * of warning if we get anything more, not because it indicates
+			 * corruption, but because it suggests a mismatch between amcheck
+			 * and pg_amcheck versions.
 			 *
 			 * In conjunction with --progress, anything written to stderr at
 			 * this time would present strangely to the user without an extra
@@ -1142,7 +1183,7 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 			 */
 			if (opts.show_progress && progress_since_last_stderr)
 				fprintf(stderr, "\n");
-			pg_log_warning("btree index \"%s.%s.%s\": btree checking function returned unexpected number of rows: %d",
+			pg_log_warning("index \"%s.%s.%s\": checking function returned unexpected number of rows: %d",
 						   rel->datinfo->datname, rel->nspname, rel->relname, ntups);
 			if (opts.verbose)
 				pg_log_warning_detail("Query was: %s", rel->sql);
@@ -1156,7 +1197,7 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 		char	   *msg = indent_lines(PQerrorMessage(conn));
 
 		all_checks_pass = false;
-		printf(_("btree index \"%s.%s.%s\":\n"),
+		printf(_("index \"%s.%s.%s\":\n"),
 			   rel->datinfo->datname, rel->nspname, rel->relname);
 		printf("%s", msg);
 		if (opts.verbose)
@@ -1210,6 +1251,8 @@ help(const char *progname)
 	printf(_("      --heapallindexed            check that all heap tuples are found within indexes\n"));
 	printf(_("      --parent-check              check index parent/child relationships\n"));
 	printf(_("      --rootdescend               search from root page to refind tuples\n"));
+	printf(_("\nGiST index checking options:\n"));
+	printf(_("      --heapallindexed            check that all heap tuples are found within indexes\n"));
 	printf(_("\nConnection options:\n"));
 	printf(_("  -h, --host=HOSTNAME             database server host or socket directory\n"));
 	printf(_("  -p, --port=PORT                 database server port\n"));
@@ -1423,11 +1466,11 @@ append_schema_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
  * pattern: the relation name pattern
  * encoding: client encoding for parsing the pattern
  * heap_only: whether the pattern should only be matched against heap tables
- * btree_only: whether the pattern should only be matched against btree indexes
+ * index_only: whether the pattern should only be matched against indexes
  */
 static void
 append_relation_pattern_helper(PatternInfoArray *pia, const char *pattern,
-							   int encoding, bool heap_only, bool btree_only)
+							   int encoding, bool heap_only, bool index_only)
 {
 	PQExpBufferData dbbuf;
 	PQExpBufferData nspbuf;
@@ -1462,14 +1505,14 @@ append_relation_pattern_helper(PatternInfoArray *pia, const char *pattern,
 	termPQExpBuffer(&relbuf);
 
 	info->heap_only = heap_only;
-	info->btree_only = btree_only;
+	info->index_only = index_only;
 }
 
 /*
  * append_relation_pattern
  *
  * Adds the given pattern interpreted as a relation pattern, to be matched
- * against both heap tables and btree indexes.
+ * against both heap tables and indexes.
  *
  * pia: the pattern info array to be appended
  * pattern: the relation name pattern
@@ -1498,17 +1541,17 @@ append_heap_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
 }
 
 /*
- * append_btree_pattern
+ * append_index_pattern
  *
  * Adds the given pattern interpreted as a relation pattern, to be matched only
- * against btree indexes.
+ * against indexes.
  *
  * pia: the pattern info array to be appended
  * pattern: the relation name pattern
  * encoding: client encoding for parsing the pattern
  */
 static void
-append_btree_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
+append_index_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
 {
 	append_relation_pattern_helper(pia, pattern, encoding, false, true);
 }
@@ -1766,7 +1809,7 @@ compile_database_list(PGconn *conn, SimplePtrList *databases,
  *     rel_regex: the relname regexp parsed from the pattern, or NULL if the
  *                pattern had no relname part
  *     heap_only: true if the pattern applies only to heap tables (not indexes)
- *     btree_only: true if the pattern applies only to btree indexes (not tables)
+ *     index_only: true if the pattern applies only to indexes (not tables)
  *
  * buf: the buffer to be appended
  * patterns: the array of patterns to be inserted into the CTE
@@ -1808,7 +1851,7 @@ append_rel_pattern_raw_cte(PQExpBuffer buf, const PatternInfoArray *pia,
 			appendPQExpBufferStr(buf, "::TEXT, true::BOOLEAN");
 		else
 			appendPQExpBufferStr(buf, "::TEXT, false::BOOLEAN");
-		if (info->btree_only)
+		if (info->index_only)
 			appendPQExpBufferStr(buf, ", true::BOOLEAN");
 		else
 			appendPQExpBufferStr(buf, ", false::BOOLEAN");
@@ -1846,8 +1889,8 @@ append_rel_pattern_filtered_cte(PQExpBuffer buf, const char *raw,
 								const char *filtered, PGconn *conn)
 {
 	appendPQExpBuffer(buf,
-					  "\n%s (pattern_id, nsp_regex, rel_regex, heap_only, btree_only) AS ("
-					  "\nSELECT pattern_id, nsp_regex, rel_regex, heap_only, btree_only "
+					  "\n%s (pattern_id, nsp_regex, rel_regex, heap_only, index_only) AS ("
+					  "\nSELECT pattern_id, nsp_regex, rel_regex, heap_only, index_only "
 					  "FROM %s r"
 					  "\nWHERE (r.db_regex IS NULL "
 					  "OR ",
@@ -1870,7 +1913,7 @@ append_rel_pattern_filtered_cte(PQExpBuffer buf, const char *raw,
  * The cells of the constructed list contain all information about the relation
  * necessary to connect to the database and check the object, including which
  * database to connect to, where contrib/amcheck is installed, and the Oid and
- * type of object (heap table vs. btree index).  Rather than duplicating the
+ * type of object (heap table vs. index).  Rather than duplicating the
  * database details per relation, the relation structs use references to the
  * same database object, provided by the caller.
  *
@@ -1897,7 +1940,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	if (!opts.allrel)
 	{
 		appendPQExpBufferStr(&sql,
-							 " include_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, btree_only) AS (");
+							 " include_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, index_only) AS (");
 		append_rel_pattern_raw_cte(&sql, &opts.include, conn);
 		appendPQExpBufferStr(&sql, "\n),");
 		append_rel_pattern_filtered_cte(&sql, "include_raw", "include_pat", conn);
@@ -1907,7 +1950,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	if (opts.excludetbl || opts.excludeidx || opts.excludensp)
 	{
 		appendPQExpBufferStr(&sql,
-							 " exclude_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, btree_only) AS (");
+							 " exclude_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, index_only) AS (");
 		append_rel_pattern_raw_cte(&sql, &opts.exclude, conn);
 		appendPQExpBufferStr(&sql, "\n),");
 		append_rel_pattern_filtered_cte(&sql, "exclude_raw", "exclude_pat", conn);
@@ -1915,36 +1958,36 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 
 	/* Append the relation CTE. */
 	appendPQExpBufferStr(&sql,
-						 " relation (pattern_id, oid, nspname, relname, reltoastrelid, relpages, is_heap, is_btree) AS ("
+						 " relation (pattern_id, oid, amoid, nspname, relname, reltoastrelid, relpages, is_heap, is_index) AS ("
 						 "\nSELECT DISTINCT ON (c.oid");
 	if (!opts.allrel)
 		appendPQExpBufferStr(&sql, ", ip.pattern_id) ip.pattern_id,");
 	else
 		appendPQExpBufferStr(&sql, ") NULL::INTEGER AS pattern_id,");
 	appendPQExpBuffer(&sql,
-					  "\nc.oid, n.nspname, c.relname, c.reltoastrelid, c.relpages, "
-					  "c.relam = %u AS is_heap, "
-					  "c.relam = %u AS is_btree"
+					  "\nc.oid, c.relam as amoid, n.nspname, c.relname, "
+					  "c.reltoastrelid, c.relpages, c.relam = %u AS is_heap, "
+					  "(c.relam = %u OR c.relam = %u) AS is_index"
 					  "\nFROM pg_catalog.pg_class c "
 					  "INNER JOIN pg_catalog.pg_namespace n "
 					  "ON c.relnamespace = n.oid",
-					  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+					  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 	if (!opts.allrel)
 		appendPQExpBuffer(&sql,
 						  "\nINNER JOIN include_pat ip"
 						  "\nON (n.nspname ~ ip.nsp_regex OR ip.nsp_regex IS NULL)"
 						  "\nAND (c.relname ~ ip.rel_regex OR ip.rel_regex IS NULL)"
 						  "\nAND (c.relam = %u OR NOT ip.heap_only)"
-						  "\nAND (c.relam = %u OR NOT ip.btree_only)",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  "\nAND ((c.relam = %u OR c.relam = %u) OR NOT ip.index_only)",
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 	if (opts.excludetbl || opts.excludeidx || opts.excludensp)
 		appendPQExpBuffer(&sql,
 						  "\nLEFT OUTER JOIN exclude_pat ep"
 						  "\nON (n.nspname ~ ep.nsp_regex OR ep.nsp_regex IS NULL)"
 						  "\nAND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL)"
 						  "\nAND (c.relam = %u OR NOT ep.heap_only OR ep.rel_regex IS NULL)"
-						  "\nAND (c.relam = %u OR NOT ep.btree_only OR ep.rel_regex IS NULL)",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  "\nAND ((c.relam = %u OR c.relam = %u) OR NOT ep.index_only OR ep.rel_regex IS NULL)",
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 
 	/*
 	 * Exclude temporary tables and indexes, which must necessarily belong to
@@ -1983,7 +2026,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 						  HEAP_TABLE_AM_OID, PG_TOAST_NAMESPACE);
 	else
 		appendPQExpBuffer(&sql,
-						  " AND c.relam IN (%u, %u)"
+						  " AND c.relam IN (%u, %u, %u)"
 						  "AND c.relkind IN ("
 						  CppAsString2(RELKIND_RELATION) ", "
 						  CppAsString2(RELKIND_SEQUENCE) ", "
@@ -1995,10 +2038,10 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 						  CppAsString2(RELKIND_SEQUENCE) ", "
 						  CppAsString2(RELKIND_MATVIEW) ", "
 						  CppAsString2(RELKIND_TOASTVALUE) ")) OR "
-						  "(c.relam = %u AND c.relkind = "
+						  "((c.relam = %u OR c.relam = %u) AND c.relkind = "
 						  CppAsString2(RELKIND_INDEX) "))",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID,
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID,
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 
 	appendPQExpBufferStr(&sql,
 						 "\nORDER BY c.oid)");
@@ -2027,7 +2070,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		appendPQExpBufferStr(&sql,
 							 "\n)");
 	}
-	if (!opts.no_btree_expansion)
+	if (!opts.no_index_expansion)
 	{
 		/*
 		 * Include a CTE for btree indexes associated with primary heap tables
@@ -2035,9 +2078,9 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		 * btree index names.
 		 */
 		appendPQExpBufferStr(&sql,
-							 ", index (oid, nspname, relname, relpages) AS ("
-							 "\nSELECT c.oid, r.nspname, c.relname, c.relpages "
-							 "FROM relation r"
+							 ", index (oid, amoid, nspname, relname, relpages) AS ("
+							 "\nSELECT c.oid, c.relam as amoid, r.nspname, "
+							 "c.relname, c.relpages FROM relation r"
 							 "\nINNER JOIN pg_catalog.pg_index i "
 							 "ON r.oid = i.indrelid "
 							 "INNER JOIN pg_catalog.pg_class c "
@@ -2050,15 +2093,15 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 								 "\nLEFT OUTER JOIN exclude_pat ep "
 								 "ON (n.nspname ~ ep.nsp_regex OR ep.nsp_regex IS NULL) "
 								 "AND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL) "
-								 "AND ep.btree_only"
+								 "AND ep.index_only"
 								 "\nWHERE ep.pattern_id IS NULL");
 		else
 			appendPQExpBufferStr(&sql,
 								 "\nWHERE true");
 		appendPQExpBuffer(&sql,
-						  " AND c.relam = %u "
+						  " AND (c.relam = %u or c.relam = %u) "
 						  "AND c.relkind = " CppAsString2(RELKIND_INDEX),
-						  BTREE_AM_OID);
+						  BTREE_AM_OID, GIST_AM_OID);
 		if (opts.no_toast_expansion)
 			appendPQExpBuffer(&sql,
 							  " AND c.relnamespace != %u",
@@ -2066,7 +2109,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		appendPQExpBufferStr(&sql, "\n)");
 	}
 
-	if (!opts.no_toast_expansion && !opts.no_btree_expansion)
+	if (!opts.no_toast_expansion && !opts.no_index_expansion)
 	{
 		/*
 		 * Include a CTE for btree indexes associated with toast tables of
@@ -2087,7 +2130,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 								 "\nLEFT OUTER JOIN exclude_pat ep "
 								 "ON ('pg_toast' ~ ep.nsp_regex OR ep.nsp_regex IS NULL) "
 								 "AND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL) "
-								 "AND ep.btree_only "
+								 "AND ep.index_only "
 								 "WHERE ep.pattern_id IS NULL");
 		else
 			appendPQExpBufferStr(&sql,
@@ -2107,12 +2150,13 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	 * list.
 	 */
 	appendPQExpBufferStr(&sql,
-						 "\nSELECT pattern_id, is_heap, is_btree, oid, nspname, relname, relpages "
+						 "\nSELECT pattern_id, is_heap, is_index, oid, amoid, nspname, relname, relpages "
 						 "FROM (");
 	appendPQExpBufferStr(&sql,
 	/* Inclusion patterns that failed to match */
-						 "\nSELECT pattern_id, is_heap, is_btree, "
+						 "\nSELECT pattern_id, is_heap, is_index, "
 						 "NULL::OID AS oid, "
+						 "NULL::OID AS amoid, "
 						 "NULL::TEXT AS nspname, "
 						 "NULL::TEXT AS relname, "
 						 "NULL::INTEGER AS relpages"
@@ -2121,29 +2165,29 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 						 "UNION"
 	/* Primary relations */
 						 "\nSELECT NULL::INTEGER AS pattern_id, "
-						 "is_heap, is_btree, oid, nspname, relname, relpages "
+						 "is_heap, is_index, oid, amoid, nspname, relname, relpages "
 						 "FROM relation");
 	if (!opts.no_toast_expansion)
-		appendPQExpBufferStr(&sql,
-							 " UNION"
+		appendPQExpBuffer(&sql,
+						  " UNION"
 		/* Toast tables for primary relations */
-							 "\nSELECT NULL::INTEGER AS pattern_id, TRUE AS is_heap, "
-							 "FALSE AS is_btree, oid, nspname, relname, relpages "
-							 "FROM toast");
-	if (!opts.no_btree_expansion)
+						  "\nSELECT NULL::INTEGER AS pattern_id, TRUE AS is_heap, "
+						  "FALSE AS is_index, oid, 0 as amoid, nspname, relname, relpages "
+						  "FROM toast");
+	if (!opts.no_index_expansion)
 		appendPQExpBufferStr(&sql,
 							 " UNION"
 		/* Indexes for primary relations */
 							 "\nSELECT NULL::INTEGER AS pattern_id, FALSE AS is_heap, "
-							 "TRUE AS is_btree, oid, nspname, relname, relpages "
+							 "TRUE AS is_index, oid, amoid, nspname, relname, relpages "
 							 "FROM index");
-	if (!opts.no_toast_expansion && !opts.no_btree_expansion)
-		appendPQExpBufferStr(&sql,
-							 " UNION"
+	if (!opts.no_toast_expansion && !opts.no_index_expansion)
+		appendPQExpBuffer(&sql,
+						  " UNION"
 		/* Indexes for toast relations */
-							 "\nSELECT NULL::INTEGER AS pattern_id, FALSE AS is_heap, "
-							 "TRUE AS is_btree, oid, nspname, relname, relpages "
-							 "FROM toast_index");
+						  "\nSELECT NULL::INTEGER AS pattern_id, FALSE AS is_heap, "
+						  "TRUE AS is_index, oid, %u as amoid, nspname, relname, relpages "
+						  "FROM toast_index", BTREE_AM_OID);
 	appendPQExpBufferStr(&sql,
 						 "\n) AS combined_records "
 						 "ORDER BY relpages DESC NULLS FIRST, oid");
@@ -2163,8 +2207,9 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	{
 		int			pattern_id = -1;
 		bool		is_heap = false;
-		bool		is_btree PG_USED_FOR_ASSERTS_ONLY = false;
+		bool		is_index PG_USED_FOR_ASSERTS_ONLY = false;
 		Oid			oid = InvalidOid;
+		Oid			amoid = InvalidOid;
 		const char *nspname = NULL;
 		const char *relname = NULL;
 		int			relpages = 0;
@@ -2174,15 +2219,17 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		if (!PQgetisnull(res, i, 1))
 			is_heap = (PQgetvalue(res, i, 1)[0] == 't');
 		if (!PQgetisnull(res, i, 2))
-			is_btree = (PQgetvalue(res, i, 2)[0] == 't');
+			is_index = (PQgetvalue(res, i, 2)[0] == 't');
 		if (!PQgetisnull(res, i, 3))
 			oid = atooid(PQgetvalue(res, i, 3));
 		if (!PQgetisnull(res, i, 4))
-			nspname = PQgetvalue(res, i, 4);
+			amoid = atooid(PQgetvalue(res, i, 4));
 		if (!PQgetisnull(res, i, 5))
-			relname = PQgetvalue(res, i, 5);
+			nspname = PQgetvalue(res, i, 5);
 		if (!PQgetisnull(res, i, 6))
-			relpages = atoi(PQgetvalue(res, i, 6));
+			relname = PQgetvalue(res, i, 6);
+		if (!PQgetisnull(res, i, 7))
+			relpages = atoi(PQgetvalue(res, i, 7));
 
 		if (pattern_id >= 0)
 		{
@@ -2204,10 +2251,11 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 			RelationInfo *rel = (RelationInfo *) pg_malloc0(sizeof(RelationInfo));
 
 			Assert(OidIsValid(oid));
-			Assert((is_heap && !is_btree) || (is_btree && !is_heap));
+			Assert((is_heap && !is_index) || (is_index && !is_heap));
 
 			rel->datinfo = dat;
 			rel->reloid = oid;
+			rel->amoid = amoid;
 			rel->is_heap = is_heap;
 			rel->nspname = pstrdup(nspname);
 			rel->relname = pstrdup(relname);
@@ -2217,7 +2265,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 			{
 				/*
 				 * We apply --startblock and --endblock to heap tables, but
-				 * not btree indexes, and for progress purposes we need to
+				 * not supported indexes, and for progress purposes we need to
 				 * track how many blocks we expect to check.
 				 */
 				if (opts.endblock >= 0 && rel->blocks_to_check > opts.endblock)
diff --git a/src/bin/pg_amcheck/t/002_nonesuch.pl b/src/bin/pg_amcheck/t/002_nonesuch.pl
index 67d700ea07a..d4cc0664f3b 100644
--- a/src/bin/pg_amcheck/t/002_nonesuch.pl
+++ b/src/bin/pg_amcheck/t/002_nonesuch.pl
@@ -272,8 +272,8 @@ $node->command_checks_all(
 	[
 		qr/pg_amcheck: warning: no heap tables to check matching "no_such_table"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "no\*such\*table"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "no_such_index"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "no\*such\*index"/,
+		qr/pg_amcheck: warning: no indexes to check matching "no_such_index"/,
+		qr/pg_amcheck: warning: no indexes to check matching "no\*such\*index"/,
 		qr/pg_amcheck: warning: no relations to check matching "no_such_relation"/,
 		qr/pg_amcheck: warning: no relations to check matching "no\*such\*relation"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "no\*such\*table"/,
@@ -350,8 +350,8 @@ $node->command_checks_all(
 		qr/pg_amcheck: warning: no heap tables to check matching "template1\.public\.foo"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "another_db\.public\.foo"/,
 		qr/pg_amcheck: warning: no connectable databases to check matching "no_such_database\.public\.foo"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "template1\.public\.foo_idx"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "another_db\.public\.foo_idx"/,
+		qr/pg_amcheck: warning: no indexes to check matching "template1\.public\.foo_idx"/,
+		qr/pg_amcheck: warning: no indexes to check matching "another_db\.public\.foo_idx"/,
 		qr/pg_amcheck: warning: no connectable databases to check matching "no_such_database\.public\.foo_idx"/,
 		qr/pg_amcheck: error: no relations to check/,
 	],
diff --git a/src/bin/pg_amcheck/t/003_check.pl b/src/bin/pg_amcheck/t/003_check.pl
index 2b57c4dbac1..0aa66b24258 100644
--- a/src/bin/pg_amcheck/t/003_check.pl
+++ b/src/bin/pg_amcheck/t/003_check.pl
@@ -185,7 +185,7 @@ for my $dbname (qw(db1 db2 db3))
 	# schemas.  The schemas are all identical to start, but
 	# we will corrupt them differently later.
 	#
-	for my $schema (qw(s1 s2 s3 s4 s5))
+	for my $schema (qw(s1 s2 s3 s4 s5 s6))
 	{
 		$node->safe_psql(
 			$dbname, qq(
@@ -291,22 +291,24 @@ plan_to_corrupt_first_page('db1', 's3.t2_btree');
 # Corrupt toast table, partitions, and materialized views in schema "s4"
 plan_to_remove_toast_file('db1', 's4.t2');
 
-# Corrupt all other object types in schema "s5".  We don't have amcheck support
+# Corrupt GiST index in schema "s5"
+plan_to_remove_relation_file('db1', 's5.t1_gist');
+plan_to_corrupt_first_page('db1', 's5.t2_gist');
+
+# Corrupt all other object types in schema "s6".  We don't have amcheck support
 # for these types, but we check that their corruption does not trigger any
 # errors in pg_amcheck
-plan_to_remove_relation_file('db1', 's5.seq1');
-plan_to_remove_relation_file('db1', 's5.t1_hash');
-plan_to_remove_relation_file('db1', 's5.t1_gist');
-plan_to_remove_relation_file('db1', 's5.t1_gin');
-plan_to_remove_relation_file('db1', 's5.t1_brin');
-plan_to_remove_relation_file('db1', 's5.t1_spgist');
+plan_to_remove_relation_file('db1', 's6.seq1');
+plan_to_remove_relation_file('db1', 's6.t1_hash');
+plan_to_remove_relation_file('db1', 's6.t1_gin');
+plan_to_remove_relation_file('db1', 's6.t1_brin');
+plan_to_remove_relation_file('db1', 's6.t1_spgist');
 
-plan_to_corrupt_first_page('db1', 's5.seq2');
-plan_to_corrupt_first_page('db1', 's5.t2_hash');
-plan_to_corrupt_first_page('db1', 's5.t2_gist');
-plan_to_corrupt_first_page('db1', 's5.t2_gin');
-plan_to_corrupt_first_page('db1', 's5.t2_brin');
-plan_to_corrupt_first_page('db1', 's5.t2_spgist');
+plan_to_corrupt_first_page('db1', 's6.seq2');
+plan_to_corrupt_first_page('db1', 's6.t2_hash');
+plan_to_corrupt_first_page('db1', 's6.t2_gin');
+plan_to_corrupt_first_page('db1', 's6.t2_brin');
+plan_to_corrupt_first_page('db1', 's6.t2_spgist');
 
 
 # Database 'db2' corruptions
@@ -437,10 +439,22 @@ $node->command_checks_all(
 	[$no_output_re],
 	'pg_amcheck in schema s4 excluding toast reports no corruption');
 
-# Check that no corruption is reported in schema db1.s5
-$node->command_checks_all([ @cmd, '-s', 's5', 'db1' ],
+# In schema db1.s5 we should see GiST corruption messages on stdout, and
+# nothing on stderr.
+#
+$node->command_checks_all(
+	[ @cmd, '-s', 's5', 'db1' ],
+	2,
+	[
+		$missing_file_re, $line_pointer_corruption_re,
+	],
+	[$no_output_re],
+	'pg_amcheck schema s5 reports GiST index errors');
+
+# Check that no corruption is reported in schema db1.s6
+$node->command_checks_all([ @cmd, '-s', 's6', 'db1' ],
 	0, [$no_output_re], [$no_output_re],
-	'pg_amcheck over schema s5 reports no corruption');
+	'pg_amcheck over schema s6 reports no corruption');
 
 # In schema db1.s1, only indexes are corrupt.  Verify that when we exclude
 # the indexes, no corruption is reported about the schema.
@@ -551,7 +565,7 @@ $node->command_checks_all(
 	'pg_amcheck excluding all corrupt schemas with --checkunique option');
 
 #
-# Smoke test for checkunique option for not supported versions.
+# Smoke test for checkunique option and GiST indexes for not supported versions.
 #
 $node->safe_psql(
 	'db3', q(
@@ -567,4 +581,19 @@ $node->command_checks_all(
 		qr/pg_amcheck: warning: option --checkunique is not supported by amcheck version 1.3/
 	],
 	'pg_amcheck smoke test --checkunique');
+
+$node->safe_psql(
+	'db1', q(
+		DROP EXTENSION amcheck;
+		CREATE EXTENSION amcheck WITH SCHEMA amcheck_schema VERSION '1.3' ;
+));
+
+$node->command_checks_all(
+	[ @cmd, '-s', 's5', 'db1' ],
+	0,
+	[$no_output_re],
+	[
+		qr/pg_amcheck: warning: GiST verification is not supported by installed amcheck version/
+	],
+	'pg_amcheck smoke test --checkunique');
 done_testing();
-- 
2.34.1

v32-0002-Refactor-amcheck-internals-to-isolate-common-loc.patchapplication/octet-stream; name=v32-0002-Refactor-amcheck-internals-to-isolate-common-loc.patchDownload

From 5cb507ead70de64cd966821045a77ae25ef2433a Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:08:10 +0500
Subject: [PATCH v32 2/5] Refactor amcheck internals to isolate common locking
 and checking routines
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Before doing checks, other indexes must take the same safety measures:
 - Making sure the index can be checked
 - changing the context of the user
 - keeping track of GUCs modified via index functions
This contribution relocates the existing functionality to amcheck.c for reuse.

Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile                 |   1 +
 contrib/amcheck/expected/check_btree.out |   4 +-
 contrib/amcheck/meson.build              |   1 +
 contrib/amcheck/verify_common.c          | 191 ++++++++++++++++
 contrib/amcheck/verify_common.h          |  31 +++
 contrib/amcheck/verify_nbtree.c          | 267 ++++++-----------------
 6 files changed, 296 insertions(+), 199 deletions(-)
 create mode 100644 contrib/amcheck/verify_common.c
 create mode 100644 contrib/amcheck/verify_common.h

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index 5e9002d2501..c3d70f3369c 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -3,6 +3,7 @@
 MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
+	verify_common.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
diff --git a/contrib/amcheck/expected/check_btree.out b/contrib/amcheck/expected/check_btree.out
index e7fb5f55157..c6f4b16c556 100644
--- a/contrib/amcheck/expected/check_btree.out
+++ b/contrib/amcheck/expected/check_btree.out
@@ -57,8 +57,8 @@ ERROR:  could not open relation with OID 17
 BEGIN;
 CREATE INDEX bttest_a_brin_idx ON bttest_a USING brin(id);
 SELECT bt_index_parent_check('bttest_a_brin_idx');
-ERROR:  only B-Tree indexes are supported as targets for verification
-DETAIL:  Relation "bttest_a_brin_idx" is not a B-Tree index.
+ERROR:  expected "btree" index as targets for verification
+DETAIL:  Relation "bttest_a_brin_idx" is a brin index.
 ROLLBACK;
 -- normal check outside of xact
 SELECT bt_index_check('bttest_a_idx');
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index fc08e32539a..1b38e0aba77 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -1,6 +1,7 @@
 # Copyright (c) 2022-2024, PostgreSQL Global Development Group
 
 amcheck_sources = files(
+  'amcheck.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
diff --git a/contrib/amcheck/verify_common.c b/contrib/amcheck/verify_common.c
new file mode 100644
index 00000000000..acdcf5729f7
--- /dev/null
+++ b/contrib/amcheck/verify_common.c
@@ -0,0 +1,191 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.c
+ *		Utility functions common to all access methods.
+ *
+ * Copyright (c) 2024, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/table.h"
+#include "access/tableam.h"
+#include "verify_common.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
+#include "commands/tablecmds.h"
+#include "utils/guc.h"
+#include "utils/syscache.h"
+
+static bool amcheck_index_mainfork_expected(Relation rel);
+
+
+/*
+ * Check if index relation should have a file for its main relation fork.
+ * Verification uses this to skip unlogged indexes when in hot standby mode,
+ * where there is simply nothing to verify.
+ *
+ * NB: Caller should call index_checkable() before calling here.
+ */
+static bool
+amcheck_index_mainfork_expected(Relation rel)
+{
+	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
+		!RecoveryInProgress())
+		return true;
+
+	ereport(NOTICE,
+			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
+			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
+					RelationGetRelationName(rel))));
+
+	return false;
+}
+
+/*
+* Amcheck main workhorse.
+* Given index relation OID, lock relation.
+* Next, take a number of standard actions:
+* 1) Make sure the index can be checked
+* 2) change the context of the user,
+* 3) keep track of GUCs modified via index functions
+* 4) execute callback function to verify integrity.
+*/
+void
+amcheck_lock_relation_and_check(Oid indrelid,
+								Oid am_id,
+								IndexDoCheckCallback check,
+								LOCKMODE lockmode,
+								void *state)
+{
+	Oid			heapid;
+	Relation	indrel;
+	Relation	heaprel;
+	Oid			save_userid;
+	int			save_sec_context;
+	int			save_nestlevel;
+
+	/*
+	 * We must lock table before index to avoid deadlocks.  However, if the
+	 * passed indrelid isn't an index then IndexGetRelation() will fail.
+	 * Rather than emitting a not-very-helpful error message, postpone
+	 * complaining, expecting that the is-it-an-index test below will fail.
+	 *
+	 * In hot standby mode this will raise an error when parentcheck is true.
+	 */
+	heapid = IndexGetRelation(indrelid, true);
+	if (OidIsValid(heapid))
+	{
+		heaprel = table_open(heapid, lockmode);
+
+		/*
+		 * Switch to the table owner's userid, so that any index functions are
+		 * run as that user.  Also lock down security-restricted operations
+		 * and arrange to make GUC variable changes local to this command.
+		 */
+		GetUserIdAndSecContext(&save_userid, &save_sec_context);
+		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
+							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
+		save_nestlevel = NewGUCNestLevel();
+	}
+	else
+	{
+		heaprel = NULL;
+		/* Set these just to suppress "uninitialized variable" warnings */
+		save_userid = InvalidOid;
+		save_sec_context = -1;
+		save_nestlevel = -1;
+	}
+
+	/*
+	 * Open the target index relations separately (like relation_openrv(), but
+	 * with heap relation locked first to prevent deadlocking).  In hot
+	 * standby mode this will raise an error when parentcheck is true.
+	 *
+	 * There is no need for the usual indcheckxmin usability horizon test
+	 * here, even in the heapallindexed case, because index undergoing
+	 * verification only needs to have entries for a new transaction snapshot.
+	 * (If this is a parentcheck verification, there is no question about
+	 * committed or recently dead heap tuples lacking index entries due to
+	 * concurrent activity.)
+	 */
+	indrel = index_open(indrelid, lockmode);
+
+	/*
+	 * Since we did the IndexGetRelation call above without any lock, it's
+	 * barely possible that a race against an index drop/recreation could have
+	 * netted us the wrong table.
+	 */
+	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_TABLE),
+				 errmsg("could not open parent table of index \"%s\"",
+						RelationGetRelationName(indrel))));
+
+	/* Check that relation suitable for checking */
+	if (index_checkable(indrel, am_id))
+		check(indrel, heaprel, state, lockmode == ShareLock);
+
+	/* Roll back any GUC changes executed by index functions */
+	AtEOXact_GUC(false, save_nestlevel);
+
+	/* Restore userid and security context */
+	SetUserIdAndSecContext(save_userid, save_sec_context);
+
+	/*
+	 * Release locks early. That's ok here because nothing in the called
+	 * routines will trigger shared cache invalidations to be sent, so we can
+	 * relax the usual pattern of only releasing locks after commit.
+	 */
+	index_close(indrel, lockmode);
+	if (heaprel)
+		table_close(heaprel, lockmode);
+}
+
+/*
+ * Basic checks about the suitability of a relation for checking as an index.
+ *
+ *
+ * NB: Intentionally not checking permissions, the function is normally not
+ * callable by non-superusers. If granted, it's useful to be able to check a
+ * whole cluster.
+ */
+bool
+index_checkable(Relation rel, Oid am_id)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != am_id)
+	{
+		HeapTuple	amtup;
+		HeapTuple	amtuprel;
+
+		amtup = SearchSysCache1(AMOID, ObjectIdGetDatum(am_id));
+		amtuprel = SearchSysCache1(AMOID, ObjectIdGetDatum(rel->rd_rel->relam));
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("expected \"%s\" index as targets for verification", NameStr(((Form_pg_am) GETSTRUCT(amtup))->amname)),
+				 errdetail("Relation \"%s\" is a %s index.",
+						   RelationGetRelationName(rel), NameStr(((Form_pg_am) GETSTRUCT(amtuprel))->amname))));
+	}
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid.")));
+
+	return amcheck_index_mainfork_expected(rel);
+}
diff --git a/contrib/amcheck/verify_common.h b/contrib/amcheck/verify_common.h
new file mode 100644
index 00000000000..30994e22933
--- /dev/null
+++ b/contrib/amcheck/verify_common.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "storage/bufpage.h"
+#include "storage/lmgr.h"
+#include "storage/lockdefs.h"
+#include "utils/relcache.h"
+#include "miscadmin.h"
+
+/* Typedefs for callback functions for amcheck_lock_relation */
+typedef void (*IndexCheckableCallback) (Relation index);
+typedef void (*IndexDoCheckCallback) (Relation rel,
+									  Relation heaprel,
+									  void *state,
+									  bool readonly);
+
+extern void amcheck_lock_relation_and_check(Oid indrelid,
+											Oid am_id,
+											IndexDoCheckCallback check,
+											LOCKMODE lockmode, void *state);
+
+extern bool index_checkable(Relation rel, Oid am_id);
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index c76349bf436..1da4f0c3461 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -30,6 +30,7 @@
 #include "access/tableam.h"
 #include "access/transam.h"
 #include "access/xact.h"
+#include "verify_common.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
 #include "catalog/pg_opfamily_d.h"
@@ -156,14 +157,22 @@ typedef struct BtreeLastVisibleEntry
 	ItemPointer tid;			/* Heap tid */
 } BtreeLastVisibleEntry;
 
+/*
+ * Check arguments
+ */
+typedef struct BTCallbackState
+{
+	bool		parentcheck;
+	bool		heapallindexed;
+	bool		rootdescend;
+	bool		checkunique;
+}			BTCallbackState;
+
 PG_FUNCTION_INFO_V1(bt_index_check);
 PG_FUNCTION_INFO_V1(bt_index_parent_check);
 
-static void bt_index_check_internal(Oid indrelid, bool parentcheck,
-									bool heapallindexed, bool rootdescend,
-									bool checkunique);
-static inline void btree_index_checkable(Relation rel);
-static inline bool btree_index_mainfork_expected(Relation rel);
+static void bt_index_check_callback(Relation indrel, Relation heaprel,
+									void *state, bool readonly);
 static void bt_check_every_level(Relation rel, Relation heaprel,
 								 bool heapkeyspace, bool readonly, bool heapallindexed,
 								 bool rootdescend, bool checkunique);
@@ -238,15 +247,21 @@ Datum
 bt_index_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		checkunique = false;
+	BTCallbackState args;
+
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = false;
+	args.checkunique = false;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
-	if (PG_NARGS() == 3)
-		checkunique = PG_GETARG_BOOL(2);
+		args.heapallindexed = PG_GETARG_BOOL(1);
+	if (PG_NARGS() >= 3)
+		args.checkunique = PG_GETARG_BOOL(2);
 
-	bt_index_check_internal(indrelid, false, heapallindexed, false, checkunique);
+	amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
+									bt_index_check_callback,
+									AccessShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -264,18 +279,23 @@ Datum
 bt_index_parent_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		rootdescend = false;
-	bool		checkunique = false;
+	BTCallbackState args;
+
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = true;
+	args.checkunique = false;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+		args.heapallindexed = PG_GETARG_BOOL(1);
 	if (PG_NARGS() >= 3)
-		rootdescend = PG_GETARG_BOOL(2);
-	if (PG_NARGS() == 4)
-		checkunique = PG_GETARG_BOOL(3);
+		args.rootdescend = PG_GETARG_BOOL(2);
+	if (PG_NARGS() >= 4)
+		args.checkunique = PG_GETARG_BOOL(3);
 
-	bt_index_check_internal(indrelid, true, heapallindexed, rootdescend, checkunique);
+	amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
+									bt_index_check_callback,
+									ShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -284,193 +304,46 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
  * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
  */
 static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend, bool checkunique)
+bt_index_check_callback(Relation indrel, Relation heaprel, void *state, bool readonly)
 {
-	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	LOCKMODE	lockmode;
-	Oid			save_userid;
-	int			save_sec_context;
-	int			save_nestlevel;
-
-	if (parentcheck)
-		lockmode = ShareLock;
-	else
-		lockmode = AccessShareLock;
-
-	/*
-	 * We must lock table before index to avoid deadlocks.  However, if the
-	 * passed indrelid isn't an index then IndexGetRelation() will fail.
-	 * Rather than emitting a not-very-helpful error message, postpone
-	 * complaining, expecting that the is-it-an-index test below will fail.
-	 *
-	 * In hot standby mode this will raise an error when parentcheck is true.
-	 */
-	heapid = IndexGetRelation(indrelid, true);
-	if (OidIsValid(heapid))
-	{
-		heaprel = table_open(heapid, lockmode);
-
-		/*
-		 * Switch to the table owner's userid, so that any index functions are
-		 * run as that user.  Also lock down security-restricted operations
-		 * and arrange to make GUC variable changes local to this command.
-		 */
-		GetUserIdAndSecContext(&save_userid, &save_sec_context);
-		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
-							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
-		save_nestlevel = NewGUCNestLevel();
-		RestrictSearchPath();
-	}
-	else
-	{
-		heaprel = NULL;
-		/* Set these just to suppress "uninitialized variable" warnings */
-		save_userid = InvalidOid;
-		save_sec_context = -1;
-		save_nestlevel = -1;
-	}
+	BTCallbackState *args = (BTCallbackState *) state;
+	bool		heapkeyspace,
+				allequalimage;
 
-	/*
-	 * Open the target index relations separately (like relation_openrv(), but
-	 * with heap relation locked first to prevent deadlocking).  In hot
-	 * standby mode this will raise an error when parentcheck is true.
-	 *
-	 * There is no need for the usual indcheckxmin usability horizon test
-	 * here, even in the heapallindexed case, because index undergoing
-	 * verification only needs to have entries for a new transaction snapshot.
-	 * (If this is a parentcheck verification, there is no question about
-	 * committed or recently dead heap tuples lacking index entries due to
-	 * concurrent activity.)
-	 */
-	indrel = index_open(indrelid, lockmode);
-
-	/*
-	 * Since we did the IndexGetRelation call above without any lock, it's
-	 * barely possible that a race against an index drop/recreation could have
-	 * netted us the wrong table.
-	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
 		ereport(ERROR,
-				(errcode(ERRCODE_UNDEFINED_TABLE),
-				 errmsg("could not open parent table of index \"%s\"",
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" lacks a main relation fork",
 						RelationGetRelationName(indrel))));
 
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	if (btree_index_mainfork_expected(indrel))
+	/* Extract metadata from metapage, and sanitize it in passing */
+	_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
+	if (allequalimage && !heapkeyspace)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
+						RelationGetRelationName(indrel))));
+	if (allequalimage && !_bt_allequalimage(indrel, false))
 	{
-		bool		heapkeyspace,
-					allequalimage;
+		bool		has_interval_ops = false;
 
-		if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" lacks a main relation fork",
-							RelationGetRelationName(indrel))));
-
-		/* Extract metadata from metapage, and sanitize it in passing */
-		_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
-		if (allequalimage && !heapkeyspace)
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
-							RelationGetRelationName(indrel))));
-		if (allequalimage && !_bt_allequalimage(indrel, false))
-		{
-			bool		has_interval_ops = false;
-
-			for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
-				if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
-					has_interval_ops = true;
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
-							RelationGetRelationName(indrel)),
-					 has_interval_ops
-					 ? errhint("This is known of \"interval\" indexes last built on a version predating 2023-11.")
-					 : 0));
-		}
-
-		/* Check index, possibly against table it is an index on */
-		bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
-							 heapallindexed, rootdescend, checkunique);
+		for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
+			if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
+			{
+				has_interval_ops = true;
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
+								RelationGetRelationName(indrel)),
+						 has_interval_ops
+						 ? errhint("This is known of \"interval\" indexes last built on a version predating 2023-11.")
+						 : 0));
+			}
 	}
 
-	/* Roll back any GUC changes executed by index functions */
-	AtEOXact_GUC(false, save_nestlevel);
-
-	/* Restore userid and security context */
-	SetUserIdAndSecContext(save_userid, save_sec_context);
-
-	/*
-	 * Release locks early. That's ok here because nothing in the called
-	 * routines will trigger shared cache invalidations to be sent, so we can
-	 * relax the usual pattern of only releasing locks after commit.
-	 */
-	index_close(indrel, lockmode);
-	if (heaprel)
-		table_close(heaprel, lockmode);
-}
-
-/*
- * Basic checks about the suitability of a relation for checking as a B-Tree
- * index.
- *
- * NB: Intentionally not checking permissions, the function is normally not
- * callable by non-superusers. If granted, it's useful to be able to check a
- * whole cluster.
- */
-static inline void
-btree_index_checkable(Relation rel)
-{
-	if (rel->rd_rel->relkind != RELKIND_INDEX ||
-		rel->rd_rel->relam != BTREE_AM_OID)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("only B-Tree indexes are supported as targets for verification"),
-				 errdetail("Relation \"%s\" is not a B-Tree index.",
-						   RelationGetRelationName(rel))));
-
-	if (RELATION_IS_OTHER_TEMP(rel))
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("cannot access temporary tables of other sessions"),
-				 errdetail("Index \"%s\" is associated with temporary relation.",
-						   RelationGetRelationName(rel))));
-
-	if (!rel->rd_index->indisvalid)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("cannot check index \"%s\"",
-						RelationGetRelationName(rel)),
-				 errdetail("Index is not valid.")));
-}
-
-/*
- * Check if B-Tree index relation should have a file for its main relation
- * fork.  Verification uses this to skip unlogged indexes when in hot standby
- * mode, where there is simply nothing to verify.  We behave as if the
- * relation is empty.
- *
- * NB: Caller should call btree_index_checkable() before calling here.
- */
-static inline bool
-btree_index_mainfork_expected(Relation rel)
-{
-	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
-		!RecoveryInProgress())
-		return true;
-
-	ereport(DEBUG1,
-			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
-			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
-					RelationGetRelationName(rel))));
-
-	return false;
+	/* Check index, possibly against table it is an index on */
+	bt_check_every_level(indrel, heaprel, heapkeyspace, readonly,
+						 args->heapallindexed, args->rootdescend, args->checkunique);
 }
 
 /*
-- 
2.34.1

v32-0001-A-tiny-nitpicky-tweak-to-beautify-the-Amcheck-in.patchapplication/octet-stream; name=v32-0001-A-tiny-nitpicky-tweak-to-beautify-the-Amcheck-in.patchDownload

From e78bc24a3a355c5731cb677b89456ceb9fbd9b55 Mon Sep 17 00:00:00 2001
From: reshke kirill <reshke@double.cloud>
Date: Tue, 26 Nov 2024 05:32:27 +0000
Subject: [PATCH v32 1/5] A tiny nitpicky tweak to beautify the Amcheck
 interiors.

The heaptuplespresent field in BtreeCheckState was not previously
adequately documented. To clarify the meaning of this field, the comment was changed.
---
 contrib/amcheck/verify_nbtree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index ffe4f721672..c76349bf436 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -124,7 +124,7 @@ typedef struct BtreeCheckState
 
 	/* Bloom filter fingerprints B-Tree index */
 	bloom_filter *filter;
-	/* Debug counter */
+	/* Debug counter for reporting percentage of work already done */
 	int64		heaptuplespresent;
 } BtreeCheckState;
 
-- 
2.34.1

v32-0004-Add-gin_index_check-to-verify-GIN-index.patchapplication/octet-stream; name=v32-0004-Add-gin_index_check-to-verify-GIN-index.patchDownload

From a73e5c5fabd5401f6b4ddffadb5b7fb5a4a752a0 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:22:26 +0500
Subject: [PATCH v32 4/5] Add gin_index_check() to verify GIN index
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Author: Grigory Kryachko <GSKryachko@gmail.com>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile               |   3 +-
 contrib/amcheck/amcheck--1.4--1.5.sql  |   9 +
 contrib/amcheck/expected/check_gin.out |  64 +++
 contrib/amcheck/meson.build            |   2 +
 contrib/amcheck/sql/check_gin.sql      |  40 ++
 contrib/amcheck/verify_gin.c           | 767 +++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml              |  19 +
 src/tools/pgindent/pgindent            |   2 +-
 8 files changed, 904 insertions(+), 2 deletions(-)
 create mode 100644 contrib/amcheck/expected/check_gin.out
 create mode 100644 contrib/amcheck/sql/check_gin.sql
 create mode 100644 contrib/amcheck/verify_gin.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index 952e458c53b..c01f8e618f3 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,6 +4,7 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	verify_common.o \
+	verify_gin.o \
 	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
@@ -13,7 +14,7 @@ DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck
 		amcheck--1.3--1.4.sql amcheck--1.4--1.5.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_gist check_heap
+REGRESS = check check_btree check_gin check_gist check_heap
 
 EXTRA_INSTALL = contrib/pg_walinspect
 TAP_TESTS = 1
diff --git a/contrib/amcheck/amcheck--1.4--1.5.sql b/contrib/amcheck/amcheck--1.4--1.5.sql
index 3fc72364180..c013abc4f55 100644
--- a/contrib/amcheck/amcheck--1.4--1.5.sql
+++ b/contrib/amcheck/amcheck--1.4--1.5.sql
@@ -12,3 +12,12 @@ AS 'MODULE_PATHNAME', 'gist_index_check'
 LANGUAGE C STRICT;
 
 REVOKE ALL ON FUNCTION gist_index_check(regclass, boolean) FROM PUBLIC;
+
+-- gin_index_check()
+--
+CREATE FUNCTION gin_index_check(index regclass)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gin_index_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gin_index_check(regclass) FROM PUBLIC;
diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
new file mode 100644
index 00000000000..bbcde80e627
--- /dev/null
+++ b/contrib/amcheck/expected/check_gin.out
@@ -0,0 +1,64 @@
+-- Test of index bulk load
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_check('gin_check_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test index inserts
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+SELECT gin_index_check('gin_check_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test GIN over text array
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_check('gin_check_text_array_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 15ae94cc90f..5c9ddfe0758 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gin.c',
   'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
@@ -38,6 +39,7 @@ tests += {
     'sql': [
       'check',
       'check_btree',
+      'check_gin',
       'check_gist',
       'check_heap',
     ],
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
new file mode 100644
index 00000000000..bbd9b9f8281
--- /dev/null
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -0,0 +1,40 @@
+-- Test of index bulk load
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test index inserts
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+
+SELECT gin_index_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test GIN over text array
+SELECT setseed(1);
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_check('gin_check_text_array_idx');
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
new file mode 100644
index 00000000000..e18adf3d1c7
--- /dev/null
+++ b/contrib/amcheck/verify_gin.c
@@ -0,0 +1,767 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gin.c
+ *		Verifies the integrity of GIN indexes based on invariants.
+ *
+ * Verification checks that all paths in GIN graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gin.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gin_private.h"
+#include "access/nbtree.h"
+#include "catalog/pg_am.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "verify_common.h"
+#include "string.h"
+
+/*
+ * GinScanItem represents one item of depth-first scan of GIN index.
+ */
+typedef struct GinScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GinScanItem *next;
+}			GinScanItem;
+
+/*
+ * GinPostingTreeScanItem represents one item of depth-first scan of GIN  posting tree.
+ */
+typedef struct GinPostingTreeScanItem
+{
+	int			depth;
+	ItemPointerData parentkey;
+	BlockNumber parentblk;
+	BlockNumber blkno;
+	struct GinPostingTreeScanItem *next;
+}			GinPostingTreeScanItem;
+
+
+PG_FUNCTION_INFO_V1(gin_index_check);
+
+static void gin_check_parent_keys_consistency(Relation rel,
+											  Relation heaprel,
+											  void *callback_state, bool readonly);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gin_refind_parent(Relation rel,
+									BlockNumber parentblkno,
+									BlockNumber childblkno,
+									BufferAccessStrategy strategy);
+static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+								   OffsetNumber offset);
+
+/*
+ * gin_index_check(index regclass)
+ *
+ * Verify integrity of GIN index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gin_index_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+
+	amcheck_lock_relation_and_check(indrelid,
+									GIN_AM_OID,
+									gin_check_parent_keys_consistency,
+									AccessShareLock,
+									NULL);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Read item pointers from leaf entry tuple.
+ *
+ * Returns a palloc'd array of ItemPointers. The number of items is returned
+ * in *nitems.
+ */
+static ItemPointer
+ginReadTupleWithoutState(IndexTuple itup, int *nitems)
+{
+	Pointer		ptr = GinGetPosting(itup);
+	int			nipd = GinGetNPosting(itup);
+	ItemPointer ipd;
+	int			ndecoded;
+
+	if (GinItupIsCompressed(itup))
+	{
+		if (nipd > 0)
+		{
+			ipd = ginPostingListDecode((GinPostingList *) ptr, &ndecoded);
+			if (nipd != ndecoded)
+				elog(ERROR, "number of items mismatch in GIN entry tuple, %d in tuple header, %d decoded",
+					 nipd, ndecoded);
+		}
+		else
+			ipd = palloc(0);
+	}
+	else
+	{
+		ipd = (ItemPointer) palloc(sizeof(ItemPointerData) * nipd);
+		memcpy(ipd, ptr, sizeof(ItemPointerData) * nipd);
+	}
+	*nitems = nipd;
+	return ipd;
+}
+
+/*
+ * Allocates memory context and scans through postigTree graph
+ *
+ */
+static void
+gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting_tree_root)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinPostingTreeScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinPostingTreeScanItem *) palloc0(sizeof(GinPostingTreeScanItem));
+	stack->depth = 0;
+	ItemPointerSetInvalid(&stack->parentkey);
+	stack->parentblk = InvalidBlockNumber;
+	stack->blkno = posting_tree_root;
+
+	elog(DEBUG3, "processing posting tree at blk %u", posting_tree_root);
+
+	while (stack)
+	{
+		GinPostingTreeScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		Assert(GinPageIsData(page));
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			ItemPointerData minItem;
+			int			nlist;
+			ItemPointerData *list;
+			char		tidrange_buf[100];
+
+			ItemPointerSetMin(&minItem);
+
+			elog(DEBUG1, "page blk: %u, type leaf", stack->blkno);
+
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			list = GinDataLeafPageGetItems(page, &nlist, minItem);
+
+			if (nlist > 0)
+				snprintf(tidrange_buf, sizeof(tidrange_buf),
+						 "%d tids (%u, %u) - (%u, %u)",
+						 nlist,
+						 ItemPointerGetBlockNumberNoCheck(&list[0]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[0]),
+						 ItemPointerGetBlockNumberNoCheck(&list[nlist - 1]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[nlist - 1]));
+			else
+				snprintf(tidrange_buf, sizeof(tidrange_buf), "0 tids");
+
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: parent %u highkey (%u, %u), %s",
+					 stack->blkno,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey),
+					 tidrange_buf);
+			else
+				elog(DEBUG3, "blk %u: root leaf, %s",
+					 stack->blkno,
+					 tidrange_buf);
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				ItemPointerGetOffsetNumberNoCheck(&stack->parentkey) != InvalidOffsetNumber &&
+				nlist > 0 && ItemPointerCompare(&stack->parentkey, &list[nlist - 1]) < 0)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": tid exceeds parent's high key in postingTree leaf on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+		else
+		{
+			LocationIndex pd_lower;
+			ItemPointerData bound;
+			int			lowersize;
+
+			/*
+			 * Check that tuples in each page are properly ordered and
+			 * consistent with parent high key
+			 */
+			Assert(GinPageIsData(page));
+			maxoff = GinPageGetOpaque(page)->maxoff;
+
+			elog(DEBUG1, "page blk: %u, type data, maxoff %d", stack->blkno, maxoff);
+
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: internal posting tree page with %u items, parent %u highkey (%u, %u)",
+					 stack->blkno, maxoff, stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey));
+			else
+				elog(DEBUG3, "blk %u: root internal posting tree page with %u items",
+					 stack->blkno, maxoff);
+
+			/*
+			 * A GIN posting tree internal page stores PostingItems in the
+			 * 'lower' part of the page. The 'upper' part is unused. The
+			 * number of elements is stored in the opaque area (maxoff). Make
+			 * sure the size of the 'lower' part agrees with 'maxoff'
+			 *
+			 * We didn't set pd_lower until PostgreSQL version 9.4, so if this
+			 * check fails, it could also be because the index was
+			 * binary-upgraded from an earlier version. That was a long time
+			 * ago, though, so let's warn if it doesn't match.
+			 */
+			pd_lower = ((PageHeader) page)->pd_lower;
+			lowersize = pd_lower - MAXALIGN(SizeOfPageHeaderData);
+			if ((lowersize - MAXALIGN(sizeof(ItemPointerData))) / sizeof(PostingItem) != maxoff)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has unexpected pd_lower %u in posting tree block %u with maxoff %u)",
+								RelationGetRelationName(rel), pd_lower, stack->blkno, maxoff)));
+
+			/*
+			 * Before the PostingItems, there's one ItemPointerData in the
+			 * 'lower' part that stores the page's high key.
+			 */
+			bound = *GinDataPageGetRightBound(page);
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				!ItemPointerEquals(&stack->parentkey, &bound))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": posting tree page's high key (%u, %u) doesn't match the downlink on block %u (parent blk %u, key (%u, %u))",
+								RelationGetRelationName(rel),
+								ItemPointerGetBlockNumberNoCheck(&bound),
+								ItemPointerGetOffsetNumberNoCheck(&bound),
+								stack->blkno, stack->parentblk,
+								ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+								ItemPointerGetOffsetNumberNoCheck(&stack->parentkey))));
+
+			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
+
+				/* ItemPointerGetOffsetNumber expects a valid pointer */
+				if (!(i == maxoff &&
+					  GinPageGetOpaque(page)->rightlink == InvalidBlockNumber))
+					elog(DEBUG3, "key (%u, %u) -> %u",
+						 ItemPointerGetBlockNumber(&posting_item->key),
+						 ItemPointerGetOffsetNumber(&posting_item->key),
+						 BlockIdGetBlockNumber(&posting_item->child_blkno));
+				else
+					elog(DEBUG3, "key (%u, %u) -> %u",
+						 0, 0, BlockIdGetBlockNumber(&posting_item->child_blkno));
+
+				if (i == maxoff &&
+					GinPageGetOpaque(page)->rightlink == InvalidBlockNumber)
+				{
+					/*
+					 * The rightmost item in the tree level has (0, 0) as the
+					 * key
+					 */
+					if (ItemPointerGetBlockNumberNoCheck(&posting_item->key) != 0 ||
+						ItemPointerGetOffsetNumberNoCheck(&posting_item->key) != 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": rightmost posting tree page (blk %u) has unexpected last key (%u, %u)",
+										RelationGetRelationName(rel),
+										stack->blkno,
+										ItemPointerGetBlockNumberNoCheck(&posting_item->key),
+										ItemPointerGetOffsetNumberNoCheck(&posting_item->key))));
+				}
+				else if (i != FirstOffsetNumber)
+				{
+					PostingItem *previous_posting_item = GinDataPageGetPostingItem(page, i - 1);
+
+					if (ItemPointerCompare(&posting_item->key, &previous_posting_item->key) < 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\" has wrong tuple order in posting tree, block %u, offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+				}
+
+				/*
+				 * Check if this tuple is consistent with the downlink in the
+				 * parent.
+				 */
+				if (stack->parentblk != InvalidBlockNumber && i == maxoff &&
+					ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\": posting item exceeds parent's high key in postingTree internal page on block %u offset %u",
+									RelationGetRelationName(rel),
+									stack->blkno, i)));
+
+				/* If this is an internal page, recurse into the child */
+				if (!GinPageIsLeaf(page))
+				{
+					GinPostingTreeScanItem *ptr;
+
+					ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
+					ptr->depth = stack->depth + 1;
+					ptr->parentkey = posting_item->key;
+					ptr->parentblk = stack->blkno;
+					ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+			}
+		}
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Main entry point for GIN check. Allocates memory context and scans through
+ * GIN graph.
+ */
+static void
+gin_check_parent_keys_consistency(Relation rel,
+								  Relation heaprel,
+								  void *callback_state,
+								  bool readonly)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GinState	state;
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+	initGinState(&state, rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinScanItem *) palloc0(sizeof(GinScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIN_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GinScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+		IndexTuple	prev_tuple;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+		maxoff = PageGetMaxOffsetNumber(page);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		elog(DEBUG3, "processing entry tree page at blk %u, maxoff: %u", stack->blkno, maxoff);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (stack->parenttup != NULL)
+		{
+			GinNullCategory parent_key_category;
+			Datum		parent_key = gintuple_get_key(&state,
+													  stack->parenttup,
+													  &parent_key_category);
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno,
+												   page, maxoff);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory page_max_key_category;
+			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
+
+			if (GinPageGetOpaque(page)->rightlink != InvalidBlockNumber &&
+				ginCompareEntries(&state, attnum, page_max_key,
+								  page_max_key_category, parent_key,
+								  parent_key_category) > 0)
+			{
+				/* split page detected, install right link to the stack */
+				GinScanItem *ptr;
+
+				elog(DEBUG3, "split detected for blk: %u, parent blk: %u", stack->blkno, stack->parentblk);
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth;
+				ptr->parenttup = CopyIndexTuple(stack->parenttup);
+				ptr->parentblk = stack->parentblk;
+				ptr->parentlsn = stack->parentlsn;
+				ptr->blkno = GinPageGetOpaque(page)->rightlink;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that tuples in each page are properly ordered and consistent
+		 * with parent high key
+		 */
+		prev_tuple = NULL;
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory prev_key_category;
+			Datum		prev_key;
+			GinNullCategory current_key_category;
+			Datum		current_key;
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
+
+			/* (apparently) first block is metadata, skip order check */
+			if (i != FirstOffsetNumber && stack->blkno != (BlockNumber) 1)
+			{
+				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
+				if (ginCompareEntries(&state, attnum, prev_key,
+									  prev_key_category, current_key,
+									  current_key_category) >= 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has wrong tuple order on entry tree page, block %u, offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+			}
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				i == maxoff)
+			{
+				GinNullCategory parent_key_category;
+				Datum		parent_key = gintuple_get_key(&state,
+														  stack->parenttup,
+														  &parent_key_category);
+
+				if (ginCompareEntries(&state, attnum, current_key,
+									  current_key_category, parent_key,
+									  parent_key_category) > 0)
+				{
+					/*
+					 * There was a discrepancy between parent and child
+					 * tuples. We need to verify it is not a result of
+					 * concurrent call of gistplacetopage(). So, lock parent
+					 * and try to find downlink for current page. It may be
+					 * missing due to concurrent page split, this is OK.
+					 */
+					pfree(stack->parenttup);
+					stack->parenttup = gin_refind_parent(rel, stack->parentblk,
+														 stack->blkno, strategy);
+
+					/* We found it - make a final check before failing */
+					if (!stack->parenttup)
+						elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+							 stack->blkno, stack->parentblk);
+					else
+					{
+						parent_key = gintuple_get_key(&state,
+													  stack->parenttup,
+													  &parent_key_category);
+
+						/*
+						 * Check if it is properly adjusted. If succeed,
+						 * procced to the next key.
+						 */
+						if (ginCompareEntries(&state, attnum, current_key,
+											  current_key_category, parent_key,
+											  parent_key_category) > 0)
+							ereport(ERROR,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+											RelationGetRelationName(rel), stack->blkno, i)));
+					}
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+			if (!GinPageIsLeaf(page))
+			{
+				GinScanItem *ptr;
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth + 1;
+				/* last tuple in layer has no high key */
+				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
+					ptr->parenttup = CopyIndexTuple(idxtuple);
+				else
+					ptr->parenttup = NULL;
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = GinGetDownlink(idxtuple);
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+			/* If this item is a pointer to a posting tree, recurse into it */
+			else if (GinIsPostingTree(idxtuple))
+			{
+				BlockNumber rootPostingTree = GinGetPostingTree(idxtuple);
+
+				gin_check_posting_tree_parent_keys_consistency(rel, rootPostingTree);
+			}
+			else
+			{
+				ItemPointer ipd;
+				int			nipd;
+
+				ipd = ginReadTupleWithoutState(idxtuple, &nipd);
+
+				for (int j = 0; j < nipd; j++)
+				{
+					if (!OffsetNumberIsValid(ItemPointerGetOffsetNumber(&ipd[j])))
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting list contains invalid heap pointer on block %u",
+										RelationGetRelationName(rel), stack->blkno)));
+				}
+				pfree(ipd);
+			}
+
+			prev_tuple = CopyIndexTuple(idxtuple);
+		}
+
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Verify that a freshly-read page looks sane.
+ */
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	/*
+	 * ReadBuffer verifies that every newly-read page passes
+	 * PageHeaderIsValid, which means it either contains a reasonably sane
+	 * page header or is all-zero.  We have to defend against the all-zero
+	 * case, however.
+	 */
+	if (PageIsNew(page))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains unexpected zero page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	/*
+	 * Additionally check that the special area looks sane.
+	 */
+	if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains corrupted page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	if (GinPageIsDeleted(page))
+	{
+		if (!GinPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gin_refind_parent(Relation rel, BlockNumber parentblkno,
+				  BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIN_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GinPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+static ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED or LP_DEAD,
+	 * since GIN never uses all three.  Verify that line pointer has storage,
+	 * too.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdIsDead(itemid) || ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 6eb526c6bb7..55f2b587e57 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -189,6 +189,25 @@ ORDER BY c.relpages DESC LIMIT 10;
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term>
+     <function>gin_index_check(index regclass) returns void</function>
+     <indexterm>
+      <primary>gin_index_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gin_index_check</function> tests that its target GIN index
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term>
      <function>gist_index_check(index regclass, heapallindexed boolean) returns void</function>
diff --git a/src/tools/pgindent/pgindent b/src/tools/pgindent/pgindent
index e889af6b1e4..e5ac0410665 100755
--- a/src/tools/pgindent/pgindent
+++ b/src/tools/pgindent/pgindent
@@ -13,7 +13,7 @@ use IO::Handle;
 use Getopt::Long;
 
 # Update for pg_bsd_indent version
-my $INDENT_VERSION = "2.1.2";
+my $INDENT_VERSION = "2.1.1";
 
 # Our standard indent settings
 my $indent_opts =
-- 
2.34.1

v32-0003-Add-gist_index_check-function-to-verify-GiST-ind.patchapplication/octet-stream; name=v32-0003-Add-gist_index_check-function-to-verify-GiST-ind.patchDownload

From 3934621f6aaf2659e38691f7519738ad53fd7e99 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:17:44 +0500
Subject: [PATCH v32 3/5] Add gist_index_check() function to verify GiST index
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This function traverses GiST with a depth-fisrt search and checks
that all downlink tuples are included into parent tuple keyspace.
This traverse takes lock of any page until some discapency found.
To re-check suspicious pair of parent and child tuples it aqcuires
locks on both parent and child pages in the same order as page
split does.

Author: Andrey Borodin <amborodin@acm.org>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile                |   6 +-
 contrib/amcheck/amcheck--1.4--1.5.sql   |  14 +
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/expected/check_gist.out | 145 +++++
 contrib/amcheck/meson.build             |   3 +
 contrib/amcheck/sql/check_gist.sql      |  62 +++
 contrib/amcheck/verify_gist.c           | 687 ++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml               |  19 +
 8 files changed, 935 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.4--1.5.sql
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index c3d70f3369c..952e458c53b 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,14 +4,16 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	verify_common.o \
+	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
-DATA = amcheck--1.3--1.4.sql amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql \
+		amcheck--1.3--1.4.sql amcheck--1.4--1.5.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap
+REGRESS = check check_btree check_gist check_heap
 
 EXTRA_INSTALL = contrib/pg_walinspect
 TAP_TESTS = 1
diff --git a/contrib/amcheck/amcheck--1.4--1.5.sql b/contrib/amcheck/amcheck--1.4--1.5.sql
new file mode 100644
index 00000000000..3fc72364180
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.4--1.5.sql
@@ -0,0 +1,14 @@
+/* contrib/amcheck/amcheck--1.4--1.5.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.5'" to load this file. \quit
+
+
+-- gist_index_check()
+--
+CREATE FUNCTION gist_index_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_check(regclass, boolean) FROM PUBLIC;
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index e67ace01c99..c8ba6d7c9bc 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.4'
+default_version = '1.5'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 00000000000..cbc3e27e679
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,145 @@
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gist_check;
+--
+-- Similar to BUG #15597
+--
+CREATE TABLE toast_bug(c point,buggy text);
+ALTER TABLE toast_bug ALTER COLUMN buggy SET STORAGE extended;
+CREATE INDEX toasty ON toast_bug USING gist(c) INCLUDE(buggy);
+-- pg_attribute entry for toasty.buggy (the index) will have plain storage:
+UPDATE pg_attribute SET attstorage = 'p'
+WHERE attrelid = 'toasty'::regclass AND attname = 'buggy';
+-- Whereas pg_attribute entry for toast_bug.buggy (the table) still has extended storage:
+SELECT attstorage FROM pg_attribute
+WHERE attrelid = 'toast_bug'::regclass AND attname = 'buggy';
+ attstorage 
+------------
+ x
+(1 row)
+
+-- Insert compressible heap tuple (comfortably exceeds TOAST_TUPLE_THRESHOLD):
+INSERT INTO toast_bug SELECT point(0,0), repeat('a', 2200);
+-- Should not get false positive report of corruption:
+SELECT gist_index_check('toasty', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 1b38e0aba77..15ae94cc90f 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
@@ -25,6 +26,7 @@ install_data(
   'amcheck--1.1--1.2.sql',
   'amcheck--1.2--1.3.sql',
   'amcheck--1.3--1.4.sql',
+  'amcheck--1.4--1.5.sql',
   kwargs: contrib_data_args,
 )
 
@@ -36,6 +38,7 @@ tests += {
     'sql': [
       'check',
       'check_btree',
+      'check_gist',
       'check_heap',
     ],
   },
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 00000000000..37966423b8b
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,62 @@
+
+SELECT setseed(1);
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+-- cleanup
+DROP TABLE gist_check;
+
+--
+-- Similar to BUG #15597
+--
+CREATE TABLE toast_bug(c point,buggy text);
+ALTER TABLE toast_bug ALTER COLUMN buggy SET STORAGE extended;
+CREATE INDEX toasty ON toast_bug USING gist(c) INCLUDE(buggy);
+
+-- pg_attribute entry for toasty.buggy (the index) will have plain storage:
+UPDATE pg_attribute SET attstorage = 'p'
+WHERE attrelid = 'toasty'::regclass AND attname = 'buggy';
+
+-- Whereas pg_attribute entry for toast_bug.buggy (the table) still has extended storage:
+SELECT attstorage FROM pg_attribute
+WHERE attrelid = 'toast_bug'::regclass AND attname = 'buggy';
+
+-- Insert compressible heap tuple (comfortably exceeds TOAST_TUPLE_THRESHOLD):
+INSERT INTO toast_bug SELECT point(0,0), repeat('a', 2200);
+-- Should not get false positive report of corruption:
+SELECT gist_index_check('toasty', true);
\ No newline at end of file
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 00000000000..477150ac802
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,687 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gist.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "access/tableam.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
+#include "common/pg_prng.h"
+#include "lib/bloomfilter.h"
+#include "verify_common.h"
+#include "utils/memutils.h"
+
+
+/*
+ * GistScanItem represents one item of depth-first scan of GiST index.
+ */
+typedef struct GistScanItem
+{
+	int			depth;
+
+	/* Referenced block number to check next */
+	BlockNumber blkno;
+
+	/*
+	 * Correctess of this parent tuple will be checked against contents of
+	 * referenced page. This tuple will be NULL for root block.
+	 */
+	IndexTuple	parenttup;
+
+	/*
+	 * LSN to hande concurrent scan of the page. It's necessary to avoid
+	 * missing some subtrees from page, that was split just before we read it.
+	 */
+	XLogRecPtr	parentlsn;
+
+	/*
+	 * Reference to parent page for re-locking in case of found parent-child
+	 * tuple discrepencies.
+	 */
+	BlockNumber parentblk;
+
+	/* Pointer to a next stack item. */
+	struct GistScanItem *next;
+}			GistScanItem;
+
+typedef struct GistCheckState
+{
+	/* GiST state */
+	GISTSTATE  *state;
+	/* Bloom filter fingerprints index tuples */
+	bloom_filter *filter;
+
+	Snapshot	snapshot;
+	Relation	rel;
+	Relation	heaprel;
+
+	/* Debug counter for reporting percentage of work already done */
+	int64		heaptuplespresent;
+
+	/* progress reporting stuff */
+	BlockNumber totalblocks;
+	BlockNumber reportedblocks;
+	BlockNumber scannedblocks;
+	BlockNumber deltablocks;
+
+	int			leafdepth;
+}			GistCheckState;
+
+PG_FUNCTION_INFO_V1(gist_index_check);
+
+static void giststate_init_heapallindexed(Relation rel, GistCheckState * result);
+static void gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+											   void *callback_state, bool readonly);
+static void gist_check_page(GistCheckState * check_state, GistScanItem * stack,
+							Page page, bool heapallindexed,
+							BufferAccessStrategy strategy);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gist_refind_parent(Relation rel, BlockNumber parentblkno,
+									 BlockNumber childblkno,
+									 BufferAccessStrategy strategy);
+static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block,
+								   Page page, OffsetNumber offset);
+static void gist_tuple_present_callback(Relation index, ItemPointer tid,
+										Datum *values, bool *isnull,
+										bool tupleIsAlive, void *checkstate);
+static IndexTuple gistFormNormalizedTuple(GISTSTATE *giststate, Relation r,
+										  Datum *attdata, bool *isnull, ItemPointerData tid);
+
+/*
+ * gist_index_check(index regclass)
+ *
+ * Verify integrity of GiST index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gist_index_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	bool		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid,
+									GIST_AM_OID,
+									gist_check_parent_keys_consistency,
+									AccessShareLock,
+									&heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+/*
+* Initaliaze GIST state filed needed to perform.
+* This initialized bloom filter and snapshot.
+*/
+static void
+giststate_init_heapallindexed(Relation rel, GistCheckState * result)
+{
+	int64		total_pages;
+	int64		total_elems;
+	uint64		seed;
+
+	/*
+	 * Size Bloom filter based on estimated number of tuples in index. This
+	 * logic is similar to B-tree, see verify_btree.c .
+	 */
+	total_pages = result->totalblocks;
+	total_elems = Max(total_pages * (MaxOffsetNumber / 5),
+					  (int64) rel->rd_rel->reltuples);
+	seed = pg_prng_uint64(&pg_global_prng_state);
+	result->filter = bloom_create(total_elems, maintenance_work_mem, seed);
+
+	result->snapshot = RegisterSnapshot(GetTransactionSnapshot());
+
+
+	/*
+	 * GetTransactionSnapshot() always acquires a new MVCC snapshot in READ
+	 * COMMITTED mode.  A new snapshot is guaranteed to have all the entries
+	 * it requires in the index.
+	 *
+	 * We must defend against the possibility that an old xact snapshot was
+	 * returned at higher isolation levels when that snapshot is not safe for
+	 * index scans of the target index.  This is possible when the snapshot
+	 * sees tuples that are before the index's indcheckxmin horizon.  Throwing
+	 * an error here should be very rare.  It doesn't seem worth using a
+	 * secondary snapshot to avoid this.
+	 */
+	if (IsolationUsesXactSnapshot() && rel->rd_index->indcheckxmin &&
+		!TransactionIdPrecedes(HeapTupleHeaderGetXmin(rel->rd_indextuple->t_data),
+							   result->snapshot->xmin))
+		ereport(ERROR,
+				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+				 errmsg("index \"%s\" cannot be verified using transaction snapshot",
+						RelationGetRelationName(rel))));
+}
+
+/*
+ * Main entry point for GiST check.
+ *
+ * This function verifies that tuples of internal pages cover all
+ * the key space of each tuple on leaf page.  To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * This check allocates memory context and scans through
+ * GiST graph. This scan is performed in a depth-first search using a stack of
+ * GistScanItem-s. Initially this stack contains only root block number. On
+ * each iteration top block numbmer is replcaed by referenced block numbers.
+ *
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries to
+ * adjust it by tuples on referenced child page.  Parent gist tuple should
+ * never require any adjustments.
+ */
+static void
+gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+								   void *callback_state, bool readonly)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GistScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GISTSTATE  *state;
+	bool		heapallindexed = *((bool *) callback_state);
+	GistCheckState *check_state = palloc0(sizeof(GistCheckState));
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	state = initGISTstate(rel);
+
+	check_state->state = state;
+	check_state->rel = rel;
+	check_state->heaprel = heaprel;
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	check_state->leafdepth = -1;
+
+	check_state->totalblocks = RelationGetNumberOfBlocks(rel);
+	/* report every 100 blocks or 5%, whichever is bigger */
+	check_state->deltablocks = Max(check_state->totalblocks / 20, 100);
+
+	if (heapallindexed)
+		giststate_init_heapallindexed(rel, check_state);
+
+	/* Start the scan at the root page */
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	/*
+	 * This GiST scan is effectively "old" VACUUM version before commit
+	 * fe280694d which introduced physical order scanning.
+	 */
+
+	while (stack)
+	{
+		GistScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		XLogRecPtr	lsn;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Report progress */
+		if (check_state->scannedblocks > check_state->reportedblocks +
+			check_state->deltablocks)
+		{
+			elog(DEBUG1, "verified level %u blocks of approximately %u total",
+				 check_state->scannedblocks, check_state->totalblocks);
+			check_state->reportedblocks = check_state->scannedblocks;
+		}
+		check_state->scannedblocks++;
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page))
+		{
+			/* split page detected, install right link to the stack */
+			GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+			ptr->depth = stack->depth;
+			ptr->parenttup = CopyIndexTuple(stack->parenttup);
+			ptr->parentblk = stack->parentblk;
+			ptr->parentlsn = stack->parentlsn;
+			ptr->blkno = GistPageGetOpaque(page)->rightlink;
+			ptr->next = stack->next;
+			stack->next = ptr;
+		}
+
+		gist_check_page(check_state, stack, page, heapallindexed, strategy);
+
+		if (!GistPageIsLeaf(page))
+		{
+			OffsetNumber maxoff = PageGetMaxOffsetNumber(page);
+
+			for (OffsetNumber i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				/* Internal page, so recurse to the child */
+				GistScanItem *ptr;
+				ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
+				IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+				ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+				ptr->depth = stack->depth + 1;
+				ptr->parenttup = CopyIndexTuple(idxtuple);
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		LockBuffer(buffer, GIST_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	if (heapallindexed)
+	{
+		IndexInfo  *indexinfo = BuildIndexInfo(rel);
+		TableScanDesc scan;
+
+		scan = table_beginscan_strat(heaprel,	/* relation */
+									 check_state->snapshot, /* snapshot */
+									 0, /* number of keys */
+									 NULL,	/* scan key */
+									 true,	/* buffer access strategy OK */
+									 true); /* syncscan OK? */
+
+		/*
+		 * Scan will behave as the first scan of a CREATE INDEX CONCURRENTLY.
+		 */
+		indexinfo->ii_Concurrent = true;
+
+		indexinfo->ii_Unique = false;
+		indexinfo->ii_ExclusionOps = NULL;
+		indexinfo->ii_ExclusionProcs = NULL;
+		indexinfo->ii_ExclusionStrats = NULL;
+
+		elog(DEBUG1, "verifying that tuples from index \"%s\" are present in \"%s\"",
+			 RelationGetRelationName(rel),
+			 RelationGetRelationName(heaprel));
+
+		table_index_build_scan(heaprel, rel, indexinfo, true, false,
+							   gist_tuple_present_callback, (void *) check_state, scan);
+
+		ereport(DEBUG1,
+				(errmsg_internal("finished verifying presence of " INT64_FORMAT " tuples from table \"%s\" with bitset %.2f%% set",
+								 check_state->heaptuplespresent,
+								 RelationGetRelationName(heaprel),
+								 100.0 * bloom_prop_bits_set(check_state->filter))));
+
+		UnregisterSnapshot(check_state->snapshot);
+		bloom_free(check_state->filter);
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+	pfree(check_state);
+}
+
+static void
+gist_check_page(GistCheckState * check_state, GistScanItem * stack,
+				Page page, bool heapallindexed, BufferAccessStrategy strategy)
+{
+	OffsetNumber maxoff = PageGetMaxOffsetNumber(page);
+
+	/* Check that the tree has the same height in all branches */
+	if (GistPageIsLeaf(page))
+	{
+		if (check_state->leafdepth == -1)
+			check_state->leafdepth = stack->depth;
+		else if (stack->depth != check_state->leafdepth)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+							RelationGetRelationName(check_state->rel), stack->blkno)));
+	}
+
+	/*
+	 * Check that each tuple looks valid, and is consistent with the downlink
+	 * we followed when we stepped on this page.
+	 */
+	for (OffsetNumber i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+	{
+		ItemId		iid = PageGetItemIdCareful(check_state->rel, stack->blkno, page, i);
+		IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+		IndexTuple  tmpTuple = NULL;
+
+		/*
+		 * Check that it's not a leftover invalid tuple from pre-9.1 See also
+		 * gistdoinsert() and gistbulkdelete() handling of such tuples. We do
+		 * consider it error here.
+		 */
+		if (GistTupleIsInvalid(idxtuple))
+			ereport(ERROR,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					 errmsg("index \"%s\" contains an inner tuple marked as invalid, block %u, offset %u",
+							RelationGetRelationName(check_state->rel), stack->blkno, i),
+					 errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+					 errhint("Please REINDEX it.")));
+
+		if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+							RelationGetRelationName(check_state->rel), stack->blkno, i)));
+
+		/*
+		 * Check if this tuple is consistent with the downlink in the parent.
+		 */
+		if (stack->parenttup)
+			tmpTuple = gistgetadjusted(check_state->rel, stack->parenttup, idxtuple, check_state->state);
+
+		if (tmpTuple)
+		{
+			/*
+			 * There was a discrepancy between parent and child tuples. We
+			 * need to verify it is not a result of concurrent call of
+			 * gistplacetopage(). So, lock parent and try to find downlink for
+			 * current page. It may be missing due to concurrent page split,
+			 * this is OK.
+			 *
+			 * Note that when we aquire parent tuple now we hold lock for both
+			 * parent and child buffers. Thus parent tuple must include
+			 * keyspace of the child.
+			 */
+
+			pfree(tmpTuple);
+			pfree(stack->parenttup);
+			stack->parenttup = gist_refind_parent(check_state->rel, stack->parentblk,
+												  stack->blkno, strategy);
+
+			/* We found it - make a final check before failing */
+			if (!stack->parenttup)
+				elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+					 stack->blkno, stack->parentblk);
+			else if (gistgetadjusted(check_state->rel, stack->parenttup, idxtuple, check_state->state))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+								RelationGetRelationName(check_state->rel), stack->blkno, i)));
+			else
+			{
+				/*
+				 * But now it is properly adjusted - nothing to do here.
+				 */
+			}
+		}
+
+		if (GistPageIsLeaf(page))
+		{
+			if (heapallindexed)
+				bloom_add_element(check_state->filter,
+								  (unsigned char *) idxtuple,
+								  IndexTupleSize(idxtuple));
+		}
+		else
+		{
+			OffsetNumber off = ItemPointerGetOffsetNumber(&(idxtuple->t_tid));
+
+			if (off != 0xffff)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has on page %u offset %u has item id not pointing to 0xffff, but %hu",
+								RelationGetRelationName(check_state->rel), stack->blkno, i, off)));
+		}
+	}
+}
+
+/*
+ * gistFormNormalizedTuple - analogue to gistFormTuple, but performs deTOASTing
+ * of all included data (for covering indexes). While we do not expected
+ * toasted attributes in normal index, this can happen as a result of
+ * intervention into system catalog. Detoasting of key attributes is expected
+ * to be done by opclass decompression methods, if indexed type might be
+ * toasted.
+ */
+static IndexTuple
+gistFormNormalizedTuple(GISTSTATE *giststate, Relation r,
+						Datum *attdata, bool *isnull, ItemPointerData tid)
+{
+	Datum		compatt[INDEX_MAX_KEYS];
+	IndexTuple	res;
+
+	gistCompressValues(giststate, r, attdata, isnull, true, compatt);
+
+	for (int i = 0; i < r->rd_att->natts; i++)
+	{
+		Form_pg_attribute att;
+
+		att = TupleDescAttr(giststate->leafTupdesc, i);
+		if (att->attbyval || att->attlen != -1 || isnull[i])
+			continue;
+
+		if (VARATT_IS_EXTERNAL(DatumGetPointer(compatt[i])))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("external varlena datum in tuple that references heap row (%u,%u) in index \"%s\"",
+							ItemPointerGetBlockNumber(&tid),
+							ItemPointerGetOffsetNumber(&tid),
+							RelationGetRelationName(r))));
+		if (VARATT_IS_COMPRESSED(DatumGetPointer(compatt[i])))
+		{
+			/* Datum old = compatt[i]; */
+			/* Key attributes must never be compressed */
+			if (i < IndexRelationGetNumberOfKeyAttributes(r))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("compressed varlena datum in tuple key that references heap row (%u,%u) in index \"%s\"",
+								ItemPointerGetBlockNumber(&tid),
+								ItemPointerGetOffsetNumber(&tid),
+								RelationGetRelationName(r))));
+
+			compatt[i] = PointerGetDatum(PG_DETOAST_DATUM(compatt[i]));
+			/* pfree(DatumGetPointer(old)); // TODO: this fails. Why? */
+		}
+	}
+
+	res = index_form_tuple(giststate->leafTupdesc, compatt, isnull);
+
+	/*
+	 * The offset number on tuples on internal pages is unused. For historical
+	 * reasons, it is set to 0xffff.
+	 */
+	ItemPointerSetOffsetNumber(&(res->t_tid), 0xffff);
+	return res;
+}
+
+static void
+gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+							bool *isnull, bool tupleIsAlive, void *checkstate)
+{
+	GistCheckState *state = (GistCheckState *) checkstate;
+	IndexTuple	itup = gistFormNormalizedTuple(state->state, index, values, isnull, *tid);
+
+	itup->t_tid = *tid;
+	/* Probe Bloom filter -- tuple should be present */
+	if (bloom_lacks_element(state->filter, (unsigned char *) itup,
+							IndexTupleSize(itup)))
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("heap tuple (%u,%u) from table \"%s\" lacks matching index tuple within index \"%s\"",
+						ItemPointerGetBlockNumber(&(itup->t_tid)),
+						ItemPointerGetOffsetNumber(&(itup->t_tid)),
+						RelationGetRelationName(state->heaprel),
+						RelationGetRelationName(state->rel))));
+
+	state->heaptuplespresent++;
+
+	pfree(itup);
+}
+
+/*
+ * check_index_page - verification of basic invariants about GiST page data
+ * This function does no any tuple analysis.
+ */
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	gistcheckpage(rel, buffer);
+
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted page %d",
+						RelationGetRelationName(rel), blockNo)));
+
+	if (GistPageIsDeleted(page))
+	{
+		if (!GistPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gist_refind_parent(Relation rel,
+				   BlockNumber parentblkno, BlockNumber childblkno,
+				   BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIST_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GistPageIsLeaf(parentpage))
+	{
+		/*
+		 * Currently GiST never deletes internal pages, thus they can never
+		 * become leaf.
+		 */
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" internal page %d became leaf",
+						RelationGetRelationName(rel), parentblkno)));
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (OffsetNumber o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/*
+			 * Found it! Make copy and return it while both parent and child
+			 * pages are locked. This guaranties that at this particular
+			 * moment tuples must be coherent to each other.
+			 */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+static ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(sizeof(GISTPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since gist
+	 * never uses either.  Verify that line pointer has storage, too, since
+	 * even LP_DEAD items should.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 3af065615bc..6eb526c6bb7 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -188,6 +188,25 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gist_index_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
   <tip>
    <para>
-- 
2.34.1

#57

Kirill Reshke

reshkekirill@gmail.com

about 1 year ago

In reply to: Kirill Reshke (#56)

6 attachment(s)

Re: Amcheck verification of GiST and GIN

On Mon, 2 Dec 2024 at 12:57, Kirill Reshke <reshkekirill@gmail.com> wrote:

This change was not correct at all.
PFA v32.

Repro:
```
db1=# create table ttt(t text);
CREATE TABLE
db1=# create index on ttt using gin(t gin_trgm_ops);
CREATE INDEX
db1=# insert into ttt select md5(random()::text) from generate_series(1,50000);
INSERT 0 50000
db1=# set client_min_messages to debug5;
DEBUG: CommitTransaction(1) name: unnamed; blockState: STARTED;
state: INPROGRESS, xid/subid/cid: 0/1/0
SET
db1=# select gin_index_check('ttt_t_idx');
DEBUG: StartTransaction(1) name: unnamed; blockState: DEFAULT; state:
INPROGRESS, xid/subid/cid: 0/1/0
DEBUG: processing entry tree page at blk 1, maxoff: 2
DEBUG: processing entry tree page at blk 941, maxoff: 229
ERROR: index "ttt_t_idx" has wrong tuple order on entry tree page,
block 941, offset 229
db1=#
```
I have only observed failures on the last tuple of the entry page. All
other known issues that were on v31 are now fixed (I hope).

PFA v33.

Main change from v32 is a meson-related fix, and also bug fix that I
reported in a previous message.

I did find this in README.
```
GIN packs keys and downlinks into tuples in a different way.

(P_0, K_1), (P_1, K_2), ... , (P_n, K_{n+1})

P_i is grouped with K_{i+1}. -Inf key is not needed.

There are couple of additional notes regarding K_{n+1} key.
1) In entry tree rightmost page, a key coupled with P_n doesn't really matter.
Highkey is assumed to be infinity.
2) In posting tree, a key coupled with P_n always doesn't matter. Highkey for
non-rightmost pages is stored separately and accessed via
GinDataPageGetRightBound().
```
I indeed only observe gin_index_check failures only on the In entry
tree rightmost (non-leaf & non-root) page, on the last tuple
(Highkey). So fix was not to check is:

```
-                       /* (apparently) first block is metadata, skip
order check */
-                       if (i != FirstOffsetNumber && stack->blkno !=
(BlockNumber) 1)
+                       /*
+                        * First block is metadata, skip order check.
Also, never check
+                        * for high key on rightmost page, as this key
is not really
+                        * stored explicitly.
+                        */
+                       if (i != FirstOffsetNumber && stack->blkno !=
GIN_ROOT_BLKNO &&
+                               !(i == maxoff && rightlink ==
InvalidBlockNumber))
                        {
```

I also did a small wording correction for this part of README (this is v33-0006)

--
Best regards,
Kirill Reshke

Attachments:

v33-0003-Add-gist_index_check-function-to-verify-GiST-ind.patchapplication/octet-stream; name=v33-0003-Add-gist_index_check-function-to-verify-GiST-ind.patchDownload

From 3380a7148a5c20f2da6d6d8c1d0f8693e518ccec Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:17:44 +0500
Subject: [PATCH v33 3/5] Add gist_index_check() function to verify GiST index
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This function traverses GiST with a depth-fisrt search and checks
that all downlink tuples are included into parent tuple keyspace.
This traverse takes lock of any page until some discapency found.
To re-check suspicious pair of parent and child tuples it aqcuires
locks on both parent and child pages in the same order as page
split does.

Author: Andrey Borodin <amborodin@acm.org>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile                |   6 +-
 contrib/amcheck/amcheck--1.4--1.5.sql   |  14 +
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/expected/check_gist.out | 145 +++++
 contrib/amcheck/meson.build             |   3 +
 contrib/amcheck/sql/check_gist.sql      |  62 +++
 contrib/amcheck/verify_gist.c           | 687 ++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml               |  19 +
 8 files changed, 935 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.4--1.5.sql
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index c3d70f3369c..952e458c53b 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,14 +4,16 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	verify_common.o \
+	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
-DATA = amcheck--1.3--1.4.sql amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql \
+		amcheck--1.3--1.4.sql amcheck--1.4--1.5.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap
+REGRESS = check check_btree check_gist check_heap
 
 EXTRA_INSTALL = contrib/pg_walinspect
 TAP_TESTS = 1
diff --git a/contrib/amcheck/amcheck--1.4--1.5.sql b/contrib/amcheck/amcheck--1.4--1.5.sql
new file mode 100644
index 00000000000..3fc72364180
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.4--1.5.sql
@@ -0,0 +1,14 @@
+/* contrib/amcheck/amcheck--1.4--1.5.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.5'" to load this file. \quit
+
+
+-- gist_index_check()
+--
+CREATE FUNCTION gist_index_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_check(regclass, boolean) FROM PUBLIC;
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index e67ace01c99..c8ba6d7c9bc 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.4'
+default_version = '1.5'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 00000000000..cbc3e27e679
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,145 @@
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gist_check;
+--
+-- Similar to BUG #15597
+--
+CREATE TABLE toast_bug(c point,buggy text);
+ALTER TABLE toast_bug ALTER COLUMN buggy SET STORAGE extended;
+CREATE INDEX toasty ON toast_bug USING gist(c) INCLUDE(buggy);
+-- pg_attribute entry for toasty.buggy (the index) will have plain storage:
+UPDATE pg_attribute SET attstorage = 'p'
+WHERE attrelid = 'toasty'::regclass AND attname = 'buggy';
+-- Whereas pg_attribute entry for toast_bug.buggy (the table) still has extended storage:
+SELECT attstorage FROM pg_attribute
+WHERE attrelid = 'toast_bug'::regclass AND attname = 'buggy';
+ attstorage 
+------------
+ x
+(1 row)
+
+-- Insert compressible heap tuple (comfortably exceeds TOAST_TUPLE_THRESHOLD):
+INSERT INTO toast_bug SELECT point(0,0), repeat('a', 2200);
+-- Should not get false positive report of corruption:
+SELECT gist_index_check('toasty', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 1b38e0aba77..15ae94cc90f 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
@@ -25,6 +26,7 @@ install_data(
   'amcheck--1.1--1.2.sql',
   'amcheck--1.2--1.3.sql',
   'amcheck--1.3--1.4.sql',
+  'amcheck--1.4--1.5.sql',
   kwargs: contrib_data_args,
 )
 
@@ -36,6 +38,7 @@ tests += {
     'sql': [
       'check',
       'check_btree',
+      'check_gist',
       'check_heap',
     ],
   },
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 00000000000..37966423b8b
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,62 @@
+
+SELECT setseed(1);
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+-- cleanup
+DROP TABLE gist_check;
+
+--
+-- Similar to BUG #15597
+--
+CREATE TABLE toast_bug(c point,buggy text);
+ALTER TABLE toast_bug ALTER COLUMN buggy SET STORAGE extended;
+CREATE INDEX toasty ON toast_bug USING gist(c) INCLUDE(buggy);
+
+-- pg_attribute entry for toasty.buggy (the index) will have plain storage:
+UPDATE pg_attribute SET attstorage = 'p'
+WHERE attrelid = 'toasty'::regclass AND attname = 'buggy';
+
+-- Whereas pg_attribute entry for toast_bug.buggy (the table) still has extended storage:
+SELECT attstorage FROM pg_attribute
+WHERE attrelid = 'toast_bug'::regclass AND attname = 'buggy';
+
+-- Insert compressible heap tuple (comfortably exceeds TOAST_TUPLE_THRESHOLD):
+INSERT INTO toast_bug SELECT point(0,0), repeat('a', 2200);
+-- Should not get false positive report of corruption:
+SELECT gist_index_check('toasty', true);
\ No newline at end of file
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 00000000000..477150ac802
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,687 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gist.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "access/tableam.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
+#include "common/pg_prng.h"
+#include "lib/bloomfilter.h"
+#include "verify_common.h"
+#include "utils/memutils.h"
+
+
+/*
+ * GistScanItem represents one item of depth-first scan of GiST index.
+ */
+typedef struct GistScanItem
+{
+	int			depth;
+
+	/* Referenced block number to check next */
+	BlockNumber blkno;
+
+	/*
+	 * Correctess of this parent tuple will be checked against contents of
+	 * referenced page. This tuple will be NULL for root block.
+	 */
+	IndexTuple	parenttup;
+
+	/*
+	 * LSN to hande concurrent scan of the page. It's necessary to avoid
+	 * missing some subtrees from page, that was split just before we read it.
+	 */
+	XLogRecPtr	parentlsn;
+
+	/*
+	 * Reference to parent page for re-locking in case of found parent-child
+	 * tuple discrepencies.
+	 */
+	BlockNumber parentblk;
+
+	/* Pointer to a next stack item. */
+	struct GistScanItem *next;
+}			GistScanItem;
+
+typedef struct GistCheckState
+{
+	/* GiST state */
+	GISTSTATE  *state;
+	/* Bloom filter fingerprints index tuples */
+	bloom_filter *filter;
+
+	Snapshot	snapshot;
+	Relation	rel;
+	Relation	heaprel;
+
+	/* Debug counter for reporting percentage of work already done */
+	int64		heaptuplespresent;
+
+	/* progress reporting stuff */
+	BlockNumber totalblocks;
+	BlockNumber reportedblocks;
+	BlockNumber scannedblocks;
+	BlockNumber deltablocks;
+
+	int			leafdepth;
+}			GistCheckState;
+
+PG_FUNCTION_INFO_V1(gist_index_check);
+
+static void giststate_init_heapallindexed(Relation rel, GistCheckState * result);
+static void gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+											   void *callback_state, bool readonly);
+static void gist_check_page(GistCheckState * check_state, GistScanItem * stack,
+							Page page, bool heapallindexed,
+							BufferAccessStrategy strategy);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gist_refind_parent(Relation rel, BlockNumber parentblkno,
+									 BlockNumber childblkno,
+									 BufferAccessStrategy strategy);
+static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block,
+								   Page page, OffsetNumber offset);
+static void gist_tuple_present_callback(Relation index, ItemPointer tid,
+										Datum *values, bool *isnull,
+										bool tupleIsAlive, void *checkstate);
+static IndexTuple gistFormNormalizedTuple(GISTSTATE *giststate, Relation r,
+										  Datum *attdata, bool *isnull, ItemPointerData tid);
+
+/*
+ * gist_index_check(index regclass)
+ *
+ * Verify integrity of GiST index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gist_index_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	bool		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid,
+									GIST_AM_OID,
+									gist_check_parent_keys_consistency,
+									AccessShareLock,
+									&heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+/*
+* Initaliaze GIST state filed needed to perform.
+* This initialized bloom filter and snapshot.
+*/
+static void
+giststate_init_heapallindexed(Relation rel, GistCheckState * result)
+{
+	int64		total_pages;
+	int64		total_elems;
+	uint64		seed;
+
+	/*
+	 * Size Bloom filter based on estimated number of tuples in index. This
+	 * logic is similar to B-tree, see verify_btree.c .
+	 */
+	total_pages = result->totalblocks;
+	total_elems = Max(total_pages * (MaxOffsetNumber / 5),
+					  (int64) rel->rd_rel->reltuples);
+	seed = pg_prng_uint64(&pg_global_prng_state);
+	result->filter = bloom_create(total_elems, maintenance_work_mem, seed);
+
+	result->snapshot = RegisterSnapshot(GetTransactionSnapshot());
+
+
+	/*
+	 * GetTransactionSnapshot() always acquires a new MVCC snapshot in READ
+	 * COMMITTED mode.  A new snapshot is guaranteed to have all the entries
+	 * it requires in the index.
+	 *
+	 * We must defend against the possibility that an old xact snapshot was
+	 * returned at higher isolation levels when that snapshot is not safe for
+	 * index scans of the target index.  This is possible when the snapshot
+	 * sees tuples that are before the index's indcheckxmin horizon.  Throwing
+	 * an error here should be very rare.  It doesn't seem worth using a
+	 * secondary snapshot to avoid this.
+	 */
+	if (IsolationUsesXactSnapshot() && rel->rd_index->indcheckxmin &&
+		!TransactionIdPrecedes(HeapTupleHeaderGetXmin(rel->rd_indextuple->t_data),
+							   result->snapshot->xmin))
+		ereport(ERROR,
+				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+				 errmsg("index \"%s\" cannot be verified using transaction snapshot",
+						RelationGetRelationName(rel))));
+}
+
+/*
+ * Main entry point for GiST check.
+ *
+ * This function verifies that tuples of internal pages cover all
+ * the key space of each tuple on leaf page.  To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * This check allocates memory context and scans through
+ * GiST graph. This scan is performed in a depth-first search using a stack of
+ * GistScanItem-s. Initially this stack contains only root block number. On
+ * each iteration top block numbmer is replcaed by referenced block numbers.
+ *
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries to
+ * adjust it by tuples on referenced child page.  Parent gist tuple should
+ * never require any adjustments.
+ */
+static void
+gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+								   void *callback_state, bool readonly)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GistScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GISTSTATE  *state;
+	bool		heapallindexed = *((bool *) callback_state);
+	GistCheckState *check_state = palloc0(sizeof(GistCheckState));
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	state = initGISTstate(rel);
+
+	check_state->state = state;
+	check_state->rel = rel;
+	check_state->heaprel = heaprel;
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	check_state->leafdepth = -1;
+
+	check_state->totalblocks = RelationGetNumberOfBlocks(rel);
+	/* report every 100 blocks or 5%, whichever is bigger */
+	check_state->deltablocks = Max(check_state->totalblocks / 20, 100);
+
+	if (heapallindexed)
+		giststate_init_heapallindexed(rel, check_state);
+
+	/* Start the scan at the root page */
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	/*
+	 * This GiST scan is effectively "old" VACUUM version before commit
+	 * fe280694d which introduced physical order scanning.
+	 */
+
+	while (stack)
+	{
+		GistScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		XLogRecPtr	lsn;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Report progress */
+		if (check_state->scannedblocks > check_state->reportedblocks +
+			check_state->deltablocks)
+		{
+			elog(DEBUG1, "verified level %u blocks of approximately %u total",
+				 check_state->scannedblocks, check_state->totalblocks);
+			check_state->reportedblocks = check_state->scannedblocks;
+		}
+		check_state->scannedblocks++;
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page))
+		{
+			/* split page detected, install right link to the stack */
+			GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+			ptr->depth = stack->depth;
+			ptr->parenttup = CopyIndexTuple(stack->parenttup);
+			ptr->parentblk = stack->parentblk;
+			ptr->parentlsn = stack->parentlsn;
+			ptr->blkno = GistPageGetOpaque(page)->rightlink;
+			ptr->next = stack->next;
+			stack->next = ptr;
+		}
+
+		gist_check_page(check_state, stack, page, heapallindexed, strategy);
+
+		if (!GistPageIsLeaf(page))
+		{
+			OffsetNumber maxoff = PageGetMaxOffsetNumber(page);
+
+			for (OffsetNumber i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				/* Internal page, so recurse to the child */
+				GistScanItem *ptr;
+				ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
+				IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+				ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+				ptr->depth = stack->depth + 1;
+				ptr->parenttup = CopyIndexTuple(idxtuple);
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		LockBuffer(buffer, GIST_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	if (heapallindexed)
+	{
+		IndexInfo  *indexinfo = BuildIndexInfo(rel);
+		TableScanDesc scan;
+
+		scan = table_beginscan_strat(heaprel,	/* relation */
+									 check_state->snapshot, /* snapshot */
+									 0, /* number of keys */
+									 NULL,	/* scan key */
+									 true,	/* buffer access strategy OK */
+									 true); /* syncscan OK? */
+
+		/*
+		 * Scan will behave as the first scan of a CREATE INDEX CONCURRENTLY.
+		 */
+		indexinfo->ii_Concurrent = true;
+
+		indexinfo->ii_Unique = false;
+		indexinfo->ii_ExclusionOps = NULL;
+		indexinfo->ii_ExclusionProcs = NULL;
+		indexinfo->ii_ExclusionStrats = NULL;
+
+		elog(DEBUG1, "verifying that tuples from index \"%s\" are present in \"%s\"",
+			 RelationGetRelationName(rel),
+			 RelationGetRelationName(heaprel));
+
+		table_index_build_scan(heaprel, rel, indexinfo, true, false,
+							   gist_tuple_present_callback, (void *) check_state, scan);
+
+		ereport(DEBUG1,
+				(errmsg_internal("finished verifying presence of " INT64_FORMAT " tuples from table \"%s\" with bitset %.2f%% set",
+								 check_state->heaptuplespresent,
+								 RelationGetRelationName(heaprel),
+								 100.0 * bloom_prop_bits_set(check_state->filter))));
+
+		UnregisterSnapshot(check_state->snapshot);
+		bloom_free(check_state->filter);
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+	pfree(check_state);
+}
+
+static void
+gist_check_page(GistCheckState * check_state, GistScanItem * stack,
+				Page page, bool heapallindexed, BufferAccessStrategy strategy)
+{
+	OffsetNumber maxoff = PageGetMaxOffsetNumber(page);
+
+	/* Check that the tree has the same height in all branches */
+	if (GistPageIsLeaf(page))
+	{
+		if (check_state->leafdepth == -1)
+			check_state->leafdepth = stack->depth;
+		else if (stack->depth != check_state->leafdepth)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+							RelationGetRelationName(check_state->rel), stack->blkno)));
+	}
+
+	/*
+	 * Check that each tuple looks valid, and is consistent with the downlink
+	 * we followed when we stepped on this page.
+	 */
+	for (OffsetNumber i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+	{
+		ItemId		iid = PageGetItemIdCareful(check_state->rel, stack->blkno, page, i);
+		IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+		IndexTuple  tmpTuple = NULL;
+
+		/*
+		 * Check that it's not a leftover invalid tuple from pre-9.1 See also
+		 * gistdoinsert() and gistbulkdelete() handling of such tuples. We do
+		 * consider it error here.
+		 */
+		if (GistTupleIsInvalid(idxtuple))
+			ereport(ERROR,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					 errmsg("index \"%s\" contains an inner tuple marked as invalid, block %u, offset %u",
+							RelationGetRelationName(check_state->rel), stack->blkno, i),
+					 errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+					 errhint("Please REINDEX it.")));
+
+		if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+							RelationGetRelationName(check_state->rel), stack->blkno, i)));
+
+		/*
+		 * Check if this tuple is consistent with the downlink in the parent.
+		 */
+		if (stack->parenttup)
+			tmpTuple = gistgetadjusted(check_state->rel, stack->parenttup, idxtuple, check_state->state);
+
+		if (tmpTuple)
+		{
+			/*
+			 * There was a discrepancy between parent and child tuples. We
+			 * need to verify it is not a result of concurrent call of
+			 * gistplacetopage(). So, lock parent and try to find downlink for
+			 * current page. It may be missing due to concurrent page split,
+			 * this is OK.
+			 *
+			 * Note that when we aquire parent tuple now we hold lock for both
+			 * parent and child buffers. Thus parent tuple must include
+			 * keyspace of the child.
+			 */
+
+			pfree(tmpTuple);
+			pfree(stack->parenttup);
+			stack->parenttup = gist_refind_parent(check_state->rel, stack->parentblk,
+												  stack->blkno, strategy);
+
+			/* We found it - make a final check before failing */
+			if (!stack->parenttup)
+				elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+					 stack->blkno, stack->parentblk);
+			else if (gistgetadjusted(check_state->rel, stack->parenttup, idxtuple, check_state->state))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+								RelationGetRelationName(check_state->rel), stack->blkno, i)));
+			else
+			{
+				/*
+				 * But now it is properly adjusted - nothing to do here.
+				 */
+			}
+		}
+
+		if (GistPageIsLeaf(page))
+		{
+			if (heapallindexed)
+				bloom_add_element(check_state->filter,
+								  (unsigned char *) idxtuple,
+								  IndexTupleSize(idxtuple));
+		}
+		else
+		{
+			OffsetNumber off = ItemPointerGetOffsetNumber(&(idxtuple->t_tid));
+
+			if (off != 0xffff)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has on page %u offset %u has item id not pointing to 0xffff, but %hu",
+								RelationGetRelationName(check_state->rel), stack->blkno, i, off)));
+		}
+	}
+}
+
+/*
+ * gistFormNormalizedTuple - analogue to gistFormTuple, but performs deTOASTing
+ * of all included data (for covering indexes). While we do not expected
+ * toasted attributes in normal index, this can happen as a result of
+ * intervention into system catalog. Detoasting of key attributes is expected
+ * to be done by opclass decompression methods, if indexed type might be
+ * toasted.
+ */
+static IndexTuple
+gistFormNormalizedTuple(GISTSTATE *giststate, Relation r,
+						Datum *attdata, bool *isnull, ItemPointerData tid)
+{
+	Datum		compatt[INDEX_MAX_KEYS];
+	IndexTuple	res;
+
+	gistCompressValues(giststate, r, attdata, isnull, true, compatt);
+
+	for (int i = 0; i < r->rd_att->natts; i++)
+	{
+		Form_pg_attribute att;
+
+		att = TupleDescAttr(giststate->leafTupdesc, i);
+		if (att->attbyval || att->attlen != -1 || isnull[i])
+			continue;
+
+		if (VARATT_IS_EXTERNAL(DatumGetPointer(compatt[i])))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("external varlena datum in tuple that references heap row (%u,%u) in index \"%s\"",
+							ItemPointerGetBlockNumber(&tid),
+							ItemPointerGetOffsetNumber(&tid),
+							RelationGetRelationName(r))));
+		if (VARATT_IS_COMPRESSED(DatumGetPointer(compatt[i])))
+		{
+			/* Datum old = compatt[i]; */
+			/* Key attributes must never be compressed */
+			if (i < IndexRelationGetNumberOfKeyAttributes(r))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("compressed varlena datum in tuple key that references heap row (%u,%u) in index \"%s\"",
+								ItemPointerGetBlockNumber(&tid),
+								ItemPointerGetOffsetNumber(&tid),
+								RelationGetRelationName(r))));
+
+			compatt[i] = PointerGetDatum(PG_DETOAST_DATUM(compatt[i]));
+			/* pfree(DatumGetPointer(old)); // TODO: this fails. Why? */
+		}
+	}
+
+	res = index_form_tuple(giststate->leafTupdesc, compatt, isnull);
+
+	/*
+	 * The offset number on tuples on internal pages is unused. For historical
+	 * reasons, it is set to 0xffff.
+	 */
+	ItemPointerSetOffsetNumber(&(res->t_tid), 0xffff);
+	return res;
+}
+
+static void
+gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+							bool *isnull, bool tupleIsAlive, void *checkstate)
+{
+	GistCheckState *state = (GistCheckState *) checkstate;
+	IndexTuple	itup = gistFormNormalizedTuple(state->state, index, values, isnull, *tid);
+
+	itup->t_tid = *tid;
+	/* Probe Bloom filter -- tuple should be present */
+	if (bloom_lacks_element(state->filter, (unsigned char *) itup,
+							IndexTupleSize(itup)))
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("heap tuple (%u,%u) from table \"%s\" lacks matching index tuple within index \"%s\"",
+						ItemPointerGetBlockNumber(&(itup->t_tid)),
+						ItemPointerGetOffsetNumber(&(itup->t_tid)),
+						RelationGetRelationName(state->heaprel),
+						RelationGetRelationName(state->rel))));
+
+	state->heaptuplespresent++;
+
+	pfree(itup);
+}
+
+/*
+ * check_index_page - verification of basic invariants about GiST page data
+ * This function does no any tuple analysis.
+ */
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	gistcheckpage(rel, buffer);
+
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted page %d",
+						RelationGetRelationName(rel), blockNo)));
+
+	if (GistPageIsDeleted(page))
+	{
+		if (!GistPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gist_refind_parent(Relation rel,
+				   BlockNumber parentblkno, BlockNumber childblkno,
+				   BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIST_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GistPageIsLeaf(parentpage))
+	{
+		/*
+		 * Currently GiST never deletes internal pages, thus they can never
+		 * become leaf.
+		 */
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" internal page %d became leaf",
+						RelationGetRelationName(rel), parentblkno)));
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (OffsetNumber o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/*
+			 * Found it! Make copy and return it while both parent and child
+			 * pages are locked. This guaranties that at this particular
+			 * moment tuples must be coherent to each other.
+			 */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+static ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(sizeof(GISTPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since gist
+	 * never uses either.  Verify that line pointer has storage, too, since
+	 * even LP_DEAD items should.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 3af065615bc..6eb526c6bb7 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -188,6 +188,25 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gist_index_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
   <tip>
    <para>
-- 
2.34.1

v33-0001-A-tiny-nitpicky-tweak-to-beautify-the-Amcheck-in.patchapplication/octet-stream; name=v33-0001-A-tiny-nitpicky-tweak-to-beautify-the-Amcheck-in.patchDownload

From a73f5e5f73d9a0c12cc29c4b349950a6fe3ead36 Mon Sep 17 00:00:00 2001
From: reshke kirill <reshke@double.cloud>
Date: Tue, 26 Nov 2024 05:32:27 +0000
Subject: [PATCH v33 1/5] A tiny nitpicky tweak to beautify the Amcheck
 interiors.

The heaptuplespresent field in BtreeCheckState was not previously
adequately documented. To clarify the meaning of this field, the comment was changed.
---
 contrib/amcheck/verify_nbtree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index ffe4f721672..c76349bf436 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -124,7 +124,7 @@ typedef struct BtreeCheckState
 
 	/* Bloom filter fingerprints B-Tree index */
 	bloom_filter *filter;
-	/* Debug counter */
+	/* Debug counter for reporting percentage of work already done */
 	int64		heaptuplespresent;
 } BtreeCheckState;
 
-- 
2.34.1

v33-0002-Refactor-amcheck-internals-to-isolate-common-loc.patchapplication/octet-stream; name=v33-0002-Refactor-amcheck-internals-to-isolate-common-loc.patchDownload

From 8870ff60fd743e090ebf86426a5b2bd76798af29 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:08:10 +0500
Subject: [PATCH v33 2/5] Refactor amcheck internals to isolate common locking
 and checking routines
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Before doing checks, other indexes must take the same safety measures:
 - Making sure the index can be checked
 - changing the context of the user
 - keeping track of GUCs modified via index functions
This contribution relocates the existing functionality to amcheck.c for reuse.

Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile                 |   1 +
 contrib/amcheck/expected/check_btree.out |   4 +-
 contrib/amcheck/meson.build              |   1 +
 contrib/amcheck/verify_common.c          | 191 ++++++++++++++++
 contrib/amcheck/verify_common.h          |  31 +++
 contrib/amcheck/verify_nbtree.c          | 267 ++++++-----------------
 6 files changed, 296 insertions(+), 199 deletions(-)
 create mode 100644 contrib/amcheck/verify_common.c
 create mode 100644 contrib/amcheck/verify_common.h

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index 5e9002d2501..c3d70f3369c 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -3,6 +3,7 @@
 MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
+	verify_common.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
diff --git a/contrib/amcheck/expected/check_btree.out b/contrib/amcheck/expected/check_btree.out
index e7fb5f55157..c6f4b16c556 100644
--- a/contrib/amcheck/expected/check_btree.out
+++ b/contrib/amcheck/expected/check_btree.out
@@ -57,8 +57,8 @@ ERROR:  could not open relation with OID 17
 BEGIN;
 CREATE INDEX bttest_a_brin_idx ON bttest_a USING brin(id);
 SELECT bt_index_parent_check('bttest_a_brin_idx');
-ERROR:  only B-Tree indexes are supported as targets for verification
-DETAIL:  Relation "bttest_a_brin_idx" is not a B-Tree index.
+ERROR:  expected "btree" index as targets for verification
+DETAIL:  Relation "bttest_a_brin_idx" is a brin index.
 ROLLBACK;
 -- normal check outside of xact
 SELECT bt_index_check('bttest_a_idx');
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index fc08e32539a..1b38e0aba77 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -1,6 +1,7 @@
 # Copyright (c) 2022-2024, PostgreSQL Global Development Group
 
 amcheck_sources = files(
+  'amcheck.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
diff --git a/contrib/amcheck/verify_common.c b/contrib/amcheck/verify_common.c
new file mode 100644
index 00000000000..acdcf5729f7
--- /dev/null
+++ b/contrib/amcheck/verify_common.c
@@ -0,0 +1,191 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.c
+ *		Utility functions common to all access methods.
+ *
+ * Copyright (c) 2024, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/table.h"
+#include "access/tableam.h"
+#include "verify_common.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
+#include "commands/tablecmds.h"
+#include "utils/guc.h"
+#include "utils/syscache.h"
+
+static bool amcheck_index_mainfork_expected(Relation rel);
+
+
+/*
+ * Check if index relation should have a file for its main relation fork.
+ * Verification uses this to skip unlogged indexes when in hot standby mode,
+ * where there is simply nothing to verify.
+ *
+ * NB: Caller should call index_checkable() before calling here.
+ */
+static bool
+amcheck_index_mainfork_expected(Relation rel)
+{
+	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
+		!RecoveryInProgress())
+		return true;
+
+	ereport(NOTICE,
+			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
+			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
+					RelationGetRelationName(rel))));
+
+	return false;
+}
+
+/*
+* Amcheck main workhorse.
+* Given index relation OID, lock relation.
+* Next, take a number of standard actions:
+* 1) Make sure the index can be checked
+* 2) change the context of the user,
+* 3) keep track of GUCs modified via index functions
+* 4) execute callback function to verify integrity.
+*/
+void
+amcheck_lock_relation_and_check(Oid indrelid,
+								Oid am_id,
+								IndexDoCheckCallback check,
+								LOCKMODE lockmode,
+								void *state)
+{
+	Oid			heapid;
+	Relation	indrel;
+	Relation	heaprel;
+	Oid			save_userid;
+	int			save_sec_context;
+	int			save_nestlevel;
+
+	/*
+	 * We must lock table before index to avoid deadlocks.  However, if the
+	 * passed indrelid isn't an index then IndexGetRelation() will fail.
+	 * Rather than emitting a not-very-helpful error message, postpone
+	 * complaining, expecting that the is-it-an-index test below will fail.
+	 *
+	 * In hot standby mode this will raise an error when parentcheck is true.
+	 */
+	heapid = IndexGetRelation(indrelid, true);
+	if (OidIsValid(heapid))
+	{
+		heaprel = table_open(heapid, lockmode);
+
+		/*
+		 * Switch to the table owner's userid, so that any index functions are
+		 * run as that user.  Also lock down security-restricted operations
+		 * and arrange to make GUC variable changes local to this command.
+		 */
+		GetUserIdAndSecContext(&save_userid, &save_sec_context);
+		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
+							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
+		save_nestlevel = NewGUCNestLevel();
+	}
+	else
+	{
+		heaprel = NULL;
+		/* Set these just to suppress "uninitialized variable" warnings */
+		save_userid = InvalidOid;
+		save_sec_context = -1;
+		save_nestlevel = -1;
+	}
+
+	/*
+	 * Open the target index relations separately (like relation_openrv(), but
+	 * with heap relation locked first to prevent deadlocking).  In hot
+	 * standby mode this will raise an error when parentcheck is true.
+	 *
+	 * There is no need for the usual indcheckxmin usability horizon test
+	 * here, even in the heapallindexed case, because index undergoing
+	 * verification only needs to have entries for a new transaction snapshot.
+	 * (If this is a parentcheck verification, there is no question about
+	 * committed or recently dead heap tuples lacking index entries due to
+	 * concurrent activity.)
+	 */
+	indrel = index_open(indrelid, lockmode);
+
+	/*
+	 * Since we did the IndexGetRelation call above without any lock, it's
+	 * barely possible that a race against an index drop/recreation could have
+	 * netted us the wrong table.
+	 */
+	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_TABLE),
+				 errmsg("could not open parent table of index \"%s\"",
+						RelationGetRelationName(indrel))));
+
+	/* Check that relation suitable for checking */
+	if (index_checkable(indrel, am_id))
+		check(indrel, heaprel, state, lockmode == ShareLock);
+
+	/* Roll back any GUC changes executed by index functions */
+	AtEOXact_GUC(false, save_nestlevel);
+
+	/* Restore userid and security context */
+	SetUserIdAndSecContext(save_userid, save_sec_context);
+
+	/*
+	 * Release locks early. That's ok here because nothing in the called
+	 * routines will trigger shared cache invalidations to be sent, so we can
+	 * relax the usual pattern of only releasing locks after commit.
+	 */
+	index_close(indrel, lockmode);
+	if (heaprel)
+		table_close(heaprel, lockmode);
+}
+
+/*
+ * Basic checks about the suitability of a relation for checking as an index.
+ *
+ *
+ * NB: Intentionally not checking permissions, the function is normally not
+ * callable by non-superusers. If granted, it's useful to be able to check a
+ * whole cluster.
+ */
+bool
+index_checkable(Relation rel, Oid am_id)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != am_id)
+	{
+		HeapTuple	amtup;
+		HeapTuple	amtuprel;
+
+		amtup = SearchSysCache1(AMOID, ObjectIdGetDatum(am_id));
+		amtuprel = SearchSysCache1(AMOID, ObjectIdGetDatum(rel->rd_rel->relam));
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("expected \"%s\" index as targets for verification", NameStr(((Form_pg_am) GETSTRUCT(amtup))->amname)),
+				 errdetail("Relation \"%s\" is a %s index.",
+						   RelationGetRelationName(rel), NameStr(((Form_pg_am) GETSTRUCT(amtuprel))->amname))));
+	}
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid.")));
+
+	return amcheck_index_mainfork_expected(rel);
+}
diff --git a/contrib/amcheck/verify_common.h b/contrib/amcheck/verify_common.h
new file mode 100644
index 00000000000..30994e22933
--- /dev/null
+++ b/contrib/amcheck/verify_common.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "storage/bufpage.h"
+#include "storage/lmgr.h"
+#include "storage/lockdefs.h"
+#include "utils/relcache.h"
+#include "miscadmin.h"
+
+/* Typedefs for callback functions for amcheck_lock_relation */
+typedef void (*IndexCheckableCallback) (Relation index);
+typedef void (*IndexDoCheckCallback) (Relation rel,
+									  Relation heaprel,
+									  void *state,
+									  bool readonly);
+
+extern void amcheck_lock_relation_and_check(Oid indrelid,
+											Oid am_id,
+											IndexDoCheckCallback check,
+											LOCKMODE lockmode, void *state);
+
+extern bool index_checkable(Relation rel, Oid am_id);
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index c76349bf436..1da4f0c3461 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -30,6 +30,7 @@
 #include "access/tableam.h"
 #include "access/transam.h"
 #include "access/xact.h"
+#include "verify_common.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
 #include "catalog/pg_opfamily_d.h"
@@ -156,14 +157,22 @@ typedef struct BtreeLastVisibleEntry
 	ItemPointer tid;			/* Heap tid */
 } BtreeLastVisibleEntry;
 
+/*
+ * Check arguments
+ */
+typedef struct BTCallbackState
+{
+	bool		parentcheck;
+	bool		heapallindexed;
+	bool		rootdescend;
+	bool		checkunique;
+}			BTCallbackState;
+
 PG_FUNCTION_INFO_V1(bt_index_check);
 PG_FUNCTION_INFO_V1(bt_index_parent_check);
 
-static void bt_index_check_internal(Oid indrelid, bool parentcheck,
-									bool heapallindexed, bool rootdescend,
-									bool checkunique);
-static inline void btree_index_checkable(Relation rel);
-static inline bool btree_index_mainfork_expected(Relation rel);
+static void bt_index_check_callback(Relation indrel, Relation heaprel,
+									void *state, bool readonly);
 static void bt_check_every_level(Relation rel, Relation heaprel,
 								 bool heapkeyspace, bool readonly, bool heapallindexed,
 								 bool rootdescend, bool checkunique);
@@ -238,15 +247,21 @@ Datum
 bt_index_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		checkunique = false;
+	BTCallbackState args;
+
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = false;
+	args.checkunique = false;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
-	if (PG_NARGS() == 3)
-		checkunique = PG_GETARG_BOOL(2);
+		args.heapallindexed = PG_GETARG_BOOL(1);
+	if (PG_NARGS() >= 3)
+		args.checkunique = PG_GETARG_BOOL(2);
 
-	bt_index_check_internal(indrelid, false, heapallindexed, false, checkunique);
+	amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
+									bt_index_check_callback,
+									AccessShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -264,18 +279,23 @@ Datum
 bt_index_parent_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		rootdescend = false;
-	bool		checkunique = false;
+	BTCallbackState args;
+
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = true;
+	args.checkunique = false;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+		args.heapallindexed = PG_GETARG_BOOL(1);
 	if (PG_NARGS() >= 3)
-		rootdescend = PG_GETARG_BOOL(2);
-	if (PG_NARGS() == 4)
-		checkunique = PG_GETARG_BOOL(3);
+		args.rootdescend = PG_GETARG_BOOL(2);
+	if (PG_NARGS() >= 4)
+		args.checkunique = PG_GETARG_BOOL(3);
 
-	bt_index_check_internal(indrelid, true, heapallindexed, rootdescend, checkunique);
+	amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
+									bt_index_check_callback,
+									ShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -284,193 +304,46 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
  * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
  */
 static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend, bool checkunique)
+bt_index_check_callback(Relation indrel, Relation heaprel, void *state, bool readonly)
 {
-	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	LOCKMODE	lockmode;
-	Oid			save_userid;
-	int			save_sec_context;
-	int			save_nestlevel;
-
-	if (parentcheck)
-		lockmode = ShareLock;
-	else
-		lockmode = AccessShareLock;
-
-	/*
-	 * We must lock table before index to avoid deadlocks.  However, if the
-	 * passed indrelid isn't an index then IndexGetRelation() will fail.
-	 * Rather than emitting a not-very-helpful error message, postpone
-	 * complaining, expecting that the is-it-an-index test below will fail.
-	 *
-	 * In hot standby mode this will raise an error when parentcheck is true.
-	 */
-	heapid = IndexGetRelation(indrelid, true);
-	if (OidIsValid(heapid))
-	{
-		heaprel = table_open(heapid, lockmode);
-
-		/*
-		 * Switch to the table owner's userid, so that any index functions are
-		 * run as that user.  Also lock down security-restricted operations
-		 * and arrange to make GUC variable changes local to this command.
-		 */
-		GetUserIdAndSecContext(&save_userid, &save_sec_context);
-		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
-							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
-		save_nestlevel = NewGUCNestLevel();
-		RestrictSearchPath();
-	}
-	else
-	{
-		heaprel = NULL;
-		/* Set these just to suppress "uninitialized variable" warnings */
-		save_userid = InvalidOid;
-		save_sec_context = -1;
-		save_nestlevel = -1;
-	}
+	BTCallbackState *args = (BTCallbackState *) state;
+	bool		heapkeyspace,
+				allequalimage;
 
-	/*
-	 * Open the target index relations separately (like relation_openrv(), but
-	 * with heap relation locked first to prevent deadlocking).  In hot
-	 * standby mode this will raise an error when parentcheck is true.
-	 *
-	 * There is no need for the usual indcheckxmin usability horizon test
-	 * here, even in the heapallindexed case, because index undergoing
-	 * verification only needs to have entries for a new transaction snapshot.
-	 * (If this is a parentcheck verification, there is no question about
-	 * committed or recently dead heap tuples lacking index entries due to
-	 * concurrent activity.)
-	 */
-	indrel = index_open(indrelid, lockmode);
-
-	/*
-	 * Since we did the IndexGetRelation call above without any lock, it's
-	 * barely possible that a race against an index drop/recreation could have
-	 * netted us the wrong table.
-	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
 		ereport(ERROR,
-				(errcode(ERRCODE_UNDEFINED_TABLE),
-				 errmsg("could not open parent table of index \"%s\"",
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" lacks a main relation fork",
 						RelationGetRelationName(indrel))));
 
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	if (btree_index_mainfork_expected(indrel))
+	/* Extract metadata from metapage, and sanitize it in passing */
+	_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
+	if (allequalimage && !heapkeyspace)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
+						RelationGetRelationName(indrel))));
+	if (allequalimage && !_bt_allequalimage(indrel, false))
 	{
-		bool		heapkeyspace,
-					allequalimage;
+		bool		has_interval_ops = false;
 
-		if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" lacks a main relation fork",
-							RelationGetRelationName(indrel))));
-
-		/* Extract metadata from metapage, and sanitize it in passing */
-		_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
-		if (allequalimage && !heapkeyspace)
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
-							RelationGetRelationName(indrel))));
-		if (allequalimage && !_bt_allequalimage(indrel, false))
-		{
-			bool		has_interval_ops = false;
-
-			for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
-				if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
-					has_interval_ops = true;
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
-							RelationGetRelationName(indrel)),
-					 has_interval_ops
-					 ? errhint("This is known of \"interval\" indexes last built on a version predating 2023-11.")
-					 : 0));
-		}
-
-		/* Check index, possibly against table it is an index on */
-		bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
-							 heapallindexed, rootdescend, checkunique);
+		for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
+			if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
+			{
+				has_interval_ops = true;
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
+								RelationGetRelationName(indrel)),
+						 has_interval_ops
+						 ? errhint("This is known of \"interval\" indexes last built on a version predating 2023-11.")
+						 : 0));
+			}
 	}
 
-	/* Roll back any GUC changes executed by index functions */
-	AtEOXact_GUC(false, save_nestlevel);
-
-	/* Restore userid and security context */
-	SetUserIdAndSecContext(save_userid, save_sec_context);
-
-	/*
-	 * Release locks early. That's ok here because nothing in the called
-	 * routines will trigger shared cache invalidations to be sent, so we can
-	 * relax the usual pattern of only releasing locks after commit.
-	 */
-	index_close(indrel, lockmode);
-	if (heaprel)
-		table_close(heaprel, lockmode);
-}
-
-/*
- * Basic checks about the suitability of a relation for checking as a B-Tree
- * index.
- *
- * NB: Intentionally not checking permissions, the function is normally not
- * callable by non-superusers. If granted, it's useful to be able to check a
- * whole cluster.
- */
-static inline void
-btree_index_checkable(Relation rel)
-{
-	if (rel->rd_rel->relkind != RELKIND_INDEX ||
-		rel->rd_rel->relam != BTREE_AM_OID)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("only B-Tree indexes are supported as targets for verification"),
-				 errdetail("Relation \"%s\" is not a B-Tree index.",
-						   RelationGetRelationName(rel))));
-
-	if (RELATION_IS_OTHER_TEMP(rel))
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("cannot access temporary tables of other sessions"),
-				 errdetail("Index \"%s\" is associated with temporary relation.",
-						   RelationGetRelationName(rel))));
-
-	if (!rel->rd_index->indisvalid)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("cannot check index \"%s\"",
-						RelationGetRelationName(rel)),
-				 errdetail("Index is not valid.")));
-}
-
-/*
- * Check if B-Tree index relation should have a file for its main relation
- * fork.  Verification uses this to skip unlogged indexes when in hot standby
- * mode, where there is simply nothing to verify.  We behave as if the
- * relation is empty.
- *
- * NB: Caller should call btree_index_checkable() before calling here.
- */
-static inline bool
-btree_index_mainfork_expected(Relation rel)
-{
-	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
-		!RecoveryInProgress())
-		return true;
-
-	ereport(DEBUG1,
-			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
-			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
-					RelationGetRelationName(rel))));
-
-	return false;
+	/* Check index, possibly against table it is an index on */
+	bt_check_every_level(indrel, heaprel, heapkeyspace, readonly,
+						 args->heapallindexed, args->rootdescend, args->checkunique);
 }
 
 /*
-- 
2.34.1

v33-0004-Add-gin_index_check-to-verify-GIN-index.patchapplication/octet-stream; name=v33-0004-Add-gin_index_check-to-verify-GIN-index.patchDownload

From 2970cfbdee597aee3d5122c4f8447a7f18dcbff2 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:22:26 +0500
Subject: [PATCH v33 4/5] Add gin_index_check() to verify GIN index
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Author: Grigory Kryachko <GSKryachko@gmail.com>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile               |   3 +-
 contrib/amcheck/amcheck--1.4--1.5.sql  |   9 +
 contrib/amcheck/expected/check_gin.out |  64 ++
 contrib/amcheck/meson.build            |   2 +
 contrib/amcheck/sql/check_gin.sql      |  40 ++
 contrib/amcheck/verify_gin.c           | 774 +++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml              |  19 +
 src/tools/pgindent/pgindent            |   2 +-
 8 files changed, 911 insertions(+), 2 deletions(-)
 create mode 100644 contrib/amcheck/expected/check_gin.out
 create mode 100644 contrib/amcheck/sql/check_gin.sql
 create mode 100644 contrib/amcheck/verify_gin.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index 952e458c53b..c01f8e618f3 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,6 +4,7 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	verify_common.o \
+	verify_gin.o \
 	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
@@ -13,7 +14,7 @@ DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck
 		amcheck--1.3--1.4.sql amcheck--1.4--1.5.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_gist check_heap
+REGRESS = check check_btree check_gin check_gist check_heap
 
 EXTRA_INSTALL = contrib/pg_walinspect
 TAP_TESTS = 1
diff --git a/contrib/amcheck/amcheck--1.4--1.5.sql b/contrib/amcheck/amcheck--1.4--1.5.sql
index 3fc72364180..c013abc4f55 100644
--- a/contrib/amcheck/amcheck--1.4--1.5.sql
+++ b/contrib/amcheck/amcheck--1.4--1.5.sql
@@ -12,3 +12,12 @@ AS 'MODULE_PATHNAME', 'gist_index_check'
 LANGUAGE C STRICT;
 
 REVOKE ALL ON FUNCTION gist_index_check(regclass, boolean) FROM PUBLIC;
+
+-- gin_index_check()
+--
+CREATE FUNCTION gin_index_check(index regclass)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gin_index_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gin_index_check(regclass) FROM PUBLIC;
diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
new file mode 100644
index 00000000000..bbcde80e627
--- /dev/null
+++ b/contrib/amcheck/expected/check_gin.out
@@ -0,0 +1,64 @@
+-- Test of index bulk load
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_check('gin_check_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test index inserts
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+SELECT gin_index_check('gin_check_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test GIN over text array
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_check('gin_check_text_array_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 15ae94cc90f..5c9ddfe0758 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'amcheck.c',
+  'verify_gin.c',
   'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
@@ -38,6 +39,7 @@ tests += {
     'sql': [
       'check',
       'check_btree',
+      'check_gin',
       'check_gist',
       'check_heap',
     ],
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
new file mode 100644
index 00000000000..bbd9b9f8281
--- /dev/null
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -0,0 +1,40 @@
+-- Test of index bulk load
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test index inserts
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+
+SELECT gin_index_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test GIN over text array
+SELECT setseed(1);
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_check('gin_check_text_array_idx');
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
new file mode 100644
index 00000000000..2dc5fbba619
--- /dev/null
+++ b/contrib/amcheck/verify_gin.c
@@ -0,0 +1,774 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gin.c
+ *		Verifies the integrity of GIN indexes based on invariants.
+ *
+ * Verification checks that all paths in GIN graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gin.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gin_private.h"
+#include "access/nbtree.h"
+#include "catalog/pg_am.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "verify_common.h"
+#include "string.h"
+
+/*
+ * GinScanItem represents one item of depth-first scan of GIN index.
+ */
+typedef struct GinScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GinScanItem *next;
+}			GinScanItem;
+
+/*
+ * GinPostingTreeScanItem represents one item of depth-first scan of GIN  posting tree.
+ */
+typedef struct GinPostingTreeScanItem
+{
+	int			depth;
+	ItemPointerData parentkey;
+	BlockNumber parentblk;
+	BlockNumber blkno;
+	struct GinPostingTreeScanItem *next;
+}			GinPostingTreeScanItem;
+
+
+PG_FUNCTION_INFO_V1(gin_index_check);
+
+static void gin_check_parent_keys_consistency(Relation rel,
+											  Relation heaprel,
+											  void *callback_state, bool readonly);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gin_refind_parent(Relation rel,
+									BlockNumber parentblkno,
+									BlockNumber childblkno,
+									BufferAccessStrategy strategy);
+static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+								   OffsetNumber offset);
+
+/*
+ * gin_index_check(index regclass)
+ *
+ * Verify integrity of GIN index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gin_index_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+
+	amcheck_lock_relation_and_check(indrelid,
+									GIN_AM_OID,
+									gin_check_parent_keys_consistency,
+									AccessShareLock,
+									NULL);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Read item pointers from leaf entry tuple.
+ *
+ * Returns a palloc'd array of ItemPointers. The number of items is returned
+ * in *nitems.
+ */
+static ItemPointer
+ginReadTupleWithoutState(IndexTuple itup, int *nitems)
+{
+	Pointer		ptr = GinGetPosting(itup);
+	int			nipd = GinGetNPosting(itup);
+	ItemPointer ipd;
+	int			ndecoded;
+
+	if (GinItupIsCompressed(itup))
+	{
+		if (nipd > 0)
+		{
+			ipd = ginPostingListDecode((GinPostingList *) ptr, &ndecoded);
+			if (nipd != ndecoded)
+				elog(ERROR, "number of items mismatch in GIN entry tuple, %d in tuple header, %d decoded",
+					 nipd, ndecoded);
+		}
+		else
+			ipd = palloc(0);
+	}
+	else
+	{
+		ipd = (ItemPointer) palloc(sizeof(ItemPointerData) * nipd);
+		memcpy(ipd, ptr, sizeof(ItemPointerData) * nipd);
+	}
+	*nitems = nipd;
+	return ipd;
+}
+
+/*
+ * Allocates memory context and scans through postigTree graph
+ *
+ */
+static void
+gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting_tree_root)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinPostingTreeScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinPostingTreeScanItem *) palloc0(sizeof(GinPostingTreeScanItem));
+	stack->depth = 0;
+	ItemPointerSetInvalid(&stack->parentkey);
+	stack->parentblk = InvalidBlockNumber;
+	stack->blkno = posting_tree_root;
+
+	elog(DEBUG3, "processing posting tree at blk %u", posting_tree_root);
+
+	while (stack)
+	{
+		GinPostingTreeScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		Assert(GinPageIsData(page));
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			ItemPointerData minItem;
+			int			nlist;
+			ItemPointerData *list;
+			char		tidrange_buf[MAXPGPATH];
+
+			ItemPointerSetMin(&minItem);
+
+			elog(DEBUG1, "page blk: %u, type leaf", stack->blkno);
+
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			list = GinDataLeafPageGetItems(page, &nlist, minItem);
+
+			if (nlist > 0)
+				snprintf(tidrange_buf, sizeof(tidrange_buf),
+						 "%d tids (%u, %u) - (%u, %u)",
+						 nlist,
+						 ItemPointerGetBlockNumberNoCheck(&list[0]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[0]),
+						 ItemPointerGetBlockNumberNoCheck(&list[nlist - 1]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[nlist - 1]));
+			else
+				snprintf(tidrange_buf, sizeof(tidrange_buf), "0 tids");
+
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: parent %u highkey (%u, %u), %s",
+					 stack->blkno,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey),
+					 tidrange_buf);
+			else
+				elog(DEBUG3, "blk %u: root leaf, %s",
+					 stack->blkno,
+					 tidrange_buf);
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				ItemPointerGetOffsetNumberNoCheck(&stack->parentkey) != InvalidOffsetNumber &&
+				nlist > 0 && ItemPointerCompare(&stack->parentkey, &list[nlist - 1]) < 0)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": tid exceeds parent's high key in postingTree leaf on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+		else
+		{
+			LocationIndex pd_lower;
+			ItemPointerData bound;
+			int			lowersize;
+
+			/*
+			 * Check that tuples in each page are properly ordered and
+			 * consistent with parent high key
+			 */
+			Assert(GinPageIsData(page));
+			maxoff = GinPageGetOpaque(page)->maxoff;
+
+			elog(DEBUG1, "page blk: %u, type data, maxoff %d", stack->blkno, maxoff);
+
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: internal posting tree page with %u items, parent %u highkey (%u, %u)",
+					 stack->blkno, maxoff, stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey));
+			else
+				elog(DEBUG3, "blk %u: root internal posting tree page with %u items",
+					 stack->blkno, maxoff);
+
+			/*
+			 * A GIN posting tree internal page stores PostingItems in the
+			 * 'lower' part of the page. The 'upper' part is unused. The
+			 * number of elements is stored in the opaque area (maxoff). Make
+			 * sure the size of the 'lower' part agrees with 'maxoff'
+			 *
+			 * We didn't set pd_lower until PostgreSQL version 9.4, so if this
+			 * check fails, it could also be because the index was
+			 * binary-upgraded from an earlier version. That was a long time
+			 * ago, though, so let's warn if it doesn't match.
+			 */
+			pd_lower = ((PageHeader) page)->pd_lower;
+			lowersize = pd_lower - MAXALIGN(SizeOfPageHeaderData);
+			if ((lowersize - MAXALIGN(sizeof(ItemPointerData))) / sizeof(PostingItem) != maxoff)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has unexpected pd_lower %u in posting tree block %u with maxoff %u)",
+								RelationGetRelationName(rel), pd_lower, stack->blkno, maxoff)));
+
+			/*
+			 * Before the PostingItems, there's one ItemPointerData in the
+			 * 'lower' part that stores the page's high key.
+			 */
+			bound = *GinDataPageGetRightBound(page);
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				!ItemPointerEquals(&stack->parentkey, &bound))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": posting tree page's high key (%u, %u) doesn't match the downlink on block %u (parent blk %u, key (%u, %u))",
+								RelationGetRelationName(rel),
+								ItemPointerGetBlockNumberNoCheck(&bound),
+								ItemPointerGetOffsetNumberNoCheck(&bound),
+								stack->blkno, stack->parentblk,
+								ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+								ItemPointerGetOffsetNumberNoCheck(&stack->parentkey))));
+
+			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
+
+				/* ItemPointerGetOffsetNumber expects a valid pointer */
+				if (!(i == maxoff &&
+					  GinPageGetOpaque(page)->rightlink == InvalidBlockNumber))
+					elog(DEBUG3, "key (%u, %u) -> %u",
+						 ItemPointerGetBlockNumber(&posting_item->key),
+						 ItemPointerGetOffsetNumber(&posting_item->key),
+						 BlockIdGetBlockNumber(&posting_item->child_blkno));
+				else
+					elog(DEBUG3, "key (%u, %u) -> %u",
+						 0, 0, BlockIdGetBlockNumber(&posting_item->child_blkno));
+
+				if (i == maxoff &&
+					GinPageGetOpaque(page)->rightlink == InvalidBlockNumber)
+				{
+					/*
+					 * The rightmost item in the tree level has (0, 0) as the
+					 * key
+					 */
+					if (ItemPointerGetBlockNumberNoCheck(&posting_item->key) != 0 ||
+						ItemPointerGetOffsetNumberNoCheck(&posting_item->key) != 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": rightmost posting tree page (blk %u) has unexpected last key (%u, %u)",
+										RelationGetRelationName(rel),
+										stack->blkno,
+										ItemPointerGetBlockNumberNoCheck(&posting_item->key),
+										ItemPointerGetOffsetNumberNoCheck(&posting_item->key))));
+				}
+				else if (i != FirstOffsetNumber)
+				{
+					PostingItem *previous_posting_item = GinDataPageGetPostingItem(page, i - 1);
+
+					if (ItemPointerCompare(&posting_item->key, &previous_posting_item->key) < 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\" has wrong tuple order in posting tree, block %u, offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+				}
+
+				/*
+				 * Check if this tuple is consistent with the downlink in the
+				 * parent.
+				 */
+				if (stack->parentblk != InvalidBlockNumber && i == maxoff &&
+					ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\": posting item exceeds parent's high key in postingTree internal page on block %u offset %u",
+									RelationGetRelationName(rel),
+									stack->blkno, i)));
+
+				/* If this is an internal page, recurse into the child */
+				if (!GinPageIsLeaf(page))
+				{
+					GinPostingTreeScanItem *ptr;
+
+					ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
+					ptr->depth = stack->depth + 1;
+					ptr->parentkey = posting_item->key;
+					ptr->parentblk = stack->blkno;
+					ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+			}
+		}
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Main entry point for GIN check. Allocates memory context and scans through
+ * GIN graph.
+ */
+static void
+gin_check_parent_keys_consistency(Relation rel,
+								  Relation heaprel,
+								  void *callback_state,
+								  bool readonly)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GinState	state;
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+	initGinState(&state, rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinScanItem *) palloc0(sizeof(GinScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIN_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GinScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+		IndexTuple	prev_tuple;
+		BlockNumber rightlink;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+		maxoff = PageGetMaxOffsetNumber(page);
+		rightlink = GinPageGetOpaque(page)->rightlink;
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		elog(DEBUG3, "processing entry tree page at blk %u, maxoff: %u", stack->blkno, maxoff);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (stack->parenttup != NULL)
+		{
+			GinNullCategory parent_key_category;
+			Datum		parent_key = gintuple_get_key(&state,
+													  stack->parenttup,
+													  &parent_key_category);
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno,
+												   page, maxoff);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory page_max_key_category;
+			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
+
+			if (rightlink != InvalidBlockNumber &&
+				ginCompareEntries(&state, attnum, page_max_key,
+								  page_max_key_category, parent_key,
+								  parent_key_category) > 0)
+			{
+				/* split page detected, install right link to the stack */
+				GinScanItem *ptr;
+
+				elog(DEBUG3, "split detected for blk: %u, parent blk: %u", stack->blkno, stack->parentblk);
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth;
+				ptr->parenttup = CopyIndexTuple(stack->parenttup);
+				ptr->parentblk = stack->parentblk;
+				ptr->parentlsn = stack->parentlsn;
+				ptr->blkno = rightlink;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that tuples in each page are properly ordered and consistent
+		 * with parent high key
+		 */
+		prev_tuple = NULL;
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory prev_key_category;
+			Datum		prev_key;
+			GinNullCategory current_key_category;
+			Datum		current_key;
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
+
+			/*
+			 * First block is metadata, skip order check. Also, never check
+			 * for high key on rightmost page, as this key is not really
+			 * stored explicitly.
+			 */
+			if (i != FirstOffsetNumber && stack->blkno != GIN_ROOT_BLKNO &&
+				!(i == maxoff && rightlink == InvalidBlockNumber))
+			{
+				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
+				if (ginCompareEntries(&state, attnum, prev_key,
+									  prev_key_category, current_key,
+									  current_key_category) >= 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has wrong tuple order on entry tree page, block %u, offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+			}
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				i == maxoff)
+			{
+				GinNullCategory parent_key_category;
+				Datum		parent_key = gintuple_get_key(&state,
+														  stack->parenttup,
+														  &parent_key_category);
+
+				if (ginCompareEntries(&state, attnum, current_key,
+									  current_key_category, parent_key,
+									  parent_key_category) > 0)
+				{
+					/*
+					 * There was a discrepancy between parent and child
+					 * tuples. We need to verify it is not a result of
+					 * concurrent call of gistplacetopage(). So, lock parent
+					 * and try to find downlink for current page. It may be
+					 * missing due to concurrent page split, this is OK.
+					 */
+					pfree(stack->parenttup);
+					stack->parenttup = gin_refind_parent(rel, stack->parentblk,
+														 stack->blkno, strategy);
+
+					/* We found it - make a final check before failing */
+					if (!stack->parenttup)
+						elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+							 stack->blkno, stack->parentblk);
+					else
+					{
+						parent_key = gintuple_get_key(&state,
+													  stack->parenttup,
+													  &parent_key_category);
+
+						/*
+						 * Check if it is properly adjusted. If succeed,
+						 * procced to the next key.
+						 */
+						if (ginCompareEntries(&state, attnum, current_key,
+											  current_key_category, parent_key,
+											  parent_key_category) > 0)
+							ereport(ERROR,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+											RelationGetRelationName(rel), stack->blkno, i)));
+					}
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+			if (!GinPageIsLeaf(page))
+			{
+				GinScanItem *ptr;
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth + 1;
+				/* last tuple in layer has no high key */
+				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
+					ptr->parenttup = CopyIndexTuple(idxtuple);
+				else
+					ptr->parenttup = NULL;
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = GinGetDownlink(idxtuple);
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+			/* If this item is a pointer to a posting tree, recurse into it */
+			else if (GinIsPostingTree(idxtuple))
+			{
+				BlockNumber rootPostingTree = GinGetPostingTree(idxtuple);
+
+				gin_check_posting_tree_parent_keys_consistency(rel, rootPostingTree);
+			}
+			else
+			{
+				ItemPointer ipd;
+				int			nipd;
+
+				ipd = ginReadTupleWithoutState(idxtuple, &nipd);
+
+				for (int j = 0; j < nipd; j++)
+				{
+					if (!OffsetNumberIsValid(ItemPointerGetOffsetNumber(&ipd[j])))
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting list contains invalid heap pointer on block %u",
+										RelationGetRelationName(rel), stack->blkno)));
+				}
+				pfree(ipd);
+			}
+
+			prev_tuple = CopyIndexTuple(idxtuple);
+		}
+
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Verify that a freshly-read page looks sane.
+ */
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	/*
+	 * ReadBuffer verifies that every newly-read page passes
+	 * PageHeaderIsValid, which means it either contains a reasonably sane
+	 * page header or is all-zero.  We have to defend against the all-zero
+	 * case, however.
+	 */
+	if (PageIsNew(page))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains unexpected zero page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	/*
+	 * Additionally check that the special area looks sane.
+	 */
+	if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains corrupted page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	if (GinPageIsDeleted(page))
+	{
+		if (!GinPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gin_refind_parent(Relation rel, BlockNumber parentblkno,
+				  BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIN_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GinPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+static ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED or LP_DEAD,
+	 * since GIN never uses all three.  Verify that line pointer has storage,
+	 * too.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdIsDead(itemid) || ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 6eb526c6bb7..55f2b587e57 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -189,6 +189,25 @@ ORDER BY c.relpages DESC LIMIT 10;
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term>
+     <function>gin_index_check(index regclass) returns void</function>
+     <indexterm>
+      <primary>gin_index_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gin_index_check</function> tests that its target GIN index
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term>
      <function>gist_index_check(index regclass, heapallindexed boolean) returns void</function>
diff --git a/src/tools/pgindent/pgindent b/src/tools/pgindent/pgindent
index e889af6b1e4..e5ac0410665 100755
--- a/src/tools/pgindent/pgindent
+++ b/src/tools/pgindent/pgindent
@@ -13,7 +13,7 @@ use IO::Handle;
 use Getopt::Long;
 
 # Update for pg_bsd_indent version
-my $INDENT_VERSION = "2.1.2";
+my $INDENT_VERSION = "2.1.1";
 
 # Our standard indent settings
 my $indent_opts =
-- 
2.34.1

v33-0005-Add-GiST-support-to-pg_amcheck.patchapplication/octet-stream; name=v33-0005-Add-GiST-support-to-pg_amcheck.patchDownload

From 544e53f83acc78c07488cb315359b1a5bef4d72d Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sun, 5 Feb 2023 15:52:14 -0800
Subject: [PATCH v33 5/5] Add GiST support to pg_amcheck

Proof of concept patch for pg_amcheck binary support
for GIST and GIN index checks.

Author: Andrey Borodin <amborodin@acm.org>
Reviewed-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
---
 src/bin/pg_amcheck/pg_amcheck.c      | 290 ++++++++++++++++-----------
 src/bin/pg_amcheck/t/002_nonesuch.pl |   8 +-
 src/bin/pg_amcheck/t/003_check.pl    |  65 ++++--
 3 files changed, 220 insertions(+), 143 deletions(-)

diff --git a/src/bin/pg_amcheck/pg_amcheck.c b/src/bin/pg_amcheck/pg_amcheck.c
index 27a7d5e925e..8146ea1e604 100644
--- a/src/bin/pg_amcheck/pg_amcheck.c
+++ b/src/bin/pg_amcheck/pg_amcheck.c
@@ -40,8 +40,7 @@ typedef struct PatternInfo
 								 * NULL */
 	bool		heap_only;		/* true if rel_regex should only match heap
 								 * tables */
-	bool		btree_only;		/* true if rel_regex should only match btree
-								 * indexes */
+	bool		index_only;		/* true if rel_regex should only match indexes */
 	bool		matched;		/* true if the pattern matched in any database */
 } PatternInfo;
 
@@ -75,10 +74,9 @@ typedef struct AmcheckOptions
 
 	/*
 	 * As an optimization, if any pattern in the exclude list applies to heap
-	 * tables, or similarly if any such pattern applies to btree indexes, or
-	 * to schemas, then these will be true, otherwise false.  These should
-	 * always agree with what you'd conclude by grep'ing through the exclude
-	 * list.
+	 * tables, or similarly if any such pattern applies to indexes, or to
+	 * schemas, then these will be true, otherwise false.  These should always
+	 * agree with what you'd conclude by grep'ing through the exclude list.
 	 */
 	bool		excludetbl;
 	bool		excludeidx;
@@ -99,14 +97,14 @@ typedef struct AmcheckOptions
 	int64		endblock;
 	const char *skip;
 
-	/* btree index checking options */
+	/* index checking options */
 	bool		parent_check;
 	bool		rootdescend;
 	bool		heapallindexed;
 	bool		checkunique;
 
-	/* heap and btree hybrid option */
-	bool		no_btree_expansion;
+	/* heap and indexes hybrid option */
+	bool		no_index_expansion;
 } AmcheckOptions;
 
 static AmcheckOptions opts = {
@@ -135,7 +133,7 @@ static AmcheckOptions opts = {
 	.rootdescend = false,
 	.heapallindexed = false,
 	.checkunique = false,
-	.no_btree_expansion = false
+	.no_index_expansion = false
 };
 
 static const char *progname = NULL;
@@ -152,13 +150,15 @@ typedef struct DatabaseInfo
 	char	   *datname;
 	char	   *amcheck_schema; /* escaped, quoted literal */
 	bool		is_checkunique;
+	bool		gist_supported;
 } DatabaseInfo;
 
 typedef struct RelationInfo
 {
 	const DatabaseInfo *datinfo;	/* shared by other relinfos */
 	Oid			reloid;
-	bool		is_heap;		/* true if heap, false if btree */
+	Oid			amoid;
+	bool		is_heap;		/* true if heap, false if index */
 	char	   *nspname;
 	char	   *relname;
 	int			relpages;
@@ -179,10 +179,12 @@ static void prepare_heap_command(PQExpBuffer sql, RelationInfo *rel,
 								 PGconn *conn);
 static void prepare_btree_command(PQExpBuffer sql, RelationInfo *rel,
 								  PGconn *conn);
+static void prepare_gist_command(PQExpBuffer sql, RelationInfo *rel,
+								 PGconn *conn);
 static void run_command(ParallelSlot *slot, const char *sql);
 static bool verify_heap_slot_handler(PGresult *res, PGconn *conn,
 									 void *context);
-static bool verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context);
+static bool verify_index_slot_handler(PGresult *res, PGconn *conn, void *context);
 static void help(const char *progname);
 static void progress_report(uint64 relations_total, uint64 relations_checked,
 							uint64 relpages_total, uint64 relpages_checked,
@@ -196,7 +198,7 @@ static void append_relation_pattern(PatternInfoArray *pia, const char *pattern,
 									int encoding);
 static void append_heap_pattern(PatternInfoArray *pia, const char *pattern,
 								int encoding);
-static void append_btree_pattern(PatternInfoArray *pia, const char *pattern,
+static void append_index_pattern(PatternInfoArray *pia, const char *pattern,
 								 int encoding);
 static void compile_database_list(PGconn *conn, SimplePtrList *databases,
 								  const char *initial_dbname);
@@ -288,6 +290,7 @@ main(int argc, char *argv[])
 	enum trivalue prompt_password = TRI_DEFAULT;
 	int			encoding = pg_get_encoding_from_locale(NULL, false);
 	ConnParams	cparams;
+	bool		gist_warn_printed = false;
 
 	pg_logging_init(argv[0]);
 	progname = get_progname(argv[0]);
@@ -323,11 +326,11 @@ main(int argc, char *argv[])
 				break;
 			case 'i':
 				opts.allrel = false;
-				append_btree_pattern(&opts.include, optarg, encoding);
+				append_index_pattern(&opts.include, optarg, encoding);
 				break;
 			case 'I':
 				opts.excludeidx = true;
-				append_btree_pattern(&opts.exclude, optarg, encoding);
+				append_index_pattern(&opts.exclude, optarg, encoding);
 				break;
 			case 'j':
 				if (!option_parse_int(optarg, "-j/--jobs", 1, INT_MAX,
@@ -382,7 +385,7 @@ main(int argc, char *argv[])
 				maintenance_db = pg_strdup(optarg);
 				break;
 			case 2:
-				opts.no_btree_expansion = true;
+				opts.no_index_expansion = true;
 				break;
 			case 3:
 				opts.no_toast_expansion = true;
@@ -531,6 +534,10 @@ main(int argc, char *argv[])
 		int			ntups;
 		const char *amcheck_schema = NULL;
 		DatabaseInfo *dat = (DatabaseInfo *) cell->ptr;
+		int			vmaj = 0,
+					vmin = 0,
+					vrev = 0;
+		const char *amcheck_version;
 
 		cparams.override_dbname = dat->datname;
 		if (conn == NULL || strcmp(PQdb(conn), dat->datname) != 0)
@@ -599,36 +606,32 @@ main(int argc, char *argv[])
 												 strlen(amcheck_schema));
 
 		/*
-		 * Check the version of amcheck extension. Skip requested unique
-		 * constraint check with warning if it is not yet supported by
-		 * amcheck.
+		 * Check the version of amcheck extension.
 		 */
-		if (opts.checkunique == true)
-		{
-			/*
-			 * Now amcheck has only major and minor versions in the string but
-			 * we also support revision just in case. Now it is expected to be
-			 * zero.
-			 */
-			int			vmaj = 0,
-						vmin = 0,
-						vrev = 0;
-			const char *amcheck_version = PQgetvalue(result, 0, 1);
+		amcheck_version = PQgetvalue(result, 0, 1);
 
-			sscanf(amcheck_version, "%d.%d.%d", &vmaj, &vmin, &vrev);
+		/*
+		 * Now amcheck has only major and minor versions in the string but we
+		 * also support revision just in case. Now it is expected to be zero.
+		 */
+		sscanf(amcheck_version, "%d.%d.%d", &vmaj, &vmin, &vrev);
 
-			/*
-			 * checkunique option is supported in amcheck since version 1.4
-			 */
-			if ((vmaj == 1 && vmin < 4) || vmaj == 0)
-			{
-				pg_log_warning("option %s is not supported by amcheck version %s",
-							   "--checkunique", amcheck_version);
-				dat->is_checkunique = false;
-			}
-			else
-				dat->is_checkunique = true;
+		/*
+		 * checkunique option is supported in amcheck since version 1.4. Skip
+		 * requested unique constraint check with warning if it is not yet
+		 * supported by amcheck.
+		 */
+		if (opts.checkunique && ((vmaj == 1 && vmin < 4) || vmaj == 0))
+		{
+			pg_log_warning("option %s is not supported by amcheck version %s",
+						   "--checkunique", amcheck_version);
+			dat->is_checkunique = false;
 		}
+		else
+			dat->is_checkunique = opts.checkunique;
+
+		/* GiST indexes are supported in 1.5+ */
+		dat->gist_supported = ((vmaj == 1 && vmin >= 5) || vmaj > 1);
 
 		PQclear(result);
 
@@ -650,8 +653,8 @@ main(int argc, char *argv[])
 			if (pat->heap_only)
 				log_no_match("no heap tables to check matching \"%s\"",
 							 pat->pattern);
-			else if (pat->btree_only)
-				log_no_match("no btree indexes to check matching \"%s\"",
+			else if (pat->index_only)
+				log_no_match("no indexes to check matching \"%s\"",
 							 pat->pattern);
 			else if (pat->rel_regex == NULL)
 				log_no_match("no relations to check in schemas matching \"%s\"",
@@ -784,13 +787,29 @@ main(int argc, char *argv[])
 				if (opts.show_progress && progress_since_last_stderr)
 					fprintf(stderr, "\n");
 
-				pg_log_info("checking btree index \"%s.%s.%s\"",
+				pg_log_info("checking index \"%s.%s.%s\"",
 							rel->datinfo->datname, rel->nspname, rel->relname);
 				progress_since_last_stderr = false;
 			}
-			prepare_btree_command(&sql, rel, free_slot->connection);
+			if (rel->amoid == BTREE_AM_OID)
+				prepare_btree_command(&sql, rel, free_slot->connection);
+			else if (rel->amoid == GIST_AM_OID)
+			{
+				if (rel->datinfo->gist_supported)
+					prepare_gist_command(&sql, rel, free_slot->connection);
+				else
+				{
+					if (!gist_warn_printed)
+						pg_log_warning("GiST verification is not supported by installed amcheck version");
+					gist_warn_printed = true;
+				}
+			}
+			else
+				/* should not happen at this stage */
+				pg_log_info("Verification of index type %u not supported",
+							rel->amoid);
 			rel->sql = pstrdup(sql.data);	/* pg_free'd after command */
-			ParallelSlotSetHandler(free_slot, verify_btree_slot_handler, rel);
+			ParallelSlotSetHandler(free_slot, verify_index_slot_handler, rel);
 			run_command(free_slot, rel->sql);
 		}
 	}
@@ -868,7 +887,7 @@ prepare_heap_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
  * Creates a SQL command for running amcheck checking on the given btree index
  * relation.  The command does not select any columns, as btree checking
  * functions do not return any, but rather return corruption information by
- * raising errors, which verify_btree_slot_handler expects.
+ * raising errors, which verify_index_slot_handler expects.
  *
  * The constructed SQL command will silently skip temporary indexes, and
  * indexes being reindexed concurrently, as checking them would needlessly draw
@@ -914,6 +933,28 @@ prepare_btree_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
 						  rel->reloid);
 }
 
+/*
+ * prepare_gist_command
+ * Similar to btree equivalent prepares command to check GiST index.
+ */
+static void
+prepare_gist_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
+{
+	resetPQExpBuffer(sql);
+
+	appendPQExpBuffer(sql,
+					  "SELECT %s.gist_index_check("
+					  "index := c.oid, heapallindexed := %s)"
+					  "\nFROM pg_catalog.pg_class c, pg_catalog.pg_index i "
+					  "WHERE c.oid = %u "
+					  "AND c.oid = i.indexrelid "
+					  "AND c.relpersistence != 't' "
+					  "AND i.indisready AND i.indisvalid AND i.indislive",
+					  rel->datinfo->amcheck_schema,
+					  (opts.heapallindexed ? "true" : "false"),
+					  rel->reloid);
+}
+
 /*
  * run_command
  *
@@ -953,7 +994,7 @@ run_command(ParallelSlot *slot, const char *sql)
  * Note: Heap relation corruption is reported by verify_heapam() via the result
  * set, rather than an ERROR, but running verify_heapam() on a corrupted heap
  * table may still result in an error being returned from the server due to
- * missing relation files, bad checksums, etc.  The btree corruption checking
+ * missing relation files, bad checksums, etc.  The corruption checking
  * functions always use errors to communicate corruption messages.  We can't
  * just abort processing because we got a mere ERROR.
  *
@@ -1103,11 +1144,11 @@ verify_heap_slot_handler(PGresult *res, PGconn *conn, void *context)
 }
 
 /*
- * verify_btree_slot_handler
+ * verify_index_slot_handler
  *
- * ParallelSlotHandler that receives results from a btree checking command
- * created by prepare_btree_command and outputs them for the user.  The results
- * from the btree checking command is assumed to be empty, but when the results
+ * ParallelSlotHandler that receives results from a checking command created by
+ * prepare_[btree,gist]_command and outputs them for the user.  The results
+ * from the checking command is assumed to be empty, but when the results
  * are an error code, the useful information about the corruption is expected
  * in the connection's error message.
  *
@@ -1116,7 +1157,7 @@ verify_heap_slot_handler(PGresult *res, PGconn *conn, void *context)
  * context: unused
  */
 static bool
-verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
+verify_index_slot_handler(PGresult *res, PGconn *conn, void *context)
 {
 	RelationInfo *rel = (RelationInfo *) context;
 
@@ -1127,12 +1168,12 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 		if (ntups > 1)
 		{
 			/*
-			 * We expect the btree checking functions to return one void row
-			 * each, or zero rows if the check was skipped due to the object
-			 * being in the wrong state to be checked, so we should output
-			 * some sort of warning if we get anything more, not because it
-			 * indicates corruption, but because it suggests a mismatch
-			 * between amcheck and pg_amcheck versions.
+			 * We expect the checking functions to return one void row each,
+			 * or zero rows if the check was skipped due to the object being
+			 * in the wrong state to be checked, so we should output some sort
+			 * of warning if we get anything more, not because it indicates
+			 * corruption, but because it suggests a mismatch between amcheck
+			 * and pg_amcheck versions.
 			 *
 			 * In conjunction with --progress, anything written to stderr at
 			 * this time would present strangely to the user without an extra
@@ -1142,7 +1183,7 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 			 */
 			if (opts.show_progress && progress_since_last_stderr)
 				fprintf(stderr, "\n");
-			pg_log_warning("btree index \"%s.%s.%s\": btree checking function returned unexpected number of rows: %d",
+			pg_log_warning("index \"%s.%s.%s\": checking function returned unexpected number of rows: %d",
 						   rel->datinfo->datname, rel->nspname, rel->relname, ntups);
 			if (opts.verbose)
 				pg_log_warning_detail("Query was: %s", rel->sql);
@@ -1156,7 +1197,7 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 		char	   *msg = indent_lines(PQerrorMessage(conn));
 
 		all_checks_pass = false;
-		printf(_("btree index \"%s.%s.%s\":\n"),
+		printf(_("index \"%s.%s.%s\":\n"),
 			   rel->datinfo->datname, rel->nspname, rel->relname);
 		printf("%s", msg);
 		if (opts.verbose)
@@ -1210,6 +1251,8 @@ help(const char *progname)
 	printf(_("      --heapallindexed            check that all heap tuples are found within indexes\n"));
 	printf(_("      --parent-check              check index parent/child relationships\n"));
 	printf(_("      --rootdescend               search from root page to refind tuples\n"));
+	printf(_("\nGiST index checking options:\n"));
+	printf(_("      --heapallindexed            check that all heap tuples are found within indexes\n"));
 	printf(_("\nConnection options:\n"));
 	printf(_("  -h, --host=HOSTNAME             database server host or socket directory\n"));
 	printf(_("  -p, --port=PORT                 database server port\n"));
@@ -1423,11 +1466,11 @@ append_schema_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
  * pattern: the relation name pattern
  * encoding: client encoding for parsing the pattern
  * heap_only: whether the pattern should only be matched against heap tables
- * btree_only: whether the pattern should only be matched against btree indexes
+ * index_only: whether the pattern should only be matched against indexes
  */
 static void
 append_relation_pattern_helper(PatternInfoArray *pia, const char *pattern,
-							   int encoding, bool heap_only, bool btree_only)
+							   int encoding, bool heap_only, bool index_only)
 {
 	PQExpBufferData dbbuf;
 	PQExpBufferData nspbuf;
@@ -1462,14 +1505,14 @@ append_relation_pattern_helper(PatternInfoArray *pia, const char *pattern,
 	termPQExpBuffer(&relbuf);
 
 	info->heap_only = heap_only;
-	info->btree_only = btree_only;
+	info->index_only = index_only;
 }
 
 /*
  * append_relation_pattern
  *
  * Adds the given pattern interpreted as a relation pattern, to be matched
- * against both heap tables and btree indexes.
+ * against both heap tables and indexes.
  *
  * pia: the pattern info array to be appended
  * pattern: the relation name pattern
@@ -1498,17 +1541,17 @@ append_heap_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
 }
 
 /*
- * append_btree_pattern
+ * append_index_pattern
  *
  * Adds the given pattern interpreted as a relation pattern, to be matched only
- * against btree indexes.
+ * against indexes.
  *
  * pia: the pattern info array to be appended
  * pattern: the relation name pattern
  * encoding: client encoding for parsing the pattern
  */
 static void
-append_btree_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
+append_index_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
 {
 	append_relation_pattern_helper(pia, pattern, encoding, false, true);
 }
@@ -1766,7 +1809,7 @@ compile_database_list(PGconn *conn, SimplePtrList *databases,
  *     rel_regex: the relname regexp parsed from the pattern, or NULL if the
  *                pattern had no relname part
  *     heap_only: true if the pattern applies only to heap tables (not indexes)
- *     btree_only: true if the pattern applies only to btree indexes (not tables)
+ *     index_only: true if the pattern applies only to indexes (not tables)
  *
  * buf: the buffer to be appended
  * patterns: the array of patterns to be inserted into the CTE
@@ -1808,7 +1851,7 @@ append_rel_pattern_raw_cte(PQExpBuffer buf, const PatternInfoArray *pia,
 			appendPQExpBufferStr(buf, "::TEXT, true::BOOLEAN");
 		else
 			appendPQExpBufferStr(buf, "::TEXT, false::BOOLEAN");
-		if (info->btree_only)
+		if (info->index_only)
 			appendPQExpBufferStr(buf, ", true::BOOLEAN");
 		else
 			appendPQExpBufferStr(buf, ", false::BOOLEAN");
@@ -1846,8 +1889,8 @@ append_rel_pattern_filtered_cte(PQExpBuffer buf, const char *raw,
 								const char *filtered, PGconn *conn)
 {
 	appendPQExpBuffer(buf,
-					  "\n%s (pattern_id, nsp_regex, rel_regex, heap_only, btree_only) AS ("
-					  "\nSELECT pattern_id, nsp_regex, rel_regex, heap_only, btree_only "
+					  "\n%s (pattern_id, nsp_regex, rel_regex, heap_only, index_only) AS ("
+					  "\nSELECT pattern_id, nsp_regex, rel_regex, heap_only, index_only "
 					  "FROM %s r"
 					  "\nWHERE (r.db_regex IS NULL "
 					  "OR ",
@@ -1870,7 +1913,7 @@ append_rel_pattern_filtered_cte(PQExpBuffer buf, const char *raw,
  * The cells of the constructed list contain all information about the relation
  * necessary to connect to the database and check the object, including which
  * database to connect to, where contrib/amcheck is installed, and the Oid and
- * type of object (heap table vs. btree index).  Rather than duplicating the
+ * type of object (heap table vs. index).  Rather than duplicating the
  * database details per relation, the relation structs use references to the
  * same database object, provided by the caller.
  *
@@ -1897,7 +1940,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	if (!opts.allrel)
 	{
 		appendPQExpBufferStr(&sql,
-							 " include_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, btree_only) AS (");
+							 " include_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, index_only) AS (");
 		append_rel_pattern_raw_cte(&sql, &opts.include, conn);
 		appendPQExpBufferStr(&sql, "\n),");
 		append_rel_pattern_filtered_cte(&sql, "include_raw", "include_pat", conn);
@@ -1907,7 +1950,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	if (opts.excludetbl || opts.excludeidx || opts.excludensp)
 	{
 		appendPQExpBufferStr(&sql,
-							 " exclude_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, btree_only) AS (");
+							 " exclude_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, index_only) AS (");
 		append_rel_pattern_raw_cte(&sql, &opts.exclude, conn);
 		appendPQExpBufferStr(&sql, "\n),");
 		append_rel_pattern_filtered_cte(&sql, "exclude_raw", "exclude_pat", conn);
@@ -1915,36 +1958,36 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 
 	/* Append the relation CTE. */
 	appendPQExpBufferStr(&sql,
-						 " relation (pattern_id, oid, nspname, relname, reltoastrelid, relpages, is_heap, is_btree) AS ("
+						 " relation (pattern_id, oid, amoid, nspname, relname, reltoastrelid, relpages, is_heap, is_index) AS ("
 						 "\nSELECT DISTINCT ON (c.oid");
 	if (!opts.allrel)
 		appendPQExpBufferStr(&sql, ", ip.pattern_id) ip.pattern_id,");
 	else
 		appendPQExpBufferStr(&sql, ") NULL::INTEGER AS pattern_id,");
 	appendPQExpBuffer(&sql,
-					  "\nc.oid, n.nspname, c.relname, c.reltoastrelid, c.relpages, "
-					  "c.relam = %u AS is_heap, "
-					  "c.relam = %u AS is_btree"
+					  "\nc.oid, c.relam as amoid, n.nspname, c.relname, "
+					  "c.reltoastrelid, c.relpages, c.relam = %u AS is_heap, "
+					  "(c.relam = %u OR c.relam = %u) AS is_index"
 					  "\nFROM pg_catalog.pg_class c "
 					  "INNER JOIN pg_catalog.pg_namespace n "
 					  "ON c.relnamespace = n.oid",
-					  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+					  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 	if (!opts.allrel)
 		appendPQExpBuffer(&sql,
 						  "\nINNER JOIN include_pat ip"
 						  "\nON (n.nspname ~ ip.nsp_regex OR ip.nsp_regex IS NULL)"
 						  "\nAND (c.relname ~ ip.rel_regex OR ip.rel_regex IS NULL)"
 						  "\nAND (c.relam = %u OR NOT ip.heap_only)"
-						  "\nAND (c.relam = %u OR NOT ip.btree_only)",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  "\nAND ((c.relam = %u OR c.relam = %u) OR NOT ip.index_only)",
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 	if (opts.excludetbl || opts.excludeidx || opts.excludensp)
 		appendPQExpBuffer(&sql,
 						  "\nLEFT OUTER JOIN exclude_pat ep"
 						  "\nON (n.nspname ~ ep.nsp_regex OR ep.nsp_regex IS NULL)"
 						  "\nAND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL)"
 						  "\nAND (c.relam = %u OR NOT ep.heap_only OR ep.rel_regex IS NULL)"
-						  "\nAND (c.relam = %u OR NOT ep.btree_only OR ep.rel_regex IS NULL)",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  "\nAND ((c.relam = %u OR c.relam = %u) OR NOT ep.index_only OR ep.rel_regex IS NULL)",
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 
 	/*
 	 * Exclude temporary tables and indexes, which must necessarily belong to
@@ -1983,7 +2026,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 						  HEAP_TABLE_AM_OID, PG_TOAST_NAMESPACE);
 	else
 		appendPQExpBuffer(&sql,
-						  " AND c.relam IN (%u, %u)"
+						  " AND c.relam IN (%u, %u, %u)"
 						  "AND c.relkind IN ("
 						  CppAsString2(RELKIND_RELATION) ", "
 						  CppAsString2(RELKIND_SEQUENCE) ", "
@@ -1995,10 +2038,10 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 						  CppAsString2(RELKIND_SEQUENCE) ", "
 						  CppAsString2(RELKIND_MATVIEW) ", "
 						  CppAsString2(RELKIND_TOASTVALUE) ")) OR "
-						  "(c.relam = %u AND c.relkind = "
+						  "((c.relam = %u OR c.relam = %u) AND c.relkind = "
 						  CppAsString2(RELKIND_INDEX) "))",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID,
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID,
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 
 	appendPQExpBufferStr(&sql,
 						 "\nORDER BY c.oid)");
@@ -2027,7 +2070,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		appendPQExpBufferStr(&sql,
 							 "\n)");
 	}
-	if (!opts.no_btree_expansion)
+	if (!opts.no_index_expansion)
 	{
 		/*
 		 * Include a CTE for btree indexes associated with primary heap tables
@@ -2035,9 +2078,9 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		 * btree index names.
 		 */
 		appendPQExpBufferStr(&sql,
-							 ", index (oid, nspname, relname, relpages) AS ("
-							 "\nSELECT c.oid, r.nspname, c.relname, c.relpages "
-							 "FROM relation r"
+							 ", index (oid, amoid, nspname, relname, relpages) AS ("
+							 "\nSELECT c.oid, c.relam as amoid, r.nspname, "
+							 "c.relname, c.relpages FROM relation r"
 							 "\nINNER JOIN pg_catalog.pg_index i "
 							 "ON r.oid = i.indrelid "
 							 "INNER JOIN pg_catalog.pg_class c "
@@ -2050,15 +2093,15 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 								 "\nLEFT OUTER JOIN exclude_pat ep "
 								 "ON (n.nspname ~ ep.nsp_regex OR ep.nsp_regex IS NULL) "
 								 "AND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL) "
-								 "AND ep.btree_only"
+								 "AND ep.index_only"
 								 "\nWHERE ep.pattern_id IS NULL");
 		else
 			appendPQExpBufferStr(&sql,
 								 "\nWHERE true");
 		appendPQExpBuffer(&sql,
-						  " AND c.relam = %u "
+						  " AND (c.relam = %u or c.relam = %u) "
 						  "AND c.relkind = " CppAsString2(RELKIND_INDEX),
-						  BTREE_AM_OID);
+						  BTREE_AM_OID, GIST_AM_OID);
 		if (opts.no_toast_expansion)
 			appendPQExpBuffer(&sql,
 							  " AND c.relnamespace != %u",
@@ -2066,7 +2109,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		appendPQExpBufferStr(&sql, "\n)");
 	}
 
-	if (!opts.no_toast_expansion && !opts.no_btree_expansion)
+	if (!opts.no_toast_expansion && !opts.no_index_expansion)
 	{
 		/*
 		 * Include a CTE for btree indexes associated with toast tables of
@@ -2087,7 +2130,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 								 "\nLEFT OUTER JOIN exclude_pat ep "
 								 "ON ('pg_toast' ~ ep.nsp_regex OR ep.nsp_regex IS NULL) "
 								 "AND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL) "
-								 "AND ep.btree_only "
+								 "AND ep.index_only "
 								 "WHERE ep.pattern_id IS NULL");
 		else
 			appendPQExpBufferStr(&sql,
@@ -2107,12 +2150,13 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	 * list.
 	 */
 	appendPQExpBufferStr(&sql,
-						 "\nSELECT pattern_id, is_heap, is_btree, oid, nspname, relname, relpages "
+						 "\nSELECT pattern_id, is_heap, is_index, oid, amoid, nspname, relname, relpages "
 						 "FROM (");
 	appendPQExpBufferStr(&sql,
 	/* Inclusion patterns that failed to match */
-						 "\nSELECT pattern_id, is_heap, is_btree, "
+						 "\nSELECT pattern_id, is_heap, is_index, "
 						 "NULL::OID AS oid, "
+						 "NULL::OID AS amoid, "
 						 "NULL::TEXT AS nspname, "
 						 "NULL::TEXT AS relname, "
 						 "NULL::INTEGER AS relpages"
@@ -2121,29 +2165,29 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 						 "UNION"
 	/* Primary relations */
 						 "\nSELECT NULL::INTEGER AS pattern_id, "
-						 "is_heap, is_btree, oid, nspname, relname, relpages "
+						 "is_heap, is_index, oid, amoid, nspname, relname, relpages "
 						 "FROM relation");
 	if (!opts.no_toast_expansion)
-		appendPQExpBufferStr(&sql,
-							 " UNION"
+		appendPQExpBuffer(&sql,
+						  " UNION"
 		/* Toast tables for primary relations */
-							 "\nSELECT NULL::INTEGER AS pattern_id, TRUE AS is_heap, "
-							 "FALSE AS is_btree, oid, nspname, relname, relpages "
-							 "FROM toast");
-	if (!opts.no_btree_expansion)
+						  "\nSELECT NULL::INTEGER AS pattern_id, TRUE AS is_heap, "
+						  "FALSE AS is_index, oid, 0 as amoid, nspname, relname, relpages "
+						  "FROM toast");
+	if (!opts.no_index_expansion)
 		appendPQExpBufferStr(&sql,
 							 " UNION"
 		/* Indexes for primary relations */
 							 "\nSELECT NULL::INTEGER AS pattern_id, FALSE AS is_heap, "
-							 "TRUE AS is_btree, oid, nspname, relname, relpages "
+							 "TRUE AS is_index, oid, amoid, nspname, relname, relpages "
 							 "FROM index");
-	if (!opts.no_toast_expansion && !opts.no_btree_expansion)
-		appendPQExpBufferStr(&sql,
-							 " UNION"
+	if (!opts.no_toast_expansion && !opts.no_index_expansion)
+		appendPQExpBuffer(&sql,
+						  " UNION"
 		/* Indexes for toast relations */
-							 "\nSELECT NULL::INTEGER AS pattern_id, FALSE AS is_heap, "
-							 "TRUE AS is_btree, oid, nspname, relname, relpages "
-							 "FROM toast_index");
+						  "\nSELECT NULL::INTEGER AS pattern_id, FALSE AS is_heap, "
+						  "TRUE AS is_index, oid, %u as amoid, nspname, relname, relpages "
+						  "FROM toast_index", BTREE_AM_OID);
 	appendPQExpBufferStr(&sql,
 						 "\n) AS combined_records "
 						 "ORDER BY relpages DESC NULLS FIRST, oid");
@@ -2163,8 +2207,9 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	{
 		int			pattern_id = -1;
 		bool		is_heap = false;
-		bool		is_btree PG_USED_FOR_ASSERTS_ONLY = false;
+		bool		is_index PG_USED_FOR_ASSERTS_ONLY = false;
 		Oid			oid = InvalidOid;
+		Oid			amoid = InvalidOid;
 		const char *nspname = NULL;
 		const char *relname = NULL;
 		int			relpages = 0;
@@ -2174,15 +2219,17 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		if (!PQgetisnull(res, i, 1))
 			is_heap = (PQgetvalue(res, i, 1)[0] == 't');
 		if (!PQgetisnull(res, i, 2))
-			is_btree = (PQgetvalue(res, i, 2)[0] == 't');
+			is_index = (PQgetvalue(res, i, 2)[0] == 't');
 		if (!PQgetisnull(res, i, 3))
 			oid = atooid(PQgetvalue(res, i, 3));
 		if (!PQgetisnull(res, i, 4))
-			nspname = PQgetvalue(res, i, 4);
+			amoid = atooid(PQgetvalue(res, i, 4));
 		if (!PQgetisnull(res, i, 5))
-			relname = PQgetvalue(res, i, 5);
+			nspname = PQgetvalue(res, i, 5);
 		if (!PQgetisnull(res, i, 6))
-			relpages = atoi(PQgetvalue(res, i, 6));
+			relname = PQgetvalue(res, i, 6);
+		if (!PQgetisnull(res, i, 7))
+			relpages = atoi(PQgetvalue(res, i, 7));
 
 		if (pattern_id >= 0)
 		{
@@ -2204,10 +2251,11 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 			RelationInfo *rel = (RelationInfo *) pg_malloc0(sizeof(RelationInfo));
 
 			Assert(OidIsValid(oid));
-			Assert((is_heap && !is_btree) || (is_btree && !is_heap));
+			Assert((is_heap && !is_index) || (is_index && !is_heap));
 
 			rel->datinfo = dat;
 			rel->reloid = oid;
+			rel->amoid = amoid;
 			rel->is_heap = is_heap;
 			rel->nspname = pstrdup(nspname);
 			rel->relname = pstrdup(relname);
@@ -2217,7 +2265,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 			{
 				/*
 				 * We apply --startblock and --endblock to heap tables, but
-				 * not btree indexes, and for progress purposes we need to
+				 * not supported indexes, and for progress purposes we need to
 				 * track how many blocks we expect to check.
 				 */
 				if (opts.endblock >= 0 && rel->blocks_to_check > opts.endblock)
diff --git a/src/bin/pg_amcheck/t/002_nonesuch.pl b/src/bin/pg_amcheck/t/002_nonesuch.pl
index 67d700ea07a..d4cc0664f3b 100644
--- a/src/bin/pg_amcheck/t/002_nonesuch.pl
+++ b/src/bin/pg_amcheck/t/002_nonesuch.pl
@@ -272,8 +272,8 @@ $node->command_checks_all(
 	[
 		qr/pg_amcheck: warning: no heap tables to check matching "no_such_table"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "no\*such\*table"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "no_such_index"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "no\*such\*index"/,
+		qr/pg_amcheck: warning: no indexes to check matching "no_such_index"/,
+		qr/pg_amcheck: warning: no indexes to check matching "no\*such\*index"/,
 		qr/pg_amcheck: warning: no relations to check matching "no_such_relation"/,
 		qr/pg_amcheck: warning: no relations to check matching "no\*such\*relation"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "no\*such\*table"/,
@@ -350,8 +350,8 @@ $node->command_checks_all(
 		qr/pg_amcheck: warning: no heap tables to check matching "template1\.public\.foo"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "another_db\.public\.foo"/,
 		qr/pg_amcheck: warning: no connectable databases to check matching "no_such_database\.public\.foo"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "template1\.public\.foo_idx"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "another_db\.public\.foo_idx"/,
+		qr/pg_amcheck: warning: no indexes to check matching "template1\.public\.foo_idx"/,
+		qr/pg_amcheck: warning: no indexes to check matching "another_db\.public\.foo_idx"/,
 		qr/pg_amcheck: warning: no connectable databases to check matching "no_such_database\.public\.foo_idx"/,
 		qr/pg_amcheck: error: no relations to check/,
 	],
diff --git a/src/bin/pg_amcheck/t/003_check.pl b/src/bin/pg_amcheck/t/003_check.pl
index 2b57c4dbac1..0aa66b24258 100644
--- a/src/bin/pg_amcheck/t/003_check.pl
+++ b/src/bin/pg_amcheck/t/003_check.pl
@@ -185,7 +185,7 @@ for my $dbname (qw(db1 db2 db3))
 	# schemas.  The schemas are all identical to start, but
 	# we will corrupt them differently later.
 	#
-	for my $schema (qw(s1 s2 s3 s4 s5))
+	for my $schema (qw(s1 s2 s3 s4 s5 s6))
 	{
 		$node->safe_psql(
 			$dbname, qq(
@@ -291,22 +291,24 @@ plan_to_corrupt_first_page('db1', 's3.t2_btree');
 # Corrupt toast table, partitions, and materialized views in schema "s4"
 plan_to_remove_toast_file('db1', 's4.t2');
 
-# Corrupt all other object types in schema "s5".  We don't have amcheck support
+# Corrupt GiST index in schema "s5"
+plan_to_remove_relation_file('db1', 's5.t1_gist');
+plan_to_corrupt_first_page('db1', 's5.t2_gist');
+
+# Corrupt all other object types in schema "s6".  We don't have amcheck support
 # for these types, but we check that their corruption does not trigger any
 # errors in pg_amcheck
-plan_to_remove_relation_file('db1', 's5.seq1');
-plan_to_remove_relation_file('db1', 's5.t1_hash');
-plan_to_remove_relation_file('db1', 's5.t1_gist');
-plan_to_remove_relation_file('db1', 's5.t1_gin');
-plan_to_remove_relation_file('db1', 's5.t1_brin');
-plan_to_remove_relation_file('db1', 's5.t1_spgist');
+plan_to_remove_relation_file('db1', 's6.seq1');
+plan_to_remove_relation_file('db1', 's6.t1_hash');
+plan_to_remove_relation_file('db1', 's6.t1_gin');
+plan_to_remove_relation_file('db1', 's6.t1_brin');
+plan_to_remove_relation_file('db1', 's6.t1_spgist');
 
-plan_to_corrupt_first_page('db1', 's5.seq2');
-plan_to_corrupt_first_page('db1', 's5.t2_hash');
-plan_to_corrupt_first_page('db1', 's5.t2_gist');
-plan_to_corrupt_first_page('db1', 's5.t2_gin');
-plan_to_corrupt_first_page('db1', 's5.t2_brin');
-plan_to_corrupt_first_page('db1', 's5.t2_spgist');
+plan_to_corrupt_first_page('db1', 's6.seq2');
+plan_to_corrupt_first_page('db1', 's6.t2_hash');
+plan_to_corrupt_first_page('db1', 's6.t2_gin');
+plan_to_corrupt_first_page('db1', 's6.t2_brin');
+plan_to_corrupt_first_page('db1', 's6.t2_spgist');
 
 
 # Database 'db2' corruptions
@@ -437,10 +439,22 @@ $node->command_checks_all(
 	[$no_output_re],
 	'pg_amcheck in schema s4 excluding toast reports no corruption');
 
-# Check that no corruption is reported in schema db1.s5
-$node->command_checks_all([ @cmd, '-s', 's5', 'db1' ],
+# In schema db1.s5 we should see GiST corruption messages on stdout, and
+# nothing on stderr.
+#
+$node->command_checks_all(
+	[ @cmd, '-s', 's5', 'db1' ],
+	2,
+	[
+		$missing_file_re, $line_pointer_corruption_re,
+	],
+	[$no_output_re],
+	'pg_amcheck schema s5 reports GiST index errors');
+
+# Check that no corruption is reported in schema db1.s6
+$node->command_checks_all([ @cmd, '-s', 's6', 'db1' ],
 	0, [$no_output_re], [$no_output_re],
-	'pg_amcheck over schema s5 reports no corruption');
+	'pg_amcheck over schema s6 reports no corruption');
 
 # In schema db1.s1, only indexes are corrupt.  Verify that when we exclude
 # the indexes, no corruption is reported about the schema.
@@ -551,7 +565,7 @@ $node->command_checks_all(
 	'pg_amcheck excluding all corrupt schemas with --checkunique option');
 
 #
-# Smoke test for checkunique option for not supported versions.
+# Smoke test for checkunique option and GiST indexes for not supported versions.
 #
 $node->safe_psql(
 	'db3', q(
@@ -567,4 +581,19 @@ $node->command_checks_all(
 		qr/pg_amcheck: warning: option --checkunique is not supported by amcheck version 1.3/
 	],
 	'pg_amcheck smoke test --checkunique');
+
+$node->safe_psql(
+	'db1', q(
+		DROP EXTENSION amcheck;
+		CREATE EXTENSION amcheck WITH SCHEMA amcheck_schema VERSION '1.3' ;
+));
+
+$node->command_checks_all(
+	[ @cmd, '-s', 's5', 'db1' ],
+	0,
+	[$no_output_re],
+	[
+		qr/pg_amcheck: warning: GiST verification is not supported by installed amcheck version/
+	],
+	'pg_amcheck smoke test --checkunique');
 done_testing();
-- 
2.34.1

v33-0006-Fix-wording-in-GIN-README.patchapplication/octet-stream; name=v33-0006-Fix-wording-in-GIN-README.patchDownload

From 5c9ffe515d681ece28741a4d469a3f9120dcc255 Mon Sep 17 00:00:00 2001
From: reshke kirill <reshke@double.cloud>
Date: Tue, 3 Dec 2024 15:02:47 +0000
Subject: [PATCH v33 6/6] Fix wording in GIN README.

---
 src/backend/access/gin/README | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/backend/access/gin/README b/src/backend/access/gin/README
index b0807316212..742bcbad499 100644
--- a/src/backend/access/gin/README
+++ b/src/backend/access/gin/README
@@ -237,10 +237,10 @@ GIN packs keys and downlinks into tuples in a different way.
 
 P_i is grouped with K_{i+1}.  -Inf key is not needed.
 
-There are couple of additional notes regarding K_{n+1} key.
-1) In entry tree rightmost page, a key coupled with P_n doesn't really matter.
+There are a couple of additional notes regarding K_{n+1} key.
+1) In the entry tree on the rightmost page, a key coupled with P_n doesn't really matter.
 Highkey is assumed to be infinity.
-2) In posting tree, a key coupled with P_n always doesn't matter.  Highkey for
+2) In the posting tree, a key coupled with P_n always doesn't matter.  Highkey for
 non-rightmost pages is stored separately and accessed via
 GinDataPageGetRightBound().
 
-- 
2.34.1

#58

Kirill Reshke

reshkekirill@gmail.com

about 1 year ago

In reply to: Kirill Reshke (#57)

6 attachment(s)

Re: Amcheck verification of GiST and GIN

On Tue, 3 Dec 2024 at 20:04, Kirill Reshke <reshkekirill@gmail.com> wrote:

PFA v33.

CF bot didn't like this version[0]https://cirrus-ci.com/task/4770943156879360, for some reason i attached
v32-0002 as v33-0004.

Hope this version will be built ok.

[0]: https://cirrus-ci.com/task/4770943156879360

--
Best regards,
Kirill Reshke

Attachments:

v34-0001-A-tiny-nitpicky-tweak-to-beautify-the-Amcheck-in.patchapplication/octet-stream; name=v34-0001-A-tiny-nitpicky-tweak-to-beautify-the-Amcheck-in.patchDownload

From a73f5e5f73d9a0c12cc29c4b349950a6fe3ead36 Mon Sep 17 00:00:00 2001
From: reshke kirill <reshke@double.cloud>
Date: Tue, 26 Nov 2024 05:32:27 +0000
Subject: [PATCH v34 1/6] A tiny nitpicky tweak to beautify the Amcheck
 interiors.

The heaptuplespresent field in BtreeCheckState was not previously
adequately documented. To clarify the meaning of this field, the comment was changed.
---
 contrib/amcheck/verify_nbtree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index ffe4f721672..c76349bf436 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -124,7 +124,7 @@ typedef struct BtreeCheckState
 
 	/* Bloom filter fingerprints B-Tree index */
 	bloom_filter *filter;
-	/* Debug counter */
+	/* Debug counter for reporting percentage of work already done */
 	int64		heaptuplespresent;
 } BtreeCheckState;
 
-- 
2.34.1

v34-0004-Add-gin_index_check-to-verify-GIN-index.patchapplication/octet-stream; name=v34-0004-Add-gin_index_check-to-verify-GIN-index.patchDownload

From 241fddea9e97f92aec76bfef7d7338f15e4f08a5 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:22:26 +0500
Subject: [PATCH v34 4/6] Add gin_index_check() to verify GIN index
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Author: Grigory Kryachko <GSKryachko@gmail.com>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile               |   3 +-
 contrib/amcheck/amcheck--1.4--1.5.sql  |   9 +
 contrib/amcheck/expected/check_gin.out |  64 ++
 contrib/amcheck/meson.build            |   2 +
 contrib/amcheck/sql/check_gin.sql      |  40 ++
 contrib/amcheck/verify_gin.c           | 774 +++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml              |  19 +
 src/tools/pgindent/pgindent            |   2 +-
 8 files changed, 911 insertions(+), 2 deletions(-)
 create mode 100644 contrib/amcheck/expected/check_gin.out
 create mode 100644 contrib/amcheck/sql/check_gin.sql
 create mode 100644 contrib/amcheck/verify_gin.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index 952e458c53b..c01f8e618f3 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,6 +4,7 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	verify_common.o \
+	verify_gin.o \
 	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
@@ -13,7 +14,7 @@ DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck
 		amcheck--1.3--1.4.sql amcheck--1.4--1.5.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_gist check_heap
+REGRESS = check check_btree check_gin check_gist check_heap
 
 EXTRA_INSTALL = contrib/pg_walinspect
 TAP_TESTS = 1
diff --git a/contrib/amcheck/amcheck--1.4--1.5.sql b/contrib/amcheck/amcheck--1.4--1.5.sql
index 3fc72364180..c013abc4f55 100644
--- a/contrib/amcheck/amcheck--1.4--1.5.sql
+++ b/contrib/amcheck/amcheck--1.4--1.5.sql
@@ -12,3 +12,12 @@ AS 'MODULE_PATHNAME', 'gist_index_check'
 LANGUAGE C STRICT;
 
 REVOKE ALL ON FUNCTION gist_index_check(regclass, boolean) FROM PUBLIC;
+
+-- gin_index_check()
+--
+CREATE FUNCTION gin_index_check(index regclass)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gin_index_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gin_index_check(regclass) FROM PUBLIC;
diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
new file mode 100644
index 00000000000..bbcde80e627
--- /dev/null
+++ b/contrib/amcheck/expected/check_gin.out
@@ -0,0 +1,64 @@
+-- Test of index bulk load
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_check('gin_check_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test index inserts
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+SELECT gin_index_check('gin_check_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test GIN over text array
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_check('gin_check_text_array_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 5ab83ca7778..cbe15f4244c 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'verify_common.c',
+  'verify_gin.c',
   'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
@@ -38,6 +39,7 @@ tests += {
     'sql': [
       'check',
       'check_btree',
+      'check_gin',
       'check_gist',
       'check_heap',
     ],
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
new file mode 100644
index 00000000000..bbd9b9f8281
--- /dev/null
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -0,0 +1,40 @@
+-- Test of index bulk load
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test index inserts
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+
+SELECT gin_index_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test GIN over text array
+SELECT setseed(1);
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_check('gin_check_text_array_idx');
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
new file mode 100644
index 00000000000..2dc5fbba619
--- /dev/null
+++ b/contrib/amcheck/verify_gin.c
@@ -0,0 +1,774 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gin.c
+ *		Verifies the integrity of GIN indexes based on invariants.
+ *
+ * Verification checks that all paths in GIN graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gin.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gin_private.h"
+#include "access/nbtree.h"
+#include "catalog/pg_am.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "verify_common.h"
+#include "string.h"
+
+/*
+ * GinScanItem represents one item of depth-first scan of GIN index.
+ */
+typedef struct GinScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GinScanItem *next;
+}			GinScanItem;
+
+/*
+ * GinPostingTreeScanItem represents one item of depth-first scan of GIN  posting tree.
+ */
+typedef struct GinPostingTreeScanItem
+{
+	int			depth;
+	ItemPointerData parentkey;
+	BlockNumber parentblk;
+	BlockNumber blkno;
+	struct GinPostingTreeScanItem *next;
+}			GinPostingTreeScanItem;
+
+
+PG_FUNCTION_INFO_V1(gin_index_check);
+
+static void gin_check_parent_keys_consistency(Relation rel,
+											  Relation heaprel,
+											  void *callback_state, bool readonly);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gin_refind_parent(Relation rel,
+									BlockNumber parentblkno,
+									BlockNumber childblkno,
+									BufferAccessStrategy strategy);
+static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+								   OffsetNumber offset);
+
+/*
+ * gin_index_check(index regclass)
+ *
+ * Verify integrity of GIN index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gin_index_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+
+	amcheck_lock_relation_and_check(indrelid,
+									GIN_AM_OID,
+									gin_check_parent_keys_consistency,
+									AccessShareLock,
+									NULL);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Read item pointers from leaf entry tuple.
+ *
+ * Returns a palloc'd array of ItemPointers. The number of items is returned
+ * in *nitems.
+ */
+static ItemPointer
+ginReadTupleWithoutState(IndexTuple itup, int *nitems)
+{
+	Pointer		ptr = GinGetPosting(itup);
+	int			nipd = GinGetNPosting(itup);
+	ItemPointer ipd;
+	int			ndecoded;
+
+	if (GinItupIsCompressed(itup))
+	{
+		if (nipd > 0)
+		{
+			ipd = ginPostingListDecode((GinPostingList *) ptr, &ndecoded);
+			if (nipd != ndecoded)
+				elog(ERROR, "number of items mismatch in GIN entry tuple, %d in tuple header, %d decoded",
+					 nipd, ndecoded);
+		}
+		else
+			ipd = palloc(0);
+	}
+	else
+	{
+		ipd = (ItemPointer) palloc(sizeof(ItemPointerData) * nipd);
+		memcpy(ipd, ptr, sizeof(ItemPointerData) * nipd);
+	}
+	*nitems = nipd;
+	return ipd;
+}
+
+/*
+ * Allocates memory context and scans through postigTree graph
+ *
+ */
+static void
+gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting_tree_root)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinPostingTreeScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinPostingTreeScanItem *) palloc0(sizeof(GinPostingTreeScanItem));
+	stack->depth = 0;
+	ItemPointerSetInvalid(&stack->parentkey);
+	stack->parentblk = InvalidBlockNumber;
+	stack->blkno = posting_tree_root;
+
+	elog(DEBUG3, "processing posting tree at blk %u", posting_tree_root);
+
+	while (stack)
+	{
+		GinPostingTreeScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		Assert(GinPageIsData(page));
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			ItemPointerData minItem;
+			int			nlist;
+			ItemPointerData *list;
+			char		tidrange_buf[MAXPGPATH];
+
+			ItemPointerSetMin(&minItem);
+
+			elog(DEBUG1, "page blk: %u, type leaf", stack->blkno);
+
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			list = GinDataLeafPageGetItems(page, &nlist, minItem);
+
+			if (nlist > 0)
+				snprintf(tidrange_buf, sizeof(tidrange_buf),
+						 "%d tids (%u, %u) - (%u, %u)",
+						 nlist,
+						 ItemPointerGetBlockNumberNoCheck(&list[0]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[0]),
+						 ItemPointerGetBlockNumberNoCheck(&list[nlist - 1]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[nlist - 1]));
+			else
+				snprintf(tidrange_buf, sizeof(tidrange_buf), "0 tids");
+
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: parent %u highkey (%u, %u), %s",
+					 stack->blkno,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey),
+					 tidrange_buf);
+			else
+				elog(DEBUG3, "blk %u: root leaf, %s",
+					 stack->blkno,
+					 tidrange_buf);
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				ItemPointerGetOffsetNumberNoCheck(&stack->parentkey) != InvalidOffsetNumber &&
+				nlist > 0 && ItemPointerCompare(&stack->parentkey, &list[nlist - 1]) < 0)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": tid exceeds parent's high key in postingTree leaf on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+		else
+		{
+			LocationIndex pd_lower;
+			ItemPointerData bound;
+			int			lowersize;
+
+			/*
+			 * Check that tuples in each page are properly ordered and
+			 * consistent with parent high key
+			 */
+			Assert(GinPageIsData(page));
+			maxoff = GinPageGetOpaque(page)->maxoff;
+
+			elog(DEBUG1, "page blk: %u, type data, maxoff %d", stack->blkno, maxoff);
+
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: internal posting tree page with %u items, parent %u highkey (%u, %u)",
+					 stack->blkno, maxoff, stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey));
+			else
+				elog(DEBUG3, "blk %u: root internal posting tree page with %u items",
+					 stack->blkno, maxoff);
+
+			/*
+			 * A GIN posting tree internal page stores PostingItems in the
+			 * 'lower' part of the page. The 'upper' part is unused. The
+			 * number of elements is stored in the opaque area (maxoff). Make
+			 * sure the size of the 'lower' part agrees with 'maxoff'
+			 *
+			 * We didn't set pd_lower until PostgreSQL version 9.4, so if this
+			 * check fails, it could also be because the index was
+			 * binary-upgraded from an earlier version. That was a long time
+			 * ago, though, so let's warn if it doesn't match.
+			 */
+			pd_lower = ((PageHeader) page)->pd_lower;
+			lowersize = pd_lower - MAXALIGN(SizeOfPageHeaderData);
+			if ((lowersize - MAXALIGN(sizeof(ItemPointerData))) / sizeof(PostingItem) != maxoff)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has unexpected pd_lower %u in posting tree block %u with maxoff %u)",
+								RelationGetRelationName(rel), pd_lower, stack->blkno, maxoff)));
+
+			/*
+			 * Before the PostingItems, there's one ItemPointerData in the
+			 * 'lower' part that stores the page's high key.
+			 */
+			bound = *GinDataPageGetRightBound(page);
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				!ItemPointerEquals(&stack->parentkey, &bound))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": posting tree page's high key (%u, %u) doesn't match the downlink on block %u (parent blk %u, key (%u, %u))",
+								RelationGetRelationName(rel),
+								ItemPointerGetBlockNumberNoCheck(&bound),
+								ItemPointerGetOffsetNumberNoCheck(&bound),
+								stack->blkno, stack->parentblk,
+								ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+								ItemPointerGetOffsetNumberNoCheck(&stack->parentkey))));
+
+			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
+
+				/* ItemPointerGetOffsetNumber expects a valid pointer */
+				if (!(i == maxoff &&
+					  GinPageGetOpaque(page)->rightlink == InvalidBlockNumber))
+					elog(DEBUG3, "key (%u, %u) -> %u",
+						 ItemPointerGetBlockNumber(&posting_item->key),
+						 ItemPointerGetOffsetNumber(&posting_item->key),
+						 BlockIdGetBlockNumber(&posting_item->child_blkno));
+				else
+					elog(DEBUG3, "key (%u, %u) -> %u",
+						 0, 0, BlockIdGetBlockNumber(&posting_item->child_blkno));
+
+				if (i == maxoff &&
+					GinPageGetOpaque(page)->rightlink == InvalidBlockNumber)
+				{
+					/*
+					 * The rightmost item in the tree level has (0, 0) as the
+					 * key
+					 */
+					if (ItemPointerGetBlockNumberNoCheck(&posting_item->key) != 0 ||
+						ItemPointerGetOffsetNumberNoCheck(&posting_item->key) != 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": rightmost posting tree page (blk %u) has unexpected last key (%u, %u)",
+										RelationGetRelationName(rel),
+										stack->blkno,
+										ItemPointerGetBlockNumberNoCheck(&posting_item->key),
+										ItemPointerGetOffsetNumberNoCheck(&posting_item->key))));
+				}
+				else if (i != FirstOffsetNumber)
+				{
+					PostingItem *previous_posting_item = GinDataPageGetPostingItem(page, i - 1);
+
+					if (ItemPointerCompare(&posting_item->key, &previous_posting_item->key) < 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\" has wrong tuple order in posting tree, block %u, offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+				}
+
+				/*
+				 * Check if this tuple is consistent with the downlink in the
+				 * parent.
+				 */
+				if (stack->parentblk != InvalidBlockNumber && i == maxoff &&
+					ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\": posting item exceeds parent's high key in postingTree internal page on block %u offset %u",
+									RelationGetRelationName(rel),
+									stack->blkno, i)));
+
+				/* If this is an internal page, recurse into the child */
+				if (!GinPageIsLeaf(page))
+				{
+					GinPostingTreeScanItem *ptr;
+
+					ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
+					ptr->depth = stack->depth + 1;
+					ptr->parentkey = posting_item->key;
+					ptr->parentblk = stack->blkno;
+					ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+			}
+		}
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Main entry point for GIN check. Allocates memory context and scans through
+ * GIN graph.
+ */
+static void
+gin_check_parent_keys_consistency(Relation rel,
+								  Relation heaprel,
+								  void *callback_state,
+								  bool readonly)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GinState	state;
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+	initGinState(&state, rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinScanItem *) palloc0(sizeof(GinScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIN_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GinScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+		IndexTuple	prev_tuple;
+		BlockNumber rightlink;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+		maxoff = PageGetMaxOffsetNumber(page);
+		rightlink = GinPageGetOpaque(page)->rightlink;
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		elog(DEBUG3, "processing entry tree page at blk %u, maxoff: %u", stack->blkno, maxoff);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (stack->parenttup != NULL)
+		{
+			GinNullCategory parent_key_category;
+			Datum		parent_key = gintuple_get_key(&state,
+													  stack->parenttup,
+													  &parent_key_category);
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno,
+												   page, maxoff);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory page_max_key_category;
+			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
+
+			if (rightlink != InvalidBlockNumber &&
+				ginCompareEntries(&state, attnum, page_max_key,
+								  page_max_key_category, parent_key,
+								  parent_key_category) > 0)
+			{
+				/* split page detected, install right link to the stack */
+				GinScanItem *ptr;
+
+				elog(DEBUG3, "split detected for blk: %u, parent blk: %u", stack->blkno, stack->parentblk);
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth;
+				ptr->parenttup = CopyIndexTuple(stack->parenttup);
+				ptr->parentblk = stack->parentblk;
+				ptr->parentlsn = stack->parentlsn;
+				ptr->blkno = rightlink;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that tuples in each page are properly ordered and consistent
+		 * with parent high key
+		 */
+		prev_tuple = NULL;
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory prev_key_category;
+			Datum		prev_key;
+			GinNullCategory current_key_category;
+			Datum		current_key;
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
+
+			/*
+			 * First block is metadata, skip order check. Also, never check
+			 * for high key on rightmost page, as this key is not really
+			 * stored explicitly.
+			 */
+			if (i != FirstOffsetNumber && stack->blkno != GIN_ROOT_BLKNO &&
+				!(i == maxoff && rightlink == InvalidBlockNumber))
+			{
+				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
+				if (ginCompareEntries(&state, attnum, prev_key,
+									  prev_key_category, current_key,
+									  current_key_category) >= 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has wrong tuple order on entry tree page, block %u, offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+			}
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				i == maxoff)
+			{
+				GinNullCategory parent_key_category;
+				Datum		parent_key = gintuple_get_key(&state,
+														  stack->parenttup,
+														  &parent_key_category);
+
+				if (ginCompareEntries(&state, attnum, current_key,
+									  current_key_category, parent_key,
+									  parent_key_category) > 0)
+				{
+					/*
+					 * There was a discrepancy between parent and child
+					 * tuples. We need to verify it is not a result of
+					 * concurrent call of gistplacetopage(). So, lock parent
+					 * and try to find downlink for current page. It may be
+					 * missing due to concurrent page split, this is OK.
+					 */
+					pfree(stack->parenttup);
+					stack->parenttup = gin_refind_parent(rel, stack->parentblk,
+														 stack->blkno, strategy);
+
+					/* We found it - make a final check before failing */
+					if (!stack->parenttup)
+						elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+							 stack->blkno, stack->parentblk);
+					else
+					{
+						parent_key = gintuple_get_key(&state,
+													  stack->parenttup,
+													  &parent_key_category);
+
+						/*
+						 * Check if it is properly adjusted. If succeed,
+						 * procced to the next key.
+						 */
+						if (ginCompareEntries(&state, attnum, current_key,
+											  current_key_category, parent_key,
+											  parent_key_category) > 0)
+							ereport(ERROR,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+											RelationGetRelationName(rel), stack->blkno, i)));
+					}
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+			if (!GinPageIsLeaf(page))
+			{
+				GinScanItem *ptr;
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth + 1;
+				/* last tuple in layer has no high key */
+				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
+					ptr->parenttup = CopyIndexTuple(idxtuple);
+				else
+					ptr->parenttup = NULL;
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = GinGetDownlink(idxtuple);
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+			/* If this item is a pointer to a posting tree, recurse into it */
+			else if (GinIsPostingTree(idxtuple))
+			{
+				BlockNumber rootPostingTree = GinGetPostingTree(idxtuple);
+
+				gin_check_posting_tree_parent_keys_consistency(rel, rootPostingTree);
+			}
+			else
+			{
+				ItemPointer ipd;
+				int			nipd;
+
+				ipd = ginReadTupleWithoutState(idxtuple, &nipd);
+
+				for (int j = 0; j < nipd; j++)
+				{
+					if (!OffsetNumberIsValid(ItemPointerGetOffsetNumber(&ipd[j])))
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting list contains invalid heap pointer on block %u",
+										RelationGetRelationName(rel), stack->blkno)));
+				}
+				pfree(ipd);
+			}
+
+			prev_tuple = CopyIndexTuple(idxtuple);
+		}
+
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Verify that a freshly-read page looks sane.
+ */
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	/*
+	 * ReadBuffer verifies that every newly-read page passes
+	 * PageHeaderIsValid, which means it either contains a reasonably sane
+	 * page header or is all-zero.  We have to defend against the all-zero
+	 * case, however.
+	 */
+	if (PageIsNew(page))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains unexpected zero page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	/*
+	 * Additionally check that the special area looks sane.
+	 */
+	if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains corrupted page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	if (GinPageIsDeleted(page))
+	{
+		if (!GinPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gin_refind_parent(Relation rel, BlockNumber parentblkno,
+				  BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIN_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GinPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+static ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED or LP_DEAD,
+	 * since GIN never uses all three.  Verify that line pointer has storage,
+	 * too.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdIsDead(itemid) || ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 6eb526c6bb7..55f2b587e57 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -189,6 +189,25 @@ ORDER BY c.relpages DESC LIMIT 10;
     </listitem>
    </varlistentry>
 
+   <varlistentry>
+    <term>
+     <function>gin_index_check(index regclass) returns void</function>
+     <indexterm>
+      <primary>gin_index_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gin_index_check</function> tests that its target GIN index
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
+
    <varlistentry>
     <term>
      <function>gist_index_check(index regclass, heapallindexed boolean) returns void</function>
diff --git a/src/tools/pgindent/pgindent b/src/tools/pgindent/pgindent
index e889af6b1e4..e5ac0410665 100755
--- a/src/tools/pgindent/pgindent
+++ b/src/tools/pgindent/pgindent
@@ -13,7 +13,7 @@ use IO::Handle;
 use Getopt::Long;
 
 # Update for pg_bsd_indent version
-my $INDENT_VERSION = "2.1.2";
+my $INDENT_VERSION = "2.1.1";
 
 # Our standard indent settings
 my $indent_opts =
-- 
2.34.1

v34-0002-Refactor-amcheck-internals-to-isolate-common-loc.patchapplication/octet-stream; name=v34-0002-Refactor-amcheck-internals-to-isolate-common-loc.patchDownload

From a048ec57335f109d18ed947e2aa3458ec40ec9af Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:08:10 +0500
Subject: [PATCH v34 2/6] Refactor amcheck internals to isolate common locking
 and checking routines
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Before doing checks, other indexes must take the same safety measures:
 - Making sure the index can be checked
 - changing the context of the user
 - keeping track of GUCs modified via index functions
This contribution relocates the existing functionality to amcheck.c for reuse.

Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru

fix
---
 contrib/amcheck/Makefile                 |   1 +
 contrib/amcheck/expected/check_btree.out |   4 +-
 contrib/amcheck/meson.build              |   1 +
 contrib/amcheck/verify_common.c          | 191 ++++++++++++++++
 contrib/amcheck/verify_common.h          |  31 +++
 contrib/amcheck/verify_nbtree.c          | 267 ++++++-----------------
 6 files changed, 296 insertions(+), 199 deletions(-)
 create mode 100644 contrib/amcheck/verify_common.c
 create mode 100644 contrib/amcheck/verify_common.h

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index 5e9002d2501..c3d70f3369c 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -3,6 +3,7 @@
 MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
+	verify_common.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
diff --git a/contrib/amcheck/expected/check_btree.out b/contrib/amcheck/expected/check_btree.out
index e7fb5f55157..c6f4b16c556 100644
--- a/contrib/amcheck/expected/check_btree.out
+++ b/contrib/amcheck/expected/check_btree.out
@@ -57,8 +57,8 @@ ERROR:  could not open relation with OID 17
 BEGIN;
 CREATE INDEX bttest_a_brin_idx ON bttest_a USING brin(id);
 SELECT bt_index_parent_check('bttest_a_brin_idx');
-ERROR:  only B-Tree indexes are supported as targets for verification
-DETAIL:  Relation "bttest_a_brin_idx" is not a B-Tree index.
+ERROR:  expected "btree" index as targets for verification
+DETAIL:  Relation "bttest_a_brin_idx" is a brin index.
 ROLLBACK;
 -- normal check outside of xact
 SELECT bt_index_check('bttest_a_idx');
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index fc08e32539a..6a1ba5d7619 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -1,6 +1,7 @@
 # Copyright (c) 2022-2024, PostgreSQL Global Development Group
 
 amcheck_sources = files(
+  'verify_common.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
diff --git a/contrib/amcheck/verify_common.c b/contrib/amcheck/verify_common.c
new file mode 100644
index 00000000000..c8ed685ba42
--- /dev/null
+++ b/contrib/amcheck/verify_common.c
@@ -0,0 +1,191 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_common.c
+ *		Utility functions common to all access methods.
+ *
+ * Copyright (c) 2024, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_common.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/table.h"
+#include "access/tableam.h"
+#include "verify_common.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
+#include "commands/tablecmds.h"
+#include "utils/guc.h"
+#include "utils/syscache.h"
+
+static bool amcheck_index_mainfork_expected(Relation rel);
+
+
+/*
+ * Check if index relation should have a file for its main relation fork.
+ * Verification uses this to skip unlogged indexes when in hot standby mode,
+ * where there is simply nothing to verify.
+ *
+ * NB: Caller should call index_checkable() before calling here.
+ */
+static bool
+amcheck_index_mainfork_expected(Relation rel)
+{
+	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
+		!RecoveryInProgress())
+		return true;
+
+	ereport(NOTICE,
+			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
+			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
+					RelationGetRelationName(rel))));
+
+	return false;
+}
+
+/*
+* Amcheck main workhorse.
+* Given index relation OID, lock relation.
+* Next, take a number of standard actions:
+* 1) Make sure the index can be checked
+* 2) change the context of the user,
+* 3) keep track of GUCs modified via index functions
+* 4) execute callback function to verify integrity.
+*/
+void
+amcheck_lock_relation_and_check(Oid indrelid,
+								Oid am_id,
+								IndexDoCheckCallback check,
+								LOCKMODE lockmode,
+								void *state)
+{
+	Oid			heapid;
+	Relation	indrel;
+	Relation	heaprel;
+	Oid			save_userid;
+	int			save_sec_context;
+	int			save_nestlevel;
+
+	/*
+	 * We must lock table before index to avoid deadlocks.  However, if the
+	 * passed indrelid isn't an index then IndexGetRelation() will fail.
+	 * Rather than emitting a not-very-helpful error message, postpone
+	 * complaining, expecting that the is-it-an-index test below will fail.
+	 *
+	 * In hot standby mode this will raise an error when parentcheck is true.
+	 */
+	heapid = IndexGetRelation(indrelid, true);
+	if (OidIsValid(heapid))
+	{
+		heaprel = table_open(heapid, lockmode);
+
+		/*
+		 * Switch to the table owner's userid, so that any index functions are
+		 * run as that user.  Also lock down security-restricted operations
+		 * and arrange to make GUC variable changes local to this command.
+		 */
+		GetUserIdAndSecContext(&save_userid, &save_sec_context);
+		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
+							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
+		save_nestlevel = NewGUCNestLevel();
+	}
+	else
+	{
+		heaprel = NULL;
+		/* Set these just to suppress "uninitialized variable" warnings */
+		save_userid = InvalidOid;
+		save_sec_context = -1;
+		save_nestlevel = -1;
+	}
+
+	/*
+	 * Open the target index relations separately (like relation_openrv(), but
+	 * with heap relation locked first to prevent deadlocking).  In hot
+	 * standby mode this will raise an error when parentcheck is true.
+	 *
+	 * There is no need for the usual indcheckxmin usability horizon test
+	 * here, even in the heapallindexed case, because index undergoing
+	 * verification only needs to have entries for a new transaction snapshot.
+	 * (If this is a parentcheck verification, there is no question about
+	 * committed or recently dead heap tuples lacking index entries due to
+	 * concurrent activity.)
+	 */
+	indrel = index_open(indrelid, lockmode);
+
+	/*
+	 * Since we did the IndexGetRelation call above without any lock, it's
+	 * barely possible that a race against an index drop/recreation could have
+	 * netted us the wrong table.
+	 */
+	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_TABLE),
+				 errmsg("could not open parent table of index \"%s\"",
+						RelationGetRelationName(indrel))));
+
+	/* Check that relation suitable for checking */
+	if (index_checkable(indrel, am_id))
+		check(indrel, heaprel, state, lockmode == ShareLock);
+
+	/* Roll back any GUC changes executed by index functions */
+	AtEOXact_GUC(false, save_nestlevel);
+
+	/* Restore userid and security context */
+	SetUserIdAndSecContext(save_userid, save_sec_context);
+
+	/*
+	 * Release locks early. That's ok here because nothing in the called
+	 * routines will trigger shared cache invalidations to be sent, so we can
+	 * relax the usual pattern of only releasing locks after commit.
+	 */
+	index_close(indrel, lockmode);
+	if (heaprel)
+		table_close(heaprel, lockmode);
+}
+
+/*
+ * Basic checks about the suitability of a relation for checking as an index.
+ *
+ *
+ * NB: Intentionally not checking permissions, the function is normally not
+ * callable by non-superusers. If granted, it's useful to be able to check a
+ * whole cluster.
+ */
+bool
+index_checkable(Relation rel, Oid am_id)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != am_id)
+	{
+		HeapTuple	amtup;
+		HeapTuple	amtuprel;
+
+		amtup = SearchSysCache1(AMOID, ObjectIdGetDatum(am_id));
+		amtuprel = SearchSysCache1(AMOID, ObjectIdGetDatum(rel->rd_rel->relam));
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("expected \"%s\" index as targets for verification", NameStr(((Form_pg_am) GETSTRUCT(amtup))->amname)),
+				 errdetail("Relation \"%s\" is a %s index.",
+						   RelationGetRelationName(rel), NameStr(((Form_pg_am) GETSTRUCT(amtuprel))->amname))));
+	}
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid.")));
+
+	return amcheck_index_mainfork_expected(rel);
+}
diff --git a/contrib/amcheck/verify_common.h b/contrib/amcheck/verify_common.h
new file mode 100644
index 00000000000..30994e22933
--- /dev/null
+++ b/contrib/amcheck/verify_common.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "storage/bufpage.h"
+#include "storage/lmgr.h"
+#include "storage/lockdefs.h"
+#include "utils/relcache.h"
+#include "miscadmin.h"
+
+/* Typedefs for callback functions for amcheck_lock_relation */
+typedef void (*IndexCheckableCallback) (Relation index);
+typedef void (*IndexDoCheckCallback) (Relation rel,
+									  Relation heaprel,
+									  void *state,
+									  bool readonly);
+
+extern void amcheck_lock_relation_and_check(Oid indrelid,
+											Oid am_id,
+											IndexDoCheckCallback check,
+											LOCKMODE lockmode, void *state);
+
+extern bool index_checkable(Relation rel, Oid am_id);
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index c76349bf436..1da4f0c3461 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -30,6 +30,7 @@
 #include "access/tableam.h"
 #include "access/transam.h"
 #include "access/xact.h"
+#include "verify_common.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
 #include "catalog/pg_opfamily_d.h"
@@ -156,14 +157,22 @@ typedef struct BtreeLastVisibleEntry
 	ItemPointer tid;			/* Heap tid */
 } BtreeLastVisibleEntry;
 
+/*
+ * Check arguments
+ */
+typedef struct BTCallbackState
+{
+	bool		parentcheck;
+	bool		heapallindexed;
+	bool		rootdescend;
+	bool		checkunique;
+}			BTCallbackState;
+
 PG_FUNCTION_INFO_V1(bt_index_check);
 PG_FUNCTION_INFO_V1(bt_index_parent_check);
 
-static void bt_index_check_internal(Oid indrelid, bool parentcheck,
-									bool heapallindexed, bool rootdescend,
-									bool checkunique);
-static inline void btree_index_checkable(Relation rel);
-static inline bool btree_index_mainfork_expected(Relation rel);
+static void bt_index_check_callback(Relation indrel, Relation heaprel,
+									void *state, bool readonly);
 static void bt_check_every_level(Relation rel, Relation heaprel,
 								 bool heapkeyspace, bool readonly, bool heapallindexed,
 								 bool rootdescend, bool checkunique);
@@ -238,15 +247,21 @@ Datum
 bt_index_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		checkunique = false;
+	BTCallbackState args;
+
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = false;
+	args.checkunique = false;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
-	if (PG_NARGS() == 3)
-		checkunique = PG_GETARG_BOOL(2);
+		args.heapallindexed = PG_GETARG_BOOL(1);
+	if (PG_NARGS() >= 3)
+		args.checkunique = PG_GETARG_BOOL(2);
 
-	bt_index_check_internal(indrelid, false, heapallindexed, false, checkunique);
+	amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
+									bt_index_check_callback,
+									AccessShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -264,18 +279,23 @@ Datum
 bt_index_parent_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		rootdescend = false;
-	bool		checkunique = false;
+	BTCallbackState args;
+
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = true;
+	args.checkunique = false;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+		args.heapallindexed = PG_GETARG_BOOL(1);
 	if (PG_NARGS() >= 3)
-		rootdescend = PG_GETARG_BOOL(2);
-	if (PG_NARGS() == 4)
-		checkunique = PG_GETARG_BOOL(3);
+		args.rootdescend = PG_GETARG_BOOL(2);
+	if (PG_NARGS() >= 4)
+		args.checkunique = PG_GETARG_BOOL(3);
 
-	bt_index_check_internal(indrelid, true, heapallindexed, rootdescend, checkunique);
+	amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
+									bt_index_check_callback,
+									ShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -284,193 +304,46 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
  * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
  */
 static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend, bool checkunique)
+bt_index_check_callback(Relation indrel, Relation heaprel, void *state, bool readonly)
 {
-	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	LOCKMODE	lockmode;
-	Oid			save_userid;
-	int			save_sec_context;
-	int			save_nestlevel;
-
-	if (parentcheck)
-		lockmode = ShareLock;
-	else
-		lockmode = AccessShareLock;
-
-	/*
-	 * We must lock table before index to avoid deadlocks.  However, if the
-	 * passed indrelid isn't an index then IndexGetRelation() will fail.
-	 * Rather than emitting a not-very-helpful error message, postpone
-	 * complaining, expecting that the is-it-an-index test below will fail.
-	 *
-	 * In hot standby mode this will raise an error when parentcheck is true.
-	 */
-	heapid = IndexGetRelation(indrelid, true);
-	if (OidIsValid(heapid))
-	{
-		heaprel = table_open(heapid, lockmode);
-
-		/*
-		 * Switch to the table owner's userid, so that any index functions are
-		 * run as that user.  Also lock down security-restricted operations
-		 * and arrange to make GUC variable changes local to this command.
-		 */
-		GetUserIdAndSecContext(&save_userid, &save_sec_context);
-		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
-							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
-		save_nestlevel = NewGUCNestLevel();
-		RestrictSearchPath();
-	}
-	else
-	{
-		heaprel = NULL;
-		/* Set these just to suppress "uninitialized variable" warnings */
-		save_userid = InvalidOid;
-		save_sec_context = -1;
-		save_nestlevel = -1;
-	}
+	BTCallbackState *args = (BTCallbackState *) state;
+	bool		heapkeyspace,
+				allequalimage;
 
-	/*
-	 * Open the target index relations separately (like relation_openrv(), but
-	 * with heap relation locked first to prevent deadlocking).  In hot
-	 * standby mode this will raise an error when parentcheck is true.
-	 *
-	 * There is no need for the usual indcheckxmin usability horizon test
-	 * here, even in the heapallindexed case, because index undergoing
-	 * verification only needs to have entries for a new transaction snapshot.
-	 * (If this is a parentcheck verification, there is no question about
-	 * committed or recently dead heap tuples lacking index entries due to
-	 * concurrent activity.)
-	 */
-	indrel = index_open(indrelid, lockmode);
-
-	/*
-	 * Since we did the IndexGetRelation call above without any lock, it's
-	 * barely possible that a race against an index drop/recreation could have
-	 * netted us the wrong table.
-	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
 		ereport(ERROR,
-				(errcode(ERRCODE_UNDEFINED_TABLE),
-				 errmsg("could not open parent table of index \"%s\"",
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" lacks a main relation fork",
 						RelationGetRelationName(indrel))));
 
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	if (btree_index_mainfork_expected(indrel))
+	/* Extract metadata from metapage, and sanitize it in passing */
+	_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
+	if (allequalimage && !heapkeyspace)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
+						RelationGetRelationName(indrel))));
+	if (allequalimage && !_bt_allequalimage(indrel, false))
 	{
-		bool		heapkeyspace,
-					allequalimage;
+		bool		has_interval_ops = false;
 
-		if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" lacks a main relation fork",
-							RelationGetRelationName(indrel))));
-
-		/* Extract metadata from metapage, and sanitize it in passing */
-		_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
-		if (allequalimage && !heapkeyspace)
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
-							RelationGetRelationName(indrel))));
-		if (allequalimage && !_bt_allequalimage(indrel, false))
-		{
-			bool		has_interval_ops = false;
-
-			for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
-				if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
-					has_interval_ops = true;
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
-							RelationGetRelationName(indrel)),
-					 has_interval_ops
-					 ? errhint("This is known of \"interval\" indexes last built on a version predating 2023-11.")
-					 : 0));
-		}
-
-		/* Check index, possibly against table it is an index on */
-		bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
-							 heapallindexed, rootdescend, checkunique);
+		for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
+			if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
+			{
+				has_interval_ops = true;
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
+								RelationGetRelationName(indrel)),
+						 has_interval_ops
+						 ? errhint("This is known of \"interval\" indexes last built on a version predating 2023-11.")
+						 : 0));
+			}
 	}
 
-	/* Roll back any GUC changes executed by index functions */
-	AtEOXact_GUC(false, save_nestlevel);
-
-	/* Restore userid and security context */
-	SetUserIdAndSecContext(save_userid, save_sec_context);
-
-	/*
-	 * Release locks early. That's ok here because nothing in the called
-	 * routines will trigger shared cache invalidations to be sent, so we can
-	 * relax the usual pattern of only releasing locks after commit.
-	 */
-	index_close(indrel, lockmode);
-	if (heaprel)
-		table_close(heaprel, lockmode);
-}
-
-/*
- * Basic checks about the suitability of a relation for checking as a B-Tree
- * index.
- *
- * NB: Intentionally not checking permissions, the function is normally not
- * callable by non-superusers. If granted, it's useful to be able to check a
- * whole cluster.
- */
-static inline void
-btree_index_checkable(Relation rel)
-{
-	if (rel->rd_rel->relkind != RELKIND_INDEX ||
-		rel->rd_rel->relam != BTREE_AM_OID)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("only B-Tree indexes are supported as targets for verification"),
-				 errdetail("Relation \"%s\" is not a B-Tree index.",
-						   RelationGetRelationName(rel))));
-
-	if (RELATION_IS_OTHER_TEMP(rel))
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("cannot access temporary tables of other sessions"),
-				 errdetail("Index \"%s\" is associated with temporary relation.",
-						   RelationGetRelationName(rel))));
-
-	if (!rel->rd_index->indisvalid)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("cannot check index \"%s\"",
-						RelationGetRelationName(rel)),
-				 errdetail("Index is not valid.")));
-}
-
-/*
- * Check if B-Tree index relation should have a file for its main relation
- * fork.  Verification uses this to skip unlogged indexes when in hot standby
- * mode, where there is simply nothing to verify.  We behave as if the
- * relation is empty.
- *
- * NB: Caller should call btree_index_checkable() before calling here.
- */
-static inline bool
-btree_index_mainfork_expected(Relation rel)
-{
-	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
-		!RecoveryInProgress())
-		return true;
-
-	ereport(DEBUG1,
-			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
-			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
-					RelationGetRelationName(rel))));
-
-	return false;
+	/* Check index, possibly against table it is an index on */
+	bt_check_every_level(indrel, heaprel, heapkeyspace, readonly,
+						 args->heapallindexed, args->rootdescend, args->checkunique);
 }
 
 /*
-- 
2.34.1

v34-0005-Add-GiST-support-to-pg_amcheck.patchapplication/octet-stream; name=v34-0005-Add-GiST-support-to-pg_amcheck.patchDownload

From 32bf8f47535cef157572e42307f69fcac33a0615 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sun, 5 Feb 2023 15:52:14 -0800
Subject: [PATCH v34 5/6] Add GiST support to pg_amcheck

Proof of concept patch for pg_amcheck binary support
for GIST and GIN index checks.

Author: Andrey Borodin <amborodin@acm.org>
Reviewed-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
---
 src/bin/pg_amcheck/pg_amcheck.c      | 290 ++++++++++++++++-----------
 src/bin/pg_amcheck/t/002_nonesuch.pl |   8 +-
 src/bin/pg_amcheck/t/003_check.pl    |  65 ++++--
 3 files changed, 220 insertions(+), 143 deletions(-)

diff --git a/src/bin/pg_amcheck/pg_amcheck.c b/src/bin/pg_amcheck/pg_amcheck.c
index 27a7d5e925e..8146ea1e604 100644
--- a/src/bin/pg_amcheck/pg_amcheck.c
+++ b/src/bin/pg_amcheck/pg_amcheck.c
@@ -40,8 +40,7 @@ typedef struct PatternInfo
 								 * NULL */
 	bool		heap_only;		/* true if rel_regex should only match heap
 								 * tables */
-	bool		btree_only;		/* true if rel_regex should only match btree
-								 * indexes */
+	bool		index_only;		/* true if rel_regex should only match indexes */
 	bool		matched;		/* true if the pattern matched in any database */
 } PatternInfo;
 
@@ -75,10 +74,9 @@ typedef struct AmcheckOptions
 
 	/*
 	 * As an optimization, if any pattern in the exclude list applies to heap
-	 * tables, or similarly if any such pattern applies to btree indexes, or
-	 * to schemas, then these will be true, otherwise false.  These should
-	 * always agree with what you'd conclude by grep'ing through the exclude
-	 * list.
+	 * tables, or similarly if any such pattern applies to indexes, or to
+	 * schemas, then these will be true, otherwise false.  These should always
+	 * agree with what you'd conclude by grep'ing through the exclude list.
 	 */
 	bool		excludetbl;
 	bool		excludeidx;
@@ -99,14 +97,14 @@ typedef struct AmcheckOptions
 	int64		endblock;
 	const char *skip;
 
-	/* btree index checking options */
+	/* index checking options */
 	bool		parent_check;
 	bool		rootdescend;
 	bool		heapallindexed;
 	bool		checkunique;
 
-	/* heap and btree hybrid option */
-	bool		no_btree_expansion;
+	/* heap and indexes hybrid option */
+	bool		no_index_expansion;
 } AmcheckOptions;
 
 static AmcheckOptions opts = {
@@ -135,7 +133,7 @@ static AmcheckOptions opts = {
 	.rootdescend = false,
 	.heapallindexed = false,
 	.checkunique = false,
-	.no_btree_expansion = false
+	.no_index_expansion = false
 };
 
 static const char *progname = NULL;
@@ -152,13 +150,15 @@ typedef struct DatabaseInfo
 	char	   *datname;
 	char	   *amcheck_schema; /* escaped, quoted literal */
 	bool		is_checkunique;
+	bool		gist_supported;
 } DatabaseInfo;
 
 typedef struct RelationInfo
 {
 	const DatabaseInfo *datinfo;	/* shared by other relinfos */
 	Oid			reloid;
-	bool		is_heap;		/* true if heap, false if btree */
+	Oid			amoid;
+	bool		is_heap;		/* true if heap, false if index */
 	char	   *nspname;
 	char	   *relname;
 	int			relpages;
@@ -179,10 +179,12 @@ static void prepare_heap_command(PQExpBuffer sql, RelationInfo *rel,
 								 PGconn *conn);
 static void prepare_btree_command(PQExpBuffer sql, RelationInfo *rel,
 								  PGconn *conn);
+static void prepare_gist_command(PQExpBuffer sql, RelationInfo *rel,
+								 PGconn *conn);
 static void run_command(ParallelSlot *slot, const char *sql);
 static bool verify_heap_slot_handler(PGresult *res, PGconn *conn,
 									 void *context);
-static bool verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context);
+static bool verify_index_slot_handler(PGresult *res, PGconn *conn, void *context);
 static void help(const char *progname);
 static void progress_report(uint64 relations_total, uint64 relations_checked,
 							uint64 relpages_total, uint64 relpages_checked,
@@ -196,7 +198,7 @@ static void append_relation_pattern(PatternInfoArray *pia, const char *pattern,
 									int encoding);
 static void append_heap_pattern(PatternInfoArray *pia, const char *pattern,
 								int encoding);
-static void append_btree_pattern(PatternInfoArray *pia, const char *pattern,
+static void append_index_pattern(PatternInfoArray *pia, const char *pattern,
 								 int encoding);
 static void compile_database_list(PGconn *conn, SimplePtrList *databases,
 								  const char *initial_dbname);
@@ -288,6 +290,7 @@ main(int argc, char *argv[])
 	enum trivalue prompt_password = TRI_DEFAULT;
 	int			encoding = pg_get_encoding_from_locale(NULL, false);
 	ConnParams	cparams;
+	bool		gist_warn_printed = false;
 
 	pg_logging_init(argv[0]);
 	progname = get_progname(argv[0]);
@@ -323,11 +326,11 @@ main(int argc, char *argv[])
 				break;
 			case 'i':
 				opts.allrel = false;
-				append_btree_pattern(&opts.include, optarg, encoding);
+				append_index_pattern(&opts.include, optarg, encoding);
 				break;
 			case 'I':
 				opts.excludeidx = true;
-				append_btree_pattern(&opts.exclude, optarg, encoding);
+				append_index_pattern(&opts.exclude, optarg, encoding);
 				break;
 			case 'j':
 				if (!option_parse_int(optarg, "-j/--jobs", 1, INT_MAX,
@@ -382,7 +385,7 @@ main(int argc, char *argv[])
 				maintenance_db = pg_strdup(optarg);
 				break;
 			case 2:
-				opts.no_btree_expansion = true;
+				opts.no_index_expansion = true;
 				break;
 			case 3:
 				opts.no_toast_expansion = true;
@@ -531,6 +534,10 @@ main(int argc, char *argv[])
 		int			ntups;
 		const char *amcheck_schema = NULL;
 		DatabaseInfo *dat = (DatabaseInfo *) cell->ptr;
+		int			vmaj = 0,
+					vmin = 0,
+					vrev = 0;
+		const char *amcheck_version;
 
 		cparams.override_dbname = dat->datname;
 		if (conn == NULL || strcmp(PQdb(conn), dat->datname) != 0)
@@ -599,36 +606,32 @@ main(int argc, char *argv[])
 												 strlen(amcheck_schema));
 
 		/*
-		 * Check the version of amcheck extension. Skip requested unique
-		 * constraint check with warning if it is not yet supported by
-		 * amcheck.
+		 * Check the version of amcheck extension.
 		 */
-		if (opts.checkunique == true)
-		{
-			/*
-			 * Now amcheck has only major and minor versions in the string but
-			 * we also support revision just in case. Now it is expected to be
-			 * zero.
-			 */
-			int			vmaj = 0,
-						vmin = 0,
-						vrev = 0;
-			const char *amcheck_version = PQgetvalue(result, 0, 1);
+		amcheck_version = PQgetvalue(result, 0, 1);
 
-			sscanf(amcheck_version, "%d.%d.%d", &vmaj, &vmin, &vrev);
+		/*
+		 * Now amcheck has only major and minor versions in the string but we
+		 * also support revision just in case. Now it is expected to be zero.
+		 */
+		sscanf(amcheck_version, "%d.%d.%d", &vmaj, &vmin, &vrev);
 
-			/*
-			 * checkunique option is supported in amcheck since version 1.4
-			 */
-			if ((vmaj == 1 && vmin < 4) || vmaj == 0)
-			{
-				pg_log_warning("option %s is not supported by amcheck version %s",
-							   "--checkunique", amcheck_version);
-				dat->is_checkunique = false;
-			}
-			else
-				dat->is_checkunique = true;
+		/*
+		 * checkunique option is supported in amcheck since version 1.4. Skip
+		 * requested unique constraint check with warning if it is not yet
+		 * supported by amcheck.
+		 */
+		if (opts.checkunique && ((vmaj == 1 && vmin < 4) || vmaj == 0))
+		{
+			pg_log_warning("option %s is not supported by amcheck version %s",
+						   "--checkunique", amcheck_version);
+			dat->is_checkunique = false;
 		}
+		else
+			dat->is_checkunique = opts.checkunique;
+
+		/* GiST indexes are supported in 1.5+ */
+		dat->gist_supported = ((vmaj == 1 && vmin >= 5) || vmaj > 1);
 
 		PQclear(result);
 
@@ -650,8 +653,8 @@ main(int argc, char *argv[])
 			if (pat->heap_only)
 				log_no_match("no heap tables to check matching \"%s\"",
 							 pat->pattern);
-			else if (pat->btree_only)
-				log_no_match("no btree indexes to check matching \"%s\"",
+			else if (pat->index_only)
+				log_no_match("no indexes to check matching \"%s\"",
 							 pat->pattern);
 			else if (pat->rel_regex == NULL)
 				log_no_match("no relations to check in schemas matching \"%s\"",
@@ -784,13 +787,29 @@ main(int argc, char *argv[])
 				if (opts.show_progress && progress_since_last_stderr)
 					fprintf(stderr, "\n");
 
-				pg_log_info("checking btree index \"%s.%s.%s\"",
+				pg_log_info("checking index \"%s.%s.%s\"",
 							rel->datinfo->datname, rel->nspname, rel->relname);
 				progress_since_last_stderr = false;
 			}
-			prepare_btree_command(&sql, rel, free_slot->connection);
+			if (rel->amoid == BTREE_AM_OID)
+				prepare_btree_command(&sql, rel, free_slot->connection);
+			else if (rel->amoid == GIST_AM_OID)
+			{
+				if (rel->datinfo->gist_supported)
+					prepare_gist_command(&sql, rel, free_slot->connection);
+				else
+				{
+					if (!gist_warn_printed)
+						pg_log_warning("GiST verification is not supported by installed amcheck version");
+					gist_warn_printed = true;
+				}
+			}
+			else
+				/* should not happen at this stage */
+				pg_log_info("Verification of index type %u not supported",
+							rel->amoid);
 			rel->sql = pstrdup(sql.data);	/* pg_free'd after command */
-			ParallelSlotSetHandler(free_slot, verify_btree_slot_handler, rel);
+			ParallelSlotSetHandler(free_slot, verify_index_slot_handler, rel);
 			run_command(free_slot, rel->sql);
 		}
 	}
@@ -868,7 +887,7 @@ prepare_heap_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
  * Creates a SQL command for running amcheck checking on the given btree index
  * relation.  The command does not select any columns, as btree checking
  * functions do not return any, but rather return corruption information by
- * raising errors, which verify_btree_slot_handler expects.
+ * raising errors, which verify_index_slot_handler expects.
  *
  * The constructed SQL command will silently skip temporary indexes, and
  * indexes being reindexed concurrently, as checking them would needlessly draw
@@ -914,6 +933,28 @@ prepare_btree_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
 						  rel->reloid);
 }
 
+/*
+ * prepare_gist_command
+ * Similar to btree equivalent prepares command to check GiST index.
+ */
+static void
+prepare_gist_command(PQExpBuffer sql, RelationInfo *rel, PGconn *conn)
+{
+	resetPQExpBuffer(sql);
+
+	appendPQExpBuffer(sql,
+					  "SELECT %s.gist_index_check("
+					  "index := c.oid, heapallindexed := %s)"
+					  "\nFROM pg_catalog.pg_class c, pg_catalog.pg_index i "
+					  "WHERE c.oid = %u "
+					  "AND c.oid = i.indexrelid "
+					  "AND c.relpersistence != 't' "
+					  "AND i.indisready AND i.indisvalid AND i.indislive",
+					  rel->datinfo->amcheck_schema,
+					  (opts.heapallindexed ? "true" : "false"),
+					  rel->reloid);
+}
+
 /*
  * run_command
  *
@@ -953,7 +994,7 @@ run_command(ParallelSlot *slot, const char *sql)
  * Note: Heap relation corruption is reported by verify_heapam() via the result
  * set, rather than an ERROR, but running verify_heapam() on a corrupted heap
  * table may still result in an error being returned from the server due to
- * missing relation files, bad checksums, etc.  The btree corruption checking
+ * missing relation files, bad checksums, etc.  The corruption checking
  * functions always use errors to communicate corruption messages.  We can't
  * just abort processing because we got a mere ERROR.
  *
@@ -1103,11 +1144,11 @@ verify_heap_slot_handler(PGresult *res, PGconn *conn, void *context)
 }
 
 /*
- * verify_btree_slot_handler
+ * verify_index_slot_handler
  *
- * ParallelSlotHandler that receives results from a btree checking command
- * created by prepare_btree_command and outputs them for the user.  The results
- * from the btree checking command is assumed to be empty, but when the results
+ * ParallelSlotHandler that receives results from a checking command created by
+ * prepare_[btree,gist]_command and outputs them for the user.  The results
+ * from the checking command is assumed to be empty, but when the results
  * are an error code, the useful information about the corruption is expected
  * in the connection's error message.
  *
@@ -1116,7 +1157,7 @@ verify_heap_slot_handler(PGresult *res, PGconn *conn, void *context)
  * context: unused
  */
 static bool
-verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
+verify_index_slot_handler(PGresult *res, PGconn *conn, void *context)
 {
 	RelationInfo *rel = (RelationInfo *) context;
 
@@ -1127,12 +1168,12 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 		if (ntups > 1)
 		{
 			/*
-			 * We expect the btree checking functions to return one void row
-			 * each, or zero rows if the check was skipped due to the object
-			 * being in the wrong state to be checked, so we should output
-			 * some sort of warning if we get anything more, not because it
-			 * indicates corruption, but because it suggests a mismatch
-			 * between amcheck and pg_amcheck versions.
+			 * We expect the checking functions to return one void row each,
+			 * or zero rows if the check was skipped due to the object being
+			 * in the wrong state to be checked, so we should output some sort
+			 * of warning if we get anything more, not because it indicates
+			 * corruption, but because it suggests a mismatch between amcheck
+			 * and pg_amcheck versions.
 			 *
 			 * In conjunction with --progress, anything written to stderr at
 			 * this time would present strangely to the user without an extra
@@ -1142,7 +1183,7 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 			 */
 			if (opts.show_progress && progress_since_last_stderr)
 				fprintf(stderr, "\n");
-			pg_log_warning("btree index \"%s.%s.%s\": btree checking function returned unexpected number of rows: %d",
+			pg_log_warning("index \"%s.%s.%s\": checking function returned unexpected number of rows: %d",
 						   rel->datinfo->datname, rel->nspname, rel->relname, ntups);
 			if (opts.verbose)
 				pg_log_warning_detail("Query was: %s", rel->sql);
@@ -1156,7 +1197,7 @@ verify_btree_slot_handler(PGresult *res, PGconn *conn, void *context)
 		char	   *msg = indent_lines(PQerrorMessage(conn));
 
 		all_checks_pass = false;
-		printf(_("btree index \"%s.%s.%s\":\n"),
+		printf(_("index \"%s.%s.%s\":\n"),
 			   rel->datinfo->datname, rel->nspname, rel->relname);
 		printf("%s", msg);
 		if (opts.verbose)
@@ -1210,6 +1251,8 @@ help(const char *progname)
 	printf(_("      --heapallindexed            check that all heap tuples are found within indexes\n"));
 	printf(_("      --parent-check              check index parent/child relationships\n"));
 	printf(_("      --rootdescend               search from root page to refind tuples\n"));
+	printf(_("\nGiST index checking options:\n"));
+	printf(_("      --heapallindexed            check that all heap tuples are found within indexes\n"));
 	printf(_("\nConnection options:\n"));
 	printf(_("  -h, --host=HOSTNAME             database server host or socket directory\n"));
 	printf(_("  -p, --port=PORT                 database server port\n"));
@@ -1423,11 +1466,11 @@ append_schema_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
  * pattern: the relation name pattern
  * encoding: client encoding for parsing the pattern
  * heap_only: whether the pattern should only be matched against heap tables
- * btree_only: whether the pattern should only be matched against btree indexes
+ * index_only: whether the pattern should only be matched against indexes
  */
 static void
 append_relation_pattern_helper(PatternInfoArray *pia, const char *pattern,
-							   int encoding, bool heap_only, bool btree_only)
+							   int encoding, bool heap_only, bool index_only)
 {
 	PQExpBufferData dbbuf;
 	PQExpBufferData nspbuf;
@@ -1462,14 +1505,14 @@ append_relation_pattern_helper(PatternInfoArray *pia, const char *pattern,
 	termPQExpBuffer(&relbuf);
 
 	info->heap_only = heap_only;
-	info->btree_only = btree_only;
+	info->index_only = index_only;
 }
 
 /*
  * append_relation_pattern
  *
  * Adds the given pattern interpreted as a relation pattern, to be matched
- * against both heap tables and btree indexes.
+ * against both heap tables and indexes.
  *
  * pia: the pattern info array to be appended
  * pattern: the relation name pattern
@@ -1498,17 +1541,17 @@ append_heap_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
 }
 
 /*
- * append_btree_pattern
+ * append_index_pattern
  *
  * Adds the given pattern interpreted as a relation pattern, to be matched only
- * against btree indexes.
+ * against indexes.
  *
  * pia: the pattern info array to be appended
  * pattern: the relation name pattern
  * encoding: client encoding for parsing the pattern
  */
 static void
-append_btree_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
+append_index_pattern(PatternInfoArray *pia, const char *pattern, int encoding)
 {
 	append_relation_pattern_helper(pia, pattern, encoding, false, true);
 }
@@ -1766,7 +1809,7 @@ compile_database_list(PGconn *conn, SimplePtrList *databases,
  *     rel_regex: the relname regexp parsed from the pattern, or NULL if the
  *                pattern had no relname part
  *     heap_only: true if the pattern applies only to heap tables (not indexes)
- *     btree_only: true if the pattern applies only to btree indexes (not tables)
+ *     index_only: true if the pattern applies only to indexes (not tables)
  *
  * buf: the buffer to be appended
  * patterns: the array of patterns to be inserted into the CTE
@@ -1808,7 +1851,7 @@ append_rel_pattern_raw_cte(PQExpBuffer buf, const PatternInfoArray *pia,
 			appendPQExpBufferStr(buf, "::TEXT, true::BOOLEAN");
 		else
 			appendPQExpBufferStr(buf, "::TEXT, false::BOOLEAN");
-		if (info->btree_only)
+		if (info->index_only)
 			appendPQExpBufferStr(buf, ", true::BOOLEAN");
 		else
 			appendPQExpBufferStr(buf, ", false::BOOLEAN");
@@ -1846,8 +1889,8 @@ append_rel_pattern_filtered_cte(PQExpBuffer buf, const char *raw,
 								const char *filtered, PGconn *conn)
 {
 	appendPQExpBuffer(buf,
-					  "\n%s (pattern_id, nsp_regex, rel_regex, heap_only, btree_only) AS ("
-					  "\nSELECT pattern_id, nsp_regex, rel_regex, heap_only, btree_only "
+					  "\n%s (pattern_id, nsp_regex, rel_regex, heap_only, index_only) AS ("
+					  "\nSELECT pattern_id, nsp_regex, rel_regex, heap_only, index_only "
 					  "FROM %s r"
 					  "\nWHERE (r.db_regex IS NULL "
 					  "OR ",
@@ -1870,7 +1913,7 @@ append_rel_pattern_filtered_cte(PQExpBuffer buf, const char *raw,
  * The cells of the constructed list contain all information about the relation
  * necessary to connect to the database and check the object, including which
  * database to connect to, where contrib/amcheck is installed, and the Oid and
- * type of object (heap table vs. btree index).  Rather than duplicating the
+ * type of object (heap table vs. index).  Rather than duplicating the
  * database details per relation, the relation structs use references to the
  * same database object, provided by the caller.
  *
@@ -1897,7 +1940,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	if (!opts.allrel)
 	{
 		appendPQExpBufferStr(&sql,
-							 " include_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, btree_only) AS (");
+							 " include_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, index_only) AS (");
 		append_rel_pattern_raw_cte(&sql, &opts.include, conn);
 		appendPQExpBufferStr(&sql, "\n),");
 		append_rel_pattern_filtered_cte(&sql, "include_raw", "include_pat", conn);
@@ -1907,7 +1950,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	if (opts.excludetbl || opts.excludeidx || opts.excludensp)
 	{
 		appendPQExpBufferStr(&sql,
-							 " exclude_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, btree_only) AS (");
+							 " exclude_raw (pattern_id, db_regex, nsp_regex, rel_regex, heap_only, index_only) AS (");
 		append_rel_pattern_raw_cte(&sql, &opts.exclude, conn);
 		appendPQExpBufferStr(&sql, "\n),");
 		append_rel_pattern_filtered_cte(&sql, "exclude_raw", "exclude_pat", conn);
@@ -1915,36 +1958,36 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 
 	/* Append the relation CTE. */
 	appendPQExpBufferStr(&sql,
-						 " relation (pattern_id, oid, nspname, relname, reltoastrelid, relpages, is_heap, is_btree) AS ("
+						 " relation (pattern_id, oid, amoid, nspname, relname, reltoastrelid, relpages, is_heap, is_index) AS ("
 						 "\nSELECT DISTINCT ON (c.oid");
 	if (!opts.allrel)
 		appendPQExpBufferStr(&sql, ", ip.pattern_id) ip.pattern_id,");
 	else
 		appendPQExpBufferStr(&sql, ") NULL::INTEGER AS pattern_id,");
 	appendPQExpBuffer(&sql,
-					  "\nc.oid, n.nspname, c.relname, c.reltoastrelid, c.relpages, "
-					  "c.relam = %u AS is_heap, "
-					  "c.relam = %u AS is_btree"
+					  "\nc.oid, c.relam as amoid, n.nspname, c.relname, "
+					  "c.reltoastrelid, c.relpages, c.relam = %u AS is_heap, "
+					  "(c.relam = %u OR c.relam = %u) AS is_index"
 					  "\nFROM pg_catalog.pg_class c "
 					  "INNER JOIN pg_catalog.pg_namespace n "
 					  "ON c.relnamespace = n.oid",
-					  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+					  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 	if (!opts.allrel)
 		appendPQExpBuffer(&sql,
 						  "\nINNER JOIN include_pat ip"
 						  "\nON (n.nspname ~ ip.nsp_regex OR ip.nsp_regex IS NULL)"
 						  "\nAND (c.relname ~ ip.rel_regex OR ip.rel_regex IS NULL)"
 						  "\nAND (c.relam = %u OR NOT ip.heap_only)"
-						  "\nAND (c.relam = %u OR NOT ip.btree_only)",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  "\nAND ((c.relam = %u OR c.relam = %u) OR NOT ip.index_only)",
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 	if (opts.excludetbl || opts.excludeidx || opts.excludensp)
 		appendPQExpBuffer(&sql,
 						  "\nLEFT OUTER JOIN exclude_pat ep"
 						  "\nON (n.nspname ~ ep.nsp_regex OR ep.nsp_regex IS NULL)"
 						  "\nAND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL)"
 						  "\nAND (c.relam = %u OR NOT ep.heap_only OR ep.rel_regex IS NULL)"
-						  "\nAND (c.relam = %u OR NOT ep.btree_only OR ep.rel_regex IS NULL)",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  "\nAND ((c.relam = %u OR c.relam = %u) OR NOT ep.index_only OR ep.rel_regex IS NULL)",
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 
 	/*
 	 * Exclude temporary tables and indexes, which must necessarily belong to
@@ -1983,7 +2026,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 						  HEAP_TABLE_AM_OID, PG_TOAST_NAMESPACE);
 	else
 		appendPQExpBuffer(&sql,
-						  " AND c.relam IN (%u, %u)"
+						  " AND c.relam IN (%u, %u, %u)"
 						  "AND c.relkind IN ("
 						  CppAsString2(RELKIND_RELATION) ", "
 						  CppAsString2(RELKIND_SEQUENCE) ", "
@@ -1995,10 +2038,10 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 						  CppAsString2(RELKIND_SEQUENCE) ", "
 						  CppAsString2(RELKIND_MATVIEW) ", "
 						  CppAsString2(RELKIND_TOASTVALUE) ")) OR "
-						  "(c.relam = %u AND c.relkind = "
+						  "((c.relam = %u OR c.relam = %u) AND c.relkind = "
 						  CppAsString2(RELKIND_INDEX) "))",
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID,
-						  HEAP_TABLE_AM_OID, BTREE_AM_OID);
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID,
+						  HEAP_TABLE_AM_OID, BTREE_AM_OID, GIST_AM_OID);
 
 	appendPQExpBufferStr(&sql,
 						 "\nORDER BY c.oid)");
@@ -2027,7 +2070,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		appendPQExpBufferStr(&sql,
 							 "\n)");
 	}
-	if (!opts.no_btree_expansion)
+	if (!opts.no_index_expansion)
 	{
 		/*
 		 * Include a CTE for btree indexes associated with primary heap tables
@@ -2035,9 +2078,9 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		 * btree index names.
 		 */
 		appendPQExpBufferStr(&sql,
-							 ", index (oid, nspname, relname, relpages) AS ("
-							 "\nSELECT c.oid, r.nspname, c.relname, c.relpages "
-							 "FROM relation r"
+							 ", index (oid, amoid, nspname, relname, relpages) AS ("
+							 "\nSELECT c.oid, c.relam as amoid, r.nspname, "
+							 "c.relname, c.relpages FROM relation r"
 							 "\nINNER JOIN pg_catalog.pg_index i "
 							 "ON r.oid = i.indrelid "
 							 "INNER JOIN pg_catalog.pg_class c "
@@ -2050,15 +2093,15 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 								 "\nLEFT OUTER JOIN exclude_pat ep "
 								 "ON (n.nspname ~ ep.nsp_regex OR ep.nsp_regex IS NULL) "
 								 "AND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL) "
-								 "AND ep.btree_only"
+								 "AND ep.index_only"
 								 "\nWHERE ep.pattern_id IS NULL");
 		else
 			appendPQExpBufferStr(&sql,
 								 "\nWHERE true");
 		appendPQExpBuffer(&sql,
-						  " AND c.relam = %u "
+						  " AND (c.relam = %u or c.relam = %u) "
 						  "AND c.relkind = " CppAsString2(RELKIND_INDEX),
-						  BTREE_AM_OID);
+						  BTREE_AM_OID, GIST_AM_OID);
 		if (opts.no_toast_expansion)
 			appendPQExpBuffer(&sql,
 							  " AND c.relnamespace != %u",
@@ -2066,7 +2109,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		appendPQExpBufferStr(&sql, "\n)");
 	}
 
-	if (!opts.no_toast_expansion && !opts.no_btree_expansion)
+	if (!opts.no_toast_expansion && !opts.no_index_expansion)
 	{
 		/*
 		 * Include a CTE for btree indexes associated with toast tables of
@@ -2087,7 +2130,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 								 "\nLEFT OUTER JOIN exclude_pat ep "
 								 "ON ('pg_toast' ~ ep.nsp_regex OR ep.nsp_regex IS NULL) "
 								 "AND (c.relname ~ ep.rel_regex OR ep.rel_regex IS NULL) "
-								 "AND ep.btree_only "
+								 "AND ep.index_only "
 								 "WHERE ep.pattern_id IS NULL");
 		else
 			appendPQExpBufferStr(&sql,
@@ -2107,12 +2150,13 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	 * list.
 	 */
 	appendPQExpBufferStr(&sql,
-						 "\nSELECT pattern_id, is_heap, is_btree, oid, nspname, relname, relpages "
+						 "\nSELECT pattern_id, is_heap, is_index, oid, amoid, nspname, relname, relpages "
 						 "FROM (");
 	appendPQExpBufferStr(&sql,
 	/* Inclusion patterns that failed to match */
-						 "\nSELECT pattern_id, is_heap, is_btree, "
+						 "\nSELECT pattern_id, is_heap, is_index, "
 						 "NULL::OID AS oid, "
+						 "NULL::OID AS amoid, "
 						 "NULL::TEXT AS nspname, "
 						 "NULL::TEXT AS relname, "
 						 "NULL::INTEGER AS relpages"
@@ -2121,29 +2165,29 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 						 "UNION"
 	/* Primary relations */
 						 "\nSELECT NULL::INTEGER AS pattern_id, "
-						 "is_heap, is_btree, oid, nspname, relname, relpages "
+						 "is_heap, is_index, oid, amoid, nspname, relname, relpages "
 						 "FROM relation");
 	if (!opts.no_toast_expansion)
-		appendPQExpBufferStr(&sql,
-							 " UNION"
+		appendPQExpBuffer(&sql,
+						  " UNION"
 		/* Toast tables for primary relations */
-							 "\nSELECT NULL::INTEGER AS pattern_id, TRUE AS is_heap, "
-							 "FALSE AS is_btree, oid, nspname, relname, relpages "
-							 "FROM toast");
-	if (!opts.no_btree_expansion)
+						  "\nSELECT NULL::INTEGER AS pattern_id, TRUE AS is_heap, "
+						  "FALSE AS is_index, oid, 0 as amoid, nspname, relname, relpages "
+						  "FROM toast");
+	if (!opts.no_index_expansion)
 		appendPQExpBufferStr(&sql,
 							 " UNION"
 		/* Indexes for primary relations */
 							 "\nSELECT NULL::INTEGER AS pattern_id, FALSE AS is_heap, "
-							 "TRUE AS is_btree, oid, nspname, relname, relpages "
+							 "TRUE AS is_index, oid, amoid, nspname, relname, relpages "
 							 "FROM index");
-	if (!opts.no_toast_expansion && !opts.no_btree_expansion)
-		appendPQExpBufferStr(&sql,
-							 " UNION"
+	if (!opts.no_toast_expansion && !opts.no_index_expansion)
+		appendPQExpBuffer(&sql,
+						  " UNION"
 		/* Indexes for toast relations */
-							 "\nSELECT NULL::INTEGER AS pattern_id, FALSE AS is_heap, "
-							 "TRUE AS is_btree, oid, nspname, relname, relpages "
-							 "FROM toast_index");
+						  "\nSELECT NULL::INTEGER AS pattern_id, FALSE AS is_heap, "
+						  "TRUE AS is_index, oid, %u as amoid, nspname, relname, relpages "
+						  "FROM toast_index", BTREE_AM_OID);
 	appendPQExpBufferStr(&sql,
 						 "\n) AS combined_records "
 						 "ORDER BY relpages DESC NULLS FIRST, oid");
@@ -2163,8 +2207,9 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 	{
 		int			pattern_id = -1;
 		bool		is_heap = false;
-		bool		is_btree PG_USED_FOR_ASSERTS_ONLY = false;
+		bool		is_index PG_USED_FOR_ASSERTS_ONLY = false;
 		Oid			oid = InvalidOid;
+		Oid			amoid = InvalidOid;
 		const char *nspname = NULL;
 		const char *relname = NULL;
 		int			relpages = 0;
@@ -2174,15 +2219,17 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 		if (!PQgetisnull(res, i, 1))
 			is_heap = (PQgetvalue(res, i, 1)[0] == 't');
 		if (!PQgetisnull(res, i, 2))
-			is_btree = (PQgetvalue(res, i, 2)[0] == 't');
+			is_index = (PQgetvalue(res, i, 2)[0] == 't');
 		if (!PQgetisnull(res, i, 3))
 			oid = atooid(PQgetvalue(res, i, 3));
 		if (!PQgetisnull(res, i, 4))
-			nspname = PQgetvalue(res, i, 4);
+			amoid = atooid(PQgetvalue(res, i, 4));
 		if (!PQgetisnull(res, i, 5))
-			relname = PQgetvalue(res, i, 5);
+			nspname = PQgetvalue(res, i, 5);
 		if (!PQgetisnull(res, i, 6))
-			relpages = atoi(PQgetvalue(res, i, 6));
+			relname = PQgetvalue(res, i, 6);
+		if (!PQgetisnull(res, i, 7))
+			relpages = atoi(PQgetvalue(res, i, 7));
 
 		if (pattern_id >= 0)
 		{
@@ -2204,10 +2251,11 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 			RelationInfo *rel = (RelationInfo *) pg_malloc0(sizeof(RelationInfo));
 
 			Assert(OidIsValid(oid));
-			Assert((is_heap && !is_btree) || (is_btree && !is_heap));
+			Assert((is_heap && !is_index) || (is_index && !is_heap));
 
 			rel->datinfo = dat;
 			rel->reloid = oid;
+			rel->amoid = amoid;
 			rel->is_heap = is_heap;
 			rel->nspname = pstrdup(nspname);
 			rel->relname = pstrdup(relname);
@@ -2217,7 +2265,7 @@ compile_relation_list_one_db(PGconn *conn, SimplePtrList *relations,
 			{
 				/*
 				 * We apply --startblock and --endblock to heap tables, but
-				 * not btree indexes, and for progress purposes we need to
+				 * not supported indexes, and for progress purposes we need to
 				 * track how many blocks we expect to check.
 				 */
 				if (opts.endblock >= 0 && rel->blocks_to_check > opts.endblock)
diff --git a/src/bin/pg_amcheck/t/002_nonesuch.pl b/src/bin/pg_amcheck/t/002_nonesuch.pl
index 67d700ea07a..d4cc0664f3b 100644
--- a/src/bin/pg_amcheck/t/002_nonesuch.pl
+++ b/src/bin/pg_amcheck/t/002_nonesuch.pl
@@ -272,8 +272,8 @@ $node->command_checks_all(
 	[
 		qr/pg_amcheck: warning: no heap tables to check matching "no_such_table"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "no\*such\*table"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "no_such_index"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "no\*such\*index"/,
+		qr/pg_amcheck: warning: no indexes to check matching "no_such_index"/,
+		qr/pg_amcheck: warning: no indexes to check matching "no\*such\*index"/,
 		qr/pg_amcheck: warning: no relations to check matching "no_such_relation"/,
 		qr/pg_amcheck: warning: no relations to check matching "no\*such\*relation"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "no\*such\*table"/,
@@ -350,8 +350,8 @@ $node->command_checks_all(
 		qr/pg_amcheck: warning: no heap tables to check matching "template1\.public\.foo"/,
 		qr/pg_amcheck: warning: no heap tables to check matching "another_db\.public\.foo"/,
 		qr/pg_amcheck: warning: no connectable databases to check matching "no_such_database\.public\.foo"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "template1\.public\.foo_idx"/,
-		qr/pg_amcheck: warning: no btree indexes to check matching "another_db\.public\.foo_idx"/,
+		qr/pg_amcheck: warning: no indexes to check matching "template1\.public\.foo_idx"/,
+		qr/pg_amcheck: warning: no indexes to check matching "another_db\.public\.foo_idx"/,
 		qr/pg_amcheck: warning: no connectable databases to check matching "no_such_database\.public\.foo_idx"/,
 		qr/pg_amcheck: error: no relations to check/,
 	],
diff --git a/src/bin/pg_amcheck/t/003_check.pl b/src/bin/pg_amcheck/t/003_check.pl
index 2b57c4dbac1..0aa66b24258 100644
--- a/src/bin/pg_amcheck/t/003_check.pl
+++ b/src/bin/pg_amcheck/t/003_check.pl
@@ -185,7 +185,7 @@ for my $dbname (qw(db1 db2 db3))
 	# schemas.  The schemas are all identical to start, but
 	# we will corrupt them differently later.
 	#
-	for my $schema (qw(s1 s2 s3 s4 s5))
+	for my $schema (qw(s1 s2 s3 s4 s5 s6))
 	{
 		$node->safe_psql(
 			$dbname, qq(
@@ -291,22 +291,24 @@ plan_to_corrupt_first_page('db1', 's3.t2_btree');
 # Corrupt toast table, partitions, and materialized views in schema "s4"
 plan_to_remove_toast_file('db1', 's4.t2');
 
-# Corrupt all other object types in schema "s5".  We don't have amcheck support
+# Corrupt GiST index in schema "s5"
+plan_to_remove_relation_file('db1', 's5.t1_gist');
+plan_to_corrupt_first_page('db1', 's5.t2_gist');
+
+# Corrupt all other object types in schema "s6".  We don't have amcheck support
 # for these types, but we check that their corruption does not trigger any
 # errors in pg_amcheck
-plan_to_remove_relation_file('db1', 's5.seq1');
-plan_to_remove_relation_file('db1', 's5.t1_hash');
-plan_to_remove_relation_file('db1', 's5.t1_gist');
-plan_to_remove_relation_file('db1', 's5.t1_gin');
-plan_to_remove_relation_file('db1', 's5.t1_brin');
-plan_to_remove_relation_file('db1', 's5.t1_spgist');
+plan_to_remove_relation_file('db1', 's6.seq1');
+plan_to_remove_relation_file('db1', 's6.t1_hash');
+plan_to_remove_relation_file('db1', 's6.t1_gin');
+plan_to_remove_relation_file('db1', 's6.t1_brin');
+plan_to_remove_relation_file('db1', 's6.t1_spgist');
 
-plan_to_corrupt_first_page('db1', 's5.seq2');
-plan_to_corrupt_first_page('db1', 's5.t2_hash');
-plan_to_corrupt_first_page('db1', 's5.t2_gist');
-plan_to_corrupt_first_page('db1', 's5.t2_gin');
-plan_to_corrupt_first_page('db1', 's5.t2_brin');
-plan_to_corrupt_first_page('db1', 's5.t2_spgist');
+plan_to_corrupt_first_page('db1', 's6.seq2');
+plan_to_corrupt_first_page('db1', 's6.t2_hash');
+plan_to_corrupt_first_page('db1', 's6.t2_gin');
+plan_to_corrupt_first_page('db1', 's6.t2_brin');
+plan_to_corrupt_first_page('db1', 's6.t2_spgist');
 
 
 # Database 'db2' corruptions
@@ -437,10 +439,22 @@ $node->command_checks_all(
 	[$no_output_re],
 	'pg_amcheck in schema s4 excluding toast reports no corruption');
 
-# Check that no corruption is reported in schema db1.s5
-$node->command_checks_all([ @cmd, '-s', 's5', 'db1' ],
+# In schema db1.s5 we should see GiST corruption messages on stdout, and
+# nothing on stderr.
+#
+$node->command_checks_all(
+	[ @cmd, '-s', 's5', 'db1' ],
+	2,
+	[
+		$missing_file_re, $line_pointer_corruption_re,
+	],
+	[$no_output_re],
+	'pg_amcheck schema s5 reports GiST index errors');
+
+# Check that no corruption is reported in schema db1.s6
+$node->command_checks_all([ @cmd, '-s', 's6', 'db1' ],
 	0, [$no_output_re], [$no_output_re],
-	'pg_amcheck over schema s5 reports no corruption');
+	'pg_amcheck over schema s6 reports no corruption');
 
 # In schema db1.s1, only indexes are corrupt.  Verify that when we exclude
 # the indexes, no corruption is reported about the schema.
@@ -551,7 +565,7 @@ $node->command_checks_all(
 	'pg_amcheck excluding all corrupt schemas with --checkunique option');
 
 #
-# Smoke test for checkunique option for not supported versions.
+# Smoke test for checkunique option and GiST indexes for not supported versions.
 #
 $node->safe_psql(
 	'db3', q(
@@ -567,4 +581,19 @@ $node->command_checks_all(
 		qr/pg_amcheck: warning: option --checkunique is not supported by amcheck version 1.3/
 	],
 	'pg_amcheck smoke test --checkunique');
+
+$node->safe_psql(
+	'db1', q(
+		DROP EXTENSION amcheck;
+		CREATE EXTENSION amcheck WITH SCHEMA amcheck_schema VERSION '1.3' ;
+));
+
+$node->command_checks_all(
+	[ @cmd, '-s', 's5', 'db1' ],
+	0,
+	[$no_output_re],
+	[
+		qr/pg_amcheck: warning: GiST verification is not supported by installed amcheck version/
+	],
+	'pg_amcheck smoke test --checkunique');
 done_testing();
-- 
2.34.1

v34-0003-Add-gist_index_check-function-to-verify-GiST-ind.patchapplication/octet-stream; name=v34-0003-Add-gist_index_check-function-to-verify-GiST-ind.patchDownload

From 22d9ee1315c2de44bba228d3feefed1e016d6e57 Mon Sep 17 00:00:00 2001
From: "Andrey M. Borodin" <x4mmm@flight.local>
Date: Sat, 23 Jul 2022 14:17:44 +0500
Subject: [PATCH v34 3/6] Add gist_index_check() function to verify GiST index
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This function traverses GiST with a depth-fisrt search and checks
that all downlink tuples are included into parent tuple keyspace.
This traverse takes lock of any page until some discapency found.
To re-check suspicious pair of parent and child tuples it aqcuires
locks on both parent and child pages in the same order as page
split does.

Author: Andrey Borodin <amborodin@acm.org>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile                |   6 +-
 contrib/amcheck/amcheck--1.4--1.5.sql   |  14 +
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/expected/check_gist.out | 145 +++++
 contrib/amcheck/meson.build             |   3 +
 contrib/amcheck/sql/check_gist.sql      |  62 +++
 contrib/amcheck/verify_gist.c           | 687 ++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml               |  19 +
 8 files changed, 935 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.4--1.5.sql
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index c3d70f3369c..952e458c53b 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,14 +4,16 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	verify_common.o \
+	verify_gist.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
-DATA = amcheck--1.3--1.4.sql amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql \
+		amcheck--1.3--1.4.sql amcheck--1.4--1.5.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap
+REGRESS = check check_btree check_gist check_heap
 
 EXTRA_INSTALL = contrib/pg_walinspect
 TAP_TESTS = 1
diff --git a/contrib/amcheck/amcheck--1.4--1.5.sql b/contrib/amcheck/amcheck--1.4--1.5.sql
new file mode 100644
index 00000000000..3fc72364180
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.4--1.5.sql
@@ -0,0 +1,14 @@
+/* contrib/amcheck/amcheck--1.4--1.5.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.5'" to load this file. \quit
+
+
+-- gist_index_check()
+--
+CREATE FUNCTION gist_index_check(index regclass, heapallindexed boolean)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_check(regclass, boolean) FROM PUBLIC;
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index e67ace01c99..c8ba6d7c9bc 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.4'
+default_version = '1.5'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 00000000000..cbc3e27e679
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,145 @@
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', false);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx1', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+SELECT gist_index_check('gist_check_idx2', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gist_check;
+--
+-- Similar to BUG #15597
+--
+CREATE TABLE toast_bug(c point,buggy text);
+ALTER TABLE toast_bug ALTER COLUMN buggy SET STORAGE extended;
+CREATE INDEX toasty ON toast_bug USING gist(c) INCLUDE(buggy);
+-- pg_attribute entry for toasty.buggy (the index) will have plain storage:
+UPDATE pg_attribute SET attstorage = 'p'
+WHERE attrelid = 'toasty'::regclass AND attname = 'buggy';
+-- Whereas pg_attribute entry for toast_bug.buggy (the table) still has extended storage:
+SELECT attstorage FROM pg_attribute
+WHERE attrelid = 'toast_bug'::regclass AND attname = 'buggy';
+ attstorage 
+------------
+ x
+(1 row)
+
+-- Insert compressible heap tuple (comfortably exceeds TOAST_TUPLE_THRESHOLD):
+INSERT INTO toast_bug SELECT point(0,0), repeat('a', 2200);
+-- Should not get false positive report of corruption:
+SELECT gist_index_check('toasty', true);
+ gist_index_check 
+------------------
+ 
+(1 row)
+
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 6a1ba5d7619..5ab83ca7778 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'verify_common.c',
+  'verify_gist.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
@@ -25,6 +26,7 @@ install_data(
   'amcheck--1.1--1.2.sql',
   'amcheck--1.2--1.3.sql',
   'amcheck--1.3--1.4.sql',
+  'amcheck--1.4--1.5.sql',
   kwargs: contrib_data_args,
 )
 
@@ -36,6 +38,7 @@ tests += {
     'sql': [
       'check',
       'check_btree',
+      'check_gist',
       'check_heap',
     ],
   },
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 00000000000..37966423b8b
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,62 @@
+
+SELECT setseed(1);
+
+-- Test that index built with bulk load is correct
+CREATE TABLE gist_check AS SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx1 ON gist_check USING gist(c);
+CREATE INDEX gist_check_idx2 ON gist_check USING gist(c) INCLUDE(p);
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+-- Test that index is correct after inserts
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+-- Test that index is correct after vacuuming
+DELETE FROM gist_check WHERE c[1] < 5000; -- delete clustered data
+DELETE FROM gist_check WHERE c[1]::int % 2 = 0; -- delete scattered data
+
+-- We need two passes through the index and one global vacuum to actually
+-- reuse page
+VACUUM gist_check;
+VACUUM;
+
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+
+
+-- Test that index is correct after reusing pages
+INSERT INTO gist_check SELECT point(random(),s) c, random() p FROM generate_series(1,10000) s;
+SELECT gist_index_check('gist_check_idx1', false);
+SELECT gist_index_check('gist_check_idx2', false);
+SELECT gist_index_check('gist_check_idx1', true);
+SELECT gist_index_check('gist_check_idx2', true);
+-- cleanup
+DROP TABLE gist_check;
+
+--
+-- Similar to BUG #15597
+--
+CREATE TABLE toast_bug(c point,buggy text);
+ALTER TABLE toast_bug ALTER COLUMN buggy SET STORAGE extended;
+CREATE INDEX toasty ON toast_bug USING gist(c) INCLUDE(buggy);
+
+-- pg_attribute entry for toasty.buggy (the index) will have plain storage:
+UPDATE pg_attribute SET attstorage = 'p'
+WHERE attrelid = 'toasty'::regclass AND attname = 'buggy';
+
+-- Whereas pg_attribute entry for toast_bug.buggy (the table) still has extended storage:
+SELECT attstorage FROM pg_attribute
+WHERE attrelid = 'toast_bug'::regclass AND attname = 'buggy';
+
+-- Insert compressible heap tuple (comfortably exceeds TOAST_TUPLE_THRESHOLD):
+INSERT INTO toast_bug SELECT point(0,0), repeat('a', 2200);
+-- Should not get false positive report of corruption:
+SELECT gist_index_check('toasty', true);
\ No newline at end of file
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 00000000000..477150ac802
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,687 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gist.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "access/tableam.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
+#include "common/pg_prng.h"
+#include "lib/bloomfilter.h"
+#include "verify_common.h"
+#include "utils/memutils.h"
+
+
+/*
+ * GistScanItem represents one item of depth-first scan of GiST index.
+ */
+typedef struct GistScanItem
+{
+	int			depth;
+
+	/* Referenced block number to check next */
+	BlockNumber blkno;
+
+	/*
+	 * Correctess of this parent tuple will be checked against contents of
+	 * referenced page. This tuple will be NULL for root block.
+	 */
+	IndexTuple	parenttup;
+
+	/*
+	 * LSN to hande concurrent scan of the page. It's necessary to avoid
+	 * missing some subtrees from page, that was split just before we read it.
+	 */
+	XLogRecPtr	parentlsn;
+
+	/*
+	 * Reference to parent page for re-locking in case of found parent-child
+	 * tuple discrepencies.
+	 */
+	BlockNumber parentblk;
+
+	/* Pointer to a next stack item. */
+	struct GistScanItem *next;
+}			GistScanItem;
+
+typedef struct GistCheckState
+{
+	/* GiST state */
+	GISTSTATE  *state;
+	/* Bloom filter fingerprints index tuples */
+	bloom_filter *filter;
+
+	Snapshot	snapshot;
+	Relation	rel;
+	Relation	heaprel;
+
+	/* Debug counter for reporting percentage of work already done */
+	int64		heaptuplespresent;
+
+	/* progress reporting stuff */
+	BlockNumber totalblocks;
+	BlockNumber reportedblocks;
+	BlockNumber scannedblocks;
+	BlockNumber deltablocks;
+
+	int			leafdepth;
+}			GistCheckState;
+
+PG_FUNCTION_INFO_V1(gist_index_check);
+
+static void giststate_init_heapallindexed(Relation rel, GistCheckState * result);
+static void gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+											   void *callback_state, bool readonly);
+static void gist_check_page(GistCheckState * check_state, GistScanItem * stack,
+							Page page, bool heapallindexed,
+							BufferAccessStrategy strategy);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gist_refind_parent(Relation rel, BlockNumber parentblkno,
+									 BlockNumber childblkno,
+									 BufferAccessStrategy strategy);
+static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block,
+								   Page page, OffsetNumber offset);
+static void gist_tuple_present_callback(Relation index, ItemPointer tid,
+										Datum *values, bool *isnull,
+										bool tupleIsAlive, void *checkstate);
+static IndexTuple gistFormNormalizedTuple(GISTSTATE *giststate, Relation r,
+										  Datum *attdata, bool *isnull, ItemPointerData tid);
+
+/*
+ * gist_index_check(index regclass)
+ *
+ * Verify integrity of GiST index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gist_index_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	bool		heapallindexed = PG_GETARG_BOOL(1);
+
+	amcheck_lock_relation_and_check(indrelid,
+									GIST_AM_OID,
+									gist_check_parent_keys_consistency,
+									AccessShareLock,
+									&heapallindexed);
+
+	PG_RETURN_VOID();
+}
+
+/*
+* Initaliaze GIST state filed needed to perform.
+* This initialized bloom filter and snapshot.
+*/
+static void
+giststate_init_heapallindexed(Relation rel, GistCheckState * result)
+{
+	int64		total_pages;
+	int64		total_elems;
+	uint64		seed;
+
+	/*
+	 * Size Bloom filter based on estimated number of tuples in index. This
+	 * logic is similar to B-tree, see verify_btree.c .
+	 */
+	total_pages = result->totalblocks;
+	total_elems = Max(total_pages * (MaxOffsetNumber / 5),
+					  (int64) rel->rd_rel->reltuples);
+	seed = pg_prng_uint64(&pg_global_prng_state);
+	result->filter = bloom_create(total_elems, maintenance_work_mem, seed);
+
+	result->snapshot = RegisterSnapshot(GetTransactionSnapshot());
+
+
+	/*
+	 * GetTransactionSnapshot() always acquires a new MVCC snapshot in READ
+	 * COMMITTED mode.  A new snapshot is guaranteed to have all the entries
+	 * it requires in the index.
+	 *
+	 * We must defend against the possibility that an old xact snapshot was
+	 * returned at higher isolation levels when that snapshot is not safe for
+	 * index scans of the target index.  This is possible when the snapshot
+	 * sees tuples that are before the index's indcheckxmin horizon.  Throwing
+	 * an error here should be very rare.  It doesn't seem worth using a
+	 * secondary snapshot to avoid this.
+	 */
+	if (IsolationUsesXactSnapshot() && rel->rd_index->indcheckxmin &&
+		!TransactionIdPrecedes(HeapTupleHeaderGetXmin(rel->rd_indextuple->t_data),
+							   result->snapshot->xmin))
+		ereport(ERROR,
+				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+				 errmsg("index \"%s\" cannot be verified using transaction snapshot",
+						RelationGetRelationName(rel))));
+}
+
+/*
+ * Main entry point for GiST check.
+ *
+ * This function verifies that tuples of internal pages cover all
+ * the key space of each tuple on leaf page.  To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * This check allocates memory context and scans through
+ * GiST graph. This scan is performed in a depth-first search using a stack of
+ * GistScanItem-s. Initially this stack contains only root block number. On
+ * each iteration top block numbmer is replcaed by referenced block numbers.
+ *
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries to
+ * adjust it by tuples on referenced child page.  Parent gist tuple should
+ * never require any adjustments.
+ */
+static void
+gist_check_parent_keys_consistency(Relation rel, Relation heaprel,
+								   void *callback_state, bool readonly)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GistScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GISTSTATE  *state;
+	bool		heapallindexed = *((bool *) callback_state);
+	GistCheckState *check_state = palloc0(sizeof(GistCheckState));
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	state = initGISTstate(rel);
+
+	check_state->state = state;
+	check_state->rel = rel;
+	check_state->heaprel = heaprel;
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	check_state->leafdepth = -1;
+
+	check_state->totalblocks = RelationGetNumberOfBlocks(rel);
+	/* report every 100 blocks or 5%, whichever is bigger */
+	check_state->deltablocks = Max(check_state->totalblocks / 20, 100);
+
+	if (heapallindexed)
+		giststate_init_heapallindexed(rel, check_state);
+
+	/* Start the scan at the root page */
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	/*
+	 * This GiST scan is effectively "old" VACUUM version before commit
+	 * fe280694d which introduced physical order scanning.
+	 */
+
+	while (stack)
+	{
+		GistScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		XLogRecPtr	lsn;
+
+		CHECK_FOR_INTERRUPTS();
+
+		/* Report progress */
+		if (check_state->scannedblocks > check_state->reportedblocks +
+			check_state->deltablocks)
+		{
+			elog(DEBUG1, "verified level %u blocks of approximately %u total",
+				 check_state->scannedblocks, check_state->totalblocks);
+			check_state->reportedblocks = check_state->scannedblocks;
+		}
+		check_state->scannedblocks++;
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page))
+		{
+			/* split page detected, install right link to the stack */
+			GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+			ptr->depth = stack->depth;
+			ptr->parenttup = CopyIndexTuple(stack->parenttup);
+			ptr->parentblk = stack->parentblk;
+			ptr->parentlsn = stack->parentlsn;
+			ptr->blkno = GistPageGetOpaque(page)->rightlink;
+			ptr->next = stack->next;
+			stack->next = ptr;
+		}
+
+		gist_check_page(check_state, stack, page, heapallindexed, strategy);
+
+		if (!GistPageIsLeaf(page))
+		{
+			OffsetNumber maxoff = PageGetMaxOffsetNumber(page);
+
+			for (OffsetNumber i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				/* Internal page, so recurse to the child */
+				GistScanItem *ptr;
+				ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
+				IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+				ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+				ptr->depth = stack->depth + 1;
+				ptr->parenttup = CopyIndexTuple(idxtuple);
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		LockBuffer(buffer, GIST_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	if (heapallindexed)
+	{
+		IndexInfo  *indexinfo = BuildIndexInfo(rel);
+		TableScanDesc scan;
+
+		scan = table_beginscan_strat(heaprel,	/* relation */
+									 check_state->snapshot, /* snapshot */
+									 0, /* number of keys */
+									 NULL,	/* scan key */
+									 true,	/* buffer access strategy OK */
+									 true); /* syncscan OK? */
+
+		/*
+		 * Scan will behave as the first scan of a CREATE INDEX CONCURRENTLY.
+		 */
+		indexinfo->ii_Concurrent = true;
+
+		indexinfo->ii_Unique = false;
+		indexinfo->ii_ExclusionOps = NULL;
+		indexinfo->ii_ExclusionProcs = NULL;
+		indexinfo->ii_ExclusionStrats = NULL;
+
+		elog(DEBUG1, "verifying that tuples from index \"%s\" are present in \"%s\"",
+			 RelationGetRelationName(rel),
+			 RelationGetRelationName(heaprel));
+
+		table_index_build_scan(heaprel, rel, indexinfo, true, false,
+							   gist_tuple_present_callback, (void *) check_state, scan);
+
+		ereport(DEBUG1,
+				(errmsg_internal("finished verifying presence of " INT64_FORMAT " tuples from table \"%s\" with bitset %.2f%% set",
+								 check_state->heaptuplespresent,
+								 RelationGetRelationName(heaprel),
+								 100.0 * bloom_prop_bits_set(check_state->filter))));
+
+		UnregisterSnapshot(check_state->snapshot);
+		bloom_free(check_state->filter);
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+	pfree(check_state);
+}
+
+static void
+gist_check_page(GistCheckState * check_state, GistScanItem * stack,
+				Page page, bool heapallindexed, BufferAccessStrategy strategy)
+{
+	OffsetNumber maxoff = PageGetMaxOffsetNumber(page);
+
+	/* Check that the tree has the same height in all branches */
+	if (GistPageIsLeaf(page))
+	{
+		if (check_state->leafdepth == -1)
+			check_state->leafdepth = stack->depth;
+		else if (stack->depth != check_state->leafdepth)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+							RelationGetRelationName(check_state->rel), stack->blkno)));
+	}
+
+	/*
+	 * Check that each tuple looks valid, and is consistent with the downlink
+	 * we followed when we stepped on this page.
+	 */
+	for (OffsetNumber i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+	{
+		ItemId		iid = PageGetItemIdCareful(check_state->rel, stack->blkno, page, i);
+		IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+		IndexTuple  tmpTuple = NULL;
+
+		/*
+		 * Check that it's not a leftover invalid tuple from pre-9.1 See also
+		 * gistdoinsert() and gistbulkdelete() handling of such tuples. We do
+		 * consider it error here.
+		 */
+		if (GistTupleIsInvalid(idxtuple))
+			ereport(ERROR,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					 errmsg("index \"%s\" contains an inner tuple marked as invalid, block %u, offset %u",
+							RelationGetRelationName(check_state->rel), stack->blkno, i),
+					 errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+					 errhint("Please REINDEX it.")));
+
+		if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+							RelationGetRelationName(check_state->rel), stack->blkno, i)));
+
+		/*
+		 * Check if this tuple is consistent with the downlink in the parent.
+		 */
+		if (stack->parenttup)
+			tmpTuple = gistgetadjusted(check_state->rel, stack->parenttup, idxtuple, check_state->state);
+
+		if (tmpTuple)
+		{
+			/*
+			 * There was a discrepancy between parent and child tuples. We
+			 * need to verify it is not a result of concurrent call of
+			 * gistplacetopage(). So, lock parent and try to find downlink for
+			 * current page. It may be missing due to concurrent page split,
+			 * this is OK.
+			 *
+			 * Note that when we aquire parent tuple now we hold lock for both
+			 * parent and child buffers. Thus parent tuple must include
+			 * keyspace of the child.
+			 */
+
+			pfree(tmpTuple);
+			pfree(stack->parenttup);
+			stack->parenttup = gist_refind_parent(check_state->rel, stack->parentblk,
+												  stack->blkno, strategy);
+
+			/* We found it - make a final check before failing */
+			if (!stack->parenttup)
+				elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+					 stack->blkno, stack->parentblk);
+			else if (gistgetadjusted(check_state->rel, stack->parenttup, idxtuple, check_state->state))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+								RelationGetRelationName(check_state->rel), stack->blkno, i)));
+			else
+			{
+				/*
+				 * But now it is properly adjusted - nothing to do here.
+				 */
+			}
+		}
+
+		if (GistPageIsLeaf(page))
+		{
+			if (heapallindexed)
+				bloom_add_element(check_state->filter,
+								  (unsigned char *) idxtuple,
+								  IndexTupleSize(idxtuple));
+		}
+		else
+		{
+			OffsetNumber off = ItemPointerGetOffsetNumber(&(idxtuple->t_tid));
+
+			if (off != 0xffff)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has on page %u offset %u has item id not pointing to 0xffff, but %hu",
+								RelationGetRelationName(check_state->rel), stack->blkno, i, off)));
+		}
+	}
+}
+
+/*
+ * gistFormNormalizedTuple - analogue to gistFormTuple, but performs deTOASTing
+ * of all included data (for covering indexes). While we do not expected
+ * toasted attributes in normal index, this can happen as a result of
+ * intervention into system catalog. Detoasting of key attributes is expected
+ * to be done by opclass decompression methods, if indexed type might be
+ * toasted.
+ */
+static IndexTuple
+gistFormNormalizedTuple(GISTSTATE *giststate, Relation r,
+						Datum *attdata, bool *isnull, ItemPointerData tid)
+{
+	Datum		compatt[INDEX_MAX_KEYS];
+	IndexTuple	res;
+
+	gistCompressValues(giststate, r, attdata, isnull, true, compatt);
+
+	for (int i = 0; i < r->rd_att->natts; i++)
+	{
+		Form_pg_attribute att;
+
+		att = TupleDescAttr(giststate->leafTupdesc, i);
+		if (att->attbyval || att->attlen != -1 || isnull[i])
+			continue;
+
+		if (VARATT_IS_EXTERNAL(DatumGetPointer(compatt[i])))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("external varlena datum in tuple that references heap row (%u,%u) in index \"%s\"",
+							ItemPointerGetBlockNumber(&tid),
+							ItemPointerGetOffsetNumber(&tid),
+							RelationGetRelationName(r))));
+		if (VARATT_IS_COMPRESSED(DatumGetPointer(compatt[i])))
+		{
+			/* Datum old = compatt[i]; */
+			/* Key attributes must never be compressed */
+			if (i < IndexRelationGetNumberOfKeyAttributes(r))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("compressed varlena datum in tuple key that references heap row (%u,%u) in index \"%s\"",
+								ItemPointerGetBlockNumber(&tid),
+								ItemPointerGetOffsetNumber(&tid),
+								RelationGetRelationName(r))));
+
+			compatt[i] = PointerGetDatum(PG_DETOAST_DATUM(compatt[i]));
+			/* pfree(DatumGetPointer(old)); // TODO: this fails. Why? */
+		}
+	}
+
+	res = index_form_tuple(giststate->leafTupdesc, compatt, isnull);
+
+	/*
+	 * The offset number on tuples on internal pages is unused. For historical
+	 * reasons, it is set to 0xffff.
+	 */
+	ItemPointerSetOffsetNumber(&(res->t_tid), 0xffff);
+	return res;
+}
+
+static void
+gist_tuple_present_callback(Relation index, ItemPointer tid, Datum *values,
+							bool *isnull, bool tupleIsAlive, void *checkstate)
+{
+	GistCheckState *state = (GistCheckState *) checkstate;
+	IndexTuple	itup = gistFormNormalizedTuple(state->state, index, values, isnull, *tid);
+
+	itup->t_tid = *tid;
+	/* Probe Bloom filter -- tuple should be present */
+	if (bloom_lacks_element(state->filter, (unsigned char *) itup,
+							IndexTupleSize(itup)))
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("heap tuple (%u,%u) from table \"%s\" lacks matching index tuple within index \"%s\"",
+						ItemPointerGetBlockNumber(&(itup->t_tid)),
+						ItemPointerGetOffsetNumber(&(itup->t_tid)),
+						RelationGetRelationName(state->heaprel),
+						RelationGetRelationName(state->rel))));
+
+	state->heaptuplespresent++;
+
+	pfree(itup);
+}
+
+/*
+ * check_index_page - verification of basic invariants about GiST page data
+ * This function does no any tuple analysis.
+ */
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	gistcheckpage(rel, buffer);
+
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted page %d",
+						RelationGetRelationName(rel), blockNo)));
+
+	if (GistPageIsDeleted(page))
+	{
+		if (!GistPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gist_refind_parent(Relation rel,
+				   BlockNumber parentblkno, BlockNumber childblkno,
+				   BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIST_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GistPageIsLeaf(parentpage))
+	{
+		/*
+		 * Currently GiST never deletes internal pages, thus they can never
+		 * become leaf.
+		 */
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" internal page %d became leaf",
+						RelationGetRelationName(rel), parentblkno)));
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (OffsetNumber o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/*
+			 * Found it! Make copy and return it while both parent and child
+			 * pages are locked. This guaranties that at this particular
+			 * moment tuples must be coherent to each other.
+			 */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+static ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(sizeof(GISTPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since gist
+	 * never uses either.  Verify that line pointer has storage, too, since
+	 * even LP_DEAD items should.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 3af065615bc..6eb526c6bb7 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -188,6 +188,25 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_check(index regclass, heapallindexed boolean) returns void</function>
+     <indexterm>
+      <primary>gist_index_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
   <tip>
    <para>
-- 
2.34.1

v34-0006-Fix-wording-in-GIN-README.patchapplication/octet-stream; name=v34-0006-Fix-wording-in-GIN-README.patchDownload

From 4bc20e477d7d55373a9cddfedc5ed22b6f96de3d Mon Sep 17 00:00:00 2001
From: reshke kirill <reshke@double.cloud>
Date: Tue, 3 Dec 2024 15:02:47 +0000
Subject: [PATCH v34 6/6] Fix wording in GIN README.

---
 src/backend/access/gin/README | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/backend/access/gin/README b/src/backend/access/gin/README
index b0807316212..742bcbad499 100644
--- a/src/backend/access/gin/README
+++ b/src/backend/access/gin/README
@@ -237,10 +237,10 @@ GIN packs keys and downlinks into tuples in a different way.
 
 P_i is grouped with K_{i+1}.  -Inf key is not needed.
 
-There are couple of additional notes regarding K_{n+1} key.
-1) In entry tree rightmost page, a key coupled with P_n doesn't really matter.
+There are a couple of additional notes regarding K_{n+1} key.
+1) In the entry tree on the rightmost page, a key coupled with P_n doesn't really matter.
 Highkey is assumed to be infinity.
-2) In posting tree, a key coupled with P_n always doesn't matter.  Highkey for
+2) In the posting tree, a key coupled with P_n always doesn't matter.  Highkey for
 non-rightmost pages is stored separately and accessed via
 GinDataPageGetRightBound().
 
-- 
2.34.1

#59

Kirill Reshke

reshkekirill@gmail.com

about 1 year ago

In reply to: Kirill Reshke (#58)

1 attachment(s)

Re: Amcheck verification of GiST and GIN

Hi all.

I was reviewing nearby thread about parallel index creation for GIN,
and spotted this test [0]/messages/by-id/87a5gyqnl5.fsf@163.com :

create table gin_t (a int[]);
insert into gin_t select * from rand_array(30000000, 0, 100, 0, 50);
create index on gin_t using gin(a);

v34 fails on this. The reason is we should never check the high key on
the rightmost page in the entry GIN tree explicitly, as this is not
actually stored.

PFA little v35 fix for this. I will update this thread with a full v35
patch set shortly.

[0]: /messages/by-id/87a5gyqnl5.fsf@163.com

--
Best regards,
Kirill Reshke

Attachments:

v35-0001-Fix-for-gin_index_check.patchapplication/octet-stream; name=v35-0001-Fix-for-gin_index_check.patchDownload

From ba3d5cc709e9b1f0adef6616f77baba04a63f471 Mon Sep 17 00:00:00 2001
From: reshke kirill <reshke@double.cloud>
Date: Mon, 16 Dec 2024 10:49:43 +0000
Subject: [PATCH v35] Fix for gin_index_check.

Never explicitly check high key on rightmost page is entry tree.
Its value is not stored explicitly and is equal to infitity.
---
 contrib/amcheck/verify_gin.c | 43 ++++++++++++++++++++++--------------
 1 file changed, 27 insertions(+), 16 deletions(-)

diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index 2dc5fbba619..74783af54d4 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -163,6 +163,7 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 		Page		page;
 		OffsetNumber i,
 					maxoff;
+		BlockNumber rightlink;
 
 		CHECK_FOR_INTERRUPTS();
 
@@ -170,6 +171,7 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 									RBM_NORMAL, strategy);
 		LockBuffer(buffer, GIN_SHARE);
 		page = (Page) BufferGetPage(buffer);
+
 		Assert(GinPageIsData(page));
 
 		/* Check that the tree has the same height in all branches */
@@ -234,8 +236,8 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 			 * Check that tuples in each page are properly ordered and
 			 * consistent with parent high key
 			 */
-			Assert(GinPageIsData(page));
 			maxoff = GinPageGetOpaque(page)->maxoff;
+			rightlink = GinPageGetOpaque(page)->rightlink;
 
 			elog(DEBUG1, "page blk: %u, type data, maxoff %d", stack->blkno, maxoff);
 
@@ -273,7 +275,12 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 			 */
 			bound = *GinDataPageGetRightBound(page);
 
-			if (stack->parentblk != InvalidBlockNumber &&
+			/*
+			 * Gin page right bound has sane value if only not a highkey on
+			 * rightmost page on level.
+			 */
+			if (ItemPointerIsValid(&stack->parentkey) &&
+				rightlink != InvalidBlockNumber &&
 				!ItemPointerEquals(&stack->parentkey, &bound))
 				ereport(ERROR,
 						(errcode(ERRCODE_INDEX_CORRUPTED),
@@ -287,11 +294,12 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 
 			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
 			{
+				GinPostingTreeScanItem *ptr;
 				PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
 
 				/* ItemPointerGetOffsetNumber expects a valid pointer */
 				if (!(i == maxoff &&
-					  GinPageGetOpaque(page)->rightlink == InvalidBlockNumber))
+					  rightlink == InvalidBlockNumber))
 					elog(DEBUG3, "key (%u, %u) -> %u",
 						 ItemPointerGetBlockNumber(&posting_item->key),
 						 ItemPointerGetOffsetNumber(&posting_item->key),
@@ -300,8 +308,7 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 					elog(DEBUG3, "key (%u, %u) -> %u",
 						 0, 0, BlockIdGetBlockNumber(&posting_item->child_blkno));
 
-				if (i == maxoff &&
-					GinPageGetOpaque(page)->rightlink == InvalidBlockNumber)
+				if (i == maxoff && rightlink == InvalidBlockNumber)
 				{
 					/*
 					 * The rightmost item in the tree level has (0, 0) as the
@@ -340,19 +347,23 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 									RelationGetRelationName(rel),
 									stack->blkno, i)));
 
-				/* If this is an internal page, recurse into the child */
-				if (!GinPageIsLeaf(page))
-				{
-					GinPostingTreeScanItem *ptr;
+				/* This is an internal page, recurse into the child */
+				ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
+				ptr->depth = stack->depth + 1;
 
-					ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
-					ptr->depth = stack->depth + 1;
+				/*
+				 * Set rightmost parent key to invalid iterm pointer. Its
+				 * value is 'Infinity' and not explicitly stored.
+				 */
+				if (rightlink == InvalidBlockNumber)
+					ItemPointerSetInvalid(&ptr->parentkey);
+				else
 					ptr->parentkey = posting_item->key;
-					ptr->parentblk = stack->blkno;
-					ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
-					ptr->next = stack->next;
-					stack->next = ptr;
-				}
+
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
+				ptr->next = stack->next;
+				stack->next = ptr;
 			}
 		}
 		LockBuffer(buffer, GIN_UNLOCK);
-- 
2.34.1

#60

Tomas Vondra

tomas@vondra.me

11 months ago

In reply to: Kirill Reshke (#59)

Re: Amcheck verification of GiST and GIN

Hi,

I see this patch didn't move since December :-( I still think these
improvements would be useful, it certainly was very helpful when I was
working on the GIN/GiST parallel builds (the GiST builds stalled, but I
hope to push the GIN patches soon).

So I'd like to get some of this in too. I'm not sure about the GiST
bits, because I know very little about that AM (the parallel builds made
me acutely aware of that).

But I'd like to get the GIN parts in. We're at v34 already, and the
recent changes were mostly cosmetic. Does anyone object to me polishing
and pushing those parts?

regards

--
Tomas Vondra

#61

Mark Dilger

mark.dilger@enterprisedb.com

11 months ago

In reply to: Tomas Vondra (#60)

Re: Amcheck verification of GiST and GIN

On Feb 21, 2025, at 6:29 AM, Tomas Vondra <tomas@vondra.me> wrote:

Hi,

I see this patch didn't move since December :-( I still think these
improvements would be useful, it certainly was very helpful when I was
working on the GIN/GiST parallel builds (the GiST builds stalled, but I
hope to push the GIN patches soon).

So I'd like to get some of this in too. I'm not sure about the GiST
bits, because I know very little about that AM (the parallel builds made
me acutely aware of that).

But I'd like to get the GIN parts in. We're at v34 already, and the
recent changes were mostly cosmetic. Does anyone object to me polishing
and pushing those parts?

I infer that you intend to make v34-0004, v34-0006, and v35-0001 apply cleanly without the other patches and commit it that way. If that is correct, be advised that I'm doing a review and will respond back shortly, maybe in a few hours.

—
Mark Dilger
+001 (360) 271-8498
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#62

Tomas Vondra

tomas@vondra.me

11 months ago

In reply to: Mark Dilger (#61)

Re: Amcheck verification of GiST and GIN

On 2/21/25 18:07, Mark Dilger wrote:

On Feb 21, 2025, at 6:29 AM, Tomas Vondra <tomas@vondra.me> wrote:

Hi,

I see this patch didn't move since December :-( I still think
these improvements would be useful, it certainly was very helpful
when I was working on the GIN/GiST parallel builds (the GiST
builds stalled, but I hope to push the GIN patches soon).

So I'd like to get some of this in too. I'm not sure about the
GiST bits, because I know very little about that AM (the parallel
builds made me acutely aware of that).

But I'd like to get the GIN parts in. We're at v34 already, and
the recent changes were mostly cosmetic. Does anyone object to me
polishing and pushing those parts?

I infer that you intend to make v34-0004, v34-0006, and v35-0001
apply cleanly without the other patches and commit it that way. If
that is correct, be advised that I'm doing a review and will respond
back shortly, maybe in a few hours.

Something like that. I haven't tried to actually massage the patches
yet, maybe it'd require some of the refactoring patches too.

But OK, I'll wait for the review, no problem. I wouldn't get to this
until Monday anyway.

regards

--
Tomas Vondra

#63

Andrey Borodin

x4mmm@yandex-team.ru

11 months ago

In reply to: Tomas Vondra (#60)

Re: Amcheck verification of GiST and GIN

On 21 Feb 2025, at 19:29, Tomas Vondra <tomas@vondra.me> wrote:

I see this patch didn't move since December :-( I still think these
improvements would be useful, it certainly was very helpful when I was
working on the GIN/GiST parallel builds (the GiST builds stalled, but I
hope to push the GIN patches soon).

So I'd like to get some of this in too. I'm not sure about the GiST
bits, because I know very little about that AM (the parallel builds made
me acutely aware of that).

But I'd like to get the GIN parts in. We're at v34 already, and the
recent changes were mostly cosmetic. Does anyone object to me polishing
and pushing those parts?

Hi Tomas!

Committing verification for any index type would help immensely. Currently we have many separate areas of work that just depend on this common part. GiST and GIN have a code for verification, which is bound together by this patch set. If someone, e.g., wants to work on BRIN - they have to deal with all the patch set.
If we have any second index in amcheck, no matter GiST or GIN, - it's clear how to split the work on other AMs.

Kirill spend a lot of time ironing out various false positives from GIN check. Kirill, what is your opinion about GIN verification? Does it look complete? (in a sense that it will not trigger false alarm, certainly it cannot catch all the type of corruptions)

Thanks!

Best regards, Andrey Borodin.

#64

Mark Dilger

mark.dilger@enterprisedb.com

11 months ago

In reply to: Mark Dilger (#61)

8 attachment(s)

Re: Amcheck verification of GiST and GIN

On Feb 21, 2025, at 9:07 AM, Mark Dilger <mark.dilger@enterprisedb.com> wrote:

I infer that you intend to make v34-0004, v34-0006, and v35-0001 apply cleanly without the other patches and commit it that way. If that is correct, be advised that I'm doing a review and will respond back shortly, maybe in a few hours.

Ok, here is my review:

v34-0001 looks fine
v34-0002 refactoring is needed by the gin patches, so I kept it in the patchset for review purposes
v34-0004 can mostly be applied without v34-0003, but a few changes are needed to make it apply cleanly.
v34-0006 looks fine
v35-0001 applies cleanly

I find the token quoting and capitalization patterns in sql/check_gin.sql somewhat confusing, but I tried to follow what is already there in extending that test to also check gin indexes over jsonb data using jsonb_path_ops. I think this is a common enough usage of gin that we should have test coverage for it.

After extending the test a bit, I ran the tests and checked lcov:

verify_common.c 86.3%
verify_gin.c 38.4%
verify_heapam.c 57.2%
verify_nbtree.c 72.4%

Showing that verify_gin has the least coverage of all. The main areas lacking coverage have to do with posting list trees and concurrent page splits never being exercised. My first attempt cover that with a TAP test using pgbench got the number up to 56.8%, but while trying to get that higher, I started getting error reports from verify_gin(), apparently out of function gin_check_parent_keys_consistency():

# at t/006_gin_concurrency.pl line 137.
# 'pgbench: error: client 14 script 1 aborted in command 0 query 0: ERROR: index "ginidx" has wrong tuple order on entry tree page, block 153, offset 8
# pgbench: error: client 0 script 1 aborted in command 0 query 0: ERROR: index "ginidx" has wrong tuple order on entry tree page, block 153, offset 8
# pgbench: error: client 12 script 1 aborted in command 0 query 0: ERROR: index "ginidx" has wrong tuple order on entry tree page, block 153, offset 8
# pgbench: error: client 7 script 1 aborted in command 0 query 0: ERROR: index "ginidx" has wrong tuple order on entry tree page, block 153, offset 8
# pgbench: error: client 1 script 1 aborted in command 0 query 0: ERROR: index "ginidx" has wrong tuple order on entry tree page, block 153, offset 8

The pgbench script is not corrupting anything overtly, so this looks to either be a bug in gin or a bug in the check. I am including a patchset with the original patches reworked plus the extra test cases. For completeness, I also added gin indexes to t/002_cic.pl and t/003_cic_2pc.pl.

Attachments:

v36-0001-A-tiny-nitpicky-tweak-to-beautify-the-Amcheck-in.patchapplication/octet-stream; name=v36-0001-A-tiny-nitpicky-tweak-to-beautify-the-Amcheck-in.patch; x-unix-mode=0644Download

From 7c472ae3a64e4576e1779e0636c0016e89656829 Mon Sep 17 00:00:00 2001
From: reshke kirill <reshke@double.cloud>
Date: Tue, 26 Nov 2024 05:32:27 +0000
Subject: [PATCH v36 1/8] A tiny nitpicky tweak to beautify the Amcheck
 interiors.

The heaptuplespresent field in BtreeCheckState was not previously
adequately documented. To clarify the meaning of this field, the comment was changed.
---
 contrib/amcheck/verify_nbtree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index aac8c74f546..7543be17552 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -124,7 +124,7 @@ typedef struct BtreeCheckState
 
 	/* Bloom filter fingerprints B-Tree index */
 	bloom_filter *filter;
-	/* Debug counter */
+	/* Debug counter for reporting percentage of work already done */
 	int64		heaptuplespresent;
 } BtreeCheckState;
 
-- 
2.39.3 (Apple Git-145)

v36-0002-Refactor-amcheck-internals-to-isolate-common-loc.patchapplication/octet-stream; name=v36-0002-Refactor-amcheck-internals-to-isolate-common-loc.patch; x-unix-mode=0644Download

From 2e9b9d7d105b4f8279a6c953d832616b8b256c8d Mon Sep 17 00:00:00 2001
From: Mark Dilger <mark.dilger@enterprisedb.com>
Date: Fri, 21 Feb 2025 09:18:04 -0800
Subject: [PATCH v36 2/8] Refactor amcheck internals to isolate common locking
 and checking routines
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Before doing checks, other indexes must take the same safety measures:
 - Making sure the index can be checked
 - changing the context of the user
 - keeping track of GUCs modified via index functions
This contribution relocates the existing functionality to amcheck.c for reuse.

Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile                 |   1 +
 contrib/amcheck/expected/check_btree.out |   4 +-
 contrib/amcheck/meson.build              |   1 +
 contrib/amcheck/verify_common.c          | 191 ++++++++++++++++
 contrib/amcheck/verify_common.h          |  31 +++
 contrib/amcheck/verify_nbtree.c          | 267 ++++++-----------------
 6 files changed, 296 insertions(+), 199 deletions(-)
 create mode 100644 contrib/amcheck/verify_common.c
 create mode 100644 contrib/amcheck/verify_common.h

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index 5e9002d2501..c3d70f3369c 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -3,6 +3,7 @@
 MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
+	verify_common.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
diff --git a/contrib/amcheck/expected/check_btree.out b/contrib/amcheck/expected/check_btree.out
index e7fb5f55157..c6f4b16c556 100644
--- a/contrib/amcheck/expected/check_btree.out
+++ b/contrib/amcheck/expected/check_btree.out
@@ -57,8 +57,8 @@ ERROR:  could not open relation with OID 17
 BEGIN;
 CREATE INDEX bttest_a_brin_idx ON bttest_a USING brin(id);
 SELECT bt_index_parent_check('bttest_a_brin_idx');
-ERROR:  only B-Tree indexes are supported as targets for verification
-DETAIL:  Relation "bttest_a_brin_idx" is not a B-Tree index.
+ERROR:  expected "btree" index as targets for verification
+DETAIL:  Relation "bttest_a_brin_idx" is a brin index.
 ROLLBACK;
 -- normal check outside of xact
 SELECT bt_index_check('bttest_a_idx');
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 61d7eaf2305..67a4ac8518d 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -1,6 +1,7 @@
 # Copyright (c) 2022-2025, PostgreSQL Global Development Group
 
 amcheck_sources = files(
+  'verify_common.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
diff --git a/contrib/amcheck/verify_common.c b/contrib/amcheck/verify_common.c
new file mode 100644
index 00000000000..c8ed685ba42
--- /dev/null
+++ b/contrib/amcheck/verify_common.c
@@ -0,0 +1,191 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_common.c
+ *		Utility functions common to all access methods.
+ *
+ * Copyright (c) 2024, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_common.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/table.h"
+#include "access/tableam.h"
+#include "verify_common.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
+#include "commands/tablecmds.h"
+#include "utils/guc.h"
+#include "utils/syscache.h"
+
+static bool amcheck_index_mainfork_expected(Relation rel);
+
+
+/*
+ * Check if index relation should have a file for its main relation fork.
+ * Verification uses this to skip unlogged indexes when in hot standby mode,
+ * where there is simply nothing to verify.
+ *
+ * NB: Caller should call index_checkable() before calling here.
+ */
+static bool
+amcheck_index_mainfork_expected(Relation rel)
+{
+	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
+		!RecoveryInProgress())
+		return true;
+
+	ereport(NOTICE,
+			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
+			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
+					RelationGetRelationName(rel))));
+
+	return false;
+}
+
+/*
+* Amcheck main workhorse.
+* Given index relation OID, lock relation.
+* Next, take a number of standard actions:
+* 1) Make sure the index can be checked
+* 2) change the context of the user,
+* 3) keep track of GUCs modified via index functions
+* 4) execute callback function to verify integrity.
+*/
+void
+amcheck_lock_relation_and_check(Oid indrelid,
+								Oid am_id,
+								IndexDoCheckCallback check,
+								LOCKMODE lockmode,
+								void *state)
+{
+	Oid			heapid;
+	Relation	indrel;
+	Relation	heaprel;
+	Oid			save_userid;
+	int			save_sec_context;
+	int			save_nestlevel;
+
+	/*
+	 * We must lock table before index to avoid deadlocks.  However, if the
+	 * passed indrelid isn't an index then IndexGetRelation() will fail.
+	 * Rather than emitting a not-very-helpful error message, postpone
+	 * complaining, expecting that the is-it-an-index test below will fail.
+	 *
+	 * In hot standby mode this will raise an error when parentcheck is true.
+	 */
+	heapid = IndexGetRelation(indrelid, true);
+	if (OidIsValid(heapid))
+	{
+		heaprel = table_open(heapid, lockmode);
+
+		/*
+		 * Switch to the table owner's userid, so that any index functions are
+		 * run as that user.  Also lock down security-restricted operations
+		 * and arrange to make GUC variable changes local to this command.
+		 */
+		GetUserIdAndSecContext(&save_userid, &save_sec_context);
+		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
+							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
+		save_nestlevel = NewGUCNestLevel();
+	}
+	else
+	{
+		heaprel = NULL;
+		/* Set these just to suppress "uninitialized variable" warnings */
+		save_userid = InvalidOid;
+		save_sec_context = -1;
+		save_nestlevel = -1;
+	}
+
+	/*
+	 * Open the target index relations separately (like relation_openrv(), but
+	 * with heap relation locked first to prevent deadlocking).  In hot
+	 * standby mode this will raise an error when parentcheck is true.
+	 *
+	 * There is no need for the usual indcheckxmin usability horizon test
+	 * here, even in the heapallindexed case, because index undergoing
+	 * verification only needs to have entries for a new transaction snapshot.
+	 * (If this is a parentcheck verification, there is no question about
+	 * committed or recently dead heap tuples lacking index entries due to
+	 * concurrent activity.)
+	 */
+	indrel = index_open(indrelid, lockmode);
+
+	/*
+	 * Since we did the IndexGetRelation call above without any lock, it's
+	 * barely possible that a race against an index drop/recreation could have
+	 * netted us the wrong table.
+	 */
+	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_TABLE),
+				 errmsg("could not open parent table of index \"%s\"",
+						RelationGetRelationName(indrel))));
+
+	/* Check that relation suitable for checking */
+	if (index_checkable(indrel, am_id))
+		check(indrel, heaprel, state, lockmode == ShareLock);
+
+	/* Roll back any GUC changes executed by index functions */
+	AtEOXact_GUC(false, save_nestlevel);
+
+	/* Restore userid and security context */
+	SetUserIdAndSecContext(save_userid, save_sec_context);
+
+	/*
+	 * Release locks early. That's ok here because nothing in the called
+	 * routines will trigger shared cache invalidations to be sent, so we can
+	 * relax the usual pattern of only releasing locks after commit.
+	 */
+	index_close(indrel, lockmode);
+	if (heaprel)
+		table_close(heaprel, lockmode);
+}
+
+/*
+ * Basic checks about the suitability of a relation for checking as an index.
+ *
+ *
+ * NB: Intentionally not checking permissions, the function is normally not
+ * callable by non-superusers. If granted, it's useful to be able to check a
+ * whole cluster.
+ */
+bool
+index_checkable(Relation rel, Oid am_id)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != am_id)
+	{
+		HeapTuple	amtup;
+		HeapTuple	amtuprel;
+
+		amtup = SearchSysCache1(AMOID, ObjectIdGetDatum(am_id));
+		amtuprel = SearchSysCache1(AMOID, ObjectIdGetDatum(rel->rd_rel->relam));
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("expected \"%s\" index as targets for verification", NameStr(((Form_pg_am) GETSTRUCT(amtup))->amname)),
+				 errdetail("Relation \"%s\" is a %s index.",
+						   RelationGetRelationName(rel), NameStr(((Form_pg_am) GETSTRUCT(amtuprel))->amname))));
+	}
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid.")));
+
+	return amcheck_index_mainfork_expected(rel);
+}
diff --git a/contrib/amcheck/verify_common.h b/contrib/amcheck/verify_common.h
new file mode 100644
index 00000000000..30994e22933
--- /dev/null
+++ b/contrib/amcheck/verify_common.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "storage/bufpage.h"
+#include "storage/lmgr.h"
+#include "storage/lockdefs.h"
+#include "utils/relcache.h"
+#include "miscadmin.h"
+
+/* Typedefs for callback functions for amcheck_lock_relation */
+typedef void (*IndexCheckableCallback) (Relation index);
+typedef void (*IndexDoCheckCallback) (Relation rel,
+									  Relation heaprel,
+									  void *state,
+									  bool readonly);
+
+extern void amcheck_lock_relation_and_check(Oid indrelid,
+											Oid am_id,
+											IndexDoCheckCallback check,
+											LOCKMODE lockmode, void *state);
+
+extern bool index_checkable(Relation rel, Oid am_id);
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 7543be17552..9dc76f0e5d4 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -30,6 +30,7 @@
 #include "access/tableam.h"
 #include "access/transam.h"
 #include "access/xact.h"
+#include "verify_common.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
 #include "catalog/pg_opfamily_d.h"
@@ -156,14 +157,22 @@ typedef struct BtreeLastVisibleEntry
 	ItemPointer tid;			/* Heap tid */
 } BtreeLastVisibleEntry;
 
+/*
+ * Check arguments
+ */
+typedef struct BTCallbackState
+{
+	bool		parentcheck;
+	bool		heapallindexed;
+	bool		rootdescend;
+	bool		checkunique;
+}			BTCallbackState;
+
 PG_FUNCTION_INFO_V1(bt_index_check);
 PG_FUNCTION_INFO_V1(bt_index_parent_check);
 
-static void bt_index_check_internal(Oid indrelid, bool parentcheck,
-									bool heapallindexed, bool rootdescend,
-									bool checkunique);
-static inline void btree_index_checkable(Relation rel);
-static inline bool btree_index_mainfork_expected(Relation rel);
+static void bt_index_check_callback(Relation indrel, Relation heaprel,
+									void *state, bool readonly);
 static void bt_check_every_level(Relation rel, Relation heaprel,
 								 bool heapkeyspace, bool readonly, bool heapallindexed,
 								 bool rootdescend, bool checkunique);
@@ -238,15 +247,21 @@ Datum
 bt_index_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		checkunique = false;
+	BTCallbackState args;
+
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = false;
+	args.checkunique = false;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
-	if (PG_NARGS() == 3)
-		checkunique = PG_GETARG_BOOL(2);
+		args.heapallindexed = PG_GETARG_BOOL(1);
+	if (PG_NARGS() >= 3)
+		args.checkunique = PG_GETARG_BOOL(2);
 
-	bt_index_check_internal(indrelid, false, heapallindexed, false, checkunique);
+	amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
+									bt_index_check_callback,
+									AccessShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -264,18 +279,23 @@ Datum
 bt_index_parent_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		rootdescend = false;
-	bool		checkunique = false;
+	BTCallbackState args;
+
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = true;
+	args.checkunique = false;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+		args.heapallindexed = PG_GETARG_BOOL(1);
 	if (PG_NARGS() >= 3)
-		rootdescend = PG_GETARG_BOOL(2);
-	if (PG_NARGS() == 4)
-		checkunique = PG_GETARG_BOOL(3);
+		args.rootdescend = PG_GETARG_BOOL(2);
+	if (PG_NARGS() >= 4)
+		args.checkunique = PG_GETARG_BOOL(3);
 
-	bt_index_check_internal(indrelid, true, heapallindexed, rootdescend, checkunique);
+	amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
+									bt_index_check_callback,
+									ShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -284,193 +304,46 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
  * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
  */
 static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend, bool checkunique)
+bt_index_check_callback(Relation indrel, Relation heaprel, void *state, bool readonly)
 {
-	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	LOCKMODE	lockmode;
-	Oid			save_userid;
-	int			save_sec_context;
-	int			save_nestlevel;
-
-	if (parentcheck)
-		lockmode = ShareLock;
-	else
-		lockmode = AccessShareLock;
-
-	/*
-	 * We must lock table before index to avoid deadlocks.  However, if the
-	 * passed indrelid isn't an index then IndexGetRelation() will fail.
-	 * Rather than emitting a not-very-helpful error message, postpone
-	 * complaining, expecting that the is-it-an-index test below will fail.
-	 *
-	 * In hot standby mode this will raise an error when parentcheck is true.
-	 */
-	heapid = IndexGetRelation(indrelid, true);
-	if (OidIsValid(heapid))
-	{
-		heaprel = table_open(heapid, lockmode);
-
-		/*
-		 * Switch to the table owner's userid, so that any index functions are
-		 * run as that user.  Also lock down security-restricted operations
-		 * and arrange to make GUC variable changes local to this command.
-		 */
-		GetUserIdAndSecContext(&save_userid, &save_sec_context);
-		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
-							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
-		save_nestlevel = NewGUCNestLevel();
-		RestrictSearchPath();
-	}
-	else
-	{
-		heaprel = NULL;
-		/* Set these just to suppress "uninitialized variable" warnings */
-		save_userid = InvalidOid;
-		save_sec_context = -1;
-		save_nestlevel = -1;
-	}
+	BTCallbackState *args = (BTCallbackState *) state;
+	bool		heapkeyspace,
+				allequalimage;
 
-	/*
-	 * Open the target index relations separately (like relation_openrv(), but
-	 * with heap relation locked first to prevent deadlocking).  In hot
-	 * standby mode this will raise an error when parentcheck is true.
-	 *
-	 * There is no need for the usual indcheckxmin usability horizon test
-	 * here, even in the heapallindexed case, because index undergoing
-	 * verification only needs to have entries for a new transaction snapshot.
-	 * (If this is a parentcheck verification, there is no question about
-	 * committed or recently dead heap tuples lacking index entries due to
-	 * concurrent activity.)
-	 */
-	indrel = index_open(indrelid, lockmode);
-
-	/*
-	 * Since we did the IndexGetRelation call above without any lock, it's
-	 * barely possible that a race against an index drop/recreation could have
-	 * netted us the wrong table.
-	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
 		ereport(ERROR,
-				(errcode(ERRCODE_UNDEFINED_TABLE),
-				 errmsg("could not open parent table of index \"%s\"",
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" lacks a main relation fork",
 						RelationGetRelationName(indrel))));
 
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	if (btree_index_mainfork_expected(indrel))
+	/* Extract metadata from metapage, and sanitize it in passing */
+	_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
+	if (allequalimage && !heapkeyspace)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
+						RelationGetRelationName(indrel))));
+	if (allequalimage && !_bt_allequalimage(indrel, false))
 	{
-		bool		heapkeyspace,
-					allequalimage;
+		bool		has_interval_ops = false;
 
-		if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" lacks a main relation fork",
-							RelationGetRelationName(indrel))));
-
-		/* Extract metadata from metapage, and sanitize it in passing */
-		_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
-		if (allequalimage && !heapkeyspace)
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
-							RelationGetRelationName(indrel))));
-		if (allequalimage && !_bt_allequalimage(indrel, false))
-		{
-			bool		has_interval_ops = false;
-
-			for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
-				if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
-					has_interval_ops = true;
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
-							RelationGetRelationName(indrel)),
-					 has_interval_ops
-					 ? errhint("This is known of \"interval\" indexes last built on a version predating 2023-11.")
-					 : 0));
-		}
-
-		/* Check index, possibly against table it is an index on */
-		bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
-							 heapallindexed, rootdescend, checkunique);
+		for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
+			if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
+			{
+				has_interval_ops = true;
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
+								RelationGetRelationName(indrel)),
+						 has_interval_ops
+						 ? errhint("This is known of \"interval\" indexes last built on a version predating 2023-11.")
+						 : 0));
+			}
 	}
 
-	/* Roll back any GUC changes executed by index functions */
-	AtEOXact_GUC(false, save_nestlevel);
-
-	/* Restore userid and security context */
-	SetUserIdAndSecContext(save_userid, save_sec_context);
-
-	/*
-	 * Release locks early. That's ok here because nothing in the called
-	 * routines will trigger shared cache invalidations to be sent, so we can
-	 * relax the usual pattern of only releasing locks after commit.
-	 */
-	index_close(indrel, lockmode);
-	if (heaprel)
-		table_close(heaprel, lockmode);
-}
-
-/*
- * Basic checks about the suitability of a relation for checking as a B-Tree
- * index.
- *
- * NB: Intentionally not checking permissions, the function is normally not
- * callable by non-superusers. If granted, it's useful to be able to check a
- * whole cluster.
- */
-static inline void
-btree_index_checkable(Relation rel)
-{
-	if (rel->rd_rel->relkind != RELKIND_INDEX ||
-		rel->rd_rel->relam != BTREE_AM_OID)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("only B-Tree indexes are supported as targets for verification"),
-				 errdetail("Relation \"%s\" is not a B-Tree index.",
-						   RelationGetRelationName(rel))));
-
-	if (RELATION_IS_OTHER_TEMP(rel))
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("cannot access temporary tables of other sessions"),
-				 errdetail("Index \"%s\" is associated with temporary relation.",
-						   RelationGetRelationName(rel))));
-
-	if (!rel->rd_index->indisvalid)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("cannot check index \"%s\"",
-						RelationGetRelationName(rel)),
-				 errdetail("Index is not valid.")));
-}
-
-/*
- * Check if B-Tree index relation should have a file for its main relation
- * fork.  Verification uses this to skip unlogged indexes when in hot standby
- * mode, where there is simply nothing to verify.  We behave as if the
- * relation is empty.
- *
- * NB: Caller should call btree_index_checkable() before calling here.
- */
-static inline bool
-btree_index_mainfork_expected(Relation rel)
-{
-	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
-		!RecoveryInProgress())
-		return true;
-
-	ereport(DEBUG1,
-			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
-			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
-					RelationGetRelationName(rel))));
-
-	return false;
+	/* Check index, possibly against table it is an index on */
+	bt_check_every_level(indrel, heaprel, heapkeyspace, readonly,
+						 args->heapallindexed, args->rootdescend, args->checkunique);
 }
 
 /*
-- 
2.39.3 (Apple Git-145)

v36-0003-Add-gin_index_check-to-verify-GIN-index.patchapplication/octet-stream; name=v36-0003-Add-gin_index_check-to-verify-GIN-index.patch; x-unix-mode=0644Download

From dfab1f7b014e3338143872ef839c18020c6a8525 Mon Sep 17 00:00:00 2001
From: Mark Dilger <mark.dilger@enterprisedb.com>
Date: Fri, 21 Feb 2025 09:27:00 -0800
Subject: [PATCH v36 3/8] Add gin_index_check() to verify GIN index
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Author: Grigory Kryachko <GSKryachko@gmail.com>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile               |   6 +-
 contrib/amcheck/amcheck--1.4--1.5.sql  |  14 +
 contrib/amcheck/amcheck.control        |   2 +-
 contrib/amcheck/expected/check_gin.out |  64 ++
 contrib/amcheck/meson.build            |   3 +
 contrib/amcheck/sql/check_gin.sql      |  40 ++
 contrib/amcheck/verify_gin.c           | 774 +++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml              |  20 +
 8 files changed, 920 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.4--1.5.sql
 create mode 100644 contrib/amcheck/expected/check_gin.out
 create mode 100644 contrib/amcheck/sql/check_gin.sql
 create mode 100644 contrib/amcheck/verify_gin.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index c3d70f3369c..1b7a63cbaa4 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,14 +4,16 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	verify_common.o \
+	verify_gin.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
-DATA = amcheck--1.3--1.4.sql amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql \
+		amcheck--1.3--1.4.sql amcheck--1.4--1.5.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap
+REGRESS = check check_btree check_gin check_heap
 
 EXTRA_INSTALL = contrib/pg_walinspect
 TAP_TESTS = 1
diff --git a/contrib/amcheck/amcheck--1.4--1.5.sql b/contrib/amcheck/amcheck--1.4--1.5.sql
new file mode 100644
index 00000000000..445c48ccb7d
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.4--1.5.sql
@@ -0,0 +1,14 @@
+/* contrib/amcheck/amcheck--1.4--1.5.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.5'" to load this file. \quit
+
+
+-- gin_index_check()
+--
+CREATE FUNCTION gin_index_check(index regclass)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gin_index_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gin_index_check(regclass) FROM PUBLIC;
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index e67ace01c99..c8ba6d7c9bc 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.4'
+default_version = '1.5'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
new file mode 100644
index 00000000000..bbcde80e627
--- /dev/null
+++ b/contrib/amcheck/expected/check_gin.out
@@ -0,0 +1,64 @@
+-- Test of index bulk load
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_check('gin_check_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test index inserts
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+SELECT gin_index_check('gin_check_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test GIN over text array
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_check('gin_check_text_array_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 67a4ac8518d..b33e8c9b062 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'verify_common.c',
+  'verify_gin.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
@@ -25,6 +26,7 @@ install_data(
   'amcheck--1.1--1.2.sql',
   'amcheck--1.2--1.3.sql',
   'amcheck--1.3--1.4.sql',
+  'amcheck--1.4--1.5.sql',
   kwargs: contrib_data_args,
 )
 
@@ -36,6 +38,7 @@ tests += {
     'sql': [
       'check',
       'check_btree',
+      'check_gin',
       'check_heap',
     ],
   },
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
new file mode 100644
index 00000000000..bbd9b9f8281
--- /dev/null
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -0,0 +1,40 @@
+-- Test of index bulk load
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test index inserts
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+
+SELECT gin_index_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test GIN over text array
+SELECT setseed(1);
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_check('gin_check_text_array_idx');
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
new file mode 100644
index 00000000000..2dc5fbba619
--- /dev/null
+++ b/contrib/amcheck/verify_gin.c
@@ -0,0 +1,774 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gin.c
+ *		Verifies the integrity of GIN indexes based on invariants.
+ *
+ * Verification checks that all paths in GIN graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gin.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gin_private.h"
+#include "access/nbtree.h"
+#include "catalog/pg_am.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "verify_common.h"
+#include "string.h"
+
+/*
+ * GinScanItem represents one item of depth-first scan of GIN index.
+ */
+typedef struct GinScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GinScanItem *next;
+}			GinScanItem;
+
+/*
+ * GinPostingTreeScanItem represents one item of depth-first scan of GIN  posting tree.
+ */
+typedef struct GinPostingTreeScanItem
+{
+	int			depth;
+	ItemPointerData parentkey;
+	BlockNumber parentblk;
+	BlockNumber blkno;
+	struct GinPostingTreeScanItem *next;
+}			GinPostingTreeScanItem;
+
+
+PG_FUNCTION_INFO_V1(gin_index_check);
+
+static void gin_check_parent_keys_consistency(Relation rel,
+											  Relation heaprel,
+											  void *callback_state, bool readonly);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gin_refind_parent(Relation rel,
+									BlockNumber parentblkno,
+									BlockNumber childblkno,
+									BufferAccessStrategy strategy);
+static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+								   OffsetNumber offset);
+
+/*
+ * gin_index_check(index regclass)
+ *
+ * Verify integrity of GIN index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gin_index_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+
+	amcheck_lock_relation_and_check(indrelid,
+									GIN_AM_OID,
+									gin_check_parent_keys_consistency,
+									AccessShareLock,
+									NULL);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Read item pointers from leaf entry tuple.
+ *
+ * Returns a palloc'd array of ItemPointers. The number of items is returned
+ * in *nitems.
+ */
+static ItemPointer
+ginReadTupleWithoutState(IndexTuple itup, int *nitems)
+{
+	Pointer		ptr = GinGetPosting(itup);
+	int			nipd = GinGetNPosting(itup);
+	ItemPointer ipd;
+	int			ndecoded;
+
+	if (GinItupIsCompressed(itup))
+	{
+		if (nipd > 0)
+		{
+			ipd = ginPostingListDecode((GinPostingList *) ptr, &ndecoded);
+			if (nipd != ndecoded)
+				elog(ERROR, "number of items mismatch in GIN entry tuple, %d in tuple header, %d decoded",
+					 nipd, ndecoded);
+		}
+		else
+			ipd = palloc(0);
+	}
+	else
+	{
+		ipd = (ItemPointer) palloc(sizeof(ItemPointerData) * nipd);
+		memcpy(ipd, ptr, sizeof(ItemPointerData) * nipd);
+	}
+	*nitems = nipd;
+	return ipd;
+}
+
+/*
+ * Allocates memory context and scans through postigTree graph
+ *
+ */
+static void
+gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting_tree_root)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinPostingTreeScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinPostingTreeScanItem *) palloc0(sizeof(GinPostingTreeScanItem));
+	stack->depth = 0;
+	ItemPointerSetInvalid(&stack->parentkey);
+	stack->parentblk = InvalidBlockNumber;
+	stack->blkno = posting_tree_root;
+
+	elog(DEBUG3, "processing posting tree at blk %u", posting_tree_root);
+
+	while (stack)
+	{
+		GinPostingTreeScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		Assert(GinPageIsData(page));
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			ItemPointerData minItem;
+			int			nlist;
+			ItemPointerData *list;
+			char		tidrange_buf[MAXPGPATH];
+
+			ItemPointerSetMin(&minItem);
+
+			elog(DEBUG1, "page blk: %u, type leaf", stack->blkno);
+
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			list = GinDataLeafPageGetItems(page, &nlist, minItem);
+
+			if (nlist > 0)
+				snprintf(tidrange_buf, sizeof(tidrange_buf),
+						 "%d tids (%u, %u) - (%u, %u)",
+						 nlist,
+						 ItemPointerGetBlockNumberNoCheck(&list[0]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[0]),
+						 ItemPointerGetBlockNumberNoCheck(&list[nlist - 1]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[nlist - 1]));
+			else
+				snprintf(tidrange_buf, sizeof(tidrange_buf), "0 tids");
+
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: parent %u highkey (%u, %u), %s",
+					 stack->blkno,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey),
+					 tidrange_buf);
+			else
+				elog(DEBUG3, "blk %u: root leaf, %s",
+					 stack->blkno,
+					 tidrange_buf);
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				ItemPointerGetOffsetNumberNoCheck(&stack->parentkey) != InvalidOffsetNumber &&
+				nlist > 0 && ItemPointerCompare(&stack->parentkey, &list[nlist - 1]) < 0)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": tid exceeds parent's high key in postingTree leaf on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+		else
+		{
+			LocationIndex pd_lower;
+			ItemPointerData bound;
+			int			lowersize;
+
+			/*
+			 * Check that tuples in each page are properly ordered and
+			 * consistent with parent high key
+			 */
+			Assert(GinPageIsData(page));
+			maxoff = GinPageGetOpaque(page)->maxoff;
+
+			elog(DEBUG1, "page blk: %u, type data, maxoff %d", stack->blkno, maxoff);
+
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: internal posting tree page with %u items, parent %u highkey (%u, %u)",
+					 stack->blkno, maxoff, stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey));
+			else
+				elog(DEBUG3, "blk %u: root internal posting tree page with %u items",
+					 stack->blkno, maxoff);
+
+			/*
+			 * A GIN posting tree internal page stores PostingItems in the
+			 * 'lower' part of the page. The 'upper' part is unused. The
+			 * number of elements is stored in the opaque area (maxoff). Make
+			 * sure the size of the 'lower' part agrees with 'maxoff'
+			 *
+			 * We didn't set pd_lower until PostgreSQL version 9.4, so if this
+			 * check fails, it could also be because the index was
+			 * binary-upgraded from an earlier version. That was a long time
+			 * ago, though, so let's warn if it doesn't match.
+			 */
+			pd_lower = ((PageHeader) page)->pd_lower;
+			lowersize = pd_lower - MAXALIGN(SizeOfPageHeaderData);
+			if ((lowersize - MAXALIGN(sizeof(ItemPointerData))) / sizeof(PostingItem) != maxoff)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has unexpected pd_lower %u in posting tree block %u with maxoff %u)",
+								RelationGetRelationName(rel), pd_lower, stack->blkno, maxoff)));
+
+			/*
+			 * Before the PostingItems, there's one ItemPointerData in the
+			 * 'lower' part that stores the page's high key.
+			 */
+			bound = *GinDataPageGetRightBound(page);
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				!ItemPointerEquals(&stack->parentkey, &bound))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": posting tree page's high key (%u, %u) doesn't match the downlink on block %u (parent blk %u, key (%u, %u))",
+								RelationGetRelationName(rel),
+								ItemPointerGetBlockNumberNoCheck(&bound),
+								ItemPointerGetOffsetNumberNoCheck(&bound),
+								stack->blkno, stack->parentblk,
+								ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+								ItemPointerGetOffsetNumberNoCheck(&stack->parentkey))));
+
+			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
+
+				/* ItemPointerGetOffsetNumber expects a valid pointer */
+				if (!(i == maxoff &&
+					  GinPageGetOpaque(page)->rightlink == InvalidBlockNumber))
+					elog(DEBUG3, "key (%u, %u) -> %u",
+						 ItemPointerGetBlockNumber(&posting_item->key),
+						 ItemPointerGetOffsetNumber(&posting_item->key),
+						 BlockIdGetBlockNumber(&posting_item->child_blkno));
+				else
+					elog(DEBUG3, "key (%u, %u) -> %u",
+						 0, 0, BlockIdGetBlockNumber(&posting_item->child_blkno));
+
+				if (i == maxoff &&
+					GinPageGetOpaque(page)->rightlink == InvalidBlockNumber)
+				{
+					/*
+					 * The rightmost item in the tree level has (0, 0) as the
+					 * key
+					 */
+					if (ItemPointerGetBlockNumberNoCheck(&posting_item->key) != 0 ||
+						ItemPointerGetOffsetNumberNoCheck(&posting_item->key) != 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": rightmost posting tree page (blk %u) has unexpected last key (%u, %u)",
+										RelationGetRelationName(rel),
+										stack->blkno,
+										ItemPointerGetBlockNumberNoCheck(&posting_item->key),
+										ItemPointerGetOffsetNumberNoCheck(&posting_item->key))));
+				}
+				else if (i != FirstOffsetNumber)
+				{
+					PostingItem *previous_posting_item = GinDataPageGetPostingItem(page, i - 1);
+
+					if (ItemPointerCompare(&posting_item->key, &previous_posting_item->key) < 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\" has wrong tuple order in posting tree, block %u, offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+				}
+
+				/*
+				 * Check if this tuple is consistent with the downlink in the
+				 * parent.
+				 */
+				if (stack->parentblk != InvalidBlockNumber && i == maxoff &&
+					ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\": posting item exceeds parent's high key in postingTree internal page on block %u offset %u",
+									RelationGetRelationName(rel),
+									stack->blkno, i)));
+
+				/* If this is an internal page, recurse into the child */
+				if (!GinPageIsLeaf(page))
+				{
+					GinPostingTreeScanItem *ptr;
+
+					ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
+					ptr->depth = stack->depth + 1;
+					ptr->parentkey = posting_item->key;
+					ptr->parentblk = stack->blkno;
+					ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+			}
+		}
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Main entry point for GIN check. Allocates memory context and scans through
+ * GIN graph.
+ */
+static void
+gin_check_parent_keys_consistency(Relation rel,
+								  Relation heaprel,
+								  void *callback_state,
+								  bool readonly)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GinState	state;
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+	initGinState(&state, rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinScanItem *) palloc0(sizeof(GinScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIN_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GinScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+		IndexTuple	prev_tuple;
+		BlockNumber rightlink;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+		maxoff = PageGetMaxOffsetNumber(page);
+		rightlink = GinPageGetOpaque(page)->rightlink;
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		elog(DEBUG3, "processing entry tree page at blk %u, maxoff: %u", stack->blkno, maxoff);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (stack->parenttup != NULL)
+		{
+			GinNullCategory parent_key_category;
+			Datum		parent_key = gintuple_get_key(&state,
+													  stack->parenttup,
+													  &parent_key_category);
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno,
+												   page, maxoff);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory page_max_key_category;
+			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
+
+			if (rightlink != InvalidBlockNumber &&
+				ginCompareEntries(&state, attnum, page_max_key,
+								  page_max_key_category, parent_key,
+								  parent_key_category) > 0)
+			{
+				/* split page detected, install right link to the stack */
+				GinScanItem *ptr;
+
+				elog(DEBUG3, "split detected for blk: %u, parent blk: %u", stack->blkno, stack->parentblk);
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth;
+				ptr->parenttup = CopyIndexTuple(stack->parenttup);
+				ptr->parentblk = stack->parentblk;
+				ptr->parentlsn = stack->parentlsn;
+				ptr->blkno = rightlink;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that tuples in each page are properly ordered and consistent
+		 * with parent high key
+		 */
+		prev_tuple = NULL;
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory prev_key_category;
+			Datum		prev_key;
+			GinNullCategory current_key_category;
+			Datum		current_key;
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
+
+			/*
+			 * First block is metadata, skip order check. Also, never check
+			 * for high key on rightmost page, as this key is not really
+			 * stored explicitly.
+			 */
+			if (i != FirstOffsetNumber && stack->blkno != GIN_ROOT_BLKNO &&
+				!(i == maxoff && rightlink == InvalidBlockNumber))
+			{
+				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
+				if (ginCompareEntries(&state, attnum, prev_key,
+									  prev_key_category, current_key,
+									  current_key_category) >= 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has wrong tuple order on entry tree page, block %u, offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+			}
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				i == maxoff)
+			{
+				GinNullCategory parent_key_category;
+				Datum		parent_key = gintuple_get_key(&state,
+														  stack->parenttup,
+														  &parent_key_category);
+
+				if (ginCompareEntries(&state, attnum, current_key,
+									  current_key_category, parent_key,
+									  parent_key_category) > 0)
+				{
+					/*
+					 * There was a discrepancy between parent and child
+					 * tuples. We need to verify it is not a result of
+					 * concurrent call of gistplacetopage(). So, lock parent
+					 * and try to find downlink for current page. It may be
+					 * missing due to concurrent page split, this is OK.
+					 */
+					pfree(stack->parenttup);
+					stack->parenttup = gin_refind_parent(rel, stack->parentblk,
+														 stack->blkno, strategy);
+
+					/* We found it - make a final check before failing */
+					if (!stack->parenttup)
+						elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+							 stack->blkno, stack->parentblk);
+					else
+					{
+						parent_key = gintuple_get_key(&state,
+													  stack->parenttup,
+													  &parent_key_category);
+
+						/*
+						 * Check if it is properly adjusted. If succeed,
+						 * procced to the next key.
+						 */
+						if (ginCompareEntries(&state, attnum, current_key,
+											  current_key_category, parent_key,
+											  parent_key_category) > 0)
+							ereport(ERROR,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+											RelationGetRelationName(rel), stack->blkno, i)));
+					}
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+			if (!GinPageIsLeaf(page))
+			{
+				GinScanItem *ptr;
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth + 1;
+				/* last tuple in layer has no high key */
+				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
+					ptr->parenttup = CopyIndexTuple(idxtuple);
+				else
+					ptr->parenttup = NULL;
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = GinGetDownlink(idxtuple);
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+			/* If this item is a pointer to a posting tree, recurse into it */
+			else if (GinIsPostingTree(idxtuple))
+			{
+				BlockNumber rootPostingTree = GinGetPostingTree(idxtuple);
+
+				gin_check_posting_tree_parent_keys_consistency(rel, rootPostingTree);
+			}
+			else
+			{
+				ItemPointer ipd;
+				int			nipd;
+
+				ipd = ginReadTupleWithoutState(idxtuple, &nipd);
+
+				for (int j = 0; j < nipd; j++)
+				{
+					if (!OffsetNumberIsValid(ItemPointerGetOffsetNumber(&ipd[j])))
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting list contains invalid heap pointer on block %u",
+										RelationGetRelationName(rel), stack->blkno)));
+				}
+				pfree(ipd);
+			}
+
+			prev_tuple = CopyIndexTuple(idxtuple);
+		}
+
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Verify that a freshly-read page looks sane.
+ */
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	/*
+	 * ReadBuffer verifies that every newly-read page passes
+	 * PageHeaderIsValid, which means it either contains a reasonably sane
+	 * page header or is all-zero.  We have to defend against the all-zero
+	 * case, however.
+	 */
+	if (PageIsNew(page))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains unexpected zero page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	/*
+	 * Additionally check that the special area looks sane.
+	 */
+	if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains corrupted page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	if (GinPageIsDeleted(page))
+	{
+		if (!GinPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gin_refind_parent(Relation rel, BlockNumber parentblkno,
+				  BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIN_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GinPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+static ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED or LP_DEAD,
+	 * since GIN never uses all three.  Verify that line pointer has storage,
+	 * too.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdIsDead(itemid) || ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 3af065615bc..08290c5a448 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -188,6 +188,26 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gin_index_check(index regclass) returns void</function>
+     <indexterm>
+      <primary>gin_index_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gin_index_check</function> tests that its target GIN index
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
+
   </variablelist>
   <tip>
    <para>
-- 
2.39.3 (Apple Git-145)

v36-0004-Fix-wording-in-GIN-README.patchapplication/octet-stream; name=v36-0004-Fix-wording-in-GIN-README.patch; x-unix-mode=0644Download

From 16c96fa6df59f8419ea935e77448fa3db3ef8efc Mon Sep 17 00:00:00 2001
From: reshke kirill <reshke@double.cloud>
Date: Tue, 3 Dec 2024 15:02:47 +0000
Subject: [PATCH v36 4/8] Fix wording in GIN README.

---
 src/backend/access/gin/README | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/backend/access/gin/README b/src/backend/access/gin/README
index b0807316212..742bcbad499 100644
--- a/src/backend/access/gin/README
+++ b/src/backend/access/gin/README
@@ -237,10 +237,10 @@ GIN packs keys and downlinks into tuples in a different way.
 
 P_i is grouped with K_{i+1}.  -Inf key is not needed.
 
-There are couple of additional notes regarding K_{n+1} key.
-1) In entry tree rightmost page, a key coupled with P_n doesn't really matter.
+There are a couple of additional notes regarding K_{n+1} key.
+1) In the entry tree on the rightmost page, a key coupled with P_n doesn't really matter.
 Highkey is assumed to be infinity.
-2) In posting tree, a key coupled with P_n always doesn't matter.  Highkey for
+2) In the posting tree, a key coupled with P_n always doesn't matter.  Highkey for
 non-rightmost pages is stored separately and accessed via
 GinDataPageGetRightBound().
 
-- 
2.39.3 (Apple Git-145)

v36-0005-Fix-for-gin_index_check.patchapplication/octet-stream; name=v36-0005-Fix-for-gin_index_check.patch; x-unix-mode=0644Download

From 330ad2f0fa34e6f2dffe19afa340cd3da2261920 Mon Sep 17 00:00:00 2001
From: reshke kirill <reshke@double.cloud>
Date: Mon, 16 Dec 2024 10:49:43 +0000
Subject: [PATCH v36 5/8] Fix for gin_index_check.

Never explicitly check high key on rightmost page is entry tree.
Its value is not stored explicitly and is equal to infitity.
---
 contrib/amcheck/verify_gin.c | 43 ++++++++++++++++++++++--------------
 1 file changed, 27 insertions(+), 16 deletions(-)

diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index 2dc5fbba619..74783af54d4 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -163,6 +163,7 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 		Page		page;
 		OffsetNumber i,
 					maxoff;
+		BlockNumber rightlink;
 
 		CHECK_FOR_INTERRUPTS();
 
@@ -170,6 +171,7 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 									RBM_NORMAL, strategy);
 		LockBuffer(buffer, GIN_SHARE);
 		page = (Page) BufferGetPage(buffer);
+
 		Assert(GinPageIsData(page));
 
 		/* Check that the tree has the same height in all branches */
@@ -234,8 +236,8 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 			 * Check that tuples in each page are properly ordered and
 			 * consistent with parent high key
 			 */
-			Assert(GinPageIsData(page));
 			maxoff = GinPageGetOpaque(page)->maxoff;
+			rightlink = GinPageGetOpaque(page)->rightlink;
 
 			elog(DEBUG1, "page blk: %u, type data, maxoff %d", stack->blkno, maxoff);
 
@@ -273,7 +275,12 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 			 */
 			bound = *GinDataPageGetRightBound(page);
 
-			if (stack->parentblk != InvalidBlockNumber &&
+			/*
+			 * Gin page right bound has sane value if only not a highkey on
+			 * rightmost page on level.
+			 */
+			if (ItemPointerIsValid(&stack->parentkey) &&
+				rightlink != InvalidBlockNumber &&
 				!ItemPointerEquals(&stack->parentkey, &bound))
 				ereport(ERROR,
 						(errcode(ERRCODE_INDEX_CORRUPTED),
@@ -287,11 +294,12 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 
 			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
 			{
+				GinPostingTreeScanItem *ptr;
 				PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
 
 				/* ItemPointerGetOffsetNumber expects a valid pointer */
 				if (!(i == maxoff &&
-					  GinPageGetOpaque(page)->rightlink == InvalidBlockNumber))
+					  rightlink == InvalidBlockNumber))
 					elog(DEBUG3, "key (%u, %u) -> %u",
 						 ItemPointerGetBlockNumber(&posting_item->key),
 						 ItemPointerGetOffsetNumber(&posting_item->key),
@@ -300,8 +308,7 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 					elog(DEBUG3, "key (%u, %u) -> %u",
 						 0, 0, BlockIdGetBlockNumber(&posting_item->child_blkno));
 
-				if (i == maxoff &&
-					GinPageGetOpaque(page)->rightlink == InvalidBlockNumber)
+				if (i == maxoff && rightlink == InvalidBlockNumber)
 				{
 					/*
 					 * The rightmost item in the tree level has (0, 0) as the
@@ -340,19 +347,23 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 									RelationGetRelationName(rel),
 									stack->blkno, i)));
 
-				/* If this is an internal page, recurse into the child */
-				if (!GinPageIsLeaf(page))
-				{
-					GinPostingTreeScanItem *ptr;
+				/* This is an internal page, recurse into the child */
+				ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
+				ptr->depth = stack->depth + 1;
 
-					ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
-					ptr->depth = stack->depth + 1;
+				/*
+				 * Set rightmost parent key to invalid iterm pointer. Its
+				 * value is 'Infinity' and not explicitly stored.
+				 */
+				if (rightlink == InvalidBlockNumber)
+					ItemPointerSetInvalid(&ptr->parentkey);
+				else
 					ptr->parentkey = posting_item->key;
-					ptr->parentblk = stack->blkno;
-					ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
-					ptr->next = stack->next;
-					stack->next = ptr;
-				}
+
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
+				ptr->next = stack->next;
+				stack->next = ptr;
 			}
 		}
 		LockBuffer(buffer, GIN_UNLOCK);
-- 
2.39.3 (Apple Git-145)

v36-0006-Add-gin-index-checking-test-for-jsonb-data.patchapplication/octet-stream; name=v36-0006-Add-gin-index-checking-test-for-jsonb-data.patch; x-unix-mode=0644Download

From 010d4cf5502695d8f745423ad5769b34192389ff Mon Sep 17 00:00:00 2001
From: Mark Dilger <mark.dilger@enterprisedb.com>
Date: Fri, 21 Feb 2025 09:40:37 -0800
Subject: [PATCH v36 6/8] Add gin index checking test for jsonb data

Extend the previously committed test of gin index checking to also
include a table using jsonb_path_ops.
---
 contrib/amcheck/expected/check_gin.out | 14 ++++++++++++++
 contrib/amcheck/sql/check_gin.sql      | 12 ++++++++++++
 2 files changed, 26 insertions(+)

diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
index bbcde80e627..93147de0ef1 100644
--- a/contrib/amcheck/expected/check_gin.out
+++ b/contrib/amcheck/expected/check_gin.out
@@ -62,3 +62,17 @@ SELECT gin_index_check('gin_check_text_array_idx');
 
 -- cleanup
 DROP TABLE gin_check_text_array;
+-- Test GIN over jsonb
+CREATE TABLE "gin_check_jsonb"("j" jsonb);
+INSERT INTO gin_check_jsonb values ('{"a":[["b",{"x":1}],["b",{"x":2}]],"c":3}');
+INSERT INTO gin_check_jsonb values ('[[14,2,3]]');
+INSERT INTO gin_check_jsonb values ('[1,[14,2,3]]');
+CREATE INDEX "gin_check_jsonb_idx" on gin_check_jsonb USING GIN("j" jsonb_path_ops);
+SELECT gin_index_check('gin_check_jsonb_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_jsonb;
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
index bbd9b9f8281..92ddbbc7a89 100644
--- a/contrib/amcheck/sql/check_gin.sql
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -38,3 +38,15 @@ SELECT gin_index_check('gin_check_text_array_idx');
 
 -- cleanup
 DROP TABLE gin_check_text_array;
+
+-- Test GIN over jsonb
+CREATE TABLE "gin_check_jsonb"("j" jsonb);
+INSERT INTO gin_check_jsonb values ('{"a":[["b",{"x":1}],["b",{"x":2}]],"c":3}');
+INSERT INTO gin_check_jsonb values ('[[14,2,3]]');
+INSERT INTO gin_check_jsonb values ('[1,[14,2,3]]');
+CREATE INDEX "gin_check_jsonb_idx" on gin_check_jsonb USING GIN("j" jsonb_path_ops);
+
+SELECT gin_index_check('gin_check_jsonb_idx');
+
+-- cleanup
+DROP TABLE gin_check_jsonb;
-- 
2.39.3 (Apple Git-145)

v36-0007-Add-gin-to-the-create-index-concurrently-tap-tes.patchapplication/octet-stream; name=v36-0007-Add-gin-to-the-create-index-concurrently-tap-tes.patch; x-unix-mode=0644Download

From 1e9dbdf291f3b81d7ec702fe262b41933498257b Mon Sep 17 00:00:00 2001
From: Mark Dilger <mark.dilger@enterprisedb.com>
Date: Fri, 21 Feb 2025 12:10:26 -0800
Subject: [PATCH v36 7/8] Add gin to the create index concurrently tap tests

These tests are already checking btree, and can cheaply be extended
to also check gin, so do that.
---
 contrib/amcheck/t/002_cic.pl     | 10 +++++---
 contrib/amcheck/t/003_cic_2pc.pl | 40 ++++++++++++++++++++++++++------
 2 files changed, 40 insertions(+), 10 deletions(-)

diff --git a/contrib/amcheck/t/002_cic.pl b/contrib/amcheck/t/002_cic.pl
index 0b6a5a9e464..6a0c4f61125 100644
--- a/contrib/amcheck/t/002_cic.pl
+++ b/contrib/amcheck/t/002_cic.pl
@@ -21,8 +21,9 @@ $node->append_conf('postgresql.conf',
 	'lock_timeout = ' . (1000 * $PostgreSQL::Test::Utils::timeout_default));
 $node->start;
 $node->safe_psql('postgres', q(CREATE EXTENSION amcheck));
-$node->safe_psql('postgres', q(CREATE TABLE tbl(i int)));
+$node->safe_psql('postgres', q(CREATE TABLE tbl(i int, j jsonb)));
 $node->safe_psql('postgres', q(CREATE INDEX idx ON tbl(i)));
+$node->safe_psql('postgres', q(CREATE INDEX ginidx ON tbl USING gin(j)));
 
 #
 # Stress CIC with pgbench.
@@ -40,13 +41,13 @@ $node->pgbench(
 	{
 		'002_pgbench_concurrent_transaction' => q(
 			BEGIN;
-			INSERT INTO tbl VALUES(0);
+			INSERT INTO tbl VALUES(0, '{"a":[["b",{"x":1}],["b",{"x":2}]],"c":3}');
 			COMMIT;
 		  ),
 		'002_pgbench_concurrent_transaction_savepoints' => q(
 			BEGIN;
 			SAVEPOINT s1;
-			INSERT INTO tbl VALUES(0);
+			INSERT INTO tbl VALUES(0, '[[14,2,3]]');
 			COMMIT;
 		  ),
 		'002_pgbench_concurrent_cic' => q(
@@ -54,7 +55,10 @@ $node->pgbench(
 			\if :gotlock
 				DROP INDEX CONCURRENTLY idx;
 				CREATE INDEX CONCURRENTLY idx ON tbl(i);
+				DROP INDEX CONCURRENTLY ginidx;
+				CREATE INDEX CONCURRENTLY ginidx ON tbl USING gin(j);
 				SELECT bt_index_check('idx',true);
+				SELECT gin_index_check('ginidx');
 				SELECT pg_advisory_unlock(42);
 			\endif
 		  )
diff --git a/contrib/amcheck/t/003_cic_2pc.pl b/contrib/amcheck/t/003_cic_2pc.pl
index 9134487f3b4..00a446a381f 100644
--- a/contrib/amcheck/t/003_cic_2pc.pl
+++ b/contrib/amcheck/t/003_cic_2pc.pl
@@ -25,7 +25,7 @@ $node->append_conf('postgresql.conf',
 	'lock_timeout = ' . (1000 * $PostgreSQL::Test::Utils::timeout_default));
 $node->start;
 $node->safe_psql('postgres', q(CREATE EXTENSION amcheck));
-$node->safe_psql('postgres', q(CREATE TABLE tbl(i int)));
+$node->safe_psql('postgres', q(CREATE TABLE tbl(i int, j jsonb)));
 
 
 #
@@ -41,7 +41,7 @@ my $main_h = $node->background_psql('postgres');
 $main_h->query_safe(
 	q(
 BEGIN;
-INSERT INTO tbl VALUES(0);
+INSERT INTO tbl VALUES(0, '[[14,2,3]]');
 ));
 
 my $cic_h = $node->background_psql('postgres');
@@ -50,6 +50,7 @@ $cic_h->query_until(
 	qr/start/, q(
 \echo start
 CREATE INDEX CONCURRENTLY idx ON tbl(i);
+CREATE INDEX CONCURRENTLY ginidx ON tbl USING gin(j);
 ));
 
 $main_h->query_safe(
@@ -60,7 +61,7 @@ PREPARE TRANSACTION 'a';
 $main_h->query_safe(
 	q(
 BEGIN;
-INSERT INTO tbl VALUES(0);
+INSERT INTO tbl VALUES(0, '[[14,2,3]]');
 ));
 
 $node->safe_psql('postgres', q(COMMIT PREPARED 'a';));
@@ -69,7 +70,7 @@ $main_h->query_safe(
 	q(
 PREPARE TRANSACTION 'b';
 BEGIN;
-INSERT INTO tbl VALUES(0);
+INSERT INTO tbl VALUES(0, '"mary had a little lamb"');
 ));
 
 $node->safe_psql('postgres', q(COMMIT PREPARED 'b';));
@@ -86,6 +87,9 @@ $cic_h->quit;
 $result = $node->psql('postgres', q(SELECT bt_index_check('idx',true)));
 is($result, '0', 'bt_index_check after overlapping 2PC');
 
+$result = $node->psql('postgres', q(SELECT gin_index_check('ginidx')));
+is($result, '0', 'gin_index_check after overlapping 2PC');
+
 
 #
 # Server restart shall not change whether prepared xact blocks CIC
@@ -94,7 +98,7 @@ is($result, '0', 'bt_index_check after overlapping 2PC');
 $node->safe_psql(
 	'postgres', q(
 BEGIN;
-INSERT INTO tbl VALUES(0);
+INSERT INTO tbl VALUES(0, '{"a":[["b",{"x":1}],["b",{"x":2}]],"c":3}');
 PREPARE TRANSACTION 'spans_restart';
 BEGIN;
 CREATE TABLE unused ();
@@ -108,12 +112,16 @@ $reindex_h->query_until(
 \echo start
 DROP INDEX CONCURRENTLY idx;
 CREATE INDEX CONCURRENTLY idx ON tbl(i);
+DROP INDEX CONCURRENTLY ginidx;
+CREATE INDEX CONCURRENTLY ginidx ON tbl USING gin(j);
 ));
 
 $node->safe_psql('postgres', "COMMIT PREPARED 'spans_restart'");
 $reindex_h->quit;
 $result = $node->psql('postgres', q(SELECT bt_index_check('idx',true)));
 is($result, '0', 'bt_index_check after 2PC and restart');
+$result = $node->psql('postgres', q(SELECT gin_index_check('ginidx')));
+is($result, '0', 'gin_index_check after 2PC and restart');
 
 
 #
@@ -136,14 +144,14 @@ $node->pgbench(
 	{
 		'003_pgbench_concurrent_2pc' => q(
 			BEGIN;
-			INSERT INTO tbl VALUES(0);
+			INSERT INTO tbl VALUES(0,'null');
 			PREPARE TRANSACTION 'c:client_id';
 			COMMIT PREPARED 'c:client_id';
 		  ),
 		'003_pgbench_concurrent_2pc_savepoint' => q(
 			BEGIN;
 			SAVEPOINT s1;
-			INSERT INTO tbl VALUES(0);
+			INSERT INTO tbl VALUES(0,'[false, "jnvaba", -76, 7, {"_": [1]}, 9]');
 			PREPARE TRANSACTION 'c:client_id';
 			COMMIT PREPARED 'c:client_id';
 		  ),
@@ -163,7 +171,25 @@ $node->pgbench(
 				SELECT bt_index_check('idx',true);
 				SELECT pg_advisory_unlock(42);
 			\endif
+		  ),
+		'005_pgbench_concurrent_cic' => q(
+			SELECT pg_try_advisory_lock(42)::integer AS gotginlock \gset
+			\if :gotginlock
+				DROP INDEX CONCURRENTLY ginidx;
+				CREATE INDEX CONCURRENTLY ginidx ON tbl USING gin(j);
+				SELECT gin_index_check('ginidx');
+				SELECT pg_advisory_unlock(42);
+			\endif
+		  ),
+		'006_pgbench_concurrent_ric' => q(
+			SELECT pg_try_advisory_lock(42)::integer AS gotginlock \gset
+			\if :gotginlock
+				REINDEX INDEX CONCURRENTLY ginidx;
+				SELECT gin_index_check('ginidx');
+				SELECT pg_advisory_unlock(42);
+			\endif
 		  )
+
 	});
 
 $node->stop;
-- 
2.39.3 (Apple Git-145)

v36-0008-Stress-test-verify_gin-using-pgbench.patchapplication/octet-stream; name=v36-0008-Stress-test-verify_gin-using-pgbench.patch; x-unix-mode=0644Download

From 40858417c7fe2cfef0609e2e30e6f910fe25276f Mon Sep 17 00:00:00 2001
From: Mark Dilger <mark.dilger@enterprisedb.com>
Date: Fri, 21 Feb 2025 12:11:07 -0800
Subject: [PATCH v36 8/8] Stress test verify_gin() using pgbench

Add a tap test which inserts, updates, deletes, and checks in
parallel.  Like all pgbench based tap tests, this test contains race
conditions between the operations, so Your Mileage May Vary.  For
me, on my laptop, I got failures like:

	index "ginidx" has wrong tuple order on entry tree page

which I have not yet investigated.  The test is included here for
anybody interested in debugging this failure.
---
 contrib/amcheck/t/006_gin_concurrency.pl | 196 +++++++++++++++++++++++
 1 file changed, 196 insertions(+)
 create mode 100644 contrib/amcheck/t/006_gin_concurrency.pl

diff --git a/contrib/amcheck/t/006_gin_concurrency.pl b/contrib/amcheck/t/006_gin_concurrency.pl
new file mode 100644
index 00000000000..afc67940d4d
--- /dev/null
+++ b/contrib/amcheck/t/006_gin_concurrency.pl
@@ -0,0 +1,196 @@
+
+# Copyright (c) 2021-2025, PostgreSQL Global Development Group
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+
+use Test::More;
+
+my $node;
+
+#
+# Test set-up
+#
+$node = PostgreSQL::Test::Cluster->new('test');
+$node->init;
+$node->append_conf('postgresql.conf',
+	'lock_timeout = ' . (1000 * $PostgreSQL::Test::Utils::timeout_default));
+$node->start;
+$node->safe_psql('postgres', q(CREATE EXTENSION amcheck));
+$node->safe_psql('postgres', q(CREATE TABLE tbl(i integer[], j jsonb, k jsonb)));
+$node->safe_psql('postgres', q(CREATE INDEX ginidx ON tbl USING gin(i, j, k)));
+$node->safe_psql('postgres', q(CREATE TABLE jsondata (i serial, j jsonb)));
+$node->safe_psql('postgres', q(INSERT INTO jsondata (j) VALUES
+	('1'),
+	('91'),
+	('[5]'),
+	('true'),
+	('"zxI"'),
+	('[1, 7]'),
+	('["", 4]'),
+	('"utDFBz"'),
+	('[[9], ""]'),
+	('"eCvxKPML"'),
+	('["1VMQNQM"]'),
+	('{"": "562c"}'),
+	('[58, 8, null]'),
+	('{"": {"": 62}}'),
+	('["", 6, 19, ""]'),
+	('{"ddfWTQ": true}'),
+	('["", 734.2, 9, 5]'),
+	('"GMV27mjtuuqmlltw"'),
+	('{"dabe": -5, "": 6}'),
+	('"hgihykirQGIYTcCA30"'),
+	('[9, {"Utrn": -6}, ""]'),
+	('"BJTZUMST1_WWEgyqgka_"'),
+	('["", -4, "", [-2], -47]'),
+	('{"": [3], "": {"": "y"}}'),
+	('{"myuijj": "YUWIUZXXLGS"}'),
+	('{"3": false, "C": "1sHTX"}'),
+	('"ZGUORVDE_ACF1QXJ_hipgwrks"'),
+	('{"072": [3, -4], "oh": "eL"}'),
+	('[{"de": 9, "JWHPMRZJW": [0]}]'),
+	('"EACJUZEBAFFBEE6706SZLWVGO635"'),
+	('["P", {"TZW": [""]}, {"": [0]}]'),
+	('{"": -6, "YMb": -22, "__": [""]}'),
+	('{"659": [8], "bfc": [0], "V": ""}'),
+	('{"8776": "1tryl", "Q": 2, "": 4.6}'),
+	('[[1], "", 9, 0, [1, 0], -1, 0, "C"]'),
+	('"635321pnpjlfFzhGTIYP9265iA_19D8260"'),
+	('"klmxsoCFDtzxrhotsqlnmvmzlcbdde34twj"'),
+	('"GZSXSZVS19ecbe_ZJJED0379c1j9_GSU9167"'),
+	('{"F18s": {"": -84194}, "ececab2": [""]}'),
+	('["", {"SVAvgg": "Q"}, 1, 9, "gypy", [1]]'),
+	('[[""], {"": 5}, "GVZGGVGSWM", 2, ["", 8]]'),
+	('{"V": 8, "TPNL": [826, null], "4": -9.729}'),
+	('{"HTJP_DAptxn6": 9, "": "r", "hji4124": ""}'),
+	('[1, ["9", 5, 6, ""], {"": "", "": "efb"}, 7]'),
+	('{"": 6, "1251e_cajrgkyzuxBEDM017444EFD": 548}'),
+	('{"853": -60, "TGLUG_jxmrggv": null, "pjx": ""}'),
+	('[0, "wsgnnvCfJVV_KOMLVXOUIS9FIQLPXXBbbaohjrpj"]'),
+	('"nizvkl36908OLW22ecbdeEBMHMiCEEACcikwkjpmu30X_m"'),
+	('{"bD24eeVZWY": 1, "Bt": 9, "": 6052, "FT": ["h"]}'),
+	('"CDBnouyzlAMSHJCtguxxizpzgkNYfaNLURVITNLYVPSNLYNy"'),
+	('{"d": [[4, "N"], null, 6, true], "1PKV": 6, "9": 6}'),
+	('[-7326, [83, 55], -63, [0, {"": 1}], {"ri0": false}]'),
+	('{"": 117.38, "FCkx3608szztpvjolomzvlyrshyvrgz": -4.2}'),
+	('["", 8, {"WXHNG": {"6": 4}}, [null], 7, 2, "", 299, 6]'),
+	('[[-992.2, "TPm", "", "cedeff79BD8", "t", [1]], 0, [-7]]'),
+	('[9, 34, ["LONuyiYGQZ"], [7, 88], ["c"], 1, 6, "", [[2]]]'),
+	('[20, 5, null, "eLHTXRWNV", 8, ["pnpvrum", -3], "FINY", 3]'),
+	('[{"": "", "b": 2, "d": "egu"}, "aPNK", 2, 9, {"": -79946}]'),
+	('[1, {"769": 9}, 5, 9821, 22, 0, 2.7, 5, 4, 191, 54.599, 24]'),
+	('["c", 77, "b_0lplvHJNLMxw", "VN76dhFadaafadfe5dfbco", false]'),
+	('"TYIHXebbPK_86QMP_199bEEIS__8205986vdC_CFAEFBFCEFCJQRHYoqztv"'),
+	('"cdmxxxzrhtxpwuyrxinmhb5577NSPHIHMTPQYTXSUVVGJPUUMCBEDb_1569e"'),
+	('[[5, null, "C"], "ORNR", "mnCb", 1, -800, "6953", ["K", 0], ""]'),
+	('"SSKLTHJxjxywwquhiwsde353eCIJJjkyvn9946c2cdVadcboiyZFAYMHJWGMMT"'),
+	('"5185__D5AtvhizvmEVceF3jxtghlCF0789_owmsztJHRMOJ7rlowxqq51XLXJbF"'),
+	('{"D": 565206, "xupqtmfedff": "ZGJN9", "9": 1, "glzv": -47, "": -8}'),
+	('{"": 9, "": {"": [null], "ROP": 842}, "": ["5FFD", 7, 5, 1, 94, 1]}'),
+	('{"JLn": ["8s"], "": "_ahxizrzhivyzvhr", "XSAt": 5, "P": 2838, "": 5}'),
+	('[51, 3, {"": 9, "": -9, "": [[6]]}, 7, 7, {"": 0}, "TXLQL", 7.6, [7]]'),
+	('[-38.7, "kre40", 5, {"": null}, "tvuv", 8, "", "", "uizygprwwvh", "1"]'),
+	('"z934377_nxmzjnuqglgyukjteefeihjyot1irkvwnnrqinptlpzwjgmkjbQMUVxxwvbdz"'),
+	('[165.9, "dAFD_60JQPYbafh", false, {"": 6, "": "fcfd"}, [[2], "c"], 4, 2]'),
+	('"ffHOOPVSSACDqiyeecTNWJMWPNRXU283aHRXNUNZZZQPUGYSQTTQXQVJM5eeafcIPGIHcac"'),
+	('[2, 8, -53, {"": 5}, "F9", 8, "SGUJPNVI", "7OLOZH", 9.84, {"": 6}, 207, 6]'),
+	('"xqmqmyljhq__ZGWJVNefagsxrsktruhmlinhxloupuVQW0804901NKGGMNNSYYXWQOosz8938"'),
+	('{"FEoLfaab1160167": {"L": [42, 0]}, "938": "FCCUPGYYYMQSQVZJKM", "knqmk": 2}'),
+	('"0igyurmOMSXIYHSZQEAcxlvgqdxkhwtrbaabfaaMC138Z_BDRLrythpi30_MPRXMTOILRLswmoy"'),
+	('"1129BBCABFFAACA9VGVKipnwohaccc9TSIMTOQKHmcGYVeFE_PWKLHmpyj60137672qugtsstugg"'),
+	('"D3BDA069074174vx48A37IVHWVXLUP9382542ypsl1465pixtryzCBgrkkhrvCC_BDDFatkyXHLIe"'),
+	('[{"esx7": -53, "ec60834YGVMYoXAAvgxmmqnojyzmiklhdovFipl": 2, "os": 66433}, 9.13]'),
+	('{"": ["", 4, null, 5, null], "": "3", "5_GMMHTIhPB_F_vsebc1": "Er", "GY": 121.32}'),
+	('["krTVPYDEd", 5, 8, [6, -6], [[-9], 3340, [[""]]], "", 5, [6, true], 3, "", 1, ""]'),
+	('{"rBNPKN8446080wruOLeceaCBDCKWNUYYMONSJUlCDFExr": {"": "EE0", "6826": 5, "": 7496}}'),
+	('[3, {"": -8}, "101dboMVSNKZLVPITLHLPorwwuxxjmjsh", "", "LSQPRVYKWVYK945imrh", 4, 51]'),
+	('[["HY6"], "", "bcdB", [2, [85, 1], 3, 3, 3, [8]], "", ["_m"], "2", -33, 8, 3, "_xwj"]'),
+	('["", 0, -3.7, 8, false, null, {"": 5}, 9, "06FccxFcdb283bbZGGVRSMWLJH2_PBAFpwtkbceto"]'),
+	('[52, "", -39, -7, [1], "c", {"": 9, "": 45528, "G": {"": 7}}, 3, false, 0, "EB", 8, -6]'),
+	('"qzrkvrlG78CCCEBCptzwwok808805243QXVSYed3efZSKLSNXPxhrS357KJMWSKgrfcFFDFDWKSXJJSIJ_yqJu"'),
+	('[43, 8, {"": ""}, "uwtv__HURKGJLGGPPW", 9, 66, "yqrvghxuw", {"J": false}, false, 2, 0, 4]'),
+	('[{"UVL": 7, "": 1}, false, [6, "H"], "boxlgqgm", 3, "znhm", [true], 0, ["e", 3.7], 9, 9.4]'),
+	('{"825634870117somzqw": 1, "": [5], "gYH": "_XT", "b22412631709RZP": 3, "": "", "FDB": [""]}'),
+	('[8, ["_bae"], "", "WN", 80, {"o": 2, "aff": 16}, false, true, 4, 6, {"nutzkzikolsxZRQ": 30}]'),
+	('["588BD9c_xzsn", {"k": 0, "_Ecezlkslrwvjpwrukiqzl": 3, "Ej": "4"}, "TUXwghn1dTNRXJZpswmD", 5]'),
+	('[{"dC": 7}, {"": 1, "4": 41, "": "", "": "adKS"}, {"": "ypv"}, 6, 9, 2, [-61.46], [1, 3.9], 2]'),
+	('{"8": 8, "": -364, "855": -238.1, "zj": 9, "SNHJG413": 3, "UMNVI73": [60, 0], "iwvqse": -1.833}'),
+	('"VTUKMLZKQPHIEniCFZ_cjrhvspxzulvxhqykjzmrw89OGOGISWdcrvpOPLOFALGK809896999xzqnkm63254_xrmcfcedb"'),
+	('["", "USNQbcexyFDCdBAFWJIphloxwytplyZZR008400FmoiYXVYOHVGV79795644463Aug_aeoDDEjzoziisxoykuijhz"]'),
+	('{"": 1, "5abB58gXVQVTTMWU3jSHXMMNV": "", "nv": 934, "kjsnhtj": 8, "": [{"xm": [71, 425]}], "": -9}'),
+	('"__oliqCcbwwyqmtECsqivplcb1NTMOQRZTYRJONOIPWNHKWLJRIHKROMJNZLNGTTKRcedebccdbMTQXSzhynxmllqxuhnxBA_"'),
+	('["thgACBWGNGMkFFEA", [0, -1349, {"18": "RM", "F3": 6, "dP": "_AF"}, 64, 0, {"f": [8]}], 5, [[0]], 2]')
+));
+
+#
+# Stress gin with pgbench.
+#
+# Modify the table data, and hence the index data, from multiple process
+# while from other processes run the index checking code.  This should,
+# if the index is large enough, result in the checks performing across
+# concurrent page splits.
+#
+$node->pgbench(
+	'--no-vacuum --client=20 --transactions=5000',
+	0,
+	[qr{actually processed}],
+	[qr{^$}],
+	'concurrent DML and index checking',
+	{
+		'006_gin_concurrency_insert_1' => q(
+			INSERT INTO tbl (i, j, k)
+				(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+					FROM jsondata x, jsondata y
+					WHERE x.i = random(1,100)
+					  AND y.i = random(1,100)
+				)
+		  ),
+		'006_gin_concurrency_insert_2' => q(
+			INSERT INTO tbl (i, j, k)
+				(SELECT gs.i, j.j, j.j || j.j
+					FROM jsondata j,
+						 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+					WHERE j.i = random(1,100)
+				)
+		  ),
+		'006_gin_concurrency_insert_nulls' => q(
+			INSERT INTO tbl (i, j, k) VALUES
+				(null,               null, null),
+				(null,               null, '[]'),
+				(null,               '[]', null),
+				(ARRAY[]::INTEGER[], null, null),
+				(null,               '[]', '[]'),
+				(ARRAY[]::INTEGER[], '[]', null),
+				(ARRAY[]::INTEGER[], '[]', '[]')
+		  ),
+		'006_gin_concurrency_update_i' => q(
+			UPDATE tbl
+				SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+				WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+		),
+		'006_gin_concurrency_update_j' => q(
+			UPDATE tbl
+				SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+				WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+		),
+		'006_gin_concurrency_update_k' => q(
+			UPDATE tbl
+				SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+				WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+		),
+		'006_gin_concurrency_delete' => q(
+			DELETE FROM tbl
+				WHERE random(1,5) = 3;
+		),
+		'006_gin_concurrency_gin_index_check' => q(
+				SELECT gin_index_check('ginidx');
+		)
+	});
+
+$node->stop;
+done_testing();
+
-- 
2.39.3 (Apple Git-145)

#65

Mark Dilger

mark.dilger@enterprisedb.com

11 months ago

In reply to: Mark Dilger (#64)

Re: Amcheck verification of GiST and GIN

On Feb 21, 2025, at 12:16 PM, Mark Dilger <mark.dilger@enterprisedb.com> wrote:

The pgbench script is not corrupting anything overtly, so this looks to either be a bug in gin or a bug in the check.

I suspected the AccessShareLock taken by verify_gin() might be too weak, and upgraded that to ShareRowExclusiveLock so as to prevent the concurrent table modifications (and incidentally other concurrent verify_gin() calls), but to my surprise that didn't fix anything. Even AccessExclusiveLock doesn't fix it. So this seems to either be a bug in the checking code complaining about perfectly valid tuple order, or a bug in Gin corrupting its own entry tree page.

On successive runs, (instrumented to print out a bit more info), there doesn't seem to be any obvious pattern in where the corruption occurs. The offset in the page changes, neither always being at the beginning, nor always at the maxoff; likewise the block where corruption is detected changes from run to run. I've noticed that the rightlink for the page is always the page's block number plus one, but that might just be that I haven't run enough iterations yet to see counter-examples.

Could one of the patch authors take a look? I don't have the time to chase this to conclusion just now. Thanks.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#66

Mark Dilger

mark.dilger@enterprisedb.com

11 months ago

In reply to: Mark Dilger (#65)

1 attachment(s)

Re: Amcheck verification of GiST and GIN

On Feb 21, 2025, at 12:50 PM, Mark Dilger <mark.dilger@enterprisedb.com> wrote:

Could one of the patch authors take a look?

I turned the TAP test which triggers the error into a regression test that does likewise, for ease of stepping through the test, if anybody should want to do that. I'm attaching that patch here, but please note that I'm not expecting this to be committed.

Attachments:

v0-0001-Add-a-reproducible-test-case-for-verify_gin-error.patch.no_applyapplication/octet-stream; name=v0-0001-Add-a-reproducible-test-case-for-verify_gin-error.patch.no_apply; x-unix-mode=0644Download

From d2ec9b8f77bf40ed258f5d12bc58877b6f1109ef Mon Sep 17 00:00:00 2001
From: Mark Dilger <mark.dilger@enterprisedb.com>
Date: Fri, 21 Feb 2025 14:45:12 -0800
Subject: [PATCH v0] Add a reproducible test case for verify_gin errors

The test t/006_gin_concurrency.pl works fairly well at triggering
the bug, at least for me, but it is hard to step through a TAP test.
Convert the statements in the log output of a failed TAP run into a
regression test, and find a setseed() value that makes it
deterministically fail, and add that test.

If you use this test and it does not fail, perhaps that is due to
architectural differences between your system and my laptop.  I
found that I could find a seed by guess-and-check within no more
than 10 attempts, so the reader can probably do likewise.
---
 contrib/amcheck/Makefile                |     2 +-
 contrib/amcheck/expected/stress_gin.out | 12925 ++++++++++++++++++++++
 contrib/amcheck/sql/stress_gin.sql      | 11432 +++++++++++++++++++
 3 files changed, 24358 insertions(+), 1 deletion(-)
 create mode 100644 contrib/amcheck/expected/stress_gin.out
 create mode 100644 contrib/amcheck/sql/stress_gin.sql

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index 1b7a63cbaa4..66686f522ff 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -13,7 +13,7 @@ DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck
 		amcheck--1.3--1.4.sql amcheck--1.4--1.5.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_gin check_heap
+REGRESS = check check_btree check_gin check_heap stress_gin
 
 EXTRA_INSTALL = contrib/pg_walinspect
 TAP_TESTS = 1
diff --git a/contrib/amcheck/expected/stress_gin.out b/contrib/amcheck/expected/stress_gin.out
new file mode 100644
index 00000000000..e2ace3cbf13
--- /dev/null
+++ b/contrib/amcheck/expected/stress_gin.out
@@ -0,0 +1,12925 @@
+SELECT setseed(0.2);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE tbl(i integer[], j jsonb, k jsonb);
+CREATE INDEX ginidx ON tbl USING gin(i, j, k);
+CREATE TABLE jsondata (i serial, j jsonb);
+INSERT INTO jsondata (j) VALUES
+	('1'),
+	('91'),
+	('[5]'),
+	('true'),
+	('"zxI"'),
+	('[1, 7]'),
+	('["", 4]'),
+	('"utDFBz"'),
+	('[[9], ""]'),
+	('"eCvxKPML"'),
+	('["1VMQNQM"]'),
+	('{"": "562c"}'),
+	('[58, 8, null]'),
+	('{"": {"": 62}}'),
+	('["", 6, 19, ""]'),
+	('{"ddfWTQ": true}'),
+	('["", 734.2, 9, 5]'),
+	('"GMV27mjtuuqmlltw"'),
+	('{"dabe": -5, "": 6}'),
+	('"hgihykirQGIYTcCA30"'),
+	('[9, {"Utrn": -6}, ""]'),
+	('"BJTZUMST1_WWEgyqgka_"'),
+	('["", -4, "", [-2], -47]'),
+	('{"": [3], "": {"": "y"}}'),
+	('{"myuijj": "YUWIUZXXLGS"}'),
+	('{"3": false, "C": "1sHTX"}'),
+	('"ZGUORVDE_ACF1QXJ_hipgwrks"'),
+	('{"072": [3, -4], "oh": "eL"}'),
+	('[{"de": 9, "JWHPMRZJW": [0]}]'),
+	('"EACJUZEBAFFBEE6706SZLWVGO635"'),
+	('["P", {"TZW": [""]}, {"": [0]}]'),
+	('{"": -6, "YMb": -22, "__": [""]}'),
+	('{"659": [8], "bfc": [0], "V": ""}'),
+	('{"8776": "1tryl", "Q": 2, "": 4.6}'),
+	('[[1], "", 9, 0, [1, 0], -1, 0, "C"]'),
+	('"635321pnpjlfFzhGTIYP9265iA_19D8260"'),
+	('"klmxsoCFDtzxrhotsqlnmvmzlcbdde34twj"'),
+	('"GZSXSZVS19ecbe_ZJJED0379c1j9_GSU9167"'),
+	('{"F18s": {"": -84194}, "ececab2": [""]}'),
+	('["", {"SVAvgg": "Q"}, 1, 9, "gypy", [1]]'),
+	('[[""], {"": 5}, "GVZGGVGSWM", 2, ["", 8]]'),
+	('{"V": 8, "TPNL": [826, null], "4": -9.729}'),
+	('{"HTJP_DAptxn6": 9, "": "r", "hji4124": ""}'),
+	('[1, ["9", 5, 6, ""], {"": "", "": "efb"}, 7]'),
+	('{"": 6, "1251e_cajrgkyzuxBEDM017444EFD": 548}'),
+	('{"853": -60, "TGLUG_jxmrggv": null, "pjx": ""}'),
+	('[0, "wsgnnvCfJVV_KOMLVXOUIS9FIQLPXXBbbaohjrpj"]'),
+	('"nizvkl36908OLW22ecbdeEBMHMiCEEACcikwkjpmu30X_m"'),
+	('{"bD24eeVZWY": 1, "Bt": 9, "": 6052, "FT": ["h"]}'),
+	('"CDBnouyzlAMSHJCtguxxizpzgkNYfaNLURVITNLYVPSNLYNy"'),
+	('{"d": [[4, "N"], null, 6, true], "1PKV": 6, "9": 6}'),
+	('[-7326, [83, 55], -63, [0, {"": 1}], {"ri0": false}]'),
+	('{"": 117.38, "FCkx3608szztpvjolomzvlyrshyvrgz": -4.2}'),
+	('["", 8, {"WXHNG": {"6": 4}}, [null], 7, 2, "", 299, 6]'),
+	('[[-992.2, "TPm", "", "cedeff79BD8", "t", [1]], 0, [-7]]'),
+	('[9, 34, ["LONuyiYGQZ"], [7, 88], ["c"], 1, 6, "", [[2]]]'),
+	('[20, 5, null, "eLHTXRWNV", 8, ["pnpvrum", -3], "FINY", 3]'),
+	('[{"": "", "b": 2, "d": "egu"}, "aPNK", 2, 9, {"": -79946}]'),
+	('[1, {"769": 9}, 5, 9821, 22, 0, 2.7, 5, 4, 191, 54.599, 24]'),
+	('["c", 77, "b_0lplvHJNLMxw", "VN76dhFadaafadfe5dfbco", false]'),
+	('"TYIHXebbPK_86QMP_199bEEIS__8205986vdC_CFAEFBFCEFCJQRHYoqztv"'),
+	('"cdmxxxzrhtxpwuyrxinmhb5577NSPHIHMTPQYTXSUVVGJPUUMCBEDb_1569e"'),
+	('[[5, null, "C"], "ORNR", "mnCb", 1, -800, "6953", ["K", 0], ""]'),
+	('"SSKLTHJxjxywwquhiwsde353eCIJJjkyvn9946c2cdVadcboiyZFAYMHJWGMMT"'),
+	('"5185__D5AtvhizvmEVceF3jxtghlCF0789_owmsztJHRMOJ7rlowxqq51XLXJbF"'),
+	('{"D": 565206, "xupqtmfedff": "ZGJN9", "9": 1, "glzv": -47, "": -8}'),
+	('{"": 9, "": {"": [null], "ROP": 842}, "": ["5FFD", 7, 5, 1, 94, 1]}'),
+	('{"JLn": ["8s"], "": "_ahxizrzhivyzvhr", "XSAt": 5, "P": 2838, "": 5}'),
+	('[51, 3, {"": 9, "": -9, "": [[6]]}, 7, 7, {"": 0}, "TXLQL", 7.6, [7]]'),
+	('[-38.7, "kre40", 5, {"": null}, "tvuv", 8, "", "", "uizygprwwvh", "1"]'),
+	('"z934377_nxmzjnuqglgyukjteefeihjyot1irkvwnnrqinptlpzwjgmkjbQMUVxxwvbdz"'),
+	('[165.9, "dAFD_60JQPYbafh", false, {"": 6, "": "fcfd"}, [[2], "c"], 4, 2]'),
+	('"ffHOOPVSSACDqiyeecTNWJMWPNRXU283aHRXNUNZZZQPUGYSQTTQXQVJM5eeafcIPGIHcac"'),
+	('[2, 8, -53, {"": 5}, "F9", 8, "SGUJPNVI", "7OLOZH", 9.84, {"": 6}, 207, 6]'),
+	('"xqmqmyljhq__ZGWJVNefagsxrsktruhmlinhxloupuVQW0804901NKGGMNNSYYXWQOosz8938"'),
+	('{"FEoLfaab1160167": {"L": [42, 0]}, "938": "FCCUPGYYYMQSQVZJKM", "knqmk": 2}'),
+	('"0igyurmOMSXIYHSZQEAcxlvgqdxkhwtrbaabfaaMC138Z_BDRLrythpi30_MPRXMTOILRLswmoy"'),
+	('"1129BBCABFFAACA9VGVKipnwohaccc9TSIMTOQKHmcGYVeFE_PWKLHmpyj60137672qugtsstugg"'),
+	('"D3BDA069074174vx48A37IVHWVXLUP9382542ypsl1465pixtryzCBgrkkhrvCC_BDDFatkyXHLIe"'),
+	('[{"esx7": -53, "ec60834YGVMYoXAAvgxmmqnojyzmiklhdovFipl": 2, "os": 66433}, 9.13]'),
+	('{"": ["", 4, null, 5, null], "": "3", "5_GMMHTIhPB_F_vsebc1": "Er", "GY": 121.32}'),
+	('["krTVPYDEd", 5, 8, [6, -6], [[-9], 3340, [[""]]], "", 5, [6, true], 3, "", 1, ""]'),
+	('{"rBNPKN8446080wruOLeceaCBDCKWNUYYMONSJUlCDFExr": {"": "EE0", "6826": 5, "": 7496}}'),
+	('[3, {"": -8}, "101dboMVSNKZLVPITLHLPorwwuxxjmjsh", "", "LSQPRVYKWVYK945imrh", 4, 51]'),
+	('[["HY6"], "", "bcdB", [2, [85, 1], 3, 3, 3, [8]], "", ["_m"], "2", -33, 8, 3, "_xwj"]'),
+	('["", 0, -3.7, 8, false, null, {"": 5}, 9, "06FccxFcdb283bbZGGVRSMWLJH2_PBAFpwtkbceto"]'),
+	('[52, "", -39, -7, [1], "c", {"": 9, "": 45528, "G": {"": 7}}, 3, false, 0, "EB", 8, -6]'),
+	('"qzrkvrlG78CCCEBCptzwwok808805243QXVSYed3efZSKLSNXPxhrS357KJMWSKgrfcFFDFDWKSXJJSIJ_yqJu"'),
+	('[43, 8, {"": ""}, "uwtv__HURKGJLGGPPW", 9, 66, "yqrvghxuw", {"J": false}, false, 2, 0, 4]'),
+	('[{"UVL": 7, "": 1}, false, [6, "H"], "boxlgqgm", 3, "znhm", [true], 0, ["e", 3.7], 9, 9.4]'),
+	('{"825634870117somzqw": 1, "": [5], "gYH": "_XT", "b22412631709RZP": 3, "": "", "FDB": [""]}'),
+	('[8, ["_bae"], "", "WN", 80, {"o": 2, "aff": 16}, false, true, 4, 6, {"nutzkzikolsxZRQ": 30}]'),
+	('["588BD9c_xzsn", {"k": 0, "_Ecezlkslrwvjpwrukiqzl": 3, "Ej": "4"}, "TUXwghn1dTNRXJZpswmD", 5]'),
+	('[{"dC": 7}, {"": 1, "4": 41, "": "", "": "adKS"}, {"": "ypv"}, 6, 9, 2, [-61.46], [1, 3.9], 2]'),
+	('{"8": 8, "": -364, "855": -238.1, "zj": 9, "SNHJG413": 3, "UMNVI73": [60, 0], "iwvqse": -1.833}'),
+	('"VTUKMLZKQPHIEniCFZ_cjrhvspxzulvxhqykjzmrw89OGOGISWdcrvpOPLOFALGK809896999xzqnkm63254_xrmcfcedb"'),
+	('["", "USNQbcexyFDCdBAFWJIphloxwytplyZZR008400FmoiYXVYOHVGV79795644463Aug_aeoDDEjzoziisxoykuijhz"]'),
+	('{"": 1, "5abB58gXVQVTTMWU3jSHXMMNV": "", "nv": 934, "kjsnhtj": 8, "": [{"xm": [71, 425]}], "": -9}'),
+	('"__oliqCcbwwyqmtECsqivplcb1NTMOQRZTYRJONOIPWNHKWLJRIHKROMJNZLNGTTKRcedebccdbMTQXSzhynxmllqxuhnxBA_"'),
+	('["thgACBWGNGMkFFEA", [0, -1349, {"18": "RM", "F3": 6, "dP": "_AF"}, 64, 0, {"f": [8]}], 5, [[0]], 2]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
diff --git a/contrib/amcheck/sql/stress_gin.sql b/contrib/amcheck/sql/stress_gin.sql
new file mode 100644
index 00000000000..881a7d442a5
--- /dev/null
+++ b/contrib/amcheck/sql/stress_gin.sql
@@ -0,0 +1,11432 @@
+SELECT setseed(0.2);
+CREATE TABLE tbl(i integer[], j jsonb, k jsonb);
+CREATE INDEX ginidx ON tbl USING gin(i, j, k);
+CREATE TABLE jsondata (i serial, j jsonb);
+INSERT INTO jsondata (j) VALUES
+	('1'),
+	('91'),
+	('[5]'),
+	('true'),
+	('"zxI"'),
+	('[1, 7]'),
+	('["", 4]'),
+	('"utDFBz"'),
+	('[[9], ""]'),
+	('"eCvxKPML"'),
+	('["1VMQNQM"]'),
+	('{"": "562c"}'),
+	('[58, 8, null]'),
+	('{"": {"": 62}}'),
+	('["", 6, 19, ""]'),
+	('{"ddfWTQ": true}'),
+	('["", 734.2, 9, 5]'),
+	('"GMV27mjtuuqmlltw"'),
+	('{"dabe": -5, "": 6}'),
+	('"hgihykirQGIYTcCA30"'),
+	('[9, {"Utrn": -6}, ""]'),
+	('"BJTZUMST1_WWEgyqgka_"'),
+	('["", -4, "", [-2], -47]'),
+	('{"": [3], "": {"": "y"}}'),
+	('{"myuijj": "YUWIUZXXLGS"}'),
+	('{"3": false, "C": "1sHTX"}'),
+	('"ZGUORVDE_ACF1QXJ_hipgwrks"'),
+	('{"072": [3, -4], "oh": "eL"}'),
+	('[{"de": 9, "JWHPMRZJW": [0]}]'),
+	('"EACJUZEBAFFBEE6706SZLWVGO635"'),
+	('["P", {"TZW": [""]}, {"": [0]}]'),
+	('{"": -6, "YMb": -22, "__": [""]}'),
+	('{"659": [8], "bfc": [0], "V": ""}'),
+	('{"8776": "1tryl", "Q": 2, "": 4.6}'),
+	('[[1], "", 9, 0, [1, 0], -1, 0, "C"]'),
+	('"635321pnpjlfFzhGTIYP9265iA_19D8260"'),
+	('"klmxsoCFDtzxrhotsqlnmvmzlcbdde34twj"'),
+	('"GZSXSZVS19ecbe_ZJJED0379c1j9_GSU9167"'),
+	('{"F18s": {"": -84194}, "ececab2": [""]}'),
+	('["", {"SVAvgg": "Q"}, 1, 9, "gypy", [1]]'),
+	('[[""], {"": 5}, "GVZGGVGSWM", 2, ["", 8]]'),
+	('{"V": 8, "TPNL": [826, null], "4": -9.729}'),
+	('{"HTJP_DAptxn6": 9, "": "r", "hji4124": ""}'),
+	('[1, ["9", 5, 6, ""], {"": "", "": "efb"}, 7]'),
+	('{"": 6, "1251e_cajrgkyzuxBEDM017444EFD": 548}'),
+	('{"853": -60, "TGLUG_jxmrggv": null, "pjx": ""}'),
+	('[0, "wsgnnvCfJVV_KOMLVXOUIS9FIQLPXXBbbaohjrpj"]'),
+	('"nizvkl36908OLW22ecbdeEBMHMiCEEACcikwkjpmu30X_m"'),
+	('{"bD24eeVZWY": 1, "Bt": 9, "": 6052, "FT": ["h"]}'),
+	('"CDBnouyzlAMSHJCtguxxizpzgkNYfaNLURVITNLYVPSNLYNy"'),
+	('{"d": [[4, "N"], null, 6, true], "1PKV": 6, "9": 6}'),
+	('[-7326, [83, 55], -63, [0, {"": 1}], {"ri0": false}]'),
+	('{"": 117.38, "FCkx3608szztpvjolomzvlyrshyvrgz": -4.2}'),
+	('["", 8, {"WXHNG": {"6": 4}}, [null], 7, 2, "", 299, 6]'),
+	('[[-992.2, "TPm", "", "cedeff79BD8", "t", [1]], 0, [-7]]'),
+	('[9, 34, ["LONuyiYGQZ"], [7, 88], ["c"], 1, 6, "", [[2]]]'),
+	('[20, 5, null, "eLHTXRWNV", 8, ["pnpvrum", -3], "FINY", 3]'),
+	('[{"": "", "b": 2, "d": "egu"}, "aPNK", 2, 9, {"": -79946}]'),
+	('[1, {"769": 9}, 5, 9821, 22, 0, 2.7, 5, 4, 191, 54.599, 24]'),
+	('["c", 77, "b_0lplvHJNLMxw", "VN76dhFadaafadfe5dfbco", false]'),
+	('"TYIHXebbPK_86QMP_199bEEIS__8205986vdC_CFAEFBFCEFCJQRHYoqztv"'),
+	('"cdmxxxzrhtxpwuyrxinmhb5577NSPHIHMTPQYTXSUVVGJPUUMCBEDb_1569e"'),
+	('[[5, null, "C"], "ORNR", "mnCb", 1, -800, "6953", ["K", 0], ""]'),
+	('"SSKLTHJxjxywwquhiwsde353eCIJJjkyvn9946c2cdVadcboiyZFAYMHJWGMMT"'),
+	('"5185__D5AtvhizvmEVceF3jxtghlCF0789_owmsztJHRMOJ7rlowxqq51XLXJbF"'),
+	('{"D": 565206, "xupqtmfedff": "ZGJN9", "9": 1, "glzv": -47, "": -8}'),
+	('{"": 9, "": {"": [null], "ROP": 842}, "": ["5FFD", 7, 5, 1, 94, 1]}'),
+	('{"JLn": ["8s"], "": "_ahxizrzhivyzvhr", "XSAt": 5, "P": 2838, "": 5}'),
+	('[51, 3, {"": 9, "": -9, "": [[6]]}, 7, 7, {"": 0}, "TXLQL", 7.6, [7]]'),
+	('[-38.7, "kre40", 5, {"": null}, "tvuv", 8, "", "", "uizygprwwvh", "1"]'),
+	('"z934377_nxmzjnuqglgyukjteefeihjyot1irkvwnnrqinptlpzwjgmkjbQMUVxxwvbdz"'),
+	('[165.9, "dAFD_60JQPYbafh", false, {"": 6, "": "fcfd"}, [[2], "c"], 4, 2]'),
+	('"ffHOOPVSSACDqiyeecTNWJMWPNRXU283aHRXNUNZZZQPUGYSQTTQXQVJM5eeafcIPGIHcac"'),
+	('[2, 8, -53, {"": 5}, "F9", 8, "SGUJPNVI", "7OLOZH", 9.84, {"": 6}, 207, 6]'),
+	('"xqmqmyljhq__ZGWJVNefagsxrsktruhmlinhxloupuVQW0804901NKGGMNNSYYXWQOosz8938"'),
+	('{"FEoLfaab1160167": {"L": [42, 0]}, "938": "FCCUPGYYYMQSQVZJKM", "knqmk": 2}'),
+	('"0igyurmOMSXIYHSZQEAcxlvgqdxkhwtrbaabfaaMC138Z_BDRLrythpi30_MPRXMTOILRLswmoy"'),
+	('"1129BBCABFFAACA9VGVKipnwohaccc9TSIMTOQKHmcGYVeFE_PWKLHmpyj60137672qugtsstugg"'),
+	('"D3BDA069074174vx48A37IVHWVXLUP9382542ypsl1465pixtryzCBgrkkhrvCC_BDDFatkyXHLIe"'),
+	('[{"esx7": -53, "ec60834YGVMYoXAAvgxmmqnojyzmiklhdovFipl": 2, "os": 66433}, 9.13]'),
+	('{"": ["", 4, null, 5, null], "": "3", "5_GMMHTIhPB_F_vsebc1": "Er", "GY": 121.32}'),
+	('["krTVPYDEd", 5, 8, [6, -6], [[-9], 3340, [[""]]], "", 5, [6, true], 3, "", 1, ""]'),
+	('{"rBNPKN8446080wruOLeceaCBDCKWNUYYMONSJUlCDFExr": {"": "EE0", "6826": 5, "": 7496}}'),
+	('[3, {"": -8}, "101dboMVSNKZLVPITLHLPorwwuxxjmjsh", "", "LSQPRVYKWVYK945imrh", 4, 51]'),
+	('[["HY6"], "", "bcdB", [2, [85, 1], 3, 3, 3, [8]], "", ["_m"], "2", -33, 8, 3, "_xwj"]'),
+	('["", 0, -3.7, 8, false, null, {"": 5}, 9, "06FccxFcdb283bbZGGVRSMWLJH2_PBAFpwtkbceto"]'),
+	('[52, "", -39, -7, [1], "c", {"": 9, "": 45528, "G": {"": 7}}, 3, false, 0, "EB", 8, -6]'),
+	('"qzrkvrlG78CCCEBCptzwwok808805243QXVSYed3efZSKLSNXPxhrS357KJMWSKgrfcFFDFDWKSXJJSIJ_yqJu"'),
+	('[43, 8, {"": ""}, "uwtv__HURKGJLGGPPW", 9, 66, "yqrvghxuw", {"J": false}, false, 2, 0, 4]'),
+	('[{"UVL": 7, "": 1}, false, [6, "H"], "boxlgqgm", 3, "znhm", [true], 0, ["e", 3.7], 9, 9.4]'),
+	('{"825634870117somzqw": 1, "": [5], "gYH": "_XT", "b22412631709RZP": 3, "": "", "FDB": [""]}'),
+	('[8, ["_bae"], "", "WN", 80, {"o": 2, "aff": 16}, false, true, 4, 6, {"nutzkzikolsxZRQ": 30}]'),
+	('["588BD9c_xzsn", {"k": 0, "_Ecezlkslrwvjpwrukiqzl": 3, "Ej": "4"}, "TUXwghn1dTNRXJZpswmD", 5]'),
+	('[{"dC": 7}, {"": 1, "4": 41, "": "", "": "adKS"}, {"": "ypv"}, 6, 9, 2, [-61.46], [1, 3.9], 2]'),
+	('{"8": 8, "": -364, "855": -238.1, "zj": 9, "SNHJG413": 3, "UMNVI73": [60, 0], "iwvqse": -1.833}'),
+	('"VTUKMLZKQPHIEniCFZ_cjrhvspxzulvxhqykjzmrw89OGOGISWdcrvpOPLOFALGK809896999xzqnkm63254_xrmcfcedb"'),
+	('["", "USNQbcexyFDCdBAFWJIphloxwytplyZZR008400FmoiYXVYOHVGV79795644463Aug_aeoDDEjzoziisxoykuijhz"]'),
+	('{"": 1, "5abB58gXVQVTTMWU3jSHXMMNV": "", "nv": 934, "kjsnhtj": 8, "": [{"xm": [71, 425]}], "": -9}'),
+	('"__oliqCcbwwyqmtECsqivplcb1NTMOQRZTYRJONOIPWNHKWLJRIHKROMJNZLNGTTKRcedebccdbMTQXSzhynxmllqxuhnxBA_"'),
+	('["thgACBWGNGMkFFEA", [0, -1349, {"18": "RM", "F3": 6, "dP": "_AF"}, 64, 0, {"f": [8]}], 5, [[0]], 2]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+UPDATE tbl
+	SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+	WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+DELETE FROM tbl
+	WHERE random(1,5) = 3;
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
+UPDATE tbl
+	SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+	WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
+SELECT gin_index_check('ginidx');
-- 
2.39.3 (Apple Git-145)

#67

Kirill Reshke

reshkekirill@gmail.com

11 months ago

In reply to: Mark Dilger (#66)

1 attachment(s)

Re: Amcheck verification of GiST and GIN

On Sat, 22 Feb 2025 at 03:51, Mark Dilger <mark.dilger@enterprisedb.com> wrote:

On Feb 21, 2025, at 12:50 PM, Mark Dilger <mark.dilger@enterprisedb.com> wrote:

Could one of the patch authors take a look?

I turned the TAP test which triggers the error into a regression test that does likewise, for ease of stepping through the test, if anybody should want to do that. I'm attaching that patch here, but please note that I'm not expecting this to be committed.

Hi!
Your efforts are much appreciated!
I used this patch to derive a smaller repro.

this seems to either be a bug in the checking code complaining about perfectly valid tuple order,

I'm doubtful this is the case. I have added some more logging to
gin_index_check, and here is output after running attached:
```
DEBUG: processing entry tree page at blk 2, maxoff: 125
....
DEBUG: comparing for offset 78 category 0
DEBUG: comparing for offset 79 category 2
DEBUG: comparing for offset 80 category 3
DEBUG: comparing for offset 81 category 0
LOG: index "ginidx" has wrong tuple order on entry tree page, block
2, offset 81, rightlink 4294967295
DEBUG: comparing for offset 82 category 0
....
DEBUG: comparing for offset 100 category 0
DEBUG: comparing for offset 101 category 2
DEBUG: comparing for offset 102 category 3
DEBUG: comparing for offset 103 category 0
LOG: index "ginidx" has wrong tuple order on entry tree page, block
2, offset 103, rightlink 4294967295
DEBUG: comparing for offset 104 category 0
DEBUG: comparing for offset 105 category 0
```
So, we have an entry tree page, where there is tuple on offset 80,
with gin tuple category = 3, and then it goes category 0 again. And
one more such pattern on the same page.
The ginCompareEntries function compares the gin tuples category first.
I do not understand how this would be a valid order on the page, given
that
ginCompareEntries used in `ginget.c` logic. . Maybe I'm missing
something vital about GIN.

--
Best regards,
Kirill Reshke

Attachments:

0001-Much-smaller-repro.patchapplication/octet-stream; name=0001-Much-smaller-repro.patchDownload

From 657a26e191b97ce3d4868a97178a61969dfe2dff Mon Sep 17 00:00:00 2001
From: reshke <reshkekirill@gmail.com>
Date: Fri, 28 Feb 2025 06:11:58 +0000
Subject: [PATCH] Much smaller repro

---
 contrib/amcheck/stress_gin_3.sql | 145 +++++++++++++++++++++++++++++++
 1 file changed, 145 insertions(+)
 create mode 100644 contrib/amcheck/stress_gin_3.sql

diff --git a/contrib/amcheck/stress_gin_3.sql b/contrib/amcheck/stress_gin_3.sql
new file mode 100644
index 00000000000..0d73947c87b
--- /dev/null
+++ b/contrib/amcheck/stress_gin_3.sql
@@ -0,0 +1,145 @@
+CREATE EXTENSION amcheck;
+SELECT setseed(0.2);
+CREATE TABLE tbl(i integer[], j jsonb, k jsonb);
+CREATE INDEX ginidx ON tbl USING gin(i, j, k) WITH (fastupdate=false, gin_pending_list_limit=64);
+CREATE TABLE jsondata (i serial, j jsonb);
+INSERT INTO jsondata (j) VALUES
+	('1'),
+	('91'),
+	('[5]'),
+	('true'),
+	('"zxI"'),
+	('[1, 7]'),
+	('["", 4]'),
+	('"utDFBz"'),
+	('[[9], ""]'),
+	('"eCvxKPML"'),
+	('["1VMQNQM"]'),
+	('{"": "562c"}'),
+	('[58, 8, null]'),
+	('{"": {"": 62}}'),
+	('["", 6, 19, ""]'),
+	('{"ddfWTQ": true}'),
+	('["", 734.2, 9, 5]'),
+	('"GMV27mjtuuqmlltw"'),
+	('{"dabe": -5, "": 6}'),
+	('"hgihykirQGIYTcCA30"'),
+	('[9, {"Utrn": -6}, ""]'),
+	('"BJTZUMST1_WWEgyqgka_"'),
+	('["", -4, "", [-2], -47]'),
+	('{"": [3], "": {"": "y"}}'),
+	('{"myuijj": "YUWIUZXXLGS"}'),
+	('{"3": false, "C": "1sHTX"}'),
+	('"ZGUORVDE_ACF1QXJ_hipgwrks"'),
+	('{"072": [3, -4], "oh": "eL"}'),
+	('[{"de": 9, "JWHPMRZJW": [0]}]'),
+	('"EACJUZEBAFFBEE6706SZLWVGO635"'),
+	('["P", {"TZW": [""]}, {"": [0]}]'),
+	('{"": -6, "YMb": -22, "__": [""]}'),
+	('{"659": [8], "bfc": [0], "V": ""}'),
+	('{"8776": "1tryl", "Q": 2, "": 4.6}'),
+	('[[1], "", 9, 0, [1, 0], -1, 0, "C"]'),
+	('"635321pnpjlfFzhGTIYP9265iA_19D8260"'),
+	('"klmxsoCFDtzxrhotsqlnmvmzlcbdde34twj"'),
+	('"GZSXSZVS19ecbe_ZJJED0379c1j9_GSU9167"'),
+	('{"F18s": {"": -84194}, "ececab2": [""]}'),
+	('["", {"SVAvgg": "Q"}, 1, 9, "gypy", [1]]'),
+	('[[""], {"": 5}, "GVZGGVGSWM", 2, ["", 8]]'),
+	('{"V": 8, "TPNL": [826, null], "4": -9.729}'),
+	('{"HTJP_DAptxn6": 9, "": "r", "hji4124": ""}'),
+	('[1, ["9", 5, 6, ""], {"": "", "": "efb"}, 7]'),
+	('{"": 6, "1251e_cajrgkyzuxBEDM017444EFD": 548}'),
+	('{"853": -60, "TGLUG_jxmrggv": null, "pjx": ""}'),
+	('[0, "wsgnnvCfJVV_KOMLVXOUIS9FIQLPXXBbbaohjrpj"]'),
+	('"nizvkl36908OLW22ecbdeEBMHMiCEEACcikwkjpmu30X_m"'),
+	('{"bD24eeVZWY": 1, "Bt": 9, "": 6052, "FT": ["h"]}'),
+	('"CDBnouyzlAMSHJCtguxxizpzgkNYfaNLURVITNLYVPSNLYNy"'),
+	('{"d": [[4, "N"], null, 6, true], "1PKV": 6, "9": 6}'),
+	('[-7326, [83, 55], -63, [0, {"": 1}], {"ri0": false}]'),
+	('{"": 117.38, "FCkx3608szztpvjolomzvlyrshyvrgz": -4.2}'),
+	('["", 8, {"WXHNG": {"6": 4}}, [null], 7, 2, "", 299, 6]'),
+	('[[-992.2, "TPm", "", "cedeff79BD8", "t", [1]], 0, [-7]]'),
+	('[9, 34, ["LONuyiYGQZ"], [7, 88], ["c"], 1, 6, "", [[2]]]'),
+	('[20, 5, null, "eLHTXRWNV", 8, ["pnpvrum", -3], "FINY", 3]'),
+	('[{"": "", "b": 2, "d": "egu"}, "aPNK", 2, 9, {"": -79946}]'),
+	('[1, {"769": 9}, 5, 9821, 22, 0, 2.7, 5, 4, 191, 54.599, 24]'),
+	('["c", 77, "b_0lplvHJNLMxw", "VN76dhFadaafadfe5dfbco", false]'),
+	('"TYIHXebbPK_86QMP_199bEEIS__8205986vdC_CFAEFBFCEFCJQRHYoqztv"'),
+	('"cdmxxxzrhtxpwuyrxinmhb5577NSPHIHMTPQYTXSUVVGJPUUMCBEDb_1569e"'),
+	('[[5, null, "C"], "ORNR", "mnCb", 1, -800, "6953", ["K", 0], ""]'),
+	('"SSKLTHJxjxywwquhiwsde353eCIJJjkyvn9946c2cdVadcboiyZFAYMHJWGMMT"'),
+	('"5185__D5AtvhizvmEVceF3jxtghlCF0789_owmsztJHRMOJ7rlowxqq51XLXJbF"'),
+	('{"D": 565206, "xupqtmfedff": "ZGJN9", "9": 1, "glzv": -47, "": -8}'),
+	('{"": 9, "": {"": [null], "ROP": 842}, "": ["5FFD", 7, 5, 1, 94, 1]}'),
+	('{"JLn": ["8s"], "": "_ahxizrzhivyzvhr", "XSAt": 5, "P": 2838, "": 5}'),
+	('[51, 3, {"": 9, "": -9, "": [[6]]}, 7, 7, {"": 0}, "TXLQL", 7.6, [7]]'),
+	('[-38.7, "kre40", 5, {"": null}, "tvuv", 8, "", "", "uizygprwwvh", "1"]'),
+	('"z934377_nxmzjnuqglgyukjteefeihjyot1irkvwnnrqinptlpzwjgmkjbQMUVxxwvbdz"'),
+	('[165.9, "dAFD_60JQPYbafh", false, {"": 6, "": "fcfd"}, [[2], "c"], 4, 2]'),
+	('"ffHOOPVSSACDqiyeecTNWJMWPNRXU283aHRXNUNZZZQPUGYSQTTQXQVJM5eeafcIPGIHcac"'),
+	('[2, 8, -53, {"": 5}, "F9", 8, "SGUJPNVI", "7OLOZH", 9.84, {"": 6}, 207, 6]'),
+	('"xqmqmyljhq__ZGWJVNefagsxrsktruhmlinhxloupuVQW0804901NKGGMNNSYYXWQOosz8938"'),
+	('{"FEoLfaab1160167": {"L": [42, 0]}, "938": "FCCUPGYYYMQSQVZJKM", "knqmk": 2}'),
+	('"0igyurmOMSXIYHSZQEAcxlvgqdxkhwtrbaabfaaMC138Z_BDRLrythpi30_MPRXMTOILRLswmoy"'),
+	('"1129BBCABFFAACA9VGVKipnwohaccc9TSIMTOQKHmcGYVeFE_PWKLHmpyj60137672qugtsstugg"'),
+	('"D3BDA069074174vx48A37IVHWVXLUP9382542ypsl1465pixtryzCBgrkkhrvCC_BDDFatkyXHLIe"'),
+	('[{"esx7": -53, "ec60834YGVMYoXAAvgxmmqnojyzmiklhdovFipl": 2, "os": 66433}, 9.13]'),
+	('{"": ["", 4, null, 5, null], "": "3", "5_GMMHTIhPB_F_vsebc1": "Er", "GY": 121.32}'),
+	('["krTVPYDEd", 5, 8, [6, -6], [[-9], 3340, [[""]]], "", 5, [6, true], 3, "", 1, ""]'),
+	('{"rBNPKN8446080wruOLeceaCBDCKWNUYYMONSJUlCDFExr": {"": "EE0", "6826": 5, "": 7496}}'),
+	('[3, {"": -8}, "101dboMVSNKZLVPITLHLPorwwuxxjmjsh", "", "LSQPRVYKWVYK945imrh", 4, 51]'),
+	('[["HY6"], "", "bcdB", [2, [85, 1], 3, 3, 3, [8]], "", ["_m"], "2", -33, 8, 3, "_xwj"]'),
+	('["", 0, -3.7, 8, false, null, {"": 5}, 9, "06FccxFcdb283bbZGGVRSMWLJH2_PBAFpwtkbceto"]'),
+	('[52, "", -39, -7, [1], "c", {"": 9, "": 45528, "G": {"": 7}}, 3, false, 0, "EB", 8, -6]'),
+	('"qzrkvrlG78CCCEBCptzwwok808805243QXVSYed3efZSKLSNXPxhrS357KJMWSKgrfcFFDFDWKSXJJSIJ_yqJu"'),
+	('[43, 8, {"": ""}, "uwtv__HURKGJLGGPPW", 9, 66, "yqrvghxuw", {"J": false}, false, 2, 0, 4]'),
+	('[{"UVL": 7, "": 1}, false, [6, "H"], "boxlgqgm", 3, "znhm", [true], 0, ["e", 3.7], 9, 9.4]'),
+	('{"825634870117somzqw": 1, "": [5], "gYH": "_XT", "b22412631709RZP": 3, "": "", "FDB": [""]}'),
+	('[8, ["_bae"], "", "WN", 80, {"o": 2, "aff": 16}, false, true, 4, 6, {"nutzkzikolsxZRQ": 30}]'),
+	('["588BD9c_xzsn", {"k": 0, "_Ecezlkslrwvjpwrukiqzl": 3, "Ej": "4"}, "TUXwghn1dTNRXJZpswmD", 5]'),
+	('[{"dC": 7}, {"": 1, "4": 41, "": "", "": "adKS"}, {"": "ypv"}, 6, 9, 2, [-61.46], [1, 3.9], 2]'),
+	('{"8": 8, "": -364, "855": -238.1, "zj": 9, "SNHJG413": 3, "UMNVI73": [60, 0], "iwvqse": -1.833}'),
+	('"VTUKMLZKQPHIEniCFZ_cjrhvspxzulvxhqykjzmrw89OGOGISWdcrvpOPLOFALGK809896999xzqnkm63254_xrmcfcedb"'),
+	('["", "USNQbcexyFDCdBAFWJIphloxwytplyZZR008400FmoiYXVYOHVGV79795644463Aug_aeoDDEjzoziisxoykuijhz"]'),
+	('{"": 1, "5abB58gXVQVTTMWU3jSHXMMNV": "", "nv": 934, "kjsnhtj": 8, "": [{"xm": [71, 425]}], "": -9}'),
+	('"__oliqCcbwwyqmtECsqivplcb1NTMOQRZTYRJONOIPWNHKWLJRIHKROMJNZLNGTTKRcedebccdbMTQXSzhynxmllqxuhnxBA_"'),
+	('["thgACBWGNGMkFFEA", [0, -1349, {"18": "RM", "F3": 6, "dP": "_AF"}, 64, 0, {"f": [8]}], 5, [[0]], 2]');
+INSERT INTO tbl (i, j, k) VALUES
+	(null,               null, null),
+	(null,               null, '[]'),
+	(null,               '[]', null),
+	(ARRAY[]::INTEGER[], null, null),
+	(null,               '[]', '[]'),
+	(ARRAY[]::INTEGER[], '[]', null),
+	(ARRAY[]::INTEGER[], '[]', '[]');
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+INSERT INTO tbl (i, j, k)
+	(SELECT gs.i, j.j, j.j || j.j
+	FROM jsondata j,
+	 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+	WHERE j.i = random(1,100)
+	);
+UPDATE tbl
+	SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+	WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+INSERT INTO tbl (i, j, k)
+	(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+	FROM jsondata x, jsondata y
+	WHERE x.i = random(1,100)
+	  AND y.i = random(1,100)
+	);
+SELECT gin_index_check('ginidx');
-- 
2.43.0

#68

Mark Dilger

mark.dilger@enterprisedb.com

11 months ago

In reply to: Kirill Reshke (#67)

Re: Amcheck verification of GiST and GIN

So, we have an entry tree page, where there is tuple on offset 80,
with gin tuple category = 3, and then it goes category 0 again. And
one more such pattern on the same page.
The ginCompareEntries function compares the gin tuples category first.
I do not understand how this would be a valid order on the page, given
that
ginCompareEntries used in `ginget.c` logic. . Maybe I'm missing
something vital about GIN.

The only obvious definition of "wrong" for this is that gin index scans
return different result sets than table scans over the same data. Using
your much smaller reproducible test case, and adding rows like:

SELECT COUNT(*) FROM tbl WHERE j @>
'"1129BBCABFFAACA9VGVKipnwohaccc9TSIMTOQKHmcGYVeFE_PWKLHmpyj60137672qugtsstugg"'::jsonb;
SELECT COUNT(*) FROM tbl WHERE j @> '{"": "r", "hji4124": "",
"HTJP_DAptxn6": 9}'::jsonb;
SELECT COUNT(*) FROM tbl WHERE j @> '[]'::jsonb;
SELECT COUNT(*) FROM tbl WHERE j @> NULL::jsonb;
SELECT COUNT(*) FROM tbl WHERE j @> '{"": -6, "__": [""], "YMb":
-22}'::jsonb;
SELECT COUNT(*) FROM tbl WHERE j @> '{"853": -60, "pjx": "",
"TGLUG_jxmrggv": null}'::jsonb;
SELECT COUNT(*) FROM tbl WHERE j @>
'"D3BDA069074174vx48A37IVHWVXLUP9382542ypsl1465pixtryzCBgrkkhrvCC_BDDFatkyXHLIe"'::jsonb;
SELECT COUNT(*) FROM tbl WHERE j @> '{"F18s": {"": -84194}, "ececab2":
[""]}'::jsonb;

The results are the same with or without the index. Can you find any
examples where they differ?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#69

Kirill Reshke

reshkekirill@gmail.com

11 months ago

In reply to: Mark Dilger (#68)

7 attachment(s)

Re: Amcheck verification of GiST and GIN

On Fri, 28 Feb 2025 at 23:31, Mark Dilger <mark.dilger@enterprisedb.com> wrote:

The only obvious definition of "wrong" for this is that gin index scans return different result sets than table scans over the same data. Using your much smaller reproducible test case, and adding rows like:

Yeach, you are 100% right. Actually, along this thread, we have not
spotted any GIN bugs yet, only GIN amcheck bugs.

This turns out to be also an GIN amcheck bug:

```
DEBUG: comparing for offset 79 category 2 key attnum 1
DEBUG: comparing for offset 80 category 3 key attnum 1
DEBUG: comparing for offset 81 category 0 key attnum 2
LOG: index "ginidx" has wrong tuple order on entry tree page, block
2, offset 81, rightlink 4294967295
DEBUG: comparing for offset 82 category 0 key attnum 2
....
DEBUG: comparing for offset 100 category 0 key attnum 2
DEBUG: comparing for offset 101 category 2 key attnum 2
DEBUG: comparing for offset 102 category 3 key attnum 2
DEBUG: comparing for offset 103 category 0 key attnum 3
LOG: index "ginidx" has wrong tuple order on entry tree page, block
2, offset 103, rightlink 4294967295
DEBUG: comparing for offset 104 category 0 key attnum 3
DEBUG: comparing for offset 105 category 0 key attnum 3
```
Turns out we compare page entries for different attributes in
gin_check_parent_keys_consistency.

Trivial fix attached (see v37-0004). I now simply compare current and
prev attribute numbers. This revolves issue discovered by
`v0-0001-Add-a-reproducible-test-case-for-verify_gin-error.patch.no_apply`.
However, the stress test seems to still not pass. On my pc, it never
ens, all processes are in
DELETE waiting/UPDATE waiting state. I will take another look tomorrow.

p.s. I am just about to send this message, while i discovered we now
miss v34-0003-Add-gist_index_check-function-to-verify-GiST-ind.patch &
v34-0005-Add-GiST-support-to-pg_amcheck.patch from this patch series
;(

--
Best regards,
Kirill Reshke

Attachments:

v37-0004-Fix-for-gin_index_check.patchapplication/octet-stream; name=v37-0004-Fix-for-gin_index_check.patchDownload

From 0e86a027ab785cb134d8cb5b8eee200b7c046569 Mon Sep 17 00:00:00 2001
From: reshke kirill <reshke@double.cloud>
Date: Mon, 16 Dec 2024 10:49:43 +0000
Subject: [PATCH v37 4/7] Fix for gin_index_check.

Never explicitly check high key on rightmost page is entry tree.
Its value is not stored explicitly and is equal to infitity.
Also never compare gin entries for different attnums.
---
 contrib/amcheck/verify_gin.c | 56 ++++++++++++++++++++++--------------
 1 file changed, 35 insertions(+), 21 deletions(-)

diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index 2dc5fbba619..ae3d24a8b86 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -163,6 +163,7 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 		Page		page;
 		OffsetNumber i,
 					maxoff;
+		BlockNumber rightlink;
 
 		CHECK_FOR_INTERRUPTS();
 
@@ -170,6 +171,7 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 									RBM_NORMAL, strategy);
 		LockBuffer(buffer, GIN_SHARE);
 		page = (Page) BufferGetPage(buffer);
+
 		Assert(GinPageIsData(page));
 
 		/* Check that the tree has the same height in all branches */
@@ -182,7 +184,7 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 
 			ItemPointerSetMin(&minItem);
 
-			elog(DEBUG1, "page blk: %u, type leaf", stack->blkno);
+			elog(DEBUG2, "page blk: %u, type leaf", stack->blkno);
 
 			if (leafdepth == -1)
 				leafdepth = stack->depth;
@@ -234,8 +236,8 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 			 * Check that tuples in each page are properly ordered and
 			 * consistent with parent high key
 			 */
-			Assert(GinPageIsData(page));
 			maxoff = GinPageGetOpaque(page)->maxoff;
+			rightlink = GinPageGetOpaque(page)->rightlink;
 
 			elog(DEBUG1, "page blk: %u, type data, maxoff %d", stack->blkno, maxoff);
 
@@ -273,7 +275,12 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 			 */
 			bound = *GinDataPageGetRightBound(page);
 
-			if (stack->parentblk != InvalidBlockNumber &&
+			/*
+			 * Gin page right bound has sane value if only not a highkey on
+			 * rightmost page on level.
+			 */
+			if (ItemPointerIsValid(&stack->parentkey) &&
+				rightlink != InvalidBlockNumber &&
 				!ItemPointerEquals(&stack->parentkey, &bound))
 				ereport(ERROR,
 						(errcode(ERRCODE_INDEX_CORRUPTED),
@@ -287,11 +294,12 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 
 			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
 			{
+				GinPostingTreeScanItem *ptr;
 				PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
 
 				/* ItemPointerGetOffsetNumber expects a valid pointer */
 				if (!(i == maxoff &&
-					  GinPageGetOpaque(page)->rightlink == InvalidBlockNumber))
+					  rightlink == InvalidBlockNumber))
 					elog(DEBUG3, "key (%u, %u) -> %u",
 						 ItemPointerGetBlockNumber(&posting_item->key),
 						 ItemPointerGetOffsetNumber(&posting_item->key),
@@ -300,8 +308,7 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 					elog(DEBUG3, "key (%u, %u) -> %u",
 						 0, 0, BlockIdGetBlockNumber(&posting_item->child_blkno));
 
-				if (i == maxoff &&
-					GinPageGetOpaque(page)->rightlink == InvalidBlockNumber)
+				if (i == maxoff && rightlink == InvalidBlockNumber)
 				{
 					/*
 					 * The rightmost item in the tree level has (0, 0) as the
@@ -340,19 +347,23 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 									RelationGetRelationName(rel),
 									stack->blkno, i)));
 
-				/* If this is an internal page, recurse into the child */
-				if (!GinPageIsLeaf(page))
-				{
-					GinPostingTreeScanItem *ptr;
+				/* This is an internal page, recurse into the child */
+				ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
+				ptr->depth = stack->depth + 1;
 
-					ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
-					ptr->depth = stack->depth + 1;
+				/*
+				 * Set rightmost parent key to invalid iterm pointer. Its
+				 * value is 'Infinity' and not explicitly stored.
+				 */
+				if (rightlink == InvalidBlockNumber)
+					ItemPointerSetInvalid(&ptr->parentkey);
+				else
 					ptr->parentkey = posting_item->key;
-					ptr->parentblk = stack->blkno;
-					ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
-					ptr->next = stack->next;
-					stack->next = ptr;
-				}
+
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
+				ptr->next = stack->next;
+				stack->next = ptr;
 			}
 		}
 		LockBuffer(buffer, GIN_UNLOCK);
@@ -411,7 +422,8 @@ gin_check_parent_keys_consistency(Relation rel,
 		Buffer		buffer;
 		Page		page;
 		OffsetNumber i,
-					maxoff;
+					maxoff,
+					prev_attnum;
 		XLogRecPtr	lsn;
 		IndexTuple	prev_tuple;
 		BlockNumber rightlink;
@@ -488,6 +500,7 @@ gin_check_parent_keys_consistency(Relation rel,
 		 * with parent high key
 		 */
 		prev_tuple = NULL;
+		prev_attnum = InvalidAttrNumber;
 		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
 		{
 			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
@@ -511,7 +524,7 @@ gin_check_parent_keys_consistency(Relation rel,
 			 * for high key on rightmost page, as this key is not really
 			 * stored explicitly.
 			 */
-			if (i != FirstOffsetNumber && stack->blkno != GIN_ROOT_BLKNO &&
+			if (i != FirstOffsetNumber && attnum == prev_attnum && stack->blkno != GIN_ROOT_BLKNO &&
 				!(i == maxoff && rightlink == InvalidBlockNumber))
 			{
 				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
@@ -520,8 +533,8 @@ gin_check_parent_keys_consistency(Relation rel,
 									  current_key_category) >= 0)
 					ereport(ERROR,
 							(errcode(ERRCODE_INDEX_CORRUPTED),
-							 errmsg("index \"%s\" has wrong tuple order on entry tree page, block %u, offset %u",
-									RelationGetRelationName(rel), stack->blkno, i)));
+							 errmsg("index \"%s\" has wrong tuple order on entry tree page, block %u, offset %u, rightlink %u",
+									RelationGetRelationName(rel), stack->blkno, i, rightlink)));
 			}
 
 			/*
@@ -620,6 +633,7 @@ gin_check_parent_keys_consistency(Relation rel,
 			}
 
 			prev_tuple = CopyIndexTuple(idxtuple);
+			prev_attnum = attnum;
 		}
 
 		LockBuffer(buffer, GIN_UNLOCK);
-- 
2.43.0

v37-0003-Fix-wording-in-GIN-README.patchapplication/octet-stream; name=v37-0003-Fix-wording-in-GIN-README.patchDownload

From 4c13e172d5ddfbdac412f4247e3e049d075b7699 Mon Sep 17 00:00:00 2001
From: reshke kirill <reshke@double.cloud>
Date: Tue, 3 Dec 2024 15:02:47 +0000
Subject: [PATCH v37 3/7] Fix wording in GIN README.

---
 src/backend/access/gin/README | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/backend/access/gin/README b/src/backend/access/gin/README
index b0807316212..742bcbad499 100644
--- a/src/backend/access/gin/README
+++ b/src/backend/access/gin/README
@@ -237,10 +237,10 @@ GIN packs keys and downlinks into tuples in a different way.
 
 P_i is grouped with K_{i+1}.  -Inf key is not needed.
 
-There are couple of additional notes regarding K_{n+1} key.
-1) In entry tree rightmost page, a key coupled with P_n doesn't really matter.
+There are a couple of additional notes regarding K_{n+1} key.
+1) In the entry tree on the rightmost page, a key coupled with P_n doesn't really matter.
 Highkey is assumed to be infinity.
-2) In posting tree, a key coupled with P_n always doesn't matter.  Highkey for
+2) In the posting tree, a key coupled with P_n always doesn't matter.  Highkey for
 non-rightmost pages is stored separately and accessed via
 GinDataPageGetRightBound().
 
-- 
2.43.0

v37-0002-Add-gin_index_check-to-verify-GIN-index.patchapplication/octet-stream; name=v37-0002-Add-gin_index_check-to-verify-GIN-index.patchDownload

From 88a76cc81a2479f901e382d80cc83b60f9009b7e Mon Sep 17 00:00:00 2001
From: Mark Dilger <mark.dilger@enterprisedb.com>
Date: Fri, 21 Feb 2025 09:27:00 -0800
Subject: [PATCH v37 2/7] Add gin_index_check() to verify GIN index
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Author: Grigory Kryachko <GSKryachko@gmail.com>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile               |   6 +-
 contrib/amcheck/amcheck--1.4--1.5.sql  |  14 +
 contrib/amcheck/amcheck.control        |   2 +-
 contrib/amcheck/expected/check_gin.out |  64 ++
 contrib/amcheck/meson.build            |   3 +
 contrib/amcheck/sql/check_gin.sql      |  40 ++
 contrib/amcheck/verify_gin.c           | 774 +++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml              |  20 +
 8 files changed, 920 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.4--1.5.sql
 create mode 100644 contrib/amcheck/expected/check_gin.out
 create mode 100644 contrib/amcheck/sql/check_gin.sql
 create mode 100644 contrib/amcheck/verify_gin.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index c3d70f3369c..1b7a63cbaa4 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,14 +4,16 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	verify_common.o \
+	verify_gin.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
-DATA = amcheck--1.3--1.4.sql amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql \
+		amcheck--1.3--1.4.sql amcheck--1.4--1.5.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap
+REGRESS = check check_btree check_gin check_heap
 
 EXTRA_INSTALL = contrib/pg_walinspect
 TAP_TESTS = 1
diff --git a/contrib/amcheck/amcheck--1.4--1.5.sql b/contrib/amcheck/amcheck--1.4--1.5.sql
new file mode 100644
index 00000000000..445c48ccb7d
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.4--1.5.sql
@@ -0,0 +1,14 @@
+/* contrib/amcheck/amcheck--1.4--1.5.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.5'" to load this file. \quit
+
+
+-- gin_index_check()
+--
+CREATE FUNCTION gin_index_check(index regclass)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gin_index_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gin_index_check(regclass) FROM PUBLIC;
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index e67ace01c99..c8ba6d7c9bc 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.4'
+default_version = '1.5'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
new file mode 100644
index 00000000000..bbcde80e627
--- /dev/null
+++ b/contrib/amcheck/expected/check_gin.out
@@ -0,0 +1,64 @@
+-- Test of index bulk load
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_check('gin_check_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test index inserts
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+SELECT gin_index_check('gin_check_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test GIN over text array
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_check('gin_check_text_array_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 67a4ac8518d..b33e8c9b062 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'verify_common.c',
+  'verify_gin.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
@@ -25,6 +26,7 @@ install_data(
   'amcheck--1.1--1.2.sql',
   'amcheck--1.2--1.3.sql',
   'amcheck--1.3--1.4.sql',
+  'amcheck--1.4--1.5.sql',
   kwargs: contrib_data_args,
 )
 
@@ -36,6 +38,7 @@ tests += {
     'sql': [
       'check',
       'check_btree',
+      'check_gin',
       'check_heap',
     ],
   },
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
new file mode 100644
index 00000000000..bbd9b9f8281
--- /dev/null
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -0,0 +1,40 @@
+-- Test of index bulk load
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test index inserts
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+
+SELECT gin_index_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test GIN over text array
+SELECT setseed(1);
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_check('gin_check_text_array_idx');
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
new file mode 100644
index 00000000000..2dc5fbba619
--- /dev/null
+++ b/contrib/amcheck/verify_gin.c
@@ -0,0 +1,774 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gin.c
+ *		Verifies the integrity of GIN indexes based on invariants.
+ *
+ * Verification checks that all paths in GIN graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gin.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gin_private.h"
+#include "access/nbtree.h"
+#include "catalog/pg_am.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "verify_common.h"
+#include "string.h"
+
+/*
+ * GinScanItem represents one item of depth-first scan of GIN index.
+ */
+typedef struct GinScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GinScanItem *next;
+}			GinScanItem;
+
+/*
+ * GinPostingTreeScanItem represents one item of depth-first scan of GIN  posting tree.
+ */
+typedef struct GinPostingTreeScanItem
+{
+	int			depth;
+	ItemPointerData parentkey;
+	BlockNumber parentblk;
+	BlockNumber blkno;
+	struct GinPostingTreeScanItem *next;
+}			GinPostingTreeScanItem;
+
+
+PG_FUNCTION_INFO_V1(gin_index_check);
+
+static void gin_check_parent_keys_consistency(Relation rel,
+											  Relation heaprel,
+											  void *callback_state, bool readonly);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gin_refind_parent(Relation rel,
+									BlockNumber parentblkno,
+									BlockNumber childblkno,
+									BufferAccessStrategy strategy);
+static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+								   OffsetNumber offset);
+
+/*
+ * gin_index_check(index regclass)
+ *
+ * Verify integrity of GIN index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gin_index_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+
+	amcheck_lock_relation_and_check(indrelid,
+									GIN_AM_OID,
+									gin_check_parent_keys_consistency,
+									AccessShareLock,
+									NULL);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Read item pointers from leaf entry tuple.
+ *
+ * Returns a palloc'd array of ItemPointers. The number of items is returned
+ * in *nitems.
+ */
+static ItemPointer
+ginReadTupleWithoutState(IndexTuple itup, int *nitems)
+{
+	Pointer		ptr = GinGetPosting(itup);
+	int			nipd = GinGetNPosting(itup);
+	ItemPointer ipd;
+	int			ndecoded;
+
+	if (GinItupIsCompressed(itup))
+	{
+		if (nipd > 0)
+		{
+			ipd = ginPostingListDecode((GinPostingList *) ptr, &ndecoded);
+			if (nipd != ndecoded)
+				elog(ERROR, "number of items mismatch in GIN entry tuple, %d in tuple header, %d decoded",
+					 nipd, ndecoded);
+		}
+		else
+			ipd = palloc(0);
+	}
+	else
+	{
+		ipd = (ItemPointer) palloc(sizeof(ItemPointerData) * nipd);
+		memcpy(ipd, ptr, sizeof(ItemPointerData) * nipd);
+	}
+	*nitems = nipd;
+	return ipd;
+}
+
+/*
+ * Allocates memory context and scans through postigTree graph
+ *
+ */
+static void
+gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting_tree_root)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinPostingTreeScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinPostingTreeScanItem *) palloc0(sizeof(GinPostingTreeScanItem));
+	stack->depth = 0;
+	ItemPointerSetInvalid(&stack->parentkey);
+	stack->parentblk = InvalidBlockNumber;
+	stack->blkno = posting_tree_root;
+
+	elog(DEBUG3, "processing posting tree at blk %u", posting_tree_root);
+
+	while (stack)
+	{
+		GinPostingTreeScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		Assert(GinPageIsData(page));
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			ItemPointerData minItem;
+			int			nlist;
+			ItemPointerData *list;
+			char		tidrange_buf[MAXPGPATH];
+
+			ItemPointerSetMin(&minItem);
+
+			elog(DEBUG1, "page blk: %u, type leaf", stack->blkno);
+
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			list = GinDataLeafPageGetItems(page, &nlist, minItem);
+
+			if (nlist > 0)
+				snprintf(tidrange_buf, sizeof(tidrange_buf),
+						 "%d tids (%u, %u) - (%u, %u)",
+						 nlist,
+						 ItemPointerGetBlockNumberNoCheck(&list[0]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[0]),
+						 ItemPointerGetBlockNumberNoCheck(&list[nlist - 1]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[nlist - 1]));
+			else
+				snprintf(tidrange_buf, sizeof(tidrange_buf), "0 tids");
+
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: parent %u highkey (%u, %u), %s",
+					 stack->blkno,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey),
+					 tidrange_buf);
+			else
+				elog(DEBUG3, "blk %u: root leaf, %s",
+					 stack->blkno,
+					 tidrange_buf);
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				ItemPointerGetOffsetNumberNoCheck(&stack->parentkey) != InvalidOffsetNumber &&
+				nlist > 0 && ItemPointerCompare(&stack->parentkey, &list[nlist - 1]) < 0)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": tid exceeds parent's high key in postingTree leaf on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+		else
+		{
+			LocationIndex pd_lower;
+			ItemPointerData bound;
+			int			lowersize;
+
+			/*
+			 * Check that tuples in each page are properly ordered and
+			 * consistent with parent high key
+			 */
+			Assert(GinPageIsData(page));
+			maxoff = GinPageGetOpaque(page)->maxoff;
+
+			elog(DEBUG1, "page blk: %u, type data, maxoff %d", stack->blkno, maxoff);
+
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: internal posting tree page with %u items, parent %u highkey (%u, %u)",
+					 stack->blkno, maxoff, stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey));
+			else
+				elog(DEBUG3, "blk %u: root internal posting tree page with %u items",
+					 stack->blkno, maxoff);
+
+			/*
+			 * A GIN posting tree internal page stores PostingItems in the
+			 * 'lower' part of the page. The 'upper' part is unused. The
+			 * number of elements is stored in the opaque area (maxoff). Make
+			 * sure the size of the 'lower' part agrees with 'maxoff'
+			 *
+			 * We didn't set pd_lower until PostgreSQL version 9.4, so if this
+			 * check fails, it could also be because the index was
+			 * binary-upgraded from an earlier version. That was a long time
+			 * ago, though, so let's warn if it doesn't match.
+			 */
+			pd_lower = ((PageHeader) page)->pd_lower;
+			lowersize = pd_lower - MAXALIGN(SizeOfPageHeaderData);
+			if ((lowersize - MAXALIGN(sizeof(ItemPointerData))) / sizeof(PostingItem) != maxoff)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has unexpected pd_lower %u in posting tree block %u with maxoff %u)",
+								RelationGetRelationName(rel), pd_lower, stack->blkno, maxoff)));
+
+			/*
+			 * Before the PostingItems, there's one ItemPointerData in the
+			 * 'lower' part that stores the page's high key.
+			 */
+			bound = *GinDataPageGetRightBound(page);
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				!ItemPointerEquals(&stack->parentkey, &bound))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": posting tree page's high key (%u, %u) doesn't match the downlink on block %u (parent blk %u, key (%u, %u))",
+								RelationGetRelationName(rel),
+								ItemPointerGetBlockNumberNoCheck(&bound),
+								ItemPointerGetOffsetNumberNoCheck(&bound),
+								stack->blkno, stack->parentblk,
+								ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+								ItemPointerGetOffsetNumberNoCheck(&stack->parentkey))));
+
+			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
+
+				/* ItemPointerGetOffsetNumber expects a valid pointer */
+				if (!(i == maxoff &&
+					  GinPageGetOpaque(page)->rightlink == InvalidBlockNumber))
+					elog(DEBUG3, "key (%u, %u) -> %u",
+						 ItemPointerGetBlockNumber(&posting_item->key),
+						 ItemPointerGetOffsetNumber(&posting_item->key),
+						 BlockIdGetBlockNumber(&posting_item->child_blkno));
+				else
+					elog(DEBUG3, "key (%u, %u) -> %u",
+						 0, 0, BlockIdGetBlockNumber(&posting_item->child_blkno));
+
+				if (i == maxoff &&
+					GinPageGetOpaque(page)->rightlink == InvalidBlockNumber)
+				{
+					/*
+					 * The rightmost item in the tree level has (0, 0) as the
+					 * key
+					 */
+					if (ItemPointerGetBlockNumberNoCheck(&posting_item->key) != 0 ||
+						ItemPointerGetOffsetNumberNoCheck(&posting_item->key) != 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": rightmost posting tree page (blk %u) has unexpected last key (%u, %u)",
+										RelationGetRelationName(rel),
+										stack->blkno,
+										ItemPointerGetBlockNumberNoCheck(&posting_item->key),
+										ItemPointerGetOffsetNumberNoCheck(&posting_item->key))));
+				}
+				else if (i != FirstOffsetNumber)
+				{
+					PostingItem *previous_posting_item = GinDataPageGetPostingItem(page, i - 1);
+
+					if (ItemPointerCompare(&posting_item->key, &previous_posting_item->key) < 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\" has wrong tuple order in posting tree, block %u, offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+				}
+
+				/*
+				 * Check if this tuple is consistent with the downlink in the
+				 * parent.
+				 */
+				if (stack->parentblk != InvalidBlockNumber && i == maxoff &&
+					ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\": posting item exceeds parent's high key in postingTree internal page on block %u offset %u",
+									RelationGetRelationName(rel),
+									stack->blkno, i)));
+
+				/* If this is an internal page, recurse into the child */
+				if (!GinPageIsLeaf(page))
+				{
+					GinPostingTreeScanItem *ptr;
+
+					ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
+					ptr->depth = stack->depth + 1;
+					ptr->parentkey = posting_item->key;
+					ptr->parentblk = stack->blkno;
+					ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+			}
+		}
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Main entry point for GIN check. Allocates memory context and scans through
+ * GIN graph.
+ */
+static void
+gin_check_parent_keys_consistency(Relation rel,
+								  Relation heaprel,
+								  void *callback_state,
+								  bool readonly)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GinState	state;
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+	initGinState(&state, rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinScanItem *) palloc0(sizeof(GinScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIN_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GinScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+		IndexTuple	prev_tuple;
+		BlockNumber rightlink;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+		maxoff = PageGetMaxOffsetNumber(page);
+		rightlink = GinPageGetOpaque(page)->rightlink;
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		elog(DEBUG3, "processing entry tree page at blk %u, maxoff: %u", stack->blkno, maxoff);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (stack->parenttup != NULL)
+		{
+			GinNullCategory parent_key_category;
+			Datum		parent_key = gintuple_get_key(&state,
+													  stack->parenttup,
+													  &parent_key_category);
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno,
+												   page, maxoff);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory page_max_key_category;
+			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
+
+			if (rightlink != InvalidBlockNumber &&
+				ginCompareEntries(&state, attnum, page_max_key,
+								  page_max_key_category, parent_key,
+								  parent_key_category) > 0)
+			{
+				/* split page detected, install right link to the stack */
+				GinScanItem *ptr;
+
+				elog(DEBUG3, "split detected for blk: %u, parent blk: %u", stack->blkno, stack->parentblk);
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth;
+				ptr->parenttup = CopyIndexTuple(stack->parenttup);
+				ptr->parentblk = stack->parentblk;
+				ptr->parentlsn = stack->parentlsn;
+				ptr->blkno = rightlink;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that tuples in each page are properly ordered and consistent
+		 * with parent high key
+		 */
+		prev_tuple = NULL;
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory prev_key_category;
+			Datum		prev_key;
+			GinNullCategory current_key_category;
+			Datum		current_key;
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
+
+			/*
+			 * First block is metadata, skip order check. Also, never check
+			 * for high key on rightmost page, as this key is not really
+			 * stored explicitly.
+			 */
+			if (i != FirstOffsetNumber && stack->blkno != GIN_ROOT_BLKNO &&
+				!(i == maxoff && rightlink == InvalidBlockNumber))
+			{
+				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
+				if (ginCompareEntries(&state, attnum, prev_key,
+									  prev_key_category, current_key,
+									  current_key_category) >= 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has wrong tuple order on entry tree page, block %u, offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+			}
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				i == maxoff)
+			{
+				GinNullCategory parent_key_category;
+				Datum		parent_key = gintuple_get_key(&state,
+														  stack->parenttup,
+														  &parent_key_category);
+
+				if (ginCompareEntries(&state, attnum, current_key,
+									  current_key_category, parent_key,
+									  parent_key_category) > 0)
+				{
+					/*
+					 * There was a discrepancy between parent and child
+					 * tuples. We need to verify it is not a result of
+					 * concurrent call of gistplacetopage(). So, lock parent
+					 * and try to find downlink for current page. It may be
+					 * missing due to concurrent page split, this is OK.
+					 */
+					pfree(stack->parenttup);
+					stack->parenttup = gin_refind_parent(rel, stack->parentblk,
+														 stack->blkno, strategy);
+
+					/* We found it - make a final check before failing */
+					if (!stack->parenttup)
+						elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+							 stack->blkno, stack->parentblk);
+					else
+					{
+						parent_key = gintuple_get_key(&state,
+													  stack->parenttup,
+													  &parent_key_category);
+
+						/*
+						 * Check if it is properly adjusted. If succeed,
+						 * procced to the next key.
+						 */
+						if (ginCompareEntries(&state, attnum, current_key,
+											  current_key_category, parent_key,
+											  parent_key_category) > 0)
+							ereport(ERROR,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+											RelationGetRelationName(rel), stack->blkno, i)));
+					}
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+			if (!GinPageIsLeaf(page))
+			{
+				GinScanItem *ptr;
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth + 1;
+				/* last tuple in layer has no high key */
+				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
+					ptr->parenttup = CopyIndexTuple(idxtuple);
+				else
+					ptr->parenttup = NULL;
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = GinGetDownlink(idxtuple);
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+			/* If this item is a pointer to a posting tree, recurse into it */
+			else if (GinIsPostingTree(idxtuple))
+			{
+				BlockNumber rootPostingTree = GinGetPostingTree(idxtuple);
+
+				gin_check_posting_tree_parent_keys_consistency(rel, rootPostingTree);
+			}
+			else
+			{
+				ItemPointer ipd;
+				int			nipd;
+
+				ipd = ginReadTupleWithoutState(idxtuple, &nipd);
+
+				for (int j = 0; j < nipd; j++)
+				{
+					if (!OffsetNumberIsValid(ItemPointerGetOffsetNumber(&ipd[j])))
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting list contains invalid heap pointer on block %u",
+										RelationGetRelationName(rel), stack->blkno)));
+				}
+				pfree(ipd);
+			}
+
+			prev_tuple = CopyIndexTuple(idxtuple);
+		}
+
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Verify that a freshly-read page looks sane.
+ */
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	/*
+	 * ReadBuffer verifies that every newly-read page passes
+	 * PageHeaderIsValid, which means it either contains a reasonably sane
+	 * page header or is all-zero.  We have to defend against the all-zero
+	 * case, however.
+	 */
+	if (PageIsNew(page))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains unexpected zero page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	/*
+	 * Additionally check that the special area looks sane.
+	 */
+	if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains corrupted page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	if (GinPageIsDeleted(page))
+	{
+		if (!GinPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gin_refind_parent(Relation rel, BlockNumber parentblkno,
+				  BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIN_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GinPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+static ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED or LP_DEAD,
+	 * since GIN never uses all three.  Verify that line pointer has storage,
+	 * too.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdIsDead(itemid) || ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index a12aa3abf01..98f836e15e7 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -188,6 +188,26 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gin_index_check(index regclass) returns void</function>
+     <indexterm>
+      <primary>gin_index_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gin_index_check</function> tests that its target GIN index
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
+
   </variablelist>
   <tip>
    <para>
-- 
2.43.0

v37-0001-Refactor-amcheck-internals-to-isolate-common-loc.patchapplication/octet-stream; name=v37-0001-Refactor-amcheck-internals-to-isolate-common-loc.patchDownload

From b9cc198a4daf14f1ee5242ed7413be90bda18094 Mon Sep 17 00:00:00 2001
From: Mark Dilger <mark.dilger@enterprisedb.com>
Date: Fri, 21 Feb 2025 09:18:04 -0800
Subject: [PATCH v37 1/7] Refactor amcheck internals to isolate common locking
 and checking routines
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Before doing checks, other indexes must take the same safety measures:
 - Making sure the index can be checked
 - changing the context of the user
 - keeping track of GUCs modified via index functions
This contribution relocates the existing functionality to amcheck.c for reuse.

Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile                 |   1 +
 contrib/amcheck/expected/check_btree.out |   4 +-
 contrib/amcheck/meson.build              |   1 +
 contrib/amcheck/verify_common.c          | 191 ++++++++++++++++
 contrib/amcheck/verify_common.h          |  31 +++
 contrib/amcheck/verify_nbtree.c          | 267 ++++++-----------------
 6 files changed, 296 insertions(+), 199 deletions(-)
 create mode 100644 contrib/amcheck/verify_common.c
 create mode 100644 contrib/amcheck/verify_common.h

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index 5e9002d2501..c3d70f3369c 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -3,6 +3,7 @@
 MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
+	verify_common.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
diff --git a/contrib/amcheck/expected/check_btree.out b/contrib/amcheck/expected/check_btree.out
index e7fb5f55157..c6f4b16c556 100644
--- a/contrib/amcheck/expected/check_btree.out
+++ b/contrib/amcheck/expected/check_btree.out
@@ -57,8 +57,8 @@ ERROR:  could not open relation with OID 17
 BEGIN;
 CREATE INDEX bttest_a_brin_idx ON bttest_a USING brin(id);
 SELECT bt_index_parent_check('bttest_a_brin_idx');
-ERROR:  only B-Tree indexes are supported as targets for verification
-DETAIL:  Relation "bttest_a_brin_idx" is not a B-Tree index.
+ERROR:  expected "btree" index as targets for verification
+DETAIL:  Relation "bttest_a_brin_idx" is a brin index.
 ROLLBACK;
 -- normal check outside of xact
 SELECT bt_index_check('bttest_a_idx');
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 61d7eaf2305..67a4ac8518d 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -1,6 +1,7 @@
 # Copyright (c) 2022-2025, PostgreSQL Global Development Group
 
 amcheck_sources = files(
+  'verify_common.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
diff --git a/contrib/amcheck/verify_common.c b/contrib/amcheck/verify_common.c
new file mode 100644
index 00000000000..c8ed685ba42
--- /dev/null
+++ b/contrib/amcheck/verify_common.c
@@ -0,0 +1,191 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_common.c
+ *		Utility functions common to all access methods.
+ *
+ * Copyright (c) 2024, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_common.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/table.h"
+#include "access/tableam.h"
+#include "verify_common.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
+#include "commands/tablecmds.h"
+#include "utils/guc.h"
+#include "utils/syscache.h"
+
+static bool amcheck_index_mainfork_expected(Relation rel);
+
+
+/*
+ * Check if index relation should have a file for its main relation fork.
+ * Verification uses this to skip unlogged indexes when in hot standby mode,
+ * where there is simply nothing to verify.
+ *
+ * NB: Caller should call index_checkable() before calling here.
+ */
+static bool
+amcheck_index_mainfork_expected(Relation rel)
+{
+	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
+		!RecoveryInProgress())
+		return true;
+
+	ereport(NOTICE,
+			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
+			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
+					RelationGetRelationName(rel))));
+
+	return false;
+}
+
+/*
+* Amcheck main workhorse.
+* Given index relation OID, lock relation.
+* Next, take a number of standard actions:
+* 1) Make sure the index can be checked
+* 2) change the context of the user,
+* 3) keep track of GUCs modified via index functions
+* 4) execute callback function to verify integrity.
+*/
+void
+amcheck_lock_relation_and_check(Oid indrelid,
+								Oid am_id,
+								IndexDoCheckCallback check,
+								LOCKMODE lockmode,
+								void *state)
+{
+	Oid			heapid;
+	Relation	indrel;
+	Relation	heaprel;
+	Oid			save_userid;
+	int			save_sec_context;
+	int			save_nestlevel;
+
+	/*
+	 * We must lock table before index to avoid deadlocks.  However, if the
+	 * passed indrelid isn't an index then IndexGetRelation() will fail.
+	 * Rather than emitting a not-very-helpful error message, postpone
+	 * complaining, expecting that the is-it-an-index test below will fail.
+	 *
+	 * In hot standby mode this will raise an error when parentcheck is true.
+	 */
+	heapid = IndexGetRelation(indrelid, true);
+	if (OidIsValid(heapid))
+	{
+		heaprel = table_open(heapid, lockmode);
+
+		/*
+		 * Switch to the table owner's userid, so that any index functions are
+		 * run as that user.  Also lock down security-restricted operations
+		 * and arrange to make GUC variable changes local to this command.
+		 */
+		GetUserIdAndSecContext(&save_userid, &save_sec_context);
+		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
+							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
+		save_nestlevel = NewGUCNestLevel();
+	}
+	else
+	{
+		heaprel = NULL;
+		/* Set these just to suppress "uninitialized variable" warnings */
+		save_userid = InvalidOid;
+		save_sec_context = -1;
+		save_nestlevel = -1;
+	}
+
+	/*
+	 * Open the target index relations separately (like relation_openrv(), but
+	 * with heap relation locked first to prevent deadlocking).  In hot
+	 * standby mode this will raise an error when parentcheck is true.
+	 *
+	 * There is no need for the usual indcheckxmin usability horizon test
+	 * here, even in the heapallindexed case, because index undergoing
+	 * verification only needs to have entries for a new transaction snapshot.
+	 * (If this is a parentcheck verification, there is no question about
+	 * committed or recently dead heap tuples lacking index entries due to
+	 * concurrent activity.)
+	 */
+	indrel = index_open(indrelid, lockmode);
+
+	/*
+	 * Since we did the IndexGetRelation call above without any lock, it's
+	 * barely possible that a race against an index drop/recreation could have
+	 * netted us the wrong table.
+	 */
+	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_TABLE),
+				 errmsg("could not open parent table of index \"%s\"",
+						RelationGetRelationName(indrel))));
+
+	/* Check that relation suitable for checking */
+	if (index_checkable(indrel, am_id))
+		check(indrel, heaprel, state, lockmode == ShareLock);
+
+	/* Roll back any GUC changes executed by index functions */
+	AtEOXact_GUC(false, save_nestlevel);
+
+	/* Restore userid and security context */
+	SetUserIdAndSecContext(save_userid, save_sec_context);
+
+	/*
+	 * Release locks early. That's ok here because nothing in the called
+	 * routines will trigger shared cache invalidations to be sent, so we can
+	 * relax the usual pattern of only releasing locks after commit.
+	 */
+	index_close(indrel, lockmode);
+	if (heaprel)
+		table_close(heaprel, lockmode);
+}
+
+/*
+ * Basic checks about the suitability of a relation for checking as an index.
+ *
+ *
+ * NB: Intentionally not checking permissions, the function is normally not
+ * callable by non-superusers. If granted, it's useful to be able to check a
+ * whole cluster.
+ */
+bool
+index_checkable(Relation rel, Oid am_id)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != am_id)
+	{
+		HeapTuple	amtup;
+		HeapTuple	amtuprel;
+
+		amtup = SearchSysCache1(AMOID, ObjectIdGetDatum(am_id));
+		amtuprel = SearchSysCache1(AMOID, ObjectIdGetDatum(rel->rd_rel->relam));
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("expected \"%s\" index as targets for verification", NameStr(((Form_pg_am) GETSTRUCT(amtup))->amname)),
+				 errdetail("Relation \"%s\" is a %s index.",
+						   RelationGetRelationName(rel), NameStr(((Form_pg_am) GETSTRUCT(amtuprel))->amname))));
+	}
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid.")));
+
+	return amcheck_index_mainfork_expected(rel);
+}
diff --git a/contrib/amcheck/verify_common.h b/contrib/amcheck/verify_common.h
new file mode 100644
index 00000000000..30994e22933
--- /dev/null
+++ b/contrib/amcheck/verify_common.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2017-2023, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "storage/bufpage.h"
+#include "storage/lmgr.h"
+#include "storage/lockdefs.h"
+#include "utils/relcache.h"
+#include "miscadmin.h"
+
+/* Typedefs for callback functions for amcheck_lock_relation */
+typedef void (*IndexCheckableCallback) (Relation index);
+typedef void (*IndexDoCheckCallback) (Relation rel,
+									  Relation heaprel,
+									  void *state,
+									  bool readonly);
+
+extern void amcheck_lock_relation_and_check(Oid indrelid,
+											Oid am_id,
+											IndexDoCheckCallback check,
+											LOCKMODE lockmode, void *state);
+
+extern bool index_checkable(Relation rel, Oid am_id);
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 7543be17552..9dc76f0e5d4 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -30,6 +30,7 @@
 #include "access/tableam.h"
 #include "access/transam.h"
 #include "access/xact.h"
+#include "verify_common.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
 #include "catalog/pg_opfamily_d.h"
@@ -156,14 +157,22 @@ typedef struct BtreeLastVisibleEntry
 	ItemPointer tid;			/* Heap tid */
 } BtreeLastVisibleEntry;
 
+/*
+ * Check arguments
+ */
+typedef struct BTCallbackState
+{
+	bool		parentcheck;
+	bool		heapallindexed;
+	bool		rootdescend;
+	bool		checkunique;
+}			BTCallbackState;
+
 PG_FUNCTION_INFO_V1(bt_index_check);
 PG_FUNCTION_INFO_V1(bt_index_parent_check);
 
-static void bt_index_check_internal(Oid indrelid, bool parentcheck,
-									bool heapallindexed, bool rootdescend,
-									bool checkunique);
-static inline void btree_index_checkable(Relation rel);
-static inline bool btree_index_mainfork_expected(Relation rel);
+static void bt_index_check_callback(Relation indrel, Relation heaprel,
+									void *state, bool readonly);
 static void bt_check_every_level(Relation rel, Relation heaprel,
 								 bool heapkeyspace, bool readonly, bool heapallindexed,
 								 bool rootdescend, bool checkunique);
@@ -238,15 +247,21 @@ Datum
 bt_index_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		checkunique = false;
+	BTCallbackState args;
+
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = false;
+	args.checkunique = false;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
-	if (PG_NARGS() == 3)
-		checkunique = PG_GETARG_BOOL(2);
+		args.heapallindexed = PG_GETARG_BOOL(1);
+	if (PG_NARGS() >= 3)
+		args.checkunique = PG_GETARG_BOOL(2);
 
-	bt_index_check_internal(indrelid, false, heapallindexed, false, checkunique);
+	amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
+									bt_index_check_callback,
+									AccessShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -264,18 +279,23 @@ Datum
 bt_index_parent_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		rootdescend = false;
-	bool		checkunique = false;
+	BTCallbackState args;
+
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = true;
+	args.checkunique = false;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+		args.heapallindexed = PG_GETARG_BOOL(1);
 	if (PG_NARGS() >= 3)
-		rootdescend = PG_GETARG_BOOL(2);
-	if (PG_NARGS() == 4)
-		checkunique = PG_GETARG_BOOL(3);
+		args.rootdescend = PG_GETARG_BOOL(2);
+	if (PG_NARGS() >= 4)
+		args.checkunique = PG_GETARG_BOOL(3);
 
-	bt_index_check_internal(indrelid, true, heapallindexed, rootdescend, checkunique);
+	amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
+									bt_index_check_callback,
+									ShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -284,193 +304,46 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
  * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
  */
 static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend, bool checkunique)
+bt_index_check_callback(Relation indrel, Relation heaprel, void *state, bool readonly)
 {
-	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	LOCKMODE	lockmode;
-	Oid			save_userid;
-	int			save_sec_context;
-	int			save_nestlevel;
-
-	if (parentcheck)
-		lockmode = ShareLock;
-	else
-		lockmode = AccessShareLock;
-
-	/*
-	 * We must lock table before index to avoid deadlocks.  However, if the
-	 * passed indrelid isn't an index then IndexGetRelation() will fail.
-	 * Rather than emitting a not-very-helpful error message, postpone
-	 * complaining, expecting that the is-it-an-index test below will fail.
-	 *
-	 * In hot standby mode this will raise an error when parentcheck is true.
-	 */
-	heapid = IndexGetRelation(indrelid, true);
-	if (OidIsValid(heapid))
-	{
-		heaprel = table_open(heapid, lockmode);
-
-		/*
-		 * Switch to the table owner's userid, so that any index functions are
-		 * run as that user.  Also lock down security-restricted operations
-		 * and arrange to make GUC variable changes local to this command.
-		 */
-		GetUserIdAndSecContext(&save_userid, &save_sec_context);
-		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
-							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
-		save_nestlevel = NewGUCNestLevel();
-		RestrictSearchPath();
-	}
-	else
-	{
-		heaprel = NULL;
-		/* Set these just to suppress "uninitialized variable" warnings */
-		save_userid = InvalidOid;
-		save_sec_context = -1;
-		save_nestlevel = -1;
-	}
+	BTCallbackState *args = (BTCallbackState *) state;
+	bool		heapkeyspace,
+				allequalimage;
 
-	/*
-	 * Open the target index relations separately (like relation_openrv(), but
-	 * with heap relation locked first to prevent deadlocking).  In hot
-	 * standby mode this will raise an error when parentcheck is true.
-	 *
-	 * There is no need for the usual indcheckxmin usability horizon test
-	 * here, even in the heapallindexed case, because index undergoing
-	 * verification only needs to have entries for a new transaction snapshot.
-	 * (If this is a parentcheck verification, there is no question about
-	 * committed or recently dead heap tuples lacking index entries due to
-	 * concurrent activity.)
-	 */
-	indrel = index_open(indrelid, lockmode);
-
-	/*
-	 * Since we did the IndexGetRelation call above without any lock, it's
-	 * barely possible that a race against an index drop/recreation could have
-	 * netted us the wrong table.
-	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
 		ereport(ERROR,
-				(errcode(ERRCODE_UNDEFINED_TABLE),
-				 errmsg("could not open parent table of index \"%s\"",
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" lacks a main relation fork",
 						RelationGetRelationName(indrel))));
 
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	if (btree_index_mainfork_expected(indrel))
+	/* Extract metadata from metapage, and sanitize it in passing */
+	_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
+	if (allequalimage && !heapkeyspace)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
+						RelationGetRelationName(indrel))));
+	if (allequalimage && !_bt_allequalimage(indrel, false))
 	{
-		bool		heapkeyspace,
-					allequalimage;
+		bool		has_interval_ops = false;
 
-		if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" lacks a main relation fork",
-							RelationGetRelationName(indrel))));
-
-		/* Extract metadata from metapage, and sanitize it in passing */
-		_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
-		if (allequalimage && !heapkeyspace)
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
-							RelationGetRelationName(indrel))));
-		if (allequalimage && !_bt_allequalimage(indrel, false))
-		{
-			bool		has_interval_ops = false;
-
-			for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
-				if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
-					has_interval_ops = true;
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
-							RelationGetRelationName(indrel)),
-					 has_interval_ops
-					 ? errhint("This is known of \"interval\" indexes last built on a version predating 2023-11.")
-					 : 0));
-		}
-
-		/* Check index, possibly against table it is an index on */
-		bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
-							 heapallindexed, rootdescend, checkunique);
+		for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
+			if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
+			{
+				has_interval_ops = true;
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
+								RelationGetRelationName(indrel)),
+						 has_interval_ops
+						 ? errhint("This is known of \"interval\" indexes last built on a version predating 2023-11.")
+						 : 0));
+			}
 	}
 
-	/* Roll back any GUC changes executed by index functions */
-	AtEOXact_GUC(false, save_nestlevel);
-
-	/* Restore userid and security context */
-	SetUserIdAndSecContext(save_userid, save_sec_context);
-
-	/*
-	 * Release locks early. That's ok here because nothing in the called
-	 * routines will trigger shared cache invalidations to be sent, so we can
-	 * relax the usual pattern of only releasing locks after commit.
-	 */
-	index_close(indrel, lockmode);
-	if (heaprel)
-		table_close(heaprel, lockmode);
-}
-
-/*
- * Basic checks about the suitability of a relation for checking as a B-Tree
- * index.
- *
- * NB: Intentionally not checking permissions, the function is normally not
- * callable by non-superusers. If granted, it's useful to be able to check a
- * whole cluster.
- */
-static inline void
-btree_index_checkable(Relation rel)
-{
-	if (rel->rd_rel->relkind != RELKIND_INDEX ||
-		rel->rd_rel->relam != BTREE_AM_OID)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("only B-Tree indexes are supported as targets for verification"),
-				 errdetail("Relation \"%s\" is not a B-Tree index.",
-						   RelationGetRelationName(rel))));
-
-	if (RELATION_IS_OTHER_TEMP(rel))
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("cannot access temporary tables of other sessions"),
-				 errdetail("Index \"%s\" is associated with temporary relation.",
-						   RelationGetRelationName(rel))));
-
-	if (!rel->rd_index->indisvalid)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("cannot check index \"%s\"",
-						RelationGetRelationName(rel)),
-				 errdetail("Index is not valid.")));
-}
-
-/*
- * Check if B-Tree index relation should have a file for its main relation
- * fork.  Verification uses this to skip unlogged indexes when in hot standby
- * mode, where there is simply nothing to verify.  We behave as if the
- * relation is empty.
- *
- * NB: Caller should call btree_index_checkable() before calling here.
- */
-static inline bool
-btree_index_mainfork_expected(Relation rel)
-{
-	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
-		!RecoveryInProgress())
-		return true;
-
-	ereport(DEBUG1,
-			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
-			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
-					RelationGetRelationName(rel))));
-
-	return false;
+	/* Check index, possibly against table it is an index on */
+	bt_check_every_level(indrel, heaprel, heapkeyspace, readonly,
+						 args->heapallindexed, args->rootdescend, args->checkunique);
 }
 
 /*
-- 
2.43.0

v37-0005-Add-gin-index-checking-test-for-jsonb-data.patchapplication/octet-stream; name=v37-0005-Add-gin-index-checking-test-for-jsonb-data.patchDownload

From 3a5161bd1125a0a4e4e449501a1c02e6e8f7cc42 Mon Sep 17 00:00:00 2001
From: Mark Dilger <mark.dilger@enterprisedb.com>
Date: Fri, 21 Feb 2025 09:40:37 -0800
Subject: [PATCH v37 5/7] Add gin index checking test for jsonb data

Extend the previously committed test of gin index checking to also
include a table using jsonb_path_ops.
---
 contrib/amcheck/expected/check_gin.out | 14 ++++++++++++++
 contrib/amcheck/sql/check_gin.sql      | 12 ++++++++++++
 2 files changed, 26 insertions(+)

diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
index bbcde80e627..93147de0ef1 100644
--- a/contrib/amcheck/expected/check_gin.out
+++ b/contrib/amcheck/expected/check_gin.out
@@ -62,3 +62,17 @@ SELECT gin_index_check('gin_check_text_array_idx');
 
 -- cleanup
 DROP TABLE gin_check_text_array;
+-- Test GIN over jsonb
+CREATE TABLE "gin_check_jsonb"("j" jsonb);
+INSERT INTO gin_check_jsonb values ('{"a":[["b",{"x":1}],["b",{"x":2}]],"c":3}');
+INSERT INTO gin_check_jsonb values ('[[14,2,3]]');
+INSERT INTO gin_check_jsonb values ('[1,[14,2,3]]');
+CREATE INDEX "gin_check_jsonb_idx" on gin_check_jsonb USING GIN("j" jsonb_path_ops);
+SELECT gin_index_check('gin_check_jsonb_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_jsonb;
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
index bbd9b9f8281..92ddbbc7a89 100644
--- a/contrib/amcheck/sql/check_gin.sql
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -38,3 +38,15 @@ SELECT gin_index_check('gin_check_text_array_idx');
 
 -- cleanup
 DROP TABLE gin_check_text_array;
+
+-- Test GIN over jsonb
+CREATE TABLE "gin_check_jsonb"("j" jsonb);
+INSERT INTO gin_check_jsonb values ('{"a":[["b",{"x":1}],["b",{"x":2}]],"c":3}');
+INSERT INTO gin_check_jsonb values ('[[14,2,3]]');
+INSERT INTO gin_check_jsonb values ('[1,[14,2,3]]');
+CREATE INDEX "gin_check_jsonb_idx" on gin_check_jsonb USING GIN("j" jsonb_path_ops);
+
+SELECT gin_index_check('gin_check_jsonb_idx');
+
+-- cleanup
+DROP TABLE gin_check_jsonb;
-- 
2.43.0

v37-0006-Add-gin-to-the-create-index-concurrently-tap-tes.patchapplication/octet-stream; name=v37-0006-Add-gin-to-the-create-index-concurrently-tap-tes.patchDownload

From 84abfcbd7afcb77f19a24cd83438a10d7355cec0 Mon Sep 17 00:00:00 2001
From: Mark Dilger <mark.dilger@enterprisedb.com>
Date: Fri, 21 Feb 2025 12:10:26 -0800
Subject: [PATCH v37 6/7] Add gin to the create index concurrently tap tests

These tests are already checking btree, and can cheaply be extended
to also check gin, so do that.
---
 contrib/amcheck/t/002_cic.pl     | 10 +++++---
 contrib/amcheck/t/003_cic_2pc.pl | 40 ++++++++++++++++++++++++++------
 2 files changed, 40 insertions(+), 10 deletions(-)

diff --git a/contrib/amcheck/t/002_cic.pl b/contrib/amcheck/t/002_cic.pl
index 0b6a5a9e464..6a0c4f61125 100644
--- a/contrib/amcheck/t/002_cic.pl
+++ b/contrib/amcheck/t/002_cic.pl
@@ -21,8 +21,9 @@ $node->append_conf('postgresql.conf',
 	'lock_timeout = ' . (1000 * $PostgreSQL::Test::Utils::timeout_default));
 $node->start;
 $node->safe_psql('postgres', q(CREATE EXTENSION amcheck));
-$node->safe_psql('postgres', q(CREATE TABLE tbl(i int)));
+$node->safe_psql('postgres', q(CREATE TABLE tbl(i int, j jsonb)));
 $node->safe_psql('postgres', q(CREATE INDEX idx ON tbl(i)));
+$node->safe_psql('postgres', q(CREATE INDEX ginidx ON tbl USING gin(j)));
 
 #
 # Stress CIC with pgbench.
@@ -40,13 +41,13 @@ $node->pgbench(
 	{
 		'002_pgbench_concurrent_transaction' => q(
 			BEGIN;
-			INSERT INTO tbl VALUES(0);
+			INSERT INTO tbl VALUES(0, '{"a":[["b",{"x":1}],["b",{"x":2}]],"c":3}');
 			COMMIT;
 		  ),
 		'002_pgbench_concurrent_transaction_savepoints' => q(
 			BEGIN;
 			SAVEPOINT s1;
-			INSERT INTO tbl VALUES(0);
+			INSERT INTO tbl VALUES(0, '[[14,2,3]]');
 			COMMIT;
 		  ),
 		'002_pgbench_concurrent_cic' => q(
@@ -54,7 +55,10 @@ $node->pgbench(
 			\if :gotlock
 				DROP INDEX CONCURRENTLY idx;
 				CREATE INDEX CONCURRENTLY idx ON tbl(i);
+				DROP INDEX CONCURRENTLY ginidx;
+				CREATE INDEX CONCURRENTLY ginidx ON tbl USING gin(j);
 				SELECT bt_index_check('idx',true);
+				SELECT gin_index_check('ginidx');
 				SELECT pg_advisory_unlock(42);
 			\endif
 		  )
diff --git a/contrib/amcheck/t/003_cic_2pc.pl b/contrib/amcheck/t/003_cic_2pc.pl
index 9134487f3b4..00a446a381f 100644
--- a/contrib/amcheck/t/003_cic_2pc.pl
+++ b/contrib/amcheck/t/003_cic_2pc.pl
@@ -25,7 +25,7 @@ $node->append_conf('postgresql.conf',
 	'lock_timeout = ' . (1000 * $PostgreSQL::Test::Utils::timeout_default));
 $node->start;
 $node->safe_psql('postgres', q(CREATE EXTENSION amcheck));
-$node->safe_psql('postgres', q(CREATE TABLE tbl(i int)));
+$node->safe_psql('postgres', q(CREATE TABLE tbl(i int, j jsonb)));
 
 
 #
@@ -41,7 +41,7 @@ my $main_h = $node->background_psql('postgres');
 $main_h->query_safe(
 	q(
 BEGIN;
-INSERT INTO tbl VALUES(0);
+INSERT INTO tbl VALUES(0, '[[14,2,3]]');
 ));
 
 my $cic_h = $node->background_psql('postgres');
@@ -50,6 +50,7 @@ $cic_h->query_until(
 	qr/start/, q(
 \echo start
 CREATE INDEX CONCURRENTLY idx ON tbl(i);
+CREATE INDEX CONCURRENTLY ginidx ON tbl USING gin(j);
 ));
 
 $main_h->query_safe(
@@ -60,7 +61,7 @@ PREPARE TRANSACTION 'a';
 $main_h->query_safe(
 	q(
 BEGIN;
-INSERT INTO tbl VALUES(0);
+INSERT INTO tbl VALUES(0, '[[14,2,3]]');
 ));
 
 $node->safe_psql('postgres', q(COMMIT PREPARED 'a';));
@@ -69,7 +70,7 @@ $main_h->query_safe(
 	q(
 PREPARE TRANSACTION 'b';
 BEGIN;
-INSERT INTO tbl VALUES(0);
+INSERT INTO tbl VALUES(0, '"mary had a little lamb"');
 ));
 
 $node->safe_psql('postgres', q(COMMIT PREPARED 'b';));
@@ -86,6 +87,9 @@ $cic_h->quit;
 $result = $node->psql('postgres', q(SELECT bt_index_check('idx',true)));
 is($result, '0', 'bt_index_check after overlapping 2PC');
 
+$result = $node->psql('postgres', q(SELECT gin_index_check('ginidx')));
+is($result, '0', 'gin_index_check after overlapping 2PC');
+
 
 #
 # Server restart shall not change whether prepared xact blocks CIC
@@ -94,7 +98,7 @@ is($result, '0', 'bt_index_check after overlapping 2PC');
 $node->safe_psql(
 	'postgres', q(
 BEGIN;
-INSERT INTO tbl VALUES(0);
+INSERT INTO tbl VALUES(0, '{"a":[["b",{"x":1}],["b",{"x":2}]],"c":3}');
 PREPARE TRANSACTION 'spans_restart';
 BEGIN;
 CREATE TABLE unused ();
@@ -108,12 +112,16 @@ $reindex_h->query_until(
 \echo start
 DROP INDEX CONCURRENTLY idx;
 CREATE INDEX CONCURRENTLY idx ON tbl(i);
+DROP INDEX CONCURRENTLY ginidx;
+CREATE INDEX CONCURRENTLY ginidx ON tbl USING gin(j);
 ));
 
 $node->safe_psql('postgres', "COMMIT PREPARED 'spans_restart'");
 $reindex_h->quit;
 $result = $node->psql('postgres', q(SELECT bt_index_check('idx',true)));
 is($result, '0', 'bt_index_check after 2PC and restart');
+$result = $node->psql('postgres', q(SELECT gin_index_check('ginidx')));
+is($result, '0', 'gin_index_check after 2PC and restart');
 
 
 #
@@ -136,14 +144,14 @@ $node->pgbench(
 	{
 		'003_pgbench_concurrent_2pc' => q(
 			BEGIN;
-			INSERT INTO tbl VALUES(0);
+			INSERT INTO tbl VALUES(0,'null');
 			PREPARE TRANSACTION 'c:client_id';
 			COMMIT PREPARED 'c:client_id';
 		  ),
 		'003_pgbench_concurrent_2pc_savepoint' => q(
 			BEGIN;
 			SAVEPOINT s1;
-			INSERT INTO tbl VALUES(0);
+			INSERT INTO tbl VALUES(0,'[false, "jnvaba", -76, 7, {"_": [1]}, 9]');
 			PREPARE TRANSACTION 'c:client_id';
 			COMMIT PREPARED 'c:client_id';
 		  ),
@@ -163,7 +171,25 @@ $node->pgbench(
 				SELECT bt_index_check('idx',true);
 				SELECT pg_advisory_unlock(42);
 			\endif
+		  ),
+		'005_pgbench_concurrent_cic' => q(
+			SELECT pg_try_advisory_lock(42)::integer AS gotginlock \gset
+			\if :gotginlock
+				DROP INDEX CONCURRENTLY ginidx;
+				CREATE INDEX CONCURRENTLY ginidx ON tbl USING gin(j);
+				SELECT gin_index_check('ginidx');
+				SELECT pg_advisory_unlock(42);
+			\endif
+		  ),
+		'006_pgbench_concurrent_ric' => q(
+			SELECT pg_try_advisory_lock(42)::integer AS gotginlock \gset
+			\if :gotginlock
+				REINDEX INDEX CONCURRENTLY ginidx;
+				SELECT gin_index_check('ginidx');
+				SELECT pg_advisory_unlock(42);
+			\endif
 		  )
+
 	});
 
 $node->stop;
-- 
2.43.0

v37-0007-Stress-test-verify_gin-using-pgbench.patchapplication/octet-stream; name=v37-0007-Stress-test-verify_gin-using-pgbench.patchDownload

From 8682c4b200b32b38eb74b9cdd45d5995fb1f79ee Mon Sep 17 00:00:00 2001
From: Mark Dilger <mark.dilger@enterprisedb.com>
Date: Fri, 21 Feb 2025 12:11:07 -0800
Subject: [PATCH v37 7/7] Stress test verify_gin() using pgbench

Add a tap test which inserts, updates, deletes, and checks in
parallel.  Like all pgbench based tap tests, this test contains race
conditions between the operations, so Your Mileage May Vary.  For
me, on my laptop, I got failures like:

	index "ginidx" has wrong tuple order on entry tree page

which I have not yet investigated.  The test is included here for
anybody interested in debugging this failure.
---
 contrib/amcheck/t/006_gin_concurrency.pl | 196 +++++++++++++++++++++++
 1 file changed, 196 insertions(+)
 create mode 100644 contrib/amcheck/t/006_gin_concurrency.pl

diff --git a/contrib/amcheck/t/006_gin_concurrency.pl b/contrib/amcheck/t/006_gin_concurrency.pl
new file mode 100644
index 00000000000..afc67940d4d
--- /dev/null
+++ b/contrib/amcheck/t/006_gin_concurrency.pl
@@ -0,0 +1,196 @@
+
+# Copyright (c) 2021-2025, PostgreSQL Global Development Group
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+
+use Test::More;
+
+my $node;
+
+#
+# Test set-up
+#
+$node = PostgreSQL::Test::Cluster->new('test');
+$node->init;
+$node->append_conf('postgresql.conf',
+	'lock_timeout = ' . (1000 * $PostgreSQL::Test::Utils::timeout_default));
+$node->start;
+$node->safe_psql('postgres', q(CREATE EXTENSION amcheck));
+$node->safe_psql('postgres', q(CREATE TABLE tbl(i integer[], j jsonb, k jsonb)));
+$node->safe_psql('postgres', q(CREATE INDEX ginidx ON tbl USING gin(i, j, k)));
+$node->safe_psql('postgres', q(CREATE TABLE jsondata (i serial, j jsonb)));
+$node->safe_psql('postgres', q(INSERT INTO jsondata (j) VALUES
+	('1'),
+	('91'),
+	('[5]'),
+	('true'),
+	('"zxI"'),
+	('[1, 7]'),
+	('["", 4]'),
+	('"utDFBz"'),
+	('[[9], ""]'),
+	('"eCvxKPML"'),
+	('["1VMQNQM"]'),
+	('{"": "562c"}'),
+	('[58, 8, null]'),
+	('{"": {"": 62}}'),
+	('["", 6, 19, ""]'),
+	('{"ddfWTQ": true}'),
+	('["", 734.2, 9, 5]'),
+	('"GMV27mjtuuqmlltw"'),
+	('{"dabe": -5, "": 6}'),
+	('"hgihykirQGIYTcCA30"'),
+	('[9, {"Utrn": -6}, ""]'),
+	('"BJTZUMST1_WWEgyqgka_"'),
+	('["", -4, "", [-2], -47]'),
+	('{"": [3], "": {"": "y"}}'),
+	('{"myuijj": "YUWIUZXXLGS"}'),
+	('{"3": false, "C": "1sHTX"}'),
+	('"ZGUORVDE_ACF1QXJ_hipgwrks"'),
+	('{"072": [3, -4], "oh": "eL"}'),
+	('[{"de": 9, "JWHPMRZJW": [0]}]'),
+	('"EACJUZEBAFFBEE6706SZLWVGO635"'),
+	('["P", {"TZW": [""]}, {"": [0]}]'),
+	('{"": -6, "YMb": -22, "__": [""]}'),
+	('{"659": [8], "bfc": [0], "V": ""}'),
+	('{"8776": "1tryl", "Q": 2, "": 4.6}'),
+	('[[1], "", 9, 0, [1, 0], -1, 0, "C"]'),
+	('"635321pnpjlfFzhGTIYP9265iA_19D8260"'),
+	('"klmxsoCFDtzxrhotsqlnmvmzlcbdde34twj"'),
+	('"GZSXSZVS19ecbe_ZJJED0379c1j9_GSU9167"'),
+	('{"F18s": {"": -84194}, "ececab2": [""]}'),
+	('["", {"SVAvgg": "Q"}, 1, 9, "gypy", [1]]'),
+	('[[""], {"": 5}, "GVZGGVGSWM", 2, ["", 8]]'),
+	('{"V": 8, "TPNL": [826, null], "4": -9.729}'),
+	('{"HTJP_DAptxn6": 9, "": "r", "hji4124": ""}'),
+	('[1, ["9", 5, 6, ""], {"": "", "": "efb"}, 7]'),
+	('{"": 6, "1251e_cajrgkyzuxBEDM017444EFD": 548}'),
+	('{"853": -60, "TGLUG_jxmrggv": null, "pjx": ""}'),
+	('[0, "wsgnnvCfJVV_KOMLVXOUIS9FIQLPXXBbbaohjrpj"]'),
+	('"nizvkl36908OLW22ecbdeEBMHMiCEEACcikwkjpmu30X_m"'),
+	('{"bD24eeVZWY": 1, "Bt": 9, "": 6052, "FT": ["h"]}'),
+	('"CDBnouyzlAMSHJCtguxxizpzgkNYfaNLURVITNLYVPSNLYNy"'),
+	('{"d": [[4, "N"], null, 6, true], "1PKV": 6, "9": 6}'),
+	('[-7326, [83, 55], -63, [0, {"": 1}], {"ri0": false}]'),
+	('{"": 117.38, "FCkx3608szztpvjolomzvlyrshyvrgz": -4.2}'),
+	('["", 8, {"WXHNG": {"6": 4}}, [null], 7, 2, "", 299, 6]'),
+	('[[-992.2, "TPm", "", "cedeff79BD8", "t", [1]], 0, [-7]]'),
+	('[9, 34, ["LONuyiYGQZ"], [7, 88], ["c"], 1, 6, "", [[2]]]'),
+	('[20, 5, null, "eLHTXRWNV", 8, ["pnpvrum", -3], "FINY", 3]'),
+	('[{"": "", "b": 2, "d": "egu"}, "aPNK", 2, 9, {"": -79946}]'),
+	('[1, {"769": 9}, 5, 9821, 22, 0, 2.7, 5, 4, 191, 54.599, 24]'),
+	('["c", 77, "b_0lplvHJNLMxw", "VN76dhFadaafadfe5dfbco", false]'),
+	('"TYIHXebbPK_86QMP_199bEEIS__8205986vdC_CFAEFBFCEFCJQRHYoqztv"'),
+	('"cdmxxxzrhtxpwuyrxinmhb5577NSPHIHMTPQYTXSUVVGJPUUMCBEDb_1569e"'),
+	('[[5, null, "C"], "ORNR", "mnCb", 1, -800, "6953", ["K", 0], ""]'),
+	('"SSKLTHJxjxywwquhiwsde353eCIJJjkyvn9946c2cdVadcboiyZFAYMHJWGMMT"'),
+	('"5185__D5AtvhizvmEVceF3jxtghlCF0789_owmsztJHRMOJ7rlowxqq51XLXJbF"'),
+	('{"D": 565206, "xupqtmfedff": "ZGJN9", "9": 1, "glzv": -47, "": -8}'),
+	('{"": 9, "": {"": [null], "ROP": 842}, "": ["5FFD", 7, 5, 1, 94, 1]}'),
+	('{"JLn": ["8s"], "": "_ahxizrzhivyzvhr", "XSAt": 5, "P": 2838, "": 5}'),
+	('[51, 3, {"": 9, "": -9, "": [[6]]}, 7, 7, {"": 0}, "TXLQL", 7.6, [7]]'),
+	('[-38.7, "kre40", 5, {"": null}, "tvuv", 8, "", "", "uizygprwwvh", "1"]'),
+	('"z934377_nxmzjnuqglgyukjteefeihjyot1irkvwnnrqinptlpzwjgmkjbQMUVxxwvbdz"'),
+	('[165.9, "dAFD_60JQPYbafh", false, {"": 6, "": "fcfd"}, [[2], "c"], 4, 2]'),
+	('"ffHOOPVSSACDqiyeecTNWJMWPNRXU283aHRXNUNZZZQPUGYSQTTQXQVJM5eeafcIPGIHcac"'),
+	('[2, 8, -53, {"": 5}, "F9", 8, "SGUJPNVI", "7OLOZH", 9.84, {"": 6}, 207, 6]'),
+	('"xqmqmyljhq__ZGWJVNefagsxrsktruhmlinhxloupuVQW0804901NKGGMNNSYYXWQOosz8938"'),
+	('{"FEoLfaab1160167": {"L": [42, 0]}, "938": "FCCUPGYYYMQSQVZJKM", "knqmk": 2}'),
+	('"0igyurmOMSXIYHSZQEAcxlvgqdxkhwtrbaabfaaMC138Z_BDRLrythpi30_MPRXMTOILRLswmoy"'),
+	('"1129BBCABFFAACA9VGVKipnwohaccc9TSIMTOQKHmcGYVeFE_PWKLHmpyj60137672qugtsstugg"'),
+	('"D3BDA069074174vx48A37IVHWVXLUP9382542ypsl1465pixtryzCBgrkkhrvCC_BDDFatkyXHLIe"'),
+	('[{"esx7": -53, "ec60834YGVMYoXAAvgxmmqnojyzmiklhdovFipl": 2, "os": 66433}, 9.13]'),
+	('{"": ["", 4, null, 5, null], "": "3", "5_GMMHTIhPB_F_vsebc1": "Er", "GY": 121.32}'),
+	('["krTVPYDEd", 5, 8, [6, -6], [[-9], 3340, [[""]]], "", 5, [6, true], 3, "", 1, ""]'),
+	('{"rBNPKN8446080wruOLeceaCBDCKWNUYYMONSJUlCDFExr": {"": "EE0", "6826": 5, "": 7496}}'),
+	('[3, {"": -8}, "101dboMVSNKZLVPITLHLPorwwuxxjmjsh", "", "LSQPRVYKWVYK945imrh", 4, 51]'),
+	('[["HY6"], "", "bcdB", [2, [85, 1], 3, 3, 3, [8]], "", ["_m"], "2", -33, 8, 3, "_xwj"]'),
+	('["", 0, -3.7, 8, false, null, {"": 5}, 9, "06FccxFcdb283bbZGGVRSMWLJH2_PBAFpwtkbceto"]'),
+	('[52, "", -39, -7, [1], "c", {"": 9, "": 45528, "G": {"": 7}}, 3, false, 0, "EB", 8, -6]'),
+	('"qzrkvrlG78CCCEBCptzwwok808805243QXVSYed3efZSKLSNXPxhrS357KJMWSKgrfcFFDFDWKSXJJSIJ_yqJu"'),
+	('[43, 8, {"": ""}, "uwtv__HURKGJLGGPPW", 9, 66, "yqrvghxuw", {"J": false}, false, 2, 0, 4]'),
+	('[{"UVL": 7, "": 1}, false, [6, "H"], "boxlgqgm", 3, "znhm", [true], 0, ["e", 3.7], 9, 9.4]'),
+	('{"825634870117somzqw": 1, "": [5], "gYH": "_XT", "b22412631709RZP": 3, "": "", "FDB": [""]}'),
+	('[8, ["_bae"], "", "WN", 80, {"o": 2, "aff": 16}, false, true, 4, 6, {"nutzkzikolsxZRQ": 30}]'),
+	('["588BD9c_xzsn", {"k": 0, "_Ecezlkslrwvjpwrukiqzl": 3, "Ej": "4"}, "TUXwghn1dTNRXJZpswmD", 5]'),
+	('[{"dC": 7}, {"": 1, "4": 41, "": "", "": "adKS"}, {"": "ypv"}, 6, 9, 2, [-61.46], [1, 3.9], 2]'),
+	('{"8": 8, "": -364, "855": -238.1, "zj": 9, "SNHJG413": 3, "UMNVI73": [60, 0], "iwvqse": -1.833}'),
+	('"VTUKMLZKQPHIEniCFZ_cjrhvspxzulvxhqykjzmrw89OGOGISWdcrvpOPLOFALGK809896999xzqnkm63254_xrmcfcedb"'),
+	('["", "USNQbcexyFDCdBAFWJIphloxwytplyZZR008400FmoiYXVYOHVGV79795644463Aug_aeoDDEjzoziisxoykuijhz"]'),
+	('{"": 1, "5abB58gXVQVTTMWU3jSHXMMNV": "", "nv": 934, "kjsnhtj": 8, "": [{"xm": [71, 425]}], "": -9}'),
+	('"__oliqCcbwwyqmtECsqivplcb1NTMOQRZTYRJONOIPWNHKWLJRIHKROMJNZLNGTTKRcedebccdbMTQXSzhynxmllqxuhnxBA_"'),
+	('["thgACBWGNGMkFFEA", [0, -1349, {"18": "RM", "F3": 6, "dP": "_AF"}, 64, 0, {"f": [8]}], 5, [[0]], 2]')
+));
+
+#
+# Stress gin with pgbench.
+#
+# Modify the table data, and hence the index data, from multiple process
+# while from other processes run the index checking code.  This should,
+# if the index is large enough, result in the checks performing across
+# concurrent page splits.
+#
+$node->pgbench(
+	'--no-vacuum --client=20 --transactions=5000',
+	0,
+	[qr{actually processed}],
+	[qr{^$}],
+	'concurrent DML and index checking',
+	{
+		'006_gin_concurrency_insert_1' => q(
+			INSERT INTO tbl (i, j, k)
+				(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+					FROM jsondata x, jsondata y
+					WHERE x.i = random(1,100)
+					  AND y.i = random(1,100)
+				)
+		  ),
+		'006_gin_concurrency_insert_2' => q(
+			INSERT INTO tbl (i, j, k)
+				(SELECT gs.i, j.j, j.j || j.j
+					FROM jsondata j,
+						 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+					WHERE j.i = random(1,100)
+				)
+		  ),
+		'006_gin_concurrency_insert_nulls' => q(
+			INSERT INTO tbl (i, j, k) VALUES
+				(null,               null, null),
+				(null,               null, '[]'),
+				(null,               '[]', null),
+				(ARRAY[]::INTEGER[], null, null),
+				(null,               '[]', '[]'),
+				(ARRAY[]::INTEGER[], '[]', null),
+				(ARRAY[]::INTEGER[], '[]', '[]')
+		  ),
+		'006_gin_concurrency_update_i' => q(
+			UPDATE tbl
+				SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+				WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+		),
+		'006_gin_concurrency_update_j' => q(
+			UPDATE tbl
+				SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+				WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+		),
+		'006_gin_concurrency_update_k' => q(
+			UPDATE tbl
+				SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+				WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+		),
+		'006_gin_concurrency_delete' => q(
+			DELETE FROM tbl
+				WHERE random(1,5) = 3;
+		),
+		'006_gin_concurrency_gin_index_check' => q(
+				SELECT gin_index_check('ginidx');
+		)
+	});
+
+$node->stop;
+done_testing();
+
-- 
2.43.0

#70

Mark Dilger

mark.dilger@enterprisedb.com

10 months ago

In reply to: Tomas Vondra (#60)

Re: Amcheck verification of GiST and GIN

On Fri, Feb 21, 2025 at 6:29 AM Tomas Vondra <tomas@vondra.me> wrote:

Hi,

I see this patch didn't move since December :-( I still think these
improvements would be useful, it certainly was very helpful when I was
working on the GIN/GiST parallel builds (the GiST builds stalled, but I
hope to push the GIN patches soon).

So I'd like to get some of this in too. I'm not sure about the GiST
bits, because I know very little about that AM (the parallel builds made
me acutely aware of that).

But I'd like to get the GIN parts in. We're at v34 already, and the
recent changes were mostly cosmetic. Does anyone object to me polishing
and pushing those parts?

Kirill may have addressed my concerns in the latest version. I have not
had time for another review. Tomas, would you still like to review and
push this patch? I have no objection.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#71

Tomas Vondra

tomas@vondra.me

10 months ago

In reply to: Mark Dilger (#70)

Re: Amcheck verification of GiST and GIN

On 3/27/25 16:30, Mark Dilger wrote:

On Fri, Feb 21, 2025 at 6:29 AM Tomas Vondra <tomas@vondra.me
<mailto:tomas@vondra.me>> wrote:

Hi,

I see this patch didn't move since December :-( I still think these
improvements would be useful, it certainly was very helpful when I was
working on the GIN/GiST parallel builds (the GiST builds stalled, but I
hope to push the GIN patches soon).

So I'd like to get some of this in too. I'm not sure about the GiST
bits, because I know very little about that AM (the parallel builds made
me acutely aware of that).

But I'd like to get the GIN parts in. We're at v34 already, and the
recent changes were mostly cosmetic. Does anyone object to me polishing
and pushing those parts?

Kirill may have addressed my concerns in the latest version. I have not
had time for another review. Tomas, would you still like to review and
push this patch? I have no objection.

Thanks for reminding me. I think the patches are in good share, but I'll
take a look once more, and I hope to get it committed.

regards

--
Tomas Vondra

#72

Tomas Vondra

tomas@vondra.me

10 months ago

In reply to: Tomas Vondra (#71)

6 attachment(s)

Re: Amcheck verification of GiST and GIN

Here's a polished version of the patches. If you have any
comments/objections, please speak now. I don't plan to push 0006 (the
stress test), of course.

Changes I did:

1) update / write proper commit messages, hopefully explaining the
purpose of each patch well enough

2) update the lists of reviewers/authors (would appreciate someone
checking - it's hard to keep track for a thread that runs for years, and
it may not be quite clear what qualifies as a review)

3) squash the fix patch into the right patch, moved the README fix to be
the first patch (doesn't really matter)

4) minor cleanups in the main patches (0002 and 0003), mostly adding the
structs to typedefs.list and tweaking a couple comments

5) I've adjusted names of the memory contexts, because having both with
"amcheck context" seemed confusing, especially as it's in caller-callee
functions. So now it's

- amcheck consistency check context
- posting tree check context

regards

--
Tomas Vondra

Attachments:

v20250328-0001-Fix-grammar-in-GIN-README.patchtext/x-patch; charset=UTF-8; name=v20250328-0001-Fix-grammar-in-GIN-README.patchDownload

From 28b392b687f641b09bc79bb3bb3e61505845e6c1 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas.vondra@postgresql.org>
Date: Fri, 28 Mar 2025 16:49:04 +0100
Subject: [PATCH v20250328 1/6] Fix grammar in GIN README

Author: Kirill Reshke <reshkekirill@gmail.com>
Discussion: https://postgr.es/m/CALdSSPgu9uAhVYojQ0yjG%3Dq5MaqmiSLUJPhz%2B-u7cA6K6Mc9UA%40mail.gmail.com
---
 src/backend/access/gin/README | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/backend/access/gin/README b/src/backend/access/gin/README
index b0807316212..742bcbad499 100644
--- a/src/backend/access/gin/README
+++ b/src/backend/access/gin/README
@@ -237,10 +237,10 @@ GIN packs keys and downlinks into tuples in a different way.
 
 P_i is grouped with K_{i+1}.  -Inf key is not needed.
 
-There are couple of additional notes regarding K_{n+1} key.
-1) In entry tree rightmost page, a key coupled with P_n doesn't really matter.
+There are a couple of additional notes regarding K_{n+1} key.
+1) In the entry tree on the rightmost page, a key coupled with P_n doesn't really matter.
 Highkey is assumed to be infinity.
-2) In posting tree, a key coupled with P_n always doesn't matter.  Highkey for
+2) In the posting tree, a key coupled with P_n always doesn't matter.  Highkey for
 non-rightmost pages is stored separately and accessed via
 GinDataPageGetRightBound().
 
-- 
2.49.0

v20250328-0002-amcheck-Move-common-routines-into-a-separa.patchtext/x-patch; charset=UTF-8; name=v20250328-0002-amcheck-Move-common-routines-into-a-separa.patchDownload

From 40288f3d42b2e8ca716333826e5b67cec5fa7b3d Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas.vondra@postgresql.org>
Date: Fri, 28 Mar 2025 15:40:09 +0100
Subject: [PATCH v20250328 2/6] amcheck: Move common routines into a separate
 module
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Before performing checks on an index, we need to take some safety
measures that apply to all index AMs. This includes:

* verifying that the index can be checked - Only selected AMs are
supported by amcheck (right now only B-Tree). The index has to be
valid and not a temporary index from another session.

* changing (and then restoring) user's security context

* obtaining proper locks on the index (and table, if needed)

* discarding GUC changes from the index functions

Until now this was implemented in the B-Tree amcheck module, but it's
something every AM will have to do. So relocate the code into a new
module verify_common for reuse.

The shared steps are implemented by amcheck_lock_relation_and_check(),
receiving the AM-specific verification as a callback. Custom parameters
may be supplied using a pointer.

Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Tomas Vondra <tomas@vondra.me>
Reviewed-By: Mark Dilger <mark.dilger@enterprisedb.com>
Reviewed-By: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Kirill Reshke <reshkekirill@gmail.com>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile                 |   1 +
 contrib/amcheck/expected/check_btree.out |   4 +-
 contrib/amcheck/meson.build              |   1 +
 contrib/amcheck/verify_common.c          | 191 ++++++++++++++++
 contrib/amcheck/verify_common.h          |  31 +++
 contrib/amcheck/verify_nbtree.c          | 267 ++++++-----------------
 src/tools/pgindent/typedefs.list         |   1 +
 7 files changed, 297 insertions(+), 199 deletions(-)
 create mode 100644 contrib/amcheck/verify_common.c
 create mode 100644 contrib/amcheck/verify_common.h

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index 5e9002d2501..c3d70f3369c 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -3,6 +3,7 @@
 MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
+	verify_common.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
diff --git a/contrib/amcheck/expected/check_btree.out b/contrib/amcheck/expected/check_btree.out
index e7fb5f55157..c6f4b16c556 100644
--- a/contrib/amcheck/expected/check_btree.out
+++ b/contrib/amcheck/expected/check_btree.out
@@ -57,8 +57,8 @@ ERROR:  could not open relation with OID 17
 BEGIN;
 CREATE INDEX bttest_a_brin_idx ON bttest_a USING brin(id);
 SELECT bt_index_parent_check('bttest_a_brin_idx');
-ERROR:  only B-Tree indexes are supported as targets for verification
-DETAIL:  Relation "bttest_a_brin_idx" is not a B-Tree index.
+ERROR:  expected "btree" index as targets for verification
+DETAIL:  Relation "bttest_a_brin_idx" is a brin index.
 ROLLBACK;
 -- normal check outside of xact
 SELECT bt_index_check('bttest_a_idx');
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 61d7eaf2305..67a4ac8518d 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -1,6 +1,7 @@
 # Copyright (c) 2022-2025, PostgreSQL Global Development Group
 
 amcheck_sources = files(
+  'verify_common.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
diff --git a/contrib/amcheck/verify_common.c b/contrib/amcheck/verify_common.c
new file mode 100644
index 00000000000..d095e62ce55
--- /dev/null
+++ b/contrib/amcheck/verify_common.c
@@ -0,0 +1,191 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_common.c
+ *		Utility functions common to all access methods.
+ *
+ * Copyright (c) 2016-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_common.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/table.h"
+#include "access/tableam.h"
+#include "verify_common.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
+#include "commands/tablecmds.h"
+#include "utils/guc.h"
+#include "utils/syscache.h"
+
+static bool amcheck_index_mainfork_expected(Relation rel);
+
+
+/*
+ * Check if index relation should have a file for its main relation fork.
+ * Verification uses this to skip unlogged indexes when in hot standby mode,
+ * where there is simply nothing to verify.
+ *
+ * NB: Caller should call index_checkable() before calling here.
+ */
+static bool
+amcheck_index_mainfork_expected(Relation rel)
+{
+	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
+		!RecoveryInProgress())
+		return true;
+
+	ereport(NOTICE,
+			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
+			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
+					RelationGetRelationName(rel))));
+
+	return false;
+}
+
+/*
+* Amcheck main workhorse.
+* Given index relation OID, lock relation.
+* Next, take a number of standard actions:
+* 1) Make sure the index can be checked
+* 2) change the context of the user,
+* 3) keep track of GUCs modified via index functions
+* 4) execute callback function to verify integrity.
+*/
+void
+amcheck_lock_relation_and_check(Oid indrelid,
+								Oid am_id,
+								IndexDoCheckCallback check,
+								LOCKMODE lockmode,
+								void *state)
+{
+	Oid			heapid;
+	Relation	indrel;
+	Relation	heaprel;
+	Oid			save_userid;
+	int			save_sec_context;
+	int			save_nestlevel;
+
+	/*
+	 * We must lock table before index to avoid deadlocks.  However, if the
+	 * passed indrelid isn't an index then IndexGetRelation() will fail.
+	 * Rather than emitting a not-very-helpful error message, postpone
+	 * complaining, expecting that the is-it-an-index test below will fail.
+	 *
+	 * In hot standby mode this will raise an error when parentcheck is true.
+	 */
+	heapid = IndexGetRelation(indrelid, true);
+	if (OidIsValid(heapid))
+	{
+		heaprel = table_open(heapid, lockmode);
+
+		/*
+		 * Switch to the table owner's userid, so that any index functions are
+		 * run as that user.  Also lock down security-restricted operations
+		 * and arrange to make GUC variable changes local to this command.
+		 */
+		GetUserIdAndSecContext(&save_userid, &save_sec_context);
+		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
+							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
+		save_nestlevel = NewGUCNestLevel();
+	}
+	else
+	{
+		heaprel = NULL;
+		/* Set these just to suppress "uninitialized variable" warnings */
+		save_userid = InvalidOid;
+		save_sec_context = -1;
+		save_nestlevel = -1;
+	}
+
+	/*
+	 * Open the target index relations separately (like relation_openrv(), but
+	 * with heap relation locked first to prevent deadlocking).  In hot
+	 * standby mode this will raise an error when parentcheck is true.
+	 *
+	 * There is no need for the usual indcheckxmin usability horizon test
+	 * here, even in the heapallindexed case, because index undergoing
+	 * verification only needs to have entries for a new transaction snapshot.
+	 * (If this is a parentcheck verification, there is no question about
+	 * committed or recently dead heap tuples lacking index entries due to
+	 * concurrent activity.)
+	 */
+	indrel = index_open(indrelid, lockmode);
+
+	/*
+	 * Since we did the IndexGetRelation call above without any lock, it's
+	 * barely possible that a race against an index drop/recreation could have
+	 * netted us the wrong table.
+	 */
+	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_TABLE),
+				 errmsg("could not open parent table of index \"%s\"",
+						RelationGetRelationName(indrel))));
+
+	/* Check that relation suitable for checking */
+	if (index_checkable(indrel, am_id))
+		check(indrel, heaprel, state, lockmode == ShareLock);
+
+	/* Roll back any GUC changes executed by index functions */
+	AtEOXact_GUC(false, save_nestlevel);
+
+	/* Restore userid and security context */
+	SetUserIdAndSecContext(save_userid, save_sec_context);
+
+	/*
+	 * Release locks early. That's ok here because nothing in the called
+	 * routines will trigger shared cache invalidations to be sent, so we can
+	 * relax the usual pattern of only releasing locks after commit.
+	 */
+	index_close(indrel, lockmode);
+	if (heaprel)
+		table_close(heaprel, lockmode);
+}
+
+/*
+ * Basic checks about the suitability of a relation for checking as an index.
+ *
+ *
+ * NB: Intentionally not checking permissions, the function is normally not
+ * callable by non-superusers. If granted, it's useful to be able to check a
+ * whole cluster.
+ */
+bool
+index_checkable(Relation rel, Oid am_id)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != am_id)
+	{
+		HeapTuple	amtup;
+		HeapTuple	amtuprel;
+
+		amtup = SearchSysCache1(AMOID, ObjectIdGetDatum(am_id));
+		amtuprel = SearchSysCache1(AMOID, ObjectIdGetDatum(rel->rd_rel->relam));
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("expected \"%s\" index as targets for verification", NameStr(((Form_pg_am) GETSTRUCT(amtup))->amname)),
+				 errdetail("Relation \"%s\" is a %s index.",
+						   RelationGetRelationName(rel), NameStr(((Form_pg_am) GETSTRUCT(amtuprel))->amname))));
+	}
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid.")));
+
+	return amcheck_index_mainfork_expected(rel);
+}
diff --git a/contrib/amcheck/verify_common.h b/contrib/amcheck/verify_common.h
new file mode 100644
index 00000000000..b2565bfbbab
--- /dev/null
+++ b/contrib/amcheck/verify_common.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2016-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "storage/bufpage.h"
+#include "storage/lmgr.h"
+#include "storage/lockdefs.h"
+#include "utils/relcache.h"
+#include "miscadmin.h"
+
+/* Typedefs for callback functions for amcheck_lock_relation */
+typedef void (*IndexCheckableCallback) (Relation index);
+typedef void (*IndexDoCheckCallback) (Relation rel,
+									  Relation heaprel,
+									  void *state,
+									  bool readonly);
+
+extern void amcheck_lock_relation_and_check(Oid indrelid,
+											Oid am_id,
+											IndexDoCheckCallback check,
+											LOCKMODE lockmode, void *state);
+
+extern bool index_checkable(Relation rel, Oid am_id);
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index d56eb7637d3..f11c43a0ed7 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -30,6 +30,7 @@
 #include "access/tableam.h"
 #include "access/transam.h"
 #include "access/xact.h"
+#include "verify_common.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
 #include "catalog/pg_opfamily_d.h"
@@ -159,14 +160,22 @@ typedef struct BtreeLastVisibleEntry
 	ItemPointer tid;			/* Heap tid */
 } BtreeLastVisibleEntry;
 
+/*
+ * arguments for the bt_index_check_callback callback
+ */
+typedef struct BTCallbackState
+{
+	bool		parentcheck;
+	bool		heapallindexed;
+	bool		rootdescend;
+	bool		checkunique;
+} BTCallbackState;
+
 PG_FUNCTION_INFO_V1(bt_index_check);
 PG_FUNCTION_INFO_V1(bt_index_parent_check);
 
-static void bt_index_check_internal(Oid indrelid, bool parentcheck,
-									bool heapallindexed, bool rootdescend,
-									bool checkunique);
-static inline void btree_index_checkable(Relation rel);
-static inline bool btree_index_mainfork_expected(Relation rel);
+static void bt_index_check_callback(Relation indrel, Relation heaprel,
+									void *state, bool readonly);
 static void bt_check_every_level(Relation rel, Relation heaprel,
 								 bool heapkeyspace, bool readonly, bool heapallindexed,
 								 bool rootdescend, bool checkunique);
@@ -241,15 +250,21 @@ Datum
 bt_index_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		checkunique = false;
+	BTCallbackState args;
+
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = false;
+	args.checkunique = false;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
-	if (PG_NARGS() == 3)
-		checkunique = PG_GETARG_BOOL(2);
+		args.heapallindexed = PG_GETARG_BOOL(1);
+	if (PG_NARGS() >= 3)
+		args.checkunique = PG_GETARG_BOOL(2);
 
-	bt_index_check_internal(indrelid, false, heapallindexed, false, checkunique);
+	amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
+									bt_index_check_callback,
+									AccessShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -267,18 +282,23 @@ Datum
 bt_index_parent_check(PG_FUNCTION_ARGS)
 {
 	Oid			indrelid = PG_GETARG_OID(0);
-	bool		heapallindexed = false;
-	bool		rootdescend = false;
-	bool		checkunique = false;
+	BTCallbackState args;
+
+	args.heapallindexed = false;
+	args.rootdescend = false;
+	args.parentcheck = true;
+	args.checkunique = false;
 
 	if (PG_NARGS() >= 2)
-		heapallindexed = PG_GETARG_BOOL(1);
+		args.heapallindexed = PG_GETARG_BOOL(1);
 	if (PG_NARGS() >= 3)
-		rootdescend = PG_GETARG_BOOL(2);
-	if (PG_NARGS() == 4)
-		checkunique = PG_GETARG_BOOL(3);
+		args.rootdescend = PG_GETARG_BOOL(2);
+	if (PG_NARGS() >= 4)
+		args.checkunique = PG_GETARG_BOOL(3);
 
-	bt_index_check_internal(indrelid, true, heapallindexed, rootdescend, checkunique);
+	amcheck_lock_relation_and_check(indrelid, BTREE_AM_OID,
+									bt_index_check_callback,
+									ShareLock, &args);
 
 	PG_RETURN_VOID();
 }
@@ -287,193 +307,46 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
  * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
  */
 static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend, bool checkunique)
+bt_index_check_callback(Relation indrel, Relation heaprel, void *state, bool readonly)
 {
-	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	LOCKMODE	lockmode;
-	Oid			save_userid;
-	int			save_sec_context;
-	int			save_nestlevel;
-
-	if (parentcheck)
-		lockmode = ShareLock;
-	else
-		lockmode = AccessShareLock;
-
-	/*
-	 * We must lock table before index to avoid deadlocks.  However, if the
-	 * passed indrelid isn't an index then IndexGetRelation() will fail.
-	 * Rather than emitting a not-very-helpful error message, postpone
-	 * complaining, expecting that the is-it-an-index test below will fail.
-	 *
-	 * In hot standby mode this will raise an error when parentcheck is true.
-	 */
-	heapid = IndexGetRelation(indrelid, true);
-	if (OidIsValid(heapid))
-	{
-		heaprel = table_open(heapid, lockmode);
-
-		/*
-		 * Switch to the table owner's userid, so that any index functions are
-		 * run as that user.  Also lock down security-restricted operations
-		 * and arrange to make GUC variable changes local to this command.
-		 */
-		GetUserIdAndSecContext(&save_userid, &save_sec_context);
-		SetUserIdAndSecContext(heaprel->rd_rel->relowner,
-							   save_sec_context | SECURITY_RESTRICTED_OPERATION);
-		save_nestlevel = NewGUCNestLevel();
-		RestrictSearchPath();
-	}
-	else
-	{
-		heaprel = NULL;
-		/* Set these just to suppress "uninitialized variable" warnings */
-		save_userid = InvalidOid;
-		save_sec_context = -1;
-		save_nestlevel = -1;
-	}
+	BTCallbackState *args = (BTCallbackState *) state;
+	bool		heapkeyspace,
+				allequalimage;
 
-	/*
-	 * Open the target index relations separately (like relation_openrv(), but
-	 * with heap relation locked first to prevent deadlocking).  In hot
-	 * standby mode this will raise an error when parentcheck is true.
-	 *
-	 * There is no need for the usual indcheckxmin usability horizon test
-	 * here, even in the heapallindexed case, because index undergoing
-	 * verification only needs to have entries for a new transaction snapshot.
-	 * (If this is a parentcheck verification, there is no question about
-	 * committed or recently dead heap tuples lacking index entries due to
-	 * concurrent activity.)
-	 */
-	indrel = index_open(indrelid, lockmode);
-
-	/*
-	 * Since we did the IndexGetRelation call above without any lock, it's
-	 * barely possible that a race against an index drop/recreation could have
-	 * netted us the wrong table.
-	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
 		ereport(ERROR,
-				(errcode(ERRCODE_UNDEFINED_TABLE),
-				 errmsg("could not open parent table of index \"%s\"",
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" lacks a main relation fork",
 						RelationGetRelationName(indrel))));
 
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	if (btree_index_mainfork_expected(indrel))
+	/* Extract metadata from metapage, and sanitize it in passing */
+	_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
+	if (allequalimage && !heapkeyspace)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
+						RelationGetRelationName(indrel))));
+	if (allequalimage && !_bt_allequalimage(indrel, false))
 	{
-		bool		heapkeyspace,
-					allequalimage;
+		bool		has_interval_ops = false;
 
-		if (!smgrexists(RelationGetSmgr(indrel), MAIN_FORKNUM))
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" lacks a main relation fork",
-							RelationGetRelationName(indrel))));
-
-		/* Extract metadata from metapage, and sanitize it in passing */
-		_bt_metaversion(indrel, &heapkeyspace, &allequalimage);
-		if (allequalimage && !heapkeyspace)
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage has equalimage field set on unsupported nbtree version",
-							RelationGetRelationName(indrel))));
-		if (allequalimage && !_bt_allequalimage(indrel, false))
-		{
-			bool		has_interval_ops = false;
-
-			for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
-				if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
-					has_interval_ops = true;
-			ereport(ERROR,
-					(errcode(ERRCODE_INDEX_CORRUPTED),
-					 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
-							RelationGetRelationName(indrel)),
-					 has_interval_ops
-					 ? errhint("This is known of \"interval\" indexes last built on a version predating 2023-11.")
-					 : 0));
-		}
-
-		/* Check index, possibly against table it is an index on */
-		bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
-							 heapallindexed, rootdescend, checkunique);
+		for (int i = 0; i < IndexRelationGetNumberOfKeyAttributes(indrel); i++)
+			if (indrel->rd_opfamily[i] == INTERVAL_BTREE_FAM_OID)
+			{
+				has_interval_ops = true;
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" metapage incorrectly indicates that deduplication is safe",
+								RelationGetRelationName(indrel)),
+						 has_interval_ops
+						 ? errhint("This is known of \"interval\" indexes last built on a version predating 2023-11.")
+						 : 0));
+			}
 	}
 
-	/* Roll back any GUC changes executed by index functions */
-	AtEOXact_GUC(false, save_nestlevel);
-
-	/* Restore userid and security context */
-	SetUserIdAndSecContext(save_userid, save_sec_context);
-
-	/*
-	 * Release locks early. That's ok here because nothing in the called
-	 * routines will trigger shared cache invalidations to be sent, so we can
-	 * relax the usual pattern of only releasing locks after commit.
-	 */
-	index_close(indrel, lockmode);
-	if (heaprel)
-		table_close(heaprel, lockmode);
-}
-
-/*
- * Basic checks about the suitability of a relation for checking as a B-Tree
- * index.
- *
- * NB: Intentionally not checking permissions, the function is normally not
- * callable by non-superusers. If granted, it's useful to be able to check a
- * whole cluster.
- */
-static inline void
-btree_index_checkable(Relation rel)
-{
-	if (rel->rd_rel->relkind != RELKIND_INDEX ||
-		rel->rd_rel->relam != BTREE_AM_OID)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("only B-Tree indexes are supported as targets for verification"),
-				 errdetail("Relation \"%s\" is not a B-Tree index.",
-						   RelationGetRelationName(rel))));
-
-	if (RELATION_IS_OTHER_TEMP(rel))
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("cannot access temporary tables of other sessions"),
-				 errdetail("Index \"%s\" is associated with temporary relation.",
-						   RelationGetRelationName(rel))));
-
-	if (!rel->rd_index->indisvalid)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("cannot check index \"%s\"",
-						RelationGetRelationName(rel)),
-				 errdetail("Index is not valid.")));
-}
-
-/*
- * Check if B-Tree index relation should have a file for its main relation
- * fork.  Verification uses this to skip unlogged indexes when in hot standby
- * mode, where there is simply nothing to verify.  We behave as if the
- * relation is empty.
- *
- * NB: Caller should call btree_index_checkable() before calling here.
- */
-static inline bool
-btree_index_mainfork_expected(Relation rel)
-{
-	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
-		!RecoveryInProgress())
-		return true;
-
-	ereport(DEBUG1,
-			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
-			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
-					RelationGetRelationName(rel))));
-
-	return false;
+	/* Check index, possibly against table it is an index on */
+	bt_check_every_level(indrel, heaprel, heapkeyspace, readonly,
+						 args->heapallindexed, args->rootdescend, args->checkunique);
 }
 
 /*
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 1279b69422a..b66affbea56 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -194,6 +194,7 @@ BOOLEAN
 BOX
 BTArrayKeyInfo
 BTBuildState
+BTCallbackState
 BTCycleId
 BTDedupInterval
 BTDedupState
-- 
2.49.0

v20250328-0003-amcheck-Add-gin_index_check-to-verify-GIN-.patchtext/x-patch; charset=UTF-8; name=v20250328-0003-amcheck-Add-gin_index_check-to-verify-GIN-.patchDownload

From 9f85cac7e03aa9e24fad52399949c0dc4858dda8 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas.vondra@postgresql.org>
Date: Fri, 28 Mar 2025 16:08:05 +0100
Subject: [PATCH v20250328 3/6] amcheck: Add gin_index_check() to verify GIN
 index
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The new functions validates two kinds of invariants on a GIN index:

- parent-child consistency: Paths in a GIN graph have to contain
  consistent keys: tuples on parent pages consistently include tuples
  from children pages. That is, parent tuples must not require any
  adjustments.

- balanced-tree / graph: Each internal page has at least one downlink,
  and can reference either only leaf pages or only internal pages.

The GIN verification is based on work by Grigory Kryachko, reworked by
Heikki Linnakangas and with various improvements by Andrey Borodin.
Investigation and fixes for a couple bugs by Kirill Reshke.

Author: Grigory Kryachko <GSKryachko@gmail.com>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Author: Andrey Borodin <amborodin@acm.org>
Reviewed-By: José Villanova <jose.arthur@gmail.com>
Reviewed-By: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-By: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
Reviewed-By: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-By: Mark Dilger <mark.dilger@enterprisedb.com>
Reviewed-By: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/45AC9B0A-2B45-40EE-B08F-BDCF5739D1E1%40yandex-team.ru
---
 contrib/amcheck/Makefile               |   6 +-
 contrib/amcheck/amcheck--1.4--1.5.sql  |  14 +
 contrib/amcheck/amcheck.control        |   2 +-
 contrib/amcheck/expected/check_gin.out |  64 ++
 contrib/amcheck/meson.build            |   3 +
 contrib/amcheck/sql/check_gin.sql      |  40 ++
 contrib/amcheck/verify_gin.c           | 798 +++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml              |  20 +
 src/tools/pgindent/typedefs.list       |   2 +
 9 files changed, 946 insertions(+), 3 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.4--1.5.sql
 create mode 100644 contrib/amcheck/expected/check_gin.out
 create mode 100644 contrib/amcheck/sql/check_gin.sql
 create mode 100644 contrib/amcheck/verify_gin.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index c3d70f3369c..1b7a63cbaa4 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -4,14 +4,16 @@ MODULE_big	= amcheck
 OBJS = \
 	$(WIN32RES) \
 	verify_common.o \
+	verify_gin.o \
 	verify_heapam.o \
 	verify_nbtree.o
 
 EXTENSION = amcheck
-DATA = amcheck--1.3--1.4.sql amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql \
+		amcheck--1.3--1.4.sql amcheck--1.4--1.5.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree check_heap
+REGRESS = check check_btree check_gin check_heap
 
 EXTRA_INSTALL = contrib/pg_walinspect
 TAP_TESTS = 1
diff --git a/contrib/amcheck/amcheck--1.4--1.5.sql b/contrib/amcheck/amcheck--1.4--1.5.sql
new file mode 100644
index 00000000000..445c48ccb7d
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.4--1.5.sql
@@ -0,0 +1,14 @@
+/* contrib/amcheck/amcheck--1.4--1.5.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.5'" to load this file. \quit
+
+
+-- gin_index_check()
+--
+CREATE FUNCTION gin_index_check(index regclass)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gin_index_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gin_index_check(regclass) FROM PUBLIC;
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index e67ace01c99..c8ba6d7c9bc 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.4'
+default_version = '1.5'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
new file mode 100644
index 00000000000..bbcde80e627
--- /dev/null
+++ b/contrib/amcheck/expected/check_gin.out
@@ -0,0 +1,64 @@
+-- Test of index bulk load
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_check('gin_check_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test index inserts
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+SELECT gin_index_check('gin_check_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check;
+-- Test GIN over text array
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_check('gin_check_text_array_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 67a4ac8518d..b33e8c9b062 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -2,6 +2,7 @@
 
 amcheck_sources = files(
   'verify_common.c',
+  'verify_gin.c',
   'verify_heapam.c',
   'verify_nbtree.c',
 )
@@ -25,6 +26,7 @@ install_data(
   'amcheck--1.1--1.2.sql',
   'amcheck--1.2--1.3.sql',
   'amcheck--1.3--1.4.sql',
+  'amcheck--1.4--1.5.sql',
   kwargs: contrib_data_args,
 )
 
@@ -36,6 +38,7 @@ tests += {
     'sql': [
       'check',
       'check_btree',
+      'check_gin',
       'check_heap',
     ],
   },
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
new file mode 100644
index 00000000000..bbd9b9f8281
--- /dev/null
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -0,0 +1,40 @@
+-- Test of index bulk load
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+-- posting trees (frequently used entries)
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves (sparse entries)
+INSERT INTO gin_check select array_agg(255 + round(random()*100)) from generate_series(1, 100) as i group by i % 100;
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+SELECT gin_index_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test index inserts
+SELECT setseed(1);
+CREATE TABLE "gin_check"("Column1" int[]);
+CREATE INDEX gin_check_idx on "gin_check" USING GIN("Column1");
+ALTER INDEX gin_check_idx SET (fastupdate = false);
+-- posting trees
+INSERT INTO gin_check select array_agg(round(random()*255) ) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check select array_agg(100 + round(random()*255)) from generate_series(1, 100) as i group by i % 100;
+
+SELECT gin_index_check('gin_check_idx');
+
+-- cleanup
+DROP TABLE gin_check;
+
+-- Test GIN over text array
+SELECT setseed(1);
+CREATE TABLE "gin_check_text_array"("Column1" text[]);
+-- posting trees
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300)::text)::text) from generate_series(1, 100000) as i group by i % 10000;
+-- posting leaves
+INSERT INTO gin_check_text_array select array_agg(md5(round(random()*300 + 300)::text)::text) from generate_series(1, 10000) as i group by i % 100;
+CREATE INDEX gin_check_text_array_idx on "gin_check_text_array" USING GIN("Column1");
+SELECT gin_index_check('gin_check_text_array_idx');
+
+-- cleanup
+DROP TABLE gin_check_text_array;
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
new file mode 100644
index 00000000000..670f53637d4
--- /dev/null
+++ b/contrib/amcheck/verify_gin.c
@@ -0,0 +1,798 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gin.c
+ *		Verifies the integrity of GIN indexes based on invariants.
+ *
+ *
+ * GIN index verification checks a number of invariants:
+ *
+ * - consistency: Paths in GIN graph have to contain consistent keys: tuples
+ *   on parent pages consistently include tuples from children pages.
+ *
+ * - graph invariants: Each internal page must have at least one downlink, and
+ *   can reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2016-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gin.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gin_private.h"
+#include "access/nbtree.h"
+#include "catalog/pg_am.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+#include "verify_common.h"
+#include "string.h"
+
+/*
+ * GinScanItem represents one item of depth-first scan of the index.
+ */
+typedef struct GinScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GinScanItem *next;
+} GinScanItem;
+
+/*
+ * GinPostingTreeScanItem represents one item of a depth-first posting tree scan.
+ */
+typedef struct GinPostingTreeScanItem
+{
+	int			depth;
+	ItemPointerData parentkey;
+	BlockNumber parentblk;
+	BlockNumber blkno;
+	struct GinPostingTreeScanItem *next;
+} GinPostingTreeScanItem;
+
+
+PG_FUNCTION_INFO_V1(gin_index_check);
+
+static void gin_check_parent_keys_consistency(Relation rel,
+											  Relation heaprel,
+											  void *callback_state, bool readonly);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gin_refind_parent(Relation rel,
+									BlockNumber parentblkno,
+									BlockNumber childblkno,
+									BufferAccessStrategy strategy);
+static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+								   OffsetNumber offset);
+
+/*
+ * gin_index_check(index regclass)
+ *
+ * Verify integrity of GIN index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum
+gin_index_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+
+	amcheck_lock_relation_and_check(indrelid,
+									GIN_AM_OID,
+									gin_check_parent_keys_consistency,
+									AccessShareLock,
+									NULL);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Read item pointers from leaf entry tuple.
+ *
+ * Returns a palloc'd array of ItemPointers. The number of items is returned
+ * in *nitems.
+ */
+static ItemPointer
+ginReadTupleWithoutState(IndexTuple itup, int *nitems)
+{
+	Pointer		ptr = GinGetPosting(itup);
+	int			nipd = GinGetNPosting(itup);
+	ItemPointer ipd;
+	int			ndecoded;
+
+	if (GinItupIsCompressed(itup))
+	{
+		if (nipd > 0)
+		{
+			ipd = ginPostingListDecode((GinPostingList *) ptr, &ndecoded);
+			if (nipd != ndecoded)
+				elog(ERROR, "number of items mismatch in GIN entry tuple, %d in tuple header, %d decoded",
+					 nipd, ndecoded);
+		}
+		else
+			ipd = palloc(0);
+	}
+	else
+	{
+		ipd = (ItemPointer) palloc(sizeof(ItemPointerData) * nipd);
+		memcpy(ipd, ptr, sizeof(ItemPointerData) * nipd);
+	}
+	*nitems = nipd;
+	return ipd;
+}
+
+/*
+ * Scans through a posting tree (given by the root), and verifies that the keys
+ * on a child keys are consistent with the parent.
+ *
+ * Allocates a separate memory context and scans through posting tree graph.
+ */
+static void
+gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting_tree_root)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinPostingTreeScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "posting tree check context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinPostingTreeScanItem *) palloc0(sizeof(GinPostingTreeScanItem));
+	stack->depth = 0;
+	ItemPointerSetInvalid(&stack->parentkey);
+	stack->parentblk = InvalidBlockNumber;
+	stack->blkno = posting_tree_root;
+
+	elog(DEBUG3, "processing posting tree at blk %u", posting_tree_root);
+
+	while (stack)
+	{
+		GinPostingTreeScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		BlockNumber rightlink;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+
+		Assert(GinPageIsData(page));
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			ItemPointerData minItem;
+			int			nlist;
+			ItemPointerData *list;
+			char		tidrange_buf[MAXPGPATH];
+
+			ItemPointerSetMin(&minItem);
+
+			elog(DEBUG1, "page blk: %u, type leaf", stack->blkno);
+
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+			list = GinDataLeafPageGetItems(page, &nlist, minItem);
+
+			if (nlist > 0)
+				snprintf(tidrange_buf, sizeof(tidrange_buf),
+						 "%d tids (%u, %u) - (%u, %u)",
+						 nlist,
+						 ItemPointerGetBlockNumberNoCheck(&list[0]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[0]),
+						 ItemPointerGetBlockNumberNoCheck(&list[nlist - 1]),
+						 ItemPointerGetOffsetNumberNoCheck(&list[nlist - 1]));
+			else
+				snprintf(tidrange_buf, sizeof(tidrange_buf), "0 tids");
+
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: parent %u highkey (%u, %u), %s",
+					 stack->blkno,
+					 stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey),
+					 tidrange_buf);
+			else
+				elog(DEBUG3, "blk %u: root leaf, %s",
+					 stack->blkno,
+					 tidrange_buf);
+
+			if (stack->parentblk != InvalidBlockNumber &&
+				ItemPointerGetOffsetNumberNoCheck(&stack->parentkey) != InvalidOffsetNumber &&
+				nlist > 0 && ItemPointerCompare(&stack->parentkey, &list[nlist - 1]) < 0)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": tid exceeds parent's high key in postingTree leaf on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+		else
+		{
+			LocationIndex pd_lower;
+			ItemPointerData bound;
+			int			lowersize;
+
+			/*
+			 * Check that tuples in each page are properly ordered and
+			 * consistent with parent high key
+			 */
+			maxoff = GinPageGetOpaque(page)->maxoff;
+			rightlink = GinPageGetOpaque(page)->rightlink;
+
+			elog(DEBUG1, "page blk: %u, type data, maxoff %d", stack->blkno, maxoff);
+
+			if (stack->parentblk != InvalidBlockNumber)
+				elog(DEBUG3, "blk %u: internal posting tree page with %u items, parent %u highkey (%u, %u)",
+					 stack->blkno, maxoff, stack->parentblk,
+					 ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+					 ItemPointerGetOffsetNumberNoCheck(&stack->parentkey));
+			else
+				elog(DEBUG3, "blk %u: root internal posting tree page with %u items",
+					 stack->blkno, maxoff);
+
+			/*
+			 * A GIN posting tree internal page stores PostingItems in the
+			 * 'lower' part of the page. The 'upper' part is unused. The
+			 * number of elements is stored in the opaque area (maxoff). Make
+			 * sure the size of the 'lower' part agrees with 'maxoff'
+			 *
+			 * We didn't set pd_lower until PostgreSQL version 9.4, so if this
+			 * check fails, it could also be because the index was
+			 * binary-upgraded from an earlier version. That was a long time
+			 * ago, though, so let's warn if it doesn't match.
+			 */
+			pd_lower = ((PageHeader) page)->pd_lower;
+			lowersize = pd_lower - MAXALIGN(SizeOfPageHeaderData);
+			if ((lowersize - MAXALIGN(sizeof(ItemPointerData))) / sizeof(PostingItem) != maxoff)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has unexpected pd_lower %u in posting tree block %u with maxoff %u)",
+								RelationGetRelationName(rel), pd_lower, stack->blkno, maxoff)));
+
+			/*
+			 * Before the PostingItems, there's one ItemPointerData in the
+			 * 'lower' part that stores the page's high key.
+			 */
+			bound = *GinDataPageGetRightBound(page);
+
+			/*
+			 * Gin page right bound has a sane value only when not a highkey on
+			 * the rightmost page (at a given level). For the rightmost page does
+			 * not store the highkey explicitly, and the value is infinity.
+			 */
+			if (ItemPointerIsValid(&stack->parentkey) &&
+				rightlink != InvalidBlockNumber &&
+				!ItemPointerEquals(&stack->parentkey, &bound))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": posting tree page's high key (%u, %u) doesn't match the downlink on block %u (parent blk %u, key (%u, %u))",
+								RelationGetRelationName(rel),
+								ItemPointerGetBlockNumberNoCheck(&bound),
+								ItemPointerGetOffsetNumberNoCheck(&bound),
+								stack->blkno, stack->parentblk,
+								ItemPointerGetBlockNumberNoCheck(&stack->parentkey),
+								ItemPointerGetOffsetNumberNoCheck(&stack->parentkey))));
+
+			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				GinPostingTreeScanItem *ptr;
+				PostingItem *posting_item = GinDataPageGetPostingItem(page, i);
+
+				/* ItemPointerGetOffsetNumber expects a valid pointer */
+				if (!(i == maxoff &&
+					  rightlink == InvalidBlockNumber))
+					elog(DEBUG3, "key (%u, %u) -> %u",
+						 ItemPointerGetBlockNumber(&posting_item->key),
+						 ItemPointerGetOffsetNumber(&posting_item->key),
+						 BlockIdGetBlockNumber(&posting_item->child_blkno));
+				else
+					elog(DEBUG3, "key (%u, %u) -> %u",
+						 0, 0, BlockIdGetBlockNumber(&posting_item->child_blkno));
+
+				if (i == maxoff && rightlink == InvalidBlockNumber)
+				{
+					/*
+					 * The rightmost item in the tree level has (0, 0) as the
+					 * key
+					 */
+					if (ItemPointerGetBlockNumberNoCheck(&posting_item->key) != 0 ||
+						ItemPointerGetOffsetNumberNoCheck(&posting_item->key) != 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": rightmost posting tree page (blk %u) has unexpected last key (%u, %u)",
+										RelationGetRelationName(rel),
+										stack->blkno,
+										ItemPointerGetBlockNumberNoCheck(&posting_item->key),
+										ItemPointerGetOffsetNumberNoCheck(&posting_item->key))));
+				}
+				else if (i != FirstOffsetNumber)
+				{
+					PostingItem *previous_posting_item = GinDataPageGetPostingItem(page, i - 1);
+
+					if (ItemPointerCompare(&posting_item->key, &previous_posting_item->key) < 0)
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\" has wrong tuple order in posting tree, block %u, offset %u",
+										RelationGetRelationName(rel), stack->blkno, i)));
+				}
+
+				/*
+				 * Check if this tuple is consistent with the downlink in the
+				 * parent.
+				 */
+				if (stack->parentblk != InvalidBlockNumber && i == maxoff &&
+					ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\": posting item exceeds parent's high key in postingTree internal page on block %u offset %u",
+									RelationGetRelationName(rel),
+									stack->blkno, i)));
+
+				/* This is an internal page, recurse into the child. */
+				ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));
+				ptr->depth = stack->depth + 1;
+
+				/*
+				 * Set rightmost parent key to invalid iterm pointer. Its
+				 * value is 'Infinity' and not explicitly stored.
+				 */
+				if (rightlink == InvalidBlockNumber)
+					ItemPointerSetInvalid(&ptr->parentkey);
+				else
+					ptr->parentkey = posting_item->key;
+
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Main entry point for GIN checks.
+ *
+ * Allocates memory context and scans through the whole GIN graph.
+ */
+static void
+gin_check_parent_keys_consistency(Relation rel,
+								  Relation heaprel,
+								  void *callback_state,
+								  bool readonly)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GinScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GinState	state;
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck consistency check context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+	initGinState(&state, rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GinScanItem *) palloc0(sizeof(GinScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIN_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GinScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff,
+					prev_attnum;
+		XLogRecPtr	lsn;
+		IndexTuple	prev_tuple;
+		BlockNumber rightlink;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIN_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+		maxoff = PageGetMaxOffsetNumber(page);
+		rightlink = GinPageGetOpaque(page)->rightlink;
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		elog(DEBUG3, "processing entry tree page at blk %u, maxoff: %u", stack->blkno, maxoff);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (stack->parenttup != NULL)
+		{
+			GinNullCategory parent_key_category;
+			Datum		parent_key = gintuple_get_key(&state,
+													  stack->parenttup,
+													  &parent_key_category);
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno,
+												   page, maxoff);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory page_max_key_category;
+			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
+
+			if (rightlink != InvalidBlockNumber &&
+				ginCompareEntries(&state, attnum, page_max_key,
+								  page_max_key_category, parent_key,
+								  parent_key_category) > 0)
+			{
+				/* split page detected, install right link to the stack */
+				GinScanItem *ptr;
+
+				elog(DEBUG3, "split detected for blk: %u, parent blk: %u", stack->blkno, stack->parentblk);
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth;
+				ptr->parenttup = CopyIndexTuple(stack->parenttup);
+				ptr->parentblk = stack->parentblk;
+				ptr->parentlsn = stack->parentlsn;
+				ptr->blkno = rightlink;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GinPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that tuples in each page are properly ordered and consistent
+		 * with parent high key
+		 */
+		prev_tuple = NULL;
+		prev_attnum = InvalidAttrNumber;
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			GinNullCategory prev_key_category;
+			Datum		prev_key;
+			GinNullCategory current_key_category;
+			Datum		current_key;
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
+
+			/*
+			 * First block is metadata, skip order check. Also, never check
+			 * for high key on rightmost page, as this key is not really
+			 * stored explicitly.
+			 *
+			 * Also make sure to not compare entries for different attnums, which
+			 * may be stored on the same page.
+			 */
+			if (i != FirstOffsetNumber && attnum == prev_attnum && stack->blkno != GIN_ROOT_BLKNO &&
+				!(i == maxoff && rightlink == InvalidBlockNumber))
+			{
+				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
+				if (ginCompareEntries(&state, attnum, prev_key,
+									  prev_key_category, current_key,
+									  current_key_category) >= 0)
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has wrong tuple order on entry tree page, block %u, offset %u, rightlink %u",
+									RelationGetRelationName(rel), stack->blkno, i, rightlink)));
+			}
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				i == maxoff)
+			{
+				GinNullCategory parent_key_category;
+				Datum		parent_key = gintuple_get_key(&state,
+														  stack->parenttup,
+														  &parent_key_category);
+
+				if (ginCompareEntries(&state, attnum, current_key,
+									  current_key_category, parent_key,
+									  parent_key_category) > 0)
+				{
+					/*
+					 * There was a discrepancy between parent and child
+					 * tuples. We need to verify it is not a result of
+					 * concurrent call of gistplacetopage(). So, lock parent
+					 * and try to find downlink for current page. It may be
+					 * missing due to concurrent page split, this is OK.
+					 */
+					pfree(stack->parenttup);
+					stack->parenttup = gin_refind_parent(rel, stack->parentblk,
+														 stack->blkno, strategy);
+
+					/* We found it - make a final check before failing */
+					if (!stack->parenttup)
+						elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+							 stack->blkno, stack->parentblk);
+					else
+					{
+						parent_key = gintuple_get_key(&state,
+													  stack->parenttup,
+													  &parent_key_category);
+
+						/*
+						 * Check if it is properly adjusted. If succeed,
+						 * procced to the next key.
+						 */
+						if (ginCompareEntries(&state, attnum, current_key,
+											  current_key_category, parent_key,
+											  parent_key_category) > 0)
+							ereport(ERROR,
+									(errcode(ERRCODE_INDEX_CORRUPTED),
+									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+											RelationGetRelationName(rel), stack->blkno, i)));
+					}
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+			if (!GinPageIsLeaf(page))
+			{
+				GinScanItem *ptr;
+
+				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
+				ptr->depth = stack->depth + 1;
+				/* last tuple in layer has no high key */
+				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
+					ptr->parenttup = CopyIndexTuple(idxtuple);
+				else
+					ptr->parenttup = NULL;
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = GinGetDownlink(idxtuple);
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+			/* If this item is a pointer to a posting tree, recurse into it */
+			else if (GinIsPostingTree(idxtuple))
+			{
+				BlockNumber rootPostingTree = GinGetPostingTree(idxtuple);
+
+				gin_check_posting_tree_parent_keys_consistency(rel, rootPostingTree);
+			}
+			else
+			{
+				ItemPointer ipd;
+				int			nipd;
+
+				ipd = ginReadTupleWithoutState(idxtuple, &nipd);
+
+				for (int j = 0; j < nipd; j++)
+				{
+					if (!OffsetNumberIsValid(ItemPointerGetOffsetNumber(&ipd[j])))
+						ereport(ERROR,
+								(errcode(ERRCODE_INDEX_CORRUPTED),
+								 errmsg("index \"%s\": posting list contains invalid heap pointer on block %u",
+										RelationGetRelationName(rel), stack->blkno)));
+				}
+				pfree(ipd);
+			}
+
+			prev_tuple = CopyIndexTuple(idxtuple);
+			prev_attnum = attnum;
+		}
+
+		LockBuffer(buffer, GIN_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/*
+ * Verify that a freshly-read page looks sane.
+ */
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	/*
+	 * ReadBuffer verifies that every newly-read page passes
+	 * PageHeaderIsValid, which means it either contains a reasonably sane
+	 * page header or is all-zero.  We have to defend against the all-zero
+	 * case, however.
+	 */
+	if (PageIsNew(page))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains unexpected zero page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	/*
+	 * Additionally check that the special area looks sane.
+	 */
+	if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" contains corrupted page at block %u",
+						RelationGetRelationName(rel),
+						BufferGetBlockNumber(buffer)),
+				 errhint("Please REINDEX it.")));
+
+	if (GinPageIsDeleted(page))
+	{
+		if (!GinPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gin_refind_parent(Relation rel, BlockNumber parentblkno,
+				  BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIN_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GinPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+static ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - MAXALIGN(sizeof(GinPageOpaqueData)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED or LP_DEAD,
+	 * since GIN never uses all three.  Verify that line pointer has storage,
+	 * too.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdIsDead(itemid) || ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index a12aa3abf01..98f836e15e7 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -188,6 +188,26 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gin_index_check(index regclass) returns void</function>
+     <indexterm>
+      <primary>gin_index_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gin_index_check</function> tests that its target GIN index
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
+
   </variablelist>
   <tip>
    <para>
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index b66affbea56..b66cecd8799 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1053,8 +1053,10 @@ GinPageOpaque
 GinPageOpaqueData
 GinPlaceToPageRC
 GinPostingList
+GinPostingTreeScanItem
 GinQualCounts
 GinScanEntry
+GinScanItem
 GinScanKey
 GinScanOpaque
 GinScanOpaqueData
-- 
2.49.0

v20250328-0004-amcheck-Add-a-test-with-GIN-index-on-JSONB.patchtext/x-patch; charset=UTF-8; name=v20250328-0004-amcheck-Add-a-test-with-GIN-index-on-JSONB.patchDownload

From d11a04e9712e7f278923c2ad2474f9d251174263 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas.vondra@postgresql.org>
Date: Fri, 28 Mar 2025 16:53:31 +0100
Subject: [PATCH v20250328 4/6] amcheck: Add a test with GIN index on JSONB
 data

Extend the existing test of GIN checks to also include an index on JSONB
data, using the jsonb_path_ops opclass. This is a common enough usage of
GIN that it makes sense to have better test coverage for it.

Author: Mark Dilger <mark.dilger@enterprisedb.com>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
Reviewed-By: Kirill Reshke <reshkekirill@gmail.com>
Discussion: https://postgr.es/m/BC221A56-977C-418E-A1B8-9EFC881D80C5%40enterprisedb.com
---
 contrib/amcheck/expected/check_gin.out | 14 ++++++++++++++
 contrib/amcheck/sql/check_gin.sql      | 12 ++++++++++++
 2 files changed, 26 insertions(+)

diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
index bbcde80e627..93147de0ef1 100644
--- a/contrib/amcheck/expected/check_gin.out
+++ b/contrib/amcheck/expected/check_gin.out
@@ -62,3 +62,17 @@ SELECT gin_index_check('gin_check_text_array_idx');
 
 -- cleanup
 DROP TABLE gin_check_text_array;
+-- Test GIN over jsonb
+CREATE TABLE "gin_check_jsonb"("j" jsonb);
+INSERT INTO gin_check_jsonb values ('{"a":[["b",{"x":1}],["b",{"x":2}]],"c":3}');
+INSERT INTO gin_check_jsonb values ('[[14,2,3]]');
+INSERT INTO gin_check_jsonb values ('[1,[14,2,3]]');
+CREATE INDEX "gin_check_jsonb_idx" on gin_check_jsonb USING GIN("j" jsonb_path_ops);
+SELECT gin_index_check('gin_check_jsonb_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_jsonb;
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
index bbd9b9f8281..92ddbbc7a89 100644
--- a/contrib/amcheck/sql/check_gin.sql
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -38,3 +38,15 @@ SELECT gin_index_check('gin_check_text_array_idx');
 
 -- cleanup
 DROP TABLE gin_check_text_array;
+
+-- Test GIN over jsonb
+CREATE TABLE "gin_check_jsonb"("j" jsonb);
+INSERT INTO gin_check_jsonb values ('{"a":[["b",{"x":1}],["b",{"x":2}]],"c":3}');
+INSERT INTO gin_check_jsonb values ('[[14,2,3]]');
+INSERT INTO gin_check_jsonb values ('[1,[14,2,3]]');
+CREATE INDEX "gin_check_jsonb_idx" on gin_check_jsonb USING GIN("j" jsonb_path_ops);
+
+SELECT gin_index_check('gin_check_jsonb_idx');
+
+-- cleanup
+DROP TABLE gin_check_jsonb;
-- 
2.49.0

v20250328-0005-amcheck-Add-a-GIN-index-to-the-CREATE-INDE.patchtext/x-patch; charset=UTF-8; name=v20250328-0005-amcheck-Add-a-GIN-index-to-the-CREATE-INDE.patchDownload

From 17562587b68a5d29245b4c7d852d0c29876ace54 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas.vondra@postgresql.org>
Date: Fri, 28 Mar 2025 17:05:18 +0100
Subject: [PATCH v20250328 5/6] amcheck: Add a GIN index to the CREATE INDEX
 CONCURRENTLY tests

The existing CREATE INDEX CONCURRENTLY tests checking only B-Tree, but
can be cheaply extended to also check GIN. This helps increasing test
coverage for GIN amcheck, especially related to handling concurrent page
splits and posting list trees.

This already helped to identify several issues during development of the
GIN amcheck support.

Author: Mark Dilger <mark.dilger@enterprisedb.com>
Reviewed-By: Tomas Vondra <tomas.vondra@enterprisedb.com>
Reviewed-By: Kirill Reshke <reshkekirill@gmail.com>
Discussion: https://postgr.es/m/BC221A56-977C-418E-A1B8-9EFC881D80C5%40enterprisedb.com
---
 contrib/amcheck/t/002_cic.pl     | 10 +++++---
 contrib/amcheck/t/003_cic_2pc.pl | 40 ++++++++++++++++++++++++++------
 2 files changed, 40 insertions(+), 10 deletions(-)

diff --git a/contrib/amcheck/t/002_cic.pl b/contrib/amcheck/t/002_cic.pl
index 0b6a5a9e464..6a0c4f61125 100644
--- a/contrib/amcheck/t/002_cic.pl
+++ b/contrib/amcheck/t/002_cic.pl
@@ -21,8 +21,9 @@ $node->append_conf('postgresql.conf',
 	'lock_timeout = ' . (1000 * $PostgreSQL::Test::Utils::timeout_default));
 $node->start;
 $node->safe_psql('postgres', q(CREATE EXTENSION amcheck));
-$node->safe_psql('postgres', q(CREATE TABLE tbl(i int)));
+$node->safe_psql('postgres', q(CREATE TABLE tbl(i int, j jsonb)));
 $node->safe_psql('postgres', q(CREATE INDEX idx ON tbl(i)));
+$node->safe_psql('postgres', q(CREATE INDEX ginidx ON tbl USING gin(j)));
 
 #
 # Stress CIC with pgbench.
@@ -40,13 +41,13 @@ $node->pgbench(
 	{
 		'002_pgbench_concurrent_transaction' => q(
 			BEGIN;
-			INSERT INTO tbl VALUES(0);
+			INSERT INTO tbl VALUES(0, '{"a":[["b",{"x":1}],["b",{"x":2}]],"c":3}');
 			COMMIT;
 		  ),
 		'002_pgbench_concurrent_transaction_savepoints' => q(
 			BEGIN;
 			SAVEPOINT s1;
-			INSERT INTO tbl VALUES(0);
+			INSERT INTO tbl VALUES(0, '[[14,2,3]]');
 			COMMIT;
 		  ),
 		'002_pgbench_concurrent_cic' => q(
@@ -54,7 +55,10 @@ $node->pgbench(
 			\if :gotlock
 				DROP INDEX CONCURRENTLY idx;
 				CREATE INDEX CONCURRENTLY idx ON tbl(i);
+				DROP INDEX CONCURRENTLY ginidx;
+				CREATE INDEX CONCURRENTLY ginidx ON tbl USING gin(j);
 				SELECT bt_index_check('idx',true);
+				SELECT gin_index_check('ginidx');
 				SELECT pg_advisory_unlock(42);
 			\endif
 		  )
diff --git a/contrib/amcheck/t/003_cic_2pc.pl b/contrib/amcheck/t/003_cic_2pc.pl
index 9134487f3b4..00a446a381f 100644
--- a/contrib/amcheck/t/003_cic_2pc.pl
+++ b/contrib/amcheck/t/003_cic_2pc.pl
@@ -25,7 +25,7 @@ $node->append_conf('postgresql.conf',
 	'lock_timeout = ' . (1000 * $PostgreSQL::Test::Utils::timeout_default));
 $node->start;
 $node->safe_psql('postgres', q(CREATE EXTENSION amcheck));
-$node->safe_psql('postgres', q(CREATE TABLE tbl(i int)));
+$node->safe_psql('postgres', q(CREATE TABLE tbl(i int, j jsonb)));
 
 
 #
@@ -41,7 +41,7 @@ my $main_h = $node->background_psql('postgres');
 $main_h->query_safe(
 	q(
 BEGIN;
-INSERT INTO tbl VALUES(0);
+INSERT INTO tbl VALUES(0, '[[14,2,3]]');
 ));
 
 my $cic_h = $node->background_psql('postgres');
@@ -50,6 +50,7 @@ $cic_h->query_until(
 	qr/start/, q(
 \echo start
 CREATE INDEX CONCURRENTLY idx ON tbl(i);
+CREATE INDEX CONCURRENTLY ginidx ON tbl USING gin(j);
 ));
 
 $main_h->query_safe(
@@ -60,7 +61,7 @@ PREPARE TRANSACTION 'a';
 $main_h->query_safe(
 	q(
 BEGIN;
-INSERT INTO tbl VALUES(0);
+INSERT INTO tbl VALUES(0, '[[14,2,3]]');
 ));
 
 $node->safe_psql('postgres', q(COMMIT PREPARED 'a';));
@@ -69,7 +70,7 @@ $main_h->query_safe(
 	q(
 PREPARE TRANSACTION 'b';
 BEGIN;
-INSERT INTO tbl VALUES(0);
+INSERT INTO tbl VALUES(0, '"mary had a little lamb"');
 ));
 
 $node->safe_psql('postgres', q(COMMIT PREPARED 'b';));
@@ -86,6 +87,9 @@ $cic_h->quit;
 $result = $node->psql('postgres', q(SELECT bt_index_check('idx',true)));
 is($result, '0', 'bt_index_check after overlapping 2PC');
 
+$result = $node->psql('postgres', q(SELECT gin_index_check('ginidx')));
+is($result, '0', 'gin_index_check after overlapping 2PC');
+
 
 #
 # Server restart shall not change whether prepared xact blocks CIC
@@ -94,7 +98,7 @@ is($result, '0', 'bt_index_check after overlapping 2PC');
 $node->safe_psql(
 	'postgres', q(
 BEGIN;
-INSERT INTO tbl VALUES(0);
+INSERT INTO tbl VALUES(0, '{"a":[["b",{"x":1}],["b",{"x":2}]],"c":3}');
 PREPARE TRANSACTION 'spans_restart';
 BEGIN;
 CREATE TABLE unused ();
@@ -108,12 +112,16 @@ $reindex_h->query_until(
 \echo start
 DROP INDEX CONCURRENTLY idx;
 CREATE INDEX CONCURRENTLY idx ON tbl(i);
+DROP INDEX CONCURRENTLY ginidx;
+CREATE INDEX CONCURRENTLY ginidx ON tbl USING gin(j);
 ));
 
 $node->safe_psql('postgres', "COMMIT PREPARED 'spans_restart'");
 $reindex_h->quit;
 $result = $node->psql('postgres', q(SELECT bt_index_check('idx',true)));
 is($result, '0', 'bt_index_check after 2PC and restart');
+$result = $node->psql('postgres', q(SELECT gin_index_check('ginidx')));
+is($result, '0', 'gin_index_check after 2PC and restart');
 
 
 #
@@ -136,14 +144,14 @@ $node->pgbench(
 	{
 		'003_pgbench_concurrent_2pc' => q(
 			BEGIN;
-			INSERT INTO tbl VALUES(0);
+			INSERT INTO tbl VALUES(0,'null');
 			PREPARE TRANSACTION 'c:client_id';
 			COMMIT PREPARED 'c:client_id';
 		  ),
 		'003_pgbench_concurrent_2pc_savepoint' => q(
 			BEGIN;
 			SAVEPOINT s1;
-			INSERT INTO tbl VALUES(0);
+			INSERT INTO tbl VALUES(0,'[false, "jnvaba", -76, 7, {"_": [1]}, 9]');
 			PREPARE TRANSACTION 'c:client_id';
 			COMMIT PREPARED 'c:client_id';
 		  ),
@@ -163,7 +171,25 @@ $node->pgbench(
 				SELECT bt_index_check('idx',true);
 				SELECT pg_advisory_unlock(42);
 			\endif
+		  ),
+		'005_pgbench_concurrent_cic' => q(
+			SELECT pg_try_advisory_lock(42)::integer AS gotginlock \gset
+			\if :gotginlock
+				DROP INDEX CONCURRENTLY ginidx;
+				CREATE INDEX CONCURRENTLY ginidx ON tbl USING gin(j);
+				SELECT gin_index_check('ginidx');
+				SELECT pg_advisory_unlock(42);
+			\endif
+		  ),
+		'006_pgbench_concurrent_ric' => q(
+			SELECT pg_try_advisory_lock(42)::integer AS gotginlock \gset
+			\if :gotginlock
+				REINDEX INDEX CONCURRENTLY ginidx;
+				SELECT gin_index_check('ginidx');
+				SELECT pg_advisory_unlock(42);
+			\endif
 		  )
+
 	});
 
 $node->stop;
-- 
2.49.0

v20250328-0006-Stress-test-verify_gin-using-pgbench.patchtext/x-patch; charset=UTF-8; name=v20250328-0006-Stress-test-verify_gin-using-pgbench.patchDownload

From 5fccd4981c6aac5249bceca4b3e698d53f9a4480 Mon Sep 17 00:00:00 2001
From: Mark Dilger <mark.dilger@enterprisedb.com>
Date: Fri, 21 Feb 2025 12:11:07 -0800
Subject: [PATCH v20250328 6/6] Stress test verify_gin() using pgbench

Add a tap test which inserts, updates, deletes, and checks in
parallel.  Like all pgbench based tap tests, this test contains race
conditions between the operations, so Your Mileage May Vary.  For
me, on my laptop, I got failures like:

	index "ginidx" has wrong tuple order on entry tree page

which I have not yet investigated.  The test is included here for
anybody interested in debugging this failure.
---
 contrib/amcheck/t/006_gin_concurrency.pl | 196 +++++++++++++++++++++++
 1 file changed, 196 insertions(+)
 create mode 100644 contrib/amcheck/t/006_gin_concurrency.pl

diff --git a/contrib/amcheck/t/006_gin_concurrency.pl b/contrib/amcheck/t/006_gin_concurrency.pl
new file mode 100644
index 00000000000..afc67940d4d
--- /dev/null
+++ b/contrib/amcheck/t/006_gin_concurrency.pl
@@ -0,0 +1,196 @@
+
+# Copyright (c) 2021-2025, PostgreSQL Global Development Group
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+
+use Test::More;
+
+my $node;
+
+#
+# Test set-up
+#
+$node = PostgreSQL::Test::Cluster->new('test');
+$node->init;
+$node->append_conf('postgresql.conf',
+	'lock_timeout = ' . (1000 * $PostgreSQL::Test::Utils::timeout_default));
+$node->start;
+$node->safe_psql('postgres', q(CREATE EXTENSION amcheck));
+$node->safe_psql('postgres', q(CREATE TABLE tbl(i integer[], j jsonb, k jsonb)));
+$node->safe_psql('postgres', q(CREATE INDEX ginidx ON tbl USING gin(i, j, k)));
+$node->safe_psql('postgres', q(CREATE TABLE jsondata (i serial, j jsonb)));
+$node->safe_psql('postgres', q(INSERT INTO jsondata (j) VALUES
+	('1'),
+	('91'),
+	('[5]'),
+	('true'),
+	('"zxI"'),
+	('[1, 7]'),
+	('["", 4]'),
+	('"utDFBz"'),
+	('[[9], ""]'),
+	('"eCvxKPML"'),
+	('["1VMQNQM"]'),
+	('{"": "562c"}'),
+	('[58, 8, null]'),
+	('{"": {"": 62}}'),
+	('["", 6, 19, ""]'),
+	('{"ddfWTQ": true}'),
+	('["", 734.2, 9, 5]'),
+	('"GMV27mjtuuqmlltw"'),
+	('{"dabe": -5, "": 6}'),
+	('"hgihykirQGIYTcCA30"'),
+	('[9, {"Utrn": -6}, ""]'),
+	('"BJTZUMST1_WWEgyqgka_"'),
+	('["", -4, "", [-2], -47]'),
+	('{"": [3], "": {"": "y"}}'),
+	('{"myuijj": "YUWIUZXXLGS"}'),
+	('{"3": false, "C": "1sHTX"}'),
+	('"ZGUORVDE_ACF1QXJ_hipgwrks"'),
+	('{"072": [3, -4], "oh": "eL"}'),
+	('[{"de": 9, "JWHPMRZJW": [0]}]'),
+	('"EACJUZEBAFFBEE6706SZLWVGO635"'),
+	('["P", {"TZW": [""]}, {"": [0]}]'),
+	('{"": -6, "YMb": -22, "__": [""]}'),
+	('{"659": [8], "bfc": [0], "V": ""}'),
+	('{"8776": "1tryl", "Q": 2, "": 4.6}'),
+	('[[1], "", 9, 0, [1, 0], -1, 0, "C"]'),
+	('"635321pnpjlfFzhGTIYP9265iA_19D8260"'),
+	('"klmxsoCFDtzxrhotsqlnmvmzlcbdde34twj"'),
+	('"GZSXSZVS19ecbe_ZJJED0379c1j9_GSU9167"'),
+	('{"F18s": {"": -84194}, "ececab2": [""]}'),
+	('["", {"SVAvgg": "Q"}, 1, 9, "gypy", [1]]'),
+	('[[""], {"": 5}, "GVZGGVGSWM", 2, ["", 8]]'),
+	('{"V": 8, "TPNL": [826, null], "4": -9.729}'),
+	('{"HTJP_DAptxn6": 9, "": "r", "hji4124": ""}'),
+	('[1, ["9", 5, 6, ""], {"": "", "": "efb"}, 7]'),
+	('{"": 6, "1251e_cajrgkyzuxBEDM017444EFD": 548}'),
+	('{"853": -60, "TGLUG_jxmrggv": null, "pjx": ""}'),
+	('[0, "wsgnnvCfJVV_KOMLVXOUIS9FIQLPXXBbbaohjrpj"]'),
+	('"nizvkl36908OLW22ecbdeEBMHMiCEEACcikwkjpmu30X_m"'),
+	('{"bD24eeVZWY": 1, "Bt": 9, "": 6052, "FT": ["h"]}'),
+	('"CDBnouyzlAMSHJCtguxxizpzgkNYfaNLURVITNLYVPSNLYNy"'),
+	('{"d": [[4, "N"], null, 6, true], "1PKV": 6, "9": 6}'),
+	('[-7326, [83, 55], -63, [0, {"": 1}], {"ri0": false}]'),
+	('{"": 117.38, "FCkx3608szztpvjolomzvlyrshyvrgz": -4.2}'),
+	('["", 8, {"WXHNG": {"6": 4}}, [null], 7, 2, "", 299, 6]'),
+	('[[-992.2, "TPm", "", "cedeff79BD8", "t", [1]], 0, [-7]]'),
+	('[9, 34, ["LONuyiYGQZ"], [7, 88], ["c"], 1, 6, "", [[2]]]'),
+	('[20, 5, null, "eLHTXRWNV", 8, ["pnpvrum", -3], "FINY", 3]'),
+	('[{"": "", "b": 2, "d": "egu"}, "aPNK", 2, 9, {"": -79946}]'),
+	('[1, {"769": 9}, 5, 9821, 22, 0, 2.7, 5, 4, 191, 54.599, 24]'),
+	('["c", 77, "b_0lplvHJNLMxw", "VN76dhFadaafadfe5dfbco", false]'),
+	('"TYIHXebbPK_86QMP_199bEEIS__8205986vdC_CFAEFBFCEFCJQRHYoqztv"'),
+	('"cdmxxxzrhtxpwuyrxinmhb5577NSPHIHMTPQYTXSUVVGJPUUMCBEDb_1569e"'),
+	('[[5, null, "C"], "ORNR", "mnCb", 1, -800, "6953", ["K", 0], ""]'),
+	('"SSKLTHJxjxywwquhiwsde353eCIJJjkyvn9946c2cdVadcboiyZFAYMHJWGMMT"'),
+	('"5185__D5AtvhizvmEVceF3jxtghlCF0789_owmsztJHRMOJ7rlowxqq51XLXJbF"'),
+	('{"D": 565206, "xupqtmfedff": "ZGJN9", "9": 1, "glzv": -47, "": -8}'),
+	('{"": 9, "": {"": [null], "ROP": 842}, "": ["5FFD", 7, 5, 1, 94, 1]}'),
+	('{"JLn": ["8s"], "": "_ahxizrzhivyzvhr", "XSAt": 5, "P": 2838, "": 5}'),
+	('[51, 3, {"": 9, "": -9, "": [[6]]}, 7, 7, {"": 0}, "TXLQL", 7.6, [7]]'),
+	('[-38.7, "kre40", 5, {"": null}, "tvuv", 8, "", "", "uizygprwwvh", "1"]'),
+	('"z934377_nxmzjnuqglgyukjteefeihjyot1irkvwnnrqinptlpzwjgmkjbQMUVxxwvbdz"'),
+	('[165.9, "dAFD_60JQPYbafh", false, {"": 6, "": "fcfd"}, [[2], "c"], 4, 2]'),
+	('"ffHOOPVSSACDqiyeecTNWJMWPNRXU283aHRXNUNZZZQPUGYSQTTQXQVJM5eeafcIPGIHcac"'),
+	('[2, 8, -53, {"": 5}, "F9", 8, "SGUJPNVI", "7OLOZH", 9.84, {"": 6}, 207, 6]'),
+	('"xqmqmyljhq__ZGWJVNefagsxrsktruhmlinhxloupuVQW0804901NKGGMNNSYYXWQOosz8938"'),
+	('{"FEoLfaab1160167": {"L": [42, 0]}, "938": "FCCUPGYYYMQSQVZJKM", "knqmk": 2}'),
+	('"0igyurmOMSXIYHSZQEAcxlvgqdxkhwtrbaabfaaMC138Z_BDRLrythpi30_MPRXMTOILRLswmoy"'),
+	('"1129BBCABFFAACA9VGVKipnwohaccc9TSIMTOQKHmcGYVeFE_PWKLHmpyj60137672qugtsstugg"'),
+	('"D3BDA069074174vx48A37IVHWVXLUP9382542ypsl1465pixtryzCBgrkkhrvCC_BDDFatkyXHLIe"'),
+	('[{"esx7": -53, "ec60834YGVMYoXAAvgxmmqnojyzmiklhdovFipl": 2, "os": 66433}, 9.13]'),
+	('{"": ["", 4, null, 5, null], "": "3", "5_GMMHTIhPB_F_vsebc1": "Er", "GY": 121.32}'),
+	('["krTVPYDEd", 5, 8, [6, -6], [[-9], 3340, [[""]]], "", 5, [6, true], 3, "", 1, ""]'),
+	('{"rBNPKN8446080wruOLeceaCBDCKWNUYYMONSJUlCDFExr": {"": "EE0", "6826": 5, "": 7496}}'),
+	('[3, {"": -8}, "101dboMVSNKZLVPITLHLPorwwuxxjmjsh", "", "LSQPRVYKWVYK945imrh", 4, 51]'),
+	('[["HY6"], "", "bcdB", [2, [85, 1], 3, 3, 3, [8]], "", ["_m"], "2", -33, 8, 3, "_xwj"]'),
+	('["", 0, -3.7, 8, false, null, {"": 5}, 9, "06FccxFcdb283bbZGGVRSMWLJH2_PBAFpwtkbceto"]'),
+	('[52, "", -39, -7, [1], "c", {"": 9, "": 45528, "G": {"": 7}}, 3, false, 0, "EB", 8, -6]'),
+	('"qzrkvrlG78CCCEBCptzwwok808805243QXVSYed3efZSKLSNXPxhrS357KJMWSKgrfcFFDFDWKSXJJSIJ_yqJu"'),
+	('[43, 8, {"": ""}, "uwtv__HURKGJLGGPPW", 9, 66, "yqrvghxuw", {"J": false}, false, 2, 0, 4]'),
+	('[{"UVL": 7, "": 1}, false, [6, "H"], "boxlgqgm", 3, "znhm", [true], 0, ["e", 3.7], 9, 9.4]'),
+	('{"825634870117somzqw": 1, "": [5], "gYH": "_XT", "b22412631709RZP": 3, "": "", "FDB": [""]}'),
+	('[8, ["_bae"], "", "WN", 80, {"o": 2, "aff": 16}, false, true, 4, 6, {"nutzkzikolsxZRQ": 30}]'),
+	('["588BD9c_xzsn", {"k": 0, "_Ecezlkslrwvjpwrukiqzl": 3, "Ej": "4"}, "TUXwghn1dTNRXJZpswmD", 5]'),
+	('[{"dC": 7}, {"": 1, "4": 41, "": "", "": "adKS"}, {"": "ypv"}, 6, 9, 2, [-61.46], [1, 3.9], 2]'),
+	('{"8": 8, "": -364, "855": -238.1, "zj": 9, "SNHJG413": 3, "UMNVI73": [60, 0], "iwvqse": -1.833}'),
+	('"VTUKMLZKQPHIEniCFZ_cjrhvspxzulvxhqykjzmrw89OGOGISWdcrvpOPLOFALGK809896999xzqnkm63254_xrmcfcedb"'),
+	('["", "USNQbcexyFDCdBAFWJIphloxwytplyZZR008400FmoiYXVYOHVGV79795644463Aug_aeoDDEjzoziisxoykuijhz"]'),
+	('{"": 1, "5abB58gXVQVTTMWU3jSHXMMNV": "", "nv": 934, "kjsnhtj": 8, "": [{"xm": [71, 425]}], "": -9}'),
+	('"__oliqCcbwwyqmtECsqivplcb1NTMOQRZTYRJONOIPWNHKWLJRIHKROMJNZLNGTTKRcedebccdbMTQXSzhynxmllqxuhnxBA_"'),
+	('["thgACBWGNGMkFFEA", [0, -1349, {"18": "RM", "F3": 6, "dP": "_AF"}, 64, 0, {"f": [8]}], 5, [[0]], 2]')
+));
+
+#
+# Stress gin with pgbench.
+#
+# Modify the table data, and hence the index data, from multiple process
+# while from other processes run the index checking code.  This should,
+# if the index is large enough, result in the checks performing across
+# concurrent page splits.
+#
+$node->pgbench(
+	'--no-vacuum --client=20 --transactions=5000',
+	0,
+	[qr{actually processed}],
+	[qr{^$}],
+	'concurrent DML and index checking',
+	{
+		'006_gin_concurrency_insert_1' => q(
+			INSERT INTO tbl (i, j, k)
+				(SELECT ARRAY[x.i, y.i, random(0,100000), random(0,100000)], x.j, y.j
+					FROM jsondata x, jsondata y
+					WHERE x.i = random(1,100)
+					  AND y.i = random(1,100)
+				)
+		  ),
+		'006_gin_concurrency_insert_2' => q(
+			INSERT INTO tbl (i, j, k)
+				(SELECT gs.i, j.j, j.j || j.j
+					FROM jsondata j,
+						 (SELECT array_agg(gs) AS i FROM generate_series(random(0,100), random(101,200)) gs) gs
+					WHERE j.i = random(1,100)
+				)
+		  ),
+		'006_gin_concurrency_insert_nulls' => q(
+			INSERT INTO tbl (i, j, k) VALUES
+				(null,               null, null),
+				(null,               null, '[]'),
+				(null,               '[]', null),
+				(ARRAY[]::INTEGER[], null, null),
+				(null,               '[]', '[]'),
+				(ARRAY[]::INTEGER[], '[]', null),
+				(ARRAY[]::INTEGER[], '[]', '[]')
+		  ),
+		'006_gin_concurrency_update_i' => q(
+			UPDATE tbl
+				SET i = (SELECT i || i FROM tbl ORDER BY random() LIMIT 1)
+				WHERE j = (SELECT j FROM tbl ORDER BY random() LIMIT 1);
+		),
+		'006_gin_concurrency_update_j' => q(
+			UPDATE tbl
+				SET j = (SELECT j || j FROM tbl ORDER BY random() LIMIT 1)
+				WHERE k = (SELECT k FROM tbl ORDER BY random() LIMIT 1);
+		),
+		'006_gin_concurrency_update_k' => q(
+			UPDATE tbl
+				SET k = (SELECT k || k FROM tbl ORDER BY random() LIMIT 1)
+				WHERE i = (SELECT i FROM tbl ORDER BY random() LIMIT 1);
+		),
+		'006_gin_concurrency_delete' => q(
+			DELETE FROM tbl
+				WHERE random(1,5) = 3;
+		),
+		'006_gin_concurrency_gin_index_check' => q(
+				SELECT gin_index_check('ginidx');
+		)
+	});
+
+$node->stop;
+done_testing();
+
-- 
2.49.0

#73

Kirill Reshke

reshkekirill@gmail.com

10 months ago

In reply to: Tomas Vondra (#72)

Re: Amcheck verification of GiST and GIN

On Fri, 28 Mar 2025 at 21:26, Tomas Vondra <tomas@vondra.me> wrote:

Here's a polished version of the patches. If you have any
comments/objections, please speak now.
--
Tomas Vondra

Hi, no objections, lgtm

--
Best regards,
Kirill Reshke

#74

Tomas Vondra

tomas@vondra.me

10 months ago

In reply to: Kirill Reshke (#73)

Re: Amcheck verification of GiST and GIN

On 3/28/25 20:51, Kirill Reshke wrote:

On Fri, 28 Mar 2025 at 21:26, Tomas Vondra <tomas@vondra.me> wrote:

Here's a polished version of the patches. If you have any
comments/objections, please speak now.
--
Tomas Vondra

Hi, no objections, lgtm

I've pushed all the parts of this patch series, except for the stress
test - which I think was not meant for commit.

buildfarm seems happy so far, except for a minor indentation issue
(forgot to reindent after merging the separate fix patch).

Marked as committed in the CFA - that's not entirely correct, because
the original patch series also included amcheck support for GiST, but
that was not committed. I suggest we open a new CF entry if that gets
resubmitted for PG19 (I hope that will be the case).

Thanks for the patches, reviews, etc.!

--
Tomas Vondra

#75

Tom Lane

tgl@sss.pgh.pa.us

10 months ago

In reply to: Tomas Vondra (#74)

Re: Amcheck verification of GiST and GIN

Tomas Vondra <tomas@vondra.me> writes:

I've pushed all the parts of this patch series, except for the stress
test - which I think was not meant for commit.
buildfarm seems happy so far, except for a minor indentation issue
(forgot to reindent after merging the separate fix patch).

Not so much here:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gecko&dt=2025-03-29%2020%3A25%3A23

Need to avoid depending on md5(), or it'll fail on machines with
FIPS mode enabled. See for example 95b856de2.

regards, tom lane

#76

Tomas Vondra

tomas@vondra.me

10 months ago

In reply to: Tom Lane (#75)

Re: Amcheck verification of GiST and GIN

On 3/30/25 06:04, Tom Lane wrote:

Tomas Vondra <tomas@vondra.me> writes:

I've pushed all the parts of this patch series, except for the stress
test - which I think was not meant for commit.
buildfarm seems happy so far, except for a minor indentation issue
(forgot to reindent after merging the separate fix patch).

Not so much here:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gecko&dt=2025-03-29%2020%3A25%3A23

Need to avoid depending on md5(), or it'll fail on machines with
FIPS mode enabled. See for example 95b856de2.

Ah, I forgot about that. Fixed.

regards

--
Tomas Vondra

#77

Arseniy Mukhin

arseniy.mukhin.dev@gmail.com

8 months ago

In reply to: Tomas Vondra (#76)

1 attachment(s)

Re: Amcheck verification of GiST and GIN

Hello,

Thanks everybody for the patch.

I noticed there are no tests that GIN check fails if the index is
corrupted, so I thought it would be great to have some.
While writing tests I noticed some issues in the patch (all issues are
for verify_gin.c)

1) When we add new items to the entry tree stack, ptr->parenttup is always null
because GinPageGetOpaque(page)->rightlink is never NULL.

/* last tuple in layer has no high key */
if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
ptr->parenttup = CopyIndexTuple(idxtuple);
else
ptr->parenttup = NULL;

Right way to check if entry doesn't have right neighbour is

GinPageGetOpaque(page)->rightlink == InvalidBlockNumber

But even if we fix it, the condition would not do what the comment
says. If we want to have NULL as parenttup
only for the last tuple of the btree layer, the right check would be:

if (i == maxoff && rightlink == InvalidBlockNumber)
ptr->parenttup = NULL;
else
ptr->parenttup = CopyIndexTuple(idxtuple);

2) Check don't use attnum in comparisons, but for multicolumn indexes
attnum defines order. When we compare max entry page key
with parent key we ignore attnum. It means we occasionally can try to
compare keys of different columns.
While checking order within the same page we skip checking order for
tuples with different attnums now,
but it can be checked. Fix is easy: using ginCompareAttEntries()
instead of ginCompareEntries().

3) Here a code of the split detection

if (rightlink != InvalidBlockNumber &&
ginCompareEntries(&state, attnum, page_max_key,
page_max_key_category, parent_key,
parent_key_category) > 0)
{
/* split page detected, install right link to the stack */

Condition seems not right, because the child page max item key never
can be bigger then parent key.
It can be equal to the parentkey, and it means that there was no split
and the parent key that we cached in the stack is still
relevant. Or it could be less then cached parent key and it means that
split took place and old max item key moved to the
right neighbour and current page max item key should be less then
cached parent key. So I think we should replace > with <.

4) Here is the code for checking the order within the entry page.

/*
* First block is metadata, skip order check. Also, never check
* for high key on rightmost page, as this key is not really
* stored explicitly.
*
* Also make sure to not compare entries for different attnums,
* which may be stored on the same page.
*/
if (i != FirstOffsetNumber && attnum == prev_attnum &&
stack->blkno != GIN_ROOT_BLKNO &&
!(i == maxoff && rightlink == InvalidBlockNumber))
{
prev_key = gintuple_get_key(&state, prev_tuple,
&prev_key_category);
if (ginCompareEntries(&state, attnum, prev_key,
prev_key_category, current_key,
current_key_category) >= 0)

We skip checking the order for the root page, it's not clear why.
Probably there is some mess with the meta page, because
comment says "First block is metadata, skip order check". So I think
we can remove

stack->blkno != GIN_ROOT_BLKNO

5) The same place as 4). We skip checking the order for the high key
on the rightmost page, as this key is not really stored explicitly,
but for leaf pages all keys are stored explicitly, so we can check the
order for the last item of the leaf page too.
So I think we can change the condition to this:

!(i == maxoff && rightlink == InvalidBlockNumber &&
!GinPageIsLeaf(page))

6) In posting tree parent key check part:

/*
* Check if this tuple is consistent with the downlink in the
* parent.
*/
if (stack->parentblk != InvalidBlockNumber && i == maxoff &&
ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
ereport(ERROR,
...
Here we don't check if stack->parentkey is valid, so sometimes we
compare invalid parentkey (because we can have
valid parentblk and invalid parentkey the same time). Invalid
parentkey is always bigger, so the code never triggers
ereport, but it doesn't look right. so probably we can rewrite it this way:

if (i == maxoff && ItemPointerIsValid(&stack->parentkey) &&
ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)

7) When producing stack entries for posting tree check, we set parent
key like this:

/*
* Set rightmost parent key to invalid item pointer. Its value
* is 'Infinity' and not explicitly stored.
*/
if (rightlink == InvalidBlockNumber)
ItemPointerSetInvalid(&ptr->parentkey);
else
ptr->parentkey = posting_item->key;

We set invalid parent key for all items of the rightmost page. But
it's the only rightmost item that doesn't have an explicit
parentkey (actually the comment says exactly this, but the code does a
different thing). All others have an explicit parent
key and we can set it. So fix can look like this:

if (rightlink == InvalidBlockNumber && i == maxoff)
ItemPointerSetInvalid(&ptr->parentkey);
else
ptr->parentkey = posting_item->key;

But for (rightlink == InvalidBlockNumber && i == maxoff)
posting_item->key is always (0,0) (we check it a little bit earlier),
so I think we can simplify it:

ptr->parentkey = posting_item->key;

8) In the function gin_refind_parent() the code

if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)

triggers Assert(ItemPointerIsValid(pointer)) within the
ItemPointerGetBlockNumber(), because itup->t_tid could be invalid.
AFAIS GIN code uses a special method GinGetDownlink(itup) that avoids
this Assert. So we can use it here too.

Please find the attached patch with fixes for issues above. Also there
are added regression test for multicolumn index,
and several tap tests with some basic corrupted index cases. I'm not
sure if it's the right way to write such tests and would
be glad to hear any feedback, especially about
invalid_entry_columns_order_test() where it seems important to
preserve
byte ordering. Also all tests expect standard page size 8192 now.

Also there are several points that I think also worth addressing:

9) Field 'parentlsn' is set but never actually used for any check. Or
I missed something.

10) README says "Vacuum never deletes tuples or pages from the entry
tree." But check assumes that it's possible to have
deleted leaf page with 0 entries.

if (GinPageIsDeleted(page))
{
if (!GinPageIsLeaf(page))
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("index \"%s\" has deleted internal page %u",
RelationGetRelationName(rel), blockNo)));
if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("index \"%s\" has deleted page %u with tuples",
RelationGetRelationName(rel), blockNo)));
}
11) When we compare entry tree max page key with parent key:

if (ginCompareAttEntries(&state, attnum, current_key,
current_key_category, parent_key_attnum,
parent_key, parent_key_category) > 0)
{
/*
* There was a discrepancy between parent and child
* tuples. We need to verify it is not a result of
* concurrent call of gistplacetopage(). So, lock parent
* and try to find downlink for current page. It may be
* missing due to concurrent page split, this is OK.
*/
pfree(stack->parenttup);
stack->parenttup = gin_refind_parent(rel, stack->parentblk,
stack->blkno, strategy);

I think we can remove gin_refind_parent() and do ereport right away here.
The same logic as with 3). AFAIK it's impossible to have a child item
with a key that is higher than the cached parent key.
Parent key bounds what keys we can insert into the child page, so it
seems there is no way how they can appear there.

Best regards,
Arseniy Mukhin

Attachments:

0001-verify-gin-fixes-and-tests.patchtext/x-patch; charset=US-ASCII; name=0001-verify-gin-fixes-and-tests.patchDownload

From 88b703737f112c177a3c1ed877bbb0a1374bb8ca Mon Sep 17 00:00:00 2001
From: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>
Date: Thu, 8 May 2025 19:54:38 +0300
Subject: [PATCH] verify gin fixes and tests

---
 contrib/amcheck/expected/check_gin.out |  12 ++
 contrib/amcheck/meson.build            |   1 +
 contrib/amcheck/sql/check_gin.sql      |  10 +
 contrib/amcheck/t/006_verify_gin.pl    | 274 +++++++++++++++++++++++++
 contrib/amcheck/verify_gin.c           |  57 +++--
 5 files changed, 324 insertions(+), 30 deletions(-)
 create mode 100644 contrib/amcheck/t/006_verify_gin.pl

diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
index b4f0b110747..8dd01ced8d1 100644
--- a/contrib/amcheck/expected/check_gin.out
+++ b/contrib/amcheck/expected/check_gin.out
@@ -76,3 +76,15 @@ SELECT gin_index_check('gin_check_jsonb_idx');
 
 -- cleanup
 DROP TABLE gin_check_jsonb;
+-- Test GIN multicolumn index
+CREATE TABLE "gin_check_multicolumn"(a text[], b text[]);
+INSERT INTO gin_check_multicolumn (a,b) values ('{a,c,e}','{b,d,f}');
+CREATE INDEX "gin_check_multicolumn_idx" on gin_check_multicolumn USING GIN(a,b);
+SELECT gin_index_check('gin_check_multicolumn_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_multicolumn;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index b33e8c9b062..1f0c347ed54 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -49,6 +49,7 @@ tests += {
       't/003_cic_2pc.pl',
       't/004_verify_nbtree_unique.pl',
       't/005_pitr.pl',
+      't/006_verify_gin.pl',
     ],
   },
 }
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
index 66f42c34311..11caed3d6a8 100644
--- a/contrib/amcheck/sql/check_gin.sql
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -50,3 +50,13 @@ SELECT gin_index_check('gin_check_jsonb_idx');
 
 -- cleanup
 DROP TABLE gin_check_jsonb;
+
+-- Test GIN multicolumn index
+CREATE TABLE "gin_check_multicolumn"(a text[], b text[]);
+INSERT INTO gin_check_multicolumn (a,b) values ('{a,c,e}','{b,d,f}');
+CREATE INDEX "gin_check_multicolumn_idx" on gin_check_multicolumn USING GIN(a,b);
+
+SELECT gin_index_check('gin_check_multicolumn_idx');
+
+-- cleanup
+DROP TABLE gin_check_multicolumn;
diff --git a/contrib/amcheck/t/006_verify_gin.pl b/contrib/amcheck/t/006_verify_gin.pl
new file mode 100644
index 00000000000..3b6077c6282
--- /dev/null
+++ b/contrib/amcheck/t/006_verify_gin.pl
@@ -0,0 +1,274 @@
+
+# Copyright (c) 2021-2025, PostgreSQL Global Development Group
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+
+use Test::More;
+
+my $node;
+
+#
+# Test set-up
+#
+$node = PostgreSQL::Test::Cluster->new('test');
+$node->init(no_data_checksums => 1);
+$node->append_conf('postgresql.conf', 'autovacuum=off');
+$node->start;
+$node->safe_psql('postgres', q(CREATE EXTENSION amcheck));
+$node->safe_psql(
+	'postgres', q(
+        CREATE OR REPLACE FUNCTION  random_string( INT ) RETURNS text AS $$
+        SELECT string_agg(substring('0123456789bcdfghjkmnpqrstvwxyz', ceil(random() * 30)::integer, 1), '') from generate_series(1, $1);
+        $$ LANGUAGE SQL;));
+
+# Tests
+invalid_entry_order_leaf_page_test();
+invalid_entry_order_inner_page_test();
+invalid_entry_columns_order_test();
+inconsistent_with_parent_key_parent_key_corrupted_test();
+inconsistent_with_parent_key_child_key_corrupted_test();
+
+sub invalid_entry_order_leaf_page_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+	 	CREATE TABLE $relname (a text[]);
+	 	CREATE INDEX $indexname ON $relname USING gin (a);
+		INSERT INTO $relname (a) VALUES ('{aaaaa,bbbbb}');
+		SELECT gin_clean_pending_list('$indexname');
+	 ));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blksize = 8192;
+	my $blkno = 1;  # root
+
+	# produce wrong order by replacing aaaaa with ccccc
+	string_replace_block(
+		$relpath,
+		"aaaaa",
+		"ccccc",
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	ok($stderr =~ 'index "test_gin_idx" has wrong tuple order on entry tree page, block 1, offset 2, rightlink 4294967295');
+}
+
+sub invalid_entry_order_inner_page_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+	 	CREATE TABLE $relname (a text[]);
+	 	CREATE INDEX $indexname ON $relname USING gin (a);
+		INSERT INTO $relname (a) VALUES (('{' || 'pppppppppp' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'qqqqqqqqqq' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'rrrrrrrrrr' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'ssssssssss' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'tttttttttt' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'uuuuuuuuuu' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'vvvvvvvvvv' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'wwwwwwwwww' || random_string(1870) ||'}')::text[]);
+		SELECT gin_clean_pending_list('$indexname');
+	 ));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blksize = 8192;
+	my $blkno = 1;  # root
+
+	# we have rrrrrrrrr... and tttttttttt... as keys in the root, so produce wrong order by replacing rrrrrrrrrr....
+	string_replace_block(
+		$relpath,
+		"rrrrrrrrrr",
+		"zzzzzzzzzz",
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	ok($stderr =~ 'index "test_gin_idx" has wrong tuple order on entry tree page, block 1, offset 2, rightlink 4294967295');
+}
+
+sub invalid_entry_columns_order_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+	 	CREATE TABLE $relname (a text[],b text[]);
+	 	CREATE INDEX $indexname ON $relname USING gin (a,b);
+		INSERT INTO $relname (a,b) VALUES ('{aaa}','{bbb}');
+		SELECT gin_clean_pending_list('$indexname');
+	 ));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blksize = 8192;
+	my $blkno = 1;  # root
+
+	# mess column numbers
+	# root items order before: (1,aaa), (2,bbb)
+	# root items order after:  (2,aaa), (1,bbb)
+	my $find = pack('s', 1) . pack('c', 0x09) . "aaa";
+	my $replace = pack('s', 2) . pack('c', 0x09) . "aaa";
+	string_replace_block(
+		$relpath,
+		$find,
+		$replace,
+		$blksize,
+		$blkno
+	);
+
+	$find = pack('s', 2) . pack('c', 0x09) . "bbb";
+	$replace = pack('s', 1) . pack('c', 0x09) . "bbb";
+	string_replace_block(
+		$relpath,
+		$find,
+		$replace,
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	ok($stderr =~ 'index "test_gin_idx" has wrong tuple order on entry tree page, block 1, offset 2, rightlink 4294967295');
+}
+
+sub inconsistent_with_parent_key_parent_key_corrupted_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+	 	CREATE TABLE $relname (a text[]);
+	 	CREATE INDEX $indexname ON $relname USING gin (a);
+		INSERT INTO $relname (a) VALUES (('{' || 'llllllllll' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'mmmmmmmmmm' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'nnnnnnnnnn' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'xxxxxxxxxx' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'yyyyyyyyyy' || random_string(1870) ||'}')::text[]);
+		SELECT gin_clean_pending_list('$indexname');
+	 ));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blksize = 8192;
+	my $blkno = 1;  # root
+
+	# we have nnnnnnnnnn... as parent key in the root, so replace it with something smaller then child's keys
+	string_replace_block(
+		$relpath,
+		"nnnnnnnnnn",
+		"aaaaaaaaaa",
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	ok($stderr =~ 'index "test_gin_idx" has inconsistent records on page 5 offset 3');
+}
+
+sub inconsistent_with_parent_key_child_key_corrupted_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+	 	CREATE TABLE $relname (a text[]);
+	 	CREATE INDEX $indexname ON $relname USING gin (a);
+		INSERT INTO $relname (a) VALUES (('{' || 'llllllllll' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'mmmmmmmmmm' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'nnnnnnnnnn' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'xxxxxxxxxx' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'yyyyyyyyyy' || random_string(1870) ||'}')::text[]);
+		SELECT gin_clean_pending_list('$indexname');
+	 ));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blksize = 8192;
+	my $blkno = 5;  # leaf
+
+	# we have nnnnnnnnnn... as parent key in the root, so replace child key with something bigger
+	string_replace_block(
+		$relpath,
+		"nnnnnnnnnn",
+		"pppppppppp",
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	ok($stderr =~ 'index "test_gin_idx" has inconsistent records on page 5 offset 3');
+}
+
+# Returns the filesystem path for the named relation.
+sub relation_filepath
+{
+	my ($relname) = @_;
+
+	my $pgdata = $node->data_dir;
+	my $rel = $node->safe_psql('postgres',
+		qq(SELECT pg_relation_filepath('$relname')));
+	die "path not found for relation $relname" unless defined $rel;
+	return "$pgdata/$rel";
+}
+
+sub string_replace_block {
+	my ($filename, $find, $replace, $blksize, $blkno) = @_;
+
+	my $fh;
+	open($fh, '+<', $filename) or BAIL_OUT("open failed: $!");
+	binmode $fh;
+
+	my $offset = $blkno * $blksize;
+	my $buffer;
+
+	sysseek($fh, $offset, 0) or BAIL_OUT("seek failed: $!");
+	sysread($fh, $buffer, $blksize) or BAIL_OUT("read failed: $!");
+
+	$buffer =~ s/$find/$replace/g;
+
+	sysseek($fh, $offset, 0) or BAIL_OUT("seek failed: $!");
+	syswrite($fh, $buffer) or BAIL_OUT("write failed: $!");
+
+	close($fh) or BAIL_OUT("close failed: $!");
+
+	return;
+}
+
+done_testing();
\ No newline at end of file
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index b5f363562e3..427cf1669a6 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -346,7 +346,7 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 				 * Check if this tuple is consistent with the downlink in the
 				 * parent.
 				 */
-				if (stack->parentblk != InvalidBlockNumber && i == maxoff &&
+				if (i == maxoff && ItemPointerIsValid(&stack->parentkey) &&
 					ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
 					ereport(ERROR,
 							(errcode(ERRCODE_INDEX_CORRUPTED),
@@ -359,14 +359,10 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 				ptr->depth = stack->depth + 1;
 
 				/*
-				 * Set rightmost parent key to invalid item pointer. Its value
-				 * is 'Infinity' and not explicitly stored.
+				 * The rightmost parent key is always invalid item pointer.
+				 * Its value is 'Infinity' and not explicitly stored.
 				 */
-				if (rightlink == InvalidBlockNumber)
-					ItemPointerSetInvalid(&ptr->parentkey);
-				else
-					ptr->parentkey = posting_item->key;
-
+				ptr->parentkey = posting_item->key;
 				ptr->parentblk = stack->blkno;
 				ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
 				ptr->next = stack->next;
@@ -463,17 +459,18 @@ gin_check_parent_keys_consistency(Relation rel,
 			Datum		parent_key = gintuple_get_key(&state,
 													  stack->parenttup,
 													  &parent_key_category);
+			OffsetNumber parent_key_attnum = gintuple_get_attrnum(&state, stack->parenttup);
 			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno,
 												   page, maxoff);
 			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
-			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			OffsetNumber page_max_key_attnum = gintuple_get_attrnum(&state, idxtuple);
 			GinNullCategory page_max_key_category;
 			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
 
 			if (rightlink != InvalidBlockNumber &&
-				ginCompareEntries(&state, attnum, page_max_key,
-								  page_max_key_category, parent_key,
-								  parent_key_category) > 0)
+				ginCompareAttEntries(&state, page_max_key_attnum, page_max_key,
+									 page_max_key_category, parent_key_attnum,
+									 parent_key, parent_key_category) < 0)
 			{
 				/* split page detected, install right link to the stack */
 				GinScanItem *ptr;
@@ -528,20 +525,18 @@ gin_check_parent_keys_consistency(Relation rel,
 			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
 
 			/*
-			 * First block is metadata, skip order check. Also, never check
-			 * for high key on rightmost page, as this key is not really
-			 * stored explicitly.
+			 * Never check for high key on rightmost inner page, as this key
+			 * is not really stored explicitly.
 			 *
 			 * Also make sure to not compare entries for different attnums,
 			 * which may be stored on the same page.
 			 */
-			if (i != FirstOffsetNumber && attnum == prev_attnum && stack->blkno != GIN_ROOT_BLKNO &&
-				!(i == maxoff && rightlink == InvalidBlockNumber))
+			if (i != FirstOffsetNumber && !(i == maxoff && rightlink == InvalidBlockNumber && !GinPageIsLeaf(page)))
 			{
 				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
-				if (ginCompareEntries(&state, attnum, prev_key,
-									  prev_key_category, current_key,
-									  current_key_category) >= 0)
+				if (ginCompareAttEntries(&state, prev_attnum, prev_key,
+										 prev_key_category, attnum,
+										 current_key, current_key_category) >= 0)
 					ereport(ERROR,
 							(errcode(ERRCODE_INDEX_CORRUPTED),
 							 errmsg("index \"%s\" has wrong tuple order on entry tree page, block %u, offset %u, rightlink %u",
@@ -556,13 +551,14 @@ gin_check_parent_keys_consistency(Relation rel,
 				i == maxoff)
 			{
 				GinNullCategory parent_key_category;
+				OffsetNumber parent_key_attnum = gintuple_get_attrnum(&state, stack->parenttup);
 				Datum		parent_key = gintuple_get_key(&state,
 														  stack->parenttup,
 														  &parent_key_category);
 
-				if (ginCompareEntries(&state, attnum, current_key,
-									  current_key_category, parent_key,
-									  parent_key_category) > 0)
+				if (ginCompareAttEntries(&state, attnum, current_key,
+										 current_key_category, parent_key_attnum,
+										 parent_key, parent_key_category) > 0)
 				{
 					/*
 					 * There was a discrepancy between parent and child
@@ -581,6 +577,7 @@ gin_check_parent_keys_consistency(Relation rel,
 							 stack->blkno, stack->parentblk);
 					else
 					{
+						parent_key_attnum = gintuple_get_attrnum(&state, stack->parenttup);
 						parent_key = gintuple_get_key(&state,
 													  stack->parenttup,
 													  &parent_key_category);
@@ -589,9 +586,9 @@ gin_check_parent_keys_consistency(Relation rel,
 						 * Check if it is properly adjusted. If succeed,
 						 * proceed to the next key.
 						 */
-						if (ginCompareEntries(&state, attnum, current_key,
-											  current_key_category, parent_key,
-											  parent_key_category) > 0)
+						if (ginCompareAttEntries(&state, attnum, current_key,
+												 current_key_category, parent_key_attnum,
+												 parent_key, parent_key_category) > 0)
 							ereport(ERROR,
 									(errcode(ERRCODE_INDEX_CORRUPTED),
 									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
@@ -608,10 +605,10 @@ gin_check_parent_keys_consistency(Relation rel,
 				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
 				ptr->depth = stack->depth + 1;
 				/* last tuple in layer has no high key */
-				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
-					ptr->parenttup = CopyIndexTuple(idxtuple);
-				else
+				if (i == maxoff && rightlink == InvalidBlockNumber)
 					ptr->parenttup = NULL;
+				else
+					ptr->parenttup = CopyIndexTuple(idxtuple);
 				ptr->parentblk = stack->blkno;
 				ptr->blkno = GinGetDownlink(idxtuple);
 				ptr->parentlsn = lsn;
@@ -749,7 +746,7 @@ gin_refind_parent(Relation rel, BlockNumber parentblkno,
 		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
 		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
 
-		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		if (GinGetDownlink(itup) == childblkno)
 		{
 			/* Found it! Make copy and return it */
 			result = CopyIndexTuple(itup);
-- 
2.43.0

#78

Tomas Vondra

tomas@vondra.me

8 months ago

In reply to: Arseniy Mukhin (#77)

Re: Amcheck verification of GiST and GIN

On 5/9/25 14:43, Arseniy Mukhin wrote:

Hello,

Thanks everybody for the patch.

I noticed there are no tests that GIN check fails if the index is
corrupted, so I thought it would be great to have some.
While writing tests I noticed some issues in the patch (all issues are
for verify_gin.c)

1) When we add new items to the entry tree stack, ptr->parenttup is always null
because GinPageGetOpaque(page)->rightlink is never NULL.

/* last tuple in layer has no high key */
if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
ptr->parenttup = CopyIndexTuple(idxtuple);
else
ptr->parenttup = NULL;

Right way to check if entry doesn't have right neighbour is

GinPageGetOpaque(page)->rightlink == InvalidBlockNumber

But even if we fix it, the condition would not do what the comment
says. If we want to have NULL as parenttup
only for the last tuple of the btree layer, the right check would be:

if (i == maxoff && rightlink == InvalidBlockNumber)
ptr->parenttup = NULL;
else
ptr->parenttup = CopyIndexTuple(idxtuple);

2) Check don't use attnum in comparisons, but for multicolumn indexes
attnum defines order. When we compare max entry page key
with parent key we ignore attnum. It means we occasionally can try to
compare keys of different columns.
While checking order within the same page we skip checking order for
tuples with different attnums now,
but it can be checked. Fix is easy: using ginCompareAttEntries()
instead of ginCompareEntries().

3) Here a code of the split detection

if (rightlink != InvalidBlockNumber &&
ginCompareEntries(&state, attnum, page_max_key,
page_max_key_category, parent_key,
parent_key_category) > 0)
{
/* split page detected, install right link to the stack */

Condition seems not right, because the child page max item key never
can be bigger then parent key.
It can be equal to the parentkey, and it means that there was no split
and the parent key that we cached in the stack is still
relevant. Or it could be less then cached parent key and it means that
split took place and old max item key moved to the
right neighbour and current page max item key should be less then
cached parent key. So I think we should replace > with <.

4) Here is the code for checking the order within the entry page.

/*
* First block is metadata, skip order check. Also, never check
* for high key on rightmost page, as this key is not really
* stored explicitly.
*
* Also make sure to not compare entries for different attnums,
* which may be stored on the same page.
*/
if (i != FirstOffsetNumber && attnum == prev_attnum &&
stack->blkno != GIN_ROOT_BLKNO &&
!(i == maxoff && rightlink == InvalidBlockNumber))
{
prev_key = gintuple_get_key(&state, prev_tuple,
&prev_key_category);
if (ginCompareEntries(&state, attnum, prev_key,
prev_key_category, current_key,
current_key_category) >= 0)

We skip checking the order for the root page, it's not clear why.
Probably there is some mess with the meta page, because
comment says "First block is metadata, skip order check". So I think
we can remove

stack->blkno != GIN_ROOT_BLKNO

5) The same place as 4). We skip checking the order for the high key
on the rightmost page, as this key is not really stored explicitly,
but for leaf pages all keys are stored explicitly, so we can check the
order for the last item of the leaf page too.
So I think we can change the condition to this:

!(i == maxoff && rightlink == InvalidBlockNumber &&
!GinPageIsLeaf(page))

6) In posting tree parent key check part:

/*
* Check if this tuple is consistent with the downlink in the
* parent.
*/
if (stack->parentblk != InvalidBlockNumber && i == maxoff &&
ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
ereport(ERROR,
...
Here we don't check if stack->parentkey is valid, so sometimes we
compare invalid parentkey (because we can have
valid parentblk and invalid parentkey the same time). Invalid
parentkey is always bigger, so the code never triggers
ereport, but it doesn't look right. so probably we can rewrite it this way:

if (i == maxoff && ItemPointerIsValid(&stack->parentkey) &&
ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)

7) When producing stack entries for posting tree check, we set parent
key like this:

/*
* Set rightmost parent key to invalid item pointer. Its value
* is 'Infinity' and not explicitly stored.
*/
if (rightlink == InvalidBlockNumber)
ItemPointerSetInvalid(&ptr->parentkey);
else
ptr->parentkey = posting_item->key;

We set invalid parent key for all items of the rightmost page. But
it's the only rightmost item that doesn't have an explicit
parentkey (actually the comment says exactly this, but the code does a
different thing). All others have an explicit parent
key and we can set it. So fix can look like this:

if (rightlink == InvalidBlockNumber && i == maxoff)
ItemPointerSetInvalid(&ptr->parentkey);
else
ptr->parentkey = posting_item->key;

But for (rightlink == InvalidBlockNumber && i == maxoff)
posting_item->key is always (0,0) (we check it a little bit earlier),
so I think we can simplify it:

ptr->parentkey = posting_item->key;

8) In the function gin_refind_parent() the code

if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)

triggers Assert(ItemPointerIsValid(pointer)) within the
ItemPointerGetBlockNumber(), because itup->t_tid could be invalid.
AFAIS GIN code uses a special method GinGetDownlink(itup) that avoids
this Assert. So we can use it here too.

Please find the attached patch with fixes for issues above. Also there
are added regression test for multicolumn index,
and several tap tests with some basic corrupted index cases. I'm not
sure if it's the right way to write such tests and would
be glad to hear any feedback, especially about
invalid_entry_columns_order_test() where it seems important to
preserve
byte ordering. Also all tests expect standard page size 8192 now.

Also there are several points that I think also worth addressing:

9) Field 'parentlsn' is set but never actually used for any check. Or
I missed something.

10) README says "Vacuum never deletes tuples or pages from the entry
tree." But check assumes that it's possible to have
deleted leaf page with 0 entries.

if (GinPageIsDeleted(page))
{
if (!GinPageIsLeaf(page))
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("index \"%s\" has deleted internal page %u",
RelationGetRelationName(rel), blockNo)));
if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("index \"%s\" has deleted page %u with tuples",
RelationGetRelationName(rel), blockNo)));
}
11) When we compare entry tree max page key with parent key:

if (ginCompareAttEntries(&state, attnum, current_key,
current_key_category, parent_key_attnum,
parent_key, parent_key_category) > 0)
{
/*
* There was a discrepancy between parent and child
* tuples. We need to verify it is not a result of
* concurrent call of gistplacetopage(). So, lock parent
* and try to find downlink for current page. It may be
* missing due to concurrent page split, this is OK.
*/
pfree(stack->parenttup);
stack->parenttup = gin_refind_parent(rel, stack->parentblk,
stack->blkno, strategy);

I think we can remove gin_refind_parent() and do ereport right away here.
The same logic as with 3). AFAIK it's impossible to have a child item
with a key that is higher than the cached parent key.
Parent key bounds what keys we can insert into the child page, so it
seems there is no way how they can appear there.

These look like good points. I've added it to open items so that we
don't forget about this, I won't have time to look at this until after
pgconf.dev.

thanks

--
Tomas Vondra

#79

Tomas Vondra

tomas@vondra.me

8 months ago

In reply to: Arseniy Mukhin (#77)

2 attachment(s)

Re: Amcheck verification of GiST and GIN

Hello Arseniy,

I finally got time to look at this more closely, and do some testing.

Are there any cases when the current code incorrectly reports corruption
for a valid index? So far I've been unable to find such case. Or am I wrong?

It seems to me all the proposed changes are "tightening" the checks, in
the sense that we might have missed certain types of issues before. This
is supported by the fact that the new TAP test fails on master, i.e.
master does not report the corruption the TAP introduces.

(The TAP test is great, it would have been great to add something like
this in the original commit.)

Also, I've noticed that the TAP test passes even with some (most) of the
verify_gin.c changes reverted. See the 0002 patch - this does not break
the TAP test. Of course, that does not prove the changes are wrong and
I'm not claiming that. But can we improve the TAP test to trigger this
too? To show the current code (in master) misses this?

Grigory, Andrey, Heikki, any opinions on the tweaks?

regards

--
Tomas Vondra

Attachments:

v2-0001-verify-gin-fixes-and-tests.patchtext/x-patch; charset=UTF-8; name=v2-0001-verify-gin-fixes-and-tests.patchDownload

From 973de3eaeeca7ff2946a5b0f92f481d70ba5b78d Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Mon, 26 May 2025 12:10:37 +0200
Subject: [PATCH v2 1/2] verify gin fixes and tests

---
 contrib/amcheck/expected/check_gin.out |  12 ++
 contrib/amcheck/meson.build            |   1 +
 contrib/amcheck/sql/check_gin.sql      |  10 +
 contrib/amcheck/t/006_verify_gin.pl    | 274 +++++++++++++++++++++++++
 contrib/amcheck/verify_gin.c           |  57 +++--
 5 files changed, 324 insertions(+), 30 deletions(-)
 create mode 100644 contrib/amcheck/t/006_verify_gin.pl

diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
index b4f0b110747..8dd01ced8d1 100644
--- a/contrib/amcheck/expected/check_gin.out
+++ b/contrib/amcheck/expected/check_gin.out
@@ -76,3 +76,15 @@ SELECT gin_index_check('gin_check_jsonb_idx');
 
 -- cleanup
 DROP TABLE gin_check_jsonb;
+-- Test GIN multicolumn index
+CREATE TABLE "gin_check_multicolumn"(a text[], b text[]);
+INSERT INTO gin_check_multicolumn (a,b) values ('{a,c,e}','{b,d,f}');
+CREATE INDEX "gin_check_multicolumn_idx" on gin_check_multicolumn USING GIN(a,b);
+SELECT gin_index_check('gin_check_multicolumn_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_multicolumn;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index b33e8c9b062..1f0c347ed54 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -49,6 +49,7 @@ tests += {
       't/003_cic_2pc.pl',
       't/004_verify_nbtree_unique.pl',
       't/005_pitr.pl',
+      't/006_verify_gin.pl',
     ],
   },
 }
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
index 66f42c34311..11caed3d6a8 100644
--- a/contrib/amcheck/sql/check_gin.sql
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -50,3 +50,13 @@ SELECT gin_index_check('gin_check_jsonb_idx');
 
 -- cleanup
 DROP TABLE gin_check_jsonb;
+
+-- Test GIN multicolumn index
+CREATE TABLE "gin_check_multicolumn"(a text[], b text[]);
+INSERT INTO gin_check_multicolumn (a,b) values ('{a,c,e}','{b,d,f}');
+CREATE INDEX "gin_check_multicolumn_idx" on gin_check_multicolumn USING GIN(a,b);
+
+SELECT gin_index_check('gin_check_multicolumn_idx');
+
+-- cleanup
+DROP TABLE gin_check_multicolumn;
diff --git a/contrib/amcheck/t/006_verify_gin.pl b/contrib/amcheck/t/006_verify_gin.pl
new file mode 100644
index 00000000000..3b6077c6282
--- /dev/null
+++ b/contrib/amcheck/t/006_verify_gin.pl
@@ -0,0 +1,274 @@
+
+# Copyright (c) 2021-2025, PostgreSQL Global Development Group
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+
+use Test::More;
+
+my $node;
+
+#
+# Test set-up
+#
+$node = PostgreSQL::Test::Cluster->new('test');
+$node->init(no_data_checksums => 1);
+$node->append_conf('postgresql.conf', 'autovacuum=off');
+$node->start;
+$node->safe_psql('postgres', q(CREATE EXTENSION amcheck));
+$node->safe_psql(
+	'postgres', q(
+        CREATE OR REPLACE FUNCTION  random_string( INT ) RETURNS text AS $$
+        SELECT string_agg(substring('0123456789bcdfghjkmnpqrstvwxyz', ceil(random() * 30)::integer, 1), '') from generate_series(1, $1);
+        $$ LANGUAGE SQL;));
+
+# Tests
+invalid_entry_order_leaf_page_test();
+invalid_entry_order_inner_page_test();
+invalid_entry_columns_order_test();
+inconsistent_with_parent_key_parent_key_corrupted_test();
+inconsistent_with_parent_key_child_key_corrupted_test();
+
+sub invalid_entry_order_leaf_page_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+	 	CREATE TABLE $relname (a text[]);
+	 	CREATE INDEX $indexname ON $relname USING gin (a);
+		INSERT INTO $relname (a) VALUES ('{aaaaa,bbbbb}');
+		SELECT gin_clean_pending_list('$indexname');
+	 ));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blksize = 8192;
+	my $blkno = 1;  # root
+
+	# produce wrong order by replacing aaaaa with ccccc
+	string_replace_block(
+		$relpath,
+		"aaaaa",
+		"ccccc",
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	ok($stderr =~ 'index "test_gin_idx" has wrong tuple order on entry tree page, block 1, offset 2, rightlink 4294967295');
+}
+
+sub invalid_entry_order_inner_page_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+	 	CREATE TABLE $relname (a text[]);
+	 	CREATE INDEX $indexname ON $relname USING gin (a);
+		INSERT INTO $relname (a) VALUES (('{' || 'pppppppppp' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'qqqqqqqqqq' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'rrrrrrrrrr' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'ssssssssss' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'tttttttttt' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'uuuuuuuuuu' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'vvvvvvvvvv' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'wwwwwwwwww' || random_string(1870) ||'}')::text[]);
+		SELECT gin_clean_pending_list('$indexname');
+	 ));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blksize = 8192;
+	my $blkno = 1;  # root
+
+	# we have rrrrrrrrr... and tttttttttt... as keys in the root, so produce wrong order by replacing rrrrrrrrrr....
+	string_replace_block(
+		$relpath,
+		"rrrrrrrrrr",
+		"zzzzzzzzzz",
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	ok($stderr =~ 'index "test_gin_idx" has wrong tuple order on entry tree page, block 1, offset 2, rightlink 4294967295');
+}
+
+sub invalid_entry_columns_order_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+	 	CREATE TABLE $relname (a text[],b text[]);
+	 	CREATE INDEX $indexname ON $relname USING gin (a,b);
+		INSERT INTO $relname (a,b) VALUES ('{aaa}','{bbb}');
+		SELECT gin_clean_pending_list('$indexname');
+	 ));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blksize = 8192;
+	my $blkno = 1;  # root
+
+	# mess column numbers
+	# root items order before: (1,aaa), (2,bbb)
+	# root items order after:  (2,aaa), (1,bbb)
+	my $find = pack('s', 1) . pack('c', 0x09) . "aaa";
+	my $replace = pack('s', 2) . pack('c', 0x09) . "aaa";
+	string_replace_block(
+		$relpath,
+		$find,
+		$replace,
+		$blksize,
+		$blkno
+	);
+
+	$find = pack('s', 2) . pack('c', 0x09) . "bbb";
+	$replace = pack('s', 1) . pack('c', 0x09) . "bbb";
+	string_replace_block(
+		$relpath,
+		$find,
+		$replace,
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	ok($stderr =~ 'index "test_gin_idx" has wrong tuple order on entry tree page, block 1, offset 2, rightlink 4294967295');
+}
+
+sub inconsistent_with_parent_key_parent_key_corrupted_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+	 	CREATE TABLE $relname (a text[]);
+	 	CREATE INDEX $indexname ON $relname USING gin (a);
+		INSERT INTO $relname (a) VALUES (('{' || 'llllllllll' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'mmmmmmmmmm' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'nnnnnnnnnn' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'xxxxxxxxxx' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'yyyyyyyyyy' || random_string(1870) ||'}')::text[]);
+		SELECT gin_clean_pending_list('$indexname');
+	 ));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blksize = 8192;
+	my $blkno = 1;  # root
+
+	# we have nnnnnnnnnn... as parent key in the root, so replace it with something smaller then child's keys
+	string_replace_block(
+		$relpath,
+		"nnnnnnnnnn",
+		"aaaaaaaaaa",
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	ok($stderr =~ 'index "test_gin_idx" has inconsistent records on page 5 offset 3');
+}
+
+sub inconsistent_with_parent_key_child_key_corrupted_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+	 	CREATE TABLE $relname (a text[]);
+	 	CREATE INDEX $indexname ON $relname USING gin (a);
+		INSERT INTO $relname (a) VALUES (('{' || 'llllllllll' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'mmmmmmmmmm' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'nnnnnnnnnn' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'xxxxxxxxxx' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'yyyyyyyyyy' || random_string(1870) ||'}')::text[]);
+		SELECT gin_clean_pending_list('$indexname');
+	 ));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blksize = 8192;
+	my $blkno = 5;  # leaf
+
+	# we have nnnnnnnnnn... as parent key in the root, so replace child key with something bigger
+	string_replace_block(
+		$relpath,
+		"nnnnnnnnnn",
+		"pppppppppp",
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	ok($stderr =~ 'index "test_gin_idx" has inconsistent records on page 5 offset 3');
+}
+
+# Returns the filesystem path for the named relation.
+sub relation_filepath
+{
+	my ($relname) = @_;
+
+	my $pgdata = $node->data_dir;
+	my $rel = $node->safe_psql('postgres',
+		qq(SELECT pg_relation_filepath('$relname')));
+	die "path not found for relation $relname" unless defined $rel;
+	return "$pgdata/$rel";
+}
+
+sub string_replace_block {
+	my ($filename, $find, $replace, $blksize, $blkno) = @_;
+
+	my $fh;
+	open($fh, '+<', $filename) or BAIL_OUT("open failed: $!");
+	binmode $fh;
+
+	my $offset = $blkno * $blksize;
+	my $buffer;
+
+	sysseek($fh, $offset, 0) or BAIL_OUT("seek failed: $!");
+	sysread($fh, $buffer, $blksize) or BAIL_OUT("read failed: $!");
+
+	$buffer =~ s/$find/$replace/g;
+
+	sysseek($fh, $offset, 0) or BAIL_OUT("seek failed: $!");
+	syswrite($fh, $buffer) or BAIL_OUT("write failed: $!");
+
+	close($fh) or BAIL_OUT("close failed: $!");
+
+	return;
+}
+
+done_testing();
\ No newline at end of file
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index b5f363562e3..427cf1669a6 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -346,7 +346,7 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 				 * Check if this tuple is consistent with the downlink in the
 				 * parent.
 				 */
-				if (stack->parentblk != InvalidBlockNumber && i == maxoff &&
+				if (i == maxoff && ItemPointerIsValid(&stack->parentkey) &&
 					ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
 					ereport(ERROR,
 							(errcode(ERRCODE_INDEX_CORRUPTED),
@@ -359,14 +359,10 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 				ptr->depth = stack->depth + 1;
 
 				/*
-				 * Set rightmost parent key to invalid item pointer. Its value
-				 * is 'Infinity' and not explicitly stored.
+				 * The rightmost parent key is always invalid item pointer.
+				 * Its value is 'Infinity' and not explicitly stored.
 				 */
-				if (rightlink == InvalidBlockNumber)
-					ItemPointerSetInvalid(&ptr->parentkey);
-				else
-					ptr->parentkey = posting_item->key;
-
+				ptr->parentkey = posting_item->key;
 				ptr->parentblk = stack->blkno;
 				ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
 				ptr->next = stack->next;
@@ -463,17 +459,18 @@ gin_check_parent_keys_consistency(Relation rel,
 			Datum		parent_key = gintuple_get_key(&state,
 													  stack->parenttup,
 													  &parent_key_category);
+			OffsetNumber parent_key_attnum = gintuple_get_attrnum(&state, stack->parenttup);
 			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno,
 												   page, maxoff);
 			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
-			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			OffsetNumber page_max_key_attnum = gintuple_get_attrnum(&state, idxtuple);
 			GinNullCategory page_max_key_category;
 			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
 
 			if (rightlink != InvalidBlockNumber &&
-				ginCompareEntries(&state, attnum, page_max_key,
-								  page_max_key_category, parent_key,
-								  parent_key_category) > 0)
+				ginCompareAttEntries(&state, page_max_key_attnum, page_max_key,
+									 page_max_key_category, parent_key_attnum,
+									 parent_key, parent_key_category) < 0)
 			{
 				/* split page detected, install right link to the stack */
 				GinScanItem *ptr;
@@ -528,20 +525,18 @@ gin_check_parent_keys_consistency(Relation rel,
 			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
 
 			/*
-			 * First block is metadata, skip order check. Also, never check
-			 * for high key on rightmost page, as this key is not really
-			 * stored explicitly.
+			 * Never check for high key on rightmost inner page, as this key
+			 * is not really stored explicitly.
 			 *
 			 * Also make sure to not compare entries for different attnums,
 			 * which may be stored on the same page.
 			 */
-			if (i != FirstOffsetNumber && attnum == prev_attnum && stack->blkno != GIN_ROOT_BLKNO &&
-				!(i == maxoff && rightlink == InvalidBlockNumber))
+			if (i != FirstOffsetNumber && !(i == maxoff && rightlink == InvalidBlockNumber && !GinPageIsLeaf(page)))
 			{
 				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
-				if (ginCompareEntries(&state, attnum, prev_key,
-									  prev_key_category, current_key,
-									  current_key_category) >= 0)
+				if (ginCompareAttEntries(&state, prev_attnum, prev_key,
+										 prev_key_category, attnum,
+										 current_key, current_key_category) >= 0)
 					ereport(ERROR,
 							(errcode(ERRCODE_INDEX_CORRUPTED),
 							 errmsg("index \"%s\" has wrong tuple order on entry tree page, block %u, offset %u, rightlink %u",
@@ -556,13 +551,14 @@ gin_check_parent_keys_consistency(Relation rel,
 				i == maxoff)
 			{
 				GinNullCategory parent_key_category;
+				OffsetNumber parent_key_attnum = gintuple_get_attrnum(&state, stack->parenttup);
 				Datum		parent_key = gintuple_get_key(&state,
 														  stack->parenttup,
 														  &parent_key_category);
 
-				if (ginCompareEntries(&state, attnum, current_key,
-									  current_key_category, parent_key,
-									  parent_key_category) > 0)
+				if (ginCompareAttEntries(&state, attnum, current_key,
+										 current_key_category, parent_key_attnum,
+										 parent_key, parent_key_category) > 0)
 				{
 					/*
 					 * There was a discrepancy between parent and child
@@ -581,6 +577,7 @@ gin_check_parent_keys_consistency(Relation rel,
 							 stack->blkno, stack->parentblk);
 					else
 					{
+						parent_key_attnum = gintuple_get_attrnum(&state, stack->parenttup);
 						parent_key = gintuple_get_key(&state,
 													  stack->parenttup,
 													  &parent_key_category);
@@ -589,9 +586,9 @@ gin_check_parent_keys_consistency(Relation rel,
 						 * Check if it is properly adjusted. If succeed,
 						 * proceed to the next key.
 						 */
-						if (ginCompareEntries(&state, attnum, current_key,
-											  current_key_category, parent_key,
-											  parent_key_category) > 0)
+						if (ginCompareAttEntries(&state, attnum, current_key,
+												 current_key_category, parent_key_attnum,
+												 parent_key, parent_key_category) > 0)
 							ereport(ERROR,
 									(errcode(ERRCODE_INDEX_CORRUPTED),
 									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
@@ -608,10 +605,10 @@ gin_check_parent_keys_consistency(Relation rel,
 				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
 				ptr->depth = stack->depth + 1;
 				/* last tuple in layer has no high key */
-				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
-					ptr->parenttup = CopyIndexTuple(idxtuple);
-				else
+				if (i == maxoff && rightlink == InvalidBlockNumber)
 					ptr->parenttup = NULL;
+				else
+					ptr->parenttup = CopyIndexTuple(idxtuple);
 				ptr->parentblk = stack->blkno;
 				ptr->blkno = GinGetDownlink(idxtuple);
 				ptr->parentlsn = lsn;
@@ -749,7 +746,7 @@ gin_refind_parent(Relation rel, BlockNumber parentblkno,
 		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
 		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
 
-		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		if (GinGetDownlink(itup) == childblkno)
 		{
 			/* Found it! Make copy and return it */
 			result = CopyIndexTuple(itup);
-- 
2.49.0

v2-0002-undo.patchtext/x-patch; charset=UTF-8; name=v2-0002-undo.patchDownload

From 1495281bb00810ed3c5c3ea20bbb3b626c73b580 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Mon, 26 May 2025 12:24:21 +0200
Subject: [PATCH v2 2/2] undo

---
 contrib/amcheck/verify_gin.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index 427cf1669a6..8f6a5410cb7 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -346,7 +346,7 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 				 * Check if this tuple is consistent with the downlink in the
 				 * parent.
 				 */
-				if (i == maxoff && ItemPointerIsValid(&stack->parentkey) &&
+				if (stack->parentblk != InvalidBlockNumber && i == maxoff &&
 					ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
 					ereport(ERROR,
 							(errcode(ERRCODE_INDEX_CORRUPTED),
@@ -359,10 +359,14 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 				ptr->depth = stack->depth + 1;
 
 				/*
-				 * The rightmost parent key is always invalid item pointer.
-				 * Its value is 'Infinity' and not explicitly stored.
+				 * Set rightmost parent key to invalid item pointer. Its value
+				 * is 'Infinity' and not explicitly stored.
 				 */
-				ptr->parentkey = posting_item->key;
+				if (rightlink == InvalidBlockNumber)
+					ItemPointerSetInvalid(&ptr->parentkey);
+				else
+					ptr->parentkey = posting_item->key;
+
 				ptr->parentblk = stack->blkno;
 				ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
 				ptr->next = stack->next;
-- 
2.49.0

#80

Arseniy Mukhin

arseniy.mukhin.dev@gmail.com

8 months ago

In reply to: Tomas Vondra (#79)

Re: Amcheck verification of GiST and GIN

Hello Tomas,

On Mon, May 26, 2025 at 1:27 PM Tomas Vondra <tomas@vondra.me> wrote:

Hello Arseniy,

I finally got time to look at this more closely, and do some testing.

Thank you for looking into this.

Are there any cases when the current code incorrectly reports corruption
for a valid index? So far I've been unable to find such case. Or am I wrong?

I think you are right, I'm not aware of such cases either.

It seems to me all the proposed changes are "tightening" the checks, in
the sense that we might have missed certain types of issues before. This
is supported by the fact that the new TAP test fails on master, i.e.
master does not report the corruption the TAP introduces.

I would say points 4, 5, 7 - yes, they are about tightening checks.

I think point 1 is more about fixing the existing code. In the current code,
parent_key is always NULL for the entry tree, so a bunch of code
(related to checking consistency between parents and children) is unreachable.

Then if you apply changes of the 1st point and parent_key comparison code starts
working, you will need changes of the 2nd point. The current code
ignores attribute
numbers in parent_key check, which can lead to comparing keys of
different columns.
I see one scenario where it can happen: let's say we have a 2 column
index. The first
attribute type is "int", the second attribute type is "text". In the
multicolumn gin
index tuple has two parts: attno and key value. Let's write it as
(attno, key). While
traversing the entry tree the current code caches parent keys with child blkno.
Let's say it cached (2, "a") parent key. It means that there was a
time when the child
page's high key was (2, "a"). But when the child page check actually
starts, it's possible
that as a result of parallel splits, the child page now contains keys
of the first
attribute only, for example (1, 1), (1, 5), (1, 10). So if we ignore
the attribute
number here, we will end up comparing 10 with "a". Hope the example is
not too confusing.

The 3rd point is about the code that never runs. As I understood it is
supposed that the check detects
splits so we can check more index pages, but If I'm not wrong it
doesn't work now.

The 6th point is about comparison with invalid pointer. I thought that
it's probably
not right to compare it with invalid pointer, but now I'm not sure.

(The TAP test is great, it would have been great to add something like
this in the original commit.)

Great, thank you for the feedback.

Also, I've noticed that the TAP test passes even with some (most) of the
verify_gin.c changes reverted. See the 0002 patch - this does not break
the TAP test. Of course, that does not prove the changes are wrong and
I'm not claiming that. But can we improve the TAP test to trigger this
too? To show the current code (in master) misses this?

Yes, changes in the undo patch is about posting tree check part (6, 7 points)
and I haven't written tests for it, because to break posting tree you need to
manipulate with tids which is not as easy as replace "aaaa" with "cccc" as tests
for entry tree do. Probably it would be much easier to use page api to
corrupt some
posting tree pages, but I don't know, is it impossible in TAP tests?

#81

Arseniy Mukhin

arseniy.mukhin.dev@gmail.com

8 months ago

In reply to: Arseniy Mukhin (#80)

2 attachment(s)

Re: Amcheck verification of GiST and GIN

On Mon, May 26, 2025 at 7:28 PM Arseniy Mukhin
<arseniy.mukhin.dev@gmail.com> wrote:

On Mon, May 26, 2025 at 1:27 PM Tomas Vondra <tomas@vondra.me> wrote:

Also, I've noticed that the TAP test passes even with some (most) of the
verify_gin.c changes reverted. See the 0002 patch - this does not break
the TAP test. Of course, that does not prove the changes are wrong and
I'm not claiming that. But can we improve the TAP test to trigger this
too? To show the current code (in master) misses this?

Yes, changes in the undo patch is about posting tree check part (6, 7 points)
and I haven't written tests for it, because to break posting tree you need to
manipulate with tids which is not as easy as replace "aaaa" with "cccc" as tests
for entry tree do. Probably it would be much easier to use page api to
corrupt some
posting tree pages, but I don't know, is it impossible in TAP tests?

I added the test for the posting tree parent_key check. Now applying
'undo patch' results in
a test failure.
Also I realized that the test 'invalid_entry_columns_order_test' will
fail on big endian machines,
because varlena len encoding is different for little endian and big
endian, so I changed the test a little bit.
Now the test doesn't use varlena len byte in regex.
I also remove the blksize hardcode and start getting it from the
cluster configuration. But anyway some tests
will fail with not standard block size (probably all tests where tree
growth is expected).

Best regards,
Arseniy Mukhin

Attachments:

v3-0002-undo.patchtext/x-patch; charset=US-ASCII; name=v3-0002-undo.patchDownload

From 2aeb21a7fc029d823e55f861344ee613b295e912 Mon Sep 17 00:00:00 2001
From: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>
Date: Thu, 29 May 2025 14:33:37 +0300
Subject: [PATCH v3 2/2] undo

---
 contrib/amcheck/verify_gin.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index 427cf1669a6..8f6a5410cb7 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -346,7 +346,7 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 				 * Check if this tuple is consistent with the downlink in the
 				 * parent.
 				 */
-				if (i == maxoff && ItemPointerIsValid(&stack->parentkey) &&
+				if (stack->parentblk != InvalidBlockNumber && i == maxoff &&
 					ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
 					ereport(ERROR,
 							(errcode(ERRCODE_INDEX_CORRUPTED),
@@ -359,10 +359,14 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 				ptr->depth = stack->depth + 1;
 
 				/*
-				 * The rightmost parent key is always invalid item pointer.
-				 * Its value is 'Infinity' and not explicitly stored.
+				 * Set rightmost parent key to invalid item pointer. Its value
+				 * is 'Infinity' and not explicitly stored.
 				 */
-				ptr->parentkey = posting_item->key;
+				if (rightlink == InvalidBlockNumber)
+					ItemPointerSetInvalid(&ptr->parentkey);
+				else
+					ptr->parentkey = posting_item->key;
+
 				ptr->parentblk = stack->blkno;
 				ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
 				ptr->next = stack->next;
-- 
2.43.0

v3-0001-verify-gin-fixes-and-tests.patchtext/x-patch; charset=US-ASCII; name=v3-0001-verify-gin-fixes-and-tests.patchDownload

From 27d54e5bc4cf37b429543969c881a5d1d7fabc26 Mon Sep 17 00:00:00 2001
From: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>
Date: Thu, 29 May 2025 13:47:47 +0300
Subject: [PATCH v3 1/2] verify-gin-fixes-and-tests

---
 contrib/amcheck/expected/check_gin.out |  12 +
 contrib/amcheck/meson.build            |   1 +
 contrib/amcheck/sql/check_gin.sql      |  10 +
 contrib/amcheck/t/006_verify_gin.pl    | 313 +++++++++++++++++++++++++
 contrib/amcheck/verify_gin.c           |  57 +++--
 5 files changed, 363 insertions(+), 30 deletions(-)
 create mode 100644 contrib/amcheck/t/006_verify_gin.pl

diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
index b4f0b110747..8dd01ced8d1 100644
--- a/contrib/amcheck/expected/check_gin.out
+++ b/contrib/amcheck/expected/check_gin.out
@@ -76,3 +76,15 @@ SELECT gin_index_check('gin_check_jsonb_idx');
 
 -- cleanup
 DROP TABLE gin_check_jsonb;
+-- Test GIN multicolumn index
+CREATE TABLE "gin_check_multicolumn"(a text[], b text[]);
+INSERT INTO gin_check_multicolumn (a,b) values ('{a,c,e}','{b,d,f}');
+CREATE INDEX "gin_check_multicolumn_idx" on gin_check_multicolumn USING GIN(a,b);
+SELECT gin_index_check('gin_check_multicolumn_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_multicolumn;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index b33e8c9b062..1f0c347ed54 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -49,6 +49,7 @@ tests += {
       't/003_cic_2pc.pl',
       't/004_verify_nbtree_unique.pl',
       't/005_pitr.pl',
+      't/006_verify_gin.pl',
     ],
   },
 }
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
index 66f42c34311..11caed3d6a8 100644
--- a/contrib/amcheck/sql/check_gin.sql
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -50,3 +50,13 @@ SELECT gin_index_check('gin_check_jsonb_idx');
 
 -- cleanup
 DROP TABLE gin_check_jsonb;
+
+-- Test GIN multicolumn index
+CREATE TABLE "gin_check_multicolumn"(a text[], b text[]);
+INSERT INTO gin_check_multicolumn (a,b) values ('{a,c,e}','{b,d,f}');
+CREATE INDEX "gin_check_multicolumn_idx" on gin_check_multicolumn USING GIN(a,b);
+
+SELECT gin_index_check('gin_check_multicolumn_idx');
+
+-- cleanup
+DROP TABLE gin_check_multicolumn;
diff --git a/contrib/amcheck/t/006_verify_gin.pl b/contrib/amcheck/t/006_verify_gin.pl
new file mode 100644
index 00000000000..5d974228644
--- /dev/null
+++ b/contrib/amcheck/t/006_verify_gin.pl
@@ -0,0 +1,313 @@
+
+# Copyright (c) 2021-2025, PostgreSQL Global Development Group
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+
+use Test::More;
+
+my $node;
+my $blksize;
+
+#
+# Test set-up
+#
+$node = PostgreSQL::Test::Cluster->new('test');
+$node->init(no_data_checksums => 1);
+$node->append_conf('postgresql.conf', 'autovacuum=off');
+$node->start;
+$blksize = int($node->safe_psql('postgres', 'SHOW block_size;'));
+$node->safe_psql('postgres', q(CREATE EXTENSION amcheck));
+$node->safe_psql(
+	'postgres', q(
+        CREATE OR REPLACE FUNCTION  random_string( INT ) RETURNS text AS $$
+        SELECT string_agg(substring('0123456789bcdfghjkmnpqrstvwxyz', ceil(random() * 30)::integer, 1), '') from generate_series(1, $1);
+        $$ LANGUAGE SQL;));
+
+# Tests
+invalid_entry_order_leaf_page_test();
+invalid_entry_order_inner_page_test();
+invalid_entry_columns_order_test();
+inconsistent_with_parent_key__parent_key_corrupted_test();
+inconsistent_with_parent_key__child_key_corrupted_test();
+inconsistent_with_parent_key__parent_key_corrupted_posting_tree_test();
+
+sub invalid_entry_order_leaf_page_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+	 	CREATE TABLE $relname (a text[]);
+	 	CREATE INDEX $indexname ON $relname USING gin (a);
+		INSERT INTO $relname (a) VALUES ('{aaaaa,bbbbb}');
+		SELECT gin_clean_pending_list('$indexname');
+	 ));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 1;  # root
+
+	# produce wrong order by replacing aaaaa with ccccc
+	string_replace_block(
+		$relpath,
+		"aaaaa",
+		'"ccccc"',
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	ok($stderr =~ "index \"$indexname\" has wrong tuple order on entry tree page, block 1, offset 2, rightlink 4294967295");
+}
+
+sub invalid_entry_order_inner_page_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+	 	CREATE TABLE $relname (a text[]);
+	 	CREATE INDEX $indexname ON $relname USING gin (a);
+		INSERT INTO $relname (a) VALUES (('{' || 'pppppppppp' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'qqqqqqqqqq' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'rrrrrrrrrr' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'ssssssssss' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'tttttttttt' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'uuuuuuuuuu' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'vvvvvvvvvv' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'wwwwwwwwww' || random_string(1870) ||'}')::text[]);
+		SELECT gin_clean_pending_list('$indexname');
+	 ));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 1;  # root
+
+	# we have rrrrrrrrr... and tttttttttt... as keys in the root, so produce wrong order by replacing rrrrrrrrrr....
+	string_replace_block(
+		$relpath,
+		"rrrrrrrrrr",
+		'"zzzzzzzzzz"',
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	ok($stderr =~ "index \"$indexname\" has wrong tuple order on entry tree page, block 1, offset 2, rightlink 4294967295");
+}
+
+sub invalid_entry_columns_order_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+	 	CREATE TABLE $relname (a text[],b text[]);
+	 	CREATE INDEX $indexname ON $relname USING gin (a,b);
+		INSERT INTO $relname (a,b) VALUES ('{aaa}','{bbb}');
+		SELECT gin_clean_pending_list('$indexname');
+	 ));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 1;  # root
+
+	# mess column numbers
+	# root items order before: (1,aaa), (2,bbb)
+	# root items order after:  (2,aaa), (1,bbb)
+	my $attrno_1 = pack('s', 1);
+	my $attrno_2 = pack('s', 2);
+
+	my $find = qr/($attrno_1)(.)(aaa)/s;
+	my $replace = '"' . $attrno_2 . '$2$3"';
+	string_replace_block(
+		$relpath,
+		$find,
+		$replace,
+		$blksize,
+		$blkno
+	);
+
+	$find = qr/($attrno_2)(.)(bbb)/s;
+	$replace = '"' . $attrno_1 . '$2$3"';
+	string_replace_block(
+		$relpath,
+		$find,
+		$replace,
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	ok($stderr =~ "index \"$indexname\" has wrong tuple order on entry tree page, block 1, offset 2, rightlink 4294967295");
+}
+
+sub inconsistent_with_parent_key__parent_key_corrupted_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+	 	CREATE TABLE $relname (a text[]);
+	 	CREATE INDEX $indexname ON $relname USING gin (a);
+		INSERT INTO $relname (a) VALUES (('{' || 'llllllllll' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'mmmmmmmmmm' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'nnnnnnnnnn' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'xxxxxxxxxx' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'yyyyyyyyyy' || random_string(1870) ||'}')::text[]);
+		SELECT gin_clean_pending_list('$indexname');
+	 ));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 1;  # root
+
+	# we have nnnnnnnnnn... as parent key in the root, so replace it with something smaller then child's keys
+	string_replace_block(
+		$relpath,
+		"nnnnnnnnnn",
+		'"aaaaaaaaaa"',
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	ok($stderr =~ "index \"$indexname\" has inconsistent records on page 5 offset 3");
+}
+
+sub inconsistent_with_parent_key__child_key_corrupted_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+	 	CREATE TABLE $relname (a text[]);
+	 	CREATE INDEX $indexname ON $relname USING gin (a);
+		INSERT INTO $relname (a) VALUES (('{' || 'llllllllll' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'mmmmmmmmmm' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'nnnnnnnnnn' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'xxxxxxxxxx' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'yyyyyyyyyy' || random_string(1870) ||'}')::text[]);
+		SELECT gin_clean_pending_list('$indexname');
+	 ));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 5;  # leaf
+
+	# we have nnnnnnnnnn... as parent key in the root, so replace child key with something bigger
+	string_replace_block(
+		$relpath,
+		"nnnnnnnnnn",
+		'"pppppppppp"',
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	ok($stderr =~ "index \"$indexname\" has inconsistent records on page 5 offset 3");
+}
+
+sub inconsistent_with_parent_key__parent_key_corrupted_posting_tree_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+		CREATE TABLE $relname (a text[]);
+		INSERT INTO $relname (a) select ('{aaaaa}') from generate_series(1,10000);
+		CREATE INDEX $indexname ON $relname USING gin (a);
+	 ));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 2;  # posting tree root
+
+	# we have a posting tree for 'aaaaa' key with the root at 2nd block
+	# and two leaf pages 3 and 4. for 4th page high key is (65,52), so let's make it a little bit
+	# smaller, so that there are tid's in leaf page that are larger then the new high key.
+	my $find = pack('S', 65) . pack('S', 52);
+	my $replace = '"' . pack('S', 64) . pack('S', 52) . '"';
+	string_replace_block(
+		$relpath,
+		$find,
+		$replace,
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	ok($stderr =~ "index \"$indexname\": tid exceeds parent's high key in postingTree leaf on block 4");
+}
+
+
+# Returns the filesystem path for the named relation.
+sub relation_filepath
+{
+	my ($relname) = @_;
+
+	my $pgdata = $node->data_dir;
+	my $rel = $node->safe_psql('postgres',
+		qq(SELECT pg_relation_filepath('$relname')));
+	die "path not found for relation $relname" unless defined $rel;
+	return "$pgdata/$rel";
+}
+
+sub string_replace_block {
+	my ($filename, $find, $replace, $blksize, $blkno) = @_;
+
+	my $fh;
+	open($fh, '+<', $filename) or BAIL_OUT("open failed: $!");
+	binmode $fh;
+
+	my $offset = $blkno * $blksize;
+	my $buffer;
+
+	sysseek($fh, $offset, 0) or BAIL_OUT("seek failed: $!");
+	sysread($fh, $buffer, $blksize) or BAIL_OUT("read failed: $!");
+
+	$buffer =~ s/$find/$replace/gee;
+
+	sysseek($fh, $offset, 0) or BAIL_OUT("seek failed: $!");
+	syswrite($fh, $buffer) or BAIL_OUT("write failed: $!");
+
+	close($fh) or BAIL_OUT("close failed: $!");
+
+	return;
+}
+
+done_testing();
\ No newline at end of file
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index b5f363562e3..427cf1669a6 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -346,7 +346,7 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 				 * Check if this tuple is consistent with the downlink in the
 				 * parent.
 				 */
-				if (stack->parentblk != InvalidBlockNumber && i == maxoff &&
+				if (i == maxoff && ItemPointerIsValid(&stack->parentkey) &&
 					ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
 					ereport(ERROR,
 							(errcode(ERRCODE_INDEX_CORRUPTED),
@@ -359,14 +359,10 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 				ptr->depth = stack->depth + 1;
 
 				/*
-				 * Set rightmost parent key to invalid item pointer. Its value
-				 * is 'Infinity' and not explicitly stored.
+				 * The rightmost parent key is always invalid item pointer.
+				 * Its value is 'Infinity' and not explicitly stored.
 				 */
-				if (rightlink == InvalidBlockNumber)
-					ItemPointerSetInvalid(&ptr->parentkey);
-				else
-					ptr->parentkey = posting_item->key;
-
+				ptr->parentkey = posting_item->key;
 				ptr->parentblk = stack->blkno;
 				ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
 				ptr->next = stack->next;
@@ -463,17 +459,18 @@ gin_check_parent_keys_consistency(Relation rel,
 			Datum		parent_key = gintuple_get_key(&state,
 													  stack->parenttup,
 													  &parent_key_category);
+			OffsetNumber parent_key_attnum = gintuple_get_attrnum(&state, stack->parenttup);
 			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno,
 												   page, maxoff);
 			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
-			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			OffsetNumber page_max_key_attnum = gintuple_get_attrnum(&state, idxtuple);
 			GinNullCategory page_max_key_category;
 			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
 
 			if (rightlink != InvalidBlockNumber &&
-				ginCompareEntries(&state, attnum, page_max_key,
-								  page_max_key_category, parent_key,
-								  parent_key_category) > 0)
+				ginCompareAttEntries(&state, page_max_key_attnum, page_max_key,
+									 page_max_key_category, parent_key_attnum,
+									 parent_key, parent_key_category) < 0)
 			{
 				/* split page detected, install right link to the stack */
 				GinScanItem *ptr;
@@ -528,20 +525,18 @@ gin_check_parent_keys_consistency(Relation rel,
 			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
 
 			/*
-			 * First block is metadata, skip order check. Also, never check
-			 * for high key on rightmost page, as this key is not really
-			 * stored explicitly.
+			 * Never check for high key on rightmost inner page, as this key
+			 * is not really stored explicitly.
 			 *
 			 * Also make sure to not compare entries for different attnums,
 			 * which may be stored on the same page.
 			 */
-			if (i != FirstOffsetNumber && attnum == prev_attnum && stack->blkno != GIN_ROOT_BLKNO &&
-				!(i == maxoff && rightlink == InvalidBlockNumber))
+			if (i != FirstOffsetNumber && !(i == maxoff && rightlink == InvalidBlockNumber && !GinPageIsLeaf(page)))
 			{
 				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
-				if (ginCompareEntries(&state, attnum, prev_key,
-									  prev_key_category, current_key,
-									  current_key_category) >= 0)
+				if (ginCompareAttEntries(&state, prev_attnum, prev_key,
+										 prev_key_category, attnum,
+										 current_key, current_key_category) >= 0)
 					ereport(ERROR,
 							(errcode(ERRCODE_INDEX_CORRUPTED),
 							 errmsg("index \"%s\" has wrong tuple order on entry tree page, block %u, offset %u, rightlink %u",
@@ -556,13 +551,14 @@ gin_check_parent_keys_consistency(Relation rel,
 				i == maxoff)
 			{
 				GinNullCategory parent_key_category;
+				OffsetNumber parent_key_attnum = gintuple_get_attrnum(&state, stack->parenttup);
 				Datum		parent_key = gintuple_get_key(&state,
 														  stack->parenttup,
 														  &parent_key_category);
 
-				if (ginCompareEntries(&state, attnum, current_key,
-									  current_key_category, parent_key,
-									  parent_key_category) > 0)
+				if (ginCompareAttEntries(&state, attnum, current_key,
+										 current_key_category, parent_key_attnum,
+										 parent_key, parent_key_category) > 0)
 				{
 					/*
 					 * There was a discrepancy between parent and child
@@ -581,6 +577,7 @@ gin_check_parent_keys_consistency(Relation rel,
 							 stack->blkno, stack->parentblk);
 					else
 					{
+						parent_key_attnum = gintuple_get_attrnum(&state, stack->parenttup);
 						parent_key = gintuple_get_key(&state,
 													  stack->parenttup,
 													  &parent_key_category);
@@ -589,9 +586,9 @@ gin_check_parent_keys_consistency(Relation rel,
 						 * Check if it is properly adjusted. If succeed,
 						 * proceed to the next key.
 						 */
-						if (ginCompareEntries(&state, attnum, current_key,
-											  current_key_category, parent_key,
-											  parent_key_category) > 0)
+						if (ginCompareAttEntries(&state, attnum, current_key,
+												 current_key_category, parent_key_attnum,
+												 parent_key, parent_key_category) > 0)
 							ereport(ERROR,
 									(errcode(ERRCODE_INDEX_CORRUPTED),
 									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
@@ -608,10 +605,10 @@ gin_check_parent_keys_consistency(Relation rel,
 				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
 				ptr->depth = stack->depth + 1;
 				/* last tuple in layer has no high key */
-				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
-					ptr->parenttup = CopyIndexTuple(idxtuple);
-				else
+				if (i == maxoff && rightlink == InvalidBlockNumber)
 					ptr->parenttup = NULL;
+				else
+					ptr->parenttup = CopyIndexTuple(idxtuple);
 				ptr->parentblk = stack->blkno;
 				ptr->blkno = GinGetDownlink(idxtuple);
 				ptr->parentlsn = lsn;
@@ -749,7 +746,7 @@ gin_refind_parent(Relation rel, BlockNumber parentblkno,
 		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
 		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
 
-		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		if (GinGetDownlink(itup) == childblkno)
 		{
 			/* Found it! Make copy and return it */
 			result = CopyIndexTuple(itup);
-- 
2.43.0

#82

Tomas Vondra

tomas@vondra.me

7 months ago

In reply to: Arseniy Mukhin (#81)

3 attachment(s)

Re: Amcheck verification of GiST and GIN

On 5/29/25 13:53, Arseniy Mukhin wrote:

On Mon, May 26, 2025 at 7:28 PM Arseniy Mukhin
<arseniy.mukhin.dev@gmail.com> wrote:

On Mon, May 26, 2025 at 1:27 PM Tomas Vondra <tomas@vondra.me> wrote:

Also, I've noticed that the TAP test passes even with some (most) of the
verify_gin.c changes reverted. See the 0002 patch - this does not break
the TAP test. Of course, that does not prove the changes are wrong and
I'm not claiming that. But can we improve the TAP test to trigger this
too? To show the current code (in master) misses this?

Yes, changes in the undo patch is about posting tree check part (6, 7 points)
and I haven't written tests for it, because to break posting tree you need to
manipulate with tids which is not as easy as replace "aaaa" with "cccc" as tests
for entry tree do. Probably it would be much easier to use page api to
corrupt some
posting tree pages, but I don't know, is it impossible in TAP tests?

I added the test for the posting tree parent_key check. Now applying
'undo patch' results in a test failure.

Great, thank you.

I noticed git-am complaining about a couple whitespace issues in the
test, mostly about mixing spaces/tabs. The v4 fixes them (in a separate
part, but should be merged into 0001). It's a detail, but might be good
to try git-am on patches ;-)

Also I realized that the test 'invalid_entry_columns_order_test' will
fail on big endian machines,
because varlena len encoding is different for little endian and big
endian, so I changed the test a little bit.
Now the test doesn't use varlena len byte in regex.

I think it'd make sense to split this into smaller patches, each fixing
a different issue. Not one patch for each of the 11 items in your
original message, that would be an overkill ...

I propose to split it like this, into three parts, each addressing a
particular type of mistake:

1) gin_check_posting_tree_parent_keys_consistency

2) gin_check_parent_keys_consistency / att comparisons

3) gin_check_parent_keys_consistency / setting ptr->parenttup (at the end)

Does this make sense to you? If yes, can you split the patch series like
this, including a commit message for each part, explaining the fix? We'd
need the commit message even with a single patch, ofc.

I also remove the blksize hardcode and start getting it from the
cluster configuration. But anyway some tests
will fail with not standard block size (probably all tests where tree
growth is expected).

I think that's fine. AFAIK we don't expect tests to be 100% stable with
other block sizes. It shouldn't crash / segfault, ofc, but some tests
may be sensitive to this.

BTW I hoped to get this fix pushed this week, but that didn't happen and
I'll be away most of next week :-( Let's try to get this sorted so that
I can push it on June 16 or so.

regards

--
Tomas Vondra

Attachments:

v4-0001-verify-gin-fixes-and-tests.patchtext/x-patch; charset=UTF-8; name=v4-0001-verify-gin-fixes-and-tests.patchDownload

From f24396e4cbfa5d5b64cb815654a50bbd9d003154 Mon Sep 17 00:00:00 2001
From: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>
Date: Thu, 29 May 2025 13:47:47 +0300
Subject: [PATCH v4 1/3] verify-gin-fixes-and-tests

---
 contrib/amcheck/expected/check_gin.out |  12 +
 contrib/amcheck/meson.build            |   1 +
 contrib/amcheck/sql/check_gin.sql      |  10 +
 contrib/amcheck/t/006_verify_gin.pl    | 313 +++++++++++++++++++++++++
 contrib/amcheck/verify_gin.c           |  57 +++--
 5 files changed, 363 insertions(+), 30 deletions(-)
 create mode 100644 contrib/amcheck/t/006_verify_gin.pl

diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
index b4f0b110747..8dd01ced8d1 100644
--- a/contrib/amcheck/expected/check_gin.out
+++ b/contrib/amcheck/expected/check_gin.out
@@ -76,3 +76,15 @@ SELECT gin_index_check('gin_check_jsonb_idx');
 
 -- cleanup
 DROP TABLE gin_check_jsonb;
+-- Test GIN multicolumn index
+CREATE TABLE "gin_check_multicolumn"(a text[], b text[]);
+INSERT INTO gin_check_multicolumn (a,b) values ('{a,c,e}','{b,d,f}');
+CREATE INDEX "gin_check_multicolumn_idx" on gin_check_multicolumn USING GIN(a,b);
+SELECT gin_index_check('gin_check_multicolumn_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_multicolumn;
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index b33e8c9b062..1f0c347ed54 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -49,6 +49,7 @@ tests += {
       't/003_cic_2pc.pl',
       't/004_verify_nbtree_unique.pl',
       't/005_pitr.pl',
+      't/006_verify_gin.pl',
     ],
   },
 }
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
index 66f42c34311..11caed3d6a8 100644
--- a/contrib/amcheck/sql/check_gin.sql
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -50,3 +50,13 @@ SELECT gin_index_check('gin_check_jsonb_idx');
 
 -- cleanup
 DROP TABLE gin_check_jsonb;
+
+-- Test GIN multicolumn index
+CREATE TABLE "gin_check_multicolumn"(a text[], b text[]);
+INSERT INTO gin_check_multicolumn (a,b) values ('{a,c,e}','{b,d,f}');
+CREATE INDEX "gin_check_multicolumn_idx" on gin_check_multicolumn USING GIN(a,b);
+
+SELECT gin_index_check('gin_check_multicolumn_idx');
+
+-- cleanup
+DROP TABLE gin_check_multicolumn;
diff --git a/contrib/amcheck/t/006_verify_gin.pl b/contrib/amcheck/t/006_verify_gin.pl
new file mode 100644
index 00000000000..5d974228644
--- /dev/null
+++ b/contrib/amcheck/t/006_verify_gin.pl
@@ -0,0 +1,313 @@
+
+# Copyright (c) 2021-2025, PostgreSQL Global Development Group
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+
+use Test::More;
+
+my $node;
+my $blksize;
+
+#
+# Test set-up
+#
+$node = PostgreSQL::Test::Cluster->new('test');
+$node->init(no_data_checksums => 1);
+$node->append_conf('postgresql.conf', 'autovacuum=off');
+$node->start;
+$blksize = int($node->safe_psql('postgres', 'SHOW block_size;'));
+$node->safe_psql('postgres', q(CREATE EXTENSION amcheck));
+$node->safe_psql(
+	'postgres', q(
+        CREATE OR REPLACE FUNCTION  random_string( INT ) RETURNS text AS $$
+        SELECT string_agg(substring('0123456789bcdfghjkmnpqrstvwxyz', ceil(random() * 30)::integer, 1), '') from generate_series(1, $1);
+        $$ LANGUAGE SQL;));
+
+# Tests
+invalid_entry_order_leaf_page_test();
+invalid_entry_order_inner_page_test();
+invalid_entry_columns_order_test();
+inconsistent_with_parent_key__parent_key_corrupted_test();
+inconsistent_with_parent_key__child_key_corrupted_test();
+inconsistent_with_parent_key__parent_key_corrupted_posting_tree_test();
+
+sub invalid_entry_order_leaf_page_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+	 	CREATE TABLE $relname (a text[]);
+	 	CREATE INDEX $indexname ON $relname USING gin (a);
+		INSERT INTO $relname (a) VALUES ('{aaaaa,bbbbb}');
+		SELECT gin_clean_pending_list('$indexname');
+	 ));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 1;  # root
+
+	# produce wrong order by replacing aaaaa with ccccc
+	string_replace_block(
+		$relpath,
+		"aaaaa",
+		'"ccccc"',
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	ok($stderr =~ "index \"$indexname\" has wrong tuple order on entry tree page, block 1, offset 2, rightlink 4294967295");
+}
+
+sub invalid_entry_order_inner_page_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+	 	CREATE TABLE $relname (a text[]);
+	 	CREATE INDEX $indexname ON $relname USING gin (a);
+		INSERT INTO $relname (a) VALUES (('{' || 'pppppppppp' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'qqqqqqqqqq' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'rrrrrrrrrr' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'ssssssssss' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'tttttttttt' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'uuuuuuuuuu' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'vvvvvvvvvv' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'wwwwwwwwww' || random_string(1870) ||'}')::text[]);
+		SELECT gin_clean_pending_list('$indexname');
+	 ));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 1;  # root
+
+	# we have rrrrrrrrr... and tttttttttt... as keys in the root, so produce wrong order by replacing rrrrrrrrrr....
+	string_replace_block(
+		$relpath,
+		"rrrrrrrrrr",
+		'"zzzzzzzzzz"',
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	ok($stderr =~ "index \"$indexname\" has wrong tuple order on entry tree page, block 1, offset 2, rightlink 4294967295");
+}
+
+sub invalid_entry_columns_order_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+	 	CREATE TABLE $relname (a text[],b text[]);
+	 	CREATE INDEX $indexname ON $relname USING gin (a,b);
+		INSERT INTO $relname (a,b) VALUES ('{aaa}','{bbb}');
+		SELECT gin_clean_pending_list('$indexname');
+	 ));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 1;  # root
+
+	# mess column numbers
+	# root items order before: (1,aaa), (2,bbb)
+	# root items order after:  (2,aaa), (1,bbb)
+	my $attrno_1 = pack('s', 1);
+	my $attrno_2 = pack('s', 2);
+
+	my $find = qr/($attrno_1)(.)(aaa)/s;
+	my $replace = '"' . $attrno_2 . '$2$3"';
+	string_replace_block(
+		$relpath,
+		$find,
+		$replace,
+		$blksize,
+		$blkno
+	);
+
+	$find = qr/($attrno_2)(.)(bbb)/s;
+	$replace = '"' . $attrno_1 . '$2$3"';
+	string_replace_block(
+		$relpath,
+		$find,
+		$replace,
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	ok($stderr =~ "index \"$indexname\" has wrong tuple order on entry tree page, block 1, offset 2, rightlink 4294967295");
+}
+
+sub inconsistent_with_parent_key__parent_key_corrupted_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+	 	CREATE TABLE $relname (a text[]);
+	 	CREATE INDEX $indexname ON $relname USING gin (a);
+		INSERT INTO $relname (a) VALUES (('{' || 'llllllllll' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'mmmmmmmmmm' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'nnnnnnnnnn' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'xxxxxxxxxx' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'yyyyyyyyyy' || random_string(1870) ||'}')::text[]);
+		SELECT gin_clean_pending_list('$indexname');
+	 ));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 1;  # root
+
+	# we have nnnnnnnnnn... as parent key in the root, so replace it with something smaller then child's keys
+	string_replace_block(
+		$relpath,
+		"nnnnnnnnnn",
+		'"aaaaaaaaaa"',
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	ok($stderr =~ "index \"$indexname\" has inconsistent records on page 5 offset 3");
+}
+
+sub inconsistent_with_parent_key__child_key_corrupted_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+	 	CREATE TABLE $relname (a text[]);
+	 	CREATE INDEX $indexname ON $relname USING gin (a);
+		INSERT INTO $relname (a) VALUES (('{' || 'llllllllll' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'mmmmmmmmmm' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'nnnnnnnnnn' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'xxxxxxxxxx' || random_string(1870) ||'}')::text[]);
+        INSERT INTO $relname (a) VALUES (('{' || 'yyyyyyyyyy' || random_string(1870) ||'}')::text[]);
+		SELECT gin_clean_pending_list('$indexname');
+	 ));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 5;  # leaf
+
+	# we have nnnnnnnnnn... as parent key in the root, so replace child key with something bigger
+	string_replace_block(
+		$relpath,
+		"nnnnnnnnnn",
+		'"pppppppppp"',
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	ok($stderr =~ "index \"$indexname\" has inconsistent records on page 5 offset 3");
+}
+
+sub inconsistent_with_parent_key__parent_key_corrupted_posting_tree_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+		CREATE TABLE $relname (a text[]);
+		INSERT INTO $relname (a) select ('{aaaaa}') from generate_series(1,10000);
+		CREATE INDEX $indexname ON $relname USING gin (a);
+	 ));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 2;  # posting tree root
+
+	# we have a posting tree for 'aaaaa' key with the root at 2nd block
+	# and two leaf pages 3 and 4. for 4th page high key is (65,52), so let's make it a little bit
+	# smaller, so that there are tid's in leaf page that are larger then the new high key.
+	my $find = pack('S', 65) . pack('S', 52);
+	my $replace = '"' . pack('S', 64) . pack('S', 52) . '"';
+	string_replace_block(
+		$relpath,
+		$find,
+		$replace,
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	ok($stderr =~ "index \"$indexname\": tid exceeds parent's high key in postingTree leaf on block 4");
+}
+
+
+# Returns the filesystem path for the named relation.
+sub relation_filepath
+{
+	my ($relname) = @_;
+
+	my $pgdata = $node->data_dir;
+	my $rel = $node->safe_psql('postgres',
+		qq(SELECT pg_relation_filepath('$relname')));
+	die "path not found for relation $relname" unless defined $rel;
+	return "$pgdata/$rel";
+}
+
+sub string_replace_block {
+	my ($filename, $find, $replace, $blksize, $blkno) = @_;
+
+	my $fh;
+	open($fh, '+<', $filename) or BAIL_OUT("open failed: $!");
+	binmode $fh;
+
+	my $offset = $blkno * $blksize;
+	my $buffer;
+
+	sysseek($fh, $offset, 0) or BAIL_OUT("seek failed: $!");
+	sysread($fh, $buffer, $blksize) or BAIL_OUT("read failed: $!");
+
+	$buffer =~ s/$find/$replace/gee;
+
+	sysseek($fh, $offset, 0) or BAIL_OUT("seek failed: $!");
+	syswrite($fh, $buffer) or BAIL_OUT("write failed: $!");
+
+	close($fh) or BAIL_OUT("close failed: $!");
+
+	return;
+}
+
+done_testing();
\ No newline at end of file
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index b5f363562e3..427cf1669a6 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -346,7 +346,7 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 				 * Check if this tuple is consistent with the downlink in the
 				 * parent.
 				 */
-				if (stack->parentblk != InvalidBlockNumber && i == maxoff &&
+				if (i == maxoff && ItemPointerIsValid(&stack->parentkey) &&
 					ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
 					ereport(ERROR,
 							(errcode(ERRCODE_INDEX_CORRUPTED),
@@ -359,14 +359,10 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 				ptr->depth = stack->depth + 1;
 
 				/*
-				 * Set rightmost parent key to invalid item pointer. Its value
-				 * is 'Infinity' and not explicitly stored.
+				 * The rightmost parent key is always invalid item pointer.
+				 * Its value is 'Infinity' and not explicitly stored.
 				 */
-				if (rightlink == InvalidBlockNumber)
-					ItemPointerSetInvalid(&ptr->parentkey);
-				else
-					ptr->parentkey = posting_item->key;
-
+				ptr->parentkey = posting_item->key;
 				ptr->parentblk = stack->blkno;
 				ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
 				ptr->next = stack->next;
@@ -463,17 +459,18 @@ gin_check_parent_keys_consistency(Relation rel,
 			Datum		parent_key = gintuple_get_key(&state,
 													  stack->parenttup,
 													  &parent_key_category);
+			OffsetNumber parent_key_attnum = gintuple_get_attrnum(&state, stack->parenttup);
 			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno,
 												   page, maxoff);
 			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
-			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			OffsetNumber page_max_key_attnum = gintuple_get_attrnum(&state, idxtuple);
 			GinNullCategory page_max_key_category;
 			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
 
 			if (rightlink != InvalidBlockNumber &&
-				ginCompareEntries(&state, attnum, page_max_key,
-								  page_max_key_category, parent_key,
-								  parent_key_category) > 0)
+				ginCompareAttEntries(&state, page_max_key_attnum, page_max_key,
+									 page_max_key_category, parent_key_attnum,
+									 parent_key, parent_key_category) < 0)
 			{
 				/* split page detected, install right link to the stack */
 				GinScanItem *ptr;
@@ -528,20 +525,18 @@ gin_check_parent_keys_consistency(Relation rel,
 			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
 
 			/*
-			 * First block is metadata, skip order check. Also, never check
-			 * for high key on rightmost page, as this key is not really
-			 * stored explicitly.
+			 * Never check for high key on rightmost inner page, as this key
+			 * is not really stored explicitly.
 			 *
 			 * Also make sure to not compare entries for different attnums,
 			 * which may be stored on the same page.
 			 */
-			if (i != FirstOffsetNumber && attnum == prev_attnum && stack->blkno != GIN_ROOT_BLKNO &&
-				!(i == maxoff && rightlink == InvalidBlockNumber))
+			if (i != FirstOffsetNumber && !(i == maxoff && rightlink == InvalidBlockNumber && !GinPageIsLeaf(page)))
 			{
 				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
-				if (ginCompareEntries(&state, attnum, prev_key,
-									  prev_key_category, current_key,
-									  current_key_category) >= 0)
+				if (ginCompareAttEntries(&state, prev_attnum, prev_key,
+										 prev_key_category, attnum,
+										 current_key, current_key_category) >= 0)
 					ereport(ERROR,
 							(errcode(ERRCODE_INDEX_CORRUPTED),
 							 errmsg("index \"%s\" has wrong tuple order on entry tree page, block %u, offset %u, rightlink %u",
@@ -556,13 +551,14 @@ gin_check_parent_keys_consistency(Relation rel,
 				i == maxoff)
 			{
 				GinNullCategory parent_key_category;
+				OffsetNumber parent_key_attnum = gintuple_get_attrnum(&state, stack->parenttup);
 				Datum		parent_key = gintuple_get_key(&state,
 														  stack->parenttup,
 														  &parent_key_category);
 
-				if (ginCompareEntries(&state, attnum, current_key,
-									  current_key_category, parent_key,
-									  parent_key_category) > 0)
+				if (ginCompareAttEntries(&state, attnum, current_key,
+										 current_key_category, parent_key_attnum,
+										 parent_key, parent_key_category) > 0)
 				{
 					/*
 					 * There was a discrepancy between parent and child
@@ -581,6 +577,7 @@ gin_check_parent_keys_consistency(Relation rel,
 							 stack->blkno, stack->parentblk);
 					else
 					{
+						parent_key_attnum = gintuple_get_attrnum(&state, stack->parenttup);
 						parent_key = gintuple_get_key(&state,
 													  stack->parenttup,
 													  &parent_key_category);
@@ -589,9 +586,9 @@ gin_check_parent_keys_consistency(Relation rel,
 						 * Check if it is properly adjusted. If succeed,
 						 * proceed to the next key.
 						 */
-						if (ginCompareEntries(&state, attnum, current_key,
-											  current_key_category, parent_key,
-											  parent_key_category) > 0)
+						if (ginCompareAttEntries(&state, attnum, current_key,
+												 current_key_category, parent_key_attnum,
+												 parent_key, parent_key_category) > 0)
 							ereport(ERROR,
 									(errcode(ERRCODE_INDEX_CORRUPTED),
 									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
@@ -608,10 +605,10 @@ gin_check_parent_keys_consistency(Relation rel,
 				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
 				ptr->depth = stack->depth + 1;
 				/* last tuple in layer has no high key */
-				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
-					ptr->parenttup = CopyIndexTuple(idxtuple);
-				else
+				if (i == maxoff && rightlink == InvalidBlockNumber)
 					ptr->parenttup = NULL;
+				else
+					ptr->parenttup = CopyIndexTuple(idxtuple);
 				ptr->parentblk = stack->blkno;
 				ptr->blkno = GinGetDownlink(idxtuple);
 				ptr->parentlsn = lsn;
@@ -749,7 +746,7 @@ gin_refind_parent(Relation rel, BlockNumber parentblkno,
 		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
 		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
 
-		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		if (GinGetDownlink(itup) == childblkno)
 		{
 			/* Found it! Make copy and return it */
 			result = CopyIndexTuple(itup);
-- 
2.49.0

v4-0002-whitespace-fixes.patchtext/x-patch; charset=UTF-8; name=v4-0002-whitespace-fixes.patchDownload

From accb47ff16f06de2dc45524a1cf1223cefb93272 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Sun, 8 Jun 2025 22:29:31 +0200
Subject: [PATCH v4 2/3] whitespace fixes

---
 contrib/amcheck/t/006_verify_gin.pl | 69 +++++++++++++++--------------
 1 file changed, 35 insertions(+), 34 deletions(-)

diff --git a/contrib/amcheck/t/006_verify_gin.pl b/contrib/amcheck/t/006_verify_gin.pl
index 5d974228644..e8afeb038fc 100644
--- a/contrib/amcheck/t/006_verify_gin.pl
+++ b/contrib/amcheck/t/006_verify_gin.pl
@@ -23,9 +23,9 @@ $blksize = int($node->safe_psql('postgres', 'SHOW block_size;'));
 $node->safe_psql('postgres', q(CREATE EXTENSION amcheck));
 $node->safe_psql(
 	'postgres', q(
-        CREATE OR REPLACE FUNCTION  random_string( INT ) RETURNS text AS $$
-        SELECT string_agg(substring('0123456789bcdfghjkmnpqrstvwxyz', ceil(random() * 30)::integer, 1), '') from generate_series(1, $1);
-        $$ LANGUAGE SQL;));
+		CREATE OR REPLACE FUNCTION  random_string( INT ) RETURNS text AS $$
+		SELECT string_agg(substring('0123456789bcdfghjkmnpqrstvwxyz', ceil(random() * 30)::integer, 1), '') from generate_series(1, $1);
+		$$ LANGUAGE SQL;));
 
 # Tests
 invalid_entry_order_leaf_page_test();
@@ -43,8 +43,8 @@ sub invalid_entry_order_leaf_page_test
 	$node->safe_psql(
 		'postgres', qq(
 		DROP TABLE IF EXISTS $relname;
-	 	CREATE TABLE $relname (a text[]);
-	 	CREATE INDEX $indexname ON $relname USING gin (a);
+		CREATE TABLE $relname (a text[]);
+		CREATE INDEX $indexname ON $relname USING gin (a);
 		INSERT INTO $relname (a) VALUES ('{aaaaa,bbbbb}');
 		SELECT gin_clean_pending_list('$indexname');
 	 ));
@@ -77,18 +77,18 @@ sub invalid_entry_order_inner_page_test
 	$node->safe_psql(
 		'postgres', qq(
 		DROP TABLE IF EXISTS $relname;
-	 	CREATE TABLE $relname (a text[]);
-	 	CREATE INDEX $indexname ON $relname USING gin (a);
+		CREATE TABLE $relname (a text[]);
+		CREATE INDEX $indexname ON $relname USING gin (a);
 		INSERT INTO $relname (a) VALUES (('{' || 'pppppppppp' || random_string(1870) ||'}')::text[]);
-        INSERT INTO $relname (a) VALUES (('{' || 'qqqqqqqqqq' || random_string(1870) ||'}')::text[]);
-        INSERT INTO $relname (a) VALUES (('{' || 'rrrrrrrrrr' || random_string(1870) ||'}')::text[]);
-        INSERT INTO $relname (a) VALUES (('{' || 'ssssssssss' || random_string(1870) ||'}')::text[]);
-        INSERT INTO $relname (a) VALUES (('{' || 'tttttttttt' || random_string(1870) ||'}')::text[]);
-        INSERT INTO $relname (a) VALUES (('{' || 'uuuuuuuuuu' || random_string(1870) ||'}')::text[]);
-        INSERT INTO $relname (a) VALUES (('{' || 'vvvvvvvvvv' || random_string(1870) ||'}')::text[]);
-        INSERT INTO $relname (a) VALUES (('{' || 'wwwwwwwwww' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'qqqqqqqqqq' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'rrrrrrrrrr' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'ssssssssss' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'tttttttttt' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'uuuuuuuuuu' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'vvvvvvvvvv' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'wwwwwwwwww' || random_string(1870) ||'}')::text[]);
 		SELECT gin_clean_pending_list('$indexname');
-	 ));
+	));
 	my $relpath = relation_filepath($indexname);
 
 	$node->stop;
@@ -118,11 +118,11 @@ sub invalid_entry_columns_order_test
 	$node->safe_psql(
 		'postgres', qq(
 		DROP TABLE IF EXISTS $relname;
-	 	CREATE TABLE $relname (a text[],b text[]);
-	 	CREATE INDEX $indexname ON $relname USING gin (a,b);
+		CREATE TABLE $relname (a text[],b text[]);
+		CREATE INDEX $indexname ON $relname USING gin (a,b);
 		INSERT INTO $relname (a,b) VALUES ('{aaa}','{bbb}');
 		SELECT gin_clean_pending_list('$indexname');
-	 ));
+	));
 	my $relpath = relation_filepath($indexname);
 
 	$node->stop;
@@ -169,15 +169,15 @@ sub inconsistent_with_parent_key__parent_key_corrupted_test
 	$node->safe_psql(
 		'postgres', qq(
 		DROP TABLE IF EXISTS $relname;
-	 	CREATE TABLE $relname (a text[]);
-	 	CREATE INDEX $indexname ON $relname USING gin (a);
+		CREATE TABLE $relname (a text[]);
+		CREATE INDEX $indexname ON $relname USING gin (a);
 		INSERT INTO $relname (a) VALUES (('{' || 'llllllllll' || random_string(1870) ||'}')::text[]);
-        INSERT INTO $relname (a) VALUES (('{' || 'mmmmmmmmmm' || random_string(1870) ||'}')::text[]);
-        INSERT INTO $relname (a) VALUES (('{' || 'nnnnnnnnnn' || random_string(1870) ||'}')::text[]);
-        INSERT INTO $relname (a) VALUES (('{' || 'xxxxxxxxxx' || random_string(1870) ||'}')::text[]);
-        INSERT INTO $relname (a) VALUES (('{' || 'yyyyyyyyyy' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'mmmmmmmmmm' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'nnnnnnnnnn' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'xxxxxxxxxx' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'yyyyyyyyyy' || random_string(1870) ||'}')::text[]);
 		SELECT gin_clean_pending_list('$indexname');
-	 ));
+	));
 	my $relpath = relation_filepath($indexname);
 
 	$node->stop;
@@ -207,13 +207,13 @@ sub inconsistent_with_parent_key__child_key_corrupted_test
 	$node->safe_psql(
 		'postgres', qq(
 		DROP TABLE IF EXISTS $relname;
-	 	CREATE TABLE $relname (a text[]);
-	 	CREATE INDEX $indexname ON $relname USING gin (a);
+		CREATE TABLE $relname (a text[]);
+		CREATE INDEX $indexname ON $relname USING gin (a);
 		INSERT INTO $relname (a) VALUES (('{' || 'llllllllll' || random_string(1870) ||'}')::text[]);
-        INSERT INTO $relname (a) VALUES (('{' || 'mmmmmmmmmm' || random_string(1870) ||'}')::text[]);
-        INSERT INTO $relname (a) VALUES (('{' || 'nnnnnnnnnn' || random_string(1870) ||'}')::text[]);
-        INSERT INTO $relname (a) VALUES (('{' || 'xxxxxxxxxx' || random_string(1870) ||'}')::text[]);
-        INSERT INTO $relname (a) VALUES (('{' || 'yyyyyyyyyy' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'mmmmmmmmmm' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'nnnnnnnnnn' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'xxxxxxxxxx' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'yyyyyyyyyy' || random_string(1870) ||'}')::text[]);
 		SELECT gin_clean_pending_list('$indexname');
 	 ));
 	my $relpath = relation_filepath($indexname);
@@ -248,7 +248,7 @@ sub inconsistent_with_parent_key__parent_key_corrupted_posting_tree_test
 		CREATE TABLE $relname (a text[]);
 		INSERT INTO $relname (a) select ('{aaaaa}') from generate_series(1,10000);
 		CREATE INDEX $indexname ON $relname USING gin (a);
-	 ));
+	));
 	my $relpath = relation_filepath($indexname);
 
 	$node->stop;
@@ -287,7 +287,8 @@ sub relation_filepath
 	return "$pgdata/$rel";
 }
 
-sub string_replace_block {
+sub string_replace_block
+{
 	my ($filename, $find, $replace, $blksize, $blkno) = @_;
 
 	my $fh;
@@ -310,4 +311,4 @@ sub string_replace_block {
 	return;
 }
 
-done_testing();
\ No newline at end of file
+done_testing();
-- 
2.49.0

v4-0003-undo.patchtext/x-patch; charset=UTF-8; name=v4-0003-undo.patchDownload

From e7d488efd2b93ad56573fa597f2134df6e48b0ea Mon Sep 17 00:00:00 2001
From: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>
Date: Thu, 29 May 2025 14:33:37 +0300
Subject: [PATCH v4 3/3] undo

---
 contrib/amcheck/verify_gin.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index 427cf1669a6..8f6a5410cb7 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -346,7 +346,7 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 				 * Check if this tuple is consistent with the downlink in the
 				 * parent.
 				 */
-				if (i == maxoff && ItemPointerIsValid(&stack->parentkey) &&
+				if (stack->parentblk != InvalidBlockNumber && i == maxoff &&
 					ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
 					ereport(ERROR,
 							(errcode(ERRCODE_INDEX_CORRUPTED),
@@ -359,10 +359,14 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 				ptr->depth = stack->depth + 1;
 
 				/*
-				 * The rightmost parent key is always invalid item pointer.
-				 * Its value is 'Infinity' and not explicitly stored.
+				 * Set rightmost parent key to invalid item pointer. Its value
+				 * is 'Infinity' and not explicitly stored.
 				 */
-				ptr->parentkey = posting_item->key;
+				if (rightlink == InvalidBlockNumber)
+					ItemPointerSetInvalid(&ptr->parentkey);
+				else
+					ptr->parentkey = posting_item->key;
+
 				ptr->parentblk = stack->blkno;
 				ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
 				ptr->next = stack->next;
-- 
2.49.0

#83

Tomas Vondra

tomas@vondra.me

7 months ago

In reply to: Tomas Vondra (#82)

4 attachment(s)

Re: Amcheck verification of GiST and GIN

On 6/9/25 00:14, Tomas Vondra wrote:

...

I propose to split it like this, into three parts, each addressing a
particular type of mistake:

1) gin_check_posting_tree_parent_keys_consistency

2) gin_check_parent_keys_consistency / att comparisons

3) gin_check_parent_keys_consistency / setting ptr->parenttup (at the end)

Does this make sense to you? If yes, can you split the patch series like
this, including a commit message for each part, explaining the fix? We'd
need the commit message even with a single patch, ofc.

The attached v5 patch splits it along these lines, except that the extra
0001 part merely adds a multicolumn index into the regression test. The
0002-0004 parts are ordered to match the TAP test, i.e. it adds tests.

I've copied the points from the report to the commit messages, but this
needs cleanup/rephrasing, to make it readable. Could you look into
that?Of course, if you think the patches should be split differently,
feel free to move stuff.

And as I said before - if you feel the issues are too intertwined and
can't be split like this (or it just doesn't make sense), please speak
up. We can commit that as a single patch. It still needs the commit
message, though.

regards

--
Tomas Vondra

Attachments:

v5-0003-patch-2-gin_check_parent_keys_consistency.patchtext/x-patch; charset=UTF-8; name=v5-0003-patch-2-gin_check_parent_keys_consistency.patchDownload

From f967749e3021e2c3b5c2680275f35aa6d428abf9 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Mon, 9 Jun 2025 01:45:56 +0200
Subject: [PATCH v5 3/4] patch 2: gin_check_parent_keys_consistency

1) When we add new items to the entry tree stack, ptr->parenttup is always null
because GinPageGetOpaque(page)->rightlink is never NULL.

             /* last tuple in layer has no high key */
                if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
                    ptr->parenttup = CopyIndexTuple(idxtuple);
                else
                    ptr->parenttup = NULL;

Right way to check if entry doesn't have right neighbour is

            GinPageGetOpaque(page)->rightlink == InvalidBlockNumber

But even if we fix it, the condition would not do what the comment
says. If we want to have NULL as parenttup
only for the last tuple of the btree layer, the right check would be:

                if (i == maxoff && rightlink == InvalidBlockNumber)
                    ptr->parenttup = NULL;
             else
                    ptr->parenttup = CopyIndexTuple(idxtuple);

8) In the function gin_refind_parent() the code

                if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)

triggers Assert(ItemPointerIsValid(pointer)) within the
ItemPointerGetBlockNumber(), because itup->t_tid could be invalid.
AFAIS GIN code uses a special method GinGetDownlink(itup) that avoids
this Assert. So we can use it here too.

Please find the attached patch with fixes for issues above. Also there
are added regression test for multicolumn index,
and several tap tests with some basic corrupted index cases. I'm not
sure if it's the right way to write such tests and would
be glad to hear any feedback, especially about
invalid_entry_columns_order_test() where it seems important to
preserve
byte ordering. Also all tests expect standard page size 8192 now.
---
 contrib/amcheck/t/006_verify_gin.pl | 78 +++++++++++++++++++++++++++++
 contrib/amcheck/verify_gin.c        |  8 +--
 2 files changed, 82 insertions(+), 4 deletions(-)

diff --git a/contrib/amcheck/t/006_verify_gin.pl b/contrib/amcheck/t/006_verify_gin.pl
index 8c5975d2e37..a999a13d183 100644
--- a/contrib/amcheck/t/006_verify_gin.pl
+++ b/contrib/amcheck/t/006_verify_gin.pl
@@ -31,6 +31,8 @@ $node->safe_psql(
 invalid_entry_order_leaf_page_test();
 invalid_entry_order_inner_page_test();
 invalid_entry_columns_order_test();
+inconsistent_with_parent_key__parent_key_corrupted_test();
+inconsistent_with_parent_key__child_key_corrupted_test();
 
 sub invalid_entry_order_leaf_page_test
 {
@@ -158,6 +160,82 @@ sub invalid_entry_columns_order_test
 	ok($stderr =~ "index \"$indexname\" has wrong tuple order on entry tree page, block 1, offset 2, rightlink 4294967295");
 }
 
+sub inconsistent_with_parent_key__parent_key_corrupted_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+		CREATE TABLE $relname (a text[]);
+		CREATE INDEX $indexname ON $relname USING gin (a);
+		INSERT INTO $relname (a) VALUES (('{' || 'llllllllll' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'mmmmmmmmmm' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'nnnnnnnnnn' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'xxxxxxxxxx' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'yyyyyyyyyy' || random_string(1870) ||'}')::text[]);
+		SELECT gin_clean_pending_list('$indexname');
+	));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 1;  # root
+
+	# we have nnnnnnnnnn... as parent key in the root, so replace it with something smaller then child's keys
+	string_replace_block(
+		$relpath,
+		"nnnnnnnnnn",
+		'"aaaaaaaaaa"',
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	ok($stderr =~ "index \"$indexname\" has inconsistent records on page 5 offset 3");
+}
+
+sub inconsistent_with_parent_key__child_key_corrupted_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+		CREATE TABLE $relname (a text[]);
+		CREATE INDEX $indexname ON $relname USING gin (a);
+		INSERT INTO $relname (a) VALUES (('{' || 'llllllllll' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'mmmmmmmmmm' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'nnnnnnnnnn' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'xxxxxxxxxx' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'yyyyyyyyyy' || random_string(1870) ||'}')::text[]);
+		SELECT gin_clean_pending_list('$indexname');
+	 ));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 5;  # leaf
+
+	# we have nnnnnnnnnn... as parent key in the root, so replace child key with something bigger
+	string_replace_block(
+		$relpath,
+		"nnnnnnnnnn",
+		'"pppppppppp"',
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	ok($stderr =~ "index \"$indexname\" has inconsistent records on page 5 offset 3");
+}
+
 # Returns the filesystem path for the named relation.
 sub relation_filepath
 {
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index 26b98571b56..8f6a5410cb7 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -609,10 +609,10 @@ gin_check_parent_keys_consistency(Relation rel,
 				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
 				ptr->depth = stack->depth + 1;
 				/* last tuple in layer has no high key */
-				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
-					ptr->parenttup = CopyIndexTuple(idxtuple);
-				else
+				if (i == maxoff && rightlink == InvalidBlockNumber)
 					ptr->parenttup = NULL;
+				else
+					ptr->parenttup = CopyIndexTuple(idxtuple);
 				ptr->parentblk = stack->blkno;
 				ptr->blkno = GinGetDownlink(idxtuple);
 				ptr->parentlsn = lsn;
@@ -750,7 +750,7 @@ gin_refind_parent(Relation rel, BlockNumber parentblkno,
 		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
 		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
 
-		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		if (GinGetDownlink(itup) == childblkno)
 		{
 			/* Found it! Make copy and return it */
 			result = CopyIndexTuple(itup);
-- 
2.49.0

v5-0004-patch-3-gin_check_posting_tree_parent_keys_consis.patchtext/x-patch; charset=UTF-8; name=v5-0004-patch-3-gin_check_posting_tree_parent_keys_consis.patchDownload

From 3f19acdaaeaa7db51ef66adf58befc5598339683 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Mon, 9 Jun 2025 01:46:13 +0200
Subject: [PATCH v5 4/4] patch 3:
 gin_check_posting_tree_parent_keys_consistency

6) In posting tree parent key check part:

              /*
              * Check if this tuple is consistent with the downlink in the
              * parent.
              */
             if (stack->parentblk != InvalidBlockNumber && i == maxoff &&
                ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
                ereport(ERROR,
                      ...
Here we don't check if stack->parentkey is valid, so sometimes we
compare invalid parentkey (because we can have
valid parentblk and invalid parentkey the same time). Invalid
parentkey is always bigger, so the code never triggers
ereport, but it doesn't look right. so probably we can rewrite it this way:

                if (i == maxoff && ItemPointerIsValid(&stack->parentkey) &&
                ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)

7) When producing stack entries for posting tree check, we set parent
key like this:

             /*
              * Set rightmost parent key to invalid item pointer. Its value
              * is 'Infinity' and not explicitly stored.
              */
             if (rightlink == InvalidBlockNumber)
                ItemPointerSetInvalid(&ptr->parentkey);
             else
                ptr->parentkey = posting_item->key;

We set invalid parent key for all items of the rightmost page. But
it's the only rightmost item that doesn't have an explicit
parentkey (actually the comment says exactly this, but the code does a
different thing). All others have an explicit parent
key and we can set it. So fix can look like this:

                if (rightlink == InvalidBlockNumber && i == maxoff)
                ItemPointerSetInvalid(&ptr->parentkey);
             else
                ptr->parentkey = posting_item->key;

But for (rightlink == InvalidBlockNumber && i == maxoff)
posting_item->key is always (0,0) (we check it a little bit earlier),
so I think we can simplify it:

                ptr->parentkey = posting_item->key;
---
 contrib/amcheck/t/006_verify_gin.pl | 39 +++++++++++++++++++++++++++++
 contrib/amcheck/verify_gin.c        | 12 +++------
 2 files changed, 43 insertions(+), 8 deletions(-)

diff --git a/contrib/amcheck/t/006_verify_gin.pl b/contrib/amcheck/t/006_verify_gin.pl
index a999a13d183..e8afeb038fc 100644
--- a/contrib/amcheck/t/006_verify_gin.pl
+++ b/contrib/amcheck/t/006_verify_gin.pl
@@ -33,6 +33,7 @@ invalid_entry_order_inner_page_test();
 invalid_entry_columns_order_test();
 inconsistent_with_parent_key__parent_key_corrupted_test();
 inconsistent_with_parent_key__child_key_corrupted_test();
+inconsistent_with_parent_key__parent_key_corrupted_posting_tree_test();
 
 sub invalid_entry_order_leaf_page_test
 {
@@ -236,6 +237,44 @@ sub inconsistent_with_parent_key__child_key_corrupted_test
 	ok($stderr =~ "index \"$indexname\" has inconsistent records on page 5 offset 3");
 }
 
+sub inconsistent_with_parent_key__parent_key_corrupted_posting_tree_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+		CREATE TABLE $relname (a text[]);
+		INSERT INTO $relname (a) select ('{aaaaa}') from generate_series(1,10000);
+		CREATE INDEX $indexname ON $relname USING gin (a);
+	));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 2;  # posting tree root
+
+	# we have a posting tree for 'aaaaa' key with the root at 2nd block
+	# and two leaf pages 3 and 4. for 4th page high key is (65,52), so let's make it a little bit
+	# smaller, so that there are tid's in leaf page that are larger then the new high key.
+	my $find = pack('S', 65) . pack('S', 52);
+	my $replace = '"' . pack('S', 64) . pack('S', 52) . '"';
+	string_replace_block(
+		$relpath,
+		$find,
+		$replace,
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	ok($stderr =~ "index \"$indexname\": tid exceeds parent's high key in postingTree leaf on block 4");
+}
+
+
 # Returns the filesystem path for the named relation.
 sub relation_filepath
 {
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index 8f6a5410cb7..427cf1669a6 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -346,7 +346,7 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 				 * Check if this tuple is consistent with the downlink in the
 				 * parent.
 				 */
-				if (stack->parentblk != InvalidBlockNumber && i == maxoff &&
+				if (i == maxoff && ItemPointerIsValid(&stack->parentkey) &&
 					ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
 					ereport(ERROR,
 							(errcode(ERRCODE_INDEX_CORRUPTED),
@@ -359,14 +359,10 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 				ptr->depth = stack->depth + 1;
 
 				/*
-				 * Set rightmost parent key to invalid item pointer. Its value
-				 * is 'Infinity' and not explicitly stored.
+				 * The rightmost parent key is always invalid item pointer.
+				 * Its value is 'Infinity' and not explicitly stored.
 				 */
-				if (rightlink == InvalidBlockNumber)
-					ItemPointerSetInvalid(&ptr->parentkey);
-				else
-					ptr->parentkey = posting_item->key;
-
+				ptr->parentkey = posting_item->key;
 				ptr->parentblk = stack->blkno;
 				ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
 				ptr->next = stack->next;
-- 
2.49.0

v5-0001-amcheck-Add-gin_index_check-on-a-multicolumn-inde.patchtext/x-patch; charset=UTF-8; name=v5-0001-amcheck-Add-gin_index_check-on-a-multicolumn-inde.patchDownload

From e90dd32a4260c9ecb7aa5584de5d7f393ac2729b Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Mon, 9 Jun 2025 01:42:52 +0200
Subject: [PATCH v5 1/4] amcheck: Add gin_index_check on a multicolumn index

Extend the amcheck regression tests with a gin_index_check() check on
a small multicolumn index.

TODO briefly explain this increases test coverage

Author: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>
Discussion: https://postgr.es/m/CAE7r3MJ611B9TE=YqBBncewp7-k64VWs+sjk7XF6fJUX77uFBA@mail.gmail.com
---
 contrib/amcheck/expected/check_gin.out | 12 ++++++++++++
 contrib/amcheck/sql/check_gin.sql      | 10 ++++++++++
 2 files changed, 22 insertions(+)

diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
index b4f0b110747..8dd01ced8d1 100644
--- a/contrib/amcheck/expected/check_gin.out
+++ b/contrib/amcheck/expected/check_gin.out
@@ -76,3 +76,15 @@ SELECT gin_index_check('gin_check_jsonb_idx');
 
 -- cleanup
 DROP TABLE gin_check_jsonb;
+-- Test GIN multicolumn index
+CREATE TABLE "gin_check_multicolumn"(a text[], b text[]);
+INSERT INTO gin_check_multicolumn (a,b) values ('{a,c,e}','{b,d,f}');
+CREATE INDEX "gin_check_multicolumn_idx" on gin_check_multicolumn USING GIN(a,b);
+SELECT gin_index_check('gin_check_multicolumn_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_multicolumn;
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
index 66f42c34311..11caed3d6a8 100644
--- a/contrib/amcheck/sql/check_gin.sql
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -50,3 +50,13 @@ SELECT gin_index_check('gin_check_jsonb_idx');
 
 -- cleanup
 DROP TABLE gin_check_jsonb;
+
+-- Test GIN multicolumn index
+CREATE TABLE "gin_check_multicolumn"(a text[], b text[]);
+INSERT INTO gin_check_multicolumn (a,b) values ('{a,c,e}','{b,d,f}');
+CREATE INDEX "gin_check_multicolumn_idx" on gin_check_multicolumn USING GIN(a,b);
+
+SELECT gin_index_check('gin_check_multicolumn_idx');
+
+-- cleanup
+DROP TABLE gin_check_multicolumn;
-- 
2.49.0

v5-0002-patch-1-gin_check_parent_keys_consistency.patchtext/x-patch; charset=UTF-8; name=v5-0002-patch-1-gin_check_parent_keys_consistency.patchDownload

From 7e6913007e00d8f8a6f01f87972492e3c2a925a3 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Mon, 9 Jun 2025 01:44:59 +0200
Subject: [PATCH v5 2/4] patch 1: gin_check_parent_keys_consistency

2) Check don't use attnum in comparisons, but for multicolumn indexes
attnum defines order. When we compare max entry page key
with parent key we ignore attnum. It means we occasionally can try to
compare keys of different columns.
While checking order within the same page we skip checking order for
tuples with different attnums now,
but it can be checked. Fix is easy: using ginCompareAttEntries()
instead of ginCompareEntries().

3) Here a code of the split detection

       if (rightlink != InvalidBlockNumber &&
             ginCompareEntries(&state, attnum, page_max_key,
                           page_max_key_category, parent_key,
                           parent_key_category) > 0)
          {
             /* split page detected, install right link to the stack */

Condition seems not right, because the child page max item key never
can be bigger then parent key.
It can be equal to the parentkey, and it means that there was no split
and the parent key that we cached in the stack is still
relevant. Or it could be less then cached parent key and it means that
split took place and old max item key moved to the
right neighbour and current page max item key should be less then
cached parent key. So I think we should replace > with <.

4) Here is the code for checking the order within the entry page.

           /*
           * First block is metadata, skip order check. Also, never check
           * for high key on rightmost page, as this key is not really
           * stored explicitly.
           *
           * Also make sure to not compare entries for different attnums,
           * which may be stored on the same page.
           */
          if (i != FirstOffsetNumber && attnum == prev_attnum &&
stack->blkno != GIN_ROOT_BLKNO &&
             !(i == maxoff && rightlink == InvalidBlockNumber))
          {
             prev_key = gintuple_get_key(&state, prev_tuple,
&prev_key_category);
             if (ginCompareEntries(&state, attnum, prev_key,
                              prev_key_category, current_key,
                              current_key_category) >= 0)

We skip checking the order for the root page, it's not clear why.
Probably there is some mess with the meta page, because
comment says "First block is metadata, skip order check". So I think
we can remove

                    stack->blkno != GIN_ROOT_BLKNO

5) The same place as 4). We skip checking the order for the high key
on the rightmost page, as this key is not really stored explicitly,
but for leaf pages all keys are stored explicitly, so we can check the
order for the last item of the leaf page too.
So I think we can change the condition to this:

            !(i == maxoff && rightlink == InvalidBlockNumber &&
!GinPageIsLeaf(page))

11) When we compare entry tree max page key with parent key:

             if (ginCompareAttEntries(&state, attnum, current_key,
                              current_key_category, parent_key_attnum,
                                      parent_key, parent_key_category) > 0)
             {
                /*
                 * There was a discrepancy between parent and child
                 * tuples. We need to verify it is not a result of
                 * concurrent call of gistplacetopage(). So, lock parent
                 * and try to find downlink for current page. It may be
                 * missing due to concurrent page split, this is OK.
                 */
                pfree(stack->parenttup);
                stack->parenttup = gin_refind_parent(rel, stack->parentblk,
                                            stack->blkno, strategy);

I think we can remove gin_refind_parent() and do ereport right away here.
The same logic as with 3). AFAIK it's impossible to have a child item
with a key that is higher than the cached parent key.
Parent key bounds what keys we can insert into the child page, so it
seems there is no way how they can appear there.
---
 contrib/amcheck/meson.build         |   1 +
 contrib/amcheck/t/006_verify_gin.pl | 197 ++++++++++++++++++++++++++++
 contrib/amcheck/verify_gin.c        |  37 +++---
 3 files changed, 217 insertions(+), 18 deletions(-)
 create mode 100644 contrib/amcheck/t/006_verify_gin.pl

diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index b33e8c9b062..1f0c347ed54 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -49,6 +49,7 @@ tests += {
       't/003_cic_2pc.pl',
       't/004_verify_nbtree_unique.pl',
       't/005_pitr.pl',
+      't/006_verify_gin.pl',
     ],
   },
 }
diff --git a/contrib/amcheck/t/006_verify_gin.pl b/contrib/amcheck/t/006_verify_gin.pl
new file mode 100644
index 00000000000..8c5975d2e37
--- /dev/null
+++ b/contrib/amcheck/t/006_verify_gin.pl
@@ -0,0 +1,197 @@
+
+# Copyright (c) 2021-2025, PostgreSQL Global Development Group
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+
+use Test::More;
+
+my $node;
+my $blksize;
+
+#
+# Test set-up
+#
+$node = PostgreSQL::Test::Cluster->new('test');
+$node->init(no_data_checksums => 1);
+$node->append_conf('postgresql.conf', 'autovacuum=off');
+$node->start;
+$blksize = int($node->safe_psql('postgres', 'SHOW block_size;'));
+$node->safe_psql('postgres', q(CREATE EXTENSION amcheck));
+$node->safe_psql(
+	'postgres', q(
+		CREATE OR REPLACE FUNCTION  random_string( INT ) RETURNS text AS $$
+		SELECT string_agg(substring('0123456789bcdfghjkmnpqrstvwxyz', ceil(random() * 30)::integer, 1), '') from generate_series(1, $1);
+		$$ LANGUAGE SQL;));
+
+# Tests
+invalid_entry_order_leaf_page_test();
+invalid_entry_order_inner_page_test();
+invalid_entry_columns_order_test();
+
+sub invalid_entry_order_leaf_page_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+		CREATE TABLE $relname (a text[]);
+		CREATE INDEX $indexname ON $relname USING gin (a);
+		INSERT INTO $relname (a) VALUES ('{aaaaa,bbbbb}');
+		SELECT gin_clean_pending_list('$indexname');
+	 ));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 1;  # root
+
+	# produce wrong order by replacing aaaaa with ccccc
+	string_replace_block(
+		$relpath,
+		"aaaaa",
+		'"ccccc"',
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	ok($stderr =~ "index \"$indexname\" has wrong tuple order on entry tree page, block 1, offset 2, rightlink 4294967295");
+}
+
+sub invalid_entry_order_inner_page_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+		CREATE TABLE $relname (a text[]);
+		CREATE INDEX $indexname ON $relname USING gin (a);
+		INSERT INTO $relname (a) VALUES (('{' || 'pppppppppp' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'qqqqqqqqqq' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'rrrrrrrrrr' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'ssssssssss' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'tttttttttt' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'uuuuuuuuuu' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'vvvvvvvvvv' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'wwwwwwwwww' || random_string(1870) ||'}')::text[]);
+		SELECT gin_clean_pending_list('$indexname');
+	));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 1;  # root
+
+	# we have rrrrrrrrr... and tttttttttt... as keys in the root, so produce wrong order by replacing rrrrrrrrrr....
+	string_replace_block(
+		$relpath,
+		"rrrrrrrrrr",
+		'"zzzzzzzzzz"',
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	ok($stderr =~ "index \"$indexname\" has wrong tuple order on entry tree page, block 1, offset 2, rightlink 4294967295");
+}
+
+sub invalid_entry_columns_order_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+		CREATE TABLE $relname (a text[],b text[]);
+		CREATE INDEX $indexname ON $relname USING gin (a,b);
+		INSERT INTO $relname (a,b) VALUES ('{aaa}','{bbb}');
+		SELECT gin_clean_pending_list('$indexname');
+	));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 1;  # root
+
+	# mess column numbers
+	# root items order before: (1,aaa), (2,bbb)
+	# root items order after:  (2,aaa), (1,bbb)
+	my $attrno_1 = pack('s', 1);
+	my $attrno_2 = pack('s', 2);
+
+	my $find = qr/($attrno_1)(.)(aaa)/s;
+	my $replace = '"' . $attrno_2 . '$2$3"';
+	string_replace_block(
+		$relpath,
+		$find,
+		$replace,
+		$blksize,
+		$blkno
+	);
+
+	$find = qr/($attrno_2)(.)(bbb)/s;
+	$replace = '"' . $attrno_1 . '$2$3"';
+	string_replace_block(
+		$relpath,
+		$find,
+		$replace,
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	ok($stderr =~ "index \"$indexname\" has wrong tuple order on entry tree page, block 1, offset 2, rightlink 4294967295");
+}
+
+# Returns the filesystem path for the named relation.
+sub relation_filepath
+{
+	my ($relname) = @_;
+
+	my $pgdata = $node->data_dir;
+	my $rel = $node->safe_psql('postgres',
+		qq(SELECT pg_relation_filepath('$relname')));
+	die "path not found for relation $relname" unless defined $rel;
+	return "$pgdata/$rel";
+}
+
+sub string_replace_block
+{
+	my ($filename, $find, $replace, $blksize, $blkno) = @_;
+
+	my $fh;
+	open($fh, '+<', $filename) or BAIL_OUT("open failed: $!");
+	binmode $fh;
+
+	my $offset = $blkno * $blksize;
+	my $buffer;
+
+	sysseek($fh, $offset, 0) or BAIL_OUT("seek failed: $!");
+	sysread($fh, $buffer, $blksize) or BAIL_OUT("read failed: $!");
+
+	$buffer =~ s/$find/$replace/gee;
+
+	sysseek($fh, $offset, 0) or BAIL_OUT("seek failed: $!");
+	syswrite($fh, $buffer) or BAIL_OUT("write failed: $!");
+
+	close($fh) or BAIL_OUT("close failed: $!");
+
+	return;
+}
+
+done_testing();
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index b5f363562e3..26b98571b56 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -463,17 +463,18 @@ gin_check_parent_keys_consistency(Relation rel,
 			Datum		parent_key = gintuple_get_key(&state,
 													  stack->parenttup,
 													  &parent_key_category);
+			OffsetNumber parent_key_attnum = gintuple_get_attrnum(&state, stack->parenttup);
 			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno,
 												   page, maxoff);
 			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
-			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			OffsetNumber page_max_key_attnum = gintuple_get_attrnum(&state, idxtuple);
 			GinNullCategory page_max_key_category;
 			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
 
 			if (rightlink != InvalidBlockNumber &&
-				ginCompareEntries(&state, attnum, page_max_key,
-								  page_max_key_category, parent_key,
-								  parent_key_category) > 0)
+				ginCompareAttEntries(&state, page_max_key_attnum, page_max_key,
+									 page_max_key_category, parent_key_attnum,
+									 parent_key, parent_key_category) < 0)
 			{
 				/* split page detected, install right link to the stack */
 				GinScanItem *ptr;
@@ -528,20 +529,18 @@ gin_check_parent_keys_consistency(Relation rel,
 			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
 
 			/*
-			 * First block is metadata, skip order check. Also, never check
-			 * for high key on rightmost page, as this key is not really
-			 * stored explicitly.
+			 * Never check for high key on rightmost inner page, as this key
+			 * is not really stored explicitly.
 			 *
 			 * Also make sure to not compare entries for different attnums,
 			 * which may be stored on the same page.
 			 */
-			if (i != FirstOffsetNumber && attnum == prev_attnum && stack->blkno != GIN_ROOT_BLKNO &&
-				!(i == maxoff && rightlink == InvalidBlockNumber))
+			if (i != FirstOffsetNumber && !(i == maxoff && rightlink == InvalidBlockNumber && !GinPageIsLeaf(page)))
 			{
 				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
-				if (ginCompareEntries(&state, attnum, prev_key,
-									  prev_key_category, current_key,
-									  current_key_category) >= 0)
+				if (ginCompareAttEntries(&state, prev_attnum, prev_key,
+										 prev_key_category, attnum,
+										 current_key, current_key_category) >= 0)
 					ereport(ERROR,
 							(errcode(ERRCODE_INDEX_CORRUPTED),
 							 errmsg("index \"%s\" has wrong tuple order on entry tree page, block %u, offset %u, rightlink %u",
@@ -556,13 +555,14 @@ gin_check_parent_keys_consistency(Relation rel,
 				i == maxoff)
 			{
 				GinNullCategory parent_key_category;
+				OffsetNumber parent_key_attnum = gintuple_get_attrnum(&state, stack->parenttup);
 				Datum		parent_key = gintuple_get_key(&state,
 														  stack->parenttup,
 														  &parent_key_category);
 
-				if (ginCompareEntries(&state, attnum, current_key,
-									  current_key_category, parent_key,
-									  parent_key_category) > 0)
+				if (ginCompareAttEntries(&state, attnum, current_key,
+										 current_key_category, parent_key_attnum,
+										 parent_key, parent_key_category) > 0)
 				{
 					/*
 					 * There was a discrepancy between parent and child
@@ -581,6 +581,7 @@ gin_check_parent_keys_consistency(Relation rel,
 							 stack->blkno, stack->parentblk);
 					else
 					{
+						parent_key_attnum = gintuple_get_attrnum(&state, stack->parenttup);
 						parent_key = gintuple_get_key(&state,
 													  stack->parenttup,
 													  &parent_key_category);
@@ -589,9 +590,9 @@ gin_check_parent_keys_consistency(Relation rel,
 						 * Check if it is properly adjusted. If succeed,
 						 * proceed to the next key.
 						 */
-						if (ginCompareEntries(&state, attnum, current_key,
-											  current_key_category, parent_key,
-											  parent_key_category) > 0)
+						if (ginCompareAttEntries(&state, attnum, current_key,
+												 current_key_category, parent_key_attnum,
+												 parent_key, parent_key_category) > 0)
 							ereport(ERROR,
 									(errcode(ERRCODE_INDEX_CORRUPTED),
 									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
-- 
2.49.0

#84

Arseniy Mukhin

arseniy.mukhin.dev@gmail.com

7 months ago

In reply to: Tomas Vondra (#83)

Re: Amcheck verification of GiST and GIN

On Mon, Jun 9, 2025 at 6:34 PM Tomas Vondra <tomas@vondra.me> wrote:

On 6/9/25 00:14, Tomas Vondra wrote:

...

I propose to split it like this, into three parts, each addressing a
particular type of mistake:

1) gin_check_posting_tree_parent_keys_consistency

2) gin_check_parent_keys_consistency / att comparisons

3) gin_check_parent_keys_consistency / setting ptr->parenttup (at the end)

Does this make sense to you? If yes, can you split the patch series like
this, including a commit message for each part, explaining the fix? We'd
need the commit message even with a single patch, ofc.

The attached v5 patch splits it along these lines, except that the extra
0001 part merely adds a multicolumn index into the regression test. The
0002-0004 parts are ordered to match the TAP test, i.e. it adds tests.

Great, thank you.

I've copied the points from the report to the commit messages, but this
needs cleanup/rephrasing, to make it readable. Could you look into
that?Of course, if you think the patches should be split differently,
feel free to move stuff.

Yes, sure, I will do it ASAP.

And as I said before - if you feel the issues are too intertwined and
can't be split like this (or it just doesn't make sense), please speak
up. We can commit that as a single patch. It still needs the commit
message, though.

The way it splitted seems reasonable to me. Intertwined issues are
grouped together, and patches are more or less independent.

Also the test for 'posting tree parent_key check' that was added last
started failing locally. Don't know what changed, but I rewrote it
so now it relies on child blkno, which is stable (I hope), instead of
concrete TID. Will include it in the new patchset.

Best regards,
Arseniy Mukhin

#85

Arseniy Mukhin

arseniy.mukhin.dev@gmail.com

7 months ago

In reply to: Arseniy Mukhin (#84)

5 attachment(s)

Re: Amcheck verification of GiST and GIN

On Mon, Jun 9, 2025 at 7:37 PM Arseniy Mukhin
<arseniy.mukhin.dev@gmail.com> wrote:

On Mon, Jun 9, 2025 at 6:34 PM Tomas Vondra <tomas@vondra.me> wrote:

On 6/9/25 00:14, Tomas Vondra wrote:

...

I propose to split it like this, into three parts, each addressing a
particular type of mistake:

1) gin_check_posting_tree_parent_keys_consistency

2) gin_check_parent_keys_consistency / att comparisons

3) gin_check_parent_keys_consistency / setting ptr->parenttup (at the end)

Does this make sense to you? If yes, can you split the patch series like
this, including a commit message for each part, explaining the fix? We'd
need the commit message even with a single patch, ofc.

The attached v5 patch splits it along these lines, except that the extra
0001 part merely adds a multicolumn index into the regression test. The
0002-0004 parts are ordered to match the TAP test, i.e. it adds tests.

Great, thank you.

I've copied the points from the report to the commit messages, but this
needs cleanup/rephrasing, to make it readable. Could you look into
that?Of course, if you think the patches should be split differently,
feel free to move stuff.

Yes, sure, I will do it ASAP.

Please find a new version in attachments. There are formatted commit
messages and some cosmetic changes in the tests. Please let me know if
anything needs to be changed. Also FWIW points 9th, 10th and 11th from
the report [1]/messages/by-id/CAE7r3MJ611B9TE=YqBBncewp7-k64VWs+sjk7XF6fJUX77uFBA@mail.gmail.com were not addressed in the fixes. I'm not sure about
10th and 11th, but 9th seems like a no-brainer, so I added a patch
deleting an unused field 'parentlsn'. I tried git-am with patches and
it's ok with it. Thank you for the advice, added git-am step in my
patch preparation routine.

...
Also the test for 'posting tree parent_key check' that was added last
started failing locally. Don't know what changed, but I rewrote it
so now it relies on child blkno, which is stable (I hope), instead of
concrete TID. Will include it in the new patchset.

Also changed the regex pattern for this failing test, hope it is more
robust now.

[1]: /messages/by-id/CAE7r3MJ611B9TE=YqBBncewp7-k64VWs+sjk7XF6fJUX77uFBA@mail.gmail.com

Best regards,
Arseniy Mukhin

Attachments:

0005-patch-4-remove-unused-parentlsn.patchtext/x-patch; charset=US-ASCII; name=0005-patch-4-remove-unused-parentlsn.patchDownload

From 9e59d4fa552a998386be011b807fbe3329802fc0 Mon Sep 17 00:00:00 2001
From: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>
Date: Mon, 9 Jun 2025 20:51:05 +0300
Subject: [PATCH 5/5] patch 4: remove unused parentlsn

This commit removes unused field parentlsn in gin_index_check()
---
 contrib/amcheck/verify_gin.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index 427cf1669a6..f49e27b1a30 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -38,7 +38,6 @@ typedef struct GinScanItem
 	int			depth;
 	IndexTuple	parenttup;
 	BlockNumber parentblk;
-	XLogRecPtr	parentlsn;
 	BlockNumber blkno;
 	struct GinScanItem *next;
 } GinScanItem;
@@ -417,7 +416,6 @@ gin_check_parent_keys_consistency(Relation rel,
 	stack->depth = 0;
 	stack->parenttup = NULL;
 	stack->parentblk = InvalidBlockNumber;
-	stack->parentlsn = InvalidXLogRecPtr;
 	stack->blkno = GIN_ROOT_BLKNO;
 
 	while (stack)
@@ -428,7 +426,6 @@ gin_check_parent_keys_consistency(Relation rel,
 		OffsetNumber i,
 					maxoff,
 					prev_attnum;
-		XLogRecPtr	lsn;
 		IndexTuple	prev_tuple;
 		BlockNumber rightlink;
 
@@ -438,7 +435,6 @@ gin_check_parent_keys_consistency(Relation rel,
 									RBM_NORMAL, strategy);
 		LockBuffer(buffer, GIN_SHARE);
 		page = (Page) BufferGetPage(buffer);
-		lsn = BufferGetLSNAtomic(buffer);
 		maxoff = PageGetMaxOffsetNumber(page);
 		rightlink = GinPageGetOpaque(page)->rightlink;
 
@@ -481,7 +477,6 @@ gin_check_parent_keys_consistency(Relation rel,
 				ptr->depth = stack->depth;
 				ptr->parenttup = CopyIndexTuple(stack->parenttup);
 				ptr->parentblk = stack->parentblk;
-				ptr->parentlsn = stack->parentlsn;
 				ptr->blkno = rightlink;
 				ptr->next = stack->next;
 				stack->next = ptr;
@@ -611,7 +606,6 @@ gin_check_parent_keys_consistency(Relation rel,
 					ptr->parenttup = CopyIndexTuple(idxtuple);
 				ptr->parentblk = stack->blkno;
 				ptr->blkno = GinGetDownlink(idxtuple);
-				ptr->parentlsn = lsn;
 				ptr->next = stack->next;
 				stack->next = ptr;
 			}
-- 
2.43.0

0003-patch-2-gin_check_parent_keys_consistency.patchtext/x-patch; charset=US-ASCII; name=0003-patch-2-gin_check_parent_keys_consistency.patchDownload

From a7dfd221a809ccd062fec36cd2b7acf025642bfb Mon Sep 17 00:00:00 2001
From: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>
Date: Mon, 9 Jun 2025 20:39:13 +0300
Subject: [PATCH 3/5] patch 2: gin_check_parent_keys_consistency

This commit address issues with parent_key checks for the entry tree in gin_index_check():
- parenttup is always NULL. We want to set parenttup to NULL for the rightmost tuple of the level only, in all other cases it should be valid parent_key.
- use GinGetDownlink while retrieving child blkno to avoid triggering Assert (as core GIN code does).
---
 contrib/amcheck/t/006_verify_gin.pl | 80 +++++++++++++++++++++++++++++
 contrib/amcheck/verify_gin.c        |  8 +--
 2 files changed, 84 insertions(+), 4 deletions(-)

diff --git a/contrib/amcheck/t/006_verify_gin.pl b/contrib/amcheck/t/006_verify_gin.pl
index 46f693fbb08..fa6baa51546 100644
--- a/contrib/amcheck/t/006_verify_gin.pl
+++ b/contrib/amcheck/t/006_verify_gin.pl
@@ -31,6 +31,8 @@ $node->safe_psql(
 invalid_entry_order_leaf_page_test();
 invalid_entry_order_inner_page_test();
 invalid_entry_columns_order_test();
+inconsistent_with_parent_key__parent_key_corrupted_test();
+inconsistent_with_parent_key__child_key_corrupted_test();
 
 sub invalid_entry_order_leaf_page_test
 {
@@ -161,6 +163,84 @@ sub invalid_entry_columns_order_test
 	like($stderr, qr/$expected/);
 }
 
+sub inconsistent_with_parent_key__parent_key_corrupted_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+		CREATE TABLE $relname (a text[]);
+		CREATE INDEX $indexname ON $relname USING gin (a);
+		INSERT INTO $relname (a) VALUES (('{' || 'llllllllll' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'mmmmmmmmmm' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'nnnnnnnnnn' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'xxxxxxxxxx' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'yyyyyyyyyy' || random_string(1870) ||'}')::text[]);
+		SELECT gin_clean_pending_list('$indexname');
+	));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 1;  # root
+
+	# we have nnnnnnnnnn... as parent key in the root, so replace it with something smaller then child's keys
+	string_replace_block(
+		$relpath,
+		'nnnnnnnnnn',
+		'aaaaaaaaaa',
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	my $expected = "index \"$indexname\" has inconsistent records on page 5 offset 3";
+	like($stderr, qr/$expected/);
+}
+
+sub inconsistent_with_parent_key__child_key_corrupted_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+		CREATE TABLE $relname (a text[]);
+		CREATE INDEX $indexname ON $relname USING gin (a);
+		INSERT INTO $relname (a) VALUES (('{' || 'llllllllll' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'mmmmmmmmmm' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'nnnnnnnnnn' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'xxxxxxxxxx' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'yyyyyyyyyy' || random_string(1870) ||'}')::text[]);
+		SELECT gin_clean_pending_list('$indexname');
+	 ));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 5;  # leaf
+
+	# we have nnnnnnnnnn... as parent key in the root, so replace child key with something bigger
+	string_replace_block(
+		$relpath,
+		'nnnnnnnnnn',
+		'pppppppppp',
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	my $expected = "index \"$indexname\" has inconsistent records on page 5 offset 3";
+	like($stderr, qr/$expected/);
+}
+
 # Returns the filesystem path for the named relation.
 sub relation_filepath
 {
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index 26b98571b56..8f6a5410cb7 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -609,10 +609,10 @@ gin_check_parent_keys_consistency(Relation rel,
 				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
 				ptr->depth = stack->depth + 1;
 				/* last tuple in layer has no high key */
-				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
-					ptr->parenttup = CopyIndexTuple(idxtuple);
-				else
+				if (i == maxoff && rightlink == InvalidBlockNumber)
 					ptr->parenttup = NULL;
+				else
+					ptr->parenttup = CopyIndexTuple(idxtuple);
 				ptr->parentblk = stack->blkno;
 				ptr->blkno = GinGetDownlink(idxtuple);
 				ptr->parentlsn = lsn;
@@ -750,7 +750,7 @@ gin_refind_parent(Relation rel, BlockNumber parentblkno,
 		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
 		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
 
-		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		if (GinGetDownlink(itup) == childblkno)
 		{
 			/* Found it! Make copy and return it */
 			result = CopyIndexTuple(itup);
-- 
2.43.0

0001-amcheck-Add-gin_index_check-on-a-multicolumn-index.patchtext/x-patch; charset=US-ASCII; name=0001-amcheck-Add-gin_index_check-on-a-multicolumn-index.patchDownload

From 33da75ed57ef66a47b59f0be8b7feeb6aaf2028e Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Mon, 9 Jun 2025 01:42:52 +0200
Subject: [PATCH 1/5] amcheck: Add gin_index_check on a multicolumn index

This commit adds a regression test to verify that gin_index_check()
correctly handles multicolumn indexes, increases its test coverage.

Author: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>
Discussion: https://postgr.es/m/CAE7r3MJ611B9TE=YqBBncewp7-k64VWs+sjk7XF6fJUX77uFBA@mail.gmail.com
---
 contrib/amcheck/expected/check_gin.out | 12 ++++++++++++
 contrib/amcheck/sql/check_gin.sql      | 10 ++++++++++
 2 files changed, 22 insertions(+)

diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
index b4f0b110747..8dd01ced8d1 100644
--- a/contrib/amcheck/expected/check_gin.out
+++ b/contrib/amcheck/expected/check_gin.out
@@ -76,3 +76,15 @@ SELECT gin_index_check('gin_check_jsonb_idx');
 
 -- cleanup
 DROP TABLE gin_check_jsonb;
+-- Test GIN multicolumn index
+CREATE TABLE "gin_check_multicolumn"(a text[], b text[]);
+INSERT INTO gin_check_multicolumn (a,b) values ('{a,c,e}','{b,d,f}');
+CREATE INDEX "gin_check_multicolumn_idx" on gin_check_multicolumn USING GIN(a,b);
+SELECT gin_index_check('gin_check_multicolumn_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_multicolumn;
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
index 66f42c34311..11caed3d6a8 100644
--- a/contrib/amcheck/sql/check_gin.sql
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -50,3 +50,13 @@ SELECT gin_index_check('gin_check_jsonb_idx');
 
 -- cleanup
 DROP TABLE gin_check_jsonb;
+
+-- Test GIN multicolumn index
+CREATE TABLE "gin_check_multicolumn"(a text[], b text[]);
+INSERT INTO gin_check_multicolumn (a,b) values ('{a,c,e}','{b,d,f}');
+CREATE INDEX "gin_check_multicolumn_idx" on gin_check_multicolumn USING GIN(a,b);
+
+SELECT gin_index_check('gin_check_multicolumn_idx');
+
+-- cleanup
+DROP TABLE gin_check_multicolumn;
-- 
2.43.0

0002-patch-1-gin_check_parent_keys_consistency.patchtext/x-patch; charset=US-ASCII; name=0002-patch-1-gin_check_parent_keys_consistency.patchDownload

From e4a6292ff262e754c32fd26325bf386dec3a2981 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Mon, 9 Jun 2025 01:44:59 +0200
Subject: [PATCH 2/5] patch 1: gin_check_parent_keys_consistency

This commit address several issues:
- Current code doesn't take into account attribute number when comparing index entries, but for multicolumn indexes attnum defines order.
- Add check that root page entries are ordered (there is nothing special about the root page that would prevent us from checking the order)
- Add checking order for the rightmost entry of the leaf level and skip it only for inner level.
- Split detection fix: in the case of a split, the cached parent key is greater than the current child key, not less.
---
 contrib/amcheck/meson.build         |   1 +
 contrib/amcheck/t/006_verify_gin.pl | 200 ++++++++++++++++++++++++++++
 contrib/amcheck/verify_gin.c        |  37 ++---
 3 files changed, 220 insertions(+), 18 deletions(-)
 create mode 100644 contrib/amcheck/t/006_verify_gin.pl

diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index b33e8c9b062..1f0c347ed54 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -49,6 +49,7 @@ tests += {
       't/003_cic_2pc.pl',
       't/004_verify_nbtree_unique.pl',
       't/005_pitr.pl',
+      't/006_verify_gin.pl',
     ],
   },
 }
diff --git a/contrib/amcheck/t/006_verify_gin.pl b/contrib/amcheck/t/006_verify_gin.pl
new file mode 100644
index 00000000000..46f693fbb08
--- /dev/null
+++ b/contrib/amcheck/t/006_verify_gin.pl
@@ -0,0 +1,200 @@
+
+# Copyright (c) 2021-2025, PostgreSQL Global Development Group
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+
+use Test::More;
+
+my $node;
+my $blksize;
+
+#
+# Test set-up
+#
+$node = PostgreSQL::Test::Cluster->new('test');
+$node->init(no_data_checksums => 1);
+$node->append_conf('postgresql.conf', 'autovacuum=off');
+$node->start;
+$blksize = int($node->safe_psql('postgres', 'SHOW block_size;'));
+$node->safe_psql('postgres', q(CREATE EXTENSION amcheck));
+$node->safe_psql(
+	'postgres', q(
+		CREATE OR REPLACE FUNCTION  random_string( INT ) RETURNS text AS $$
+		SELECT string_agg(substring('0123456789bcdfghjkmnpqrstvwxyz', ceil(random() * 30)::integer, 1), '') from generate_series(1, $1);
+		$$ LANGUAGE SQL;));
+
+# Tests
+invalid_entry_order_leaf_page_test();
+invalid_entry_order_inner_page_test();
+invalid_entry_columns_order_test();
+
+sub invalid_entry_order_leaf_page_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+		CREATE TABLE $relname (a text[]);
+		CREATE INDEX $indexname ON $relname USING gin (a);
+		INSERT INTO $relname (a) VALUES ('{aaaaa,bbbbb}');
+		SELECT gin_clean_pending_list('$indexname');
+	 ));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 1;  # root
+
+	# produce wrong order by replacing aaaaa with ccccc
+	string_replace_block(
+		$relpath,
+		'aaaaa',
+		'ccccc',
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	my $expected = "index \"$indexname\" has wrong tuple order on entry tree page, block 1, offset 2, rightlink 4294967295";
+	like($stderr, qr/$expected/);
+}
+
+sub invalid_entry_order_inner_page_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+		CREATE TABLE $relname (a text[]);
+		CREATE INDEX $indexname ON $relname USING gin (a);
+		INSERT INTO $relname (a) VALUES (('{' || 'pppppppppp' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'qqqqqqqqqq' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'rrrrrrrrrr' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'ssssssssss' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'tttttttttt' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'uuuuuuuuuu' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'vvvvvvvvvv' || random_string(1870) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'wwwwwwwwww' || random_string(1870) ||'}')::text[]);
+		SELECT gin_clean_pending_list('$indexname');
+	));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 1;  # root
+
+	# we have rrrrrrrrr... and tttttttttt... as keys in the root, so produce wrong order by replacing rrrrrrrrrr....
+	string_replace_block(
+		$relpath,
+		'rrrrrrrrrr',
+		'zzzzzzzzzz',
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	my $expected = "index \"$indexname\" has wrong tuple order on entry tree page, block 1, offset 2, rightlink 4294967295";
+	like($stderr, qr/$expected/);
+}
+
+sub invalid_entry_columns_order_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+		CREATE TABLE $relname (a text[],b text[]);
+		CREATE INDEX $indexname ON $relname USING gin (a,b);
+		INSERT INTO $relname (a,b) VALUES ('{aaa}','{bbb}');
+		SELECT gin_clean_pending_list('$indexname');
+	));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 1;  # root
+
+	# mess column numbers
+	# root items order before: (1,aaa), (2,bbb)
+	# root items order after:  (2,aaa), (1,bbb)
+	my $attrno_1 = pack('s', 1);
+	my $attrno_2 = pack('s', 2);
+
+	my $find = qr/($attrno_1)(.)(aaa)/s;
+	my $replace = $attrno_2 . '$2$3';
+	string_replace_block(
+		$relpath,
+		$find,
+		$replace,
+		$blksize,
+		$blkno
+	);
+
+	$find = qr/($attrno_2)(.)(bbb)/s;
+	$replace = $attrno_1 . '$2$3';
+	string_replace_block(
+		$relpath,
+		$find,
+		$replace,
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	my $expected = "index \"$indexname\" has wrong tuple order on entry tree page, block 1, offset 2, rightlink 4294967295";
+	like($stderr, qr/$expected/);
+}
+
+# Returns the filesystem path for the named relation.
+sub relation_filepath
+{
+	my ($relname) = @_;
+
+	my $pgdata = $node->data_dir;
+	my $rel = $node->safe_psql('postgres',
+		qq(SELECT pg_relation_filepath('$relname')));
+	die "path not found for relation $relname" unless defined $rel;
+	return "$pgdata/$rel";
+}
+
+sub string_replace_block
+{
+	my ($filename, $find, $replace, $blksize, $blkno) = @_;
+
+	my $fh;
+	open($fh, '+<', $filename) or BAIL_OUT("open failed: $!");
+	binmode $fh;
+
+	my $offset = $blkno * $blksize;
+	my $buffer;
+
+	sysseek($fh, $offset, 0) or BAIL_OUT("seek failed: $!");
+	sysread($fh, $buffer, $blksize) or BAIL_OUT("read failed: $!");
+
+	$buffer =~ s/$find/'"' . $replace . '"'/gee;
+
+	sysseek($fh, $offset, 0) or BAIL_OUT("seek failed: $!");
+	syswrite($fh, $buffer) or BAIL_OUT("write failed: $!");
+
+	close($fh) or BAIL_OUT("close failed: $!");
+
+	return;
+}
+
+done_testing();
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index b5f363562e3..26b98571b56 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -463,17 +463,18 @@ gin_check_parent_keys_consistency(Relation rel,
 			Datum		parent_key = gintuple_get_key(&state,
 													  stack->parenttup,
 													  &parent_key_category);
+			OffsetNumber parent_key_attnum = gintuple_get_attrnum(&state, stack->parenttup);
 			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno,
 												   page, maxoff);
 			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
-			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			OffsetNumber page_max_key_attnum = gintuple_get_attrnum(&state, idxtuple);
 			GinNullCategory page_max_key_category;
 			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
 
 			if (rightlink != InvalidBlockNumber &&
-				ginCompareEntries(&state, attnum, page_max_key,
-								  page_max_key_category, parent_key,
-								  parent_key_category) > 0)
+				ginCompareAttEntries(&state, page_max_key_attnum, page_max_key,
+									 page_max_key_category, parent_key_attnum,
+									 parent_key, parent_key_category) < 0)
 			{
 				/* split page detected, install right link to the stack */
 				GinScanItem *ptr;
@@ -528,20 +529,18 @@ gin_check_parent_keys_consistency(Relation rel,
 			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
 
 			/*
-			 * First block is metadata, skip order check. Also, never check
-			 * for high key on rightmost page, as this key is not really
-			 * stored explicitly.
+			 * Never check for high key on rightmost inner page, as this key
+			 * is not really stored explicitly.
 			 *
 			 * Also make sure to not compare entries for different attnums,
 			 * which may be stored on the same page.
 			 */
-			if (i != FirstOffsetNumber && attnum == prev_attnum && stack->blkno != GIN_ROOT_BLKNO &&
-				!(i == maxoff && rightlink == InvalidBlockNumber))
+			if (i != FirstOffsetNumber && !(i == maxoff && rightlink == InvalidBlockNumber && !GinPageIsLeaf(page)))
 			{
 				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
-				if (ginCompareEntries(&state, attnum, prev_key,
-									  prev_key_category, current_key,
-									  current_key_category) >= 0)
+				if (ginCompareAttEntries(&state, prev_attnum, prev_key,
+										 prev_key_category, attnum,
+										 current_key, current_key_category) >= 0)
 					ereport(ERROR,
 							(errcode(ERRCODE_INDEX_CORRUPTED),
 							 errmsg("index \"%s\" has wrong tuple order on entry tree page, block %u, offset %u, rightlink %u",
@@ -556,13 +555,14 @@ gin_check_parent_keys_consistency(Relation rel,
 				i == maxoff)
 			{
 				GinNullCategory parent_key_category;
+				OffsetNumber parent_key_attnum = gintuple_get_attrnum(&state, stack->parenttup);
 				Datum		parent_key = gintuple_get_key(&state,
 														  stack->parenttup,
 														  &parent_key_category);
 
-				if (ginCompareEntries(&state, attnum, current_key,
-									  current_key_category, parent_key,
-									  parent_key_category) > 0)
+				if (ginCompareAttEntries(&state, attnum, current_key,
+										 current_key_category, parent_key_attnum,
+										 parent_key, parent_key_category) > 0)
 				{
 					/*
 					 * There was a discrepancy between parent and child
@@ -581,6 +581,7 @@ gin_check_parent_keys_consistency(Relation rel,
 							 stack->blkno, stack->parentblk);
 					else
 					{
+						parent_key_attnum = gintuple_get_attrnum(&state, stack->parenttup);
 						parent_key = gintuple_get_key(&state,
 													  stack->parenttup,
 													  &parent_key_category);
@@ -589,9 +590,9 @@ gin_check_parent_keys_consistency(Relation rel,
 						 * Check if it is properly adjusted. If succeed,
 						 * proceed to the next key.
 						 */
-						if (ginCompareEntries(&state, attnum, current_key,
-											  current_key_category, parent_key,
-											  parent_key_category) > 0)
+						if (ginCompareAttEntries(&state, attnum, current_key,
+												 current_key_category, parent_key_attnum,
+												 parent_key, parent_key_category) > 0)
 							ereport(ERROR,
 									(errcode(ERRCODE_INDEX_CORRUPTED),
 									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
-- 
2.43.0

0004-patch-3-gin_check_posting_tree_parent_keys_consisten.patchtext/x-patch; charset=US-ASCII; name=0004-patch-3-gin_check_posting_tree_parent_keys_consisten.patchDownload

From cb6f5b84e3bf5f5716c5470ffb4823563b23d21f Mon Sep 17 00:00:00 2001
From: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>
Date: Mon, 9 Jun 2025 20:49:11 +0300
Subject: [PATCH 4/5] patch 3: gin_check_posting_tree_parent_keys_consistency

This commit address issues with parent_key checks for the posting tree in gin_index_check():
- Check if parentkey is valid before using it in the comparison. Checking that stack->parentblk is valid blocknum is not enough
- Set invalid parentkey for the rightmost child only, not for all children of the rightmost page.
---
 contrib/amcheck/t/006_verify_gin.pl | 40 +++++++++++++++++++++++++++++
 contrib/amcheck/verify_gin.c        | 12 +++------
 2 files changed, 44 insertions(+), 8 deletions(-)

diff --git a/contrib/amcheck/t/006_verify_gin.pl b/contrib/amcheck/t/006_verify_gin.pl
index fa6baa51546..a7c7c1f5a61 100644
--- a/contrib/amcheck/t/006_verify_gin.pl
+++ b/contrib/amcheck/t/006_verify_gin.pl
@@ -33,6 +33,7 @@ invalid_entry_order_inner_page_test();
 invalid_entry_columns_order_test();
 inconsistent_with_parent_key__parent_key_corrupted_test();
 inconsistent_with_parent_key__child_key_corrupted_test();
+inconsistent_with_parent_key__parent_key_corrupted_posting_tree_test();
 
 sub invalid_entry_order_leaf_page_test
 {
@@ -241,6 +242,45 @@ sub inconsistent_with_parent_key__child_key_corrupted_test
 	like($stderr, qr/$expected/);
 }
 
+sub inconsistent_with_parent_key__parent_key_corrupted_posting_tree_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+		CREATE TABLE $relname (a text[]);
+		INSERT INTO $relname (a) select ('{aaaaa}') from generate_series(1,10000);
+		CREATE INDEX $indexname ON $relname USING gin (a);
+	));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 2;  # posting tree root
+
+	# we have a posting tree for 'aaaaa' key with the root at 2nd block
+	# and two leaf pages 3 and 4. replace 4th page's high key with (1,1)
+	# so that there are tid's in leaf page that are larger then the new high key.
+	my $find = pack('S*', 0, 4, 0) . '....';
+	my $replace = pack('S*', 0, 4, 0, 1, 1);
+	string_replace_block(
+		$relpath,
+		$find,
+		$replace,
+		$blksize,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	my $expected = "index \"$indexname\": tid exceeds parent's high key in postingTree leaf on block 4";
+	like($stderr, qr/$expected/);
+}
+
+
 # Returns the filesystem path for the named relation.
 sub relation_filepath
 {
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index 8f6a5410cb7..427cf1669a6 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -346,7 +346,7 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 				 * Check if this tuple is consistent with the downlink in the
 				 * parent.
 				 */
-				if (stack->parentblk != InvalidBlockNumber && i == maxoff &&
+				if (i == maxoff && ItemPointerIsValid(&stack->parentkey) &&
 					ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
 					ereport(ERROR,
 							(errcode(ERRCODE_INDEX_CORRUPTED),
@@ -359,14 +359,10 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 				ptr->depth = stack->depth + 1;
 
 				/*
-				 * Set rightmost parent key to invalid item pointer. Its value
-				 * is 'Infinity' and not explicitly stored.
+				 * The rightmost parent key is always invalid item pointer.
+				 * Its value is 'Infinity' and not explicitly stored.
 				 */
-				if (rightlink == InvalidBlockNumber)
-					ItemPointerSetInvalid(&ptr->parentkey);
-				else
-					ptr->parentkey = posting_item->key;
-
+				ptr->parentkey = posting_item->key;
 				ptr->parentblk = stack->blkno;
 				ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
 				ptr->next = stack->next;
-- 
2.43.0

#86

Andrey Borodin

x4mmm@yandex-team.ru

7 months ago

In reply to: Arseniy Mukhin (#85)

Re: Amcheck verification of GiST and GIN

Hi Arseniy!

Thanks for finding these problems.
I had several attempts to wrap my head around original patch with fixes, but when it was broken into several steps it finally became easier for me.
Here are some thought about patches.

On 10 Jun 2025, at 13:18, Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> wrote:
<0001-amcheck-Add-gin_index_check-on-a-multicolumn-index.patch>

The test seems harmless and nice to have. I understand that this test is needed to extend coverage.
Perhaps, we could verify that some code is actually triggered. Personally, I would be happy if we could some add injection points with notices at tested branches. But, AFAIK, it's too much of a burden to have injection points in contrib extensions. We had very similar problem with sort patch in btree_gist and eventually gave up. elog(DEBUG) was not a good solution too, because it was unstable.
See 'gin-finish-incomplete-split' or 'hash-aggregate-enter-spill-mode' for reference.

On 10 Jun 2025, at 13:18, Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> wrote:
<0002-patch-1-gin_check_parent_keys_consistency.patch>

Well, we inherited ginCompareEntries() from the very first patch version from 2020. I can't really say anything about differences here, but your proposed change seems correct.

Kirill excluded rightmost keys in v33 and that was kind of a fix. Kirill, do you remember if was particular problem of internal pages? Is it safe to enable tuple order check for rightmost tuples on leaf pages?

You wrote this comment:
+			/*
+			 * First block is metadata, skip order check. Also, never check
+			 * for high key on rightmost page, as this key is not really
+			 * stored explicitly.
+			 */

I agree that exclusion (stack->blkno != GIN_ROOT_BLKNO) make no sense. It was with us from the original version from 2020. As I understand some checks on root page will be used in test invalid_entry_columns_order_test.

Having some TAP tests sounds like a very good idea.

I'm a bit surprised by excluding some letters from random_string(), but perhaps it's fine for this test.

Somewhere here:
+ INSERT INTO $relname (a) VALUES (('{' || 'pppppppppp' || random_string(1870) ||'}')::text[]);
I'd like to have a comment explaining number 1870. And, probably, you expect exactly 2 tuples on root page, right?

Are we 100% certain that 'rrrrrrrrr' will always be on root page?

I do not see much value in having variables $relname and $indexname. I'd just substitute its usages with literals. But I'm not sure, maybe this structure will be used in your tests later...

In this function
+sub string_replace_block
I'd suggest a little bit of comments. Also, perhaps, fsync of files, but 001_verify_heapam.pl does not do fsync. So, maybe it's OK here too.

Also, I have a wild idea. Maybe add an assert that block size if 8192 and just exit otherwise?

And, maybe instead of gin_clean_pending_list() you can just create an index with fastupdate=off.

On 10 Jun 2025, at 13:18, Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> wrote:
<0003-patch-2-gin_check_parent_keys_consistency.patch>

The patch seems correct to me.
Except this
+ my $blkno = 5; # leaf
in test reads scary. Will it be stable on buildfarm?

On 10 Jun 2025, at 13:18, Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> wrote:
<0004-patch-3-gin_check_posting_tree_parent_keys_consisten.patch>

I generally agree with direction of this patch.
But please also check the approach of PageGetItemIdCareful() in verify_nbtree.c. It goes extra mile to avoid coredump in case of bogus ItemId. Should we do something like that here too?

On 10 Jun 2025, at 13:18, Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> wrote:

<0005-patch-4-remove-unused-parentlsn.patch>

LGTM.

On 9 May 2025, at 17:43, Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> wrote:

10) README says "Vacuum never deletes tuples or pages from the entry
tree." But check assumes that it's possible to have
deleted leaf page with 0 entries.

if (GinPageIsDeleted(page))
{
if (!GinPageIsLeaf(page))
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("index \"%s\" has deleted internal page %u",
RelationGetRelationName(rel), blockNo)));
if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("index \"%s\" has deleted page %u with tuples",
RelationGetRelationName(rel), blockNo)));
}

To enforce such an invariant we must be sure that GIN never deleted entry pages in older versions. I do not have enough knowledge of the history for this.

11) When we compare entry tree max page key with parent key:

if (ginCompareAttEntries(&state, attnum, current_key,
current_key_category, parent_key_attnum,
parent_key, parent_key_category) > 0)
{
/*
* There was a discrepancy between parent and child
* tuples. We need to verify it is not a result of
* concurrent call of gistplacetopage(). So, lock parent
* and try to find downlink for current page. It may be
* missing due to concurrent page split, this is OK.
*/
pfree(stack->parenttup);
stack->parenttup = gin_refind_parent(rel, stack->parentblk,
stack->blkno, strategy);

I think we can remove gin_refind_parent() and do ereport right away here.
The same logic as with 3). AFAIK it's impossible to have a child item
with a key that is higher than the cached parent key.
Parent key bounds what keys we can insert into the child page, so it
seems there is no way how they can appear there.

This logic was copied from GiST check. In GiST "Area of responsibility" of internal tuple can be extended in any direction. That's why we need to lock parent page.
If in GIN internal tuple keyspace is never extended - it's OK to avoid gin_refind_parent().
But reasoning about GIN concurrency is rather difficult. Unfortunately, we do not have such checks in B-tree verification without ShareLock. Either way we could peep some idea from there.

Thank you!

Best regards, Andrey Borodin.

#87

Arseniy Mukhin

arseniy.mukhin.dev@gmail.com

7 months ago

In reply to: Andrey Borodin (#86)

5 attachment(s)

Re: Amcheck verification of GiST and GIN

On Sun, Jun 15, 2025 at 4:24 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:

Hi Arseniy!

Thanks for finding these problems.
I had several attempts to wrap my head around original patch with fixes, but when it was broken into several steps it finally became easier for me.
Here are some thought about patches.

Hi Andrey! Thank you for the review.

On 10 Jun 2025, at 13:18, Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> wrote:
<0001-amcheck-Add-gin_index_check-on-a-multicolumn-index.patch>

The test seems harmless and nice to have. I understand that this test is needed to extend coverage.
Perhaps, we could verify that some code is actually triggered. Personally, I would be happy if we could some add injection points with notices at tested branches. But, AFAIK, it's too much of a burden to have injection points in contrib extensions. We had very similar problem with sort patch in btree_gist and eventually gave up. elog(DEBUG) was not a good solution too, because it was unstable.
See 'gin-finish-incomplete-split' or 'hash-aggregate-enter-spill-mode' for reference.

I'm not familiar with injections points much, but I think I got the
idea, sounds interesting. Thank you for the references.

Having some TAP tests sounds like a very good idea.

I'm a bit surprised by excluding some letters from random_string(), but perhaps it's fine for this test.

Yeah, there is no reason why we can't use vowels here, so I will add
them so that it doesn't look like there is any point in their absence.

Somewhere here:
+ INSERT INTO $relname (a) VALUES (('{' || 'pppppppppp' || random_string(1870) ||'}')::text[]);
I'd like to have a comment explaining number 1870. And, probably, you expect exactly 2 tuples on root page, right?

The idea behind "random_string(1870)" was to get split as fast as
possible, but tuples with size > 2kb are toasted, so we have to use
something about 2k here. I think I took 1870 from some other place
where it was necessary, but here we can round it to 1900. So I'll
replace 1870 with 1900 and add a comment about the size. Also gonna
add some comments about datasets in some tests to make it more clear.

Are we 100% certain that 'rrrrrrrrr' will always be on root page?

I'm not 100% sure. AFAIK the split algorithm is deterministic and the
idea was that if we use very long tuples, then all other factors will
be too small to influence what key we will see on the root page.

I do not see much value in having variables $relname and $indexname. I'd just substitute its usages with literals. But I'm not sure, maybe this structure will be used in your tests later...

I added variables just because we use index name and table name
several times, but I don't mind getting rid of them.

In this function
+sub string_replace_block
I'd suggest a little bit of comments. Also, perhaps, fsync of files, but 001_verify_heapam.pl does not do fsync. So, maybe it's OK here too.

Will add a comment here.

Also, I have a wild idea. Maybe add an assert that block size if 8192 and just exit otherwise?

I like the idea. I thought maybe it would be great to have some
function that every TAP test can use if it needs a certain block size?

And, maybe instead of gin_clean_pending_list() you can just create an index with fastupdate=off.

Yeah, I think we can do it even simpler if we move index creation to
the end as regression tests do.

On 10 Jun 2025, at 13:18, Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> wrote:
<0003-patch-2-gin_check_parent_keys_consistency.patch>

The patch seems correct to me.
Except this
+ my $blkno = 5; # leaf
in test reads scary. Will it be stable on buildfarm?

Not sure, but I thought that blkno should be more or less the same everywhere.

On 9 May 2025, at 17:43, Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> wrote:

10) README says "Vacuum never deletes tuples or pages from the entry
tree." But check assumes that it's possible to have
deleted leaf page with 0 entries.

if (GinPageIsDeleted(page))
{
if (!GinPageIsLeaf(page))
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("index \"%s\" has deleted internal page %u",
RelationGetRelationName(rel), blockNo)));
if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("index \"%s\" has deleted page %u with tuples",
RelationGetRelationName(rel), blockNo)));
}

To enforce such an invariant we must be sure that GIN never deleted entry pages in older versions. I do not have enough knowledge of the history for this.

Agree, good point.

11) When we compare entry tree max page key with parent key:

if (ginCompareAttEntries(&state, attnum, current_key,
current_key_category, parent_key_attnum,
parent_key, parent_key_category) > 0)
{
/*
* There was a discrepancy between parent and child
* tuples. We need to verify it is not a result of
* concurrent call of gistplacetopage(). So, lock parent
* and try to find downlink for current page. It may be
* missing due to concurrent page split, this is OK.
*/
pfree(stack->parenttup);
stack->parenttup = gin_refind_parent(rel, stack->parentblk,
stack->blkno, strategy);

I think we can remove gin_refind_parent() and do ereport right away here.
The same logic as with 3). AFAIK it's impossible to have a child item
with a key that is higher than the cached parent key.
Parent key bounds what keys we can insert into the child page, so it
seems there is no way how they can appear there.

This logic was copied from GiST check. In GiST "Area of responsibility" of internal tuple can be extended in any direction. That's why we need to lock parent page.
If in GIN internal tuple keyspace is never extended - it's OK to avoid gin_refind_parent().
But reasoning about GIN concurrency is rather difficult. Unfortunately, we do not have such checks in B-tree verification without ShareLock. Either way we could peep some idea from there.

Got it.

Here is the new version. I fixed some points that Andrey mentioned.
All of them in the TAP test. Several comments were added, filler size
1870 changed to 1900. Also I added vowels to the replace function and
moved index creation after the data filling. Thank you!

Best regards,
Arseniy Mukhin

Attachments:

v7-0002-patch-1-gin_check_parent_keys_consistency.patchtext/x-patch; charset=US-ASCII; name=v7-0002-patch-1-gin_check_parent_keys_consistency.patchDownload

From 64364caee85683f73818628decd066b33f01a7dd Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Mon, 9 Jun 2025 01:44:59 +0200
Subject: [PATCH v7 2/5] patch 1: gin_check_parent_keys_consistency

This commit address several issues:
- Current code doesn't take into account attribute number when comparing index entries, but for multicolumn indexes attnum defines order.
- Add check that root page entries are ordered (there is nothing special about the root page that would prevent us from checking the order)
- Add checking order for the rightmost entry of the leaf level and skip it only for inner level.
- Split detection fix: in the case of a split, the cached parent key is greater than the current child key, not less.
---
 contrib/amcheck/meson.build         |   1 +
 contrib/amcheck/t/006_verify_gin.pl | 199 ++++++++++++++++++++++++++++
 contrib/amcheck/verify_gin.c        |  37 +++---
 3 files changed, 219 insertions(+), 18 deletions(-)
 create mode 100644 contrib/amcheck/t/006_verify_gin.pl

diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index b33e8c9b062..1f0c347ed54 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -49,6 +49,7 @@ tests += {
       't/003_cic_2pc.pl',
       't/004_verify_nbtree_unique.pl',
       't/005_pitr.pl',
+      't/006_verify_gin.pl',
     ],
   },
 }
diff --git a/contrib/amcheck/t/006_verify_gin.pl b/contrib/amcheck/t/006_verify_gin.pl
new file mode 100644
index 00000000000..e73c9d9f92e
--- /dev/null
+++ b/contrib/amcheck/t/006_verify_gin.pl
@@ -0,0 +1,199 @@
+
+# Copyright (c) 2021-2025, PostgreSQL Global Development Group
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+
+use Test::More;
+
+my $node;
+my $blksize;
+
+# to get the split fast, we want tuples to be as large as possible, but the same time we don't want them to be toasted.
+my $filler_size = 1900;
+
+#
+# Test set-up
+#
+$node = PostgreSQL::Test::Cluster->new('test');
+$node->init(no_data_checksums => 1);
+$node->append_conf('postgresql.conf', 'autovacuum=off');
+$node->start;
+$blksize = int($node->safe_psql('postgres', 'SHOW block_size;'));
+$node->safe_psql('postgres', q(CREATE EXTENSION amcheck));
+$node->safe_psql(
+	'postgres', q(
+		CREATE OR REPLACE FUNCTION  random_string( INT ) RETURNS text AS $$
+		SELECT string_agg(substring('0123456789abcdefghijklmnopqrstuvwxyz', ceil(random() * 36)::integer, 1), '') from generate_series(1, $1);
+		$$ LANGUAGE SQL;));
+
+# Tests
+invalid_entry_order_leaf_page_test();
+invalid_entry_order_inner_page_test();
+invalid_entry_columns_order_test();
+
+sub invalid_entry_order_leaf_page_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+		CREATE TABLE $relname (a text[]);
+		INSERT INTO $relname (a) VALUES ('{aaaaa,bbbbb}');
+		CREATE INDEX $indexname ON $relname USING gin (a);
+	 ));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 1;  # root
+
+	# produce wrong order by replacing aaaaa with ccccc
+	string_replace_block(
+		$relpath,
+		'aaaaa',
+		'ccccc',
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	my $expected = "index \"$indexname\" has wrong tuple order on entry tree page, block 1, offset 2, rightlink 4294967295";
+	like($stderr, qr/$expected/);
+}
+
+sub invalid_entry_order_inner_page_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	# to break the order in the inner page we need at least 3 items (rightmost key in the inner level is not checked for the order)
+	# so fill table until we have 2 splits
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+		CREATE TABLE $relname (a text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'pppppppppp' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'qqqqqqqqqq' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'rrrrrrrrrr' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'ssssssssss' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'tttttttttt' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'uuuuuuuuuu' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'vvvvvvvvvv' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'wwwwwwwwww' || random_string($filler_size) ||'}')::text[]);
+		CREATE INDEX $indexname ON $relname USING gin (a);
+	));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 1;  # root
+
+	# we have rrrrrrrrr... and tttttttttt... as keys in the root, so produce wrong order by replacing rrrrrrrrrr....
+	string_replace_block(
+		$relpath,
+		'rrrrrrrrrr',
+		'zzzzzzzzzz',
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	my $expected = "index \"$indexname\" has wrong tuple order on entry tree page, block 1, offset 2, rightlink 4294967295";
+	like($stderr, qr/$expected/);
+}
+
+sub invalid_entry_columns_order_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+		CREATE TABLE $relname (a text[],b text[]);
+		INSERT INTO $relname (a,b) VALUES ('{aaa}','{bbb}');
+		CREATE INDEX $indexname ON $relname USING gin (a,b);
+	));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 1;  # root
+
+	# mess column numbers
+	# root items order before: (1,aaa), (2,bbb)
+	# root items order after:  (2,aaa), (1,bbb)
+	my $attrno_1 = pack('s', 1);
+	my $attrno_2 = pack('s', 2);
+
+	my $find = qr/($attrno_1)(.)(aaa)/s;
+	my $replace = $attrno_2 . '$2$3';
+	string_replace_block(
+		$relpath,
+		$find,
+		$replace,
+		$blkno
+	);
+
+	$find = qr/($attrno_2)(.)(bbb)/s;
+	$replace = $attrno_1 . '$2$3';
+	string_replace_block(
+		$relpath,
+		$find,
+		$replace,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	my $expected = "index \"$indexname\" has wrong tuple order on entry tree page, block 1, offset 2, rightlink 4294967295";
+	like($stderr, qr/$expected/);
+}
+
+# Returns the filesystem path for the named relation.
+sub relation_filepath
+{
+	my ($relname) = @_;
+
+	my $pgdata = $node->data_dir;
+	my $rel = $node->safe_psql('postgres',
+		qq(SELECT pg_relation_filepath('$relname')));
+	die "path not found for relation $relname" unless defined $rel;
+	return "$pgdata/$rel";
+}
+
+# substitute pattern 'find' with 'replace' within the block with number 'blkno' in the file 'filename'
+sub string_replace_block
+{
+	my ($filename, $find, $replace, $blkno) = @_;
+
+	my $fh;
+	open($fh, '+<', $filename) or BAIL_OUT("open failed: $!");
+	binmode $fh;
+
+	my $offset = $blkno * $blksize;
+	my $buffer;
+
+	sysseek($fh, $offset, 0) or BAIL_OUT("seek failed: $!");
+	sysread($fh, $buffer, $blksize) or BAIL_OUT("read failed: $!");
+
+	$buffer =~ s/$find/'"' . $replace . '"'/gee;
+
+	sysseek($fh, $offset, 0) or BAIL_OUT("seek failed: $!");
+	syswrite($fh, $buffer) or BAIL_OUT("write failed: $!");
+
+	close($fh) or BAIL_OUT("close failed: $!");
+
+	return;
+}
+
+done_testing();
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index b5f363562e3..26b98571b56 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -463,17 +463,18 @@ gin_check_parent_keys_consistency(Relation rel,
 			Datum		parent_key = gintuple_get_key(&state,
 													  stack->parenttup,
 													  &parent_key_category);
+			OffsetNumber parent_key_attnum = gintuple_get_attrnum(&state, stack->parenttup);
 			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno,
 												   page, maxoff);
 			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
-			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			OffsetNumber page_max_key_attnum = gintuple_get_attrnum(&state, idxtuple);
 			GinNullCategory page_max_key_category;
 			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
 
 			if (rightlink != InvalidBlockNumber &&
-				ginCompareEntries(&state, attnum, page_max_key,
-								  page_max_key_category, parent_key,
-								  parent_key_category) > 0)
+				ginCompareAttEntries(&state, page_max_key_attnum, page_max_key,
+									 page_max_key_category, parent_key_attnum,
+									 parent_key, parent_key_category) < 0)
 			{
 				/* split page detected, install right link to the stack */
 				GinScanItem *ptr;
@@ -528,20 +529,18 @@ gin_check_parent_keys_consistency(Relation rel,
 			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
 
 			/*
-			 * First block is metadata, skip order check. Also, never check
-			 * for high key on rightmost page, as this key is not really
-			 * stored explicitly.
+			 * Never check for high key on rightmost inner page, as this key
+			 * is not really stored explicitly.
 			 *
 			 * Also make sure to not compare entries for different attnums,
 			 * which may be stored on the same page.
 			 */
-			if (i != FirstOffsetNumber && attnum == prev_attnum && stack->blkno != GIN_ROOT_BLKNO &&
-				!(i == maxoff && rightlink == InvalidBlockNumber))
+			if (i != FirstOffsetNumber && !(i == maxoff && rightlink == InvalidBlockNumber && !GinPageIsLeaf(page)))
 			{
 				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
-				if (ginCompareEntries(&state, attnum, prev_key,
-									  prev_key_category, current_key,
-									  current_key_category) >= 0)
+				if (ginCompareAttEntries(&state, prev_attnum, prev_key,
+										 prev_key_category, attnum,
+										 current_key, current_key_category) >= 0)
 					ereport(ERROR,
 							(errcode(ERRCODE_INDEX_CORRUPTED),
 							 errmsg("index \"%s\" has wrong tuple order on entry tree page, block %u, offset %u, rightlink %u",
@@ -556,13 +555,14 @@ gin_check_parent_keys_consistency(Relation rel,
 				i == maxoff)
 			{
 				GinNullCategory parent_key_category;
+				OffsetNumber parent_key_attnum = gintuple_get_attrnum(&state, stack->parenttup);
 				Datum		parent_key = gintuple_get_key(&state,
 														  stack->parenttup,
 														  &parent_key_category);
 
-				if (ginCompareEntries(&state, attnum, current_key,
-									  current_key_category, parent_key,
-									  parent_key_category) > 0)
+				if (ginCompareAttEntries(&state, attnum, current_key,
+										 current_key_category, parent_key_attnum,
+										 parent_key, parent_key_category) > 0)
 				{
 					/*
 					 * There was a discrepancy between parent and child
@@ -581,6 +581,7 @@ gin_check_parent_keys_consistency(Relation rel,
 							 stack->blkno, stack->parentblk);
 					else
 					{
+						parent_key_attnum = gintuple_get_attrnum(&state, stack->parenttup);
 						parent_key = gintuple_get_key(&state,
 													  stack->parenttup,
 													  &parent_key_category);
@@ -589,9 +590,9 @@ gin_check_parent_keys_consistency(Relation rel,
 						 * Check if it is properly adjusted. If succeed,
 						 * proceed to the next key.
 						 */
-						if (ginCompareEntries(&state, attnum, current_key,
-											  current_key_category, parent_key,
-											  parent_key_category) > 0)
+						if (ginCompareAttEntries(&state, attnum, current_key,
+												 current_key_category, parent_key_attnum,
+												 parent_key, parent_key_category) > 0)
 							ereport(ERROR,
 									(errcode(ERRCODE_INDEX_CORRUPTED),
 									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
-- 
2.43.0

v7-0003-patch-2-gin_check_parent_keys_consistency.patchtext/x-patch; charset=US-ASCII; name=v7-0003-patch-2-gin_check_parent_keys_consistency.patchDownload

From 79d67a965baca64e45cd9d08ef37576bfee43ff7 Mon Sep 17 00:00:00 2001
From: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>
Date: Mon, 9 Jun 2025 20:39:13 +0300
Subject: [PATCH v7 3/5] patch 2: gin_check_parent_keys_consistency

This commit address issues with parent_key checks for the entry tree in gin_index_check():
- parenttup is always NULL. We want to set parenttup to NULL for the rightmost tuple of the level only, in all other cases it should be valid parent_key.
- use GinGetDownlink while retrieving child blkno to avoid triggering Assert (as core GIN code does).
---
 contrib/amcheck/t/006_verify_gin.pl | 78 +++++++++++++++++++++++++++++
 contrib/amcheck/verify_gin.c        |  8 +--
 2 files changed, 82 insertions(+), 4 deletions(-)

diff --git a/contrib/amcheck/t/006_verify_gin.pl b/contrib/amcheck/t/006_verify_gin.pl
index e73c9d9f92e..b997fc85f86 100644
--- a/contrib/amcheck/t/006_verify_gin.pl
+++ b/contrib/amcheck/t/006_verify_gin.pl
@@ -34,6 +34,8 @@ $node->safe_psql(
 invalid_entry_order_leaf_page_test();
 invalid_entry_order_inner_page_test();
 invalid_entry_columns_order_test();
+inconsistent_with_parent_key__parent_key_corrupted_test();
+inconsistent_with_parent_key__child_key_corrupted_test();
 
 sub invalid_entry_order_leaf_page_test
 {
@@ -159,6 +161,82 @@ sub invalid_entry_columns_order_test
 	like($stderr, qr/$expected/);
 }
 
+sub inconsistent_with_parent_key__parent_key_corrupted_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	# fill the table until we have a split
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+		CREATE TABLE $relname (a text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'llllllllll' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'mmmmmmmmmm' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'nnnnnnnnnn' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'xxxxxxxxxx' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'yyyyyyyyyy' || random_string($filler_size) ||'}')::text[]);
+		CREATE INDEX $indexname ON $relname USING gin (a);
+	));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 1;  # root
+
+	# we have nnnnnnnnnn... as parent key in the root, so replace it with something smaller then child's keys
+	string_replace_block(
+		$relpath,
+		'nnnnnnnnnn',
+		'aaaaaaaaaa',
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	my $expected = "index \"$indexname\" has inconsistent records on page 3 offset 3";
+	like($stderr, qr/$expected/);
+}
+
+sub inconsistent_with_parent_key__child_key_corrupted_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	# fill the table until we have a split
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+		CREATE TABLE $relname (a text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'llllllllll' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'mmmmmmmmmm' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'nnnnnnnnnn' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'xxxxxxxxxx' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'yyyyyyyyyy' || random_string($filler_size) ||'}')::text[]);
+		CREATE INDEX $indexname ON $relname USING gin (a);
+	 ));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 3;  # leaf
+
+	# we have nnnnnnnnnn... as parent key in the root, so replace child key with something bigger
+	string_replace_block(
+		$relpath,
+		'nnnnnnnnnn',
+		'pppppppppp',
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	my $expected = "index \"$indexname\" has inconsistent records on page 3 offset 3";
+	like($stderr, qr/$expected/);
+}
+
 # Returns the filesystem path for the named relation.
 sub relation_filepath
 {
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index 26b98571b56..8f6a5410cb7 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -609,10 +609,10 @@ gin_check_parent_keys_consistency(Relation rel,
 				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
 				ptr->depth = stack->depth + 1;
 				/* last tuple in layer has no high key */
-				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
-					ptr->parenttup = CopyIndexTuple(idxtuple);
-				else
+				if (i == maxoff && rightlink == InvalidBlockNumber)
 					ptr->parenttup = NULL;
+				else
+					ptr->parenttup = CopyIndexTuple(idxtuple);
 				ptr->parentblk = stack->blkno;
 				ptr->blkno = GinGetDownlink(idxtuple);
 				ptr->parentlsn = lsn;
@@ -750,7 +750,7 @@ gin_refind_parent(Relation rel, BlockNumber parentblkno,
 		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
 		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
 
-		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		if (GinGetDownlink(itup) == childblkno)
 		{
 			/* Found it! Make copy and return it */
 			result = CopyIndexTuple(itup);
-- 
2.43.0

v7-0004-patch-3-gin_check_posting_tree_parent_keys_consis.patchtext/x-patch; charset=US-ASCII; name=v7-0004-patch-3-gin_check_posting_tree_parent_keys_consis.patchDownload

From b7b487b2d1dd97fbf6062674515711b9c2e4409e Mon Sep 17 00:00:00 2001
From: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>
Date: Mon, 9 Jun 2025 20:49:11 +0300
Subject: [PATCH v7 4/5] patch 3:
 gin_check_posting_tree_parent_keys_consistency

This commit address issues with parent_key checks for the posting tree in gin_index_check():
- Check if parentkey is valid before using it in the comparison. Checking that stack->parentblk is valid blocknum is not enough
- Set invalid parentkey for the rightmost child only, not for all children of the rightmost page.
---
 contrib/amcheck/t/006_verify_gin.pl | 39 +++++++++++++++++++++++++++++
 contrib/amcheck/verify_gin.c        | 12 +++------
 2 files changed, 43 insertions(+), 8 deletions(-)

diff --git a/contrib/amcheck/t/006_verify_gin.pl b/contrib/amcheck/t/006_verify_gin.pl
index b997fc85f86..ff857aa9886 100644
--- a/contrib/amcheck/t/006_verify_gin.pl
+++ b/contrib/amcheck/t/006_verify_gin.pl
@@ -36,6 +36,7 @@ invalid_entry_order_inner_page_test();
 invalid_entry_columns_order_test();
 inconsistent_with_parent_key__parent_key_corrupted_test();
 inconsistent_with_parent_key__child_key_corrupted_test();
+inconsistent_with_parent_key__parent_key_corrupted_posting_tree_test();
 
 sub invalid_entry_order_leaf_page_test
 {
@@ -237,6 +238,44 @@ sub inconsistent_with_parent_key__child_key_corrupted_test
 	like($stderr, qr/$expected/);
 }
 
+sub inconsistent_with_parent_key__parent_key_corrupted_posting_tree_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+		CREATE TABLE $relname (a text[]);
+		INSERT INTO $relname (a) select ('{aaaaa}') from generate_series(1,10000);
+		CREATE INDEX $indexname ON $relname USING gin (a);
+	));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 2;  # posting tree root
+
+	# we have a posting tree for 'aaaaa' key with the root at 2nd block
+	# and two leaf pages 3 and 4. replace 4th page's high key with (1,1)
+	# so that there are tid's in leaf page that are larger then the new high key.
+	my $find = pack('S*', 0, 4, 0) . '....';
+	my $replace = pack('S*', 0, 4, 0, 1, 1);
+	string_replace_block(
+		$relpath,
+		$find,
+		$replace,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	my $expected = "index \"$indexname\": tid exceeds parent's high key in postingTree leaf on block 4";
+	like($stderr, qr/$expected/);
+}
+
+
 # Returns the filesystem path for the named relation.
 sub relation_filepath
 {
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index 8f6a5410cb7..427cf1669a6 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -346,7 +346,7 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 				 * Check if this tuple is consistent with the downlink in the
 				 * parent.
 				 */
-				if (stack->parentblk != InvalidBlockNumber && i == maxoff &&
+				if (i == maxoff && ItemPointerIsValid(&stack->parentkey) &&
 					ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
 					ereport(ERROR,
 							(errcode(ERRCODE_INDEX_CORRUPTED),
@@ -359,14 +359,10 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 				ptr->depth = stack->depth + 1;
 
 				/*
-				 * Set rightmost parent key to invalid item pointer. Its value
-				 * is 'Infinity' and not explicitly stored.
+				 * The rightmost parent key is always invalid item pointer.
+				 * Its value is 'Infinity' and not explicitly stored.
 				 */
-				if (rightlink == InvalidBlockNumber)
-					ItemPointerSetInvalid(&ptr->parentkey);
-				else
-					ptr->parentkey = posting_item->key;
-
+				ptr->parentkey = posting_item->key;
 				ptr->parentblk = stack->blkno;
 				ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
 				ptr->next = stack->next;
-- 
2.43.0

v7-0005-patch-4-remove-unused-parentlsn.patchtext/x-patch; charset=US-ASCII; name=v7-0005-patch-4-remove-unused-parentlsn.patchDownload

From 4344b1b27a554139db20c3a1ba6f9250a577a45d Mon Sep 17 00:00:00 2001
From: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>
Date: Mon, 9 Jun 2025 20:51:05 +0300
Subject: [PATCH v7 5/5] patch 4: remove unused parentlsn

This commit removes unused field parentlsn in gin_index_check()
---
 contrib/amcheck/verify_gin.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index 427cf1669a6..f49e27b1a30 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -38,7 +38,6 @@ typedef struct GinScanItem
 	int			depth;
 	IndexTuple	parenttup;
 	BlockNumber parentblk;
-	XLogRecPtr	parentlsn;
 	BlockNumber blkno;
 	struct GinScanItem *next;
 } GinScanItem;
@@ -417,7 +416,6 @@ gin_check_parent_keys_consistency(Relation rel,
 	stack->depth = 0;
 	stack->parenttup = NULL;
 	stack->parentblk = InvalidBlockNumber;
-	stack->parentlsn = InvalidXLogRecPtr;
 	stack->blkno = GIN_ROOT_BLKNO;
 
 	while (stack)
@@ -428,7 +426,6 @@ gin_check_parent_keys_consistency(Relation rel,
 		OffsetNumber i,
 					maxoff,
 					prev_attnum;
-		XLogRecPtr	lsn;
 		IndexTuple	prev_tuple;
 		BlockNumber rightlink;
 
@@ -438,7 +435,6 @@ gin_check_parent_keys_consistency(Relation rel,
 									RBM_NORMAL, strategy);
 		LockBuffer(buffer, GIN_SHARE);
 		page = (Page) BufferGetPage(buffer);
-		lsn = BufferGetLSNAtomic(buffer);
 		maxoff = PageGetMaxOffsetNumber(page);
 		rightlink = GinPageGetOpaque(page)->rightlink;
 
@@ -481,7 +477,6 @@ gin_check_parent_keys_consistency(Relation rel,
 				ptr->depth = stack->depth;
 				ptr->parenttup = CopyIndexTuple(stack->parenttup);
 				ptr->parentblk = stack->parentblk;
-				ptr->parentlsn = stack->parentlsn;
 				ptr->blkno = rightlink;
 				ptr->next = stack->next;
 				stack->next = ptr;
@@ -611,7 +606,6 @@ gin_check_parent_keys_consistency(Relation rel,
 					ptr->parenttup = CopyIndexTuple(idxtuple);
 				ptr->parentblk = stack->blkno;
 				ptr->blkno = GinGetDownlink(idxtuple);
-				ptr->parentlsn = lsn;
 				ptr->next = stack->next;
 				stack->next = ptr;
 			}
-- 
2.43.0

v7-0001-amcheck-Add-gin_index_check-on-a-multicolumn-inde.patchtext/x-patch; charset=US-ASCII; name=v7-0001-amcheck-Add-gin_index_check-on-a-multicolumn-inde.patchDownload

From 426895c13c77eab3e988a49da19434c0888323e0 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Mon, 9 Jun 2025 01:42:52 +0200
Subject: [PATCH v7 1/5] amcheck: Add gin_index_check on a multicolumn index

This commit adds a regression test to verify that gin_index_check()
correctly handles multicolumn indexes, increases its test coverage.

Author: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>
Discussion: https://postgr.es/m/CAE7r3MJ611B9TE=YqBBncewp7-k64VWs+sjk7XF6fJUX77uFBA@mail.gmail.com
---
 contrib/amcheck/expected/check_gin.out | 12 ++++++++++++
 contrib/amcheck/sql/check_gin.sql      | 10 ++++++++++
 2 files changed, 22 insertions(+)

diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
index b4f0b110747..8dd01ced8d1 100644
--- a/contrib/amcheck/expected/check_gin.out
+++ b/contrib/amcheck/expected/check_gin.out
@@ -76,3 +76,15 @@ SELECT gin_index_check('gin_check_jsonb_idx');
 
 -- cleanup
 DROP TABLE gin_check_jsonb;
+-- Test GIN multicolumn index
+CREATE TABLE "gin_check_multicolumn"(a text[], b text[]);
+INSERT INTO gin_check_multicolumn (a,b) values ('{a,c,e}','{b,d,f}');
+CREATE INDEX "gin_check_multicolumn_idx" on gin_check_multicolumn USING GIN(a,b);
+SELECT gin_index_check('gin_check_multicolumn_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_multicolumn;
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
index 66f42c34311..11caed3d6a8 100644
--- a/contrib/amcheck/sql/check_gin.sql
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -50,3 +50,13 @@ SELECT gin_index_check('gin_check_jsonb_idx');
 
 -- cleanup
 DROP TABLE gin_check_jsonb;
+
+-- Test GIN multicolumn index
+CREATE TABLE "gin_check_multicolumn"(a text[], b text[]);
+INSERT INTO gin_check_multicolumn (a,b) values ('{a,c,e}','{b,d,f}');
+CREATE INDEX "gin_check_multicolumn_idx" on gin_check_multicolumn USING GIN(a,b);
+
+SELECT gin_index_check('gin_check_multicolumn_idx');
+
+-- cleanup
+DROP TABLE gin_check_multicolumn;
-- 
2.43.0

#88

Tomas Vondra

tomas@vondra.me

7 months ago

In reply to: Arseniy Mukhin (#87)

6 attachment(s)

Re: Amcheck verification of GiST and GIN

Thanks.

I went through the patches, polished the commit messages and did some
minor tweaks in patch 0002 (to make the variable names a bit more
consistent, and reduce the scope a little bit). I left it as a separate
patch to make the changes clearer, but it should be merged into 0002.

Please read through the commit messages, and let me know if I got some
of the details wrong (or not clear enough). Otherwise I plan to start
pushing this soon (~tomorrow).

regards

--
Tomas Vondra

Attachments:

v8-0001-amcheck-Add-gin_index_check-on-a-multicolumn-inde.patchtext/x-patch; charset=UTF-8; name=v8-0001-amcheck-Add-gin_index_check-on-a-multicolumn-inde.patchDownload

From cb24bb068582a39df9e9e59c2a9347889e896cf2 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Mon, 9 Jun 2025 01:42:52 +0200
Subject: [PATCH v8 1/6] amcheck: Add gin_index_check on a multicolumn index

Adds a regression test with gin_index_check() on a multicolumn index,
to verify it's handled correctly and improve test coverage for code
introduced by 14ffaece0fb5.

Author: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>
Reviewed-by: Andrey M. Borodin <x4mmm@yandex-team.ru>
Discussion: https://postgr.es/m/CAE7r3MJ611B9TE=YqBBncewp7-k64VWs+sjk7XF6fJUX77uFBA@mail.gmail.com
---
 contrib/amcheck/expected/check_gin.out | 12 ++++++++++++
 contrib/amcheck/sql/check_gin.sql      | 10 ++++++++++
 2 files changed, 22 insertions(+)

diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
index b4f0b110747..8dd01ced8d1 100644
--- a/contrib/amcheck/expected/check_gin.out
+++ b/contrib/amcheck/expected/check_gin.out
@@ -76,3 +76,15 @@ SELECT gin_index_check('gin_check_jsonb_idx');
 
 -- cleanup
 DROP TABLE gin_check_jsonb;
+-- Test GIN multicolumn index
+CREATE TABLE "gin_check_multicolumn"(a text[], b text[]);
+INSERT INTO gin_check_multicolumn (a,b) values ('{a,c,e}','{b,d,f}');
+CREATE INDEX "gin_check_multicolumn_idx" on gin_check_multicolumn USING GIN(a,b);
+SELECT gin_index_check('gin_check_multicolumn_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_multicolumn;
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
index 66f42c34311..11caed3d6a8 100644
--- a/contrib/amcheck/sql/check_gin.sql
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -50,3 +50,13 @@ SELECT gin_index_check('gin_check_jsonb_idx');
 
 -- cleanup
 DROP TABLE gin_check_jsonb;
+
+-- Test GIN multicolumn index
+CREATE TABLE "gin_check_multicolumn"(a text[], b text[]);
+INSERT INTO gin_check_multicolumn (a,b) values ('{a,c,e}','{b,d,f}');
+CREATE INDEX "gin_check_multicolumn_idx" on gin_check_multicolumn USING GIN(a,b);
+
+SELECT gin_index_check('gin_check_multicolumn_idx');
+
+-- cleanup
+DROP TABLE gin_check_multicolumn;
-- 
2.49.0

v8-0002-amcheck-Fix-checks-of-entry-order-for-GIN-indexes.patchtext/x-patch; charset=UTF-8; name=v8-0002-amcheck-Fix-checks-of-entry-order-for-GIN-indexes.patchDownload

From 04b0c0c718dd109b9b4598d316b27daab281688d Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Mon, 9 Jun 2025 01:44:59 +0200
Subject: [PATCH v8 2/6] amcheck: Fix checks of entry order for GIN indexes

This tightens a couple checks in checking GIN indexes, which might have
resulted in incorrect results (false positives/negatives).

* The code skipped ordering checks if the entries were for different
  attributes (for multi-column GIN indexes), possibly missing some cases
  of data corruption. But the attribute number is part of the ordering,
  so we can check that.

* The root page was skipped when checking entry order, but that is
  unnecessary. The root page is subject to the same ordering rules, we
  can process it just like any other page.

* The high key on the right-most page was not checked, but that is
  needed only for inner pages (we don't store the high key for those).
  For leaf pages we can check the high key just fine.

* Correct the detection of split pages. If the page gets split, the
  cached parent key is creater than the current child key (not less, as
  the core incorrectly expected).

Issues reported by Arseniy Mikhin, along with a proposed patch. Review
by Andrey M. Borodin, cleanup and improvements by me.

Author: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>
Reviewed-by: Andrey M. Borodin <x4mmm@yandex-team.ru>
Discussion: https://postgr.es/m/CAE7r3MJ611B9TE=YqBBncewp7-k64VWs+sjk7XF6fJUX77uFBA@mail.gmail.com
---
 contrib/amcheck/meson.build         |   1 +
 contrib/amcheck/t/006_verify_gin.pl | 199 ++++++++++++++++++++++++++++
 contrib/amcheck/verify_gin.c        |  37 +++---
 3 files changed, 219 insertions(+), 18 deletions(-)
 create mode 100644 contrib/amcheck/t/006_verify_gin.pl

diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index b33e8c9b062..1f0c347ed54 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -49,6 +49,7 @@ tests += {
       't/003_cic_2pc.pl',
       't/004_verify_nbtree_unique.pl',
       't/005_pitr.pl',
+      't/006_verify_gin.pl',
     ],
   },
 }
diff --git a/contrib/amcheck/t/006_verify_gin.pl b/contrib/amcheck/t/006_verify_gin.pl
new file mode 100644
index 00000000000..7fdde170e06
--- /dev/null
+++ b/contrib/amcheck/t/006_verify_gin.pl
@@ -0,0 +1,199 @@
+
+# Copyright (c) 2021-2025, PostgreSQL Global Development Group
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+
+use Test::More;
+
+my $node;
+my $blksize;
+
+# to get the split fast, we want tuples to be as large as possible, but the same time we don't want them to be toasted.
+my $filler_size = 1900;
+
+#
+# Test set-up
+#
+$node = PostgreSQL::Test::Cluster->new('test');
+$node->init(no_data_checksums => 1);
+$node->append_conf('postgresql.conf', 'autovacuum=off');
+$node->start;
+$blksize = int($node->safe_psql('postgres', 'SHOW block_size;'));
+$node->safe_psql('postgres', q(CREATE EXTENSION amcheck));
+$node->safe_psql(
+	'postgres', q(
+		CREATE OR REPLACE FUNCTION  random_string( INT ) RETURNS text AS $$
+		SELECT string_agg(substring('0123456789abcdefghijklmnopqrstuvwxyz', ceil(random() * 36)::integer, 1), '') from generate_series(1, $1);
+		$$ LANGUAGE SQL;));
+
+# Tests
+invalid_entry_order_leaf_page_test();
+invalid_entry_order_inner_page_test();
+invalid_entry_columns_order_test();
+
+sub invalid_entry_order_leaf_page_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+		CREATE TABLE $relname (a text[]);
+		INSERT INTO $relname (a) VALUES ('{aaaaa,bbbbb}');
+		CREATE INDEX $indexname ON $relname USING gin (a);
+	 ));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 1;  # root
+
+	# produce wrong order by replacing aaaaa with ccccc
+	string_replace_block(
+		$relpath,
+		'aaaaa',
+		'ccccc',
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	my $expected = "index \"$indexname\" has wrong tuple order on entry tree page, block 1, offset 2, rightlink 4294967295";
+	like($stderr, qr/$expected/);
+}
+
+sub invalid_entry_order_inner_page_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	# to break the order in the inner page we need at least 3 items (rightmost key in the inner level is not checked for the order)
+	# so fill table until we have 2 splits
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+		CREATE TABLE $relname (a text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'pppppppppp' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'qqqqqqqqqq' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'rrrrrrrrrr' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'ssssssssss' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'tttttttttt' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'uuuuuuuuuu' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'vvvvvvvvvv' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'wwwwwwwwww' || random_string($filler_size) ||'}')::text[]);
+		CREATE INDEX $indexname ON $relname USING gin (a);
+	));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 1;  # root
+
+	# we have rrrrrrrrr... and tttttttttt... as keys in the root, so produce wrong order by replacing rrrrrrrrrr....
+	string_replace_block(
+		$relpath,
+		'rrrrrrrrrr',
+		'zzzzzzzzzz',
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	my $expected = "index \"$indexname\" has wrong tuple order on entry tree page, block 1, offset 2, rightlink 4294967295";
+	like($stderr, qr/$expected/);
+}
+
+sub invalid_entry_columns_order_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+		CREATE TABLE $relname (a text[],b text[]);
+		INSERT INTO $relname (a,b) VALUES ('{aaa}','{bbb}');
+		CREATE INDEX $indexname ON $relname USING gin (a,b);
+	));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 1;  # root
+
+	# mess column numbers
+	# root items order before: (1,aaa), (2,bbb)
+	# root items order after:  (2,aaa), (1,bbb)
+	my $attrno_1 = pack('s', 1);
+	my $attrno_2 = pack('s', 2);
+
+	my $find = qr/($attrno_1)(.)(aaa)/s;
+	my $replace = $attrno_2 . '$2$3';
+	string_replace_block(
+		$relpath,
+		$find,
+		$replace,
+		$blkno
+	);
+
+	$find = qr/($attrno_2)(.)(bbb)/s;
+	$replace = $attrno_1 . '$2$3';
+	string_replace_block(
+		$relpath,
+		$find,
+		$replace,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	my $expected = "index \"$indexname\" has wrong tuple order on entry tree page, block 1, offset 2, rightlink 4294967295";
+	like($stderr, qr/$expected/);
+}
+
+# Returns the filesystem path for the named relation.
+sub relation_filepath
+{
+	my ($relname) = @_;
+
+	my $pgdata = $node->data_dir;
+	my $rel = $node->safe_psql('postgres',
+		qq(SELECT pg_relation_filepath('$relname')));
+	die "path not found for relation $relname" unless defined $rel;
+	return "$pgdata/$rel";
+}
+
+# substitute pattern 'find' with 'replace' within the block with number 'blkno' in the file 'filename'
+sub string_replace_block
+{
+	my ($filename, $find, $replace, $blkno) = @_;
+
+	my $fh;
+	open($fh, '+<', $filename) or BAIL_OUT("open failed: $!");
+	binmode $fh;
+
+	my $offset = $blkno * $blksize;
+	my $buffer;
+
+	sysseek($fh, $offset, 0) or BAIL_OUT("seek failed: $!");
+	sysread($fh, $buffer, $blksize) or BAIL_OUT("read failed: $!");
+
+	$buffer =~ s/$find/'"' . $replace . '"'/gee;
+
+	sysseek($fh, $offset, 0) or BAIL_OUT("seek failed: $!");
+	syswrite($fh, $buffer) or BAIL_OUT("write failed: $!");
+
+	close($fh) or BAIL_OUT("close failed: $!");
+
+	return;
+}
+
+done_testing();
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index b5f363562e3..26b98571b56 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -463,17 +463,18 @@ gin_check_parent_keys_consistency(Relation rel,
 			Datum		parent_key = gintuple_get_key(&state,
 													  stack->parenttup,
 													  &parent_key_category);
+			OffsetNumber parent_key_attnum = gintuple_get_attrnum(&state, stack->parenttup);
 			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno,
 												   page, maxoff);
 			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
-			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			OffsetNumber page_max_key_attnum = gintuple_get_attrnum(&state, idxtuple);
 			GinNullCategory page_max_key_category;
 			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
 
 			if (rightlink != InvalidBlockNumber &&
-				ginCompareEntries(&state, attnum, page_max_key,
-								  page_max_key_category, parent_key,
-								  parent_key_category) > 0)
+				ginCompareAttEntries(&state, page_max_key_attnum, page_max_key,
+									 page_max_key_category, parent_key_attnum,
+									 parent_key, parent_key_category) < 0)
 			{
 				/* split page detected, install right link to the stack */
 				GinScanItem *ptr;
@@ -528,20 +529,18 @@ gin_check_parent_keys_consistency(Relation rel,
 			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
 
 			/*
-			 * First block is metadata, skip order check. Also, never check
-			 * for high key on rightmost page, as this key is not really
-			 * stored explicitly.
+			 * Never check for high key on rightmost inner page, as this key
+			 * is not really stored explicitly.
 			 *
 			 * Also make sure to not compare entries for different attnums,
 			 * which may be stored on the same page.
 			 */
-			if (i != FirstOffsetNumber && attnum == prev_attnum && stack->blkno != GIN_ROOT_BLKNO &&
-				!(i == maxoff && rightlink == InvalidBlockNumber))
+			if (i != FirstOffsetNumber && !(i == maxoff && rightlink == InvalidBlockNumber && !GinPageIsLeaf(page)))
 			{
 				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
-				if (ginCompareEntries(&state, attnum, prev_key,
-									  prev_key_category, current_key,
-									  current_key_category) >= 0)
+				if (ginCompareAttEntries(&state, prev_attnum, prev_key,
+										 prev_key_category, attnum,
+										 current_key, current_key_category) >= 0)
 					ereport(ERROR,
 							(errcode(ERRCODE_INDEX_CORRUPTED),
 							 errmsg("index \"%s\" has wrong tuple order on entry tree page, block %u, offset %u, rightlink %u",
@@ -556,13 +555,14 @@ gin_check_parent_keys_consistency(Relation rel,
 				i == maxoff)
 			{
 				GinNullCategory parent_key_category;
+				OffsetNumber parent_key_attnum = gintuple_get_attrnum(&state, stack->parenttup);
 				Datum		parent_key = gintuple_get_key(&state,
 														  stack->parenttup,
 														  &parent_key_category);
 
-				if (ginCompareEntries(&state, attnum, current_key,
-									  current_key_category, parent_key,
-									  parent_key_category) > 0)
+				if (ginCompareAttEntries(&state, attnum, current_key,
+										 current_key_category, parent_key_attnum,
+										 parent_key, parent_key_category) > 0)
 				{
 					/*
 					 * There was a discrepancy between parent and child
@@ -581,6 +581,7 @@ gin_check_parent_keys_consistency(Relation rel,
 							 stack->blkno, stack->parentblk);
 					else
 					{
+						parent_key_attnum = gintuple_get_attrnum(&state, stack->parenttup);
 						parent_key = gintuple_get_key(&state,
 													  stack->parenttup,
 													  &parent_key_category);
@@ -589,9 +590,9 @@ gin_check_parent_keys_consistency(Relation rel,
 						 * Check if it is properly adjusted. If succeed,
 						 * proceed to the next key.
 						 */
-						if (ginCompareEntries(&state, attnum, current_key,
-											  current_key_category, parent_key,
-											  parent_key_category) > 0)
+						if (ginCompareAttEntries(&state, attnum, current_key,
+												 current_key_category, parent_key_attnum,
+												 parent_key, parent_key_category) > 0)
 							ereport(ERROR,
 									(errcode(ERRCODE_INDEX_CORRUPTED),
 									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
-- 
2.49.0

v8-0003-tweaks.patchtext/x-patch; charset=UTF-8; name=v8-0003-tweaks.patchDownload

From d79cdc4ef65a23d8d42eb6f150373f6a76490c54 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@vondra.me>
Date: Mon, 16 Jun 2025 17:09:18 +0200
Subject: [PATCH v8 3/6] tweaks

---
 contrib/amcheck/verify_gin.c | 28 ++++++++++++++++------------
 1 file changed, 16 insertions(+), 12 deletions(-)

diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index 26b98571b56..25d47eefcbc 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -514,9 +514,7 @@ gin_check_parent_keys_consistency(Relation rel,
 		{
 			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
 			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
-			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
-			GinNullCategory prev_key_category;
-			Datum		prev_key;
+			OffsetNumber current_attnum = gintuple_get_attrnum(&state, idxtuple);
 			GinNullCategory current_key_category;
 			Datum		current_key;
 
@@ -529,17 +527,23 @@ gin_check_parent_keys_consistency(Relation rel,
 			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
 
 			/*
-			 * Never check for high key on rightmost inner page, as this key
-			 * is not really stored explicitly.
+			 * Compare the entry to the preceding one.
 			 *
-			 * Also make sure to not compare entries for different attnums,
-			 * which may be stored on the same page.
+			 * Don't check for high key on the rightmost inner page, as this
+			 * key is not really stored explicitly.
+			 *
+			 * The entries may be for different attributes, so make sure to
+			 * use ginCompareAttEntries for comparison.
 			 */
-			if (i != FirstOffsetNumber && !(i == maxoff && rightlink == InvalidBlockNumber && !GinPageIsLeaf(page)))
+			if ((i != FirstOffsetNumber) &&
+				!(i == maxoff && rightlink == InvalidBlockNumber && !GinPageIsLeaf(page)))
 			{
+				Datum		prev_key;
+				GinNullCategory prev_key_category;
+
 				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
 				if (ginCompareAttEntries(&state, prev_attnum, prev_key,
-										 prev_key_category, attnum,
+										 prev_key_category, current_attnum,
 										 current_key, current_key_category) >= 0)
 					ereport(ERROR,
 							(errcode(ERRCODE_INDEX_CORRUPTED),
@@ -560,7 +564,7 @@ gin_check_parent_keys_consistency(Relation rel,
 														  stack->parenttup,
 														  &parent_key_category);
 
-				if (ginCompareAttEntries(&state, attnum, current_key,
+				if (ginCompareAttEntries(&state, current_attnum, current_key,
 										 current_key_category, parent_key_attnum,
 										 parent_key, parent_key_category) > 0)
 				{
@@ -590,7 +594,7 @@ gin_check_parent_keys_consistency(Relation rel,
 						 * Check if it is properly adjusted. If succeed,
 						 * proceed to the next key.
 						 */
-						if (ginCompareAttEntries(&state, attnum, current_key,
+						if (ginCompareAttEntries(&state, current_attnum, current_key,
 												 current_key_category, parent_key_attnum,
 												 parent_key, parent_key_category) > 0)
 							ereport(ERROR,
@@ -645,7 +649,7 @@ gin_check_parent_keys_consistency(Relation rel,
 			}
 
 			prev_tuple = CopyIndexTuple(idxtuple);
-			prev_attnum = attnum;
+			prev_attnum = current_attnum;
 		}
 
 		LockBuffer(buffer, GIN_UNLOCK);
-- 
2.49.0

v8-0004-amcheck-Fix-parent-key-check-in-gin_index_check.patchtext/x-patch; charset=UTF-8; name=v8-0004-amcheck-Fix-parent-key-check-in-gin_index_check.patchDownload

From 7326918e2ef50a52233c119712c7bc7abec993ba Mon Sep 17 00:00:00 2001
From: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>
Date: Mon, 9 Jun 2025 20:39:13 +0300
Subject: [PATCH v8 4/6] amcheck: Fix parent key check in gin_index_check()

The checks introduced by commit 14ffaece0fb5 did not get the parent key
checks quite right, missing some data corruption cases. In particular:

* The "rightlink" check was not working as intended, because rightlink
  is a BlockNumber, and InvalidBlockNumber is 0xFFFFFFFF, so

    !GinPageGetOpaque(page)->rightlink

  almost always evaluates to false (except for rightlink=0). So in most
  cases parenttup was left NULL, preventing any checks against parent.

* Use GinGetDownlink() to retrieve child blkno to avoid triggering
  Assert, same as the core GIN code.

Issues reported by Arseniy Mikhin, along with a proposed patch. Review
by Andrey M. Borodin, cleanup and improvements by me.

Author: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>
Reviewed-by: Andrey M. Borodin <x4mmm@yandex-team.ru>
Discussion: https://postgr.es/m/CAE7r3MJ611B9TE=YqBBncewp7-k64VWs+sjk7XF6fJUX77uFBA@mail.gmail.com
---
 contrib/amcheck/t/006_verify_gin.pl | 78 +++++++++++++++++++++++++++++
 contrib/amcheck/verify_gin.c        |  8 +--
 2 files changed, 82 insertions(+), 4 deletions(-)

diff --git a/contrib/amcheck/t/006_verify_gin.pl b/contrib/amcheck/t/006_verify_gin.pl
index 7fdde170e06..308e53b2f75 100644
--- a/contrib/amcheck/t/006_verify_gin.pl
+++ b/contrib/amcheck/t/006_verify_gin.pl
@@ -34,6 +34,8 @@ $node->safe_psql(
 invalid_entry_order_leaf_page_test();
 invalid_entry_order_inner_page_test();
 invalid_entry_columns_order_test();
+inconsistent_with_parent_key__parent_key_corrupted_test();
+inconsistent_with_parent_key__child_key_corrupted_test();
 
 sub invalid_entry_order_leaf_page_test
 {
@@ -159,6 +161,82 @@ sub invalid_entry_columns_order_test
 	like($stderr, qr/$expected/);
 }
 
+sub inconsistent_with_parent_key__parent_key_corrupted_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	# fill the table until we have a split
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+		CREATE TABLE $relname (a text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'llllllllll' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'mmmmmmmmmm' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'nnnnnnnnnn' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'xxxxxxxxxx' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'yyyyyyyyyy' || random_string($filler_size) ||'}')::text[]);
+		CREATE INDEX $indexname ON $relname USING gin (a);
+	));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 1;  # root
+
+	# we have nnnnnnnnnn... as parent key in the root, so replace it with something smaller then child's keys
+	string_replace_block(
+		$relpath,
+		'nnnnnnnnnn',
+		'aaaaaaaaaa',
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	my $expected = "index \"$indexname\" has inconsistent records on page 3 offset 3";
+	like($stderr, qr/$expected/);
+}
+
+sub inconsistent_with_parent_key__child_key_corrupted_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	# fill the table until we have a split
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+		CREATE TABLE $relname (a text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'llllllllll' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'mmmmmmmmmm' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'nnnnnnnnnn' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'xxxxxxxxxx' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'yyyyyyyyyy' || random_string($filler_size) ||'}')::text[]);
+		CREATE INDEX $indexname ON $relname USING gin (a);
+	 ));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 3;  # leaf
+
+	# we have nnnnnnnnnn... as parent key in the root, so replace child key with something bigger
+	string_replace_block(
+		$relpath,
+		'nnnnnnnnnn',
+		'pppppppppp',
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	my $expected = "index \"$indexname\" has inconsistent records on page 3 offset 3";
+	like($stderr, qr/$expected/);
+}
+
 # Returns the filesystem path for the named relation.
 sub relation_filepath
 {
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index 25d47eefcbc..ae36a4ea587 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -613,10 +613,10 @@ gin_check_parent_keys_consistency(Relation rel,
 				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
 				ptr->depth = stack->depth + 1;
 				/* last tuple in layer has no high key */
-				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
-					ptr->parenttup = CopyIndexTuple(idxtuple);
-				else
+				if (i == maxoff && rightlink == InvalidBlockNumber)
 					ptr->parenttup = NULL;
+				else
+					ptr->parenttup = CopyIndexTuple(idxtuple);
 				ptr->parentblk = stack->blkno;
 				ptr->blkno = GinGetDownlink(idxtuple);
 				ptr->parentlsn = lsn;
@@ -754,7 +754,7 @@ gin_refind_parent(Relation rel, BlockNumber parentblkno,
 		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
 		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
 
-		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		if (GinGetDownlink(itup) == childblkno)
 		{
 			/* Found it! Make copy and return it */
 			result = CopyIndexTuple(itup);
-- 
2.49.0

v8-0005-amcheck-Fix-posting-tree-checks-in-gin_index_chec.patchtext/x-patch; charset=UTF-8; name=v8-0005-amcheck-Fix-posting-tree-checks-in-gin_index_chec.patchDownload

From 65efe9243fcbabe0b8f28fcc743524060cadd730 Mon Sep 17 00:00:00 2001
From: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>
Date: Mon, 9 Jun 2025 20:49:11 +0300
Subject: [PATCH v8 5/6] amcheck: Fix posting tree checks in gin_index_check()

Fix two issues in parent_key validation in posting trees:

* It's not enough to check stack->parentblk is valid to determine if the
  parentkey is valid. It's possible parentblk is set to a valid block
  number, but parentkey is invalid. So check parentkey directly.

* We don't need to invalidate parentkey for all child pages of the
  rightmost page. It's enough to invalidate it for the rightmost child
  only, which means we can check more cases (less false negatives).

Issues reported by Arseniy Mikhin, along with a proposed patch. Review
by Andrey M. Borodin, cleanup and improvements by me.

Author: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>
Reviewed-by: Andrey M. Borodin <x4mmm@yandex-team.ru>
Discussion: https://postgr.es/m/CAE7r3MJ611B9TE=YqBBncewp7-k64VWs+sjk7XF6fJUX77uFBA@mail.gmail.com
---
 contrib/amcheck/t/006_verify_gin.pl | 39 +++++++++++++++++++++++++++++
 contrib/amcheck/verify_gin.c        | 12 +++------
 2 files changed, 43 insertions(+), 8 deletions(-)

diff --git a/contrib/amcheck/t/006_verify_gin.pl b/contrib/amcheck/t/006_verify_gin.pl
index 308e53b2f75..e540cd6606a 100644
--- a/contrib/amcheck/t/006_verify_gin.pl
+++ b/contrib/amcheck/t/006_verify_gin.pl
@@ -36,6 +36,7 @@ invalid_entry_order_inner_page_test();
 invalid_entry_columns_order_test();
 inconsistent_with_parent_key__parent_key_corrupted_test();
 inconsistent_with_parent_key__child_key_corrupted_test();
+inconsistent_with_parent_key__parent_key_corrupted_posting_tree_test();
 
 sub invalid_entry_order_leaf_page_test
 {
@@ -237,6 +238,44 @@ sub inconsistent_with_parent_key__child_key_corrupted_test
 	like($stderr, qr/$expected/);
 }
 
+sub inconsistent_with_parent_key__parent_key_corrupted_posting_tree_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+		CREATE TABLE $relname (a text[]);
+		INSERT INTO $relname (a) select ('{aaaaa}') from generate_series(1,10000);
+		CREATE INDEX $indexname ON $relname USING gin (a);
+	));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 2;  # posting tree root
+
+	# we have a posting tree for 'aaaaa' key with the root at 2nd block
+	# and two leaf pages 3 and 4. replace 4th page's high key with (1,1)
+	# so that there are tid's in leaf page that are larger then the new high key.
+	my $find = pack('S*', 0, 4, 0) . '....';
+	my $replace = pack('S*', 0, 4, 0, 1, 1);
+	string_replace_block(
+		$relpath,
+		$find,
+		$replace,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	my $expected = "index \"$indexname\": tid exceeds parent's high key in postingTree leaf on block 4";
+	like($stderr, qr/$expected/);
+}
+
+
 # Returns the filesystem path for the named relation.
 sub relation_filepath
 {
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index ae36a4ea587..a1d57ac551f 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -346,7 +346,7 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 				 * Check if this tuple is consistent with the downlink in the
 				 * parent.
 				 */
-				if (stack->parentblk != InvalidBlockNumber && i == maxoff &&
+				if (i == maxoff && ItemPointerIsValid(&stack->parentkey) &&
 					ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
 					ereport(ERROR,
 							(errcode(ERRCODE_INDEX_CORRUPTED),
@@ -359,14 +359,10 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 				ptr->depth = stack->depth + 1;
 
 				/*
-				 * Set rightmost parent key to invalid item pointer. Its value
-				 * is 'Infinity' and not explicitly stored.
+				 * The rightmost parent key is always invalid item pointer.
+				 * Its value is 'Infinity' and not explicitly stored.
 				 */
-				if (rightlink == InvalidBlockNumber)
-					ItemPointerSetInvalid(&ptr->parentkey);
-				else
-					ptr->parentkey = posting_item->key;
-
+				ptr->parentkey = posting_item->key;
 				ptr->parentblk = stack->blkno;
 				ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
 				ptr->next = stack->next;
-- 
2.49.0

v8-0006-amcheck-Remove-unused-GinScanItem-parentlsn-field.patchtext/x-patch; charset=UTF-8; name=v8-0006-amcheck-Remove-unused-GinScanItem-parentlsn-field.patchDownload

From ffc02618b19ff315f8daf992ad4a6d3d577adcca Mon Sep 17 00:00:00 2001
From: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>
Date: Mon, 9 Jun 2025 20:51:05 +0300
Subject: [PATCH v8 6/6] amcheck: Remove unused GinScanItem->parentlsn field

The field was introduced by commit 14ffaece0fb5, but is unused and
unnecessary. So remove it.

Issues reported by Arseniy Mikhin, along with a proposed patch. Review
by Andrey M. Borodin, cleanup and improvements by me.

Author: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>
Reviewed-by: Andrey M. Borodin <x4mmm@yandex-team.ru>
Discussion: https://postgr.es/m/CAE7r3MJ611B9TE=YqBBncewp7-k64VWs+sjk7XF6fJUX77uFBA@mail.gmail.com
---
 contrib/amcheck/verify_gin.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index a1d57ac551f..c615d950736 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -38,7 +38,6 @@ typedef struct GinScanItem
 	int			depth;
 	IndexTuple	parenttup;
 	BlockNumber parentblk;
-	XLogRecPtr	parentlsn;
 	BlockNumber blkno;
 	struct GinScanItem *next;
 } GinScanItem;
@@ -417,7 +416,6 @@ gin_check_parent_keys_consistency(Relation rel,
 	stack->depth = 0;
 	stack->parenttup = NULL;
 	stack->parentblk = InvalidBlockNumber;
-	stack->parentlsn = InvalidXLogRecPtr;
 	stack->blkno = GIN_ROOT_BLKNO;
 
 	while (stack)
@@ -428,7 +426,6 @@ gin_check_parent_keys_consistency(Relation rel,
 		OffsetNumber i,
 					maxoff,
 					prev_attnum;
-		XLogRecPtr	lsn;
 		IndexTuple	prev_tuple;
 		BlockNumber rightlink;
 
@@ -438,7 +435,6 @@ gin_check_parent_keys_consistency(Relation rel,
 									RBM_NORMAL, strategy);
 		LockBuffer(buffer, GIN_SHARE);
 		page = (Page) BufferGetPage(buffer);
-		lsn = BufferGetLSNAtomic(buffer);
 		maxoff = PageGetMaxOffsetNumber(page);
 		rightlink = GinPageGetOpaque(page)->rightlink;
 
@@ -481,7 +477,6 @@ gin_check_parent_keys_consistency(Relation rel,
 				ptr->depth = stack->depth;
 				ptr->parenttup = CopyIndexTuple(stack->parenttup);
 				ptr->parentblk = stack->parentblk;
-				ptr->parentlsn = stack->parentlsn;
 				ptr->blkno = rightlink;
 				ptr->next = stack->next;
 				stack->next = ptr;
@@ -615,7 +610,6 @@ gin_check_parent_keys_consistency(Relation rel,
 					ptr->parenttup = CopyIndexTuple(idxtuple);
 				ptr->parentblk = stack->blkno;
 				ptr->blkno = GinGetDownlink(idxtuple);
-				ptr->parentlsn = lsn;
 				ptr->next = stack->next;
 				stack->next = ptr;
 			}
-- 
2.49.0

#89

Arseniy Mukhin

arseniy.mukhin.dev@gmail.com

7 months ago

In reply to: Tomas Vondra (#88)

Re: Amcheck verification of GiST and GIN

On Mon, Jun 16, 2025 at 6:58 PM Tomas Vondra <tomas@vondra.me> wrote:

Thanks.

I went through the patches, polished the commit messages and did some
minor tweaks in patch 0002 (to make the variable names a bit more
consistent, and reduce the scope a little bit). I left it as a separate
patch to make the changes clearer, but it should be merged into 0002.

Please read through the commit messages, and let me know if I got some
of the details wrong (or not clear enough). Otherwise I plan to start
pushing this soon (~tomorrow).

LGTM.
Noticed a few typos in messages:
in v8-0002-amcheck-Fix-checks-of-entry-order-for-GIN-indexes.patch
- parent key is creator
- as the core incorrectly expected
and 'Arseniy Mikhin' in some patches.

Best regards,
Arseniy Mukhin

#90

Tomas Vondra

tomas@vondra.me

7 months ago

In reply to: Arseniy Mukhin (#89)

Re: Amcheck verification of GiST and GIN

On 6/16/25 21:09, Arseniy Mukhin wrote:

On Mon, Jun 16, 2025 at 6:58 PM Tomas Vondra <tomas@vondra.me> wrote:

Thanks.

I went through the patches, polished the commit messages and did some
minor tweaks in patch 0002 (to make the variable names a bit more
consistent, and reduce the scope a little bit). I left it as a separate
patch to make the changes clearer, but it should be merged into 0002.

Please read through the commit messages, and let me know if I got some
of the details wrong (or not clear enough). Otherwise I plan to start
pushing this soon (~tomorrow).

LGTM.
Noticed a few typos in messages:
in v8-0002-amcheck-Fix-checks-of-entry-order-for-GIN-indexes.patch
- parent key is creator
- as the core incorrectly expected
and 'Arseniy Mikhin' in some patches.

Thanks for noticing those typos, especially the one in the name.

regards

--
Tomas Vondra

#91

Thom Brown

thom@linux.com

7 months ago

In reply to: Tomas Vondra (#90)

Re: Amcheck verification of GiST and GIN

On Mon, 16 Jun 2025 at 21:00, Tomas Vondra <tomas@vondra.me> wrote:

On 6/16/25 21:09, Arseniy Mukhin wrote:

On Mon, Jun 16, 2025 at 6:58 PM Tomas Vondra <tomas@vondra.me> wrote:

Thanks.

I went through the patches, polished the commit messages and did some
minor tweaks in patch 0002 (to make the variable names a bit more
consistent, and reduce the scope a little bit). I left it as a separate
patch to make the changes clearer, but it should be merged into 0002.

Please read through the commit messages, and let me know if I got some
of the details wrong (or not clear enough). Otherwise I plan to start
pushing this soon (~tomorrow).

LGTM.
Noticed a few typos in messages:
in v8-0002-amcheck-Fix-checks-of-entry-order-for-GIN-indexes.patch
- parent key is creator
- as the core incorrectly expected
and 'Arseniy Mikhin' in some patches.

Thanks for noticing those typos, especially the one in the name.

Do today's commits clear this from the PostgreSQL 18 Open Items list?

Thom

#92

Tomas Vondra

tomas@vondra.me

7 months ago

In reply to: Thom Brown (#91)

Re: Amcheck verification of GiST and GIN

On 6/17/25 16:19, Thom Brown wrote:

On Mon, 16 Jun 2025 at 21:00, Tomas Vondra <tomas@vondra.me> wrote:

On 6/16/25 21:09, Arseniy Mukhin wrote:

On Mon, Jun 16, 2025 at 6:58 PM Tomas Vondra <tomas@vondra.me> wrote:

Thanks.

I went through the patches, polished the commit messages and did some
minor tweaks in patch 0002 (to make the variable names a bit more
consistent, and reduce the scope a little bit). I left it as a separate
patch to make the changes clearer, but it should be merged into 0002.

Please read through the commit messages, and let me know if I got some
of the details wrong (or not clear enough). Otherwise I plan to start
pushing this soon (~tomorrow).

LGTM.
Noticed a few typos in messages:
in v8-0002-amcheck-Fix-checks-of-entry-order-for-GIN-indexes.patch
- parent key is creator
- as the core incorrectly expected
and 'Arseniy Mikhin' in some patches.

Thanks for noticing those typos, especially the one in the name.

Do today's commits clear this from the PostgreSQL 18 Open Items list?

That's the intent, yes. There's one remaining commit.

--
Tomas Vondra