amcheck verification for GiST

Started by Andrey Borodinover 7 years ago31 messages
#1Andrey Borodin
x4mmm@yandex-team.ru
1 attachment(s)

Hi, hackers!

Here's the patch with amcheck functionality for GiST.

It basically checks two invariants:
1. Every internal tuple need no adjustment by tuples of referenced page
2. Internal page reference or only leaf pages or only internal pages

We actually cannot check all balanced tree invariants due to concurrency reasons some concurrent splits will be visible as temporary balance violations.

Are there any other invariants that we can check?

I'd be happy to hear any thought about this.

Best regards, Andrey Borodin.

Attachments:

0001-GiST-verification-function-for-amcheck.patchapplication/octet-stream; name=0001-GiST-verification-function-for-amcheck.patch; x-unix-mode=0644Download
From 6e50f29e7a040c8caef07d3b8b68d9333e7e1401 Mon Sep 17 00:00:00 2001
From: Andrey Borodin <amborodin@acm.org>
Date: Sun, 23 Sep 2018 14:59:54 +0500
Subject: [PATCH] GiST verification function for amcheck

Function gist_index_check() verifies that in target GiST every internal tuple need no extension by underlying subtree tuples and page graph respects balanced-tree invariants.
---
 contrib/amcheck/Makefile                |   6 +-
 contrib/amcheck/amcheck--1.1--1.2.sql   |  14 ++
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/expected/check_gist.out |   9 +
 contrib/amcheck/sql/check_gist.sql      |   4 +
 contrib/amcheck/verify_gist.c           | 272 ++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml               |  19 ++
 7 files changed, 322 insertions(+), 4 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.1--1.2.sql
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index c5764b544f..dd9b5ecf92 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -1,13 +1,13 @@
 # contrib/amcheck/Makefile
 
 MODULE_big	= amcheck
-OBJS		= verify_nbtree.o $(WIN32RES)
+OBJS		= verify_nbtree.o verify_gist.o $(WIN32RES)
 
 EXTENSION = amcheck
-DATA = amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree
+REGRESS = check check_btree check_gist
 
 ifdef USE_PGXS
 PG_CONFIG = pg_config
diff --git a/contrib/amcheck/amcheck--1.1--1.2.sql b/contrib/amcheck/amcheck--1.1--1.2.sql
new file mode 100644
index 0000000000..6888900303
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.1--1.2.sql
@@ -0,0 +1,14 @@
+/* amcheck--1.1--1.2.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.2'" to load this file. \quit
+
+--
+-- gist_index_check()
+--
+CREATE FUNCTION gist_index_check(index regclass)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_check(regclass) FROM PUBLIC;
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index 469048403d..c6e310046d 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.1'
+default_version = '1.2'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 0000000000..d8ad66805b
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,9 @@
+-- minimal test, basically just verifying that amcheck works with GiST
+CREATE TABLE gist_check AS SELECT point(s,1) c FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx ON gist_check USING gist(c);
+SELECT gist_index_check('gist_check_idx');
+ gist_index_check 
+------------------
+ 
+(1 row)
+
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 0000000000..585c455ca5
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,4 @@
+-- minimal test, basically just verifying that amcheck works with GiST
+CREATE TABLE gist_check AS SELECT point(s,1) c FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx ON gist_check USING gist(c);
+SELECT gist_index_check('gist_check_idx');
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 0000000000..c5287ec57c
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,272 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_nbtree.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph are contatining
+ * consisnent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2018, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "access/htup_details.h"
+#include "access/transam.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
+#include "commands/tablecmds.h"
+#include "miscadmin.h"
+#include "storage/lmgr.h"
+#include "utils/memutils.h"
+#include "utils/snapmgr.h"
+
+typedef struct GistScanItem
+{
+	GistNSN		parentlsn;
+	BlockNumber blkno;
+	struct GistScanItem *next;
+} GistScanItem;
+
+/*
+ * For every tuple on page check if it is contained by tuple on parent page
+ */
+static inline void
+gist_check_page_keys(Relation rel, Page parentpage, Page page, IndexTuple parent, GISTSTATE *state)
+{
+	OffsetNumber i,
+				maxoff = PageGetMaxOffsetNumber(page);
+
+	for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+	{
+		ItemId iid = PageGetItemId(page, i);
+		IndexTuple idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+		if (GistTupleIsInvalid(idxtuple))
+			ereport(LOG,
+					(errmsg("index \"%s\" contains an inner tuple marked as invalid",
+							RelationGetRelationName(rel)),
+					 errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+					 errhint("Please REINDEX it.")));
+
+		/*
+		 * Tree is inconsistent if adjustement is necessary for any parent tuple
+		 */
+		if (gistgetadjusted(rel, parent, idxtuple, state))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has inconsistent records",
+							RelationGetRelationName(rel))));
+	}
+}
+
+/* Check of an internal page. Hold locks on two pages at a time (parent+child). */
+static inline bool
+gist_check_internal_page(Relation rel, Page page, BufferAccessStrategy strategy, GISTSTATE *state)
+{
+	bool has_leafs = false;
+	bool has_internals = false;
+	OffsetNumber i,
+				maxoff = PageGetMaxOffsetNumber(page);
+
+	for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+	{
+		ItemId iid = PageGetItemId(page, i);
+		IndexTuple idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+		BlockNumber child_blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));	
+		Buffer		buffer;
+		Page child_page;
+
+		if (GistTupleIsInvalid(idxtuple))
+			ereport(LOG,
+					(errmsg("index \"%s\" contains an inner tuple marked as invalid",
+							RelationGetRelationName(rel)),
+					 errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+					 errhint("Please REINDEX it.")));
+		
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, child_blkno,
+									RBM_NORMAL, strategy);
+
+		LockBuffer(buffer, GIST_SHARE);
+		gistcheckpage(rel, buffer);
+		child_page = (Page) BufferGetPage(buffer);
+
+		has_leafs = has_leafs || GistPageIsLeaf(child_page);
+		has_internals = has_internals || !GistPageIsLeaf(child_page);
+		gist_check_page_keys(rel, page, child_page, idxtuple, state);
+
+		UnlockReleaseBuffer(buffer);
+	}
+
+	if (!(has_leafs || has_internals))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" internal page has no downlink references",
+						RelationGetRelationName(rel))));
+
+
+	if (has_leafs == has_internals)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" page references both internal and leaf pages",
+						RelationGetRelationName(rel))));
+	
+	return has_internals;
+}
+
+/* add pages with unfinished split to scan */
+static void
+pushStackIfSplited(Page page, GistScanItem *stack)
+{
+	GISTPageOpaque opaque = GistPageGetOpaque(page);
+
+	if (stack->blkno != GIST_ROOT_BLKNO && !XLogRecPtrIsInvalid(stack->parentlsn) &&
+		(GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page)) &&
+		opaque->rightlink != InvalidBlockNumber /* sanity check */ )
+	{
+		/* split page detected, install right link to the stack */
+
+		GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+		ptr->blkno = opaque->rightlink;
+		ptr->parentlsn = stack->parentlsn;
+		ptr->next = stack->next;
+		stack->next = ptr;
+	}
+}
+
+/* 
+ * Main entry point for GiST check. Allocates memory context and scans 
+ * through GiST graph.
+ */
+static inline void
+gist_check_keys_consistency(Relation rel)
+{
+	GistScanItem *stack,
+			   *ptr;
+	
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+
+    MemoryContext mctx = AllocSetContextCreate(CurrentMemoryContext,
+												 "amcheck context",
+#if PG_VERSION_NUM >= 110000
+												 ALLOCSET_DEFAULT_SIZES);
+#else
+												 ALLOCSET_DEFAULT_MINSIZE,
+												 ALLOCSET_DEFAULT_INITSIZE,
+												 ALLOCSET_DEFAULT_MAXSIZE);
+#endif
+
+	MemoryContext oldcontext = MemoryContextSwitchTo(mctx);
+	GISTSTATE *state = initGISTstate(rel);
+
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	while (stack)
+	{
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		IndexTuple	idxtuple;
+		ItemId		iid;
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		gistcheckpage(rel, buffer);
+		page = (Page) BufferGetPage(buffer);
+
+		if (GistPageIsLeaf(page))
+		{
+			/* should never happen unless it is root */
+			Assert(stack->blkno == GIST_ROOT_BLKNO);
+		}
+		else
+		{
+			/* check for split proceeded after look at parent */
+			pushStackIfSplited(page, stack);
+
+			maxoff = PageGetMaxOffsetNumber(page);
+
+			if (gist_check_internal_page(rel, page, strategy, state))
+			{
+				for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+				{
+					iid = PageGetItemId(page, i);
+					idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+					ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+					ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+					ptr->parentlsn = BufferGetLSNAtomic(buffer);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+			}
+		}
+
+		UnlockReleaseBuffer(buffer);
+
+		ptr = stack->next;
+		pfree(stack);
+		stack = ptr;
+	}
+
+    MemoryContextSwitchTo(oldcontext);
+    MemoryContextDelete(mctx);
+}
+
+/* Check that relation is eligible for GiST verification */
+static inline void
+gist_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIST_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GiST indexes are supported as targets for this verification"),
+				 errdetail("Relation \"%s\" is not a GiST index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!IndexIsValid(rel->rd_index))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+PG_FUNCTION_INFO_V1(gist_index_check);
+
+Datum
+gist_index_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	Relation	indrel;
+	indrel = index_open(indrelid, ShareLock);
+
+	gist_index_checkable(indrel);
+	gist_check_keys_consistency(indrel);		
+
+	index_close(indrel, ShareLock);
+
+	PG_RETURN_VOID();
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 66a0232e24..621aeb881d 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -163,6 +163,25 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_check(index regclass) returns void</function>
+     <indexterm>
+      <primary>gist_index_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
  </sect2>
 
-- 
2.17.1 (Apple Git-112)

#2Thomas Munro
thomas.munro@enterprisedb.com
In reply to: Andrey Borodin (#1)
Re: amcheck verification for GiST

On Sun, Sep 23, 2018 at 10:15 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:

Here's the patch with amcheck functionality for GiST.

Hi Andrey,

Windows doesn't like it[1]https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.14056:

contrib/amcheck/verify_gist.c(163): error C2121: '#' : invalid
character : possibly the result of a macro expansion
[C:\projects\postgresql\amcheck.vcxproj]

That's:

+    MemoryContext mctx = AllocSetContextCreate(CurrentMemoryContext,
+ "amcheck context",
+#if PG_VERSION_NUM >= 110000
+ ALLOCSET_DEFAULT_SIZES);
+#else
+ ALLOCSET_DEFAULT_MINSIZE,
+ ALLOCSET_DEFAULT_INITSIZE,
+ ALLOCSET_DEFAULT_MAXSIZE);
+#endif

Not sure what's gong on there... perhaps it doesn't like you to do
that in the middle of a function-like-macro invocation
(AllocSetContextCreate)?

[1]: https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.14056

--
Thomas Munro
http://www.enterprisedb.com

#3Andres Freund
andres@anarazel.de
In reply to: Thomas Munro (#2)
Re: amcheck verification for GiST

On 2018-09-24 15:29:38 +1200, Thomas Munro wrote:

On Sun, Sep 23, 2018 at 10:15 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:

Here's the patch with amcheck functionality for GiST.

Hi Andrey,

Windows doesn't like it[1]:

contrib/amcheck/verify_gist.c(163): error C2121: '#' : invalid
character : possibly the result of a macro expansion
[C:\projects\postgresql\amcheck.vcxproj]

That's:

+    MemoryContext mctx = AllocSetContextCreate(CurrentMemoryContext,
+ "amcheck context",
+#if PG_VERSION_NUM >= 110000
+ ALLOCSET_DEFAULT_SIZES);
+#else
+ ALLOCSET_DEFAULT_MINSIZE,
+ ALLOCSET_DEFAULT_INITSIZE,
+ ALLOCSET_DEFAULT_MAXSIZE);
+#endif

Not sure what's gong on there... perhaps it doesn't like you to do
that in the middle of a function-like-macro invocation
(AllocSetContextCreate)?

But note that the version dependent code shouldn't be present in
/contrib anyway.

Greetings,

Andres Freund

#4Andrey Borodin
x4mmm@yandex-team.ru
In reply to: Thomas Munro (#2)
1 attachment(s)
Re: amcheck verification for GiST

Hi!

24 сент. 2018 г., в 8:29, Thomas Munro <thomas.munro@enterprisedb.com> написал(а):

On Sun, Sep 23, 2018 at 10:15 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:

Here's the patch with amcheck functionality for GiST.

Hi Andrey,

Windows doesn't like it[1]:

Thanks, Thomas! Yes, I've missed that version-dependent macro. Surely, it's redundant.

Best regards, Andrey Borodin.

Attachments:

0001-GiST-verification-function-for-amcheck-v2.patchapplication/octet-stream; name=0001-GiST-verification-function-for-amcheck-v2.patch; x-unix-mode=0644Download
From 62c425583deed26e72540a820310258d17961515 Mon Sep 17 00:00:00 2001
From: Andrey Borodin <amborodin@acm.org>
Date: Sun, 23 Sep 2018 14:59:54 +0500
Subject: [PATCH] GiST verification function for amcheck v2

Function gist_index_check() verifies that in target GiST every internal tuple need no extension by underlying subtree tuples and page graph respects balanced-tree invariants.
---
 contrib/amcheck/Makefile                |   6 +-
 contrib/amcheck/amcheck--1.1--1.2.sql   |  14 ++
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/expected/check_gist.out |   9 +
 contrib/amcheck/sql/check_gist.sql      |   4 +
 contrib/amcheck/verify_gist.c           | 266 ++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml               |  19 ++
 7 files changed, 316 insertions(+), 4 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.1--1.2.sql
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index c5764b544f..dd9b5ecf92 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -1,13 +1,13 @@
 # contrib/amcheck/Makefile
 
 MODULE_big	= amcheck
-OBJS		= verify_nbtree.o $(WIN32RES)
+OBJS		= verify_nbtree.o verify_gist.o $(WIN32RES)
 
 EXTENSION = amcheck
-DATA = amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree
+REGRESS = check check_btree check_gist
 
 ifdef USE_PGXS
 PG_CONFIG = pg_config
diff --git a/contrib/amcheck/amcheck--1.1--1.2.sql b/contrib/amcheck/amcheck--1.1--1.2.sql
new file mode 100644
index 0000000000..6888900303
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.1--1.2.sql
@@ -0,0 +1,14 @@
+/* amcheck--1.1--1.2.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.2'" to load this file. \quit
+
+--
+-- gist_index_check()
+--
+CREATE FUNCTION gist_index_check(index regclass)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_check(regclass) FROM PUBLIC;
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index 469048403d..c6e310046d 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.1'
+default_version = '1.2'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 0000000000..d8ad66805b
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,9 @@
+-- minimal test, basically just verifying that amcheck works with GiST
+CREATE TABLE gist_check AS SELECT point(s,1) c FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx ON gist_check USING gist(c);
+SELECT gist_index_check('gist_check_idx');
+ gist_index_check 
+------------------
+ 
+(1 row)
+
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 0000000000..585c455ca5
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,4 @@
+-- minimal test, basically just verifying that amcheck works with GiST
+CREATE TABLE gist_check AS SELECT point(s,1) c FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx ON gist_check USING gist(c);
+SELECT gist_index_check('gist_check_idx');
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 0000000000..8fcf7b9c0b
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,266 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_nbtree.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph are contatining
+ * consisnent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2018, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "access/htup_details.h"
+#include "access/transam.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
+#include "commands/tablecmds.h"
+#include "miscadmin.h"
+#include "storage/lmgr.h"
+#include "utils/memutils.h"
+#include "utils/snapmgr.h"
+
+typedef struct GistScanItem
+{
+	GistNSN		parentlsn;
+	BlockNumber blkno;
+	struct GistScanItem *next;
+} GistScanItem;
+
+/*
+ * For every tuple on page check if it is contained by tuple on parent page
+ */
+static inline void
+gist_check_page_keys(Relation rel, Page parentpage, Page page, IndexTuple parent, GISTSTATE *state)
+{
+	OffsetNumber i,
+				maxoff = PageGetMaxOffsetNumber(page);
+
+	for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+	{
+		ItemId iid = PageGetItemId(page, i);
+		IndexTuple idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+		if (GistTupleIsInvalid(idxtuple))
+			ereport(LOG,
+					(errmsg("index \"%s\" contains an inner tuple marked as invalid",
+							RelationGetRelationName(rel)),
+					 errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+					 errhint("Please REINDEX it.")));
+
+		/*
+		 * Tree is inconsistent if adjustement is necessary for any parent tuple
+		 */
+		if (gistgetadjusted(rel, parent, idxtuple, state))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has inconsistent records",
+							RelationGetRelationName(rel))));
+	}
+}
+
+/* Check of an internal page. Hold locks on two pages at a time (parent+child). */
+static inline bool
+gist_check_internal_page(Relation rel, Page page, BufferAccessStrategy strategy, GISTSTATE *state)
+{
+	bool has_leafs = false;
+	bool has_internals = false;
+	OffsetNumber i,
+				maxoff = PageGetMaxOffsetNumber(page);
+
+	for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+	{
+		ItemId iid = PageGetItemId(page, i);
+		IndexTuple idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+		BlockNumber child_blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));	
+		Buffer		buffer;
+		Page child_page;
+
+		if (GistTupleIsInvalid(idxtuple))
+			ereport(LOG,
+					(errmsg("index \"%s\" contains an inner tuple marked as invalid",
+							RelationGetRelationName(rel)),
+					 errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+					 errhint("Please REINDEX it.")));
+		
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, child_blkno,
+									RBM_NORMAL, strategy);
+
+		LockBuffer(buffer, GIST_SHARE);
+		gistcheckpage(rel, buffer);
+		child_page = (Page) BufferGetPage(buffer);
+
+		has_leafs = has_leafs || GistPageIsLeaf(child_page);
+		has_internals = has_internals || !GistPageIsLeaf(child_page);
+		gist_check_page_keys(rel, page, child_page, idxtuple, state);
+
+		UnlockReleaseBuffer(buffer);
+	}
+
+	if (!(has_leafs || has_internals))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" internal page has no downlink references",
+						RelationGetRelationName(rel))));
+
+
+	if (has_leafs == has_internals)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" page references both internal and leaf pages",
+						RelationGetRelationName(rel))));
+	
+	return has_internals;
+}
+
+/* add pages with unfinished split to scan */
+static void
+pushStackIfSplited(Page page, GistScanItem *stack)
+{
+	GISTPageOpaque opaque = GistPageGetOpaque(page);
+
+	if (stack->blkno != GIST_ROOT_BLKNO && !XLogRecPtrIsInvalid(stack->parentlsn) &&
+		(GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page)) &&
+		opaque->rightlink != InvalidBlockNumber /* sanity check */ )
+	{
+		/* split page detected, install right link to the stack */
+
+		GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+		ptr->blkno = opaque->rightlink;
+		ptr->parentlsn = stack->parentlsn;
+		ptr->next = stack->next;
+		stack->next = ptr;
+	}
+}
+
+/* 
+ * Main entry point for GiST check. Allocates memory context and scans 
+ * through GiST graph.
+ */
+static inline void
+gist_check_keys_consistency(Relation rel)
+{
+	GistScanItem *stack,
+			   *ptr;
+	
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+
+    MemoryContext mctx = AllocSetContextCreate(CurrentMemoryContext,
+												 "amcheck context",
+												 ALLOCSET_DEFAULT_SIZES);
+
+	MemoryContext oldcontext = MemoryContextSwitchTo(mctx);
+	GISTSTATE *state = initGISTstate(rel);
+
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	while (stack)
+	{
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		IndexTuple	idxtuple;
+		ItemId		iid;
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		gistcheckpage(rel, buffer);
+		page = (Page) BufferGetPage(buffer);
+
+		if (GistPageIsLeaf(page))
+		{
+			/* should never happen unless it is root */
+			Assert(stack->blkno == GIST_ROOT_BLKNO);
+		}
+		else
+		{
+			/* check for split proceeded after look at parent */
+			pushStackIfSplited(page, stack);
+
+			maxoff = PageGetMaxOffsetNumber(page);
+
+			if (gist_check_internal_page(rel, page, strategy, state))
+			{
+				for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+				{
+					iid = PageGetItemId(page, i);
+					idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+					ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+					ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+					ptr->parentlsn = BufferGetLSNAtomic(buffer);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+			}
+		}
+
+		UnlockReleaseBuffer(buffer);
+
+		ptr = stack->next;
+		pfree(stack);
+		stack = ptr;
+	}
+
+    MemoryContextSwitchTo(oldcontext);
+    MemoryContextDelete(mctx);
+}
+
+/* Check that relation is eligible for GiST verification */
+static inline void
+gist_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIST_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GiST indexes are supported as targets for this verification"),
+				 errdetail("Relation \"%s\" is not a GiST index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!IndexIsValid(rel->rd_index))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+PG_FUNCTION_INFO_V1(gist_index_check);
+
+Datum
+gist_index_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	Relation	indrel;
+	indrel = index_open(indrelid, ShareLock);
+
+	gist_index_checkable(indrel);
+	gist_check_keys_consistency(indrel);		
+
+	index_close(indrel, ShareLock);
+
+	PG_RETURN_VOID();
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 66a0232e24..621aeb881d 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -163,6 +163,25 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_check(index regclass) returns void</function>
+     <indexterm>
+      <primary>gist_index_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
  </sect2>
 
-- 
2.17.1 (Apple Git-112)

In reply to: Andrey Borodin (#4)
Re: amcheck verification for GiST

On Sun, Sep 23, 2018 at 10:12 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:

(0001-GiST-verification-function-for-amcheck-v2.patch)

Thanks for working on this. Some feedback:

* You do this:

+/* Check of an internal page. Hold locks on two pages at a time (parent+child). */

This isn't consistent with what you do within verify_nbtree.c, which
deliberately avoids ever holding more than a single buffer lock at a
time, on general principle. That isn't necessarily a reason why you
have to do the same, but it's not clear why you do things that way.
Why isn't it enough to have a ShareLock on the relation? Maybe this is
a sign that it would be a good idea to always operate on a palloc()'d
copy of the page, by introducing something equivalent to
palloc_btree_page(). (That would also be an opportunity to do very
basic checks on every page.)

* You need to sprinkle a few CHECK_FOR_INTERRUPTS() calls around.
Certainly, there should be one at the top of the main loop.

* Maybe gist_index_check() should be called gist_index_parent_check(),
since it's rather like the existing verification function
bt_index_parent_check().

* Alternatively, you could find a way to make your function only need
an AccessShareLock -- that would make gist_index_check() an
appropriate name. That would probably require careful thought about
VACUUM.

* Why is it okay to do this?:

+       if (GistTupleIsInvalid(idxtuple))
+           ereport(LOG,
+                   (errmsg("index \"%s\" contains an inner tuple marked as invalid",
+                           RelationGetRelationName(rel)),
+                    errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+                    errhint("Please REINDEX it.")));

You should probably mention the gistdoinsert() precedent for this.

* Can we check GIST_PAGE_ID somewhere? I try to be as paranoid as
possible, adding almost any check that I can think of, provided it
hasn't got very high overhead. Note that gistcheckpage() doesn't do
this for you.

* Should we be concerned about the memory used by pushStackIfSplited()?

* How about a cross-check between IndexTupleSize() and
ItemIdGetLength(), like the B-Tree code? It's a bit unfortunate that
we have this redundancy, which wastes space, but we do, so we might as
well get some small benefit from it.

--
Peter Geoghegan

#6Andrey Borodin
x4mmm@yandex-team.ru
In reply to: Peter Geoghegan (#5)
1 attachment(s)
Re: amcheck verification for GiST

Hi, Peter!

Thank you for the review!

7 дек. 2018 г., в 3:59, Peter Geoghegan <pg@bowt.ie> написал(а):

On Sun, Sep 23, 2018 at 10:12 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:
* You do this:

+/* Check of an internal page. Hold locks on two pages at a time (parent+child). */

This isn't consistent with what you do within verify_nbtree.c, which
deliberately avoids ever holding more than a single buffer lock at a
time, on general principle. That isn't necessarily a reason why you
have to do the same, but it's not clear why you do things that way.
Why isn't it enough to have a ShareLock on the relation? Maybe this is
a sign that it would be a good idea to always operate on a palloc()'d
copy of the page, by introducing something equivalent to
palloc_btree_page(). (That would also be an opportunity to do very
basic checks on every page.)

If we unlock parent page before checking child, some insert can adjust tuple on parent, sneak into child and insert new tuple.
This can trigger false positive. I'll think about it more.

* You need to sprinkle a few CHECK_FOR_INTERRUPTS() calls around.
Certainly, there should be one at the top of the main loop.

I've added check into main loop of the scan. All deeper paths hold buffer locks.

* Maybe gist_index_check() should be called gist_index_parent_check(),
since it's rather like the existing verification function
bt_index_parent_check().

* Alternatively, you could find a way to make your function only need
an AccessShareLock -- that would make gist_index_check() an
appropriate name. That would probably require careful thought about
VACUUM.

I've changed lock level to AccessShareLock. IMV scan is just as safe as regular GiST index scan.
There is my patch with VACUUM on CF, it can add deleted pages. I'll update one of these two patches accordingly, if other is committed.

* Why is it okay to do this?:

+       if (GistTupleIsInvalid(idxtuple))
+           ereport(LOG,
+                   (errmsg("index \"%s\" contains an inner tuple marked as invalid",
+                           RelationGetRelationName(rel)),
+                    errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+                    errhint("Please REINDEX it.")));

You should probably mention the gistdoinsert() precedent for this.

This invalid tuple will break inserts, but will not affect select. I do not know, should this be error or warning in amcheck?

* Can we check GIST_PAGE_ID somewhere? I try to be as paranoid as
possible, adding almost any check that I can think of, provided it
hasn't got very high overhead. Note that gistcheckpage() doesn't do
this for you.

Done. I think that gistcheckpage() should do this too, but I think we should avoid interventions into GiST mechanics here.

* Should we be concerned about the memory used by pushStackIfSplited()?

Memory is pfree()`d as usual. Tree scan code is almost equivalent to VACUUM`s gistbulkdelete.

* How about a cross-check between IndexTupleSize() and
ItemIdGetLength(), like the B-Tree code? It's a bit unfortunate that
we have this redundancy, which wastes space, but we do, so we might as
well get some small benefit from it.

Done. I'm checking it MAXALIGNED, this rounding seems correct.

Please find attached v3.

Best regards, Andrey Borodin.

Attachments:

0001-GiST-verification-function-for-amcheck-v3.patchapplication/octet-stream; name=0001-GiST-verification-function-for-amcheck-v3.patch; x-unix-mode=0644Download
From 90dbe2717add4c5ec06e42d7249ed912b4283831 Mon Sep 17 00:00:00 2001
From: Andrey <amborodin@acm.org>
Date: Tue, 1 Jan 2019 15:03:13 +0500
Subject: [PATCH] GiST verification function for amcheck v3

---
 contrib/amcheck/Makefile                |   6 +-
 contrib/amcheck/amcheck--1.1--1.2.sql   |  14 ++
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/expected/check_gist.out |   9 +
 contrib/amcheck/sql/check_gist.sql      |   4 +
 contrib/amcheck/verify_gist.c           | 293 ++++++++++++++++++++++++
 doc/src/sgml/amcheck.sgml               |  19 ++
 7 files changed, 343 insertions(+), 4 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.1--1.2.sql
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index c5764b544f..dd9b5ecf92 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -1,13 +1,13 @@
 # contrib/amcheck/Makefile
 
 MODULE_big	= amcheck
-OBJS		= verify_nbtree.o $(WIN32RES)
+OBJS		= verify_nbtree.o verify_gist.o $(WIN32RES)
 
 EXTENSION = amcheck
-DATA = amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree
+REGRESS = check check_btree check_gist
 
 ifdef USE_PGXS
 PG_CONFIG = pg_config
diff --git a/contrib/amcheck/amcheck--1.1--1.2.sql b/contrib/amcheck/amcheck--1.1--1.2.sql
new file mode 100644
index 0000000000..6888900303
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.1--1.2.sql
@@ -0,0 +1,14 @@
+/* amcheck--1.1--1.2.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.2'" to load this file. \quit
+
+--
+-- gist_index_check()
+--
+CREATE FUNCTION gist_index_check(index regclass)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_check(regclass) FROM PUBLIC;
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index 469048403d..c6e310046d 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.1'
+default_version = '1.2'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 0000000000..d8ad66805b
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,9 @@
+-- minimal test, basically just verifying that amcheck works with GiST
+CREATE TABLE gist_check AS SELECT point(s,1) c FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx ON gist_check USING gist(c);
+SELECT gist_index_check('gist_check_idx');
+ gist_index_check 
+------------------
+ 
+(1 row)
+
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 0000000000..585c455ca5
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,4 @@
+-- minimal test, basically just verifying that amcheck works with GiST
+CREATE TABLE gist_check AS SELECT point(s,1) c FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx ON gist_check USING gist(c);
+SELECT gist_index_check('gist_check_idx');
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 0000000000..1af884a14b
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,293 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_nbtree.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph are contatining
+ * consisnent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2018, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "access/htup_details.h"
+#include "access/transam.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
+#include "commands/tablecmds.h"
+#include "miscadmin.h"
+#include "storage/lmgr.h"
+#include "utils/memutils.h"
+#include "utils/snapmgr.h"
+
+
+typedef struct GistScanItem
+{
+	GistNSN		parentlsn;
+	BlockNumber blkno;
+	struct GistScanItem *next;
+} GistScanItem;
+
+static inline void
+check_index_tuple(IndexTuple idxtuple, Relation rel, ItemId iid)
+{
+	/*
+	 * Check that it's not a leftover invalid tuple from pre-9.1
+	 * See also gistdoinsert() and gistbulkdelete() handlding of such tuples.
+	 * We do not consider it error here, but warn operator.
+	 */
+	if (GistTupleIsInvalid(idxtuple))
+		ereport(ERROR,
+				(errmsg("index \"%s\" contains an inner tuple marked as invalid",
+						RelationGetRelationName(rel)),
+				 errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+				 errhint("Please REINDEX it.")));
+
+	if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has tuple sizes",
+						RelationGetRelationName(rel))));
+}
+
+static inline void
+check_index_page(Relation rel, Page page, Buffer buffer)
+{
+	gistcheckpage(rel, buffer);
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted pages",
+						RelationGetRelationName(rel))));
+}
+
+/*
+ * For every tuple on page check if it is contained by tuple on parent page
+ */
+static inline void
+gist_check_page_keys(Relation rel, Page parentpage, Page page, IndexTuple parent, GISTSTATE *state)
+{
+	OffsetNumber i,
+				maxoff = PageGetMaxOffsetNumber(page);
+
+	for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+	{
+		ItemId iid = PageGetItemId(page, i);
+		IndexTuple idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+		check_index_tuple(idxtuple, rel, iid);
+
+		/*
+		 * Tree is inconsistent if adjustement is necessary for any parent tuple
+		 */
+		if (gistgetadjusted(rel, parent, idxtuple, state))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has inconsistent records",
+							RelationGetRelationName(rel))));
+	}
+}
+
+/* Check of an internal page. Hold locks on two pages at a time (parent+child). */
+/* Return true if further descent is necessary */
+static inline bool
+gist_check_internal_page(Relation rel, Page page, BufferAccessStrategy strategy, GISTSTATE *state)
+{
+	bool has_leafs = false;
+	bool has_internals = false;
+	OffsetNumber i,
+				maxoff = PageGetMaxOffsetNumber(page);
+
+	for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+	{
+		ItemId iid = PageGetItemId(page, i);
+		IndexTuple idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+		BlockNumber child_blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+		Buffer		buffer;
+		Page child_page;
+
+		check_index_tuple(idxtuple, rel, iid);
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, child_blkno,
+									RBM_NORMAL, strategy);
+
+		LockBuffer(buffer, GIST_SHARE);
+		child_page = (Page) BufferGetPage(buffer);
+		check_index_page(rel, child_page, buffer);
+
+		has_leafs = has_leafs || GistPageIsLeaf(child_page);
+		has_internals = has_internals || !GistPageIsLeaf(child_page);
+		gist_check_page_keys(rel, page, child_page, idxtuple, state);
+
+		UnlockReleaseBuffer(buffer);
+	}
+
+	if (!(has_leafs || has_internals))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" internal page has no downlink references",
+						RelationGetRelationName(rel))));
+
+
+	if (has_leafs == has_internals)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" page references both internal and leaf pages",
+						RelationGetRelationName(rel))));
+
+	return has_internals;
+}
+
+/* add pages with unfinished split to scan */
+static void
+pushStackIfSplited(Page page, GistScanItem *stack)
+{
+	GISTPageOpaque opaque = GistPageGetOpaque(page);
+
+	if (stack->blkno != GIST_ROOT_BLKNO && !XLogRecPtrIsInvalid(stack->parentlsn) &&
+		(GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page)) &&
+		opaque->rightlink != InvalidBlockNumber /* sanity check */ )
+	{
+		/* split page detected, install right link to the stack */
+
+		GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+		ptr->blkno = opaque->rightlink;
+		ptr->parentlsn = stack->parentlsn;
+		ptr->next = stack->next;
+		stack->next = ptr;
+	}
+}
+
+/*
+ * Main entry point for GiST check. Allocates memory context and scans 
+ * through GiST graph.
+ */
+static inline void
+gist_check_keys_consistency(Relation rel)
+{
+	GistScanItem *stack,
+			   *ptr;
+
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+
+	MemoryContext mctx = AllocSetContextCreate(CurrentMemoryContext,
+												 "amcheck context",
+												 ALLOCSET_DEFAULT_SIZES);
+
+	MemoryContext oldcontext = MemoryContextSwitchTo(mctx);
+	GISTSTATE *state = initGISTstate(rel);
+
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	while (stack)
+	{
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		IndexTuple	idxtuple;
+		ItemId		iid;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		check_index_page(rel, page, buffer);
+
+		if (GistPageIsLeaf(page))
+		{
+			/* should never happen unless it is root */
+			Assert(stack->blkno == GIST_ROOT_BLKNO);
+		}
+		else
+		{
+			/* check for split proceeded after look at parent */
+			pushStackIfSplited(page, stack);
+
+			maxoff = PageGetMaxOffsetNumber(page);
+
+			if (gist_check_internal_page(rel, page, strategy, state))
+			{
+				for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+				{
+					iid = PageGetItemId(page, i);
+					idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+					ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+					ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+					ptr->parentlsn = BufferGetLSNAtomic(buffer);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+			}
+		}
+
+		UnlockReleaseBuffer(buffer);
+
+		ptr = stack->next;
+		pfree(stack);
+		stack = ptr;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/* Check that relation is eligible for GiST verification */
+static inline void
+gist_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIST_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GiST indexes are supported as targets for this verification"),
+				 errdetail("Relation \"%s\" is not a GiST index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+PG_FUNCTION_INFO_V1(gist_index_check);
+
+Datum
+gist_index_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	Relation	indrel;
+	indrel = index_open(indrelid, AccessShareLock);
+
+	gist_index_checkable(indrel);
+	gist_check_keys_consistency(indrel);
+
+	index_close(indrel, AccessShareLock);
+
+	PG_RETURN_VOID();
+}
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 8bb60d5c2d..4fda774713 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -162,6 +162,25 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_check(index regclass) returns void</function>
+     <indexterm>
+      <primary>gist_index_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
  </sect2>
 
-- 
2.17.2 (Apple Git-113)

In reply to: Andrey Borodin (#6)
Re: amcheck verification for GiST

On Tue, Jan 1, 2019 at 2:11 AM Andrey Borodin <x4mmm@yandex-team.ru> wrote:

This isn't consistent with what you do within verify_nbtree.c, which
deliberately avoids ever holding more than a single buffer lock at a
time, on general principle. That isn't necessarily a reason why you
have to do the same, but it's not clear why you do things that way.
Why isn't it enough to have a ShareLock on the relation? Maybe this is
a sign that it would be a good idea to always operate on a palloc()'d
copy of the page, by introducing something equivalent to
palloc_btree_page(). (That would also be an opportunity to do very
basic checks on every page.)

If we unlock parent page before checking child, some insert can adjust tuple on parent, sneak into child and insert new tuple.
This can trigger false positive. I'll think about it more.

I think that holding a buffer lock on an internal pages for as long as
it takes to check all of the child pages is a non-starter. If you
can't think of a way of not doing that that's race free with a
relation-level AccessShareLock, then a relation-level ShareLock (which
will block VACUUM) seems necessary.

I think that you should duplicate some of what's in
bt_index_check_internal() -- lock the heap table as well as the index,
to avoid deadlocks. We might not do this with contrib/pageinspect, but
that's not really intended to be used in production in the same way
this will be.

* Maybe gist_index_check() should be called gist_index_parent_check(),
since it's rather like the existing verification function
bt_index_parent_check().

* Alternatively, you could find a way to make your function only need
an AccessShareLock -- that would make gist_index_check() an
appropriate name. That would probably require careful thought about
VACUUM.

I've changed lock level to AccessShareLock. IMV scan is just as safe as regular GiST index scan.
There is my patch with VACUUM on CF, it can add deleted pages. I'll update one of these two patches accordingly, if other is committed.

Maybe you should have renamed it to gist_index_parent_check() instead,
and kept the ShareLock. I don't think that this business with buffer
locks is acceptable. And it is mostly checking parent/child
relationships. Right?

You're not really able to check GiST tuples against anything other
than their parent, unlike with B-Tree (you can only do very simple
things, like the IndexTupleSize()/lp_len cross-check). Naming the
function gist_index_parent_check() seems like the right way to go,
even if you could get away with an AccessShareLock (which I now tend
to doubt). It was way too optimistic to suppose that there might be a
clever way of locking down race conditions that allowed you to not
couple buffer locks, but also use an AccessShareLock. After all, even
the B-Tree amcheck code doesn't manage to do this when verifying
parent/child relationships (it only does something clever with the
sibling page cross-check).

* Why is it okay to do this?:

+       if (GistTupleIsInvalid(idxtuple))
+           ereport(LOG,
+                   (errmsg("index \"%s\" contains an inner tuple marked as invalid",
+                           RelationGetRelationName(rel)),
+                    errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+                    errhint("Please REINDEX it.")));

You should probably mention the gistdoinsert() precedent for this.

This invalid tuple will break inserts, but will not affect select. I do not know, should this be error or warning in amcheck?

It should be an error. Breaking inserts is a serious problem.

--
Peter Geoghegan

#8Michael Paquier
michael@paquier.xyz
In reply to: Peter Geoghegan (#7)
Re: amcheck verification for GiST

On Thu, Jan 31, 2019 at 03:58:48PM -0800, Peter Geoghegan wrote:

I think that holding a buffer lock on an internal pages for as long as
it takes to check all of the child pages is a non-starter. If you
can't think of a way of not doing that that's race free with a
relation-level AccessShareLock, then a relation-level ShareLock (which
will block VACUUM) seems necessary.

(Please be careful to update the status of the patch in the CF
correctly!)
This review is recent, so I have moved the patch to next CF, waiting
for input from the author.
--
Michael

#9Andrey Borodin
x4mmm@yandex-team.ru
In reply to: Peter Geoghegan (#7)
1 attachment(s)
Re: amcheck verification for GiST

Hi Peter!

Sorry for the delay. Here's new version.

1 февр. 2019 г., в 4:58, Peter Geoghegan <pg@bowt.ie> написал(а):

On Tue, Jan 1, 2019 at 2:11 AM Andrey Borodin <x4mmm@yandex-team.ru> wrote:

This isn't consistent with what you do within verify_nbtree.c, which
deliberately avoids ever holding more than a single buffer lock at a
time, on general principle. That isn't necessarily a reason why you
have to do the same, but it's not clear why you do things that way.
Why isn't it enough to have a ShareLock on the relation? Maybe this is
a sign that it would be a good idea to always operate on a palloc()'d
copy of the page, by introducing something equivalent to
palloc_btree_page(). (That would also be an opportunity to do very
basic checks on every page.)

If we unlock parent page before checking child, some insert can adjust tuple on parent, sneak into child and insert new tuple.
This can trigger false positive. I'll think about it more.

I think that holding a buffer lock on an internal pages for as long as
it takes to check all of the child pages is a non-starter. If you
can't think of a way of not doing that that's race free with a
relation-level AccessShareLock, then a relation-level ShareLock (which
will block VACUUM) seems necessary.

I think that you should duplicate some of what's in
bt_index_check_internal() -- lock the heap table as well as the index,
to avoid deadlocks. We might not do this with contrib/pageinspect, but
that's not really intended to be used in production in the same way
this will be.

I've extracted functions amcheck_lock_relation() and amcheck_unlock_relation().
With this patch a little bit complicated, but I do not think that code duplication will be OK here.
Instead of lock\unlock functions I can provide one-function-interface receiving index_checkable() and index_check() callbacks.

* Maybe gist_index_check() should be called gist_index_parent_check(),
since it's rather like the existing verification function
bt_index_parent_check().

* Alternatively, you could find a way to make your function only need
an AccessShareLock -- that would make gist_index_check() an
appropriate name. That would probably require careful thought about
VACUUM.

I've changed lock level to AccessShareLock. IMV scan is just as safe as regular GiST index scan.
There is my patch with VACUUM on CF, it can add deleted pages. I'll update one of these two patches accordingly, if other is committed.

Maybe you should have renamed it to gist_index_parent_check() instead,
and kept the ShareLock. I don't think that this business with buffer
locks is acceptable. And it is mostly checking parent/child
relationships. Right?

That's right. Semantically gist_index_parent_check() is correct name, let's use it.

You're not really able to check GiST tuples against anything other
than their parent, unlike with B-Tree (you can only do very simple
things, like the IndexTupleSize()/lp_len cross-check). Naming the
function gist_index_parent_check() seems like the right way to go,
even if you could get away with an AccessShareLock (which I now tend
to doubt). It was way too optimistic to suppose that there might be a
clever way of locking down race conditions that allowed you to not
couple buffer locks, but also use an AccessShareLock. After all, even
the B-Tree amcheck code doesn't manage to do this when verifying
parent/child relationships (it only does something clever with the
sibling page cross-check).

That's true, we cannot avoid locking parent and child page simultaneously to check correctness of tuples.

Currently, we do not check index tuples against heap. Should we do this or leave this for another patch?

Thanks!

Best regards, Andrey Borodin.

Attachments:

0001-GiST-verification-function-for-amcheck-v4.patchapplication/octet-stream; name=0001-GiST-verification-function-for-amcheck-v4.patch; x-unix-mode=0644Download
From cc7c3a57605f95f4f1c7060a4ab7dc957da00c3b Mon Sep 17 00:00:00 2001
From: Andrey <amborodin@acm.org>
Date: Tue, 1 Jan 2019 15:03:13 +0500
Subject: [PATCH] GiST verification function for amcheck v4

---
 contrib/amcheck/Makefile                |   6 +-
 contrib/amcheck/amcheck--1.1--1.2.sql   |  14 ++
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/amcheck.h               |  31 +++
 contrib/amcheck/expected/check_gist.out |   9 +
 contrib/amcheck/sql/check_gist.sql      |   4 +
 contrib/amcheck/verify_gist.c           | 290 ++++++++++++++++++++++++
 contrib/amcheck/verify_nbtree.c         |  72 +++---
 doc/src/sgml/amcheck.sgml               |  21 ++
 9 files changed, 413 insertions(+), 36 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.1--1.2.sql
 create mode 100644 contrib/amcheck/amcheck.h
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index c5764b544f..dd9b5ecf92 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -1,13 +1,13 @@
 # contrib/amcheck/Makefile
 
 MODULE_big	= amcheck
-OBJS		= verify_nbtree.o $(WIN32RES)
+OBJS		= verify_nbtree.o verify_gist.o $(WIN32RES)
 
 EXTENSION = amcheck
-DATA = amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree
+REGRESS = check check_btree check_gist
 
 ifdef USE_PGXS
 PG_CONFIG = pg_config
diff --git a/contrib/amcheck/amcheck--1.1--1.2.sql b/contrib/amcheck/amcheck--1.1--1.2.sql
new file mode 100644
index 0000000000..5571642f4a
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.1--1.2.sql
@@ -0,0 +1,14 @@
+/* amcheck--1.1--1.2.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.2'" to load this file. \quit
+
+--
+-- gist_index_parent_check()
+--
+CREATE FUNCTION gist_index_parent_check(index regclass)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_parent_check(regclass) FROM PUBLIC;
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index 469048403d..c6e310046d 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.1'
+default_version = '1.2'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/amcheck.h b/contrib/amcheck/amcheck.h
new file mode 100644
index 0000000000..84da7d7ec0
--- /dev/null
+++ b/contrib/amcheck/amcheck.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "access/htup_details.h"
+#include "access/transam.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
+#include "commands/tablecmds.h"
+#include "miscadmin.h"
+#include "storage/lmgr.h"
+#include "utils/memutils.h"
+#include "utils/snapmgr.h"
+
+extern void
+amcheck_lock_relation(Oid indrelid, bool parentcheck,Relation *indrel,
+						Relation *heaprel, LOCKMODE	*lockmode);
+
+extern void
+amcheck_unlock_relation(Oid indrelid, Relation indrel, Relation heaprel, LOCKMODE	lockmode);
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 0000000000..4f373aea62
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,9 @@
+-- minimal test, basically just verifying that amcheck works with GiST
+CREATE TABLE gist_check AS SELECT point(s,1) c FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx ON gist_check USING gist(c);
+SELECT gist_index_parent_check('gist_check_idx');
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 0000000000..95e62ba975
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,4 @@
+-- minimal test, basically just verifying that amcheck works with GiST
+CREATE TABLE gist_check AS SELECT point(s,1) c FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx ON gist_check USING gist(c);
+SELECT gist_index_parent_check('gist_check_idx');
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 0000000000..1cdff357a1
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,290 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_nbtree.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph are contatining
+ * consisnent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "amcheck.h"
+
+#include "access/gist_private.h"
+
+
+typedef struct GistScanItem
+{
+	GistNSN		parentlsn;
+	BlockNumber blkno;
+	struct GistScanItem *next;
+} GistScanItem;
+
+static inline void
+check_index_tuple(IndexTuple idxtuple, Relation rel, ItemId iid)
+{
+	/*
+	 * Check that it's not a leftover invalid tuple from pre-9.1
+	 * See also gistdoinsert() and gistbulkdelete() handlding of such tuples.
+	 * We do not consider it error here, but warn operator.
+	 */
+	if (GistTupleIsInvalid(idxtuple))
+		ereport(ERROR,
+				(errmsg("index \"%s\" contains an inner tuple marked as invalid",
+						RelationGetRelationName(rel)),
+				 errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+				 errhint("Please REINDEX it.")));
+
+	if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has tuple sizes",
+						RelationGetRelationName(rel))));
+}
+
+static inline void
+check_index_page(Relation rel, Page page, Buffer buffer)
+{
+	gistcheckpage(rel, buffer);
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted pages",
+						RelationGetRelationName(rel))));
+}
+
+/*
+ * For every tuple on page check if it is contained by tuple on parent page
+ */
+static inline void
+gist_check_page_keys(Relation rel, Page parentpage, Page page, IndexTuple parent, GISTSTATE *state)
+{
+	OffsetNumber i,
+				maxoff = PageGetMaxOffsetNumber(page);
+
+	for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+	{
+		ItemId iid = PageGetItemId(page, i);
+		IndexTuple idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+		check_index_tuple(idxtuple, rel, iid);
+
+		/*
+		 * Tree is inconsistent if adjustement is necessary for any parent tuple
+		 */
+		if (gistgetadjusted(rel, parent, idxtuple, state))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has inconsistent records",
+							RelationGetRelationName(rel))));
+	}
+}
+
+/* Check of an internal page. Hold locks on two pages at a time (parent+child). */
+/* Return true if further descent is necessary */
+static inline bool
+gist_check_internal_page(Relation rel, Page page, BufferAccessStrategy strategy, GISTSTATE *state)
+{
+	bool has_leafs = false;
+	bool has_internals = false;
+	OffsetNumber i,
+				maxoff = PageGetMaxOffsetNumber(page);
+
+	for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+	{
+		ItemId iid = PageGetItemId(page, i);
+		IndexTuple idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+		BlockNumber child_blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+		Buffer		buffer;
+		Page child_page;
+
+		check_index_tuple(idxtuple, rel, iid);
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, child_blkno,
+									RBM_NORMAL, strategy);
+
+		LockBuffer(buffer, GIST_SHARE);
+		child_page = (Page) BufferGetPage(buffer);
+		check_index_page(rel, child_page, buffer);
+
+		has_leafs = has_leafs || GistPageIsLeaf(child_page);
+		has_internals = has_internals || !GistPageIsLeaf(child_page);
+		gist_check_page_keys(rel, page, child_page, idxtuple, state);
+
+		UnlockReleaseBuffer(buffer);
+	}
+
+	if (!(has_leafs || has_internals))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" internal page has no downlink references",
+						RelationGetRelationName(rel))));
+
+
+	if (has_leafs == has_internals)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" page references both internal and leaf pages",
+						RelationGetRelationName(rel))));
+
+	return has_internals;
+}
+
+/* add pages with unfinished split to scan */
+static void
+pushStackIfSplited(Page page, GistScanItem *stack)
+{
+	GISTPageOpaque opaque = GistPageGetOpaque(page);
+
+	if (stack->blkno != GIST_ROOT_BLKNO && !XLogRecPtrIsInvalid(stack->parentlsn) &&
+		(GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page)) &&
+		opaque->rightlink != InvalidBlockNumber /* sanity check */ )
+	{
+		/* split page detected, install right link to the stack */
+
+		GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+		ptr->blkno = opaque->rightlink;
+		ptr->parentlsn = stack->parentlsn;
+		ptr->next = stack->next;
+		stack->next = ptr;
+	}
+}
+
+/*
+ * Main entry point for GiST check. Allocates memory context and scans 
+ * through GiST graph.
+ */
+static inline void
+gist_check_parent_keys_consistency(Relation rel)
+{
+	GistScanItem *stack,
+			   *ptr;
+
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+
+	MemoryContext mctx = AllocSetContextCreate(CurrentMemoryContext,
+												 "amcheck context",
+												 ALLOCSET_DEFAULT_SIZES);
+
+	MemoryContext oldcontext = MemoryContextSwitchTo(mctx);
+	GISTSTATE *state = initGISTstate(rel);
+
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	while (stack)
+	{
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		IndexTuple	idxtuple;
+		ItemId		iid;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		check_index_page(rel, page, buffer);
+
+		if (GistPageIsLeaf(page))
+		{
+			/* should never happen unless it is root */
+			Assert(stack->blkno == GIST_ROOT_BLKNO);
+		}
+		else
+		{
+			/* check for split proceeded after look at parent */
+			pushStackIfSplited(page, stack);
+
+			maxoff = PageGetMaxOffsetNumber(page);
+
+			if (gist_check_internal_page(rel, page, strategy, state))
+			{
+				for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+				{
+					iid = PageGetItemId(page, i);
+					idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+					ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+					ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+					ptr->parentlsn = BufferGetLSNAtomic(buffer);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+			}
+		}
+
+		UnlockReleaseBuffer(buffer);
+
+		ptr = stack->next;
+		pfree(stack);
+		stack = ptr;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/* Check that relation is eligible for GiST verification */
+static inline void
+gist_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIST_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GiST indexes are supported as targets for this verification"),
+				 errdetail("Relation \"%s\" is not a GiST index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+PG_FUNCTION_INFO_V1(gist_index_parent_check);
+
+Datum
+gist_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	Relation	indrel;
+	Relation	heaprel;
+	LOCKMODE	lockmode;
+
+	/* lock table and index with neccesary level */
+	amcheck_lock_relation(indrelid, true, &indrel, &heaprel, &lockmode);
+
+	/* verify that this is GiST eligible for check */
+	gist_index_checkable(indrel);
+	gist_check_parent_keys_consistency(indrel);
+
+	/* Unlock index and table */
+	amcheck_unlock_relation(indrelid, indrel, heaprel, lockmode);
+
+	PG_RETURN_VOID();
+}
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 964200a767..f17552c5e5 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -21,22 +21,12 @@
  *
  *-------------------------------------------------------------------------
  */
-#include "postgres.h"
+#include "amcheck.h"
 
 #include "access/heapam.h"
-#include "access/htup_details.h"
 #include "access/nbtree.h"
-#include "access/transam.h"
 #include "access/xact.h"
-#include "catalog/index.h"
-#include "catalog/pg_am.h"
-#include "commands/tablecmds.h"
 #include "lib/bloomfilter.h"
-#include "miscadmin.h"
-#include "storage/lmgr.h"
-#include "utils/memutils.h"
-#include "utils/snapmgr.h"
-
 
 PG_MODULE_MAGIC;
 
@@ -195,21 +185,18 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
 	PG_RETURN_VOID();
 }
 
-/*
- * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
- */
-static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed)
+
+/* Lock aquisition reused accross different am types */
+void
+amcheck_lock_relation(Oid indrelid, bool parentcheck, Relation *indrel,
+						Relation *heaprel, LOCKMODE	*lockmode)
 {
 	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	LOCKMODE	lockmode;
 
 	if (parentcheck)
-		lockmode = ShareLock;
+		*lockmode = ShareLock;
 	else
-		lockmode = AccessShareLock;
+		*lockmode = AccessShareLock;
 
 	/*
 	 * We must lock table before index to avoid deadlocks.  However, if the
@@ -221,9 +208,9 @@ bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed)
 	 */
 	heapid = IndexGetRelation(indrelid, true);
 	if (OidIsValid(heapid))
-		heaprel = table_open(heapid, lockmode);
+		*heaprel = heap_open(heapid, *lockmode);
 	else
-		heaprel = NULL;
+		*heaprel = NULL;
 
 	/*
 	 * Open the target index relations separately (like relation_openrv(), but
@@ -237,25 +224,23 @@ bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed)
 	 * committed or recently dead heap tuples lacking index entries due to
 	 * concurrent activity.)
 	 */
-	indrel = index_open(indrelid, lockmode);
+	*indrel = index_open(indrelid, *lockmode);
 
 	/*
 	 * Since we did the IndexGetRelation call above without any lock, it's
 	 * barely possible that a race against an index drop/recreation could have
 	 * netted us the wrong table.
 	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (*heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
 		ereport(ERROR,
 				(errcode(ERRCODE_UNDEFINED_TABLE),
 				 errmsg("could not open parent table of index %s",
-						RelationGetRelationName(indrel))));
-
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	/* Check index, possibly against table it is an index on */
-	bt_check_every_level(indrel, heaprel, parentcheck, heapallindexed);
+						RelationGetRelationName(*indrel))));
+}
 
+/* Pair for  amcheck_lock_relation() */
+void amcheck_unlock_relation(Oid indrelid, Relation indrel, Relation heaprel, LOCKMODE	lockmode)
+{
 	/*
 	 * Release locks early. That's ok here because nothing in the called
 	 * routines will trigger shared cache invalidations to be sent, so we can
@@ -266,6 +251,29 @@ bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed)
 		table_close(heaprel, lockmode);
 }
 
+/*
+ * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
+ */
+static void
+bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed)
+{
+	Relation	indrel;
+	Relation	heaprel;
+	LOCKMODE	lockmode;
+
+	/* lock table and index with neccesary level */
+	amcheck_lock_relation(indrelid, parentcheck, &indrel, &heaprel, &lockmode);
+
+	/* Relation suitable for checking as B-Tree? */
+	btree_index_checkable(indrel);
+
+	/* Check index, possibly against table it is an index on */
+	bt_check_every_level(indrel, heaprel, parentcheck, heapallindexed);
+
+	/* Unlock index and table */
+	amcheck_unlock_relation(indrelid, indrel, heaprel, lockmode);
+}
+
 /*
  * Basic checks about the suitability of a relation for checking as a B-Tree
  * index.
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 8bb60d5c2d..c147be1c11 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -162,6 +162,27 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_parent_check(index regclass) returns void</function>
+     <indexterm>
+      <primary>gist_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_parent_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages). As with <function>bt_index_parent_check</function>, the
+      <function>gist_index_parent_check</function> aquires
+      <literal>ShareLock</literal> on index and heap relations.
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
  </sect2>
 
-- 
2.20.1

In reply to: Andrey Borodin (#9)
Re: amcheck verification for GiST

On Sun, Feb 17, 2019 at 12:55 AM Andrey Borodin <x4mmm@yandex-team.ru> wrote:

That's true, we cannot avoid locking parent and child page simultaneously to check correctness of tuples.

Right.

Some further questions/comments:

* I think that this should be an error:

+       if (GistPageIsLeaf(page))
+       {
+           /* should never happen unless it is root */
+           Assert(stack->blkno == GIST_ROOT_BLKNO);
+       }

I use assertions in the verify_nbtree.c, but only for things that are
programming errors, and state that amcheck "owns". If they fail, then
it's a bug in amcheck specifically (they're really obvious assertions
about local state). Whereas this case could in theory happen with a
corrupt index, for example due to a page written in the wrong place.
I'm sure that the root block looks very similar to a leaf, so we won't
detect this any other way.

It's good to be paranoid, and to even think adversarially, provided it
doesn't make the design more difficult and has low runtime overhead.
We could debate whether or not this corruption is realistic, but it's
easy to make it an error and the cost is low, so you should just do
it.

* Couldn't leaf-like root pages also get some of the testing that we
do for other pages within check_index_tuple()? Ideally it wouldn't be
too much of a special case, though.

* I think that you need to add an
errcode(ERRCODE_FEATURE_NOT_SUPPORTED) to this:

+   /*
+    * Check that it's not a leftover invalid tuple from pre-9.1
+    * See also gistdoinsert() and gistbulkdelete() handlding of such tuples.
+    * We do not consider it error here, but warn operator.
+    */
+   if (GistTupleIsInvalid(idxtuple))
+       ereport(ERROR,
+               (errmsg("index \"%s\" contains an inner tuple marked as invalid",
+                       RelationGetRelationName(rel)),
+                errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+                errhint("Please REINDEX it.")));

Currently, we do not check index tuples against heap. Should we do this or leave this for another patch?

To address this question: I would leave this for now. It can probably
work based on the same principle as nbtree's heapallindexed
verification, with few or no changes, but I don't think that we need
to make that happen in this release.

* You still hold multiple buffer locks at once, starting with the
parent and moving to the child. Only 2 buffer locks. Why is this
necessary, given that you now hold a ShareLock on both heap and index
relations? Couldn't you just copy the parent into local memory, in the
style of verify_nbtree.c?

* Note that this "parent then child" lock order seems to not be
consistent with the general rule for holding concurrent buffer locks
that is described in the GiST README:

"""
Concurrency control
-------------------
As a rule of thumb, if you need to hold a lock on multiple pages at the
same time, the locks should be acquired in the following order: child page
before parent, and left-to-right at the same level. Always acquiring the
locks in the same order avoids deadlocks.
"""

* The main point of entry for GiST verification is
gist_check_parent_keys_consistency(). It would be nice to have
comments that describe what it does, and in what order. I understand
that English is not your first language, which makes it harder, but I
would appreciate it if you made the effort to explain the theory. I
don't want to assume that I understand your intent -- I could be
wrong.

* Suggest that you use function prototypes consistently, even for
static functions. That is the style that we prefer. Also, please make
the comments consistent with project guidelines on indentation and
style.

* It would be nice to know if the code in
gist_check_parent_keys_consistency() finds problems in the parent, or
in the child. The nbtree check has an idea of a "target" page: every
page gets to be the target exactly once, and we only ever find
problems in the current target. Maybe that is arbitrary in some cases,
because the relationship between the two is where the problem actually
is. I still think that it is a good idea to make it work that way if
possible. It makes it easier to describe complicated relationships in
comments.

* Why does it make sense to use range_gist_picksplit()/operator class
"union" support function to verify parent/child relationships? Does it
handle NULL values correctly, given the special rules?

* Why not use the "consistent" support function instead of the "union"
support function? Could you write new code that was based on
gistindex_keytest() to do this?

* The GiST docs ("63.3. Extensibility") say of the "distance" support
function: "For an internal tree node, the distance returned must not
be greater than the distance to any of the child nodes". Is there an
opportunity to test this relationship, too, by making sure that
distance is sane? Perhaps this is a very naive question -- I am not a
GiST expert.

* Is it right that gist_check_internal_page() should both return a
value that says "this internal page has child pages that are
themselves internal", while testing the child pages?

* Do we need to worry about F_DELETED pages? Why or why not?

* Do we need to worry about F_HAS_GARBAGE pages? Why or why not?

--
Peter Geoghegan

#11Andrey Borodin
x4mmm@yandex-team.ru
In reply to: Peter Geoghegan (#10)
1 attachment(s)
Re: amcheck verification for GiST

Hi!

Thanks for this detailed review!

* Note that this "parent then child" lock order seems to not be
consistent with the general rule for holding concurrent buffer locks
that is described in the GiST README:

This is correct. I've changed locking order.
When we check target internal page, we make a palloc'ed copy and unlock it (but hold the pin).
If we find discrepancy between parent and child keys we lock parent page again.
Then look for correct downlink. And check keys again.
In case when discrepancy still present we report an error.
Otherwise drop parent lock.

* I think that this should be an error:

+       if (GistPageIsLeaf(page))
+       {
+           /* should never happen unless it is root */
+           Assert(stack->blkno == GIST_ROOT_BLKNO);
+       }

Done. If this happens, this looks like a programming error or page header flags were corrupted
concurrently. Let's just report.

* Couldn't leaf-like root pages also get some of the testing that we
do for other pages within check_index_tuple()? Ideally it wouldn't be
too much of a special case, though.

Done.

* I think that you need to add an
errcode(ERRCODE_FEATURE_NOT_SUPPORTED) to this:

Done.

* You still hold multiple buffer locks at once, starting with the
parent and moving to the child. Only 2 buffer locks. Why is this
necessary, given that you now hold a ShareLock on both heap and index
relations? Couldn't you just copy the parent into local memory, in the
style of verify_nbtree.c?

Done.

* The main point of entry for GiST verification is
gist_check_parent_keys_consistency(). It would be nice to have
comments that describe what it does, and in what order. I understand
that English is not your first language, which makes it harder, but I
would appreciate it if you made the effort to explain the theory. I
don't want to assume that I understand your intent -- I could be
wrong.

I've added about 10 lines of comments.

* Suggest that you use function prototypes consistently, even for
static functions. That is the style that we prefer. Also, please make
the comments consistent with project guidelines on indentation and
style.

Done.

* It would be nice to know if the code in
gist_check_parent_keys_consistency() finds problems in the parent, or
in the child. The nbtree check has an idea of a "target" page: every
page gets to be the target exactly once, and we only ever find
problems in the current target. Maybe that is arbitrary in some cases,
because the relationship between the two is where the problem actually
is. I still think that it is a good idea to make it work that way if
possible. It makes it easier to describe complicated relationships in
comments.

the "target" for gist_check_parent_keys_consistency() is tuples on internal
page and their relation with tuples on referenced page. All other checks
are just additional checks, while I expect that this parent-child relationship
may contain some kind of bug: we had observed that GiST could not find some
tuples after CREATE INDEX CONCURRENTLY, but could not reliably reproduce the
problem before it was gone. That's why I've started this work.

* Why does it make sense to use range_gist_picksplit()/operator class
"union" support function to verify parent/child relationships? Does it
handle NULL values correctly, given the special rules?

Yes, union handles NULLs. I do not use range_gist_picksplit().
By definition parent tuple is union of all child tuples.

* Why not use the "consistent" support function instead of the "union"
support function? Could you write new code that was based on
gistindex_keytest() to do this?

Initially, I've used the consistency function. But it answers the question
"Does the downlinked key-space overlap with query". And query may be "not a
given key". Consistency function is controlled by search strategy. Different
operator classes support different set of strategies. But every opclass
should support union to build a GiST. So, I've used "union" instead of
"consistency".

* The GiST docs ("63.3. Extensibility") say of the "distance" support
function: "For an internal tree node, the distance returned must not
be greater than the distance to any of the child nodes". Is there an
opportunity to test this relationship, too, by making sure that
distance is sane? Perhaps this is a very naive question -- I am not a
GiST expert.

If parent tuple is correctly adjusted by child tuples, distance between
them must be 0. With this check We will test opclass, not an index structure.

* Is it right that gist_check_internal_page() should both return a
value that says "this internal page has child pages that are
themselves internal", while testing the child pages?

Yes.

* Do we need to worry about F_DELETED pages? Why or why not?

Currently, GiST scan does not check for F_DELETED: there simply is no code
to delete a page in GiST (except my patch on commitfest).
I suspect that there may be deleted pages from previous versions.
But encountering this in a search should not be a problem. Thus, I copy
behavior from index scan: do not complain about deleted pages.

* Do we need to worry about F_HAS_GARBAGE pages? Why or why not?

This flag is for microvacuum and only hints that there may be some vacuumable
tuples on page. It is used during check for neccesary split to allocate some space.

Thanks!

Best regards, Andrey Borodin.

Attachments:

0001-GiST-verification-function-for-amcheck-v5.patchapplication/octet-stream; name=0001-GiST-verification-function-for-amcheck-v5.patch; x-unix-mode=0644Download
From 161f39b3001fec522ca1d3bb881635d05f91a984 Mon Sep 17 00:00:00 2001
From: Andrey <amborodin@acm.org>
Date: Tue, 1 Jan 2019 15:03:13 +0500
Subject: [PATCH] GiST verification function for amcheck v5

---
 contrib/amcheck/Makefile                |   6 +-
 contrib/amcheck/amcheck--1.1--1.2.sql   |  14 +
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/amcheck.h               |  31 ++
 contrib/amcheck/expected/check_gist.out |   9 +
 contrib/amcheck/sql/check_gist.sql      |   4 +
 contrib/amcheck/verify_gist.c           | 391 ++++++++++++++++++++++++
 contrib/amcheck/verify_nbtree.c         |  72 +++--
 doc/src/sgml/amcheck.sgml               |  21 ++
 9 files changed, 514 insertions(+), 36 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.1--1.2.sql
 create mode 100644 contrib/amcheck/amcheck.h
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index c5764b544f..dd9b5ecf92 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -1,13 +1,13 @@
 # contrib/amcheck/Makefile
 
 MODULE_big	= amcheck
-OBJS		= verify_nbtree.o $(WIN32RES)
+OBJS		= verify_nbtree.o verify_gist.o $(WIN32RES)
 
 EXTENSION = amcheck
-DATA = amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree
+REGRESS = check check_btree check_gist
 
 ifdef USE_PGXS
 PG_CONFIG = pg_config
diff --git a/contrib/amcheck/amcheck--1.1--1.2.sql b/contrib/amcheck/amcheck--1.1--1.2.sql
new file mode 100644
index 0000000000..5571642f4a
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.1--1.2.sql
@@ -0,0 +1,14 @@
+/* amcheck--1.1--1.2.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.2'" to load this file. \quit
+
+--
+-- gist_index_parent_check()
+--
+CREATE FUNCTION gist_index_parent_check(index regclass)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_parent_check(regclass) FROM PUBLIC;
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index 469048403d..c6e310046d 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.1'
+default_version = '1.2'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/amcheck.h b/contrib/amcheck/amcheck.h
new file mode 100644
index 0000000000..84da7d7ec0
--- /dev/null
+++ b/contrib/amcheck/amcheck.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "access/htup_details.h"
+#include "access/transam.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
+#include "commands/tablecmds.h"
+#include "miscadmin.h"
+#include "storage/lmgr.h"
+#include "utils/memutils.h"
+#include "utils/snapmgr.h"
+
+extern void
+amcheck_lock_relation(Oid indrelid, bool parentcheck,Relation *indrel,
+						Relation *heaprel, LOCKMODE	*lockmode);
+
+extern void
+amcheck_unlock_relation(Oid indrelid, Relation indrel, Relation heaprel, LOCKMODE	lockmode);
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 0000000000..4f373aea62
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,9 @@
+-- minimal test, basically just verifying that amcheck works with GiST
+CREATE TABLE gist_check AS SELECT point(s,1) c FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx ON gist_check USING gist(c);
+SELECT gist_index_parent_check('gist_check_idx');
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 0000000000..95e62ba975
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,4 @@
+-- minimal test, basically just verifying that amcheck works with GiST
+CREATE TABLE gist_check AS SELECT point(s,1) c FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx ON gist_check USING gist(c);
+SELECT gist_index_parent_check('gist_check_idx');
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 0000000000..38c00f103c
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,391 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_nbtree.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph are contatining
+ * consisnent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "amcheck.h"
+
+#include "access/gist_private.h"
+
+
+typedef struct GistScanItem
+{
+	GistNSN		parentlsn;
+	BlockNumber blkno;
+	struct GistScanItem *next;
+} GistScanItem;
+
+static inline void
+check_index_tuple(IndexTuple idxtuple, Relation rel, ItemId iid);
+
+static inline void
+check_index_page(Relation rel, Page page, Buffer buffer);
+
+static inline bool
+gist_check_internal_page(Relation rel, Page page_copy, Buffer buffer,
+						 BufferAccessStrategy strategy, GISTSTATE *state);
+
+static inline void
+gist_check_parent_keys_consistency(Relation rel);
+
+static inline void
+gist_check_page_keys(Relation rel, Buffer parentbuffer, Page page,
+					 BlockNumber blkno, IndexTuple parent, GISTSTATE *state);
+
+static void
+pushStackIfSplited(Page page, GistScanItem *stack);
+
+static inline void
+gist_index_checkable(Relation rel);
+
+static inline void
+check_index_tuple(IndexTuple idxtuple, Relation rel, ItemId iid)
+{
+	/*
+	 * Check that it's not a leftover invalid tuple from pre-9.1
+	 * See also gistdoinsert() and gistbulkdelete() handlding of such tuples.
+	 * We do not consider it error here, but warn operator.
+	 */
+	if (GistTupleIsInvalid(idxtuple))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				errmsg("index \"%s\" contains an inner tuple marked as"
+						" invalid",RelationGetRelationName(rel)),
+				 errdetail("This is caused by an incomplete page split at "
+				 "crash recovery before upgrading to PostgreSQL 9.1."),
+				 errhint("Please REINDEX it.")));
+
+	if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has tuple sizes",
+						RelationGetRelationName(rel))));
+}
+
+static inline void
+check_index_page(Relation rel, Page page, Buffer buffer)
+{
+	gistcheckpage(rel, buffer);
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted pages",
+						RelationGetRelationName(rel))));
+}
+
+/*
+ * For every tuple on page check if it is contained by tuple on parent page
+ */
+static inline void
+gist_check_page_keys(Relation rel, Buffer parentbuffer, Page page,
+					 BlockNumber blkno, IndexTuple parent, GISTSTATE *state)
+{
+	OffsetNumber i,
+				maxoff = PageGetMaxOffsetNumber(page);
+
+	for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+	{
+		ItemId iid = PageGetItemId(page, i);
+		IndexTuple idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+		check_index_tuple(idxtuple, rel, iid);
+
+		/*
+		 * Tree is inconsistent if adjustement is necessary for any parent
+		 * tuple
+		 */
+		if (gistgetadjusted(rel, parent, idxtuple, state))
+		{
+			/*
+			 * OK, we found a discrepency between parent and child tuples.
+			 * We need to verify it is not a result of concurrent call
+			 * of gistplacetopage(). So, lock parent and try to find downlink
+			 * for current page. It may be missing due to concurrent page
+			 * split, this is OK.
+			 */
+			LockBuffer(parentbuffer, GIST_SHARE);
+			Page parentpage = (Page) BufferGetPage(parentbuffer);
+			OffsetNumber o,
+				parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+
+			for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(i))
+			{
+				ItemId p_iid = PageGetItemId(parentpage, o);
+				parent = (IndexTuple) PageGetItem(parentpage, p_iid);
+				BlockNumber child_blkno = ItemPointerGetBlockNumber(&(parent->t_tid));
+				if (child_blkno == blkno)
+				{
+					/* We found it - make a final check before failing */
+					if (gistgetadjusted(rel, parent, idxtuple, state))
+					{
+						ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							errmsg("index \"%s\" has inconsistent records",
+									RelationGetRelationName(rel))));
+					}
+					else
+					{
+						/*
+						 * But now it is properly adjusted - nothing to do here.
+						 */
+						break;
+					}
+				}
+			}
+
+
+			LockBuffer(parentbuffer, GIST_UNLOCK);
+		}
+	}
+}
+
+/*
+ * Check of an internal page.
+ * Return true if further descent is necessary.
+ * Hold pins on two pages at a time (parent+child).
+ * But coupled lock on parent is taken iif parent-child discrepency found.
+ * Locks is taken on every leaf page, and only then, if neccesary, on leaf
+ * inside gist_check_page_keys() call.
+ */
+static inline bool
+gist_check_internal_page(Relation rel, Page page_copy, Buffer buffer,
+						 BufferAccessStrategy strategy, GISTSTATE *state)
+{
+	bool has_leafs = false;
+	bool has_internals = false;
+	OffsetNumber i,
+				maxoff = PageGetMaxOffsetNumber(page_copy);
+
+	for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+	{
+		ItemId iid = PageGetItemId(page_copy, i);
+		IndexTuple idxtuple = (IndexTuple) PageGetItem(page_copy, iid);
+
+		BlockNumber child_blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+		Buffer		buffer;
+		Page child_page;
+
+		check_index_tuple(idxtuple, rel, iid);
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, child_blkno,
+									RBM_NORMAL, strategy);
+
+		LockBuffer(buffer, GIST_SHARE);
+		child_page = (Page) BufferGetPage(buffer);
+		check_index_page(rel, child_page, buffer);
+
+		has_leafs = has_leafs || GistPageIsLeaf(child_page);
+		has_internals = has_internals || !GistPageIsLeaf(child_page);
+		gist_check_page_keys(rel, buffer, child_page, child_blkno, idxtuple, state);
+
+		UnlockReleaseBuffer(buffer);
+	}
+
+	if (!(has_leafs || has_internals))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" internal page has no downlink references",
+						RelationGetRelationName(rel))));
+
+
+	if (has_leafs == has_internals)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" page references both internal and leaf pages",
+						RelationGetRelationName(rel))));
+
+	return has_internals;
+}
+
+/* add pages with unfinished split to scan */
+static void
+pushStackIfSplited(Page page, GistScanItem *stack)
+{
+	GISTPageOpaque opaque = GistPageGetOpaque(page);
+
+	if (stack->blkno != GIST_ROOT_BLKNO && !XLogRecPtrIsInvalid(stack->parentlsn) &&
+		(GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page)) &&
+		opaque->rightlink != InvalidBlockNumber /* sanity check */ )
+	{
+		/* split page detected, install right link to the stack */
+
+		GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+		ptr->blkno = opaque->rightlink;
+		ptr->parentlsn = stack->parentlsn;
+		ptr->next = stack->next;
+		stack->next = ptr;
+	}
+}
+
+/*
+ * Main entry point for GiST check. Allocates memory context and scans
+ * through GiST graph.
+ * This function verifies that tuples of internal pages cover all the key
+ * space of each tuple on leaf page. To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries
+ * to adjust it by tuples on referenced child page. Parent gist tuple should
+ * never requre an adjustement.
+ */
+static inline void
+gist_check_parent_keys_consistency(Relation rel)
+{
+	GistScanItem *stack,
+			   *ptr;
+
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+
+	MemoryContext mctx = AllocSetContextCreate(CurrentMemoryContext,
+												 "amcheck context",
+												 ALLOCSET_DEFAULT_SIZES);
+
+	MemoryContext oldcontext = MemoryContextSwitchTo(mctx);
+	GISTSTATE *state = initGISTstate(rel);
+
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	while (stack)
+	{
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		IndexTuple	idxtuple;
+		ItemId		iid;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		check_index_page(rel, page, buffer);
+		maxoff = PageGetMaxOffsetNumber(page);
+
+		if (GistPageIsLeaf(page))
+		{
+			/* should never happen unless it is root */
+			if (stack->blkno != GIST_ROOT_BLKNO)
+			{
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						errmsg("index \"%s\": internal pages traversal "
+						"encountered leaf page unexpectedly",
+								RelationGetRelationName(rel))));
+			}
+			check_index_page(rel, page, buffer);
+
+			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				iid = PageGetItemId(page, i);
+				idxtuple = (IndexTuple) PageGetItem(page, iid);
+				check_index_tuple(idxtuple, rel, iid);
+			}
+			LockBuffer(buffer, GIST_UNLOCK);
+		}
+		else
+		{
+			/* we need to copy only internal pages */
+			Page page_copy = palloc(BLCKSZ);
+			memcpy(page_copy, page, BLCKSZ);
+			LockBuffer(buffer, GIST_UNLOCK);
+
+			/* check for split proceeded after look at parent */
+			pushStackIfSplited(page_copy, stack);
+
+			if (gist_check_internal_page(rel, page_copy, buffer, strategy, state))
+			{
+				for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+				{
+					iid = PageGetItemId(page_copy, i);
+					idxtuple = (IndexTuple) PageGetItem(page_copy, iid);
+
+					ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+					ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+					ptr->parentlsn = BufferGetLSNAtomic(buffer);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+			}
+
+			pfree(page_copy);
+		}
+
+		ReleaseBuffer(buffer);
+
+		ptr = stack->next;
+		pfree(stack);
+		stack = ptr;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/* Check that relation is eligible for GiST verification */
+static inline void
+gist_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIST_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GiST indexes are supported as targets for this"
+						 " verification"),
+				 errdetail("Relation \"%s\" is not a GiST index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+PG_FUNCTION_INFO_V1(gist_index_parent_check);
+
+Datum
+gist_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	Relation	indrel;
+	Relation	heaprel;
+	LOCKMODE	lockmode;
+
+	/* lock table and index with neccesary level */
+	amcheck_lock_relation(indrelid, true, &indrel, &heaprel, &lockmode);
+
+	/* verify that this is GiST eligible for check */
+	gist_index_checkable(indrel);
+	gist_check_parent_keys_consistency(indrel);
+
+	/* Unlock index and table */
+	amcheck_unlock_relation(indrelid, indrel, heaprel, lockmode);
+
+	PG_RETURN_VOID();
+}
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 964200a767..f17552c5e5 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -21,22 +21,12 @@
  *
  *-------------------------------------------------------------------------
  */
-#include "postgres.h"
+#include "amcheck.h"
 
 #include "access/heapam.h"
-#include "access/htup_details.h"
 #include "access/nbtree.h"
-#include "access/transam.h"
 #include "access/xact.h"
-#include "catalog/index.h"
-#include "catalog/pg_am.h"
-#include "commands/tablecmds.h"
 #include "lib/bloomfilter.h"
-#include "miscadmin.h"
-#include "storage/lmgr.h"
-#include "utils/memutils.h"
-#include "utils/snapmgr.h"
-
 
 PG_MODULE_MAGIC;
 
@@ -195,21 +185,18 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
 	PG_RETURN_VOID();
 }
 
-/*
- * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
- */
-static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed)
+
+/* Lock aquisition reused accross different am types */
+void
+amcheck_lock_relation(Oid indrelid, bool parentcheck, Relation *indrel,
+						Relation *heaprel, LOCKMODE	*lockmode)
 {
 	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	LOCKMODE	lockmode;
 
 	if (parentcheck)
-		lockmode = ShareLock;
+		*lockmode = ShareLock;
 	else
-		lockmode = AccessShareLock;
+		*lockmode = AccessShareLock;
 
 	/*
 	 * We must lock table before index to avoid deadlocks.  However, if the
@@ -221,9 +208,9 @@ bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed)
 	 */
 	heapid = IndexGetRelation(indrelid, true);
 	if (OidIsValid(heapid))
-		heaprel = table_open(heapid, lockmode);
+		*heaprel = heap_open(heapid, *lockmode);
 	else
-		heaprel = NULL;
+		*heaprel = NULL;
 
 	/*
 	 * Open the target index relations separately (like relation_openrv(), but
@@ -237,25 +224,23 @@ bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed)
 	 * committed or recently dead heap tuples lacking index entries due to
 	 * concurrent activity.)
 	 */
-	indrel = index_open(indrelid, lockmode);
+	*indrel = index_open(indrelid, *lockmode);
 
 	/*
 	 * Since we did the IndexGetRelation call above without any lock, it's
 	 * barely possible that a race against an index drop/recreation could have
 	 * netted us the wrong table.
 	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (*heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
 		ereport(ERROR,
 				(errcode(ERRCODE_UNDEFINED_TABLE),
 				 errmsg("could not open parent table of index %s",
-						RelationGetRelationName(indrel))));
-
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	/* Check index, possibly against table it is an index on */
-	bt_check_every_level(indrel, heaprel, parentcheck, heapallindexed);
+						RelationGetRelationName(*indrel))));
+}
 
+/* Pair for  amcheck_lock_relation() */
+void amcheck_unlock_relation(Oid indrelid, Relation indrel, Relation heaprel, LOCKMODE	lockmode)
+{
 	/*
 	 * Release locks early. That's ok here because nothing in the called
 	 * routines will trigger shared cache invalidations to be sent, so we can
@@ -266,6 +251,29 @@ bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed)
 		table_close(heaprel, lockmode);
 }
 
+/*
+ * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
+ */
+static void
+bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed)
+{
+	Relation	indrel;
+	Relation	heaprel;
+	LOCKMODE	lockmode;
+
+	/* lock table and index with neccesary level */
+	amcheck_lock_relation(indrelid, parentcheck, &indrel, &heaprel, &lockmode);
+
+	/* Relation suitable for checking as B-Tree? */
+	btree_index_checkable(indrel);
+
+	/* Check index, possibly against table it is an index on */
+	bt_check_every_level(indrel, heaprel, parentcheck, heapallindexed);
+
+	/* Unlock index and table */
+	amcheck_unlock_relation(indrelid, indrel, heaprel, lockmode);
+}
+
 /*
  * Basic checks about the suitability of a relation for checking as a B-Tree
  * index.
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 8bb60d5c2d..c147be1c11 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -162,6 +162,27 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_parent_check(index regclass) returns void</function>
+     <indexterm>
+      <primary>gist_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_parent_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages). As with <function>bt_index_parent_check</function>, the
+      <function>gist_index_parent_check</function> aquires
+      <literal>ShareLock</literal> on index and heap relations.
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
  </sect2>
 
-- 
2.20.1

#12Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Andrey Borodin (#11)
Re: amcheck verification for GiST

There's a little copy-pasto in gist_check_page_keys():

+ for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(i))

Should be "OffsetNumberNext(o)".

I tested this patch with your testing patch from the other thread (after
fixing the above), to leave behind incompletely split pages [1]/messages/by-id/EB87A69B-EE5E-4259-9EEB-DA9DC1F7E265@yandex-team.ru. It
seems that the amcheck code doesn't expect incomplete splits:

postgres=# SELECT gist_index_parent_check('x_c_idx');
ERROR: index "x_c_idx" has inconsistent records

[1]: /messages/by-id/EB87A69B-EE5E-4259-9EEB-DA9DC1F7E265@yandex-team.ru
/messages/by-id/EB87A69B-EE5E-4259-9EEB-DA9DC1F7E265@yandex-team.ru

- Heikki

#13Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Heikki Linnakangas (#12)
Re: amcheck verification for GiST

On 04/03/2019 17:53, Heikki Linnakangas wrote:

I tested this patch with your testing patch from the other thread (after
fixing the above), to leave behind incompletely split pages [1]. It
seems that the amcheck code doesn't expect incomplete splits:

postgres=# SELECT gist_index_parent_check('x_c_idx');
ERROR: index "x_c_idx" has inconsistent records

On closer look, I think that was because that testing patch to leave
behind incomplete splits really did corrupt the index. It always
inserted the downlink to the parent, but randomly skipped clearing the
FOLLOW_RIGHT flag and updating the NSN in the child. That's not a valid
combination. To test incomplete splits, you need to skip inserting the
downlink to the parent, too.

- Heikki

#14Andrey Borodin
x4mmm@yandex-team.ru
In reply to: Heikki Linnakangas (#13)
1 attachment(s)
Re: amcheck verification for GiST

Hi!

Here's new version of GiST amcheck, which takes into account recently committed GiST VACUUM.
It tests that deleted pages do not contain any data.

Also, Heikki's fix applied (wrong OffsetNumberNext(i) replaced by OffsetNumberNext(o)).

Thanks!

Best regards, Andrey Borodin.

Attachments:

0001-GiST-verification-function-for-amcheck-v6.patchapplication/octet-stream; name=0001-GiST-verification-function-for-amcheck-v6.patch; x-unix-mode=0644Download
From ae14b3d994ebf73687c22136cabaf8df04686c05 Mon Sep 17 00:00:00 2001
From: Andrey <amborodin@acm.org>
Date: Fri, 22 Mar 2019 22:11:16 +0800
Subject: [PATCH] GiST verification function for amcheck v6

---
 contrib/amcheck/Makefile                |   4 +-
 contrib/amcheck/amcheck--1.1--1.2.sql   |  10 +
 contrib/amcheck/amcheck.h               |  31 ++
 contrib/amcheck/expected/check_gist.out |   9 +
 contrib/amcheck/sql/check_gist.sql      |   4 +
 contrib/amcheck/verify_gist.c           | 403 ++++++++++++++++++++++++
 contrib/amcheck/verify_nbtree.c         |  79 +++--
 doc/src/sgml/amcheck.sgml               |  21 ++
 8 files changed, 524 insertions(+), 37 deletions(-)
 create mode 100644 contrib/amcheck/amcheck.h
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index dcec3b8520..dd9b5ecf92 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -1,13 +1,13 @@
 # contrib/amcheck/Makefile
 
 MODULE_big	= amcheck
-OBJS		= verify_nbtree.o $(WIN32RES)
+OBJS		= verify_nbtree.o verify_gist.o $(WIN32RES)
 
 EXTENSION = amcheck
 DATA = amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree
+REGRESS = check check_btree check_gist
 
 ifdef USE_PGXS
 PG_CONFIG = pg_config
diff --git a/contrib/amcheck/amcheck--1.1--1.2.sql b/contrib/amcheck/amcheck--1.1--1.2.sql
index 883530dec7..1d461fff5b 100644
--- a/contrib/amcheck/amcheck--1.1--1.2.sql
+++ b/contrib/amcheck/amcheck--1.1--1.2.sql
@@ -17,3 +17,13 @@ LANGUAGE C STRICT PARALLEL RESTRICTED;
 
 -- Don't want this to be available to public
 REVOKE ALL ON FUNCTION bt_index_parent_check(regclass, boolean, boolean) FROM PUBLIC;
+
+--
+-- gist_index_parent_check()
+--
+CREATE FUNCTION gist_index_parent_check(index regclass)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_parent_check(regclass) FROM PUBLIC;
diff --git a/contrib/amcheck/amcheck.h b/contrib/amcheck/amcheck.h
new file mode 100644
index 0000000000..84da7d7ec0
--- /dev/null
+++ b/contrib/amcheck/amcheck.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "access/htup_details.h"
+#include "access/transam.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
+#include "commands/tablecmds.h"
+#include "miscadmin.h"
+#include "storage/lmgr.h"
+#include "utils/memutils.h"
+#include "utils/snapmgr.h"
+
+extern void
+amcheck_lock_relation(Oid indrelid, bool parentcheck,Relation *indrel,
+						Relation *heaprel, LOCKMODE	*lockmode);
+
+extern void
+amcheck_unlock_relation(Oid indrelid, Relation indrel, Relation heaprel, LOCKMODE	lockmode);
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 0000000000..4f373aea62
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,9 @@
+-- minimal test, basically just verifying that amcheck works with GiST
+CREATE TABLE gist_check AS SELECT point(s,1) c FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx ON gist_check USING gist(c);
+SELECT gist_index_parent_check('gist_check_idx');
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 0000000000..95e62ba975
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,4 @@
+-- minimal test, basically just verifying that amcheck works with GiST
+CREATE TABLE gist_check AS SELECT point(s,1) c FROM generate_series(1,10000) s;
+CREATE INDEX gist_check_idx ON gist_check USING gist(c);
+SELECT gist_index_parent_check('gist_check_idx');
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 0000000000..46ffa15977
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,403 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_nbtree.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph are contatining
+ * consisnent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "amcheck.h"
+
+#include "access/gist_private.h"
+
+
+typedef struct GistScanItem
+{
+	GistNSN		parentlsn;
+	BlockNumber blkno;
+	struct GistScanItem *next;
+} GistScanItem;
+
+static inline void
+check_index_tuple(IndexTuple idxtuple, Relation rel, ItemId iid);
+
+static inline void
+check_index_page(Relation rel, Page page, Buffer buffer);
+
+static inline bool
+gist_check_internal_page(Relation rel, Page page_copy, Buffer buffer,
+						 BufferAccessStrategy strategy, GISTSTATE *state);
+
+static inline void
+gist_check_parent_keys_consistency(Relation rel);
+
+static inline void
+gist_check_page_keys(Relation rel, Buffer parentbuffer, Page page,
+					 BlockNumber blkno, IndexTuple parent, GISTSTATE *state);
+
+static void
+pushStackIfSplited(Page page, GistScanItem *stack);
+
+static inline void
+gist_index_checkable(Relation rel);
+
+static inline void
+check_index_tuple(IndexTuple idxtuple, Relation rel, ItemId iid)
+{
+	/*
+	 * Check that it's not a leftover invalid tuple from pre-9.1
+	 * See also gistdoinsert() and gistbulkdelete() handlding of such tuples.
+	 * We do consider it error here.
+	 */
+	if (GistTupleIsInvalid(idxtuple))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				errmsg("index \"%s\" contains an inner tuple marked as"
+						" invalid",RelationGetRelationName(rel)),
+				 errdetail("This is caused by an incomplete page split at "
+				 "crash recovery before upgrading to PostgreSQL 9.1."),
+				 errhint("Please REINDEX it.")));
+
+	if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has tuple sizes",
+						RelationGetRelationName(rel))));
+}
+
+static inline void
+check_index_page(Relation rel, Page page, Buffer buffer)
+{
+	gistcheckpage(rel, buffer);
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted pages",
+						RelationGetRelationName(rel))));
+	if (GistPageIsDeleted(page))
+	{
+		elog(ERROR,"boom");
+		if (!GistPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" has deleted internal page",
+							RelationGetRelationName(rel))));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" has deleted page with tuples",
+							RelationGetRelationName(rel))));
+	}
+}
+
+/*
+ * For every tuple on page check if it is contained by tuple on parent page
+ */
+static inline void
+gist_check_page_keys(Relation rel, Buffer parentbuffer, Page page,
+					 BlockNumber blkno, IndexTuple parent, GISTSTATE *state)
+{
+	OffsetNumber i,
+				maxoff = PageGetMaxOffsetNumber(page);
+
+	for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+	{
+		ItemId iid = PageGetItemId(page, i);
+		IndexTuple idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+		check_index_tuple(idxtuple, rel, iid);
+
+		/*
+		 * Tree is inconsistent if adjustement is necessary for any parent
+		 * tuple
+		 */
+		if (gistgetadjusted(rel, parent, idxtuple, state))
+		{
+			/*
+			 * OK, we found a discrepency between parent and child tuples.
+			 * We need to verify it is not a result of concurrent call
+			 * of gistplacetopage(). So, lock parent and try to find downlink
+			 * for current page. It may be missing due to concurrent page
+			 * split, this is OK.
+			 */
+			LockBuffer(parentbuffer, GIST_SHARE);
+			Page parentpage = (Page) BufferGetPage(parentbuffer);
+			OffsetNumber o,
+				parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+
+			for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+			{
+				ItemId p_iid = PageGetItemId(parentpage, o);
+				parent = (IndexTuple) PageGetItem(parentpage, p_iid);
+				BlockNumber child_blkno = ItemPointerGetBlockNumber(&(parent->t_tid));
+				if (child_blkno == blkno)
+				{
+					/* We found it - make a final check before failing */
+					if (gistgetadjusted(rel, parent, idxtuple, state))
+					{
+						ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							errmsg("index \"%s\" has inconsistent records",
+									RelationGetRelationName(rel))));
+					}
+					else
+					{
+						/*
+						 * But now it is properly adjusted - nothing to do here.
+						 */
+						break;
+					}
+				}
+			}
+
+			LockBuffer(parentbuffer, GIST_UNLOCK);
+		}
+	}
+}
+
+/*
+ * Check of an internal page.
+ * Return true if further descent is necessary.
+ * Hold pins on two pages at a time (parent+child).
+ * But coupled lock on parent is taken iif parent-child discrepency found.
+ * Locks is taken on every leaf page, and only then, if neccesary, on leaf
+ * inside gist_check_page_keys() call.
+ */
+static inline bool
+gist_check_internal_page(Relation rel, Page page_copy, Buffer buffer,
+						 BufferAccessStrategy strategy, GISTSTATE *state)
+{
+	bool		 has_leafs = false;
+	bool		 has_internals = false;
+	OffsetNumber i,
+				 maxoff = PageGetMaxOffsetNumber(page_copy);
+
+	for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+	{
+		ItemId		iid = PageGetItemId(page_copy, i);
+		IndexTuple	idxtuple = (IndexTuple) PageGetItem(page_copy, iid);
+
+		BlockNumber child_blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+		Buffer		buffer;
+		Page		child_page;
+
+		check_index_tuple(idxtuple, rel, iid);
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, child_blkno,
+									RBM_NORMAL, strategy);
+
+		LockBuffer(buffer, GIST_SHARE);
+		child_page = (Page) BufferGetPage(buffer);
+		check_index_page(rel, child_page, buffer);
+
+		has_leafs = has_leafs || GistPageIsLeaf(child_page);
+		has_internals = has_internals || !GistPageIsLeaf(child_page);
+		gist_check_page_keys(rel, buffer, child_page, child_blkno, idxtuple, state);
+
+		UnlockReleaseBuffer(buffer);
+	}
+
+	if (!(has_leafs || has_internals))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" internal page has no downlink references",
+						RelationGetRelationName(rel))));
+
+	if (has_leafs == has_internals)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" page references both internal and leaf pages",
+						RelationGetRelationName(rel))));
+
+	return has_internals;
+}
+
+/* add pages with unfinished split to scan */
+static void
+pushStackIfSplited(Page page, GistScanItem *stack)
+{
+	GISTPageOpaque opaque = GistPageGetOpaque(page);
+
+	if (stack->blkno != GIST_ROOT_BLKNO && !XLogRecPtrIsInvalid(stack->parentlsn) &&
+		(GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page)) &&
+		opaque->rightlink != InvalidBlockNumber /* sanity check */ )
+	{
+		/* split page detected, install right link to the stack */
+
+		GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+		ptr->blkno = opaque->rightlink;
+		ptr->parentlsn = stack->parentlsn;
+		ptr->next = stack->next;
+		stack->next = ptr;
+	}
+}
+
+/*
+ * Main entry point for GiST check. Allocates memory context and scans
+ * through GiST graph.
+ * This function verifies that tuples of internal pages cover all the key
+ * space of each tuple on leaf page. To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries
+ * to adjust it by tuples on referenced child page. Parent gist tuple should
+ * never requre an adjustement.
+ */
+static inline void
+gist_check_parent_keys_consistency(Relation rel)
+{
+	GistScanItem *stack,
+			   *ptr;
+
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+
+	MemoryContext mctx = AllocSetContextCreate(CurrentMemoryContext,
+												 "amcheck context",
+												 ALLOCSET_DEFAULT_SIZES);
+
+	MemoryContext oldcontext = MemoryContextSwitchTo(mctx);
+	GISTSTATE *state = initGISTstate(rel);
+
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	while (stack)
+	{
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		IndexTuple	idxtuple;
+		ItemId		iid;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		check_index_page(rel, page, buffer);
+		maxoff = PageGetMaxOffsetNumber(page);
+
+		if (GistPageIsLeaf(page))
+		{
+			/* should never happen unless it is root */
+			if (stack->blkno != GIST_ROOT_BLKNO)
+			{
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						errmsg("index \"%s\": internal pages traversal "
+						"encountered leaf page unexpectedly",
+								RelationGetRelationName(rel))));
+			}
+			check_index_page(rel, page, buffer);
+
+			for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+			{
+				iid = PageGetItemId(page, i);
+				idxtuple = (IndexTuple) PageGetItem(page, iid);
+				check_index_tuple(idxtuple, rel, iid);
+			}
+			LockBuffer(buffer, GIST_UNLOCK);
+		}
+		else
+		{
+			/* we need to copy only internal pages */
+			Page page_copy = palloc(BLCKSZ);
+			memcpy(page_copy, page, BLCKSZ);
+			LockBuffer(buffer, GIST_UNLOCK);
+
+			/* check for split proceeded after look at parent */
+			pushStackIfSplited(page_copy, stack);
+
+			if (gist_check_internal_page(rel, page_copy, buffer, strategy, state))
+			{
+				for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+				{
+					iid = PageGetItemId(page_copy, i);
+					idxtuple = (IndexTuple) PageGetItem(page_copy, iid);
+
+					ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+					ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+					ptr->parentlsn = BufferGetLSNAtomic(buffer);
+					ptr->next = stack->next;
+					stack->next = ptr;
+				}
+			}
+
+			pfree(page_copy);
+		}
+
+		ReleaseBuffer(buffer);
+
+		ptr = stack->next;
+		pfree(stack);
+		stack = ptr;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/* Check that relation is eligible for GiST verification */
+static inline void
+gist_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIST_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GiST indexes are supported as targets for this"
+						 " verification"),
+				 errdetail("Relation \"%s\" is not a GiST index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+PG_FUNCTION_INFO_V1(gist_index_parent_check);
+
+Datum
+gist_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	Relation	indrel;
+	Relation	heaprel;
+	LOCKMODE	lockmode;
+
+	/* lock table and index with neccesary level */
+	amcheck_lock_relation(indrelid, true, &indrel, &heaprel, &lockmode);
+
+	/* verify that this is GiST eligible for check */
+	gist_index_checkable(indrel);
+	gist_check_parent_keys_consistency(indrel);
+
+	/* Unlock index and table */
+	amcheck_unlock_relation(indrelid, indrel, heaprel, lockmode);
+
+	PG_RETURN_VOID();
+}
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 6ae3bca953..50991a656a 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -21,23 +21,14 @@
  *
  *-------------------------------------------------------------------------
  */
-#include "postgres.h"
+#include "amcheck.h"
 
 #include "access/heapam.h"
-#include "access/htup_details.h"
 #include "access/nbtree.h"
 #include "access/tableam.h"
 #include "access/transam.h"
 #include "access/xact.h"
-#include "catalog/index.h"
-#include "catalog/pg_am.h"
-#include "commands/tablecmds.h"
 #include "lib/bloomfilter.h"
-#include "miscadmin.h"
-#include "storage/lmgr.h"
-#include "utils/memutils.h"
-#include "utils/snapmgr.h"
-
 
 PG_MODULE_MAGIC;
 
@@ -212,23 +203,18 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
 	PG_RETURN_VOID();
 }
 
-/*
- * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
- */
-static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend)
+
+/* Lock aquisition reused accross different am types */
+void
+amcheck_lock_relation(Oid indrelid, bool parentcheck, Relation *indrel,
+						Relation *heaprel, LOCKMODE	*lockmode)
 {
 	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	bool		heapkeyspace;
-	LOCKMODE	lockmode;
 
 	if (parentcheck)
-		lockmode = ShareLock;
+		*lockmode = ShareLock;
 	else
-		lockmode = AccessShareLock;
+		*lockmode = AccessShareLock;
 
 	/*
 	 * We must lock table before index to avoid deadlocks.  However, if the
@@ -240,9 +226,9 @@ bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
 	 */
 	heapid = IndexGetRelation(indrelid, true);
 	if (OidIsValid(heapid))
-		heaprel = table_open(heapid, lockmode);
+		*heaprel = heap_open(heapid, *lockmode);
 	else
-		heaprel = NULL;
+		*heaprel = NULL;
 
 	/*
 	 * Open the target index relations separately (like relation_openrv(), but
@@ -256,27 +242,23 @@ bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
 	 * committed or recently dead heap tuples lacking index entries due to
 	 * concurrent activity.)
 	 */
-	indrel = index_open(indrelid, lockmode);
+	*indrel = index_open(indrelid, *lockmode);
 
 	/*
 	 * Since we did the IndexGetRelation call above without any lock, it's
 	 * barely possible that a race against an index drop/recreation could have
 	 * netted us the wrong table.
 	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (*heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
 		ereport(ERROR,
 				(errcode(ERRCODE_UNDEFINED_TABLE),
 				 errmsg("could not open parent table of index %s",
-						RelationGetRelationName(indrel))));
-
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	/* Check index, possibly against table it is an index on */
-	heapkeyspace = _bt_heapkeyspace(indrel);
-	bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
-						 heapallindexed, rootdescend);
+						RelationGetRelationName(*indrel))));
+}
 
+/* Pair for  amcheck_lock_relation() */
+void amcheck_unlock_relation(Oid indrelid, Relation indrel, Relation heaprel, LOCKMODE	lockmode)
+{
 	/*
 	 * Release locks early. That's ok here because nothing in the called
 	 * routines will trigger shared cache invalidations to be sent, so we can
@@ -287,6 +269,33 @@ bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
 		table_close(heaprel, lockmode);
 }
 
+/*
+ * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
+ */
+static void
+bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
+						bool rootdescend)
+{
+	Relation	indrel;
+	Relation	heaprel;
+	LOCKMODE	lockmode;
+	bool 		heapkeyspace;
+
+	/* lock table and index with neccesary level */
+	amcheck_lock_relation(indrelid, parentcheck, &indrel, &heaprel, &lockmode);
+
+	/* Relation suitable for checking as B-Tree? */
+	btree_index_checkable(indrel);
+
+	/* Check index, possibly against table it is an index on */
+	heapkeyspace = _bt_heapkeyspace(indrel);
+	bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
+						 heapallindexed, rootdescend);
+
+	/* Unlock index and table */
+	amcheck_unlock_relation(indrelid, indrel, heaprel, lockmode);
+}
+
 /*
  * Basic checks about the suitability of a relation for checking as a B-Tree
  * index.
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 627651d8d4..34842eaebf 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -165,6 +165,27 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_parent_check(index regclass) returns void</function>
+     <indexterm>
+      <primary>gist_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_parent_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages). As with <function>bt_index_parent_check</function>, the
+      <function>gist_index_parent_check</function> aquires
+      <literal>ShareLock</literal> on index and heap relations.
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
  </sect2>
 
-- 
2.20.1

#15Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Andrey Borodin (#14)
1 attachment(s)
Re: amcheck verification for GiST

On 27/03/2019 11:51, Andrey Borodin wrote:

Hi!

Here's new version of GiST amcheck, which takes into account recently committed GiST VACUUM.
It tests that deleted pages do not contain any data.

Thanks! I had a look, and refactored it quite a bit.

I found the way the recursion worked confusing. On each internal page,
it looped through all the child nodes, checking the consistency of the
downlinks. And then it looped through the children again, to recurse.
This isn't performance-critical, but visiting every page twice still
seems strange.

In gist_check_page_keys(), if we get into the code to deal with a
concurrent update, we set 'parent' to point to a tuple on a parent page,
then unlock it, and continue to look at remaining tuples, using the
pointer that points to an unlocked buffer.

I came up with the attached, which fixes the above-mentioned things. I
also replaced the check that each node has only internal or leaf
children, with a different check that the tree has the same height in
all branches. That catches more potential problems, and was easier to
implement after the refactoring. This still needs at least a round of
fixing typos and tidying up comments, but it's more straightforward now,
IMHO.

What have you been using to test this?

- Heikki

Attachments:

amcheck-gist-v6-heikki.patchtext/x-patch; name=amcheck-gist-v6-heikki.patchDownload
diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index dcec3b85203..dd9b5ecf926 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -1,13 +1,13 @@
 # contrib/amcheck/Makefile
 
 MODULE_big	= amcheck
-OBJS		= verify_nbtree.o $(WIN32RES)
+OBJS		= verify_nbtree.o verify_gist.o $(WIN32RES)
 
 EXTENSION = amcheck
 DATA = amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree
+REGRESS = check check_btree check_gist
 
 ifdef USE_PGXS
 PG_CONFIG = pg_config
diff --git a/contrib/amcheck/amcheck--1.1--1.2.sql b/contrib/amcheck/amcheck--1.1--1.2.sql
index 883530dec74..1d461fff5b9 100644
--- a/contrib/amcheck/amcheck--1.1--1.2.sql
+++ b/contrib/amcheck/amcheck--1.1--1.2.sql
@@ -17,3 +17,13 @@ LANGUAGE C STRICT PARALLEL RESTRICTED;
 
 -- Don't want this to be available to public
 REVOKE ALL ON FUNCTION bt_index_parent_check(regclass, boolean, boolean) FROM PUBLIC;
+
+--
+-- gist_index_parent_check()
+--
+CREATE FUNCTION gist_index_parent_check(index regclass)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_parent_check(regclass) FROM PUBLIC;
diff --git a/contrib/amcheck/amcheck.h b/contrib/amcheck/amcheck.h
new file mode 100644
index 00000000000..84da7d7ec0f
--- /dev/null
+++ b/contrib/amcheck/amcheck.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "access/htup_details.h"
+#include "access/transam.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
+#include "commands/tablecmds.h"
+#include "miscadmin.h"
+#include "storage/lmgr.h"
+#include "utils/memutils.h"
+#include "utils/snapmgr.h"
+
+extern void
+amcheck_lock_relation(Oid indrelid, bool parentcheck,Relation *indrel,
+						Relation *heaprel, LOCKMODE	*lockmode);
+
+extern void
+amcheck_unlock_relation(Oid indrelid, Relation indrel, Relation heaprel, LOCKMODE	lockmode);
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 00000000000..8b37b20bc9c
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,348 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gist.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "amcheck.h"
+
+
+typedef struct GistScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GistScanItem *next;
+} GistScanItem;
+
+static void check_index_page(Relation rel, Buffer buffer);
+
+static IndexTuple gist_refind_parent(Relation rel, BlockNumber parentblkno, BlockNumber childblkno, BufferAccessStrategy strategy);
+
+static void gist_check_parent_keys_consistency(Relation rel);
+
+static void gist_index_checkable(Relation rel);
+
+static void
+check_index_page(Relation rel, Buffer buffer)
+{
+	Page		page = BufferGetPage(buffer);
+
+	gistcheckpage(rel, buffer);
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted pages",
+						RelationGetRelationName(rel))));
+	if (GistPageIsDeleted(page))
+	{
+		elog(ERROR,"boom");
+		if (!GistPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" has deleted internal page",
+							RelationGetRelationName(rel))));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" has deleted page with tuples",
+							RelationGetRelationName(rel))));
+	}
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gist_refind_parent(Relation rel, BlockNumber parentblkno, BlockNumber childblkno,
+				   BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, childblkno,
+								   RBM_NORMAL, strategy);
+
+	LockBuffer(parentbuf, GIST_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId p_iid = PageGetItemId(parentpage, o);
+		IndexTuple itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	LockBuffer(parentbuf, GIST_UNLOCK);
+
+	return result;
+}
+
+/*
+ * Main entry point for GiST check. Allocates memory context and scans
+ * through GiST graph.
+ * This function verifies that tuples of internal pages cover all the key
+ * space of each tuple on leaf page. To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries
+ * to adjust it by tuples on referenced child page. Parent gist tuple should
+ * never requre an adjustement.
+ */
+static void
+gist_check_parent_keys_consistency(Relation rel)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GistScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GISTSTATE  *state;
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	state = initGISTstate(rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter
+	 * a leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GistScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer);
+
+		/*
+		 * It's possible that the page was split since we looked at the parent,
+		 * so that we didn't missed the downlink of the right sibling when we
+		 * scanned the parent. If so, add the right sibling to the stack now.
+		 */
+		if (GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page))
+		{
+			/* split page detected, install right link to the stack */
+			GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+			ptr->depth = stack->depth;
+			ptr->parenttup = CopyIndexTuple(stack->parenttup);
+			ptr->parentblk = stack->parentblk;
+			ptr->parentlsn = stack->parentlsn;
+			ptr->blkno = GistPageGetOpaque(page)->rightlink;
+			ptr->next = stack->next;
+			stack->next = ptr;
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GistPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+			{
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly",
+								RelationGetRelationName(rel))));
+			}
+		}
+
+		/*
+		 * Check that each tuple looks valid, and is consistent with the
+		 * downlink we followed when we stepped on this page.
+		 */
+		maxoff = PageGetMaxOffsetNumber(page);
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemId(page, i);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+			/*
+			 * Check that it's not a leftover invalid tuple from pre-9.1
+			 * See also gistdoinsert() and gistbulkdelete() handlding of such
+			 * tuples. We do consider it error here.
+			 */
+			if (GistTupleIsInvalid(idxtuple))
+				ereport(ERROR,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("index \"%s\" contains an inner tuple marked as invalid",
+								RelationGetRelationName(rel)),
+						 errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+						 errhint("Please REINDEX it.")));
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes",
+								RelationGetRelationName(rel))));
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 *
+			 * XXX: shouldn't we rather use gist_consistent?
+			 */
+			if (stack->parenttup && gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+			{
+				/*
+				 * There was a  discrepancy between parent and child tuples.
+				 * We need to verify it is not a result of concurrent call
+				 * of gistplacetopage(). So, lock parent and try to find downlink
+				 * for current page. It may be missing due to concurrent page
+				 * split, this is OK.
+				 */
+				pfree(stack->parenttup);
+				stack->parenttup = gist_refind_parent(rel, stack->parentblk, stack->blkno, strategy);
+
+				/* We found it - make a final check before failing */
+				if (stack->parenttup && gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+				{
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has inconsistent records",
+									RelationGetRelationName(rel))));
+				}
+				else
+				{
+					/*
+					 * But now it is properly adjusted - nothing to do here.
+					 */
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+			if (!GistPageIsLeaf(page))
+			{
+				GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+				ptr->depth = stack->depth + 1;
+				ptr->parenttup = CopyIndexTuple(idxtuple);
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		LockBuffer(buffer, GIST_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/* Check that relation is eligible for GiST verification */
+static void
+gist_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIST_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GiST indexes are supported as targets for this"
+						 " verification"),
+				 errdetail("Relation \"%s\" is not a GiST index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+PG_FUNCTION_INFO_V1(gist_index_parent_check);
+
+Datum
+gist_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	Relation	indrel;
+	Relation	heaprel;
+	LOCKMODE	lockmode;
+
+	/* lock table and index with neccesary level */
+	amcheck_lock_relation(indrelid, true, &indrel, &heaprel, &lockmode);
+
+	/* verify that this is GiST eligible for check */
+	gist_index_checkable(indrel);
+	gist_check_parent_keys_consistency(indrel);
+
+	/* Unlock index and table */
+	amcheck_unlock_relation(indrelid, indrel, heaprel, lockmode);
+
+	PG_RETURN_VOID();
+}
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 6ae3bca9536..50991a656a5 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -21,23 +21,14 @@
  *
  *-------------------------------------------------------------------------
  */
-#include "postgres.h"
+#include "amcheck.h"
 
 #include "access/heapam.h"
-#include "access/htup_details.h"
 #include "access/nbtree.h"
 #include "access/tableam.h"
 #include "access/transam.h"
 #include "access/xact.h"
-#include "catalog/index.h"
-#include "catalog/pg_am.h"
-#include "commands/tablecmds.h"
 #include "lib/bloomfilter.h"
-#include "miscadmin.h"
-#include "storage/lmgr.h"
-#include "utils/memutils.h"
-#include "utils/snapmgr.h"
-
 
 PG_MODULE_MAGIC;
 
@@ -212,23 +203,18 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
 	PG_RETURN_VOID();
 }
 
-/*
- * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
- */
-static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend)
+
+/* Lock aquisition reused accross different am types */
+void
+amcheck_lock_relation(Oid indrelid, bool parentcheck, Relation *indrel,
+						Relation *heaprel, LOCKMODE	*lockmode)
 {
 	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	bool		heapkeyspace;
-	LOCKMODE	lockmode;
 
 	if (parentcheck)
-		lockmode = ShareLock;
+		*lockmode = ShareLock;
 	else
-		lockmode = AccessShareLock;
+		*lockmode = AccessShareLock;
 
 	/*
 	 * We must lock table before index to avoid deadlocks.  However, if the
@@ -240,9 +226,9 @@ bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
 	 */
 	heapid = IndexGetRelation(indrelid, true);
 	if (OidIsValid(heapid))
-		heaprel = table_open(heapid, lockmode);
+		*heaprel = heap_open(heapid, *lockmode);
 	else
-		heaprel = NULL;
+		*heaprel = NULL;
 
 	/*
 	 * Open the target index relations separately (like relation_openrv(), but
@@ -256,27 +242,23 @@ bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
 	 * committed or recently dead heap tuples lacking index entries due to
 	 * concurrent activity.)
 	 */
-	indrel = index_open(indrelid, lockmode);
+	*indrel = index_open(indrelid, *lockmode);
 
 	/*
 	 * Since we did the IndexGetRelation call above without any lock, it's
 	 * barely possible that a race against an index drop/recreation could have
 	 * netted us the wrong table.
 	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (*heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
 		ereport(ERROR,
 				(errcode(ERRCODE_UNDEFINED_TABLE),
 				 errmsg("could not open parent table of index %s",
-						RelationGetRelationName(indrel))));
-
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	/* Check index, possibly against table it is an index on */
-	heapkeyspace = _bt_heapkeyspace(indrel);
-	bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
-						 heapallindexed, rootdescend);
+						RelationGetRelationName(*indrel))));
+}
 
+/* Pair for  amcheck_lock_relation() */
+void amcheck_unlock_relation(Oid indrelid, Relation indrel, Relation heaprel, LOCKMODE	lockmode)
+{
 	/*
 	 * Release locks early. That's ok here because nothing in the called
 	 * routines will trigger shared cache invalidations to be sent, so we can
@@ -287,6 +269,33 @@ bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
 		table_close(heaprel, lockmode);
 }
 
+/*
+ * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
+ */
+static void
+bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
+						bool rootdescend)
+{
+	Relation	indrel;
+	Relation	heaprel;
+	LOCKMODE	lockmode;
+	bool 		heapkeyspace;
+
+	/* lock table and index with neccesary level */
+	amcheck_lock_relation(indrelid, parentcheck, &indrel, &heaprel, &lockmode);
+
+	/* Relation suitable for checking as B-Tree? */
+	btree_index_checkable(indrel);
+
+	/* Check index, possibly against table it is an index on */
+	heapkeyspace = _bt_heapkeyspace(indrel);
+	bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
+						 heapallindexed, rootdescend);
+
+	/* Unlock index and table */
+	amcheck_unlock_relation(indrelid, indrel, heaprel, lockmode);
+}
+
 /*
  * Basic checks about the suitability of a relation for checking as a B-Tree
  * index.
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 627651d8d4a..34842eaebf6 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -165,6 +165,27 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_parent_check(index regclass) returns void</function>
+     <indexterm>
+      <primary>gist_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_parent_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages). As with <function>bt_index_parent_check</function>, the
+      <function>gist_index_parent_check</function> aquires
+      <literal>ShareLock</literal> on index and heap relations.
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
  </sect2>
 
In reply to: Heikki Linnakangas (#15)
Re: amcheck verification for GiST

On Wed, Mar 27, 2019 at 10:29 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

Thanks! I had a look, and refactored it quite a bit.

I'm really happy that other people seem to be driving amcheck in a new
direction, without any prompting from me. It's too important to remain
something that only I have contributed to.

I found the way the recursion worked confusing. On each internal page,
it looped through all the child nodes, checking the consistency of the
downlinks. And then it looped through the children again, to recurse.
This isn't performance-critical, but visiting every page twice still
seems strange.

To be fair, that's actually what bt_index_parent_check() does. OTOH,
it has to work in a way that makes it an extension of
bt_index_check(), which is not a problem for
gist_index_parent_check().

In gist_check_page_keys(), if we get into the code to deal with a
concurrent update, we set 'parent' to point to a tuple on a parent page,
then unlock it, and continue to look at remaining tuples, using the
pointer that points to an unlocked buffer.

Why not just copy the page into a local buffer? See my remarks on this below.

I came up with the attached, which fixes the above-mentioned things. I
also replaced the check that each node has only internal or leaf
children, with a different check that the tree has the same height in
all branches. That catches more potential problems, and was easier to
implement after the refactoring. This still needs at least a round of
fixing typos and tidying up comments, but it's more straightforward now,
IMHO.

You probably didn't mean to leave this "boom" error behind:

+   if (GistPageIsDeleted(page))
+   {
+       elog(ERROR,"boom");

I see that you have this check for deleted pages:

+       if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+           ereport(ERROR,
+                   (errcode(ERRCODE_INDEX_CORRUPTED),
+                   errmsg("index \"%s\" has deleted page with tuples",
+                           RelationGetRelationName(rel))));
+   }

Why not have a similar check for non-deleted pages, whose maxoffset
must be <= MaxIndexTuplesPerPage?

I see various errors like this one:

+           if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+               ereport(ERROR,
+                       (errcode(ERRCODE_INDEX_CORRUPTED),
+                        errmsg("index \"%s\" has inconsistent tuple sizes",
+                               RelationGetRelationName(rel))));

Can't we say which TID is involved here, so we can find the offending
corrupt tuple afterwards? Or at least the block number? And maybe even
the LSN of the page? I think that that kind of stuff could be added in
a number of places.

I see this stuff that's related to concurrent processes:

+       /*
+        * It's possible that the page was split since we looked at the parent,
+        * so that we didn't missed the downlink of the right sibling when we
+        * scanned the parent. If so, add the right sibling to the stack now.
+        */
+               /*
+                * There was a  discrepancy between parent and child tuples.
+                * We need to verify it is not a result of concurrent call
+                * of gistplacetopage(). So, lock parent and try to find downlink
+                * for current page. It may be missing due to concurrent page
+                * split, this is OK.
+                */

Is this really needed? Isn't the ShareLock on the index sufficient? If so, why?

+ stack->parenttup = gist_refind_parent(rel, stack->parentblk, stack->blkno, strategy);

If the gistplacetopage() stuff is truly necessary, then is it okay to
call gist_refind_parent() with the original buffer lock still held
like this?

I still suspect that we should have something like palloc_btree_page()
for GiST, so that we're always operating on a copy of the page in
palloc()'d memory. Maybe it's worthwhile to do something clever with
concurrently holding buffer locks, but if that's what we're doing here
then I would also expect us to have something weaker than ShareLock as
our relation-level heavyweight lock. And, there should be a prominent
explanation of the theory behind it somewhere.

What have you been using to test this?

pg_hexedit has full support for GiST. ;-)

--
Peter Geoghegan

#17Andrey Borodin
x4mmm@yandex-team.ru
In reply to: Heikki Linnakangas (#15)
1 attachment(s)
Re: amcheck verification for GiST

Thanks for looking into this!

27 марта 2019 г., в 22:29, Heikki Linnakangas <hlinnaka@iki.fi> написал(а):

On 27/03/2019 11:51, Andrey Borodin wrote:

Hi!
Here's new version of GiST amcheck, which takes into account recently committed GiST VACUUM.
It tests that deleted pages do not contain any data.

Thanks! I had a look, and refactored it quite a bit.

Cool! New scan logic is much easier to read.

I found the way the recursion worked confusing. On each internal page, it looped through all the child nodes, checking the consistency of the downlinks. And then it looped through the children again, to recurse. This isn't performance-critical, but visiting every page twice still seems strange.

In gist_check_page_keys(), if we get into the code to deal with a concurrent update, we set 'parent' to point to a tuple on a parent page, then unlock it, and continue to look at remaining tuples, using the pointer that points to an unlocked buffer.

Uh, that was a tricky bug.

I came up with the attached, which fixes the above-mentioned things. I also replaced the check that each node has only internal or leaf children, with a different check that the tree has the same height in all branches. That catches more potential problems, and was easier to implement after the refactoring. This still needs at least a round of fixing typos and tidying up comments, but it's more straightforward now, IMHO.

What have you been using to test this?

Please see attached patch with line
//if (false) // THIS LINE IS INTENTIONALLY BROKEN
which breaks GiST consistency. Uncomment it to create logically broken GiST.

Also, I've fixed buffer release and small typo (childblkno->parentblkno).

To test that it does not deadlock with inserts and vacuums I use pgbench the same way is it is used here [0]/messages/by-id/96ec7ebd-42b9-4df5-18a4-42181c8a5a41@iki.fi but also with
SELECT gist_index_parent_check('gist_check_idx');

Also I've added NOTICE when parent is not refound. To test this, I was removing adjust here
if (stack->parenttup && gistgetadjusted(rel, stack->parenttup, idxtuple, state))

XXX: shouldn't we rather use gist_consistent?

No, consistency test must use scan strategy, which can be absent from opclass.
Parent tuples must be adjusted with every child. We check this simple invariant, it should cover all corner cases.

28 марта 2019 г., в 4:57, Peter Geoghegan <pg@bowt.ie> написал(а):

In gist_check_page_keys(), if we get into the code to deal with a
concurrent update, we set 'parent' to point to a tuple on a parent page,
then unlock it, and continue to look at remaining tuples, using the
pointer that points to an unlocked buffer.

Why not just copy the page into a local buffer? See my remarks on this below.

It already was copied, there was a bug of reusing copy from released buffer. In case, when we suspected error, which outed to be not error.

You probably didn't mean to leave this "boom" error behind:

+   if (GistPageIsDeleted(page))
+   {
+       elog(ERROR,"boom");

Oops. Sorry.

I see that you have this check for deleted pages:

+       if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+           ereport(ERROR,
+                   (errcode(ERRCODE_INDEX_CORRUPTED),
+                   errmsg("index \"%s\" has deleted page with tuples",
+                           RelationGetRelationName(rel))));
+   }

Why not have a similar check for non-deleted pages, whose maxoffset
must be <= MaxIndexTuplesPerPage?

Done.

I see various errors like this one:

+           if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+               ereport(ERROR,
+                       (errcode(ERRCODE_INDEX_CORRUPTED),
+                        errmsg("index \"%s\" has inconsistent tuple sizes",
+                               RelationGetRelationName(rel))));

Can't we say which TID is involved here, so we can find the offending
corrupt tuple afterwards? Or at least the block number? And maybe even
the LSN of the page? I think that that kind of stuff could be added in
a number of places.

I've added block number and offset whenever known. I do not understand point of LSN here...

I see this stuff that's related to concurrent processes:

+       /*
+        * It's possible that the page was split since we looked at the parent,
+        * so that we didn't missed the downlink of the right sibling when we
+        * scanned the parent. If so, add the right sibling to the stack now.
+        */
+               /*
+                * There was a  discrepancy between parent and child tuples.
+                * We need to verify it is not a result of concurrent call
+                * of gistplacetopage(). So, lock parent and try to find downlink
+                * for current page. It may be missing due to concurrent page
+                * split, this is OK.
+                */

Is this really needed? Isn't the ShareLock on the index sufficient? If so, why?

There may be concurrent inserts? In GiST they can reorder items on page.

+ stack->parenttup = gist_refind_parent(rel, stack->parentblk, stack->blkno, strategy);

If the gistplacetopage() stuff is truly necessary, then is it okay to
call gist_refind_parent() with the original buffer lock still held
like this?

When we call gist_refind_parent() we hold lock for a child and lock parent.
We exclude concurrent VACUUM, thus parent cannot become a child for current child, because it has to be recycled for such coincidence.

I still suspect that we should have something like palloc_btree_page()
for GiST, so that we're always operating on a copy of the page in
palloc()'d memory.

That's actually is what we are doing now. GistScanItem->parenttup is always a copy. But we are doing more small copies of each individual tuple, because we have to refresh this copies sometimes.

Maybe it's worthwhile to do something clever with
concurrently holding buffer locks, but if that's what we're doing here
then I would also expect us to have something weaker than ShareLock as
our relation-level heavyweight lock. And, there should be a prominent
explanation of the theory behind it somewhere.

We definitely can run under SharedLock. And I will try to compose up all things that prevent us from using weaker levels in next message...

What have you been using to test this?

pg_hexedit has full support for GiST. ;-)

For me it is easier to break GiST in it's code for tests :)

Thanks!

Best regards, Andrey Borodin.

[0]: /messages/by-id/96ec7ebd-42b9-4df5-18a4-42181c8a5a41@iki.fi

Attachments:

0001-GiST-verification-function-for-amcheck-v7.patchapplication/octet-stream; name=0001-GiST-verification-function-for-amcheck-v7.patch; x-unix-mode=0644Download
From dc70466459f4e08e191b9052774afe951a4c6903 Mon Sep 17 00:00:00 2001
From: Andrey <amborodin@acm.org>
Date: Thu, 28 Mar 2019 14:46:55 +0500
Subject: [PATCH] GiST verification function for amcheck v7

---
 contrib/amcheck/Makefile                |   4 +-
 contrib/amcheck/amcheck--1.1--1.2.sql   |  10 +
 contrib/amcheck/amcheck.h               |  31 ++
 contrib/amcheck/expected/check_gist.out |  16 +
 contrib/amcheck/sql/check_gist.sql      |   6 +
 contrib/amcheck/verify_gist.c           | 369 ++++++++++++++++++++++++
 contrib/amcheck/verify_nbtree.c         |  79 ++---
 src/backend/access/gist/gist.c          |   1 +
 8 files changed, 479 insertions(+), 37 deletions(-)
 create mode 100644 contrib/amcheck/amcheck.h
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index dcec3b8520..dd9b5ecf92 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -1,13 +1,13 @@
 # contrib/amcheck/Makefile
 
 MODULE_big	= amcheck
-OBJS		= verify_nbtree.o $(WIN32RES)
+OBJS		= verify_nbtree.o verify_gist.o $(WIN32RES)
 
 EXTENSION = amcheck
 DATA = amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree
+REGRESS = check check_btree check_gist
 
 ifdef USE_PGXS
 PG_CONFIG = pg_config
diff --git a/contrib/amcheck/amcheck--1.1--1.2.sql b/contrib/amcheck/amcheck--1.1--1.2.sql
index 883530dec7..1d461fff5b 100644
--- a/contrib/amcheck/amcheck--1.1--1.2.sql
+++ b/contrib/amcheck/amcheck--1.1--1.2.sql
@@ -17,3 +17,13 @@ LANGUAGE C STRICT PARALLEL RESTRICTED;
 
 -- Don't want this to be available to public
 REVOKE ALL ON FUNCTION bt_index_parent_check(regclass, boolean, boolean) FROM PUBLIC;
+
+--
+-- gist_index_parent_check()
+--
+CREATE FUNCTION gist_index_parent_check(index regclass)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_parent_check(regclass) FROM PUBLIC;
diff --git a/contrib/amcheck/amcheck.h b/contrib/amcheck/amcheck.h
new file mode 100644
index 0000000000..84da7d7ec0
--- /dev/null
+++ b/contrib/amcheck/amcheck.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "access/htup_details.h"
+#include "access/transam.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
+#include "commands/tablecmds.h"
+#include "miscadmin.h"
+#include "storage/lmgr.h"
+#include "utils/memutils.h"
+#include "utils/snapmgr.h"
+
+extern void
+amcheck_lock_relation(Oid indrelid, bool parentcheck,Relation *indrel,
+						Relation *heaprel, LOCKMODE	*lockmode);
+
+extern void
+amcheck_unlock_relation(Oid indrelid, Relation indrel, Relation heaprel, LOCKMODE	lockmode);
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 0000000000..fe884dbac4
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,16 @@
+-- minimal test, basically just verifying that amcheck works with GiST
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE gist_check AS SELECT point(random(),s) c FROM generate_series(1,10000) s;
+INSERT INTO gist_check SELECT point(random(),s) c FROM generate_series(1,100000) s;
+CREATE INDEX gist_check_idx ON gist_check USING gist(c);
+SELECT gist_index_parent_check('gist_check_idx');
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 0000000000..cbfae60883
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,6 @@
+-- minimal test, basically just verifying that amcheck works with GiST
+SELECT setseed(1);
+CREATE TABLE gist_check AS SELECT point(random(),s) c FROM generate_series(1,10000) s;
+INSERT INTO gist_check SELECT point(random(),s) c FROM generate_series(1,100000) s;
+CREATE INDEX gist_check_idx ON gist_check USING gist(c);
+SELECT gist_index_parent_check('gist_check_idx');
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 0000000000..222b7d6e70
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,369 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gist.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "amcheck.h"
+
+
+typedef struct GistScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GistScanItem *next;
+} GistScanItem;
+
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+
+static IndexTuple gist_refind_parent(Relation rel, BlockNumber parentblkno, BlockNumber childblkno, BufferAccessStrategy strategy);
+
+static void gist_check_parent_keys_consistency(Relation rel);
+
+static void gist_index_checkable(Relation rel);
+
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	gistcheckpage(rel, buffer);
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted page %d",
+						RelationGetRelationName(rel), blockNo)));
+	if (GistPageIsDeleted(page))
+	{
+		if (!GistPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else
+	{
+		if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" has page %d with exceeding count of tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gist_refind_parent(Relation rel, BlockNumber parentblkno, BlockNumber childblkno,
+				   BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno,
+								   RBM_NORMAL, strategy);
+
+	LockBuffer(parentbuf, GIST_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GistPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId p_iid = PageGetItemId(parentpage, o);
+		IndexTuple itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+/*
+ * Main entry point for GiST check. Allocates memory context and scans
+ * through GiST graph.
+ * This function verifies that tuples of internal pages cover all the key
+ * space of each tuple on leaf page. To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries
+ * to adjust it by tuples on referenced child page. Parent gist tuple should
+ * never requre an adjustement.
+ */
+static void
+gist_check_parent_keys_consistency(Relation rel)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GistScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GISTSTATE  *state;
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	state = initGISTstate(rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter
+	 * a leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GistScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the parent,
+		 * so that we didn't missed the downlink of the right sibling when we
+		 * scanned the parent. If so, add the right sibling to the stack now.
+		 */
+		if (GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page))
+		{
+			/* split page detected, install right link to the stack */
+			GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+			ptr->depth = stack->depth;
+			ptr->parenttup = CopyIndexTuple(stack->parenttup);
+			ptr->parentblk = stack->parentblk;
+			ptr->parentlsn = stack->parentlsn;
+			ptr->blkno = GistPageGetOpaque(page)->rightlink;
+			ptr->next = stack->next;
+			stack->next = ptr;
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GistPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+			{
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block",
+								RelationGetRelationName(rel), stack->blkno)));
+			}
+		}
+
+		/*
+		 * Check that each tuple looks valid, and is consistent with the
+		 * downlink we followed when we stepped on this page.
+		 */
+		maxoff = PageGetMaxOffsetNumber(page);
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemId(page, i);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+			/*
+			 * Check that it's not a leftover invalid tuple from pre-9.1
+			 * See also gistdoinsert() and gistbulkdelete() handlding of such
+			 * tuples. We do consider it error here.
+			 */
+			if (GistTupleIsInvalid(idxtuple))
+				ereport(ERROR,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("index \"%s\" contains an inner tuple marked as invalid, block %d, offset %d",
+								RelationGetRelationName(rel), stack->blkno, i),
+						 errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+						 errhint("Please REINDEX it.")));
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %d, offset %d",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 *
+			 * XXX: shouldn't we rather use gist_consistent?
+			 */
+			if (stack->parenttup && gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+			{
+				/*
+				 * There was a  discrepancy between parent and child tuples.
+				 * We need to verify it is not a result of concurrent call
+				 * of gistplacetopage(). So, lock parent and try to find downlink
+				 * for current page. It may be missing due to concurrent page
+				 * split, this is OK.
+				 */
+				pfree(stack->parenttup);
+				stack->parenttup = gist_refind_parent(rel, stack->parentblk, stack->blkno, strategy);
+
+				/* We found it - make a final check before failing */
+				if (!stack->parenttup)
+				{
+					elog(NOTICE, "Unable to find parent tuple for block %d on "
+							"block %d du to concurrent split",
+							stack->blkno, stack->parentblk);
+				}
+				else if (gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+				{
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has inconsistent records on page %d offset %d",
+									RelationGetRelationName(rel), stack->blkno, i)));
+				}
+				else
+				{
+					/*
+					 * But now it is properly adjusted - nothing to do here.
+					 */
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+			if (!GistPageIsLeaf(page))
+			{
+				GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+				ptr->depth = stack->depth + 1;
+				ptr->parenttup = CopyIndexTuple(idxtuple);
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		LockBuffer(buffer, GIST_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/* Check that relation is eligible for GiST verification */
+static void
+gist_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIST_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GiST indexes are supported as targets for this"
+						 " verification"),
+				 errdetail("Relation \"%s\" is not a GiST index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+PG_FUNCTION_INFO_V1(gist_index_parent_check);
+
+Datum
+gist_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	Relation	indrel;
+	Relation	heaprel;
+	LOCKMODE	lockmode;
+
+	/* lock table and index with neccesary level */
+	amcheck_lock_relation(indrelid, true, &indrel, &heaprel, &lockmode);
+
+	/* verify that this is GiST eligible for check */
+	gist_index_checkable(indrel);
+	gist_check_parent_keys_consistency(indrel);
+
+	/* Unlock index and table */
+	amcheck_unlock_relation(indrelid, indrel, heaprel, lockmode);
+
+	PG_RETURN_VOID();
+}
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 9ecb1999e3..3bceb7e807 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -21,23 +21,14 @@
  *
  *-------------------------------------------------------------------------
  */
-#include "postgres.h"
+#include "amcheck.h"
 
-#include "access/htup_details.h"
 #include "access/nbtree.h"
 #include "access/table.h"
 #include "access/tableam.h"
 #include "access/transam.h"
 #include "access/xact.h"
-#include "catalog/index.h"
-#include "catalog/pg_am.h"
-#include "commands/tablecmds.h"
 #include "lib/bloomfilter.h"
-#include "miscadmin.h"
-#include "storage/lmgr.h"
-#include "utils/memutils.h"
-#include "utils/snapmgr.h"
-
 
 PG_MODULE_MAGIC;
 
@@ -212,23 +203,18 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
 	PG_RETURN_VOID();
 }
 
-/*
- * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
- */
-static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend)
+
+/* Lock aquisition reused accross different am types */
+void
+amcheck_lock_relation(Oid indrelid, bool parentcheck, Relation *indrel,
+						Relation *heaprel, LOCKMODE	*lockmode)
 {
 	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	bool		heapkeyspace;
-	LOCKMODE	lockmode;
 
 	if (parentcheck)
-		lockmode = ShareLock;
+		*lockmode = ShareLock;
 	else
-		lockmode = AccessShareLock;
+		*lockmode = AccessShareLock;
 
 	/*
 	 * We must lock table before index to avoid deadlocks.  However, if the
@@ -240,9 +226,9 @@ bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
 	 */
 	heapid = IndexGetRelation(indrelid, true);
 	if (OidIsValid(heapid))
-		heaprel = table_open(heapid, lockmode);
+		*heaprel = heap_open(heapid, *lockmode);
 	else
-		heaprel = NULL;
+		*heaprel = NULL;
 
 	/*
 	 * Open the target index relations separately (like relation_openrv(), but
@@ -256,27 +242,23 @@ bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
 	 * committed or recently dead heap tuples lacking index entries due to
 	 * concurrent activity.)
 	 */
-	indrel = index_open(indrelid, lockmode);
+	*indrel = index_open(indrelid, *lockmode);
 
 	/*
 	 * Since we did the IndexGetRelation call above without any lock, it's
 	 * barely possible that a race against an index drop/recreation could have
 	 * netted us the wrong table.
 	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (*heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
 		ereport(ERROR,
 				(errcode(ERRCODE_UNDEFINED_TABLE),
 				 errmsg("could not open parent table of index %s",
-						RelationGetRelationName(indrel))));
-
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	/* Check index, possibly against table it is an index on */
-	heapkeyspace = _bt_heapkeyspace(indrel);
-	bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
-						 heapallindexed, rootdescend);
+						RelationGetRelationName(*indrel))));
+}
 
+/* Pair for  amcheck_lock_relation() */
+void amcheck_unlock_relation(Oid indrelid, Relation indrel, Relation heaprel, LOCKMODE	lockmode)
+{
 	/*
 	 * Release locks early. That's ok here because nothing in the called
 	 * routines will trigger shared cache invalidations to be sent, so we can
@@ -287,6 +269,33 @@ bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
 		table_close(heaprel, lockmode);
 }
 
+/*
+ * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
+ */
+static void
+bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
+						bool rootdescend)
+{
+	Relation	indrel;
+	Relation	heaprel;
+	LOCKMODE	lockmode;
+	bool 		heapkeyspace;
+
+	/* lock table and index with neccesary level */
+	amcheck_lock_relation(indrelid, parentcheck, &indrel, &heaprel, &lockmode);
+
+	/* Relation suitable for checking as B-Tree? */
+	btree_index_checkable(indrel);
+
+	/* Check index, possibly against table it is an index on */
+	heapkeyspace = _bt_heapkeyspace(indrel);
+	bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
+						 heapallindexed, rootdescend);
+
+	/* Unlock index and table */
+	amcheck_unlock_relation(indrelid, indrel, heaprel, lockmode);
+}
+
 /*
  * Basic checks about the suitability of a relation for checking as a B-Tree
  * index.
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 2fddb23496..4bf25945ed 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -726,6 +726,7 @@ gistdoinsert(Relation r, IndexTuple itup, Size freespace,
 			 * consistent with the key we're inserting. Update it if it's not.
 			 */
 			newtup = gistgetadjusted(state.r, idxtuple, itup, giststate);
+			//if (false) // THIS LINE IS INTENTIONALLY BROKEN
 			if (newtup)
 			{
 				/*
-- 
2.20.1

#18Andrey Borodin
x4mmm@yandex-team.ru
In reply to: Andrey Borodin (#17)
Re: amcheck verification for GiST

28 марта 2019 г., в 18:35, Andrey Borodin <x4mmm@yandex-team.ru> написал(а):

Is this really needed? Isn't the ShareLock on the index sufficient? If so, why?

There may be concurrent inserts? In GiST they can reorder items on page.

Looks like I've tried to cope with same problem twice:
v3 of the patch used AccessShareLock and many locks with incorrect order.
We could use one of possible solutions: either use ShareLock, or rewrite scan to correct locking order.
But patches v4-v7 use both.
I think we should use AccessShareLock, as long as we implemented tricky logic with gist_refind_parent().

+ stack->parenttup = gist_refind_parent(rel, stack->parentblk, stack->blkno, strategy);

If the gistplacetopage() stuff is truly necessary, then is it okay to
call gist_refind_parent() with the original buffer lock still held
like this?

When we call gist_refind_parent() we hold lock for a child and lock parent.
We exclude concurrent VACUUM, thus parent cannot become a child for current child, because it has to be recycled for such coincidence.

That's merely hard form of paranoia, internal pages are never deleted. gist_index_parent_check() would work just fine with concurrent VACUUM, INSERTs and SELECTs.

Best regards, Andrey Borodin.

In reply to: Andrey Borodin (#18)
Re: amcheck verification for GiST

On Thu, Mar 28, 2019 at 10:08 AM Andrey Borodin <x4mmm@yandex-team.ru> wrote:

Is this really needed? Isn't the ShareLock on the index sufficient? If so, why?

There may be concurrent inserts? In GiST they can reorder items on page.

Looks like I've tried to cope with same problem twice:
v3 of the patch used AccessShareLock and many locks with incorrect order.
We could use one of possible solutions: either use ShareLock, or rewrite scan to correct locking order.
But patches v4-v7 use both.

It definitely has to be one or the other. The combination makes no sense.

--
Peter Geoghegan

#20Andrey Borodin
x4mmm@yandex-team.ru
In reply to: Peter Geoghegan (#19)
1 attachment(s)
Re: amcheck verification for GiST

29 марта 2019 г., в 5:35, Peter Geoghegan <pg@bowt.ie> написал(а):

On Thu, Mar 28, 2019 at 10:08 AM Andrey Borodin <x4mmm@yandex-team.ru> wrote:

Is this really needed? Isn't the ShareLock on the index sufficient? If so, why?

There may be concurrent inserts? In GiST they can reorder items on page.

Looks like I've tried to cope with same problem twice:
v3 of the patch used AccessShareLock and many locks with incorrect order.
We could use one of possible solutions: either use ShareLock, or rewrite scan to correct locking order.
But patches v4-v7 use both.

It definitely has to be one or the other. The combination makes no sense.

Here's updated patch with AccessShareLock.
I've tried to stress this with combination of random inserts, vaccuums and checks. This process neither failed, nor deadlocked.
The patch intentionally contains one superflous line to make gist logically broken. This triggers regression test of amcheck.

Best regards, Andrey Borodin.

Attachments:

0001-GiST-verification-function-for-amcheck-v8.patchapplication/octet-stream; name=0001-GiST-verification-function-for-amcheck-v8.patch; x-unix-mode=0644Download
From 010c7f1d3348140370fac6ec809c160f336f34d5 Mon Sep 17 00:00:00 2001
From: Andrey <amborodin@acm.org>
Date: Thu, 28 Mar 2019 14:46:55 +0500
Subject: [PATCH] GiST verification function for amcheck v8

---
 contrib/amcheck/Makefile                |   4 +-
 contrib/amcheck/amcheck--1.1--1.2.sql   |  10 +
 contrib/amcheck/amcheck.h               |  31 ++
 contrib/amcheck/expected/check_gist.out |  16 +
 contrib/amcheck/sql/check_gist.sql      |   6 +
 contrib/amcheck/verify_gist.c           | 377 ++++++++++++++++++++++++
 contrib/amcheck/verify_nbtree.c         |  85 +++---
 doc/src/sgml/amcheck.sgml               |  19 ++
 src/backend/access/gist/gist.c          |   1 +
 9 files changed, 509 insertions(+), 40 deletions(-)
 create mode 100644 contrib/amcheck/amcheck.h
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index dcec3b8520..dd9b5ecf92 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -1,13 +1,13 @@
 # contrib/amcheck/Makefile
 
 MODULE_big	= amcheck
-OBJS		= verify_nbtree.o $(WIN32RES)
+OBJS		= verify_nbtree.o verify_gist.o $(WIN32RES)
 
 EXTENSION = amcheck
 DATA = amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree
+REGRESS = check check_btree check_gist
 
 ifdef USE_PGXS
 PG_CONFIG = pg_config
diff --git a/contrib/amcheck/amcheck--1.1--1.2.sql b/contrib/amcheck/amcheck--1.1--1.2.sql
index 883530dec7..1d461fff5b 100644
--- a/contrib/amcheck/amcheck--1.1--1.2.sql
+++ b/contrib/amcheck/amcheck--1.1--1.2.sql
@@ -17,3 +17,13 @@ LANGUAGE C STRICT PARALLEL RESTRICTED;
 
 -- Don't want this to be available to public
 REVOKE ALL ON FUNCTION bt_index_parent_check(regclass, boolean, boolean) FROM PUBLIC;
+
+--
+-- gist_index_parent_check()
+--
+CREATE FUNCTION gist_index_parent_check(index regclass)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_parent_check(regclass) FROM PUBLIC;
diff --git a/contrib/amcheck/amcheck.h b/contrib/amcheck/amcheck.h
new file mode 100644
index 0000000000..ac3b6da494
--- /dev/null
+++ b/contrib/amcheck/amcheck.h
@@ -0,0 +1,31 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "access/htup_details.h"
+#include "access/transam.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
+#include "commands/tablecmds.h"
+#include "miscadmin.h"
+#include "storage/lmgr.h"
+#include "utils/memutils.h"
+#include "utils/snapmgr.h"
+
+extern void
+amcheck_lock_relation(Oid indrelid, Relation *indrel,
+						Relation *heaprel, LOCKMODE	lockmode);
+
+extern void
+amcheck_unlock_relation(Oid indrelid, Relation indrel, Relation heaprel, LOCKMODE	lockmode);
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 0000000000..fe884dbac4
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,16 @@
+-- minimal test, basically just verifying that amcheck works with GiST
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE gist_check AS SELECT point(random(),s) c FROM generate_series(1,10000) s;
+INSERT INTO gist_check SELECT point(random(),s) c FROM generate_series(1,100000) s;
+CREATE INDEX gist_check_idx ON gist_check USING gist(c);
+SELECT gist_index_parent_check('gist_check_idx');
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 0000000000..cbfae60883
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,6 @@
+-- minimal test, basically just verifying that amcheck works with GiST
+SELECT setseed(1);
+CREATE TABLE gist_check AS SELECT point(random(),s) c FROM generate_series(1,10000) s;
+INSERT INTO gist_check SELECT point(random(),s) c FROM generate_series(1,100000) s;
+CREATE INDEX gist_check_idx ON gist_check USING gist(c);
+SELECT gist_index_parent_check('gist_check_idx');
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 0000000000..3247014042
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,377 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gist.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "amcheck.h"
+
+
+typedef struct GistScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GistScanItem *next;
+} GistScanItem;
+
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+
+static IndexTuple
+gist_refind_parent(Relation rel, BlockNumber parentblkno,
+					BlockNumber childblkno, BufferAccessStrategy strategy);
+
+static void gist_check_parent_keys_consistency(Relation rel);
+
+static void gist_index_checkable(Relation rel);
+
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	gistcheckpage(rel, buffer);
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted page %d",
+						RelationGetRelationName(rel), blockNo)));
+	if (GistPageIsDeleted(page))
+	{
+		if (!GistPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else
+	{
+		if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" has page %d with exceeding count of tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gist_refind_parent(Relation rel, BlockNumber parentblkno, BlockNumber childblkno,
+				   BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno,
+								   RBM_NORMAL, strategy);
+
+	LockBuffer(parentbuf, GIST_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GistPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId p_iid = PageGetItemId(parentpage, o);
+		IndexTuple itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+/*
+ * Main entry point for GiST check. Allocates memory context and scans
+ * through GiST graph.
+ * This function verifies that tuples of internal pages cover all the key
+ * space of each tuple on leaf page. To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries
+ * to adjust it by tuples on referenced child page. Parent gist tuple should
+ * never requre an adjustement.
+ */
+static void
+gist_check_parent_keys_consistency(Relation rel)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GistScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GISTSTATE  *state;
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	state = initGISTstate(rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter
+	 * a leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GistScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the parent,
+		 * so that we didn't missed the downlink of the right sibling when we
+		 * scanned the parent. If so, add the right sibling to the stack now.
+		 */
+		if (GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page))
+		{
+			/* split page detected, install right link to the stack */
+			GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+			ptr->depth = stack->depth;
+			ptr->parenttup = CopyIndexTuple(stack->parenttup);
+			ptr->parentblk = stack->parentblk;
+			ptr->parentlsn = stack->parentlsn;
+			ptr->blkno = GistPageGetOpaque(page)->rightlink;
+			ptr->next = stack->next;
+			stack->next = ptr;
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GistPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+			{
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal"
+						 		" encountered leaf page unexpectedly on block %d",
+								RelationGetRelationName(rel), stack->blkno)));
+			}
+		}
+
+		/*
+		 * Check that each tuple looks valid, and is consistent with the
+		 * downlink we followed when we stepped on this page.
+		 */
+		maxoff = PageGetMaxOffsetNumber(page);
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemId(page, i);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+			/*
+			 * Check that it's not a leftover invalid tuple from pre-9.1
+			 * See also gistdoinsert() and gistbulkdelete() handlding of such
+			 * tuples. We do consider it error here.
+			 */
+			if (GistTupleIsInvalid(idxtuple))
+				ereport(ERROR,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("index \"%s\" contains an inner tuple marked as"
+						 		" invalid, block %d, offset %d",
+								RelationGetRelationName(rel), stack->blkno, i),
+						 errdetail("This is caused by an incomplete page split"
+						 			" at crash recovery before upgrading to"
+									" PostgreSQL 9.1."),
+						 errhint("Please REINDEX it.")));
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes,"
+						 		" block %d, offset %d",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 *
+			 * XXX: shouldn't we rather use gist_consistent?
+			 */
+			if (stack->parenttup && gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+			{
+				/*
+				 * There was a  discrepancy between parent and child tuples.
+				 * We need to verify it is not a result of concurrent call
+				 * of gistplacetopage(). So, lock parent and try to find downlink
+				 * for current page. It may be missing due to concurrent page
+				 * split, this is OK.
+				 */
+				pfree(stack->parenttup);
+				stack->parenttup = gist_refind_parent(rel, stack->parentblk,
+														stack->blkno, strategy);
+
+				/* We found it - make a final check before failing */
+				if (!stack->parenttup)
+				{
+					elog(NOTICE, "Unable to find parent tuple for block %d on "
+							"block %d due to concurrent split",
+							stack->blkno, stack->parentblk);
+				}
+				else if (gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+				{
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has inconsistent records on page %d offset %d",
+									RelationGetRelationName(rel), stack->blkno, i)));
+				}
+				else
+				{
+					/*
+					 * But now it is properly adjusted - nothing to do here.
+					 */
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+			if (!GistPageIsLeaf(page))
+			{
+				GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+				ptr->depth = stack->depth + 1;
+				ptr->parenttup = CopyIndexTuple(idxtuple);
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		LockBuffer(buffer, GIST_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/* Check that relation is eligible for GiST verification */
+static void
+gist_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIST_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GiST indexes are supported as targets for this"
+						 " verification"),
+				 errdetail("Relation \"%s\" is not a GiST index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+PG_FUNCTION_INFO_V1(gist_index_parent_check);
+
+Datum
+gist_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	Relation	indrel;
+	Relation	heaprel;
+	LOCKMODE	lockmode = AccessShareLock;
+
+	/* lock table and index with neccesary level */
+	amcheck_lock_relation(indrelid, &indrel, &heaprel, lockmode);
+
+	/* verify that this is GiST eligible for check */
+	gist_index_checkable(indrel);
+	gist_check_parent_keys_consistency(indrel);
+
+	/* Unlock index and table */
+	amcheck_unlock_relation(indrelid, indrel, heaprel, lockmode);
+
+	PG_RETURN_VOID();
+}
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 9ecb1999e3..ed3ca779cb 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -21,23 +21,14 @@
  *
  *-------------------------------------------------------------------------
  */
-#include "postgres.h"
+#include "amcheck.h"
 
-#include "access/htup_details.h"
 #include "access/nbtree.h"
 #include "access/table.h"
 #include "access/tableam.h"
 #include "access/transam.h"
 #include "access/xact.h"
-#include "catalog/index.h"
-#include "catalog/pg_am.h"
-#include "commands/tablecmds.h"
 #include "lib/bloomfilter.h"
-#include "miscadmin.h"
-#include "storage/lmgr.h"
-#include "utils/memutils.h"
-#include "utils/snapmgr.h"
-
 
 PG_MODULE_MAGIC;
 
@@ -212,23 +203,13 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
 	PG_RETURN_VOID();
 }
 
-/*
- * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
- */
-static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend)
+
+/* Lock aquisition reused accross different am types */
+void
+amcheck_lock_relation(Oid indrelid, Relation *indrel,
+						Relation *heaprel, LOCKMODE	lockmode)
 {
 	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	bool		heapkeyspace;
-	LOCKMODE	lockmode;
-
-	if (parentcheck)
-		lockmode = ShareLock;
-	else
-		lockmode = AccessShareLock;
 
 	/*
 	 * We must lock table before index to avoid deadlocks.  However, if the
@@ -240,9 +221,9 @@ bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
 	 */
 	heapid = IndexGetRelation(indrelid, true);
 	if (OidIsValid(heapid))
-		heaprel = table_open(heapid, lockmode);
+		*heaprel = heap_open(heapid, lockmode);
 	else
-		heaprel = NULL;
+		*heaprel = NULL;
 
 	/*
 	 * Open the target index relations separately (like relation_openrv(), but
@@ -256,27 +237,23 @@ bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
 	 * committed or recently dead heap tuples lacking index entries due to
 	 * concurrent activity.)
 	 */
-	indrel = index_open(indrelid, lockmode);
+	*indrel = index_open(indrelid, lockmode);
 
 	/*
 	 * Since we did the IndexGetRelation call above without any lock, it's
 	 * barely possible that a race against an index drop/recreation could have
 	 * netted us the wrong table.
 	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (*heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
 		ereport(ERROR,
 				(errcode(ERRCODE_UNDEFINED_TABLE),
 				 errmsg("could not open parent table of index %s",
-						RelationGetRelationName(indrel))));
-
-	/* Relation suitable for checking as B-Tree? */
-	btree_index_checkable(indrel);
-
-	/* Check index, possibly against table it is an index on */
-	heapkeyspace = _bt_heapkeyspace(indrel);
-	bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
-						 heapallindexed, rootdescend);
+						RelationGetRelationName(*indrel))));
+}
 
+/* Pair for  amcheck_lock_relation() */
+void amcheck_unlock_relation(Oid indrelid, Relation indrel, Relation heaprel, LOCKMODE	lockmode)
+{
 	/*
 	 * Release locks early. That's ok here because nothing in the called
 	 * routines will trigger shared cache invalidations to be sent, so we can
@@ -287,6 +264,38 @@ bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
 		table_close(heaprel, lockmode);
 }
 
+/*
+ * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
+ */
+static void
+bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
+						bool rootdescend)
+{
+	Relation	indrel;
+	Relation	heaprel;
+	LOCKMODE	lockmode;
+	bool 		heapkeyspace;
+
+	if (parentcheck)
+		lockmode = ShareLock;
+	else
+		lockmode = AccessShareLock;
+
+	/* lock table and index with neccesary level */
+	amcheck_lock_relation(indrelid, &indrel, &heaprel, lockmode);
+
+	/* Relation suitable for checking as B-Tree? */
+	btree_index_checkable(indrel);
+
+	/* Check index, possibly against table it is an index on */
+	heapkeyspace = _bt_heapkeyspace(indrel);
+	bt_check_every_level(indrel, heaprel, heapkeyspace, parentcheck,
+						 heapallindexed, rootdescend);
+
+	/* Unlock index and table */
+	amcheck_unlock_relation(indrelid, indrel, heaprel, lockmode);
+}
+
 /*
  * Basic checks about the suitability of a relation for checking as a B-Tree
  * index.
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 627651d8d4..6a02e288b2 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -165,6 +165,25 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_parent_check(index regclass) returns void</function>
+     <indexterm>
+      <primary>gist_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_parent_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
  </sect2>
 
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 2fddb23496..4bf25945ed 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -726,6 +726,7 @@ gistdoinsert(Relation r, IndexTuple itup, Size freespace,
 			 * consistent with the key we're inserting. Update it if it's not.
 			 */
 			newtup = gistgetadjusted(state.r, idxtuple, itup, giststate);
+			//if (false) // THIS LINE IS INTENTIONALLY BROKEN
 			if (newtup)
 			{
 				/*
-- 
2.20.1

In reply to: Andrey Borodin (#20)
Re: amcheck verification for GiST

On Thu, Mar 28, 2019 at 11:30 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:

Here's updated patch with AccessShareLock.
I've tried to stress this with combination of random inserts, vaccuums and checks. This process neither failed, nor deadlocked.
The patch intentionally contains one superflous line to make gist logically broken. This triggers regression test of amcheck.

Any thoughts on this, Heikki?

It would be nice to be able to squeeze this into Postgres 12,
especially given that GiST has been enhanced for 12 already.

--
Peter Geoghegan

#22Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Andrey Borodin (#20)
Re: amcheck verification for GiST

On 2019-Mar-29, Andrey Borodin wrote:

Here's updated patch with AccessShareLock.
I've tried to stress this with combination of random inserts, vaccuums and checks. This process neither failed, nor deadlocked.
The patch intentionally contains one superflous line to make gist logically broken. This triggers regression test of amcheck.

How close are we to this being a committable patch? CF bot complains
that it doesn't apply anymore (please rebase), but from past discussion
it seems pretty close to ready.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#23Andrey Borodin
x4mmm@yandex-team.ru
In reply to: Alvaro Herrera (#22)
1 attachment(s)
Re: amcheck verification for GiST

Hi!

4 сент. 2019 г., в 2:13, Alvaro Herrera <alvherre@2ndquadrant.com> написал(а):

On 2019-Mar-29, Andrey Borodin wrote:

Here's updated patch with AccessShareLock.
I've tried to stress this with combination of random inserts, vaccuums and checks. This process neither failed, nor deadlocked.
The patch intentionally contains one superflous line to make gist logically broken. This triggers regression test of amcheck.

How close are we to this being a committable patch? CF bot complains
that it doesn't apply anymore (please rebase), but from past discussion
it seems pretty close to ready.

Here's rebased version. Changes in v9:
* adjust to usage of table_open
* update new extension version
* check for main fork presence in GiST check too

Thanks!

Best regards, Andrey Borodin.

Attachments:

0001-GiST-verification-function-for-amcheck-v9.patchapplication/octet-stream; name=0001-GiST-verification-function-for-amcheck-v9.patch; x-unix-mode=0600Download
From eb85f9563e5a7156dc27bb8767b3028357f72400 Mon Sep 17 00:00:00 2001
From: Andrey <amborodin@acm.org>
Date: Thu, 28 Mar 2019 14:46:55 +0500
Subject: [PATCH] GiST verification function for amcheck v9

---
 contrib/amcheck/Makefile                |   6 +-
 contrib/amcheck/amcheck--1.2--1.3.sql   |  14 +
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/amcheck.h               |  34 +++
 contrib/amcheck/expected/check_gist.out |  16 +
 contrib/amcheck/sql/check_gist.sql      |   6 +
 contrib/amcheck/verify_gist.c           | 378 ++++++++++++++++++++++++
 contrib/amcheck/verify_nbtree.c         |  95 +++---
 doc/src/sgml/amcheck.sgml               |  19 ++
 9 files changed, 524 insertions(+), 46 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.2--1.3.sql
 create mode 100644 contrib/amcheck/amcheck.h
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index dcec3b8520..8a2a93cd74 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -1,13 +1,13 @@
 # contrib/amcheck/Makefile
 
 MODULE_big	= amcheck
-OBJS		= verify_nbtree.o $(WIN32RES)
+OBJS		= verify_nbtree.o verify_gist.o $(WIN32RES)
 
 EXTENSION = amcheck
-DATA = amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA = amcheck--1.2--1.3.sql amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree
+REGRESS = check check_btree check_gist
 
 ifdef USE_PGXS
 PG_CONFIG = pg_config
diff --git a/contrib/amcheck/amcheck--1.2--1.3.sql b/contrib/amcheck/amcheck--1.2--1.3.sql
new file mode 100644
index 0000000000..44b88a40a0
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.2--1.3.sql
@@ -0,0 +1,14 @@
+/* contrib/amcheck/amcheck--1.2--1.3.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.3'" to load this file. \quit
+
+--
+-- gist_index_parent_check()
+--
+CREATE FUNCTION gist_index_parent_check(index regclass)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_parent_check(regclass) FROM PUBLIC;
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index c6e310046d..ab50931f75 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.2'
+default_version = '1.3'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/amcheck.h b/contrib/amcheck/amcheck.h
new file mode 100644
index 0000000000..9d7d3f9882
--- /dev/null
+++ b/contrib/amcheck/amcheck.h
@@ -0,0 +1,34 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "access/htup_details.h"
+#include "access/transam.h"
+#include "catalog/index.h"
+#include "catalog/pg_am.h"
+#include "commands/tablecmds.h"
+#include "miscadmin.h"
+#include "storage/lmgr.h"
+#include "utils/memutils.h"
+#include "utils/snapmgr.h"
+
+extern void
+amcheck_lock_relation(Oid indrelid, Relation *indrel,
+						Relation *heaprel, LOCKMODE	lockmode);
+
+extern void
+amcheck_unlock_relation(Oid indrelid, Relation indrel, Relation heaprel, LOCKMODE	lockmode);
+
+extern bool
+amcheck_index_mainfork_expected(Relation rel);
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 0000000000..fe884dbac4
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,16 @@
+-- minimal test, basically just verifying that amcheck works with GiST
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE gist_check AS SELECT point(random(),s) c FROM generate_series(1,10000) s;
+INSERT INTO gist_check SELECT point(random(),s) c FROM generate_series(1,100000) s;
+CREATE INDEX gist_check_idx ON gist_check USING gist(c);
+SELECT gist_index_parent_check('gist_check_idx');
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 0000000000..cbfae60883
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,6 @@
+-- minimal test, basically just verifying that amcheck works with GiST
+SELECT setseed(1);
+CREATE TABLE gist_check AS SELECT point(random(),s) c FROM generate_series(1,10000) s;
+INSERT INTO gist_check SELECT point(random(),s) c FROM generate_series(1,100000) s;
+CREATE INDEX gist_check_idx ON gist_check USING gist(c);
+SELECT gist_index_parent_check('gist_check_idx');
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 0000000000..c98dc0817b
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,378 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gist.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "amcheck.h"
+
+
+typedef struct GistScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GistScanItem *next;
+} GistScanItem;
+
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+
+static IndexTuple
+gist_refind_parent(Relation rel, BlockNumber parentblkno,
+					BlockNumber childblkno, BufferAccessStrategy strategy);
+
+static void gist_check_parent_keys_consistency(Relation rel);
+
+static void gist_index_checkable(Relation rel);
+
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	gistcheckpage(rel, buffer);
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted page %d",
+						RelationGetRelationName(rel), blockNo)));
+	if (GistPageIsDeleted(page))
+	{
+		if (!GistPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else
+	{
+		if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					errmsg("index \"%s\" has page %d with exceeding count of tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gist_refind_parent(Relation rel, BlockNumber parentblkno, BlockNumber childblkno,
+				   BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno,
+								   RBM_NORMAL, strategy);
+
+	LockBuffer(parentbuf, GIST_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GistPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId p_iid = PageGetItemId(parentpage, o);
+		IndexTuple itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+/*
+ * Main entry point for GiST check. Allocates memory context and scans
+ * through GiST graph.
+ * This function verifies that tuples of internal pages cover all the key
+ * space of each tuple on leaf page. To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries
+ * to adjust it by tuples on referenced child page. Parent gist tuple should
+ * never requre an adjustement.
+ */
+static void
+gist_check_parent_keys_consistency(Relation rel)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GistScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GISTSTATE  *state;
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	state = initGISTstate(rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter
+	 * a leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GistScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber i,
+					maxoff;
+		XLogRecPtr	lsn;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the parent,
+		 * so that we didn't missed the downlink of the right sibling when we
+		 * scanned the parent. If so, add the right sibling to the stack now.
+		 */
+		if (GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page))
+		{
+			/* split page detected, install right link to the stack */
+			GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+			ptr->depth = stack->depth;
+			ptr->parenttup = CopyIndexTuple(stack->parenttup);
+			ptr->parentblk = stack->parentblk;
+			ptr->parentlsn = stack->parentlsn;
+			ptr->blkno = GistPageGetOpaque(page)->rightlink;
+			ptr->next = stack->next;
+			stack->next = ptr;
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GistPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+			{
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal"
+						 		" encountered leaf page unexpectedly on block %d",
+								RelationGetRelationName(rel), stack->blkno)));
+			}
+		}
+
+		/*
+		 * Check that each tuple looks valid, and is consistent with the
+		 * downlink we followed when we stepped on this page.
+		 */
+		maxoff = PageGetMaxOffsetNumber(page);
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId		iid = PageGetItemId(page, i);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+			/*
+			 * Check that it's not a leftover invalid tuple from pre-9.1
+			 * See also gistdoinsert() and gistbulkdelete() handlding of such
+			 * tuples. We do consider it error here.
+			 */
+			if (GistTupleIsInvalid(idxtuple))
+				ereport(ERROR,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("index \"%s\" contains an inner tuple marked as"
+						 		" invalid, block %d, offset %d",
+								RelationGetRelationName(rel), stack->blkno, i),
+						 errdetail("This is caused by an incomplete page split"
+						 			" at crash recovery before upgrading to"
+									" PostgreSQL 9.1."),
+						 errhint("Please REINDEX it.")));
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes,"
+						 		" block %d, offset %d",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 *
+			 * XXX: shouldn't we rather use gist_consistent?
+			 */
+			if (stack->parenttup && gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+			{
+				/*
+				 * There was a  discrepancy between parent and child tuples.
+				 * We need to verify it is not a result of concurrent call
+				 * of gistplacetopage(). So, lock parent and try to find downlink
+				 * for current page. It may be missing due to concurrent page
+				 * split, this is OK.
+				 */
+				pfree(stack->parenttup);
+				stack->parenttup = gist_refind_parent(rel, stack->parentblk,
+														stack->blkno, strategy);
+
+				/* We found it - make a final check before failing */
+				if (!stack->parenttup)
+				{
+					elog(NOTICE, "Unable to find parent tuple for block %d on "
+							"block %d due to concurrent split",
+							stack->blkno, stack->parentblk);
+				}
+				else if (gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+				{
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has inconsistent records on page %d offset %d",
+									RelationGetRelationName(rel), stack->blkno, i)));
+				}
+				else
+				{
+					/*
+					 * But now it is properly adjusted - nothing to do here.
+					 */
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+			if (!GistPageIsLeaf(page))
+			{
+				GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+				ptr->depth = stack->depth + 1;
+				ptr->parenttup = CopyIndexTuple(idxtuple);
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		LockBuffer(buffer, GIST_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+/* Check that relation is eligible for GiST verification */
+static void
+gist_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIST_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GiST indexes are supported as targets for this"
+						 " verification"),
+				 errdetail("Relation \"%s\" is not a GiST index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+PG_FUNCTION_INFO_V1(gist_index_parent_check);
+
+Datum
+gist_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	Relation	indrel;
+	Relation	heaprel;
+	LOCKMODE	lockmode = AccessShareLock;
+
+	/* lock table and index with neccesary level */
+	amcheck_lock_relation(indrelid, &indrel, &heaprel, lockmode);
+
+	/* verify that this is GiST eligible for check */
+	gist_index_checkable(indrel);
+	if (amcheck_index_mainfork_expected(indrel))
+		gist_check_parent_keys_consistency(indrel);
+
+	/* Unlock index and table */
+	amcheck_unlock_relation(indrelid, indrel, heaprel, lockmode);
+
+	PG_RETURN_VOID();
+}
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 05e7d678ed..cd238750b3 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -21,23 +21,15 @@
  *
  *-------------------------------------------------------------------------
  */
-#include "postgres.h"
+#include "amcheck.h"
 
-#include "access/htup_details.h"
 #include "access/nbtree.h"
 #include "access/table.h"
 #include "access/tableam.h"
 #include "access/transam.h"
 #include "access/xact.h"
-#include "catalog/index.h"
-#include "catalog/pg_am.h"
-#include "commands/tablecmds.h"
 #include "lib/bloomfilter.h"
-#include "miscadmin.h"
-#include "storage/lmgr.h"
 #include "storage/smgr.h"
-#include "utils/memutils.h"
-#include "utils/snapmgr.h"
 
 
 PG_MODULE_MAGIC;
@@ -129,7 +121,6 @@ PG_FUNCTION_INFO_V1(bt_index_parent_check);
 static void bt_index_check_internal(Oid indrelid, bool parentcheck,
 									bool heapallindexed, bool rootdescend);
 static inline void btree_index_checkable(Relation rel);
-static inline bool btree_index_mainfork_expected(Relation rel);
 static void bt_check_every_level(Relation rel, Relation heaprel,
 								 bool heapkeyspace, bool readonly, bool heapallindexed,
 								 bool rootdescend);
@@ -217,22 +208,13 @@ bt_index_parent_check(PG_FUNCTION_ARGS)
 	PG_RETURN_VOID();
 }
 
-/*
- * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
- */
-static void
-bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
-						bool rootdescend)
+
+/* Lock aquisition reused accross different am types */
+void
+amcheck_lock_relation(Oid indrelid, Relation *indrel,
+						Relation *heaprel, LOCKMODE	lockmode)
 {
 	Oid			heapid;
-	Relation	indrel;
-	Relation	heaprel;
-	LOCKMODE	lockmode;
-
-	if (parentcheck)
-		lockmode = ShareLock;
-	else
-		lockmode = AccessShareLock;
 
 	/*
 	 * We must lock table before index to avoid deadlocks.  However, if the
@@ -244,9 +226,9 @@ bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
 	 */
 	heapid = IndexGetRelation(indrelid, true);
 	if (OidIsValid(heapid))
-		heaprel = table_open(heapid, lockmode);
+		*heaprel = table_open(heapid, lockmode);
 	else
-		heaprel = NULL;
+		*heaprel = NULL;
 
 	/*
 	 * Open the target index relations separately (like relation_openrv(), but
@@ -260,23 +242,57 @@ bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
 	 * committed or recently dead heap tuples lacking index entries due to
 	 * concurrent activity.)
 	 */
-	indrel = index_open(indrelid, lockmode);
+	*indrel = index_open(indrelid, lockmode);
 
 	/*
 	 * Since we did the IndexGetRelation call above without any lock, it's
 	 * barely possible that a race against an index drop/recreation could have
 	 * netted us the wrong table.
 	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+	if (*heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
 		ereport(ERROR,
 				(errcode(ERRCODE_UNDEFINED_TABLE),
 				 errmsg("could not open parent table of index %s",
-						RelationGetRelationName(indrel))));
+						RelationGetRelationName(*indrel))));
+}
+
+/* Pair for  amcheck_lock_relation() */
+void amcheck_unlock_relation(Oid indrelid, Relation indrel, Relation heaprel, LOCKMODE	lockmode)
+{
+	/*
+	 * Release locks early. That's ok here because nothing in the called
+	 * routines will trigger shared cache invalidations to be sent, so we can
+	 * relax the usual pattern of only releasing locks after commit.
+	 */
+	index_close(indrel, lockmode);
+	if (heaprel)
+		table_close(heaprel, lockmode);
+}
+
+/*
+ * Helper for bt_index_[parent_]check, coordinating the bulk of the work.
+ */
+static void
+bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
+						bool rootdescend)
+{
+	Relation	indrel;
+	Relation	heaprel;
+	LOCKMODE	lockmode;
+	bool 		heapkeyspace;
+
+	if (parentcheck)
+		lockmode = ShareLock;
+	else
+		lockmode = AccessShareLock;
+
+	/* lock table and index with neccesary level */
+	amcheck_lock_relation(indrelid, &indrel, &heaprel, lockmode);
 
 	/* Relation suitable for checking as B-Tree? */
 	btree_index_checkable(indrel);
 
-	if (btree_index_mainfork_expected(indrel))
+	if (amcheck_index_mainfork_expected(indrel))
 	{
 		bool	heapkeyspace;
 
@@ -293,14 +309,8 @@ bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
 							 heapallindexed, rootdescend);
 	}
 
-	/*
-	 * Release locks early. That's ok here because nothing in the called
-	 * routines will trigger shared cache invalidations to be sent, so we can
-	 * relax the usual pattern of only releasing locks after commit.
-	 */
-	index_close(indrel, lockmode);
-	if (heaprel)
-		table_close(heaprel, lockmode);
+	/* Unlock index and table */
+	amcheck_unlock_relation(indrelid, indrel, heaprel, lockmode);
 }
 
 /*
@@ -338,14 +348,15 @@ btree_index_checkable(Relation rel)
 }
 
 /*
- * Check if B-Tree index relation should have a file for its main relation
+ * Check if index relation should have a file for its main relation
  * fork.  Verification uses this to skip unlogged indexes when in hot standby
  * mode, where there is simply nothing to verify.
  *
- * NB: Caller should call btree_index_checkable() before calling here.
+ * NB: Caller should call btree_index_checkable() or gist_index_checkable()
+ * before calling here.
  */
-static inline bool
-btree_index_mainfork_expected(Relation rel)
+bool
+amcheck_index_mainfork_expected(Relation rel)
 {
 	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
 		!RecoveryInProgress())
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 627651d8d4..6a02e288b2 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -165,6 +165,25 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_parent_check(index regclass) returns void</function>
+     <indexterm>
+      <primary>gist_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_parent_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
  </sect2>
 
-- 
2.17.1

#24Alvaro Herrera from 2ndQuadrant
alvherre@alvh.no-ip.org
In reply to: Andrey Borodin (#23)
Re: amcheck verification for GiST

On 2019-Sep-06, Andrey Borodin wrote:

Here's rebased version. Changes in v9:
* adjust to usage of table_open
* update new extension version
* check for main fork presence in GiST check too

Cool. On a quick eyeball, your new amcheck.h does not conform to our
conventions: it should not include postgres.h, and it should only
include other headers as needed in order for it to compile standalone (I
think you just need utils/relcache.h and storage/lockdefs.h). All the
other headers needed for .c files should be in the corresponding .c
files, not in the .h.

Please don't split error messages (errmsg, errdetail etc) in multiple
lines; just leave the line run long (do put arguments beyond a long
format string into separate lines, though). This improves greppability.
There are some other minor style violations -- nothing that would not be
fixed by pgindent.

Peter, Heikki, are you going to do [at least] one more round of
design/functional review?

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In reply to: Alvaro Herrera from 2ndQuadrant (#24)
Re: amcheck verification for GiST

On Fri, Sep 6, 2019 at 7:02 AM Alvaro Herrera from 2ndQuadrant
<alvherre@alvh.no-ip.org> wrote:

Peter, Heikki, are you going to do [at least] one more round of
design/functional review?

I didn't plan on it, but somebody probably should. Are you offering to
commit the patch? If not, I can take care of it.

--
Peter Geoghegan

#26Alvaro Herrera from 2ndQuadrant
alvherre@alvh.no-ip.org
In reply to: Peter Geoghegan (#25)
Re: amcheck verification for GiST

On 2019-Sep-06, Peter Geoghegan wrote:

On Fri, Sep 6, 2019 at 7:02 AM Alvaro Herrera from 2ndQuadrant
<alvherre@alvh.no-ip.org> wrote:

Peter, Heikki, are you going to do [at least] one more round of
design/functional review?

I didn't plan on it, but somebody probably should. Are you offering to
commit the patch? If not, I can take care of it.

I'd welcome it more if you did it; thanks.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In reply to: Alvaro Herrera from 2ndQuadrant (#26)
Re: amcheck verification for GiST

On Fri, Sep 6, 2019 at 2:35 PM Alvaro Herrera from 2ndQuadrant
<alvherre@alvh.no-ip.org> wrote:

I'd welcome it more if you did it; thanks.

I'll take care of it, then.

--
Peter Geoghegan

In reply to: Peter Geoghegan (#27)
1 attachment(s)
Re: amcheck verification for GiST

On Fri, Sep 6, 2019 at 3:22 PM Peter Geoghegan <pg@bowt.ie> wrote:

I'll take care of it, then.

Attached is v10, which has some comment and style fix-ups, including
the stuff Alvaro mentioned. It also adds line pointer sanitization to
match what I added to verify_nbtree.c in commit a9ce839a (we use a
custom PageGetItemIdCareful() for GiST instead of a simple
PageGetItemId()). I also added a new file/TU for the routines that are
now common to both nbtree and GiST verification, which I named
amcheck.c. (I'm not sure about that, but I don't like verify_nbtree.c
having generic/common functions.)

I have only had a few hours to polish this, which doesn't seem like
enough, though was enough to fix the noticeable stuff.

My main concern now is the heavyweight lock strength needed by the new
function. I don't feel particularly qualified to sign off on the
concurrency aspects of the patch. Heikki's v6 used a ShareLock, like
bt_index_parent_check(), but you went back to an AccessShareLock,
Andrey. Why is this safe? I see that you do gist_refind_parent() in
your v9 a little differently to Heikki in his v6, which you seemed to
suggest made this safe in your e-mail on March 28, but I don't
understand that at all.

--
Peter Geoghegan

Attachments:

v10-0001-Revisions-of-GiST-amcheck-from-Peter.patchapplication/octet-stream; name=v10-0001-Revisions-of-GiST-amcheck-from-Peter.patchDownload
From f39ccae5d7440a3893c7a5a3f78fe730cd793ca1 Mon Sep 17 00:00:00 2001
From: Andrey <amborodin@acm.org>
Date: Thu, 28 Mar 2019 14:46:55 +0500
Subject: [PATCH v10] Revisions of GiST amcheck from Peter

---
 contrib/amcheck/Makefile                |   9 +-
 contrib/amcheck/amcheck--1.2--1.3.sql   |  14 +
 contrib/amcheck/amcheck.c               | 111 +++++++
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/amcheck.h               |  20 ++
 contrib/amcheck/expected/check_gist.out |  18 ++
 contrib/amcheck/sql/check_gist.sql      |   9 +
 contrib/amcheck/verify_gist.c           | 414 ++++++++++++++++++++++++
 contrib/amcheck/verify_nbtree.c         |  82 +----
 doc/src/sgml/amcheck.sgml               |  19 ++
 10 files changed, 619 insertions(+), 79 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.2--1.3.sql
 create mode 100644 contrib/amcheck/amcheck.c
 create mode 100644 contrib/amcheck/amcheck.h
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index dcec3b8520..c08d2a582b 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -1,13 +1,16 @@
 # contrib/amcheck/Makefile
 
 MODULE_big	= amcheck
-OBJS		= verify_nbtree.o $(WIN32RES)
+OBJS		= amcheck.o verify_gist.o verify_nbtree.o $(WIN32RES)
 
 EXTENSION = amcheck
-DATA = amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA =  amcheck--1.2--1.3.sql \
+	amcheck--1.1--1.2.sql \
+	amcheck--1.0--1.1.sql \
+	amcheck--1.0.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree
+REGRESS = check check_btree check_gist
 
 ifdef USE_PGXS
 PG_CONFIG = pg_config
diff --git a/contrib/amcheck/amcheck--1.2--1.3.sql b/contrib/amcheck/amcheck--1.2--1.3.sql
new file mode 100644
index 0000000000..44b88a40a0
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.2--1.3.sql
@@ -0,0 +1,14 @@
+/* contrib/amcheck/amcheck--1.2--1.3.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.3'" to load this file. \quit
+
+--
+-- gist_index_parent_check()
+--
+CREATE FUNCTION gist_index_parent_check(index regclass)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_parent_check(regclass) FROM PUBLIC;
diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
new file mode 100644
index 0000000000..e65bdfcb5f
--- /dev/null
+++ b/contrib/amcheck/amcheck.c
@@ -0,0 +1,111 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.c
+ *		Utility functions common to all access methods.
+ *
+ * Copyright (c) 2017-2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/table.h"
+#include "access/tableam.h"
+#include "amcheck.h"
+#include "catalog/index.h"
+#include "commands/tablecmds.h"
+
+
+/*
+ * Lock acquisition reused across different am types
+ */
+void
+amcheck_lock_relation(Oid indrelid, Relation *indrel,
+					  Relation *heaprel, LOCKMODE lockmode)
+{
+	Oid			heapid;
+
+	/*
+	 * We must lock table before index to avoid deadlocks.  However, if the
+	 * passed indrelid isn't an index then IndexGetRelation() will fail.
+	 * Rather than emitting a not-very-helpful error message, postpone
+	 * complaining, expecting that the is-it-an-index test below will fail.
+	 *
+	 * In hot standby mode this will raise an error when lockmode is
+	 * AccessExclusiveLock.
+	 */
+	heapid = IndexGetRelation(indrelid, true);
+	if (OidIsValid(heapid))
+		*heaprel = table_open(heapid, lockmode);
+	else
+		*heaprel = NULL;
+
+	/*
+	 * Open the target index relations separately (like relation_openrv(), but
+	 * with heap relation locked first to prevent deadlocking).  In hot
+	 * standby mode this will raise an error when lockmode is
+	 * AccessExclusiveLock.
+	 *
+	 * There is no need for the usual indcheckxmin usability horizon test
+	 * here, even in the nbtree heapallindexed case, because index undergoing
+	 * verification only needs to have entries for a new transaction snapshot.
+	 * (If caller is about to do nbtree parentcheck verification, there is no
+	 * question about committed or recently dead heap tuples lacking index
+	 * entries due to concurrent activity.)
+	 */
+	*indrel = index_open(indrelid, lockmode);
+
+	/*
+	 * Since we did the IndexGetRelation call above without any lock, it's
+	 * barely possible that a race against an index drop/recreation could have
+	 * netted us the wrong table.
+	 */
+	if (*heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_TABLE),
+				 errmsg("could not open parent table of index %s",
+						RelationGetRelationName(*indrel))));
+}
+
+/*
+ * Unlock index and heap relations early for amcheck_lock_relation() caller.
+ *
+ * This is ok because nothing in the called routines will trigger shared cache
+ * invalidations to be sent, so we can relax the usual pattern of only
+ * releasing locks after commit.
+ */
+void
+amcheck_unlock_relation(Oid indrelid, Relation indrel, Relation heaprel,
+						LOCKMODE lockmode)
+{
+	index_close(indrel, lockmode);
+	if (heaprel)
+		table_close(heaprel, lockmode);
+}
+
+/*
+ * Check if index relation should have a file for its main relation
+ * fork.  Verification uses this to skip unlogged indexes when in hot standby
+ * mode, where there is simply nothing to verify.
+ *
+ * NB: Caller should call btree_index_checkable() or gist_index_checkable()
+ * before calling here.
+ */
+bool
+amcheck_index_mainfork_expected(Relation rel)
+{
+	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
+		!RecoveryInProgress())
+		return true;
+
+	ereport(NOTICE,
+			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
+			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
+					RelationGetRelationName(rel))));
+
+	return false;
+}
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index c6e310046d..ab50931f75 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.2'
+default_version = '1.3'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/amcheck.h b/contrib/amcheck/amcheck.h
new file mode 100644
index 0000000000..721ab558b6
--- /dev/null
+++ b/contrib/amcheck/amcheck.h
@@ -0,0 +1,20 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "storage/lockdefs.h"
+#include "utils/relcache.h"
+
+extern void amcheck_lock_relation(Oid indrelid, Relation *indrel,
+								  Relation *heaprel, LOCKMODE lockmode);
+extern void amcheck_unlock_relation(Oid indrelid, Relation indrel,
+									Relation heaprel, LOCKMODE lockmode);
+extern bool amcheck_index_mainfork_expected(Relation rel);
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 0000000000..8f3ec20946
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,18 @@
+-- minimal test, basically just verifying that amcheck works with GiST
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE gist_check AS SELECT point(random(),s) c FROM generate_series(1,10000) s;
+INSERT INTO gist_check SELECT point(random(),s) c FROM generate_series(1,100000) s;
+CREATE INDEX gist_check_idx ON gist_check USING gist(c);
+SELECT gist_index_parent_check('gist_check_idx');
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 0000000000..0ee2e943c9
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,9 @@
+-- minimal test, basically just verifying that amcheck works with GiST
+SELECT setseed(1);
+CREATE TABLE gist_check AS SELECT point(random(),s) c FROM generate_series(1,10000) s;
+INSERT INTO gist_check SELECT point(random(),s) c FROM generate_series(1,100000) s;
+CREATE INDEX gist_check_idx ON gist_check USING gist(c);
+SELECT gist_index_parent_check('gist_check_idx');
+
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 0000000000..7c5519802f
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,414 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gist.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "amcheck.h"
+#include "catalog/pg_am.h"
+#include "miscadmin.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+
+typedef struct GistScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GistScanItem *next;
+} GistScanItem;
+
+PG_FUNCTION_INFO_V1(gist_index_parent_check);
+
+static void gist_index_checkable(Relation rel);
+static void gist_check_parent_keys_consistency(Relation rel);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gist_refind_parent(Relation rel, BlockNumber parentblkno,
+									 BlockNumber childblkno,
+									 BufferAccessStrategy strategy);
+static ItemId PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+								   OffsetNumber offset);
+
+/*
+ * gist_index_parent_check(index regclass)
+ *
+ * Verify integrity of GiST index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum gist_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	Relation	indrel;
+	Relation	heaprel;
+	LOCKMODE	lockmode = AccessShareLock;
+
+	/* lock table and index with neccesary level */
+	amcheck_lock_relation(indrelid, &indrel, &heaprel, lockmode);
+
+	/* verify that this is GiST eligible for check */
+	gist_index_checkable(indrel);
+
+	if (amcheck_index_mainfork_expected(indrel))
+		gist_check_parent_keys_consistency(indrel);
+
+	/* Unlock index and table */
+	amcheck_unlock_relation(indrelid, indrel, heaprel, lockmode);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Check that relation is eligible for GiST verification
+ */
+static void
+gist_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIST_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GiST indexes are supported as targets for this verification"),
+				 errdetail("Relation \"%s\" is not a GiST index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+/*
+ * Main entry point for GiST check. Allocates memory context and scans through
+ * GiST graph.  This function verifies that tuples of internal pages cover all
+ * the key space of each tuple on leaf page.  To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries to
+ * adjust it by tuples on referenced child page.  Parent gist tuple should
+ * never require any adjustments.
+ */
+static void
+gist_check_parent_keys_consistency(Relation rel)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GistScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GISTSTATE  *state;
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	state = initGISTstate(rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GistScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber  i, maxoff;
+		XLogRecPtr	lsn;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page))
+		{
+			/* split page detected, install right link to the stack */
+			GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+			ptr->depth = stack->depth;
+			ptr->parenttup = CopyIndexTuple(stack->parenttup);
+			ptr->parentblk = stack->parentblk;
+			ptr->parentlsn = stack->parentlsn;
+			ptr->blkno = GistPageGetOpaque(page)->rightlink;
+			ptr->next = stack->next;
+			stack->next = ptr;
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GistPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that each tuple looks valid, and is consistent with the
+		 * downlink we followed when we stepped on this page.
+		 */
+		maxoff = PageGetMaxOffsetNumber(page);
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId iid = PageGetItemIdCareful(rel, stack->blkno, page, i);
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+			/*
+			 * Check that it's not a leftover invalid tuple from pre-9.1 See
+			 * also gistdoinsert() and gistbulkdelete() handling of such
+			 * tuples. We do consider it error here.
+			 */
+			if (GistTupleIsInvalid(idxtuple))
+				ereport(ERROR,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("index \"%s\" contains an inner tuple marked as invalid, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i),
+						 errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+						 errhint("Please REINDEX it.")));
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+			{
+				/*
+				 * There was a discrepancy between parent and child tuples.
+				 * We need to verify it is not a result of concurrent call of
+				 * gistplacetopage(). So, lock parent and try to find downlink
+				 * for current page. It may be missing due to concurrent page
+				 * split, this is OK.
+				 */
+				pfree(stack->parenttup);
+				stack->parenttup = gist_refind_parent(rel, stack->parentblk,
+													  stack->blkno, strategy);
+
+				/* We found it - make a final check before failing */
+				if (!stack->parenttup)
+					elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+						 stack->blkno, stack->parentblk);
+				else if (gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+				else
+				{
+					/*
+					 * But now it is properly adjusted - nothing to do here.
+					 */
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+			if (!GistPageIsLeaf(page))
+			{
+				GistScanItem *ptr;
+
+				ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+				ptr->depth = stack->depth + 1;
+				ptr->parenttup = CopyIndexTuple(idxtuple);
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		LockBuffer(buffer, GIST_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	gistcheckpage(rel, buffer);
+
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted page %d",
+						RelationGetRelationName(rel), blockNo)));
+
+	if (GistPageIsDeleted(page))
+	{
+		if (!GistPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gist_refind_parent(Relation rel, BlockNumber parentblkno,
+				   BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIST_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GistPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
+
+/*
+ * PageGetItemId() wrapper that validates returned line pointer.
+ *
+ * See comments for equivalent nbtree function.
+ */
+static ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - sizeof(GISTPageOpaqueData))
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since GiST
+	 * never uses either.  Verify that line pointer has storage, too, since
+	 * even LP_DEAD items should within GiST.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 05e7d678ed..e3a2942fcc 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -25,16 +25,14 @@
 
 #include "access/htup_details.h"
 #include "access/nbtree.h"
-#include "access/table.h"
 #include "access/tableam.h"
 #include "access/transam.h"
 #include "access/xact.h"
+#include "amcheck.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
-#include "commands/tablecmds.h"
 #include "lib/bloomfilter.h"
 #include "miscadmin.h"
-#include "storage/lmgr.h"
 #include "storage/smgr.h"
 #include "utils/memutils.h"
 #include "utils/snapmgr.h"
@@ -129,7 +127,6 @@ PG_FUNCTION_INFO_V1(bt_index_parent_check);
 static void bt_index_check_internal(Oid indrelid, bool parentcheck,
 									bool heapallindexed, bool rootdescend);
 static inline void btree_index_checkable(Relation rel);
-static inline bool btree_index_mainfork_expected(Relation rel);
 static void bt_check_every_level(Relation rel, Relation heaprel,
 								 bool heapkeyspace, bool readonly, bool heapallindexed,
 								 bool rootdescend);
@@ -224,7 +221,6 @@ static void
 bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
 						bool rootdescend)
 {
-	Oid			heapid;
 	Relation	indrel;
 	Relation	heaprel;
 	LOCKMODE	lockmode;
@@ -234,51 +230,15 @@ bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
 	else
 		lockmode = AccessShareLock;
 
-	/*
-	 * We must lock table before index to avoid deadlocks.  However, if the
-	 * passed indrelid isn't an index then IndexGetRelation() will fail.
-	 * Rather than emitting a not-very-helpful error message, postpone
-	 * complaining, expecting that the is-it-an-index test below will fail.
-	 *
-	 * In hot standby mode this will raise an error when parentcheck is true.
-	 */
-	heapid = IndexGetRelation(indrelid, true);
-	if (OidIsValid(heapid))
-		heaprel = table_open(heapid, lockmode);
-	else
-		heaprel = NULL;
-
-	/*
-	 * Open the target index relations separately (like relation_openrv(), but
-	 * with heap relation locked first to prevent deadlocking).  In hot
-	 * standby mode this will raise an error when parentcheck is true.
-	 *
-	 * There is no need for the usual indcheckxmin usability horizon test
-	 * here, even in the heapallindexed case, because index undergoing
-	 * verification only needs to have entries for a new transaction snapshot.
-	 * (If this is a parentcheck verification, there is no question about
-	 * committed or recently dead heap tuples lacking index entries due to
-	 * concurrent activity.)
-	 */
-	indrel = index_open(indrelid, lockmode);
-
-	/*
-	 * Since we did the IndexGetRelation call above without any lock, it's
-	 * barely possible that a race against an index drop/recreation could have
-	 * netted us the wrong table.
-	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
-		ereport(ERROR,
-				(errcode(ERRCODE_UNDEFINED_TABLE),
-				 errmsg("could not open parent table of index %s",
-						RelationGetRelationName(indrel))));
+	/* lock table and index with neccesary level */
+	amcheck_lock_relation(indrelid, &indrel, &heaprel, lockmode);
 
 	/* Relation suitable for checking as B-Tree? */
 	btree_index_checkable(indrel);
 
-	if (btree_index_mainfork_expected(indrel))
+	if (amcheck_index_mainfork_expected(indrel))
 	{
-		bool	heapkeyspace;
+		bool		heapkeyspace;
 
 		RelationOpenSmgr(indrel);
 		if (!smgrexists(indrel->rd_smgr, MAIN_FORKNUM))
@@ -293,14 +253,8 @@ bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
 							 heapallindexed, rootdescend);
 	}
 
-	/*
-	 * Release locks early. That's ok here because nothing in the called
-	 * routines will trigger shared cache invalidations to be sent, so we can
-	 * relax the usual pattern of only releasing locks after commit.
-	 */
-	index_close(indrel, lockmode);
-	if (heaprel)
-		table_close(heaprel, lockmode);
+	/* Unlock index and table */
+	amcheck_unlock_relation(indrelid, indrel, heaprel, lockmode);
 }
 
 /*
@@ -337,28 +291,6 @@ btree_index_checkable(Relation rel)
 				 errdetail("Index is not valid.")));
 }
 
-/*
- * Check if B-Tree index relation should have a file for its main relation
- * fork.  Verification uses this to skip unlogged indexes when in hot standby
- * mode, where there is simply nothing to verify.
- *
- * NB: Caller should call btree_index_checkable() before calling here.
- */
-static inline bool
-btree_index_mainfork_expected(Relation rel)
-{
-	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
-		!RecoveryInProgress())
-		return true;
-
-	ereport(NOTICE,
-			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
-			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
-					RelationGetRelationName(rel))));
-
-	return false;
-}
-
 /*
  * Main entry point for B-Tree SQL-callable functions. Walks the B-Tree in
  * logical order, verifying invariants as it goes.  Optionally, verification
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 627651d8d4..6a02e288b2 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -165,6 +165,25 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_parent_check(index regclass) returns void</function>
+     <indexterm>
+      <primary>gist_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_parent_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
  </sect2>
 
-- 
2.17.1

#29Andrey Borodin
x4mmm@yandex-team.ru
In reply to: Peter Geoghegan (#28)
1 attachment(s)
Re: amcheck verification for GiST

Alvaro, Peter, thanks for working on this!

7 сент. 2019 г., в 6:26, Peter Geoghegan <pg@bowt.ie> написал(а):

On Fri, Sep 6, 2019 at 3:22 PM Peter Geoghegan <pg@bowt.ie> wrote:

I'll take care of it, then.

Attached is v10, which has some comment and style fix-ups, including
the stuff Alvaro mentioned. It also adds line pointer sanitization to
match what I added to verify_nbtree.c in commit a9ce839a (we use a
custom PageGetItemIdCareful() for GiST instead of a simple
PageGetItemId()). I also added a new file/TU for the routines that are
now common to both nbtree and GiST verification, which I named
amcheck.c. (I'm not sure about that, but I don't like verify_nbtree.c
having generic/common functions.)

Maybe we should PageGetItemIdCareful() to amcheck.c too?
I think it can be reused later by GIN entry tree and to some extent by SP-GiST.
SP-GiST uses redirect tuples, but do not store this information in line pointer.

I have only had a few hours to polish this, which doesn't seem like
enough, though was enough to fix the noticeable stuff.

Cool, thanks!

My main concern now is the heavyweight lock strength needed by the new
function. I don't feel particularly qualified to sign off on the
concurrency aspects of the patch. Heikki's v6 used a ShareLock, like
bt_index_parent_check(), but you went back to an AccessShareLock,
Andrey. Why is this safe? I see that you do gist_refind_parent() in
your v9 a little differently to Heikki in his v6, which you seemed to
suggest made this safe in your e-mail on March 28, but I don't
understand that at all.

Heikki's version was reading childblkno instead of parentblkno, thus never refinding parent tuple.
Also, I've added check that internal page did not become leaf. In this case we would be comparing heap tids with index tids, and could possibly find incorrect match.
But, in current GiST implementation, inner page should never become leaf. We do not delete inner pages.
I think we could even convert it into sanity check, but I'm afraid that once upon a time we will implement delete for internal pages and forget to remove this check.

Current lock mode is AccessShareLock.
Before we find some discrepancy in tuples we follow standard locking protocol of GiST Index Scan.

When we suspect key consistency violation, we hold lock on page with some tuple. Then we take pin and lock on page that was parent for current some time before.
For example of validity see gistfinishsplit(). Comments state "On entry, the caller must hold a lock on stack->buffer", line 1330 acquires LockBuffer(stack->parent->buffer, GIST_EXCLUSIVE);
This function is used during inserts, but we are not going to modify data and place row locks, thus neither ROW EXCLUSIVE, not ROW SHARE is necessary.

PFA v11. There is one small comment in GistScanItem. Also, I've moved PageGetItemIdCareful() to amcheck.c. I'm not sure that this refactoring is beneficial for patch, but it reduces code size.

Thanks!

Best regards, Andrey Borodin.

Attachments:

v11-0001-GiST-amcheck-from-Andrey.patchapplication/octet-stream; name=v11-0001-GiST-amcheck-from-Andrey.patch; x-unix-mode=0600Download
From af86f97a04ca6ef40147a7c5f67e442cbd231ee9 Mon Sep 17 00:00:00 2001
From: Andrey <amborodin@acm.org>
Date: Thu, 28 Mar 2019 14:46:55 +0500
Subject: [PATCH] GiST amcheck from Andrey

---
 contrib/amcheck/Makefile                |   9 +-
 contrib/amcheck/amcheck--1.2--1.3.sql   |  14 +
 contrib/amcheck/amcheck.c               | 160 ++++++++++
 contrib/amcheck/amcheck.control         |   2 +-
 contrib/amcheck/amcheck.h               |  23 ++
 contrib/amcheck/expected/check_gist.out |  18 ++
 contrib/amcheck/sql/check_gist.sql      |   9 +
 contrib/amcheck/verify_gist.c           | 374 ++++++++++++++++++++++++
 contrib/amcheck/verify_nbtree.c         | 176 ++---------
 doc/src/sgml/amcheck.sgml               |  19 ++
 10 files changed, 653 insertions(+), 151 deletions(-)
 create mode 100644 contrib/amcheck/amcheck--1.2--1.3.sql
 create mode 100644 contrib/amcheck/amcheck.c
 create mode 100644 contrib/amcheck/amcheck.h
 create mode 100644 contrib/amcheck/expected/check_gist.out
 create mode 100644 contrib/amcheck/sql/check_gist.sql
 create mode 100644 contrib/amcheck/verify_gist.c

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index dcec3b8520..c08d2a582b 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -1,13 +1,16 @@
 # contrib/amcheck/Makefile
 
 MODULE_big	= amcheck
-OBJS		= verify_nbtree.o $(WIN32RES)
+OBJS		= amcheck.o verify_gist.o verify_nbtree.o $(WIN32RES)
 
 EXTENSION = amcheck
-DATA = amcheck--1.1--1.2.sql amcheck--1.0--1.1.sql amcheck--1.0.sql
+DATA =  amcheck--1.2--1.3.sql \
+	amcheck--1.1--1.2.sql \
+	amcheck--1.0--1.1.sql \
+	amcheck--1.0.sql
 PGFILEDESC = "amcheck - function for verifying relation integrity"
 
-REGRESS = check check_btree
+REGRESS = check check_btree check_gist
 
 ifdef USE_PGXS
 PG_CONFIG = pg_config
diff --git a/contrib/amcheck/amcheck--1.2--1.3.sql b/contrib/amcheck/amcheck--1.2--1.3.sql
new file mode 100644
index 0000000000..44b88a40a0
--- /dev/null
+++ b/contrib/amcheck/amcheck--1.2--1.3.sql
@@ -0,0 +1,14 @@
+/* contrib/amcheck/amcheck--1.2--1.3.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION amcheck UPDATE TO '1.3'" to load this file. \quit
+
+--
+-- gist_index_parent_check()
+--
+CREATE FUNCTION gist_index_parent_check(index regclass)
+RETURNS VOID
+AS 'MODULE_PATHNAME', 'gist_index_parent_check'
+LANGUAGE C STRICT;
+
+REVOKE ALL ON FUNCTION gist_index_parent_check(regclass) FROM PUBLIC;
diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
new file mode 100644
index 0000000000..5bc3f831bc
--- /dev/null
+++ b/contrib/amcheck/amcheck.c
@@ -0,0 +1,160 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.c
+ *		Utility functions common to all access methods.
+ *
+ * Copyright (c) 2017-2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/genam.h"
+#include "access/table.h"
+#include "access/tableam.h"
+#include "amcheck.h"
+#include "catalog/index.h"
+#include "commands/tablecmds.h"
+
+
+/*
+ * Lock acquisition reused across different am types
+ */
+void
+amcheck_lock_relation(Oid indrelid, Relation *indrel,
+					  Relation *heaprel, LOCKMODE lockmode)
+{
+	Oid			heapid;
+
+	/*
+	 * We must lock table before index to avoid deadlocks.  However, if the
+	 * passed indrelid isn't an index then IndexGetRelation() will fail.
+	 * Rather than emitting a not-very-helpful error message, postpone
+	 * complaining, expecting that the is-it-an-index test below will fail.
+	 *
+	 * In hot standby mode this will raise an error when lockmode is
+	 * AccessExclusiveLock.
+	 */
+	heapid = IndexGetRelation(indrelid, true);
+	if (OidIsValid(heapid))
+		*heaprel = table_open(heapid, lockmode);
+	else
+		*heaprel = NULL;
+
+	/*
+	 * Open the target index relations separately (like relation_openrv(), but
+	 * with heap relation locked first to prevent deadlocking).  In hot
+	 * standby mode this will raise an error when lockmode is
+	 * AccessExclusiveLock.
+	 *
+	 * There is no need for the usual indcheckxmin usability horizon test
+	 * here, even in the nbtree heapallindexed case, because index undergoing
+	 * verification only needs to have entries for a new transaction snapshot.
+	 * (If caller is about to do nbtree parentcheck verification, there is no
+	 * question about committed or recently dead heap tuples lacking index
+	 * entries due to concurrent activity.)
+	 */
+	*indrel = index_open(indrelid, lockmode);
+
+	/*
+	 * Since we did the IndexGetRelation call above without any lock, it's
+	 * barely possible that a race against an index drop/recreation could have
+	 * netted us the wrong table.
+	 */
+	if (*heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
+		ereport(ERROR,
+				(errcode(ERRCODE_UNDEFINED_TABLE),
+				 errmsg("could not open parent table of index %s",
+						RelationGetRelationName(*indrel))));
+}
+
+/*
+ * Unlock index and heap relations early for amcheck_lock_relation() caller.
+ *
+ * This is ok because nothing in the called routines will trigger shared cache
+ * invalidations to be sent, so we can relax the usual pattern of only
+ * releasing locks after commit.
+ */
+void
+amcheck_unlock_relation(Oid indrelid, Relation indrel, Relation heaprel,
+						LOCKMODE lockmode)
+{
+	index_close(indrel, lockmode);
+	if (heaprel)
+		table_close(heaprel, lockmode);
+}
+
+/*
+ * Check if index relation should have a file for its main relation
+ * fork.  Verification uses this to skip unlogged indexes when in hot standby
+ * mode, where there is simply nothing to verify.
+ *
+ * NB: Caller should call btree_index_checkable() or gist_index_checkable()
+ * before calling here.
+ */
+bool
+amcheck_index_mainfork_expected(Relation rel)
+{
+	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
+		!RecoveryInProgress())
+		return true;
+
+	ereport(NOTICE,
+			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
+			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
+					RelationGetRelationName(rel))));
+
+	return false;
+}
+
+/*
+ * PageGetItemId() wrapper that validates returned line pointer.
+ *
+ * Buffer page/page item access macros generally trust that line pointers are
+ * not corrupt, which might cause problems for verification itself.  For
+ * example, there is no bounds checking in PageGetItem().  Passing it a
+ * corrupt line pointer can cause it to return a tuple/pointer that is unsafe
+ * to dereference.
+ *
+ * Validating line pointers before tuples avoids undefined behavior and
+ * assertion failures with corrupt indexes, making the verification process
+ * more robust and predictable.
+ */
+ItemId
+PageGetItemIdCareful(Relation rel, BlockNumber block, Page page,
+					 OffsetNumber offset, size_t opaquesize)
+{
+	ItemId		itemid = PageGetItemId(page, offset);
+
+	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
+		BLCKSZ - opaquesize)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("line pointer points past end of tuple space in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	/*
+	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree and gist
+	 * never uses either.  Verify that line pointer has storage, too, since
+	 * even LP_DEAD items should.
+	 */
+	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
+		ItemIdGetLength(itemid) == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("invalid line pointer storage in index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
+									block, offset, ItemIdGetOffset(itemid),
+									ItemIdGetLength(itemid),
+									ItemIdGetFlags(itemid))));
+
+	return itemid;
+}
\ No newline at end of file
diff --git a/contrib/amcheck/amcheck.control b/contrib/amcheck/amcheck.control
index c6e310046d..ab50931f75 100644
--- a/contrib/amcheck/amcheck.control
+++ b/contrib/amcheck/amcheck.control
@@ -1,5 +1,5 @@
 # amcheck extension
 comment = 'functions for verifying relation integrity'
-default_version = '1.2'
+default_version = '1.3'
 module_pathname = '$libdir/amcheck'
 relocatable = true
diff --git a/contrib/amcheck/amcheck.h b/contrib/amcheck/amcheck.h
new file mode 100644
index 0000000000..b19e617177
--- /dev/null
+++ b/contrib/amcheck/amcheck.h
@@ -0,0 +1,23 @@
+/*-------------------------------------------------------------------------
+ *
+ * amcheck.h
+ *		Shared routines for amcheck verifications.
+ *
+ * Copyright (c) 2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/amcheck.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "storage/lockdefs.h"
+#include "utils/relcache.h"
+
+extern void amcheck_lock_relation(Oid indrelid, Relation *indrel,
+								  Relation *heaprel, LOCKMODE lockmode);
+extern void amcheck_unlock_relation(Oid indrelid, Relation indrel,
+									Relation heaprel, LOCKMODE lockmode);
+extern bool amcheck_index_mainfork_expected(Relation rel);
+
+extern ItemId PageGetItemIdCareful(Relation rel, BlockNumber block,
+					 Page page, OffsetNumber offset, size_t opaquesize);
\ No newline at end of file
diff --git a/contrib/amcheck/expected/check_gist.out b/contrib/amcheck/expected/check_gist.out
new file mode 100644
index 0000000000..8f3ec20946
--- /dev/null
+++ b/contrib/amcheck/expected/check_gist.out
@@ -0,0 +1,18 @@
+-- minimal test, basically just verifying that amcheck works with GiST
+SELECT setseed(1);
+ setseed 
+---------
+ 
+(1 row)
+
+CREATE TABLE gist_check AS SELECT point(random(),s) c FROM generate_series(1,10000) s;
+INSERT INTO gist_check SELECT point(random(),s) c FROM generate_series(1,100000) s;
+CREATE INDEX gist_check_idx ON gist_check USING gist(c);
+SELECT gist_index_parent_check('gist_check_idx');
+ gist_index_parent_check 
+-------------------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/sql/check_gist.sql b/contrib/amcheck/sql/check_gist.sql
new file mode 100644
index 0000000000..0ee2e943c9
--- /dev/null
+++ b/contrib/amcheck/sql/check_gist.sql
@@ -0,0 +1,9 @@
+-- minimal test, basically just verifying that amcheck works with GiST
+SELECT setseed(1);
+CREATE TABLE gist_check AS SELECT point(random(),s) c FROM generate_series(1,10000) s;
+INSERT INTO gist_check SELECT point(random(),s) c FROM generate_series(1,100000) s;
+CREATE INDEX gist_check_idx ON gist_check USING gist(c);
+SELECT gist_index_parent_check('gist_check_idx');
+
+-- cleanup
+DROP TABLE gist_check;
diff --git a/contrib/amcheck/verify_gist.c b/contrib/amcheck/verify_gist.c
new file mode 100644
index 0000000000..3c741071f0
--- /dev/null
+++ b/contrib/amcheck/verify_gist.c
@@ -0,0 +1,374 @@
+/*-------------------------------------------------------------------------
+ *
+ * verify_gist.c
+ *		Verifies the integrity of GiST indexes based on invariants.
+ *
+ * Verification checks that all paths in GiST graph contain
+ * consistent keys: tuples on parent pages consistently include tuples
+ * from children pages. Also, verification checks graph invariants:
+ * internal page must have at least one downlinks, internal page can
+ * reference either only leaf pages or only internal pages.
+ *
+ *
+ * Copyright (c) 2017-2019, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  contrib/amcheck/verify_gist.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/gist_private.h"
+#include "amcheck.h"
+#include "catalog/pg_am.h"
+#include "miscadmin.h"
+#include "utils/memutils.h"
+#include "utils/rel.h"
+
+/*
+ * GistScanItem represents one item of depth-first scan of GiST index.
+ */
+typedef struct GistScanItem
+{
+	int			depth;
+	IndexTuple	parenttup;
+	BlockNumber parentblk;
+	XLogRecPtr	parentlsn;
+	BlockNumber blkno;
+	struct GistScanItem *next;
+} GistScanItem;
+
+PG_FUNCTION_INFO_V1(gist_index_parent_check);
+
+static void gist_index_checkable(Relation rel);
+static void gist_check_parent_keys_consistency(Relation rel);
+static void check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo);
+static IndexTuple gist_refind_parent(Relation rel, BlockNumber parentblkno,
+									 BlockNumber childblkno,
+									 BufferAccessStrategy strategy);
+
+/*
+ * gist_index_parent_check(index regclass)
+ *
+ * Verify integrity of GiST index.
+ *
+ * Acquires AccessShareLock on heap & index relations.
+ */
+Datum gist_index_parent_check(PG_FUNCTION_ARGS)
+{
+	Oid			indrelid = PG_GETARG_OID(0);
+	Relation	indrel;
+	Relation	heaprel;
+	LOCKMODE	lockmode = AccessShareLock;
+
+	/* lock table and index with neccesary level */
+	amcheck_lock_relation(indrelid, &indrel, &heaprel, lockmode);
+
+	/* verify that this is GiST eligible for check */
+	gist_index_checkable(indrel);
+
+	if (amcheck_index_mainfork_expected(indrel))
+		gist_check_parent_keys_consistency(indrel);
+
+	/* Unlock index and table */
+	amcheck_unlock_relation(indrelid, indrel, heaprel, lockmode);
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * Check that relation is eligible for GiST verification
+ */
+static void
+gist_index_checkable(Relation rel)
+{
+	if (rel->rd_rel->relkind != RELKIND_INDEX ||
+		rel->rd_rel->relam != GIST_AM_OID)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("only GiST indexes are supported as targets for this verification"),
+				 errdetail("Relation \"%s\" is not a GiST index.",
+						   RelationGetRelationName(rel))));
+
+	if (RELATION_IS_OTHER_TEMP(rel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot access temporary tables of other sessions"),
+				 errdetail("Index \"%s\" is associated with temporary relation.",
+						   RelationGetRelationName(rel))));
+
+	if (!rel->rd_index->indisvalid)
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot check index \"%s\"",
+						RelationGetRelationName(rel)),
+				 errdetail("Index is not valid")));
+}
+
+/*
+ * Main entry point for GiST check. Allocates memory context and scans through
+ * GiST graph.  This function verifies that tuples of internal pages cover all
+ * the key space of each tuple on leaf page.  To do this we invoke
+ * gist_check_internal_page() for every internal page.
+ *
+ * gist_check_internal_page() in it's turn takes every tuple and tries to
+ * adjust it by tuples on referenced child page.  Parent gist tuple should
+ * never require any adjustments.
+ */
+static void
+gist_check_parent_keys_consistency(Relation rel)
+{
+	BufferAccessStrategy strategy = GetAccessStrategy(BAS_BULKREAD);
+	GistScanItem *stack;
+	MemoryContext mctx;
+	MemoryContext oldcontext;
+	GISTSTATE  *state;
+	int			leafdepth;
+
+	mctx = AllocSetContextCreate(CurrentMemoryContext,
+								 "amcheck context",
+								 ALLOCSET_DEFAULT_SIZES);
+	oldcontext = MemoryContextSwitchTo(mctx);
+
+	state = initGISTstate(rel);
+
+	/*
+	 * We don't know the height of the tree yet, but as soon as we encounter a
+	 * leaf page, we will set 'leafdepth' to its depth.
+	 */
+	leafdepth = -1;
+
+	/* Start the scan at the root page */
+	stack = (GistScanItem *) palloc0(sizeof(GistScanItem));
+	stack->depth = 0;
+	stack->parenttup = NULL;
+	stack->parentblk = InvalidBlockNumber;
+	stack->parentlsn = InvalidXLogRecPtr;
+	stack->blkno = GIST_ROOT_BLKNO;
+
+	while (stack)
+	{
+		GistScanItem *stack_next;
+		Buffer		buffer;
+		Page		page;
+		OffsetNumber  i, maxoff;
+		XLogRecPtr	lsn;
+
+		CHECK_FOR_INTERRUPTS();
+
+		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,
+									RBM_NORMAL, strategy);
+		LockBuffer(buffer, GIST_SHARE);
+		page = (Page) BufferGetPage(buffer);
+		lsn = BufferGetLSNAtomic(buffer);
+
+		/* Do basic sanity checks on the page headers */
+		check_index_page(rel, buffer, stack->blkno);
+
+		/*
+		 * It's possible that the page was split since we looked at the
+		 * parent, so that we didn't missed the downlink of the right sibling
+		 * when we scanned the parent.  If so, add the right sibling to the
+		 * stack now.
+		 */
+		if (GistFollowRight(page) || stack->parentlsn < GistPageGetNSN(page))
+		{
+			/* split page detected, install right link to the stack */
+			GistScanItem *ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+
+			ptr->depth = stack->depth;
+			ptr->parenttup = CopyIndexTuple(stack->parenttup);
+			ptr->parentblk = stack->parentblk;
+			ptr->parentlsn = stack->parentlsn;
+			ptr->blkno = GistPageGetOpaque(page)->rightlink;
+			ptr->next = stack->next;
+			stack->next = ptr;
+		}
+
+		/* Check that the tree has the same height in all branches */
+		if (GistPageIsLeaf(page))
+		{
+			if (leafdepth == -1)
+				leafdepth = stack->depth;
+			else if (stack->depth != leafdepth)
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\": internal pages traversal encountered leaf page unexpectedly on block %u",
+								RelationGetRelationName(rel), stack->blkno)));
+		}
+
+		/*
+		 * Check that each tuple looks valid, and is consistent with the
+		 * downlink we followed when we stepped on this page.
+		 */
+		maxoff = PageGetMaxOffsetNumber(page);
+		for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
+		{
+			ItemId iid = PageGetItemIdCareful(rel, stack->blkno, page, i, sizeof(GISTPageOpaqueData));
+			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
+
+			/*
+			 * Check that it's not a leftover invalid tuple from pre-9.1 See
+			 * also gistdoinsert() and gistbulkdelete() handling of such
+			 * tuples. We do consider it error here.
+			 */
+			if (GistTupleIsInvalid(idxtuple))
+				ereport(ERROR,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("index \"%s\" contains an inner tuple marked as invalid, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i),
+						 errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
+						 errhint("Please REINDEX it.")));
+
+			if (MAXALIGN(ItemIdGetLength(iid)) != MAXALIGN(IndexTupleSize(idxtuple)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INDEX_CORRUPTED),
+						 errmsg("index \"%s\" has inconsistent tuple sizes, block %u, offset %u",
+								RelationGetRelationName(rel), stack->blkno, i)));
+
+			/*
+			 * Check if this tuple is consistent with the downlink in the
+			 * parent.
+			 */
+			if (stack->parenttup &&
+				gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+			{
+				/*
+				 * There was a discrepancy between parent and child tuples.
+				 * We need to verify it is not a result of concurrent call of
+				 * gistplacetopage(). So, lock parent and try to find downlink
+				 * for current page. It may be missing due to concurrent page
+				 * split, this is OK.
+				 */
+				pfree(stack->parenttup);
+				stack->parenttup = gist_refind_parent(rel, stack->parentblk,
+													  stack->blkno, strategy);
+
+				/* We found it - make a final check before failing */
+				if (!stack->parenttup)
+					elog(NOTICE, "Unable to find parent tuple for block %u on block %u due to concurrent split",
+						 stack->blkno, stack->parentblk);
+				else if (gistgetadjusted(rel, stack->parenttup, idxtuple, state))
+					ereport(ERROR,
+							(errcode(ERRCODE_INDEX_CORRUPTED),
+							 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
+									RelationGetRelationName(rel), stack->blkno, i)));
+				else
+				{
+					/*
+					 * But now it is properly adjusted - nothing to do here.
+					 */
+				}
+			}
+
+			/* If this is an internal page, recurse into the child */
+			if (!GistPageIsLeaf(page))
+			{
+				GistScanItem *ptr;
+
+				ptr = (GistScanItem *) palloc(sizeof(GistScanItem));
+				ptr->depth = stack->depth + 1;
+				ptr->parenttup = CopyIndexTuple(idxtuple);
+				ptr->parentblk = stack->blkno;
+				ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
+				ptr->parentlsn = lsn;
+				ptr->next = stack->next;
+				stack->next = ptr;
+			}
+		}
+
+		LockBuffer(buffer, GIST_UNLOCK);
+		ReleaseBuffer(buffer);
+
+		/* Step to next item in the queue */
+		stack_next = stack->next;
+		if (stack->parenttup)
+			pfree(stack->parenttup);
+		pfree(stack);
+		stack = stack_next;
+	}
+
+	MemoryContextSwitchTo(oldcontext);
+	MemoryContextDelete(mctx);
+}
+
+static void
+check_index_page(Relation rel, Buffer buffer, BlockNumber blockNo)
+{
+	Page		page = BufferGetPage(buffer);
+
+	gistcheckpage(rel, buffer);
+
+	if (GistPageGetOpaque(page)->gist_page_id != GIST_PAGE_ID)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has corrupted page %d",
+						RelationGetRelationName(rel), blockNo)));
+
+	if (GistPageIsDeleted(page))
+	{
+		if (!GistPageIsLeaf(page))
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted internal page %d",
+							RelationGetRelationName(rel), blockNo)));
+		if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
+			ereport(ERROR,
+					(errcode(ERRCODE_INDEX_CORRUPTED),
+					 errmsg("index \"%s\" has deleted page %d with tuples",
+							RelationGetRelationName(rel), blockNo)));
+	}
+	else if (PageGetMaxOffsetNumber(page) > MaxIndexTuplesPerPage)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("index \"%s\" has page %d with exceeding count of tuples",
+						RelationGetRelationName(rel), blockNo)));
+}
+
+/*
+ * Try to re-find downlink pointing to 'blkno', in 'parentblkno'.
+ *
+ * If found, returns a palloc'd copy of the downlink tuple. Otherwise,
+ * returns NULL.
+ */
+static IndexTuple
+gist_refind_parent(Relation rel, BlockNumber parentblkno,
+				   BlockNumber childblkno, BufferAccessStrategy strategy)
+{
+	Buffer		parentbuf;
+	Page		parentpage;
+	OffsetNumber o,
+				parent_maxoff;
+	IndexTuple	result = NULL;
+
+	parentbuf = ReadBufferExtended(rel, MAIN_FORKNUM, parentblkno, RBM_NORMAL,
+								   strategy);
+
+	LockBuffer(parentbuf, GIST_SHARE);
+	parentpage = BufferGetPage(parentbuf);
+
+	if (GistPageIsLeaf(parentpage))
+	{
+		UnlockReleaseBuffer(parentbuf);
+		return result;
+	}
+
+	parent_maxoff = PageGetMaxOffsetNumber(parentpage);
+	for (o = FirstOffsetNumber; o <= parent_maxoff; o = OffsetNumberNext(o))
+	{
+		ItemId p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o, sizeof(GISTPageOpaqueData));
+		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
+
+		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		{
+			/* Found it! Make copy and return it */
+			result = CopyIndexTuple(itup);
+			break;
+		}
+	}
+
+	UnlockReleaseBuffer(parentbuf);
+
+	return result;
+}
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 05e7d678ed..6f8b5714a3 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -25,16 +25,14 @@
 
 #include "access/htup_details.h"
 #include "access/nbtree.h"
-#include "access/table.h"
 #include "access/tableam.h"
 #include "access/transam.h"
 #include "access/xact.h"
+#include "amcheck.h"
 #include "catalog/index.h"
 #include "catalog/pg_am.h"
-#include "commands/tablecmds.h"
 #include "lib/bloomfilter.h"
 #include "miscadmin.h"
-#include "storage/lmgr.h"
 #include "storage/smgr.h"
 #include "utils/memutils.h"
 #include "utils/snapmgr.h"
@@ -129,7 +127,6 @@ PG_FUNCTION_INFO_V1(bt_index_parent_check);
 static void bt_index_check_internal(Oid indrelid, bool parentcheck,
 									bool heapallindexed, bool rootdescend);
 static inline void btree_index_checkable(Relation rel);
-static inline bool btree_index_mainfork_expected(Relation rel);
 static void bt_check_every_level(Relation rel, Relation heaprel,
 								 bool heapkeyspace, bool readonly, bool heapallindexed,
 								 bool rootdescend);
@@ -163,8 +160,6 @@ static inline bool invariant_l_nontarget_offset(BtreeCheckState *state,
 static Page palloc_btree_page(BtreeCheckState *state, BlockNumber blocknum);
 static inline BTScanInsert bt_mkscankey_pivotsearch(Relation rel,
 													IndexTuple itup);
-static ItemId PageGetItemIdCareful(BtreeCheckState *state, BlockNumber block,
-								   Page page, OffsetNumber offset);
 static inline ItemPointer BTreeTupleGetHeapTIDCareful(BtreeCheckState *state,
 													  IndexTuple itup, bool nonpivot);
 
@@ -224,7 +219,6 @@ static void
 bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
 						bool rootdescend)
 {
-	Oid			heapid;
 	Relation	indrel;
 	Relation	heaprel;
 	LOCKMODE	lockmode;
@@ -234,51 +228,15 @@ bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
 	else
 		lockmode = AccessShareLock;
 
-	/*
-	 * We must lock table before index to avoid deadlocks.  However, if the
-	 * passed indrelid isn't an index then IndexGetRelation() will fail.
-	 * Rather than emitting a not-very-helpful error message, postpone
-	 * complaining, expecting that the is-it-an-index test below will fail.
-	 *
-	 * In hot standby mode this will raise an error when parentcheck is true.
-	 */
-	heapid = IndexGetRelation(indrelid, true);
-	if (OidIsValid(heapid))
-		heaprel = table_open(heapid, lockmode);
-	else
-		heaprel = NULL;
-
-	/*
-	 * Open the target index relations separately (like relation_openrv(), but
-	 * with heap relation locked first to prevent deadlocking).  In hot
-	 * standby mode this will raise an error when parentcheck is true.
-	 *
-	 * There is no need for the usual indcheckxmin usability horizon test
-	 * here, even in the heapallindexed case, because index undergoing
-	 * verification only needs to have entries for a new transaction snapshot.
-	 * (If this is a parentcheck verification, there is no question about
-	 * committed or recently dead heap tuples lacking index entries due to
-	 * concurrent activity.)
-	 */
-	indrel = index_open(indrelid, lockmode);
-
-	/*
-	 * Since we did the IndexGetRelation call above without any lock, it's
-	 * barely possible that a race against an index drop/recreation could have
-	 * netted us the wrong table.
-	 */
-	if (heaprel == NULL || heapid != IndexGetRelation(indrelid, false))
-		ereport(ERROR,
-				(errcode(ERRCODE_UNDEFINED_TABLE),
-				 errmsg("could not open parent table of index %s",
-						RelationGetRelationName(indrel))));
+	/* lock table and index with neccesary level */
+	amcheck_lock_relation(indrelid, &indrel, &heaprel, lockmode);
 
 	/* Relation suitable for checking as B-Tree? */
 	btree_index_checkable(indrel);
 
-	if (btree_index_mainfork_expected(indrel))
+	if (amcheck_index_mainfork_expected(indrel))
 	{
-		bool	heapkeyspace;
+		bool		heapkeyspace;
 
 		RelationOpenSmgr(indrel);
 		if (!smgrexists(indrel->rd_smgr, MAIN_FORKNUM))
@@ -293,14 +251,8 @@ bt_index_check_internal(Oid indrelid, bool parentcheck, bool heapallindexed,
 							 heapallindexed, rootdescend);
 	}
 
-	/*
-	 * Release locks early. That's ok here because nothing in the called
-	 * routines will trigger shared cache invalidations to be sent, so we can
-	 * relax the usual pattern of only releasing locks after commit.
-	 */
-	index_close(indrel, lockmode);
-	if (heaprel)
-		table_close(heaprel, lockmode);
+	/* Unlock index and table */
+	amcheck_unlock_relation(indrelid, indrel, heaprel, lockmode);
 }
 
 /*
@@ -337,28 +289,6 @@ btree_index_checkable(Relation rel)
 				 errdetail("Index is not valid.")));
 }
 
-/*
- * Check if B-Tree index relation should have a file for its main relation
- * fork.  Verification uses this to skip unlogged indexes when in hot standby
- * mode, where there is simply nothing to verify.
- *
- * NB: Caller should call btree_index_checkable() before calling here.
- */
-static inline bool
-btree_index_mainfork_expected(Relation rel)
-{
-	if (rel->rd_rel->relpersistence != RELPERSISTENCE_UNLOGGED ||
-		!RecoveryInProgress())
-		return true;
-
-	ereport(NOTICE,
-			(errcode(ERRCODE_READ_ONLY_SQL_TRANSACTION),
-			 errmsg("cannot verify unlogged index \"%s\" during recovery, skipping",
-					RelationGetRelationName(rel))));
-
-	return false;
-}
-
 /*
  * Main entry point for B-Tree SQL-callable functions. Walks the B-Tree in
  * logical order, verifying invariants as it goes.  Optionally, verification
@@ -754,9 +684,9 @@ bt_check_level_from_leftmost(BtreeCheckState *state, BtreeLevel level)
 				ItemId		itemid;
 
 				/* Internal page -- downlink gets leftmost on next level */
-				itemid = PageGetItemIdCareful(state, state->targetblock,
+				itemid = PageGetItemIdCareful(state->rel, state->targetblock,
 											  state->target,
-											  P_FIRSTDATAKEY(opaque));
+											  P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 				itup = (IndexTuple) PageGetItem(state->target, itemid);
 				nextleveldown.leftmost = BTreeInnerTupleGetDownLink(itup);
 				nextleveldown.level = opaque->btpo.level - 1;
@@ -891,8 +821,8 @@ bt_target_page_check(BtreeCheckState *state)
 		IndexTuple	itup;
 
 		/* Verify line pointer before checking tuple */
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, P_HIKEY);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, P_HIKEY, sizeof(BTPageOpaqueData));
 		if (!_bt_check_natts(state->rel, state->heapkeyspace, state->target,
 							 P_HIKEY))
 		{
@@ -927,8 +857,8 @@ bt_target_page_check(BtreeCheckState *state)
 
 		CHECK_FOR_INTERRUPTS();
 
-		itemid = PageGetItemIdCareful(state, state->targetblock,
-									  state->target, offset);
+		itemid = PageGetItemIdCareful(state->rel, state->targetblock,
+									  state->target, offset, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(state->target, itemid);
 		tupsize = IndexTupleSize(itup);
 
@@ -1173,9 +1103,9 @@ bt_target_page_check(BtreeCheckState *state)
 							 OffsetNumberNext(offset));
 
 			/* Reuse itup to get pointed-to heap location of second item */
-			itemid = PageGetItemIdCareful(state, state->targetblock,
+			itemid = PageGetItemIdCareful(state->rel, state->targetblock,
 										  state->target,
-										  OffsetNumberNext(offset));
+										  OffsetNumberNext(offset), sizeof(BTPageOpaqueData));
 			itup = (IndexTuple) PageGetItem(state->target, itemid);
 			nhtid = psprintf("(%u,%u)",
 							 ItemPointerGetBlockNumberNoCheck(&(itup->t_tid)),
@@ -1460,8 +1390,8 @@ bt_right_page_check_scankey(BtreeCheckState *state)
 	if (P_ISLEAF(opaque) && nline >= P_FIRSTDATAKEY(opaque))
 	{
 		/* Return first data item (if any) */
-		rightitem = PageGetItemIdCareful(state, targetnext, rightpage,
-										 P_FIRSTDATAKEY(opaque));
+		rightitem = PageGetItemIdCareful(state->rel, targetnext, rightpage,
+										 P_FIRSTDATAKEY(opaque), sizeof(BTPageOpaqueData));
 	}
 	else if (!P_ISLEAF(opaque) &&
 			 nline >= OffsetNumberNext(P_FIRSTDATAKEY(opaque)))
@@ -1470,8 +1400,9 @@ bt_right_page_check_scankey(BtreeCheckState *state)
 		 * Return first item after the internal page's "negative infinity"
 		 * item
 		 */
-		rightitem = PageGetItemIdCareful(state, targetnext, rightpage,
-										 OffsetNumberNext(P_FIRSTDATAKEY(opaque)));
+		rightitem = PageGetItemIdCareful(state->rel, targetnext, rightpage,
+										 OffsetNumberNext(P_FIRSTDATAKEY(opaque)),
+										 sizeof(BTPageOpaqueData));
 	}
 	else
 	{
@@ -1743,8 +1674,8 @@ bt_downlink_missing_check(BtreeCheckState *state)
 		 RelationGetRelationName(state->rel));
 
 	level = topaque->btpo.level;
-	itemid = PageGetItemIdCareful(state, state->targetblock, state->target,
-								  P_FIRSTDATAKEY(topaque));
+	itemid = PageGetItemIdCareful(state->rel, state->targetblock, state->target,
+								  P_FIRSTDATAKEY(topaque), sizeof(BTPageOpaqueData));
 	itup = (IndexTuple) PageGetItem(state->target, itemid);
 	childblk = BTreeInnerTupleGetDownLink(itup);
 	for (;;)
@@ -1768,8 +1699,8 @@ bt_downlink_missing_check(BtreeCheckState *state)
 										level - 1, copaque->btpo.level)));
 
 		level = copaque->btpo.level;
-		itemid = PageGetItemIdCareful(state, childblk, child,
-									  P_FIRSTDATAKEY(copaque));
+		itemid = PageGetItemIdCareful(state->rel, childblk, child,
+									  P_FIRSTDATAKEY(copaque), sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(child, itemid);
 		childblk = BTreeInnerTupleGetDownLink(itup);
 		/* Be slightly more pro-active in freeing this memory, just in case */
@@ -1818,7 +1749,7 @@ bt_downlink_missing_check(BtreeCheckState *state)
 	 */
 	if (P_ISHALFDEAD(copaque) && !P_RIGHTMOST(copaque))
 	{
-		itemid = PageGetItemIdCareful(state, childblk, child, P_HIKEY);
+		itemid = PageGetItemIdCareful(state->rel, childblk, child, P_HIKEY, sizeof(BTPageOpaqueData));
 		itup = (IndexTuple) PageGetItem(child, itemid);
 		if (BTreeTupleGetTopParent(itup) == state->targetblock)
 			return;
@@ -2161,8 +2092,8 @@ invariant_l_offset(BtreeCheckState *state, BTScanInsert key,
 	Assert(key->pivotsearch);
 
 	/* Verify line pointer before checking tuple */
-	itemid = PageGetItemIdCareful(state, state->targetblock, state->target,
-								  upperbound);
+	itemid = PageGetItemIdCareful(state->rel, state->targetblock, state->target,
+								  upperbound, sizeof(BTPageOpaqueData));
 	/* pg_upgrade'd indexes may legally have equal sibling tuples */
 	if (!key->heapkeyspace)
 		return invariant_leq_offset(state, key, upperbound);
@@ -2284,8 +2215,8 @@ invariant_l_nontarget_offset(BtreeCheckState *state, BTScanInsert key,
 	Assert(key->pivotsearch);
 
 	/* Verify line pointer before checking tuple */
-	itemid = PageGetItemIdCareful(state, nontargetblock, nontarget,
-								  upperbound);
+	itemid = PageGetItemIdCareful(state->rel, nontargetblock, nontarget,
+								  upperbound, sizeof(BTPageOpaqueData));
 	cmp = _bt_compare(state->rel, key, nontarget, upperbound);
 
 	/* pg_upgrade'd indexes may legally have equal sibling tuples */
@@ -2498,55 +2429,6 @@ bt_mkscankey_pivotsearch(Relation rel, IndexTuple itup)
 	return skey;
 }
 
-/*
- * PageGetItemId() wrapper that validates returned line pointer.
- *
- * Buffer page/page item access macros generally trust that line pointers are
- * not corrupt, which might cause problems for verification itself.  For
- * example, there is no bounds checking in PageGetItem().  Passing it a
- * corrupt line pointer can cause it to return a tuple/pointer that is unsafe
- * to dereference.
- *
- * Validating line pointers before tuples avoids undefined behavior and
- * assertion failures with corrupt indexes, making the verification process
- * more robust and predictable.
- */
-static ItemId
-PageGetItemIdCareful(BtreeCheckState *state, BlockNumber block, Page page,
-					 OffsetNumber offset)
-{
-	ItemId		itemid = PageGetItemId(page, offset);
-
-	if (ItemIdGetOffset(itemid) + ItemIdGetLength(itemid) >
-		BLCKSZ - sizeof(BTPageOpaqueData))
-		ereport(ERROR,
-				(errcode(ERRCODE_INDEX_CORRUPTED),
-				 errmsg("line pointer points past end of tuple space in index \"%s\"",
-						RelationGetRelationName(state->rel)),
-				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
-									block, offset, ItemIdGetOffset(itemid),
-									ItemIdGetLength(itemid),
-									ItemIdGetFlags(itemid))));
-
-	/*
-	 * Verify that line pointer isn't LP_REDIRECT or LP_UNUSED, since nbtree
-	 * never uses either.  Verify that line pointer has storage, too, since
-	 * even LP_DEAD items should within nbtree.
-	 */
-	if (ItemIdIsRedirected(itemid) || !ItemIdIsUsed(itemid) ||
-		ItemIdGetLength(itemid) == 0)
-		ereport(ERROR,
-				(errcode(ERRCODE_INDEX_CORRUPTED),
-				 errmsg("invalid line pointer storage in index \"%s\"",
-						RelationGetRelationName(state->rel)),
-				 errdetail_internal("Index tid=(%u,%u) lp_off=%u, lp_len=%u lp_flags=%u.",
-									block, offset, ItemIdGetOffset(itemid),
-									ItemIdGetLength(itemid),
-									ItemIdGetFlags(itemid))));
-
-	return itemid;
-}
-
 /*
  * BTreeTupleGetHeapTID() wrapper that lets caller enforce that a heap TID must
  * be present in cases where that is mandatory.
diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 627651d8d4..6a02e288b2 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -165,6 +165,25 @@ ORDER BY c.relpages DESC LIMIT 10;
      </para>
     </listitem>
    </varlistentry>
+
+   <varlistentry>
+    <term>
+     <function>gist_index_parent_check(index regclass) returns void</function>
+     <indexterm>
+      <primary>gist_index_parent_check</primary>
+     </indexterm>
+    </term>
+
+    <listitem>
+     <para>
+      <function>gist_index_parent_check</function> tests that its target GiST
+      has consistent parent-child tuples relations (no parent tuples
+      require tuple adjustement) and page graph respects balanced-tree
+      invariants (internal pages reference only leaf page or only internal
+      pages).
+     </para>
+    </listitem>
+   </varlistentry>
   </variablelist>
  </sect2>
 
-- 
2.17.1

In reply to: Andrey Borodin (#29)
Re: amcheck verification for GiST

On Sun, Sep 8, 2019 at 1:21 AM Andrey Borodin <x4mmm@yandex-team.ru> wrote:

Maybe we should PageGetItemIdCareful() to amcheck.c too?
I think it can be reused later by GIN entry tree and to some extent by SP-GiST.
SP-GiST uses redirect tuples, but do not store this information in line pointer.

Well, the details are slightly different in each case in at least one
way -- we use the size of the special area to determine the exact
bounds that it is safe for a tuple to appear in. The size of the
special area varies based on the access method. (Actually, pg_filedump
relies on this difference when inferring which access method a
particular page image is based on -- it starts out by looking at the
standard pd_special field that appears in page headers. So clearly
they're often different.)

My main concern now is the heavyweight lock strength needed by the new
function. I don't feel particularly qualified to sign off on the
concurrency aspects of the patch. Heikki's v6 used a ShareLock, like
bt_index_parent_check(), but you went back to an AccessShareLock,
Andrey. Why is this safe? I see that you do gist_refind_parent() in
your v9 a little differently to Heikki in his v6, which you seemed to
suggest made this safe in your e-mail on March 28, but I don't
understand that at all.

Heikki's version was reading childblkno instead of parentblkno, thus never refinding parent tuple.

When we suspect key consistency violation, we hold lock on page with some tuple. Then we take pin and lock on page that was parent for current some time before.
For example of validity see gistfinishsplit(). Comments state "On entry, the caller must hold a lock on stack->buffer", line 1330 acquires LockBuffer(stack->parent->buffer, GIST_EXCLUSIVE);
This function is used during inserts, but we are not going to modify data and place row locks, thus neither ROW EXCLUSIVE, not ROW SHARE is necessary.

But gistfinishsplit() is called when finishing a page split -- the
F_FOLLOW_RIGHT bit must be set on the leaf. Are you sure that you can
generalize from that, and safely relocate the parent?

I would be a lot more comfortable with this if Heikki weighed in. I am
also concerned about the correctness of this because of this paragraph
from the GiST README file:

"""
This page deletion algorithm works on a best-effort basis. It might fail to
find a downlink, if a concurrent page split moved it after the first stage.
In that case, we won't be able to remove all empty pages. That's OK, it's
not expected to happen very often, and hopefully the next VACUUM will clean
it up.
"""

Why is this not a problem for the new amcheck checks? Maybe this is a
very naive question. I don't claim to be a GiST expert.

--
Peter Geoghegan

#31Michael Paquier
michael@paquier.xyz
In reply to: Peter Geoghegan (#30)
Re: amcheck verification for GiST

On Wed, Sep 11, 2019 at 04:10:20PM -0700, Peter Geoghegan wrote:

Why is this not a problem for the new amcheck checks? Maybe this is a
very naive question. I don't claim to be a GiST expert.

This thread did not receive any updates after a couple of months, and
visibly input was waited from Andrey, so I am marking it as returned
with feedback in the CF. Please feel free to update the CF entry or
register a new entry once you have dealt with the comments from Peter
--
Michael