Verify true root on replicas with amcheck

Started by godjan •about 6 years ago4 messages
#1godjan •
g0dj4n@gmail.com
1 attachment(s)

Hi, we have trouble to detect true root corruptions on replicas. I made a patch for resolving it with the locking meta page and potential root page. I heard that amcheck has an invariant about locking no more than 1 page at a moment for avoiding deadlocks. Is there possible a deadlock situation?

Attachments:

amcheck_trueroot.patchapplication/octet-stream; name=amcheck_trueroot.patch; x-unix-mode=0644Download
commit 26b97b68db2727de3c946d7fcb7d47aa699cf277
Author: Georgy Rylov <g0dj4n@gmail.com>
Date:   Thu Jan 9 13:36:02 2020 +0500

    amcheck verify true root on replicas
    
    Here I add checking true root by getting info about it from meta page under
    lock instead of taking lock for the whole table like in parent_check.

diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 6a058ccdac..67d07f2ba5 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -167,6 +167,7 @@ static ItemId PageGetItemIdCareful(BtreeCheckState *state, BlockNumber block,
 								   Page page, OffsetNumber offset);
 static inline ItemPointer BTreeTupleGetHeapTIDCareful(BtreeCheckState *state,
 													  IndexTuple itup, bool nonpivot);
+static void bt_check_trueroot(BtreeCheckState *state);
 
 /*
  * bt_index_check(index regclass, heapallindexed boolean)
@@ -738,6 +739,10 @@ bt_check_level_from_leftmost(BtreeCheckState *state, BtreeLevel level)
 							 errmsg("block %u is not true root in index \"%s\"",
 									current, RelationGetRelationName(state->rel))));
 			}
+			else
+			{
+				bt_check_trueroot(state);
+			}
 
 			/*
 			 * Before beginning any non-trivial examination of level, prepare
@@ -2571,3 +2576,34 @@ BTreeTupleGetHeapTIDCareful(BtreeCheckState *state, IndexTuple itup,
 
 	return result;
 }
+
+static void bt_check_trueroot(BtreeCheckState *state)
+{
+	Buffer meta_buffer;
+	Page meta_page, root_page;
+	BTMetaPageData *metad;
+	BTPageOpaque root_opaque;
+	bool is_root;
+	meta_page = palloc(BLCKSZ);
+	meta_buffer = ReadBufferExtended(state->rel, MAIN_FORKNUM, BTREE_METAPAGE, RBM_NORMAL,
+									 state->checkstrategy);
+	LockBuffer(meta_buffer, BT_READ);
+	/*
+	 * Perform the same basic sanity checking that nbtree itself performs for
+	 * every page:
+	 */
+	_bt_checkpage(state->rel, meta_buffer);	
+	/* Only use copy of page in palloc()'d memory */
+	memcpy(meta_page, BufferGetPage(meta_buffer), BLCKSZ);
+	/* Get true root block from meta-page */
+	metad = BTPageGetMeta(meta_page);
+	root_page = palloc_btree_page(state, metad->btm_root);
+	root_opaque = (BTPageOpaque) PageGetSpecialPointer(root_page);
+	is_root = P_ISROOT(root_opaque);
+	UnlockReleaseBuffer(meta_buffer);
+	if (!is_root)
+		ereport(ERROR,
+				(errcode(ERRCODE_INDEX_CORRUPTED),
+				 errmsg("block %u is not true root in index \"%s\"",
+						metad->btm_root, RelationGetRelationName(state->rel))));
+}
In reply to: godjan • (#1)
Re: Verify true root on replicas with amcheck

On Thu, Jan 9, 2020 at 12:55 AM godjan • <g0dj4n@gmail.com> wrote:

Hi, we have trouble to detect true root corruptions on replicas. I made a patch for resolving it with the locking meta page and potential root page.

What do you mean by true root corruption? What is the cause of the
problem? What symptom does it have in your application?

While I was the one that wrote the existing !readonly/parent check for
the true root (a check which your patch makes work with the regular
bt_check_index() function), I wasn't thinking of any particular
corruption scenario at the time. I wrote the check simply because it
was easy to do so (with a heavyweight ShareLock on the index).

I heard that amcheck has an invariant about locking no more than 1 page at a moment for avoiding deadlocks. Is there possible a deadlock situation?

This is a conservative principle that I came up with when I wrote the
original version of amcheck. It's not strictly necessary, but it
seemed like a good idea. It should be safe to "couple" buffer locks in
a way that matches the B-Tree code -- as long as it is thought through
very carefully. I am probably going to relax the rule for one specific
case soon -- see:

/messages/by-id/F7527087-6E95-4077-B964-D2CAFEF6224B@yandex-team.ru

Your patch looks like it gets it right (it won't deadlock with other
sessions that access the metapage), but I hesitate to commit it
without a strong justification. Acquiring multiple buffer locks
concurrently is worth avoiding wherever possible.

--
Peter Geoghegan

#3David Steele
david@pgmasters.net
In reply to: godjan • (#1)
Re: Verify true root on replicas with amcheck

On 1/9/20 3:55 AM, godjan � wrote:

Hi, we have trouble to detect true root corruptions on replicas. I made a patch for resolving it with the locking meta page and potential root page. I heard that amcheck has an invariant about locking no more than 1 page at a moment for avoiding deadlocks. Is there possible a deadlock situation?

This patch no longer applies cleanly:
http://cfbot.cputube.org/patch_27_2418.log

The CF entry has been updated to Waiting on Author.

Also, it would be a good idea to answer Peter's questions down-thread if
you are interested in moving this patch forward.

Regards,
--
-David
david@pgmasters.net

#4David Steele
david@pgmasters.net
In reply to: Peter Geoghegan (#2)
Re: Verify true root on replicas with amcheck

On 1/16/20 7:40 PM, Peter Geoghegan wrote:

On Thu, Jan 9, 2020 at 12:55 AM godjan • <g0dj4n@gmail.com> wrote:

I heard that amcheck has an invariant about locking no more than 1 page at a moment for avoiding deadlocks. Is there possible a deadlock situation?

This is a conservative principle that I came up with when I wrote the
original version of amcheck. It's not strictly necessary, but it
seemed like a good idea. It should be safe to "couple" buffer locks in
a way that matches the B-Tree code -- as long as it is thought through
very carefully. I am probably going to relax the rule for one specific
case soon -- see:

/messages/by-id/F7527087-6E95-4077-B964-D2CAFEF6224B@yandex-team.ru

Your patch looks like it gets it right (it won't deadlock with other
sessions that access the metapage), but I hesitate to commit it
without a strong justification. Acquiring multiple buffer locks
concurrently is worth avoiding wherever possible.

I have marked this patch Returned with Feedback since it has been
sitting for a while with no response from the author.

Regards,
--
-David
david@pgmasters.net