reindex concurrently and two toast indexes

Started by Justin Pryzbyalmost 6 years ago48 messages
#1Justin Pryzby
pryzby@telsasoft.com

Forking old, long thread:
/messages/by-id/36712441546604286@sas1-890ba5c2334a.qloud-c.yandex.net
On Fri, Jan 04, 2019 at 03:18:06PM +0300, Sergei Kornilov wrote:

About reindex invalid indexes - i found one good question in archives [1]: how about toast indexes?
I check it now, i am able drop invalid toast index, but i can not drop reduntant valid index.
Reproduce:
session 1: begin; select from test_toast ... for update;
session 2: reindex table CONCURRENTLY test_toast ;
session 2: interrupt by ctrl+C
session 1: commit
session 2: reindex table test_toast ;
and now we have two toast indexes. DROP INDEX is able to remove only invalid ones. Valid index gives "ERROR: permission denied: "pg_toast_16426_index_ccnew" is a system catalog"
[1]: /messages/by-id/CAB7nPqT+6igqbUb59y04NEgHoBeUGYteuUr89AKnLTFNdB8Hyw@mail.gmail.com

It looks like this was never addressed.

I noticed a ccnew toast index sitting around since October - what do I do with it ?

ts=# DROP INDEX pg_toast.pg_toast_463881620_index_ccnew;
ERROR: permission denied: "pg_toast_463881620_index_ccnew" is a system catalog

--
Justin

#2Michael Paquier
michael@paquier.xyz
In reply to: Justin Pryzby (#1)
Re: reindex concurrently and two toast indexes

On Sun, Feb 16, 2020 at 01:08:35PM -0600, Justin Pryzby wrote:

Forking old, long thread:
/messages/by-id/36712441546604286@sas1-890ba5c2334a.qloud-c.yandex.net
On Fri, Jan 04, 2019 at 03:18:06PM +0300, Sergei Kornilov wrote:

About reindex invalid indexes - i found one good question in archives [1]: how about toast indexes?
I check it now, i am able drop invalid toast index, but i can not drop reduntant valid index.
Reproduce:
session 1: begin; select from test_toast ... for update;
session 2: reindex table CONCURRENTLY test_toast ;
session 2: interrupt by ctrl+C
session 1: commit
session 2: reindex table test_toast ;
and now we have two toast indexes. DROP INDEX is able to remove
only invalid ones. Valid index gives "ERROR: permission denied:
"pg_toast_16426_index_ccnew" is a system catalog"
[1]: /messages/by-id/CAB7nPqT+6igqbUb59y04NEgHoBeUGYteuUr89AKnLTFNdB8Hyw@mail.gmail.com

It looks like this was never addressed.

On HEAD, this exact scenario leads to the presence of an old toast
index pg_toast.pg_toast_*_index_ccold, causing the index to be skipped
on a follow-up concurrent reindex:
=# reindex table CONCURRENTLY test_toast ;
WARNING: XX002: cannot reindex invalid index
"pg_toast.pg_toast_16385_index_ccold" concurrently, skipping
LOCATION: ReindexRelationConcurrently, indexcmds.c:2863
REINDEX

And this toast index can be dropped while it remains invalid:
=# drop index pg_toast.pg_toast_16385_index_ccold;
DROP INDEX

I recall testing that stuff for all the interrupts which could be
triggered and in this case, this waits at step 5 within
WaitForLockersMultiple(). Now, in your case you take an extra step
with a plain REINDEX, which forces a rebuild of the invalid toast
index, making it per se valid, and not droppable.

Hmm. There could be an argument here for skipping invalid toast
indexes within reindex_index(), because we are sure about having at
least one valid toast index at anytime, and these are not concerned
with CIC.

Any thoughts?
--
Michael

#3Julien Rouhaud
rjuju123@gmail.com
In reply to: Michael Paquier (#2)
Re: reindex concurrently and two toast indexes

On Tue, Feb 18, 2020 at 6:30 AM Michael Paquier <michael@paquier.xyz> wrote:

On Sun, Feb 16, 2020 at 01:08:35PM -0600, Justin Pryzby wrote:

Forking old, long thread:
/messages/by-id/36712441546604286@sas1-890ba5c2334a.qloud-c.yandex.net
On Fri, Jan 04, 2019 at 03:18:06PM +0300, Sergei Kornilov wrote:

About reindex invalid indexes - i found one good question in archives [1]: how about toast indexes?
I check it now, i am able drop invalid toast index, but i can not drop reduntant valid index.
Reproduce:
session 1: begin; select from test_toast ... for update;
session 2: reindex table CONCURRENTLY test_toast ;
session 2: interrupt by ctrl+C
session 1: commit
session 2: reindex table test_toast ;
and now we have two toast indexes. DROP INDEX is able to remove
only invalid ones. Valid index gives "ERROR: permission denied:
"pg_toast_16426_index_ccnew" is a system catalog"
[1]: /messages/by-id/CAB7nPqT+6igqbUb59y04NEgHoBeUGYteuUr89AKnLTFNdB8Hyw@mail.gmail.com

It looks like this was never addressed.

On HEAD, this exact scenario leads to the presence of an old toast
index pg_toast.pg_toast_*_index_ccold, causing the index to be skipped
on a follow-up concurrent reindex:
=# reindex table CONCURRENTLY test_toast ;
WARNING: XX002: cannot reindex invalid index
"pg_toast.pg_toast_16385_index_ccold" concurrently, skipping
LOCATION: ReindexRelationConcurrently, indexcmds.c:2863
REINDEX

And this toast index can be dropped while it remains invalid:
=# drop index pg_toast.pg_toast_16385_index_ccold;
DROP INDEX

I recall testing that stuff for all the interrupts which could be
triggered and in this case, this waits at step 5 within
WaitForLockersMultiple(). Now, in your case you take an extra step
with a plain REINDEX, which forces a rebuild of the invalid toast
index, making it per se valid, and not droppable.

Hmm. There could be an argument here for skipping invalid toast
indexes within reindex_index(), because we are sure about having at
least one valid toast index at anytime, and these are not concerned
with CIC.

Or even automatically drop any invalid index on toast relation in
reindex_relation, since those can't be due to a failed CIC?

#4Michael Paquier
michael@paquier.xyz
In reply to: Julien Rouhaud (#3)
Re: reindex concurrently and two toast indexes

On Tue, Feb 18, 2020 at 07:06:25AM +0100, Julien Rouhaud wrote:

On Tue, Feb 18, 2020 at 6:30 AM Michael Paquier <michael@paquier.xyz> wrote:

Hmm. There could be an argument here for skipping invalid toast
indexes within reindex_index(), because we are sure about having at
least one valid toast index at anytime, and these are not concerned
with CIC.

Or even automatically drop any invalid index on toast relation in
reindex_relation, since those can't be due to a failed CIC?

No, I don't like much outsmarting REINDEX with more index drops than
it needs to do. And this would not take care of the case with REINDEX
INDEX done directly on a toast index.
--
Michael

#5Julien Rouhaud
rjuju123@gmail.com
In reply to: Michael Paquier (#4)
Re: reindex concurrently and two toast indexes

On Tue, Feb 18, 2020 at 7:19 AM Michael Paquier <michael@paquier.xyz> wrote:

On Tue, Feb 18, 2020 at 07:06:25AM +0100, Julien Rouhaud wrote:

On Tue, Feb 18, 2020 at 6:30 AM Michael Paquier <michael@paquier.xyz> wrote:

Hmm. There could be an argument here for skipping invalid toast
indexes within reindex_index(), because we are sure about having at
least one valid toast index at anytime, and these are not concerned
with CIC.

Or even automatically drop any invalid index on toast relation in
reindex_relation, since those can't be due to a failed CIC?

No, I don't like much outsmarting REINDEX with more index drops than
it needs to do. And this would not take care of the case with REINDEX
INDEX done directly on a toast index.

Well, we could still do both but I get the objection. Then skipping
invalid toast indexes in reindex_relation looks like the best fix.

#6Julien Rouhaud
rjuju123@gmail.com
In reply to: Julien Rouhaud (#5)
1 attachment(s)
Re: reindex concurrently and two toast indexes

On Tue, Feb 18, 2020 at 07:39:49AM +0100, Julien Rouhaud wrote:

On Tue, Feb 18, 2020 at 7:19 AM Michael Paquier <michael@paquier.xyz> wrote:

On Tue, Feb 18, 2020 at 07:06:25AM +0100, Julien Rouhaud wrote:

On Tue, Feb 18, 2020 at 6:30 AM Michael Paquier <michael@paquier.xyz> wrote:

Hmm. There could be an argument here for skipping invalid toast
indexes within reindex_index(), because we are sure about having at
least one valid toast index at anytime, and these are not concerned
with CIC.

Or even automatically drop any invalid index on toast relation in
reindex_relation, since those can't be due to a failed CIC?

No, I don't like much outsmarting REINDEX with more index drops than
it needs to do. And this would not take care of the case with REINDEX
INDEX done directly on a toast index.

Well, we could still do both but I get the objection. Then skipping
invalid toast indexes in reindex_relation looks like the best fix.

PFA a patch to fix the problem using this approach.

I also added isolation tester regression tests. The failure is simulated using
a pg_cancel_backend() on top of pg_stat_activity, using filters on a
specifically set application name and the query text to avoid any unwanted
interaction. I also added a 1s locking delay, to ensure that even slow/CCA
machines can consistently reproduce the failure. Maybe that's not enough, or
maybe testing this scenario is not worth the extra time.

Attachments:

0001-Don-t-reindex-invalid-indexes-on-TOAST-tables-v1.patchtext/plain; charset=us-asciiDownload
From 990d265e5d576b3b4133232f302d6207987f1511 Mon Sep 17 00:00:00 2001
From: Julien Rouhaud <julien.rouhaud@free.fr>
Date: Fri, 21 Feb 2020 20:15:04 +0100
Subject: [PATCH] Don't reindex invalid indexes on TOAST tables.

Such indexes can only be duplicated leftovers of failed REINDEX CONCURRENTLY
commands.  As we only allow to drop invalid indexes on TOAST tables, reindexing
those would lead to useless duplicated indexes that can't be dropped anymore.

Reported-by: Sergei Kornilov, Justin Pryzby
Author: Julien Rouhaud
Reviewed-by:
Discussion: https://postgr.es/m/36712441546604286%40sas1-890ba5c2334a.qloud-c.yandex.net
Discussion: https://postgr.es/m/20200216190835.GA21832@telsasoft.com
Backpatch-through: 12
---
 src/backend/catalog/index.c                   | 29 +++++++++++
 .../expected/reindex-concurrently.out         | 49 ++++++++++++++++++-
 .../isolation/specs/reindex-concurrently.spec | 23 +++++++++
 3 files changed, 100 insertions(+), 1 deletion(-)

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 8880586c37..201a3c39df 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -46,6 +46,7 @@
 #include "catalog/pg_depend.h"
 #include "catalog/pg_description.h"
 #include "catalog/pg_inherits.h"
+#include "catalog/pg_namespace_d.h"
 #include "catalog/pg_opclass.h"
 #include "catalog/pg_operator.h"
 #include "catalog/pg_tablespace.h"
@@ -3717,6 +3718,34 @@ reindex_relation(Oid relid, int flags, int options)
 		{
 			Oid			indexOid = lfirst_oid(indexId);
 
+			/*
+			 * We skip any invalid index on a TOAST table.  Those can only be
+			 * a duplicate leftover of a failed REINDEX CONCURRENTLY, and if we
+			 * rebuild it it won't be possible to drop it anymore.
+			 */
+			if (rel->rd_rel->relnamespace == PG_TOAST_NAMESPACE)
+			{
+				HeapTuple	tup;
+				bool		skipit;
+
+				tup = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indexOid));
+				if (!HeapTupleIsValid(tup))
+					elog(ERROR, "cache lookup failed for index %u", indexOid);
+
+				skipit = ((Form_pg_index) GETSTRUCT(tup))->indisvalid == false;
+
+				ReleaseSysCache(tup);
+
+				if (skipit)
+				{
+					ereport(NOTICE,
+							 (errmsg("skipping invalid index \"%s.%s\"",
+									get_namespace_name(get_rel_namespace(indexOid)),
+									get_rel_name(indexOid))));
+					continue;
+				}
+			}
+
 			reindex_index(indexOid, !(flags & REINDEX_REL_CHECK_CONSTRAINTS),
 						  persistence, options);
 
diff --git a/src/test/isolation/expected/reindex-concurrently.out b/src/test/isolation/expected/reindex-concurrently.out
index 9e04169b2f..fa9039c125 100644
--- a/src/test/isolation/expected/reindex-concurrently.out
+++ b/src/test/isolation/expected/reindex-concurrently.out
@@ -1,4 +1,4 @@
-Parsed test spec with 3 sessions
+Parsed test spec with 5 sessions
 
 starting permutation: reindex sel1 upd2 ins2 del2 end1 end2
 step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab;
@@ -76,3 +76,50 @@ step end1: COMMIT;
 step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
 step end2: COMMIT;
 step reindex: <... completed>
+
+starting permutation: lock timeout reindex unlock check_invalid normal_reindex check_invalid nowarn reindex check_invalid
+step lock: BEGIN; SELECT data FROM reind_con_tab WHERE data = 'aa' FOR UPDATE;
+data           
+
+aa             
+step timeout: SET statement_timeout to 5000;
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step unlock: SELECT pg_sleep(6); COMMIT;
+pg_sleep       
+
+               
+step reindex: <... completed>
+error in steps unlock reindex: ERROR:  canceling statement due to statement timeout
+step check_invalid: SELECT i.indisvalid
+    FROM pg_class c
+    JOIN pg_class t ON t.oid = c.reltoastrelid
+    JOIN pg_index i ON i.indrelid = t.oid
+    WHERE c.relname = 'reind_con_tab'
+    ORDER BY i.indisvalid::text COLLATE "C";
+indisvalid     
+
+f              
+t              
+step normal_reindex: REINDEX TABLE reind_con_tab;
+step check_invalid: SELECT i.indisvalid
+    FROM pg_class c
+    JOIN pg_class t ON t.oid = c.reltoastrelid
+    JOIN pg_index i ON i.indrelid = t.oid
+    WHERE c.relname = 'reind_con_tab'
+    ORDER BY i.indisvalid::text COLLATE "C";
+indisvalid     
+
+f              
+t              
+step nowarn: SET client_min_messages = 'ERROR';
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab;
+step check_invalid: SELECT i.indisvalid
+    FROM pg_class c
+    JOIN pg_class t ON t.oid = c.reltoastrelid
+    JOIN pg_index i ON i.indrelid = t.oid
+    WHERE c.relname = 'reind_con_tab'
+    ORDER BY i.indisvalid::text COLLATE "C";
+indisvalid     
+
+f              
+t              
diff --git a/src/test/isolation/specs/reindex-concurrently.spec b/src/test/isolation/specs/reindex-concurrently.spec
index eb59fe0cba..0599e8e213 100644
--- a/src/test/isolation/specs/reindex-concurrently.spec
+++ b/src/test/isolation/specs/reindex-concurrently.spec
@@ -31,6 +31,22 @@ step "end2" { COMMIT; }
 
 session "s3"
 step "reindex" { REINDEX TABLE CONCURRENTLY reind_con_tab; }
+step "nowarn" { SET client_min_messages = 'ERROR'; }
+step "timeout" { SET statement_timeout to 5000; }
+
+session "s4"
+step "lock" { BEGIN; SELECT data FROM reind_con_tab WHERE data = 'aa' FOR UPDATE; }
+step "unlock" { SELECT pg_sleep(6); COMMIT; }
+
+session "s5"
+setup { SET client_min_messages = 'WARNING'; }
+step "normal_reindex" { REINDEX TABLE reind_con_tab; }
+step "check_invalid" {SELECT i.indisvalid
+    FROM pg_class c
+    JOIN pg_class t ON t.oid = c.reltoastrelid
+    JOIN pg_index i ON i.indrelid = t.oid
+    WHERE c.relname = 'reind_con_tab'
+    ORDER BY i.indisvalid::text COLLATE "C"; }
 
 permutation "reindex" "sel1" "upd2" "ins2" "del2" "end1" "end2"
 permutation "sel1" "reindex" "upd2" "ins2" "del2" "end1" "end2"
@@ -38,3 +54,10 @@ permutation "sel1" "upd2" "reindex" "ins2" "del2" "end1" "end2"
 permutation "sel1" "upd2" "ins2" "reindex" "del2" "end1" "end2"
 permutation "sel1" "upd2" "ins2" "del2" "reindex" "end1" "end2"
 permutation "sel1" "upd2" "ins2" "del2" "end1" "reindex" "end2"
+# A failed REINDEX CONCURRENTLY will leave an invalid index on reind_con_tab
+# TOAST table.  Any  following successful REINDEX should leave this index as
+# invalid, otherwise we would end up with a useless and duplicated index that
+# can't be dropped.
+permutation "lock" "timeout" "reindex" "unlock" "check_invalid"
+    "normal_reindex" "check_invalid"
+    "nowarn" "reindex" "check_invalid"
-- 
2.20.1

#7Justin Pryzby
pryzby@telsasoft.com
In reply to: Michael Paquier (#2)
Re: reindex concurrently and two toast indexes

On Tue, Feb 18, 2020 at 02:29:33PM +0900, Michael Paquier wrote:

On Sun, Feb 16, 2020 at 01:08:35PM -0600, Justin Pryzby wrote:

Forking old, long thread:
/messages/by-id/36712441546604286@sas1-890ba5c2334a.qloud-c.yandex.net
On Fri, Jan 04, 2019 at 03:18:06PM +0300, Sergei Kornilov wrote:

About reindex invalid indexes - i found one good question in archives [1]: how about toast indexes?
I check it now, i am able drop invalid toast index, but i can not drop reduntant valid index.
Reproduce:
session 1: begin; select from test_toast ... for update;
session 2: reindex table CONCURRENTLY test_toast ;
session 2: interrupt by ctrl+C
session 1: commit
session 2: reindex table test_toast ;
and now we have two toast indexes. DROP INDEX is able to remove
only invalid ones. Valid index gives "ERROR: permission denied:
"pg_toast_16426_index_ccnew" is a system catalog"
[1]: /messages/by-id/CAB7nPqT+6igqbUb59y04NEgHoBeUGYteuUr89AKnLTFNdB8Hyw@mail.gmail.com

It looks like this was never addressed.

On HEAD, this exact scenario leads to the presence of an old toast
index pg_toast.pg_toast_*_index_ccold, causing the index to be skipped
on a follow-up concurrent reindex:
=# reindex table CONCURRENTLY test_toast ;
WARNING: XX002: cannot reindex invalid index
"pg_toast.pg_toast_16385_index_ccold" concurrently, skipping
LOCATION: ReindexRelationConcurrently, indexcmds.c:2863
REINDEX

And this toast index can be dropped while it remains invalid:
=# drop index pg_toast.pg_toast_16385_index_ccold;
DROP INDEX

I recall testing that stuff for all the interrupts which could be
triggered and in this case, this waits at step 5 within
WaitForLockersMultiple(). Now, in your case you take an extra step
with a plain REINDEX, which forces a rebuild of the invalid toast
index, making it per se valid, and not droppable.

Hmm. There could be an argument here for skipping invalid toast
indexes within reindex_index(), because we are sure about having at
least one valid toast index at anytime, and these are not concerned
with CIC.

Julien sent a patch for that, but here are my ideas (which you are free to
reject):

Could you require an AEL for that case, or something which will preclude
reindex table test_toast from working ?

Could you use atomic updates to ensure that exactly one index in an {old,new}
pair is invalid at any given time ?

Could you make the new (invalid) toast index not visible to other transactions?

--
Justin Pryzby

#8Julien Rouhaud
rjuju123@gmail.com
In reply to: Julien Rouhaud (#6)
1 attachment(s)
Re: reindex concurrently and two toast indexes

On Sat, Feb 22, 2020 at 08:09:24AM +0100, Julien Rouhaud wrote:

On Tue, Feb 18, 2020 at 07:39:49AM +0100, Julien Rouhaud wrote:

On Tue, Feb 18, 2020 at 7:19 AM Michael Paquier <michael@paquier.xyz> wrote:

On Tue, Feb 18, 2020 at 07:06:25AM +0100, Julien Rouhaud wrote:

On Tue, Feb 18, 2020 at 6:30 AM Michael Paquier <michael@paquier.xyz> wrote:

Hmm. There could be an argument here for skipping invalid toast
indexes within reindex_index(), because we are sure about having at
least one valid toast index at anytime, and these are not concerned
with CIC.

PFA a patch to fix the problem using this approach.

I also added isolation tester regression tests. The failure is simulated using
a pg_cancel_backend() on top of pg_stat_activity, using filters on a
specifically set application name and the query text to avoid any unwanted
interaction. I also added a 1s locking delay, to ensure that even slow/CCA
machines can consistently reproduce the failure. Maybe that's not enough, or
maybe testing this scenario is not worth the extra time.

Sorry, I just realized that I forgot to commit the last changes before sending
the patch, so here's the correct v2.

Attachments:

0001-Don-t-reindex-invalid-indexes-on-TOAST-tables-v2.patchtext/plain; charset=us-asciiDownload
From 1b1bd50e4af4a2034638898129e6e49e3f4999da Mon Sep 17 00:00:00 2001
From: Julien Rouhaud <julien.rouhaud@free.fr>
Date: Fri, 21 Feb 2020 20:15:04 +0100
Subject: [PATCH] Don't reindex invalid indexes on TOAST tables.

Such indexes can only be duplicated leftovers of failed REINDEX CONCURRENTLY
commands.  As we only allow to drop invalid indexes on TOAST tables, reindexing
those would lead to useless duplicated indexes that can't be dropped anymore.

Reported-by: Sergei Kornilov, Justin Pryzby
Author: Julien Rouhaud
Reviewed-by:
Discussion: https://postgr.es/m/36712441546604286%40sas1-890ba5c2334a.qloud-c.yandex.net
Discussion: https://postgr.es/m/20200216190835.GA21832@telsasoft.com
Backpatch-through: 12
---
 src/backend/catalog/index.c                   | 29 ++++++++++
 .../expected/reindex-concurrently.out         | 55 ++++++++++++++++++-
 .../isolation/specs/reindex-concurrently.spec | 27 +++++++++
 3 files changed, 110 insertions(+), 1 deletion(-)

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 8880586c37..201a3c39df 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -46,6 +46,7 @@
 #include "catalog/pg_depend.h"
 #include "catalog/pg_description.h"
 #include "catalog/pg_inherits.h"
+#include "catalog/pg_namespace_d.h"
 #include "catalog/pg_opclass.h"
 #include "catalog/pg_operator.h"
 #include "catalog/pg_tablespace.h"
@@ -3717,6 +3718,34 @@ reindex_relation(Oid relid, int flags, int options)
 		{
 			Oid			indexOid = lfirst_oid(indexId);
 
+			/*
+			 * We skip any invalid index on a TOAST table.  Those can only be
+			 * a duplicate leftover of a failed REINDEX CONCURRENTLY, and if we
+			 * rebuild it it won't be possible to drop it anymore.
+			 */
+			if (rel->rd_rel->relnamespace == PG_TOAST_NAMESPACE)
+			{
+				HeapTuple	tup;
+				bool		skipit;
+
+				tup = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indexOid));
+				if (!HeapTupleIsValid(tup))
+					elog(ERROR, "cache lookup failed for index %u", indexOid);
+
+				skipit = ((Form_pg_index) GETSTRUCT(tup))->indisvalid == false;
+
+				ReleaseSysCache(tup);
+
+				if (skipit)
+				{
+					ereport(NOTICE,
+							 (errmsg("skipping invalid index \"%s.%s\"",
+									get_namespace_name(get_rel_namespace(indexOid)),
+									get_rel_name(indexOid))));
+					continue;
+				}
+			}
+
 			reindex_index(indexOid, !(flags & REINDEX_REL_CHECK_CONSTRAINTS),
 						  persistence, options);
 
diff --git a/src/test/isolation/expected/reindex-concurrently.out b/src/test/isolation/expected/reindex-concurrently.out
index 9e04169b2f..012b4874dd 100644
--- a/src/test/isolation/expected/reindex-concurrently.out
+++ b/src/test/isolation/expected/reindex-concurrently.out
@@ -1,4 +1,4 @@
-Parsed test spec with 3 sessions
+Parsed test spec with 5 sessions
 
 starting permutation: reindex sel1 upd2 ins2 del2 end1 end2
 step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab;
@@ -76,3 +76,56 @@ step end1: COMMIT;
 step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
 step end2: COMMIT;
 step reindex: <... completed>
+
+starting permutation: lock reindex sleep kill unlock check_invalid normal_reindex check_invalid nowarn reindex check_invalid
+step lock: BEGIN; SELECT data FROM reind_con_tab WHERE data = 'aa' FOR UPDATE;
+data           
+
+aa             
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
+step sleep: SELECT pg_sleep(1);
+pg_sleep       
+
+               
+step kill: SELECT pg_cancel_backend(pid) FROM pg_stat_activity
+    WHERE application_name = 's3_reindex_concurrently'
+    AND query = 'REINDEX TABLE CONCURRENTLY reind_con_tab;'
+pg_cancel_backend
+
+t              
+step reindex: <... completed>
+error in steps kill reindex: ERROR:  canceling statement due to user request
+step unlock: COMMIT;
+step check_invalid: SELECT i.indisvalid
+    FROM pg_class c
+    JOIN pg_class t ON t.oid = c.reltoastrelid
+    JOIN pg_index i ON i.indrelid = t.oid
+    WHERE c.relname = 'reind_con_tab'
+    ORDER BY i.indisvalid::text COLLATE "C";
+indisvalid     
+
+f              
+t              
+step normal_reindex: REINDEX TABLE reind_con_tab;
+step check_invalid: SELECT i.indisvalid
+    FROM pg_class c
+    JOIN pg_class t ON t.oid = c.reltoastrelid
+    JOIN pg_index i ON i.indrelid = t.oid
+    WHERE c.relname = 'reind_con_tab'
+    ORDER BY i.indisvalid::text COLLATE "C";
+indisvalid     
+
+f              
+t              
+step nowarn: SET client_min_messages = 'ERROR';
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab;
+step check_invalid: SELECT i.indisvalid
+    FROM pg_class c
+    JOIN pg_class t ON t.oid = c.reltoastrelid
+    JOIN pg_index i ON i.indrelid = t.oid
+    WHERE c.relname = 'reind_con_tab'
+    ORDER BY i.indisvalid::text COLLATE "C";
+indisvalid     
+
+f              
+t              
diff --git a/src/test/isolation/specs/reindex-concurrently.spec b/src/test/isolation/specs/reindex-concurrently.spec
index eb59fe0cba..ecd784cc4a 100644
--- a/src/test/isolation/specs/reindex-concurrently.spec
+++ b/src/test/isolation/specs/reindex-concurrently.spec
@@ -30,7 +30,27 @@ step "del2" { DELETE FROM reind_con_tab WHERE data = 'cccc'; }
 step "end2" { COMMIT; }
 
 session "s3"
+setup { SET application_name TO "s3_reindex_concurrently"; }
 step "reindex" { REINDEX TABLE CONCURRENTLY reind_con_tab; }
+step "nowarn" { SET client_min_messages = 'ERROR'; }
+
+session "s4"
+step "lock" { BEGIN; SELECT data FROM reind_con_tab WHERE data = 'aa' FOR UPDATE; }
+step "sleep" { SELECT pg_sleep(1); }
+step "kill" { SELECT pg_cancel_backend(pid) FROM pg_stat_activity
+    WHERE application_name = 's3_reindex_concurrently'
+    AND query = 'REINDEX TABLE CONCURRENTLY reind_con_tab;' }
+step "unlock" { COMMIT; }
+
+session "s5"
+setup { SET client_min_messages = 'WARNING'; }
+step "normal_reindex" { REINDEX TABLE reind_con_tab; }
+step "check_invalid" {SELECT i.indisvalid
+    FROM pg_class c
+    JOIN pg_class t ON t.oid = c.reltoastrelid
+    JOIN pg_index i ON i.indrelid = t.oid
+    WHERE c.relname = 'reind_con_tab'
+    ORDER BY i.indisvalid::text COLLATE "C"; }
 
 permutation "reindex" "sel1" "upd2" "ins2" "del2" "end1" "end2"
 permutation "sel1" "reindex" "upd2" "ins2" "del2" "end1" "end2"
@@ -38,3 +58,10 @@ permutation "sel1" "upd2" "reindex" "ins2" "del2" "end1" "end2"
 permutation "sel1" "upd2" "ins2" "reindex" "del2" "end1" "end2"
 permutation "sel1" "upd2" "ins2" "del2" "reindex" "end1" "end2"
 permutation "sel1" "upd2" "ins2" "del2" "end1" "reindex" "end2"
+# A failed REINDEX CONCURRENTLY will leave an invalid index on reind_con_tab
+# TOAST table.  Any  following successful REINDEX should leave this index as
+# invalid, otherwise we would end up with a useless and duplicated index that
+# can't be dropped.
+permutation "lock" "reindex" "sleep" "kill" "unlock" "check_invalid"
+    "normal_reindex" "check_invalid"
+    "nowarn" "reindex" "check_invalid"
-- 
2.20.1

#9Michael Paquier
michael@paquier.xyz
In reply to: Julien Rouhaud (#8)
Re: reindex concurrently and two toast indexes

On Sat, Feb 22, 2020 at 04:06:57PM +0100, Julien Rouhaud wrote:

Sorry, I just realized that I forgot to commit the last changes before sending
the patch, so here's the correct v2.

Thanks for the patch.

+	if (skipit)
+	{
+		ereport(NOTICE,
+			 (errmsg("skipping invalid index \"%s.%s\"",
+				 get_namespace_name(get_rel_namespace(indexOid)),
+				 get_rel_name(indexOid))));

ReindexRelationConcurrently() issues a WARNING when bumping on an
invalid index, shouldn't the same log level be used?

Even with this patch, it is possible to reindex an invalid toast index
with REINDEX INDEX (with and without CONCURRENTLY), which is the
problem I mentioned upthread (Er, actually only for the non-concurrent
case as told about reindex_index). Shouldn't both cases be prevented
as well with an ERROR?
--
Michael

#10Julien Rouhaud
rjuju123@gmail.com
In reply to: Michael Paquier (#9)
Re: reindex concurrently and two toast indexes

On Thu, Feb 27, 2020 at 04:32:11PM +0900, Michael Paquier wrote:

On Sat, Feb 22, 2020 at 04:06:57PM +0100, Julien Rouhaud wrote:

Sorry, I just realized that I forgot to commit the last changes before sending
the patch, so here's the correct v2.

Thanks for the patch.

+	if (skipit)
+	{
+		ereport(NOTICE,
+			 (errmsg("skipping invalid index \"%s.%s\"",
+				 get_namespace_name(get_rel_namespace(indexOid)),
+				 get_rel_name(indexOid))));

ReindexRelationConcurrently() issues a WARNING when bumping on an
invalid index, shouldn't the same log level be used?

For ReindexRelationConcurrently, the index is skipped because the feature isn't
supported, thus a warning. In this case that would work, it's just that we
don't want to process such indexes, so I used a notice instead.

I'm not opposed to use a warning instead if you prefer. What errcode should be
used though, ERRCODE_WARNING? ERRCODE_FEATURE_NOT_SUPPORTED doesn't feel
right.

Even with this patch, it is possible to reindex an invalid toast index
with REINDEX INDEX (with and without CONCURRENTLY), which is the
problem I mentioned upthread (Er, actually only for the non-concurrent
case as told about reindex_index). Shouldn't both cases be prevented
as well with an ERROR?

Ah indeed, sorry I missed that.

While looking at it, I see that invalid indexes seem to leaked when the table
is dropped, with no way to get rid of them:

s1:
CREATE TABLE t1(val text);
CREATE INDEX ON t1 (val);
BEGIN;
SELECT * FROM t1 FOR UPDATE;

s2:
REINDEX TABLE CONCURRENTLY t1;
[stucked and canceled]
SELECT indexrelid::regclass, indrelid::regclass FROM pg_index WHERE NOT indisvalid;
indexrelid | indrelid
-------------------------------------+-------------------------
t1_val_idx_ccold | t1
pg_toast.pg_toast_16385_index_ccold | pg_toast.pg_toast_16385
(2 rows)

s1:
ROLLBACK;
DROP TABLE t1;

SELECT indexrelid::regclass, indrelid::regclass FROM pg_index WHERE NOT indisvalid;
indexrelid | indrelid
-------------------------------------+----------
t1_val_idx_ccold | 16385
pg_toast.pg_toast_16385_index_ccold | 16388
(2 rows)

REINDEX INDEX t1_val_idx_ccold;
ERROR: XX000: could not open relation with OID 16385
LOCATION: relation_open, relation.c:62

DROP INDEX t1_val_idx_ccold;
ERROR: XX000: could not open relation with OID 16385
LOCATION: relation_open, relation.c:62

REINDEX INDEX pg_toast.pg_toast_16385_index_ccold;
ERROR: XX000: could not open relation with OID 16388
LOCATION: relation_open, relation.c:62

DROP INDEX pg_toast.pg_toast_16385_index_ccold;
ERROR: XX000: could not open relation with OID 16388
LOCATION: relation_open, relation.c:62

REINDEX DATABASE rjuju;
REINDEX

SELECT indexrelid::regclass, indrelid::regclass FROM pg_index WHERE NOT indisvalid;
indexrelid | indrelid
-------------------------------------+----------
t1_val_idx_ccold | 16385
pg_toast.pg_toast_16385_index_ccold | 16388
(2 rows)

Shouldn't DROP TABLE be fixed to also drop invalid indexes?

#11Michael Paquier
michael@paquier.xyz
In reply to: Julien Rouhaud (#10)
1 attachment(s)
Re: reindex concurrently and two toast indexes

On Thu, Feb 27, 2020 at 09:07:35AM +0100, Julien Rouhaud wrote:

While looking at it, I see that invalid indexes seem to leaked when the table
is dropped, with no way to get rid of them:

Shouldn't DROP TABLE be fixed to also drop invalid indexes?

Hmm. The problem here is that I think that we don't have the correct
correct interface to handle the dependency switching between the old
and new indexes from the start, and 68ac9cf made things better in some
aspects (like non-cancellation and old index drop) but not in others
(like yours, or even a column drop). changeDependenciesOf/On() have
been added especially for REINDEX CONCURRENTLY, but they are not
actually able to handle the case we want them to handle: do a switch
for both relations within the same scan. It is possible to use three
times the existing routines with a couple of CCIs in-between and what
I would call a fake placeholder OID to switch all the records cleanly,
but it would be actually cleaner to do a single scan of pg_depend and
switch the dependencies of both objects at once.

Attached is a draft patch to take care of that problem for HEAD. It
still needs a lot of polishing (variable names are not actually old
or new anymore, etc.) but that's enough to show the idea. If a version
reaches PG12, we would need to keep around the past routines to avoid
an ABI breakage, even if I doubt there are callers of it, but who
knows..
--
Michael

Attachments:

reindex-deps-v1.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/include/catalog/dependency.h b/src/include/catalog/dependency.h
index 0cd6fcf027..5811d082b0 100644
--- a/src/include/catalog/dependency.h
+++ b/src/include/catalog/dependency.h
@@ -200,10 +200,10 @@ extern long changeDependencyFor(Oid classId, Oid objectId,
 								Oid refClassId, Oid oldRefObjectId,
 								Oid newRefObjectId);
 
-extern long changeDependenciesOf(Oid classId, Oid oldObjectId,
+extern long switchDependenciesOf(Oid classId, Oid oldObjectId,
 								 Oid newObjectId);
 
-extern long changeDependenciesOn(Oid refClassId, Oid oldRefObjectId,
+extern long switchDependenciesOn(Oid refClassId, Oid oldRefObjectId,
 								 Oid newRefObjectId);
 
 extern Oid	getExtensionOfObject(Oid classId, Oid objectId);
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 1681f61727..fb9bb276bb 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -1675,14 +1675,10 @@ index_concurrently_swap(Oid newIndexId, Oid oldIndexId, const char *oldName)
 	}
 
 	/*
-	 * Move all dependencies of and on the old index to the new one.  First
-	 * remove any dependencies that the new index may have to provide an
-	 * initial clean state for the dependency switch, and then move all the
-	 * dependencies from the old index to the new one.
+	 * Switch all dependencies of and on the old and new indexes.
 	 */
-	deleteDependencyRecordsFor(RelationRelationId, newIndexId, false);
-	changeDependenciesOf(RelationRelationId, oldIndexId, newIndexId);
-	changeDependenciesOn(RelationRelationId, oldIndexId, newIndexId);
+	switchDependenciesOf(RelationRelationId, oldIndexId, newIndexId);
+	switchDependenciesOn(RelationRelationId, oldIndexId, newIndexId);
 
 	/*
 	 * Copy over statistics from old to new index
diff --git a/src/backend/catalog/pg_depend.c b/src/backend/catalog/pg_depend.c
index f9af245eec..11caa38083 100644
--- a/src/backend/catalog/pg_depend.c
+++ b/src/backend/catalog/pg_depend.c
@@ -396,7 +396,7 @@ changeDependencyFor(Oid classId, Oid objectId,
 }
 
 /*
- * Adjust all dependency records to come from a different object of the same type
+ * Switch dependency records between two objects of the same type
  *
  * classId/oldObjectId specify the old referencing object.
  * newObjectId is the new referencing object (must be of class classId).
@@ -404,38 +404,41 @@ changeDependencyFor(Oid classId, Oid objectId,
  * Returns the number of records updated.
  */
 long
-changeDependenciesOf(Oid classId, Oid oldObjectId,
+switchDependenciesOf(Oid classId, Oid oldObjectId,
 					 Oid newObjectId)
 {
 	long		count = 0;
 	Relation	depRel;
-	ScanKeyData key[2];
+	ScanKeyData key;
 	SysScanDesc scan;
 	HeapTuple	tup;
 
 	depRel = table_open(DependRelationId, RowExclusiveLock);
 
-	ScanKeyInit(&key[0],
+	ScanKeyInit(&key,
 				Anum_pg_depend_classid,
 				BTEqualStrategyNumber, F_OIDEQ,
 				ObjectIdGetDatum(classId));
-	ScanKeyInit(&key[1],
-				Anum_pg_depend_objid,
-				BTEqualStrategyNumber, F_OIDEQ,
-				ObjectIdGetDatum(oldObjectId));
 
 	scan = systable_beginscan(depRel, DependDependerIndexId, true,
-							  NULL, 2, key);
+							  NULL, 1, &key);
 
 	while (HeapTupleIsValid((tup = systable_getnext(scan))))
 	{
 		Form_pg_depend depform = (Form_pg_depend) GETSTRUCT(tup);
 
+		if (depform->objid != oldObjectId &&
+			depform->objid != newObjectId)
+			continue;
+
 		/* make a modifiable copy */
 		tup = heap_copytuple(tup);
 		depform = (Form_pg_depend) GETSTRUCT(tup);
 
-		depform->objid = newObjectId;
+		if (depform->objid == oldObjectId)
+			depform->objid = newObjectId;
+		else
+			depform->objid = oldObjectId;
 
 		CatalogTupleUpdate(depRel, &tup->t_self, tup);
 
@@ -452,7 +455,7 @@ changeDependenciesOf(Oid classId, Oid oldObjectId,
 }
 
 /*
- * Adjust all dependency records to point to a different object of the same type
+ * Switch all dependency records between two objects of the same type.
  *
  * refClassId/oldRefObjectId specify the old referenced object.
  * newRefObjectId is the new referenced object (must be of class refClassId).
@@ -460,74 +463,75 @@ changeDependenciesOf(Oid classId, Oid oldObjectId,
  * Returns the number of records updated.
  */
 long
-changeDependenciesOn(Oid refClassId, Oid oldRefObjectId,
+switchDependenciesOn(Oid refClassId, Oid oldRefObjectId,
 					 Oid newRefObjectId)
 {
 	long		count = 0;
 	Relation	depRel;
-	ScanKeyData key[2];
+	ScanKeyData key;
 	SysScanDesc scan;
 	HeapTuple	tup;
-	ObjectAddress objAddr;
-	bool		newIsPinned;
+	ObjectAddress newObjAddr;
+	ObjectAddress oldObjAddr;
 
 	depRel = table_open(DependRelationId, RowExclusiveLock);
 
 	/*
-	 * If oldRefObjectId is pinned, there won't be any dependency entries on
-	 * it --- we can't cope in that case.  (This isn't really worth expending
-	 * code to fix, in current usage; it just means you can't rename stuff out
-	 * of pg_catalog, which would likely be a bad move anyway.)
+	 * If oldRefObjectId or newRefObjectId are pinned, there won't be any
+	 * dependency entries on it --- we can't cope in that case.  (This
+	 * isn't really worth expending code to fix, in current usage; it
+	 * just means you can't rename stuff out of pg_catalog, which would
+	 * likely be a bad move anyway.)
 	 */
-	objAddr.classId = refClassId;
-	objAddr.objectId = oldRefObjectId;
-	objAddr.objectSubId = 0;
+	oldObjAddr.classId = refClassId;
+	oldObjAddr.objectId = oldRefObjectId;
+	oldObjAddr.objectSubId = 0;
 
-	if (isObjectPinned(&objAddr, depRel))
+	if (isObjectPinned(&oldObjAddr, depRel))
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("cannot remove dependency on %s because it is a system object",
-						getObjectDescription(&objAddr))));
+						getObjectDescription(&oldObjAddr))));
 
-	/*
-	 * We can handle adding a dependency on something pinned, though, since
-	 * that just means deleting the dependency entry.
-	 */
-	objAddr.objectId = newRefObjectId;
+	newObjAddr.classId = refClassId;
+	newObjAddr.objectId = newRefObjectId;
+	newObjAddr.objectSubId = 0;
 
-	newIsPinned = isObjectPinned(&objAddr, depRel);
+	if (isObjectPinned(&newObjAddr, depRel))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot remove dependency on %s because it is a system object",
+						getObjectDescription(&newObjAddr))));
 
 	/* Now search for dependency records */
-	ScanKeyInit(&key[0],
+	ScanKeyInit(&key,
 				Anum_pg_depend_refclassid,
 				BTEqualStrategyNumber, F_OIDEQ,
 				ObjectIdGetDatum(refClassId));
-	ScanKeyInit(&key[1],
-				Anum_pg_depend_refobjid,
-				BTEqualStrategyNumber, F_OIDEQ,
-				ObjectIdGetDatum(oldRefObjectId));
 
 	scan = systable_beginscan(depRel, DependReferenceIndexId, true,
-							  NULL, 2, key);
+							  NULL, 1, &key);
 
 	while (HeapTupleIsValid((tup = systable_getnext(scan))))
 	{
 		Form_pg_depend depform = (Form_pg_depend) GETSTRUCT(tup);
 
-		if (newIsPinned)
-			CatalogTupleDelete(depRel, &tup->t_self);
-		else
-		{
-			/* make a modifiable copy */
-			tup = heap_copytuple(tup);
-			depform = (Form_pg_depend) GETSTRUCT(tup);
+		if (depform->refobjid != oldRefObjectId &&
+			depform->refobjid != newRefObjectId)
+			continue;
 
+		/* make a modifiable copy */
+		tup = heap_copytuple(tup);
+		depform = (Form_pg_depend) GETSTRUCT(tup);
+
+		if (depform->refobjid == oldRefObjectId)
 			depform->refobjid = newRefObjectId;
+		else
+			depform->refobjid = oldRefObjectId;
 
-			CatalogTupleUpdate(depRel, &tup->t_self, tup);
+		CatalogTupleUpdate(depRel, &tup->t_self, tup);
 
-			heap_freetuple(tup);
-		}
+		heap_freetuple(tup);
 
 		count++;
 	}
#12Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#11)
1 attachment(s)
Re: reindex concurrently and two toast indexes

On Tue, Mar 03, 2020 at 05:06:42PM +0900, Michael Paquier wrote:

Attached is a draft patch to take care of that problem for HEAD. It
still needs a lot of polishing (variable names are not actually old
or new anymore, etc.) but that's enough to show the idea. If a version
reaches PG12, we would need to keep around the past routines to avoid
an ABI breakage, even if I doubt there are callers of it, but who
knows..

Or actually, a more simple solution is to abuse of the two existing
routines so as the dependency switch is done the other way around,
from the new index to the old one. That would visibly work because
there is no CCI between each scan, and that's faster because the scan
of pg_depend is done only on the entries in need of an update. I'll
look at that again tomorrow, it is late here and I may be missing
something obvious.
--
Michael

Attachments:

reindex-deps-v2.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 1681f61727..736bc9f66c 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -1675,14 +1675,13 @@ index_concurrently_swap(Oid newIndexId, Oid oldIndexId, const char *oldName)
 	}
 
 	/*
-	 * Move all dependencies of and on the old index to the new one.  First
-	 * remove any dependencies that the new index may have to provide an
-	 * initial clean state for the dependency switch, and then move all the
-	 * dependencies from the old index to the new one.
+	 * Swap all dependencies of and on the old index to the new one, and
+	 * vice-versa.
 	 */
-	deleteDependencyRecordsFor(RelationRelationId, newIndexId, false);
 	changeDependenciesOf(RelationRelationId, oldIndexId, newIndexId);
 	changeDependenciesOn(RelationRelationId, oldIndexId, newIndexId);
+	changeDependenciesOf(RelationRelationId, newIndexId, oldIndexId);
+	changeDependenciesOn(RelationRelationId, newIndexId, oldIndexId);
 
 	/*
 	 * Copy over statistics from old to new index
#13Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#12)
1 attachment(s)
Re: reindex concurrently and two toast indexes

On Tue, Mar 03, 2020 at 06:25:51PM +0900, Michael Paquier wrote:

Or actually, a more simple solution is to abuse of the two existing
routines so as the dependency switch is done the other way around,
from the new index to the old one. That would visibly work because
there is no CCI between each scan, and that's faster because the scan
of pg_depend is done only on the entries in need of an update. I'll
look at that again tomorrow, it is late here and I may be missing
something obvious.

It was a good inspiration. I have been torturing this patch today and
played with it by injecting elog(ERROR) calls in the middle of reindex
concurrently for all the phases, and checked manually the handling of
entries in pg_depend for the new and old indexes, and these correctly
map. So this is taking care of your problem. Attached is an updated
patch with an updated comment about the dependency of this code with
CCIs. I'd like to go fix this issue first.
--
Michael

Attachments:

reindex-deps-v3.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 1681f61727..7223679033 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -1675,12 +1675,13 @@ index_concurrently_swap(Oid newIndexId, Oid oldIndexId, const char *oldName)
 	}
 
 	/*
-	 * Move all dependencies of and on the old index to the new one.  First
-	 * remove any dependencies that the new index may have to provide an
-	 * initial clean state for the dependency switch, and then move all the
-	 * dependencies from the old index to the new one.
+	 * Swap all dependencies of and on the old index to the new one, and
+	 * vice-versa.  Note that a call to CommandCounterIncrement() would cause
+	 * duplicate entries in pg_depend, so this should not be done.
 	 */
-	deleteDependencyRecordsFor(RelationRelationId, newIndexId, false);
+	changeDependenciesOf(RelationRelationId, newIndexId, oldIndexId);
+	changeDependenciesOn(RelationRelationId, newIndexId, oldIndexId);
+
 	changeDependenciesOf(RelationRelationId, oldIndexId, newIndexId);
 	changeDependenciesOn(RelationRelationId, oldIndexId, newIndexId);
 
#14Julien Rouhaud
rjuju123@gmail.com
In reply to: Michael Paquier (#13)
Re: reindex concurrently and two toast indexes

On Wed, Mar 4, 2020 at 6:15 AM Michael Paquier <michael@paquier.xyz> wrote:

On Tue, Mar 03, 2020 at 06:25:51PM +0900, Michael Paquier wrote:

Or actually, a more simple solution is to abuse of the two existing
routines so as the dependency switch is done the other way around,
from the new index to the old one. That would visibly work because
there is no CCI between each scan, and that's faster because the scan
of pg_depend is done only on the entries in need of an update. I'll
look at that again tomorrow, it is late here and I may be missing
something obvious.

It was a good inspiration. I have been torturing this patch today and
played with it by injecting elog(ERROR) calls in the middle of reindex
concurrently for all the phases, and checked manually the handling of
entries in pg_depend for the new and old indexes, and these correctly
map. So this is taking care of your problem. Attached is an updated
patch with an updated comment about the dependency of this code with
CCIs. I'd like to go fix this issue first.

Thanks for the patch! I started to look at it during the weekend, but
I got interrupted and unfortunately didn't had time to look at it
since.

The fix looks good to me. I also tried multiple failure scenario and
it's unsurprisingly working just fine. Should we add some regression
tests for that? I guess most of it could be borrowed from the patch
to fix the toast index issue I sent last week.

#15Michael Paquier
michael@paquier.xyz
In reply to: Julien Rouhaud (#14)
Re: reindex concurrently and two toast indexes

On Wed, Mar 04, 2020 at 09:21:45AM +0100, Julien Rouhaud wrote:

Thanks for the patch! I started to look at it during the weekend, but
I got interrupted and unfortunately didn't had time to look at it
since.

No problem, thanks for looking at it. I have looked at it again this
morning, and applied it.

The fix looks good to me. I also tried multiple failure scenario and
it's unsurprisingly working just fine. Should we add some regression
tests for that? I guess most of it could be borrowed from the patch
to fix the toast index issue I sent last week.

I have doubts when it comes to use a strategy based on
pg_cancel_backend() and a match of application_name (see for example
5ad72ce but I cannot find the associated thread). I think that we
could design something more robust here and usable by all tests, with
two things coming into my mind:
- A new meta-command for isolation tests to be able to cancel a
session with PQcancel().
- Fault injection in the backend.
For the case of this thread, the cancellation command would be a better
match.
--
Michael

#16Julien Rouhaud
rjuju123@gmail.com
In reply to: Michael Paquier (#15)
1 attachment(s)
Re: reindex concurrently and two toast indexes

On Thu, Mar 05, 2020 at 12:53:54PM +0900, Michael Paquier wrote:

On Wed, Mar 04, 2020 at 09:21:45AM +0100, Julien Rouhaud wrote:

Thanks for the patch! I started to look at it during the weekend, but
I got interrupted and unfortunately didn't had time to look at it
since.

No problem, thanks for looking at it. I have looked at it again this
morning, and applied it.

The fix looks good to me. I also tried multiple failure scenario and
it's unsurprisingly working just fine. Should we add some regression
tests for that? I guess most of it could be borrowed from the patch
to fix the toast index issue I sent last week.

I have doubts when it comes to use a strategy based on
pg_cancel_backend() and a match of application_name (see for example
5ad72ce but I cannot find the associated thread). I think that we
could design something more robust here and usable by all tests, with
two things coming into my mind:
- A new meta-command for isolation tests to be able to cancel a
session with PQcancel().
- Fault injection in the backend.
For the case of this thread, the cancellation command would be a better
match.

I agree that the approach wasn't quite robust. I'll try to look at adding a
new command for isolationtester, but that's probably not something we want to
put in pg13?

Here's a v3 that takes address the various comments you previously noted, and
for which I also removed the regression tests.

Note that while looking at it, I noticed another bug in RIC:

# create table t1(id integer, val text); create index on t1(val);
CREATE TABLE

CREATE INDEX

# reindex table concurrently t1;
^CCancel request sent
ERROR: 57014: canceling statement due to user request
LOCATION: ProcessInterrupts, postgres.c:3171

# select indexrelid::regclass, indrelid::regclass, indexrelid, indrelid from pg_index where not indisvalid;
indexrelid | indrelid | indexrelid | indrelid
-------------------------------------+-------------------------+------------+----------
t1_val_idx_ccold | t1 | 16401 | 16395
pg_toast.pg_toast_16395_index_ccold | pg_toast.pg_toast_16395 | 16400 | 16398
(2 rows)

# reindex table concurrently t1;
WARNING: 0A000: cannot reindex invalid index "public.t1_val_idx_ccold" concurrently, skipping
LOCATION: ReindexRelationConcurrently, indexcmds.c:2821
WARNING: XX002: cannot reindex invalid index "pg_toast.pg_toast_16395_index_ccold" concurrently, skipping
LOCATION: ReindexRelationConcurrently, indexcmds.c:2867
REINDEX

# reindex index concurrently t1_val_idx_ccold;
REINDEX

That case is also fixed in this patch.

Attachments:

0001-Don-t-reindex-invalid-indexes-on-TOAST-tables-v3.patchtext/plain; charset=us-asciiDownload
From 77ec865a9a655b2b973846f9a8fa93c966ca55f5 Mon Sep 17 00:00:00 2001
From: Julien Rouhaud <julien.rouhaud@free.fr>
Date: Fri, 21 Feb 2020 20:15:04 +0100
Subject: [PATCH] Don't reindex invalid indexes on TOAST tables.

Such indexes can only be duplicated leftovers of failed REINDEX CONCURRENTLY
commands.  As we only allow to drop invalid indexes on TOAST tables, reindexing
those would lead to useless duplicated indexes that can't be dropped anymore.

Reported-by: Sergei Kornilov, Justin Pryzby
Author: Julien Rouhaud
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/36712441546604286%40sas1-890ba5c2334a.qloud-c.yandex.net
Discussion: https://postgr.es/m/20200216190835.GA21832@telsasoft.com
Backpatch-through: 12
---
 src/backend/catalog/index.c      | 30 ++++++++++++++++++++
 src/backend/commands/indexcmds.c | 47 +++++++++++++++++++++++++-------
 2 files changed, 67 insertions(+), 10 deletions(-)

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 7223679033..d3d28df97a 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -46,6 +46,7 @@
 #include "catalog/pg_depend.h"
 #include "catalog/pg_description.h"
 #include "catalog/pg_inherits.h"
+#include "catalog/pg_namespace_d.h"
 #include "catalog/pg_opclass.h"
 #include "catalog/pg_operator.h"
 #include "catalog/pg_tablespace.h"
@@ -3724,6 +3725,35 @@ reindex_relation(Oid relid, int flags, int options)
 		{
 			Oid			indexOid = lfirst_oid(indexId);
 
+			/*
+			 * We skip any invalid index on a TOAST table.  Those can only be
+			 * a duplicate leftover of a failed REINDEX CONCURRENTLY, and if we
+			 * rebuild it it won't be possible to drop it anymore.
+			 */
+			if (rel->rd_rel->relnamespace == PG_TOAST_NAMESPACE)
+			{
+				HeapTuple	tup;
+				bool		skipit;
+
+				tup = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indexOid));
+				if (!HeapTupleIsValid(tup))
+					elog(ERROR, "cache lookup failed for index %u", indexOid);
+
+				skipit = ((Form_pg_index) GETSTRUCT(tup))->indisvalid == false;
+
+				ReleaseSysCache(tup);
+
+				if (skipit)
+				{
+					ereport(WARNING,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("cannot reindex invalid TOAST index \"%s.%s\", skipping",
+									get_namespace_name(get_rel_namespace(indexOid)),
+									get_rel_name(indexOid))));
+					continue;
+				}
+			}
+
 			reindex_index(indexOid, !(flags & REINDEX_REL_CHECK_CONSTRAINTS),
 						  persistence, options);
 
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index ec20ba38d1..7985d98aa8 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -28,6 +28,7 @@
 #include "catalog/pg_am.h"
 #include "catalog/pg_constraint.h"
 #include "catalog/pg_inherits.h"
+#include "catalog/pg_namespace_d.h"
 #include "catalog/pg_opclass.h"
 #include "catalog/pg_opfamily.h"
 #include "catalog/pg_tablespace.h"
@@ -2337,6 +2338,7 @@ ReindexIndex(RangeVar *indexRelation, int options, bool concurrent)
 	Oid			indOid;
 	Relation	irel;
 	char		persistence;
+	bool		invalidtoastindex;
 
 	/*
 	 * Find and lock index, and check permissions on table; use callback to
@@ -2362,6 +2364,9 @@ ReindexIndex(RangeVar *indexRelation, int options, bool concurrent)
 	 */
 	irel = index_open(indOid, NoLock);
 
+	invalidtoastindex = (irel->rd_rel->relkind == RELKIND_INDEX &&
+						 !irel->rd_index->indisvalid);
+
 	if (irel->rd_rel->relkind == RELKIND_PARTITIONED_INDEX)
 	{
 		ReindexPartitionedIndex(irel);
@@ -2371,6 +2376,16 @@ ReindexIndex(RangeVar *indexRelation, int options, bool concurrent)
 	persistence = irel->rd_rel->relpersistence;
 	index_close(irel, NoLock);
 
+	if (invalidtoastindex)
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot reindex invalid TOAST index \"%s.%s\", skipping",
+						get_namespace_name(get_rel_namespace(indOid)),
+						get_rel_name(indOid))));
+		return;
+	}
+
 	if (concurrent && persistence != RELPERSISTENCE_TEMP)
 		ReindexRelationConcurrently(indOid, options);
 	else
@@ -2890,25 +2905,37 @@ ReindexRelationConcurrently(Oid relationOid, int options)
 		case RELKIND_INDEX:
 			{
 				Oid			heapId = IndexGetRelation(relationOid, false);
+				Relation	indexRelation = index_open(relationOid,
+													   ShareUpdateExclusiveLock);
 
 				if (IsCatalogRelationOid(heapId))
 					ereport(ERROR,
 							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 							 errmsg("cannot reindex system catalogs concurrently")));
+				else if (!indexRelation->rd_index->indisvalid)
+					ereport(WARNING,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("cannot reindex invalid index \"%s.%s\" concurrently, skipping",
+									get_namespace_name(get_rel_namespace(relationOid)),
+									get_rel_name(relationOid))));
+				else
+				{
+					/* Save the list of relation OIDs in private context */
+					oldcontext = MemoryContextSwitchTo(private_context);
 
-				/* Save the list of relation OIDs in private context */
-				oldcontext = MemoryContextSwitchTo(private_context);
+					/* Track the heap relation of this index for session locks */
+					heapRelationIds = list_make1_oid(heapId);
 
-				/* Track the heap relation of this index for session locks */
-				heapRelationIds = list_make1_oid(heapId);
+					/*
+					 * Save the list of relation OIDs in private context.  Note
+					 * that invalid indexes are allowed here.
+					 */
+					indexIds = lappend_oid(indexIds, relationOid);
 
-				/*
-				 * Save the list of relation OIDs in private context.  Note
-				 * that invalid indexes are allowed here.
-				 */
-				indexIds = lappend_oid(indexIds, relationOid);
+					MemoryContextSwitchTo(oldcontext);
+				}
 
-				MemoryContextSwitchTo(oldcontext);
+				index_close(indexRelation, NoLock);
 				break;
 			}
 		case RELKIND_PARTITIONED_TABLE:
-- 
2.20.1

#17Michael Paquier
michael@paquier.xyz
In reply to: Julien Rouhaud (#16)
Re: reindex concurrently and two toast indexes

On Thu, Mar 05, 2020 at 05:57:07PM +0100, Julien Rouhaud wrote:

I agree that the approach wasn't quite robust. I'll try to look at adding a
new command for isolationtester, but that's probably not something we want to
put in pg13?

Yes, that's too late.

Note that while looking at it, I noticed another bug in RIC:

[...]

# reindex table concurrently t1;
WARNING: 0A000: cannot reindex invalid index "public.t1_val_idx_ccold" concurrently, skipping
LOCATION: ReindexRelationConcurrently, indexcmds.c:2821
WARNING: XX002: cannot reindex invalid index "pg_toast.pg_toast_16395_index_ccold" concurrently, skipping
LOCATION: ReindexRelationConcurrently, indexcmds.c:2867
REINDEX
# reindex index concurrently t1_val_idx_ccold;
REINDEX

That case is also fixed in this patch.

This choice is intentional. The idea about bypassing invalid indexes
for table-level REINDEX is that this would lead to a bloat in the
number of relations to handling if multiple runs are failing, leading
to more and more invalid indexes to handle each time. Allowing a
single invalid non-toast index to be reindexed with CONCURRENTLY can
be helpful in some cases, like for example a CIC for a unique index
that failed and was invalid, where the relation already defined can be
reused.
--
Michael

#18Julien Rouhaud
rjuju123@gmail.com
In reply to: Michael Paquier (#17)
1 attachment(s)
Re: reindex concurrently and two toast indexes

On Fri, Mar 06, 2020 at 10:38:44AM +0900, Michael Paquier wrote:

On Thu, Mar 05, 2020 at 05:57:07PM +0100, Julien Rouhaud wrote:

I agree that the approach wasn't quite robust. I'll try to look at adding a
new command for isolationtester, but that's probably not something we want to
put in pg13?

Yes, that's too late.

Note that while looking at it, I noticed another bug in RIC:

[...]

# reindex table concurrently t1;
WARNING: 0A000: cannot reindex invalid index "public.t1_val_idx_ccold" concurrently, skipping
LOCATION: ReindexRelationConcurrently, indexcmds.c:2821
WARNING: XX002: cannot reindex invalid index "pg_toast.pg_toast_16395_index_ccold" concurrently, skipping
LOCATION: ReindexRelationConcurrently, indexcmds.c:2867
REINDEX
# reindex index concurrently t1_val_idx_ccold;
REINDEX

That case is also fixed in this patch.

This choice is intentional. The idea about bypassing invalid indexes
for table-level REINDEX is that this would lead to a bloat in the
number of relations to handling if multiple runs are failing, leading
to more and more invalid indexes to handle each time. Allowing a
single invalid non-toast index to be reindexed with CONCURRENTLY can
be helpful in some cases, like for example a CIC for a unique index
that failed and was invalid, where the relation already defined can be
reused.

Ah I see, thanks for the clarification. I guess there's room for improvement
in the comments about that, since the ERRCODE_FEATURE_NOT_SUPPORTED usage is
quite misleading there.

v4 attached, which doesn't prevent a REINDEX INDEX CONCURRENTLY on any invalid
non-TOAST index anymore.

Attachments:

0001-Don-t-reindex-invalid-indexes-on-TOAST-tables-v4.patchtext/plain; charset=us-asciiDownload
From 7bf57256192806e1caafc3dec68061e473d3978a Mon Sep 17 00:00:00 2001
From: Julien Rouhaud <julien.rouhaud@free.fr>
Date: Fri, 21 Feb 2020 20:15:04 +0100
Subject: [PATCH] Don't reindex invalid indexes on TOAST tables.

Such indexes can only be duplicated leftovers of failed REINDEX CONCURRENTLY
commands.  As we only allow to drop invalid indexes on TOAST tables, reindexing
those would lead to useless duplicated indexes that can't be dropped anymore.

Reported-by: Sergei Kornilov, Justin Pryzby
Author: Julien Rouhaud
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/36712441546604286%40sas1-890ba5c2334a.qloud-c.yandex.net
Discussion: https://postgr.es/m/20200216190835.GA21832@telsasoft.com
Backpatch-through: 12
---
 src/backend/catalog/index.c      | 30 ++++++++++++++++++++++++++++++
 src/backend/commands/indexcmds.c | 15 +++++++++++++++
 2 files changed, 45 insertions(+)

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 7223679033..d3d28df97a 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -46,6 +46,7 @@
 #include "catalog/pg_depend.h"
 #include "catalog/pg_description.h"
 #include "catalog/pg_inherits.h"
+#include "catalog/pg_namespace_d.h"
 #include "catalog/pg_opclass.h"
 #include "catalog/pg_operator.h"
 #include "catalog/pg_tablespace.h"
@@ -3724,6 +3725,35 @@ reindex_relation(Oid relid, int flags, int options)
 		{
 			Oid			indexOid = lfirst_oid(indexId);
 
+			/*
+			 * We skip any invalid index on a TOAST table.  Those can only be
+			 * a duplicate leftover of a failed REINDEX CONCURRENTLY, and if we
+			 * rebuild it it won't be possible to drop it anymore.
+			 */
+			if (rel->rd_rel->relnamespace == PG_TOAST_NAMESPACE)
+			{
+				HeapTuple	tup;
+				bool		skipit;
+
+				tup = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(indexOid));
+				if (!HeapTupleIsValid(tup))
+					elog(ERROR, "cache lookup failed for index %u", indexOid);
+
+				skipit = ((Form_pg_index) GETSTRUCT(tup))->indisvalid == false;
+
+				ReleaseSysCache(tup);
+
+				if (skipit)
+				{
+					ereport(WARNING,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("cannot reindex invalid TOAST index \"%s.%s\", skipping",
+									get_namespace_name(get_rel_namespace(indexOid)),
+									get_rel_name(indexOid))));
+					continue;
+				}
+			}
+
 			reindex_index(indexOid, !(flags & REINDEX_REL_CHECK_CONSTRAINTS),
 						  persistence, options);
 
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 3f3a89fe92..08d74fecd1 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -28,6 +28,7 @@
 #include "catalog/pg_am.h"
 #include "catalog/pg_constraint.h"
 #include "catalog/pg_inherits.h"
+#include "catalog/pg_namespace_d.h"
 #include "catalog/pg_opclass.h"
 #include "catalog/pg_opfamily.h"
 #include "catalog/pg_tablespace.h"
@@ -2309,6 +2310,7 @@ ReindexIndex(RangeVar *indexRelation, int options, bool concurrent)
 	Oid			indOid;
 	Relation	irel;
 	char		persistence;
+	bool		invalidtoastindex;
 
 	/*
 	 * Find and lock index, and check permissions on table; use callback to
@@ -2334,6 +2336,9 @@ ReindexIndex(RangeVar *indexRelation, int options, bool concurrent)
 	 */
 	irel = index_open(indOid, NoLock);
 
+	invalidtoastindex = (irel->rd_rel->relnamespace == PG_TOAST_NAMESPACE &&
+						 !irel->rd_index->indisvalid);
+
 	if (irel->rd_rel->relkind == RELKIND_PARTITIONED_INDEX)
 	{
 		ReindexPartitionedIndex(irel);
@@ -2343,6 +2348,16 @@ ReindexIndex(RangeVar *indexRelation, int options, bool concurrent)
 	persistence = irel->rd_rel->relpersistence;
 	index_close(irel, NoLock);
 
+	if (invalidtoastindex)
+	{
+		ereport(WARNING,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot reindex invalid TOAST index \"%s.%s\", skipping",
+						get_namespace_name(get_rel_namespace(indOid)),
+						get_rel_name(indOid))));
+		return;
+	}
+
 	if (concurrent && persistence != RELPERSISTENCE_TEMP)
 		ReindexRelationConcurrently(indOid, options);
 	else
-- 
2.20.1

#19Julien Rouhaud
rjuju123@gmail.com
In reply to: Michael Paquier (#15)
2 attachment(s)
Add an optional timeout clause to isolationtester step.

On Thu, Mar 05, 2020 at 12:53:54PM +0900, Michael Paquier wrote:

On Wed, Mar 04, 2020 at 09:21:45AM +0100, Julien Rouhaud wrote:

Should we add some regression
tests for that? I guess most of it could be borrowed from the patch
to fix the toast index issue I sent last week.

I have doubts when it comes to use a strategy based on
pg_cancel_backend() and a match of application_name (see for example
5ad72ce but I cannot find the associated thread). I think that we
could design something more robust here and usable by all tests, with
two things coming into my mind:
- A new meta-command for isolation tests to be able to cancel a
session with PQcancel().
- Fault injection in the backend.
For the case of this thread, the cancellation command would be a better
match.

Here's a patch to add an optional "timeout val" clause to isolationtester's
step definition. When used, isolationtester will actively wait on the query
rather than continuing with the permutation next step, and will issue a cancel
once the defined timeout is reached. I also added as a POC the previous
regression tests for invalid TOAST indexes, updated to use this new
infrastructure (which won't pass as long as the original bug for invalid TOAST
indexes isn't fixed).

I'll park that in the next commitfest, with a v14 target version.

Attachments:

0001-Add-an-optional-timeout-value-to-isolationtester-ste-v1.patchtext/x-diff; charset=us-asciiDownload
From 8480fa105032e15ef76926f9247b2f5af97223d0 Mon Sep 17 00:00:00 2001
From: Julien Rouhaud <julien.rouhaud@free.fr>
Date: Fri, 6 Mar 2020 13:40:56 +0100
Subject: [PATCH 1/2] Add an optional timeout value to isolationtester steps.

Some sanity checks can require a command to wait on a lock and eventually be
cancelled.  The only way to do that was to rely on calls to pg_cancel_backend()
filtering pg_stat_activity view, but this isn't a satisfactory solution as
there's no way to guarantee that only the wanted backend will be canceled.

Instead, add a new optional "timeout val" clause to the step definition.  When
this clause is specified, isolationtester will actively wait on that command,
and issue a cancel when the given timeout is reached.

Author: Julien Rouhaud
Reviewed-by:
Discussion: https://postgr.es/m/20200305035354.GQ2593%40paquier.xyz
---
 src/test/isolation/README            | 11 ++++++--
 src/test/isolation/isolationtester.c | 42 ++++++++++++++++++++++++----
 src/test/isolation/isolationtester.h |  2 ++
 src/test/isolation/specparse.y       | 18 ++++++++++--
 src/test/isolation/specscanner.l     |  8 ++++++
 5 files changed, 70 insertions(+), 11 deletions(-)

diff --git a/src/test/isolation/README b/src/test/isolation/README
index 217953d183..827bea0c42 100644
--- a/src/test/isolation/README
+++ b/src/test/isolation/README
@@ -86,10 +86,12 @@ session "<name>"
 
   Each step has the syntax
 
-  step "<name>" { <SQL> }
+  step "<name>" { <SQL> } [ TIMEOUT seconds ]
 
-  where <name> is a name identifying this step, and SQL is a SQL statement
-  (or statements, separated by semicolons) that is executed in the step.
+  where <name> is a name identifying this step, SQL is a SQL statement
+  (or statements, separated by semicolons) that is executed in the step and
+  seconds an optional timeout for the given SQL statement(s) to wait on before
+  canceling it.
   Step names must be unique across the whole spec file.
 
 permutation "<step name>" ...
@@ -125,6 +127,9 @@ after PGISOLATIONTIMEOUT seconds.  If the cancel doesn't work, isolationtester
 will exit uncleanly after a total of twice PGISOLATIONTIMEOUT.  Testing
 invalid permutations should be avoided because they can make the isolation
 tests take a very long time to run, and they serve no useful testing purpose.
+If a test specified the option timeout specification, then isolationtester will
+actively wait for the step commands completion rather than continuing with the
+permutation next step, and send a cancel once the given timeout is reached.
 
 Note that isolationtester recognizes that a command has blocked by looking
 to see if it is shown as waiting in the pg_locks view; therefore, only
diff --git a/src/test/isolation/isolationtester.c b/src/test/isolation/isolationtester.c
index f80261c022..dd5d335027 100644
--- a/src/test/isolation/isolationtester.c
+++ b/src/test/isolation/isolationtester.c
@@ -120,10 +120,28 @@ main(int argc, char **argv)
 	spec_yyparse();
 	testspec = &parseresult;
 
-	/* Create a lookup table of all steps. */
+	/*
+	 * Create a lookup table of all steps and validate any timeout
+	 * specification.
+	 */
 	nallsteps = 0;
 	for (i = 0; i < testspec->nsessions; i++)
+	{
 		nallsteps += testspec->sessions[i]->nsteps;
+		for (j = 0; j < testspec->sessions[i]->nsteps; j++)
+		{
+			if ((testspec->sessions[i]->steps[j]->timeout  * USECS_PER_SEC) >=
+					(max_step_wait /2))
+			{
+				fprintf(stderr, "step %s: step timeout (%d) should be less"
+						" than global timeout (%ld)",
+						testspec->sessions[i]->steps[j]->name,
+						testspec->sessions[i]->steps[j]->timeout,
+						(max_step_wait / USECS_PER_SEC));
+				exit(1);
+			}
+		}
+	}
 
 	allsteps = pg_malloc(nallsteps * sizeof(Step *));
 
@@ -587,8 +605,14 @@ run_permutation(TestSpec *testspec, int nsteps, Step **steps)
 			exit(1);
 		}
 
-		/* Try to complete this step without blocking.  */
-		mustwait = try_complete_step(testspec, step, STEP_NONBLOCK);
+		/*
+		 * Try to complete this step without blocking, unless the step has a
+		 * timeout.
+		 */
+		mustwait = try_complete_step(testspec, step,
+				(step->timeout == 0 ? STEP_NONBLOCK : 0));
+		if (step->timeout != 0)
+			report_error_message(step);
 
 		/* Check for completion of any steps that were previously waiting. */
 		w = 0;
@@ -721,6 +745,7 @@ try_complete_step(TestSpec *testspec, Step *step, int flags)
 		{
 			struct timeval current_time;
 			int64		td;
+			int64		step_timeout;
 
 			/* If it's OK for the step to block, check whether it has. */
 			if (flags & STEP_NONBLOCK)
@@ -778,6 +803,11 @@ try_complete_step(TestSpec *testspec, Step *step, int flags)
 			td *= USECS_PER_SEC;
 			td += (int64) current_time.tv_usec - (int64) start_time.tv_usec;
 
+			if (step->timeout)
+				step_timeout = step->timeout * USECS_PER_SEC;
+			else
+				step_timeout = max_step_wait;
+
 			/*
 			 * After max_step_wait microseconds, try to cancel the query.
 			 *
@@ -787,7 +817,7 @@ try_complete_step(TestSpec *testspec, Step *step, int flags)
 			 * failing, but remaining permutations and tests should still be
 			 * OK.
 			 */
-			if (td > max_step_wait && !canceled)
+			if (td > step_timeout && !canceled)
 			{
 				PGcancel   *cancel = PQgetCancel(conn);
 
@@ -812,13 +842,13 @@ try_complete_step(TestSpec *testspec, Step *step, int flags)
 			}
 
 			/*
-			 * After twice max_step_wait, just give up and die.
+			 * After twice the step timeout, just give up and die.
 			 *
 			 * Since cleanup steps won't be run in this case, this may cause
 			 * later tests to fail.  That stinks, but it's better than waiting
 			 * forever for the server to respond to the cancel.
 			 */
-			if (td > 2 * max_step_wait)
+			if (td > 2 * step_timeout)
 			{
 				fprintf(stderr, "step %s timed out after %d seconds\n",
 						step->name, (int) (td / USECS_PER_SEC));
diff --git a/src/test/isolation/isolationtester.h b/src/test/isolation/isolationtester.h
index 9cf5012416..0a0dd54ff3 100644
--- a/src/test/isolation/isolationtester.h
+++ b/src/test/isolation/isolationtester.h
@@ -33,6 +33,8 @@ struct Step
 	char	   *name;
 	char	   *sql;
 	char	   *errormsg;
+	int			timeout;
+	struct timeval start_time;
 };
 
 typedef struct
diff --git a/src/test/isolation/specparse.y b/src/test/isolation/specparse.y
index 5e007e1bf0..76842c42ab 100644
--- a/src/test/isolation/specparse.y
+++ b/src/test/isolation/specparse.y
@@ -25,6 +25,7 @@ TestSpec		parseresult;			/* result of parsing is left here */
 %union
 {
 	char	   *str;
+	int			ival;
 	Session	   *session;
 	Step	   *step;
 	Permutation *permutation;
@@ -43,9 +44,11 @@ TestSpec		parseresult;			/* result of parsing is left here */
 %type <session> session
 %type <step> step
 %type <permutation> permutation
+%type <ival> opt_timeout
 
 %token <str> sqlblock string_literal
-%token PERMUTATION SESSION SETUP STEP TEARDOWN TEST
+%token <ival> INTEGER
+%token PERMUTATION SESSION SETUP STEP TEARDOWN TEST TIMEOUT
 
 %%
 
@@ -140,16 +143,27 @@ step_list:
 
 
 step:
-			STEP string_literal sqlblock
+			STEP string_literal sqlblock opt_timeout
 			{
 				$$ = pg_malloc(sizeof(Step));
 				$$->name = $2;
 				$$->sql = $3;
 				$$->used = false;
 				$$->errormsg = NULL;
+				$$->timeout = $4;
 			}
 		;
 
+opt_timeout:
+			TIMEOUT INTEGER
+			{
+				$$ = $2;
+			}
+			| /* EMPTY */
+			{
+				$$ = 0;
+			}
+		;
 
 opt_permutation_list:
 			permutation_list
diff --git a/src/test/isolation/specscanner.l b/src/test/isolation/specscanner.l
index 410f17727e..1ec1812569 100644
--- a/src/test/isolation/specscanner.l
+++ b/src/test/isolation/specscanner.l
@@ -53,6 +53,7 @@ session			{ return SESSION; }
 setup			{ return SETUP; }
 step			{ return STEP; }
 teardown		{ return TEARDOWN; }
+timeout			{ return TIMEOUT; }
 
 [\n]			{ yyline++; }
 {comment}		{ /* ignore */ }
@@ -96,6 +97,13 @@ teardown		{ return TEARDOWN; }
 					yyerror("unterminated sql block");
 				}
 
+ /* integer */
+[0-9]+			{
+					yylval.ival = atoi(yytext);
+					return INTEGER;
+
+				}
+
 .				{
 					fprintf(stderr, "syntax error at line %d: unexpected character \"%s\"\n", yyline, yytext);
 					exit(1);
-- 
2.25.1

0002-Add-regression-tests-for-failed-REINDEX-TABLE-CONCUR-v2.patchtext/x-diff; charset=us-asciiDownload
From 1d112c1bbb563a76198435d60047e8a2b96bcd4a Mon Sep 17 00:00:00 2001
From: Julien Rouhaud <julien.rouhaud@free.fr>
Date: Fri, 6 Mar 2020 13:51:35 +0100
Subject: [PATCH 2/2] Add regression tests for failed REINDEX TABLE
 CONCURRENTLY.

If a REINDEX TABLE CONCURRENTLY fails on a table having a TOAST table, an
invalid index will be present for the TOAST table.  As we only allow to drop
invalid indexes on TOAST tables, reindexing those would lead to useless
duplicated indexes that can't be dropped anymore.

Author: Julien Rouhaud
Reviewed-by:
Discussion: https://postgr.es/m/20200216190835.GA21832@telsasoft.com
---
 .../expected/reindex-concurrently.out         | 54 ++++++++++++++++++-
 .../isolation/specs/reindex-concurrently.spec | 19 +++++++
 2 files changed, 72 insertions(+), 1 deletion(-)

diff --git a/src/test/isolation/expected/reindex-concurrently.out b/src/test/isolation/expected/reindex-concurrently.out
index 9e04169b2f..69a55d3788 100644
--- a/src/test/isolation/expected/reindex-concurrently.out
+++ b/src/test/isolation/expected/reindex-concurrently.out
@@ -1,4 +1,4 @@
-Parsed test spec with 3 sessions
+Parsed test spec with 5 sessions
 
 starting permutation: reindex sel1 upd2 ins2 del2 end1 end2
 step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab;
@@ -76,3 +76,55 @@ step end1: COMMIT;
 step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
 step end2: COMMIT;
 step reindex: <... completed>
+
+starting permutation: check_invalid lock reindex_timeout unlock check_invalid nowarn normal_reindex check_invalid reindex check_invalid
+step check_invalid: SELECT i.indisvalid
+    FROM pg_class c
+    JOIN pg_class t ON t.oid = c.reltoastrelid
+    JOIN pg_index i ON i.indrelid = t.oid
+    WHERE c.relname = 'reind_con_tab'
+    ORDER BY i.indisvalid::text COLLATE "C";
+indisvalid     
+
+t              
+step lock: BEGIN; SELECT data FROM reind_con_tab WHERE data = 'aa' FOR UPDATE;
+data           
+
+aa             
+isolationtester: canceling step reindex_timeout after 1 seconds
+step reindex_timeout: REINDEX TABLE CONCURRENTLY reind_con_tab;
+ERROR:  canceling statement due to user request
+step unlock: COMMIT;
+step check_invalid: SELECT i.indisvalid
+    FROM pg_class c
+    JOIN pg_class t ON t.oid = c.reltoastrelid
+    JOIN pg_index i ON i.indrelid = t.oid
+    WHERE c.relname = 'reind_con_tab'
+    ORDER BY i.indisvalid::text COLLATE "C";
+indisvalid     
+
+f              
+t              
+step nowarn: SET client_min_messages = 'ERROR';
+step normal_reindex: REINDEX TABLE reind_con_tab;
+step check_invalid: SELECT i.indisvalid
+    FROM pg_class c
+    JOIN pg_class t ON t.oid = c.reltoastrelid
+    JOIN pg_index i ON i.indrelid = t.oid
+    WHERE c.relname = 'reind_con_tab'
+    ORDER BY i.indisvalid::text COLLATE "C";
+indisvalid     
+
+f              
+t              
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab;
+step check_invalid: SELECT i.indisvalid
+    FROM pg_class c
+    JOIN pg_class t ON t.oid = c.reltoastrelid
+    JOIN pg_index i ON i.indrelid = t.oid
+    WHERE c.relname = 'reind_con_tab'
+    ORDER BY i.indisvalid::text COLLATE "C";
+indisvalid     
+
+f              
+t              
diff --git a/src/test/isolation/specs/reindex-concurrently.spec b/src/test/isolation/specs/reindex-concurrently.spec
index eb59fe0cba..dc67e2afd2 100644
--- a/src/test/isolation/specs/reindex-concurrently.spec
+++ b/src/test/isolation/specs/reindex-concurrently.spec
@@ -31,6 +31,22 @@ step "end2" { COMMIT; }
 
 session "s3"
 step "reindex" { REINDEX TABLE CONCURRENTLY reind_con_tab; }
+step "reindex_timeout" { REINDEX TABLE CONCURRENTLY reind_con_tab; } timeout 1
+step "nowarn" { SET client_min_messages = 'ERROR'; }
+
+session "s4"
+step "lock" { BEGIN; SELECT data FROM reind_con_tab WHERE data = 'aa' FOR UPDATE; }
+step "unlock" { COMMIT; }
+
+session "s5"
+setup { SET client_min_messages = 'WARNING'; }
+step "normal_reindex" { REINDEX TABLE reind_con_tab; }
+step "check_invalid" {SELECT i.indisvalid
+    FROM pg_class c
+    JOIN pg_class t ON t.oid = c.reltoastrelid
+    JOIN pg_index i ON i.indrelid = t.oid
+    WHERE c.relname = 'reind_con_tab'
+    ORDER BY i.indisvalid::text COLLATE "C"; }
 
 permutation "reindex" "sel1" "upd2" "ins2" "del2" "end1" "end2"
 permutation "sel1" "reindex" "upd2" "ins2" "del2" "end1" "end2"
@@ -38,3 +54,6 @@ permutation "sel1" "upd2" "reindex" "ins2" "del2" "end1" "end2"
 permutation "sel1" "upd2" "ins2" "reindex" "del2" "end1" "end2"
 permutation "sel1" "upd2" "ins2" "del2" "reindex" "end1" "end2"
 permutation "sel1" "upd2" "ins2" "del2" "end1" "reindex" "end2"
+permutation "check_invalid" "lock" "reindex_timeout" "unlock" "check_invalid"
+            "nowarn" "normal_reindex" "check_invalid"
+            "reindex" "check_invalid"
-- 
2.25.1

#20Michael Paquier
michael@paquier.xyz
In reply to: Julien Rouhaud (#19)
Re: Add an optional timeout clause to isolationtester step.

On Fri, Mar 06, 2020 at 02:15:47PM +0100, Julien Rouhaud wrote:

Here's a patch to add an optional "timeout val" clause to isolationtester's
step definition. When used, isolationtester will actively wait on the query
rather than continuing with the permutation next step, and will issue a cancel
once the defined timeout is reached. I also added as a POC the previous
regression tests for invalid TOAST indexes, updated to use this new
infrastructure (which won't pass as long as the original bug for invalid TOAST
indexes isn't fixed).

One problem with this approach is that it does address the stability
of the test on very slow machines, and there are some of them in the
buildfarm.  Taking your patch, I can make the test fail by applying
the following sleep because the query would be cancelled before some
of the indexes are marked as invalid:
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -3046,6 +3046,8 @@ ReindexRelationConcurrently(Oid relationOid, int
options)
    CommitTransactionCommand();
        StartTransactionCommand();
+   pg_usleep(100000L * 10); /* 10s */
+
    /*
     * Phase 2 of REINDEX CONCURRENTLY

Another problem is that on faster machines the test is slow because of
the timeout used. What are your thoughts about having instead a
cancel meta-command instead?
--
Michael

#21Julien Rouhaud
rjuju123@gmail.com
In reply to: Michael Paquier (#20)
Re: Add an optional timeout clause to isolationtester step.

On Sat, Mar 07, 2020 at 10:41:42AM +0900, Michael Paquier wrote:

On Fri, Mar 06, 2020 at 02:15:47PM +0100, Julien Rouhaud wrote:

Here's a patch to add an optional "timeout val" clause to isolationtester's
step definition. When used, isolationtester will actively wait on the query
rather than continuing with the permutation next step, and will issue a cancel
once the defined timeout is reached. I also added as a POC the previous
regression tests for invalid TOAST indexes, updated to use this new
infrastructure (which won't pass as long as the original bug for invalid TOAST
indexes isn't fixed).

One problem with this approach is that it does address the stability
of the test on very slow machines, and there are some of them in the
buildfarm.  Taking your patch, I can make the test fail by applying
the following sleep because the query would be cancelled before some
of the indexes are marked as invalid:
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -3046,6 +3046,8 @@ ReindexRelationConcurrently(Oid relationOid, int
options)
CommitTransactionCommand();
StartTransactionCommand();
+   pg_usleep(100000L * 10); /* 10s */
+
/*
* Phase 2 of REINDEX CONCURRENTLY

Another problem is that on faster machines the test is slow because of
the timeout used. What are your thoughts about having instead a
cancel meta-command instead?

Looking at timeouts.spec and e.g. a7921f71a3c, it seems that we already chose
to fix this problem by having a timeout long enough to statisfy the slower
buildfarm members, even when running on fast machines, so I assumed that the
same approach could be used here.

I agree that the 1s timeout I used is maybe too low, but that's easy enough to
change. Another point is that it's possible to have a close behavior without
this patch by using a statement_timeout (the active wait does change things
though), but the spec files would be more verbose.

#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Julien Rouhaud (#21)
Re: Add an optional timeout clause to isolationtester step.

Julien Rouhaud <rjuju123@gmail.com> writes:

On Sat, Mar 07, 2020 at 10:41:42AM +0900, Michael Paquier wrote:

On Fri, Mar 06, 2020 at 02:15:47PM +0100, Julien Rouhaud wrote:

Here's a patch to add an optional "timeout val" clause to isolationtester's
step definition. When used, isolationtester will actively wait on the query
rather than continuing with the permutation next step, and will issue a cancel
once the defined timeout is reached.

One problem with this approach is that it does address the stability
of the test on very slow machines, and there are some of them in the
buildfarm.

Looking at timeouts.spec and e.g. a7921f71a3c, it seems that we already chose
to fix this problem by having a timeout long enough to statisfy the slower
buildfarm members, even when running on fast machines, so I assumed that the
same approach could be used here.

The arbitrarily-set timeouts that exist in some of the isolation tests
are horrid kluges that have caused us lots of headaches in the past
and no doubt will again in the future. Aside from occasionally failing
when a machine is particularly overloaded, they cause the tests to
take far longer than necessary on decently-fast machines. So ideally
we'd get rid of those entirely in favor of some more-dynamic approach.
Admittedly, I have no proposal for what that would be. But adding yet
more ways to set a (guaranteed-to-be-wrong) timeout seems like the
wrong direction to be going in. What's the actual need that you're
trying to deal with?

regards, tom lane

#23Julien Rouhaud
rjuju123@gmail.com
In reply to: Tom Lane (#22)
Re: Add an optional timeout clause to isolationtester step.

On Sat, Mar 07, 2020 at 10:46:34AM -0500, Tom Lane wrote:

Julien Rouhaud <rjuju123@gmail.com> writes:

On Sat, Mar 07, 2020 at 10:41:42AM +0900, Michael Paquier wrote:

On Fri, Mar 06, 2020 at 02:15:47PM +0100, Julien Rouhaud wrote:

Here's a patch to add an optional "timeout val" clause to isolationtester's
step definition. When used, isolationtester will actively wait on the query
rather than continuing with the permutation next step, and will issue a cancel
once the defined timeout is reached.

One problem with this approach is that it does address the stability
of the test on very slow machines, and there are some of them in the
buildfarm.

Looking at timeouts.spec and e.g. a7921f71a3c, it seems that we already chose
to fix this problem by having a timeout long enough to statisfy the slower
buildfarm members, even when running on fast machines, so I assumed that the
same approach could be used here.

The arbitrarily-set timeouts that exist in some of the isolation tests
are horrid kluges that have caused us lots of headaches in the past
and no doubt will again in the future. Aside from occasionally failing
when a machine is particularly overloaded, they cause the tests to
take far longer than necessary on decently-fast machines.

Yeah, I have no doubt that it has been a pain, and this patch is clearly not a
bullet-proof solution.

So ideally
we'd get rid of those entirely in favor of some more-dynamic approach.
Admittedly, I have no proposal for what that would be.

The fault injection framework that was previously discussed would cover most of
the usecase I can think of, but that's a way bigger project.

But adding yet
more ways to set a (guaranteed-to-be-wrong) timeout seems like the
wrong direction to be going in.

Fair enough, I'll mark the patch as rejected then.

What's the actual need that you're trying to deal with?

Testing the correct behavior of non trivial commands, such as CIC/reindex
concurrently, that fails during the execution.

#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Julien Rouhaud (#23)
Re: Add an optional timeout clause to isolationtester step.

Julien Rouhaud <rjuju123@gmail.com> writes:

On Sat, Mar 07, 2020 at 10:46:34AM -0500, Tom Lane wrote:

What's the actual need that you're trying to deal with?

Testing the correct behavior of non trivial commands, such as CIC/reindex
concurrently, that fails during the execution.

Hmm ... don't see how a timeout helps with that?

regards, tom lane

#25Julien Rouhaud
rjuju123@gmail.com
In reply to: Tom Lane (#24)
Re: Add an optional timeout clause to isolationtester step.

On Sat, Mar 07, 2020 at 04:09:31PM -0500, Tom Lane wrote:

Julien Rouhaud <rjuju123@gmail.com> writes:

On Sat, Mar 07, 2020 at 10:46:34AM -0500, Tom Lane wrote:

What's the actual need that you're trying to deal with?

Testing the correct behavior of non trivial commands, such as CIC/reindex
concurrently, that fails during the execution.

Hmm ... don't see how a timeout helps with that?

For reindex concurrently, a SELECT FOR UPDATE on a different connection can
ensure that the reindex will be stuck at some point, so canceling the command
after a long enough timeout reproduces the original faulty behavior.

#26Tom Lane
tgl@sss.pgh.pa.us
In reply to: Julien Rouhaud (#25)
Re: Add an optional timeout clause to isolationtester step.

Julien Rouhaud <rjuju123@gmail.com> writes:

On Sat, Mar 07, 2020 at 04:09:31PM -0500, Tom Lane wrote:

Julien Rouhaud <rjuju123@gmail.com> writes:

On Sat, Mar 07, 2020 at 10:46:34AM -0500, Tom Lane wrote:

What's the actual need that you're trying to deal with?

Testing the correct behavior of non trivial commands, such as CIC/reindex
concurrently, that fails during the execution.

Hmm ... don't see how a timeout helps with that?

For reindex concurrently, a SELECT FOR UPDATE on a different connection can
ensure that the reindex will be stuck at some point, so canceling the command
after a long enough timeout reproduces the original faulty behavior.

Hmm, seems like a pretty arbitrary (and slow) way to test that. I'd
envision testing that by setting up a case with an expression index
where the expression is designed to fail at some point partway through
the build -- say, with a divide-by-zero triggered by one of the tuples
to be indexed.

regards, tom lane

#27Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#26)
Re: Add an optional timeout clause to isolationtester step.

On Sat, Mar 07, 2020 at 04:23:58PM -0500, Tom Lane wrote:

Hmm, seems like a pretty arbitrary (and slow) way to test that. I'd
envision testing that by setting up a case with an expression index
where the expression is designed to fail at some point partway through
the build -- say, with a divide-by-zero triggered by one of the tuples
to be indexed.

I am not sure that I think that's very tricky to get an invalid index
_ccold after the swap phase with what the existing test facility
provides, because the new index is already built at the point where
the dependencies are switched so you cannot rely on a failure when
building the index. Note also that some tests of CREATE INDEX
CONCURRENTLY rely on the uniqueness to create invalid index entries
(division by zero is fine as well). And, actually, if you rely on
that, you can get invalid _ccnew entries easily:
create table aa (a int);
insert into aa values (1),(1);
create unique index concurrently aai on aa (a);
reindex index concurrently aai;
=# \d aa
Table "public.aa"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
a | integer | | |
Indexes:
"aai" UNIQUE, btree (a) INVALID
"aai_ccnew" UNIQUE, btree (a) INVALID

That's before the dependency swapping is done though... With a fault
injection facility, it would be possible to test the stability of
the operation by enforcing for example failures after the start of
each inner transaction of REINDEX CONCURRENTLY.
--
Michael

#28Michael Paquier
michael@paquier.xyz
In reply to: Julien Rouhaud (#18)
1 attachment(s)
Re: reindex concurrently and two toast indexes

On Fri, Mar 06, 2020 at 01:36:48PM +0100, Julien Rouhaud wrote:

Ah I see, thanks for the clarification. I guess there's room for improvement
in the comments about that, since the ERRCODE_FEATURE_NOT_SUPPORTED usage is
quite misleading there.

v4 attached, which doesn't prevent a REINDEX INDEX CONCURRENTLY on any invalid
non-TOAST index anymore.

Thanks. The position of the error check in reindex_relation() is
correct, but as it opens a relation for the cache lookup let's invent
a new routine in lsyscache.c to grab pg_index.indisvalid. It is
possible to make use of this routine with all the other checks:
- WARNING for REINDEX TABLE (non-conurrent)
- ERROR for REINDEX INDEX (non-conurrent)
- ERROR for REINDEX INDEX CONCURRENTLY
(There is already a WARNING for REINDEX TABLE CONCURRENTLY.)

I did not find the addition of an error check in ReindexIndex()
consistent with the existing practice to check the state of the
relation reindexed in reindex_index() (for the non-concurrent case)
and ReindexRelationConcurrently() (for the concurrent case). Okay,
this leads to the introduction of two new ERROR messages related to
invalid toast indexes for the concurrent and the non-concurrent cases
when using REINDEX INDEX instead of one, but having two messages leads
to something much more consistent with the rest, and all checks remain
centralized in the same routines.

For the index-level operation, issuing a WARNING is not consistent
with the existing practice to use an ERROR, which is more adapted as
the operation is done on a single index at a time.

For the check in reindex_relation, it is more consistent to check the
namespace of the index instead of the parent relation IMO (the
previous patch used "rel", which refers to the parent table). This
has in practice no consequence though.

It would have been nice to test this stuff. However, this requires an
invalid toast index and we cannot create that except by canceling a
concurrent reindex, leading us back to the upthread discussion about
isolation tests, timeouts and fault injection :/

Any opinions?
--
Michael

Attachments:

v2-0001-Forbid-reindex-of-invalid-indexes-on-TOAST-tables.patchtext/x-diff; charset=us-asciiDownload
From 2e808a2971fcaf160f6a1e6c9c80366ff053bccf Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Mon, 9 Mar 2020 14:43:33 +0900
Subject: [PATCH v2] Forbid reindex of invalid indexes on TOAST tables

Such indexes can only be duplicated leftovers of a previously failed
REINDEX CONCURRENTLY command, and a valid equivalent is guaranteed to
exist already.  As we only allow the drop of invalid indexes on TOAST
tables, reindexing these would lead to useless duplicated indexes that
can't be dropped anymore.

Thanks to Justin Pryzby for reminding that this problem was reported
long ago, but it has never been addressed.

Reported-by: Sergei Kornilov, Justin Pryzby
Author: Julien Rouhaud
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/36712441546604286%40sas1-890ba5c2334a.qloud-c.yandex.net
Discussion: https://postgr.es/m/20200216190835.GA21832@telsasoft.com
Backpatch-through: 12
---
 src/include/utils/lsyscache.h       |  1 +
 src/backend/catalog/index.c         | 28 ++++++++++++++++++++++++++++
 src/backend/commands/indexcmds.c    | 12 ++++++++++++
 src/backend/utils/cache/lsyscache.c | 23 +++++++++++++++++++++++
 4 files changed, 64 insertions(+)

diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h
index f132d39458..131d10eab0 100644
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -181,6 +181,7 @@ extern char *get_namespace_name_or_temp(Oid nspid);
 extern Oid	get_range_subtype(Oid rangeOid);
 extern Oid	get_range_collation(Oid rangeOid);
 extern Oid	get_index_column_opclass(Oid index_oid, int attno);
+extern bool get_index_isvalid(Oid index_oid);
 
 #define type_is_array(typid)  (get_element_type(typid) != InvalidOid)
 /* type_is_array_domain accepts both plain arrays and domains over arrays */
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 7223679033..ec2d7dc9cb 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -46,6 +46,7 @@
 #include "catalog/pg_depend.h"
 #include "catalog/pg_description.h"
 #include "catalog/pg_inherits.h"
+#include "catalog/pg_namespace_d.h"
 #include "catalog/pg_opclass.h"
 #include "catalog/pg_operator.h"
 #include "catalog/pg_tablespace.h"
@@ -3474,6 +3475,17 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("cannot reindex temporary tables of other sessions")));
 
+	/*
+	 * Don't allow reindex on an invalid toast index.  This is a leftover from
+	 * a failed REINDEX CONCURRENTLY, and if rebuilt it would not be possible
+	 * to drop it anymore.
+	 */
+	if (RelationGetNamespace(iRel) == PG_TOAST_NAMESPACE &&
+		!get_index_isvalid(indexId))
+		ereport(ERROR,
+				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+				 errmsg("cannot reindex invalid toast index")));
+
 	/*
 	 * Also check for active uses of the index in the current transaction; we
 	 * don't want to reindex underneath an open indexscan.
@@ -3724,6 +3736,22 @@ reindex_relation(Oid relid, int flags, int options)
 		{
 			Oid			indexOid = lfirst_oid(indexId);
 
+			/*
+			 * Skip any invalid indexes on a TOAST table.  These can only be
+			 * duplicate leftovers from a failed REINDEX CONCURRENTLY, and if
+			 * rebuilt it it won't be possible to drop them anymore.
+			 */
+			if (get_rel_namespace(indexOid) == PG_TOAST_NAMESPACE &&
+				!get_index_isvalid(indexOid))
+			{
+				ereport(WARNING,
+						(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+						 errmsg("cannot reindex invalid toast index \"%s.%s\", skipping",
+								get_namespace_name(get_rel_namespace(indexOid)),
+								get_rel_name(indexOid))));
+				continue;
+			}
+
 			reindex_index(indexOid, !(flags & REINDEX_REL_CHECK_CONSTRAINTS),
 						  persistence, options);
 
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 3f3a89fe92..6f1dee8643 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -28,6 +28,7 @@
 #include "catalog/pg_am.h"
 #include "catalog/pg_constraint.h"
 #include "catalog/pg_inherits.h"
+#include "catalog/pg_namespace_d.h"
 #include "catalog/pg_opclass.h"
 #include "catalog/pg_opfamily.h"
 #include "catalog/pg_tablespace.h"
@@ -2868,6 +2869,17 @@ ReindexRelationConcurrently(Oid relationOid, int options)
 							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 							 errmsg("cannot reindex system catalogs concurrently")));
 
+				/*
+				 * Don't allow reindex for an invalid toast index, as if rebuild
+				 * it would not be possible to drop it.  Its valid equivalent also
+				 * already exists.
+				 */
+				if (get_rel_namespace(relationOid) ==  PG_TOAST_NAMESPACE &&
+					!get_index_isvalid(relationOid))
+					ereport(ERROR,
+							(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+							 errmsg("cannot reindex invalid toast index concurrently")));
+
 				/* Save the list of relation OIDs in private context */
 				oldcontext = MemoryContextSwitchTo(private_context);
 
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c
index 3da90cb72a..400e7689fe 100644
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -3227,3 +3227,26 @@ get_index_column_opclass(Oid index_oid, int attno)
 
 	return opclass;
 }
+
+/*
+ * get_index_isvalid
+ *
+ *		Given the index OID, return pg_index.indisvalid.
+ */
+bool
+get_index_isvalid(Oid index_oid)
+{
+	bool		isvalid;
+	HeapTuple	tuple;
+	Form_pg_index rd_index;
+
+	tuple = SearchSysCache1(INDEXRELID, ObjectIdGetDatum(index_oid));
+	if (!HeapTupleIsValid(tuple))
+		elog(ERROR, "cache lookup failed for index %u", index_oid);
+
+	rd_index = (Form_pg_index) GETSTRUCT(tuple);
+	isvalid = rd_index->indisvalid;
+	ReleaseSysCache(tuple);
+
+	return isvalid;
+}
-- 
2.25.1

#29Julien Rouhaud
rjuju123@gmail.com
In reply to: Michael Paquier (#28)
Re: reindex concurrently and two toast indexes

On Mon, Mar 09, 2020 at 02:52:31PM +0900, Michael Paquier wrote:

On Fri, Mar 06, 2020 at 01:36:48PM +0100, Julien Rouhaud wrote:

v4 attached, which doesn't prevent a REINDEX INDEX CONCURRENTLY on any invalid
non-TOAST index anymore.

Thanks. The position of the error check in reindex_relation() is
correct, but as it opens a relation for the cache lookup let's invent
a new routine in lsyscache.c to grab pg_index.indisvalid. It is
possible to make use of this routine with all the other checks:
- WARNING for REINDEX TABLE (non-conurrent)
- ERROR for REINDEX INDEX (non-conurrent)
- ERROR for REINDEX INDEX CONCURRENTLY
(There is already a WARNING for REINDEX TABLE CONCURRENTLY.)

I did not find the addition of an error check in ReindexIndex()
consistent with the existing practice to check the state of the
relation reindexed in reindex_index() (for the non-concurrent case)
and ReindexRelationConcurrently() (for the concurrent case). Okay,
this leads to the introduction of two new ERROR messages related to
invalid toast indexes for the concurrent and the non-concurrent cases
when using REINDEX INDEX instead of one, but having two messages leads
to something much more consistent with the rest, and all checks remain
centralized in the same routines.

I wanted to go this way at first but hesitated and finally chose to add less
checks, so I'm fine with this approach, and patch looks good to me.

For the index-level operation, issuing a WARNING is not consistent
with the existing practice to use an ERROR, which is more adapted as
the operation is done on a single index at a time.

Agreed.

For the check in reindex_relation, it is more consistent to check the
namespace of the index instead of the parent relation IMO (the
previous patch used "rel", which refers to the parent table). This
has in practice no consequence though.

Oops yes.

It would have been nice to test this stuff. However, this requires an
invalid toast index and we cannot create that except by canceling a
concurrent reindex, leading us back to the upthread discussion about
isolation tests, timeouts and fault injection :/

Yes, unfortunately I don't see an acceptable way to add tests for that without
some kind of fault injection, so this will have to wait :(

#30Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#22)
Re: Add an optional timeout clause to isolationtester step.

On Sat, Mar 07, 2020 at 10:46:34AM -0500, Tom Lane wrote:

The arbitrarily-set timeouts that exist in some of the isolation tests
are horrid kluges that have caused us lots of headaches in the past
and no doubt will again in the future. Aside from occasionally failing
when a machine is particularly overloaded, they cause the tests to
take far longer than necessary on decently-fast machines. So ideally
we'd get rid of those entirely in favor of some more-dynamic approach.
Admittedly, I have no proposal for what that would be. But adding yet
more ways to set a (guaranteed-to-be-wrong) timeout seems like the
wrong direction to be going in. What's the actual need that you're
trying to deal with?

As a matter of fact, the buildfarm member petalura just reported a
failure with the isolation test "timeouts", the machine being
extremely slow:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=petalura&amp;dt=2020-03-08%2011%3A20%3A05

test timeouts                     ... FAILED    60330 ms
[...]
-step update: DELETE FROM accounts WHERE accountid = 'checking'; <waiting ...>
-step update: <... completed>
+step update: DELETE FROM accounts WHERE accountid = 'checking';
 ERROR:  canceling statement due to statement timeout
--
Michael
#31Julien Rouhaud
rjuju123@gmail.com
In reply to: Michael Paquier (#30)
Re: Add an optional timeout clause to isolationtester step.

On Mon, Mar 09, 2020 at 04:47:27PM +0900, Michael Paquier wrote:

On Sat, Mar 07, 2020 at 10:46:34AM -0500, Tom Lane wrote:

The arbitrarily-set timeouts that exist in some of the isolation tests
are horrid kluges that have caused us lots of headaches in the past
and no doubt will again in the future. Aside from occasionally failing
when a machine is particularly overloaded, they cause the tests to
take far longer than necessary on decently-fast machines. So ideally
we'd get rid of those entirely in favor of some more-dynamic approach.
Admittedly, I have no proposal for what that would be. But adding yet
more ways to set a (guaranteed-to-be-wrong) timeout seems like the
wrong direction to be going in. What's the actual need that you're
trying to deal with?

As a matter of fact, the buildfarm member petalura just reported a
failure with the isolation test "timeouts", the machine being
extremely slow:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=petalura&amp;dt=2020-03-08%2011%3A20%3A05

test timeouts                     ... FAILED    60330 ms
[...]
-step update: DELETE FROM accounts WHERE accountid = 'checking'; <waiting ...>
-step update: <... completed>
+step update: DELETE FROM accounts WHERE accountid = 'checking';
ERROR:  canceling statement due to statement timeout

Indeed. I guess we could add some kind of environment variable facility in
isolationtester to let slow machine owner put a way bigger timeout without
making the test super slow for everyone else, but that seems overkill for just
one test, and given the other thread about deploying REL_11 build-farm client,
that wouldn't be an immediate fix either.

#32Andres Freund
andres@anarazel.de
In reply to: Julien Rouhaud (#25)
Re: Add an optional timeout clause to isolationtester step.

Hi,

On 2020-03-07 22:17:09 +0100, Julien Rouhaud wrote:

For reindex concurrently, a SELECT FOR UPDATE on a different connection can
ensure that the reindex will be stuck at some point, so canceling the command
after a long enough timeout reproduces the original faulty behavior.

That kind of thing can already be done using statement_timeout or
lock_timeout, no?

Greetings,

Andres Freund

#33Michael Paquier
michael@paquier.xyz
In reply to: Andres Freund (#32)
Re: Add an optional timeout clause to isolationtester step.

On Mon, Mar 09, 2020 at 03:15:58PM -0700, Andres Freund wrote:

On 2020-03-07 22:17:09 +0100, Julien Rouhaud wrote:

For reindex concurrently, a SELECT FOR UPDATE on a different connection can
ensure that the reindex will be stuck at some point, so canceling the command
after a long enough timeout reproduces the original faulty behavior.

That kind of thing can already be done using statement_timeout or
lock_timeout, no?

Yep, still that's not something I would recommend to commit in the
tree as that's a double-edged sword as you already know. For slower
machines, you need a statement_timeout large enough so as you make
sure that the state you want the query to wait for is reached, which
has a cost on all other faster machines as it makes the tests slower.
--
Michael

#34Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#33)
Re: Add an optional timeout clause to isolationtester step.

Michael Paquier <michael@paquier.xyz> writes:

On Mon, Mar 09, 2020 at 03:15:58PM -0700, Andres Freund wrote:

That kind of thing can already be done using statement_timeout or
lock_timeout, no?

Yep, still that's not something I would recommend to commit in the
tree as that's a double-edged sword as you already know. For slower
machines, you need a statement_timeout large enough so as you make
sure that the state you want the query to wait for is reached, which
has a cost on all other faster machines as it makes the tests slower.

It strikes me to wonder whether we could improve matters by teaching
isolationtester to watch for particular values in a connected backend's
pg_stat_activity.wait_event_type/wait_event columns. Those columns
didn't exist when isolationtester was designed, IIRC, so it's not
surprising that they're not used in the current design. But we could
use them perhaps to detect that a backend has arrived at some state
that's not a heavyweight-lock-wait state.

regards, tom lane

#35Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#34)
Re: Add an optional timeout clause to isolationtester step.

On Mon, Mar 09, 2020 at 10:32:27PM -0400, Tom Lane wrote:

It strikes me to wonder whether we could improve matters by teaching
isolationtester to watch for particular values in a connected backend's
pg_stat_activity.wait_event_type/wait_event columns. Those columns
didn't exist when isolationtester was designed, IIRC, so it's not
surprising that they're not used in the current design. But we could
use them perhaps to detect that a backend has arrived at some state
that's not a heavyweight-lock-wait state.

Interesting idea. So that would be basically an equivalent of
PostgresNode::poll_query_until but for the isolation tester? In short
we gain a meta-command that runs a SELECT query that waits until the
query defined in the command returns true. The polling interval may
be tricky to set though. If set too low it would consume resources
for nothing, and if set too large it would make the tests using this
meta-command slower than they actually need to be. Perhaps something
like 100ms may be fine..
--
Michael

#36Michael Paquier
michael@paquier.xyz
In reply to: Julien Rouhaud (#29)
1 attachment(s)
Re: reindex concurrently and two toast indexes

On Mon, Mar 09, 2020 at 08:04:27AM +0100, Julien Rouhaud wrote:

On Mon, Mar 09, 2020 at 02:52:31PM +0900, Michael Paquier wrote:

For the index-level operation, issuing a WARNING is not consistent
with the existing practice to use an ERROR, which is more adapted as
the operation is done on a single index at a time.

Agreed.

Thanks for checking the patch.

It would have been nice to test this stuff. However, this requires an
invalid toast index and we cannot create that except by canceling a
concurrent reindex, leading us back to the upthread discussion about
isolation tests, timeouts and fault injection :/

Yes, unfortunately I don't see an acceptable way to add tests for that without
some kind of fault injection, so this will have to wait :(

Let's discuss that separately.

I have also been reviewing the isolation test you have added upthread
about the dependency handling of invalid indexes, and one thing that
we cannot really do is attempting to do a reindex at index or
table-level with invalid toast indexes as this leads to unstable ERROR
or WARNING messages. But at least one thing we can do is to extend
the query you sent directly so as it exposes the toast relation name
filtered with regex_replace(). This gives us a stable output, and
this way the test makes sure that the query cancellation happened
after the dependencies are swapped, and not at build or validation
time (indisvalid got appended to the end of the output):
+pg_toast.pg_toast_<oid>_index_ccoldf
+pg_toast.pg_toast_<oid>_indext

Please feel free to see the attached for reference, that's not
something for commit in upstream, but I am going to keep that around
in my own plugin tree.
--
Michael

Attachments:

0001-Add-isolation-test-to-check-dependency-handling-of-i.patchtext/x-diff; charset=us-asciiDownload
From fd988b4141725082b30148da31d9c2a6a88c7319 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Tue, 10 Mar 2020 12:01:23 +0900
Subject: [PATCH] Add isolation test to check dependency handling of invalid
 indexes

This uses a statement_timeout of 1s, which is not really a good idea :D
---
 .../isolation/expected/reindex-invalid.out    | 34 +++++++++++++++++
 src/test/isolation/isolation_schedule         |  1 +
 src/test/isolation/specs/reindex-invalid.spec | 38 +++++++++++++++++++
 3 files changed, 73 insertions(+)
 create mode 100644 src/test/isolation/expected/reindex-invalid.out
 create mode 100644 src/test/isolation/specs/reindex-invalid.spec

diff --git a/src/test/isolation/expected/reindex-invalid.out b/src/test/isolation/expected/reindex-invalid.out
new file mode 100644
index 0000000000..ef1eff9ce7
--- /dev/null
+++ b/src/test/isolation/expected/reindex-invalid.out
@@ -0,0 +1,34 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s1_select s2_timeout s2_reindex s2_check s1_commit s1_drop s2_check
+step s1_select: SELECT data FROM reind_con_invalid FOR UPDATE;
+data           
+
+foo            
+bar            
+step s2_timeout: SET statement_timeout = 1000;
+step s2_reindex: REINDEX TABLE CONCURRENTLY reind_con_invalid; <waiting ...>
+step s2_reindex: <... completed>
+ERROR:  canceling statement due to statement timeout
+step s2_check: SELECT regexp_replace(i.indexrelid::regclass::text, '(pg_toast_)([0-9+/=]+)(_index)', '\1<oid>\3'),
+					         i.indisvalid
+					    FROM pg_class c
+					      JOIN pg_class t ON t.oid = c.reltoastrelid
+					      JOIN pg_index i ON i.indrelid = t.oid
+					    WHERE c.relname = 'reind_con_invalid'
+					      ORDER BY c.relname;
+regexp_replace indisvalid     
+
+pg_toast.pg_toast_<oid>_index_ccoldf              
+pg_toast.pg_toast_<oid>_indext              
+step s1_commit: COMMIT;
+step s1_drop: DROP TABLE reind_con_invalid;
+step s2_check: SELECT regexp_replace(i.indexrelid::regclass::text, '(pg_toast_)([0-9+/=]+)(_index)', '\1<oid>\3'),
+					         i.indisvalid
+					    FROM pg_class c
+					      JOIN pg_class t ON t.oid = c.reltoastrelid
+					      JOIN pg_index i ON i.indrelid = t.oid
+					    WHERE c.relname = 'reind_con_invalid'
+					      ORDER BY c.relname;
+regexp_replace indisvalid     
+
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 2873cd7c21..28e5b8b16a 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -48,6 +48,7 @@ test: lock-committed-update
 test: lock-committed-keyupdate
 test: update-locked-tuple
 test: reindex-concurrently
+test: reindex-invalid
 test: propagate-lock-delete
 test: tuplelock-conflict
 test: tuplelock-update
diff --git a/src/test/isolation/specs/reindex-invalid.spec b/src/test/isolation/specs/reindex-invalid.spec
new file mode 100644
index 0000000000..bf866d2279
--- /dev/null
+++ b/src/test/isolation/specs/reindex-invalid.spec
@@ -0,0 +1,38 @@
+# REINDEX CONCURRENTLY with invalid indexes
+#
+# Check that handling of dependencies with invalid indexes is correct.
+# When the parent table is dropped, all the indexes are dropped.
+
+setup
+{
+	CREATE TABLE reind_con_invalid (id serial primary key, data text);
+	INSERT INTO reind_con_invalid (data) VALUES ('foo');
+	INSERT INTO reind_con_invalid (data) VALUES ('bar');
+}
+
+# No need to drop the table at teardown phase here as each permutation
+# takes care of it internally.
+
+session "s1"
+setup { BEGIN; }
+step "s1_select" { SELECT data FROM reind_con_invalid FOR UPDATE; }
+step "s1_commit" { COMMIT; }
+step "s1_drop"   { DROP TABLE reind_con_invalid; }
+
+session "s2"
+step "s2_timeout" { SET statement_timeout = 1000; }
+step "s2_reindex" { REINDEX TABLE CONCURRENTLY reind_con_invalid; }
+step "s2_check" { SELECT regexp_replace(i.indexrelid::regclass::text, '(pg_toast_)([0-9+/=]+)(_index)', '\1<oid>\3'),
+					         i.indisvalid
+					    FROM pg_class c
+					      JOIN pg_class t ON t.oid = c.reltoastrelid
+					      JOIN pg_index i ON i.indrelid = t.oid
+					    WHERE c.relname = 'reind_con_invalid'
+					      ORDER BY c.relname; }
+
+# A failed REINDEX CONCURRENTLY will leave an invalid index on the toast
+# table of TOAST table.  Any following successful REINDEX should leave
+# this index as invalid, otherwise we would end up with a useless and
+# duplicated index that can't be dropped.  Any invalid index should be
+# dropped once the parent table is dropped.
+permutation "s1_select" "s2_timeout" "s2_reindex" "s2_check" "s1_commit" "s1_drop" "s2_check"
-- 
2.25.1

#37Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#35)
Re: Add an optional timeout clause to isolationtester step.

Michael Paquier <michael@paquier.xyz> writes:

On Mon, Mar 09, 2020 at 10:32:27PM -0400, Tom Lane wrote:

It strikes me to wonder whether we could improve matters by teaching
isolationtester to watch for particular values in a connected backend's
pg_stat_activity.wait_event_type/wait_event columns. Those columns
didn't exist when isolationtester was designed, IIRC, so it's not
surprising that they're not used in the current design. But we could
use them perhaps to detect that a backend has arrived at some state
that's not a heavyweight-lock-wait state.

Interesting idea. So that would be basically an equivalent of
PostgresNode::poll_query_until but for the isolation tester?

No, more like the existing isolationtester wait query, which watches
for something being blocked on a heavyweight lock. Right now, that
one depends on a bespoke function pg_isolation_test_session_is_blocked(),
but it used to be a query on pg_stat_activity/pg_locks.

In short
we gain a meta-command that runs a SELECT query that waits until the
query defined in the command returns true. The polling interval may
be tricky to set though.

I think it'd be just the same as the polling interval for the existing
wait query. We'd have to have some way to mark a script step to say
what to check to decide that it's blocked ...

regards, tom lane

#38Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#36)
Re: reindex concurrently and two toast indexes

On Tue, Mar 10, 2020 at 12:09:42PM +0900, Michael Paquier wrote:

On Mon, Mar 09, 2020 at 08:04:27AM +0100, Julien Rouhaud wrote:

Agreed.

Thanks for checking the patch.

And applied as 61d7c7b. Regarding the isolation tests, let's
brainstorm on what we can do, but I am afraid that it is too late for
13.
--
Michael

#39Julien Rouhaud
rjuju123@gmail.com
In reply to: Tom Lane (#37)
2 attachment(s)
Re: Add an optional timeout clause to isolationtester step.

On Tue, Mar 10, 2020 at 12:09:12AM -0400, Tom Lane wrote:

Michael Paquier <michael@paquier.xyz> writes:

On Mon, Mar 09, 2020 at 10:32:27PM -0400, Tom Lane wrote:

It strikes me to wonder whether we could improve matters by teaching
isolationtester to watch for particular values in a connected backend's
pg_stat_activity.wait_event_type/wait_event columns. Those columns
didn't exist when isolationtester was designed, IIRC, so it's not
surprising that they're not used in the current design. But we could
use them perhaps to detect that a backend has arrived at some state
that's not a heavyweight-lock-wait state.

Interesting idea. So that would be basically an equivalent of
PostgresNode::poll_query_until but for the isolation tester?

No, more like the existing isolationtester wait query, which watches
for something being blocked on a heavyweight lock. Right now, that
one depends on a bespoke function pg_isolation_test_session_is_blocked(),
but it used to be a query on pg_stat_activity/pg_locks.

Ah interesting indeed!

In short
we gain a meta-command that runs a SELECT query that waits until the
query defined in the command returns true. The polling interval may
be tricky to set though.

I think it'd be just the same as the polling interval for the existing
wait query. We'd have to have some way to mark a script step to say
what to check to decide that it's blocked ...

So basically we could just change pg_isolation_test_session_is_blocked() to
also return the wait_event_type and wait_event, and adding something like

step "<name>" { SQL } [ cancel on "<wait_event_type>" "<wait_event>" ]

to the step definition should be enough. I'm attaching a POC patch for that.
On my laptop, the full test now complete in about 400ms.

FTR the REINDEX TABLE CONCURRENTLY case is eventually locked on a virtualxid,
I'm not sure if that's could lead to too early cancellation.

Attachments:

v2-0001-Add-an-optional-cancel-on-clause-to-isolationtest.patchtext/x-diff; charset=us-asciiDownload
From 77c214c9fb2fa7f9a003b96db0e5ec6506217e38 Mon Sep 17 00:00:00 2001
From: Julien Rouhaud <julien.rouhaud@free.fr>
Date: Fri, 6 Mar 2020 13:40:56 +0100
Subject: [PATCH v2 1/2] Add an optional cancel on clause to isolationtester
 steps.

Some sanity checks can require a command to wait on a lock and eventually be
cancelled.  The only way to do that was to rely on calls to pg_cancel_backend()
filtering pg_stat_activity view, but this isn't a satisfactory solution as
there's no way to guarantee that only the wanted backend will be canceled.

Instead, add a new optional "cancel on <wait_event_type> <wait_event>" clause
to the step definition.  When this clause is specified, isolationtester will
actively wait on that command, and issue a cancel when the query is waiting on
the given wait event.

Author: Julien Rouhaud
Reviewed-by:
Discussion: https://postgr.es/m/20200305035354.GQ2593%40paquier.xyz
---
 src/backend/utils/adt/lockfuncs.c    | 54 ++++++++++++++++++++++++++--
 src/include/catalog/pg_proc.dat      |  5 ++-
 src/test/isolation/README            | 12 +++++--
 src/test/isolation/isolationtester.c | 43 +++++++++++++++++-----
 src/test/isolation/isolationtester.h |  8 +++++
 src/test/isolation/specparse.y       | 20 +++++++++--
 src/test/isolation/specscanner.l     |  2 ++
 7 files changed, 127 insertions(+), 17 deletions(-)

diff --git a/src/backend/utils/adt/lockfuncs.c b/src/backend/utils/adt/lockfuncs.c
index ecb1bf92ff..cc2bc94d0d 100644
--- a/src/backend/utils/adt/lockfuncs.c
+++ b/src/backend/utils/adt/lockfuncs.c
@@ -17,7 +17,9 @@
 #include "catalog/pg_type.h"
 #include "funcapi.h"
 #include "miscadmin.h"
+#include "pgstat.h"
 #include "storage/predicate_internals.h"
+#include "storage/procarray.h"
 #include "utils/array.h"
 #include "utils/builtins.h"
 
@@ -578,17 +580,28 @@ pg_safe_snapshot_blocking_pids(PG_FUNCTION_ARGS)
 Datum
 pg_isolation_test_session_is_blocked(PG_FUNCTION_ARGS)
 {
+	TupleDesc	tupdesc;
 	int			blocked_pid = PG_GETARG_INT32(0);
 	ArrayType  *interesting_pids_a = PG_GETARG_ARRAYTYPE_P(1);
 	ArrayType  *blocking_pids_a;
 	int32	   *interesting_pids;
 	int32	   *blocking_pids;
+	Datum		values[3];
+	bool		nulls[3];
+	PGPROC	   *proc;
+	uint32		raw_wait_event = 0;
+	const char *wait_event_type = NULL;
+	const char *wait_event = NULL;
 	int			num_interesting_pids;
 	int			num_blocking_pids;
 	int			dummy;
 	int			i,
 				j;
 
+	/* Initialise values and NULL flags arrays */
+	MemSet(values, 0, sizeof(values));
+	MemSet(nulls, 0, sizeof(nulls));
+
 	/* Validate the passed-in array */
 	Assert(ARR_ELEMTYPE(interesting_pids_a) == INT4OID);
 	if (array_contains_nulls(interesting_pids_a))
@@ -597,6 +610,34 @@ pg_isolation_test_session_is_blocked(PG_FUNCTION_ARGS)
 	num_interesting_pids = ArrayGetNItems(ARR_NDIM(interesting_pids_a),
 										  ARR_DIMS(interesting_pids_a));
 
+	/* Initialise attributes information in the tuple descriptor */
+	tupdesc = CreateTemplateTupleDesc(3);
+	TupleDescInitEntry(tupdesc, (AttrNumber) 1, "blocked",
+					   BOOLOID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) 2, "wait_event_type",
+					   TEXTOID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) 3, "wait_even",
+					   TEXTOID, -1, 0);
+
+	BlessTupleDesc(tupdesc);
+
+	proc = BackendPidGetProc(blocked_pid);
+	if (proc)
+	{
+#define UINT32_ACCESS_ONCE(var)		 ((uint32)(*((volatile uint32 *)&(var))))
+		raw_wait_event = UINT32_ACCESS_ONCE(proc->wait_event_info);
+		wait_event_type = pgstat_get_wait_event_type(raw_wait_event);
+		wait_event = pgstat_get_wait_event(raw_wait_event);
+
+		if (wait_event_type != NULL)
+			values[1] = CStringGetTextDatum(wait_event_type);
+		if (wait_event != NULL)
+			values[2] = CStringGetTextDatum(wait_event);
+	}
+
+	nulls[1] = (wait_event_type == NULL);
+	nulls[2] = (wait_event == NULL);
+
 	/*
 	 * Get the PIDs of all sessions blocking the given session's attempt to
 	 * acquire heavyweight locks.
@@ -623,7 +664,11 @@ pg_isolation_test_session_is_blocked(PG_FUNCTION_ARGS)
 		for (j = 0; j < num_interesting_pids; j++)
 		{
 			if (blocking_pids[i] == interesting_pids[j])
-				PG_RETURN_BOOL(true);
+			{
+				values[0] = BoolGetDatum(true);
+				PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc,
+								values, nulls)));
+			}
 		}
 
 	/*
@@ -636,9 +681,12 @@ pg_isolation_test_session_is_blocked(PG_FUNCTION_ARGS)
 	 * buffer and check if the number of safe snapshot blockers is non-zero.
 	 */
 	if (GetSafeSnapshotBlockingPids(blocked_pid, &dummy, 1) > 0)
-		PG_RETURN_BOOL(true);
+		values[0] = BoolGetDatum(true);
+
+	values[0] = BoolGetDatum(false);
 
-	PG_RETURN_BOOL(false);
+	PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc,
+					values, nulls)));
 }
 
 
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 7fb574f9dc..77529e181d 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5869,7 +5869,10 @@
   prosrc => 'pg_safe_snapshot_blocking_pids' },
 { oid => '3378', descr => 'isolationtester support function',
   proname => 'pg_isolation_test_session_is_blocked', provolatile => 'v',
-  prorettype => 'bool', proargtypes => 'int4 _int4',
+  prorettype => 'record', proargtypes => 'int4 _int4',
+  proallargtypes => '{int4,_int4,bool,text,text}',
+  proargmodes => '{i,i,o,o,o}',
+  proargnames => '{blocking_pid,interesting_pids,blocked,wait_event_type,wait_event}',
   prosrc => 'pg_isolation_test_session_is_blocked' },
 { oid => '1065', descr => 'view two-phase transactions',
   proname => 'pg_prepared_xact', prorows => '1000', proretset => 't',
diff --git a/src/test/isolation/README b/src/test/isolation/README
index 217953d183..04aed5cd17 100644
--- a/src/test/isolation/README
+++ b/src/test/isolation/README
@@ -86,10 +86,12 @@ session "<name>"
 
   Each step has the syntax
 
-  step "<name>" { <SQL> }
+  step "<name>" { <SQL> } [ cancel on "<wait_event_type>" "<wait_event>" ]
 
-  where <name> is a name identifying this step, and SQL is a SQL statement
-  (or statements, separated by semicolons) that is executed in the step.
+  where <name> is a name identifying this step, SQL is a SQL statement
+  (or statements, separated by semicolons) that is executed in the step and
+  <wait_event_type> and <wait_event> a wait event specification for which
+  isolationtester will cancel the query if it's blocked on it.
   Step names must be unique across the whole spec file.
 
 permutation "<step name>" ...
@@ -125,6 +127,10 @@ after PGISOLATIONTIMEOUT seconds.  If the cancel doesn't work, isolationtester
 will exit uncleanly after a total of twice PGISOLATIONTIMEOUT.  Testing
 invalid permutations should be avoided because they can make the isolation
 tests take a very long time to run, and they serve no useful testing purpose.
+If a test specified the optionnal cancel on specification, then isolationtester
+will actively wait for the step commands completion rather than continuing with
+the permutation next step, and send a cancel once the given wait event is
+blocking the query.
 
 Note that isolationtester recognizes that a command has blocked by looking
 to see if it is shown as waiting in the pg_locks view; therefore, only
diff --git a/src/test/isolation/isolationtester.c b/src/test/isolation/isolationtester.c
index f80261c022..f530d9923d 100644
--- a/src/test/isolation/isolationtester.c
+++ b/src/test/isolation/isolationtester.c
@@ -43,6 +43,7 @@ static void run_permutation(TestSpec *testspec, int nsteps, Step **steps);
 
 #define STEP_NONBLOCK	0x1		/* return 0 as soon as cmd waits for a lock */
 #define STEP_RETRY		0x2		/* this is a retry of a previously-waiting cmd */
+#define STEP_WAIT		0x4		/* active wait for a given wait event */
 static bool try_complete_step(TestSpec *testspec, Step *step, int flags);
 
 static int	step_qsort_cmp(const void *a, const void *b);
@@ -212,7 +213,7 @@ main(int argc, char **argv)
 	 */
 	initPQExpBuffer(&wait_query);
 	appendPQExpBufferStr(&wait_query,
-						 "SELECT pg_catalog.pg_isolation_test_session_is_blocked($1, '{");
+						 "SELECT * FROM pg_catalog.pg_isolation_test_session_is_blocked($1, '{");
 	/* The spec syntax requires at least one session; assume that here. */
 	appendPQExpBufferStr(&wait_query, backend_pid_strs[1]);
 	for (i = 2; i < nconns; i++)
@@ -587,8 +588,14 @@ run_permutation(TestSpec *testspec, int nsteps, Step **steps)
 			exit(1);
 		}
 
-		/* Try to complete this step without blocking.  */
-		mustwait = try_complete_step(testspec, step, STEP_NONBLOCK);
+		/*
+		 * Try to complete this step without blocking, unless the step has a
+		 * cancel on clause.
+		 */
+		mustwait = try_complete_step(testspec, step,
+				(step->waitinfo ? STEP_WAIT : STEP_NONBLOCK));
+		if (step->waitinfo)
+			report_error_message(step);
 
 		/* Check for completion of any steps that were previously waiting. */
 		w = 0;
@@ -720,10 +727,12 @@ try_complete_step(TestSpec *testspec, Step *step, int flags)
 		else if (ret == 0)		/* select() timeout: check for lock wait */
 		{
 			struct timeval current_time;
+			char  *wait_event_type = "";
+			char  *wait_event = "";
 			int64		td;
 
 			/* If it's OK for the step to block, check whether it has. */
-			if (flags & STEP_NONBLOCK)
+			if (flags & (STEP_NONBLOCK | STEP_WAIT))
 			{
 				bool		waiting;
 
@@ -738,9 +747,17 @@ try_complete_step(TestSpec *testspec, Step *step, int flags)
 					exit(1);
 				}
 				waiting = ((PQgetvalue(res, 0, 0))[0] == 't');
+				if (waiting && (flags & STEP_WAIT))
+				{
+					wait_event_type = pg_strdup(PQgetvalue(res, 0, 1));
+					Assert(wait_event_type != NULL);
+
+					wait_event = pg_strdup(PQgetvalue(res, 0, 2));
+					Assert(wait_event != NULL);
+				}
 				PQclear(res);
 
-				if (waiting)	/* waiting to acquire a lock */
+				if (waiting && (flags & STEP_NONBLOCK))	/* waiting to acquire a lock */
 				{
 					/*
 					 * Since it takes time to perform the lock-check query,
@@ -787,7 +804,11 @@ try_complete_step(TestSpec *testspec, Step *step, int flags)
 			 * failing, but remaining permutations and tests should still be
 			 * OK.
 			 */
-			if (td > max_step_wait && !canceled)
+			if ((td > max_step_wait ||
+				(step->waitinfo
+				 && strcmp(step->waitinfo->wait_event_type, wait_event_type) == 0
+				 && strcmp(step->waitinfo->wait_event, wait_event) == 0))
+				&& !canceled)
 			{
 				PGcancel   *cancel = PQgetCancel(conn);
 
@@ -801,8 +822,14 @@ try_complete_step(TestSpec *testspec, Step *step, int flags)
 						 * print to stdout not stderr, as this should appear
 						 * in the test case's results
 						 */
-						printf("isolationtester: canceling step %s after %d seconds\n",
-							   step->name, (int) (td / USECS_PER_SEC));
+						if (!step->waitinfo)
+							printf("isolationtester: canceling step %s after %d seconds\n",
+								   step->name, (int) (td / USECS_PER_SEC));
+						else
+							printf("isolationtester: canceling step %s on wait event %s/%s\n",
+									step->name,
+									step->waitinfo->wait_event_type,
+									step->waitinfo->wait_event);
 						canceled = true;
 					}
 					else
diff --git a/src/test/isolation/isolationtester.h b/src/test/isolation/isolationtester.h
index 9cf5012416..5df540485c 100644
--- a/src/test/isolation/isolationtester.h
+++ b/src/test/isolation/isolationtester.h
@@ -26,6 +26,12 @@ struct Session
 	int			nsteps;
 };
 
+typedef struct WaitInfo
+{
+	char	   *wait_event_type;
+	char	   *wait_event;
+} WaitInfo;
+
 struct Step
 {
 	int			session;
@@ -33,6 +39,8 @@ struct Step
 	char	   *name;
 	char	   *sql;
 	char	   *errormsg;
+	WaitInfo   *waitinfo;
+	struct timeval start_time;
 };
 
 typedef struct
diff --git a/src/test/isolation/specparse.y b/src/test/isolation/specparse.y
index 5e007e1bf0..c8e72f4316 100644
--- a/src/test/isolation/specparse.y
+++ b/src/test/isolation/specparse.y
@@ -27,6 +27,7 @@ TestSpec		parseresult;			/* result of parsing is left here */
 	char	   *str;
 	Session	   *session;
 	Step	   *step;
+	WaitInfo   *waitinfo;
 	Permutation *permutation;
 	struct
 	{
@@ -43,9 +44,11 @@ TestSpec		parseresult;			/* result of parsing is left here */
 %type <session> session
 %type <step> step
 %type <permutation> permutation
+%type <waitinfo> opt_cancel
 
 %token <str> sqlblock string_literal
-%token PERMUTATION SESSION SETUP STEP TEARDOWN TEST
+%token <str> LITERAL
+%token CANCEL ON PERMUTATION SESSION SETUP STEP TEARDOWN TEST
 
 %%
 
@@ -140,16 +143,29 @@ step_list:
 
 
 step:
-			STEP string_literal sqlblock
+			STEP string_literal sqlblock opt_cancel
 			{
 				$$ = pg_malloc(sizeof(Step));
 				$$->name = $2;
 				$$->sql = $3;
 				$$->used = false;
 				$$->errormsg = NULL;
+				$$->waitinfo = $4;
 			}
 		;
 
+opt_cancel:
+			CANCEL ON string_literal string_literal
+			{
+				$$ = pg_malloc(sizeof(WaitInfo));
+				$$->wait_event_type = $3;
+				$$->wait_event = $4;
+			}
+			| /* EMPTY */
+			{
+				$$ = NULL;
+			}
+		;
 
 opt_permutation_list:
 			permutation_list
diff --git a/src/test/isolation/specscanner.l b/src/test/isolation/specscanner.l
index 410f17727e..672e5fa2fc 100644
--- a/src/test/isolation/specscanner.l
+++ b/src/test/isolation/specscanner.l
@@ -48,6 +48,8 @@ comment			("#"{non_newline}*)
 	litbufsize = LITBUF_INIT;
 %}
 
+cancel			{ return CANCEL; }
+on				{ return ON; }
 permutation		{ return PERMUTATION; }
 session			{ return SESSION; }
 setup			{ return SETUP; }
-- 
2.25.1

v2-0002-Add-regression-tests-for-failed-REINDEX-TABLE-CON.patchtext/x-diff; charset=us-asciiDownload
From 60267b2fb3ff8e3b632ee07d12ee4031543601d3 Mon Sep 17 00:00:00 2001
From: Julien Rouhaud <julien.rouhaud@free.fr>
Date: Fri, 6 Mar 2020 13:51:35 +0100
Subject: [PATCH v2 2/2] Add regression tests for failed REINDEX TABLE
 CONCURRENTLY.

If a REINDEX TABLE CONCURRENTLY fails on a table having a TOAST table, an
invalid index will be present for the TOAST table.  As we only allow to drop
invalid indexes on TOAST tables, reindexing those would lead to useless
duplicated indexes that can't be dropped anymore.

Author: Julien Rouhaud
Reviewed-by:
Discussion: https://postgr.es/m/20200216190835.GA21832@telsasoft.com
---
 .../expected/reindex-concurrently.out         | 54 ++++++++++++++++++-
 .../isolation/specs/reindex-concurrently.spec | 19 +++++++
 2 files changed, 72 insertions(+), 1 deletion(-)

diff --git a/src/test/isolation/expected/reindex-concurrently.out b/src/test/isolation/expected/reindex-concurrently.out
index 9e04169b2f..674a8890bd 100644
--- a/src/test/isolation/expected/reindex-concurrently.out
+++ b/src/test/isolation/expected/reindex-concurrently.out
@@ -1,4 +1,4 @@
-Parsed test spec with 3 sessions
+Parsed test spec with 5 sessions
 
 starting permutation: reindex sel1 upd2 ins2 del2 end1 end2
 step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab;
@@ -76,3 +76,55 @@ step end1: COMMIT;
 step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab; <waiting ...>
 step end2: COMMIT;
 step reindex: <... completed>
+
+starting permutation: check_invalid lock reindex_fail unlock check_invalid nowarn normal_reindex check_invalid reindex check_invalid
+step check_invalid: SELECT i.indisvalid
+    FROM pg_class c
+    JOIN pg_class t ON t.oid = c.reltoastrelid
+    JOIN pg_index i ON i.indrelid = t.oid
+    WHERE c.relname = 'reind_con_tab'
+    ORDER BY i.indisvalid::text COLLATE "C";
+indisvalid     
+
+t              
+step lock: BEGIN; SELECT data FROM reind_con_tab WHERE data = 'aa' FOR UPDATE;
+data           
+
+aa             
+isolationtester: canceling step reindex_fail on wait event Lock/virtualxid
+step reindex_fail: REINDEX TABLE CONCURRENTLY reind_con_tab;
+ERROR:  canceling statement due to user request
+step unlock: COMMIT;
+step check_invalid: SELECT i.indisvalid
+    FROM pg_class c
+    JOIN pg_class t ON t.oid = c.reltoastrelid
+    JOIN pg_index i ON i.indrelid = t.oid
+    WHERE c.relname = 'reind_con_tab'
+    ORDER BY i.indisvalid::text COLLATE "C";
+indisvalid     
+
+f              
+t              
+step nowarn: SET client_min_messages = 'ERROR';
+step normal_reindex: REINDEX TABLE reind_con_tab;
+step check_invalid: SELECT i.indisvalid
+    FROM pg_class c
+    JOIN pg_class t ON t.oid = c.reltoastrelid
+    JOIN pg_index i ON i.indrelid = t.oid
+    WHERE c.relname = 'reind_con_tab'
+    ORDER BY i.indisvalid::text COLLATE "C";
+indisvalid     
+
+f              
+t              
+step reindex: REINDEX TABLE CONCURRENTLY reind_con_tab;
+step check_invalid: SELECT i.indisvalid
+    FROM pg_class c
+    JOIN pg_class t ON t.oid = c.reltoastrelid
+    JOIN pg_index i ON i.indrelid = t.oid
+    WHERE c.relname = 'reind_con_tab'
+    ORDER BY i.indisvalid::text COLLATE "C";
+indisvalid     
+
+f              
+t              
diff --git a/src/test/isolation/specs/reindex-concurrently.spec b/src/test/isolation/specs/reindex-concurrently.spec
index eb59fe0cba..a3cd8b72a5 100644
--- a/src/test/isolation/specs/reindex-concurrently.spec
+++ b/src/test/isolation/specs/reindex-concurrently.spec
@@ -31,6 +31,22 @@ step "end2" { COMMIT; }
 
 session "s3"
 step "reindex" { REINDEX TABLE CONCURRENTLY reind_con_tab; }
+step "reindex_fail" { REINDEX TABLE CONCURRENTLY reind_con_tab; } cancel on "Lock" "virtualxid"
+step "nowarn" { SET client_min_messages = 'ERROR'; }
+
+session "s4"
+step "lock" { BEGIN; SELECT data FROM reind_con_tab WHERE data = 'aa' FOR UPDATE; }
+step "unlock" { COMMIT; }
+
+session "s5"
+setup { SET client_min_messages = 'ERROR'; }
+step "normal_reindex" { REINDEX TABLE reind_con_tab; }
+step "check_invalid" {SELECT i.indisvalid
+    FROM pg_class c
+    JOIN pg_class t ON t.oid = c.reltoastrelid
+    JOIN pg_index i ON i.indrelid = t.oid
+    WHERE c.relname = 'reind_con_tab'
+    ORDER BY i.indisvalid::text COLLATE "C"; }
 
 permutation "reindex" "sel1" "upd2" "ins2" "del2" "end1" "end2"
 permutation "sel1" "reindex" "upd2" "ins2" "del2" "end1" "end2"
@@ -38,3 +54,6 @@ permutation "sel1" "upd2" "reindex" "ins2" "del2" "end1" "end2"
 permutation "sel1" "upd2" "ins2" "reindex" "del2" "end1" "end2"
 permutation "sel1" "upd2" "ins2" "del2" "reindex" "end1" "end2"
 permutation "sel1" "upd2" "ins2" "del2" "end1" "reindex" "end2"
+permutation "check_invalid" "lock" "reindex_fail" "unlock" "check_invalid"
+            "nowarn" "normal_reindex" "check_invalid"
+            "reindex" "check_invalid"
-- 
2.25.1

#40Michael Paquier
michael@paquier.xyz
In reply to: Julien Rouhaud (#39)
Re: Add an optional timeout clause to isolationtester step.

On Tue, Mar 10, 2020 at 02:53:36PM +0100, Julien Rouhaud wrote:

So basically we could just change pg_isolation_test_session_is_blocked() to
also return the wait_event_type and wait_event, and adding something like

Hmm. I think that Tom has in mind the reasons behind 511540d here.

step "<name>" { SQL } [ cancel on "<wait_event_type>" "<wait_event>" ]

to the step definition should be enough. I'm attaching a POC patch for that.
On my laptop, the full test now complete in about 400ms.

Not much a fan of that per the lack of flexibility, but we have a
single function to avoid a huge performance impact when using
CLOBBER_CACHE_ALWAYS, so we cannot really use a SQL-based logic
either...

FTR the REINDEX TABLE CONCURRENTLY case is eventually locked on a virtualxid,
I'm not sure if that's could lead to too early cancellation.

WaitForLockersMultiple() is called three times in this case, but your
test case is waiting on a lock to be released for the old index which
REINDEX CONCURRENTLY would like to drop at the beginning of step 5, so
this should work reliably here.

+	TupleDescInitEntry(tupdesc, (AttrNumber) 3, "wait_even",
+					   TEXTOID, -1, 0);

Guess who is missing a 't' here.

pg_isolation_test_session_is_blocked() is not documented and it is
only used internally in the isolation test suite, so breaking its
compatibility should be fine in practice.. Now you are actually
changing it so as we get a more complex state of the blocked
session, so I think that we should use a different function name, and
a different function. Like pg_isolation_test_session_state?
--
Michael

#41Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#40)
Re: Add an optional timeout clause to isolationtester step.

Michael Paquier <michael@paquier.xyz> writes:

On Tue, Mar 10, 2020 at 02:53:36PM +0100, Julien Rouhaud wrote:

So basically we could just change pg_isolation_test_session_is_blocked() to
also return the wait_event_type and wait_event, and adding something like

Hmm. I think that Tom has in mind the reasons behind 511540d here.

Yeah, that history suggests that we need to be very protective of the
performance of the wait-checking query, especially in CLOBBER_CACHE_ALWAYS
builds. That being the case, I'm hesitant to consider changing the test
function to return a tuple. That'll add quite a lot of overhead due to
the cache lookups involved, or so my gut says.

I'm also finding the proposed semantics (issue a cancel if wait state X
is reached) to be odd and special-purpose. I was envisioning something
more like "if wait state X is reached, consider the session to be blocked,
the same as if it had reached a heavyweight-lock wait". Then
isolationtester would move on to issue another step, which is where
I'd envision putting the cancel for that particular test usage.

So that idea leads to thinking that the wait-state specification is an
input to pg_isolation_test_session_is_blocked, not an output. We could
re-use Julien's ideas about the isolation spec syntax by making it be,
roughly,

step "<name>" { <SQL> } [ blocked if "<wait_event_type>" "<wait_event>" ]

and then those items would need to be passed as parameters of the prepared
query.

Or maybe we should use two different prepared queries depending on whether
there's a BLOCKED IF spec. We probably don't need lock-wait detection
if we're expecting a wait-state-based block, so maybe we should invent a
separate backend function "is this process waiting with this type of wait
state" and use that to check the state of a step that has this type of
annotation.

Just eyeing the proposed test case, I'm wondering whether this will
actually be sufficiently fine-grained. It seems like "REINDEX has
reached a wait on a virtual XID" is not really all that specific;
it could match on other situations, such as blocking on a concurrent
tuple update. Maybe it's okay given the restrictive context that
we don't expect anything to be happening that the isolation test
didn't ask for.

I'd like to see an attempt to rewrite some of the existing
timeout-dependent test cases to use this facility instead of
long timeouts. If we could get rid of the timeouts in the
deadlock tests, that'd go a long way towards showing that this
idea is actually any good.

regards, tom lane

#42Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#41)
Re: Add an optional timeout clause to isolationtester step.

On 2020-Mar-11, Tom Lane wrote:

We could re-use Julien's ideas about the isolation spec syntax by
making it be, roughly,

step "<name>" { <SQL> } [ blocked if "<wait_event_type>" "<wait_event>" ]

and then those items would need to be passed as parameters of the prepared
query.

I think for test readability's sake, it'd be better to put the BLOCKED
IF clause ahead of the SQL, so you can write it in the same line and let
the SQL flow to the next one:

STEP "long_select" BLOCKED IF "lwlock" "ClogControlLock"
{ select foo from pg_class where ... some more long clauses ... }

otherwise I think a step would require more lines to write.

I'd like to see an attempt to rewrite some of the existing
timeout-dependent test cases to use this facility instead of
long timeouts. If we could get rid of the timeouts in the
deadlock tests, that'd go a long way towards showing that this
idea is actually any good.

+1. Those long timeouts are annoying enough that infrastructure to make
a run shorter in normal circumstances might be sufficient justification
for this patch ...

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#43Michael Paquier
michael@paquier.xyz
In reply to: Alvaro Herrera (#42)
Re: Add an optional timeout clause to isolationtester step.

On Wed, Mar 11, 2020 at 05:52:54PM -0300, Alvaro Herrera wrote:

On 2020-Mar-11, Tom Lane wrote:

We could re-use Julien's ideas about the isolation spec syntax by
making it be, roughly,

step "<name>" { <SQL> } [ blocked if "<wait_event_type>" "<wait_event>" ]

and then those items would need to be passed as parameters of the prepared
query.

I think for test readability's sake, it'd be better to put the BLOCKED
IF clause ahead of the SQL, so you can write it in the same line and let
the SQL flow to the next one:

STEP "long_select" BLOCKED IF "lwlock" "ClogControlLock"
{ select foo from pg_class where ... some more long clauses ... }

otherwise I think a step would require more lines to write.

I prefer this version.

I'd like to see an attempt to rewrite some of the existing
timeout-dependent test cases to use this facility instead of
long timeouts. If we could get rid of the timeouts in the
deadlock tests, that'd go a long way towards showing that this
idea is actually any good.

+1. Those long timeouts are annoying enough that infrastructure to make
a run shorter in normal circumstances might be sufficient justification
for this patch ...

+1.  A patch does not seem to be that complicated.  Now isn't it too
late for v13?
--
Michael
#44Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#43)
Re: Add an optional timeout clause to isolationtester step.

Michael Paquier <michael@paquier.xyz> writes:

+1. A patch does not seem to be that complicated. Now isn't it too
late for v13?

I think we've generally given new tests more slack than new features so
far as schedule goes. If the patch ends up being complicated/invasive,
I might vote to hold it for v14, but let's see it first.

regards, tom lane

#45Julien Rouhaud
rjuju123@gmail.com
In reply to: Alvaro Herrera (#42)
Re: Add an optional timeout clause to isolationtester step.

On Wed, Mar 11, 2020 at 05:52:54PM -0300, Alvaro Herrera wrote:

On 2020-Mar-11, Tom Lane wrote:

We could re-use Julien's ideas about the isolation spec syntax by
making it be, roughly,

step "<name>" { <SQL> } [ blocked if "<wait_event_type>" "<wait_event>" ]

and then those items would need to be passed as parameters of the prepared
query.

I think for test readability's sake, it'd be better to put the BLOCKED
IF clause ahead of the SQL, so you can write it in the same line and let
the SQL flow to the next one:

STEP "long_select" BLOCKED IF "lwlock" "ClogControlLock"
{ select foo from pg_class where ... some more long clauses ... }

otherwise I think a step would require more lines to write.

I'd like to see an attempt to rewrite some of the existing
timeout-dependent test cases to use this facility instead of
long timeouts. If we could get rid of the timeouts in the
deadlock tests, that'd go a long way towards showing that this
idea is actually any good.

+1. Those long timeouts are annoying enough that infrastructure to make
a run shorter in normal circumstances might be sufficient justification
for this patch ...

I'm not familiar with those test so I'm probably missing something, but looks
like all isolation tests that setup a timeout are doing so to test server side
features (deadlock detection, statement and lock timeout). I'm not sure how
adding a client-side facility to detect locks earlier is going to help reducing
the server side timeouts?

For the REINDEX CONCURRENTLY failure test, the problem that needs to be solved
isn't detecting that the command is blocked as it's already getting blocked on
a heavyweight lock, but being able to reliably cancel a specific query as early
as possible, which AFAICS isn't possible with current isolation tester:

- either we reliably cancel the query using a statement timeout, but we'll make
it slow for everyone
- either we send a blind pg_cancel_backend() hoping that we don't catch
anything else (and also make it slower than required to make sure that it's
not canceled to early)

So we would actually only need something like this to make it work:

step "<name>" [ CANCEL IF BLOCKED ] { <SQL }

#46Tom Lane
tgl@sss.pgh.pa.us
In reply to: Julien Rouhaud (#45)
Re: Add an optional timeout clause to isolationtester step.

Julien Rouhaud <rjuju123@gmail.com> writes:

On Wed, Mar 11, 2020 at 05:52:54PM -0300, Alvaro Herrera wrote:

On 2020-Mar-11, Tom Lane wrote:

I'd like to see an attempt to rewrite some of the existing
timeout-dependent test cases to use this facility instead of
long timeouts.

+1. Those long timeouts are annoying enough that infrastructure to make
a run shorter in normal circumstances might be sufficient justification
for this patch ...

I'm not familiar with those test so I'm probably missing something, but looks
like all isolation tests that setup a timeout are doing so to test server side
features (deadlock detection, statement and lock timeout). I'm not sure how
adding a client-side facility to detect locks earlier is going to help reducing
the server side timeouts?

The point is that those timeouts have to be set long enough for even a
very slow machine to reach a desired state before the timeout happens;
on faster machines the test is just uselessly sleeping for a long time,
because of the fixed timeout. My thought was that maybe the tests could
be recast as "watch for session to reach $expected_state and then do
the next thing", allowing them to be automatically adaptive to the
machine's speed. This might require some rather subtle test redesign
and/or addition of more infrastructure (to allow recognition of the
desired state and/or taking an appropriate next action). I'm prepared
to believe that not much can be done about timeouts.spec in particular,
but it seems to me that the long delays in the deadlock tests are not
inherent in what we need to test.

For the REINDEX CONCURRENTLY failure test, the problem that needs to be solved
isn't detecting that the command is blocked as it's already getting blocked on
a heavyweight lock, but being able to reliably cancel a specific query as early
as possible, which AFAICS isn't possible with current isolation tester:

Right, it's the same thing of needing to wait till the backend has reached
a particular state before you do the next thing.

So we would actually only need something like this to make it work:
step "<name>" [ CANCEL IF BLOCKED ] { <SQL }

I continue to resist the idea of hard-wiring this feature to query cancel
as the action-to-take. That will more or less guarantee that it's not
good for anything but this one test case. I think that the feature
should have the behavior of "treat this step as blocked once it's reached
state X", and then you make the next step in the permutation be one that
issues a query cancel. (Possibly, using pg_stat_activity and
pg_cancel_backend for that will be painful enough that we'd want to
invent separate script syntax that says "send a cancel to session X".
But that's a separate discussion.)

regards, tom lane

#47Julien Rouhaud
rjuju123@gmail.com
In reply to: Tom Lane (#46)
Re: Add an optional timeout clause to isolationtester step.

On Fri, Mar 13, 2020 at 10:12:20AM -0400, Tom Lane wrote:

Julien Rouhaud <rjuju123@gmail.com> writes:

I'm not familiar with those test so I'm probably missing something, but looks
like all isolation tests that setup a timeout are doing so to test server side
features (deadlock detection, statement and lock timeout). I'm not sure how
adding a client-side facility to detect locks earlier is going to help reducing
the server side timeouts?

The point is that those timeouts have to be set long enough for even a
very slow machine to reach a desired state before the timeout happens;
on faster machines the test is just uselessly sleeping for a long time,
because of the fixed timeout. My thought was that maybe the tests could
be recast as "watch for session to reach $expected_state and then do
the next thing", allowing them to be automatically adaptive to the
machine's speed. This might require some rather subtle test redesign
and/or addition of more infrastructure (to allow recognition of the
desired state and/or taking an appropriate next action). I'm prepared
to believe that not much can be done about timeouts.spec in particular,
but it seems to me that the long delays in the deadlock tests are not
inherent in what we need to test.

Ah I see. I'll try to see if that could help the deadlock tests, but for sure
such feature would allow us to get rid of the two pg_sleep(5) in
tuplelock-update.

It seems that for all the possibly interesting cases, what we want to wait on
is an heavyweight lock, which is already what isolationtester detects. Maybe
we could simply implement something like

step "<name>" [ WAIT UNTIL BLOCKED ] { <SQL> }

without any change to the blocking detection function?

For the REINDEX CONCURRENTLY failure test, the problem that needs to be solved
isn't detecting that the command is blocked as it's already getting blocked on
a heavyweight lock, but being able to reliably cancel a specific query as early
as possible, which AFAICS isn't possible with current isolation tester:

Right, it's the same thing of needing to wait till the backend has reached
a particular state before you do the next thing.

So we would actually only need something like this to make it work:
step "<name>" [ CANCEL IF BLOCKED ] { <SQL }

I continue to resist the idea of hard-wiring this feature to query cancel
as the action-to-take. That will more or less guarantee that it's not
good for anything but this one test case. I think that the feature
should have the behavior of "treat this step as blocked once it's reached
state X", and then you make the next step in the permutation be one that
issues a query cancel. (Possibly, using pg_stat_activity and
pg_cancel_backend for that will be painful enough that we'd want to
invent separate script syntax that says "send a cancel to session X".
But that's a separate discussion.)

I agree. A new step option to kill a session rather than executing sql would
go perfectly with the above new active-wait-for-blocking-state feature.

#48Tom Lane
tgl@sss.pgh.pa.us
In reply to: Julien Rouhaud (#47)
Re: Add an optional timeout clause to isolationtester step.

Julien Rouhaud <rjuju123@gmail.com> writes:

It seems that for all the possibly interesting cases, what we want to wait on
is an heavyweight lock, which is already what isolationtester detects. Maybe
we could simply implement something like

step "<name>" [ WAIT UNTIL BLOCKED ] { <SQL> }

without any change to the blocking detection function?

Um, isn't that the existing built-in behavior?

I could actually imagine some uses for the reverse option, *don't* wait
for it to become blocked but just immediately continue with issuing
the next step.

regards, tom lane