CREATE INDEX CONCURRENTLY does not index prepared xact's data

Started by Andrey Borodinover 5 years ago147 messagesbugs
Jump to latest
#1Andrey Borodin
amborodin@acm.org

Hi hackers!

$subj.

Steps to reproduce:
create extension if not exists amcheck;
create table if not exists t1(i int);
begin;
insert into t1 values(1);
prepare transaction 'x';
create index concurrently i1 on t1(i);
commit prepared 'x';
select bt_index_check('i1', true);

I observe:
NOTICE: heap tuple (1,8) from table "t1" lacks matching index tuple within index "i1"
I expect: awaiting 'x' commit before index is created, correct index after.

This happens because WaitForLockersMultiple() does not take prepared xacts into account. Meanwhile CREATE INDEX CONCURRENTLY expects that locks are dropped only when transaction commit is visible.

This issue affects pg_repack and similar machinery based on CIC.

PFA draft of a fix.

Best regards, Andrey Borodin.

Attachments:

v1-0001-Wait-for-prepared-xacts-in-CREATE-INDEX-CONCURREN.patchapplication/octet-stream; name=v1-0001-Wait-for-prepared-xacts-in-CREATE-INDEX-CONCURREN.patch; x-unix-mode=0644Download+28-16
#2Victor Yegorov
vyegorov@gmail.com
In reply to: Andrey Borodin (#1)
Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data

сб, 19 дек. 2020 г. в 18:13, Andrey Borodin <x4mmm@yandex-team.ru>:

Steps to reproduce:
create extension if not exists amcheck;
create table if not exists t1(i int);
begin;
insert into t1 values(1);
prepare transaction 'x';
create index concurrently i1 on t1(i);
commit prepared 'x';
select bt_index_check('i1', true);

I observe:
NOTICE: heap tuple (1,8) from table "t1" lacks matching index tuple
within index "i1"
I expect: awaiting 'x' commit before index is created, correct index after.

CREATE INDEX CONCURRENTLY isn't supposed to be run inside a transaction?..

--
Victor Yegorov

#3Andrey Borodin
amborodin@acm.org
In reply to: Victor Yegorov (#2)
Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data

19 дек. 2020 г., в 22:22, Victor Yegorov <vyegorov@gmail.com> написал(а):

CREATE INDEX CONCURRENTLY isn't supposed to be run inside a transaction?..

CREATE INDEX CONCURRENTLY cannot run inside a transaction block.

Best regards, Andrey Borodin.

#4Victor Yegorov
vyegorov@gmail.com
In reply to: Andrey Borodin (#1)
Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data

сб, 19 дек. 2020 г. в 18:13, Andrey Borodin <x4mmm@yandex-team.ru>:

I observe:
NOTICE: heap tuple (1,8) from table "t1" lacks matching index tuple
within index "i1"
I expect: awaiting 'x' commit before index is created, correct index after.

I agree, that behaviour is unexpected. But getting a notice that requires me
to re-create the index some time later is not better (from DBA perspective).

Maybe it'd be better to wait on prepared xacts like on other open ordinary
transactions?

--
Victor Yegorov

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrey Borodin (#1)
Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data

Andrey Borodin <x4mmm@yandex-team.ru> writes:

This happens because WaitForLockersMultiple() does not take prepared
xacts into account.

Ugh, clearly an oversight.

Meanwhile CREATE INDEX CONCURRENTLY expects that locks are dropped only
when transaction commit is visible.

Don't follow your point here --- I'm pretty sure that prepared xacts
continue to hold their locks.

PFA draft of a fix.

Haven't you completely broken VirtualXactLock()? Certainly, whether the
target is a normal or prepared transaction shouldn't alter the meaning
of the "wait" flag.

In general, I wonder whether WaitForLockersMultiple and GetLockConflicts
need to gain an additional parameter indicating whether to consider
prepared xacts. It's not clear to me that their current behavior is wrong
for all possible uses.

regards, tom lane

#6Andrey Borodin
amborodin@acm.org
In reply to: Victor Yegorov (#4)
Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data

19 дек. 2020 г., в 22:48, Victor Yegorov <vyegorov@gmail.com> написал(а):

сб, 19 дек. 2020 г. в 18:13, Andrey Borodin <x4mmm@yandex-team.ru>:
I observe:
NOTICE: heap tuple (1,8) from table "t1" lacks matching index tuple within index "i1"
I expect: awaiting 'x' commit before index is created, correct index after.

I agree, that behaviour is unexpected. But getting a notice that requires me
to re-create the index some time later is not better (from DBA perspective).

Maybe it'd be better to wait on prepared xacts like on other open ordinary transactions?

This is not a real NOTICE. I just used a bit altered amcheck to diagnose the problem. It's an incorrect index. It lacks some tuples. It will not find existing data, failing to provide "read committed" consistency guaranties.
Fix waits for prepared xacts just like any other transaction.

19 дек. 2020 г., в 22:57, Tom Lane <tgl@sss.pgh.pa.us> написал(а):

Meanwhile CREATE INDEX CONCURRENTLY expects that locks are dropped only
when transaction commit is visible.

Don't follow your point here --- I'm pretty sure that prepared xacts
continue to hold their locks.

Uhmm, yes, locks are there. Relation is locked with RowExclusiveLock, but this lock is ignored by WaitForLockers(heaplocktag, ShareLock, true). Locking relation with ShareLock works as expected.

PFA draft of a fix.

Haven't you completely broken VirtualXactLock()? Certainly, whether the
target is a normal or prepared transaction shouldn't alter the meaning
of the "wait" flag.

You are right, the patch has a bug here.

In general, I wonder whether WaitForLockersMultiple and GetLockConflicts
need to gain an additional parameter indicating whether to consider
prepared xacts. It's not clear to me that their current behavior is wrong
for all possible uses.

I don't see usages besides indexing stuff. But maybe it worth to analyse each case...

BTW do we need a test for this? Will isolation test be good at checking this?

Thanks!

Best regards, Andrey Borodin.

#7Andrey Borodin
amborodin@acm.org
In reply to: Andrey Borodin (#6)
Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data

19 дек. 2020 г., в 23:25, Andrey Borodin <x4mmm@yandex-team.ru> написал(а):

BTW do we need a test for this? Will isolation test be good at checking this?

PFA patch set with isolation test for the $subj and fix for VirtualXactLock() bug.

I think I'll register the thread on January CF.

Thanks!

Best regards, Andrey Borodin.

Attachments:

v2-0001-Wait-for-prepared-xacts-in-CREATE-INDEX-CONCURREN.patchapplication/octet-stream; name=v2-0001-Wait-for-prepared-xacts-in-CREATE-INDEX-CONCURREN.patch; x-unix-mode=0644Download+30-16
v2-0002-Add-test-for-CIC-with-prepared-xacts.patchapplication/octet-stream; name=v2-0002-Add-test-for-CIC-with-prepared-xacts.patch; x-unix-mode=0644Download+54-3
#8Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#5)
Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data

On Sat, Dec 19, 2020 at 12:57:41PM -0500, Tom Lane wrote:

Andrey Borodin <x4mmm@yandex-team.ru> writes:

This happens because WaitForLockersMultiple() does not take prepared
xacts into account.

Ugh, clearly an oversight.

This looks to be the case since 295e639 where virtual XIDs have been
introduced. So this is an old bug.

Don't follow your point here --- I'm pretty sure that prepared xacts
continue to hold their locks.

Yes, that's what I recall as well.

Haven't you completely broken VirtualXactLock()? Certainly, whether the
target is a normal or prepared transaction shouldn't alter the meaning
of the "wait" flag.

Yep.

In general, I wonder whether WaitForLockersMultiple and GetLockConflicts
need to gain an additional parameter indicating whether to consider
prepared xacts. It's not clear to me that their current behavior is wrong
for all possible uses.

WaitForLockers is used only by REINDEX and CREATE/DROP CONCURRENTLY,
where it seems to me we need to care about all the cases related to
concurrent build, validation and index drop. The other caller of
GetLockConflicts() is for conflict resolution in standbys where it is
fine to ignore 2PC transactions as these cannot be cancelled. So I
agree that we are going to need more control with a new option
argument to be able to control if 2PC transactions are ignored or
not.

Hmm. The approach taken by the patch looks to be back-patchable.
Based on the lack of complaints on the matter, we could consider
instead putting an error in WaitForLockersMultiple() if there is at
least one numPrepXact which would at least avoid inconsistent data.
But I don't think what's proposed here is bad either.

VirtualTransactionIdIsValidOrPreparedXact() is confusing IMO, knowing
that VirtualTransactionIdIsPreparedXact() combined with
LocalTransactionIdIsValid() would be enough to do the job.

-       Assert(VirtualTransactionIdIsValid(vxid));
+       Assert(VirtualTransactionIdIsValidOrPreparedXact(vxid));
+
+       if (VirtualTransactionIdIsPreparedXact(vxid))
[...]
#define VirtualTransactionIdIsPreparedXact(vxid) \
    ((vxid).backendId == InvalidBackendId)
This would allow the case where backendId and localTransactionId are
both invalid.  So it would be better to also check in
VirtualTransactionIdIsPreparedXact() that the XID is not invalid, no?
--
Michael
#9Andrey Borodin
amborodin@acm.org
In reply to: Michael Paquier (#8)
Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data

21 дек. 2020 г., в 10:40, Michael Paquier <michael@paquier.xyz> написал(а):

In general, I wonder whether WaitForLockersMultiple and GetLockConflicts
need to gain an additional parameter indicating whether to consider
prepared xacts. It's not clear to me that their current behavior is wrong
for all possible uses.

WaitForLockers is used only by REINDEX and CREATE/DROP CONCURRENTLY,
where it seems to me we need to care about all the cases related to
concurrent build, validation and index drop. The other caller of
GetLockConflicts() is for conflict resolution in standbys where it is
fine to ignore 2PC transactions as these cannot be cancelled.

I don't think that fact that we cannot cancel transaction is sufficient here to ignore prepared transaction. I think there should not exist any prepared transaction that we need to cancel in standby conflict resolution. And if it exists - it's a sign of corruption, we could emit warning or something like that.
But I'm really not an expert here, just a common sense that prepared transaction is just like regular transaction that survives crash. If we wait for any transaction - probably we should wait for prepared too. I'm not insisting on anything though.

So I
agree that we are going to need more control with a new option
argument to be able to control if 2PC transactions are ignored or
not.

Hmm. The approach taken by the patch looks to be back-patchable.
Based on the lack of complaints on the matter, we could consider
instead putting an error in WaitForLockersMultiple() if there is at
least one numPrepXact which would at least avoid inconsistent data.
But I don't think what's proposed here is bad either.

VirtualTransactionIdIsValidOrPreparedXact() is confusing IMO, knowing
that VirtualTransactionIdIsPreparedXact() combined with
LocalTransactionIdIsValid() would be enough to do the job.

-       Assert(VirtualTransactionIdIsValid(vxid));
+       Assert(VirtualTransactionIdIsValidOrPreparedXact(vxid));
+
+       if (VirtualTransactionIdIsPreparedXact(vxid))
[...]
#define VirtualTransactionIdIsPreparedXact(vxid) \
((vxid).backendId == InvalidBackendId)
This would allow the case where backendId and localTransactionId are
both invalid.  So it would be better to also check in
VirtualTransactionIdIsPreparedXact() that the XID is not invalid, no?

Seems valid. Removed VirtualTransactionIdIsValidOrPreparedXact() from patch.

Thanks!

Best regards, Andrey Borodin.

Attachments:

v3-0001-Wait-for-prepared-xacts-in-CREATE-INDEX-CONCURREN.patchapplication/octet-stream; name=v3-0001-Wait-for-prepared-xacts-in-CREATE-INDEX-CONCURREN.patch; x-unix-mode=0644Download+32-16
v3-0002-Add-test-for-CIC-with-prepared-xacts.patchapplication/octet-stream; name=v3-0002-Add-test-for-CIC-with-prepared-xacts.patch; x-unix-mode=0644Download+54-3
#10Andrey Borodin
amborodin@acm.org
In reply to: Andrey Borodin (#9)
Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data

21 дек. 2020 г., в 12:24, Andrey Borodin <x4mmm@yandex-team.ru> написал(а):

Seems valid. Removed VirtualTransactionIdIsValidOrPreparedXact() from patch.

Sorry for the noise, removal was not complete.

Best regards, Andrey Borodin.

Attachments:

v4-0002-Add-test-for-CIC-with-prepared-xacts.patchapplication/octet-stream; name=v4-0002-Add-test-for-CIC-with-prepared-xacts.patch; x-unix-mode=0644Download+54-3
v4-0001-Wait-for-prepared-xacts-in-CREATE-INDEX-CONCURREN.patchapplication/octet-stream; name=v4-0001-Wait-for-prepared-xacts-in-CREATE-INDEX-CONCURREN.patch; x-unix-mode=0644Download+30-16
#11Andrey Borodin
amborodin@acm.org
In reply to: Michael Paquier (#8)
Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data

21 дек. 2020 г., в 10:40, Michael Paquier <michael@paquier.xyz> написал(а):

Hmm. The approach taken by the patch looks to be back-patchable.

I was checking that patch\test works in supported branches and everything seems to be fine down to REL_10_STABLE.

But my machines (Ubuntu18 and MacOS) report several fails in isolation tests on REL9_6_20\REL9_6_STABLE. Particularly eval-plan-qual-trigger, drop-index-concurrently-1, and async-notify. I do not observe problems with regular isolation tests though.

Do I understand correctly that check-world tests on buildfarm 'make check-prepared-txns' and the problem is somewhere in my machines? Or something is actually broken\outdated?

Thanks!

Best regards, Andrey Borodin.

#12Michael Paquier
michael@paquier.xyz
In reply to: Andrey Borodin (#11)
Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data

On Wed, Dec 23, 2020 at 12:23:28PM +0500, Andrey Borodin wrote:

Do I understand correctly that check-world tests on buildfarm 'make
check-prepared-txns' and the problem is somewhere in my machines? Or
something is actually broken\outdated?

The buildfarm code has no trace of check-prepared-txns. And, FWIW, I
see no failures at the top of REL9_6_STABLE. Do you mean that this
happens only with your patch? Or do you mean that you see failures
using the stable branch of upstream? I have not tested the former,
but the latter works fine on my end.
--
Michael

#13Andrey Borodin
amborodin@acm.org
In reply to: Michael Paquier (#12)
Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data

23 дек. 2020 г., в 12:52, Michael Paquier <michael@paquier.xyz> написал(а):

On Wed, Dec 23, 2020 at 12:23:28PM +0500, Andrey Borodin wrote:

Do I understand correctly that check-world tests on buildfarm 'make
check-prepared-txns' and the problem is somewhere in my machines? Or
something is actually broken\outdated?

FWIW, I
see no failures at the top of REL9_6_STABLE.

Thanks, Michael! The problem was indeed within my machines. maintainer-cleanup is not enough for make check-prepared-txns. Fresh real installation is necessary.

I've checked that test works down to REL9_5_STABLE with patch.

Thanks!

Best regards, Andrey Borodin.

#14Noah Misch
noah@leadboat.com
In reply to: Andrey Borodin (#10)
Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data

On Mon, Dec 21, 2020 at 12:24:55PM +0500, Andrey Borodin wrote:

21 дек. 2020 г., в 10:40, Michael Paquier <michael@paquier.xyz> написал(а):

In general, I wonder whether WaitForLockersMultiple and GetLockConflicts
need to gain an additional parameter indicating whether to consider
prepared xacts. It's not clear to me that their current behavior is wrong
for all possible uses.

WaitForLockers is used only by REINDEX and CREATE/DROP CONCURRENTLY,
where it seems to me we need to care about all the cases related to
concurrent build, validation and index drop. The other caller of
GetLockConflicts() is for conflict resolution in standbys where it is
fine to ignore 2PC transactions as these cannot be cancelled.

I don't think that fact that we cannot cancel transaction is sufficient here to ignore prepared transaction. I think there should not exist any prepared transaction that we need to cancel in standby conflict resolution. And if it exists - it's a sign of corruption, we could emit warning or something like that.

Agreed. Based on src/backend/storage/lmgr/README section "Locking during Hot
Standby" and RecoverPreparedTransactions(), prepared-transaction PGPROC
entries get locks at end of recovery, not earlier. The conflict resolution
callers won't see prepared-transaction locks, and they could assert that,
raise an error, or just assume that implicitly.

But I'm really not an expert here, just a common sense that prepared transaction is just like regular transaction that survives crash. If we wait for any transaction - probably we should wait for prepared too. I'm not insisting on anything though.

In the startup process of a standby, waiting for a primary-side transaction
(regular or prepared) would succeed instantly, rendering the wait meaningless.
I'm not too worried about it; several parts of the system would change if
these standby-side lock invariants ceased to hold.

On Mon, Dec 21, 2020 at 01:19:59PM +0500, Andrey Borodin wrote:

--- /dev/null
+++ b/src/test/isolation/expected/prepared-transactions-cic.out
@@ -0,0 +1,18 @@
+Parsed test spec with 2 sessions
+
+starting permutation: w1 p1 cic2 c1 r2
+step w1: BEGIN; INSERT INTO cic_test VALUES (1);
+step p1: PREPARE TRANSACTION 's1';
+step cic2: 
+    CREATE INDEX CONCURRENTLY on cic_test(a);
+
+ERROR:  canceling statement due to lock timeout

I wondered if a slow server could ever print "<waiting ...>" in here. It
can't; this is fine. pg_isolation_test_session_is_blocked() never reports the
prepared transaction, because that transaction's locks have no pid associated.

--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -2931,15 +2929,17 @@ GetLockConflicts(const LOCKTAG *locktag, LOCKMODE lockmode, int *countp)
/*
* Allocate memory to store results, and fill with InvalidVXID.  We only
-	 * need enough space for MaxBackends + a terminator, since prepared xacts
-	 * don't count. InHotStandby allocate once in TopMemoryContext.
+	 * need enough space for MaxBackends + max_prepared_xacts + a
+	 * terminator. InHotStandby allocate once in TopMemoryContext.
*/
if (InHotStandby)
{
if (vxids == NULL)
vxids = (VirtualTransactionId *)
MemoryContextAlloc(TopMemoryContext,
-								   sizeof(VirtualTransactionId) * (MaxBackends + 1));
+								   sizeof(VirtualTransactionId) * (MaxBackends 
+								   + max_prepared_xacts
+								   + 1));

PostgreSQL typically puts the operator before the newline. Also, please note
the whitespace error that "git diff --check origin/master" reports.

}
else
vxids = (VirtualTransactionId *)

This is updating only the InHotStandby branch. Both branches need the change.

@@ -4461,9 +4462,21 @@ bool
VirtualXactLock(VirtualTransactionId vxid, bool wait)
{
LOCKTAG		tag;
-	PGPROC	   *proc;
+	PGPROC	   *proc = NULL;
-	Assert(VirtualTransactionIdIsValid(vxid));
+	Assert(VirtualTransactionIdIsValid(vxid) ||
+			VirtualTransactionIdIsPreparedXact(vxid));
+
+	if (VirtualTransactionIdIsPreparedXact(vxid))
+	{
+		LockAcquireResult lar;
+		/* If it's prepared xact - just wait for it */
+		SET_LOCKTAG_TRANSACTION(tag, vxid.localTransactionId);
+		lar = LockAcquire(&tag, ShareLock, false, !wait);
+		if (lar == LOCKACQUIRE_OK)

This should instead test "!= LOCKACQUIRE_NOT_AVAIL", lest
LOCKACQUIRE_ALREADY_HELD happen. (It perhaps can't happen, but skipping the
LockRelease() would be wrong if it did.)

+			LockRelease(&tag, ShareLock, false);
+		return lar != LOCKACQUIRE_NOT_AVAIL;
+	}

SET_LOCKTAG_VIRTUALTRANSACTION(tag, vxid);

diff --git a/src/include/storage/lock.h b/src/include/storage/lock.h
index 1c3e9c1999..cedb9d6d2a 100644
--- a/src/include/storage/lock.h
+++ b/src/include/storage/lock.h
@@ -70,6 +70,8 @@ typedef struct
#define VirtualTransactionIdIsValid(vxid) \
(((vxid).backendId != InvalidBackendId) && \
LocalTransactionIdIsValid((vxid).localTransactionId))
+#define VirtualTransactionIdIsPreparedXact(vxid) \
+	((vxid).backendId == InvalidBackendId)

Your patch is introducing VirtualTransactionId values that represent prepared
xacts, and it has VirtualTransactionIdIsValid() return false for those values.
Let's make VirtualTransactionIdIsValid() return true for them, since they're
as meaningful as any other value. The GetLockConflicts() header comment says
"The result array is palloc'd and is terminated with an invalid VXID." Patch
v4 falsifies that comment. The array can contain many of these new "invalid"
VXIDs, and an all-zeroes VXID terminates the array. Rather than change the
comment, let's change VirtualTransactionIdIsValid() to render the comment true
again. Most (perhaps all) VirtualTransactionIdIsValid() callers won't need to
distinguish the prepared-transaction case.

An alternative to redefining VXID this way would be to have some new type,
each instance of which holds either a valid VXID or a valid
prepared-transaction XID. That alternative feels inferior to me, though.
What do you think?

Thanks,
nm

#15Andrey Borodin
amborodin@acm.org
In reply to: Noah Misch (#14)
Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data

Thanks for looking into this!

17 янв. 2021 г., в 12:24, Noah Misch <noah@leadboat.com> написал(а):

--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -2931,15 +2929,17 @@ GetLockConflicts(const LOCKTAG *locktag, LOCKMODE lockmode, int *countp)
/*
* Allocate memory to store results, and fill with InvalidVXID.  We only
-	 * need enough space for MaxBackends + a terminator, since prepared xacts
-	 * don't count. InHotStandby allocate once in TopMemoryContext.
+	 * need enough space for MaxBackends + max_prepared_xacts + a
+	 * terminator. InHotStandby allocate once in TopMemoryContext.
*/
if (InHotStandby)
{
if (vxids == NULL)
vxids = (VirtualTransactionId *)
MemoryContextAlloc(TopMemoryContext,
-								   sizeof(VirtualTransactionId) * (MaxBackends + 1));
+								   sizeof(VirtualTransactionId) * (MaxBackends 
+								   + max_prepared_xacts
+								   + 1));

PostgreSQL typically puts the operator before the newline. Also, please note
the whitespace error that "git diff --check origin/master" reports.

Fixed.

}
else
vxids = (VirtualTransactionId *)

This is updating only the InHotStandby branch. Both branches need the change.

Fixed.

@@ -4461,9 +4462,21 @@ bool
VirtualXactLock(VirtualTransactionId vxid, bool wait)
{
LOCKTAG		tag;
-	PGPROC	   *proc;
+	PGPROC	   *proc = NULL;
-	Assert(VirtualTransactionIdIsValid(vxid));
+	Assert(VirtualTransactionIdIsValid(vxid) ||
+			VirtualTransactionIdIsPreparedXact(vxid));
+
+	if (VirtualTransactionIdIsPreparedXact(vxid))
+	{
+		LockAcquireResult lar;
+		/* If it's prepared xact - just wait for it */
+		SET_LOCKTAG_TRANSACTION(tag, vxid.localTransactionId);
+		lar = LockAcquire(&tag, ShareLock, false, !wait);
+		if (lar == LOCKACQUIRE_OK)

This should instead test "!= LOCKACQUIRE_NOT_AVAIL", lest
LOCKACQUIRE_ALREADY_HELD happen. (It perhaps can't happen, but skipping the
LockRelease() would be wrong if it did.)

I think that code that successfully acquired lock should release it. Other way we risk to release someone's else lock held for a reason. We only acquire lock to release it instantly anyway.

+			LockRelease(&tag, ShareLock, false);
+		return lar != LOCKACQUIRE_NOT_AVAIL;
+	}

SET_LOCKTAG_VIRTUALTRANSACTION(tag, vxid);

diff --git a/src/include/storage/lock.h b/src/include/storage/lock.h
index 1c3e9c1999..cedb9d6d2a 100644
--- a/src/include/storage/lock.h
+++ b/src/include/storage/lock.h
@@ -70,6 +70,8 @@ typedef struct
#define VirtualTransactionIdIsValid(vxid) \
(((vxid).backendId != InvalidBackendId) && \
LocalTransactionIdIsValid((vxid).localTransactionId))
+#define VirtualTransactionIdIsPreparedXact(vxid) \
+	((vxid).backendId == InvalidBackendId)

Your patch is introducing VirtualTransactionId values that represent prepared
xacts, and it has VirtualTransactionIdIsValid() return false for those values.
Let's make VirtualTransactionIdIsValid() return true for them, since they're
as meaningful as any other value. The GetLockConflicts() header comment says
"The result array is palloc'd and is terminated with an invalid VXID." Patch
v4 falsifies that comment. The array can contain many of these new "invalid"
VXIDs, and an all-zeroes VXID terminates the array. Rather than change the
comment, let's change VirtualTransactionIdIsValid() to render the comment true
again. Most (perhaps all) VirtualTransactionIdIsValid() callers won't need to
distinguish the prepared-transaction case.

Makes sense, fixed. I was afraid that there's something I'm not aware about. I've iterated over VirtualTransactionIdIsValid() calls and did not find suspicious cases.

An alternative to redefining VXID this way would be to have some new type,
each instance of which holds either a valid VXID or a valid
prepared-transaction XID. That alternative feels inferior to me, though.
What do you think?

I think we should not call valid vxids "invalid".

By the way maybe rename "check-prepared-txns" to "check-prepared-xacts" for consistency?

Thanks!

Best regards, Andrey Borodin.

Attachments:

v5-0001-Wait-for-prepared-xacts-in-CREATE-INDEX-CONCURREN.patchapplication/octet-stream; name=v5-0001-Wait-for-prepared-xacts-in-CREATE-INDEX-CONCURREN.patch; x-unix-mode=0644Download+32-19
v5-0002-Add-test-for-CIC-with-prepared-xacts.patchapplication/octet-stream; name=v5-0002-Add-test-for-CIC-with-prepared-xacts.patch; x-unix-mode=0644Download+54-3
#16Andrey Borodin
amborodin@acm.org
In reply to: Andrey Borodin (#15)
Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data

22 янв. 2021 г., в 10:44, Andrey Borodin <x4mmm@yandex-team.ru> написал(а):

<v5-0001-Wait-for-prepared-xacts-in-CREATE-INDEX-CONCURREN.patch><v5-0002-Add-test-for-CIC-with-prepared-xacts.patch>

Uh, vscode did not save files and I've send incorrect version. Disregard v5.
Sorry for the noise.

Best regards, Andrey Borodin.

Attachments:

v6-0002-Add-test-for-CIC-with-prepared-xacts.patchapplication/octet-stream; name=v6-0002-Add-test-for-CIC-with-prepared-xacts.patch; x-unix-mode=0644Download+54-3
v6-0001-Wait-for-prepared-xacts-in-CREATE-INDEX-CONCURREN.patchapplication/octet-stream; name=v6-0001-Wait-for-prepared-xacts-in-CREATE-INDEX-CONCURREN.patch; x-unix-mode=0644Download+25-15
#17Noah Misch
noah@leadboat.com
In reply to: Andrey Borodin (#15)
Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data

On Fri, Jan 22, 2021 at 10:44:35AM +0500, Andrey Borodin wrote:

17 янв. 2021 г., в 12:24, Noah Misch <noah@leadboat.com> написал(а):

@@ -4461,9 +4462,21 @@ bool
VirtualXactLock(VirtualTransactionId vxid, bool wait)
{
LOCKTAG		tag;
-	PGPROC	   *proc;
+	PGPROC	   *proc = NULL;
-	Assert(VirtualTransactionIdIsValid(vxid));
+	Assert(VirtualTransactionIdIsValid(vxid) ||
+			VirtualTransactionIdIsPreparedXact(vxid));
+
+	if (VirtualTransactionIdIsPreparedXact(vxid))
+	{
+		LockAcquireResult lar;
+		/* If it's prepared xact - just wait for it */
+		SET_LOCKTAG_TRANSACTION(tag, vxid.localTransactionId);
+		lar = LockAcquire(&tag, ShareLock, false, !wait);
+		if (lar == LOCKACQUIRE_OK)

This should instead test "!= LOCKACQUIRE_NOT_AVAIL", lest
LOCKACQUIRE_ALREADY_HELD happen. (It perhaps can't happen, but skipping the
LockRelease() would be wrong if it did.)

I think that code that successfully acquired lock should release it. Other way we risk to release someone's else lock held for a reason. We only acquire lock to release it instantly anyway.

LockAcquire() increments the reference count before returning
LOCKACQUIRE_ALREADY_HELD, so it is an acquire. If this caller doesn't
LockRelease(), the lock will persist until end of transaction.

I changed that, updated comments, and fixed pgindent damage. I plan to push
the attached version.

The array can contain many of these new "invalid"
VXIDs, and an all-zeroes VXID terminates the array. Rather than change the

Correction: the terminator contains (InvalidBackendId,0).

By the way maybe rename "check-prepared-txns" to "check-prepared-xacts" for consistency?

The "xact" term is about three times as frequent "txn". I favor "xact" in new
usage. It's not dominant enough to change old usage unless done pervasively.

Attachments:

prepared-transactions-cic-v7nm.patchtext/plain; charset=us-asciiDownload+98-37
#18Andrey Borodin
amborodin@acm.org
In reply to: Noah Misch (#17)
Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data

24 янв. 2021 г., в 07:27, Noah Misch <noah@leadboat.com> написал(а):

I changed that, updated comments, and fixed pgindent damage. I plan to push
the attached version.

I see that patch was pushed. I'll flip status of CF entry. Many thanks!

Best regards, Andrey Borodin.

#19Andrey Borodin
amborodin@acm.org
In reply to: Andrey Borodin (#18)
Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data

30 янв. 2021 г., в 21:06, Andrey Borodin <x4mmm@yandex-team.ru> написал(а):

24 янв. 2021 г., в 07:27, Noah Misch <noah@leadboat.com> написал(а):

I changed that, updated comments, and fixed pgindent damage. I plan to push
the attached version.

I see that patch was pushed. I'll flip status of CF entry. Many thanks!

FWIW I have 2 new reported cases on 12.6. I've double-checked that at the moment of corruption installation run version with the patch.
To the date I could not reproduce the problem myself, but I'll continue working on this.

Thanks!

Best regards, Andrey Borodin.

#20Andrey Borodin
amborodin@acm.org
In reply to: Andrey Borodin (#19)
Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data

1 мая 2021 г., в 17:42, Andrey Borodin <x4mmm@yandex-team.ru> написал(а):

FWIW I have 2 new reported cases on 12.6.

Sorry to say that, but $subj persists. Here's a simple reproduction.
To get corrupted index you need 3 psql sessions A, B and C. By default command is executed in A.

create extension amcheck ;
create table t1(i int);
create index on t1(i);

begin;
insert into t1 values(0);

-- session C: reindex table concurrently t1;

prepare transaction 'a';
begin;
insert into t1 values(0);
-- session B: commit prepared 'a';
prepare transaction 'b';
begin;
insert into t1 values(0);
-- session B: commit prepared 'b';
prepare transaction 'c';

begin;
insert into t1 values(0);
-- session B: commit prepared 'c';
prepare transaction 'd';
commit prepared 'd';

-- session C: postgres=# select bt_index_check('i1',true);
ERROR: heap tuple (0,2) from table "t1" lacks matching index tuple within index "i1"
HINT: Retrying verification using the function bt_index_parent_check() might provide a more specific error.

The problem is WaitForLockersMultiple() gathers running vxids and 2pc xids. Then it waits, but if vxid is converted to 2pc it fails to wait.
I could not compose isolation test for this, because we need to do "prepare transaction 'a';" only when "reindex table concurrently t1;" is already working for some time.

To fix it we can return locking xids along with vxids from GetLockConflicts() like in attached diff. But this approach seems ugly.

Best regards, Andrey Borodin.

Attachments:

return_xids_with_vxids.diffapplication/octet-stream; name=return_xids_with_vxids.diff; x-unix-mode=0644Download+17-3
#21Noah Misch
noah@leadboat.com
In reply to: Andrey Borodin (#20)
#22Andrey Borodin
amborodin@acm.org
In reply to: Noah Misch (#21)
#23Noah Misch
noah@leadboat.com
In reply to: Andrey Borodin (#22)
#24Andrey Borodin
amborodin@acm.org
In reply to: Noah Misch (#23)
#25Noah Misch
noah@leadboat.com
In reply to: Andrey Borodin (#24)
#26Andrey Borodin
amborodin@acm.org
In reply to: Noah Misch (#25)
#27Andrey Borodin
amborodin@acm.org
In reply to: Andrey Borodin (#26)
#28Andrey Borodin
amborodin@acm.org
In reply to: Andrey Borodin (#26)
#29Noah Misch
noah@leadboat.com
In reply to: Andrey Borodin (#28)
#30Andrey Borodin
amborodin@acm.org
In reply to: Noah Misch (#29)
#31Noah Misch
noah@leadboat.com
In reply to: Andrey Borodin (#30)
#32Andrey Borodin
amborodin@acm.org
In reply to: Noah Misch (#31)
#33Noah Misch
noah@leadboat.com
In reply to: Andrey Borodin (#32)
In reply to: Noah Misch (#33)
#35Andrey Borodin
amborodin@acm.org
In reply to: Noah Misch (#33)
#36Andrey Borodin
amborodin@acm.org
In reply to: Andrey Borodin (#35)
#37Andrey Borodin
amborodin@acm.org
In reply to: Andrey Borodin (#36)
#38Noah Misch
noah@leadboat.com
In reply to: Andrey Borodin (#37)
#39Andrey Borodin
amborodin@acm.org
In reply to: Noah Misch (#38)
#40Noah Misch
noah@leadboat.com
In reply to: Andrey Borodin (#39)
#41Andrey Borodin
amborodin@acm.org
In reply to: Noah Misch (#40)
In reply to: Andrey Borodin (#41)
#43Andrey Borodin
amborodin@acm.org
In reply to: Peter Geoghegan (#42)
#44Andrey Borodin
amborodin@acm.org
In reply to: Peter Geoghegan (#42)
#45Andrey Borodin
amborodin@acm.org
In reply to: Andrey Borodin (#44)
#46Andrey Borodin
amborodin@acm.org
In reply to: Andrey Borodin (#45)
#47Noah Misch
noah@leadboat.com
In reply to: Andrey Borodin (#46)
#48Andrey Borodin
amborodin@acm.org
In reply to: Noah Misch (#47)
#49Noah Misch
noah@leadboat.com
In reply to: Andrey Borodin (#48)
#50Noah Misch
noah@leadboat.com
In reply to: Noah Misch (#49)
#51Noah Misch
noah@leadboat.com
In reply to: Noah Misch (#50)
#52Andrey Borodin
amborodin@acm.org
In reply to: Noah Misch (#51)
#53Andres Freund
andres@anarazel.de
In reply to: Andrey Borodin (#52)
#54Noah Misch
noah@leadboat.com
In reply to: Andrey Borodin (#52)
#55Andrey Borodin
amborodin@acm.org
In reply to: Noah Misch (#54)
#56Andrey Borodin
amborodin@acm.org
In reply to: Andrey Borodin (#55)
#57Andrey Borodin
amborodin@acm.org
In reply to: Andrey Borodin (#56)
#58Noah Misch
noah@leadboat.com
In reply to: Andrey Borodin (#57)
In reply to: Noah Misch (#58)
#60Andrey Borodin
amborodin@acm.org
In reply to: Andres Freund (#53)
#61Andres Freund
andres@anarazel.de
In reply to: Andrey Borodin (#60)
#62Andrey Borodin
amborodin@acm.org
In reply to: Andres Freund (#61)
#63Noah Misch
noah@leadboat.com
In reply to: Andrey Borodin (#62)
#64Noah Misch
noah@leadboat.com
In reply to: Noah Misch (#63)
#65Andrey Borodin
amborodin@acm.org
In reply to: Noah Misch (#64)
#66Noah Misch
noah@leadboat.com
In reply to: Andrey Borodin (#65)
#67Noah Misch
noah@leadboat.com
In reply to: Noah Misch (#66)
#68Andrey Borodin
amborodin@acm.org
In reply to: Noah Misch (#67)
#69Noah Misch
noah@leadboat.com
In reply to: Andrey Borodin (#68)
#70Noah Misch
noah@leadboat.com
In reply to: Noah Misch (#69)
#71Andrey Borodin
amborodin@acm.org
In reply to: Noah Misch (#70)
#72Noah Misch
noah@leadboat.com
In reply to: Andrey Borodin (#71)
#73Andrey Borodin
amborodin@acm.org
In reply to: Noah Misch (#72)
#74Noah Misch
noah@leadboat.com
In reply to: Noah Misch (#70)
#75Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrey Borodin (#73)
#76Andrey Borodin
amborodin@acm.org
In reply to: Tom Lane (#75)
#77Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrey Borodin (#76)
#78Noah Misch
noah@leadboat.com
In reply to: Tom Lane (#77)
#79Tom Lane
tgl@sss.pgh.pa.us
In reply to: Noah Misch (#78)
#80Noah Misch
noah@leadboat.com
In reply to: Tom Lane (#79)
#81Tom Lane
tgl@sss.pgh.pa.us
In reply to: Noah Misch (#80)
#82Tom Lane
tgl@sss.pgh.pa.us
In reply to: Noah Misch (#80)
#83Thomas Munro
thomas.munro@gmail.com
In reply to: Tom Lane (#81)
#84Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#83)
#85Noah Misch
noah@leadboat.com
In reply to: Tom Lane (#82)
#86Tom Lane
tgl@sss.pgh.pa.us
In reply to: Noah Misch (#85)
#87Andrey Borodin
amborodin@acm.org
In reply to: Tom Lane (#82)
#88Noah Misch
noah@leadboat.com
In reply to: Noah Misch (#74)
#89Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrey Borodin (#87)
#90Thomas Munro
thomas.munro@gmail.com
In reply to: Noah Misch (#88)
#91Noah Misch
noah@leadboat.com
In reply to: Thomas Munro (#90)
#92Noah Misch
noah@leadboat.com
In reply to: Andrey Borodin (#73)
#93Tom Lane
tgl@sss.pgh.pa.us
In reply to: Noah Misch (#92)
#94Thomas Munro
thomas.munro@gmail.com
In reply to: Tom Lane (#93)
#95Andrey Borodin
amborodin@acm.org
In reply to: Thomas Munro (#94)
#96Noah Misch
noah@leadboat.com
In reply to: Thomas Munro (#94)
#97Thomas Munro
thomas.munro@gmail.com
In reply to: Andrey Borodin (#95)
#98Sandeep Thakkar
sandeep.thakkar@enterprisedb.com
In reply to: Noah Misch (#88)
#99Andrey Borodin
amborodin@acm.org
In reply to: Sandeep Thakkar (#98)
#100Sandeep Thakkar
sandeep.thakkar@enterprisedb.com
In reply to: Andrey Borodin (#99)
#101Sandeep Thakkar
sandeep.thakkar@enterprisedb.com
In reply to: Sandeep Thakkar (#100)
#102Noah Misch
noah@leadboat.com
In reply to: Sandeep Thakkar (#101)
#103Sandeep Thakkar
sandeep.thakkar@enterprisedb.com
In reply to: Noah Misch (#102)
#104Andrey Borodin
amborodin@acm.org
In reply to: Sandeep Thakkar (#103)
#105Andres Freund
andres@anarazel.de
In reply to: Sandeep Thakkar (#103)
#106Sandeep Thakkar
sandeep.thakkar@enterprisedb.com
In reply to: Andres Freund (#105)
#107Andrey Borodin
amborodin@acm.org
In reply to: Sandeep Thakkar (#106)
#108Sandeep Thakkar
sandeep.thakkar@enterprisedb.com
In reply to: Andrey Borodin (#107)
#109Robert Haas
robertmhaas@gmail.com
In reply to: Sandeep Thakkar (#108)
#110Sandeep Thakkar
sandeep.thakkar@enterprisedb.com
In reply to: Robert Haas (#109)
#111Noah Misch
noah@leadboat.com
In reply to: Sandeep Thakkar (#110)
#112Semab Tariq
semab.tariq@enterprisedb.com
In reply to: Noah Misch (#111)
#113Thomas Munro
thomas.munro@gmail.com
In reply to: Semab Tariq (#112)
#114Noah Misch
noah@leadboat.com
In reply to: Thomas Munro (#113)
#115Semab Tariq
semab.tariq@enterprisedb.com
In reply to: Noah Misch (#114)
#116Noah Misch
noah@leadboat.com
In reply to: Semab Tariq (#115)
#117Semab Tariq
semab.tariq@enterprisedb.com
In reply to: Noah Misch (#116)
#118Noah Misch
noah@leadboat.com
In reply to: Semab Tariq (#117)
#119Noah Misch
noah@leadboat.com
In reply to: Noah Misch (#118)
#120Semab Tariq
semab.tariq@enterprisedb.com
In reply to: Noah Misch (#119)
#121Noah Misch
noah@leadboat.com
In reply to: Semab Tariq (#120)
#122Semab Tariq
semab.tariq@enterprisedb.com
In reply to: Noah Misch (#121)
#123Noah Misch
noah@leadboat.com
In reply to: Semab Tariq (#122)
#124Semab Tariq
semab.tariq@enterprisedb.com
In reply to: Noah Misch (#123)
#125Noah Misch
noah@leadboat.com
In reply to: Semab Tariq (#124)
#126Semab Tariq
semab.tariq@enterprisedb.com
In reply to: Noah Misch (#125)
#127Noah Misch
noah@leadboat.com
In reply to: Semab Tariq (#126)
#128Thomas Munro
thomas.munro@gmail.com
In reply to: Noah Misch (#127)
#129Noah Misch
noah@leadboat.com
In reply to: Thomas Munro (#128)
#130Noah Misch
noah@leadboat.com
In reply to: Noah Misch (#92)
#131Andrey Borodin
amborodin@acm.org
In reply to: Noah Misch (#130)
#132Andrey Borodin
amborodin@acm.org
In reply to: Andrey Borodin (#131)
#133Noah Misch
noah@leadboat.com
In reply to: Andrey Borodin (#132)
#134Thomas Munro
thomas.munro@gmail.com
In reply to: Noah Misch (#133)
#135Andrey Borodin
amborodin@acm.org
In reply to: Noah Misch (#133)
#136Thomas Munro
thomas.munro@gmail.com
In reply to: Andrey Borodin (#135)
#137Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#136)
#138Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#137)
#139Noah Misch
noah@leadboat.com
In reply to: Thomas Munro (#138)
#140Andrey Borodin
amborodin@acm.org
In reply to: Noah Misch (#139)
#141Thomas Munro
thomas.munro@gmail.com
In reply to: Noah Misch (#139)
#142Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#141)
#143Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrey Borodin (#140)
#144Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#138)
#145Thomas Munro
thomas.munro@gmail.com
In reply to: Tom Lane (#143)
#146Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#145)
#147Mikael Kjellström
mikael.kjellstrom@mksoft.nu
In reply to: Thomas Munro (#146)