BUG #14680: startup process on standby encounter a deadlock of TwoPhaseStateLock when redo 2PC xlog
The following bug has been logged on the website:
Bug reference: 14680
Logged by: chuanting wang
Email address: wangchuanting@huawei.com
PostgreSQL version: 10beta1
Operating system: SuSE
Description:
startup process on standby encounter a deadlock of TwoPhaseStateLock when
redo 2PC xlog.
the stack is:
Thread 6 (Thread 0x7f002e3ff700 (LWP 35305)):
#0 0x00007f1106167627 in semop () from /lib64/libc.so.6
#1 0x0000000000ce5622 in PGSemaphoreLock(PGSemaphoreData*, bool) ()
#2 0x0000000000d73e4d in LWLockAcquire(LWLockId, LWLockMode) ()
#3 0x0000000000969028 in RemoveGXact(GlobalTransactionData*) ()
#4 0x000000000096b254 in ProcessTwoPhaseBuffer(unsigned int, XLogRecPtr,
bool, bool, bool) ()
#5 0x000000000096b6b8 in PrescanPreparedTransactions(unsigned int**, int*)
()
#6 0x000000000097af97 in xlog_redo(XLogReaderState*) ()
#7 0x00000000009839df in StartupXLOG() ()
#8 0x0000000000cffd0b in StartupProcessMain() ()
#9 0x0000000000991539 in AuxiliaryProcessMain(int, char**) ()
#10 0x0000000000cfb209 in SubPostmasterMain(int, char**) ()
#11 0x0000000000cfb4f2 in MainStarterThreadFunc(void*) ()
#12 0x00007f11099a17b6 in start_thread () from /lib64/libpthread.so.0
#13 0x00007f1106165d6d in clone () from /lib64/libc.so.6
#14 0x0000000000000000 in ?? ()
the reason is:
when do PrescanPreparedTransactions, startup process already granted a
shared lock on TwoPhaseStateLock, then it called RemoveGXact, but in
RemoveGXact it acquired exclusive lock on TwoPhaseStateLock, so it blocked
by itself
anyone, has some idea on how to fix it?
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
wangchuanting@huawei.com writes:
startup process on standby encounter a deadlock of TwoPhaseStateLock when
redo 2PC xlog.
Please provide an example of how to get into this state.
regards, tom lane
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
On Wed, May 31, 2017 at 6:57 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
wangchuanting@huawei.com writes:
startup process on standby encounter a deadlock of TwoPhaseStateLock when
redo 2PC xlog.Please provide an example of how to get into this state.
That would help. Are you seeing in the logs something like "removing
future two-phase state from memory for XXX" or "removing stale
two-phase state from shared memory for XXX"?
Even with that, the light-weight lock sequence taken in those code
paths look definitely wrong to me, we should not take twice
TwoPhaseStateLock in the same code path. I think that we should remove
the lock acquisitions in RemoveGXact() and PrepareRedoRemove, and then
upgrade the locks of PrescanPreparedTransactions() and
StandbyRecoverPreparedTransactions() to be exclusive. We still need to
keep a lock as CheckPointTwoPhase() may still be triggered by the
checkpoint. Tom, what do you think?
--
Michael
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
we use Postgres-XC+pg10(4 coordinator + 4 datanode(pg 10 as datanode, and 1
master 1 standby for each datanode)), and benchmark tpcc, there is some
cross datanode transactions that use 2pc, during testing, we restart the
cluster, then one datanode standby can not recovery done and hangup with
TwoPhaseStateLock deadlock.
and sorry, we reinstall the cluster, and the log is removed, we will try to
reproduce, but anyway, the code is not right like Michael saied.
--
View this message in context: http://www.postgresql-archive.org/BUG-14680-startup-process-on-standby-encounter-a-deadlock-of-TwoPhaseStateLock-when-redo-2PC-xlog-tp5964069p5964241.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
On Wed, May 31, 2017 at 12:30 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
On Wed, May 31, 2017 at 6:57 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
wangchuanting@huawei.com writes:
startup process on standby encounter a deadlock of TwoPhaseStateLock when
redo 2PC xlog.Please provide an example of how to get into this state.
That would help. Are you seeing in the logs something like "removing
future two-phase state from memory for XXX" or "removing stale
two-phase state from shared memory for XXX"?Even with that, the light-weight lock sequence taken in those code
paths look definitely wrong to me, we should not take twice
TwoPhaseStateLock in the same code path. I think that we should remove
the lock acquisitions in RemoveGXact() and PrepareRedoRemove, and then
upgrade the locks of PrescanPreparedTransactions() and
StandbyRecoverPreparedTransactions() to be exclusive. We still need to
keep a lock as CheckPointTwoPhase() may still be triggered by the
checkpoint. Tom, what do you think?
Attached is what I was thinking about for reference. I just came back
from a long flight and I am pretty tired, so my brain may have missed
something. I'll take again a look at this issue on Monday, an open
item has been added for now.
--
Michael
Attachments:
2pc-redo-lwlock-fix.patchapplication/octet-stream; name=2pc-redo-lwlock-fix.patchDownload
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index c50f9c4bf6..96986ad954 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -338,18 +338,13 @@ AtAbort_Twophase(void)
* resources held by the transaction yet. In those cases, the in-memory
* state can be wrong, but it's too late to back out.
*/
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
if (!MyLockedGxact->valid)
- {
RemoveGXact(MyLockedGxact);
- }
else
- {
- LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
-
MyLockedGxact->locking_backend = InvalidBackendId;
+ LWLockRelease(TwoPhaseStateLock);
- LWLockRelease(TwoPhaseStateLock);
- }
MyLockedGxact = NULL;
}
@@ -624,6 +619,8 @@ LockGXact(const char *gid, Oid user)
/*
* RemoveGXact
* Remove the prepared transaction from the shared memory array.
+ * Caller needs to hold an exclusive lock on TwoPhaseStateLock as
+ * data of TwoPhaseState gets updated.
*
* NB: caller should have already removed it from ProcArray
*/
@@ -632,8 +629,6 @@ RemoveGXact(GlobalTransaction gxact)
{
int i;
- LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
-
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
if (gxact == TwoPhaseState->prepXacts[i])
@@ -646,8 +641,6 @@ RemoveGXact(GlobalTransaction gxact)
gxact->next = TwoPhaseState->freeGXacts;
TwoPhaseState->freeGXacts = gxact;
- LWLockRelease(TwoPhaseStateLock);
-
return;
}
}
@@ -1506,7 +1499,9 @@ FinishPreparedTransaction(const char *gid, bool isCommit)
if (gxact->ondisk)
RemoveTwoPhaseFile(xid, true);
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
RemoveGXact(gxact);
+ LWLockRelease(TwoPhaseStateLock);
MyLockedGxact = NULL;
pfree(buf);
@@ -1781,6 +1776,9 @@ restoreTwoPhaseData(void)
* If xids_p and nxids_p are not NULL, pointer to a palloc'd array of all
* top-level xids is stored in *xids_p. The number of entries in the array
* is returned in *nxids_p.
+ *
+ * An exclusive lock on TwoPhaseStateLock is taken for the duration of the
+ * scan of TwoPhaseState data as its may be updated.
*/
TransactionId
PrescanPreparedTransactions(TransactionId **xids_p, int *nxids_p)
@@ -1792,7 +1790,7 @@ PrescanPreparedTransactions(TransactionId **xids_p, int *nxids_p)
int allocsize = 0;
int i;
- LWLockAcquire(TwoPhaseStateLock, LW_SHARED);
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
TransactionId xid;
@@ -1861,13 +1859,16 @@ PrescanPreparedTransactions(TransactionId **xids_p, int *nxids_p)
* The lack of calls to SubTransSetParent() calls here is by design;
* those calls are made by RecoverPreparedTransactions() at the end of recovery
* for those xacts that need this.
+ *
+ * An exclusive lock on TwoPhaseStateLock is taken for the duration of the
+ * scan of TwoPhaseState data as it may get updated.
*/
void
StandbyRecoverPreparedTransactions(void)
{
int i;
- LWLockAcquire(TwoPhaseStateLock, LW_SHARED);
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
TransactionId xid;
@@ -2001,6 +2002,9 @@ RecoverPreparedTransactions(void)
*
* If setNextXid is true, set ShmemVariableCache->nextXid to the newest
* value scanned.
+ *
+ * Caller needs to hold an exclusive lock on TwoPhaseStateLock as data
+ * of TwoPhaseState gets updated.
*/
static char *
ProcessTwoPhaseBuffer(TransactionId xid,
@@ -2361,6 +2365,9 @@ PrepareRedoAdd(char *buf, XLogRecPtr start_lsn, XLogRecPtr end_lsn)
* Remove the corresponding gxact entry from TwoPhaseState. Also
* remove the 2PC file if a prepared transaction was saved via
* an earlier checkpoint.
+ *
+ * Caller needs to hold an exclusive lock on TwoPhaseStateLock as
+ * data of TwoPhaseState gets updated.
*/
void
PrepareRedoRemove(TransactionId xid, bool giveWarning)
@@ -2371,7 +2378,6 @@ PrepareRedoRemove(TransactionId xid, bool giveWarning)
Assert(RecoveryInProgress());
- LWLockAcquire(TwoPhaseStateLock, LW_SHARED);
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
gxact = TwoPhaseState->prepXacts[i];
@@ -2383,7 +2389,6 @@ PrepareRedoRemove(TransactionId xid, bool giveWarning)
break;
}
}
- LWLockRelease(TwoPhaseStateLock);
/*
* Just leave if there is nothing, this is expected during WAL replay.
On Thu, Jun 1, 2017 at 12:11 AM, wangchuanting <wangchuanting@huawei.com> wrote:
we use Postgres-XC+pg10(4 coordinator + 4 datanode(pg 10 as datanode, and 1
master 1 standby for each datanode)), and benchmark tpcc, there is some
cross datanode transactions that use 2pc, during testing, we restart the
cluster, then one datanode standby can not recovery done and hangup with
TwoPhaseStateLock deadlock.
(Former Postgres-XC maintainer here)
Are you aware of the fact that this is not going to work? Postgres
protocol has been extended between coordinators and datanodes to be
able to push down transaction ID, snapshot as well as timestamps when
running transactions across nodes. So by using a community Postgres as
a datanode you break the global consistency of the cluster. There are
also a couple of other things proper to datanodes.
and sorry, we reinstall the cluster, and the log is removed, we will try to
reproduce, but anyway, the code is not right like Michael said.
On that I agree.
--
Michael
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
So by using a community Postgres as
a datanode you break the global consistency of the cluster. There are
also a couple of other things proper to datanodes.
sorry, i confuse you, actually we merged "Speedup 2PC recovery by skipping
two phase state files in normal path" to pgxc.
--
View this message in context: http://www.postgresql-archive.org/BUG-14680-startup-process-on-standby-encounter-a-deadlock-of-TwoPhaseStateLock-when-redo-2PC-xlog-tp5964069p5964398.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
Hi Michael,
appreciate for your quick response.
i look carefully on the patch, because of we removed TwoPhaseStateLock
lwlock acquire in `RemoveGXact' and let caller held lwlock, so i think:
1. xact_redo also need held lwlock before call PrepareRedoRemove
2. RecoverPreparedTransactions also need held lwlock before call
ProcessTwoPhaseBuffer
--
View this message in context: http://www.postgresql-archive.org/BUG-14680-startup-process-on-standby-encounter-a-deadlock-of-TwoPhaseStateLock-when-redo-2PC-xlog-tp5964069p5964407.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
On Thu, Jun 01, 2017 at 01:07:53AM -0700, Michael Paquier wrote:
On Wed, May 31, 2017 at 12:30 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:On Wed, May 31, 2017 at 6:57 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
wangchuanting@huawei.com writes:
startup process on standby encounter a deadlock of TwoPhaseStateLock when
redo 2PC xlog.Please provide an example of how to get into this state.
That would help. Are you seeing in the logs something like "removing
future two-phase state from memory for XXX" or "removing stale
two-phase state from shared memory for XXX"?Even with that, the light-weight lock sequence taken in those code
paths look definitely wrong to me, we should not take twice
TwoPhaseStateLock in the same code path. I think that we should remove
the lock acquisitions in RemoveGXact() and PrepareRedoRemove, and then
upgrade the locks of PrescanPreparedTransactions() and
StandbyRecoverPreparedTransactions() to be exclusive. We still need to
keep a lock as CheckPointTwoPhase() may still be triggered by the
checkpoint. Tom, what do you think?Attached is what I was thinking about for reference. I just came back
from a long flight and I am pretty tired, so my brain may have missed
something. I'll take again a look at this issue on Monday, an open
item has been added for now.
[Action required within three days. This is a generic notification.]
The above-described topic is currently a PostgreSQL 10 open item. Simon,
since you committed the patch believed to have created it, you own this open
item. If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know. Otherwise, please observe the policy on
open item ownership[1]/messages/by-id/20170404140717.GA2675809@tornado.leadboat.com and send a status update within three calendar days of
this message. Include a date for your subsequent status update. Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10. Consequently, I will appreciate your efforts
toward speedy resolution. Thanks.
[1]: /messages/by-id/20170404140717.GA2675809@tornado.leadboat.com
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
On Fri, Jun 2, 2017 at 10:59 AM, wangchuanting <wangchuanting@huawei.com> wrote:
Appreciate for your quick response.
Back into business for this issue.
i look carefully on the patch, because of we removed TwoPhaseStateLock
lwlock acquire in `RemoveGXact' and let caller held lwlock, so i think:
1. xact_redo also need held lwlock before call PrepareRedoRemove
2. RecoverPreparedTransactions also need held lwlock before call
ProcessTwoPhaseBuffer
Thanks for the input. I have reviewed all the code paths that have
been modified, and strengthened the code with assertions using
LWLockHeldByMeInMode() to make sure that the correct lock is always
hold in those code paths. There is actually no point in holding the
lock in restoreTwoPhaseData(), but as this makes the code less
consistent with the rest I added one. Also, I have replaced the lock
acquisition in PrepareRedoAdd() with acquisitions at higher levels,
and added an assertion in this routine. This makes the 2PC state data
addition and removal more consistent with each other.
--
Michael
Attachments:
2pc-redo-lwlock-fix-v2.patchapplication/octet-stream; name=2pc-redo-lwlock-fix-v2.patchDownload
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index c50f9c4bf6..877330d409 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -338,18 +338,13 @@ AtAbort_Twophase(void)
* resources held by the transaction yet. In those cases, the in-memory
* state can be wrong, but it's too late to back out.
*/
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
if (!MyLockedGxact->valid)
- {
RemoveGXact(MyLockedGxact);
- }
else
- {
- LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
-
MyLockedGxact->locking_backend = InvalidBackendId;
+ LWLockRelease(TwoPhaseStateLock);
- LWLockRelease(TwoPhaseStateLock);
- }
MyLockedGxact = NULL;
}
@@ -530,15 +525,15 @@ GXactLoadSubxactData(GlobalTransaction gxact, int nsubxacts,
/*
* MarkAsPrepared
* Mark the GXACT as fully valid, and enter it into the global ProcArray.
+ * An exclusive lock on TwoPhaseStateLock needs to be taken by caller.
*/
static void
MarkAsPrepared(GlobalTransaction gxact)
{
- /* Lock here may be overkill, but I'm not convinced of that ... */
- LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
+ Assert(LWLockHeldByMeInMode(TwoPhaseStateLock, LW_EXCLUSIVE));
+
Assert(!gxact->valid);
gxact->valid = true;
- LWLockRelease(TwoPhaseStateLock);
/*
* Put it into the global ProcArray so TransactionIdIsInProgress considers
@@ -624,6 +619,8 @@ LockGXact(const char *gid, Oid user)
/*
* RemoveGXact
* Remove the prepared transaction from the shared memory array.
+ * Caller needs to hold an exclusive lock on TwoPhaseStateLock as
+ * data of TwoPhaseState gets updated.
*
* NB: caller should have already removed it from ProcArray
*/
@@ -632,7 +629,7 @@ RemoveGXact(GlobalTransaction gxact)
{
int i;
- LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
+ Assert(LWLockHeldByMeInMode(TwoPhaseStateLock, LW_EXCLUSIVE));
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
@@ -646,14 +643,10 @@ RemoveGXact(GlobalTransaction gxact)
gxact->next = TwoPhaseState->freeGXacts;
TwoPhaseState->freeGXacts = gxact;
- LWLockRelease(TwoPhaseStateLock);
-
return;
}
}
- LWLockRelease(TwoPhaseStateLock);
-
elog(ERROR, "failed to find %p in GlobalTransaction array", gxact);
}
@@ -1126,8 +1119,12 @@ EndPrepare(GlobalTransaction gxact)
* TransactionIdIsInProgress, and onlookers would be entitled to assume
* the xact crashed. Instead we have a window where the same XID appears
* twice in ProcArray, which is OK.
+ *
+ * Lock here may be overkill, but I'm not convinced of that ...
*/
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
MarkAsPrepared(gxact);
+ LWLockRelease(TwoPhaseStateLock);
/*
* Now we can mark ourselves as out of the commit critical section: a
@@ -1506,7 +1503,9 @@ FinishPreparedTransaction(const char *gid, bool isCommit)
if (gxact->ondisk)
RemoveTwoPhaseFile(xid, true);
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
RemoveGXact(gxact);
+ LWLockRelease(TwoPhaseStateLock);
MyLockedGxact = NULL;
pfree(buf);
@@ -1733,6 +1732,7 @@ restoreTwoPhaseData(void)
DIR *cldir;
struct dirent *clde;
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
cldir = AllocateDir(TWOPHASE_DIR);
while ((clde = ReadDir(cldir, TWOPHASE_DIR)) != NULL)
{
@@ -1753,6 +1753,7 @@ restoreTwoPhaseData(void)
}
}
FreeDir(cldir);
+ LWLockRelease(TwoPhaseStateLock);
}
/*
@@ -1781,6 +1782,9 @@ restoreTwoPhaseData(void)
* If xids_p and nxids_p are not NULL, pointer to a palloc'd array of all
* top-level xids is stored in *xids_p. The number of entries in the array
* is returned in *nxids_p.
+ *
+ * An exclusive lock on TwoPhaseStateLock is taken for the duration of the
+ * scan of TwoPhaseState data as its may be updated.
*/
TransactionId
PrescanPreparedTransactions(TransactionId **xids_p, int *nxids_p)
@@ -1792,7 +1796,7 @@ PrescanPreparedTransactions(TransactionId **xids_p, int *nxids_p)
int allocsize = 0;
int i;
- LWLockAcquire(TwoPhaseStateLock, LW_SHARED);
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
TransactionId xid;
@@ -1861,13 +1865,16 @@ PrescanPreparedTransactions(TransactionId **xids_p, int *nxids_p)
* The lack of calls to SubTransSetParent() calls here is by design;
* those calls are made by RecoverPreparedTransactions() at the end of recovery
* for those xacts that need this.
+ *
+ * An exclusive lock on TwoPhaseStateLock is taken for the duration of the
+ * scan of TwoPhaseState data as it may get updated.
*/
void
StandbyRecoverPreparedTransactions(void)
{
int i;
- LWLockAcquire(TwoPhaseStateLock, LW_SHARED);
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
TransactionId xid;
@@ -1931,9 +1938,11 @@ RecoverPreparedTransactions(void)
* SubTransSetParent has been set before, if the prepared transaction
* generated xid assignment records.
*/
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
buf = ProcessTwoPhaseBuffer(xid,
gxact->prepare_start_lsn,
gxact->ondisk, true, false);
+ LWLockRelease(TwoPhaseStateLock);
if (buf == NULL)
continue;
@@ -1963,11 +1972,11 @@ RecoverPreparedTransactions(void)
/* recovered, so reset the flag for entries generated by redo */
gxact->inredo = false;
- LWLockRelease(TwoPhaseStateLock);
-
GXactLoadSubxactData(gxact, hdr->nsubxacts, subxids);
MarkAsPrepared(gxact);
+ LWLockRelease(TwoPhaseStateLock);
+
/*
* Recover other state (notably locks) using resource managers
*/
@@ -2001,6 +2010,9 @@ RecoverPreparedTransactions(void)
*
* If setNextXid is true, set ShmemVariableCache->nextXid to the newest
* value scanned.
+ *
+ * Caller needs to hold an exclusive lock on TwoPhaseStateLock as data
+ * of TwoPhaseState gets updated.
*/
static char *
ProcessTwoPhaseBuffer(TransactionId xid,
@@ -2014,6 +2026,8 @@ ProcessTwoPhaseBuffer(TransactionId xid,
TwoPhaseFileHeader *hdr;
int i;
+ Assert(LWLockHeldByMeInMode(TwoPhaseStateLock, LW_EXCLUSIVE));
+
if (!fromdisk)
Assert(prepare_start_lsn != InvalidXLogRecPtr);
@@ -2299,6 +2313,10 @@ RecordTransactionAbortPrepared(TransactionId xid,
* a gxact entry in shared memory TwoPhaseState structure. If caller
* specifies InvalidXLogRecPtr as WAL location to fetch the two-phase
* data, the entry is marked as located on disk.
+ *
+ * Caller needs to hold an exclusive lock on TwoPhaseStateLock as data of
+ * TwoPhaseState gets updated. Note that the lock is not directly taken in
+ * this routine as it can be called in restoreTwoPhaseData().
*/
void
PrepareRedoAdd(char *buf, XLogRecPtr start_lsn, XLogRecPtr end_lsn)
@@ -2309,6 +2327,7 @@ PrepareRedoAdd(char *buf, XLogRecPtr start_lsn, XLogRecPtr end_lsn)
GlobalTransaction gxact;
Assert(RecoveryInProgress());
+ Assert(LWLockHeldByMeInMode(TwoPhaseStateLock, LW_EXCLUSIVE));
bufptr = buf + MAXALIGN(sizeof(TwoPhaseFileHeader));
gid = (const char *) bufptr;
@@ -2324,7 +2343,6 @@ PrepareRedoAdd(char *buf, XLogRecPtr start_lsn, XLogRecPtr end_lsn)
* that it got added in the redo phase
*/
- LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
/* Get a free gxact from the freelist */
if (TwoPhaseState->freeGXacts == NULL)
ereport(ERROR,
@@ -2350,17 +2368,18 @@ PrepareRedoAdd(char *buf, XLogRecPtr start_lsn, XLogRecPtr end_lsn)
Assert(TwoPhaseState->numPrepXacts < max_prepared_xacts);
TwoPhaseState->prepXacts[TwoPhaseState->numPrepXacts++] = gxact;
- LWLockRelease(TwoPhaseStateLock);
-
elog(DEBUG2, "Adding 2PC data to shared memory %u", gxact->xid);
}
/*
* PrepareRedoRemove
*
- * Remove the corresponding gxact entry from TwoPhaseState. Also
- * remove the 2PC file if a prepared transaction was saved via
- * an earlier checkpoint.
+ * Remove the corresponding gxact entry from TwoPhaseState. Also remove
+ * the 2PC file if a prepared transaction was saved via an earlier checkpoint.
+ *
+ * Caller needs to hold an exclusive lock on TwoPhaseStateLock as data of
+ * TwoPhaseState gets updated. Note that the lock is not directly taken in
+ * this routine as it can be called in ProcessTwoPhaseBuffer().
*/
void
PrepareRedoRemove(TransactionId xid, bool giveWarning)
@@ -2370,8 +2389,8 @@ PrepareRedoRemove(TransactionId xid, bool giveWarning)
bool found = false;
Assert(RecoveryInProgress());
+ Assert(LWLockHeldByMeInMode(TwoPhaseStateLock, LW_EXCLUSIVE));
- LWLockAcquire(TwoPhaseStateLock, LW_SHARED);
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
gxact = TwoPhaseState->prepXacts[i];
@@ -2383,7 +2402,6 @@ PrepareRedoRemove(TransactionId xid, bool giveWarning)
break;
}
}
- LWLockRelease(TwoPhaseStateLock);
/*
* Just leave if there is nothing, this is expected during WAL replay.
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 7e8c598f2a..6b60ea9577 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -5619,7 +5619,9 @@ xact_redo(XLogReaderState *record)
record->EndRecPtr, XLogRecGetOrigin(record));
/* Delete TwoPhaseState gxact entry and/or 2PC file. */
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
PrepareRedoRemove(parsed.twophase_xid, false);
+ LWLockRelease(TwoPhaseStateLock);
}
}
else if (info == XLOG_XACT_ABORT || info == XLOG_XACT_ABORT_PREPARED)
@@ -5641,7 +5643,9 @@ xact_redo(XLogReaderState *record)
xact_redo_abort(&parsed, parsed.twophase_xid);
/* Delete TwoPhaseState gxact entry and/or 2PC file. */
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
PrepareRedoRemove(parsed.twophase_xid, false);
+ LWLockRelease(TwoPhaseStateLock);
}
}
else if (info == XLOG_XACT_PREPARE)
@@ -5650,9 +5654,11 @@ xact_redo(XLogReaderState *record)
* Store xid and start/end pointers of the WAL record in TwoPhaseState
* gxact entry.
*/
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
PrepareRedoAdd(XLogRecGetData(record),
record->ReadRecPtr,
record->EndRecPtr);
+ LWLockRelease(TwoPhaseStateLock);
}
else if (info == XLOG_XACT_ASSIGNMENT)
{
On Mon, Jun 5, 2017 at 7:24 AM, Noah Misch <noah@leadboat.com> wrote:
[Action required within three days. This is a generic notification.]
The above-described topic is currently a PostgreSQL 10 open item. Simon,
since you committed the patch believed to have created it, you own this open
item. If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know. Otherwise, please observe the policy on
open item ownership[1] and send a status update within three calendar days of
this message. Include a date for your subsequent status update. Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10. Consequently, I will appreciate your efforts
toward speedy resolution. Thanks.
I have just provided a patch that takes care of solving this issue and
cleans up the lock handling for all the 2PC redo code paths.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi, Michael
I don't understand why the patch remove TwoPhaseStateLock in MarkAsPrepared.
if so:
1. does it need remove lock acquire in MarkAsPreparing also?
2. MarkAsPrepared will call ProcArrayAdd, and in ProcArrayAdd it acquires
ProcArrayLock, we should carefully handle the lock sequence about these two
locks.
--
View this message in context: http://www.postgresql-archive.org/BUG-14680-startup-process-on-standby-encounter-a-deadlock-of-TwoPhaseStateLock-when-redo-2PC-xlog-tp5964069p5964796.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
On Mon, Jun 5, 2017 at 8:32 PM, wangchuanting <wangchuanting@huawei.com> wrote:
I don't understand why the patch remove TwoPhaseStateLock in MarkAsPrepared.
This one is on purpose, no low-level routines in this patch acquire
TwoPhaseStateLock. Logically I agree that it does not change much
but... You wrote something below about that.
if so:
1. does it need remove lock acquire in MarkAsPreparing also?
No, MarkAsPreparing() is used only in the non-redo code path.
2. MarkAsPrepared will call ProcArrayAdd, and in ProcArrayAdd it acquires
ProcArrayLock, we should carefully handle the lock sequence about these two
locks.
And this gives a reason to not complicate our lives.
--
Michael
Attachments:
2pc-redo-lwlock-fix-v3.patchapplication/octet-stream; name=2pc-redo-lwlock-fix-v3.patchDownload
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index c50f9c4bf6..099e2b420d 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -338,18 +338,13 @@ AtAbort_Twophase(void)
* resources held by the transaction yet. In those cases, the in-memory
* state can be wrong, but it's too late to back out.
*/
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
if (!MyLockedGxact->valid)
- {
RemoveGXact(MyLockedGxact);
- }
else
- {
- LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
-
MyLockedGxact->locking_backend = InvalidBackendId;
+ LWLockRelease(TwoPhaseStateLock);
- LWLockRelease(TwoPhaseStateLock);
- }
MyLockedGxact = NULL;
}
@@ -624,6 +619,8 @@ LockGXact(const char *gid, Oid user)
/*
* RemoveGXact
* Remove the prepared transaction from the shared memory array.
+ * Caller needs to hold an exclusive lock on TwoPhaseStateLock as
+ * data of TwoPhaseState gets updated.
*
* NB: caller should have already removed it from ProcArray
*/
@@ -632,7 +629,7 @@ RemoveGXact(GlobalTransaction gxact)
{
int i;
- LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
+ Assert(LWLockHeldByMeInMode(TwoPhaseStateLock, LW_EXCLUSIVE));
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
@@ -646,14 +643,10 @@ RemoveGXact(GlobalTransaction gxact)
gxact->next = TwoPhaseState->freeGXacts;
TwoPhaseState->freeGXacts = gxact;
- LWLockRelease(TwoPhaseStateLock);
-
return;
}
}
- LWLockRelease(TwoPhaseStateLock);
-
elog(ERROR, "failed to find %p in GlobalTransaction array", gxact);
}
@@ -1506,7 +1499,9 @@ FinishPreparedTransaction(const char *gid, bool isCommit)
if (gxact->ondisk)
RemoveTwoPhaseFile(xid, true);
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
RemoveGXact(gxact);
+ LWLockRelease(TwoPhaseStateLock);
MyLockedGxact = NULL;
pfree(buf);
@@ -1733,6 +1728,7 @@ restoreTwoPhaseData(void)
DIR *cldir;
struct dirent *clde;
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
cldir = AllocateDir(TWOPHASE_DIR);
while ((clde = ReadDir(cldir, TWOPHASE_DIR)) != NULL)
{
@@ -1753,6 +1749,7 @@ restoreTwoPhaseData(void)
}
}
FreeDir(cldir);
+ LWLockRelease(TwoPhaseStateLock);
}
/*
@@ -1781,6 +1778,9 @@ restoreTwoPhaseData(void)
* If xids_p and nxids_p are not NULL, pointer to a palloc'd array of all
* top-level xids is stored in *xids_p. The number of entries in the array
* is returned in *nxids_p.
+ *
+ * An exclusive lock on TwoPhaseStateLock is taken for the duration of the
+ * scan of TwoPhaseState data as its may be updated.
*/
TransactionId
PrescanPreparedTransactions(TransactionId **xids_p, int *nxids_p)
@@ -1792,7 +1792,7 @@ PrescanPreparedTransactions(TransactionId **xids_p, int *nxids_p)
int allocsize = 0;
int i;
- LWLockAcquire(TwoPhaseStateLock, LW_SHARED);
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
TransactionId xid;
@@ -1861,13 +1861,16 @@ PrescanPreparedTransactions(TransactionId **xids_p, int *nxids_p)
* The lack of calls to SubTransSetParent() calls here is by design;
* those calls are made by RecoverPreparedTransactions() at the end of recovery
* for those xacts that need this.
+ *
+ * An exclusive lock on TwoPhaseStateLock is taken for the duration of the
+ * scan of TwoPhaseState data as it may get updated.
*/
void
StandbyRecoverPreparedTransactions(void)
{
int i;
- LWLockAcquire(TwoPhaseStateLock, LW_SHARED);
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
TransactionId xid;
@@ -1931,9 +1934,11 @@ RecoverPreparedTransactions(void)
* SubTransSetParent has been set before, if the prepared transaction
* generated xid assignment records.
*/
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
buf = ProcessTwoPhaseBuffer(xid,
gxact->prepare_start_lsn,
gxact->ondisk, true, false);
+ LWLockRelease(TwoPhaseStateLock);
if (buf == NULL)
continue;
@@ -2001,6 +2006,9 @@ RecoverPreparedTransactions(void)
*
* If setNextXid is true, set ShmemVariableCache->nextXid to the newest
* value scanned.
+ *
+ * Caller needs to hold an exclusive lock on TwoPhaseStateLock as data
+ * of TwoPhaseState gets updated.
*/
static char *
ProcessTwoPhaseBuffer(TransactionId xid,
@@ -2014,6 +2022,8 @@ ProcessTwoPhaseBuffer(TransactionId xid,
TwoPhaseFileHeader *hdr;
int i;
+ Assert(LWLockHeldByMeInMode(TwoPhaseStateLock, LW_EXCLUSIVE));
+
if (!fromdisk)
Assert(prepare_start_lsn != InvalidXLogRecPtr);
@@ -2299,6 +2309,10 @@ RecordTransactionAbortPrepared(TransactionId xid,
* a gxact entry in shared memory TwoPhaseState structure. If caller
* specifies InvalidXLogRecPtr as WAL location to fetch the two-phase
* data, the entry is marked as located on disk.
+ *
+ * Caller needs to hold an exclusive lock on TwoPhaseStateLock as data of
+ * TwoPhaseState gets updated. Note that the lock is not directly taken in
+ * this routine as it can be called in restoreTwoPhaseData().
*/
void
PrepareRedoAdd(char *buf, XLogRecPtr start_lsn, XLogRecPtr end_lsn)
@@ -2309,6 +2323,7 @@ PrepareRedoAdd(char *buf, XLogRecPtr start_lsn, XLogRecPtr end_lsn)
GlobalTransaction gxact;
Assert(RecoveryInProgress());
+ Assert(LWLockHeldByMeInMode(TwoPhaseStateLock, LW_EXCLUSIVE));
bufptr = buf + MAXALIGN(sizeof(TwoPhaseFileHeader));
gid = (const char *) bufptr;
@@ -2324,7 +2339,6 @@ PrepareRedoAdd(char *buf, XLogRecPtr start_lsn, XLogRecPtr end_lsn)
* that it got added in the redo phase
*/
- LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
/* Get a free gxact from the freelist */
if (TwoPhaseState->freeGXacts == NULL)
ereport(ERROR,
@@ -2350,17 +2364,18 @@ PrepareRedoAdd(char *buf, XLogRecPtr start_lsn, XLogRecPtr end_lsn)
Assert(TwoPhaseState->numPrepXacts < max_prepared_xacts);
TwoPhaseState->prepXacts[TwoPhaseState->numPrepXacts++] = gxact;
- LWLockRelease(TwoPhaseStateLock);
-
elog(DEBUG2, "Adding 2PC data to shared memory %u", gxact->xid);
}
/*
* PrepareRedoRemove
*
- * Remove the corresponding gxact entry from TwoPhaseState. Also
- * remove the 2PC file if a prepared transaction was saved via
- * an earlier checkpoint.
+ * Remove the corresponding gxact entry from TwoPhaseState. Also remove
+ * the 2PC file if a prepared transaction was saved via an earlier checkpoint.
+ *
+ * Caller needs to hold an exclusive lock on TwoPhaseStateLock as data of
+ * TwoPhaseState gets updated. Note that the lock is not directly taken in
+ * this routine as it can be called in ProcessTwoPhaseBuffer().
*/
void
PrepareRedoRemove(TransactionId xid, bool giveWarning)
@@ -2370,8 +2385,8 @@ PrepareRedoRemove(TransactionId xid, bool giveWarning)
bool found = false;
Assert(RecoveryInProgress());
+ Assert(LWLockHeldByMeInMode(TwoPhaseStateLock, LW_EXCLUSIVE));
- LWLockAcquire(TwoPhaseStateLock, LW_SHARED);
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
gxact = TwoPhaseState->prepXacts[i];
@@ -2383,7 +2398,6 @@ PrepareRedoRemove(TransactionId xid, bool giveWarning)
break;
}
}
- LWLockRelease(TwoPhaseStateLock);
/*
* Just leave if there is nothing, this is expected during WAL replay.
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 7e8c598f2a..6b60ea9577 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -5619,7 +5619,9 @@ xact_redo(XLogReaderState *record)
record->EndRecPtr, XLogRecGetOrigin(record));
/* Delete TwoPhaseState gxact entry and/or 2PC file. */
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
PrepareRedoRemove(parsed.twophase_xid, false);
+ LWLockRelease(TwoPhaseStateLock);
}
}
else if (info == XLOG_XACT_ABORT || info == XLOG_XACT_ABORT_PREPARED)
@@ -5641,7 +5643,9 @@ xact_redo(XLogReaderState *record)
xact_redo_abort(&parsed, parsed.twophase_xid);
/* Delete TwoPhaseState gxact entry and/or 2PC file. */
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
PrepareRedoRemove(parsed.twophase_xid, false);
+ LWLockRelease(TwoPhaseStateLock);
}
}
else if (info == XLOG_XACT_PREPARE)
@@ -5650,9 +5654,11 @@ xact_redo(XLogReaderState *record)
* Store xid and start/end pointers of the WAL record in TwoPhaseState
* gxact entry.
*/
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
PrepareRedoAdd(XLogRecGetData(record),
record->ReadRecPtr,
record->EndRecPtr);
+ LWLockRelease(TwoPhaseStateLock);
}
else if (info == XLOG_XACT_ASSIGNMENT)
{
On Sun, Jun 04, 2017 at 10:24:30PM +0000, Noah Misch wrote:
On Thu, Jun 01, 2017 at 01:07:53AM -0700, Michael Paquier wrote:
On Wed, May 31, 2017 at 12:30 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:On Wed, May 31, 2017 at 6:57 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
wangchuanting@huawei.com writes:
startup process on standby encounter a deadlock of TwoPhaseStateLock when
redo 2PC xlog.Please provide an example of how to get into this state.
That would help. Are you seeing in the logs something like "removing
future two-phase state from memory for XXX" or "removing stale
two-phase state from shared memory for XXX"?Even with that, the light-weight lock sequence taken in those code
paths look definitely wrong to me, we should not take twice
TwoPhaseStateLock in the same code path. I think that we should remove
the lock acquisitions in RemoveGXact() and PrepareRedoRemove, and then
upgrade the locks of PrescanPreparedTransactions() and
StandbyRecoverPreparedTransactions() to be exclusive. We still need to
keep a lock as CheckPointTwoPhase() may still be triggered by the
checkpoint. Tom, what do you think?Attached is what I was thinking about for reference. I just came back
from a long flight and I am pretty tired, so my brain may have missed
something. I'll take again a look at this issue on Monday, an open
item has been added for now.[Action required within three days. This is a generic notification.]
The above-described topic is currently a PostgreSQL 10 open item. Simon,
since you committed the patch believed to have created it, you own this open
item. If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know. Otherwise, please observe the policy on
open item ownership[1] and send a status update within three calendar days of
this message. Include a date for your subsequent status update. Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10. Consequently, I will appreciate your efforts
toward speedy resolution. Thanks.[1] /messages/by-id/20170404140717.GA2675809@tornado.leadboat.com
This PostgreSQL 10 open item is past due for your status update. Kindly send
a status update within 24 hours, and include a date for your subsequent status
update. Refer to the policy on open item ownership:
/messages/by-id/20170404140717.GA2675809@tornado.leadboat.com
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
On Fri, Jun 9, 2017 at 3:17 PM, Noah Misch <noah@leadboat.com> wrote:
This PostgreSQL 10 open item is past due for your status update. Kindly send
a status update within 24 hours, and include a date for your subsequent status
update. Refer to the policy on open item ownership:
/messages/by-id/20170404140717.GA2675809@tornado.leadboat.com
I have sent an updated patch simplifying the locking here:
/messages/by-id/CAB7nPqQthLP9GvD2242epHKOBkDMd+03tSuFvK3cVZsGarQyWA@mail.gmail.com
--
Michael
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
On Thu, Jun 08, 2017 at 11:17:38PM -0700, Noah Misch wrote:
On Sun, Jun 04, 2017 at 10:24:30PM +0000, Noah Misch wrote:
On Thu, Jun 01, 2017 at 01:07:53AM -0700, Michael Paquier wrote:
On Wed, May 31, 2017 at 12:30 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:On Wed, May 31, 2017 at 6:57 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
wangchuanting@huawei.com writes:
startup process on standby encounter a deadlock of TwoPhaseStateLock when
redo 2PC xlog.Please provide an example of how to get into this state.
That would help. Are you seeing in the logs something like "removing
future two-phase state from memory for XXX" or "removing stale
two-phase state from shared memory for XXX"?Even with that, the light-weight lock sequence taken in those code
paths look definitely wrong to me, we should not take twice
TwoPhaseStateLock in the same code path. I think that we should remove
the lock acquisitions in RemoveGXact() and PrepareRedoRemove, and then
upgrade the locks of PrescanPreparedTransactions() and
StandbyRecoverPreparedTransactions() to be exclusive. We still need to
keep a lock as CheckPointTwoPhase() may still be triggered by the
checkpoint. Tom, what do you think?Attached is what I was thinking about for reference. I just came back
from a long flight and I am pretty tired, so my brain may have missed
something. I'll take again a look at this issue on Monday, an open
item has been added for now.[Action required within three days. This is a generic notification.]
The above-described topic is currently a PostgreSQL 10 open item. Simon,
since you committed the patch believed to have created it, you own this open
item. If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know. Otherwise, please observe the policy on
open item ownership[1] and send a status update within three calendar days of
this message. Include a date for your subsequent status update. Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10. Consequently, I will appreciate your efforts
toward speedy resolution. Thanks.[1] /messages/by-id/20170404140717.GA2675809@tornado.leadboat.com
This PostgreSQL 10 open item is past due for your status update. Kindly send
a status update within 24 hours, and include a date for your subsequent status
update. Refer to the policy on open item ownership:
/messages/by-id/20170404140717.GA2675809@tornado.leadboat.com
IMMEDIATE ATTENTION REQUIRED. This PostgreSQL 10 open item is long past due
for your status update. Please reacquaint yourself with the policy on open
item ownership[1]/messages/by-id/20170404140717.GA2675809@tornado.leadboat.com and then reply immediately. If I do not hear from you by
2017-06-11 07:00 UTC, I will transfer this item to release management team
ownership without further notice.
[1]: /messages/by-id/20170404140717.GA2675809@tornado.leadboat.com
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
Noah Misch wrote:
IMMEDIATE ATTENTION REQUIRED. This PostgreSQL 10 open item is long past due
for your status update. Please reacquaint yourself with the policy on open
item ownership[1] and then reply immediately. If I do not hear from you by
2017-06-11 07:00 UTC, I will transfer this item to release management team
ownership without further notice.
I volunteer to own this item. I'll report on Wednesday 14th at the latest.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
Michael Paquier wrote:
i look carefully on the patch, because of we removed TwoPhaseStateLock
lwlock acquire in `RemoveGXact' and let caller held lwlock, so i think:
1. xact_redo also need held lwlock before call PrepareRedoRemove
2. RecoverPreparedTransactions also need held lwlock before call
ProcessTwoPhaseBufferThanks for the input. I have reviewed all the code paths that have
been modified, and strengthened the code with assertions using
LWLockHeldByMeInMode() to make sure that the correct lock is always
hold in those code paths. There is actually no point in holding the
lock in restoreTwoPhaseData(), but as this makes the code less
consistent with the rest I added one. Also, I have replaced the lock
acquisition in PrepareRedoAdd() with acquisitions at higher levels,
and added an assertion in this routine. This makes the 2PC state data
addition and removal more consistent with each other.
Thanks for the patch -- I agree that the new coding looks better.
I also agree that restoreTwoPhaseData doesn't need to hold the lock,
since we don't expect anything running concurrently, but that it's okay
to hold it and makes the whole thing simpler.
We could try to apply an equivalent rationale to
RecoverPreparedTransactions: even though we have already been running in
HOT standby mode for a while, there's no possibility of concurrent
activity either: since we exited recovery, master cannot write any more
2PC xacts, and we haven't yet set the flag that lets open sessions write
WAL. However, it seems mildly dangerous to run the bottom sections of
the loop with the lock held, since it acquires other lwlocks and
generally does a lot of crap.
Also, let's add an lwlock-is-held assertion to MarkAsPreparingGuts.
BTW: I think that saving one redundant line of code is not worth the
ugliness that it costs in xact_redo, so let's undo that. The patch is a
bit bulky, but the resulting code is simpler.
In short, I propose the attached.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
2pc-redo-lwlock-fix-v4.patchtext/plain; charset=us-asciiDownload
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index c50f9c4bf6..bd729923f1 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -195,7 +195,10 @@ typedef struct TwoPhaseStateData
static TwoPhaseStateData *TwoPhaseState;
/*
- * Global transaction entry currently locked by us, if any.
+ * Global transaction entry currently locked by us, if any. Note that any
+ * access to the entry pointed to by this variable must be protected by
+ * TwoPhaseStateLock, though obviously the pointer itself doesn't need to be
+ * (since it's just local memory).
*/
static GlobalTransaction MyLockedGxact = NULL;
@@ -338,18 +341,13 @@ AtAbort_Twophase(void)
* resources held by the transaction yet. In those cases, the in-memory
* state can be wrong, but it's too late to back out.
*/
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
if (!MyLockedGxact->valid)
- {
RemoveGXact(MyLockedGxact);
- }
else
- {
- LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
-
MyLockedGxact->locking_backend = InvalidBackendId;
+ LWLockRelease(TwoPhaseStateLock);
- LWLockRelease(TwoPhaseStateLock);
- }
MyLockedGxact = NULL;
}
@@ -454,6 +452,8 @@ MarkAsPreparingGuts(GlobalTransaction gxact, TransactionId xid, const char *gid,
PGXACT *pgxact;
int i;
+ Assert(LWLockHeldByMeInMode(TwoPhaseStateLock, LW_EXCLUSIVE));
+
Assert(gxact != NULL);
proc = &ProcGlobal->allProcs[gxact->pgprocno];
pgxact = &ProcGlobal->allPgXact[gxact->pgprocno];
@@ -632,7 +632,7 @@ RemoveGXact(GlobalTransaction gxact)
{
int i;
- LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
+ Assert(LWLockHeldByMeInMode(TwoPhaseStateLock, LW_EXCLUSIVE));
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
@@ -646,14 +646,10 @@ RemoveGXact(GlobalTransaction gxact)
gxact->next = TwoPhaseState->freeGXacts;
TwoPhaseState->freeGXacts = gxact;
- LWLockRelease(TwoPhaseStateLock);
-
return;
}
}
- LWLockRelease(TwoPhaseStateLock);
-
elog(ERROR, "failed to find %p in GlobalTransaction array", gxact);
}
@@ -1506,7 +1502,9 @@ FinishPreparedTransaction(const char *gid, bool isCommit)
if (gxact->ondisk)
RemoveTwoPhaseFile(xid, true);
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
RemoveGXact(gxact);
+ LWLockRelease(TwoPhaseStateLock);
MyLockedGxact = NULL;
pfree(buf);
@@ -1734,6 +1732,7 @@ restoreTwoPhaseData(void)
struct dirent *clde;
cldir = AllocateDir(TWOPHASE_DIR);
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
while ((clde = ReadDir(cldir, TWOPHASE_DIR)) != NULL)
{
if (strlen(clde->d_name) == 8 &&
@@ -1752,6 +1751,7 @@ restoreTwoPhaseData(void)
PrepareRedoAdd(buf, InvalidXLogRecPtr, InvalidXLogRecPtr);
}
}
+ LWLockRelease(TwoPhaseStateLock);
FreeDir(cldir);
}
@@ -1781,6 +1781,9 @@ restoreTwoPhaseData(void)
* If xids_p and nxids_p are not NULL, pointer to a palloc'd array of all
* top-level xids is stored in *xids_p. The number of entries in the array
* is returned in *nxids_p.
+ *
+ * An exclusive lock on TwoPhaseStateLock is taken for the duration of the
+ * scan of TwoPhaseState data as its may be updated.
*/
TransactionId
PrescanPreparedTransactions(TransactionId **xids_p, int *nxids_p)
@@ -1792,7 +1795,7 @@ PrescanPreparedTransactions(TransactionId **xids_p, int *nxids_p)
int allocsize = 0;
int i;
- LWLockAcquire(TwoPhaseStateLock, LW_SHARED);
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
TransactionId xid;
@@ -1867,7 +1870,7 @@ StandbyRecoverPreparedTransactions(void)
{
int i;
- LWLockAcquire(TwoPhaseStateLock, LW_SHARED);
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
TransactionId xid;
@@ -1893,7 +1896,8 @@ StandbyRecoverPreparedTransactions(void)
* Scan the shared memory entries of TwoPhaseState and reload the state for
* each prepared transaction (reacquire locks, etc).
*
- * This is run during database startup.
+ * This is run at the end of recovery, but before we allow backends to write
+ * WAL.
*
* At the end of recovery the way we take snapshots will change. We now need
* to mark all running transactions with their full SubTransSetParent() info
@@ -1908,7 +1912,13 @@ RecoverPreparedTransactions(void)
int i;
/*
- * Don't need a lock in the recovery phase.
+ * It's okay to access TwoPhaseState without a lock here: recovery is
+ * finished (so if we were a standby, there's no master that can prepare
+ * transactions anymore), and we haven't yet set WAL as open for writes,
+ * so local existing backends, if any, cannot do so either. We could use a
+ * coding pattern similar to restoreTwoPhaseData, i.e., run the whole loop
+ * with the lock held; but this loop is far more complex, so instead only
+ * grab the lock while calling the low-level functions that require it.
*/
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
@@ -1931,9 +1941,11 @@ RecoverPreparedTransactions(void)
* SubTransSetParent has been set before, if the prepared transaction
* generated xid assignment records.
*/
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
buf = ProcessTwoPhaseBuffer(xid,
gxact->prepare_start_lsn,
gxact->ondisk, true, false);
+ LWLockRelease(TwoPhaseStateLock);
if (buf == NULL)
continue;
@@ -1962,14 +1974,13 @@ RecoverPreparedTransactions(void)
/* recovered, so reset the flag for entries generated by redo */
gxact->inredo = false;
-
LWLockRelease(TwoPhaseStateLock);
GXactLoadSubxactData(gxact, hdr->nsubxacts, subxids);
MarkAsPrepared(gxact);
/*
- * Recover other state (notably locks) using resource managers
+ * Recover other state (notably locks) using resource managers.
*/
ProcessRecords(bufptr, xid, twophase_recover_callbacks);
@@ -2014,6 +2025,8 @@ ProcessTwoPhaseBuffer(TransactionId xid,
TwoPhaseFileHeader *hdr;
int i;
+ Assert(LWLockHeldByMeInMode(TwoPhaseStateLock, LW_EXCLUSIVE));
+
if (!fromdisk)
Assert(prepare_start_lsn != InvalidXLogRecPtr);
@@ -2030,8 +2043,8 @@ ProcessTwoPhaseBuffer(TransactionId xid,
else
{
ereport(WARNING,
- (errmsg("removing stale two-phase state from"
- " shared memory for \"%u\"", xid)));
+ (errmsg("removing stale two-phase state from shared memory for \"%u\"",
+ xid)));
PrepareRedoRemove(xid, true);
}
return NULL;
@@ -2308,6 +2321,7 @@ PrepareRedoAdd(char *buf, XLogRecPtr start_lsn, XLogRecPtr end_lsn)
const char *gid;
GlobalTransaction gxact;
+ Assert(LWLockHeldByMeInMode(TwoPhaseStateLock, LW_EXCLUSIVE));
Assert(RecoveryInProgress());
bufptr = buf + MAXALIGN(sizeof(TwoPhaseFileHeader));
@@ -2324,7 +2338,6 @@ PrepareRedoAdd(char *buf, XLogRecPtr start_lsn, XLogRecPtr end_lsn)
* that it got added in the redo phase
*/
- LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
/* Get a free gxact from the freelist */
if (TwoPhaseState->freeGXacts == NULL)
ereport(ERROR,
@@ -2350,17 +2363,17 @@ PrepareRedoAdd(char *buf, XLogRecPtr start_lsn, XLogRecPtr end_lsn)
Assert(TwoPhaseState->numPrepXacts < max_prepared_xacts);
TwoPhaseState->prepXacts[TwoPhaseState->numPrepXacts++] = gxact;
- LWLockRelease(TwoPhaseStateLock);
-
- elog(DEBUG2, "Adding 2PC data to shared memory %u", gxact->xid);
+ elog(DEBUG2, "added 2PC data in shared memory for transaction %u", gxact->xid);
}
/*
* PrepareRedoRemove
*
- * Remove the corresponding gxact entry from TwoPhaseState. Also
- * remove the 2PC file if a prepared transaction was saved via
- * an earlier checkpoint.
+ * Remove the corresponding gxact entry from TwoPhaseState. Also remove
+ * the 2PC file if a prepared transaction was saved via an earlier checkpoint.
+ *
+ * Caller must hold TwoPhaseStateLock in exclusive mode, because TwoPhaseState
+ * is updated.
*/
void
PrepareRedoRemove(TransactionId xid, bool giveWarning)
@@ -2369,9 +2382,9 @@ PrepareRedoRemove(TransactionId xid, bool giveWarning)
int i;
bool found = false;
+ Assert(LWLockHeldByMeInMode(TwoPhaseStateLock, LW_EXCLUSIVE));
Assert(RecoveryInProgress());
- LWLockAcquire(TwoPhaseStateLock, LW_SHARED);
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
gxact = TwoPhaseState->prepXacts[i];
@@ -2383,7 +2396,6 @@ PrepareRedoRemove(TransactionId xid, bool giveWarning)
break;
}
}
- LWLockRelease(TwoPhaseStateLock);
/*
* Just leave if there is nothing, this is expected during WAL replay.
@@ -2394,7 +2406,7 @@ PrepareRedoRemove(TransactionId xid, bool giveWarning)
/*
* And now we can clean up any files we may have left.
*/
- elog(DEBUG2, "Removing 2PC data from shared memory %u", xid);
+ elog(DEBUG2, "removing 2PC data for transaction %u", xid);
if (gxact->ondisk)
RemoveTwoPhaseFile(xid, giveWarning);
RemoveGXact(gxact);
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 7e8c598f2a..e09696c37f 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -5351,6 +5351,8 @@ xact_redo_commit(xl_xact_parsed_commit *parsed,
int i;
TimestampTz commit_time;
+ Assert(TransactionIdIsValid(xid));
+
max_xid = TransactionIdLatest(xid, parsed->nsubxacts, parsed->subxacts);
/*
@@ -5518,6 +5520,8 @@ xact_redo_abort(xl_xact_parsed_abort *parsed, TransactionId xid)
int i;
TransactionId max_xid;
+ Assert(TransactionIdIsValid(xid));
+
/*
* Make sure nextXid is beyond any XID mentioned in the record.
*
@@ -5598,51 +5602,49 @@ xact_redo(XLogReaderState *record)
/* Backup blocks are not used in xact records */
Assert(!XLogRecHasAnyBlockRefs(record));
- if (info == XLOG_XACT_COMMIT || info == XLOG_XACT_COMMIT_PREPARED)
+ if (info == XLOG_XACT_COMMIT)
{
xl_xact_commit *xlrec = (xl_xact_commit *) XLogRecGetData(record);
xl_xact_parsed_commit parsed;
- ParseCommitRecord(XLogRecGetInfo(record), xlrec,
- &parsed);
-
- if (info == XLOG_XACT_COMMIT)
- {
- Assert(!TransactionIdIsValid(parsed.twophase_xid));
- xact_redo_commit(&parsed, XLogRecGetXid(record),
- record->EndRecPtr, XLogRecGetOrigin(record));
- }
- else
- {
- Assert(TransactionIdIsValid(parsed.twophase_xid));
- xact_redo_commit(&parsed, parsed.twophase_xid,
- record->EndRecPtr, XLogRecGetOrigin(record));
-
- /* Delete TwoPhaseState gxact entry and/or 2PC file. */
- PrepareRedoRemove(parsed.twophase_xid, false);
- }
+ ParseCommitRecord(XLogRecGetInfo(record), xlrec, &parsed);
+ xact_redo_commit(&parsed, XLogRecGetXid(record),
+ record->EndRecPtr, XLogRecGetOrigin(record));
}
- else if (info == XLOG_XACT_ABORT || info == XLOG_XACT_ABORT_PREPARED)
+ else if (info == XLOG_XACT_COMMIT_PREPARED)
+ {
+ xl_xact_commit *xlrec = (xl_xact_commit *) XLogRecGetData(record);
+ xl_xact_parsed_commit parsed;
+
+ ParseCommitRecord(XLogRecGetInfo(record), xlrec, &parsed);
+ xact_redo_commit(&parsed, parsed.twophase_xid,
+ record->EndRecPtr, XLogRecGetOrigin(record));
+
+ /* Delete TwoPhaseState gxact entry and/or 2PC file. */
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
+ PrepareRedoRemove(parsed.twophase_xid, false);
+ LWLockRelease(TwoPhaseStateLock);
+ }
+ else if (info == XLOG_XACT_ABORT)
{
xl_xact_abort *xlrec = (xl_xact_abort *) XLogRecGetData(record);
xl_xact_parsed_abort parsed;
- ParseAbortRecord(XLogRecGetInfo(record), xlrec,
- &parsed);
+ ParseAbortRecord(XLogRecGetInfo(record), xlrec, &parsed);
+ xact_redo_abort(&parsed, XLogRecGetXid(record));
+ }
+ else if (info == XLOG_XACT_ABORT_PREPARED)
+ {
+ xl_xact_abort *xlrec = (xl_xact_abort *) XLogRecGetData(record);
+ xl_xact_parsed_abort parsed;
- if (info == XLOG_XACT_ABORT)
- {
- Assert(!TransactionIdIsValid(parsed.twophase_xid));
- xact_redo_abort(&parsed, XLogRecGetXid(record));
- }
- else
- {
- Assert(TransactionIdIsValid(parsed.twophase_xid));
- xact_redo_abort(&parsed, parsed.twophase_xid);
+ ParseAbortRecord(XLogRecGetInfo(record), xlrec, &parsed);
+ xact_redo_abort(&parsed, parsed.twophase_xid);
- /* Delete TwoPhaseState gxact entry and/or 2PC file. */
- PrepareRedoRemove(parsed.twophase_xid, false);
- }
+ /* Delete TwoPhaseState gxact entry and/or 2PC file. */
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
+ PrepareRedoRemove(parsed.twophase_xid, false);
+ LWLockRelease(TwoPhaseStateLock);
}
else if (info == XLOG_XACT_PREPARE)
{
@@ -5650,9 +5652,11 @@ xact_redo(XLogReaderState *record)
* Store xid and start/end pointers of the WAL record in TwoPhaseState
* gxact entry.
*/
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
PrepareRedoAdd(XLogRecGetData(record),
record->ReadRecPtr,
record->EndRecPtr);
+ LWLockRelease(TwoPhaseStateLock);
}
else if (info == XLOG_XACT_ASSIGNMENT)
{
Alvaro Herrera wrote:
For some reason I sent a preliminary version of the email I was writing.
Some additional thoughts that were supposed to be in there:
I also agree that restoreTwoPhaseData doesn't need to hold the lock,
since we don't expect anything running concurrently, but that it's okay
to hold it and makes the whole thing simpler.
It's okay to hold the lock for the whole duration of the function.
We could try to apply an equivalent rationale to
RecoverPreparedTransactions: even though we have already been running in
HOT standby mode for a while, there's no possibility of concurrent
activity either: since we exited recovery, master cannot write any more
2PC xacts, and we haven't yet set the flag that lets open sessions write
WAL. However, it seems mildly dangerous to run the bottom sections of
the loop with the lock held, since it acquires other lwlocks and
generally does a lot of crap.
Therefore, let's adopt your idea of acquiring the lock only for the
specific low-level calls that require it. But the comment atop the loop
needs a lot of work to explain why it's doing that (i.e. how come it's
reading TwoPhaseState without a lock), as in my proposed patch.
(At first, I didn't really like this very much and wanted to add more
locking to that function, but it turned quite messy, so I back-tracked).
Thanks
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
wangchuanting wrote:
we use Postgres-XC+pg10(4 coordinator + 4 datanode(pg 10 as datanode, and 1
master 1 standby for each datanode)), and benchmark tpcc, there is some
cross datanode transactions that use 2pc, during testing, we restart the
cluster, then one datanode standby can not recovery done and hangup with
TwoPhaseStateLock deadlock.and sorry, we reinstall the cluster, and the log is removed, we will try to
reproduce, but anyway, the code is not right like Michael saied.
Did you have any luck reproducing the original issue? It would be good
if you could confirm that the bug is gone with the patch I posted in
/messages/by-id/20170611022536.goef4cdbmgicye5g@alvherre.pgsql
Thanks
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
On Sun, Jun 11, 2017 at 12:24 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
Alvaro Herrera wrote:
For some reason I sent a preliminary version of the email I was writing.
Some additional thoughts that were supposed to be in there:
No problem. Thanks for the input.
I also agree that restoreTwoPhaseData doesn't need to hold the lock,
since we don't expect anything running concurrently, but that it's okay
to hold it and makes the whole thing simpler.It's okay to hold the lock for the whole duration of the function.
Yup.
We could try to apply an equivalent rationale to
RecoverPreparedTransactions: even though we have already been running in
HOT standby mode for a while, there's no possibility of concurrent
activity either: since we exited recovery, master cannot write any more
2PC xacts, and we haven't yet set the flag that lets open sessions write
WAL. However, it seems mildly dangerous to run the bottom sections of
the loop with the lock held, since it acquires other lwlocks and
generally does a lot of crap.Therefore, let's adopt your idea of acquiring the lock only for the
specific low-level calls that require it. But the comment atop the loop
needs a lot of work to explain why it's doing that (i.e. how come it's
reading TwoPhaseState without a lock), as in my proposed patch.(At first, I didn't really like this very much and wanted to add more
locking to that function, but it turned quite messy, so I back-tracked).
Yes, I got the same temptation but give up as well because of the
unnecessary complications. GXactLoadSubxactData() does nothing fancy,
but my worries are in ProcessRecords() and PostPrepare_Twophase(),
particularly if they do more complicated and fancy stuff in the future
for whatever reason. The comment you have added is this one:
/*
- * Don't need a lock in the recovery phase.
+ * It's okay to access TwoPhaseState without a lock here: recovery is
+ * finished (so if we were a standby, there's no master that can prepare
+ * transactions anymore), and we haven't yet set WAL as open for writes,
+ * so local existing backends, if any, cannot do so either.
We could use a
+ * coding pattern similar to restoreTwoPhaseData, i.e., run
the whole loop
+ * with the lock held; but this loop is far more complex, so
instead only
+ * grab the lock while calling the low-level functions that require it.
*/
I have reworked the comment as follows:
/*
- * Don't need a lock in the recovery phase.
+ * It is fine to access TwoPhaseState without a lock here: recovery is
+ * finished (so if we were a standby, there's no master that can prepare
+ * transactions anymore), and we haven't yet set WAL as open for writes,
+ * so local existing backends, if any, cannot do so either. We could use a
+ * coding pattern similar to restoreTwoPhaseData, i.e., run the whole loop
+ * with the lock held; but this loop is far more complex, so instead only
+ * grab the lock while calling the low-level functions working directly on
+ * manipulating the two-phase state data. Functions working directly on
+ * PGPROC entries linked with the two-phase transaction work with other
+ * types of locks but we don't want to complicate that more than necessary.
*/
The v5 attached does not have any other changes than what you did in
v4 (plus typos noticed).
- if (info == XLOG_XACT_COMMIT || info == XLOG_XACT_COMMIT_PREPARED)
+ if (info == XLOG_XACT_COMMIT)
{
xl_xact_commit *xlrec = (xl_xact_commit *) XLogRecGetData(record);
xl_xact_parsed_commit parsed;
No objections to the shuffling done here.
+ * An exclusive lock on TwoPhaseStateLock is taken for the duration of the
+ * scan of TwoPhaseState data as its may be updated.
Typo noticed here => s/its/it/.
--
Michael
Attachments:
2pc-redo-lwlock-fix-v5.patchapplication/octet-stream; name=2pc-redo-lwlock-fix-v5.patchDownload
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index c50f9c4bf6..d1e1df0b6d 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -195,7 +195,10 @@ typedef struct TwoPhaseStateData
static TwoPhaseStateData *TwoPhaseState;
/*
- * Global transaction entry currently locked by us, if any.
+ * Global transaction entry currently locked by us, if any. Note that any
+ * access to the entry pointed to by this variable must be protected by
+ * TwoPhaseStateLock, though obviously the pointer itself doesn't need to be
+ * (since it's just local memory).
*/
static GlobalTransaction MyLockedGxact = NULL;
@@ -338,18 +341,13 @@ AtAbort_Twophase(void)
* resources held by the transaction yet. In those cases, the in-memory
* state can be wrong, but it's too late to back out.
*/
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
if (!MyLockedGxact->valid)
- {
RemoveGXact(MyLockedGxact);
- }
else
- {
- LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
-
MyLockedGxact->locking_backend = InvalidBackendId;
+ LWLockRelease(TwoPhaseStateLock);
- LWLockRelease(TwoPhaseStateLock);
- }
MyLockedGxact = NULL;
}
@@ -454,6 +452,8 @@ MarkAsPreparingGuts(GlobalTransaction gxact, TransactionId xid, const char *gid,
PGXACT *pgxact;
int i;
+ Assert(LWLockHeldByMeInMode(TwoPhaseStateLock, LW_EXCLUSIVE));
+
Assert(gxact != NULL);
proc = &ProcGlobal->allProcs[gxact->pgprocno];
pgxact = &ProcGlobal->allPgXact[gxact->pgprocno];
@@ -632,7 +632,7 @@ RemoveGXact(GlobalTransaction gxact)
{
int i;
- LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
+ Assert(LWLockHeldByMeInMode(TwoPhaseStateLock, LW_EXCLUSIVE));
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
@@ -646,14 +646,10 @@ RemoveGXact(GlobalTransaction gxact)
gxact->next = TwoPhaseState->freeGXacts;
TwoPhaseState->freeGXacts = gxact;
- LWLockRelease(TwoPhaseStateLock);
-
return;
}
}
- LWLockRelease(TwoPhaseStateLock);
-
elog(ERROR, "failed to find %p in GlobalTransaction array", gxact);
}
@@ -1506,7 +1502,9 @@ FinishPreparedTransaction(const char *gid, bool isCommit)
if (gxact->ondisk)
RemoveTwoPhaseFile(xid, true);
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
RemoveGXact(gxact);
+ LWLockRelease(TwoPhaseStateLock);
MyLockedGxact = NULL;
pfree(buf);
@@ -1734,6 +1732,7 @@ restoreTwoPhaseData(void)
struct dirent *clde;
cldir = AllocateDir(TWOPHASE_DIR);
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
while ((clde = ReadDir(cldir, TWOPHASE_DIR)) != NULL)
{
if (strlen(clde->d_name) == 8 &&
@@ -1752,6 +1751,7 @@ restoreTwoPhaseData(void)
PrepareRedoAdd(buf, InvalidXLogRecPtr, InvalidXLogRecPtr);
}
}
+ LWLockRelease(TwoPhaseStateLock);
FreeDir(cldir);
}
@@ -1781,6 +1781,9 @@ restoreTwoPhaseData(void)
* If xids_p and nxids_p are not NULL, pointer to a palloc'd array of all
* top-level xids is stored in *xids_p. The number of entries in the array
* is returned in *nxids_p.
+ *
+ * An exclusive lock on TwoPhaseStateLock is taken for the duration of the
+ * scan of TwoPhaseState data as it may be updated.
*/
TransactionId
PrescanPreparedTransactions(TransactionId **xids_p, int *nxids_p)
@@ -1792,7 +1795,7 @@ PrescanPreparedTransactions(TransactionId **xids_p, int *nxids_p)
int allocsize = 0;
int i;
- LWLockAcquire(TwoPhaseStateLock, LW_SHARED);
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
TransactionId xid;
@@ -1867,7 +1870,7 @@ StandbyRecoverPreparedTransactions(void)
{
int i;
- LWLockAcquire(TwoPhaseStateLock, LW_SHARED);
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
TransactionId xid;
@@ -1893,7 +1896,8 @@ StandbyRecoverPreparedTransactions(void)
* Scan the shared memory entries of TwoPhaseState and reload the state for
* each prepared transaction (reacquire locks, etc).
*
- * This is run during database startup.
+ * This is run at the end of recovery, but before we allow backends to write
+ * WAL.
*
* At the end of recovery the way we take snapshots will change. We now need
* to mark all running transactions with their full SubTransSetParent() info
@@ -1908,7 +1912,16 @@ RecoverPreparedTransactions(void)
int i;
/*
- * Don't need a lock in the recovery phase.
+ * It is fine to access TwoPhaseState without a lock here: recovery is
+ * finished (so if we were a standby, there's no master that can prepare
+ * transactions anymore), and we haven't yet set WAL as open for writes,
+ * so local existing backends, if any, cannot do so either. We could use a
+ * coding pattern similar to restoreTwoPhaseData, i.e., run the whole loop
+ * with the lock held; but this loop is far more complex, so instead only
+ * grab the lock while calling the low-level functions working directly on
+ * manipulating the two-phase state data. Functions working directly on
+ * PGPROC entries linked with the two-phase transaction work with other
+ * types of locks but we don't want to complicate that more than necessary.
*/
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
@@ -1931,9 +1944,11 @@ RecoverPreparedTransactions(void)
* SubTransSetParent has been set before, if the prepared transaction
* generated xid assignment records.
*/
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
buf = ProcessTwoPhaseBuffer(xid,
gxact->prepare_start_lsn,
gxact->ondisk, true, false);
+ LWLockRelease(TwoPhaseStateLock);
if (buf == NULL)
continue;
@@ -1962,14 +1977,13 @@ RecoverPreparedTransactions(void)
/* recovered, so reset the flag for entries generated by redo */
gxact->inredo = false;
-
LWLockRelease(TwoPhaseStateLock);
GXactLoadSubxactData(gxact, hdr->nsubxacts, subxids);
MarkAsPrepared(gxact);
/*
- * Recover other state (notably locks) using resource managers
+ * Recover other state (notably locks) using resource managers.
*/
ProcessRecords(bufptr, xid, twophase_recover_callbacks);
@@ -2014,6 +2028,8 @@ ProcessTwoPhaseBuffer(TransactionId xid,
TwoPhaseFileHeader *hdr;
int i;
+ Assert(LWLockHeldByMeInMode(TwoPhaseStateLock, LW_EXCLUSIVE));
+
if (!fromdisk)
Assert(prepare_start_lsn != InvalidXLogRecPtr);
@@ -2030,8 +2046,8 @@ ProcessTwoPhaseBuffer(TransactionId xid,
else
{
ereport(WARNING,
- (errmsg("removing stale two-phase state from"
- " shared memory for \"%u\"", xid)));
+ (errmsg("removing stale two-phase state from shared memory for \"%u\"",
+ xid)));
PrepareRedoRemove(xid, true);
}
return NULL;
@@ -2308,6 +2324,7 @@ PrepareRedoAdd(char *buf, XLogRecPtr start_lsn, XLogRecPtr end_lsn)
const char *gid;
GlobalTransaction gxact;
+ Assert(LWLockHeldByMeInMode(TwoPhaseStateLock, LW_EXCLUSIVE));
Assert(RecoveryInProgress());
bufptr = buf + MAXALIGN(sizeof(TwoPhaseFileHeader));
@@ -2324,7 +2341,6 @@ PrepareRedoAdd(char *buf, XLogRecPtr start_lsn, XLogRecPtr end_lsn)
* that it got added in the redo phase
*/
- LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
/* Get a free gxact from the freelist */
if (TwoPhaseState->freeGXacts == NULL)
ereport(ERROR,
@@ -2350,17 +2366,17 @@ PrepareRedoAdd(char *buf, XLogRecPtr start_lsn, XLogRecPtr end_lsn)
Assert(TwoPhaseState->numPrepXacts < max_prepared_xacts);
TwoPhaseState->prepXacts[TwoPhaseState->numPrepXacts++] = gxact;
- LWLockRelease(TwoPhaseStateLock);
-
- elog(DEBUG2, "Adding 2PC data to shared memory %u", gxact->xid);
+ elog(DEBUG2, "added 2PC data in shared memory for transaction %u", gxact->xid);
}
/*
* PrepareRedoRemove
*
- * Remove the corresponding gxact entry from TwoPhaseState. Also
- * remove the 2PC file if a prepared transaction was saved via
- * an earlier checkpoint.
+ * Remove the corresponding gxact entry from TwoPhaseState. Also remove
+ * the 2PC file if a prepared transaction was saved via an earlier checkpoint.
+ *
+ * Caller must hold TwoPhaseStateLock in exclusive mode, because TwoPhaseState
+ * is updated.
*/
void
PrepareRedoRemove(TransactionId xid, bool giveWarning)
@@ -2369,9 +2385,9 @@ PrepareRedoRemove(TransactionId xid, bool giveWarning)
int i;
bool found = false;
+ Assert(LWLockHeldByMeInMode(TwoPhaseStateLock, LW_EXCLUSIVE));
Assert(RecoveryInProgress());
- LWLockAcquire(TwoPhaseStateLock, LW_SHARED);
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
gxact = TwoPhaseState->prepXacts[i];
@@ -2383,7 +2399,6 @@ PrepareRedoRemove(TransactionId xid, bool giveWarning)
break;
}
}
- LWLockRelease(TwoPhaseStateLock);
/*
* Just leave if there is nothing, this is expected during WAL replay.
@@ -2394,7 +2409,7 @@ PrepareRedoRemove(TransactionId xid, bool giveWarning)
/*
* And now we can clean up any files we may have left.
*/
- elog(DEBUG2, "Removing 2PC data from shared memory %u", xid);
+ elog(DEBUG2, "removing 2PC data for transaction %u", xid);
if (gxact->ondisk)
RemoveTwoPhaseFile(xid, giveWarning);
RemoveGXact(gxact);
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 7e8c598f2a..e09696c37f 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -5351,6 +5351,8 @@ xact_redo_commit(xl_xact_parsed_commit *parsed,
int i;
TimestampTz commit_time;
+ Assert(TransactionIdIsValid(xid));
+
max_xid = TransactionIdLatest(xid, parsed->nsubxacts, parsed->subxacts);
/*
@@ -5518,6 +5520,8 @@ xact_redo_abort(xl_xact_parsed_abort *parsed, TransactionId xid)
int i;
TransactionId max_xid;
+ Assert(TransactionIdIsValid(xid));
+
/*
* Make sure nextXid is beyond any XID mentioned in the record.
*
@@ -5598,51 +5602,49 @@ xact_redo(XLogReaderState *record)
/* Backup blocks are not used in xact records */
Assert(!XLogRecHasAnyBlockRefs(record));
- if (info == XLOG_XACT_COMMIT || info == XLOG_XACT_COMMIT_PREPARED)
+ if (info == XLOG_XACT_COMMIT)
{
xl_xact_commit *xlrec = (xl_xact_commit *) XLogRecGetData(record);
xl_xact_parsed_commit parsed;
- ParseCommitRecord(XLogRecGetInfo(record), xlrec,
- &parsed);
+ ParseCommitRecord(XLogRecGetInfo(record), xlrec, &parsed);
+ xact_redo_commit(&parsed, XLogRecGetXid(record),
+ record->EndRecPtr, XLogRecGetOrigin(record));
+ }
+ else if (info == XLOG_XACT_COMMIT_PREPARED)
+ {
+ xl_xact_commit *xlrec = (xl_xact_commit *) XLogRecGetData(record);
+ xl_xact_parsed_commit parsed;
- if (info == XLOG_XACT_COMMIT)
- {
- Assert(!TransactionIdIsValid(parsed.twophase_xid));
- xact_redo_commit(&parsed, XLogRecGetXid(record),
- record->EndRecPtr, XLogRecGetOrigin(record));
- }
- else
- {
- Assert(TransactionIdIsValid(parsed.twophase_xid));
- xact_redo_commit(&parsed, parsed.twophase_xid,
- record->EndRecPtr, XLogRecGetOrigin(record));
+ ParseCommitRecord(XLogRecGetInfo(record), xlrec, &parsed);
+ xact_redo_commit(&parsed, parsed.twophase_xid,
+ record->EndRecPtr, XLogRecGetOrigin(record));
- /* Delete TwoPhaseState gxact entry and/or 2PC file. */
- PrepareRedoRemove(parsed.twophase_xid, false);
- }
+ /* Delete TwoPhaseState gxact entry and/or 2PC file. */
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
+ PrepareRedoRemove(parsed.twophase_xid, false);
+ LWLockRelease(TwoPhaseStateLock);
}
- else if (info == XLOG_XACT_ABORT || info == XLOG_XACT_ABORT_PREPARED)
+ else if (info == XLOG_XACT_ABORT)
{
xl_xact_abort *xlrec = (xl_xact_abort *) XLogRecGetData(record);
xl_xact_parsed_abort parsed;
- ParseAbortRecord(XLogRecGetInfo(record), xlrec,
- &parsed);
+ ParseAbortRecord(XLogRecGetInfo(record), xlrec, &parsed);
+ xact_redo_abort(&parsed, XLogRecGetXid(record));
+ }
+ else if (info == XLOG_XACT_ABORT_PREPARED)
+ {
+ xl_xact_abort *xlrec = (xl_xact_abort *) XLogRecGetData(record);
+ xl_xact_parsed_abort parsed;
- if (info == XLOG_XACT_ABORT)
- {
- Assert(!TransactionIdIsValid(parsed.twophase_xid));
- xact_redo_abort(&parsed, XLogRecGetXid(record));
- }
- else
- {
- Assert(TransactionIdIsValid(parsed.twophase_xid));
- xact_redo_abort(&parsed, parsed.twophase_xid);
+ ParseAbortRecord(XLogRecGetInfo(record), xlrec, &parsed);
+ xact_redo_abort(&parsed, parsed.twophase_xid);
- /* Delete TwoPhaseState gxact entry and/or 2PC file. */
- PrepareRedoRemove(parsed.twophase_xid, false);
- }
+ /* Delete TwoPhaseState gxact entry and/or 2PC file. */
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
+ PrepareRedoRemove(parsed.twophase_xid, false);
+ LWLockRelease(TwoPhaseStateLock);
}
else if (info == XLOG_XACT_PREPARE)
{
@@ -5650,9 +5652,11 @@ xact_redo(XLogReaderState *record)
* Store xid and start/end pointers of the WAL record in TwoPhaseState
* gxact entry.
*/
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
PrepareRedoAdd(XLogRecGetData(record),
record->ReadRecPtr,
record->EndRecPtr);
+ LWLockRelease(TwoPhaseStateLock);
}
else if (info == XLOG_XACT_ASSIGNMENT)
{
Did you have any luck reproducing the original issue?
sorry, not reproduced
--
View this message in context: http://www.postgresql-archive.org/BUG-14680-startup-process-on-standby-encounter-a-deadlock-of-TwoPhaseStateLock-when-redo-2PC-xlog-tp5964069p5966050.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
On Mon, Jun 12, 2017 at 3:25 PM, wangchuanting <wangchuanting@huawei.com> wrote:
Did you have any luck reproducing the original issue?
sorry, not reproduced
I am a bit confused. Do you mean that you applied the patch and did
not see the issue again? Or do you mean that you did not have the time
to check it?
--
Michael
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
I am a bit confused. Do you mean that you applied the patch and did
not see the issue again? Or do you mean that you did not have the time
to check it?
i means: this issue occurs only once in my test, and have not reproduce it.
i review the patch, and i think the patch fix the issue(but i have not
applied it, i will apply it tomorrow).
--
View this message in context: http://www.postgresql-archive.org/BUG-14680-startup-process-on-standby-encounter-a-deadlock-of-TwoPhaseStateLock-when-redo-2PC-xlog-tp5964069p5966106.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
Michael Paquier wrote:
On Sun, Jun 11, 2017 at 12:24 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
I have reworked the comment as follows: /* - * Don't need a lock in the recovery phase. + * It is fine to access TwoPhaseState without a lock here: recovery is + * finished (so if we were a standby, there's no master that can prepare + * transactions anymore), and we haven't yet set WAL as open for writes, + * so local existing backends, if any, cannot do so either. We could use a + * coding pattern similar to restoreTwoPhaseData, i.e., run the whole loop + * with the lock held; but this loop is far more complex, so instead only + * grab the lock while calling the low-level functions working directly on + * manipulating the two-phase state data. Functions working directly on + * PGPROC entries linked with the two-phase transaction work with other + * types of locks but we don't want to complicate that more than necessary. */
Hmm. Honestly I don't like the final sentence you added. I find it
more confusing than useful, because it doesn't explain what these "other
types of locks" are, or why we care.
However, I found out that this rationale is likely not true, because the
checkpointer may be running concurrently with this code from startup
process, and checkpointer does process 2PC data. Maybe there are other
reasons why there's no live bug here, but it looks wrong (I didn't try
to reproduce a problem).
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
On Tue, Jun 13, 2017 at 7:49 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
However, I found out that this rationale is likely not true, because the
checkpointer may be running concurrently with this code from startup
process, and checkpointer does process 2PC data. Maybe there are other
reasons why there's no live bug here, but it looks wrong (I didn't try
to reproduce a problem).
The current coding is actually safe because the checkpointer does not
remove or add any 2PC entry in the array while holding
TwoPhaseStateLock, it just updates some values that need to be read
and/or written while holding the lock. Well, to be honest, HEAD is
wrong because it can read a flag value while the checkpointer updates
it, and the patch is careful to change that to be correct. The wrong
part is when calling ProcessTwoPhaseBuffer() in
RecoverPreparedTransactions() which accesses gxact->ondisk and
prepare_start_lsn without locking things.
--
Michael
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
Michael Paquier wrote:
The current coding is actually safe because the checkpointer does not
remove or add any 2PC entry in the array while holding
TwoPhaseStateLock, it just updates some values that need to be read
and/or written while holding the lock. Well, to be honest, HEAD is
wrong because it can read a flag value while the checkpointer updates
it, and the patch is careful to change that to be correct. The wrong
part is when calling ProcessTwoPhaseBuffer() in
RecoverPreparedTransactions() which accesses gxact->ondisk and
prepare_start_lsn without locking things.
Honestly I don't like this rationale very much. Even if doing the
unlocked access is safe today, it looks like installing a landmine for
the future, and for what? I don't think there's a lot to be gained:
RecoverPreparedTransactions only runs once in the life of a server, and
CheckPointTwoPhase is supposed to have a very short runtime (per
explanation in comments therein). It seems better to me to continue our
tradition of using the appropriate locks instead of playing a dangerous
game.
So I propose that RecoverPreparedTransactions grabs exclusive lock at
the top, and only the bottom part of the loop is done unlocked, which
AFAICS should be safe. (MarkAsPrepared gained a boolean argument
indicating that caller already holds lock).
Here's a patch along those lines. The full testsuite is running now,
but the recovery tests pass fine.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
2pc-redo-lwlock-fix-v6.patchtext/plain; charset=us-asciiDownload
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index c50f9c4bf6..e9751aa2f6 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -195,7 +195,10 @@ typedef struct TwoPhaseStateData
static TwoPhaseStateData *TwoPhaseState;
/*
- * Global transaction entry currently locked by us, if any.
+ * Global transaction entry currently locked by us, if any. Note that any
+ * access to the entry pointed to by this variable must be protected by
+ * TwoPhaseStateLock, though obviously the pointer itself doesn't need to be
+ * (since it's just local memory).
*/
static GlobalTransaction MyLockedGxact = NULL;
@@ -338,18 +341,13 @@ AtAbort_Twophase(void)
* resources held by the transaction yet. In those cases, the in-memory
* state can be wrong, but it's too late to back out.
*/
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
if (!MyLockedGxact->valid)
- {
RemoveGXact(MyLockedGxact);
- }
else
- {
- LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
-
MyLockedGxact->locking_backend = InvalidBackendId;
+ LWLockRelease(TwoPhaseStateLock);
- LWLockRelease(TwoPhaseStateLock);
- }
MyLockedGxact = NULL;
}
@@ -454,6 +452,8 @@ MarkAsPreparingGuts(GlobalTransaction gxact, TransactionId xid, const char *gid,
PGXACT *pgxact;
int i;
+ Assert(LWLockHeldByMeInMode(TwoPhaseStateLock, LW_EXCLUSIVE));
+
Assert(gxact != NULL);
proc = &ProcGlobal->allProcs[gxact->pgprocno];
pgxact = &ProcGlobal->allPgXact[gxact->pgprocno];
@@ -530,15 +530,19 @@ GXactLoadSubxactData(GlobalTransaction gxact, int nsubxacts,
/*
* MarkAsPrepared
* Mark the GXACT as fully valid, and enter it into the global ProcArray.
+ *
+ * lock_held indicates whether caller already holds TwoPhaseStateLock.
*/
static void
-MarkAsPrepared(GlobalTransaction gxact)
+MarkAsPrepared(GlobalTransaction gxact, bool lock_held)
{
/* Lock here may be overkill, but I'm not convinced of that ... */
- LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
+ if (!lock_held)
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
Assert(!gxact->valid);
gxact->valid = true;
- LWLockRelease(TwoPhaseStateLock);
+ if (!lock_held)
+ LWLockRelease(TwoPhaseStateLock);
/*
* Put it into the global ProcArray so TransactionIdIsInProgress considers
@@ -632,7 +636,7 @@ RemoveGXact(GlobalTransaction gxact)
{
int i;
- LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
+ Assert(LWLockHeldByMeInMode(TwoPhaseStateLock, LW_EXCLUSIVE));
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
@@ -646,14 +650,10 @@ RemoveGXact(GlobalTransaction gxact)
gxact->next = TwoPhaseState->freeGXacts;
TwoPhaseState->freeGXacts = gxact;
- LWLockRelease(TwoPhaseStateLock);
-
return;
}
}
- LWLockRelease(TwoPhaseStateLock);
-
elog(ERROR, "failed to find %p in GlobalTransaction array", gxact);
}
@@ -1127,7 +1127,7 @@ EndPrepare(GlobalTransaction gxact)
* the xact crashed. Instead we have a window where the same XID appears
* twice in ProcArray, which is OK.
*/
- MarkAsPrepared(gxact);
+ MarkAsPrepared(gxact, false);
/*
* Now we can mark ourselves as out of the commit critical section: a
@@ -1506,7 +1506,9 @@ FinishPreparedTransaction(const char *gid, bool isCommit)
if (gxact->ondisk)
RemoveTwoPhaseFile(xid, true);
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
RemoveGXact(gxact);
+ LWLockRelease(TwoPhaseStateLock);
MyLockedGxact = NULL;
pfree(buf);
@@ -1734,6 +1736,7 @@ restoreTwoPhaseData(void)
struct dirent *clde;
cldir = AllocateDir(TWOPHASE_DIR);
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
while ((clde = ReadDir(cldir, TWOPHASE_DIR)) != NULL)
{
if (strlen(clde->d_name) == 8 &&
@@ -1752,6 +1755,7 @@ restoreTwoPhaseData(void)
PrepareRedoAdd(buf, InvalidXLogRecPtr, InvalidXLogRecPtr);
}
}
+ LWLockRelease(TwoPhaseStateLock);
FreeDir(cldir);
}
@@ -1792,7 +1796,7 @@ PrescanPreparedTransactions(TransactionId **xids_p, int *nxids_p)
int allocsize = 0;
int i;
- LWLockAcquire(TwoPhaseStateLock, LW_SHARED);
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
TransactionId xid;
@@ -1867,7 +1871,7 @@ StandbyRecoverPreparedTransactions(void)
{
int i;
- LWLockAcquire(TwoPhaseStateLock, LW_SHARED);
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
TransactionId xid;
@@ -1893,7 +1897,8 @@ StandbyRecoverPreparedTransactions(void)
* Scan the shared memory entries of TwoPhaseState and reload the state for
* each prepared transaction (reacquire locks, etc).
*
- * This is run during database startup.
+ * This is run at the end of recovery, but before we allow backends to write
+ * WAL.
*
* At the end of recovery the way we take snapshots will change. We now need
* to mark all running transactions with their full SubTransSetParent() info
@@ -1907,9 +1912,7 @@ RecoverPreparedTransactions(void)
{
int i;
- /*
- * Don't need a lock in the recovery phase.
- */
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
TransactionId xid;
@@ -1955,7 +1958,6 @@ RecoverPreparedTransactions(void)
* Recreate its GXACT and dummy PGPROC. But, check whether it was
* added in redo and already has a shmem entry for it.
*/
- LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
MarkAsPreparingGuts(gxact, xid, gid,
hdr->prepared_at,
hdr->owner, hdr->database);
@@ -1963,13 +1965,13 @@ RecoverPreparedTransactions(void)
/* recovered, so reset the flag for entries generated by redo */
gxact->inredo = false;
+ GXactLoadSubxactData(gxact, hdr->nsubxacts, subxids);
+ MarkAsPrepared(gxact, true);
+
LWLockRelease(TwoPhaseStateLock);
- GXactLoadSubxactData(gxact, hdr->nsubxacts, subxids);
- MarkAsPrepared(gxact);
-
/*
- * Recover other state (notably locks) using resource managers
+ * Recover other state (notably locks) using resource managers.
*/
ProcessRecords(bufptr, xid, twophase_recover_callbacks);
@@ -1988,7 +1990,11 @@ RecoverPreparedTransactions(void)
PostPrepare_Twophase();
pfree(buf);
+
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
}
+
+ LWLockRelease(TwoPhaseStateLock);
}
/*
@@ -2014,6 +2020,8 @@ ProcessTwoPhaseBuffer(TransactionId xid,
TwoPhaseFileHeader *hdr;
int i;
+ Assert(LWLockHeldByMeInMode(TwoPhaseStateLock, LW_EXCLUSIVE));
+
if (!fromdisk)
Assert(prepare_start_lsn != InvalidXLogRecPtr);
@@ -2030,8 +2038,8 @@ ProcessTwoPhaseBuffer(TransactionId xid,
else
{
ereport(WARNING,
- (errmsg("removing stale two-phase state from"
- " shared memory for \"%u\"", xid)));
+ (errmsg("removing stale two-phase state from shared memory for \"%u\"",
+ xid)));
PrepareRedoRemove(xid, true);
}
return NULL;
@@ -2308,6 +2316,7 @@ PrepareRedoAdd(char *buf, XLogRecPtr start_lsn, XLogRecPtr end_lsn)
const char *gid;
GlobalTransaction gxact;
+ Assert(LWLockHeldByMeInMode(TwoPhaseStateLock, LW_EXCLUSIVE));
Assert(RecoveryInProgress());
bufptr = buf + MAXALIGN(sizeof(TwoPhaseFileHeader));
@@ -2324,7 +2333,6 @@ PrepareRedoAdd(char *buf, XLogRecPtr start_lsn, XLogRecPtr end_lsn)
* that it got added in the redo phase
*/
- LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
/* Get a free gxact from the freelist */
if (TwoPhaseState->freeGXacts == NULL)
ereport(ERROR,
@@ -2350,17 +2358,17 @@ PrepareRedoAdd(char *buf, XLogRecPtr start_lsn, XLogRecPtr end_lsn)
Assert(TwoPhaseState->numPrepXacts < max_prepared_xacts);
TwoPhaseState->prepXacts[TwoPhaseState->numPrepXacts++] = gxact;
- LWLockRelease(TwoPhaseStateLock);
-
- elog(DEBUG2, "Adding 2PC data to shared memory %u", gxact->xid);
+ elog(DEBUG2, "added 2PC data in shared memory for transaction %u", gxact->xid);
}
/*
* PrepareRedoRemove
*
- * Remove the corresponding gxact entry from TwoPhaseState. Also
- * remove the 2PC file if a prepared transaction was saved via
- * an earlier checkpoint.
+ * Remove the corresponding gxact entry from TwoPhaseState. Also remove
+ * the 2PC file if a prepared transaction was saved via an earlier checkpoint.
+ *
+ * Caller must hold TwoPhaseStateLock in exclusive mode, because TwoPhaseState
+ * is updated.
*/
void
PrepareRedoRemove(TransactionId xid, bool giveWarning)
@@ -2369,9 +2377,9 @@ PrepareRedoRemove(TransactionId xid, bool giveWarning)
int i;
bool found = false;
+ Assert(LWLockHeldByMeInMode(TwoPhaseStateLock, LW_EXCLUSIVE));
Assert(RecoveryInProgress());
- LWLockAcquire(TwoPhaseStateLock, LW_SHARED);
for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
{
gxact = TwoPhaseState->prepXacts[i];
@@ -2383,7 +2391,6 @@ PrepareRedoRemove(TransactionId xid, bool giveWarning)
break;
}
}
- LWLockRelease(TwoPhaseStateLock);
/*
* Just leave if there is nothing, this is expected during WAL replay.
@@ -2394,7 +2401,7 @@ PrepareRedoRemove(TransactionId xid, bool giveWarning)
/*
* And now we can clean up any files we may have left.
*/
- elog(DEBUG2, "Removing 2PC data from shared memory %u", xid);
+ elog(DEBUG2, "removing 2PC data for transaction %u", xid);
if (gxact->ondisk)
RemoveTwoPhaseFile(xid, giveWarning);
RemoveGXact(gxact);
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 7e8c598f2a..e09696c37f 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -5351,6 +5351,8 @@ xact_redo_commit(xl_xact_parsed_commit *parsed,
int i;
TimestampTz commit_time;
+ Assert(TransactionIdIsValid(xid));
+
max_xid = TransactionIdLatest(xid, parsed->nsubxacts, parsed->subxacts);
/*
@@ -5518,6 +5520,8 @@ xact_redo_abort(xl_xact_parsed_abort *parsed, TransactionId xid)
int i;
TransactionId max_xid;
+ Assert(TransactionIdIsValid(xid));
+
/*
* Make sure nextXid is beyond any XID mentioned in the record.
*
@@ -5598,51 +5602,49 @@ xact_redo(XLogReaderState *record)
/* Backup blocks are not used in xact records */
Assert(!XLogRecHasAnyBlockRefs(record));
- if (info == XLOG_XACT_COMMIT || info == XLOG_XACT_COMMIT_PREPARED)
+ if (info == XLOG_XACT_COMMIT)
{
xl_xact_commit *xlrec = (xl_xact_commit *) XLogRecGetData(record);
xl_xact_parsed_commit parsed;
- ParseCommitRecord(XLogRecGetInfo(record), xlrec,
- &parsed);
-
- if (info == XLOG_XACT_COMMIT)
- {
- Assert(!TransactionIdIsValid(parsed.twophase_xid));
- xact_redo_commit(&parsed, XLogRecGetXid(record),
- record->EndRecPtr, XLogRecGetOrigin(record));
- }
- else
- {
- Assert(TransactionIdIsValid(parsed.twophase_xid));
- xact_redo_commit(&parsed, parsed.twophase_xid,
- record->EndRecPtr, XLogRecGetOrigin(record));
-
- /* Delete TwoPhaseState gxact entry and/or 2PC file. */
- PrepareRedoRemove(parsed.twophase_xid, false);
- }
+ ParseCommitRecord(XLogRecGetInfo(record), xlrec, &parsed);
+ xact_redo_commit(&parsed, XLogRecGetXid(record),
+ record->EndRecPtr, XLogRecGetOrigin(record));
}
- else if (info == XLOG_XACT_ABORT || info == XLOG_XACT_ABORT_PREPARED)
+ else if (info == XLOG_XACT_COMMIT_PREPARED)
+ {
+ xl_xact_commit *xlrec = (xl_xact_commit *) XLogRecGetData(record);
+ xl_xact_parsed_commit parsed;
+
+ ParseCommitRecord(XLogRecGetInfo(record), xlrec, &parsed);
+ xact_redo_commit(&parsed, parsed.twophase_xid,
+ record->EndRecPtr, XLogRecGetOrigin(record));
+
+ /* Delete TwoPhaseState gxact entry and/or 2PC file. */
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
+ PrepareRedoRemove(parsed.twophase_xid, false);
+ LWLockRelease(TwoPhaseStateLock);
+ }
+ else if (info == XLOG_XACT_ABORT)
{
xl_xact_abort *xlrec = (xl_xact_abort *) XLogRecGetData(record);
xl_xact_parsed_abort parsed;
- ParseAbortRecord(XLogRecGetInfo(record), xlrec,
- &parsed);
+ ParseAbortRecord(XLogRecGetInfo(record), xlrec, &parsed);
+ xact_redo_abort(&parsed, XLogRecGetXid(record));
+ }
+ else if (info == XLOG_XACT_ABORT_PREPARED)
+ {
+ xl_xact_abort *xlrec = (xl_xact_abort *) XLogRecGetData(record);
+ xl_xact_parsed_abort parsed;
- if (info == XLOG_XACT_ABORT)
- {
- Assert(!TransactionIdIsValid(parsed.twophase_xid));
- xact_redo_abort(&parsed, XLogRecGetXid(record));
- }
- else
- {
- Assert(TransactionIdIsValid(parsed.twophase_xid));
- xact_redo_abort(&parsed, parsed.twophase_xid);
+ ParseAbortRecord(XLogRecGetInfo(record), xlrec, &parsed);
+ xact_redo_abort(&parsed, parsed.twophase_xid);
- /* Delete TwoPhaseState gxact entry and/or 2PC file. */
- PrepareRedoRemove(parsed.twophase_xid, false);
- }
+ /* Delete TwoPhaseState gxact entry and/or 2PC file. */
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
+ PrepareRedoRemove(parsed.twophase_xid, false);
+ LWLockRelease(TwoPhaseStateLock);
}
else if (info == XLOG_XACT_PREPARE)
{
@@ -5650,9 +5652,11 @@ xact_redo(XLogReaderState *record)
* Store xid and start/end pointers of the WAL record in TwoPhaseState
* gxact entry.
*/
+ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
PrepareRedoAdd(XLogRecGetData(record),
record->ReadRecPtr,
record->EndRecPtr);
+ LWLockRelease(TwoPhaseStateLock);
}
else if (info == XLOG_XACT_ASSIGNMENT)
{
On Wed, Jun 14, 2017 at 8:08 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
So I propose that RecoverPreparedTransactions grabs exclusive lock at
the top, and only the bottom part of the loop is done unlocked, which
AFAICS should be safe. (MarkAsPrepared gained a boolean argument
indicating that caller already holds lock).
Logically both approaches are really close as with this approach the
additional locked area is when scanning the entries of TwoPhaseState.
But I agree that your suggestion would be safer in the long run. This
looks good to me after a close look.
--
Michael
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
Michael Paquier wrote:
On Wed, Jun 14, 2017 at 8:08 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:So I propose that RecoverPreparedTransactions grabs exclusive lock at
the top, and only the bottom part of the loop is done unlocked, which
AFAICS should be safe. (MarkAsPrepared gained a boolean argument
indicating that caller already holds lock).Logically both approaches are really close as with this approach the
additional locked area is when scanning the entries of TwoPhaseState.
But I agree that your suggestion would be safer in the long run. This
looks good to me after a close look.
Thanks, pushed.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
On Thu, Jun 15, 2017 at 12:35 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
Michael Paquier wrote:
On Wed, Jun 14, 2017 at 8:08 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:So I propose that RecoverPreparedTransactions grabs exclusive lock at
the top, and only the bottom part of the loop is done unlocked, which
AFAICS should be safe. (MarkAsPrepared gained a boolean argument
indicating that caller already holds lock).Logically both approaches are really close as with this approach the
additional locked area is when scanning the entries of TwoPhaseState.
But I agree that your suggestion would be safer in the long run. This
looks good to me after a close look.Thanks, pushed.
Thanks for the commit.
--
Michael
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs