Assertion failure in SnapBuildInitialSnapshot()

Started by Amit Kapilaover 3 years ago69 messageshackers
Jump to latest
#1Amit Kapila
amit.kapila16@gmail.com

Hi,

Thomas has reported this failure in an email [1]/messages/by-id/CA+hUKG+A_LyW=FAOi2ebA9Vr-1=esu+eBSm0dsVhU=Egfpipkg@mail.gmail.com and shared the
following links offlist with me:
https://cirrus-ci.com/task/5311549010083840
https://api.cirrus-ci.com/v1/artifact/task/5311549010083840/testrun/build/testrun/subscription/100_bugs/log/100_bugs_twoways.log
https://api.cirrus-ci.com/v1/artifact/task/5311549010083840/crashlog/crashlog-postgres.exe_1c40_2022-11-08_00-20-28-110.txt

The call stack is as follows:
00000063`4edff670 00007ff7`1922fcdf postgres!ExceptionalCondition(
char * conditionName = 0x00007ff7`198f8050
"TransactionIdPrecedesOrEquals(safeXid, snap->xmin)",
char * fileName = 0x00007ff7`198f8020
"../src/backend/replication/logical/snapbuild.c",
int lineNumber = 0n600)+0x78 [c:\cirrus\src\backend\utils\error\assert.c @ 67]
00000063`4edff6b0 00007ff7`192106df postgres!SnapBuildInitialSnapshot(
struct SnapBuild * builder = 0x00000251`5b95bce8)+0x20f
[c:\cirrus\src\backend\replication\logical\snapbuild.c @ 600]
00000063`4edff730 00007ff7`1920d9f6 postgres!CreateReplicationSlot(
struct CreateReplicationSlotCmd * cmd = 0x00000251`5b94d828)+0x40f
[c:\cirrus\src\backend\replication\walsender.c @ 1152]
00000063`4edff870 00007ff7`192bc9c4 postgres!exec_replication_command(
char * cmd_string = 0x00000251`5b94ac68 "CREATE_REPLICATION_SLOT
"pg_16400_sync_16392_7163433409941550636" LOGICAL pgoutput (SNAPSHOT
'use')")+0x4a6 [c:\cirrus\src\backend\replication\walsender.c @ 1804]

As per my investigation based on the above logs, the failed test is
due to the following command in 100_bugs.pl:
$node_twoways->safe_psql('d2',
"CREATE SUBSCRIPTION testsub CONNECTION \$\$"
. $node_twoways->connstr('d1')
. "\$\$ PUBLICATION testpub WITH (create_slot=false, "
. "slot_name='testslot')");

It failed while creating the table sync slot.

The failure happens because the xmin computed by the snap builder is
lesser than what is computed by GetOldestSafeDecodingTransactionId()
during initial snapshot creation for the tablesync slot by
SnapBuildInitialSnapshot.

To investigate, I tried to study how the values of "safeXid" and
"snap->xmin" are computed in SnapBuildInitialSnapshot(). There appear
to be four places in the code where we assign value to xmin
(builder-xmin) during the snapshot building process and then we assign
the same to snap->xmin. Those places are: (a) Two places in
SnapBuildFindSnapshot(), (b) One place in SnapBuildRestore(), and (c)
One place in SnapBuildProcessRunningXacts()

Seeing the LOGS, it appears to me that we find a consistent point from
the below code in SnapBuildFindSnapshot() and the following line
assigns builder->xmin.

...
if (running->oldestRunningXid == running->nextXid)
{
...
builder->xmin = running->nextXid;

The reason is we only see "logical decoding found consistent point at
..." in LOGs. If SnapBuildFindSnapshot() has to go through the entire
machinery of snapshot building then, we should have seen "logical
decoding found initial starting point at ..." and similar other LOGs.
The builder->xmin can't be assigned from any other place in (b) or
(c). From (c), it can't be assigned because we are building a full
snapshot here which won't restore any serialized snapshot. It can't be
assigned from (b) because while creating a slot we stop as soon as we
find the consistent point, see
DecodingContextFindStartpoint()->DecodingContextReady()

In the above code snippet, the running->nextXid in the above code
snippet has been assigned from ShmemVariableCache->nextXid in
GetRunningTransactionData().

The safeXid computed from GetOldestSafeDecodingTransactionId() uses
ShmemVariableCache->nextXid as the starting point and keeps the slot's
xmin as the safe Xid limit.

It seems to me due to SnapBuilder's initial_xmin_horizon, we won't set
(SnapBuilder's) xmin lesser than slot's effective_xmin computed in
CreateInitDecodingContext(). See SnapBuildFindSnapshot(). So, ideally,
SnapBuildInitialSnapshot() should never compute safeXid which is based
on the minimum of all slot's effective_xmin to be greater than
SnapBuilder's xmin (or snapshot's xmin). In short, the code as written
seems correct to me.

I have also tried to analyze if any recent commit (7f13ac8123) could
cause this problem but can't think of any reason because the changes
are related to the restart of decoding and the failed test is related
to creating a slot the very first time.

I don't have any good ideas on how to proceed with this. Any thoughts
on this would be helpful?

Note: I have briefly discussed this issue with Sawada-San and
Kuroda-San, so kept them in Cc.

[1]: /messages/by-id/CA+hUKG+A_LyW=FAOi2ebA9Vr-1=esu+eBSm0dsVhU=Egfpipkg@mail.gmail.com

--
With Regards,
Amit Kapila.

#2Andres Freund
andres@anarazel.de
In reply to: Amit Kapila (#1)
Re: Assertion failure in SnapBuildInitialSnapshot()

Hi,

On 2022-11-10 16:04:40 +0530, Amit Kapila wrote:

I don't have any good ideas on how to proceed with this. Any thoughts
on this would be helpful?

One thing worth doing might be to convert the assertion path into an elog(),
mentioning both xids (or add a framework for things like AssertLT(), but that
seems hard). With the concrete values we could make a better guess at what's
going wrong.

It'd probably not hurt to just perform this check independent of
USE_ASSERT_CHECKING - compared to the cost of creating a slot it's neglegible.

That'll obviously only help us whenever we re-encounter the issue, which will
likely be a while...

Have you tried reproducing the issue by running the test in a loop?

One thing I noticed just now is that we don't assert
builder->building_full_snapshot==true. I think we should? That didn't use to
be an option, but now it is... It doesn't look to me like that's the issue,
but ...

Hm, also, shouldn't the patch adding CRS_USE_SNAPSHOT have copied more of
SnapBuildExportSnapshot()? Why aren't the error checks for
SnapBuildExportSnapshot() needed? Why don't we need to set XactReadOnly? Which
transactions are we even in when we import the snapshot (cf.
SnapBuildExportSnapshot() doing a StartTransactionCommand()).

I'm also somewhat suspicious of calling RestoreTransactionSnapshot() with
source=MyProc. Looks like it works, but it'd be pretty easy to screw up, and
there's no comments in SetTransactionSnapshot() or
ProcArrayInstallRestoredXmin() warning that that might be the case.

Greetings,

Andres Freund

#3Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#2)
Re: Assertion failure in SnapBuildInitialSnapshot()

Hi,

On 2022-11-14 17:25:31 -0800, Andres Freund wrote:

Hm, also, shouldn't the patch adding CRS_USE_SNAPSHOT have copied more of
SnapBuildExportSnapshot()? Why aren't the error checks for
SnapBuildExportSnapshot() needed? Why don't we need to set XactReadOnly? Which
transactions are we even in when we import the snapshot (cf.
SnapBuildExportSnapshot() doing a StartTransactionCommand()).

Most of the checks for that are in CreateReplicationSlot() - but not al,
e.g. XactReadOnly isn't set, nor do we enforce in an obvious place that we
don't already hold a snapshot.

I first thought this might directly explain the problem, due to the
MyProc->xmin assignment in SnapBuildInitialSnapshot() overwriting a value that
could influence the return value for GetOldestSafeDecodingTransactionId(). But
that happens later, and we check that MyProc->xmin is invalid at the start.

But it still seems supicious. This will e.g. influence whether
StartupDecodingContext() sets PROC_IN_LOGICAL_DECODING. Which probably is
fine, but...

Another theory: I dimly remember Thomas mentioning that there's some different
behaviour of xlogreader during shutdown as part of the v15 changes. I don't
quite remember what the scenario leading up to that was. Thomas?

It's certainly interesting that we see stuff like:

2022-11-08 00:20:23.255 GMT [2012][walsender] [pg_16400_sync_16395_7163433409941550636][8/0:0] ERROR: could not find record while sending logically-decoded data: missing contrecord at 0/1D3B710
2022-11-08 00:20:23.255 GMT [2012][walsender] [pg_16400_sync_16395_7163433409941550636][8/0:0] STATEMENT: START_REPLICATION SLOT "pg_16400_sync_16395_7163433409941550636" LOGICAL 0/1D2B650 (proto_version '3', origin 'any', publication_names '"testpub"')
ERROR: could not find record while sending logically-decoded data: missing contrecord at 0/1D3B710
2022-11-08 00:20:23.255 GMT [248][logical replication worker] ERROR: error while shutting down streaming COPY: ERROR: could not find record while sending logically-decoded data: missing contrecord at 0/1D3B710

It could entirely be caused by postmaster slowly killing processes after the
assertion failure and that that is corrupting shared memory state though. But
it might also be related.

Greetings,

Andres Freund

#4Thomas Munro
thomas.munro@gmail.com
In reply to: Andres Freund (#3)
Re: Assertion failure in SnapBuildInitialSnapshot()

On Tue, Nov 15, 2022 at 3:38 PM Andres Freund <andres@anarazel.de> wrote:

On 2022-11-14 17:25:31 -0800, Andres Freund wrote:
Another theory: I dimly remember Thomas mentioning that there's some different
behaviour of xlogreader during shutdown as part of the v15 changes. I don't
quite remember what the scenario leading up to that was. Thomas?

Yeah. So as mentioned in:

/messages/by-id/CA+hUKG+WKsZpdoryeqM8_rk5uQPCqS2HGY92WiMGFsK2wVkcig@mail.gmail.com

I still have on my list to remove a new "missing contrecord" error
that can show up in a couple of different scenarios that aren't
exactly error conditions depending on how you think about it, but I
haven't done that yet. I am not currently aware of anything bad
happening because of those messages, but I could be wrong.

It's certainly interesting that we see stuff like:

2022-11-08 00:20:23.255 GMT [2012][walsender] [pg_16400_sync_16395_7163433409941550636][8/0:0] ERROR: could not find record while sending logically-decoded data: missing contrecord at 0/1D3B710
2022-11-08 00:20:23.255 GMT [2012][walsender] [pg_16400_sync_16395_7163433409941550636][8/0:0] STATEMENT: START_REPLICATION SLOT "pg_16400_sync_16395_7163433409941550636" LOGICAL 0/1D2B650 (proto_version '3', origin 'any', publication_names '"testpub"')
ERROR: could not find record while sending logically-decoded data: missing contrecord at 0/1D3B710
2022-11-08 00:20:23.255 GMT [248][logical replication worker] ERROR: error while shutting down streaming COPY: ERROR: could not find record while sending logically-decoded data: missing contrecord at 0/1D3B710

Right, so that might fit the case described in my email above:
logical_read_xlog_page() notices that it has been asked to shut down
when it is between reads of pages with a spanning contrecord. Before,
it would fail silently, so XLogReadRecord() returns NULL without
setting *errmsg, but now it complains about a missing contrecord. In
the case where it was showing up on that other thread, just a few
machines often seemed to log that error when shutting down --
peripatus for example -- I don't know why, but I assume something to
do with shutdown timing and page alignment.

It could entirely be caused by postmaster slowly killing processes after the
assertion failure and that that is corrupting shared memory state though. But
it might also be related.

Hmm.

#5osumi.takamichi@fujitsu.com
osumi.takamichi@fujitsu.com
In reply to: Andres Freund (#2)
RE: Assertion failure in SnapBuildInitialSnapshot()

Hi,

On Tuesday, November 15, 2022 10:26 AM Andres Freund <andres@anarazel.de> wrote:

On 2022-11-10 16:04:40 +0530, Amit Kapila wrote:

I don't have any good ideas on how to proceed with this. Any thoughts
on this would be helpful?

One thing worth doing might be to convert the assertion path into an elog(),
mentioning both xids (or add a framework for things like AssertLT(), but that
seems hard). With the concrete values we could make a better guess at
what's going wrong.

It'd probably not hurt to just perform this check independent of
USE_ASSERT_CHECKING - compared to the cost of creating a slot it's
neglegible.

That'll obviously only help us whenever we re-encounter the issue, which will
likely be a while...

Have you tried reproducing the issue by running the test in a loop?

Just FYI, I've tried to reproduce this failure in a loop,
but I couldn't hit the same assertion failure.

I extracted the #16643 of 100_bugs.pl only and
executed the tests more than 500 times.

My env and test was done in rhel7.9 and gcc 4.8 with configure option of
--enable-cassert --enable-debug --enable-tap-tests --with-icu CFLAGS=-O0 and prefix.

Best Regards,
Takamichi Osumi

#6Amit Kapila
amit.kapila16@gmail.com
In reply to: Andres Freund (#3)
Re: Assertion failure in SnapBuildInitialSnapshot()

On Tue, Nov 15, 2022 at 8:08 AM Andres Freund <andres@anarazel.de> wrote:

On 2022-11-14 17:25:31 -0800, Andres Freund wrote:

Hm, also, shouldn't the patch adding CRS_USE_SNAPSHOT have copied more of
SnapBuildExportSnapshot()? Why aren't the error checks for
SnapBuildExportSnapshot() needed? Why don't we need to set XactReadOnly? Which
transactions are we even in when we import the snapshot (cf.
SnapBuildExportSnapshot() doing a StartTransactionCommand()).

Most of the checks for that are in CreateReplicationSlot() - but not al,
e.g. XactReadOnly isn't set,

Yeah, I think we can add the check for XactReadOnly along with other
checks in CreateReplicationSlot().

nor do we enforce in an obvious place that we
don't already hold a snapshot.

We have a check for (FirstXactSnapshot == NULL) in
RestoreTransactionSnapshot->SetTransactionSnapshot. Won't that be
sufficient?

--
With Regards,
Amit Kapila.

#7Amit Kapila
amit.kapila16@gmail.com
In reply to: Andres Freund (#2)
Re: Assertion failure in SnapBuildInitialSnapshot()

On Tue, Nov 15, 2022 at 6:55 AM Andres Freund <andres@anarazel.de> wrote:

On 2022-11-10 16:04:40 +0530, Amit Kapila wrote:

I don't have any good ideas on how to proceed with this. Any thoughts
on this would be helpful?

One thing worth doing might be to convert the assertion path into an elog(),
mentioning both xids (or add a framework for things like AssertLT(), but that
seems hard). With the concrete values we could make a better guess at what's
going wrong.

It'd probably not hurt to just perform this check independent of
USE_ASSERT_CHECKING - compared to the cost of creating a slot it's neglegible.

That'll obviously only help us whenever we re-encounter the issue, which will
likely be a while...

Agreed.

One thing I noticed just now is that we don't assert
builder->building_full_snapshot==true. I think we should? That didn't use to
be an option, but now it is... It doesn't look to me like that's the issue,
but ...

Agreed.

The attached patch contains both changes. It seems to me this issue
can happen, if somehow, either slot's effective_xmin increased after
we assign its initial value in CreateInitDecodingContext() or somehow
its value is InvalidTransactionId when we have invoked
SnapBuildInitialSnapshot(). The other possibility is that the
initial_xmin_horizon check in SnapBuildFindSnapshot() doesn't insulate
us from assigning builder->xmin value older than initial_xmin_horizon.
I am not able to see if any of this can be true.

--
With Regards,
Amit Kapila.

Attachments:

improve_checks_snapbuild_1.patchapplication/octet-stream; name=improve_checks_snapbuild_1.patchDownload+8-10
#8Andres Freund
andres@anarazel.de
In reply to: Amit Kapila (#7)
Re: Assertion failure in SnapBuildInitialSnapshot()

Hi,

On 2022-11-15 16:20:00 +0530, Amit Kapila wrote:

On Tue, Nov 15, 2022 at 8:08 AM Andres Freund <andres@anarazel.de> wrote:

nor do we enforce in an obvious place that we
don't already hold a snapshot.

We have a check for (FirstXactSnapshot == NULL) in
RestoreTransactionSnapshot->SetTransactionSnapshot. Won't that be
sufficient?

I don't think that'd e.g. catch a catalog snapshot being held, yet that'd
still be bad. And I think checking in SetTransactionSnapshot() is too late,
we've already overwritten MyProc->xmin by that point.

On 2022-11-15 17:21:44 +0530, Amit Kapila wrote:

One thing I noticed just now is that we don't assert
builder->building_full_snapshot==true. I think we should? That didn't use to
be an option, but now it is... It doesn't look to me like that's the issue,
but ...

Agreed.

The attached patch contains both changes. It seems to me this issue
can happen, if somehow, either slot's effective_xmin increased after
we assign its initial value in CreateInitDecodingContext() or somehow
its value is InvalidTransactionId when we have invoked
SnapBuildInitialSnapshot(). The other possibility is that the
initial_xmin_horizon check in SnapBuildFindSnapshot() doesn't insulate
us from assigning builder->xmin value older than initial_xmin_horizon.
I am not able to see if any of this can be true.

Yea, I don't immediately see anything either. Given the discussion in
/messages/by-id/Yz2hivgyjS1RfMKs@depesz.com I am
starting to wonder if we've introduced a race in the slot machinery.

diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index 5006a5c464..e85c75e0e6 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -566,11 +566,13 @@ SnapBuildInitialSnapshot(SnapBuild *builder)
{
Snapshot	snap;
TransactionId xid;
+	TransactionId safeXid;
TransactionId *newxip;
int			newxcnt = 0;

Assert(!FirstSnapshotSet);
Assert(XactIsoLevel == XACT_REPEATABLE_READ);
+ Assert(builder->building_full_snapshot);

if (builder->state != SNAPBUILD_CONSISTENT)
elog(ERROR, "cannot build an initial slot snapshot before reaching a consistent state");
@@ -589,17 +591,13 @@ SnapBuildInitialSnapshot(SnapBuild *builder)
* mechanism. Due to that we can do this without locks, we're only
* changing our own value.
*/

Perhaps add something like "Creating a snapshot is expensive and an unenforced
xmin horizon would have bad consequences, therefore always double-check that
the horizon is enforced"?

-#ifdef USE_ASSERT_CHECKING
-	{
-		TransactionId safeXid;
-
-		LWLockAcquire(ProcArrayLock, LW_SHARED);
-		safeXid = GetOldestSafeDecodingTransactionId(false);
-		LWLockRelease(ProcArrayLock);
+	LWLockAcquire(ProcArrayLock, LW_SHARED);
+	safeXid = GetOldestSafeDecodingTransactionId(false);
+	LWLockRelease(ProcArrayLock);
-		Assert(TransactionIdPrecedesOrEquals(safeXid, snap->xmin));
-	}
-#endif
+	if (TransactionIdFollows(safeXid, snap->xmin))
+		elog(ERROR, "cannot build an initial slot snapshot when oldest safe xid %u follows snapshot's xmin %u",
+			 safeXid, snap->xmin);

MyProc->xmin = snap->xmin;

s/when/as/

Greetings,

Andres Freund

#9Amit Kapila
amit.kapila16@gmail.com
In reply to: Andres Freund (#8)
Re: Assertion failure in SnapBuildInitialSnapshot()

On Wed, Nov 16, 2022 at 7:30 AM Andres Freund <andres@anarazel.de> wrote:

On 2022-11-15 16:20:00 +0530, Amit Kapila wrote:

On Tue, Nov 15, 2022 at 8:08 AM Andres Freund <andres@anarazel.de> wrote:

nor do we enforce in an obvious place that we
don't already hold a snapshot.

We have a check for (FirstXactSnapshot == NULL) in
RestoreTransactionSnapshot->SetTransactionSnapshot. Won't that be
sufficient?

I don't think that'd e.g. catch a catalog snapshot being held, yet that'd
still be bad. And I think checking in SetTransactionSnapshot() is too late,
we've already overwritten MyProc->xmin by that point.

So, shall we add the below Asserts in SnapBuildInitialSnapshot() after
we have the Assert for Assert(!FirstSnapshotSet)?
Assert(FirstXactSnapshot == NULL);
Assert(!HistoricSnapshotActive());

--
With Regards,
Amit Kapila.

#10Andres Freund
andres@anarazel.de
In reply to: Amit Kapila (#9)
Re: Assertion failure in SnapBuildInitialSnapshot()

Hi,

On 2022-11-16 14:22:01 +0530, Amit Kapila wrote:

On Wed, Nov 16, 2022 at 7:30 AM Andres Freund <andres@anarazel.de> wrote:

On 2022-11-15 16:20:00 +0530, Amit Kapila wrote:

On Tue, Nov 15, 2022 at 8:08 AM Andres Freund <andres@anarazel.de> wrote:

nor do we enforce in an obvious place that we
don't already hold a snapshot.

We have a check for (FirstXactSnapshot == NULL) in
RestoreTransactionSnapshot->SetTransactionSnapshot. Won't that be
sufficient?

I don't think that'd e.g. catch a catalog snapshot being held, yet that'd
still be bad. And I think checking in SetTransactionSnapshot() is too late,
we've already overwritten MyProc->xmin by that point.

So, shall we add the below Asserts in SnapBuildInitialSnapshot() after
we have the Assert for Assert(!FirstSnapshotSet)?
Assert(FirstXactSnapshot == NULL);
Assert(!HistoricSnapshotActive());

I don't think that'd catch a catalog snapshot. But perhaps the better answer
for the catalog snapshot is to just invalidate it explicitly. The user doesn't
have control over the catalog snapshot being taken, and it's not too hard to
imagine the walsender code triggering one somewhere.

So maybe we should add something like:

InvalidateCatalogSnapshot(); /* about to overwrite MyProc->xmin */
if (HaveRegisteredOrActiveSnapshot())
elog(ERROR, "cannot build an initial slot snapshot when snapshots exist")
Assert(!HistoricSnapshotActive());

I think we'd not need to assert FirstXactSnapshot == NULL or !FirstSnapshotSet
with that, because those would show up in HaveRegisteredOrActiveSnapshot().

Greetings,

Andres Freund

#11Amit Kapila
amit.kapila16@gmail.com
In reply to: Andres Freund (#10)
Re: Assertion failure in SnapBuildInitialSnapshot()

On Wed, Nov 16, 2022 at 11:56 PM Andres Freund <andres@anarazel.de> wrote:

On 2022-11-16 14:22:01 +0530, Amit Kapila wrote:

On Wed, Nov 16, 2022 at 7:30 AM Andres Freund <andres@anarazel.de> wrote:

On 2022-11-15 16:20:00 +0530, Amit Kapila wrote:

On Tue, Nov 15, 2022 at 8:08 AM Andres Freund <andres@anarazel.de> wrote:

nor do we enforce in an obvious place that we
don't already hold a snapshot.

We have a check for (FirstXactSnapshot == NULL) in
RestoreTransactionSnapshot->SetTransactionSnapshot. Won't that be
sufficient?

I don't think that'd e.g. catch a catalog snapshot being held, yet that'd
still be bad. And I think checking in SetTransactionSnapshot() is too late,
we've already overwritten MyProc->xmin by that point.

So, shall we add the below Asserts in SnapBuildInitialSnapshot() after
we have the Assert for Assert(!FirstSnapshotSet)?
Assert(FirstXactSnapshot == NULL);
Assert(!HistoricSnapshotActive());

I don't think that'd catch a catalog snapshot. But perhaps the better answer
for the catalog snapshot is to just invalidate it explicitly. The user doesn't
have control over the catalog snapshot being taken, and it's not too hard to
imagine the walsender code triggering one somewhere.

So maybe we should add something like:

InvalidateCatalogSnapshot(); /* about to overwrite MyProc->xmin */

The comment "/* about to overwrite MyProc->xmin */" is unclear to me.
We already have a check (/* so we don't overwrite the existing value
*/
if (TransactionIdIsValid(MyProc->xmin))) in SnapBuildInitialSnapshot()
which ensures that we don't overwrite MyProc->xmin, so the above
comment seems contradictory to me.

--
With Regards,
Amit Kapila.

#12Amit Kapila
amit.kapila16@gmail.com
In reply to: Andres Freund (#10)
Re: Assertion failure in SnapBuildInitialSnapshot()

On Wed, Nov 16, 2022 at 11:56 PM Andres Freund <andres@anarazel.de> wrote:

On 2022-11-16 14:22:01 +0530, Amit Kapila wrote:

On Wed, Nov 16, 2022 at 7:30 AM Andres Freund <andres@anarazel.de> wrote:

On 2022-11-15 16:20:00 +0530, Amit Kapila wrote:

On Tue, Nov 15, 2022 at 8:08 AM Andres Freund <andres@anarazel.de> wrote:

nor do we enforce in an obvious place that we
don't already hold a snapshot.

We have a check for (FirstXactSnapshot == NULL) in
RestoreTransactionSnapshot->SetTransactionSnapshot. Won't that be
sufficient?

I don't think that'd e.g. catch a catalog snapshot being held, yet that'd
still be bad. And I think checking in SetTransactionSnapshot() is too late,
we've already overwritten MyProc->xmin by that point.

So, shall we add the below Asserts in SnapBuildInitialSnapshot() after
we have the Assert for Assert(!FirstSnapshotSet)?
Assert(FirstXactSnapshot == NULL);
Assert(!HistoricSnapshotActive());

I don't think that'd catch a catalog snapshot. But perhaps the better answer
for the catalog snapshot is to just invalidate it explicitly. The user doesn't
have control over the catalog snapshot being taken, and it's not too hard to
imagine the walsender code triggering one somewhere.

So maybe we should add something like:

InvalidateCatalogSnapshot(); /* about to overwrite MyProc->xmin */
if (HaveRegisteredOrActiveSnapshot())
elog(ERROR, "cannot build an initial slot snapshot when snapshots exist")
Assert(!HistoricSnapshotActive());

I think we'd not need to assert FirstXactSnapshot == NULL or !FirstSnapshotSet
with that, because those would show up in HaveRegisteredOrActiveSnapshot().

In the attached patch, I have incorporated this change and other
feedback. I think this should probably help us find the reason for
this problem when we see it in the future.

--
With Regards,
Amit Kapila.

Attachments:

v2-0001-Add-additional-checks-while-creating-the-initial-.patchapplication/octet-stream; name=v2-0001-Add-additional-checks-while-creating-the-initial-.patchDownload+23-12
#13Andres Freund
andres@anarazel.de
In reply to: Amit Kapila (#11)
Re: Assertion failure in SnapBuildInitialSnapshot()

Hi,

On 2022-11-17 10:44:18 +0530, Amit Kapila wrote:

On Wed, Nov 16, 2022 at 11:56 PM Andres Freund <andres@anarazel.de> wrote:

On 2022-11-16 14:22:01 +0530, Amit Kapila wrote:

On Wed, Nov 16, 2022 at 7:30 AM Andres Freund <andres@anarazel.de> wrote:

On 2022-11-15 16:20:00 +0530, Amit Kapila wrote:

On Tue, Nov 15, 2022 at 8:08 AM Andres Freund <andres@anarazel.de> wrote:

I don't think that'd catch a catalog snapshot. But perhaps the better answer
for the catalog snapshot is to just invalidate it explicitly. The user doesn't
have control over the catalog snapshot being taken, and it's not too hard to
imagine the walsender code triggering one somewhere.

So maybe we should add something like:

InvalidateCatalogSnapshot(); /* about to overwrite MyProc->xmin */

The comment "/* about to overwrite MyProc->xmin */" is unclear to me.
We already have a check (/* so we don't overwrite the existing value
*/
if (TransactionIdIsValid(MyProc->xmin))) in SnapBuildInitialSnapshot()
which ensures that we don't overwrite MyProc->xmin, so the above
comment seems contradictory to me.

The point is that catalog snapshots could easily end up setting MyProc->xmin,
even though the caller hasn't done anything wrong. So the
InvalidateCatalogSnapshot() would avoid erroring out in a number of scenarios.

Greetings,

Andres Freund

#14Amit Kapila
amit.kapila16@gmail.com
In reply to: Andres Freund (#13)
Re: Assertion failure in SnapBuildInitialSnapshot()

On Thu, Nov 17, 2022 at 11:15 PM Andres Freund <andres@anarazel.de> wrote:

On 2022-11-17 10:44:18 +0530, Amit Kapila wrote:

On Wed, Nov 16, 2022 at 11:56 PM Andres Freund <andres@anarazel.de> wrote:

On 2022-11-16 14:22:01 +0530, Amit Kapila wrote:

On Wed, Nov 16, 2022 at 7:30 AM Andres Freund <andres@anarazel.de> wrote:

On 2022-11-15 16:20:00 +0530, Amit Kapila wrote:

On Tue, Nov 15, 2022 at 8:08 AM Andres Freund <andres@anarazel.de> wrote:

I don't think that'd catch a catalog snapshot. But perhaps the better answer
for the catalog snapshot is to just invalidate it explicitly. The user doesn't
have control over the catalog snapshot being taken, and it's not too hard to
imagine the walsender code triggering one somewhere.

So maybe we should add something like:

InvalidateCatalogSnapshot(); /* about to overwrite MyProc->xmin */

The comment "/* about to overwrite MyProc->xmin */" is unclear to me.
We already have a check (/* so we don't overwrite the existing value
*/
if (TransactionIdIsValid(MyProc->xmin))) in SnapBuildInitialSnapshot()
which ensures that we don't overwrite MyProc->xmin, so the above
comment seems contradictory to me.

The point is that catalog snapshots could easily end up setting MyProc->xmin,
even though the caller hasn't done anything wrong. So the
InvalidateCatalogSnapshot() would avoid erroring out in a number of scenarios.

Okay, updated the patch accordingly.

--
With Regards,
Amit Kapila.

Attachments:

v3-0001-Add-additional-checks-while-creating-the-initial-.patchapplication/octet-stream; name=v3-0001-Add-additional-checks-while-creating-the-initial-.patchDownload+23-12
#15Andres Freund
andres@anarazel.de
In reply to: Amit Kapila (#14)
Re: Assertion failure in SnapBuildInitialSnapshot()

Hi,

On 2022-11-18 11:20:36 +0530, Amit Kapila wrote:

Okay, updated the patch accordingly.

Assuming it passes tests etc, this'd work for me.

- Andres

#16Amit Kapila
amit.kapila16@gmail.com
In reply to: Andres Freund (#15)
Re: Assertion failure in SnapBuildInitialSnapshot()

On Sat, Nov 19, 2022 at 6:35 AM Andres Freund <andres@anarazel.de> wrote:

On 2022-11-18 11:20:36 +0530, Amit Kapila wrote:

Okay, updated the patch accordingly.

Assuming it passes tests etc, this'd work for me.

Thanks, Pushed.

--
With Regards,
Amit Kapila.

#17Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Amit Kapila (#16)
Re: Assertion failure in SnapBuildInitialSnapshot()

On Mon, Nov 21, 2022 at 4:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Sat, Nov 19, 2022 at 6:35 AM Andres Freund <andres@anarazel.de> wrote:

On 2022-11-18 11:20:36 +0530, Amit Kapila wrote:

Okay, updated the patch accordingly.

Assuming it passes tests etc, this'd work for me.

Thanks, Pushed.

The same assertion failure has been reported on another thread[1]/messages/by-id/tencent_7EB71DA5D7BA00EB0B429DCE45D0452B6406@qq.com.
Since I could reproduce this issue several times in my environment
I've investigated the root cause.

I think there is a race condition of updating
procArray->replication_slot_xmin by CreateInitDecodingContext() and
LogicalConfirmReceivedLocation().

What I observed in the test was that a walsender process called:
SnapBuildProcessRunningXacts()
LogicalIncreaseXminForSlot()
LogicalConfirmReceivedLocation()
ReplicationSlotsComputeRequiredXmin(false).

In ReplicationSlotsComputeRequiredXmin() it acquired the
ReplicationSlotControlLock and got 0 as the minimum xmin since there
was no wal sender having effective_xmin. Before calling
ProcArraySetReplicationSlotXmin() (i.e. before acquiring
ProcArrayLock), another walsender process called
CreateInitDecodingContext(), acquired ProcArrayLock, computed
slot->effective_catalog_xmin, called
ReplicationSlotsComputeRequiredXmin(true). Since its
effective_catalog_xmin had been set, it got 39968 as the minimum xmin,
and updated replication_slot_xmin. However, as soon as the second
walsender released ProcArrayLock, the first walsender updated the
replication_slot_xmin to 0. After that, the second walsender called
SnapBuildInitialSnapshot(), and GetOldestSafeDecodingTransactionId()
returned an XID newer than snap->xmin.

One idea to fix this issue is that in
ReplicationSlotsComputeRequiredXmin(), we compute the minimum xmin
while holding both ProcArrayLock and ReplicationSlotControlLock, and
release only ReplicationSlotsControlLock before updating the
replication_slot_xmin. I'm concerned it will increase the contention
on ProcArrayLock but I've attached the patch for discussion.

Regards,

[1]: /messages/by-id/tencent_7EB71DA5D7BA00EB0B429DCE45D0452B6406@qq.com

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachments:

fix_slot_xmin_race_condition.patchapplication/octet-stream; name=fix_slot_xmin_race_condition.patchDownload+10-11
#18Amit Kapila
amit.kapila16@gmail.com
In reply to: Masahiko Sawada (#17)
Re: Assertion failure in SnapBuildInitialSnapshot()

On Thu, Dec 8, 2022 at 8:17 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

The same assertion failure has been reported on another thread[1].
Since I could reproduce this issue several times in my environment
I've investigated the root cause.

I think there is a race condition of updating
procArray->replication_slot_xmin by CreateInitDecodingContext() and
LogicalConfirmReceivedLocation().

What I observed in the test was that a walsender process called:
SnapBuildProcessRunningXacts()
LogicalIncreaseXminForSlot()
LogicalConfirmReceivedLocation()
ReplicationSlotsComputeRequiredXmin(false).

In ReplicationSlotsComputeRequiredXmin() it acquired the
ReplicationSlotControlLock and got 0 as the minimum xmin since there
was no wal sender having effective_xmin.

What about the current walsender process which is processing
running_xacts via SnapBuildProcessRunningXacts()? Isn't that walsender
slot's effective_xmin have a non-zero value? If not, then why?

--
With Regards,
Amit Kapila.

#19Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Masahiko Sawada (#17)
RE: Assertion failure in SnapBuildInitialSnapshot()

Dear Sawada-san,

Thank you for making the patch! I'm still considering whether this approach is
correct, but I can put a comment to your patch anyway.

```
-	Assert(!already_locked || LWLockHeldByMe(ProcArrayLock));
-
-	if (!already_locked)
-		LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+	Assert(LWLockHeldByMe(ProcArrayLock));
```

In this function, we regard that the ProcArrayLock has been already acquired as
exclusive mode and modify data. I think LWLockHeldByMeInMode() should be used
instead of LWLockHeldByMe().
I confirmed that there is only one caller that uses ReplicationSlotsComputeRequiredXmin(true)
and it acquires exclusive lock correctly, but it can avoid future bug.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

#20Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Amit Kapila (#18)
RE: Assertion failure in SnapBuildInitialSnapshot()

Dear Amit, Sawada-san,

I have also reproduced the failure on PG15 with some debug log, and I agreed that
somebody changed procArray->replication_slot_xmin to InvalidTransactionId.

The same assertion failure has been reported on another thread[1].
Since I could reproduce this issue several times in my environment
I've investigated the root cause.

I think there is a race condition of updating
procArray->replication_slot_xmin by CreateInitDecodingContext() and
LogicalConfirmReceivedLocation().

What I observed in the test was that a walsender process called:
SnapBuildProcessRunningXacts()
LogicalIncreaseXminForSlot()
LogicalConfirmReceivedLocation()
ReplicationSlotsComputeRequiredXmin(false).

In ReplicationSlotsComputeRequiredXmin() it acquired the
ReplicationSlotControlLock and got 0 as the minimum xmin since there
was no wal sender having effective_xmin.

What about the current walsender process which is processing
running_xacts via SnapBuildProcessRunningXacts()? Isn't that walsender
slot's effective_xmin have a non-zero value? If not, then why?

Normal walsenders which are not for tablesync create a replication slot with
NOEXPORT_SNAPSHOT option. I think in this case, CreateInitDecodingContext() is
called with need_full_snapshot = false, and slot->effective_xmin is not updated.
It is set as InvalidTransactionId at ReplicationSlotCreate() and no functions update
that. Hence the slot acquired by the walsender may have Invalid effective_min.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

#21Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#20)
#22Amit Kapila
amit.kapila16@gmail.com
In reply to: Masahiko Sawada (#21)
#23Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#22)
#24Amit Kapila
amit.kapila16@gmail.com
In reply to: Masahiko Sawada (#17)
#25Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#23)
#26Amit Kapila
amit.kapila16@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#19)
#27Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Amit Kapila (#26)
#28Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Amit Kapila (#24)
#29Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Masahiko Sawada (#27)
#30Amit Kapila
amit.kapila16@gmail.com
In reply to: Masahiko Sawada (#29)
#31Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Amit Kapila (#30)
#32Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Masahiko Sawada (#31)
#33Amit Kapila
amit.kapila16@gmail.com
In reply to: Masahiko Sawada (#32)
#34Amit Kapila
amit.kapila16@gmail.com
In reply to: Masahiko Sawada (#32)
#35Andres Freund
andres@anarazel.de
In reply to: Amit Kapila (#33)
#36Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#35)
#37Amit Kapila
amit.kapila16@gmail.com
In reply to: Andres Freund (#35)
#38Amit Kapila
amit.kapila16@gmail.com
In reply to: Andres Freund (#36)
#39Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Amit Kapila (#37)
#40Daniel Gustafsson
daniel@yesql.se
In reply to: Masahiko Sawada (#39)
#41Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#35)
#42vignesh C
vignesh21@gmail.com
In reply to: Masahiko Sawada (#39)
#43vignesh C
vignesh21@gmail.com
In reply to: vignesh C (#42)
#44Alexander Lakhin
exclusion@gmail.com
In reply to: vignesh C (#43)
#45Pradeep Kumar
spradeepkumar29@gmail.com
In reply to: Alexander Lakhin (#44)
#46Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Pradeep Kumar (#45)
#47Pradeep Kumar
spradeepkumar29@gmail.com
In reply to: Masahiko Sawada (#46)
#48Zhijie Hou (Fujitsu)
houzj.fnst@fujitsu.com
In reply to: Masahiko Sawada (#46)
#49Amit Kapila
amit.kapila16@gmail.com
In reply to: Zhijie Hou (Fujitsu) (#48)
#50Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Amit Kapila (#49)
#51Zhijie Hou (Fujitsu)
houzj.fnst@fujitsu.com
In reply to: Masahiko Sawada (#50)
#52Amit Kapila
amit.kapila16@gmail.com
In reply to: Zhijie Hou (Fujitsu) (#51)
#53Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Amit Kapila (#52)
#54Pradeep Kumar
spradeepkumar29@gmail.com
In reply to: Masahiko Sawada (#53)
#55Zhijie Hou (Fujitsu)
houzj.fnst@fujitsu.com
In reply to: Pradeep Kumar (#54)
#56Zhijie Hou (Fujitsu)
houzj.fnst@fujitsu.com
In reply to: Zhijie Hou (Fujitsu) (#55)
#57Amit Kapila
amit.kapila16@gmail.com
In reply to: Zhijie Hou (Fujitsu) (#56)
#58Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Amit Kapila (#57)
#59Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Masahiko Sawada (#58)
#60Zhijie Hou (Fujitsu)
houzj.fnst@fujitsu.com
In reply to: Masahiko Sawada (#59)
#61Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Zhijie Hou (Fujitsu) (#60)
#62Zhijie Hou (Fujitsu)
houzj.fnst@fujitsu.com
In reply to: Masahiko Sawada (#61)
#63Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Zhijie Hou (Fujitsu) (#62)
#64Zhijie Hou (Fujitsu)
houzj.fnst@fujitsu.com
In reply to: Masahiko Sawada (#63)
#65Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Zhijie Hou (Fujitsu) (#64)
#66Zhijie Hou (Fujitsu)
houzj.fnst@fujitsu.com
In reply to: Masahiko Sawada (#65)
#67Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Zhijie Hou (Fujitsu) (#66)
#68Chao Li
li.evan.chao@gmail.com
In reply to: Masahiko Sawada (#67)
#69Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Chao Li (#68)