Fix 035_standby_logical_decoding.pl race conditions

Started by Bertrand Drouvotabout 1 year ago47 messages

bertranddrouvot.pg@gmail.com

about 1 year ago

Hi hackers,

Please find attached a patch to $SUBJECT.

In rare circumstances (and on slow machines) it is possible that a xl_running_xacts
is emitted and that the catalog_xmin of a logical slot on the standby advances
past the conflict point. In that case, no conflict is reported and the test
fails. It has been observed several times and the last discussion can be found
in [1]/messages/by-id/386386.1737736935@sss.pgh.pa.us.

To avoid the race condition to occur this commit adds an injection point to prevent
the catalog_xmin of a logical slot to advance past the conflict point.

While working on this patch, some adjustements have been needed for injection
points (they are proposed in 0001):

- Adds the ability to wakeup() and detach() while ensuring that no process can
wait in between. It's done thanks to a new injection_points_wakeup_detach()
function that is holding the spinlock during the whole duration.

- If the walsender is waiting on the injection point and that the logical slot
is conflicting, then the walsender process is killed and so it is not able to
"empty" it's injection slot. So the next injection_wait() should reuse this slot
(instead of using an empty one). injection_wait() has been modified that way
in 0001.

With 0001 in place, then we can make use of an injection point in
LogicalConfirmReceivedLocation() and update 035_standby_logical_decoding.pl to
prevent the catalog_xmin of a logical slot to advance past the conflict point.

Remarks:

R1. The issue still remains in v16 though (as injection points are available since
v17).
R2. 0001 should probably bump the injection point module to 1.1, but shouldn't
have been the case in d28cd3e7b21c?

[1]: /messages/by-id/386386.1737736935@sss.pgh.pa.us

Looking forward to your feedback,

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Amit Kapila

amit.kapila16@gmail.com

12 months ago

In reply to: Bertrand Drouvot (#1)

Re: Fix 035_standby_logical_decoding.pl race conditions

On Mon, Feb 10, 2025 at 8:12 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Please find attached a patch to $SUBJECT.

In rare circumstances (and on slow machines) it is possible that a xl_running_xacts
is emitted and that the catalog_xmin of a logical slot on the standby advances
past the conflict point. In that case, no conflict is reported and the test
fails. It has been observed several times and the last discussion can be found
in [1].

Is my understanding correct that bgwriter on primary node has created
a xl_running_xacts, then that record is replicated to standby, and
while decoding it (xl_running_xacts) on standby via active_slot, we
advanced the catalog_xmin of active_slot? If this happens then the
replay of vacuum record on standby won't be able to invalidate the
active slot, right?

So, if the above is correct, the reason for generating extra
xl_running_xacts on primary is Vacuum followed by Insert on primary
via below part of test:
$node_primary->safe_psql(
'testdb', qq[VACUUM $vac_option verbose $to_vac;
INSERT INTO flush_wal DEFAULT VALUES;]);

To avoid the race condition to occur this commit adds an injection point to prevent
the catalog_xmin of a logical slot to advance past the conflict point.

While working on this patch, some adjustements have been needed for injection
points (they are proposed in 0001):

- Adds the ability to wakeup() and detach() while ensuring that no process can
wait in between. It's done thanks to a new injection_points_wakeup_detach()
function that is holding the spinlock during the whole duration.

- If the walsender is waiting on the injection point and that the logical slot
is conflicting, then the walsender process is killed and so it is not able to
"empty" it's injection slot. So the next injection_wait() should reuse this slot
(instead of using an empty one). injection_wait() has been modified that way
in 0001.

With 0001 in place, then we can make use of an injection point in
LogicalConfirmReceivedLocation() and update 035_standby_logical_decoding.pl to
prevent the catalog_xmin of a logical slot to advance past the conflict point.

Remarks:

R1. The issue still remains in v16 though (as injection points are available since
v17).

This is not idle case because the test would still keep failing
intermittently on 16. I am wondering what if we start a transaction
before vacuum and do some DML in it but didn't commit that xact till
the active_slot test is finished then even the extra logging of
xl_running_xacts shouldn't advance xmin during decoding. This is
because reorder buffer may point to the xmin before vacuum. See
following code:

SnapBuildProcessRunningXacts()
....
xmin = ReorderBufferGetOldestXmin(builder->reorder);
if (xmin == InvalidTransactionId)
xmin = running->oldestRunningXid;
elog(DEBUG3, "xmin: %u, xmax: %u, oldest running: %u, oldest xmin: %u",
builder->xmin, builder->xmax, running->oldestRunningXid, xmin);
LogicalIncreaseXminForSlot(lsn, xmin);
...

Note that I have not tested this case, so I could be wrong. But if
possible, we should try to find some solution which could be
backpatched to 16 as well.

--
With Regards,
Amit Kapila.

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

12 months ago

In reply to: Amit Kapila (#2)

Re: Fix 035_standby_logical_decoding.pl race conditions

Hi,

On Wed, Mar 19, 2025 at 12:12:19PM +0530, Amit Kapila wrote:

On Mon, Feb 10, 2025 at 8:12 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Please find attached a patch to $SUBJECT.

In rare circumstances (and on slow machines) it is possible that a xl_running_xacts
is emitted and that the catalog_xmin of a logical slot on the standby advances
past the conflict point. In that case, no conflict is reported and the test
fails. It has been observed several times and the last discussion can be found
in [1].

Thanks for looking at it!

Is my understanding correct that bgwriter on primary node has created
a xl_running_xacts, then that record is replicated to standby, and
while decoding it (xl_running_xacts) on standby via active_slot, we
advanced the catalog_xmin of active_slot? If this happens then the
replay of vacuum record on standby won't be able to invalidate the
active slot, right?

Yes, that's also my understanding. It's also easy to "simulate" by adding
a checkpoint on the primary and a long enough sleep after we launched our sql in
wait_until_vacuum_can_remove().

So, if the above is correct, the reason for generating extra
xl_running_xacts on primary is Vacuum followed by Insert on primary
via below part of test:
$node_primary->safe_psql(
'testdb', qq[VACUUM $vac_option verbose $to_vac;
INSERT INTO flush_wal DEFAULT VALUES;]);

I'm not sure, I think a xl_running_xacts could also be generated (for example by
the checkpointer) before the vacuum (should the system be slow enough).

Remarks:

R1. The issue still remains in v16 though (as injection points are available since
v17).

This is not idle case because the test would still keep failing
intermittently on 16.

I do agree.

I am wondering what if we start a transaction
before vacuum and do some DML in it but didn't commit that xact till
the active_slot test is finished then even the extra logging of
xl_running_xacts shouldn't advance xmin during decoding.

I'm not sure, as I think a xl_running_xacts could still be generated after
we execute "our sql" meaning:

"
$node_primary->safe_psql('testdb', qq[$sql]);
"

and before we launch the new DML. In that case I guess the issue could still
happen.

OTOH If we create the new DML "before" we launch "our sql" then the test
would also fail for both active and inactive slots because that would not
invalidate any slots.

I did observe the above with the attached changes (just changing the PREPARE
TRANSACTION location).

we should try to find some solution which could be
backpatched to 16 as well.

I agree, but I'm not sure it's doable as it looks to me that we should prevent
the catalog xmin to advance to advance past the conflict point while still
generating a conflict point. Will try to give it another thought.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Hayato Kuroda (Fujitsu)

kuroda.hayato@fujitsu.com

12 months ago

In reply to: Bertrand Drouvot (#3)

RE: Fix 035_standby_logical_decoding.pl race conditions

Dear Bertrand,

I'm also working on the thread to resolve the random failure.

Yes, that's also my understanding. It's also easy to "simulate" by adding
a checkpoint on the primary and a long enough sleep after we launched our sql in
wait_until_vacuum_can_remove().

Thanks for letting me know. For me, it could be reporoduced only the sleep().

So, if the above is correct, the reason for generating extra
xl_running_xacts on primary is Vacuum followed by Insert on primary
via below part of test:
$node_primary->safe_psql(
'testdb', qq[VACUUM $vac_option verbose $to_vac;
INSERT INTO flush_wal DEFAULT VALUES;]);

I'm not sure, I think a xl_running_xacts could also be generated (for example by
the checkpointer) before the vacuum (should the system be slow enough).

I think you are right. When I added `CHECKPOINT` and sleep after the user SQLs,
I got the below ordering. This meant that RUNNING_XACTS are generated before the
prune triggered by the vacuum.
```
...
lsn: 0/04025218, prev 0/040251A0, desc: RUNNING_XACTS nextXid 766 latestCompletedXid 765 oldestRunningXid 766
...
lsn: 0/04028FD0, prev 0/04026FB0, desc: PRUNE_ON_ACCESS snapshotConflictHorizon: 765,...
...
```

I'm not sure, as I think a xl_running_xacts could still be generated after
we execute "our sql" meaning:

"
$node_primary->safe_psql('testdb', qq[$sql]);
"

and before we launch the new DML. In that case I guess the issue could still
happen.

OTOH If we create the new DML "before" we launch "our sql" then the test
would also fail for both active and inactive slots because that would not
invalidate any slots.

I did observe the above with the attached changes (just changing the PREPARE
TRANSACTION location).

I've also tried the idea with the living transaction via background_psql(),
but I got the same result. The test could fail when RUNNING_XACTS record was
generated before the transaction starts.

I agree, but I'm not sure it's doable as it looks to me that we should prevent
the catalog xmin to advance to advance past the conflict point while still
generating a conflict point. Will try to give it another thought.

One primitive idea for me was to stop the walsender/pg_recvlogical process for a while.
SIGSTOP signal for pg_recvlogical may do the idea, but ISTM it could not be on windows.
See 019_replslot_limit.pl.

Best regards,
Hayato Kuroda
FUJITSU LIMITED

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

12 months ago

In reply to: Hayato Kuroda (Fujitsu) (#4)

Re: Fix 035_standby_logical_decoding.pl race conditions

Hi Kuroda-san,

On Fri, Mar 21, 2025 at 12:28:10PM +0000, Hayato Kuroda (Fujitsu) wrote:

I'm also working on the thread to resolve the random failure.

Thanks for looking at it!

I've also tried the idea with the living transaction via background_psql(),
but I got the same result. The test could fail when RUNNING_XACTS record was
generated before the transaction starts.

and thanks for testing and confirming too.

SIGSTOP signal for pg_recvlogical may do the idea,

Yeah, but would we be "really" testing an "active" slot?

At the end we want to produce an invalidation that may or not happen on a real
environment. The corner case is in the test, not an issue of the feature to
fix.

So, I'm not sure I like the idea that much, but thinking out loud: I wonder if
we could bypass the "active" slot checks in 16 and 17 and use injection points as
proposed as of 18 (as we need the injection points changes proposed in 0001
up-thread). Thoughts?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Hayato Kuroda (Fujitsu)

kuroda.hayato@fujitsu.com

12 months ago

In reply to: Bertrand Drouvot (#5)

RE: Fix 035_standby_logical_decoding.pl race conditions

Dear Bertrand,

SIGSTOP signal for pg_recvlogical may do the idea,

Yeah, but would we be "really" testing an "active" slot?

Yeah, this is also a debatable point.

At the end we want to produce an invalidation that may or not happen on a real
environment. The corner case is in the test, not an issue of the feature to
fix.

I also think this is the test-issue, not the codebase.

So, I'm not sure I like the idea that much, but thinking out loud: I wonder if
we could bypass the "active" slot checks in 16 and 17 and use injection points as
proposed as of 18 (as we need the injection points changes proposed in 0001
up-thread). Thoughts?

I do not have other idea neither. I checked your patch set could solve the issue.

Comments for the patch:
I'm not sure whether new API is really needed. Isn't it enough to use both
injection_points_wakeup() and injection_points_detach()? This approach does not
require bumping the version, and can be backported to PG17.

Also, another check whether the extension can be installed for the node is required.
Please see 041_checkpoint_at_promote.pl.

Best regards,
Hayato Kuroda
FUJITSU LIMITED

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

12 months ago

In reply to: Hayato Kuroda (Fujitsu) (#6)

Re: Fix 035_standby_logical_decoding.pl race conditions

Hi Kuroda-san,

On Mon, Mar 24, 2025 at 04:54:21AM +0000, Hayato Kuroda (Fujitsu) wrote:

So, I'm not sure I like the idea that much, but thinking out loud: I wonder if
we could bypass the "active" slot checks in 16 and 17 and use injection points as
proposed as of 18 (as we need the injection points changes proposed in 0001
up-thread). Thoughts?

I do not have other idea neither. I checked your patch set could solve the issue.

Thanks for looking at it!

Comments for the patch:
I'm not sure whether new API is really needed. Isn't it enough to use both
injection_points_wakeup() and injection_points_detach()?

I think that the proposed changes are needed as they fix 2 issues that I hit
while working on 0002:

1. ensure that no process can wait in between wakeup() and detach().

2. If the walsender is waiting on the injection point and the logical slot
is conflicting, then the walsender process is killed and so it is not able to
"empty" it's injection slot. So the next injection_wait() should reuse this slot
(instead of using an empty one).

Also, another check whether the extension can be installed for the node is required.
Please see 041_checkpoint_at_promote.pl.

Indeed I can see the "# Check if the extension injection_points is available, as
it may be possible that this script is run with installcheck, where the module
would not be installed by default", in 041_checkpoint_at_promote.pl.

Thanks! I think that makes sense and will add it in the proposed patch (early
next week).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Amit Kapila

amit.kapila16@gmail.com

12 months ago

In reply to: Bertrand Drouvot (#5)

Re: Fix 035_standby_logical_decoding.pl race conditions

On Fri, Mar 21, 2025 at 9:48 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

So, I'm not sure I like the idea that much, but thinking out loud: I wonder if
we could bypass the "active" slot checks in 16 and 17 and use injection points as
proposed as of 18 (as we need the injection points changes proposed in 0001
up-thread). Thoughts?

The key point is that snapshotConflictHorizon should always be greater
than or equal to oldestRunningXid for this test to pass. The challenge
is that vacuum LOGs the safest xid to be removed as
snapshotConflictHorizon, which I think will always be either one or
more lesser than oldestRunningXid. So, we can't make it pass unless we
ensure there is no running_xact record gets logged after the last
successful transaction (in this case SQL passed to function
wait_until_vacuum_can_remove) and the till the vacuum is replayed on
the standby. I see even check_for_invalidation('pruning_', $logstart,
'with on-access pruning'); failed [1]https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2025-03-19%2007%3A08%3A16.

Seeing all these failures, I wonder whether we can reliably test
active slots apart from wal_level change test (aka Scenario 6:
incorrect wal_level on primary.). Sure, we can try by having some
injection point kind of tests, but is it really worth because, anyway
the active slots won't get invalidated in the scenarios for row
removal we are testing in this case. The other possibility is to add a
developer-level debug_disable_running_xact GUC to test this and
similar cases, or can't we have an injection point to control logging
this WAL record? I have seen the need to control logging running_xact
record in other cases as well.

[1]: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2025-03-19%2007%3A08%3A16

--
With Regards,
Amit Kapila.

Hayato Kuroda (Fujitsu)

kuroda.hayato@fujitsu.com

12 months ago

In reply to: Amit Kapila (#8)

RE: Fix 035_standby_logical_decoding.pl race conditions

Dear Amit, Bertrand,

Seeing all these failures, I wonder whether we can reliably test
active slots apart from wal_level change test (aka Scenario 6:
incorrect wal_level on primary.). Sure, we can try by having some
injection point kind of tests, but is it really worth because, anyway
the active slots won't get invalidated in the scenarios for row
removal we are testing in this case. The other possibility is to add a
developer-level debug_disable_running_xact GUC to test this and
similar cases, or can't we have an injection point to control logging
this WAL record? I have seen the need to control logging running_xact
record in other cases as well.

Based on the idea which controls generating RUNNING_XACTS, I prototyped a patch.
When the instance is attached the new injection point, all processes would skip
logging the record. This does not need to extend injection_point module.
I tested with reproducer and passed on my env.
Sadly IS_INJECTION_POINT_ATTACHED() was introduced for PG18 so that the patch
could not backport for PG17 as-is.

How do you feel?

Best regards,
Hayato Kuroda
FUJITSU LIMITED

#10

Hayato Kuroda (Fujitsu)

kuroda.hayato@fujitsu.com

12 months ago

In reply to: Amit Kapila (#8)

RE: Fix 035_standby_logical_decoding.pl race conditions

Dear Amit, Bertrand,

Seeing all these failures, I wonder whether we can reliably test
active slots apart from wal_level change test (aka Scenario 6:
incorrect wal_level on primary.).

Hmm, agreed. We do not have good solution to stabilize tests, at least for now.
I've created a patch for PG16 which avoids using active slots in scenario 1, 2, 3,
and 5 like attached. Other tests still use active slots:

* Scenario 6 invalidate slots due to the incorrect wal_level, so it retained.
* 'behaves_ok_' testcase, scenario 4 and 'Test standby promotion...' testcase
won't invalidate slots, so they retained.
* 'DROP DATABASE should drops...' invalidates slots, but it does not related with
xmin horizon, so it retained.

The patch aimed only PG16, but can be created for PG17 as well, if needed.

Best regards,
Hayato Kuroda
FUJITSU LIMITED

#11

Amit Kapila

amit.kapila16@gmail.com

12 months ago

In reply to: Hayato Kuroda (Fujitsu) (#9)

Re: Fix 035_standby_logical_decoding.pl race conditions

On Wed, Mar 26, 2025 at 1:17 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Seeing all these failures, I wonder whether we can reliably test
active slots apart from wal_level change test (aka Scenario 6:
incorrect wal_level on primary.). Sure, we can try by having some
injection point kind of tests, but is it really worth because, anyway
the active slots won't get invalidated in the scenarios for row
removal we are testing in this case. The other possibility is to add a
developer-level debug_disable_running_xact GUC to test this and
similar cases, or can't we have an injection point to control logging
this WAL record? I have seen the need to control logging running_xact
record in other cases as well.

Based on the idea which controls generating RUNNING_XACTS, I prototyped a patch.
When the instance is attached the new injection point, all processes would skip
logging the record. This does not need to extend injection_point module.

Right, I think this is a better idea. I have a few comments:
1.
+ /*
+ * In 035_standby_logical_decoding.pl, RUNNING_XACTS could move slots's
+ * xmin forward and cause random failures.

No need to use test file name in code comments.

2. The comments atop wait_until_vacuum_can_remove can be changed to
indicate that we will avoid logging running_xact with the help of
injection points.

3.
+ # Note that from this point the checkpointer and bgwriter will wait before
+ # they write RUNNING_XACT record.
+ $node_primary->safe_psql('testdb',
+ "SELECT injection_points_attach('log-running-xacts', 'wait');");

Isn't it better to use 'error' as the second parameter as we don't
want to wait at this injection point?

4.
+ # XXX If the instance does not attach 'log-running-xacts', the bgwriter
+ # pocess would generate RUNNING_XACTS record, so that the test would fail.
+ sleep(20);

I think it is better to make a separate patch (as a first patch) for
this so that it can be used as a reproducer. I suggest to use
checkpoint as used by one of Bertrand's patches to ensure that the
issue reproduces in every environment.

Sadly IS_INJECTION_POINT_ATTACHED() was introduced for PG18 so that the patch
could not backport for PG17 as-is.

We can use 'wait' mode API in PG17 as used in one of the tests
(injection_points_attach('heap_update-before-pin', 'wait');) but I
think it may be better to just leave testing active slots in
backbranches because anyway the new development happens on HEAD and we
want to ensure that no breakage happens there.

--
With Regards,
Amit Kapila.

#12

Hayato Kuroda (Fujitsu)

kuroda.hayato@fujitsu.com

11 months ago

In reply to: Amit Kapila (#11)

RE: Fix 035_standby_logical_decoding.pl race conditions

Dear Amit,

Right, I think this is a better idea. I have a few comments:
1.
+ /*
+ * In 035_standby_logical_decoding.pl, RUNNING_XACTS could move slots's
+ * xmin forward and cause random failures.

No need to use test file name in code comments.

Fixed.

2. The comments atop wait_until_vacuum_can_remove can be changed to
indicate that we will avoid logging running_xact with the help of
injection points.

Comments were updated for the master. In back-branches, they were removed
because the risk was removed.

3.
+ # Note that from this point the checkpointer and bgwriter will wait before
+ # they write RUNNING_XACT record.
+ $node_primary->safe_psql('testdb',
+ "SELECT injection_points_attach('log-running-xacts', 'wait');");
Isn't it better to use 'error' as the second parameter as we don't
want to wait at this injection point?

Right, and the comment atop it was updated.

4.
+ # XXX If the instance does not attach 'log-running-xacts', the bgwriter
+ # pocess would generate RUNNING_XACTS record, so that the test would fail.
+ sleep(20);
I think it is better to make a separate patch (as a first patch) for
this so that it can be used as a reproducer. I suggest to use
checkpoint as used by one of Bertrand's patches to ensure that the
issue reproduces in every environment.

Reproducer was separated to the .txt file.

Sadly IS_INJECTION_POINT_ATTACHED() was introduced for PG18 so that the

patch

could not backport for PG17 as-is.

We can use 'wait' mode API in PG17 as used in one of the tests
(injection_points_attach('heap_update-before-pin', 'wait');) but I
think it may be better to just leave testing active slots in
backbranches because anyway the new development happens on HEAD and we
want to ensure that no breakage happens there.

OK. I've attached a patch for PG17 as well. Commit messages for them were also
updated.

Best regards,
Hayato Kuroda
FUJITSU LIMITED

#13

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

11 months ago

In reply to: Hayato Kuroda (Fujitsu) (#12)

Re: Fix 035_standby_logical_decoding.pl race conditions

Hi Kuroda-san and Amit,

On Fri, Mar 28, 2025 at 09:02:29AM +0000, Hayato Kuroda (Fujitsu) wrote:

Dear Amit,

Right, I think this is a better idea.

I like it too and the bonus point is that this injection point can be used
in more tests (more use cases).

A few comments:

==== About v2-0001-Stabilize

=== 1

s/to avoid the seeing a xl_running_xacts/to avoid seeing a xl_running_xacts/?

=== 2 (Nit)

/* For testing slot invalidation due to the conflict */

Not sure "due to the conflict" is needed.

==== About PG17-v2-0001

=== 3

The commit message still mentions injection point.

=== 4

-# Note that pg_current_snapshot() is used to get the horizon. It does
-# not generate a Transaction/COMMIT WAL record, decreasing the risk of
-# seeing a xl_running_xacts that would advance an active replication slot's
-# catalog_xmin. Advancing the active replication slot's catalog_xmin
-# would break some tests that expect the active slot to conflict with
-# the catalog xmin horizon.

I'd be tempted to not remove this comment but reword it a bit instead. Something
like?

# Note that pg_current_snapshot() is used to get the horizon. It does
# not generate a Transaction/COMMIT WAL record, decreasing the risk of
# seeing a xl_running_xacts that would advance an active replication slot's
# catalog_xmin. Advancing the active replication slot's catalog_xmin
# would break some tests that expect the active slot to conflict with
# the catalog xmin horizon. We ensure that active replication slots are not
# created for tests that might produce this race condition though.

=== 5

The invalidation checks for active slots are kept for the wal_level case. Also
the active slots are still created to test that logical decoding on the standby
behaves correctly, when no conflict is expected and for the promotion.

The above makes sense to me.

=== 6 (Nit)

In drop_logical_slots(), s/needs_active_slot/drop_active_slot/?

=== 7 (Nit)

In check_slots_conflict_reason(), s/needs_active_slot/checks_active_slot/?

==== About PG16-v2-0001

Same as for PG17-v2-0001.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#14

Hayato Kuroda (Fujitsu)

kuroda.hayato@fujitsu.com

11 months ago

In reply to: Bertrand Drouvot (#13)

RE: Fix 035_standby_logical_decoding.pl race conditions

Dear Bertrand,

s/to avoid the seeing a xl_running_xacts/to avoid seeing a xl_running_xacts/?

Fixed.

=== 2 (Nit)

/* For testing slot invalidation due to the conflict */

Not sure "due to the conflict" is needed.

OK, removed.

==== About PG17-v2-0001

=== 3

The commit message still mentions injection point.

Oh, removed.

=== 4

-# Note that pg_current_snapshot() is used to get the horizon. It does
-# not generate a Transaction/COMMIT WAL record, decreasing the risk of
-# seeing a xl_running_xacts that would advance an active replication slot's
-# catalog_xmin. Advancing the active replication slot's catalog_xmin
-# would break some tests that expect the active slot to conflict with
-# the catalog xmin horizon.

I'd be tempted to not remove this comment but reword it a bit instead. Something
like?

# Note that pg_current_snapshot() is used to get the horizon. It does
# not generate a Transaction/COMMIT WAL record, decreasing the risk of
# seeing a xl_running_xacts that would advance an active replication slot's
# catalog_xmin. Advancing the active replication slot's catalog_xmin
# would break some tests that expect the active slot to conflict with
# the catalog xmin horizon. We ensure that active replication slots are not
# created for tests that might produce this race condition though.

Added.

=== 6 (Nit)

In drop_logical_slots(), s/needs_active_slot/drop_active_slot/?

Fixed.

=== 7 (Nit)

In check_slots_conflict_reason(), s/needs_active_slot/checks_active_slot/?

Fixed.

==== About PG16-v2-0001

Same as for PG17-v2-0001.

I followed all needed changes.

Best regards,
Hayato Kuroda
FUJITSU LIMITED

#15

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

11 months ago

In reply to: Hayato Kuroda (Fujitsu) (#14)

Re: Fix 035_standby_logical_decoding.pl race conditions

Hi Kuroda-san,

On Tue, Apr 01, 2025 at 01:22:49AM +0000, Hayato Kuroda (Fujitsu) wrote:

Dear Bertrand,

Thanks for the updated patch!

s/to avoid the seeing a xl_running_xacts/to avoid seeing a xl_running_xacts/?

Fixed.

hmm, not sure as I still can see:

+# Note that injection_point is used to avoid the seeing the xl_running_xacts

=== 1

+                * XXX What value should we return here? Originally this returns the
+                * inserted location of RUNNING_XACT record. Based on that, here
+                * returns the latest insert location for now.
+                */
+               return GetInsertRecPtr();

Looking at the LogStandbySnapshot() that are using the output lsn, i.e:

pg_log_standby_snapshot()
BackgroundWriterMain()
ReplicationSlotReserveWal()

It looks ok to me to use GetInsertRecPtr().

But if we "really" want to produce a "new" WAL record, what about using
LogLogicalMessage()? It could also be used for debugging purpose. Bonus point:
it does not need wal_level to be set to logical. Thoughts?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#16

Amit Kapila

amit.kapila16@gmail.com

11 months ago

In reply to: Bertrand Drouvot (#15)

Re: Fix 035_standby_logical_decoding.pl race conditions

On Tue, Apr 1, 2025 at 2:02 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Hi Kuroda-san,

On Tue, Apr 01, 2025 at 01:22:49AM +0000, Hayato Kuroda (Fujitsu) wrote:

Dear Bertrand,

Thanks for the updated patch!

s/to avoid the seeing a xl_running_xacts/to avoid seeing a xl_running_xacts/?

Fixed.

hmm, not sure as I still can see:

+# Note that injection_point is used to avoid the seeing the xl_running_xacts

=== 1
+                * XXX What value should we return here? Originally this returns the
+                * inserted location of RUNNING_XACT record. Based on that, here
+                * returns the latest insert location for now.
+                */
+               return GetInsertRecPtr();
Looking at the LogStandbySnapshot() that are using the output lsn, i.e:

pg_log_standby_snapshot()
BackgroundWriterMain()
ReplicationSlotReserveWal()

It looks ok to me to use GetInsertRecPtr().

+1.

But if we "really" want to produce a "new" WAL record, what about using
LogLogicalMessage()?

We are using injection points for testing purposes, which means the
caller is aware of skipping the running_xacts record during the test
run. So, there doesn't seem to be any reason to do anything extra.

--
With Regards,
Amit Kapila.

#17

Amit Kapila

amit.kapila16@gmail.com

11 months ago

In reply to: Hayato Kuroda (Fujitsu) (#14)

Re: Fix 035_standby_logical_decoding.pl race conditions

On Tue, Apr 1, 2025 at 6:53 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

With respect to 0001, can't this problem happen for the following case as well?
# Recovery conflict: Invalidate conflicting slots, including in-use slots
# Scenario 5: conflict due to on-access pruning.

You have not added any injection point for the above case. Isn't it
possible that if running_xact record is logged concurrently to the
pruning record, it should move the active slot on standby, and the
same failure should occur in this case as well?

--
With Regards,
Amit Kapila.

#18

Hayato Kuroda (Fujitsu)

kuroda.hayato@fujitsu.com

11 months ago

In reply to: Bertrand Drouvot (#15)

RE: Fix 035_standby_logical_decoding.pl race conditions

Dear Bertrand,

s/to avoid the seeing a xl_running_xacts/to avoid seeing a xl_running_xacts/?

Fixed.

Sorry, I misunderstood your comment and wrongly fixed. I will address in next version.

=== 1
+                * XXX What value should we return here? Originally this
returns the
+                * inserted location of RUNNING_XACT record. Based on that,
here
+                * returns the latest insert location for now.
+                */
+               return GetInsertRecPtr();
Looking at the LogStandbySnapshot() that are using the output lsn, i.e:

pg_log_standby_snapshot()
BackgroundWriterMain()
ReplicationSlotReserveWal()

It looks ok to me to use GetInsertRecPtr().

But if we "really" want to produce a "new" WAL record, what about using
LogLogicalMessage()? It could also be used for debugging purpose. Bonus point:
it does not need wal_level to be set to logical. Thoughts?

Right. Similarly, an SQL function pg_logical_emit_message() is sometimes used for
the testing purpose, advance_wal() and emit_wal( in Cluster.pm. Even so, we have
not found the use-case yet, thus I want to retain now and will update based on
the future needs.

I'll investigate another point [1]/messages/by-id/CAA4eK1+x5-eOn5+MW6FiUjB_1bBCH8jCCARC1uMrx6erZ3J73w@mail.gmail.com and then will post new version.

[1]: /messages/by-id/CAA4eK1+x5-eOn5+MW6FiUjB_1bBCH8jCCARC1uMrx6erZ3J73w@mail.gmail.com

Best regards,
Hayato Kuroda
FUJITSU LIMITED

#19

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

11 months ago

In reply to: Amit Kapila (#16)

Re: Fix 035_standby_logical_decoding.pl race conditions

Hi,

On Tue, Apr 01, 2025 at 04:55:06PM +0530, Amit Kapila wrote:

On Tue, Apr 1, 2025 at 2:02 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

But if we "really" want to produce a "new" WAL record, what about using
LogLogicalMessage()?

We are using injection points for testing purposes, which means the
caller is aware of skipping the running_xacts record during the test
run. So, there doesn't seem to be any reason to do anything extra.

Agree, the idea was to provide extra debugging info for the tests. We can just
keep it in mind should we have a need for.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#20

Hayato Kuroda (Fujitsu)

kuroda.hayato@fujitsu.com

11 months ago

In reply to: Amit Kapila (#17)

RE: Fix 035_standby_logical_decoding.pl race conditions

Dear Amit, Bertrand,

You have not added any injection point for the above case. Isn't it
possible that if running_xact record is logged concurrently to the
pruning record, it should move the active slot on standby, and the
same failure should occur in this case as well?

I considered that the timing failure can happen. Reproducer:

```
 $node_primary->safe_psql('testdb', qq[UPDATE prun SET s = 'D';]);
+$node_primary->safe_psql('testdb', 'CHECKPOINT');
+sleep(20);
 $node_primary->safe_psql('testdb', qq[UPDATE prun SET s = 'E';]);
```

And here is my theory...

Firstly, a new table was created with smaller fill factor. Then, after doing UPDATE
three times, the page became full. At fourth UPDATE command (let's say txn4),
the page pruning was done by the backend process and PRUNE_ON_ACCESS was generated.
It requested standbys to discard tuples before the third UPDATE (say txn3),
thus the slot could be invalidated.
However, if a RUNNING_XACTS record is generated between txn3 and txn4, the
oldestRunningXact would be same xid as txn4, and the catalog_xmin of the standby
slot would be advanced till that. Upcoming PRUNE_ON_ACCESS points the txn3 so that
slot invalidation won't happen in this case.

Based on the fact, I've updated to use injection_points for scenario 5. Of course,
PG16/17 patches won't use the active slot for that scenario.

Best regards,
Hayato Kuroda
FUJITSU LIMITED

#21

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

11 months ago

In reply to: Hayato Kuroda (Fujitsu) (#20)

#22

Amit Kapila

amit.kapila16@gmail.com

11 months ago

In reply to: Bertrand Drouvot (#21)

#23

Amit Kapila

amit.kapila16@gmail.com

11 months ago

In reply to: Bertrand Drouvot (#21)

#24

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

11 months ago

In reply to: Amit Kapila (#22)

#25

Hayato Kuroda (Fujitsu)

kuroda.hayato@fujitsu.com

11 months ago

In reply to: Amit Kapila (#23)

#26

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

11 months ago

In reply to: Hayato Kuroda (Fujitsu) (#25)

#27

Amit Kapila

amit.kapila16@gmail.com

11 months ago

In reply to: Bertrand Drouvot (#26)

#28

Hayato Kuroda (Fujitsu)

kuroda.hayato@fujitsu.com

11 months ago

In reply to: Amit Kapila (#27)

#29

Amit Kapila

amit.kapila16@gmail.com

11 months ago

In reply to: Hayato Kuroda (Fujitsu) (#28)

#30

Hayato Kuroda (Fujitsu)

kuroda.hayato@fujitsu.com

11 months ago

In reply to: Amit Kapila (#29)

#31

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

11 months ago

In reply to: Hayato Kuroda (Fujitsu) (#28)

#32

Amit Kapila

amit.kapila16@gmail.com

11 months ago

In reply to: Bertrand Drouvot (#31)

#33

Hayato Kuroda (Fujitsu)

kuroda.hayato@fujitsu.com

11 months ago

In reply to: Bertrand Drouvot (#31)

#34

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

11 months ago

In reply to: Hayato Kuroda (Fujitsu) (#33)

#35

Amit Kapila

amit.kapila16@gmail.com

11 months ago

In reply to: Amit Kapila (#32)

#36

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

11 months ago

In reply to: Amit Kapila (#35)

#37

Amit Kapila

amit.kapila16@gmail.com

11 months ago

In reply to: Hayato Kuroda (Fujitsu) (#30)

#38

Hayato Kuroda (Fujitsu)

kuroda.hayato@fujitsu.com

11 months ago

In reply to: Amit Kapila (#37)

#39

Michael Paquier

michael@paquier.xyz

11 months ago

In reply to: Hayato Kuroda (Fujitsu) (#38)

#40

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

11 months ago

In reply to: Michael Paquier (#39)

#41

Andres Freund

andres@anarazel.de

11 months ago

In reply to: Hayato Kuroda (Fujitsu) (#38)

#42

Michael Paquier

michael@paquier.xyz

11 months ago

In reply to: Bertrand Drouvot (#40)

#43

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

11 months ago

In reply to: Michael Paquier (#42)

#44

Michael Paquier

michael@paquier.xyz

11 months ago

In reply to: Bertrand Drouvot (#43)

#45

Bertrand Drouvot

bertranddrouvot.pg@gmail.com

11 months ago

In reply to: Michael Paquier (#44)

#46

Amit Kapila

amit.kapila16@gmail.com

11 months ago

In reply to: Bertrand Drouvot (#45)

#47

Michael Paquier

michael@paquier.xyz

11 months ago

In reply to: Amit Kapila (#46)

Fix 035_standby_logical_decoding.pl race conditions

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments: