Why is subscription/t/031_column_list.pl failing so much?
If you look at the buildfarm's failures page and filter down to
just subscriptionCheck failures, what you find is that all of the
last 6 such failures are in 031_column_list.pl:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tamandua&dt=2024-02-02%2019%3A33%3A16
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=calliphoridae&dt=2024-02-02%2011%3A21%3A44
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=flaviventris&dt=2024-02-01%2020%3A34%3A29
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=flaviventris&dt=2024-02-01%2016%3A57%3A14
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=kestrel&dt=2024-01-31%2022%3A18%3A24
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=calliphoridae&dt=2024-01-30%2011%3A29%3A23
There are some further back too:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mylodon&dt=2023-11-17%2018%3A28%3A24
but this definitely got way more common in the last few days.
Digging down into the logs, these all look pretty similar. Somehow
things get into a state where replication connections fail with
the publisher reporting "publication does not exist":
2024-02-02 19:42:23.187 UTC [1631708][not initialized][:0] LOG: connection received: host=[local]
2024-02-02 19:42:23.189 UTC [1631708][walsender][4/287:0] LOG: connection authenticated: user="bf" method=trust (/home/bf/bf-build/tamandua/HEAD/pgsql.build/testrun/subscription/031_column_list/data/t_031_column_list_publisher_data/pgdata/pg_hba.conf:117)
2024-02-02 19:42:23.189 UTC [1631708][walsender][4/287:0] LOG: replication connection authorized: user=bf application_name=sub1
2024-02-02 19:42:23.214 UTC [1631708][walsender][4/288:0] LOG: statement: SELECT pg_catalog.set_config('search_path', '', false);
2024-02-02 19:42:23.226 UTC [1631708][walsender][4/0:0] LOG: received replication command: IDENTIFY_SYSTEM
2024-02-02 19:42:23.226 UTC [1631708][walsender][4/0:0] STATEMENT: IDENTIFY_SYSTEM
2024-02-02 19:42:23.226 UTC [1631708][walsender][4/0:0] LOG: received replication command: START_REPLICATION SLOT "sub1" LOGICAL 0/15BCDD0 (proto_version '4', origin 'any', publication_names '"pub7"')
2024-02-02 19:42:23.226 UTC [1631708][walsender][4/0:0] STATEMENT: START_REPLICATION SLOT "sub1" LOGICAL 0/15BCDD0 (proto_version '4', origin 'any', publication_names '"pub7"')
2024-02-02 19:42:23.226 UTC [1631708][walsender][4/0:0] LOG: acquired logical replication slot "sub1"
2024-02-02 19:42:23.226 UTC [1631708][walsender][4/0:0] STATEMENT: START_REPLICATION SLOT "sub1" LOGICAL 0/15BCDD0 (proto_version '4', origin 'any', publication_names '"pub7"')
2024-02-02 19:42:23.242 UTC [1631708][walsender][4/0:0] LOG: starting logical decoding for slot "sub1"
2024-02-02 19:42:23.242 UTC [1631708][walsender][4/0:0] DETAIL: Streaming transactions committing after 0/15BCDD0, reading WAL from 0/15BCDD0.
2024-02-02 19:42:23.242 UTC [1631708][walsender][4/0:0] STATEMENT: START_REPLICATION SLOT "sub1" LOGICAL 0/15BCDD0 (proto_version '4', origin 'any', publication_names '"pub7"')
2024-02-02 19:42:23.243 UTC [1631708][walsender][4/0:0] LOG: logical decoding found consistent point at 0/15BCDD0
2024-02-02 19:42:23.243 UTC [1631708][walsender][4/0:0] DETAIL: There are no running transactions.
2024-02-02 19:42:23.243 UTC [1631708][walsender][4/0:0] STATEMENT: START_REPLICATION SLOT "sub1" LOGICAL 0/15BCDD0 (proto_version '4', origin 'any', publication_names '"pub7"')
2024-02-02 19:42:23.244 UTC [1631708][walsender][4/0:0] ERROR: publication "pub7" does not exist
2024-02-02 19:42:23.244 UTC [1631708][walsender][4/0:0] CONTEXT: slot "sub1", output plugin "pgoutput", in the change callback, associated LSN 0/15C7698
2024-02-02 19:42:23.244 UTC [1631708][walsender][4/0:0] STATEMENT: START_REPLICATION SLOT "sub1" LOGICAL 0/15BCDD0 (proto_version '4', origin 'any', publication_names '"pub7"')
2024-02-02 19:42:23.244 UTC [1631708][walsender][4/0:0] LOG: released logical replication slot "sub1"
2024-02-02 19:42:23.834 UTC [1631708][walsender][:0] LOG: disconnection: session time: 0:00:00.647 user=bf database=postgres host=[local]
and then we just repeat that until the test times out. It fails at
different points in the test script (hence, different publication
names), but the pattern looks about the same.
I don't see anything that 031_column_list.pl is doing that is much
different from other subscription tests, so why is it the only one
failing? And more to the point, what's going wrong exactly?
I am suspicious that this somehow represents a failure of the
historical catalog decoding logic, but I don't see how that theory
explains this only breaking in one test script.
regards, tom lane
On Fri, Feb 02, 2024 at 05:07:14PM -0500, Tom Lane wrote:
If you look at the buildfarm's failures page and filter down to
just subscriptionCheck failures, what you find is that all of the
last 6 such failures are in 031_column_list.pl:https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tamandua&dt=2024-02-02%2019%3A33%3A16
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=calliphoridae&dt=2024-02-02%2011%3A21%3A44
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=flaviventris&dt=2024-02-01%2020%3A34%3A29
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=flaviventris&dt=2024-02-01%2016%3A57%3A14
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=kestrel&dt=2024-01-31%2022%3A18%3A24
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=calliphoridae&dt=2024-01-30%2011%3A29%3A23There are some further back too:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mylodon&dt=2023-11-17%2018%3A28%3A24
but this definitely got way more common in the last few days.
I don't see anything that 031_column_list.pl is doing that is much
different from other subscription tests, so why is it the only one
failing? And more to the point, what's going wrong exactly?
I don't know, but
/messages/by-id/16d6d9cc-f97d-0b34-be65-425183ed3721@gmail.com
reported a replacement BgWriterDelay value reproducing it. That hasn't
reproduced it in ~10 runs on my machine, though.
Show quoted text
I am suspicious that this somehow represents a failure of the
historical catalog decoding logic, but I don't see how that theory
explains this only breaking in one test script.
On Fri, Feb 02, 2024 at 02:30:03PM -0800, Noah Misch wrote:
On Fri, Feb 02, 2024 at 05:07:14PM -0500, Tom Lane wrote:
If you look at the buildfarm's failures page and filter down to
just subscriptionCheck failures, what you find is that all of the
last 6 such failures are in 031_column_list.pl:https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tamandua&dt=2024-02-02%2019%3A33%3A16
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=calliphoridae&dt=2024-02-02%2011%3A21%3A44
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=flaviventris&dt=2024-02-01%2020%3A34%3A29
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=flaviventris&dt=2024-02-01%2016%3A57%3A14
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=kestrel&dt=2024-01-31%2022%3A18%3A24
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=calliphoridae&dt=2024-01-30%2011%3A29%3A23There are some further back too:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mylodon&dt=2023-11-17%2018%3A28%3A24
but this definitely got way more common in the last few days.
I don't see anything that 031_column_list.pl is doing that is much
different from other subscription tests, so why is it the only one
failing? And more to the point, what's going wrong exactly?I don't know, but
/messages/by-id/16d6d9cc-f97d-0b34-be65-425183ed3721@gmail.com
reported a replacement BgWriterDelay value reproducing it.
Correction: the recipe changes LOG_SNAPSHOT_INTERVAL_MS in addition to
BgWriterDelay.
That hasn't reproduced it in ~10 runs on my machine, though.
After 207 successes, it did fail once for me.
Show quoted text
I am suspicious that this somehow represents a failure of the
historical catalog decoding logic, but I don't see how that theory
explains this only breaking in one test script.
Noah Misch <noah@leadboat.com> writes:
On Fri, Feb 02, 2024 at 05:07:14PM -0500, Tom Lane wrote:
If you look at the buildfarm's failures page and filter down to
just subscriptionCheck failures, what you find is that all of the
last 6 such failures are in 031_column_list.pl:
...
I don't see anything that 031_column_list.pl is doing that is much
different from other subscription tests, so why is it the only one
failing? And more to the point, what's going wrong exactly?
I don't know, but
/messages/by-id/16d6d9cc-f97d-0b34-be65-425183ed3721@gmail.com
reported a replacement BgWriterDelay value reproducing it. That hasn't
reproduced it in ~10 runs on my machine, though.
Ah, thanks for that link. I like the theory proposed in that thread
that the walsender is starting up at an LSN somewhere before where
the publication is created. I'm tempted to add some more queries to
the test script to see if that can be proven.
I'm still wondering how come the failure seems to have suddenly gotten
way more common. The only changes that are in vaguely-related places
and fit the time frame are Amit's 732924043 and 776621a5e, but I sure
don't see a connection.
regards, tom lane
Hello Tom and Noah,
03.02.2024 04:24, Tom Lane wrote:
I'm still wondering how come the failure seems to have suddenly gotten
way more common. The only changes that are in vaguely-related places
and fit the time frame are Amit's 732924043 and 776621a5e, but I sure
don't see a connection.
I think the failure rate increased due to tamandua, calliphoridae,
flaviventris, and kestrel were switched from make to meson recently.
The last `make` builds for them:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tamandua&dt=2024-01-31%2016%3A51%3A31
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=calliphoridae&dt=2024-01-31%2016%3A51%3A38
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=flaviventris&dt=2024-01-31%2016%3A52%3A37
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=kestrel&dt=2024-01-31%2016%3A51%3A53
and since that switch the 031_column_list duration increased significantly,
e.g., on the last tamandua `make` run it executed for 7 seconds, but
successful `meson` runs give much longer duration:
280/283 postgresql:subscription / subscription/031_column_list OK 38.27s 36 subtests passed
280/283 postgresql:subscription / subscription/031_column_list OK 126.13s 36 subtests passed
280/283 postgresql:subscription / subscription/031_column_list OK 31.93s 36 subtests passed
279/283 postgresql:subscription / subscription/031_column_list OK 99.76s 36 subtests passed
So, looking at the tamandua's failure log, I see:
2024-02-02 19:41:19.750 UTC [1579219][postmaster][:0] LOG: starting PostgreSQL 17devel on x86_64-linux, compiled by
gcc-12.3.0, 64-bit
...
2024-02-02 19:42:19.973 UTC [1629333][client backend][4/213:0] LOG: statement: ALTER SUBSCRIPTION sub1 SET PUBLICATION pub7
2024-02-02 19:42:20.131 UTC [1625765][logical replication apply worker][3/122:0] LOG: logical replication worker for
subscription "sub1" will restart because of a parameter change
2024-02-02 19:42:20.137 UTC [1629333][client backend][:0] LOG: disconnection: session time: 0:00:00.212 user=bf
database=postgres host=[local]
2024-02-02 19:42:20.191 UTC [1629535][logical replication apply worker][3/124:0] LOG: logical replication apply worker
for subscription "sub1" has started
...
2024-02-02 19:42:20.445 UTC [1629535][logical replication apply worker][3/0:0] ERROR: could not receive data from WAL
stream: ERROR: publication "pub7" does not exist
CONTEXT: slot "sub1", output plugin "pgoutput", in the change callback, associated LSN 0/15C7698
(The interval between subscriber start and the error is ~ 4 * 15 seconds.)
Thus it still may be explained by bgwriter activity, though perhaps
autovacuum/checkpointer can add something as well.
Best regards,
Alexander
On Sat, Feb 3, 2024 at 12:30 PM Alexander Lakhin <exclusion@gmail.com> wrote:
03.02.2024 04:24, Tom Lane wrote:
I'm still wondering how come the failure seems to have suddenly gotten
way more common. The only changes that are in vaguely-related places
and fit the time frame are Amit's 732924043 and 776621a5e, but I sure
don't see a connection.I think the failure rate increased due to tamandua, calliphoridae,
flaviventris, and kestrel were switched from make to meson recently.
If this can be reproduced frequently then we can even try to test the
patch in that thread by Osumi-San [1]/messages/by-id/TYCPR01MB83737A68CD5D554EA82BD7B9EDD39@TYCPR01MB8373.jpnprd01.prod.outlook.com (I haven't tested that it
applies cleanly but shouldn't be difficult to make it work) based on
the theory that walsender is starting up at an LSN somewhere before
where the publication is created.
[1]: /messages/by-id/TYCPR01MB83737A68CD5D554EA82BD7B9EDD39@TYCPR01MB8373.jpnprd01.prod.outlook.com
--
With Regards,
Amit Kapila.
Hello Amit,
05.02.2024 15:20, Amit Kapila wrote:
If this can be reproduced frequently then we can even try to test the
patch in that thread by Osumi-San [1] (I haven't tested that it
applies cleanly but shouldn't be difficult to make it work) based on
the theory that walsender is starting up at an LSN somewhere before
where the publication is created.[1] - /messages/by-id/TYCPR01MB83737A68CD5D554EA82BD7B9EDD39@TYCPR01MB8373.jpnprd01.prod.outlook.com
Yes, with the aforementioned modification of bgwriter.c and when running
20 tests in parallel, I got failures on iterations 20. 3, 21 ..., but with the
updated Osumi-San's patch (which adds wait_for_catchup('sub1') before every
ALTER SUBSCRIPTION sub1 SET PUBLICATION ...) applied, 300 iterations ran
with no failures.
Best regards,
Alexander
Attachments:
v2-0001-Fix-the-intermittent-Build-farm-failures-in-031_c.patchtext/x-patch; charset=UTF-8; name=v2-0001-Fix-the-intermittent-Build-farm-failures-in-031_c.patchDownload
diff --git a/src/test/subscription/t/031_column_list.pl b/src/test/subscription/t/031_column_list.pl
index 938582e31a..b199c8a1bf 100644
--- a/src/test/subscription/t/031_column_list.pl
+++ b/src/test/subscription/t/031_column_list.pl
@@ -368,6 +368,8 @@ $node_subscriber->safe_psql(
CREATE TABLE tab5 (a int PRIMARY KEY, b int, d int);
));
+$node_publisher->wait_for_catchup('sub1');
+
$node_subscriber->safe_psql(
'postgres', qq(
ALTER SUBSCRIPTION sub1 SET PUBLICATION pub2, pub3
@@ -409,6 +411,8 @@ $node_subscriber->safe_psql(
CREATE TABLE tab6 (a int PRIMARY KEY, b int, c int, d int);
));
+$node_publisher->wait_for_catchup('sub1');
+
$node_subscriber->safe_psql(
'postgres', qq(
ALTER SUBSCRIPTION sub1 SET PUBLICATION pub4
@@ -485,6 +489,8 @@ $node_subscriber->safe_psql(
CREATE TABLE tab7 (a int PRIMARY KEY, b int, c int, d int);
));
+$node_publisher->wait_for_catchup('sub1');
+
$node_subscriber->safe_psql(
'postgres', qq(
ALTER SUBSCRIPTION sub1 SET PUBLICATION pub5
@@ -601,6 +607,8 @@ $node_publisher->safe_psql(
ALTER PUBLICATION pub6 ADD TABLE test_part_a_2 (b);
));
+$node_publisher->wait_for_catchup('sub1');
+
# add the publication to our subscription, wait for sync to complete
$node_subscriber->safe_psql(
'postgres', qq(
@@ -667,6 +675,8 @@ $node_publisher->safe_psql(
CREATE PUBLICATION pub7 FOR TABLE test_part_b (a, b) WITH (publish_via_partition_root = true);
));
+$node_publisher->wait_for_catchup('sub1');
+
# add the publication to our subscription, wait for sync to complete
$node_subscriber->safe_psql(
'postgres', qq(
@@ -835,6 +845,8 @@ $node_publisher->safe_psql(
CREATE PUBLICATION pub9 FOR TABLE test_part_d (a) WITH (publish_via_partition_root = true);
));
+$node_publisher->wait_for_catchup('sub1');
+
# add the publication to our subscription, wait for sync to complete
$node_subscriber->safe_psql(
'postgres', qq(
@@ -877,6 +889,8 @@ $node_publisher->safe_psql(
INSERT INTO test_mix_2 VALUES (1, 2, 3);
));
+$node_publisher->wait_for_catchup('sub1');
+
$node_subscriber->safe_psql(
'postgres', qq(
CREATE TABLE test_mix_2 (a int PRIMARY KEY, b int, c int);
@@ -1016,6 +1030,8 @@ $node_publisher->safe_psql(
INSERT INTO s1.t VALUES (1, 2, 3);
));
+$node_publisher->wait_for_catchup('sub1');
+
$node_subscriber->safe_psql(
'postgres', qq(
CREATE SCHEMA s1;
@@ -1039,6 +1055,8 @@ is( $node_subscriber->safe_psql('postgres', "SELECT * FROM s1.t ORDER BY a"),
4|5|6),
'two publications, publishing the same relation');
+$node_publisher->wait_for_catchup('sub1');
+
# Now resync the subscription, but with publications in the opposite order.
# The result should be the same.
@@ -1082,6 +1100,8 @@ $node_publisher->safe_psql(
INSERT INTO t VALUES (1, 2, 3);
));
+$node_publisher->wait_for_catchup('sub1');
+
$node_subscriber->safe_psql(
'postgres', qq(
DROP SCHEMA s1 CASCADE;
@@ -1130,6 +1150,8 @@ $node_publisher->safe_psql(
INSERT INTO t VALUES (1, 2, 3);
));
+$node_publisher->wait_for_catchup('sub1');
+
$node_subscriber->safe_psql(
'postgres', qq(
DROP TABLE t;
On Tue, Feb 6, 2024 at 8:30 AM Alexander Lakhin <exclusion@gmail.com> wrote:
05.02.2024 15:20, Amit Kapila wrote:
If this can be reproduced frequently then we can even try to test the
patch in that thread by Osumi-San [1] (I haven't tested that it
applies cleanly but shouldn't be difficult to make it work) based on
the theory that walsender is starting up at an LSN somewhere before
where the publication is created.[1] - /messages/by-id/TYCPR01MB83737A68CD5D554EA82BD7B9EDD39@TYCPR01MB8373.jpnprd01.prod.outlook.com
Yes, with the aforementioned modification of bgwriter.c and when running
20 tests in parallel, I got failures on iterations 20. 3, 21 ..., but with the
updated Osumi-San's patch (which adds wait_for_catchup('sub1') before every
ALTER SUBSCRIPTION sub1 SET PUBLICATION ...) applied, 300 iterations ran
with no failures.
cool, is it possible to see whether this patch changes the runtime of
this test in any noticeable way?
--
With Regards,
Amit Kapila.
06.02.2024 09:48, Amit Kapila wrote:
cool, is it possible to see whether this patch changes the runtime of
this test in any noticeable way?
Yes, unfortunately it does.
I've measured duration of 100 tests runs without the patch (with pristine
bgwriter and with NO_TEMP_INSTALL):
real 6m46,031s
real 6m52,406s
real 6m51,014s
But with the patched test, I've got:
real 9m39,872s
real 9m40,044s
real 9m38,236s
(nearly 2 seconds increase per one test run)
Under Valgrind, the original test run takes:
Files=1, Tests=36, 334 wallclock secs ( 0.02 usr 0.00 sys + 163.14 cusr 7.98 csys = 171.14 CPU)
But the patched one:
Files=1, Tests=36, 368 wallclock secs ( 0.02 usr 0.00 sys + 182.16 cusr 8.90 csys = 191.08 CPU)
(30 seconds increase)
Maybe the more CPU-efficient solution would be disabling bgworker, as was
proposed in another discussion of tests instability:
/messages/by-id/ZaTxhjnPygOdosJ4@ip-10-97-1-34.eu-west-3.compute.internal
Though I think that devising a way to control bgwriter may take more time
than we can afford given the current 031 failure rate on the buildfarm
(17 failures for the last 3 days).
Best regards,
Alexander
On Tue, Feb 6, 2024 at 2:30 PM Alexander Lakhin <exclusion@gmail.com> wrote:
06.02.2024 09:48, Amit Kapila wrote:
cool, is it possible to see whether this patch changes the runtime of
this test in any noticeable way?Yes, unfortunately it does.
I've measured duration of 100 tests runs without the patch (with pristine
bgwriter and with NO_TEMP_INSTALL):
real 6m46,031s
real 6m52,406s
real 6m51,014sBut with the patched test, I've got:
real 9m39,872s
real 9m40,044s
real 9m38,236s
(nearly 2 seconds increase per one test run)Under Valgrind, the original test run takes:
Files=1, Tests=36, 334 wallclock secs ( 0.02 usr 0.00 sys + 163.14 cusr 7.98 csys = 171.14 CPU)But the patched one:
Files=1, Tests=36, 368 wallclock secs ( 0.02 usr 0.00 sys + 182.16 cusr 8.90 csys = 191.08 CPU)
(30 seconds increase)
Yeah, I was worried about that. The other idea I have previously
thought was to change Alter Subscription to Drop+Create Subscription.
That should also help in bringing stability without losing any
functionality.
--
With Regards,
Amit Kapila.
Amit Kapila <amit.kapila16@gmail.com> writes:
Yeah, I was worried about that. The other idea I have previously
thought was to change Alter Subscription to Drop+Create Subscription.
That should also help in bringing stability without losing any
functionality.
Hm, why would that fix it?
More to the point, aren't these proposals just band-aids that
would stabilize the test without fixing the actual problem?
The same thing is likely to happen to people in the field,
unless we do something drastic like removing ALTER SUBSCRIPTION.
regards, tom lane
I wrote:
More to the point, aren't these proposals just band-aids that
would stabilize the test without fixing the actual problem?
The same thing is likely to happen to people in the field,
unless we do something drastic like removing ALTER SUBSCRIPTION.
I've been able to make the 031_column_list.pl failure pretty
reproducible by adding a delay in walsender, as attached.
While I'm not too familiar with this code, it definitely does appear
that the new walsender is told to start up at an LSN before the
creation of the publication, and then if it needs to decide whether
to stream a particular data change before it's reached that creation,
kaboom!
I read and understood the upthread worries about it not being
a great idea to ignore publication lookup failures, but I really
don't see that we have much choice. As an example, if a subscriber
is humming along reading publication pub1, and then someone
drops and then recreates pub1 on the publisher, I don't think that
the subscriber will be able to advance through that gap if there
are any operations within it that require deciding if they should
be streamed. (That is, contrary to Amit's expectation that
DROP/CREATE would mask the problem, I suspect it will instead turn
it into a hard failure. I've not experimented though.)
BTW, this same change breaks two other subscription tests:
015_stream.pl and 022_twophase_cascade.pl.
The symptoms are different (no "publication does not exist" errors),
so maybe these are just test problems not fundamental weaknesses.
But "replication falls over if the walsender is slow" isn't
something I'd call acceptable.
regards, tom lane
Attachments:
hack-add-delay-in-walsender-loop.patchtext/x-diff; charset=us-ascii; name=hack-add-delay-in-walsender-loop.patchDownload
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 77c8baa32a..c897ef4c60 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -2699,6 +2699,8 @@ WalSndLoop(WalSndSendDataCallback send_data)
!pq_is_send_pending())
break;
+ pg_usleep(10000);
+
/*
* If we don't have any pending data in the output buffer, try to send
* some more. If there is some, we don't bother to call send_data
On Tue, 6 Feb 2024 at 08:30, Alexander Lakhin <exclusion@gmail.com> wrote:
Hello Amit,
05.02.2024 15:20, Amit Kapila wrote:
If this can be reproduced frequently then we can even try to test the
patch in that thread by Osumi-San [1] (I haven't tested that it
applies cleanly but shouldn't be difficult to make it work) based on
the theory that walsender is starting up at an LSN somewhere before
where the publication is created.[1] - /messages/by-id/TYCPR01MB83737A68CD5D554EA82BD7B9EDD39@TYCPR01MB8373.jpnprd01.prod.outlook.com
Yes, with the aforementioned modification of bgwriter.c and when running
20 tests in parallel, I got failures on iterations 20. 3, 21 ..., but with the
updated Osumi-San's patch (which adds wait_for_catchup('sub1') before every
ALTER SUBSCRIPTION sub1 SET PUBLICATION ...) applied, 300 iterations ran
with no failures.
I was able to reproduce the issue with the patch changes suggested in
bgwriter, but for me it failed in the 287th run.
Then I had run the test 1000 times with the test changes shared at [1]/messages/by-id/e6ce3cf7-4025-f129-e3ac-0f778469f720@gmail.com
and the test had passed all the 1000 times successfully.
I have measured the test execution with average of 10 runs and found
that it takes about 1.2 seconds more time to execute with the changes:
Without patch: 8.454 seconds
With test change patch: 9.672 seconds
For the test execution comparison I had used a machine which has total
Memory of 755.536 GB, 120 CPUs and RHEL 7 Operating System.
[1]: /messages/by-id/e6ce3cf7-4025-f129-e3ac-0f778469f720@gmail.com
Regards,
Vignesh
On Tue, Feb 6, 2024 at 8:21 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Amit Kapila <amit.kapila16@gmail.com> writes:
Yeah, I was worried about that. The other idea I have previously
thought was to change Alter Subscription to Drop+Create Subscription.
That should also help in bringing stability without losing any
functionality.Hm, why would that fix it?
Because for new subscriptions, we will start reading WAL from the
latest WAL insert pointer on the publisher which will be after the
point where publication is created.
More to the point, aren't these proposals just band-aids that
would stabilize the test without fixing the actual problem?
Yes, but OTOH, this behavior has been since the beginning of logical
replication. This particular test has just exposed it, so keeping BF
failing for this particular test doesn't sound like the best way to
remember it.
--
With Regards,
Amit Kapila.
On Wed, Feb 7, 2024 at 2:06 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I wrote:
More to the point, aren't these proposals just band-aids that
would stabilize the test without fixing the actual problem?
The same thing is likely to happen to people in the field,
unless we do something drastic like removing ALTER SUBSCRIPTION.I've been able to make the 031_column_list.pl failure pretty
reproducible by adding a delay in walsender, as attached.While I'm not too familiar with this code, it definitely does appear
that the new walsender is told to start up at an LSN before the
creation of the publication, and then if it needs to decide whether
to stream a particular data change before it's reached that creation,
kaboom!I read and understood the upthread worries about it not being
a great idea to ignore publication lookup failures, but I really
don't see that we have much choice. As an example, if a subscriber
is humming along reading publication pub1, and then someone
drops and then recreates pub1 on the publisher, I don't think that
the subscriber will be able to advance through that gap if there
are any operations within it that require deciding if they should
be streamed.
Right. One idea to address those worries was to have a new
subscription option like ignore_nonexistant_pubs (or some better name
for such an option). The 'true' value of this new option means that we
will ignore the publication lookup failures and continue replication,
the 'false' means give an error as we are doing now. If we agree that
such an option is useful or at least saves us in some cases as
discussed in another thread [1], we can keep the default value as true
so that users don't face such errors by default and also have a way to
go back to current behavior.
(That is, contrary to Amit's expectation that
DROP/CREATE would mask the problem, I suspect it will instead turn
it into a hard failure. I've not experimented though.)
This is not contrary because I was suggesting to DROP/CREATE
Subscription whereas you are talking of drop and recreate of
Publication.
BTW, this same change breaks two other subscription tests:
015_stream.pl and 022_twophase_cascade.pl.
The symptoms are different (no "publication does not exist" errors),
so maybe these are just test problems not fundamental weaknesses.
As per the initial analysis, this is because those cases have somewhat
larger transactions (more than 64kB) under test so it just times out
waiting for all the data to be replicated. We will do further analysis
and share the findings.
--
With Regards,
Amit Kapila.
On Wed, 7 Feb 2024 at 15:21, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Feb 6, 2024 at 8:21 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Amit Kapila <amit.kapila16@gmail.com> writes:
Yeah, I was worried about that. The other idea I have previously
thought was to change Alter Subscription to Drop+Create Subscription.
That should also help in bringing stability without losing any
functionality.Hm, why would that fix it?
Because for new subscriptions, we will start reading WAL from the
latest WAL insert pointer on the publisher which will be after the
point where publication is created.
I was able to reproduce the issue consistently with the changes shared
by Tom Lane at [1]/messages/by-id/631312.1707251789@sss.pgh.pa.us.
I have made changes to change ALTER SUBSCRIPTION to DROP+CREATE
SUBSCRIPTION and verified that the test has passed consistently for
50 runs that I ran. Also the test execution time increased for this
case is very negligibly:
Without patch: 7.991 seconds
With test change patch: 8.121 seconds
The test changes for the same are attached.
[1]: /messages/by-id/631312.1707251789@sss.pgh.pa.us
Regards,
Vignesh
Attachments:
v3-0001-Fix-the-intermittent-Build-farm-failures-in-031_c.patchtext/x-patch; charset=US-ASCII; name=v3-0001-Fix-the-intermittent-Build-farm-failures-in-031_c.patchDownload
From 35e31ca3d843d81b43a01d537073174f30f5097c Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Wed, 7 Feb 2024 15:49:00 +0530
Subject: [PATCH v3] Fix the intermittent buildfarm failures in 031_column_list test.
Fix the intermittent buildfarm failures in 031_column_list test.
---
src/test/subscription/t/031_column_list.pl | 43 ++++++++++++++--------
1 file changed, 28 insertions(+), 15 deletions(-)
diff --git a/src/test/subscription/t/031_column_list.pl b/src/test/subscription/t/031_column_list.pl
index 938582e31a..9a97fa5020 100644
--- a/src/test/subscription/t/031_column_list.pl
+++ b/src/test/subscription/t/031_column_list.pl
@@ -370,7 +370,8 @@ $node_subscriber->safe_psql(
$node_subscriber->safe_psql(
'postgres', qq(
- ALTER SUBSCRIPTION sub1 SET PUBLICATION pub2, pub3
+ DROP SUBSCRIPTION sub1;
+ CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub2, pub3
));
$node_subscriber->wait_for_subscription_sync($node_publisher, 'sub1');
@@ -411,7 +412,8 @@ $node_subscriber->safe_psql(
$node_subscriber->safe_psql(
'postgres', qq(
- ALTER SUBSCRIPTION sub1 SET PUBLICATION pub4
+ DROP SUBSCRIPTION sub1;
+ CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub4
));
$node_subscriber->wait_for_subscription_sync;
@@ -487,7 +489,8 @@ $node_subscriber->safe_psql(
$node_subscriber->safe_psql(
'postgres', qq(
- ALTER SUBSCRIPTION sub1 SET PUBLICATION pub5
+ DROP SUBSCRIPTION sub1;
+ CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub5
));
$node_subscriber->wait_for_subscription_sync;
@@ -601,10 +604,12 @@ $node_publisher->safe_psql(
ALTER PUBLICATION pub6 ADD TABLE test_part_a_2 (b);
));
-# add the publication to our subscription, wait for sync to complete
+# create the subscription for the above publication, wait for sync to
+# complete
$node_subscriber->safe_psql(
'postgres', qq(
- ALTER SUBSCRIPTION sub1 SET PUBLICATION pub6
+ DROP SUBSCRIPTION sub1;
+ CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub6
));
$node_subscriber->wait_for_subscription_sync;
@@ -667,10 +672,12 @@ $node_publisher->safe_psql(
CREATE PUBLICATION pub7 FOR TABLE test_part_b (a, b) WITH (publish_via_partition_root = true);
));
-# add the publication to our subscription, wait for sync to complete
+# create the subscription for the above publication, wait for sync to
+# complete
$node_subscriber->safe_psql(
'postgres', qq(
- ALTER SUBSCRIPTION sub1 SET PUBLICATION pub7
+ DROP SUBSCRIPTION sub1;
+ CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub7
));
$node_subscriber->wait_for_subscription_sync;
@@ -737,7 +744,8 @@ $node_publisher->safe_psql(
ALTER PUBLICATION pub8 ADD TABLE test_part_c_2 (a,b);
));
-# add the publication to our subscription, wait for sync to complete
+# create the subscription for the above publication, wait for sync to
+# complete
$node_subscriber->safe_psql(
'postgres', qq(
DROP SUBSCRIPTION sub1;
@@ -835,10 +843,12 @@ $node_publisher->safe_psql(
CREATE PUBLICATION pub9 FOR TABLE test_part_d (a) WITH (publish_via_partition_root = true);
));
-# add the publication to our subscription, wait for sync to complete
+# create the subscription for the above publication, wait for sync to
+# complete
$node_subscriber->safe_psql(
'postgres', qq(
- ALTER SUBSCRIPTION sub1 SET PUBLICATION pub9
+ DROP SUBSCRIPTION sub1;
+ CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub9
));
$node_subscriber->wait_for_subscription_sync;
@@ -880,8 +890,8 @@ $node_publisher->safe_psql(
$node_subscriber->safe_psql(
'postgres', qq(
CREATE TABLE test_mix_2 (a int PRIMARY KEY, b int, c int);
- ALTER SUBSCRIPTION sub1 SET PUBLICATION pub_mix_3, pub_mix_4;
- ALTER SUBSCRIPTION sub1 REFRESH PUBLICATION;
+ DROP SUBSCRIPTION sub1;
+ CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub_mix_3, pub_mix_4;
));
$node_subscriber->wait_for_subscription_sync;
@@ -1022,7 +1032,8 @@ $node_subscriber->safe_psql(
CREATE TABLE s1.t (a int, b int, c int) PARTITION BY RANGE (a);
CREATE TABLE t_1 PARTITION OF s1.t FOR VALUES FROM (1) TO (10);
- ALTER SUBSCRIPTION sub1 SET PUBLICATION pub1, pub2;
+ DROP SUBSCRIPTION sub1;
+ CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub1, pub2;
));
$node_subscriber->wait_for_subscription_sync;
@@ -1090,7 +1101,8 @@ $node_subscriber->safe_psql(
PARTITION BY RANGE (a);
CREATE TABLE t_2 PARTITION OF t_1 FOR VALUES FROM (1) TO (10);
- ALTER SUBSCRIPTION sub1 SET PUBLICATION pub3;
+ DROP SUBSCRIPTION sub1;
+ CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub3;
));
$node_subscriber->wait_for_subscription_sync;
@@ -1138,7 +1150,8 @@ $node_subscriber->safe_psql(
PARTITION BY RANGE (a);
CREATE TABLE t_2 PARTITION OF t_1 FOR VALUES FROM (1) TO (10);
- ALTER SUBSCRIPTION sub1 SET PUBLICATION pub4;
+ DROP SUBSCRIPTION sub1;
+ CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub4;
));
$node_subscriber->wait_for_subscription_sync;
--
2.34.1
On Wed, 7 Feb 2024 at 15:26, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Feb 7, 2024 at 2:06 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I wrote:
More to the point, aren't these proposals just band-aids that
would stabilize the test without fixing the actual problem?
The same thing is likely to happen to people in the field,
unless we do something drastic like removing ALTER SUBSCRIPTION.I've been able to make the 031_column_list.pl failure pretty
reproducible by adding a delay in walsender, as attached.While I'm not too familiar with this code, it definitely does appear
that the new walsender is told to start up at an LSN before the
creation of the publication, and then if it needs to decide whether
to stream a particular data change before it's reached that creation,
kaboom!I read and understood the upthread worries about it not being
a great idea to ignore publication lookup failures, but I really
don't see that we have much choice. As an example, if a subscriber
is humming along reading publication pub1, and then someone
drops and then recreates pub1 on the publisher, I don't think that
the subscriber will be able to advance through that gap if there
are any operations within it that require deciding if they should
be streamed.Right. One idea to address those worries was to have a new
subscription option like ignore_nonexistant_pubs (or some better name
for such an option). The 'true' value of this new option means that we
will ignore the publication lookup failures and continue replication,
the 'false' means give an error as we are doing now. If we agree that
such an option is useful or at least saves us in some cases as
discussed in another thread [1], we can keep the default value as true
so that users don't face such errors by default and also have a way to
go back to current behavior.(That is, contrary to Amit's expectation that
DROP/CREATE would mask the problem, I suspect it will instead turn
it into a hard failure. I've not experimented though.)This is not contrary because I was suggesting to DROP/CREATE
Subscription whereas you are talking of drop and recreate of
Publication.BTW, this same change breaks two other subscription tests:
015_stream.pl and 022_twophase_cascade.pl.
The symptoms are different (no "publication does not exist" errors),
so maybe these are just test problems not fundamental weaknesses.As per the initial analysis, this is because those cases have somewhat
larger transactions (more than 64kB) under test so it just times out
waiting for all the data to be replicated. We will do further analysis
and share the findings.
Yes, these tests are failing while waiting to catchup the larger
transactions to be replicated within180 seconds, as the transactions
needs more time to replicate because of the sleep added. To verify
this I had tried a couple of things a) I had increased the timeout to
a higher value and verified that both the test runs successfully with
1800 seconds timeout. b) I reduced the sleep to 1000 microseconds and
verified that both the test runs successfully.
So I feel these tests 015_stream.pl and 022_twophase_cascade.pl
failing after the sleep added can be ignored.
Regards,
Vignesh
Coincidentally, one of my buildfarm animals hanged several weeks in a
different test, 035_standby_logical_decoding.pl. A LOG_SNAPSHOT_INTERVAL_MS
reduction was part of making it reproducible:
On Fri, Feb 02, 2024 at 04:01:45PM -0800, Noah Misch wrote:
On Fri, Feb 02, 2024 at 02:30:03PM -0800, Noah Misch wrote:
On Fri, Feb 02, 2024 at 05:07:14PM -0500, Tom Lane wrote:
If you look at the buildfarm's failures page and filter down to
just subscriptionCheck failures, what you find is that all of the
last 6 such failures are in 031_column_list.pl:
/messages/by-id/16d6d9cc-f97d-0b34-be65-425183ed3721@gmail.com
reported a replacement BgWriterDelay value reproducing it.Correction: the recipe changes LOG_SNAPSHOT_INTERVAL_MS in addition to
BgWriterDelay.
I'm reusing this thread just in case there's overlap with the
031_column_list.pl cause and fix. The 035_standby_logical_decoding.pl hang is
a race condition arising from an event sequence like this:
- Test script sends CREATE SUBSCRIPTION to subscriber, which loses the CPU.
- Test script calls pg_log_standby_snapshot() on primary. Emits XLOG_RUNNING_XACTS.
- checkpoint_timeout makes a primary checkpoint finish. Emits XLOG_RUNNING_XACTS.
- bgwriter executes LOG_SNAPSHOT_INTERVAL_MS logic. Emits XLOG_RUNNING_XACTS.
- CREATE SUBSCRIPTION wakes up and sends CREATE_REPLICATION_SLOT to standby.
Other test code already has a solution for this, so the attached patches add a
timeout and copy the existing solution. I'm also attaching the hack that
makes it 100% reproducible.
Attachments:
repro-standby-slot-test-race-v1.patchtext/plain; charset=us-asciiDownload
Author: Noah Misch <noah@leadboat.com>
Commit: Noah Misch <noah@leadboat.com>
[not for commit] Demonstrate 035_standby_logical_decoding.pl hang.
There are three way XLOG_RUNNING_XACTS can show up:
- SELECT pg_log_standby_snapshot()
- CHECKPOINT
- bgwriter LOG_SNAPSHOT_INTERVAL_MS
An idle primary won't do the second two. If you're unlucky enough for
all three to happen before CREATE_REPLICATION_SLOT's
DecodingContextFindStartpoint() call starts watching, then
CREATE_REPLICATION_SLOT will hang indefinitely.
diff --git a/src/backend/postmaster/bgwriter.c b/src/backend/postmaster/bgwriter.c
index f11ce27..07f987c 100644
--- a/src/backend/postmaster/bgwriter.c
+++ b/src/backend/postmaster/bgwriter.c
@@ -70,7 +70,7 @@ int BgWriterDelay = 200;
* Interval in which standby snapshots are logged into the WAL stream, in
* milliseconds.
*/
-#define LOG_SNAPSHOT_INTERVAL_MS 15000
+#define LOG_SNAPSHOT_INTERVAL_MS 1
/*
* LSN and timestamp at which we last issued a LogStandbySnapshot(), to avoid
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 9270d7b..0f170a4 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -1085,6 +1085,7 @@ libpqrcv_create_slot(WalReceiverConn *conn, const char *slotname,
appendStringInfoString(&cmd, " PHYSICAL RESERVE_WAL");
}
+ pg_usleep(1000 * 1000);
res = libpqrcv_PQexec(conn->streamConn, cmd.data);
pfree(cmd.data);
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index cebfa52..c371369 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -467,6 +467,7 @@ $psql_subscriber{run}->pump_nb();
# Speed up the subscription creation
$node_primary->safe_psql('postgres', "SELECT pg_log_standby_snapshot()");
+$node_primary->safe_psql('postgres', "CHECKPOINT");
# Explicitly shut down psql instance gracefully - to avoid hangs
# or worse on windows
standby-slot-test-1-timeout-v1.patchtext/plain; charset=us-asciiDownload
Author: Noah Misch <noah@leadboat.com>
Commit: Noah Misch <noah@leadboat.com>
Bound waits in 035_standby_logical_decoding.pl.
One IPC::Run::start() used an IPC::Run::timer() without checking for
expiration. The other used no timeout or timer. Back-patch to v16,
which introduced the test.
Reviewed by FIXME.
Discussion: https://postgr.es/m/FIXME
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index c371369..61da915 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -21,7 +21,6 @@ my $node_cascading_standby =
PostgreSQL::Test::Cluster->new('cascading_standby');
my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
my $default_timeout = $PostgreSQL::Test::Utils::timeout_default;
-my $psql_timeout = IPC::Run::timer($default_timeout);
my $res;
# Name for the physical slot on primary
@@ -90,7 +89,8 @@ sub make_slot_active
'>',
$to_stdout,
'2>',
- $to_stderr);
+ $to_stderr,
+ IPC::Run::timeout($default_timeout));
if ($wait)
{
@@ -341,7 +341,7 @@ $psql_subscriber{run} = IPC::Run::start(
\$psql_subscriber{subscriber_stdout},
'2>',
\$psql_subscriber{subscriber_stderr},
- $psql_timeout);
+ IPC::Run::timeout($default_timeout));
##################################################
# Test that logical decoding on the standby
standby-slot-test-2-race-v1.patchtext/plain; charset=us-asciiDownload
Author: Noah Misch <noah@leadboat.com>
Commit: Noah Misch <noah@leadboat.com>
Fix test race between primary XLOG_RUNNING_XACTS and standby logical slot.
Before the previous commit, the test could hang until
LOG_SNAPSHOT_INTERVAL_MS (15s), until checkpoint_timeout (300s), or
indefinitely. An indefinite hang was awfully improbable. It entailed
the test reaching checkpoint_timeout after a CREATE SUBSCRIPTION and
before its DecodingContextFindStartpoint(). Back-patch to v16, which
introduced the test.
Reviewed by FIXME.
Discussion: https://postgr.es/m/FIXME
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index 61da915..f7b7fc7 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -466,4 +466,11 @@ $psql_subscriber{subscriber_stdin} .= "\n";
$psql_subscriber{run}->pump_nb();
+# See corresponding create_logical_slot_on_standby() code.
+$node_standby->poll_query_until(
+ 'postgres', qq[
+ SELECT restart_lsn IS NOT NULL
+ FROM pg_catalog.pg_replication_slots WHERE slot_name = 'tap_sub'
+ ]) or die "timed out waiting for logical slot to calculate its restart_lsn";
+
# Speed up the subscription creation
$node_primary->safe_psql('postgres', "SELECT pg_log_standby_snapshot()");
On Wed, 7 Feb 2024 at 16:27, vignesh C <vignesh21@gmail.com> wrote:
On Wed, 7 Feb 2024 at 15:21, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Feb 6, 2024 at 8:21 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Amit Kapila <amit.kapila16@gmail.com> writes:
Yeah, I was worried about that. The other idea I have previously
thought was to change Alter Subscription to Drop+Create Subscription.
That should also help in bringing stability without losing any
functionality.Hm, why would that fix it?
Because for new subscriptions, we will start reading WAL from the
latest WAL insert pointer on the publisher which will be after the
point where publication is created.I was able to reproduce the issue consistently with the changes shared
by Tom Lane at [1].
I have made changes to change ALTER SUBSCRIPTION to DROP+CREATE
SUBSCRIPTION and verified that the test has passed consistently for50 runs that I ran. Also the test execution time increased for this
case is very negligibly:
Without patch: 7.991 seconds
With test change patch: 8.121 secondsThe test changes for the same are attached.
Alternative, this could also be fixed like the changes proposed by
Amit at [1]/messages/by-id/CAA4eK1+T-ETXeRM4DHWzGxBpKafLCp__5bPA_QZfFQp7-0wj4Q@mail.gmail.com. In this case we ignore publications that are not found
for the purpose of computing RelSyncEntry attributes. We won't mark
such an entry as valid till all the publications are loaded without
anything missing. This means we won't publish operations on tables
corresponding to that publication till we found such a publication and
that seems okay.
Tomas had raised a performance issue forcing us to reload it for every
replicated change/row in case the publications are invalid at [2]/messages/by-id/dc08add3-10a8-738b-983a-191c7406707b@enterprisedb.com. How
about keeping the default option as it is and providing a new option
skip_not_exist_publication while creating/altering a subscription. In
this case if skip_not_exist_publication is specified we will ignore
the case if publication is not present and publications will be kept
in invalid and get validated later.
The attached patch has the changes for the same. Thoughts?
[1]: /messages/by-id/CAA4eK1+T-ETXeRM4DHWzGxBpKafLCp__5bPA_QZfFQp7-0wj4Q@mail.gmail.com
[2]: /messages/by-id/dc08add3-10a8-738b-983a-191c7406707b@enterprisedb.com
Regards,
Vignesh
Attachments:
Column_list_failure_fix_by_skip_not_exist_publication_option.patchtext/x-patch; charset=US-ASCII; name=Column_list_failure_fix_by_skip_not_exist_publication_option.patchDownload
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index 406a3c2dd1..404577725e 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -74,6 +74,7 @@ GetSubscription(Oid subid, bool missing_ok)
sub->passwordrequired = subform->subpasswordrequired;
sub->runasowner = subform->subrunasowner;
sub->failover = subform->subfailover;
+ sub->skipnotexistpub = subform->subskipnotexistpub;
/* Get conninfo */
datum = SysCacheGetAttrNotNull(SUBSCRIPTIONOID,
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 6791bff9dd..bdd367d272 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1359,7 +1359,8 @@ REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subskiplsn, subname, subowner, subenabled,
subbinary, substream, subtwophasestate, subdisableonerr,
subpasswordrequired, subrunasowner, subfailover,
- subslotname, subsynccommit, subpublications, suborigin)
+ subskipnotexistpub, subslotname, subsynccommit, subpublications,
+ suborigin)
ON pg_subscription TO public;
CREATE VIEW pg_stat_subscription_stats AS
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index a05d69922d..bd59efc73a 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -72,6 +72,7 @@
#define SUBOPT_FAILOVER 0x00002000
#define SUBOPT_LSN 0x00004000
#define SUBOPT_ORIGIN 0x00008000
+#define SUBOPT_SKIP_NOT_EXISTS_PUB 0x00010000
/* check if the 'val' has 'bits' set */
#define IsSet(val, bits) (((val) & (bits)) == (bits))
@@ -97,6 +98,7 @@ typedef struct SubOpts
bool passwordrequired;
bool runasowner;
bool failover;
+ bool skipnotexistpub;
char *origin;
XLogRecPtr lsn;
} SubOpts;
@@ -159,6 +161,8 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->runasowner = false;
if (IsSet(supported_opts, SUBOPT_FAILOVER))
opts->failover = false;
+ if (IsSet(supported_opts, SUBOPT_SKIP_NOT_EXISTS_PUB))
+ opts->skipnotexistpub = false;
if (IsSet(supported_opts, SUBOPT_ORIGIN))
opts->origin = pstrdup(LOGICALREP_ORIGIN_ANY);
@@ -316,6 +320,15 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
opts->specified_opts |= SUBOPT_FAILOVER;
opts->failover = defGetBoolean(defel);
}
+ else if (IsSet(supported_opts, SUBOPT_SKIP_NOT_EXISTS_PUB) &&
+ strcmp(defel->defname, "skip_not_exist_publication") == 0)
+ {
+ if (IsSet(opts->specified_opts, SUBOPT_SKIP_NOT_EXISTS_PUB))
+ errorConflictingDefElem(defel, pstate);
+
+ opts->specified_opts |= SUBOPT_SKIP_NOT_EXISTS_PUB;
+ opts->skipnotexistpub = defGetBoolean(defel);
+ }
else if (IsSet(supported_opts, SUBOPT_ORIGIN) &&
strcmp(defel->defname, "origin") == 0)
{
@@ -408,6 +421,13 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
errmsg("%s and %s are mutually exclusive options",
"connect = false", "failover = true")));
+ if (opts->skipnotexistpub &&
+ IsSet(opts->specified_opts, SUBOPT_SKIP_NOT_EXISTS_PUB))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("%s and %s are mutually exclusive options",
+ "connect = false", "skip_not_exist_publication = true")));
+
/* Change the defaults of other options. */
opts->enabled = false;
opts->create_slot = false;
@@ -611,7 +631,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
SUBOPT_SYNCHRONOUS_COMMIT | SUBOPT_BINARY |
SUBOPT_STREAMING | SUBOPT_TWOPHASE_COMMIT |
SUBOPT_DISABLE_ON_ERR | SUBOPT_PASSWORD_REQUIRED |
- SUBOPT_RUN_AS_OWNER | SUBOPT_FAILOVER | SUBOPT_ORIGIN);
+ SUBOPT_RUN_AS_OWNER | SUBOPT_FAILOVER |
+ SUBOPT_SKIP_NOT_EXISTS_PUB | SUBOPT_ORIGIN);
parse_subscription_options(pstate, stmt->options, supported_opts, &opts);
/*
@@ -718,6 +739,8 @@ CreateSubscription(ParseState *pstate, CreateSubscriptionStmt *stmt,
values[Anum_pg_subscription_subpasswordrequired - 1] = BoolGetDatum(opts.passwordrequired);
values[Anum_pg_subscription_subrunasowner - 1] = BoolGetDatum(opts.runasowner);
values[Anum_pg_subscription_subfailover - 1] = BoolGetDatum(opts.failover);
+ values[Anum_pg_subscription_subskipnotexistpub - 1] =
+ BoolGetDatum(opts.skipnotexistpub);
values[Anum_pg_subscription_subconninfo - 1] =
CStringGetTextDatum(conninfo);
if (opts.slot_name)
@@ -1169,6 +1192,7 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
SUBOPT_STREAMING | SUBOPT_DISABLE_ON_ERR |
SUBOPT_PASSWORD_REQUIRED |
SUBOPT_RUN_AS_OWNER | SUBOPT_FAILOVER |
+ SUBOPT_SKIP_NOT_EXISTS_PUB |
SUBOPT_ORIGIN);
parse_subscription_options(pstate, stmt->options,
@@ -1248,6 +1272,13 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
replaces[Anum_pg_subscription_subrunasowner - 1] = true;
}
+ if (IsSet(opts.specified_opts, SUBOPT_SKIP_NOT_EXISTS_PUB))
+ {
+ values[Anum_pg_subscription_subskipnotexistpub - 1] =
+ BoolGetDatum(opts.runasowner);
+ replaces[Anum_pg_subscription_subskipnotexistpub - 1] = true;
+ }
+
if (IsSet(opts.specified_opts, SUBOPT_FAILOVER))
{
if (!sub->slotname)
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 9270d7b855..a66108aee8 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -593,6 +593,10 @@ libpqrcv_startstreaming(WalReceiverConn *conn,
appendStringInfo(&cmd, ", origin '%s'",
options->proto.logical.origin);
+ if (options->proto.logical.skipnotexistpub &&
+ PQserverVersion(conn->streamConn) >= 170000)
+ appendStringInfo(&cmd, ", skip_not_exist_publication 'true'");
+
pubnames = options->proto.logical.publication_names;
pubnames_str = stringlist_to_identifierstr(conn->streamConn, pubnames);
if (!pubnames_str)
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 9dd2446fbf..ef362e3571 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -3952,6 +3952,7 @@ maybe_reread_subscription(void)
newsub->binary != MySubscription->binary ||
newsub->stream != MySubscription->stream ||
newsub->passwordrequired != MySubscription->passwordrequired ||
+ newsub->skipnotexistpub != MySubscription->skipnotexistpub ||
strcmp(newsub->origin, MySubscription->origin) != 0 ||
newsub->owner != MySubscription->owner ||
!equal(newsub->publications, MySubscription->publications))
@@ -4380,6 +4381,9 @@ set_stream_options(WalRcvStreamOptions *options,
options->proto.logical.publication_names = MySubscription->publications;
options->proto.logical.binary = MySubscription->binary;
+ if (server_version >= 170000 && MySubscription->skipnotexistpub)
+ options->proto.logical.skipnotexistpub = true;
+
/*
* Assign the appropriate option value for streaming option according to
* the 'streaming' mode and the publisher's ability to support that mode.
diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 998f92d671..7c27d4a7d7 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -82,7 +82,8 @@ static void pgoutput_stream_prepare_txn(LogicalDecodingContext *ctx,
static bool publications_valid;
-static List *LoadPublications(List *pubnames);
+static List *LoadPublications(List *pubnames, bool skipnotexistpub,
+ bool *skipped);
static void publication_invalidation_cb(Datum arg, int cacheid,
uint32 hashvalue);
static void send_relation_and_attrs(Relation relation, TransactionId xid,
@@ -284,11 +285,13 @@ parse_output_parameters(List *options, PGOutputData *data)
bool streaming_given = false;
bool two_phase_option_given = false;
bool origin_option_given = false;
+ bool skipnotexistpub_option_given = false;
data->binary = false;
data->streaming = LOGICALREP_STREAM_OFF;
data->messages = false;
data->two_phase = false;
+ data->skipnotexistpub = false;
foreach(lc, options)
{
@@ -397,6 +400,16 @@ parse_output_parameters(List *options, PGOutputData *data)
errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("unrecognized origin value: \"%s\"", origin));
}
+ else if (strcmp(defel->defname, "skip_not_exist_publication") == 0)
+ {
+ if (skipnotexistpub_option_given)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("conflicting or redundant options")));
+ skipnotexistpub_option_given = true;
+
+ data->skipnotexistpub = true;
+ }
else
elog(ERROR, "unrecognized pgoutput option: %s", defel->defname);
}
@@ -1703,9 +1716,13 @@ pgoutput_shutdown(LogicalDecodingContext *ctx)
/*
* Load publications from the list of publication names.
+ *
+ * Here, we just skip the publications that don't exist yet. 'skipped'
+ * will be true if we find any publication from the given list that doesn't
+ * exist.
*/
static List *
-LoadPublications(List *pubnames)
+LoadPublications(List *pubnames, bool skipnotexistpub, bool *skipped)
{
List *result = NIL;
ListCell *lc;
@@ -1713,9 +1730,12 @@ LoadPublications(List *pubnames)
foreach(lc, pubnames)
{
char *pubname = (char *) lfirst(lc);
- Publication *pub = GetPublicationByName(pubname, false);
+ Publication *pub = GetPublicationByName(pubname, skipnotexistpub);
- result = lappend(result, pub);
+ if (pub)
+ result = lappend(result, pub);
+ else
+ *skipped = true;
}
return result;
@@ -1994,7 +2014,7 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
}
/* Validate the entry */
- if (!entry->replicate_valid)
+ if (!entry->replicate_valid || !publications_valid)
{
Oid schemaId = get_rel_namespace(relid);
List *pubids = GetRelationPublications(relid);
@@ -2011,6 +2031,7 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
bool am_partition = get_rel_relispartition(relid);
char relkind = get_rel_relkind(relid);
List *rel_publications = NIL;
+ bool skipped_pub = false;
/* Reload publications if needed before use. */
if (!publications_valid)
@@ -2021,9 +2042,17 @@ get_rel_sync_entry(PGOutputData *data, Relation relation)
list_free_deep(data->publications);
data->publications = NIL;
}
- data->publications = LoadPublications(data->publication_names);
+ data->publications = LoadPublications(data->publication_names,
+ data->skipnotexistpub,
+ &skipped_pub);
MemoryContextSwitchTo(oldctx);
- publications_valid = true;
+
+ /*
+ * We don't consider the publications to be valid till we have
+ * information of all the publications.
+ */
+ if (!skipped_pub)
+ publications_valid = true;
}
/*
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 151a5211ee..0642ff4581 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1944,8 +1944,9 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION <name> SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SET", "("))
COMPLETE_WITH("binary", "disable_on_error", "failover", "origin",
- "password_required", "run_as_owner", "slot_name",
- "streaming", "synchronous_commit");
+ "password_required", "run_as_owner",
+ "skip_not_exist_publication", "slot_name", "streaming",
+ "synchronous_commit");
/* ALTER SUBSCRIPTION <name> SKIP ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && TailMatches("SKIP", "("))
COMPLETE_WITH("lsn");
@@ -3341,8 +3342,9 @@ psql_completion(const char *text, int start, int end)
else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
"disable_on_error", "enabled", "failover", "origin",
- "password_required", "run_as_owner", "slot_name",
- "streaming", "synchronous_commit", "two_phase");
+ "password_required", "run_as_owner",
+ "skip_not_exist_publication", "slot_name", "streaming",
+ "synchronous_commit", "two_phase");
/* CREATE TRIGGER --- is allowed inside CREATE SCHEMA, so use TailMatches */
diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h
index 0aa14ec4a2..704d1217d9 100644
--- a/src/include/catalog/pg_subscription.h
+++ b/src/include/catalog/pg_subscription.h
@@ -98,6 +98,9 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
* slots) in the upstream database are enabled
* to be synchronized to the standbys. */
+ bool subskipnotexistpub; /* True if the not-exist publications should
+ * be ignored */
+
#ifdef CATALOG_VARLEN /* variable-length fields start here */
/* Connection string to the publisher */
text subconninfo BKI_FORCE_NOT_NULL;
@@ -151,6 +154,8 @@ typedef struct Subscription
* (i.e. the main slot and the table sync
* slots) in the upstream database are enabled
* to be synchronized to the standbys. */
+ bool skipnotexistpub; /* True if the non-existent publications should
+ * be ignored. */
char *conninfo; /* Connection string to the publisher */
char *slotname; /* Name of the replication slot */
char *synccommit; /* Synchronous commit setting for worker */
diff --git a/src/include/replication/pgoutput.h b/src/include/replication/pgoutput.h
index 89f94e1147..38faa2ea04 100644
--- a/src/include/replication/pgoutput.h
+++ b/src/include/replication/pgoutput.h
@@ -33,6 +33,7 @@ typedef struct PGOutputData
bool messages;
bool two_phase;
bool publish_no_origin;
+ bool skipnotexistpub;
} PGOutputData;
#endif /* PGOUTPUT_H */
diff --git a/src/include/replication/walreceiver.h b/src/include/replication/walreceiver.h
index b906bb5ce8..e7a0d9e08a 100644
--- a/src/include/replication/walreceiver.h
+++ b/src/include/replication/walreceiver.h
@@ -186,6 +186,7 @@ typedef struct
* prepare time */
char *origin; /* Only publish data originating from the
* specified origin */
+ bool skipnotexistpub;
} logical;
} proto;
} WalRcvStreamOptions;
diff --git a/src/test/subscription/t/031_column_list.pl b/src/test/subscription/t/031_column_list.pl
index 938582e31a..8626397c1c 100644
--- a/src/test/subscription/t/031_column_list.pl
+++ b/src/test/subscription/t/031_column_list.pl
@@ -145,7 +145,7 @@ $node_publisher->safe_psql(
# then check the sync results
$node_subscriber->safe_psql(
'postgres', qq(
- CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub1
+ CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub1 WITH (skip_not_exist_publication = true)
));
$node_subscriber->wait_for_subscription_sync;
@@ -741,7 +741,7 @@ $node_publisher->safe_psql(
$node_subscriber->safe_psql(
'postgres', qq(
DROP SUBSCRIPTION sub1;
- CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub8;
+ CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub8 WITH (skip_not_exist_publication = true);
));
$node_subscriber->wait_for_subscription_sync;
@@ -921,7 +921,7 @@ $node_publisher->safe_psql(
$node_subscriber->safe_psql(
'postgres', qq(
- CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub_mix_5, pub_mix_6;
+ CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub_mix_5, pub_mix_6 WITH (skip_not_exist_publication = true);
));
$node_subscriber->wait_for_subscription_sync;
@@ -973,7 +973,7 @@ $node_publisher->safe_psql(
# both table sync and data replication.
$node_subscriber->safe_psql(
'postgres', qq(
- CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub_test_root, pub_test_root_1;
+ CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub_test_root, pub_test_root_1 WITH (skip_not_exist_publication = true);
));
$node_subscriber->wait_for_subscription_sync;
@@ -1213,7 +1213,7 @@ $node_subscriber->safe_psql(
$node_subscriber->safe_psql(
'postgres', qq(
- CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub_mix_7, pub_mix_8;
+ CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub_mix_7, pub_mix_8 WITH (skip_not_exist_publication = true);
));
$node_subscriber->wait_for_subscription_sync;
@@ -1255,7 +1255,7 @@ $node_subscriber->safe_psql(
my ($cmdret, $stdout, $stderr) = $node_subscriber->psql(
'postgres', qq(
- CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub_mix_1, pub_mix_2;
+ CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub_mix_1, pub_mix_2 WITH (skip_not_exist_publication = true);
));
ok( $stderr =~
@@ -1272,7 +1272,7 @@ $node_publisher->safe_psql(
$node_subscriber->safe_psql(
'postgres', qq(
- CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub_mix_1, pub_mix_2;
+ CREATE SUBSCRIPTION sub1 CONNECTION '$publisher_connstr' PUBLICATION pub_mix_1, pub_mix_2 WITH (skip_not_exist_publication = true);
));
$node_publisher->wait_for_catchup('sub1');
Hi,
On Sat, Feb 10, 2024 at 05:02:27PM -0800, Noah Misch wrote:
Coincidentally, one of my buildfarm animals hanged several weeks in a
different test, 035_standby_logical_decoding.pl. A LOG_SNAPSHOT_INTERVAL_MS
reduction was part of making it reproducible:On Fri, Feb 02, 2024 at 04:01:45PM -0800, Noah Misch wrote:
On Fri, Feb 02, 2024 at 02:30:03PM -0800, Noah Misch wrote:
On Fri, Feb 02, 2024 at 05:07:14PM -0500, Tom Lane wrote:
If you look at the buildfarm's failures page and filter down to
just subscriptionCheck failures, what you find is that all of the
last 6 such failures are in 031_column_list.pl:/messages/by-id/16d6d9cc-f97d-0b34-be65-425183ed3721@gmail.com
reported a replacement BgWriterDelay value reproducing it.Correction: the recipe changes LOG_SNAPSHOT_INTERVAL_MS in addition to
BgWriterDelay.I'm reusing this thread just in case there's overlap with the
031_column_list.pl cause and fix. The 035_standby_logical_decoding.pl hang is
a race condition arising from an event sequence like this:- Test script sends CREATE SUBSCRIPTION to subscriber, which loses the CPU.
- Test script calls pg_log_standby_snapshot() on primary. Emits XLOG_RUNNING_XACTS.
- checkpoint_timeout makes a primary checkpoint finish. Emits XLOG_RUNNING_XACTS.
- bgwriter executes LOG_SNAPSHOT_INTERVAL_MS logic. Emits XLOG_RUNNING_XACTS.
- CREATE SUBSCRIPTION wakes up and sends CREATE_REPLICATION_SLOT to standby.Other test code already has a solution for this, so the attached patches add a
timeout and copy the existing solution. I'm also attaching the hack that
makes it 100% reproducible.
Thanks!
I did a few tests and confirm that the proposed solution fixes the corner case.
standby-slot-test-1-timeout-v1.patch LGTM.
Regarding standby-slot-test-2-race-v1.patch:
+# See corresponding create_logical_slot_on_standby() code. +$node_standby->poll_query_until( + 'postgres', qq[ + SELECT restart_lsn IS NOT NULL + FROM pg_catalog.pg_replication_slots WHERE slot_name = 'tap_sub' + ]) or die "timed out waiting for logical slot to calculate its restart_lsn"; +
What about creating a sub, say wait_for_restart_lsn_calculation() in Cluster.pm
and then make use of it in create_logical_slot_on_standby() and above? (something
like wait_for_restart_lsn_calculation-v1.patch attached).
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachments:
wait_for_restart_lsn_calculation-v1.patchtext/x-diff; charset=us-asciiDownload
commit 14d988c748f3e500c44d65e073c276e6d8af6156
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Date: Wed Feb 14 15:21:35 2024 +0000
wait_for_restart_lsn_calculation
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index e2e70d0dbf..21cf179db1 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3174,6 +3174,28 @@ $SIG{TERM} = $SIG{INT} = sub {
=pod
+=item $node->wait_for_restart_lsn_calculation(self, slot_name)
+
+Create logical replication slot on given standby
+
+=cut
+
+sub wait_for_restart_lsn_calculation
+{
+ my ($self, $slot_name) = @_;
+ my ($stdout, $stderr);
+
+ $self->poll_query_until(
+ 'postgres', qq[
+ SELECT restart_lsn IS NOT NULL
+ FROM pg_catalog.pg_replication_slots WHERE slot_name = '$slot_name'
+ ])
+ or die
+ "timed out waiting for logical slot to calculate its restart_lsn";
+}
+
+=pod
+
=item $node->create_logical_slot_on_standby(self, primary, slot_name, dbname)
Create logical replication slot on given standby
@@ -3203,13 +3225,7 @@ sub create_logical_slot_on_standby
# xl_running_xacts WAL record from the restart_lsn onwards. First wait
# until the slot restart_lsn is determined.
- $self->poll_query_until(
- 'postgres', qq[
- SELECT restart_lsn IS NOT NULL
- FROM pg_catalog.pg_replication_slots WHERE slot_name = '$slot_name'
- ])
- or die
- "timed out waiting for logical slot to calculate its restart_lsn";
+ $self->wait_for_restart_lsn_calculation($slot_name);
# Then arrange for the xl_running_xacts record for which pg_recvlogical is
# waiting.
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index f7b7fc7f9e..85330720c5 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -465,12 +465,8 @@ $psql_subscriber{subscriber_stdin} .= "\n";
$psql_subscriber{run}->pump_nb();
-# See corresponding create_logical_slot_on_standby() code.
-$node_standby->poll_query_until(
- 'postgres', qq[
- SELECT restart_lsn IS NOT NULL
- FROM pg_catalog.pg_replication_slots WHERE slot_name = 'tap_sub'
- ]) or die "timed out waiting for logical slot to calculate its restart_lsn";
+# Wait for restart_lsn calculation
+$node_standby->wait_for_restart_lsn_calculation('tap_sub');
# Speed up the subscription creation
$node_primary->safe_psql('postgres', "SELECT pg_log_standby_snapshot()");
On Wed, Feb 14, 2024 at 12:58 PM vignesh C <vignesh21@gmail.com> wrote:
On Wed, 7 Feb 2024 at 16:27, vignesh C <vignesh21@gmail.com> wrote:
I was able to reproduce the issue consistently with the changes shared
by Tom Lane at [1].
I have made changes to change ALTER SUBSCRIPTION to DROP+CREATE
SUBSCRIPTION and verified that the test has passed consistently for50 runs that I ran. Also the test execution time increased for this
case is very negligibly:
Without patch: 7.991 seconds
With test change patch: 8.121 secondsThe test changes for the same are attached.
Alternative, this could also be fixed like the changes proposed by
Amit at [1]. In this case we ignore publications that are not found
for the purpose of computing RelSyncEntry attributes. We won't mark
such an entry as valid till all the publications are loaded without
anything missing. This means we won't publish operations on tables
corresponding to that publication till we found such a publication and
that seems okay.Tomas had raised a performance issue forcing us to reload it for every
replicated change/row in case the publications are invalid at [2].
Did you measure the performance impact? I think it should impact the
performance only when there is a dropped/non-existent publication as
per the subscription definition. Also, in the same email[2], Tomas
raised another concern that in such cases it is better to break the
replication.
How
about keeping the default option as it is and providing a new option
skip_not_exist_publication while creating/altering a subscription. In
this case if skip_not_exist_publication is specified we will ignore
the case if publication is not present and publications will be kept
in invalid and get validated later.
I don't know if this deserves a separate option or not but I think it
is better to discuss this in a separate thread. To resolve the BF
failure, I still feel, we should just recreate the subscription. This
is a pre-existing problem and we can track it via a separate patch
with a test case targetting such failures.
The attached patch has the changes for the same. Thoughts?
[1] - /messages/by-id/CAA4eK1+T-ETXeRM4DHWzGxBpKafLCp__5bPA_QZfFQp7-0wj4Q@mail.gmail.com
[2] - /messages/by-id/dc08add3-10a8-738b-983a-191c7406707b@enterprisedb.com
--
With Regards,
Amit Kapila.
On Thu, 15 Feb 2024 at 08:42, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Feb 14, 2024 at 12:58 PM vignesh C <vignesh21@gmail.com> wrote:
On Wed, 7 Feb 2024 at 16:27, vignesh C <vignesh21@gmail.com> wrote:
I was able to reproduce the issue consistently with the changes shared
by Tom Lane at [1].
I have made changes to change ALTER SUBSCRIPTION to DROP+CREATE
SUBSCRIPTION and verified that the test has passed consistently for50 runs that I ran. Also the test execution time increased for this
case is very negligibly:
Without patch: 7.991 seconds
With test change patch: 8.121 secondsThe test changes for the same are attached.
Alternative, this could also be fixed like the changes proposed by
Amit at [1]. In this case we ignore publications that are not found
for the purpose of computing RelSyncEntry attributes. We won't mark
such an entry as valid till all the publications are loaded without
anything missing. This means we won't publish operations on tables
corresponding to that publication till we found such a publication and
that seems okay.Tomas had raised a performance issue forcing us to reload it for every
replicated change/row in case the publications are invalid at [2].Did you measure the performance impact? I think it should impact the
performance only when there is a dropped/non-existent publication as
per the subscription definition.
The performance was almost similar in this case:
Without patch: 7.991 seconds
With skip_not_exist_publication option patch: 7.996 seconds
Also, in the same email[2], Tomas
raised another concern that in such cases it is better to break the
replication.
I will handle this in the next version
How
about keeping the default option as it is and providing a new option
skip_not_exist_publication while creating/altering a subscription. In
this case if skip_not_exist_publication is specified we will ignore
the case if publication is not present and publications will be kept
in invalid and get validated later.I don't know if this deserves a separate option or not but I think it
is better to discuss this in a separate thread.
I will start a separate thread for this.
To resolve the BF
failure, I still feel, we should just recreate the subscription. This
is a pre-existing problem and we can track it via a separate patch
with a test case targetting such failures.
+1 for going with recreation of the subscription, the changes for this
are available at [1]/messages/by-id/CALDaNm1hLZW4H4Z61swK6WPW6pcNcjzXqw=6NqG7e-RMtkFaZA@mail.gmail.com.
[1]: /messages/by-id/CALDaNm1hLZW4H4Z61swK6WPW6pcNcjzXqw=6NqG7e-RMtkFaZA@mail.gmail.com
Regards,
Vignesh
On Wed, Feb 14, 2024 at 03:31:16PM +0000, Bertrand Drouvot wrote:
On Sat, Feb 10, 2024 at 05:02:27PM -0800, Noah Misch wrote:
The 035_standby_logical_decoding.pl hang is
a race condition arising from an event sequence like this:- Test script sends CREATE SUBSCRIPTION to subscriber, which loses the CPU.
- Test script calls pg_log_standby_snapshot() on primary. Emits XLOG_RUNNING_XACTS.
- checkpoint_timeout makes a primary checkpoint finish. Emits XLOG_RUNNING_XACTS.
- bgwriter executes LOG_SNAPSHOT_INTERVAL_MS logic. Emits XLOG_RUNNING_XACTS.
- CREATE SUBSCRIPTION wakes up and sends CREATE_REPLICATION_SLOT to standby.Other test code already has a solution for this, so the attached patches add a
timeout and copy the existing solution. I'm also attaching the hack that
makes it 100% reproducible.
I did a few tests and confirm that the proposed solution fixes the corner case.
Thanks for reviewing.
What about creating a sub, say wait_for_restart_lsn_calculation() in Cluster.pm
and then make use of it in create_logical_slot_on_standby() and above? (something
like wait_for_restart_lsn_calculation-v1.patch attached).
Waiting for restart_lsn is just a prerequisite for calling
pg_log_standby_snapshot(), so I wouldn't separate those two. If we're
extracting a sub, I would move the pg_log_standby_snapshot() call into the sub
and make the API like one of these:
$standby->wait_for_subscription_starting_point($primary, $slot_name);
$primary->log_standby_snapshot($standby, $slot_name);
Would you like to finish the patch in such a way?
Hi,
On Thu, Feb 15, 2024 at 12:48:16PM -0800, Noah Misch wrote:
On Wed, Feb 14, 2024 at 03:31:16PM +0000, Bertrand Drouvot wrote:
What about creating a sub, say wait_for_restart_lsn_calculation() in Cluster.pm
and then make use of it in create_logical_slot_on_standby() and above? (something
like wait_for_restart_lsn_calculation-v1.patch attached).Waiting for restart_lsn is just a prerequisite for calling
pg_log_standby_snapshot(), so I wouldn't separate those two.
Yeah, makes sense.
If we're
extracting a sub, I would move the pg_log_standby_snapshot() call into the sub
and make the API like one of these:$standby->wait_for_subscription_starting_point($primary, $slot_name);
$primary->log_standby_snapshot($standby, $slot_name);Would you like to finish the patch in such a way?
Sure, please find attached standby-slot-test-2-race-v2.patch doing so. I used
log_standby_snapshot() as it seems more generic to me.
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachments:
standby-slot-test-2-race-v2.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 07da74cf56..5d3174d0b5 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -3181,6 +3181,37 @@ $SIG{TERM} = $SIG{INT} = sub {
=pod
+=item $node->log_standby_snapshot(self, standby, slot_name)
+
+Log a standby snapshot on primary once the slot restart_lsn is determined on
+the standby.
+
+=cut
+
+sub log_standby_snapshot
+{
+ my ($self, $standby, $slot_name) = @_;
+ my ($stdout, $stderr);
+
+ # Once the slot's restart_lsn is determined, the standby looks for
+ # xl_running_xacts WAL record from the restart_lsn onwards. First wait
+ # until the slot restart_lsn is determined.
+
+ $standby->poll_query_until(
+ 'postgres', qq[
+ SELECT restart_lsn IS NOT NULL
+ FROM pg_catalog.pg_replication_slots WHERE slot_name = '$slot_name'
+ ])
+ or die
+ "timed out waiting for logical slot to calculate its restart_lsn";
+
+ # Then arrange for the xl_running_xacts record for which the standby is
+ # waiting.
+ $self->safe_psql('postgres', 'SELECT pg_log_standby_snapshot()');
+}
+
+=pod
+
=item $node->create_logical_slot_on_standby(self, primary, slot_name, dbname)
Create logical replication slot on given standby
@@ -3206,21 +3237,9 @@ sub create_logical_slot_on_standby
'2>',
\$stderr);
- # Once the slot's restart_lsn is determined, the standby looks for
- # xl_running_xacts WAL record from the restart_lsn onwards. First wait
- # until the slot restart_lsn is determined.
-
- $self->poll_query_until(
- 'postgres', qq[
- SELECT restart_lsn IS NOT NULL
- FROM pg_catalog.pg_replication_slots WHERE slot_name = '$slot_name'
- ])
- or die
- "timed out waiting for logical slot to calculate its restart_lsn";
-
- # Then arrange for the xl_running_xacts record for which pg_recvlogical is
+ # Arrange for the xl_running_xacts record for which pg_recvlogical is
# waiting.
- $primary->safe_psql('postgres', 'SELECT pg_log_standby_snapshot()');
+ $primary->log_standby_snapshot($self, $slot_name);
$handle->finish();
diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index b020361b29..0710bccc17 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -465,8 +465,8 @@ $psql_subscriber{subscriber_stdin} .= "\n";
$psql_subscriber{run}->pump_nb();
-# Speed up the subscription creation
-$node_primary->safe_psql('postgres', "SELECT pg_log_standby_snapshot()");
+# Log the standby snapshot to speed up the subscription creation
+$node_primary->log_standby_snapshot($node_standby, 'tap_sub');
# Explicitly shut down psql instance gracefully - to avoid hangs
# or worse on windows
On Thu, Feb 15, 2024 at 10:46 AM vignesh C <vignesh21@gmail.com> wrote:
On Thu, 15 Feb 2024 at 08:42, Amit Kapila <amit.kapila16@gmail.com> wrote:
To resolve the BF
failure, I still feel, we should just recreate the subscription. This
is a pre-existing problem and we can track it via a separate patch
with a test case targetting such failures.+1 for going with recreation of the subscription, the changes for this
are available at [1].
[1] - /messages/by-id/CALDaNm1hLZW4H4Z61swK6WPW6pcNcjzXqw=6NqG7e-RMtkFaZA@mail.gmail.com
Tom, and others, does anyone still object to going ahead with an idea
by just changing the test to recreate the subscription to silence BF
failures for this test?
Note that Vignesh has already started a thread [1]/messages/by-id/CALDaNm0-n8FGAorM+bTxkzn+AOUyx5=L_XmnvOP6T24+-NcBKg@mail.gmail.com to add an option
(or otherwise) to ignore publication lookup failures.
[1]: /messages/by-id/CALDaNm0-n8FGAorM+bTxkzn+AOUyx5=L_XmnvOP6T24+-NcBKg@mail.gmail.com
--
With Regards,
Amit Kapila.
On Fri, Feb 16, 2024 at 06:37:38AM +0000, Bertrand Drouvot wrote:
On Thu, Feb 15, 2024 at 12:48:16PM -0800, Noah Misch wrote:
On Wed, Feb 14, 2024 at 03:31:16PM +0000, Bertrand Drouvot wrote:
What about creating a sub, say wait_for_restart_lsn_calculation() in Cluster.pm
and then make use of it in create_logical_slot_on_standby() and above? (something
like wait_for_restart_lsn_calculation-v1.patch attached).Waiting for restart_lsn is just a prerequisite for calling
pg_log_standby_snapshot(), so I wouldn't separate those two.Yeah, makes sense.
If we're
extracting a sub, I would move the pg_log_standby_snapshot() call into the sub
and make the API like one of these:$standby->wait_for_subscription_starting_point($primary, $slot_name);
$primary->log_standby_snapshot($standby, $slot_name);Would you like to finish the patch in such a way?
Sure, please find attached standby-slot-test-2-race-v2.patch doing so. I used
log_standby_snapshot() as it seems more generic to me.
Thanks. Pushed at commit 0e16281.
On Mon, Feb 19, 2024 at 2:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Feb 15, 2024 at 10:46 AM vignesh C <vignesh21@gmail.com> wrote:
On Thu, 15 Feb 2024 at 08:42, Amit Kapila <amit.kapila16@gmail.com> wrote:
To resolve the BF
failure, I still feel, we should just recreate the subscription. This
is a pre-existing problem and we can track it via a separate patch
with a test case targetting such failures.+1 for going with recreation of the subscription, the changes for this
are available at [1].
[1] - /messages/by-id/CALDaNm1hLZW4H4Z61swK6WPW6pcNcjzXqw=6NqG7e-RMtkFaZA@mail.gmail.comTom, and others, does anyone still object to going ahead with an idea
by just changing the test to recreate the subscription to silence BF
failures for this test?
Seeing no objections, I have pushed the required test changes to silence the BF.
--
With Regards,
Amit Kapila.
Amit Kapila <amit.kapila16@gmail.com> writes:
Seeing no objections, I have pushed the required test changes to silence the BF.
The farm is still unhappy in the v16 branch.
regards, tom lane
On Sat, Feb 24, 2024 at 2:31 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Amit Kapila <amit.kapila16@gmail.com> writes:
Seeing no objections, I have pushed the required test changes to silence the BF.
The farm is still unhappy in the v16 branch.
As this feature and tests were introduced in 15, I think we should
backpatch till 15. I'll do that early next week unless you or someone
else thinks otherwise.
--
With Regards,
Amit Kapila.
On 20 Feb 2024, at 04:09, Noah Misch <noah@leadboat.com> wrote:
I’m not sure if it is connected, but so far many patches in CFbot keep hanging in this test. For example [0]https://cirrus-ci.com/task/5911176840740864?logs=check_world#L292.
I haven’t done root cause analysis yet, but hangs may be related to this thread. Maybe someone more familiar with similar issues could take a look there?
Thanks!
Best regards, Andrey Borodin.
[0]: https://cirrus-ci.com/task/5911176840740864?logs=check_world#L292
On Thu, Mar 07, 2024 at 02:46:55PM +0500, Andrey M. Borodin wrote:
I’m not sure if it is connected, but so far many patches in CFbot keep hanging in this test. For example [0].
[0] https://cirrus-ci.com/task/5911176840740864?logs=check_world#L292
Relevant part:
[22:03:10.292] stderr:
[22:03:10.292] # poll_query_until timed out executing this query:
[22:03:10.292] # SELECT (SELECT catalog_xmin::text::int - 770 from pg_catalog.pg_replication_slots where slot_name = 'injection_activeslot') > 0
[22:03:10.292] # expecting this output:
[22:03:10.292] # t
[22:03:10.292] # last actual query output:
[22:03:10.292] # f
[22:03:10.292] # with stderr:
[22:03:10.292] # Tests were run but no plan was declared and done_testing() was not seen.
[22:03:10.292] # Looks like your test exited with 255 just after 57.
The injection_activeslot test got added after $SUBJECT, in 08a52ab151
(2024-03-06). It's now reverted in 65db0cfb4c (2024-03-07).