Re: Add a perl function in Cluster.pm to generate WAL

Started by Bharath Rupireddyabout 3 years ago24 messageshackers

bharath.rupireddyforpostgres@gmail.com

about 3 years ago

On Wed, Aug 24, 2022 at 6:42 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Tue, 16 Aug 2022 18:40:49 +0200, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote in

On 2022-Aug-16, Andrew Dunstan wrote:

I don't think there's a hard and fast rule about it. Certainly the case
would be more compelling if the functions were used across different TAP
suites. The SSL suite has suite-specific modules. That's a pattern also
worth considering. e.g something like.

use FindBin qw($Bin);
use lib $Bin;
use MySuite;

and then you put your common routines in MySuite.pm in the same
directory as the TAP test files.

Yeah, I agree with that for advance_wal. Regarding find_in_log, that
one seems general enough to warrant being in Cluster.pm -- consider
issues_sql_like, which also slurps_file($log). That could be unified a
little bit, I think.

+1

With the generalized function for find_in_log() has been added as part
of https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=e25e5f7fc6b74c9d4ce82627e9145ef5537412e2,
I'm proposing a generalized function for advance_wal(). Please find
the attached patch.

I tried to replace the existing tests with the new cluster function
advance_wal(). Please let me know if I'm missing any other tests.
Also, this new function can be used by an in-progress feature -
https://commitfest.postgresql.org/43/3663/.

Thoughts?

FWIW, it's discussed here -
/messages/by-id/ZIKVd+a43UfsIWJE@paquier.xyz.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Kyotaro Horiguchi

horikyota.ntt@gmail.com

about 3 years ago

In reply to: Bharath Rupireddy (#1)

Mmm. It seems like the email I thought I'd sent failed to go out.

At Sun, 11 Jun 2023 07:14:54 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in

On Wed, Aug 24, 2022 at 6:42 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Tue, 16 Aug 2022 18:40:49 +0200, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote in

On 2022-Aug-16, Andrew Dunstan wrote:
Yeah, I agree with that for advance_wal. Regarding find_in_log, that
one seems general enough to warrant being in Cluster.pm -- consider
issues_sql_like, which also slurps_file($log). That could be unified a
little bit, I think.

+1

With the generalized function for find_in_log() has been added as part
of https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=e25e5f7fc6b74c9d4ce82627e9145ef5537412e2,
I'm proposing a generalized function for advance_wal(). Please find
the attached patch.

I tried to replace the existing tests with the new cluster function
advance_wal(). Please let me know if I'm missing any other tests.
Also, this new function can be used by an in-progress feature -
https://commitfest.postgresql.org/43/3663/.

Thoughts?

Thanks!

+ "CREATE TABLE tt (); DROP TABLE tt; SELECT pg_switch_wal();");

At least since 11, we can utilize pg_logical_emit_message() for this
purpose. It's more lightweight and seems appropriate, not only because
it doesn't cause any side effects but also bacause we don't have to
worry about name conflicts.

-		 SELECT 'finished';",
-		timeout => $PostgreSQL::Test::Utils::timeout_default));
-is($result[1], 'finished', 'check if checkpoint command is not blocked');
-
+$node_primary2->advance_wal(1);
+$node_primary2->safe_psql('postgres', 'CHECKPOINT;');

This test anticipates that the checkpoint could get blocked. Shouldn't
we keep the timeout?

-$node_primary->safe_psql(
-	'postgres', "create table retain_test(a int);
-									 select pg_switch_wal();
-									 insert into retain_test values(1);
-									 checkpoint;");
+$node_primary->advance_wal(1);
+$node_primary->safe_psql('postgres', "checkpoint;");

The original test generated some WAL after the segment switch, which
appears to be a significant characteristics of the test.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Michael Paquier

michael@paquier.xyz

about 3 years ago

In reply to: Kyotaro Horiguchi (#2)

On Thu, Jun 15, 2023 at 01:40:15PM +0900, Kyotaro Horiguchi wrote:

+ "CREATE TABLE tt (); DROP TABLE tt; SELECT pg_switch_wal();");

At least since 11, we can utilize pg_logical_emit_message() for this
purpose. It's more lightweight and seems appropriate, not only because
it doesn't cause any side effects but also bacause we don't have to
worry about name conflicts.

Making this as cheap as possible by design is a good concept for a
common routine. +1.

-		 SELECT 'finished';",
-		timeout => $PostgreSQL::Test::Utils::timeout_default));
-is($result[1], 'finished', 'check if checkpoint command is not blocked');
-
+$node_primary2->advance_wal(1);
+$node_primary2->safe_psql('postgres', 'CHECKPOINT;');

This test anticipates that the checkpoint could get blocked. Shouldn't
we keep the timeout?

Indeed, this would partially invalidate what's getting tested in light
of 1816a1c6 where we run a secondary command after the checkpoint. So
the last SELECT should remain around.

-$node_primary->safe_psql(
-    'postgres', "create table retain_test(a int);
-                                     select pg_switch_wal();
-                                     insert into retain_test values(1);
-                                     checkpoint;");
+$node_primary->advance_wal(1);
+$node_primary->safe_psql('postgres', "checkpoint;");

The original test generated some WAL after the segment switch, which
appears to be a significant characteristics of the test.

Still it does not matter for this specific case? The logical slot has
been already invalidated, so we don't care much about logical changes
in WAL, do we?
--
Michael

Kyotaro Horiguchi

horikyota.ntt@gmail.com

about 3 years ago

In reply to: Michael Paquier (#3)

Thanks for the comments.

At Fri, 16 Jun 2023 11:30:15 +0900, Michael Paquier <michael@paquier.xyz> wrote in

-$node_primary->safe_psql(
-    'postgres', "create table retain_test(a int);
-                                     select pg_switch_wal();
-                                     insert into retain_test values(1);
-                                     checkpoint;");
+$node_primary->advance_wal(1);
+$node_primary->safe_psql('postgres', "checkpoint;");
The original test generated some WAL after the segment switch, which
appears to be a significant characteristics of the test.
Still it does not matter for this specific case? The logical slot has
been already invalidated, so we don't care much about logical changes
in WAL, do we?

The change itself doesn't seem to matter, but it seems intended to let
checkpoint trigger the removal of the last segment. However, I'm
unsure how the insert would influence this that way. If my
understanding is correct, then I'd support its removal.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

almost 3 years ago

In reply to: Michael Paquier (#3)

On Fri, Jun 16, 2023 at 8:00 AM Michael Paquier <michael@paquier.xyz> wrote:

On Thu, Jun 15, 2023 at 01:40:15PM +0900, Kyotaro Horiguchi wrote:

+ "CREATE TABLE tt (); DROP TABLE tt; SELECT pg_switch_wal();");

At least since 11, we can utilize pg_logical_emit_message() for this
purpose. It's more lightweight and seems appropriate, not only because
it doesn't cause any side effects but also bacause we don't have to
worry about name conflicts.

Making this as cheap as possible by design is a good concept for a
common routine. +1.

While it seems reasonable to use pg_logical_emit_message, it won't
work for all the cases - what if someone wants to advance WAL by a few
WAL files? I think pg_switch_wal() is the way, no? For instance, the
replslot_limit.pl test increases the WAL in a very calculated way - it
increases by 5 WAL files. So, -1 to use pg_logical_emit_message.

I understand the naming conflicts for the table name used ('tt' in
this case). If the table name 'tt' looks so simple and easy for
someone to have tables with that name in their tests file, we can
generate a random table name in advance_wal, something like in the
attached v2 patch.

-              SELECT 'finished';",
-             timeout => $PostgreSQL::Test::Utils::timeout_default));
-is($result[1], 'finished', 'check if checkpoint command is not blocked');
-
+$node_primary2->advance_wal(1);
+$node_primary2->safe_psql('postgres', 'CHECKPOINT;');
This test anticipates that the checkpoint could get blocked. Shouldn't
we keep the timeout?
Indeed, this would partially invalidate what's getting tested in light
of 1816a1c6 where we run a secondary command after the checkpoint. So
the last SELECT should remain around.

Changed.

-$node_primary->safe_psql(
-    'postgres', "create table retain_test(a int);
-                                     select pg_switch_wal();
-                                     insert into retain_test values(1);
-                                     checkpoint;");
+$node_primary->advance_wal(1);
+$node_primary->safe_psql('postgres', "checkpoint;");
The original test generated some WAL after the segment switch, which
appears to be a significant characteristics of the test.
Still it does not matter for this specific case? The logical slot has
been already invalidated, so we don't care much about logical changes
in WAL, do we?

Correct, the slot has already been invalidated and the test is
verifying that WAL isn't retained by the invalidated slot, so
essentially what it needs is to generate "some" wal. So, using
advance_wal there seems fine to me. CFBot doesn't complain anything -
https://github.com/BRupireddy/postgres/tree/add_a_TAP_test_function_to_generate_WAL_v2.

Attached the v2 patch. Thoughts?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

over 2 years ago

In reply to: Bharath Rupireddy (#5)

On Wed, Jul 19, 2023 at 4:11 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Attached the v2 patch. Thoughts?

Rebase needed, attached v3 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Euler Taveira

euler@eulerto.com

over 2 years ago

In reply to: Bharath Rupireddy (#6)

On Mon, Dec 18, 2023, at 2:39 AM, Bharath Rupireddy wrote:

Rebase needed, attached v3 patch.

I think you don't understand the suggestion proposed by Michael and Kyotaro. If
you do a comparison with the following SQL commands:

euler=# select pg_walfile_name(pg_current_wal_lsn());
pg_walfile_name
--------------------------
000000010000000000000040
(1 row)

euler=# select pg_logical_emit_message(true, 'prefix', 'message4');
pg_logical_emit_message
-------------------------
0/400000A8
(1 row)

euler=# select pg_switch_wal();
pg_switch_wal
---------------
0/400000F0
(1 row)

euler=# create table cc (b int);
CREATE TABLE
euler=# drop table cc;
DROP TABLE
euler=# select pg_switch_wal();
pg_switch_wal
---------------
0/41017C88
(1 row)

euler=# select pg_walfile_name(pg_current_wal_lsn());
pg_walfile_name
--------------------------
000000010000000000000041
(1 row)

You get

$ pg_waldump 000000010000000000000040
rmgr: Standby len (rec/tot): 50/ 50, tx: 0, lsn: 0/40000028, prev 0/3F0001C0, desc: RUNNING_XACTS nextXid 295180 latestCompletedXid 295179 oldestRunningXid 295180
rmgr: LogicalMessage len (rec/tot): 65/ 65, tx: 295180, lsn: 0/40000060, prev 0/40000028, desc: MESSAGE transactional, prefix "prefix"; payload (8 bytes): 6D 65 73 73 61 67 65 34
rmgr: Transaction len (rec/tot): 46/ 46, tx: 295180, lsn: 0/400000A8, prev 0/40000060, desc: COMMIT 2023-12-18 08:35:06.821322 -03
rmgr: XLOG len (rec/tot): 24/ 24, tx: 0, lsn: 0/400000D8, prev 0/400000A8, desc: SWITCH

$ pg_waldump 000000010000000000000041
rmgr: Standby len (rec/tot): 50/ 50, tx: 0, lsn: 0/41000028, prev 0/400000D8, desc: RUNNING_XACTS nextXid 295181 latestCompletedXid 295180 oldestRunningXid 295181
rmgr: Storage len (rec/tot): 42/ 42, tx: 0, lsn: 0/41000060, prev 0/41000028, desc: CREATE base/33287/88102
rmgr: Heap2 len (rec/tot): 60/ 60, tx: 295181, lsn: 0/41000090, prev 0/41000060, desc: NEW_CID rel: 1663/33287/1247, tid: 14/16, cmin: 0, cmax: 4294967295, combo: 4294967295
rmgr: Heap len (rec/tot): 54/ 3086, tx: 295181, lsn: 0/410000D0, prev 0/41000090, desc: INSERT off: 16, flags: 0x00, blkref #0: rel 1663/33287/1247 blk 14 FPW
rmgr: Btree len (rec/tot): 53/ 5133, tx: 295181, lsn: 0/41000CE0, prev 0/410000D0, desc: INSERT_LEAF off: 252, blkref #0: rel 1663/33287/2703 blk 2 FPW
.
.
.
rmgr: Btree len (rec/tot): 72/ 72, tx: 295181, lsn: 0/41016E48, prev 0/41014F00, desc: INSERT_LEAF off: 111, blkref #0: rel 1663/33287/2674 blk 7
rmgr: Heap2 len (rec/tot): 69/ 69, tx: 295181, lsn: 0/41016E90, prev 0/41016E48, desc: PRUNE snapshotConflictHorizon: 295177, nredirected: 0, ndead: 7, nunused: 0, redirected: [], dead: [20, 21, 22, 23, 24, 26, 27], unused: [], blkref #0: rel 1663/33287/1249 blk 17
rmgr: Transaction len (rec/tot): 385/ 385, tx: 295181, lsn: 0/41016ED8, prev 0/41016E90, desc: INVALIDATION ; inval msgs: catcache 80 catcache 79 catcache 80 catcache 79 catcache 55 catcache 54 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 snapshot 2608 relcache 88102
rmgr: Standby len (rec/tot): 42/ 42, tx: 295181, lsn: 0/41017060, prev 0/41016ED8, desc: LOCK xid 295181 db 33287 rel 88102
rmgr: Transaction len (rec/tot): 405/ 405, tx: 295181, lsn: 0/41017090, prev 0/41017060, desc: COMMIT 2023-12-18 08:35:22.342122 -03; inval msgs: catcache 80 catcache 79 catcache 80 catcache 79 catcache 55 catcache 54 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 snapshot 2608 relcache 88102
rmgr: Standby len (rec/tot): 42/ 42, tx: 295182, lsn: 0/41017228, prev 0/41017090, desc: LOCK xid 295182 db 33287 rel 88102
rmgr: Heap2 len (rec/tot): 61/ 61, tx: 295182, lsn: 0/41017258, prev 0/41017228, desc: PRUNE snapshotConflictHorizon: 295177, nredirected: 0, ndead: 3, nunused: 0, redirected: [], dead: [9, 12, 15], unused: [], blkref #0: rel 1663/33287/2608 blk 3
rmgr: Heap2 len (rec/tot): 60/ 60, tx: 295182, lsn: 0/41017298, prev 0/41017258, desc: NEW_CID rel: 1663/33287/1247, tid: 14/17, cmin: 4294967295, cmax: 0, combo: 4294967295
rmgr: Heap len (rec/tot): 54/ 54, tx: 295182, lsn: 0/410172D8, prev 0/41017298, desc: DELETE xmax: 295182, off: 17, infobits: [KEYS_UPDATED], flags: 0x00, blkref #0: rel 1663/33287/1247 blk 14
.
.
.
rmgr: Heap2 len (rec/tot): 60/ 60, tx: 295182, lsn: 0/410178D8, prev 0/410178A0, desc: NEW_CID rel: 1663/33287/2608, tid: 3/24, cmin: 4294967295, cmax: 2, combo: 4294967295
rmgr: Heap len (rec/tot): 54/ 54, tx: 295182, lsn: 0/41017918, prev 0/410178D8, desc: DELETE xmax: 295182, off: 24, infobits: [KEYS_UPDATED], flags: 0x00, blkref #0: rel 1663/33287/2608 blk 3
rmgr: Transaction len (rec/tot): 321/ 321, tx: 295182, lsn: 0/41017950, prev 0/41017918, desc: INVALIDATION ; inval msgs: catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 55 catcache 54 relcache 88102 snapshot 2608
rmgr: Transaction len (rec/tot): 469/ 469, tx: 295182, lsn: 0/41017A98, prev 0/41017950, desc: COMMIT 2023-12-18 08:35:25.053905 -03; rels: base/33287/88102; dropped stats: 2/33287/88102; inval msgs: catcache 80 catcache 79 catcache 80 catcache 79 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 55 catcache 54 snapshot 2608 snapshot 2608 relcache 88102 snapshot 2608
rmgr: XLOG len (rec/tot): 24/ 24, tx: 0, lsn: 0/41017C70, prev 0/41017A98, desc: SWITCH

The difference is

euler=# select '0/400000A8'::pg_lsn - '0/40000028'::pg_lsn;
?column?
----------
128
(1 row)

euler=# select '0/41017A98'::pg_lsn - '0/41000028'::pg_lsn;
?column?
----------
96880
(1 row)

It is cheaper.

--
Euler Taveira
EDB https://www.enterprisedb.com/

Michael Paquier

michael@paquier.xyz

over 2 years ago

In reply to: Euler Taveira (#7)

On Mon, Dec 18, 2023 at 08:48:09AM -0300, Euler Taveira wrote:

It is cheaper.

Agreed that this could just use a set of pg_logical_emit_message()
when jumping across N segments. Another thing that seems quite
important to me is to force a flush of WAL with the last segment
switch, and the new "flush" option of pg_logical_emit_message() can
be very handy for this purpose.
--
Michael

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

over 2 years ago

In reply to: Michael Paquier (#8)

On Tue, Dec 19, 2023 at 9:51 AM Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Dec 18, 2023 at 08:48:09AM -0300, Euler Taveira wrote:

It is cheaper.

Agreed that this could just use a set of pg_logical_emit_message()
when jumping across N segments.

Thanks. I missed the point of using pg_logical_emit_message() over
CREATE .. DROP TABLE to generate WAL. And, I agree that it's better
and relatively cheaper in terms of amount of WAL generated.

Another thing that seems quite
important to me is to force a flush of WAL with the last segment
switch, and the new "flush" option of pg_logical_emit_message() can
be very handy for this purpose.

I used pg_logical_emit_message() in non-transactional mode without
needing an explicit WAL flush as the pg_switch_wal() does a WAL flush
at the end [1]/* * If this was an XLOG_SWITCH record, flush the record and the empty * padding space that fills the rest of the segment, and perform * end-of-segment actions (eg, notifying archiver). */ if (class == WALINSERT_SPECIAL_SWITCH) { TRACE_POSTGRESQL_WAL_SWITCH(); XLogFlush(EndPos);.

Attached v4 patch.

[1]: /* * If this was an XLOG_SWITCH record, flush the record and the empty * padding space that fills the rest of the segment, and perform * end-of-segment actions (eg, notifying archiver). */ if (class == WALINSERT_SPECIAL_SWITCH) { TRACE_POSTGRESQL_WAL_SWITCH(); XLogFlush(EndPos);
/*
* If this was an XLOG_SWITCH record, flush the record and the empty
* padding space that fills the rest of the segment, and perform
* end-of-segment actions (eg, notifying archiver).
*/
if (class == WALINSERT_SPECIAL_SWITCH)
{
TRACE_POSTGRESQL_WAL_SWITCH();
XLogFlush(EndPos);

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#10

Michael Paquier

michael@paquier.xyz

over 2 years ago

In reply to: Bharath Rupireddy (#9)

On Tue, Dec 19, 2023 at 11:25:50AM +0530, Bharath Rupireddy wrote:

I used pg_logical_emit_message() in non-transactional mode without
needing an explicit WAL flush as the pg_switch_wal() does a WAL flush
at the end [1].

Indeed, that should be enough to answer my comment.

Attached v4 patch.

LGTM, thanks. Euler, what do you think?
--
Michael

#11

Euler Taveira

euler@eulerto.com

over 2 years ago

In reply to: Michael Paquier (#10)

On Tue, Dec 19, 2023, at 8:00 PM, Michael Paquier wrote:

On Tue, Dec 19, 2023 at 11:25:50AM +0530, Bharath Rupireddy wrote:

I used pg_logical_emit_message() in non-transactional mode without
needing an explicit WAL flush as the pg_switch_wal() does a WAL flush
at the end [1].

Indeed, that should be enough to answer my comment.

Attached v4 patch.

LGTM, thanks. Euler, what do you think?

LGTM.

--
Euler Taveira
EDB https://www.enterprisedb.com/

#12

Michael Paquier

michael@paquier.xyz

over 2 years ago

In reply to: Euler Taveira (#11)

On Wed, Dec 20, 2023 at 12:24:04AM -0300, Euler Taveira wrote:

LGTM.

I was eyeing at 020_messages.pl and it has a pg_switch_wal() after a
transaction rollbacked. 020_archive_status.pl creates a table, does
one segment switch, then a checkpoint (table is used afterwards).
Perhaps these could be changed with the new routine, but it does not
seem like this improves the readability of the tests, either, contrary
to the ones updated here where a fake table is created to force some
records. What do you think?

We have a few more pg_switch_wal() calls, as well, but these rely on
WAL being already generated beforehand.

I have added a comment about pg_logical_emit_message() being in
non-transactional mode and the flush implied by pg_switch_wal() as
that's important, edited a bit the whole, then applied the patch.
--
Michael

#13

Tom Lane

tgl@sss.pgh.pa.us

over 2 years ago

In reply to: Michael Paquier (#12)

Michael Paquier <michael@paquier.xyz> writes:

I have added a comment about pg_logical_emit_message() being in
non-transactional mode and the flush implied by pg_switch_wal() as
that's important, edited a bit the whole, then applied the patch.

Buildfarm member skink has failed 3 times in
035_standby_logical_decoding.pl in the last couple of days:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2024-01-03%2020%3A07%3A15

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2024-01-03%2017%3A09%3A27

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2024-01-01%2020%3A10%3A18

These all look like

# poll_query_until timed out executing this query:
# select (confl_active_logicalslot = 1) from pg_stat_database_conflicts where datname = 'testdb'
# expecting this output:
# t
# last actual query output:
# f

although it's notable that two different tests are involved
(vacuum vs. vacuum full).

I am not real sure what is happening there, but I see that c161ab74f
changed some details of how that test works, so I wonder if it's
responsible for these failures. The timing isn't a perfect match,
since this commit went in two weeks ago, but I don't see any
more-recent commits that seem like plausible explanations.

regards, tom lane

#14

Michael Paquier

michael@paquier.xyz

over 2 years ago

In reply to: Tom Lane (#13)

On Wed, Jan 03, 2024 at 06:39:29PM -0500, Tom Lane wrote:

I am not real sure what is happening there, but I see that c161ab74f
changed some details of how that test works, so I wonder if it's
responsible for these failures. The timing isn't a perfect match,
since this commit went in two weeks ago, but I don't see any
more-recent commits that seem like plausible explanations.

Likely the INSERT query on retain_test that has been removed from the
test is impacting the slot conflict analysis that we'd expect.
--
Michael

#15

Alexander Lakhin

exclusion@gmail.com

over 2 years ago

In reply to: Tom Lane (#13)

Hello Tom,

04.01.2024 02:39, Tom Lane wrote:

Buildfarm member skink has failed 3 times in
035_standby_logical_decoding.pl in the last couple of days:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2024-01-03%2020%3A07%3A15

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2024-01-03%2017%3A09%3A27

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2024-01-01%2020%3A10%3A18

These all look like

# poll_query_until timed out executing this query:
# select (confl_active_logicalslot = 1) from pg_stat_database_conflicts where datname = 'testdb'
# expecting this output:
# t
# last actual query output:
# f

although it's notable that two different tests are involved
(vacuum vs. vacuum full).

I am not real sure what is happening there, but I see that c161ab74f
changed some details of how that test works, so I wonder if it's
responsible for these failures. The timing isn't a perfect match,
since this commit went in two weeks ago, but I don't see any
more-recent commits that seem like plausible explanations.

Reproduced here.
As I can see in the failure logs you referenced, the first problem is:
# Failed test 'inactiveslot slot invalidation is logged with vacuum on pg_authid'
# at t/035_standby_logical_decoding.pl line 224.

It reminded me of:
/messages/by-id/b2a1f7d0-7629-72c0-2da7-e9c4e336b010@gmail.com

It seems that it's not something new, and maybe that my analysis is still
valid. If so, VACUUM FREEZE/autovacuum = off should fix the issue.

Best regards,
Alexander

#16

Michael Paquier

michael@paquier.xyz

over 2 years ago

In reply to: Alexander Lakhin (#15)

On Thu, Jan 04, 2024 at 04:00:01PM +0300, Alexander Lakhin wrote:

Reproduced here.

Did you just make the run slow enough to show the failure with
valgrind?

As I can see in the failure logs you referenced, the first problem is:
# Failed test 'inactiveslot slot invalidation is logged with vacuum on pg_authid'
# at t/035_standby_logical_decoding.pl line 224.
It reminded me of:
/messages/by-id/b2a1f7d0-7629-72c0-2da7-e9c4e336b010@gmail.com

It seems that it's not something new, and maybe that my analysis is still
valid. If so, VACUUM FREEZE/autovacuum = off should fix the issue.

Not sure about that yet.
--
Michael

#17

Alexander Lakhin

exclusion@gmail.com

over 2 years ago

In reply to: Michael Paquier (#16)

05.01.2024 02:48, Michael Paquier wrote:

On Thu, Jan 04, 2024 at 04:00:01PM +0300, Alexander Lakhin wrote:

Reproduced here.

Did you just make the run slow enough to show the failure with
valgrind?

Yes, I just run several test instances (with no extra modifications) under
Valgrind with parallel as follows:
for i in `seq 20`; do cp -r src/test/recovery/ src/test/recovery_$i/; sed "s|src/test/recovery|src/test/recovery_$i|" -i
src/test/recovery_$i/Makefile; done

for i in `seq 20`; do echo "iteration $i"; parallel --halt now,fail=1 -j20 --linebuffer --tag PROVE_TESTS="t/035*"
NO_TEMP_INSTALL=1 make check -s -C src/test/recovery_{} PROVE_FLAGS="--timer" ::: `seq 20` || break; done

The test run fails for me on iterations 9, 8, 14 like so:
...
iteration 9
...
5
5       #   Failed test 'inactiveslot slot invalidation is logged with vacuum on pg_authid'
5       #   at t/035_standby_logical_decoding.pl line 224.
5
5       #   Failed test 'activeslot slot invalidation is logged with vacuum on pg_authid'
5       #   at t/035_standby_logical_decoding.pl line 229.
13      [07:35:53] t/035_standby_logical_decoding.pl .. ok   432930 ms ( 0.02 usr 0.00 sys + 292.08 cusr 13.05 csys =
305.15 CPU)
13      [07:35:53]
13      All tests successful.
...

I've also reproduced it without Valgrind in a VM with CPU slowed down to
5% (on iterations 2, 5, 4), where average test duration is about 800 sec:

4
4       #   Failed test 'inactiveslot slot invalidation is logged with vacuum on pg_authid'
4       #   at t/035_standby_logical_decoding.pl line 222.
4
4       #   Failed test 'activeslot slot invalidation is logged with vacuum on pg_authid'
4       #   at t/035_standby_logical_decoding.pl line 227.
6       [15:16:53] t/035_standby_logical_decoding.pl .. ok   763168 ms ( 0.68 usr 0.10 sys + 19.55 cusr 102.59 csys =
122.92 CPU)

As I can see in the failure logs you referenced, the first problem is:
# Failed test 'inactiveslot slot invalidation is logged with vacuum on pg_authid'
# at t/035_standby_logical_decoding.pl line 224.
It reminded me of:
/messages/by-id/b2a1f7d0-7629-72c0-2da7-e9c4e336b010@gmail.com

It seems that it's not something new, and maybe that my analysis is still
valid. If so, VACUUM FREEZE/autovacuum = off should fix the issue.

Not sure about that yet.

Your suspicion was proved right. After
git show c161ab74f src/test/recovery/t/035_standby_logical_decoding.pl | git apply -R
20 iterations with 20 tests in parallel performed successfully for me
(twice).

So it looks like c161ab74f really made the things worse.

Best regards,
Alexander

#18

Michael Paquier

michael@paquier.xyz

over 2 years ago

In reply to: Alexander Lakhin (#17)

On Fri, Jan 05, 2024 at 11:00:00PM +0300, Alexander Lakhin wrote:

Your suspicion was proved right. After
git show c161ab74f src/test/recovery/t/035_standby_logical_decoding.pl | git apply -R
20 iterations with 20 tests in parallel performed successfully for me
(twice).

So it looks like c161ab74f really made the things worse.

We have two different failures here, one when VACUUM fails for a
shared relation:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2024-01-03%2017%3A09%3A27
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2024-01-01%2020%3A10%3A18

And the second failure happens for VACUUM FULL with a shared relation:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2024-01-03%2020%3A07%3A15

In the second case, the VACUUM FULL happens *BEFORE* the new
advance_wal(), making c161ab74f unrelated, no?

Anyway, if one looks at the buildfarm logs, this failure is more
ancient than c161ab74f. We have many of them before that, some
reported back in October:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2023-10-19%2000%3A44%3A58
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2023-10-30%2013%3A39%3A20

I suspect on the contrary that c161ab74f may be actually helping here,
because we've switched the CREATE TABLE/INSERT queries to not use a
snapshot anymore, reducing the reasons why a slot conflict would
happen? Or maybe that's just a matter of luck because the test is
racy anyway.

Anyway, this has the smell of a legit bug to me. I am also a bit
dubious about the choice of pg_authid as shared catalog to choose for
the slot invalidation check. Isn't that potentially racy with the
scans we may do on it at connection startup? Something else should be
chosen, like pg_shdescription as it is non-critical? I am adding in
CC Bertrand and Andres, as author and committer behind befcd77d53217b.
--
Michael

#19

Alexander Lakhin

exclusion@gmail.com

over 2 years ago

In reply to: Michael Paquier (#18)

07.01.2024 10:10, Michael Paquier wrote:

On Fri, Jan 05, 2024 at 11:00:00PM +0300, Alexander Lakhin wrote:

Your suspicion was proved right. After
git show c161ab74f src/test/recovery/t/035_standby_logical_decoding.pl | git apply -R
20 iterations with 20 tests in parallel performed successfully for me
(twice).

So it looks like c161ab74f really made the things worse.

After more waiting, I saw the test failure (with c161ab74f reverted) on
iteration 17 in VM where one test run takes up to 800 seconds.

We have two different failures here, one when VACUUM fails for a
shared relation:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2024-01-03%2017%3A09%3A27
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2024-01-01%2020%3A10%3A18

And the second failure happens for VACUUM FULL with a shared relation:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2024-01-03%2020%3A07%3A15

In the second case, the VACUUM FULL happens *BEFORE* the new
advance_wal(), making c161ab74f unrelated, no?

Anyway, if one looks at the buildfarm logs, this failure is more
ancient than c161ab74f. We have many of them before that, some
reported back in October:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2023-10-19%2000%3A44%3A58
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=skink&dt=2023-10-30%2013%3A39%3A20

Yes, I wrote exactly about that upthread and referenced my previous
investigation. But what I'm observing now, is that the failure probability
greatly increased with c161ab74f, so something really changed in the test
behaviour. (I need a couple of days to investigate this.)

Best regards,
Alexander

#20

Michael Paquier

michael@paquier.xyz

over 2 years ago

In reply to: Alexander Lakhin (#19)

On Sun, Jan 07, 2024 at 05:00:00PM +0300, Alexander Lakhin wrote:

Yes, I wrote exactly about that upthread and referenced my previous
investigation. But what I'm observing now, is that the failure probability
greatly increased with c161ab74f, so something really changed in the test
behaviour. (I need a couple of days to investigate this.)

As far as I've cross-checked the logs between successful and failed
runs on skink and my own machines (not reproduced it locally
unfortunately), I did not notice a correlation with autovacuum running
while VACUUM (with or without FULL) is executed on the catalogs.
Perhaps a next sensible step would be to plug-in pg_waldump or
pg_walinspect and get some sense from the WAL records if we fail to
detect an invalidation from the log contents, from a LSN retrieved
slightly at the beginning of each scenario.

I would be tempted to add more increments of $Test::Builder::Level as
well in the subroutines of the test because it is kind of hard to find
out from where a failure comes now. One needs to grep for the
slot names, whose strings are built from prefixes and suffixes defined
as arguments of these subroutines...
--
Michael

#21