BUG #18641: Logical decoding of two-phase commit fails with TOASTed default values

Started by PG Bug reporting formover 1 year ago10 messages
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 18641
Logged by: Alexander Lakhin
Email address: exclusion@gmail.com
PostgreSQL version: 17.0
Operating system: Ubuntu 22.04
Description:

Logical decoding of a transaction like:
BEGIN;
INSERT INTO test_tab VALUES(1);
PREPARE TRANSACTION 'pt';

where test_tab defined as:
CREATE TABLE test_tab(a int primary key,
t text DEFAULT 'Some TOASTable value';

for a subscription created with two_phase = on, fails as below:
2024-09-28 06:44:50.708 UTC [3741774:6][client backend][6/2:740] LOG:
statement: PREPARE TRANSACTION 'pt';
2024-09-28 06:44:50.709 UTC [3741774:7][client backend][:0] LOG:
disconnection: session time: 0:00:00.006 user=law database=postgres
host=[local]
2024-09-28 06:44:50.713 UTC [3741741:17][walsender][25/0:0] ERROR:
unexpected table_index_fetch_tuple call during logical decoding
2024-09-28 06:44:50.713 UTC [3741741:18][walsender][25/0:0] BACKTRACE:
table_index_fetch_tuple at tableam.h:1253:3
index_fetch_heap at indexam.c:637:10
index_getnext_slot at indexam.c:697:6
systable_getnext_ordered at genam.c:717:5
heap_fetch_toast_slice at heaptoast.c:698:17
table_relation_fetch_toast_slice at tableam.h:1924:1
toast_fetch_datum at detoast.c:379:2
detoast_attr at detoast.c:123:10
pg_detoast_datum_packed at fmgr.c:1867:10
text_to_cstring at varlena.c:220:23
AttrDefaultFetch at relcache.c:4537:17
RelationBuildTupleDesc at relcache.c:697:4
RelationBuildDesc at relcache.c:1188:24
RelationIdGetRelation at relcache.c:2116:7
ReorderBufferProcessTXN at reorderbuffer.c:2246:17
ReorderBufferReplay at reorderbuffer.c:2725:2
ReorderBufferPrepare at reorderbuffer.c:2822:2
DecodePrepare at decode.c:826:2
xact_decode at decode.c:347:5
LogicalDecodingProcessRecord at decode.c:123:1
XLogSendLogical at walsender.c:3413:33
WalSndLoop at walsender.c:2814:4
StartLogicalReplication at walsender.c:1523:2
exec_replication_command at walsender.c:2148:5
PostgresMain at postgres.c:4763:11
BackendInitialize at backend_startup.c:123:1
postmaster_child_launch at launch_backend.c:281:9
BackendStartup at postmaster.c:3593:8
ServerLoop at postmaster.c:1677:10
PostmasterMain at postmaster.c:1372:11
startup_hacks at main.c:217:1
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)
[0x7f30a90d7e40]
postgres: node_publisher: walsender law postgres [local]
START_REPLICATION(_start+0x25) [0x5647b6023565]
2024-09-28 06:44:50.713 UTC [3741741:19][walsender][25/0:0] LOG: released
logical replication slot "test_sub"
2024-09-28 06:44:50.713 UTC [3741741:20][walsender][25/0:0] LOG: could not
send data to client: Broken pipe
2024-09-28 06:44:50.713 UTC [3741741:21][walsender][25/0:0] FATAL:
connection to client lost

The issue can be easily reproduced with 022_twophase_cascade.pl modified
like this:
@@ -67,0 +68 @@ $node_A->safe_psql('postgres',
+my $default = join('', map {chr(65 + rand 26)} (1 .. 10000));
@@ -69 +70 @@ $node_B->safe_psql('postgres',
-    "CREATE TABLE test_tab (a int primary key, b bytea, c timestamptz
DEFAULT now(), d bigint DEFAULT 999)"
+    "CREATE TABLE test_tab (a int primary key, b bytea, c timestamptz
DEFAULT now(), d bigint DEFAULT 999, t text DEFAULT '$default')"
@@ -72 +73 @@ $node_C->safe_psql('postgres',
-    "CREATE TABLE test_tab (a int primary key, b bytea, c timestamptz
DEFAULT now(), d bigint DEFAULT 999)"
+    "CREATE TABLE test_tab (a int primary key, b bytea, c timestamptz
DEFAULT now(), d bigint DEFAULT 999, t text DEFAULT '$default')"

Reproduced on REL_15_STABLE (starting from a8fd13cab) .. master.

#2Takeshi Ideriha
iderihatakeshi@gmail.com
In reply to: PG Bug reporting form (#1)
1 attachment(s)
Re: BUG #18641: Logical decoding of two-phase commit fails with TOASTed default values

Hi, Alexander and hackers

The following bug has been logged on the website:

Bug reference: 18641
Logged by: Alexander Lakhin
Email address: exclusion@gmail.com
PostgreSQL version: 17.0
Operating system: Ubuntu 22.04
Description:

Logical decoding of a transaction like:
BEGIN;
INSERT INTO test_tab VALUES(1);
PREPARE TRANSACTION 'pt';

where test_tab defined as:
CREATE TABLE test_tab(a int primary key,
t text DEFAULT 'Some TOASTable value';

for a subscription created with two_phase = on, fails as below:
2024-09-28 06:44:50.708 UTC [3741774:6][client backend][6/2:740] LOG:
statement: PREPARE TRANSACTION 'pt';
2024-09-28 06:44:50.709 UTC [3741774:7][client backend][:0] LOG:
disconnection: session time: 0:00:00.006 user=law database=postgres
host=[local]
2024-09-28 06:44:50.713 UTC [3741741:17][walsender][25/0:0] ERROR:
unexpected table_index_fetch_tuple call during logical decoding
2024-09-28 06:44:50.713 UTC [3741741:18][walsender][25/0:0] BACKTRACE:
table_index_fetch_tuple at tableam.h:1253:3
index_fetch_heap at indexam.c:637:10
index_getnext_slot at indexam.c:697:6
systable_getnext_ordered at genam.c:717:5
heap_fetch_toast_slice at heaptoast.c:698:17
table_relation_fetch_toast_slice at tableam.h:1924:1
toast_fetch_datum at detoast.c:379:2
detoast_attr at detoast.c:123:10
pg_detoast_datum_packed at fmgr.c:1867:10
text_to_cstring at varlena.c:220:23
AttrDefaultFetch at relcache.c:4537:17
RelationBuildTupleDesc at relcache.c:697:4
RelationBuildDesc at relcache.c:1188:24
RelationIdGetRelation at relcache.c:2116:7
ReorderBufferProcessTXN at reorderbuffer.c:2246:17
ReorderBufferReplay at reorderbuffer.c:2725:2
ReorderBufferPrepare at reorderbuffer.c:2822:2
DecodePrepare at decode.c:826:2
xact_decode at decode.c:347:5
LogicalDecodingProcessRecord at decode.c:123:1
XLogSendLogical at walsender.c:3413:33
WalSndLoop at walsender.c:2814:4
StartLogicalReplication at walsender.c:1523:2
exec_replication_command at walsender.c:2148:5
PostgresMain at postgres.c:4763:11
BackendInitialize at backend_startup.c:123:1
postmaster_child_launch at launch_backend.c:281:9
BackendStartup at postmaster.c:3593:8
ServerLoop at postmaster.c:1677:10
PostmasterMain at postmaster.c:1372:11
startup_hacks at main.c:217:1
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)
[0x7f30a90d7e40]
postgres: node_publisher: walsender law postgres [local]
START_REPLICATION(_start+0x25) [0x5647b6023565]
2024-09-28 06:44:50.713 UTC [3741741:19][walsender][25/0:0] LOG: released
logical replication slot "test_sub"
2024-09-28 06:44:50.713 UTC [3741741:20][walsender][25/0:0] LOG: could not
send data to client: Broken pipe
2024-09-28 06:44:50.713 UTC [3741741:21][walsender][25/0:0] FATAL:
connection to client lost

The issue can be easily reproduced with 022_twophase_cascade.pl modified
like this:
@@ -67,0 +68 @@ $node_A->safe_psql('postgres',
+my $default = join('', map {chr(65 + rand 26)} (1 .. 10000));
@@ -69 +70 @@ $node_B->safe_psql('postgres',
-    "CREATE TABLE test_tab (a int primary key, b bytea, c timestamptz
DEFAULT now(), d bigint DEFAULT 999)"
+    "CREATE TABLE test_tab (a int primary key, b bytea, c timestamptz
DEFAULT now(), d bigint DEFAULT 999, t text DEFAULT '$default')"
@@ -72 +73 @@ $node_C->safe_psql('postgres',
-    "CREATE TABLE test_tab (a int primary key, b bytea, c timestamptz
DEFAULT now(), d bigint DEFAULT 999)"
+    "CREATE TABLE test_tab (a int primary key, b bytea, c timestamptz
DEFAULT now(), d bigint DEFAULT 999, t text DEFAULT '$default')"

Reproduced on REL_15_STABLE (starting from a8fd13cab) .. master.

Thank you for reporting the issue.
I was able to reproduce the issue by modifying 022_twophase_cascade.pl
accordingly.

The scan for toast index is actually done under systable_getnext_ordered,
where HandleConcurrentAbort() is called. So it seems to me that this
scan is actually safe for concurrent abort in logical decoding.
Logic around HandleConcurrentAbort is intorduced
https://github.com/postgres/postgres/commit/7259736a6e5b7c7588fff9578370736a6648acbb.

Though I may not understand the logic around HandleConcurrentAbort
fully and I am not sure not-setting bsysscan at
systable_beginscan_ordered is intentional,
it seems to me setting and unsetting a bsysscan flag in
systable_beginscan_ordered and systable_endscan_ordered would resolve
the issue.
Patch is attached.

Regards,
Takeshi Ideriha

Attachments:

0001-Set-and-unset-bsysscan-flag-in-systable_-_ordered.patchapplication/octet-stream; name=0001-Set-and-unset-bsysscan-flag-in-systable_-_ordered.patchDownload
From 181ff3783ea73ef909aec23b7fdd2378d08f166a Mon Sep 17 00:00:00 2001
From: Ideriha Takeshi <iderihatakeshi@gmail.com>
Date: Mon, 30 Sep 2024 00:24:13 +0000
Subject: [PATCH] Set and unset bsysscan flag in systable_*_ordered.

We call HandleConcurrentAbort in systable_getnext_ordered so that
we can handle concurrent abort case in logical decoding same as
systable_getnext. Set and unset bsysscan flag is forgotten.
---
 src/backend/access/index/genam.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/src/backend/access/index/genam.c b/src/backend/access/index/genam.c
index 52fde5cc4d..94a00ec00b 100644
--- a/src/backend/access/index/genam.c
+++ b/src/backend/access/index/genam.c
@@ -713,6 +713,15 @@ systable_beginscan_ordered(Relation heapRelation,
 	index_rescan(sysscan->iscan, idxkey, nkeys, NULL, 0);
 	sysscan->scan = NULL;
 
+
+   	/*
+	 * If CheckXidAlive is set then set a flag to indicate that system table
+	 * scan is in-progress.  See detailed comments in xact.c where these
+	 * variables are declared.
+	 */
+	if (TransactionIdIsValid(CheckXidAlive))
+		bsysscan = true;
+	
 	return sysscan;
 }
 
@@ -757,6 +766,14 @@ systable_endscan_ordered(SysScanDesc sysscan)
 	index_endscan(sysscan->iscan);
 	if (sysscan->snapshot)
 		UnregisterSnapshot(sysscan->snapshot);
+
+	/*
+	 * Reset the bsysscan flag at the end of the systable scan.  See detailed
+	 * comments in xact.c where these variables are declared.
+	 */
+	if (TransactionIdIsValid(CheckXidAlive))
+		bsysscan = false;
+	
 	pfree(sysscan);
 }
 
-- 
2.40.1

#3Amit Kapila
amit.kapila16@gmail.com
In reply to: Takeshi Ideriha (#2)
Re: BUG #18641: Logical decoding of two-phase commit fails with TOASTed default values

On Mon, Sep 30, 2024 at 6:46 AM Takeshi Ideriha
<iderihatakeshi@gmail.com> wrote:

Patch is attached.

Thanks for the patch. I'll look into it.

--
With Regards,
Amit Kapila.

#4Amit Kapila
amit.kapila16@gmail.com
In reply to: Takeshi Ideriha (#2)
Re: BUG #18641: Logical decoding of two-phase commit fails with TOASTed default values

On Mon, Sep 30, 2024 at 6:46 AM Takeshi Ideriha
<iderihatakeshi@gmail.com> wrote:

Thank you for reporting the issue.
I was able to reproduce the issue by modifying 022_twophase_cascade.pl
accordingly.

The scan for toast index is actually done under systable_getnext_ordered,
where HandleConcurrentAbort() is called. So it seems to me that this
scan is actually safe for concurrent abort in logical decoding.
Logic around HandleConcurrentAbort is intorduced
https://github.com/postgres/postgres/commit/7259736a6e5b7c7588fff9578370736a6648acbb.

Though I may not understand the logic around HandleConcurrentAbort
fully and I am not sure not-setting bsysscan at
systable_beginscan_ordered is intentional,
it seems to me setting and unsetting a bsysscan flag in
systable_beginscan_ordered and systable_endscan_ordered would resolve
the issue.

We forgot to set/unset the flag in functions
systable_beginscan_ordered and systable_endscan_ordered. BTW,
shouldn't this occur even without prepare transaction? If so, we need
to backpatch this till 14.

Also, it is better to have a test for this, and let's ensure that the
new test doesn't increase the regression time too much if possible.

One minor point:
+
+    /*
+ * If CheckXidAlive is set then set a flag to indicate that system table

The indentation in the first comment line seems off.

--
With Regards,
Amit Kapila.

#5Takeshi Ideriha
iderihatakeshi@gmail.com
In reply to: Amit Kapila (#4)
1 attachment(s)
Re: BUG #18641: Logical decoding of two-phase commit fails with TOASTed default values

Hi, Amit
(adding hackers for discussion)

We forgot to set/unset the flag in functions
systable_beginscan_ordered and systable_endscan_ordered. BTW,

Thank you for the clarification.

shouldn't this occur even without prepare transaction? If so, we need
to backpatch this till 14.

Yes, it occurred also at PG14.
I did some tests using 015_stream.pl with some modification like
below, which tests the subscription about stream is on but two-phase
is off.
The same issue occurred at both current head source code and PG14.

```
--- a/src/test/subscription/t/015_stream.pl
+++ b/src/test/subscription/t/015_stream.pl
@@ -134,9 +134,11 @@ my $node_subscriber =
PostgreSQL::Test::Cluster->new('subscriber');
 $node_subscriber->init;
 $node_subscriber->start;
+my $default = join('', map {chr(65 + rand 26)} (1 .. 10000));
+
 # Create some preexisting content on publisher
 $node_publisher->safe_psql('postgres',
-       "CREATE TABLE test_tab (a int primary key, b bytea)");
+       "CREATE TABLE test_tab (a int primary key, b bytea, t text
DEFAULT '$default')");
 $node_publisher->safe_psql('postgres',
        "INSERT INTO test_tab VALUES (1, 'foo'), (2, 'bar')");

@@ -144,7 +146,7 @@ $node_publisher->safe_psql('postgres', "CREATE
TABLE test_tab_2 (a int)");

 # Setup structure on subscriber
 $node_subscriber->safe_psql('postgres',
-       "CREATE TABLE test_tab (a int primary key, b bytea, c
timestamptz DEFAULT now(), d bigint DEFAULT 999)"
+       "CREATE TABLE test_tab (a int primary key, b bytea, t text
DEFAULT '$default', c timestamptz DEFAULT now(), d bigint DEFAULT
999)"
 );
```

Logs from head source:
```
2024-10-01 12:34:56.694 UTC [575202] LOG: starting PostgreSQL 18devel
on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.4.1 20230605 (Red Hat
11.4.1-2), 64-bit
...
2024-10-01 12:34:57.506 UTC [575258] 015_stream.pl LOG: statement: BEGIN;
2024-10-01 12:34:57.506 UTC [575258] 015_stream.pl LOG: statement:
INSERT INTO test_tab SELECT i, sha256(i::text::bytea) FROM
generate_series(3, 5000) s(i);
2024-10-01 12:34:57.524 UTC [575242] tap_sub ERROR: unexpected
table_index_fetch_tuple call during logical decoding
2024-10-01 12:34:57.524 UTC [575242] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4',
streaming 'on', origin 'any', publication_names '"tap_pub"')
2024-10-01 12:34:57.525 UTC [575242] tap_sub LOG: released logical
replication slot "tap_sub"
2024-10-01 12:34:57.525 UTC [575242] tap_sub LOG: could not send data
to client: Broken pipe
2024-10-01 12:34:57.525 UTC [575242] tap_sub FATAL: connection to client lost
...
2024-10-01 12:34:57.829 UTC [575260] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4',
streaming 'on', origin 'any', publication_names '"tap_pub"')
2024-10-01 12:34:57.829 UTC [575260] tap_sub LOG: starting logical
decoding for slot "tap_sub"
2024-10-01 12:34:57.829 UTC [575260] tap_sub DETAIL: Streaming
transactions committing after 0/1583A68, reading WAL from 0/1583A30.
2024-10-01 12:34:57.829 UTC [575260] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4',
streaming 'on', origin 'any', publication_names '"tap_pub"')
2024-10-01 12:34:57.829 UTC [575260] tap_sub LOG: logical decoding
found consistent point at 0/1583A30
2024-10-01 12:34:57.829 UTC [575260] tap_sub DETAIL: There are no
running transactions.
2024-10-01 12:34:57.829 UTC [575260] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4',
streaming 'on', origin 'any', publication_names '"tap_pub"')
2024-10-01 12:34:57.831 UTC [575260] tap_sub ERROR: unexpected
table_index_fetch_tuple call during logical decoding
2024-10-01 12:34:57.831 UTC [575260] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4',
streaming 'on', origin 'any', publication_names '"tap_pub"')
2024-10-01 12:34:57.832 UTC [575260] tap_sub LOG: released logical
replication slot "tap_sub"
2024-10-01 12:34:57.832 UTC [575260] tap_sub LOG: could not send data
to client: Broken pipe
2024-10-01 12:34:57.832 UTC [575260] tap_sub FATAL: connection to client lost
```

Logs from PG14 source:

```
2024-10-01 13:20:29.409 UTC [593696] LOG: starting PostgreSQL 14.9 on
x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.4.1 20230605 (Red Hat
11.4.1-2), 64-bit
...
2024-10-01 13:20:31.285 UTC [593750] 015_stream.pl LOG: statement: BEGIN;
2024-10-01 13:20:31.285 UTC [593750] 015_stream.pl LOG: statement:
INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(3,
5000) s(i);
2024-10-01 13:20:31.301 UTC [593740] tap_sub ERROR: unexpected
table_index_fetch_tuple call during logical decoding
2024-10-01 13:20:31.301 UTC [593740] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '2',
streaming 'on', publication_names '"tap_pub"')
2024-10-01 13:20:31.302 UTC [593740] tap_sub LOG: could not send data
to client: Broken pipe
2024-10-01 13:20:31.302 UTC [593740] tap_sub FATAL: connection to client lost
...
2024-10-01 13:20:31.830 UTC [593756] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '2',
streaming 'on', publication_names '"tap_pub"')
2024-10-01 13:20:31.830 UTC [593756] tap_sub LOG: logical decoding
found consistent point at 0/172C758
2024-10-01 13:20:31.830 UTC [593756] tap_sub DETAIL: There are no
running transactions.
2024-10-01 13:20:31.830 UTC [593756] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '2',
streaming 'on', publication_names '"tap_pub"')
2024-10-01 13:20:31.834 UTC [593756] tap_sub ERROR: unexpected
table_index_fetch_tuple call during logical decoding
2024-10-01 13:20:31.834 UTC [593756] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '2',
streaming 'on', publication_names '"tap_pub"')
2024-10-01 13:20:31.835 UTC [593756] tap_sub LOG: could not send data
to client: Broken pipe
2024-10-01 13:20:31.835 UTC [593756] tap_sub FATAL: connection to client lost
```

Also, it is better to have a test for this, and let's ensure that the
new test doesn't increase the regression time too much if possible.

Sure. I'm going to add some test codes in a few days.

The indentation in the first comment line seems off.

Opps, thank you. Fixed.

regards,
Takeshi Ideriha

Attachments:

0001-Set-and-unset-bsysscan-flag-in-systable_-_ordered-v2.patchapplication/x-patch; name=0001-Set-and-unset-bsysscan-flag-in-systable_-_ordered-v2.patchDownload
From 2a4116ebf5f6b838dd3df64c78322c4e70375b6f Mon Sep 17 00:00:00 2001
From: Ideriha Takeshi <iderihatakeshi@gmail.com>
Date: Mon, 30 Sep 2024 00:24:13 +0000
Subject: [PATCH] Set and unset bsysscan flag in systable_*_ordered.

We call HandleConcurrentAbort in systable_getnext_ordered so that
we can handle concurrent abort case in logical decoding same as
systable_getnext. Set and unset bsysscan flag is forgotten.
---
 src/backend/access/index/genam.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/src/backend/access/index/genam.c b/src/backend/access/index/genam.c
index 52fde5cc4d..2ecc3e8b31 100644
--- a/src/backend/access/index/genam.c
+++ b/src/backend/access/index/genam.c
@@ -713,6 +713,14 @@ systable_beginscan_ordered(Relation heapRelation,
 	index_rescan(sysscan->iscan, idxkey, nkeys, NULL, 0);
 	sysscan->scan = NULL;
 
+	/*
+	 * If CheckXidAlive is set then set a flag to indicate that system table
+	 * scan is in-progress.  See detailed comments in xact.c where these
+	 * variables are declared.
+	 */
+	if (TransactionIdIsValid(CheckXidAlive))
+		bsysscan = true;
+	
 	return sysscan;
 }
 
@@ -757,6 +765,14 @@ systable_endscan_ordered(SysScanDesc sysscan)
 	index_endscan(sysscan->iscan);
 	if (sysscan->snapshot)
 		UnregisterSnapshot(sysscan->snapshot);
+
+	/*
+	 * Reset the bsysscan flag at the end of the systable scan.  See detailed
+	 * comments in xact.c where these variables are declared.
+	 */
+	if (TransactionIdIsValid(CheckXidAlive))
+		bsysscan = false;
+	
 	pfree(sysscan);
 }
 
-- 
2.40.1

#6Takeshi Ideriha
iderihatakeshi@gmail.com
In reply to: Takeshi Ideriha (#5)
1 attachment(s)
Re: BUG #18641: Logical decoding of two-phase commit fails with TOASTed default values

Hi, Amit
(Sending this again due to previous email got moderated because I
tried to multi post to hackers and bugs.)

We forgot to set/unset the flag in functions
systable_beginscan_ordered and systable_endscan_ordered. BTW,

Thank you for the clarification.

shouldn't this occur even without prepare transaction? If so, we need
to backpatch this till 14.

Yes, it occurred also at PG14.
I did some tests using 015_stream.pl with some modification like
below, which tests the subscription about stream is on but two-phase
is off.
The same issue occurred at both current head source code and PG14.

```
--- a/src/test/subscription/t/015_stream.pl
+++ b/src/test/subscription/t/015_stream.pl
@@ -134,9 +134,11 @@ my $node_subscriber =
PostgreSQL::Test::Cluster->new
('subscriber');
 $node_subscriber->init;
 $node_subscriber->start;
+my $default = join('', map {chr(65 + rand 26)} (1 .. 10000));
+
 # Create some preexisting content on publisher
 $node_publisher->safe_psql('postgres',
-       "CREATE TABLE test_tab (a int primary key, b bytea)");
+       "CREATE TABLE test_tab (a int primary key, b bytea, t text
DEFAULT '$default')");
 $node_publisher->safe_psql('postgres',
        "INSERT INTO test_tab VALUES (1, 'foo'), (2, 'bar')");

@@ -144,7 +146,7 @@ $node_publisher->safe_psql('postgres', "CREATE
TABLE test_tab_2 (a int)");

 # Setup structure on subscriber
 $node_subscriber->safe_psql('postgres',
-       "CREATE TABLE test_tab (a int primary key, b bytea, c
timestamptz DEFAULT now(), d bigint DEFAULT 999)"
+       "CREATE TABLE test_tab (a int primary key, b bytea, t text
DEFAULT '$default', c timestamptz DEFAULT now(), d bigint DEFAULT
999)"
 );
```

Logs from head source:
```
2024-10-01 12:34:56.694 UTC [575202] LOG: starting PostgreSQL 18devel
on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.4.1 20230605 (Red Hat
11.4.1-2), 64-bit
...
2024-10-01 12:34:57.506 UTC [575258] 015_stream.pl LOG: statement: BEGIN;
2024-10-01 12:34:57.506 UTC [575258] 015_stream.pl LOG: statement:
INSERT INTO test_tab SELECT i, sha256(i::text::bytea) FROM
generate_series(3, 5000) s(i);
2024-10-01 12:34:57.524 UTC [575242] tap_sub ERROR: unexpected
table_index_fetch_tuple call during logical decoding
2024-10-01 12:34:57.524 UTC [575242] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4',
streaming 'on', origin 'any', publication_names '"tap_pub"')
2024-10-01 12:34:57.525 UTC [575242] tap_sub LOG: released logical
replication slot "tap_sub"
2024-10-01 12:34:57.525 UTC [575242] tap_sub LOG: could not send data
to client: Broken pipe
2024-10-01 12:34:57.525 UTC [575242] tap_sub FATAL: connection to client lost
...
2024-10-01 12:34:57.829 UTC [575260] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4',
streaming 'on', origin 'any', publication_names '"tap_pub"')
2024-10-01 12:34:57.829 UTC [575260] tap_sub LOG: starting logical
decoding for slot "tap_sub"
2024-10-01 12:34:57.829 UTC [575260] tap_sub DETAIL: Streaming
transactions committing after 0/1583A68, reading WAL from 0/1583A30.
2024-10-01 12:34:57.829 UTC [575260] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4',
streaming 'on', origin 'any', publication_names '"tap_pub"')
2024-10-01 12:34:57.829 UTC [575260] tap_sub LOG: logical decoding
found consistent point at 0/1583A30
2024-10-01 12:34:57.829 UTC [575260] tap_sub DETAIL: There are no
running transactions.
2024-10-01 12:34:57.829 UTC [575260] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4',
streaming 'on', origin 'any', publication_names '"tap_pub"')
2024-10-01 12:34:57.831 UTC [575260] tap_sub ERROR: unexpected
table_index_fetch_tuple call during logical decoding
2024-10-01 12:34:57.831 UTC [575260] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4',
streaming 'on', origin 'any', publication_names '"tap_pub"')
2024-10-01 12:34:57.832 UTC [575260] tap_sub LOG: released logical
replication slot "tap_sub"
2024-10-01 12:34:57.832 UTC [575260] tap_sub LOG: could not send data
to client: Broken pipe
2024-10-01 12:34:57.832 UTC [575260] tap_sub FATAL: connection to client lost
```

Logs from PG14 source:

```
2024-10-01 13:20:29.409 UTC [593696] LOG: starting PostgreSQL 14.9 on
x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.4.1 20230605 (Red Hat
11.4.1-2), 64-bit
...
2024-10-01 13:20:31.285 UTC [593750] 015_stream.pl LOG: statement: BEGIN;
2024-10-01 13:20:31.285 UTC [593750] 015_stream.pl LOG: statement:
INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(3,
5000) s(i);
2024-10-01 13:20:31.301 UTC [593740] tap_sub ERROR: unexpected
table_index_fetch_tuple call during logical decoding
2024-10-01 13:20:31.301 UTC [593740] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '2',
streaming 'on', publication_names '"tap_pub"')
2024-10-01 13:20:31.302 UTC [593740] tap_sub LOG: could not send data
to client: Broken pipe
2024-10-01 13:20:31.302 UTC [593740] tap_sub FATAL: connection to client lost
...
2024-10-01 13:20:31.830 UTC [593756] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '2',
streaming 'on', publication_names '"tap_pub"')
2024-10-01 13:20:31.830 UTC [593756] tap_sub LOG: logical decoding
found consistent point at 0/172C758
2024-10-01 13:20:31.830 UTC [593756] tap_sub DETAIL: There are no
running transactions.
2024-10-01 13:20:31.830 UTC [593756] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '2',
streaming 'on', publication_names '"tap_pub"')
2024-10-01 13:20:31.834 UTC [593756] tap_sub ERROR: unexpected
table_index_fetch_tuple call during logical decoding
2024-10-01 13:20:31.834 UTC [593756] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '2',
streaming 'on', publication_names '"tap_pub"')
2024-10-01 13:20:31.835 UTC [593756] tap_sub LOG: could not send data
to client: Broken pipe
2024-10-01 13:20:31.835 UTC [593756] tap_sub FATAL: connection to client lost
```

Also, it is better to have a test for this, and let's ensure that the
new test doesn't increase the regression time too much if possible.

Sure. I'm going to add some test codes in a few days.

The indentation in the first comment line seems off.

Opps, thank you. Fixed.

regards,
Takeshi Ideriha

Attachments:

0001-Set-and-unset-bsysscan-flag-in-systable_-_ordered-v2.patchapplication/octet-stream; name=0001-Set-and-unset-bsysscan-flag-in-systable_-_ordered-v2.patchDownload
From 2a4116ebf5f6b838dd3df64c78322c4e70375b6f Mon Sep 17 00:00:00 2001
From: Ideriha Takeshi <iderihatakeshi@gmail.com>
Date: Mon, 30 Sep 2024 00:24:13 +0000
Subject: [PATCH] Set and unset bsysscan flag in systable_*_ordered.

We call HandleConcurrentAbort in systable_getnext_ordered so that
we can handle concurrent abort case in logical decoding same as
systable_getnext. Set and unset bsysscan flag is forgotten.
---
 src/backend/access/index/genam.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/src/backend/access/index/genam.c b/src/backend/access/index/genam.c
index 52fde5cc4d..2ecc3e8b31 100644
--- a/src/backend/access/index/genam.c
+++ b/src/backend/access/index/genam.c
@@ -713,6 +713,14 @@ systable_beginscan_ordered(Relation heapRelation,
 	index_rescan(sysscan->iscan, idxkey, nkeys, NULL, 0);
 	sysscan->scan = NULL;
 
+	/*
+	 * If CheckXidAlive is set then set a flag to indicate that system table
+	 * scan is in-progress.  See detailed comments in xact.c where these
+	 * variables are declared.
+	 */
+	if (TransactionIdIsValid(CheckXidAlive))
+		bsysscan = true;
+	
 	return sysscan;
 }
 
@@ -757,6 +765,14 @@ systable_endscan_ordered(SysScanDesc sysscan)
 	index_endscan(sysscan->iscan);
 	if (sysscan->snapshot)
 		UnregisterSnapshot(sysscan->snapshot);
+
+	/*
+	 * Reset the bsysscan flag at the end of the systable scan.  See detailed
+	 * comments in xact.c where these variables are declared.
+	 */
+	if (TransactionIdIsValid(CheckXidAlive))
+		bsysscan = false;
+	
 	pfree(sysscan);
 }
 
-- 
2.40.1

#7Amit Kapila
amit.kapila16@gmail.com
In reply to: Takeshi Ideriha (#5)
Re: BUG #18641: Logical decoding of two-phase commit fails with TOASTed default values

On Tue, Oct 1, 2024 at 7:07 PM Takeshi Ideriha <iderihatakeshi@gmail.com> wrote:

We forgot to set/unset the flag in functions
systable_beginscan_ordered and systable_endscan_ordered. BTW,

Thank you for the clarification.

shouldn't this occur even without prepare transaction? If so, we need
to backpatch this till 14.

Yes, it occurred also at PG14.
I did some tests using 015_stream.pl with some modification like
below, which tests the subscription about stream is on but two-phase
is off.
The same issue occurred at both current head source code and PG14.

```
--- a/src/test/subscription/t/015_stream.pl
+++ b/src/test/subscription/t/015_stream.pl
@@ -134,9 +134,11 @@ my $node_subscriber =
PostgreSQL::Test::Cluster->new('subscriber');
$node_subscriber->init;
$node_subscriber->start;
+my $default = join('', map {chr(65 + rand 26)} (1 .. 10000));
+
# Create some preexisting content on publisher
$node_publisher->safe_psql('postgres',
-       "CREATE TABLE test_tab (a int primary key, b bytea)");
+       "CREATE TABLE test_tab (a int primary key, b bytea, t text
DEFAULT '$default')");
$node_publisher->safe_psql('postgres',
"INSERT INTO test_tab VALUES (1, 'foo'), (2, 'bar')");

@@ -144,7 +146,7 @@ $node_publisher->safe_psql('postgres', "CREATE
TABLE test_tab_2 (a int)");

# Setup structure on subscriber
$node_subscriber->safe_psql('postgres',
-       "CREATE TABLE test_tab (a int primary key, b bytea, c
timestamptz DEFAULT now(), d bigint DEFAULT 999)"
+       "CREATE TABLE test_tab (a int primary key, b bytea, t text
DEFAULT '$default', c timestamptz DEFAULT now(), d bigint DEFAULT
999)"
);
```

Logs from head source:
```
2024-10-01 12:34:56.694 UTC [575202] LOG: starting PostgreSQL 18devel
on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.4.1 20230605 (Red Hat
11.4.1-2), 64-bit
...
2024-10-01 12:34:57.506 UTC [575258] 015_stream.pl LOG: statement: BEGIN;
2024-10-01 12:34:57.506 UTC [575258] 015_stream.pl LOG: statement:
INSERT INTO test_tab SELECT i, sha256(i::text::bytea) FROM
generate_series(3, 5000) s(i);
2024-10-01 12:34:57.524 UTC [575242] tap_sub ERROR: unexpected
table_index_fetch_tuple call during logical decoding
2024-10-01 12:34:57.524 UTC [575242] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4',
streaming 'on', origin 'any', publication_names '"tap_pub"')
2024-10-01 12:34:57.525 UTC [575242] tap_sub LOG: released logical
replication slot "tap_sub"
2024-10-01 12:34:57.525 UTC [575242] tap_sub LOG: could not send data
to client: Broken pipe
2024-10-01 12:34:57.525 UTC [575242] tap_sub FATAL: connection to client lost
...
2024-10-01 12:34:57.829 UTC [575260] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4',
streaming 'on', origin 'any', publication_names '"tap_pub"')
2024-10-01 12:34:57.829 UTC [575260] tap_sub LOG: starting logical
decoding for slot "tap_sub"
2024-10-01 12:34:57.829 UTC [575260] tap_sub DETAIL: Streaming
transactions committing after 0/1583A68, reading WAL from 0/1583A30.
2024-10-01 12:34:57.829 UTC [575260] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4',
streaming 'on', origin 'any', publication_names '"tap_pub"')
2024-10-01 12:34:57.829 UTC [575260] tap_sub LOG: logical decoding
found consistent point at 0/1583A30
2024-10-01 12:34:57.829 UTC [575260] tap_sub DETAIL: There are no
running transactions.
2024-10-01 12:34:57.829 UTC [575260] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4',
streaming 'on', origin 'any', publication_names '"tap_pub"')
2024-10-01 12:34:57.831 UTC [575260] tap_sub ERROR: unexpected
table_index_fetch_tuple call during logical decoding
2024-10-01 12:34:57.831 UTC [575260] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4',
streaming 'on', origin 'any', publication_names '"tap_pub"')
2024-10-01 12:34:57.832 UTC [575260] tap_sub LOG: released logical
replication slot "tap_sub"
2024-10-01 12:34:57.832 UTC [575260] tap_sub LOG: could not send data
to client: Broken pipe
2024-10-01 12:34:57.832 UTC [575260] tap_sub FATAL: connection to client lost
```

Logs from PG14 source:

```
2024-10-01 13:20:29.409 UTC [593696] LOG: starting PostgreSQL 14.9 on
x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.4.1 20230605 (Red Hat
11.4.1-2), 64-bit
...
2024-10-01 13:20:31.285 UTC [593750] 015_stream.pl LOG: statement: BEGIN;
2024-10-01 13:20:31.285 UTC [593750] 015_stream.pl LOG: statement:
INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(3,
5000) s(i);
2024-10-01 13:20:31.301 UTC [593740] tap_sub ERROR: unexpected
table_index_fetch_tuple call during logical decoding
2024-10-01 13:20:31.301 UTC [593740] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '2',
streaming 'on', publication_names '"tap_pub"')
2024-10-01 13:20:31.302 UTC [593740] tap_sub LOG: could not send data
to client: Broken pipe
2024-10-01 13:20:31.302 UTC [593740] tap_sub FATAL: connection to client lost
...
2024-10-01 13:20:31.830 UTC [593756] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '2',
streaming 'on', publication_names '"tap_pub"')
2024-10-01 13:20:31.830 UTC [593756] tap_sub LOG: logical decoding
found consistent point at 0/172C758
2024-10-01 13:20:31.830 UTC [593756] tap_sub DETAIL: There are no
running transactions.
2024-10-01 13:20:31.830 UTC [593756] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '2',
streaming 'on', publication_names '"tap_pub"')
2024-10-01 13:20:31.834 UTC [593756] tap_sub ERROR: unexpected
table_index_fetch_tuple call during logical decoding
2024-10-01 13:20:31.834 UTC [593756] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '2',
streaming 'on', publication_names '"tap_pub"')
2024-10-01 13:20:31.835 UTC [593756] tap_sub LOG: could not send data
to client: Broken pipe
2024-10-01 13:20:31.835 UTC [593756] tap_sub FATAL: connection to client lost
```

Also, it is better to have a test for this, and let's ensure that the
new test doesn't increase the regression time too much if possible.

Sure. I'm going to add some test codes in a few days.

The above test shown in the email will work to test this issue.
However, we should use 'debug_logical_replication_streaming' as that
will help to reproduce the issue with just one write operation. Is it
possible to add the test in 018_stream_subxact_abort where we already
use it?

--
With Regards,
Amit Kapila.

#8Amit Kapila
amit.kapila16@gmail.com
In reply to: Takeshi Ideriha (#5)
Re: BUG #18641: Logical decoding of two-phase commit fails with TOASTed default values

On Tue, Oct 1, 2024 at 7:07 PM Takeshi Ideriha <iderihatakeshi@gmail.com> wrote:

Sending it again to avoid getting it stuck because the original was
replied to both -hackers and -bugs.

We forgot to set/unset the flag in functions
systable_beginscan_ordered and systable_endscan_ordered. BTW,

Thank you for the clarification.

shouldn't this occur even without prepare transaction? If so, we need
to backpatch this till 14.

Yes, it occurred also at PG14.
I did some tests using 015_stream.pl with some modification like
below, which tests the subscription about stream is on but two-phase
is off.
The same issue occurred at both current head source code and PG14.

```
--- a/src/test/subscription/t/015_stream.pl
+++ b/src/test/subscription/t/015_stream.pl
@@ -134,9 +134,11 @@ my $node_subscriber =
PostgreSQL::Test::Cluster->new('subscriber');
$node_subscriber->init;
$node_subscriber->start;
+my $default = join('', map {chr(65 + rand 26)} (1 .. 10000));
+
# Create some preexisting content on publisher
$node_publisher->safe_psql('postgres',
-       "CREATE TABLE test_tab (a int primary key, b bytea)");
+       "CREATE TABLE test_tab (a int primary key, b bytea, t text
DEFAULT '$default')");
$node_publisher->safe_psql('postgres',
"INSERT INTO test_tab VALUES (1, 'foo'), (2, 'bar')");

@@ -144,7 +146,7 @@ $node_publisher->safe_psql('postgres', "CREATE
TABLE test_tab_2 (a int)");

# Setup structure on subscriber
$node_subscriber->safe_psql('postgres',
-       "CREATE TABLE test_tab (a int primary key, b bytea, c
timestamptz DEFAULT now(), d bigint DEFAULT 999)"
+       "CREATE TABLE test_tab (a int primary key, b bytea, t text
DEFAULT '$default', c timestamptz DEFAULT now(), d bigint DEFAULT
999)"
);
```

Logs from head source:
```
2024-10-01 12:34:56.694 UTC [575202] LOG: starting PostgreSQL 18devel
on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.4.1 20230605 (Red Hat
11.4.1-2), 64-bit
...
2024-10-01 12:34:57.506 UTC [575258] 015_stream.pl LOG: statement: BEGIN;
2024-10-01 12:34:57.506 UTC [575258] 015_stream.pl LOG: statement:
INSERT INTO test_tab SELECT i, sha256(i::text::bytea) FROM
generate_series(3, 5000) s(i);
2024-10-01 12:34:57.524 UTC [575242] tap_sub ERROR: unexpected
table_index_fetch_tuple call during logical decoding
2024-10-01 12:34:57.524 UTC [575242] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4',
streaming 'on', origin 'any', publication_names '"tap_pub"')
2024-10-01 12:34:57.525 UTC [575242] tap_sub LOG: released logical
replication slot "tap_sub"
2024-10-01 12:34:57.525 UTC [575242] tap_sub LOG: could not send data
to client: Broken pipe
2024-10-01 12:34:57.525 UTC [575242] tap_sub FATAL: connection to client lost
...
2024-10-01 12:34:57.829 UTC [575260] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4',
streaming 'on', origin 'any', publication_names '"tap_pub"')
2024-10-01 12:34:57.829 UTC [575260] tap_sub LOG: starting logical
decoding for slot "tap_sub"
2024-10-01 12:34:57.829 UTC [575260] tap_sub DETAIL: Streaming
transactions committing after 0/1583A68, reading WAL from 0/1583A30.
2024-10-01 12:34:57.829 UTC [575260] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4',
streaming 'on', origin 'any', publication_names '"tap_pub"')
2024-10-01 12:34:57.829 UTC [575260] tap_sub LOG: logical decoding
found consistent point at 0/1583A30
2024-10-01 12:34:57.829 UTC [575260] tap_sub DETAIL: There are no
running transactions.
2024-10-01 12:34:57.829 UTC [575260] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4',
streaming 'on', origin 'any', publication_names '"tap_pub"')
2024-10-01 12:34:57.831 UTC [575260] tap_sub ERROR: unexpected
table_index_fetch_tuple call during logical decoding
2024-10-01 12:34:57.831 UTC [575260] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '4',
streaming 'on', origin 'any', publication_names '"tap_pub"')
2024-10-01 12:34:57.832 UTC [575260] tap_sub LOG: released logical
replication slot "tap_sub"
2024-10-01 12:34:57.832 UTC [575260] tap_sub LOG: could not send data
to client: Broken pipe
2024-10-01 12:34:57.832 UTC [575260] tap_sub FATAL: connection to client lost
```

Logs from PG14 source:

```
2024-10-01 13:20:29.409 UTC [593696] LOG: starting PostgreSQL 14.9 on
x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.4.1 20230605 (Red Hat
11.4.1-2), 64-bit
...
2024-10-01 13:20:31.285 UTC [593750] 015_stream.pl LOG: statement: BEGIN;
2024-10-01 13:20:31.285 UTC [593750] 015_stream.pl LOG: statement:
INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series(3,
5000) s(i);
2024-10-01 13:20:31.301 UTC [593740] tap_sub ERROR: unexpected
table_index_fetch_tuple call during logical decoding
2024-10-01 13:20:31.301 UTC [593740] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '2',
streaming 'on', publication_names '"tap_pub"')
2024-10-01 13:20:31.302 UTC [593740] tap_sub LOG: could not send data
to client: Broken pipe
2024-10-01 13:20:31.302 UTC [593740] tap_sub FATAL: connection to client lost
...
2024-10-01 13:20:31.830 UTC [593756] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '2',
streaming 'on', publication_names '"tap_pub"')
2024-10-01 13:20:31.830 UTC [593756] tap_sub LOG: logical decoding
found consistent point at 0/172C758
2024-10-01 13:20:31.830 UTC [593756] tap_sub DETAIL: There are no
running transactions.
2024-10-01 13:20:31.830 UTC [593756] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '2',
streaming 'on', publication_names '"tap_pub"')
2024-10-01 13:20:31.834 UTC [593756] tap_sub ERROR: unexpected
table_index_fetch_tuple call during logical decoding
2024-10-01 13:20:31.834 UTC [593756] tap_sub STATEMENT:
START_REPLICATION SLOT "tap_sub" LOGICAL 0/0 (proto_version '2',
streaming 'on', publication_names '"tap_pub"')
2024-10-01 13:20:31.835 UTC [593756] tap_sub LOG: could not send data
to client: Broken pipe
2024-10-01 13:20:31.835 UTC [593756] tap_sub FATAL: connection to client lost
```

Also, it is better to have a test for this, and let's ensure that the
new test doesn't increase the regression time too much if possible.

Sure. I'm going to add some test codes in a few days.

The above test shown in the email will work to test this issue.
However, we should use 'debug_logical_replication_streaming' as that
will help to reproduce the issue with just one write operation. Is it
possible to add the test in 018_stream_subxact_abort where we already
use it?

--
With Regards,
Amit Kapila.

#9Zhijie Hou (Fujitsu)
houzj.fnst@fujitsu.com
In reply to: Amit Kapila (#8)
2 attachment(s)
RE: BUG #18641: Logical decoding of two-phase commit fails with TOASTed default values

On Saturday, October 5, 2024 6:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Sending it again to avoid getting it stuck because the original was replied to both -hackers and -bugs.

On Tue, Oct 1, 2024 at 7:07 PM Takeshi Ideriha
<iderihatakeshi@gmail.com>
wrote:

We forgot to set/unset the flag in functions
systable_beginscan_ordered and systable_endscan_ordered. BTW,

Thank you for the clarification.

shouldn't this occur even without prepare transaction? If so, we
need to backpatch this till 14.

Yes, it occurred also at PG14.
I did some tests using 015_stream.pl with some modification like
below, which tests the subscription about stream is on but two-phase
is off.
The same issue occurred at both current head source code and PG14.

```
--- a/src/test/subscription/t/015_stream.pl
+++ b/src/test/subscription/t/015_stream.pl
@@ -134,9 +134,11 @@ my $node_subscriber = 
PostgreSQL::Test::Cluster->new('subscriber');
$node_subscriber->init;
$node_subscriber->start;
+my $default = join('', map {chr(65 + rand 26)} (1 .. 10000));
+
# Create some preexisting content on publisher 
$node_publisher->safe_psql('postgres',
-       "CREATE TABLE test_tab (a int primary key, b bytea)");
+       "CREATE TABLE test_tab (a int primary key, b bytea, t text
DEFAULT '$default')");
$node_publisher->safe_psql('postgres',
"INSERT INTO test_tab VALUES (1, 'foo'), (2, 'bar')");

@@ -144,7 +146,7 @@ $node_publisher->safe_psql('postgres', "CREATE
TABLE test_tab_2 (a int)");

# Setup structure on subscriber
$node_subscriber->safe_psql('postgres',
-       "CREATE TABLE test_tab (a int primary key, b bytea, c
timestamptz DEFAULT now(), d bigint DEFAULT 999)"
+       "CREATE TABLE test_tab (a int primary key, b bytea, t text
DEFAULT '$default', c timestamptz DEFAULT now(), d bigint DEFAULT 
999)"

Also, it is better to have a test for this, and let's ensure that
the new test doesn't increase the regression time too much if possible.

Sure. I'm going to add some test codes in a few days.

The above test shown in the email will work to test this issue.
However, we should use 'debug_logical_replication_streaming' as that
will help to reproduce the issue with just one write operation. Is it
possible to add the test in 018_stream_subxact_abort where we already use it?

I think it might be better to put the test in test_decoding/sql/stream.sql as
the bug is only related to the decoding part. Writing it in sql test also seems cheaper.

Just share a top-up patch 0002 which includes the tests I suggested for
reference. The test would fail without the fix.

Node that I didn't change any logic in 0001 as the fix looks good to
me. I just ran the pgindent for the 0001.

Best Regards,
Hou zj

Attachments:

v3-0001-Set-and-unset-bsysscan-flag-in-systable_-_ordered.patchapplication/octet-stream; name=v3-0001-Set-and-unset-bsysscan-flag-in-systable_-_ordered.patchDownload
From 83804c82c289d2fc3fea00058fff0ab934b590ac Mon Sep 17 00:00:00 2001
From: Ideriha Takeshi <iderihatakeshi@gmail.com>
Date: Mon, 30 Sep 2024 00:24:13 +0000
Subject: [PATCH v3] Set and unset bsysscan flag in systable_*_ordered.

We call HandleConcurrentAbort in systable_getnext_ordered so that
we can handle concurrent abort case in logical decoding same as
systable_getnext. Set and unset bsysscan flag is forgotten.
---
 src/backend/access/index/genam.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/src/backend/access/index/genam.c b/src/backend/access/index/genam.c
index 52fde5cc4d..69c3608432 100644
--- a/src/backend/access/index/genam.c
+++ b/src/backend/access/index/genam.c
@@ -713,6 +713,14 @@ systable_beginscan_ordered(Relation heapRelation,
 	index_rescan(sysscan->iscan, idxkey, nkeys, NULL, 0);
 	sysscan->scan = NULL;
 
+	/*
+	 * If CheckXidAlive is set then set a flag to indicate that system table
+	 * scan is in-progress.  See detailed comments in xact.c where these
+	 * variables are declared.
+	 */
+	if (TransactionIdIsValid(CheckXidAlive))
+		bsysscan = true;
+
 	return sysscan;
 }
 
@@ -757,6 +765,14 @@ systable_endscan_ordered(SysScanDesc sysscan)
 	index_endscan(sysscan->iscan);
 	if (sysscan->snapshot)
 		UnregisterSnapshot(sysscan->snapshot);
+
+	/*
+	 * Reset the bsysscan flag at the end of the systable scan.  See detailed
+	 * comments in xact.c where these variables are declared.
+	 */
+	if (TransactionIdIsValid(CheckXidAlive))
+		bsysscan = false;
+
 	pfree(sysscan);
 }
 
-- 
2.31.1

v3-0002-add-a-testcase-in-stream.sql.patchapplication/octet-stream; name=v3-0002-add-a-testcase-in-stream.sql.patchDownload
From f36657c9cfe1620a3f698a5932a97c019d7322e9 Mon Sep 17 00:00:00 2001
From: Hou Zhijie <houzj.fnst@cn.fujitsu.com>
Date: Mon, 7 Oct 2024 11:48:57 +0800
Subject: [PATCH v3 2/2] add a testcase in stream.sql

---
 contrib/test_decoding/expected/stream.out | 22 ++++++++++++++++++++++
 contrib/test_decoding/sql/stream.sql      | 22 ++++++++++++++++++++++
 2 files changed, 44 insertions(+)

diff --git a/contrib/test_decoding/expected/stream.out b/contrib/test_decoding/expected/stream.out
index a76f77601e..6a8a00a65b 100644
--- a/contrib/test_decoding/expected/stream.out
+++ b/contrib/test_decoding/expected/stream.out
@@ -128,6 +128,28 @@ SELECT count(*) FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL,
      5
 (1 row)
 
+-- Test that accessing a TOAST table in streaming mode is allowed.
+-- Create a table with a column that uses a TOASTed default value.
+-- (temporarily hide query, to avoid the long CREATE TABLE stmt)
+\set ECHO none
+SET debug_logical_replication_streaming = immediate;
+BEGIN;
+INSERT INTO test_tab VALUES(1);
+-- Force WAL flush, so that the above changes will be streamed.
+SELECT 'force flush' FROM pg_switch_wal();
+  ?column?   
+-------------
+ force flush
+(1 row)
+
+SELECT count(*) FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1', 'stream-changes', '1');
+ count 
+-------
+     3
+(1 row)
+
+COMMIT;
+RESET debug_logical_replication_streaming;
 DROP TABLE stream_test;
 SELECT pg_drop_replication_slot('regression_slot');
  pg_drop_replication_slot 
diff --git a/contrib/test_decoding/sql/stream.sql b/contrib/test_decoding/sql/stream.sql
index 7f43f0c2ab..d7a696586e 100644
--- a/contrib/test_decoding/sql/stream.sql
+++ b/contrib/test_decoding/sql/stream.sql
@@ -59,5 +59,27 @@ ROLLBACK TO s1;
 COMMIT;
 SELECT count(*) FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1', 'stream-changes', '1');
 
+-- Test that accessing a TOAST table in streaming mode is allowed.
+
+-- Create a table with a column that uses a TOASTed default value.
+-- (temporarily hide query, to avoid the long CREATE TABLE stmt)
+\set ECHO none
+SELECT 'CREATE TABLE test_tab (a text DEFAULT ''' || string_agg('toast value', '') || ''');' FROM generate_series(1, 4000)
+\gexec
+\set ECHO all
+
+SET debug_logical_replication_streaming = immediate;
+
+BEGIN;
+INSERT INTO test_tab VALUES(1);
+
+-- Force WAL flush, so that the above changes will be streamed.
+SELECT 'force flush' FROM pg_switch_wal();
+
+SELECT count(*) FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1', 'stream-changes', '1');
+COMMIT;
+
+RESET debug_logical_replication_streaming;
+
 DROP TABLE stream_test;
 SELECT pg_drop_replication_slot('regression_slot');
-- 
2.30.0.windows.2

#10Zhijie Hou (Fujitsu)
houzj.fnst@fujitsu.com
In reply to: Zhijie Hou (Fujitsu) (#9)
8 attachment(s)
RE: BUG #18641: Logical decoding of two-phase commit fails with TOASTed default values

On Monday, October 7, 2024 1:57 PM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com> wrote:

On Saturday, October 5, 2024 6:12 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:

Sending it again to avoid getting it stuck because the original was replied to
both -hackers and -bugs.

On Tue, Oct 1, 2024 at 7:07 PM Takeshi Ideriha
<iderihatakeshi@gmail.com>
wrote:

We forgot to set/unset the flag in functions
systable_beginscan_ordered and systable_endscan_ordered. BTW,

Thank you for the clarification.

shouldn't this occur even without prepare transaction? If so, we
need to backpatch this till 14.

Yes, it occurred also at PG14.
I did some tests using 015_stream.pl with some modification like
below, which tests the subscription about stream is on but two-phase
is off.
The same issue occurred at both current head source code and PG14.

```
--- a/src/test/subscription/t/015_stream.pl
+++ b/src/test/subscription/t/015_stream.pl
@@ -134,9 +134,11 @@ my $node_subscriber =
PostgreSQL::Test::Cluster->new('subscriber');
$node_subscriber->init;
$node_subscriber->start;
+my $default = join('', map {chr(65 + rand 26)} (1 .. 10000));
+
# Create some preexisting content on publisher
$node_publisher->safe_psql('postgres',
-       "CREATE TABLE test_tab (a int primary key, b bytea)");
+       "CREATE TABLE test_tab (a int primary key, b bytea, t text
DEFAULT '$default')");
$node_publisher->safe_psql('postgres',
"INSERT INTO test_tab VALUES (1, 'foo'), (2, 'bar')");

@@ -144,7 +146,7 @@ $node_publisher->safe_psql('postgres', "CREATE
TABLE test_tab_2 (a int)");

# Setup structure on subscriber
$node_subscriber->safe_psql('postgres',
-       "CREATE TABLE test_tab (a int primary key, b bytea, c
timestamptz DEFAULT now(), d bigint DEFAULT 999)"
+       "CREATE TABLE test_tab (a int primary key, b bytea, t text
DEFAULT '$default', c timestamptz DEFAULT now(), d bigint DEFAULT
999)"

Also, it is better to have a test for this, and let's ensure that
the new test doesn't increase the regression time too much if possible.

Sure. I'm going to add some test codes in a few days.

The above test shown in the email will work to test this issue.
However, we should use 'debug_logical_replication_streaming' as that
will help to reproduce the issue with just one write operation. Is it
possible to add the test in 018_stream_subxact_abort where we already use

it?

I think it might be better to put the test in test_decoding/sql/stream.sql as the
bug is only related to the decoding part. Writing it in sql test also seems
cheaper.

Just share a top-up patch 0002 which includes the tests I suggested for
reference. The test would fail without the fix.

Node that I didn't change any logic in 0001 as the fix looks good to me. I just ran
the pgindent for the 0001.

While testing the fix on back-branches, I realized that we don't have the
debug_logical_replication_streaming GUC on PG14 and 15. So I tried to generate
enough data to exceed the logical_decoding_work_mem in the test instead. Please
see the attachment for reference. I also rebased 0001 on back-branches, so just
share them here as well.

Best Regards,
Hou zj

Attachments:

v3-0002-add-a-testcase-in-stream.sql-PG14-PG15.patchapplication/octet-stream; name=v3-0002-add-a-testcase-in-stream.sql-PG14-PG15.patchDownload
From 8a4d3d81a182ba774606047f72e91b76ced613c2 Mon Sep 17 00:00:00 2001
From: Hou Zhijie <houzj.fnst@cn.fujitsu.com>
Date: Mon, 7 Oct 2024 15:16:25 +0800
Subject: [PATCH v3] add a testcase in stream.sql PG14

---
 contrib/test_decoding/expected/stream.out | 20 ++++++++++++++++++++
 contrib/test_decoding/sql/stream.sql      | 18 ++++++++++++++++++
 2 files changed, 38 insertions(+)

diff --git a/contrib/test_decoding/expected/stream.out b/contrib/test_decoding/expected/stream.out
index 0f21dcb8e0..502cd2af17 100644
--- a/contrib/test_decoding/expected/stream.out
+++ b/contrib/test_decoding/expected/stream.out
@@ -106,6 +106,26 @@ SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL,NULL, 'incl
  committing streamed transaction
 (17 rows)
 
+-- Test that accessing a TOAST table in streaming mode is allowed.
+-- Create a table with a column that uses a TOASTed default value.
+-- (temporarily hide query, to avoid the long CREATE TABLE stmt)
+\set ECHO none
+BEGIN;
+INSERT INTO test_tab SELECT repeat('a', 6000) || g.i FROM generate_series(1, 350) g(i);
+-- Force WAL flush, so that the above changes will be streamed.
+SELECT 'force flush' FROM pg_switch_wal();
+  ?column?   
+-------------
+ force flush
+(1 row)
+
+SELECT count(*) FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1', 'stream-changes', '1');
+ count 
+-------
+   315
+(1 row)
+
+COMMIT;
 DROP TABLE stream_test;
 SELECT pg_drop_replication_slot('regression_slot');
  pg_drop_replication_slot 
diff --git a/contrib/test_decoding/sql/stream.sql b/contrib/test_decoding/sql/stream.sql
index 4feec62972..d6915958d4 100644
--- a/contrib/test_decoding/sql/stream.sql
+++ b/contrib/test_decoding/sql/stream.sql
@@ -44,5 +44,23 @@ toasted-123456789012345678901234567890123456789012345678901234567890123456789012
 
 SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL,NULL, 'include-xids', '0', 'skip-empty-xacts', '1', 'stream-changes', '1');
 
+-- Test that accessing a TOAST table in streaming mode is allowed.
+
+-- Create a table with a column that uses a TOASTed default value.
+-- (temporarily hide query, to avoid the long CREATE TABLE stmt)
+\set ECHO none
+SELECT 'CREATE TABLE test_tab (a text DEFAULT ''' || string_agg('toast value', '') || ''');' FROM generate_series(1, 4000)
+\gexec
+\set ECHO all
+
+BEGIN;
+INSERT INTO test_tab SELECT repeat('a', 6000) || g.i FROM generate_series(1, 350) g(i);
+
+-- Force WAL flush, so that the above changes will be streamed.
+SELECT 'force flush' FROM pg_switch_wal();
+
+SELECT count(*) FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1', 'stream-changes', '1');
+COMMIT;
+
 DROP TABLE stream_test;
 SELECT pg_drop_replication_slot('regression_slot');
-- 
2.30.0.windows.2

v3-0001-Set-and-unset-bsysscan-flag-in-systable_-_ordered-PG14-PG15.patchapplication/octet-stream; name=v3-0001-Set-and-unset-bsysscan-flag-in-systable_-_ordered-PG14-PG15.patchDownload
From 83804c82c289d2fc3fea00058fff0ab934b590ac Mon Sep 17 00:00:00 2001
From: Ideriha Takeshi <iderihatakeshi@gmail.com>
Date: Mon, 30 Sep 2024 00:24:13 +0000
Subject: [PATCH v3] Set and unset bsysscan flag in systable_*_ordered.

We call HandleConcurrentAbort in systable_getnext_ordered so that
we can handle concurrent abort case in logical decoding same as
systable_getnext. Set and unset bsysscan flag is forgotten.
---
 src/backend/access/index/genam.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/src/backend/access/index/genam.c b/src/backend/access/index/genam.c
index 52fde5cc4d..69c3608432 100644
--- a/src/backend/access/index/genam.c
+++ b/src/backend/access/index/genam.c
@@ -713,6 +713,14 @@ systable_beginscan_ordered(Relation heapRelation,
 	index_rescan(sysscan->iscan, key, nkeys, NULL, 0);
 	sysscan->scan = NULL;
 
+	/*
+	 * If CheckXidAlive is set then set a flag to indicate that system table
+	 * scan is in-progress.  See detailed comments in xact.c where these
+	 * variables are declared.
+	 */
+	if (TransactionIdIsValid(CheckXidAlive))
+		bsysscan = true;
+
 	return sysscan;
 }
 
@@ -757,6 +765,14 @@ systable_endscan_ordered(SysScanDesc sysscan)
 	index_endscan(sysscan->iscan);
 	if (sysscan->snapshot)
 		UnregisterSnapshot(sysscan->snapshot);
+
+	/*
+	 * Reset the bsysscan flag at the end of the systable scan.  See detailed
+	 * comments in xact.c where these variables are declared.
+	 */
+	if (TransactionIdIsValid(CheckXidAlive))
+		bsysscan = false;
+
 	pfree(sysscan);
 }
 
-- 
2.31.1

v3-0001-Set-and-unset-bsysscan-flag-in-systable_-_ordered-PG17.patchapplication/octet-stream; name=v3-0001-Set-and-unset-bsysscan-flag-in-systable_-_ordered-PG17.patchDownload
From 83804c82c289d2fc3fea00058fff0ab934b590ac Mon Sep 17 00:00:00 2001
From: Ideriha Takeshi <iderihatakeshi@gmail.com>
Date: Mon, 30 Sep 2024 00:24:13 +0000
Subject: [PATCH v3] Set and unset bsysscan flag in systable_*_ordered.

We call HandleConcurrentAbort in systable_getnext_ordered so that
we can handle concurrent abort case in logical decoding same as
systable_getnext. Set and unset bsysscan flag is forgotten.
---
 src/backend/access/index/genam.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/src/backend/access/index/genam.c b/src/backend/access/index/genam.c
index 52fde5cc4d..69c3608432 100644
--- a/src/backend/access/index/genam.c
+++ b/src/backend/access/index/genam.c
@@ -713,6 +713,14 @@ systable_beginscan_ordered(Relation heapRelation,
 	index_rescan(sysscan->iscan, key, nkeys, NULL, 0);
 	sysscan->scan = NULL;
 
+	/*
+	 * If CheckXidAlive is set then set a flag to indicate that system table
+	 * scan is in-progress.  See detailed comments in xact.c where these
+	 * variables are declared.
+	 */
+	if (TransactionIdIsValid(CheckXidAlive))
+		bsysscan = true;
+
 	return sysscan;
 }
 
@@ -757,6 +765,14 @@ systable_endscan_ordered(SysScanDesc sysscan)
 	index_endscan(sysscan->iscan);
 	if (sysscan->snapshot)
 		UnregisterSnapshot(sysscan->snapshot);
+
+	/*
+	 * Reset the bsysscan flag at the end of the systable scan.  See detailed
+	 * comments in xact.c where these variables are declared.
+	 */
+	if (TransactionIdIsValid(CheckXidAlive))
+		bsysscan = false;
+
 	pfree(sysscan);
 }
 
-- 
2.31.1

v3-0001-Set-and-unset-bsysscan-flag-in-systable_-_ordered-HEAD.patchapplication/octet-stream; name=v3-0001-Set-and-unset-bsysscan-flag-in-systable_-_ordered-HEAD.patchDownload
From 83804c82c289d2fc3fea00058fff0ab934b590ac Mon Sep 17 00:00:00 2001
From: Ideriha Takeshi <iderihatakeshi@gmail.com>
Date: Mon, 30 Sep 2024 00:24:13 +0000
Subject: [PATCH v3] Set and unset bsysscan flag in systable_*_ordered.

We call HandleConcurrentAbort in systable_getnext_ordered so that
we can handle concurrent abort case in logical decoding same as
systable_getnext. Set and unset bsysscan flag is forgotten.
---
 src/backend/access/index/genam.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/src/backend/access/index/genam.c b/src/backend/access/index/genam.c
index 52fde5cc4d..69c3608432 100644
--- a/src/backend/access/index/genam.c
+++ b/src/backend/access/index/genam.c
@@ -713,6 +713,14 @@ systable_beginscan_ordered(Relation heapRelation,
 	index_rescan(sysscan->iscan, idxkey, nkeys, NULL, 0);
 	sysscan->scan = NULL;
 
+	/*
+	 * If CheckXidAlive is set then set a flag to indicate that system table
+	 * scan is in-progress.  See detailed comments in xact.c where these
+	 * variables are declared.
+	 */
+	if (TransactionIdIsValid(CheckXidAlive))
+		bsysscan = true;
+
 	return sysscan;
 }
 
@@ -757,6 +765,14 @@ systable_endscan_ordered(SysScanDesc sysscan)
 	index_endscan(sysscan->iscan);
 	if (sysscan->snapshot)
 		UnregisterSnapshot(sysscan->snapshot);
+
+	/*
+	 * Reset the bsysscan flag at the end of the systable scan.  See detailed
+	 * comments in xact.c where these variables are declared.
+	 */
+	if (TransactionIdIsValid(CheckXidAlive))
+		bsysscan = false;
+
 	pfree(sysscan);
 }
 
-- 
2.31.1

v3-0001-Set-and-unset-bsysscan-flag-in-systable_-_ordered-PG16.patchapplication/octet-stream; name=v3-0001-Set-and-unset-bsysscan-flag-in-systable_-_ordered-PG16.patchDownload
From 9b90537c94f3f2128fc58acd905e1ad5a1d5748f Mon Sep 17 00:00:00 2001
From: Ideriha Takeshi <iderihatakeshi@gmail.com>
Date: Mon, 30 Sep 2024 00:24:13 +0000
Subject: [PATCH v4 1/2] Set and unset bsysscan flag in systable_*_ordered.

We call HandleConcurrentAbort in systable_getnext_ordered so that
we can handle concurrent abort case in logical decoding same as
systable_getnext. Set and unset bsysscan flag is forgotten.
---
 src/backend/access/index/genam.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/src/backend/access/index/genam.c b/src/backend/access/index/genam.c
index 16c7aa9fbf..e3b626b303 100644
--- a/src/backend/access/index/genam.c
+++ b/src/backend/access/index/genam.c
@@ -703,6 +703,14 @@ systable_beginscan_ordered(Relation heapRelation,
 	index_rescan(sysscan->iscan, key, nkeys, NULL, 0);
 	sysscan->scan = NULL;
 
+	/*
+	 * If CheckXidAlive is set then set a flag to indicate that system table
+	 * scan is in-progress.  See detailed comments in xact.c where these
+	 * variables are declared.
+	 */
+	if (TransactionIdIsValid(CheckXidAlive))
+		bsysscan = true;
+
 	return sysscan;
 }
 
@@ -747,6 +755,14 @@ systable_endscan_ordered(SysScanDesc sysscan)
 	index_endscan(sysscan->iscan);
 	if (sysscan->snapshot)
 		UnregisterSnapshot(sysscan->snapshot);
+
+	/*
+	 * Reset the bsysscan flag at the end of the systable scan.  See detailed
+	 * comments in xact.c where these variables are declared.
+	 */
+	if (TransactionIdIsValid(CheckXidAlive))
+		bsysscan = false;
+
 	pfree(sysscan);
 }
 
-- 
2.31.1

v3-0002-add-a-testcase-in-stream.sql-PG16.patchapplication/octet-stream; name=v3-0002-add-a-testcase-in-stream.sql-PG16.patchDownload
From 066ef2eb5de3f7403cb4d779190a2275a06dc277 Mon Sep 17 00:00:00 2001
From: Hou Zhijie <houzj.fnst@cn.fujitsu.com>
Date: Mon, 7 Oct 2024 14:29:55 +0800
Subject: [PATCH v4 2/2] add a testcase in stream.sql PG16

---
 contrib/test_decoding/expected/stream.out | 22 ++++++++++++++++++++++
 contrib/test_decoding/sql/stream.sql      | 22 ++++++++++++++++++++++
 2 files changed, 44 insertions(+)

diff --git a/contrib/test_decoding/expected/stream.out b/contrib/test_decoding/expected/stream.out
index 0f21dcb8e0..d4bfbf975d 100644
--- a/contrib/test_decoding/expected/stream.out
+++ b/contrib/test_decoding/expected/stream.out
@@ -106,6 +106,28 @@ SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL,NULL, 'incl
  committing streamed transaction
 (17 rows)
 
+-- Test that accessing a TOAST table in streaming mode is allowed.
+-- Create a table with a column that uses a TOASTed default value.
+-- (temporarily hide query, to avoid the long CREATE TABLE stmt)
+\set ECHO none
+SET debug_logical_replication_streaming = immediate;
+BEGIN;
+INSERT INTO test_tab VALUES(1);
+-- Force WAL flush, so that the above changes will be streamed.
+SELECT 'force flush' FROM pg_switch_wal();
+  ?column?   
+-------------
+ force flush
+(1 row)
+
+SELECT count(*) FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1', 'stream-changes', '1');
+ count 
+-------
+     3
+(1 row)
+
+COMMIT;
+RESET debug_logical_replication_streaming;
 DROP TABLE stream_test;
 SELECT pg_drop_replication_slot('regression_slot');
  pg_drop_replication_slot 
diff --git a/contrib/test_decoding/sql/stream.sql b/contrib/test_decoding/sql/stream.sql
index 4feec62972..0549d3204a 100644
--- a/contrib/test_decoding/sql/stream.sql
+++ b/contrib/test_decoding/sql/stream.sql
@@ -44,5 +44,27 @@ toasted-123456789012345678901234567890123456789012345678901234567890123456789012
 
 SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL,NULL, 'include-xids', '0', 'skip-empty-xacts', '1', 'stream-changes', '1');
 
+-- Test that accessing a TOAST table in streaming mode is allowed.
+
+-- Create a table with a column that uses a TOASTed default value.
+-- (temporarily hide query, to avoid the long CREATE TABLE stmt)
+\set ECHO none
+SELECT 'CREATE TABLE test_tab (a text DEFAULT ''' || string_agg('toast value', '') || ''');' FROM generate_series(1, 4000)
+\gexec
+\set ECHO all
+
+SET debug_logical_replication_streaming = immediate;
+
+BEGIN;
+INSERT INTO test_tab VALUES(1);
+
+-- Force WAL flush, so that the above changes will be streamed.
+SELECT 'force flush' FROM pg_switch_wal();
+
+SELECT count(*) FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1', 'stream-changes', '1');
+COMMIT;
+
+RESET debug_logical_replication_streaming;
+
 DROP TABLE stream_test;
 SELECT pg_drop_replication_slot('regression_slot');
-- 
2.31.1

v3-0002-add-a-testcase-in-stream.sql-HEAD.patchapplication/octet-stream; name=v3-0002-add-a-testcase-in-stream.sql-HEAD.patchDownload
From f36657c9cfe1620a3f698a5932a97c019d7322e9 Mon Sep 17 00:00:00 2001
From: Hou Zhijie <houzj.fnst@cn.fujitsu.com>
Date: Mon, 7 Oct 2024 11:48:57 +0800
Subject: [PATCH v3 2/2] add a testcase in stream.sql

---
 contrib/test_decoding/expected/stream.out | 22 ++++++++++++++++++++++
 contrib/test_decoding/sql/stream.sql      | 22 ++++++++++++++++++++++
 2 files changed, 44 insertions(+)

diff --git a/contrib/test_decoding/expected/stream.out b/contrib/test_decoding/expected/stream.out
index a76f77601e..6a8a00a65b 100644
--- a/contrib/test_decoding/expected/stream.out
+++ b/contrib/test_decoding/expected/stream.out
@@ -128,6 +128,28 @@ SELECT count(*) FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL,
      5
 (1 row)
 
+-- Test that accessing a TOAST table in streaming mode is allowed.
+-- Create a table with a column that uses a TOASTed default value.
+-- (temporarily hide query, to avoid the long CREATE TABLE stmt)
+\set ECHO none
+SET debug_logical_replication_streaming = immediate;
+BEGIN;
+INSERT INTO test_tab VALUES(1);
+-- Force WAL flush, so that the above changes will be streamed.
+SELECT 'force flush' FROM pg_switch_wal();
+  ?column?   
+-------------
+ force flush
+(1 row)
+
+SELECT count(*) FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1', 'stream-changes', '1');
+ count 
+-------
+     3
+(1 row)
+
+COMMIT;
+RESET debug_logical_replication_streaming;
 DROP TABLE stream_test;
 SELECT pg_drop_replication_slot('regression_slot');
  pg_drop_replication_slot 
diff --git a/contrib/test_decoding/sql/stream.sql b/contrib/test_decoding/sql/stream.sql
index 7f43f0c2ab..d7a696586e 100644
--- a/contrib/test_decoding/sql/stream.sql
+++ b/contrib/test_decoding/sql/stream.sql
@@ -59,5 +59,27 @@ ROLLBACK TO s1;
 COMMIT;
 SELECT count(*) FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1', 'stream-changes', '1');
 
+-- Test that accessing a TOAST table in streaming mode is allowed.
+
+-- Create a table with a column that uses a TOASTed default value.
+-- (temporarily hide query, to avoid the long CREATE TABLE stmt)
+\set ECHO none
+SELECT 'CREATE TABLE test_tab (a text DEFAULT ''' || string_agg('toast value', '') || ''');' FROM generate_series(1, 4000)
+\gexec
+\set ECHO all
+
+SET debug_logical_replication_streaming = immediate;
+
+BEGIN;
+INSERT INTO test_tab VALUES(1);
+
+-- Force WAL flush, so that the above changes will be streamed.
+SELECT 'force flush' FROM pg_switch_wal();
+
+SELECT count(*) FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1', 'stream-changes', '1');
+COMMIT;
+
+RESET debug_logical_replication_streaming;
+
 DROP TABLE stream_test;
 SELECT pg_drop_replication_slot('regression_slot');
-- 
2.30.0.windows.2

v3-0002-add-a-testcase-in-stream.sql-PG17.patchapplication/octet-stream; name=v3-0002-add-a-testcase-in-stream.sql-PG17.patchDownload
From f36657c9cfe1620a3f698a5932a97c019d7322e9 Mon Sep 17 00:00:00 2001
From: Hou Zhijie <houzj.fnst@cn.fujitsu.com>
Date: Mon, 7 Oct 2024 11:48:57 +0800
Subject: [PATCH v3 2/2] add a testcase in stream.sql

---
 contrib/test_decoding/expected/stream.out | 22 ++++++++++++++++++++++
 contrib/test_decoding/sql/stream.sql      | 22 ++++++++++++++++++++++
 2 files changed, 44 insertions(+)

diff --git a/contrib/test_decoding/expected/stream.out b/contrib/test_decoding/expected/stream.out
index a76f77601e..6a8a00a65b 100644
--- a/contrib/test_decoding/expected/stream.out
+++ b/contrib/test_decoding/expected/stream.out
@@ -128,6 +128,28 @@ SELECT count(*) FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL,
      5
 (1 row)
 
+-- Test that accessing a TOAST table in streaming mode is allowed.
+-- Create a table with a column that uses a TOASTed default value.
+-- (temporarily hide query, to avoid the long CREATE TABLE stmt)
+\set ECHO none
+SET debug_logical_replication_streaming = immediate;
+BEGIN;
+INSERT INTO test_tab VALUES(1);
+-- Force WAL flush, so that the above changes will be streamed.
+SELECT 'force flush' FROM pg_switch_wal();
+  ?column?   
+-------------
+ force flush
+(1 row)
+
+SELECT count(*) FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1', 'stream-changes', '1');
+ count 
+-------
+     3
+(1 row)
+
+COMMIT;
+RESET debug_logical_replication_streaming;
 DROP TABLE stream_test;
 SELECT pg_drop_replication_slot('regression_slot');
  pg_drop_replication_slot 
diff --git a/contrib/test_decoding/sql/stream.sql b/contrib/test_decoding/sql/stream.sql
index 7f43f0c2ab..d7a696586e 100644
--- a/contrib/test_decoding/sql/stream.sql
+++ b/contrib/test_decoding/sql/stream.sql
@@ -59,5 +59,27 @@ ROLLBACK TO s1;
 COMMIT;
 SELECT count(*) FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1', 'stream-changes', '1');
 
+-- Test that accessing a TOAST table in streaming mode is allowed.
+
+-- Create a table with a column that uses a TOASTed default value.
+-- (temporarily hide query, to avoid the long CREATE TABLE stmt)
+\set ECHO none
+SELECT 'CREATE TABLE test_tab (a text DEFAULT ''' || string_agg('toast value', '') || ''');' FROM generate_series(1, 4000)
+\gexec
+\set ECHO all
+
+SET debug_logical_replication_streaming = immediate;
+
+BEGIN;
+INSERT INTO test_tab VALUES(1);
+
+-- Force WAL flush, so that the above changes will be streamed.
+SELECT 'force flush' FROM pg_switch_wal();
+
+SELECT count(*) FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1', 'stream-changes', '1');
+COMMIT;
+
+RESET debug_logical_replication_streaming;
+
 DROP TABLE stream_test;
 SELECT pg_drop_replication_slot('regression_slot');
-- 
2.30.0.windows.2