Fix logical decoding not track transaction during SNAPBUILD_BUILDING_SNAPSHOT

Started by ocean_li_996about 2 months ago6 messages
#1ocean_li_996
ocean_li_996@163.com
4 attachment(s)

Hi all,

I would like to share a logical replication bug and some possible fixes. It seems that this bug has existed since
logical replication was first introduced, so it has been around for quite some time. In fact, the previously
reported issues [1]/messages/by-id/tencent_6AAF072A7623A11A85C0B5FD290232467808@qq.com, [2]/messages/by-id/18509-983f064d174ea880@postgresql.org, [3]/messages/by-id/2b9e5ac8.136f.19a8f7297ee.Coremail.ocean_li_996@163.com were all caused by this bug.

# Problem description

When in the BUILDING_SNAPSHOT state, the snapshot builder does not track the status of any
transaction. It can lead to missing transaction states when:
-- The transaction commits before the builder reaches FULL_SNAPSHOT state, and
-- The transaction's xid is greater than or equal to builder->xmin when the builder reaches
FULL_SNAPSHOT state.

Once in FULL_SNAPSHOT state, the builder constructs a base snapshot using incomplete transaction state
information. This results in an incorrect base snapshot, which can cause unpredictable behavior during
subsequent decoding. The case provided in v6-0002 attachment reproduces the issue (provided by ChangAo Chen).

# Code-level analysis

SnapBuildCommitTxn does consider transaction processing during the BUILDING_SNAPSHOT state. However, it
is only called from xact_decode -> DecodeCommit. xact_decode does not process any xact record when snapshot
builder have not yet reached the FULL_SNAPSHOT state, meaning those commits are ignored. Similarly, other
functions marking transaction having catalog changes (e.g., heap2_decode) also do not handle records before
reaching the FULL_SNAPSHOT state.

# Possible fixes

1. Replace snapshot at the time we reach CONSISTENT state.

Ajin Cherian in [4]/messages/by-id/CAFPTHDYSQipcO_+GNt-ZQsk6cidt9Lc4PkcdvO7jnrugiUw0eg@mail.gmail.com and my initial thought was that although the snapshot at FULL_SNAPSHOT state might be
wrong, the snapshot at CONSISTENT state is guaranteed to be correct. Since decoding always starts after
reaching CONSISTENT state, we could update both the reorder buffer and the builder snapshot with the one
captured at CONSISTENT state. However, IMUC, this would still cause changes generated before CONSISTENT to
carry a wrong snapshot (see SnapBuildDistributeSnapshotAndInval).

2. Track transactions during BUILDING_SNAPSHOT state for snapshot builder
If the builder does not track transactions in BUILDING_SNAPSHOT state, then we make it track them.

1) ChangAo Chen in v6-0001 attachment provided a fix, already reviewed by several people (including me).
Bertrand Drouvot in [5]/messages/by-id/ZrnlgJEH473Q1kTp@ip-10-97-1-34.eu-west-3.compute.internal considered the logic a bit messy. And I prefer we should make the behavior of
snapshot building similar in both BUILDING_SNAPSHOT and FULL_SNAPSHOT states, except in cases where
a base snapshot is needed.

2) Based on v6-0001, I have provided a minimal fix in v6-0003 (not yet reviewed). AFAICS, it resolves
the problem, though it records additional useless information in the reorder buffer during BUILDING_SNAPSHOT
state (which is discarded later). This increases memory usage and slightly impacts performance. But since
snapshot building is infrequent, I consider this acceptable.

3) I have also prepared a cleaner and more efficient fix in v6-0004 than v6-0003, albeit more complex
(similar to v6-0001). Provided as an alternative reference.

I think we should fix this issue to ensure snapshot building is correct.
Looking forward to your reviews and any feedback on the above proposed solutions.

Best regards,
Haiyang Li

[1]: /messages/by-id/tencent_6AAF072A7623A11A85C0B5FD290232467808@qq.com
[2]: /messages/by-id/18509-983f064d174ea880@postgresql.org
[3]: /messages/by-id/2b9e5ac8.136f.19a8f7297ee.Coremail.ocean_li_996@163.com
[4]: /messages/by-id/CAFPTHDYSQipcO_+GNt-ZQsk6cidt9Lc4PkcdvO7jnrugiUw0eg@mail.gmail.com
[5]: /messages/by-id/ZrnlgJEH473Q1kTp@ip-10-97-1-34.eu-west-3.compute.internal

Attachments:

v6-0002-Add-test-case-snapshot_build-for-test_decoding.patchapplication/octet-stream; name=v6-0002-Add-test-case-snapshot_build-for-test_decoding.patch; x-cm-securityLevel=0; x-cm-securitylevel=0Download
From d4747efa7c8103ec259a725051fff3bc4849dc17 Mon Sep 17 00:00:00 2001
From: ChangAo Chen <cca5507@qq.com>
Date: Fri, 21 Nov 2025 15:48:27 +0800
Subject: [PATCH v6 2/2] Add test case snapshot_build for test_decoding.

---
 contrib/test_decoding/Makefile                |  3 +-
 .../test_decoding/expected/snapshot_build.out | 33 +++++++++++++
 contrib/test_decoding/meson.build             |  1 +
 .../test_decoding/specs/snapshot_build.spec   | 46 +++++++++++++++++++
 4 files changed, 82 insertions(+), 1 deletion(-)
 create mode 100644 contrib/test_decoding/expected/snapshot_build.out
 create mode 100644 contrib/test_decoding/specs/snapshot_build.spec

diff --git a/contrib/test_decoding/Makefile b/contrib/test_decoding/Makefile
index acbcaed2feb..60210726566 100644
--- a/contrib/test_decoding/Makefile
+++ b/contrib/test_decoding/Makefile
@@ -9,7 +9,8 @@ REGRESS = ddl xact rewrite toast permissions decoding_in_xact \
 ISOLATION = mxact delayed_startup ondisk_startup concurrent_ddl_dml \
 	oldest_xmin snapshot_transfer subxact_without_top concurrent_stream \
 	twophase_snapshot slot_creation_error catalog_change_snapshot \
-	skip_snapshot_restore invalidation_distribution parallel_session_origin
+	skip_snapshot_restore invalidation_distribution parallel_session_origin \
+	snapshot_build
 
 REGRESS_OPTS = --temp-config $(top_srcdir)/contrib/test_decoding/logical.conf
 ISOLATION_OPTS = --temp-config $(top_srcdir)/contrib/test_decoding/logical.conf
diff --git a/contrib/test_decoding/expected/snapshot_build.out b/contrib/test_decoding/expected/snapshot_build.out
new file mode 100644
index 00000000000..0fcf20cce86
--- /dev/null
+++ b/contrib/test_decoding/expected/snapshot_build.out
@@ -0,0 +1,33 @@
+Parsed test spec with 4 sessions
+
+starting permutation: s1_begin s1_insert s2_init s3_begin s3_insert s4_create s1_commit s4_begin s4_insert s3_commit s4_commit s2_get_changes
+step s1_begin: BEGIN;
+step s1_insert: INSERT INTO tbl1 VALUES (1);
+step s2_init: SELECT 'init' FROM pg_create_logical_replication_slot('isolation_slot', 'test_decoding'); <waiting ...>
+step s3_begin: BEGIN;
+step s3_insert: INSERT INTO tbl1 VALUES (1);
+step s4_create: CREATE TABLE tbl2 (val1 integer);
+step s1_commit: COMMIT;
+step s4_begin: BEGIN;
+step s4_insert: INSERT INTO tbl2 VALUES (1);
+step s3_commit: COMMIT;
+step s2_init: <... completed>
+?column?
+--------
+init    
+(1 row)
+
+step s4_commit: COMMIT;
+step s2_get_changes: SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'skip-empty-xacts', '1', 'include-xids', '0');
+data                                      
+------------------------------------------
+BEGIN                                     
+table public.tbl2: INSERT: val1[integer]:1
+COMMIT                                    
+(3 rows)
+
+?column?
+--------
+stop    
+(1 row)
+
diff --git a/contrib/test_decoding/meson.build b/contrib/test_decoding/meson.build
index 99310555e6c..252a39b7727 100644
--- a/contrib/test_decoding/meson.build
+++ b/contrib/test_decoding/meson.build
@@ -65,6 +65,7 @@ tests += {
       'skip_snapshot_restore',
       'invalidation_distribution',
       'parallel_session_origin',
+      'snapshot_build',
     ],
     'regress_args': [
       '--temp-config', files('logical.conf'),
diff --git a/contrib/test_decoding/specs/snapshot_build.spec b/contrib/test_decoding/specs/snapshot_build.spec
new file mode 100644
index 00000000000..334531dd219
--- /dev/null
+++ b/contrib/test_decoding/specs/snapshot_build.spec
@@ -0,0 +1,46 @@
+# Test snapshot build correctly, it must track committed transactions during BUILDING_SNAPSHOT
+
+setup
+{
+    DROP TABLE IF EXISTS tbl1;
+    DROP TABLE IF EXISTS tbl2;
+    CREATE TABLE tbl1 (val1 integer);
+}
+
+teardown
+{
+    DROP TABLE tbl1;
+    DROP TABLE tbl2;
+    SELECT 'stop' FROM pg_drop_replication_slot('isolation_slot');
+}
+
+session "s1"
+setup { SET synchronous_commit=on; }
+step "s1_begin" { BEGIN; }
+step "s1_insert" { INSERT INTO tbl1 VALUES (1); }
+step "s1_commit" { COMMIT; }
+
+session "s2"
+setup { SET synchronous_commit=on; }
+step "s2_init" { SELECT 'init' FROM pg_create_logical_replication_slot('isolation_slot', 'test_decoding'); }
+step "s2_get_changes" { SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'skip-empty-xacts', '1', 'include-xids', '0'); }
+
+session "s3"
+setup { SET synchronous_commit=on; }
+step "s3_begin" { BEGIN; }
+step "s3_insert" { INSERT INTO tbl1 VALUES (1); }
+step "s3_commit" { COMMIT; }
+
+session "s4"
+setup { SET synchronous_commit=on; }
+step "s4_create" { CREATE TABLE tbl2 (val1 integer); }
+step "s4_begin" { BEGIN; }
+step "s4_insert" { INSERT INTO tbl2 VALUES (1); }
+step "s4_commit" { COMMIT; }
+
+# T1: s1_begin -> s1_insert -> BUILDING_SNAPSHOT -> s1_commit -> FULL_SNAPSHOT
+# T2: BUILDING_SNAPSHOT -> s3_begin -> s3_insert -> FULL_SNAPSHOT -> s3_commit -> CONSISTENT
+# T3: BUILDING_SNAPSHOT -> s4_create -> FULL_SNAPSHOT
+# T4: FULL_SNAPSHOT -> s4_begin -> s4_insert -> CONSISTENT -> s4_commit
+# The snapshot must track T3 or the replay of T4 will fail because its snapshot cannot see tbl2
+permutation "s1_begin" "s1_insert" "s2_init" "s3_begin" "s3_insert" "s4_create" "s1_commit" "s4_begin" "s4_insert" "s3_commit" "s4_commit" "s2_get_changes"
-- 
2.34.1

v6-0001-Track-transactions-committed-in-BUILDING_SNAPSHOT.patchapplication/octet-stream; name=v6-0001-Track-transactions-committed-in-BUILDING_SNAPSHOT.patch; x-cm-securityLevel=0; x-cm-securitylevel=0Download
From 12dd3434ef13609b324bbbbe68a3f0e2a48934a2 Mon Sep 17 00:00:00 2001
From: ChangAo Chen <cca5507@qq.com>
Date: Fri, 21 Nov 2025 15:19:22 +0800
Subject: [PATCH v6 1/2] Track transactions committed in BUILDING_SNAPSHOT.

The historic snapshot previously didn't track transactions committed
in BUILDING_SNAPSHOT, this might result in a transaction taking an
incorrect snapshot and logical decoding being interrupted. So we need
to track these transactions.

We also need to handle the xlog which means a catalog change in BUILDING_SNAPSHOT
because the historic snapshot only tracks catalog modifying transactions.
---
 src/backend/replication/logical/decode.c | 33 ++++++++++++++++++++----
 1 file changed, 28 insertions(+), 5 deletions(-)

diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index cc03f0706e9..de1bed30781 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -206,12 +206,16 @@ xact_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 	uint8		info = XLogRecGetInfo(r) & XLOG_XACT_OPMASK;
 
 	/*
-	 * If the snapshot isn't yet fully built, we cannot decode anything, so
-	 * bail out.
+	 * If the snapshot hasn't started building yet, the transaction won't be
+	 * decoded or tracked by the snapshot, so bail out.
 	 */
-	if (SnapBuildCurrentState(builder) < SNAPBUILD_FULL_SNAPSHOT)
+	if (SnapBuildCurrentState(builder) < SNAPBUILD_BUILDING_SNAPSHOT)
 		return;
 
+	/*
+	 * Note that if the snapshot isn't yet fully built, the xlog is only used
+	 * to build the snapshot and won't be decoded.
+	 */
 	switch (info)
 	{
 		case XLOG_XACT_COMMIT:
@@ -282,18 +286,24 @@ xact_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 			{
 				TransactionId xid;
 				xl_xact_invals *invals;
+				bool has_snapshot;
 
 				xid = XLogRecGetXid(r);
 				invals = (xl_xact_invals *) XLogRecGetData(r);
+				has_snapshot =
+					SnapBuildCurrentState(builder) >= SNAPBUILD_FULL_SNAPSHOT;
 
 				/*
 				 * Execute the invalidations for xid-less transactions,
 				 * otherwise, accumulate them so that they can be processed at
 				 * the commit time.
+				 *
+				 * Note that we only need to do this when we are not fast-forwarding
+				 * and there is a snapshot.
 				 */
 				if (TransactionIdIsValid(xid))
 				{
-					if (!ctx->fast_forward)
+					if (!ctx->fast_forward && has_snapshot)
 						ReorderBufferAddInvalidations(reorder, xid,
 													  buf->origptr,
 													  invals->nmsgs,
@@ -301,7 +311,7 @@ xact_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 					ReorderBufferXidSetCatalogChanges(ctx->reorder, xid,
 													  buf->origptr);
 				}
-				else if (!ctx->fast_forward)
+				else if (!ctx->fast_forward && has_snapshot)
 					ReorderBufferImmediateInvalidation(ctx->reorder,
 													   invals->nmsgs,
 													   invals->msgs);
@@ -419,7 +429,19 @@ heap2_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 	 * SnapBuildProcessRunningXacts().
 	 */
 	if (SnapBuildCurrentState(builder) < SNAPBUILD_FULL_SNAPSHOT)
+	{
+		/*
+		 * If we are building snapshot and the xlog means a catalog
+		 * change, we need to mark it in the reorder buffer.
+		 *
+		 * Now only XLOG_HEAP2_NEW_CID means a catalog change.
+		 */
+		if (SnapBuildCurrentState(builder) >= SNAPBUILD_BUILDING_SNAPSHOT &&
+			TransactionIdIsValid(xid) && info == XLOG_HEAP2_NEW_CID)
+			ReorderBufferXidSetCatalogChanges(ctx->reorder, xid, buf->origptr);
+
 		return;
+	}
 
 	switch (info)
 	{
@@ -1306,6 +1328,7 @@ DecodeTXNNeedSkip(LogicalDecodingContext *ctx, XLogRecordBuffer *buf,
 				  Oid txn_dbid, RepOriginId origin_id)
 {
 	if (SnapBuildXactNeedsSkip(ctx->snapshot_builder, buf->origptr) ||
+		SnapBuildCurrentState(ctx->snapshot_builder) < SNAPBUILD_CONSISTENT ||
 		(txn_dbid != InvalidOid && txn_dbid != ctx->slot->data.database) ||
 		FilterByOrigin(ctx, origin_id))
 		return true;
-- 
2.34.1

v6-0003-Track-transaction-committed-in-BUILDING_SNAPSHOT.patchapplication/octet-stream; name=v6-0003-Track-transaction-committed-in-BUILDING_SNAPSHOT.patch; x-cm-securityLevel=0; x-cm-securitylevel=0Download
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index cc03f0706e9..d124310e1ad 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -206,10 +206,10 @@ xact_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 	uint8		info = XLogRecGetInfo(r) & XLOG_XACT_OPMASK;
 
 	/*
-	 * If the snapshot isn't yet fully built, we cannot decode anything, so
-	 * bail out.
+	 * If the snapshot hasn't started building yet, the transaction won't be
+	 * decoded or tracked by the snapshot, so bail out.
 	 */
-	if (SnapBuildCurrentState(builder) < SNAPBUILD_FULL_SNAPSHOT)
+	if (SnapBuildCurrentState(builder) < SNAPBUILD_BUILDING_SNAPSHOT)
 		return;
 
 	switch (info)
@@ -418,7 +418,7 @@ heap2_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 	 * determining the candidate catalog_xmin for the replication slot. See
 	 * SnapBuildProcessRunningXacts().
 	 */
-	if (SnapBuildCurrentState(builder) < SNAPBUILD_FULL_SNAPSHOT)
+	if (SnapBuildCurrentState(builder) < SNAPBUILD_BUILDING_SNAPSHOT)
 		return;
 
 	switch (info)
v6-0004-Track-transaction-committed-in-BUILDING_SNAPSHOT.patchapplication/octet-stream; name=v6-0004-Track-transaction-committed-in-BUILDING_SNAPSHOT.patch; x-cm-securityLevel=0; x-cm-securitylevel=0Download
diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index cc03f0706e9..62a6d3097a1 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -206,10 +206,10 @@ xact_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 	uint8		info = XLogRecGetInfo(r) & XLOG_XACT_OPMASK;
 
 	/*
-	 * If the snapshot isn't yet fully built, we cannot decode anything, so
-	 * bail out.
+	 * If the snapshot hasn't started building yet, the transaction won't be
+	 * decoded or tracked by the snapshot, so bail out.
 	 */
-	if (SnapBuildCurrentState(builder) < SNAPBUILD_FULL_SNAPSHOT)
+	if (SnapBuildCurrentState(builder) < SNAPBUILD_BUILDING_SNAPSHOT)
 		return;
 
 	switch (info)
@@ -286,6 +286,9 @@ xact_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 				xid = XLogRecGetXid(r);
 				invals = (xl_xact_invals *) XLogRecGetData(r);
 
+				if (SNAPBUILD_XID_IGNORED(builder, xid))
+					break;
+
 				/*
 				 * Execute the invalidations for xid-less transactions,
 				 * otherwise, accumulate them so that they can be processed at
@@ -418,7 +421,7 @@ heap2_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 	 * determining the candidate catalog_xmin for the replication slot. See
 	 * SnapBuildProcessRunningXacts().
 	 */
-	if (SnapBuildCurrentState(builder) < SNAPBUILD_FULL_SNAPSHOT)
+	if (SNAPBUILD_XID_IGNORED(builder, xid))
 		return;
 
 	switch (info)
@@ -860,6 +863,9 @@ DecodeAbort(LogicalDecodingContext *ctx, XLogRecordBuffer *buf,
 		abort_time = parsed->origin_timestamp;
 	}
 
+	if (SNAPBUILD_XID_IGNORED(ctx->snapshot_builder, xid))
+		return;
+
 	/*
 	 * Check whether we need to process this transaction. See
 	 * DecodeTXNNeedSkip for the reasons why we sometimes want to skip the
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index 6e18baa33cb..a5bfa4ecdb5 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -306,6 +306,15 @@ SnapBuildXactNeedsSkip(SnapBuild *builder, XLogRecPtr ptr)
 	return ptr < builder->start_decoding_at;
 }
 
+/*
+ * Return the next phase at transaction ID during snapshot building.
+ */
+TransactionId
+SnapBuildNextPhaseAt(SnapBuild *builder)
+{
+	return builder->next_phase_at;
+}
+
 /*
  * Increase refcount of a snapshot.
  *
@@ -952,9 +961,7 @@ SnapBuildCommitTxn(SnapBuild *builder, XLogRecPtr lsn, TransactionId xid,
 	 * Transactions preceding BUILDING_SNAPSHOT will neither be decoded, nor
 	 * will they be part of a snapshot.  So we don't need to record anything.
 	 */
-	if (builder->state == SNAPBUILD_START ||
-		(builder->state == SNAPBUILD_BUILDING_SNAPSHOT &&
-		 TransactionIdPrecedes(xid, builder->next_phase_at)))
+	if (SNAPBUILD_XID_IGNORED(builder, xid))
 	{
 		/* ensure that only commits after this are getting replayed */
 		if (builder->start_decoding_at <= lsn)
diff --git a/src/include/replication/snapbuild.h b/src/include/replication/snapbuild.h
index 44031dcf6e3..509063cf35e 100644
--- a/src/include/replication/snapbuild.h
+++ b/src/include/replication/snapbuild.h
@@ -61,6 +61,11 @@ struct ReorderBuffer;
 struct xl_heap_new_cid;
 struct xl_running_xacts;
 
+#define SNAPBUILD_XID_IGNORED(builder, xid) \
+	(SnapBuildCurrentState((builder)) == SNAPBUILD_START || \
+	 (SnapBuildCurrentState((builder)) == SNAPBUILD_BUILDING_SNAPSHOT && \
+	  TransactionIdPrecedes((xid), SnapBuildNextPhaseAt((builder)))))
+
 extern void CheckPointSnapBuild(void);
 
 extern SnapBuild *AllocateSnapshotBuilder(struct ReorderBuffer *reorder,
@@ -84,6 +89,8 @@ extern bool SnapBuildXactNeedsSkip(SnapBuild *builder, XLogRecPtr ptr);
 extern XLogRecPtr SnapBuildGetTwoPhaseAt(SnapBuild *builder);
 extern void SnapBuildSetTwoPhaseAt(SnapBuild *builder, XLogRecPtr ptr);
 
+extern TransactionId SnapBuildNextPhaseAt(SnapBuild *builder);
+
 extern void SnapBuildCommitTxn(SnapBuild *builder, XLogRecPtr lsn,
 							   TransactionId xid, int nsubxacts,
 							   TransactionId *subxacts, uint32 xinfo);
#2ocean_li_996
ocean_li_996@163.com
In reply to: ocean_li_996 (#1)
Re:Fix logical decoding not track transaction during SNAPBUILD_BUILDING_SNAPSHOT

Hi Ajin & Bertrand,

I missed CC you in last email. Just for your infomation. No defence.

Best regards,
Haiyang Li

#3cca5507
cca5507@qq.com
In reply to: ocean_li_996 (#1)
Re: Fix logical decoding not track transaction duringSNAPBUILD_BUILDING_SNAPSHOT

Hi Haiyang,

Thank you for your summary.

One important thing is that we must not skip any call to ReorderBufferXidSetCatalogChanges() (direct or indirect) during fast-forwarding or building snapshot, because the historic snapshot only tracks txns with catalog changes, the v6-0004 seems to skip it in xact_decode().

Here is a related bug:

/messages/by-id/tencent_3A071B760AA1A38540B57F297332B7781C08@qq.com

--
Regards,
ChangAo Chen

#4ocean_li_996
ocean_li_996@163.com
In reply to: cca5507 (#3)
Re: Fix logical decoding not track transaction duringSNAPBUILD_BUILDING_SNAPSHOT

Hi ChaoAo,
At 2025-11-22 18:34:23, "cca5507" <cca5507@qq.com> wrote:

One important thing is that we must not skip any call to ReorderBufferXidSetCatalogChanges() (direct or indirect) during fast-forwarding or building snapshot, because the historic snapshot only tracks txns with catalog changes, the v6-0004 seems to skip it in xact_decode().

v6-0004 only skip the transaction commited during START state and precedeing next_phase_at
(set when changing to BUILDING_SNAPSHOT state) during BUILDING_SNAPSHOT. Those transactions
are always useless no matter in fast forward or not. Plaese recheck v6-0004 again.

Here is a related bug: > >/messages/by-id/tencent_3A071B760AA1A38540B57F297332B7781C08@qq.com >

Yeah, I have researched that issue. I think your analyze is correct for me. But it is
independent of this thread. Let's discuss it in thread where the issues belongs to.

Best regards,
Haiyang Li

#5ocean_li_996
ocean_li_996@163.com
In reply to: ocean_li_996 (#1)
Re:Fix logical decoding not track transaction during SNAPBUILD_BUILDING_SNAPSHOT

Hi,

Sorry for the direct CC.

Given your expertise in logical replication and your dedication to improving its functionality,
I think the issue mentioned in [1]/messages/by-id/3575444b.25e0.19aaae481e0.Coremail.ocean_li_996@163.com may be worth some attention. If you have time, I am appreciate
your thoughts or opinions on the issue.

Thanks and sorry again for the intrusion.

Best regards,
Haiyang Li

[1]: /messages/by-id/3575444b.25e0.19aaae481e0.Coremail.ocean_li_996@163.com

#6cca5507
cca5507@qq.com
In reply to: ocean_li_996 (#4)
Re: Fix logical decoding not track transaction during SNAPBUILD_BUILDING_SNAPSHOT

Hi Haiyang,

v6-0004 only skip the transaction commited during START state and precedeing next_phase_at
(set when changing to BUILDING_SNAPSHOT state) during BUILDING_SNAPSHOT. Those transactions
are always useless no matter in fast forward or not. Plaese recheck v6-0004 again.

Yeah, you're right. What's useful for building the snapshot and we currently don't track are the txns
start after BUILDING_SNAPSHOT and commit before FULL_SNAPSHOT, their xids all >= next_phase_at
during BUILDING_SNAPSHOT.

--
Regards,
ChangAo Chen