Avoid distributing new catalog snapshot again for the transaction being decoded.

Started by houzj.fnst@fujitsu.comabout 3 years ago4 messages
#1houzj.fnst@fujitsu.com
houzj.fnst@fujitsu.com
1 attachment(s)

Hi,

When doing some other related work, I noticed that when decoding the COMMIT
record via SnapBuildCommitTxn()-> SnapBuildDistributeNewCatalogSnapshot() we
will add a new snapshot to all transactions including the one being decoded(just committed one).

But since we've already done required modifications in the snapshot for the
current transaction being decoded(in SnapBuildCommitTxn()), so I think we can
avoid adding a snapshot for it again.

Here is the patch to improve this.
Thoughts ?

Best regards,
Hou zhijie

Attachments:

0001-Avoid-distributing-new-catalogsnapshot-for-the-trans.patchapplication/octet-stream; name=0001-Avoid-distributing-new-catalogsnapshot-for-the-trans.patchDownload
From 7662f434a945f3a05e04f1f82bb0d0de3fde835a Mon Sep 17 00:00:00 2001
From: Hou Zhijie <houzj.fnst@cn.fujitsu.com>
Date: Fri, 25 Nov 2022 14:03:53 +0800
Subject: [PATCH] Avoid distributing new catalogsnapshot for the transaction
 being decoded

Currently, when decoding the COMMIT of a transaction containing catalog
changes, we add a new snapshot to all transactions including the one being
decoded. But since we've already built a new snapshot for the current
transaction, there's no need to add a new one. So skip adding a snapshot of the
current transaction being decoded to improve this.
---
 src/backend/replication/logical/snapbuild.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index a1fd1d9..1d02665 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -288,7 +288,8 @@ static void SnapBuildFreeSnapshot(Snapshot snap);
 
 static void SnapBuildSnapIncRefcount(Snapshot snap);
 
-static void SnapBuildDistributeNewCatalogSnapshot(SnapBuild *builder, XLogRecPtr lsn);
+static void SnapBuildDistributeNewCatalogSnapshot(SnapBuild *builder, XLogRecPtr lsn,
+												  TransactionId xid);
 
 static inline bool SnapBuildXidHasCatalogChanges(SnapBuild *builder, TransactionId xid,
 												 uint32 xinfo);
@@ -853,7 +854,8 @@ SnapBuildProcessNewCid(SnapBuild *builder, TransactionId xid,
  * contents).
  */
 static void
-SnapBuildDistributeNewCatalogSnapshot(SnapBuild *builder, XLogRecPtr lsn)
+SnapBuildDistributeNewCatalogSnapshot(SnapBuild *builder, XLogRecPtr lsn,
+									  TransactionId xid)
 {
 	dlist_iter	txn_i;
 	ReorderBufferTXN *txn;
@@ -870,6 +872,14 @@ SnapBuildDistributeNewCatalogSnapshot(SnapBuild *builder, XLogRecPtr lsn)
 		Assert(TransactionIdIsValid(txn->xid));
 
 		/*
+		 * We've already done required modifications in snapshot for the
+		 * transaction that just committed, so there's no need to add a new
+		 * snapshot for the transaction again.
+		 */
+		if (xid == txn->xid)
+			continue;
+
+		/*
 		 * If we don't have a base snapshot yet, there are no changes in this
 		 * transaction which in turn implies we don't yet need a snapshot at
 		 * all. We'll add a snapshot when the first change gets queued.
@@ -1171,7 +1181,7 @@ SnapBuildCommitTxn(SnapBuild *builder, XLogRecPtr lsn, TransactionId xid,
 		SnapBuildSnapIncRefcount(builder->snapshot);
 
 		/* add a new catalog snapshot to all currently running transactions */
-		SnapBuildDistributeNewCatalogSnapshot(builder, lsn);
+		SnapBuildDistributeNewCatalogSnapshot(builder, lsn, xid);
 	}
 }
 
-- 
2.7.2.windows.1

#2Ashutosh Bapat
ashutosh.bapat.oss@gmail.com
In reply to: houzj.fnst@fujitsu.com (#1)
Re: Avoid distributing new catalog snapshot again for the transaction being decoded.

Hi Hou,
Thanks for the patch. With a simple condition, we can eliminate the
need to queueing snapshot change in the current transaction and then
applying it. Saves some memory and computation. This looks useful.

When the queue snapshot change is processed in
ReorderBufferProcessTXN(), we call SetupHistoricSnapshot(). But I
didn't find a path through which SetupHistoricSnapshot() is called
from SnapBuildCommitTxn(). Either I missed some code path or it's not
needed. Can you please enlighten me?

--
Best Wishes,
Ashutosh Bapat

On Fri, Nov 25, 2022 at 2:28 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:

Show quoted text

Hi,

When doing some other related work, I noticed that when decoding the COMMIT
record via SnapBuildCommitTxn()-> SnapBuildDistributeNewCatalogSnapshot() we
will add a new snapshot to all transactions including the one being decoded(just committed one).

But since we've already done required modifications in the snapshot for the
current transaction being decoded(in SnapBuildCommitTxn()), so I think we can
avoid adding a snapshot for it again.

Here is the patch to improve this.
Thoughts ?

Best regards,
Hou zhijie

#3Amit Kapila
amit.kapila16@gmail.com
In reply to: Ashutosh Bapat (#2)
Re: Avoid distributing new catalog snapshot again for the transaction being decoded.

On Fri, Nov 25, 2022 at 5:30 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:

Hi Hou,
Thanks for the patch. With a simple condition, we can eliminate the
need to queueing snapshot change in the current transaction and then
applying it. Saves some memory and computation. This looks useful.

When the queue snapshot change is processed in
ReorderBufferProcessTXN(), we call SetupHistoricSnapshot(). But I
didn't find a path through which SetupHistoricSnapshot() is called
from SnapBuildCommitTxn().

It will be called after SnapBuildCommitTxn() via
ReorderBufferCommit()->ReorderBufferReplay()->ReorderBufferProcessTXN().
But, I think what I don't see is how the snapshot we build in
SnapBuildCommitTxn() will be assigned as a historic snapshot. We
assign base_snapshot as a historic snapshot and the new snapshot we
build in SnapBuildCommitTxn() is assigned as base_snapshot only if the
same is not set previously. I might be missing something but if that
is true then I don't think the patch is correct, OTOH I would expect
some existing tests to fail if this change is incorrect.

--
With Regards,
Amit Kapila.

#4wangw.fnst@fujitsu.com
wangw.fnst@fujitsu.com
In reply to: Amit Kapila (#3)
RE: Avoid distributing new catalog snapshot again for the transaction being decoded.

On Sat, Nov 26, 2022 at 19:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Nov 25, 2022 at 5:30 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:

Hi Hou,
Thanks for the patch. With a simple condition, we can eliminate the
need to queueing snapshot change in the current transaction and then
applying it. Saves some memory and computation. This looks useful.

When the queue snapshot change is processed in
ReorderBufferProcessTXN(), we call SetupHistoricSnapshot(). But I
didn't find a path through which SetupHistoricSnapshot() is called
from SnapBuildCommitTxn().

It will be called after SnapBuildCommitTxn() via ReorderBufferCommit()-

ReorderBufferReplay()->ReorderBufferProcessTXN().

But, I think what I don't see is how the snapshot we build in
SnapBuildCommitTxn() will be assigned as a historic snapshot. We assign
base_snapshot as a historic snapshot and the new snapshot we build in
SnapBuildCommitTxn() is assigned as base_snapshot only if the same is not set
previously. I might be missing something but if that is true then I don't think the
patch is correct, OTOH I would expect some existing tests to fail if this change is
incorrect.

Hi,

I also think that the snapshot we build in SnapBuildCommitTxn() will not be
assigned as a historic snapshot. But I think that when the commandID message is
processed in the function ReorderBufferProcessTXN, the snapshot of the current
transaction will be updated. And I also did some tests and found no problems.

Here is my detailed analysis:

I think that when a transaction internally modifies the catalog, a record of
XLOG_HEAP2_NEW_CID will be inserted into the WAL (see function
log_heap_new_cid). Then during logical decoding, this record will be decoded
into a change of type REORDER_BUFFER_CHANGE_INTERNAL_COMMAND_ID (see function
ReorderBufferAddNewCommandId). And I think the function ReorderBufferProcessTXN
would update the HistoricSnapshot (member subxip and member curcid of snapshot)
when processing this change.

And here is only one small comment:

+		 * We've already done required modifications in snapshot for the
+		 * transaction that just committed, so there's no need to add a new
+		 * snapshot for the transaction again.
+		 */
+		if (xid == txn->xid)
+			continue;

This comment seems a bit inaccurate. How about the comment below?
```
We don't need to add a snapshot to the transaction that just committed as it
will be able to see the new catalog contents after processing the new commandID
in ReorderBufferProcessTXN.
```

Regards,
Wang wei