[BUG] "FailedAssertion" reported when streaming in logical replication

Started by tanghy.fnst@fujitsu.comover 4 years ago19 messages
#1tanghy.fnst@fujitsu.com
tanghy.fnst@fujitsu.com
2 attachment(s)

Hi

I think I may found a bug when using streaming in logical replication. Could anyone please take a look at this?

Here's what I did to produce the problem.
I set logical_decoding_work_mem and created multiple publications at publisher, created multiple subscriptions with "streaming = on" at subscriber.
However, an assertion failed at publisher when I COMMIT and ROLLBACK multiple transactions at the same time.

The log reported a FailedAssertion:
TRAP: FailedAssertion("txn->size == 0", File: "reorderbuffer.c", Line: 3465, PID: 911730)

The problem happens both in synchronous mode and asynchronous mode. When there are only one or two publications, It doesn't seem to happen. (In my case, there are 8 publications and the failure always happened).

The scripts and the log are attached. It took me about 4 minutes to run the script on my machine.
Please contact me if you need more specific info for the problem.

Regards
Tang

Attachments:

pub.logapplication/octet-stream; name=pub.logDownload
script.zipapplication/x-zip-compressed; name=script.zipDownload
#2Dilip Kumar
dilipbalaut@gmail.com
In reply to: tanghy.fnst@fujitsu.com (#1)
Re: [BUG] "FailedAssertion" reported when streaming in logical replication

On Mon, 26 Apr 2021 at 12:45 PM, tanghy.fnst@fujitsu.com <
tanghy.fnst@fujitsu.com> wrote:

Hi

I think I may found a bug when using streaming in logical replication.
Could anyone please take a look at this?

Here's what I did to produce the problem.
I set logical_decoding_work_mem and created multiple publications at
publisher, created multiple subscriptions with "streaming = on" at
subscriber.
However, an assertion failed at publisher when I COMMIT and ROLLBACK
multiple transactions at the same time.

The log reported a FailedAssertion:
TRAP: FailedAssertion("txn->size == 0", File: "reorderbuffer.c", Line:
3465, PID: 911730)

The problem happens both in synchronous mode and asynchronous mode. When
there are only one or two publications, It doesn't seem to happen. (In my
case, there are 8 publications and the failure always happened).

The scripts and the log are attached. It took me about 4 minutes to run
the script on my machine.
Please contact me if you need more specific info for the problem.

Thanks for reporting. I will look into it.

--

Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#3Dilip Kumar
dilipbalaut@gmail.com
In reply to: Dilip Kumar (#2)
Re: [BUG] "FailedAssertion" reported when streaming in logical replication

On Mon, Apr 26, 2021 at 1:26 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Mon, 26 Apr 2021 at 12:45 PM, tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote:

Hi

I think I may found a bug when using streaming in logical replication. Could anyone please take a look at this?

Here's what I did to produce the problem.
I set logical_decoding_work_mem and created multiple publications at publisher, created multiple subscriptions with "streaming = on" at subscriber.
However, an assertion failed at publisher when I COMMIT and ROLLBACK multiple transactions at the same time.

The log reported a FailedAssertion:
TRAP: FailedAssertion("txn->size == 0", File: "reorderbuffer.c", Line: 3465, PID: 911730)

The problem happens both in synchronous mode and asynchronous mode. When there are only one or two publications, It doesn't seem to happen. (In my case, there are 8 publications and the failure always happened).

The scripts and the log are attached. It took me about 4 minutes to run the script on my machine.
Please contact me if you need more specific info for the problem.

Thanks for reporting. I will look into it.

I am able to reproduce this and I think I have done the initial investigation.

The cause of the issue is that, this transaction has only one change
and that change is XLOG_HEAP2_NEW_CID, which is added through
SnapBuildProcessNewCid. Basically, when we add any changes through
SnapBuildProcessChange we set the base snapshot but when we add
SnapBuildProcessNewCid this we don't set the base snapshot, because
there is nothing to be done for this change. Now, this transaction is
identified as the biggest transaction with non -partial changes, and
now in ReorderBufferStreamTXN, it will return immediately because the
base_snapshot is NULL. I think the fix should be while selecting the
largest transaction in ReorderBufferLargestTopTXN, we should check the
base_snapshot should not be NULL.

I will think more about this and post the patch.

From the core dump, we can see that base_snapshot is 0x0 and
ntuplecids = 1, and txn_flags = 1 also proves that it has a new
command id change. And the size of the txn also shows that it has
only one change and that is REORDER_BUFFER_CHANGE_INTERNAL_COMMAND_ID
because in that case, the change size will be just the
sizeof(ReorderBufferChange) which is 80.

(gdb) p *txn
$4 = {txn_flags = 1, xid = 1115, toplevel_xid = 0, gid = 0x0,
first_lsn = 1061159120, final_lsn = 0, end_lsn = 0, toptxn = 0x0,
restart_decoding_lsn = 958642624,
origin_id = 0, origin_lsn = 0, commit_time = 0, base_snapshot = 0x0,
base_snapshot_lsn = 0, base_snapshot_node = {prev = 0x0, next = 0x0},
snapshot_now = 0x0,
command_id = 4294967295, nentries = 1, nentries_mem = 1, changes =
{head = {prev = 0x3907c18, next = 0x3907c18}}, tuplecids = {head =
{prev = 0x39073d8,
next = 0x39073d8}}, ntuplecids = 1, tuplecid_hash = 0x0,
toast_hash = 0x0, subtxns = {head = {prev = 0x30f1cd8, next =
0x30f1cd8}}, nsubtxns = 0,
ninvalidations = 0, invalidations = 0x0, node = {prev = 0x30f1a98,
next = 0x30c64f8}, size = 80, total_size = 80, concurrent_abort =
false,
output_plugin_private = 0x0}

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#4Amit Kapila
amit.kapila16@gmail.com
In reply to: Dilip Kumar (#3)
Re: [BUG] "FailedAssertion" reported when streaming in logical replication

On Mon, Apr 26, 2021 at 5:55 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Mon, Apr 26, 2021 at 1:26 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Mon, 26 Apr 2021 at 12:45 PM, tanghy.fnst@fujitsu.com <tanghy.fnst@fujitsu.com> wrote:

Hi

I think I may found a bug when using streaming in logical replication. Could anyone please take a look at this?

Here's what I did to produce the problem.
I set logical_decoding_work_mem and created multiple publications at publisher, created multiple subscriptions with "streaming = on" at subscriber.
However, an assertion failed at publisher when I COMMIT and ROLLBACK multiple transactions at the same time.

The log reported a FailedAssertion:
TRAP: FailedAssertion("txn->size == 0", File: "reorderbuffer.c", Line: 3465, PID: 911730)

The problem happens both in synchronous mode and asynchronous mode. When there are only one or two publications, It doesn't seem to happen. (In my case, there are 8 publications and the failure always happened).

The scripts and the log are attached. It took me about 4 minutes to run the script on my machine.
Please contact me if you need more specific info for the problem.

Thanks for reporting. I will look into it.

I am able to reproduce this and I think I have done the initial investigation.

The cause of the issue is that, this transaction has only one change
and that change is XLOG_HEAP2_NEW_CID, which is added through
SnapBuildProcessNewCid. Basically, when we add any changes through
SnapBuildProcessChange we set the base snapshot but when we add
SnapBuildProcessNewCid this we don't set the base snapshot, because
there is nothing to be done for this change. Now, this transaction is
identified as the biggest transaction with non -partial changes, and
now in ReorderBufferStreamTXN, it will return immediately because the
base_snapshot is NULL.

Your analysis sounds correct to me.

--
With Regards,
Amit Kapila.

#5Dilip Kumar
dilipbalaut@gmail.com
In reply to: Amit Kapila (#4)
1 attachment(s)
Re: [BUG] "FailedAssertion" reported when streaming in logical replication

On Mon, Apr 26, 2021 at 6:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Apr 26, 2021 at 5:55 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

I am able to reproduce this and I think I have done the initial investigation.

The cause of the issue is that, this transaction has only one change
and that change is XLOG_HEAP2_NEW_CID, which is added through
SnapBuildProcessNewCid. Basically, when we add any changes through
SnapBuildProcessChange we set the base snapshot but when we add
SnapBuildProcessNewCid this we don't set the base snapshot, because
there is nothing to be done for this change. Now, this transaction is
identified as the biggest transaction with non -partial changes, and
now in ReorderBufferStreamTXN, it will return immediately because the
base_snapshot is NULL.

Your analysis sounds correct to me.

Thanks, I have attached a patch to fix this.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v1-0001-Don-t-select-the-transaction-without-base-snapsho.patchtext/x-patch; charset=US-ASCII; name=v1-0001-Don-t-select-the-transaction-without-base-snapsho.patchDownload
From 16d47947002357cc37fdc3debcdf8c376e370188 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilipkumar@localhost.localdomain>
Date: Mon, 26 Apr 2021 18:19:27 +0530
Subject: [PATCH v1] Don't select the transaction without base snapshot for
 streaming

While selecting the largest top transaction, currently we don't check
whether the transaction has the base snapshot or not, but if the
transaction doesn't have the base snapshot then we can not stream that so
skip such transactions.
---
 src/backend/replication/logical/reorderbuffer.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 5cb484f..981619f 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -3388,7 +3388,8 @@ ReorderBufferLargestTopTXN(ReorderBuffer *rb)
 		txn = dlist_container(ReorderBufferTXN, node, iter.cur);
 
 		if ((largest != NULL || txn->total_size > largest_size) &&
-			(txn->total_size > 0) && !(rbtxn_has_incomplete_tuple(txn)))
+			(txn->base_snapshot != NULL) && (txn->total_size > 0) &&
+			!(rbtxn_has_incomplete_tuple(txn)))
 		{
 			largest = txn;
 			largest_size = txn->total_size;
-- 
1.8.3.1

#6Dilip Kumar
dilipbalaut@gmail.com
In reply to: Dilip Kumar (#5)
2 attachment(s)
Re: [BUG] "FailedAssertion" reported when streaming in logical replication

On Mon, Apr 26, 2021 at 7:52 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Mon, Apr 26, 2021 at 6:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Apr 26, 2021 at 5:55 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

I am able to reproduce this and I think I have done the initial investigation.

The cause of the issue is that, this transaction has only one change
and that change is XLOG_HEAP2_NEW_CID, which is added through
SnapBuildProcessNewCid. Basically, when we add any changes through
SnapBuildProcessChange we set the base snapshot but when we add
SnapBuildProcessNewCid this we don't set the base snapshot, because
there is nothing to be done for this change. Now, this transaction is
identified as the biggest transaction with non -partial changes, and
now in ReorderBufferStreamTXN, it will return immediately because the
base_snapshot is NULL.

Your analysis sounds correct to me.

Thanks, I have attached a patch to fix this.

There is also one very silly mistake in below condition, basically,
once we got any transaction for next transaction it is unconditionally
selecting without comparing the size because largest != NULL is wrong,
ideally this should be largest == NULL, basically, if we haven't
select any transaction then only we can approve next transaction
without comparing the size.

if ((largest != NULL || txn->total_size > largest_size) &&
(txn->base_snapshot != NULL) && (txn->total_size > 0) &&
!(rbtxn_has_incomplete_tuple(txn)))

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v1-0001-Don-t-select-the-transaction-without-base-snapsho.patchtext/x-patch; charset=US-ASCII; name=v1-0001-Don-t-select-the-transaction-without-base-snapsho.patchDownload
From 16d47947002357cc37fdc3debcdf8c376e370188 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilipkumar@localhost.localdomain>
Date: Mon, 26 Apr 2021 18:19:27 +0530
Subject: [PATCH v1] Don't select the transaction without base snapshot for
 streaming

While selecting the largest top transaction, currently we don't check
whether the transaction has the base snapshot or not, but if the
transaction doesn't have the base snapshot then we can not stream that so
skip such transactions.
---
 src/backend/replication/logical/reorderbuffer.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 5cb484f..981619f 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -3388,7 +3388,8 @@ ReorderBufferLargestTopTXN(ReorderBuffer *rb)
 		txn = dlist_container(ReorderBufferTXN, node, iter.cur);
 
 		if ((largest != NULL || txn->total_size > largest_size) &&
-			(txn->total_size > 0) && !(rbtxn_has_incomplete_tuple(txn)))
+			(txn->base_snapshot != NULL) && (txn->total_size > 0) &&
+			!(rbtxn_has_incomplete_tuple(txn)))
 		{
 			largest = txn;
 			largest_size = txn->total_size;
-- 
1.8.3.1

v2-0002-Fix-thinko-while-selecting-the-largest-transactio.patchtext/x-patch; charset=US-ASCII; name=v2-0002-Fix-thinko-while-selecting-the-largest-transactio.patchDownload
From c3327f9eda4a880a960afb67ad2ad65b30fdbd32 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilipkumar@localhost.localdomain>
Date: Tue, 27 Apr 2021 10:53:44 +0530
Subject: [PATCH v2 2/2] Fix thinko while selecting the largest transaction for
 streaming

---
 src/backend/replication/logical/reorderbuffer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 981619f..3e53144 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -3387,7 +3387,7 @@ ReorderBufferLargestTopTXN(ReorderBuffer *rb)
 
 		txn = dlist_container(ReorderBufferTXN, node, iter.cur);
 
-		if ((largest != NULL || txn->total_size > largest_size) &&
+		if ((largest == NULL || txn->total_size > largest_size) &&
 			(txn->base_snapshot != NULL) && (txn->total_size > 0) &&
 			!(rbtxn_has_incomplete_tuple(txn)))
 		{
-- 
1.8.3.1

#7Amit Kapila
amit.kapila16@gmail.com
In reply to: Dilip Kumar (#5)
Re: [BUG] "FailedAssertion" reported when streaming in logical replication

On Mon, Apr 26, 2021 at 7:52 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Mon, Apr 26, 2021 at 6:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Apr 26, 2021 at 5:55 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

I am able to reproduce this and I think I have done the initial investigation.

The cause of the issue is that, this transaction has only one change
and that change is XLOG_HEAP2_NEW_CID, which is added through
SnapBuildProcessNewCid. Basically, when we add any changes through
SnapBuildProcessChange we set the base snapshot but when we add
SnapBuildProcessNewCid this we don't set the base snapshot, because
there is nothing to be done for this change. Now, this transaction is
identified as the biggest transaction with non -partial changes, and
now in ReorderBufferStreamTXN, it will return immediately because the
base_snapshot is NULL.

Your analysis sounds correct to me.

Thanks, I have attached a patch to fix this.

Can't we use 'txns_by_base_snapshot_lsn' list for this purpose? It is
ensured in ReorderBufferSetBaseSnapshot that we always assign
base_snapshot to a top-level transaction if the current is a known
subxact. I think that will be true because we always form xid-subxid
relation before processing each record in
LogicalDecodingProcessRecord.

Few other minor comments:
1. I think we can update the comments atop function ReorderBufferLargestTopTXN.
2. minor typo in comments atop ReorderBufferLargestTopTXN "...There is
a scope of optimization here such that we can select the largest
transaction which has complete changes...". In this 'complete' should
be incomplete. This is not related to this patch but I think we can
fix it along with this because anyway we are going to change
surrounding comments.

--
With Regards,
Amit Kapila.

#8Dilip Kumar
dilipbalaut@gmail.com
In reply to: Amit Kapila (#7)
Re: [BUG] "FailedAssertion" reported when streaming in logical replication

On Tue, Apr 27, 2021 at 11:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Apr 26, 2021 at 7:52 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Mon, Apr 26, 2021 at 6:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Apr 26, 2021 at 5:55 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

I am able to reproduce this and I think I have done the initial investigation.

The cause of the issue is that, this transaction has only one change
and that change is XLOG_HEAP2_NEW_CID, which is added through
SnapBuildProcessNewCid. Basically, when we add any changes through
SnapBuildProcessChange we set the base snapshot but when we add
SnapBuildProcessNewCid this we don't set the base snapshot, because
there is nothing to be done for this change. Now, this transaction is
identified as the biggest transaction with non -partial changes, and
now in ReorderBufferStreamTXN, it will return immediately because the
base_snapshot is NULL.

Your analysis sounds correct to me.

Thanks, I have attached a patch to fix this.

Can't we use 'txns_by_base_snapshot_lsn' list for this purpose? It is
ensured in ReorderBufferSetBaseSnapshot that we always assign
base_snapshot to a top-level transaction if the current is a known
subxact. I think that will be true because we always form xid-subxid
relation before processing each record in
LogicalDecodingProcessRecord.

Yeah, we can do that, but here we are only interested in top
transactions and this list will give us sub-transaction as well so we
will have to skip it in the below if condition. So I think using
toplevel_by_lsn and skipping the txn without base_snapshot in below if
condition will be cheaper compared to process all the transactions
with base snapshot i.e. txns_by_base_snapshot_lsn and skipping the
sub-transactions in the below if conditions. Whats your thoughts on
this?

Few other minor comments:
1. I think we can update the comments atop function ReorderBufferLargestTopTXN.
2. minor typo in comments atop ReorderBufferLargestTopTXN "...There is
a scope of optimization here such that we can select the largest
transaction which has complete changes...". In this 'complete' should
be incomplete. This is not related to this patch but I think we can
fix it along with this because anyway we are going to change
surrounding comments.

I will work on these in the next version.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#9Amit Kapila
amit.kapila16@gmail.com
In reply to: Dilip Kumar (#8)
Re: [BUG] "FailedAssertion" reported when streaming in logical replication

On Tue, Apr 27, 2021 at 11:50 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Apr 27, 2021 at 11:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Apr 26, 2021 at 7:52 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Mon, Apr 26, 2021 at 6:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Apr 26, 2021 at 5:55 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

I am able to reproduce this and I think I have done the initial investigation.

The cause of the issue is that, this transaction has only one change
and that change is XLOG_HEAP2_NEW_CID, which is added through
SnapBuildProcessNewCid. Basically, when we add any changes through
SnapBuildProcessChange we set the base snapshot but when we add
SnapBuildProcessNewCid this we don't set the base snapshot, because
there is nothing to be done for this change. Now, this transaction is
identified as the biggest transaction with non -partial changes, and
now in ReorderBufferStreamTXN, it will return immediately because the
base_snapshot is NULL.

Your analysis sounds correct to me.

Thanks, I have attached a patch to fix this.

Can't we use 'txns_by_base_snapshot_lsn' list for this purpose? It is
ensured in ReorderBufferSetBaseSnapshot that we always assign
base_snapshot to a top-level transaction if the current is a known
subxact. I think that will be true because we always form xid-subxid
relation before processing each record in
LogicalDecodingProcessRecord.

Yeah, we can do that, but here we are only interested in top
transactions and this list will give us sub-transaction as well so we
will have to skip it in the below if condition.

I am not so sure about this point. I have explained above why I think
there won't be any subtransactions in this. Can you please let me know
what am I missing if anything?

--
With Regards,
Amit Kapila.

#10Dilip Kumar
dilipbalaut@gmail.com
In reply to: Amit Kapila (#9)
Re: [BUG] "FailedAssertion" reported when streaming in logical replication

On Tue, Apr 27, 2021 at 12:05 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Can't we use 'txns_by_base_snapshot_lsn' list for this purpose? It is
ensured in ReorderBufferSetBaseSnapshot that we always assign
base_snapshot to a top-level transaction if the current is a known
subxact. I think that will be true because we always form xid-subxid
relation before processing each record in
LogicalDecodingProcessRecord.

Yeah, we can do that, but here we are only interested in top
transactions and this list will give us sub-transaction as well so we
will have to skip it in the below if condition.

I am not so sure about this point. I have explained above why I think
there won't be any subtransactions in this. Can you please let me know
what am I missing if anything?

Got your point, yeah this will only have top transactions so we can
use this. I will change this in the next patch. In fact we can put
an assert that it should not be an sub transaction?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#11Amit Kapila
amit.kapila16@gmail.com
In reply to: Dilip Kumar (#10)
Re: [BUG] "FailedAssertion" reported when streaming in logical replication

On Tue, Apr 27, 2021 at 12:22 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Apr 27, 2021 at 12:05 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Can't we use 'txns_by_base_snapshot_lsn' list for this purpose? It is
ensured in ReorderBufferSetBaseSnapshot that we always assign
base_snapshot to a top-level transaction if the current is a known
subxact. I think that will be true because we always form xid-subxid
relation before processing each record in
LogicalDecodingProcessRecord.

Yeah, we can do that, but here we are only interested in top
transactions and this list will give us sub-transaction as well so we
will have to skip it in the below if condition.

I am not so sure about this point. I have explained above why I think
there won't be any subtransactions in this. Can you please let me know
what am I missing if anything?

Got your point, yeah this will only have top transactions so we can
use this. I will change this in the next patch. In fact we can put
an assert that it should not be an sub transaction?

Right. It is good to have an assert.

--
With Regards,
Amit Kapila.

#12Dilip Kumar
dilipbalaut@gmail.com
In reply to: Amit Kapila (#11)
1 attachment(s)
Re: [BUG] "FailedAssertion" reported when streaming in logical replication

On Tue, Apr 27, 2021 at 12:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Apr 27, 2021 at 12:22 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Tue, Apr 27, 2021 at 12:05 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

Can't we use 'txns_by_base_snapshot_lsn' list for this purpose? It is
ensured in ReorderBufferSetBaseSnapshot that we always assign
base_snapshot to a top-level transaction if the current is a known
subxact. I think that will be true because we always form xid-subxid
relation before processing each record in
LogicalDecodingProcessRecord.

Yeah, we can do that, but here we are only interested in top
transactions and this list will give us sub-transaction as well so we
will have to skip it in the below if condition.

I am not so sure about this point. I have explained above why I think
there won't be any subtransactions in this. Can you please let me know
what am I missing if anything?

Got your point, yeah this will only have top transactions so we can
use this. I will change this in the next patch. In fact we can put
an assert that it should not be an sub transaction?

Right. It is good to have an assert.

I have modified the patch based on the above comments.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v3-0001-Assorted-bug-fix-while-selecting-the-largest-top-.patchtext/x-patch; charset=US-ASCII; name=v3-0001-Assorted-bug-fix-while-selecting-the-largest-top-.patchDownload
From 6fcd9be88bd85df4af0508f082dfdad0f8bbf518 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilipkumar@localhost.localdomain>
Date: Tue, 27 Apr 2021 13:57:21 +0530
Subject: [PATCH v3] Assorted bug fix while selecting the largest top
 transaction for streaming

There were mainly 2 problems, 1) Ideally, if we haven't selected any transaction
we should take next available transaction without comparing the size but the
condition was wrong and it was selecting the next available transaction without
comparing the size if we had already selected a transaction which was wrong.
2) Another probelm was we were selecting the transaction without checking their
base snapshot, so if the base snapshot is NULL then we can not stream any change
so it was hitting the assert that after streaming txn->size should be 0.  So the
solution is we should never select the transaction for streaming which doesn't
have a base snapshot as we can not stream that transaction.
---
 src/backend/replication/logical/reorderbuffer.c | 31 ++++++++++++++-----------
 1 file changed, 18 insertions(+), 13 deletions(-)

diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 5cb484f..ea217ef 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -3362,16 +3362,17 @@ ReorderBufferLargestTXN(ReorderBuffer *rb)
  * iterate over the limited number of toplevel transactions.
  *
  * Note that, we skip transactions that contains incomplete changes. There
- * is a scope of optimization here such that we can select the largest transaction
- * which has complete changes.  But that will make the code and design quite complex
- * and that might not be worth the benefit.  If we plan to stream the transactions
- * that contains incomplete changes then we need to find a way to partially
- * stream/truncate the transaction changes in-memory and build a mechanism to
- * partially truncate the spilled files.  Additionally, whenever we partially
- * stream the transaction we need to maintain the last streamed lsn and next time
- * we need to restore from that segment and the offset in WAL.  As we stream the
- * changes from the top transaction and restore them subtransaction wise, we need
- * to even remember the subxact from where we streamed the last change.
+ * is a scope of optimization here such that we can select the largest
+ * transaction which has incomplete changes.  But that will make the code and
+ * design quite complex and that might not be worth the benefit.  If we plan to
+ * stream the transactions that contains incomplete changes then we need to
+ * find a way to partially stream/truncate the transaction changes in-memory
+ * and build a mechanism to partially truncate the spilled files.
+ * Additionally, whenever we partially stream the transaction we need to
+ * maintain the last streamed lsn and next time we need to restore from that
+ * segment and the offset in WAL.  As we stream the changes from the top
+ * transaction and restore them subtransaction wise, we need to even remember
+ * the subxact from where we streamed the last change.
  */
 static ReorderBufferTXN *
 ReorderBufferLargestTopTXN(ReorderBuffer *rb)
@@ -3381,13 +3382,17 @@ ReorderBufferLargestTopTXN(ReorderBuffer *rb)
 	ReorderBufferTXN *largest = NULL;
 
 	/* Find the largest top-level transaction. */
-	dlist_foreach(iter, &rb->toplevel_by_lsn)
+	dlist_foreach(iter, &rb->txns_by_base_snapshot_lsn)
 	{
 		ReorderBufferTXN *txn;
 
-		txn = dlist_container(ReorderBufferTXN, node, iter.cur);
+		txn = dlist_container(ReorderBufferTXN, base_snapshot_node, iter.cur);
+
+		/* known-as-subtxn txns must not be listed */
+		Assert(!rbtxn_is_known_subxact(txn));
+		Assert(txn->base_snapshot != NULL);
 
-		if ((largest != NULL || txn->total_size > largest_size) &&
+		if ((largest == NULL || txn->total_size > largest_size) &&
 			(txn->total_size > 0) && !(rbtxn_has_incomplete_tuple(txn)))
 		{
 			largest = txn;
-- 
1.8.3.1

#13tanghy.fnst@fujitsu.com
tanghy.fnst@fujitsu.com
In reply to: Dilip Kumar (#12)
RE: [BUG] "FailedAssertion" reported when streaming in logical replication

I have modified the patch based on the above comments.

Thanks for your patch.
I tested again after applying your patch and the problem is fixed.

Regards
Tang

#14Dilip Kumar
dilipbalaut@gmail.com
In reply to: tanghy.fnst@fujitsu.com (#13)
Re: [BUG] "FailedAssertion" reported when streaming in logical replication

On Wed, Apr 28, 2021 at 12:25 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:

I have modified the patch based on the above comments.

Thanks for your patch.
I tested again after applying your patch and the problem is fixed.

Thanks for confirming.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#15Dilip Kumar
dilipbalaut@gmail.com
In reply to: Dilip Kumar (#14)
Re: [BUG] "FailedAssertion" reported when streaming in logical replication

On Wed, Apr 28, 2021 at 1:02 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Wed, Apr 28, 2021 at 12:25 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:

I have modified the patch based on the above comments.

Thanks for your patch.
I tested again after applying your patch and the problem is fixed.

Thanks for confirming.

I tried to think about how to write a test case for this scenario, but
I think it will not be possible to generate an automated test case for
this. Basically, we need 2 concurrent transactions and out of that,
we need one transaction which just has processed only one change i.e
XLOG_HEAP2_NEW_CID and another transaction should overflow the logical
decoding work mem, so that we select the wrong transaction which
doesn't have the base snapshot. But how to control that the
transaction which is performing the DDL just write the
XLOG_HEAP2_NEW_CID wal and before it writes any other WAL we should
get the WAl from other transaction which overflows the buffer.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#16tanghy.fnst@fujitsu.com
tanghy.fnst@fujitsu.com
In reply to: Dilip Kumar (#15)
RE: [BUG] "FailedAssertion" reported when streaming in logical replication

On Thursday, April 29, 2021 3:18 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote

I tried to think about how to write a test case for this scenario, but
I think it will not be possible to generate an automated test case for this.
Basically, we need 2 concurrent transactions and out of that,
we need one transaction which just has processed only one change i.e
XLOG_HEAP2_NEW_CID and another transaction should overflow the logical
decoding work mem, so that we select the wrong transaction which
doesn't have the base snapshot. But how to control that the
transaction which is performing the DDL just write the
XLOG_HEAP2_NEW_CID wal and before it writes any other WAL we should
get the WAl from other transaction which overflows the buffer.

Thanks for your updating.
Actually, I tried to make the automated test for the problem, too. But made no process on this.
Agreed on your opinion " not be possible to generate an automated test case for this ".

If anyone figure out a good solution for the test automation of this case.
Please be kind to share that with us. Thanks.

Regards,
Tang

#17Dilip Kumar
dilipbalaut@gmail.com
In reply to: tanghy.fnst@fujitsu.com (#16)
Re: [BUG] "FailedAssertion" reported when streaming in logical replication

On Thu, Apr 29, 2021 at 12:09 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:

On Thursday, April 29, 2021 3:18 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote

I tried to think about how to write a test case for this scenario, but
I think it will not be possible to generate an automated test case for this.
Basically, we need 2 concurrent transactions and out of that,
we need one transaction which just has processed only one change i.e
XLOG_HEAP2_NEW_CID and another transaction should overflow the logical
decoding work mem, so that we select the wrong transaction which
doesn't have the base snapshot. But how to control that the
transaction which is performing the DDL just write the
XLOG_HEAP2_NEW_CID wal and before it writes any other WAL we should
get the WAl from other transaction which overflows the buffer.

Thanks for your updating.
Actually, I tried to make the automated test for the problem, too. But made no process on this.
Agreed on your opinion " not be possible to generate an automated test case for this ".

Thanks for trying this out.

If anyone figure out a good solution for the test automation of this case.
Please be kind to share that with us. Thanks.

+1

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

#18Amit Kapila
amit.kapila16@gmail.com
In reply to: Dilip Kumar (#12)
1 attachment(s)
Re: [BUG] "FailedAssertion" reported when streaming in logical replication

On Tue, Apr 27, 2021 at 5:18 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

I have modified the patch based on the above comments.

The patch looks good to me. I have slightly modified the comments and
commit message. See, what you think of the attached? I think we can
leave the test for this as there doesn't seem to be an easy way to
automate it.

--
With Regards,
Amit Kapila.

Attachments:

v4-0001-Fix-the-bugs-in-selecting-the-transaction-for-str.patchapplication/octet-stream; name=v4-0001-Fix-the-bugs-in-selecting-the-transaction-for-str.patchDownload
From 70d944c1de1eb6a01686cf25ec86be1994e16b9d Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Thu, 29 Apr 2021 16:08:10 +0530
Subject: [PATCH v4] Fix the bugs in selecting the transaction for streaming.

There were two problems:
a. We were always selecting the next available txn instead of selecting it
iff it is larger than the previous transaction.
b. We were selecting the transactions which haven't made any changes to the
database (base snapshot is not set). Later it was hitting Assert because we
don't decode such transactions.

Reported-by: Haiying Tang
Author: Dilip Kumar
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/OS0PR01MB61133B94E63177040F7ECDA1FB429@OS0PR01MB6113.jpnprd01.prod.outlook.com
---
 .../replication/logical/reorderbuffer.c       | 38 +++++++++++--------
 1 file changed, 23 insertions(+), 15 deletions(-)

diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index c27f7100534..e1e17962e7d 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -3362,19 +3362,22 @@ ReorderBufferLargestTXN(ReorderBuffer *rb)
  * This can be seen as an optimized version of ReorderBufferLargestTXN, which
  * should give us the same transaction (because we don't update memory account
  * for subtransaction with streaming, so it's always 0). But we can simply
- * iterate over the limited number of toplevel transactions.
+ * iterate over the limited number of toplevel transactions that have a base
+ * snapshot. There is no use of selecting a transaction that doesn't have base
+ * snapshot because we don't decode such transactions.
  *
  * Note that, we skip transactions that contains incomplete changes. There
- * is a scope of optimization here such that we can select the largest transaction
- * which has complete changes.  But that will make the code and design quite complex
- * and that might not be worth the benefit.  If we plan to stream the transactions
- * that contains incomplete changes then we need to find a way to partially
- * stream/truncate the transaction changes in-memory and build a mechanism to
- * partially truncate the spilled files.  Additionally, whenever we partially
- * stream the transaction we need to maintain the last streamed lsn and next time
- * we need to restore from that segment and the offset in WAL.  As we stream the
- * changes from the top transaction and restore them subtransaction wise, we need
- * to even remember the subxact from where we streamed the last change.
+ * is a scope of optimization here such that we can select the largest
+ * transaction which has incomplete changes.  But that will make the code and
+ * design quite complex and that might not be worth the benefit.  If we plan to
+ * stream the transactions that contains incomplete changes then we need to
+ * find a way to partially stream/truncate the transaction changes in-memory
+ * and build a mechanism to partially truncate the spilled files.
+ * Additionally, whenever we partially stream the transaction we need to
+ * maintain the last streamed lsn and next time we need to restore from that
+ * segment and the offset in WAL.  As we stream the changes from the top
+ * transaction and restore them subtransaction wise, we need to even remember
+ * the subxact from where we streamed the last change.
  */
 static ReorderBufferTXN *
 ReorderBufferLargestTopTXN(ReorderBuffer *rb)
@@ -3383,14 +3386,19 @@ ReorderBufferLargestTopTXN(ReorderBuffer *rb)
 	Size		largest_size = 0;
 	ReorderBufferTXN *largest = NULL;
 
-	/* Find the largest top-level transaction. */
-	dlist_foreach(iter, &rb->toplevel_by_lsn)
+	/* Find the largest top-level transaction having a base snapshot. */
+	dlist_foreach(iter, &rb->txns_by_base_snapshot_lsn)
 	{
 		ReorderBufferTXN *txn;
 
-		txn = dlist_container(ReorderBufferTXN, node, iter.cur);
+		txn = dlist_container(ReorderBufferTXN, base_snapshot_node, iter.cur);
+
+		/* must not be a subtxn */
+		Assert(!rbtxn_is_known_subxact(txn));
+		/* base_snapshot must be set */
+		Assert(txn->base_snapshot != NULL);
 
-		if ((largest != NULL || txn->total_size > largest_size) &&
+		if ((largest == NULL || txn->total_size > largest_size) &&
 			(txn->total_size > 0) && !(rbtxn_has_incomplete_tuple(txn)))
 		{
 			largest = txn;
-- 
2.28.0.windows.1

#19Dilip Kumar
dilipbalaut@gmail.com
In reply to: Amit Kapila (#18)
Re: [BUG] "FailedAssertion" reported when streaming in logical replication

On Thu, Apr 29, 2021 at 5:24 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Apr 27, 2021 at 5:18 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:

I have modified the patch based on the above comments.

The patch looks good to me. I have slightly modified the comments and
commit message. See, what you think of the attached? I think we can
leave the test for this as there doesn't seem to be an easy way to
automate it.

Your changes look good to me.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com