Notify downstream to discard the streamed transaction which was aborted due to crash.
Hi,
When developing another feature, I find an existing bug which was reported from Dilip[1]/messages/by-id/CAFiTN-sTYk=h75Jn1a7ee+5hOcdQFjKGDvF_0NWQQXmoBv4A+A@mail.gmail.com.
Currently, it's possible that we only send a streaming block without sending a
end of stream message(stream abort) when decoding and streaming a transaction
that was aborted due to crash because we might not WAL log a XLOG_XACT_ABORT
for such a crashed transaction. This will cause the subscriber(with
streaming=on) create a stream file but won't delete it until the apply
worker restart.
BUG repro(borrowed from Dilip):
---
1. start 2 servers(config: logical_decoding_work_mem=64kB)
./pg_ctl -D data/ -c -l pub_logs start
./pg_ctl -D data1/ -c -l sub_logs start
2. Publisher:
create table t(a int PRIMARY KEY ,b text);
create publication test_pub for table t
with(PUBLISH='insert,delete,update,truncate');
alter table t replica identity FULL ;
3. Subscription Server:
create table t(a int,b text);
create subscription test_sub CONNECTION 'host=localhost port=10000
dbname=postgres' PUBLICATION test_pub WITH ( slot_name =
test_slot_sub1,streaming=on);
4. Publication Server:
begin ;
insert into t values (generate_series(1,50000), 'zzzzzzzzz'); -- (while executing this restart publisher in 2-3 secs)
Restart the publication server, while the transaction is still in an
uncommitted state.
./pg_ctl -D data/ -c -l pub_logs restart -mi
---
After restarting the publisher, we can see the subscriber receive a streaming
block and create a stream file(/base/pgsql_tmp/xxx.fileset).
To fix it, One idea is to send a stream abort message when we are cleaning up
crashed transaction on publisher(e.g. in ReorderBufferAbortOld()). And here is
a tiny patch which changes the same. I have confirmed that the bug is fixed and
all regression tests pass. I didn't add a testcase because we need to make sure
the crash happens before all the WAL logged transactions data are decoded which
doesn't seem easy to write a stable test for this.
Thoughts ?
[1]: /messages/by-id/CAFiTN-sTYk=h75Jn1a7ee+5hOcdQFjKGDvF_0NWQQXmoBv4A+A@mail.gmail.com
Best regards,
Hou zj
Attachments:
0001-fix-stream-changes-for-crashed-transaction.patchapplication/octet-stream; name=0001-fix-stream-changes-for-crashed-transaction.patchDownload
From 08815e4107e6104d55694bb75b5410e592007b76 Mon Sep 17 00:00:00 2001
From: Hou Zhijie <houzj.fnst@cn.fujitsu.com>
Date: Thu, 5 Jan 2023 18:53:17 +0800
Subject: [PATCH] fix stream changes for crashed transaction
We do not send end-of-stream messages for transactions aborted due to crashes,
which would result in the stream file being left on the subscriber and not
cleaned unless the apply worker is restarted. Fix this by sending a stream
abort message when cleaning up a transaction that was aborted by a crash and
was previously streamed.
---
src/backend/replication/logical/reorderbuffer.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 66fb039..3cb4cb3 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -2939,6 +2939,10 @@ ReorderBufferAbortOld(ReorderBuffer *rb, TransactionId oldestRunningXid)
{
elog(DEBUG2, "aborting old transaction %u", txn->xid);
+ /* Notify the remote node about the crash. */
+ if (rbtxn_is_streamed(txn))
+ rb->stream_abort(rb, txn, InvalidXLogRecPtr);
+
/* remove potential on-disk data, and deallocate this tx */
ReorderBufferCleanupTXN(rb, txn);
}
--
2.7.2.windows.1
On Fri, Jan 6, 2023 at 9:25 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
To fix it, One idea is to send a stream abort message when we are cleaning up
crashed transaction on publisher(e.g. in ReorderBufferAbortOld()). And here is
a tiny patch which changes the same. I have confirmed that the bug is fixed and
all regression tests pass. I didn't add a testcase because we need to make sure
the crash happens before all the WAL logged transactions data are decoded which
doesn't seem easy to write a stable test for this.
Your fix looks good to me. Have you tried this in PG-14 where it was introduced?
--
With Regards,
Amit Kapila.
On Friday, January 6, 2023 1:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Jan 6, 2023 at 9:25 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com>
wrote:To fix it, One idea is to send a stream abort message when we are
cleaning up crashed transaction on publisher(e.g. in
ReorderBufferAbortOld()). And here is a tiny patch which changes the
same. I have confirmed that the bug is fixed and all regression tests
pass. I didn't add a testcase because we need to make sure the crash
happens before all the WAL logged transactions data are decoded whichdoesn't seem easy to write a stable test for this.
Your fix looks good to me. Have you tried this in PG-14 where it was
introduced?
Yes, I have confirmed that PG-14 has the same problem and can be fixed after
applying the patch.
Best regards,
Hou zj
On Fri, Jan 6, 2023 at 9:25 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
Hi,
When developing another feature, I find an existing bug which was reported from Dilip[1].
Currently, it's possible that we only send a streaming block without sending a
end of stream message(stream abort) when decoding and streaming a transaction
that was aborted due to crash because we might not WAL log a XLOG_XACT_ABORT
for such a crashed transaction. This will cause the subscriber(with
streaming=on) create a stream file but won't delete it until the apply
worker restart.BUG repro(borrowed from Dilip):
---
1. start 2 servers(config: logical_decoding_work_mem=64kB)
./pg_ctl -D data/ -c -l pub_logs start
./pg_ctl -D data1/ -c -l sub_logs start2. Publisher:
create table t(a int PRIMARY KEY ,b text);
create publication test_pub for table t
with(PUBLISH='insert,delete,update,truncate');
alter table t replica identity FULL ;3. Subscription Server:
create table t(a int,b text);
create subscription test_sub CONNECTION 'host=localhost port=10000
dbname=postgres' PUBLICATION test_pub WITH ( slot_name =
test_slot_sub1,streaming=on);4. Publication Server:
begin ;
insert into t values (generate_series(1,50000), 'zzzzzzzzz'); -- (while executing this restart publisher in 2-3 secs)Restart the publication server, while the transaction is still in an
uncommitted state.
./pg_ctl -D data/ -c -l pub_logs restart -mi
---After restarting the publisher, we can see the subscriber receive a streaming
block and create a stream file(/base/pgsql_tmp/xxx.fileset).To fix it, One idea is to send a stream abort message when we are cleaning up
crashed transaction on publisher(e.g. in ReorderBufferAbortOld()). And here is
a tiny patch which changes the same. I have confirmed that the bug is fixed and
all regression tests pass. I didn't add a testcase because we need to make sure
the crash happens before all the WAL logged transactions data are decoded which
doesn't seem easy to write a stable test for this.Thoughts ?
Fix looks good to me. Thanks for working on this.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
On Fri, Jan 6, 2023 at 11:18 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
To fix it, One idea is to send a stream abort message when we are cleaning up
crashed transaction on publisher(e.g. in ReorderBufferAbortOld()). And here is
a tiny patch which changes the same. I have confirmed that the bug is fixed and
all regression tests pass. I didn't add a testcase because we need to make sure
the crash happens before all the WAL logged transactions data are decoded which
doesn't seem easy to write a stable test for this.Thoughts ?
Fix looks good to me. Thanks for working on this.
Pushed!
--
With Regards,
Amit Kapila.