Unexpected changes of CurrentResourceOwner and CurrentMemoryContext

Started by Antonin Houska4 months ago8 messages
#1Antonin Houska
ah@cybertec.at
1 attachment(s)

When working on the REPACK command, we see an ERROR caused by unexpected
change of CurrentResourceOwner [1]/messages/by-id/CADzfLwUgPMLiFkXRnk97ugPqkDfsNJ3TRdw9gjJM=8WB4_nXwQ@mail.gmail.com. I think the problem is that
reorderbuffer.c does not restore the original value after calling
RollbackAndReleaseCurrentSubTransaction(). The attached patch tries to handle
the call like other callers throughout the tree do.

[1]: /messages/by-id/CADzfLwUgPMLiFkXRnk97ugPqkDfsNJ3TRdw9gjJM=8WB4_nXwQ@mail.gmail.com
/messages/by-id/CADzfLwUgPMLiFkXRnk97ugPqkDfsNJ3TRdw9gjJM=8WB4_nXwQ@mail.gmail.com

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

Attachments:

0001-Avoid-unexpected-changes-of-CurrentResourceOwner-and.patchtext/x-diffDownload
From 025322cd23c05fa92bb04c8e1ce76ef40003d4cc Mon Sep 17 00:00:00 2001
From: Antonin Houska <ah@cybertec.at>
Date: Wed, 3 Sep 2025 11:33:45 +0200
Subject: [PATCH] Avoid unexpected changes of CurrentResourceOwner and
 CurrentMemoryContext.

Users of logical decoding can encounter unexpected change of
CurrentResourceOwner and CurrentMemoryContext. The problem is that in
reorderbuffer.c, unlike other call sites, we call
RollbackAndReleaseCurrentSubTransaction() without restoring the original
values of these global variables. This patch saves the values prior to the
call and restores them eventually.
---
 src/backend/replication/logical/reorderbuffer.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 34cf05668ae..4736f993c37 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -2215,6 +2215,7 @@ ReorderBufferProcessTXN(ReorderBuffer *rb, ReorderBufferTXN *txn,
 {
 	bool		using_subtxn;
 	MemoryContext ccxt = CurrentMemoryContext;
+	ResourceOwner cowner = CurrentResourceOwner;
 	ReorderBufferIterTXNState *volatile iterstate = NULL;
 	volatile XLogRecPtr prev_lsn = InvalidXLogRecPtr;
 	ReorderBufferChange *volatile specinsert = NULL;
@@ -2692,7 +2693,11 @@ ReorderBufferProcessTXN(ReorderBuffer *rb, ReorderBufferTXN *txn,
 		}
 
 		if (using_subtxn)
+		{
 			RollbackAndReleaseCurrentSubTransaction();
+			MemoryContextSwitchTo(ccxt);
+			CurrentResourceOwner = cowner;
+		}
 
 		/*
 		 * We are here due to one of the four reasons: 1. Decoding an
@@ -2751,7 +2756,11 @@ ReorderBufferProcessTXN(ReorderBuffer *rb, ReorderBufferTXN *txn,
 		}
 
 		if (using_subtxn)
+		{
 			RollbackAndReleaseCurrentSubTransaction();
+			MemoryContextSwitchTo(ccxt);
+			CurrentResourceOwner = cowner;
+		}
 
 		/*
 		 * The error code ERRCODE_TRANSACTION_ROLLBACK indicates a concurrent
@@ -3244,6 +3253,8 @@ ReorderBufferImmediateInvalidation(ReorderBuffer *rb, uint32 ninvalidations,
 								   SharedInvalidationMessage *invalidations)
 {
 	bool		use_subtxn = IsTransactionOrTransactionBlock();
+	MemoryContext ccxt = CurrentMemoryContext;
+	ResourceOwner cowner = CurrentResourceOwner;
 	int			i;
 
 	if (use_subtxn)
@@ -3262,7 +3273,11 @@ ReorderBufferImmediateInvalidation(ReorderBuffer *rb, uint32 ninvalidations,
 		LocalExecuteInvalidationMessage(&invalidations[i]);
 
 	if (use_subtxn)
+	{
 		RollbackAndReleaseCurrentSubTransaction();
+		MemoryContextSwitchTo(ccxt);
+		CurrentResourceOwner = cowner;
+	}
 }
 
 /*
-- 
2.47.1

#2Mihail Nikalayeu
mihailnikalayeu@gmail.com
In reply to: Antonin Houska (#1)
Re: Unexpected changes of CurrentResourceOwner and CurrentMemoryContext

Hello, Antonin!

if (using_subtxn)
{
RollbackAndReleaseCurrentSubTransaction();
MemoryContextSwitchTo(ccxt);
CurrentResourceOwner = cowner;
}

IIUC memory context is already switched above:

MemoryContext ecxt = MemoryContextSwitchTo(ccxt);

Best regards,
Mikhail.

#3Álvaro Herrera
alvherre@kurilemu.de
In reply to: Antonin Houska (#1)
Re: Unexpected changes of CurrentResourceOwner and CurrentMemoryContext

On 2025-Sep-03, Antonin Houska wrote:

When working on the REPACK command, we see an ERROR caused by unexpected
change of CurrentResourceOwner [1]. I think the problem is that
reorderbuffer.c does not restore the original value after calling
RollbackAndReleaseCurrentSubTransaction(). The attached patch tries to handle
the call like other callers throughout the tree do.

I have registered this as
https://commitfest.postgresql.org/patch/6051/

I've been wondering whether this should be backpatched. In principle
this is a bugfix, so it should, but I don't offhand recall any cases
where failure to set the current context/resowner in the other
reorderbuffer.c users causes a live bug, so ... maybe master only? I'm
wondering if it's possible where anybody _depends_ on the current
behavior, but I suppose that's quite unlikely.

It applies cleanly back to 14, so I would probably stop there in any
case. (A good thing also, because if we mess up 13 with it now and
don't notice until after the next minor is out, there won't be a chance
to unbreak it later.)

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
Essentially, you're proposing Kevlar shoes as a solution for the problem
that you want to walk around carrying a loaded gun aimed at your foot.
(Tom Lane)

#4Euler Taveira
euler@eulerto.com
In reply to: Álvaro Herrera (#3)
Re: Unexpected changes of CurrentResourceOwner and CurrentMemoryContext

On Thu, Sep 11, 2025, at 3:05 PM, Álvaro Herrera wrote:

On 2025-Sep-03, Antonin Houska wrote:

When working on the REPACK command, we see an ERROR caused by unexpected
change of CurrentResourceOwner [1]. I think the problem is that
reorderbuffer.c does not restore the original value after calling
RollbackAndReleaseCurrentSubTransaction(). The attached patch tries to handle
the call like other callers throughout the tree do.

Interesting. I'm wondering that if this patch is applied we could remove the
following code

/*
* Logical decoding could have clobbered CurrentResourceOwner during
* transaction management, so restore the executor's value. (This is
* a kluge, but it's not worth cleaning up right now.)
*/
CurrentResourceOwner = old_resowner;

from pg_logical_slot_get_changes_guts and LogicalSlotAdvanceAndCheckSnapState
functions too. IIUC the referred code is a band-aid that will be improved
someday.

I have registered this as
https://commitfest.postgresql.org/patch/6051/

I've been wondering whether this should be backpatched. In principle
this is a bugfix, so it should, but I don't offhand recall any cases
where failure to set the current context/resowner in the other
reorderbuffer.c users causes a live bug, so ... maybe master only? I'm
wondering if it's possible where anybody _depends_ on the current
behavior, but I suppose that's quite unlikely.

I would say apply it to master only. If/when we have a bug report we can
backpatch it. Per the crash description, I'm not sure we can create a
reproducible test case with the current supported commands. Am I wrong?

It applies cleanly back to 14, so I would probably stop there in any
case. (A good thing also, because if we mess up 13 with it now and
don't notice until after the next minor is out, there won't be a chance
to unbreak it later.)

I would also refrain to apply scary fixes into an EOL branch.

--
Euler Taveira
EDB https://www.enterprisedb.com/

#5Antonin Houska
ah@cybertec.at
In reply to: Euler Taveira (#4)
Re: Unexpected changes of CurrentResourceOwner and CurrentMemoryContext

Euler Taveira <euler@eulerto.com> wrote:

On Thu, Sep 11, 2025, at 3:05 PM, Álvaro Herrera wrote:

On 2025-Sep-03, Antonin Houska wrote:

When working on the REPACK command, we see an ERROR caused by unexpected
change of CurrentResourceOwner [1]. I think the problem is that
reorderbuffer.c does not restore the original value after calling
RollbackAndReleaseCurrentSubTransaction(). The attached patch tries to handle
the call like other callers throughout the tree do.

Interesting. I'm wondering that if this patch is applied we could remove the
following code

/*
* Logical decoding could have clobbered CurrentResourceOwner during
* transaction management, so restore the executor's value. (This is
* a kluge, but it's not worth cleaning up right now.)
*/
CurrentResourceOwner = old_resowner;

from pg_logical_slot_get_changes_guts and LogicalSlotAdvanceAndCheckSnapState
functions too. IIUC the referred code is a band-aid that will be improved
someday.

Even though we're fixing the likely reason of this problem, we cannot be 100%
sure that no other problem like this still exists. So I'd not remove this
assignment. Maybe add Assert(CurrentResourceOwner == old_resowner) in front of
that, and adjust the comment?

I have registered this as
https://commitfest.postgresql.org/patch/6051/

I've been wondering whether this should be backpatched. In principle
this is a bugfix, so it should, but I don't offhand recall any cases
where failure to set the current context/resowner in the other
reorderbuffer.c users causes a live bug, so ... maybe master only? I'm
wondering if it's possible where anybody _depends_ on the current
behavior, but I suppose that's quite unlikely.

I would say apply it to master only. If/when we have a bug report we can
backpatch it.

+1

Per the crash description, I'm not sure we can create a
reproducible test case with the current supported commands. Am I wrong?

It seems so, at least with he "CurrentResourceOwner = old_resowner" assignment
in place. REPACK CONCURRENTLY exposes the problem a bit more because it has at
least one kind of resource open during logical decoding: relation.

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

#6Álvaro Herrera
alvherre@kurilemu.de
In reply to: Antonin Houska (#5)
Re: Unexpected changes of CurrentResourceOwner and CurrentMemoryContext

On 2025-Sep-12, Antonin Houska wrote:

Euler Taveira <euler@eulerto.com> wrote:

Interesting. I'm wondering that if this patch is applied we could remove the
following code [...] from pg_logical_slot_get_changes_guts and
LogicalSlotAdvanceAndCheckSnapState functions too. IIUC the referred
code is a band-aid that will be improved someday.

Even though we're fixing the likely reason of this problem, we cannot be 100%
sure that no other problem like this still exists. So I'd not remove this
assignment. Maybe add Assert(CurrentResourceOwner == old_resowner) in front of
that, and adjust the comment?

Yeah, I'm going to risk removing it, because if we don't do it now,
we're never going to do it. We can mitigate the risk of missing
remaining bugs by having that assertion you suggest, so that if anyone
actually hits a problem here, we'll know soon enough.

I have pushed it with that change. I'll also add an open item for pg19
so that we remember to come back to remove the assertions, if we feel we
no longer need them.

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"Update: super-fast reaction on the Postgres bugs mailing list. The report
was acknowledged [...], and a fix is under discussion.
The wonders of open-source !"
https://twitter.com/gunnarmorling/status/1596080409259003906

#7Antonin Houska
ah@cybertec.at
In reply to: Álvaro Herrera (#6)
Re: Unexpected changes of CurrentResourceOwner and CurrentMemoryContext

Álvaro Herrera <alvherre@kurilemu.de> wrote:

On 2025-Sep-12, Antonin Houska wrote:

Euler Taveira <euler@eulerto.com> wrote:

Interesting. I'm wondering that if this patch is applied we could remove the
following code [...] from pg_logical_slot_get_changes_guts and
LogicalSlotAdvanceAndCheckSnapState functions too. IIUC the referred
code is a band-aid that will be improved someday.

Even though we're fixing the likely reason of this problem, we cannot be 100%
sure that no other problem like this still exists. So I'd not remove this
assignment. Maybe add Assert(CurrentResourceOwner == old_resowner) in front of
that, and adjust the comment?

Yeah, I'm going to risk removing it, because if we don't do it now,
we're never going to do it. We can mitigate the risk of missing
remaining bugs by having that assertion you suggest, so that if anyone
actually hits a problem here, we'll know soon enough.

I have pushed it with that change. I'll also add an open item for pg19
so that we remember to come back to remove the assertions, if we feel we
no longer need them.

I was worried about removing those workarounds because it was not trivial to
diagnose the issue. But it should be ok for the master branch. Thanks.

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

#8Mihail Nikalayeu
mihailnikalayeu@gmail.com
In reply to: Mihail Nikalayeu (#2)
Re: Unexpected changes of CurrentResourceOwner and CurrentMemoryContext

Hi!

Hello, Antonin!

if (using_subtxn)
{
RollbackAndReleaseCurrentSubTransaction();
MemoryContextSwitchTo(ccxt);
CurrentResourceOwner = cowner;
}

IIUC memory context is already switched above:

MemoryContext ecxt = MemoryContextSwitchTo(ccxt);

Finally realized RollbackAndReleaseCurrentSubTransaction leaves as
with TransactionAbortContext, so, nevermind :)