Missing error_context_stack = NULL in AutoVacWorkerMain()

Started by Ashwin Agrawalabout 6 years ago7 messages
#1Ashwin Agrawal
aagrawal@pivotal.io

I am not sure if this causes any potential problems or not, but for
consistency of code seems we are missing below. All other places in code
where sigsetjmp() exists for top level handling has error_context_stack set
to NULL.

diff --git a/src/backend/postmaster/autovacuum.c
b/src/backend/postmaster/autovacuum.c
index 073f313337..b06d0ad058 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -1558,6 +1558,9 @@ AutoVacWorkerMain(int argc, char *argv[])
         */
        if (sigsetjmp(local_sigjmp_buf, 1) != 0)
        {
+               /* Since not using PG_TRY, must reset error stack by hand */
+               error_context_stack = NULL;
+
                /* Prevents interrupts while cleaning up */
                HOLD_INTERRUPTS();

This was spotted by Paul during code inspection.

#2Michael Paquier
michael@paquier.xyz
In reply to: Ashwin Agrawal (#1)
Re: Missing error_context_stack = NULL in AutoVacWorkerMain()

On Fri, Oct 18, 2019 at 05:55:32PM -0700, Ashwin Agrawal wrote:

I am not sure if this causes any potential problems or not, but for
consistency of code seems we are missing below. All other places in code
where sigsetjmp() exists for top level handling has error_context_stack set
to NULL.

Resetting error_context_stack prevents calling any callbacks which may
be set. These would not be much useful in this context anyway, and
visibly that's actually not an issue with the autovacuum code so far
(I don't recall seeing a custom callback setup in this area, but I may
have missed something). So fixing it would be a good thing actually,
on HEAD.

Any thoughts from others?
--
Michael

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#2)
Re: Missing error_context_stack = NULL in AutoVacWorkerMain()

Michael Paquier <michael@paquier.xyz> writes:

On Fri, Oct 18, 2019 at 05:55:32PM -0700, Ashwin Agrawal wrote:

I am not sure if this causes any potential problems or not, but for
consistency of code seems we are missing below. All other places in code
where sigsetjmp() exists for top level handling has error_context_stack set
to NULL.

Resetting error_context_stack prevents calling any callbacks which may
be set. These would not be much useful in this context anyway, and
visibly that's actually not an issue with the autovacuum code so far
(I don't recall seeing a custom callback setup in this area, but I may
have missed something). So fixing it would be a good thing actually,
on HEAD.

Any thoughts from others?

This seems like a real and possibly serious bug to me. Backend sigsetjmp
callers *must* clear error_context_stack (or restore it to a previous
value), because if it isn't NULL it's surely pointing at garbage, ie a
local variable that's no longer part of the valid stack.

The issue might be argued to be insignificant because the autovacuum
worker is just going to do proc_exit anyway. But if it encountered
another error during proc_exit, elog.c might try to invoke error
callbacks using garbage callback data.

In short, I think we'd better back-patch too.

regards, tom lane

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#3)
Re: Missing error_context_stack = NULL in AutoVacWorkerMain()

I wrote:

The issue might be argued to be insignificant because the autovacuum
worker is just going to do proc_exit anyway. But if it encountered
another error during proc_exit, elog.c might try to invoke error
callbacks using garbage callback data.

Oh --- looking closer, proc_exit itself will clear error_context_stack
before doing much. So a problem would only occur if we suffered an error
during EmitErrorReport, which seems somewhat unlikely. Still, it's bad
that this code isn't like all the others. There's certainly no downside
to clearing the pointer.

regards, tom lane

#5Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#3)
Re: Missing error_context_stack = NULL in AutoVacWorkerMain()

On Mon, Oct 21, 2019 at 12:47:40AM -0400, Tom Lane wrote:

This seems like a real and possibly serious bug to me. Backend sigsetjmp
callers *must* clear error_context_stack (or restore it to a previous
value), because if it isn't NULL it's surely pointing at garbage, ie a
local variable that's no longer part of the valid stack.

Sure. From my recollection of memories we never set it in autovacuum
code paths (including index entry deletions), so I don't think that we
have an actual live bug here.

The issue might be argued to be insignificant because the autovacuum
worker is just going to do proc_exit anyway. But if it encountered
another error during proc_exit, elog.c might try to invoke error
callbacks using garbage callback data.

In short, I think we'd better back-patch too.

Okay, no objections to back-patch.
--
Michael

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#5)
Re: Missing error_context_stack = NULL in AutoVacWorkerMain()

Michael Paquier <michael@paquier.xyz> writes:

On Mon, Oct 21, 2019 at 12:47:40AM -0400, Tom Lane wrote:

This seems like a real and possibly serious bug to me. Backend sigsetjmp
callers *must* clear error_context_stack (or restore it to a previous
value), because if it isn't NULL it's surely pointing at garbage, ie a
local variable that's no longer part of the valid stack.

Sure. From my recollection of memories we never set it in autovacuum
code paths (including index entry deletions), so I don't think that we
have an actual live bug here.

Uh ... what about, say, auto-analyze on an expression index? That
could call user-defined PL functions and thus reach just about all
of the backend.

regards, tom lane

#7Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#4)
Re: Missing error_context_stack = NULL in AutoVacWorkerMain()

On Mon, Oct 21, 2019 at 12:53:27AM -0400, Tom Lane wrote:

Oh --- looking closer, proc_exit itself will clear error_context_stack
before doing much. So a problem would only occur if we suffered an error
during EmitErrorReport, which seems somewhat unlikely. Still, it's bad
that this code isn't like all the others. There's certainly no downside
to clearing the pointer.

Good point about index predicates/expressions. There is the elog()
hook as well in the area, and it's hard to predict how people use
that. So applied and back-patched down 9.4.
--
Michael