longjmp clobber warnings are utterly broken in modern gcc

Started by Tom Lanealmost 11 years ago19 messages

tgl@sss.pgh.pa.us

almost 11 years ago

I've been looking for other instances of the problem Mark Wilding
pointed out, about missing "volatile" markers on variables that
are modified in PG_TRY blocks and then used in the PG_CATCH stanzas.
There definitely are some. Current gcc versions do not warn about that.

If you turn on -Wclobbered, you don't always get warnings about the
variables that are problematic, and you do get warnings about variables
that are perfectly safe. (This is evidently why that option is not on
by default: it's *useless*, or even counterproductive if it gives you
false confidence that the compiler is protecting you.)

I thought maybe the gcc folk no longer care about this because the
compiler is now smart enough to compile safe code in these situations.
A simple experiment disabused me of that notion. I took guc.c's
AlterSystemSetConfigFile, which is at risk because of this sequence:

int Tmpfd = -1;

...
Tmpfd = open(AutoConfTmpFileName, O_CREAT | O_RDWR | O_TRUNC, S_IRUSR | S_IWUSR);
if (Tmpfd < 0)
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not open file \"%s\": %m",
AutoConfTmpFileName)));

PG_TRY();
{
...
close(Tmpfd);
Tmpfd = -1;
...
}
PG_CATCH();
{
if (Tmpfd >= 0)
close(Tmpfd);
...
}
PG_END_TRY();

and compared the assembly language generated with and without adding
"volatile" to Tmpfd's declaration. Without "volatile" (ie, in the
code as shipped), gcc optimizes away the assignment "Tmpfd = -1"
within PG_TRY, and it also optimizes away the if-test in PG_CATCH,
apparently believing that control cannot transfer from inside the
PG_TRY to the PG_CATCH. This is utterly wrong of course. The issue
is masked because we don't bother to test for a failure return from the
second close() call, but it's not hard to think of similar coding
patterns where this type of mistaken optimization would be disastrous.
(Even here, the bogus close call could cause a problem if we'd happened
to open another file during the last part of the PG_TRY stanza.)

Not only does -Wclobbered fail to point out this risk, but to add
insult to injury it does whinge about two *other* variables in the
same function that are actually perfectly safe. I'm not sure what
algorithm it's using to decide what to warn about, but the algorithm
has approximately nothing to do with reality.

I tested this on gcc 4.4.7 (current on RHEL6), and gcc 4.8.3 (current
on Fedora 20); they behave the same. Even if this were fixed in
bleeding-edge gcc, we'd definitely still need the "volatile" marker
to get non-broken code compiled on most current platforms.

This is scary as hell. I intend to go around and manually audit
every single PG_TRY in the current source code, but that is obviously
not a long-term solution. Anybody have an idea about how we might
get trustworthy mechanical detection of this type of situation?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Greg Stark

stark@mit.edu

almost 11 years ago

In reply to: Tom Lane (#1)

Re: longjmp clobber warnings are utterly broken in modern gcc

Some Google(tm)ing does turn up plenty of other people complaining about
similar behaviour. This report seems to have the most enlightening response:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54561

Perhaps Clang has a more useful warning?

Tom Lane

tgl@sss.pgh.pa.us

almost 11 years ago

In reply to: Greg Stark (#2)

Re: longjmp clobber warnings are utterly broken in modern gcc

Greg Stark <stark@mit.edu> writes:

Some Google(tm)ing does turn up plenty of other people complaining about
similar behaviour. This report seems to have the most enlightening response:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54561

Yeah, I saw that before too. I got an interesting response from Jakub J.
just now as well:
https://bugzilla.redhat.com/show_bug.cgi?id=1185673

It sounds like the appearance of the warning is contingent on code
generation decisions, making it even less likely to ever be useful
to us in its current form.

Perhaps Clang has a more useful warning?

Clang, at least the version on my Mac, doesn't warn either with the
settings we normally use, and it doesn't have -Wclobber at all.
I tried turning on -Weverything, and it still didn't complain.
(It did generate incorrect code though, so it's no better than gcc
in that respect.)

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Martijn van Oosterhout

kleptog@svana.org

almost 11 years ago

In reply to: Tom Lane (#1)

Re: longjmp clobber warnings are utterly broken in modern gcc

On Sun, Jan 25, 2015 at 02:02:47PM -0500, Tom Lane wrote:

and compared the assembly language generated with and without adding
"volatile" to Tmpfd's declaration. Without "volatile" (ie, in the
code as shipped), gcc optimizes away the assignment "Tmpfd = -1"
within PG_TRY, and it also optimizes away the if-test in PG_CATCH,
apparently believing that control cannot transfer from inside the
PG_TRY to the PG_CATCH. This is utterly wrong of course. The issue
is masked because we don't bother to test for a failure return from the
second close() call, but it's not hard to think of similar coding
patterns where this type of mistaken optimization would be disastrous.
(Even here, the bogus close call could cause a problem if we'd happened
to open another file during the last part of the PG_TRY stanza.)

<snip>

This is scary as hell. I intend to go around and manually audit
every single PG_TRY in the current source code, but that is obviously
not a long-term solution. Anybody have an idea about how we might
get trustworthy mechanical detection of this type of situation?

It's a bit of a long shot, but perhaps if you put something like:

asm volatile("":"":"":"memory")

at the beginning of the catch-block it might convince the compiler to
forget any assumptions about what is in the local variables...

Hope this helps,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

He who writes carelessly confesses thereby at the very outset that he does
not attach much importance to his own thoughts.

-- Arthur Schopenhauer

Tom Lane

tgl@sss.pgh.pa.us

almost 11 years ago

In reply to: Martijn van Oosterhout (#4)

Re: longjmp clobber warnings are utterly broken in modern gcc

Martijn van Oosterhout <kleptog@svana.org> writes:

On Sun, Jan 25, 2015 at 02:02:47PM -0500, Tom Lane wrote:

This is scary as hell. I intend to go around and manually audit
every single PG_TRY in the current source code, but that is obviously
not a long-term solution. Anybody have an idea about how we might
get trustworthy mechanical detection of this type of situation?

It's a bit of a long shot, but perhaps if you put something like:

asm volatile("":"":"":"memory")

at the beginning of the catch-block it might convince the compiler to
forget any assumptions about what is in the local variables...

Meh. Even if that worked for gcc (which as you say is uncertain),
it would help not at all for other compilers. The POSIX requirements
for portable code are clear: we need a "volatile" marker on affected
variables.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Andres Freund

andres@2ndquadrant.com

almost 11 years ago

In reply to: Tom Lane (#1)

Re: longjmp clobber warnings are utterly broken in modern gcc

Hi,

On 2015-01-25 14:02:47 -0500, Tom Lane wrote:

I've been looking for other instances of the problem Mark Wilding
pointed out, about missing "volatile" markers on variables that
are modified in PG_TRY blocks and then used in the PG_CATCH stanzas.
There definitely are some. Current gcc versions do not warn about that.

If you turn on -Wclobbered, you don't always get warnings about the
variables that are problematic, and you do get warnings about variables
that are perfectly safe. (This is evidently why that option is not on
by default: it's *useless*, or even counterproductive if it gives you
false confidence that the compiler is protecting you.)

I've observed that too. IIUC the warnings are generated by observing
what register spilling does - which unfortunately seems to mean that the
more complex a function gets the less useful the clobber warnings get
because there's more spilling going on, generating pointless warnings.

I think it's actually not a recent regression - in the past a lot of
spurious instances of these warnings have been fixed by simply tacking
on volatile on variables that didn't actually need it.

I tested this on gcc 4.4.7 (current on RHEL6), and gcc 4.8.3 (current
on Fedora 20); they behave the same. Even if this were fixed in
bleeding-edge gcc, we'd definitely still need the "volatile" marker
to get non-broken code compiled on most current platforms.

It's not fixed (both in the correct warning and not needing anymore sense) at least for
gcc (Debian 20150119-1) 5.0.0 20150119 (experimental) [trunk revision 219863]

This is scary as hell. I intend to go around and manually audit
every single PG_TRY in the current source code, but that is obviously
not a long-term solution. Anybody have an idea about how we might
get trustworthy mechanical detection of this type of situation?

Not really, except convincing gcc to fix the inaccurate detection. Given
that there've been bugs open about this (IIRC one from you even) for
years I'm not holding my breath.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Andres Freund

andres@2ndquadrant.com

almost 11 years ago

In reply to: Tom Lane (#3)

Re: longjmp clobber warnings are utterly broken in modern gcc

On 2015-01-25 15:40:10 -0500, Tom Lane wrote:

Greg Stark <stark@mit.edu> writes:

Perhaps Clang has a more useful warning?

Clang, at least the version on my Mac, doesn't warn either with the
settings we normally use, and it doesn't have -Wclobber at all.
I tried turning on -Weverything, and it still didn't complain.
(It did generate incorrect code though, so it's no better than gcc
in that respect.)

Even a pretty recent (3.6-rc1) clang doesn't warn. It generates lots of
useful warnings, but nothing about longjmp clobbers.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

almost 11 years ago

In reply to: Andres Freund (#6)

Re: longjmp clobber warnings are utterly broken in modern gcc

Andres Freund <andres@2ndquadrant.com> writes:

On 2015-01-25 14:02:47 -0500, Tom Lane wrote:

I've been looking for other instances of the problem Mark Wilding
pointed out, about missing "volatile" markers on variables that
are modified in PG_TRY blocks and then used in the PG_CATCH stanzas.
There definitely are some. Current gcc versions do not warn about that.

I think it's actually not a recent regression - in the past a lot of
spurious instances of these warnings have been fixed by simply tacking
on volatile on variables that didn't actually need it.

Yeah, it's not. For years and years I just automatically stuck a "volatile"
on anything gcc 2.95.3 complained about, so that's why there's so many
volatiles there now. But I've not done that lately, and comparing what
2.95.3 warns about now with what a modern version says with -Wclobbered,
it's clear that it's pretty much the same broken (and perhaps slightly
machine-dependent) algorithm :-(

This is scary as hell. I intend to go around and manually audit
every single PG_TRY in the current source code, but that is obviously
not a long-term solution. Anybody have an idea about how we might
get trustworthy mechanical detection of this type of situation?

Not really, except convincing gcc to fix the inaccurate detection. Given
that there've been bugs open about this (IIRC one from you even) for
years I'm not holding my breath.

I've completed the audit, and there were a total of only five places
that need fixes (including the two I already patched over the weekend).
It's mostly pretty new code too, which probably explains why we don't
already have field reports of problems.

Interestingly, plpython seems heavily *over* volatilized. Not sure
whether to take some out there for consistency, or just leave it alone.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Robert Haas

robertmhaas@gmail.com

almost 11 years ago

In reply to: Tom Lane (#1)

Re: longjmp clobber warnings are utterly broken in modern gcc

On Sun, Jan 25, 2015 at 2:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

This is scary as hell. I intend to go around and manually audit
every single PG_TRY in the current source code, but that is obviously
not a long-term solution. Anybody have an idea about how we might
get trustworthy mechanical detection of this type of situation?

One idea I've been thinking about for a while is to create some new,
safer notation. Suppose we did this:

PG_TRY_WITH_CLEANUP(cleanup_function, cleanup_argument);
{
/* code requiring cleanup */
}
PG_END_TRY_WITH_CLEANUP();

Instead of doing anything with sigsetjmp(), this would just push a
frame onto a cleanup stack. We would call of those callbacks from
innermost to outermost before doing siglongjmp(). With this design,
we don't need any volatile-ization.

This doesn't work for PG_CATCH() blocks that do not PG_RE_THROW(), but
there are not a ton of those. In a quick search, I found initTrie,
do_autovacuum, xml_is_document, and a number of instances in various
procedural languages. Most instances in the core code could be
converted to the style above. Aside from any reduction in the need
for volatile, this might actually perform slightly better, because
sigsetjmp() is a system call on some platforms. There are probably
few cases where that actually matters, but the one in pq_getmessage(),
for example, might not be entirely discountable.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10

Andres Freund

andres@2ndquadrant.com

almost 11 years ago

In reply to: Robert Haas (#9)

Re: longjmp clobber warnings are utterly broken in modern gcc

On 2015-01-26 10:52:07 -0500, Robert Haas wrote:

On Sun, Jan 25, 2015 at 2:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

This is scary as hell. I intend to go around and manually audit
every single PG_TRY in the current source code, but that is obviously
not a long-term solution. Anybody have an idea about how we might
get trustworthy mechanical detection of this type of situation?

One idea I've been thinking about for a while is to create some new,
safer notation. Suppose we did this:

PG_TRY_WITH_CLEANUP(cleanup_function, cleanup_argument);
{
/* code requiring cleanup */
}
PG_END_TRY_WITH_CLEANUP();

That's pretty similar to to PG_ENSURE_ERROR_CLEANUP, except that that is
also called after FATAL errors. If we do this, we probably should try to
come up with a easier to understand naming scheme. PG_TRY_WITH_CLEANUP
vs. PG_ENSURE_ERROR_CLEANUP isn't very clear to a reader.

Instead of doing anything with sigsetjmp(), this would just push a
frame onto a cleanup stack. We would call of those callbacks from
innermost to outermost before doing siglongjmp(). With this design,
we don't need any volatile-ization.

On the other hand most of the callsites will need some extra state
somewhere to keep track of what to undo. That's a bit of restructuring
work too. And if the cleanup function needs to reference anything done
in the TRY block, that state will need to be volatile too.

Aside from any reduction in the need
for volatile, this might actually perform slightly better, because
sigsetjmp() is a system call on some platforms. There are probably
few cases where that actually matters, but the one in pq_getmessage(),
for example, might not be entirely discountable.

Hm. How would you implement PG_TRY_WITH_CLEANUP without a sigsetjmp()?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11

Tom Lane

tgl@sss.pgh.pa.us

almost 11 years ago

In reply to: Robert Haas (#9)

Re: longjmp clobber warnings are utterly broken in modern gcc

Robert Haas <robertmhaas@gmail.com> writes:

On Sun, Jan 25, 2015 at 2:02 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

This is scary as hell. I intend to go around and manually audit
every single PG_TRY in the current source code, but that is obviously
not a long-term solution. Anybody have an idea about how we might
get trustworthy mechanical detection of this type of situation?

One idea I've been thinking about for a while is to create some new,
safer notation. Suppose we did this:

PG_TRY_WITH_CLEANUP(cleanup_function, cleanup_argument);
{
/* code requiring cleanup */
}
PG_END_TRY_WITH_CLEANUP();

Instead of doing anything with sigsetjmp(), this would just push a
frame onto a cleanup stack. We would call of those callbacks from
innermost to outermost before doing siglongjmp(). With this design,
we don't need any volatile-ization.

Maybe not, but the notational pain of exposing everything that the
cleanup_function needs to see would be substantial. We have basically
this design already with PG_ENSURE_ERROR_CLEANUP, and that's a PITA to
use.

Also and perhaps more to the point, I'm no longer convinced that this sort
of thing doesn't require any volatile markers. The fundamental problem
we're hitting with PG_TRY is that the compiler is optimizing on the
assumption that no "unexpected" touches/changes of local variables can
happen as a result of unexpected control flow. I think it might still be
willing to optimize away superficially-dead stores even if you structure
stuff as above. We need to take a closer look at the uses of
PG_ENSURE_ERROR_CLEANUP as well ...

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12

Andres Freund

andres@2ndquadrant.com

almost 11 years ago

In reply to: Tom Lane (#11)

Re: longjmp clobber warnings are utterly broken in modern gcc

On 2015-01-26 11:18:41 -0500, Tom Lane wrote:

Also and perhaps more to the point, I'm no longer convinced that this sort
of thing doesn't require any volatile markers. The fundamental problem
we're hitting with PG_TRY is that the compiler is optimizing on the
assumption that no "unexpected" touches/changes of local variables can
happen as a result of unexpected control flow. I think it might still be
willing to optimize away superficially-dead stores even if you structure
stuff as above. We need to take a closer look at the uses of
PG_ENSURE_ERROR_CLEANUP as well ...

Robert's premise was that the new notion doesn't allow catching an
error. If the state that's passed isn't endangered (because it's marked
volatile :(), then there's no danger with the bit after the CATCH
block. That's obviously not the case for ENSURE_ERROR_CLEANUP. That
definitely needs volatiles for stuff that's referenced after the TRY
block if modified inside.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13

Robert Haas

robertmhaas@gmail.com

almost 11 years ago

In reply to: Andres Freund (#10)

Re: longjmp clobber warnings are utterly broken in modern gcc

That's pretty similar to to PG_ENSURE_ERROR_CLEANUP, except that that is
also called after FATAL errors. If we do this, we probably should try to
come up with a easier to understand naming scheme. PG_TRY_WITH_CLEANUP
vs. PG_ENSURE_ERROR_CLEANUP isn't very clear to a reader.

Yep.

Instead of doing anything with sigsetjmp(), this would just push a
frame onto a cleanup stack. We would call of those callbacks from
innermost to outermost before doing siglongjmp(). With this design,
we don't need any volatile-ization.

On the other hand most of the callsites will need some extra state
somewhere to keep track of what to undo. That's a bit of restructuring
work too. And if the cleanup function needs to reference anything done
in the TRY block, that state will need to be volatile too.

I don't think so.

Aside from any reduction in the need
for volatile, this might actually perform slightly better, because
sigsetjmp() is a system call on some platforms. There are probably
few cases where that actually matters, but the one in pq_getmessage(),
for example, might not be entirely discountable.

Hm. How would you implement PG_TRY_WITH_CLEANUP without a sigsetjmp()?

Posit:

struct cleanup_entry {
void (*callback)(void *);
void *callback_arg;
struct cleanup_entry *next;
};
cleanup_entry *cleanup_stack = NULL;

So PG_TRY_WITH_CLEANUP(_cb, _cb_arg) does (approximately) this:

{
cleanup_entry e;
cleanup_entry *orige;
e.callback = (_cb);
e.callback_arg = (_cb_arg);
e.next = cleanup_stack;
orige = cleanup_stack;
cleanup_stack = &e;

And when you PG_END_TRY_WITH_CLEANUP, we just do this:

cleanup_stack = orige;
}

And before doing sigsetjmp to the active handler, we run all the
functions on the stack. There shouldn't be any need for volatile; the
compiler has to know that once we make it possible to get at a pointer
to cb_arg via a global variable (cleanup_stack), any function we call
in another translation unit could decide to call that function and it
would need to see up-to-date values of everything cb_arg points to.
So before calling such a function it had better store that data to
memory, not just leave it lying around in a register somewhere.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14

Andres Freund

andres@2ndquadrant.com

almost 11 years ago

In reply to: Robert Haas (#13)

Re: longjmp clobber warnings are utterly broken in modern gcc

Aside from any reduction in the need
for volatile, this might actually perform slightly better, because
sigsetjmp() is a system call on some platforms. There are probably
few cases where that actually matters, but the one in pq_getmessage(),
for example, might not be entirely discountable.

Hm. How would you implement PG_TRY_WITH_CLEANUP without a sigsetjmp()?

Posit:

struct cleanup_entry {
void (*callback)(void *);
void *callback_arg;
struct cleanup_entry *next;
};
cleanup_entry *cleanup_stack = NULL;

So PG_TRY_WITH_CLEANUP(_cb, _cb_arg) does (approximately) this:

{
cleanup_entry e;
cleanup_entry *orige;
e.callback = (_cb);
e.callback_arg = (_cb_arg);
e.next = cleanup_stack;
orige = cleanup_stack;
cleanup_stack = &e;

And when you PG_END_TRY_WITH_CLEANUP, we just do this:

cleanup_stack = orige;
}

Hm. Not bad.

[ponder]

I guess we'd need to tie it into PG_exception_stack levels, so it
correctly handles nesting with sigsetjmp locations. In contrast to
sigsetjmp() style handlers we can't rely on PG_CATCH cleaning up that
state.

I wonder how hard it is to make that reliable for errors thrown in a
cleanup callback. Those certainly are possible in some of the PG_CATCHes
we have right now.

And before doing sigsetjmp to the active handler, we run all the
functions on the stack. There shouldn't be any need for volatile; the
compiler has to know that once we make it possible to get at a pointer
to cb_arg via a global variable (cleanup_stack), any function we call
in another translation unit could decide to call that function and it
would need to see up-to-date values of everything cb_arg points to.
So before calling such a function it had better store that data to
memory, not just leave it lying around in a register somewhere.

Given that we, at the moment at least, throw ERRORs from signal
handlers, I don't think that necessarily holds true. I think we're not
that far away from getting rid of all of those though.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15

Robert Haas

robertmhaas@gmail.com

almost 11 years ago

In reply to: Andres Freund (#14)

Re: longjmp clobber warnings are utterly broken in modern gcc

On Mon, Jan 26, 2015 at 1:34 PM, Andres Freund <andres@2ndquadrant.com> wrote:

I guess we'd need to tie it into PG_exception_stack levels, so it
correctly handles nesting with sigsetjmp locations. In contrast to
sigsetjmp() style handlers we can't rely on PG_CATCH cleaning up that
state.

I was thinking that PG_TRY() would need to squirrel away the value of
cleanup_stack and set it to null, and PG_CATCH would need to restore
the squirreled-away value. That way we fire handlers in
reverse-order-of-registration regardless of which style of
registration is used.

I wonder how hard it is to make that reliable for errors thrown in a
cleanup callback. Those certainly are possible in some of the PG_CATCHes
we have right now.

I think what should happen is that pg_re_throw() should remove each
frame from the stack and then call the associated callback. If
another error occurs, we won't try that particular callback again, but
we'll try the next one. In most cases this should be impossible
anyway because the catch-block is just a variable assignment or
something, but not in all cases, of course.

And before doing sigsetjmp to the active handler, we run all the
functions on the stack. There shouldn't be any need for volatile; the
compiler has to know that once we make it possible to get at a pointer
to cb_arg via a global variable (cleanup_stack), any function we call
in another translation unit could decide to call that function and it
would need to see up-to-date values of everything cb_arg points to.
So before calling such a function it had better store that data to
memory, not just leave it lying around in a register somewhere.

Given that we, at the moment at least, throw ERRORs from signal
handlers, I don't think that necessarily holds true. I think we're not
that far away from getting rid of all of those though.

Well, I think the theory behind that is we should only be throwing
errors from signal handlers when ImmediateInterruptOK = true, which is
only supposed to happen when the code thereby interrupted "isn't doing
anything interesting". So if you set up some state that can be
touched by the error-handling path and then, in the same function, set
ImmediateInterruptOK, yeah, that probably needs to be volatile. But
if function A sets up the state and then calls function B in another
translation unit, and B sets ImmediateInterruptOK, then A has no way
of knowing that B wasn't going to just throw a garden-variety error,
so it better have left things in a tidy state. We still need to
scrutinize the actual functions that set ImmediateInterruptOK and, if
they are static, their callers, but that isn't too many places to look
at. It's certainly (IMHO) a lot better than trying to stick in
volatile qualifiers every place we use a try-catch block, and if you
succeed in getting rid of ImmediateInterruptOK, then it goes away
altogether.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16

Martijn van Oosterhout

kleptog@svana.org

almost 11 years ago

In reply to: Tom Lane (#5)

Re: longjmp clobber warnings are utterly broken in modern gcc

On Sun, Jan 25, 2015 at 07:11:12PM -0500, Tom Lane wrote:

Martijn van Oosterhout <kleptog@svana.org> writes:

On Sun, Jan 25, 2015 at 02:02:47PM -0500, Tom Lane wrote:
It's a bit of a long shot, but perhaps if you put something like:

asm volatile("":"":"":"memory")

at the beginning of the catch-block it might convince the compiler to
forget any assumptions about what is in the local variables...

Meh. Even if that worked for gcc (which as you say is uncertain),
it would help not at all for other compilers. The POSIX requirements
for portable code are clear: we need a "volatile" marker on affected
variables.

Never mind, it doesn't work. It's not that GCC doesn't know setjmp() is
special, it does (the returns_twice attribute). So GCC does the above
effectivly itself. The problem is that local variables may be stored
in memory over calls in the PG_TRY() block, volatile is a sledgehammer
way of preventing that.

The problem is, GCC doesn't know anything about what the return value
of setjmp() means which means that it can never produce any sensible
warnings in this area.

If you want the compiler to catch this, I don't see any way without
requiring the code to indicate specifically which local variables it
intends to use, or not using the locals at all by using a seperate
cleanup function (as discussed elsewhere in this thread). With
information about the locals you might be able to conjure some GCC
macros to set things up to complain if you use anything else.

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

He who writes carelessly confesses thereby at the very outset that he does
not attach much importance to his own thoughts.

-- Arthur Schopenhauer

#17

Tom Lane

tgl@sss.pgh.pa.us

almost 11 years ago

In reply to: Martijn van Oosterhout (#16)

Re: longjmp clobber warnings are utterly broken in modern gcc

Martijn van Oosterhout <kleptog@svana.org> writes:

Never mind, it doesn't work. It's not that GCC doesn't know setjmp() is
special, it does (the returns_twice attribute). So GCC does the above
effectivly itself. The problem is that local variables may be stored
in memory over calls in the PG_TRY() block, volatile is a sledgehammer
way of preventing that.

The problem is, GCC doesn't know anything about what the return value
of setjmp() means which means that it can never produce any sensible
warnings in this area.

Yeah. There are actually two distinct issues here:

1. If local variables are kept in registers, longjmp will cause their
values to revert back to whatever they were at setjmp. Forcing them
into memory prevents that, ensuring that the CATCH block can see any
value changes made inside the TRY block.

2. The compiler might make optimizations based on the assumption that
control cannot flow from (any function call in!) the TRY block to the
CATCH block. It might well decide for example that a store to some
variable is dead and remove it.

"volatile" fixes both of these issues but as you say it's a pretty
sledgehammer way of doing it. In any case, the practical problem is
that we don't get any warnings reminding us that there's a hazard.

I've been wondering if we could improve the situation by changing the TRY
macro expansion. Instead of a straight

if (setjmp(...) == 0)
{
TRY code;
}
else
{
CATCH code;
}

we could do something like

volatile int _setjmpresult = setjmp(...);
if (_setjmpresult == 0)
{
TRY code;
}
if (_setjmpresult != 0)
{
CATCH code;
}

The "volatile" marker would prevent gcc from deducing that the CATCH
code cannot be reached after running the TRY code. Unfortunately,
that only partially solves the incorrect-optimization problem.
Given code like, say,

PG_TRY();
{
ptr = palloc...;
... stuff ...
pfree(ptr);
ptr = NULL;

... more stuff ...

ptr = palloc...;
... still more stuff ...
pfree(ptr);
ptr = NULL;

... yet more stuff ...
}
PG_CATCH();
{
if (ptr)
pfree(ptr);
}

the compiler could still decide that the first "ptr = NULL;" is a dead
store. I don't currently see any way around that without a volatile
marker on "ptr"; but maybe somebody can improve on this idea?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18

Heikki Linnakangas

hlinnakangas@vmware.com

almost 11 years ago

In reply to: Martijn van Oosterhout (#16)

Re: longjmp clobber warnings are utterly broken in modern gcc

On 02/01/2015 03:56 PM, Martijn van Oosterhout wrote:

If you want the compiler to catch this, I don't see any way without
requiring the code to indicate specifically which local variables it
intends to use, or not using the locals at all by using a seperate
cleanup function (as discussed elsewhere in this thread). With
information about the locals you might be able to conjure some GCC
macros to set things up to complain if you use anything else.

I wonder how difficult it would be to teach e.g. clang static analyzer
to catch this, rather than the compiler.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19

Tom Lane

tgl@sss.pgh.pa.us

almost 11 years ago

In reply to: Heikki Linnakangas (#18)

Re: longjmp clobber warnings are utterly broken in modern gcc

Heikki Linnakangas <hlinnakangas@vmware.com> writes:

On 02/01/2015 03:56 PM, Martijn van Oosterhout wrote:

If you want the compiler to catch this, I don't see any way without
requiring the code to indicate specifically which local variables it
intends to use, or not using the locals at all by using a seperate
cleanup function (as discussed elsewhere in this thread). With
information about the locals you might be able to conjure some GCC
macros to set things up to complain if you use anything else.

I wonder how difficult it would be to teach e.g. clang static analyzer
to catch this, rather than the compiler.

Maybe we could interest the Coverity crew in this topic. Seems like
the kind of thing they should care about.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers