Suspicion of a compiler bug in clang: using ternary operator in ereport()

Started by Christian Kruseabout 12 years ago24 messageshackers
Jump to latest
#1Christian Kruse
christian@2ndquadrant.com

Hi,

just a word of warning: it seems as if there is compiler bug in clang
regarding the ternary operator when used in ereport(). While working
on a patch I found that this code:

ereport(FATAL,
(errmsg("could not map anonymous shared memory: %m"),
(errno == ENOMEM) ?
errhint("This error usually means that PostgreSQL's request "
"for a shared memory segment exceeded available memory "
"or swap space. To reduce the request size (currently "
"%zu bytes), reduce PostgreSQL's shared memory usage, "
"perhaps by reducing shared_buffers or "
"max_connections.",
*size) : 0));

did not emit a errhint when using clang, although errno == ENOMEM was
true. The same code works with gcc. I used the same data dir, so
config was exactly the same, too.

I reported this bug at clang.org:

<http://llvm.org/bugs/show_bug.cgi?id=18644&gt;

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#2Christian Kruse
christian@2ndquadrant.com
In reply to: Christian Kruse (#1)
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

Hi,

when I remove the errno comparison and use a 1 it works:

ereport(FATAL,
(errmsg("could not map anonymous shared memory: %m"),
1 ?
errhint("This error usually means that PostgreSQL's request "
"for a shared memory segment exceeded available memory "
"or swap space. To reduce the request size (currently "
"%zu bytes), reduce PostgreSQL's shared memory usage, "
"perhaps by reducing shared_buffers or "
"max_connections.",
*size) : 0));

Same if I use an if(errno == ENOMEM) instead of the ternary operator.

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Christian Kruse (#1)
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

Christian Kruse <christian@2ndQuadrant.com> writes:

just a word of warning: it seems as if there is compiler bug in clang
regarding the ternary operator when used in ereport(). While working
on a patch I found that this code:
...
did not emit a errhint when using clang, although errno == ENOMEM was
true.

Huh. I noticed a buildfarm failure a couple days ago in which the visible
regression diff was that an expected HINT was missing. This probably
explains that, because we use ternary operators in this style in quite a
few places.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Christian Kruse
christian@2ndquadrant.com
In reply to: Christian Kruse (#1)
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

Hi,

On 28/01/14 16:43, Christian Kruse wrote:

ereport(FATAL,
(errmsg("could not map anonymous shared memory: %m"),
(errno == ENOMEM) ?
errhint("This error usually means that PostgreSQL's request "
"for a shared memory segment exceeded available memory "
"or swap space. To reduce the request size (currently "
"%zu bytes), reduce PostgreSQL's shared memory usage, "
"perhaps by reducing shared_buffers or "
"max_connections.",
*size) : 0));

did not emit a errhint when using clang, although errno == ENOMEM was
true. The same code works with gcc.

According to http://llvm.org/bugs/show_bug.cgi?id=18644#c5 this is not
a compiler bug but a difference between gcc and clang. Clang seems to
use a left-to-right order of evaluation while gcc uses a right-to-left
order of evaluation. So if errmsg changes errno this would lead to
errno == ENOMEM evaluated to false. I added a watch point on errno and
it turns out that exactly this happens: in src/common/psprintf.c line
114

nprinted = vsnprintf(buf, len, fmt, args);

errno gets set to 0. This means that we will miss errhint/errdetail if
we use errno in a ternary operator and clang.

Should we work on this issue?

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Christian Kruse (#4)
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

Christian Kruse <christian@2ndQuadrant.com> writes:

According to http://llvm.org/bugs/show_bug.cgi?id=18644#c5 this is not
a compiler bug but a difference between gcc and clang. Clang seems to
use a left-to-right order of evaluation while gcc uses a right-to-left
order of evaluation. So if errmsg changes errno this would lead to
errno == ENOMEM evaluated to false.

Oh! Yeah, that is our own bug then.

Should we work on this issue?

Absolutely. Probably best to save errno into a local just before the
ereport.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Jason Petersen
jason@citusdata.com
In reply to: Christian Kruse (#4)
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

I realize Postgres’ codebase is probably intractably large to begin using a tool like splint (http://www.splint.org ), but this is exactly the sort of thing it’ll catch. I’m pretty sure it would have warned in this case that the code relies on an ordering of side effects that is left undefined by C standards (and as seen here implemented differently by two different compilers).

The workaround is to make separate assignments on separate lines, which act as sequence points to impose a total order on the side-effects in question.

—Jason

On Jan 28, 2014, at 2:12 PM, Christian Kruse <christian@2ndQuadrant.com> wrote:

Hi,

On 28/01/14 16:43, Christian Kruse wrote:

ereport(FATAL,
(errmsg("could not map anonymous shared memory: %m"),
(errno == ENOMEM) ?
errhint("This error usually means that PostgreSQL's request "
"for a shared memory segment exceeded available memory "
"or swap space. To reduce the request size (currently "
"%zu bytes), reduce PostgreSQL's shared memory usage, "
"perhaps by reducing shared_buffers or "
"max_connections.",
*size) : 0));

did not emit a errhint when using clang, although errno == ENOMEM was
true. The same code works with gcc.

According to http://llvm.org/bugs/show_bug.cgi?id=18644#c5 this is not
a compiler bug but a difference between gcc and clang. Clang seems to
use a left-to-right order of evaluation while gcc uses a right-to-left
order of evaluation. So if errmsg changes errno this would lead to
errno == ENOMEM evaluated to false. I added a watch point on errno and
it turns out that exactly this happens: in src/common/psprintf.c line
114

nprinted = vsnprintf(buf, len, fmt, args);

errno gets set to 0. This means that we will miss errhint/errdetail if
we use errno in a ternary operator and clang.

Should we work on this issue?

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#5)
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

On 2014-01-28 16:19:11 -0500, Tom Lane wrote:

Christian Kruse <christian@2ndQuadrant.com> writes:

According to http://llvm.org/bugs/show_bug.cgi?id=18644#c5 this is not
a compiler bug but a difference between gcc and clang. Clang seems to
use a left-to-right order of evaluation while gcc uses a right-to-left
order of evaluation. So if errmsg changes errno this would lead to
errno == ENOMEM evaluated to false.

Oh! Yeah, that is our own bug then.

Pretty nasty too. Surprising that it didn't cause more issues. It's not
like it would only be capable to cause problems because of the
evaluation order...

Should we work on this issue?

Absolutely. Probably best to save errno into a local just before the
ereport.

I think just resetting to edata->saved_errno is better and sufficient?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Jason Petersen (#6)
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

Jason Petersen wrote:

I realize Postgres’ codebase is probably intractably large to begin
using a tool like splint (http://www.splint.org ), but this is exactly
the sort of thing it’ll catch. I’m pretty sure it would have warned in
this case that the code relies on an ordering of side effects that is
left undefined by C standards (and as seen here implemented
differently by two different compilers).

Well, we already have Coverity reports and the VIVA64 stuff posted last
month. Did they not see these problems? Maybe they did, maybe not, but
since there's a large number of false positives it's hard to tell. I
don't know how many false positives we would get from a Splint run, but
my guess is that it'll be a lot.

The workaround is to make separate assignments on separate lines,
which act as sequence points to impose a total order on the
side-effects in question.

Not sure how that would work with a complex macro such as ereport.
Perhaps the answer is to use C99 variadic macros if available, but that
would leave bugs such as this one open on compilers that don't support
variadic macros.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Andres Freund
andres@anarazel.de
In reply to: Alvaro Herrera (#8)
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

On 2014-01-28 18:31:59 -0300, Alvaro Herrera wrote:

Jason Petersen wrote:

I realize Postgres’ codebase is probably intractably large to begin
using a tool like splint (http://www.splint.org ), but this is exactly
the sort of thing it’ll catch. I’m pretty sure it would have warned in
this case that the code relies on an ordering of side effects that is
left undefined by C standards (and as seen here implemented
differently by two different compilers).

Well, we already have Coverity reports and the VIVA64 stuff posted last
month. Did they not see these problems? Maybe they did, maybe not, but
since there's a large number of false positives it's hard to tell. I
don't know how many false positives we would get from a Splint run, but
my guess is that it'll be a lot.

Well, this isn't really a case of classical undefined beaviour. Most of
the code is actually perfectly well setup to handle the differing
evaluation, it's just that some bits of code forgot to restore errno.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Stephen Frost
sfrost@snowman.net
In reply to: Alvaro Herrera (#8)
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

* Alvaro Herrera (alvherre@2ndquadrant.com) wrote:

Well, we already have Coverity reports and the VIVA64 stuff posted last
month. Did they not see these problems? Maybe they did, maybe not, but
since there's a large number of false positives it's hard to tell. I
don't know how many false positives we would get from a Splint run, but
my guess is that it'll be a lot.

I've whittled down most of the false positives and gone through just
about all of the rest. I do not recall any reports in Coverity for this
issue and that makes me doubt that it checks for it.

I'll try and take a look at what splint reports this weekend.

Thanks,

Stephen

#11Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Stephen Frost (#10)
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

Stephen Frost wrote:

* Alvaro Herrera (alvherre@2ndquadrant.com) wrote:

Well, we already have Coverity reports and the VIVA64 stuff posted last
month. Did they not see these problems? Maybe they did, maybe not, but
since there's a large number of false positives it's hard to tell. I
don't know how many false positives we would get from a Splint run, but
my guess is that it'll be a lot.

I've whittled down most of the false positives and gone through just
about all of the rest.

Really? Excellent, thanks. I haven't looked at it in quite a while
apparently ...

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#7)
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

Andres Freund <andres@2ndquadrant.com> writes:

Absolutely. Probably best to save errno into a local just before the
ereport.

I think just resetting to edata->saved_errno is better and sufficient?

-1 --- that field is nobody's business except elog.c's.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Christian Kruse
christian@2ndquadrant.com
In reply to: Tom Lane (#12)
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

Hi,

On 28/01/14 22:35, Tom Lane wrote:

Absolutely. Probably best to save errno into a local just before the
ereport.

I think just resetting to edata->saved_errno is better and sufficient?

-1 --- that field is nobody's business except elog.c's.

Ok, so I propose the attached patch as a fix.

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

saved_errno.patchtext/x-diff; charset=us-asciiDownload+4-0
#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Christian Kruse (#13)
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

Christian Kruse <christian@2ndquadrant.com> writes:

Ok, so I propose the attached patch as a fix.

No, what I meant is that the ereport caller needs to save errno, rather
than assuming that (some subset of) ereport-related subroutines will
preserve it.

In general, it's unsafe to assume that any nontrivial subroutine preserves
errno, and I don't particularly want to promise that the ereport functions
are an exception. Even if we did that, this type of coding would still
be risky. Here are some examples:

ereport(...,
foo() ? errdetail(...) : 0,
(errno == something) ? errhint(...) : 0);

If foo() clobbers errno and returns false, there is nothing that elog.c
can do to make this coding work.

ereport(...,
errmsg("%s belongs to %s",
foo(), (errno == something) ? "this" : "that"));

Again, even if every single elog.c entry point saved and restored errno,
this coding wouldn't be safe.

I don't think we should try to make the world safe for some uses of errno
within ereport logic, when there are other very similar-looking uses that
we cannot make safe. What we need is a coding rule that you don't do
that.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Christian Kruse
christian@2ndquadrant.com
In reply to: Tom Lane (#14)
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

Hi,

On 29/01/14 13:39, Tom Lane wrote:

No, what I meant is that the ereport caller needs to save errno, rather
than assuming that (some subset of) ereport-related subroutines will
preserve it.
[…]

Your reasoning sounds quite logical to me. Thus I did a

grep -RA 3 "ereport" src/* | less

and looked for ereport calls with errno in it. I found quite a few,
attached you will find a patch addressing that issue.

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

clang-patches-v1.patchtext/x-diff; charset=us-asciiDownload+16-7
#16Christian Kruse
christian@2ndquadrant.com
In reply to: Christian Kruse (#15)
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

Hi,

On 29/01/14 21:37, Christian Kruse wrote:

[…]
attached you will find a patch addressing that issue.

Maybe we should include the patch proposed in

<20140129195930.GD31325@defunct.ch>

and do this as one (slightly bigger) patch. Attached you will find
this alternative version.

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

clang-patches-v1.1.patchtext/x-diff; charset=us-asciiDownload+21-8
#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Christian Kruse (#15)
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

Christian Kruse <christian@2ndquadrant.com> writes:

Your reasoning sounds quite logical to me. Thus I did a
grep -RA 3 "ereport" src/* | less
and looked for ereport calls with errno in it. I found quite a few,
attached you will find a patch addressing that issue.

Excellent, thanks for doing the legwork. I'll take care of getting
this committed and back-patched.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Christian Kruse (#15)
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

Christian Kruse <christian@2ndquadrant.com> writes:

Your reasoning sounds quite logical to me. Thus I did a
grep -RA 3 "ereport" src/* | less
and looked for ereport calls with errno in it. I found quite a few,
attached you will find a patch addressing that issue.

Committed. I found a couple of errors in your patch, but I think
everything is addressed in the patch as committed.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Christian Kruse
christian@2ndquadrant.com
In reply to: Tom Lane (#18)
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

Hi Tom,

On 29/01/14 20:06, Tom Lane wrote:

Christian Kruse <christian@2ndquadrant.com> writes:

Your reasoning sounds quite logical to me. Thus I did a
grep -RA 3 "ereport" src/* | less
and looked for ereport calls with errno in it. I found quite a few,
attached you will find a patch addressing that issue.

Committed.

Great! Thanks!

I found a couple of errors in your patch, but I think everything is
addressed in the patch as committed.

While I understand most modifications I'm a little bit confused by
some parts. Have a look at for example this one:

+       *errstr = psprintf(_("failed to look up effective user id %ld: %s"),
+                          (long) user_id,
+                        errno ? strerror(errno) : _("user does not exist"));

Why is it safe here to use errno? It is possible that the _() function
changes errno, isn't it?

Best regards,

--
Christian Kruse http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#20Andres Freund
andres@anarazel.de
In reply to: Christian Kruse (#19)
Re: Suspicion of a compiler bug in clang: using ternary operator in ereport()

On 2014-01-30 08:32:20 +0100, Christian Kruse wrote:

Hi Tom,

On 29/01/14 20:06, Tom Lane wrote:

Christian Kruse <christian@2ndquadrant.com> writes:

Your reasoning sounds quite logical to me. Thus I did a
grep -RA 3 "ereport" src/* | less
and looked for ereport calls with errno in it. I found quite a few,
attached you will find a patch addressing that issue.

Committed.

Great! Thanks!

I found a couple of errors in your patch, but I think everything is
addressed in the patch as committed.

While I understand most modifications I'm a little bit confused by
some parts. Have a look at for example this one:

+       *errstr = psprintf(_("failed to look up effective user id %ld: %s"),
+                          (long) user_id,
+                        errno ? strerror(errno) : _("user does not exist"));

Why is it safe here to use errno? It is possible that the _() function
changes errno, isn't it?

But the evaluation order is strictly defined here, no? First the boolean
check for errno, then *either* strerror(errno), *or* the _().

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Christian Kruse
christian@2ndquadrant.com
In reply to: Andres Freund (#20)
#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Christian Kruse (#21)
#23Christian Kruse
christian@2ndquadrant.com
In reply to: Tom Lane (#22)
#24Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#22)