Further news on Clang - spurious warnings
I'm happy to report that thanks to some persistent complaining on my
part, the one outstanding issue when building Postgres with Clang -
the spurious warnings that occured as a result of it being statically
detected that there are assignments past what appears to be the end of
a single element array at the end of a struct - has been fixed in a
recent revision. Now, the only warning that remains is that same,
annoying Flex related bug also seen with GCC that we can't seem to do
anything about, because the Flex people refuse to acknowledge that
it's a bug.
However, at least when you see this warning when using Clang, you get
to see a comment beside the declaration that hints that the warning is
spurious:
[peter@laptop postgresql]$ /home/peter/build/Release/bin/clang -O2
-Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels -Wformat-security
-fno-strict-aliasing -fwrapv -Wno-error -I. -I. -Isrc/include
-D_GNU_SOURCE -c -o gram.o src/backend/parser/gram.c
In file included from gram.y:12949:
scan.c:16246:23: warning: unused variable 'yyg' [-Wunused-variable]
struct yyguts_t * yyg = (struct yyguts_t*)yyscanner; /* This var
may be unused depending upon options. */
^
1 warning generated.
This is not the case with GCC 4.6:
[peter@laptop postgresql]$ gcc -O2 -Wall -Wmissing-prototypes
-Wpointer-arith -Wdeclaration-after-statement -Wendif-labels
-Wformat-security -fno-strict-aliasing -fwrapv -Wno-error -I. -I.
-Isrc/include -D_GNU_SOURCE -c -o gram.o src/backend/parser/gram.c
In file included from gram.y:12949:0:
scan.c: In function ‘yy_try_NUL_trans’:
scan.c:16246:23: warning: unused variable ‘yyg’ [-Wunused-variable]
--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services
On 3 August 2011 00:52, Peter Geoghegan <peter@2ndquadrant.com> wrote:
Now, the only warning that remains is that same
Correction - there are actually 3 additional warnings like this in repl_gram.c:
/home/peter/build/Release/bin/clang -O2 -Wall -Wmissing-prototypes
-Wpointer-arith -Wdeclaration-after-statement -Wendif-labels
-Wformat-security -fno-strict-aliasing -fwrapv -I.
-I../../../src/include -D_GNU_SOURCE -c -o repl_gram.o repl_gram.c
repl_gram.y:106:30: warning: implicit conversion from enumeration type
'enum ReplNodeTag' to different enumeration type 'NodeTag' (aka 'enum
NodeTag') [-Wconversion]
(yyval.node) = (Node *)
makeNode(IdentifySystemCmd);
^~~~~~~~~~~~~~~~~~~~~~~~~~~
../../../src/include/nodes/nodes.h:475:64: note: expanded from:
#define makeNode(_type_) ((_type_ *)
newNode(sizeof(_type_),T_##_type_))
^
<scratch space>:180:1: note: expanded from:
T_IdentifySystemCmd
^
../../../src/include/nodes/nodes.h:452:19: note: expanded from:
_result->type = (tag); \
~ ^~~
I'll take a look into them later.
This tautology warning sounds completely valid:
/home/peter/build/Release/bin/clang -O2 -Wall -Wmissing-prototypes
-Wpointer-arith -Wdeclaration-after-statement -Wendif-labels
-Wformat-security -fno-strict-aliasing -fwrapv -pthread -D_REENTRANT
-D_THREAD_SAFE -D_POSIX_PTHREAD_SEMANTICS -fpic -DFRONTEND
-DUNSAFE_STAT_OK -I. -I../../../src/include -D_GNU_SOURCE
-I../../../src/port -I../../../src/port -DSO_MAJOR_VERSION=5 -c -o
fe-exec.o fe-exec.c
fe-exec.c:2408:13: warning: comparison of unsigned enum expression < 0
is always false [-Wtautological-compare]
if (status < 0 || status >= sizeof pgresStatus / sizeof pgresStatus[0])
~~~~~~ ^ ~
1 warning generated.
So, there are 4 remaining warnings.
--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services
Attached patch removes the tautologolical part of an evaluated
expression, fixing the problem flagged by this quite valid warning.
--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services
Attachments:
remove_tautology.patchtext/x-patch; charset=US-ASCII; name=remove_tautology.patchDownload+1-1
On 03.08.2011 12:25, Peter Geoghegan wrote:
Attached patch removes the tautologolical part of an evaluated
expression, fixing the problem flagged by this quite valid warning.
The check is only tautological if the compiler implements enums as
unsigned integers. Whether enums are signed or not is
implementation-dependent. Perhaps cast status to unsigned or signed
explicitly before the checks?
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On 3 August 2011 10:34, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
The check is only tautological if the compiler implements enums as unsigned
integers. Whether enums are signed or not is implementation-dependent.
Perhaps cast status to unsigned or signed explicitly before the checks?
I don't believe that the standard allows for an implementation of
enums as unsigned integers - after all, individual enum literals can
be given corresponding negative integer values.
This warning is only seen because the first enum literal in the enum
is explicitly given the value 0, thus precluding the possibility of
the value being < 0, barring some abuse of the enum.
--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services
On 03.08.2011 13:05, Peter Geoghegan wrote:
I don't believe that the standard allows for an implementation of
enums as unsigned integers - after all, individual enum literals can
be given corresponding negative integer values.
C99 says:
Each enumerated type shall be compatible with char, a signed integer type, or an
unsigned integer type. The choice of type is implementation-defined,110) but shall be
capable of representing the values of all the members of the enumeration.
See also:
http://stackoverflow.com/questions/2579230/signedness-of-enum-in-c-c99-c-cx-gnu-c-gnu-c99
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On 3 August 2011 11:05, Peter Geoghegan <peter@2ndquadrant.com> wrote:
I don't believe that the standard allows for an implementation of
enums as unsigned integers - after all, individual enum literals can
be given corresponding negative integer values.
It actually gives leeway to implement the enum as unsigned int when
the compiler knows that it won't matter, because there are no negative
integer values that correspond to some enum literal. The hint was in
my original warning. :-)
This warning is only seen because the first enum literal in the enum
is explicitly given the value 0, thus precluding the possibility of
the value being < 0, barring some abuse of the enum.
It's also seen if no explicit values are given, and the compiler opts
to make the representation unsigned. It is not seen if it the value is
-1, for example.
Despite the fact that whether or not the value is unsigned is
implementation defined, I think that the patch is still valid - the
expression is at least logically tautological, even if it isn't
necessarily bitwise tautological, because, as I said, barring some
violation of the enum's contract, it should not be < 0. That's
precisely why the compiler has opted to make it unsigned.
--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services
On 03.08.2011 14:13, Peter Geoghegan wrote:
On 3 August 2011 11:05, Peter Geoghegan<peter@2ndquadrant.com> wrote:
I don't believe that the standard allows for an implementation of
enums as unsigned integers - after all, individual enum literals can
be given corresponding negative integer values.It actually gives leeway to implement the enum as unsigned int when
the compiler knows that it won't matter, because there are no negative
integer values that correspond to some enum literal. The hint was in
my original warning. :-)This warning is only seen because the first enum literal in the enum
is explicitly given the value 0, thus precluding the possibility of
the value being< 0, barring some abuse of the enum.It's also seen if no explicit values are given, and the compiler opts
to make the representation unsigned. It is not seen if it the value is
-1, for example.Despite the fact that whether or not the value is unsigned is
implementation defined, I think that the patch is still valid - the
expression is at least logically tautological, even if it isn't
necessarily bitwise tautological, because, as I said, barring some
violation of the enum's contract, it should not be< 0. That's
precisely why the compiler has opted to make it unsigned.
Right, but the purpose of that check is to defend from programmer error.
If the programmer screws up and calls "PQresStatus(-1)", we want to give
an error, not crash. If you assume that the programmer will only pass a
valid enum constant as parameter, then you might as well remove the
if-statement altogether. I don't think that would be an improvement.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On 3 August 2011 12:19, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
Right, but the purpose of that check is to defend from programmer error. If
the programmer screws up and calls "PQresStatus(-1)", we want to give an
error, not crash. If you assume that the programmer will only pass a valid
enum constant as parameter, then you might as well remove the if-statement
altogether. I don't think that would be an improvement.
Ahh. I failed to consider the intent of the code.
Attached patch has a better solution than casting though - we simply
use an enum literal. The fact that PGRES_EMPTY_QUERY is the first
value in the enum can be safely assumed to be stable, not least
because we've even already explicitly given it a corresponding value
of 0.
--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services
Attachments:
remove_tautology.v2.patchtext/x-patch; charset=US-ASCII; name=remove_tautology.v2.patchDownload+1-1
It occurred to me that you may be a little surprised that the patch
actually prevents the warning from occurring. If you have any doubts,
I can assure you that it does. Clang differentiates between 0 as an
rvalue, 0 from an enum literal and 0 resulting from evaluating a macro
expression (including a function-like macro with deep nesting).
--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services
Peter Geoghegan <peter@2ndquadrant.com> writes:
On 3 August 2011 12:19, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:Right, but the purpose of that check is to defend from programmer error. If
the programmer screws up and calls "PQresStatus(-1)", we want to give an
error, not crash. If you assume that the programmer will only pass a valid
enum constant as parameter, then you might as well remove the if-statement
altogether. I don't think that would be an improvement.
Ahh. I failed to consider the intent of the code.
Attached patch has a better solution than casting though - we simply
use an enum literal. The fact that PGRES_EMPTY_QUERY is the first
value in the enum can be safely assumed to be stable, not least
because we've even already explicitly given it a corresponding value
of 0.
No, this is not an improvement at all. The point of the code is that we
are about to use the enum value as an integer array subscript, and we
want to make sure it is within the array bounds. Tying that logic to
some member of the enum is not a readability or safety improvement.
We aren't trusting the caller to pass a valid enum value, and likewise
this code doesn't particularly want to trust the enum definition to be
anything in particular.
There is another point here, though, which is that if we're not sure
whether the compiler considers ExecStatusType to be signed or unsigned,
then we have no idea what the test "status < PGRES_EMPTY_QUERY" even
means. So I think the most reasonable fix is probably
if ((unsigned int) status >= sizeof pgresStatus / sizeof pgresStatus[0])
which is sufficient to cover both directions, since if status is passed
as -1 then it will convert to a large unsigned value. It's also a
natural expression of what we really want, ie, that the integer
equivalent of the enum value is in range.
regards, tom lane
On 3 August 2011 15:29, Tom Lane <tgl@sss.pgh.pa.us> wrote:
No, this is not an improvement at all. The point of the code is that we
are about to use the enum value as an integer array subscript, and we
want to make sure it is within the array bounds. Tying that logic to
some member of the enum is not a readability or safety improvement.
We aren't trusting the caller to pass a valid enum value, and likewise
this code doesn't particularly want to trust the enum definition to be
anything in particular.
I would think that if someone were to have a reason to change the
explicit value set for the identifier PGRES_EMPTY_QUERY, they would
carefully consider the ramifications of doing so. It's far more likely
they'd just append new values to the end of the enum though. This is
why I don't consider that what I've proposed exposes us to any
additional risk.
There is another point here, though, which is that if we're not sure
whether the compiler considers ExecStatusType to be signed or unsigned,
then we have no idea what the test "status < PGRES_EMPTY_QUERY" even
means.
I'm sorry, but I don't know what you mean by this.
So I think the most reasonable fix is probably
if ((unsigned int) status >= sizeof pgresStatus / sizeof pgresStatus[0])
which is sufficient to cover both directions, since if status is passed
as -1 then it will convert to a large unsigned value. It's also a
natural expression of what we really want, ie, that the integer
equivalent of the enum value is in range.
I'm not convinced that that is an improvement to rely on the
conversion doing so, but it's not as if I feel very strongly about it.
--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services
On Aug 3, 2011, at 1:47 AM, Peter Geoghegan wrote:
So, there are 4 remaining warnings.
I haven't been paying attention to warnings, but FWIW 9.1beta3 has been building with LLVM by default with Xcode 4. Both on Snow Leopard and on Lion I saw something like this:
try=# select version();
version
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
PostgreSQL 9.1beta3 on x86_64-apple-darwin11.0.0, compiled by i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00), 64-bit
(1 row)
Which is just awesome. Things seem to work great.
Best,
David
On ons, 2011-08-03 at 10:25 +0100, Peter Geoghegan wrote:
Attached patch removes the tautologolical part of an evaluated
expression, fixing the problem flagged by this quite valid warning.
I think these warnings are completely bogus and should not be worked
around. Even in the most trivial case of
{
unsigned int foo;
...
if (foo < 0 && ...)
...
}
I would not want to remove the check, because as the code gets moved
around, refactored, reused, whatever, the unsigned int might change into
a signed int, and then you're hosed. It's fine for -Wextra, but not for
the -Wall level.
On ons, 2011-08-03 at 10:33 -0700, David E. Wheeler wrote:
I haven't been paying attention to warnings, but FWIW 9.1beta3 has
been building with LLVM by default with Xcode 4. Both on Snow Leopard
and on Lion I saw something like this:try=# select version();
version
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
PostgreSQL 9.1beta3 on x86_64-apple-darwin11.0.0, compiled by
i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc.
build 5658) (LLVM build 2335.15.00), 64-bit
(1 row)Which is just awesome. Things seem to work great.
Note that what you are using there is a GCC frontend with a LLVM
backend. (So I suppose you would get mostly GCC warnings.) Clang is a
new/different compiler frontend using the LLVM backend.
On Aug 3, 2011, at 11:17 AM, Peter Eisentraut wrote:
Note that what you are using there is a GCC frontend with a LLVM
backend. (So I suppose you would get mostly GCC warnings.) Clang is a
new/different compiler frontend using the LLVM backend.
Yeah, not sure whether or not to tweak the Makefile to use Clang. I guess not?
David
On 3 Aug 2011, at 19:20, David E. Wheeler wrote:
Yeah, not sure whether or not to tweak the Makefile to use Clang. I guess not?
export CC=clang
./configure
...
On Aug 3, 2011, at 11:28 AM, Grzegorz Jaskiewicz wrote:
export CC=clang
./configure
...
Thanks, I'll give that a try the next time I build (RC1 I guess).
David
Peter Geoghegan <peter@2ndquadrant.com> writes:
On 3 August 2011 15:29, Tom Lane <tgl@sss.pgh.pa.us> wrote:
There is another point here, though, which is that if we're not sure
whether the compiler considers ExecStatusType to be signed or unsigned,
then we have no idea what the test "status < PGRES_EMPTY_QUERY" even
means.
I'm sorry, but I don't know what you mean by this.
I mean that it's unclear what you'll get if status has a bitpattern
equivalent to a negative integer. If the compiler implements the
comparison as signed, the test will yield TRUE; if unsigned, it's FALSE.
So I think the most reasonable fix is probably
if ((unsigned int) status >= sizeof pgresStatus / sizeof pgresStatus[0])
which is sufficient to cover both directions, since if status is passed
as -1 then it will convert to a large unsigned value. It's also a
natural expression of what we really want, ie, that the integer
equivalent of the enum value is in range.
I'm not convinced that that is an improvement to rely on the
conversion doing so, but it's not as if I feel very strongly about it.
The C standard specifies that signed-to-unsigned conversions must work
like that; and even if the standard didn't, it would surely work like
that on any machine with two's-complement representation, which is to
say every computer built in the last forty years or so. So I don't find
it a questionable assumption.
regards, tom lane
On Wed, Aug 03, 2011 at 04:03:39PM -0400, Tom Lane wrote:
The C standard specifies that signed-to-unsigned conversions must work
like that; and even if the standard didn't, it would surely work like
that on any machine with two's-complement representation, which is to
say every computer built in the last forty years or so. So I don't find
it a questionable assumption.
I had the "pleasure" of working on a Univac 1108 in about 1978 and
it was very definitely ones complement. I'm somewhat amazed to find that
the Univac 1100 series architecture and instruction set lives on to this
day. The last pure 1100 seems to be the Unisys 2200/3800 released in 1997.
Even later U1100/Exec 8 descendants appear to still exist and are still
actively supported:
http://en.wikipedia.org/wiki/Unisys_OS_2200_operating_system
So there are still ones complement machines out there. However I suggest we
pretend otherwise and continue to ignore them.
-dg
--
David Gould daveg@sonic.net 510 536 1443 510 282 0869
If simplicity worked, the world would be overrun with insects.