test failure with gcc-12 -O3 -march=native
Hi,
For my optimized builds I've long used -O3 -march=native. After one of the
recent package updates (I'm not certain when exactly yet), the main regression
tests started to fail for me with that. Oddly enough in opr_sanity:
-- Ask access methods to validate opclasses
-- (this replaces a lot of SQL-level checks that used to be done in this file)
SELECT oid, opcname FROM pg_opclass WHERE NOT amvalidate(oid);
- oid | opcname
------+---------
-(0 rows)
+INFO: operator family "array_ops" of access method hash contains function hash_array_extended(anyarray,bigint) with wrong signature for support number 2
+INFO: operator family "bpchar_ops" of access method hash contains function hashbpcharextended(character,bigint) with wrong signature for support number 2
...
+ 16492 | part_test_int4_ops
+ 16497 | part_test_text_ops
+(43 rows)
Given that I did not encounter this problem with gcc-12 before, and that
gcc-12 has been released, it seems less likely to be a bug in our code
highlighted by a new optimization and more likely to be a bug in a gcc bugfix,
but it's definitely not clear.
I only investigated this a tiny bit so far. What fails is the
procform->prorettype != restype comparison in check_hash_func_signature().
Greetings,
Andres Freund
On Thu, Aug 11, 2022 at 01:03:43PM -0700, Andres Freund wrote:
Hi,
For my optimized builds I've long used -O3 -march=native. After one of the
On what kind of arch ?
Given that I did not encounter this problem with gcc-12 before, and that
gcc-12 has been released, it seems less likely to be a bug in our code
highlighted by a new optimization and more likely to be a bug in a gcc bugfix,
but it's definitely not clear.
debian testing is now defaulting to gcc-12.
https://tracker.debian.org/news/1348007/accepted-gcc-defaults-1198-source-into-unstable/
Are you sure you were building with gcc-12 and not gcc(default) which, until 3
weeks ago, was gcc-11 ?
--
Justin
Hi,
On 2022-08-11 20:06:02 -0500, Justin Pryzby wrote:
On Thu, Aug 11, 2022 at 01:03:43PM -0700, Andres Freund wrote:
Hi,
For my optimized builds I've long used -O3 -march=native. After one of the
On what kind of arch ?
x86-64 cascadelake. I've since debugged this further. It's not even -march
that's the problem, it's the difference between -mtune=broadwell and
-mtune=skylake, even with -march=x86-64.
Given that I did not encounter this problem with gcc-12 before, and that
gcc-12 has been released, it seems less likely to be a bug in our code
highlighted by a new optimization and more likely to be a bug in a gcc bugfix,
but it's definitely not clear.debian testing is now defaulting to gcc-12.
https://tracker.debian.org/news/1348007/accepted-gcc-defaults-1198-source-into-unstable/Are you sure you were building with gcc-12 and not gcc(default) which, until 3
weeks ago, was gcc-11 ?
Yes.
I'm now bisecting...
Greetings,
Andres Freund
Hi,
On 2022-08-11 18:24:16 -0700, Andres Freund wrote:
Given that I did not encounter this problem with gcc-12 before, and that
gcc-12 has been released, it seems less likely to be a bug in our code
highlighted by a new optimization and more likely to be a bug in a gcc bugfix,
but it's definitely not clear.debian testing is now defaulting to gcc-12.
https://tracker.debian.org/news/1348007/accepted-gcc-defaults-1198-source-into-unstable/Are you sure you were building with gcc-12 and not gcc(default) which, until 3
weeks ago, was gcc-11 ?Yes.
I'm now bisecting...
I found the commit triggering it [1]https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=1ceddd7497e. Oddly it's a change from a few months
ago, and I can reconstruct from dpkg.log and shell history that I definitely
ran the tests many times since upgrading the compiler. I did however clean my
ccache cache yesterday, I wonder if somehow the 'old' version got stuck in
it. ccache says it checks the compiler's mtime though.
Greetings,
Andres Freund
[1]: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=1ceddd7497e
Hi,
On 2022-08-11 19:08:14 -0700, Andres Freund wrote:
On 2022-08-11 18:24:16 -0700, Andres Freund wrote:
I'm now bisecting...
I found the commit triggering it [1]. Oddly it's a change from a few months
ago, and I can reconstruct from dpkg.log and shell history that I definitely
ran the tests many times since upgrading the compiler. I did however clean my
ccache cache yesterday, I wonder if somehow the 'old' version got stuck in
it. ccache says it checks the compiler's mtime though.
Spent a fair bit of time reducing the problem to something triggering the
problem in isolation. This is somewhat scary - I'd be quite surprised if this
were the only place triggering the bug.
And I suspect that it doesn't actually require -mtune=skylake, but I'm not
sure.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106590
Greetings,
Andres Freund