[buildfarm related] Machines gcc experimental failed test_lfind

Started by Hayato Kuroda (Fujitsu)about 2 months ago7 messages
#1Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com

Hi hackers,

Sorry if it has already been discussed somewhere.

While seeing the buildfarm, I found that recently regression test for modules
sometimes failed [1]https://buildfarm.postgresql.org/cgi-bin/show_failures.pl?max_days=3&stage=TestModulesCheck-C&filter=Submit. Typical example is [2]https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=leafhopper&dt=2025-11-25%2014%3A50%3A04&stg=testmodules-install-check-C and failed at simd.h.
The issue happened over branches, and they use gcc experimental. Based on that
I felt recent commits for gcc might be related with.

I don't have enough knowledge around here, but I saw commits in gcc and listed
Candidates [4]https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=b24f73a8a5210b9877ae98112b0a87f2aeae0c62, [5]https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=2f7d90ef65c6f09106c18b99a9590b8f81933115, [6]. Can you find something from here?
Also, what should we do for nightly-built compilers? Will we fix tests or codes for them?

[1]: https://buildfarm.postgresql.org/cgi-bin/show_failures.pl?max_days=3&stage=TestModulesCheck-C&filter=Submit
[2]: https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=leafhopper&dt=2025-11-25%2014%3A50%3A04&stg=testmodules-install-check-C
[3]: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=1c9d321611367608d6bc1d97cf35b4c1bcb4b2d1
[4]: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=b24f73a8a5210b9877ae98112b0a87f2aeae0c62
[5]: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=2f7d90ef65c6f09106c18b99a9590b8f81933115

Best regards,
Hayato Kuroda
FUJITSU LIMITED

#2John Naylor
johncnaylorls@gmail.com
In reply to: Hayato Kuroda (Fujitsu) (#1)
Re: [buildfarm related] Machines gcc experimental failed test_lfind

On Wed, Nov 26, 2025 at 1:08 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

While seeing the buildfarm, I found that recently regression test for modules
sometimes failed [1]. Typical example is [2] and failed at simd.h.
The issue happened over branches, and they use gcc experimental. Based on that
I felt recent commits for gcc might be related with.

I don't have enough knowledge around here, but I saw commits in gcc and listed
Candidates [4], [5], [6]. Can you find something from here?

We're not compiler engineers.

Also, what should we do for nightly-built compilers? Will we fix tests or codes for them?

We might ask ourselves how often these have resulted in
forward-looking fixes for our code, weighed against spurious failures
and compiler bugs.

--
John Naylor
Amazon Web Services

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: John Naylor (#2)
Re: [buildfarm related] Machines gcc experimental failed test_lfind

John Naylor <johncnaylorls@gmail.com> writes:

On Wed, Nov 26, 2025 at 1:08 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:

Also, what should we do for nightly-built compilers? Will we fix tests or codes for them?

We might ask ourselves how often these have resulted in
forward-looking fixes for our code, weighed against spurious failures
and compiler bugs.

If a problem is being observed only on animals with experimental
compilers, it's almost surely a compiler bug and not something
we should spend time on. In this case, the failure is in test
code that hasn't changed in several years, making it even less
likely that we broke it.

I'm not really thrilled that toolchains like these are being
run as normal buildfarm animals. Even with the "experimental"
annotation, there's too much chance that PG hackers will spend
time analyzing issues that aren't ours. I do realize that
there's value in this sort of testing, I just wish the results
were more clearly identified as "probably not our problem".

regards, tom lane

#4Nathan Bossart
nathandbossart@gmail.com
In reply to: Tom Lane (#3)
Re: [buildfarm related] Machines gcc experimental failed test_lfind

On Wed, Nov 26, 2025 at 10:51:12AM -0500, Tom Lane wrote:

If a problem is being observed only on animals with experimental
compilers, it's almost surely a compiler bug and not something
we should spend time on. In this case, the failure is in test
code that hasn't changed in several years, making it even less
likely that we broke it.

It looks like the failures are limited to AArch64, and commit f8f4afe did
change the code in question a bit. However, there are failures on older
PG versions where that commit was not applied, so the evidence does seem to
point to a compiler bug.

--
nathan

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nathan Bossart (#4)
Re: [buildfarm related] Machines gcc experimental failed test_lfind

Nathan Bossart <nathandbossart@gmail.com> writes:

It looks like the failures are limited to AArch64, and commit f8f4afe did
change the code in question a bit. However, there are failures on older
PG versions where that commit was not applied, so the evidence does seem to
point to a compiler bug.

The present failures started much more recently than f8f4afe ...
leafhopper, for example, was okay until about a day ago.
Also, AFAICS f8f4afe was HEAD-only, but leafhopper is also
failing in v18. (No evidence as yet about older branches.)

Anyway, this is exactly the sort of analysis that we shouldn't
be doing.

regards, tom lane

#6Nathan Bossart
nathandbossart@gmail.com
In reply to: Tom Lane (#5)
Re: [buildfarm related] Machines gcc experimental failed test_lfind

On Wed, Nov 26, 2025 at 11:10:31AM -0500, Tom Lane wrote:

Anyway, this is exactly the sort of analysis that we shouldn't
be doing.

Well, I wanted to be sure we weren't accidentally depending on undefined
behavior, like what we fixed in commit 43da394. Something like that seemed
within the realm of possibility.

--
nathan

#7Hayato Kuroda (Fujitsu)
kuroda.hayato@fujitsu.com
In reply to: Nathan Bossart (#6)
RE: [buildfarm related] Machines gcc experimental failed test_lfind

Dear John, Tom and Nathan,

Thanks for seeing the issue. I found the animals are now green, e.g. [1]https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=snakefly&amp;br=master, so
I hope it might be a compiler's bug. I learned that no need to spend time for
experimental tools unless we have idea around here.

[1]: https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=snakefly&amp;br=master

Best regards,
Hayato Kuroda
FUJITSU LIMITED