How did queensnake corrupt zic.o?

Started by Thomas Munroalmost 4 years ago4 messages
#1Thomas Munro
thomas.munro@gmail.com

Hi,

We see a successful compile and then a failure to read the file while
linking. We see that the animal got into that state recently and then
fixed itself, and now it's back in that state. I don't know if it's
significant, but it happened to fix itself when a configure change
came along, which might be explained by ccache invalidation; that is,
the failure mode doesn't depend on the input files, but once it's
borked you need a change to kick ccache. My memory may be playing
tricks on me but I vaguely recall seeing another animal do this, a
while back.

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#1)
Re: How did queensnake corrupt zic.o?

Thomas Munro <thomas.munro@gmail.com> writes:

We see a successful compile and then a failure to read the file while
linking. We see that the animal got into that state recently and then
fixed itself, and now it's back in that state. I don't know if it's
significant, but it happened to fix itself when a configure change
came along, which might be explained by ccache invalidation; that is,
the failure mode doesn't depend on the input files, but once it's
borked you need a change to kick ccache. My memory may be playing
tricks on me but I vaguely recall seeing another animal do this, a
while back.

queensnake's seen repeated cycles of unexplainable build failures.
I wonder if it's using a bogus ccache version, or if the machine
itself is flaky.

regards, tom lane

#3Filipe Rosset
rosset.filipe@gmail.com
In reply to: Tom Lane (#2)
Re: How did queensnake corrupt zic.o?

hi guys,
I had to disable ccache on queensnake, all builds are fine right now. Let's
see how it goes.
due OS migration on queensnake (scientific linux -> centos -> rocky linux)
I think it's time to decommission the queensnake and request a new animal
for a new server, maybe.

On Mon, Feb 14, 2022, 03:25 Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

Thomas Munro <thomas.munro@gmail.com> writes:

We see a successful compile and then a failure to read the file while
linking. We see that the animal got into that state recently and then
fixed itself, and now it's back in that state. I don't know if it's
significant, but it happened to fix itself when a configure change
came along, which might be explained by ccache invalidation; that is,
the failure mode doesn't depend on the input files, but once it's
borked you need a change to kick ccache. My memory may be playing
tricks on me but I vaguely recall seeing another animal do this, a
while back.

queensnake's seen repeated cycles of unexplainable build failures.
I wonder if it's using a bogus ccache version, or if the machine
itself is flaky.

regards, tom lane

#4Andres Freund
andres@anarazel.de
In reply to: Filipe Rosset (#3)
Re: How did queensnake corrupt zic.o?

Hi,

On 2022-02-15 10:00:45 -0300, Filipe Rosset wrote:

I had to disable ccache on queensnake, all builds are fine right now. Let's
see how it goes.

due OS migration on queensnake (scientific linux -> centos -> rocky linux)
I think it's time to decommission the queensnake and request a new animal
for a new server, maybe.

Did you keep the ccache cache across these migrations? If so, I'd not at all
be surprised if that caused the problem. I don't think ccache is going to be
reliably protecting against all the things changing between distrubtion
[versions]. Particularly not if the ccache version also changed.

Greetings,

Andres Freund