cfbot failures
On 2023-02-11, Andres Freund wrote 20230212004254.3lp22a7bpkcjo3y6@awork3.anarazel.de:
The windows test failure is a transient issue independent of the patch
(something went wrong with image permissions).
That's happening again since 3h ago.
https://cirrus-ci.com/github/postgresql-cfbot/postgresql
I suggested in the past that cfbot should delay if (say) the last 5 or
10 consecutive runs all failed (or maybe all failed on the same "task").
Maybe that should only apply to re-tests but not to new patches. It
could inject 15min delays until the condition is resolved. Or it could
run retests on a longer interval like 96h instead of 24. And add a
warning or start beeping about the issue.
That would mitigate not only issues in the master branch but also issues
with CI infrastructure (cirrus/google/images).
--
Justin
Hi,
On 2023-02-19 19:08:41 -0600, Justin Pryzby wrote:
On 2023-02-11, Andres Freund wrote 20230212004254.3lp22a7bpkcjo3y6@awork3.anarazel.de:
The windows test failure is a transient issue independent of the patch
(something went wrong with image permissions).That's happening again since 3h ago.
https://cirrus-ci.com/github/postgresql-cfbot/postgresql
Fixed manually. This is some sort of gcp issue. Bilal tried to deploy a
workaround, but that didn't yet work.
[21:39:06.006] 2023-02-19T21:39:06Z: ==> windows.googlecompute.windows-ci-vs-2019: Creating image...
[21:44:08.025] 2023-02-19T21:44:08Z: ==> windows.googlecompute.windows-ci-vs-2019: Error waiting for image: time out while waiting for image to register
...
[21:44:10.990] gcloud compute images deprecate pg-ci-${CIRRUS_TASK_NAME}-${DATE} --state=DEPRECATED
[21:44:33.834] ERROR: (gcloud.compute.images.deprecate) Could not fetch resource:
[21:44:33.834] - Required 'compute.images.deprecate' permission for 'projects/cirrus-ci-community/global/images/pg-ci-windows-ci-vs-2019-2023-02-19t21-30-43'
Greetings,
Andres Freund