Upload only the failed tests logs to the Postgres CI (Cirrus CI)

Started by Nazir Bilal Yavuz3 months ago6 messageshackers
Jump to latest
#1Nazir Bilal Yavuz
byavuz81@gmail.com

Hi,

Currently, all test logs are uploaded to Postgres CI regardless of
whether the tests pass or fail. This approach has a few drawbacks:

- It can be difficult to identify failed tests quickly. You need to
remember which tests failed and then locate them within the CI
artifacts, which often requires scrolling and can be frustrating.

- Uploading all test logs adds unnecessary overhead. For example,
uploading artifacts when only a single test fails takes approximately
300 seconds on macOS and 70 seconds on other CI platforms [1]https://cirrus-ci.com/build/6045972905590784.

- There may also be associated storage or transfer costs, although I
am not certain about this.

To improve this, I propose removing the output folders of successful
tests before uploading artifacts. In Meson builds, a 'test.success'
file is created in the test output directory when a test passes. I
have written a Python script that traverses these directories and
removes those directories which contain this file. At the moment, this
solution only applies to Meson builds, since the test.success file is
not generated in the Autoconf build system.

I would appreciate any thoughts or feedback on this approach.

Note: Currently NetBSD is failing with: 'env: python3: No such file or
directory', this can be fixed separately but I wanted to hear your
thoughts first.

Example CI Run after the patch is applied and CI is intentionally
broken to show how patch works:
https://cirrus-ci.com/build/6514731441192960

[1]: https://cirrus-ci.com/build/6045972905590784

--
Regards,
Nazir Bilal Yavuz
Microsoft

Attachments:

v1-0001-ci-Don-t-collect-successful-tests-logs.patchtext/x-patch; charset=US-ASCII; name=v1-0001-ci-Don-t-collect-successful-tests-logs.patchDownload+67-1
#2Nazir Bilal Yavuz
byavuz81@gmail.com
In reply to: Nazir Bilal Yavuz (#1)
Re: Upload only the failed tests logs to the Postgres CI (Cirrus CI)

Hi,

On Tue, 7 Apr 2026 at 19:18, Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:

Note: Currently NetBSD is failing with: 'env: python3: No such file or
directory', this can be fixed separately but I wanted to hear your
thoughts first.

Quick update: Windows CI tasks seem to be running into issues. Like
NetBSD, they can be fixed separately.

--
Regards,
Nazir Bilal Yavuz
Microsoft

#3Nazir Bilal Yavuz
byavuz81@gmail.com
In reply to: Nazir Bilal Yavuz (#2)
Re: Upload only the failed tests logs to the Postgres CI (Cirrus CI)

Hi,

On Tue, 7 Apr 2026 at 19:27, Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:

Hi,

On Tue, 7 Apr 2026 at 19:18, Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:

Note: Currently NetBSD is failing with: 'env: python3: No such file or
directory', this can be fixed separately but I wanted to hear your
thoughts first.

Quick update: Windows CI tasks seem to be running into issues. Like
NetBSD, they can be fixed separately.

Here is v2, which addresses the issues I mentioned earlier.

I have updated the design since this will only be used for Meson
builds. I decided to invoke the script using 'meson compile
clear_testrun_folder', as this avoids the need to manually call
python3. This approach prevents issues on NetBSD, where the python3
environment may not be found.

I don’t expect the 'ci_meson_clear_testrun_folder' script to be used
outside of CI, so I placed it in the 'src/tools/ci/' directory.
However, if you think it could be useful beyond CI, we could consider
renaming it and moving it to 'src/tools/'.

--
Regards,
Nazir Bilal Yavuz
Microsoft

Attachments:

v2-0001-ci-Don-t-collect-successful-tests-logs.patchtext/x-patch; charset=US-ASCII; name=v2-0001-ci-Don-t-collect-successful-tests-logs.patchDownload+86-1
#4Nazir Bilal Yavuz
byavuz81@gmail.com
In reply to: Nazir Bilal Yavuz (#3)
Re: Upload only the failed tests logs to the Postgres CI (Cirrus CI)

Hi,

On Thu, 9 Apr 2026 at 11:01, Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:

On Tue, 7 Apr 2026 at 19:27, Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:

Here is v2, which addresses the issues I mentioned earlier.

I have updated the design since this will only be used for Meson
builds. I decided to invoke the script using 'meson compile
clear_testrun_folder', as this avoids the need to manually call
python3. This approach prevents issues on NetBSD, where the python3
environment may not be found.

I don’t expect the 'ci_meson_clear_testrun_folder' script to be used
outside of CI, so I placed it in the 'src/tools/ci/' directory.
However, if you think it could be useful beyond CI, we could consider
renaming it and moving it to 'src/tools/'.

Here is the v3, it is modified for Github Actions.

I implemented it as another step which will run if the job fails.
Alternatively, we can put 'the clearing command' into the
meson_test_world step, this makes the change simpler but then this
command will run although tests don't fail.

With this change, the compressed size of the log decreases from ~8MB
to ~1MB. Also, it is much easier to find which tests failed.

You can see example GHA run, which intentionally failed:
https://github.com/nbyavuz/postgres/actions/runs/27013199064

--
Regards,
Nazir Bilal Yavuz
Microsoft

Attachments:

v3-0001-ci-Don-t-collect-successful-tests-logs.patchtext/x-patch; charset=US-ASCII; name=v3-0001-ci-Don-t-collect-successful-tests-logs.patchDownload+72-1
#5Andres Freund
andres@anarazel.de
In reply to: Nazir Bilal Yavuz (#4)
Re: Upload only the failed tests logs to the Postgres CI (Cirrus CI)

Hi,

On 2026-06-05 15:23:52 +0300, Nazir Bilal Yavuz wrote:

I implemented it as another step which will run if the job fails.
Alternatively, we can put 'the clearing command' into the
meson_test_world step, this makes the change simpler but then this
command will run although tests don't fail.

I think it's the right thing to have it as a dedicated step. That way it can
work with things like the running tests etc as well.

+ # Clear test folder so only failed tests are left

FWIW, I'd much rather see this copy the to-be-archived logs to a different
location, instead of removing the unneeded logs.

+      - &clear_test_folder_step
+        name: Clear test folder
+        if: failure() && !cancelled()
+        run: |
+          meson compile clear_testrun_folder -C build

I don't think I see what we gain from invoking this via meson compile, given
that we invoke ninja directly elsewhere. Not that that's going to make the
difference, but meson compile, which internally invokes ninja, is noticeably
slower than going through ninja directly.

But I suspect this shouldn't be integrated into the build system directly, as
I think we should eventually make this work for autoconf as well.

Greetings,

Andres Freund

#6Nazir Bilal Yavuz
byavuz81@gmail.com
In reply to: Andres Freund (#5)
Re: Upload only the failed tests logs to the Postgres CI (Cirrus CI)

Hi,

Thank you for looking into this!

On Mon, 8 Jun 2026 at 18:10, Andres Freund <andres@anarazel.de> wrote:

On 2026-06-05 15:23:52 +0300, Nazir Bilal Yavuz wrote:

I implemented it as another step which will run if the job fails.
Alternatively, we can put 'the clearing command' into the
meson_test_world step, this makes the change simpler but then this
command will run although tests don't fail.

I think it's the right thing to have it as a dedicated step. That way it can
work with things like the running tests etc as well.

+ # Clear test folder so only failed tests are left

FWIW, I'd much rather see this copy the to-be-archived logs to a different
location, instead of removing the unneeded logs.

Should we upload these logs too? If so, perhaps we can have something like:

build/testrun/successful_tests/*
build/testrun/failed_tests/*

Do you think this makes sense?

+      - &clear_test_folder_step
+        name: Clear test folder
+        if: failure() && !cancelled()
+        run: |
+          meson compile clear_testrun_folder -C build

I don't think I see what we gain from invoking this via meson compile, given
that we invoke ninja directly elsewhere. Not that that's going to make the
difference, but meson compile, which internally invokes ninja, is noticeably
slower than going through ninja directly.

I got it, I will change this.

But I suspect this shouldn't be integrated into the build system directly, as
I think we should eventually make this work for autoconf as well.

There were couple of reasons:

1- That was easy to implement for meson build because we have a
'test.success' file for the successful tests, AFAIK we don't have a
similar thing for the autoconf.

2- python3 was not available on the PATH for some OSes, so I
indirectly used python3 from the meson.

--
Regards,
Nazir Bilal Yavuz
Microsoft