Adding null patch entry to cfbot/CommitFest

Started by Tatsuo Ishii8 months ago5 messages
#1Tatsuo Ishii
ishii@postgresql.org

I have observed cases where a cfbot entry fails without clear reason
[1]: /messages/by-id/20250422.111139.1502127917165838403.ishii@postgresql.org
cfbot test. In this case we could easily guess that master branch
might have problems at the time when the tests performed. But if a
patch modifies some codes, it's not that easy because it is possible
that the test reveals hidden defects of the patch or maybe not. To
make reviewer's job easier, I would like to propose cfbot/CommitFest
have "null patch entry", which is a dummy entry without any patch,
just for testing healthiness of master branch.

One might think adding new buildfarm animals could do the same. But I
think adding to cfbot/CommitFest is better because:

- Test timing. Cfbot possibly runs the null patch entry test everyday
so that one can easily compare a particular patch's test results with
the null patch test results easily.

- Test method. By letting cfbot run the test, one can easily compare
test results because the test method is identical.

[1]: /messages/by-id/20250422.111139.1502127917165838403.ishii@postgresql.org

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

#2David G. Johnston
david.g.johnston@gmail.com
In reply to: Tatsuo Ishii (#1)
Re: Adding null patch entry to cfbot/CommitFest

On Monday, May 19, 2025, Tatsuo Ishii <ishii@postgresql.org> wrote:

I have observed cases where a cfbot entry fails without clear reason
[1]. Even if the patch just modifies comments, it has failed in some
cfbot test. In this case we could easily guess that master branch
might have problems at the time when the tests performed.

I too made this realization while reviewing the application. I concur it
is something that we should try and mitigate. Sending a canary patch
through once-a-day, or on any fixed time period, doesn’t quite seem
sufficient. We have many commits per day and immediately switch to them as
they are seen.

The system itself needs to be able to simply create a job and test
master/HEAD whenever it wishes. Then use the outcome of the job to decide
whether to wait for a new HEAD before pulling more CF entries. Or,
alternatively, continue using the “previous” commit as the base until
master/HEAD changes again and it can evaluate the new proposed base.

There are some other variations on these to consider as well. Like, the
first patch (or multiple due to concurrency) to fail on a new commit base
waits for a test of the base to complete before being marked failed or
aborted.

That said, adding a “null” CF entry to the queue right now is doable and
virtually cost-free. While it may not provide complete benefit there
likely would be some. Anyone could increment the version number and email
the thread to release a new canary on-demand. It would also provide some
data/feedback while designing and implementing a more integrated solution.

David J.

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: David G. Johnston (#2)
Re: Adding null patch entry to cfbot/CommitFest

"David G. Johnston" <david.g.johnston@gmail.com> writes:

On Monday, May 19, 2025, Tatsuo Ishii <ishii@postgresql.org> wrote:

I have observed cases where a cfbot entry fails without clear reason
[1]. Even if the patch just modifies comments, it has failed in some
cfbot test. In this case we could easily guess that master branch
might have problems at the time when the tests performed.

I too made this realization while reviewing the application. I concur it
is something that we should try and mitigate. Sending a canary patch
through once-a-day, or on any fixed time period, doesn’t quite seem
sufficient.

Yeah, I'm afraid that won't do much except eat valuable cycles.

IME the real problem with the CI tests is that the CI infrastructure
itself is not very stable, and intermittently fails tests for reasons
having little to do with the tested patch *or* the state of the master
branch. If there is a problem in master, we typically know it because
the buildfarm is also unhappy, so I'm not clear on what a null patch
in the queue would add.

regards, tom lane

#4Tatsuo Ishii
ishii@postgresql.org
In reply to: Tom Lane (#3)
Re: Adding null patch entry to cfbot/CommitFest

I too made this realization while reviewing the application. I concur it
is something that we should try and mitigate. Sending a canary patch
through once-a-day, or on any fixed time period, doesn’t quite seem
sufficient.

Yeah, I'm afraid that won't do much except eat valuable cycles.

IME the real problem with the CI tests is that the CI infrastructure
itself is not very stable, and intermittently fails tests for reasons
having little to do with the tested patch *or* the state of the master
branch.

I am not sure. Cfbot seems to run tests for a particular patch every
day until it fails. (Once tests failed, it seems Cfbot sleep 10 days
or so if a patch is unchanged). If CI is that unstable, we should see
random failure day by day even if the patch is unchanged.

If there is a problem in master, we typically know it because
the buildfarm is also unhappy, so I'm not clear on what a null patch
in the queue would add.

The particular CF test I am seeing failure is "Windows - Server 2019,
VS 2019 - Meson & ninja". As far as I know there's no equivalent
animal in the buildfarm.

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

#5Tatsuo Ishii
ishii@postgresql.org
In reply to: David G. Johnston (#2)
Re: Adding null patch entry to cfbot/CommitFest

I too made this realization while reviewing the application. I concur it
is something that we should try and mitigate. Sending a canary patch
through once-a-day, or on any fixed time period, doesn’t quite seem
sufficient. We have many commits per day and immediately switch to them as
they are seen.

The system itself needs to be able to simply create a job and test
master/HEAD whenever it wishes. Then use the outcome of the job to decide
whether to wait for a new HEAD before pulling more CF entries. Or,
alternatively, continue using the “previous” commit as the base until
master/HEAD changes again and it can evaluate the new proposed base.

There are some other variations on these to consider as well. Like, the
first patch (or multiple due to concurrency) to fail on a new commit base
waits for a test of the base to complete before being marked failed or
aborted.

That said, adding a “null” CF entry to the queue right now is doable and
virtually cost-free.

Unfortunately cfbot seems to sleep 10 days or so once a test fails
(and if patch is not get updated). So it's not automatically doable
today. I need manual updating of the patch.

While it may not provide complete benefit there
likely would be some.

Yeah. But I still think even once a day test is useful because it will
reduce the number of commits to git bisect.

Anyone could increment the version number and email
the thread to release a new canary on-demand. It would also provide some
data/feedback while designing and implementing a more integrated solution.

Okay, next time I face a cfbot error like I explained upthread, I will
create a "canary".

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp