Unstable select_parallel regression output in 12rc1
Building the 12rc1 package on Ubuntu eoan/amd64, I got this
regression diff:
12:06:27 diff -U3 /<<PKGBUILDDIR>>/build/../src/test/regress/expected/select_parallel.out /<<PKGBUILDDIR>>/build/src/bin/pg_upgrade/tmp_check/regress/results/select_parallel.out
12:06:27 --- /<<PKGBUILDDIR>>/build/../src/test/regress/expected/select_parallel.out 2019-09-23 20:24:42.000000000 +0000
12:06:27 +++ /<<PKGBUILDDIR>>/build/src/bin/pg_upgrade/tmp_check/regress/results/select_parallel.out 2019-09-26 10:06:21.171683801 +0000
12:06:27 @@ -21,8 +21,8 @@
12:06:27 Workers Planned: 3
12:06:27 -> Partial Aggregate
12:06:27 -> Parallel Append
12:06:27 - -> Parallel Seq Scan on d_star
12:06:27 -> Parallel Seq Scan on f_star
12:06:27 + -> Parallel Seq Scan on d_star
12:06:27 -> Parallel Seq Scan on e_star
12:06:27 -> Parallel Seq Scan on b_star
12:06:27 -> Parallel Seq Scan on c_star
12:06:27 @@ -75,8 +75,8 @@
12:06:27 Workers Planned: 3
12:06:27 -> Partial Aggregate
12:06:27 -> Parallel Append
12:06:27 - -> Seq Scan on d_star
12:06:27 -> Seq Scan on f_star
12:06:27 + -> Seq Scan on d_star
12:06:27 -> Seq Scan on e_star
12:06:27 -> Seq Scan on b_star
12:06:27 -> Seq Scan on c_star
12:06:27 @@ -103,7 +103,7 @@
12:06:27 -----------------------------------------------------
12:06:27 Finalize Aggregate
12:06:27 -> Gather
12:06:27 - Workers Planned: 1
12:06:27 + Workers Planned: 3
12:06:27 -> Partial Aggregate
12:06:27 -> Append
12:06:27 -> Parallel Seq Scan on a_star
Retriggering the build worked, though.
Christoph
Christoph Berg <myon@debian.org> writes:
Building the 12rc1 package on Ubuntu eoan/amd64, I got this
regression diff:
The append-order differences have been seen before, per this thread:
/messages/by-id/CA+hUKG+0CxrKRWRMf5ymN3gm+BECHna2B-q1w8onKBep4HasUw@mail.gmail.com
We haven't seen it in quite some time in HEAD, though I fear that's
just due to bad luck or change of timing of unrelated tests. I've
been hoping to catch it in HEAD to validate the theory I posited in
<22315.1563378828@sss.pgh.pa.us>, but your report doesn't help because
the additional checking queries aren't there in the v12 branch :-(
12:06:27 @@ -103,7 +103,7 @@
12:06:27 -----------------------------------------------------
12:06:27 Finalize Aggregate
12:06:27 -> Gather
12:06:27 - Workers Planned: 1
12:06:27 + Workers Planned: 3
12:06:27 -> Partial Aggregate
12:06:27 -> Append
12:06:27 -> Parallel Seq Scan on a_star
We've also seen this on a semi-regular basis, and I've been intending
to bitch about it, though it didn't seem very useful to do so as long
as there were other instabilities in the regression tests. What we
could do, perhaps, is feed the plan output through a filter that
suppresses the exact number-of-workers value. There's precedent
for such plan-filtering elsewhere in the tests already.
regards, tom lane
Re: Tom Lane 2019-09-26 <12685.1569510771@sss.pgh.pa.us>
We haven't seen it in quite some time in HEAD, though I fear that's
just due to bad luck or change of timing of unrelated tests.
The v13 package builds that are running every 6h here haven't seen a
problem yet either, so the probability of triggering it seems very
low. So it's not a pressing problem. (There's some extension modules
where the testsuite fails at a much higher rate, getting all targets
to pass at the same time is next to impossible there :(. )
Christoph
Christoph Berg <myon@debian.org> writes:
Re: Tom Lane 2019-09-26 <12685.1569510771@sss.pgh.pa.us>
We haven't seen it in quite some time in HEAD, though I fear that's
just due to bad luck or change of timing of unrelated tests.
The v13 package builds that are running every 6h here haven't seen a
problem yet either, so the probability of triggering it seems very
low. So it's not a pressing problem.
I've pushed some changes to try to ameliorate the issue.
(There's some extension modules
where the testsuite fails at a much higher rate, getting all targets
to pass at the same time is next to impossible there :(. )
I feel your pain, believe me. Used to fight the same kind of problems
when I was at Red Hat. Are any of those extension modules part of
Postgres?
regards, tom lane
Re: Tom Lane 2019-09-28 <24917.1569692191@sss.pgh.pa.us>
(There's some extension modules
where the testsuite fails at a much higher rate, getting all targets
to pass at the same time is next to impossible there :(. )I feel your pain, believe me. Used to fight the same kind of problems
when I was at Red Hat. Are any of those extension modules part of
Postgres?
No, external ones. The main offenders at the moment are pglogical and
patroni (admittedly not an extension in the strict sense). Both have
extensive testsuites that exercise replication scenarios that are
prone to race conditions. (Maybe we should just run less tests for the
packaging.)
Christoph