Thread: Unstable select_parallel regression output in 12rc1
Building the 12rc1 package on Ubuntu eoan/amd64, I got this regression diff: 12:06:27 diff -U3 /<<PKGBUILDDIR>>/build/../src/test/regress/expected/select_parallel.out /<<PKGBUILDDIR>>/build/src/bin/pg_upgrade/tmp_check/regress/results/select_parallel.out 12:06:27 --- /<<PKGBUILDDIR>>/build/../src/test/regress/expected/select_parallel.out 2019-09-23 20:24:42.000000000 +0000 12:06:27 +++ /<<PKGBUILDDIR>>/build/src/bin/pg_upgrade/tmp_check/regress/results/select_parallel.out 2019-09-26 10:06:21.171683801+0000 12:06:27 @@ -21,8 +21,8 @@ 12:06:27 Workers Planned: 3 12:06:27 -> Partial Aggregate 12:06:27 -> Parallel Append 12:06:27 - -> Parallel Seq Scan on d_star 12:06:27 -> Parallel Seq Scan on f_star 12:06:27 + -> Parallel Seq Scan on d_star 12:06:27 -> Parallel Seq Scan on e_star 12:06:27 -> Parallel Seq Scan on b_star 12:06:27 -> Parallel Seq Scan on c_star 12:06:27 @@ -75,8 +75,8 @@ 12:06:27 Workers Planned: 3 12:06:27 -> Partial Aggregate 12:06:27 -> Parallel Append 12:06:27 - -> Seq Scan on d_star 12:06:27 -> Seq Scan on f_star 12:06:27 + -> Seq Scan on d_star 12:06:27 -> Seq Scan on e_star 12:06:27 -> Seq Scan on b_star 12:06:27 -> Seq Scan on c_star 12:06:27 @@ -103,7 +103,7 @@ 12:06:27 ----------------------------------------------------- 12:06:27 Finalize Aggregate 12:06:27 -> Gather 12:06:27 - Workers Planned: 1 12:06:27 + Workers Planned: 3 12:06:27 -> Partial Aggregate 12:06:27 -> Append 12:06:27 -> Parallel Seq Scan on a_star Retriggering the build worked, though. Christoph
Christoph Berg <myon@debian.org> writes: > Building the 12rc1 package on Ubuntu eoan/amd64, I got this > regression diff: The append-order differences have been seen before, per this thread: https://www.postgresql.org/message-id/flat/CA%2BhUKG%2B0CxrKRWRMf5ymN3gm%2BBECHna2B-q1w8onKBep4HasUw%40mail.gmail.com We haven't seen it in quite some time in HEAD, though I fear that's just due to bad luck or change of timing of unrelated tests. I've been hoping to catch it in HEAD to validate the theory I posited in <22315.1563378828@sss.pgh.pa.us>, but your report doesn't help because the additional checking queries aren't there in the v12 branch :-( > 12:06:27 @@ -103,7 +103,7 @@ > 12:06:27 ----------------------------------------------------- > 12:06:27 Finalize Aggregate > 12:06:27 -> Gather > 12:06:27 - Workers Planned: 1 > 12:06:27 + Workers Planned: 3 > 12:06:27 -> Partial Aggregate > 12:06:27 -> Append > 12:06:27 -> Parallel Seq Scan on a_star We've also seen this on a semi-regular basis, and I've been intending to bitch about it, though it didn't seem very useful to do so as long as there were other instabilities in the regression tests. What we could do, perhaps, is feed the plan output through a filter that suppresses the exact number-of-workers value. There's precedent for such plan-filtering elsewhere in the tests already. regards, tom lane
Re: Tom Lane 2019-09-26 <12685.1569510771@sss.pgh.pa.us> > We haven't seen it in quite some time in HEAD, though I fear that's > just due to bad luck or change of timing of unrelated tests. The v13 package builds that are running every 6h here haven't seen a problem yet either, so the probability of triggering it seems very low. So it's not a pressing problem. (There's some extension modules where the testsuite fails at a much higher rate, getting all targets to pass at the same time is next to impossible there :(. ) Christoph
Christoph Berg <myon@debian.org> writes: > Re: Tom Lane 2019-09-26 <12685.1569510771@sss.pgh.pa.us> >> We haven't seen it in quite some time in HEAD, though I fear that's >> just due to bad luck or change of timing of unrelated tests. > The v13 package builds that are running every 6h here haven't seen a > problem yet either, so the probability of triggering it seems very > low. So it's not a pressing problem. I've pushed some changes to try to ameliorate the issue. > (There's some extension modules > where the testsuite fails at a much higher rate, getting all targets > to pass at the same time is next to impossible there :(. ) I feel your pain, believe me. Used to fight the same kind of problems when I was at Red Hat. Are any of those extension modules part of Postgres? regards, tom lane
Re: Tom Lane 2019-09-28 <24917.1569692191@sss.pgh.pa.us> > > (There's some extension modules > > where the testsuite fails at a much higher rate, getting all targets > > to pass at the same time is next to impossible there :(. ) > > I feel your pain, believe me. Used to fight the same kind of problems > when I was at Red Hat. Are any of those extension modules part of > Postgres? No, external ones. The main offenders at the moment are pglogical and patroni (admittedly not an extension in the strict sense). Both have extensive testsuites that exercise replication scenarios that are prone to race conditions. (Maybe we should just run less tests for the packaging.) Christoph