Re: Parallel append plan instability/randomness - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Parallel append plan instability/randomness |
Date | |
Msg-id | CAA4eK1LegxN9of0sydNM5n4up=7oCj=-E94edKDOsYFKyb1R0Q@mail.gmail.com Whole thread Raw |
In response to | Parallel append plan instability/randomness (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Parallel append plan instability/randomness
Re: Parallel append plan instability/randomness |
List | pgsql-hackers |
On Sun, Jan 7, 2018 at 5:44 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > According to buildfarm member silverfish, > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=silverfish&dt=2018-01-06%2008%3A53%3A38 > > it's possible to sometimes get this failure in the regression tests: > > *** /mnt/buildfarm/buildroot/HEAD/pgsql.build/../pgsql/src/test/regress/expected/select_parallel.out Tue Dec 19 20:24:022017 > --- /mnt/buildfarm/buildroot/HEAD/pgsql.build/src/test/regress/results/select_parallel.out Sat Jan 6 09:21:39 2018 > *************** > *** 75,84 **** > Workers Planned: 3 > -> Partial Aggregate > -> Parallel Append > -> Seq Scan on d_star > -> Seq Scan on f_star > -> Seq Scan on e_star > - -> Seq Scan on b_star > -> Seq Scan on c_star > -> Seq Scan on a_star > (11 rows) > --- 75,84 ---- > Workers Planned: 3 > -> Partial Aggregate > -> Parallel Append > + -> Seq Scan on b_star > -> Seq Scan on d_star > -> Seq Scan on f_star > -> Seq Scan on e_star > -> Seq Scan on c_star > -> Seq Scan on a_star > (11 rows) > > Irreproducible failures in the regression tests are not very acceptable. > Furthermore, considering that the query being tested is > > explain (costs off) > select round(avg(aa)), sum(aa) from a_star; > > it seems to me that the "expected" order of the sub-scans is mighty > random to begin with. > I think order of sub-scans can be random if the number of rows in child relations can vary across runs. For the above case, the subpaths (non-partial-paths) are always in the descending order of their cost and I can see that by running it locally. On my local m/c, output is as below: regression=# explain select round(avg(aa)), sum(aa) from a_star; QUERY PLAN ------------------------------------------------------------------------------- Finalize Aggregate (cost=2.30..2.31 rows=1 width=40) -> Gather (cost=2.28..2.29 rows=3 width=40) Workers Planned: 3 -> Partial Aggregate (cost=2.28..2.29 rows=1 width=40) -> Parallel Append (cost=0.00..2.20 rows=15 width=4) -> Seq Scan on d_star (cost=0.00..1.16 rows=16 width=4) -> Seq Scan on f_star (cost=0.00..1.16 rows=16 width=4) -> Seq Scan on e_star (cost=0.00..1.07 rows=7 width=4) -> Seq Scan on b_star (cost=0.00..1.04 rows=4 width=4) -> Seq Scan on c_star (cost=0.00..1.04 rows=4 width=4) -> Seq Scan on a_star (cost=0.00..1.03 rows=3 width=4) (11 rows) The above indicates that paths are listed in the order as expected. What makes you think that the order of sub-scans can be random? Is it possible that the number of rows in child relations can vary across runs? One theory that can explain above failure is that the costs of scanning some of the sub-paths is very close due to which sometimes the results can vary. If that is the case, then probably using fuzz_factor in costs comparison (as is done in attached patch) can improve the situation, may be we have to consider some other factors like number of rows in each subpath. However, it might be better to first somehow reproduce this case and see what is going wrong, any ideas? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Attachment
pgsql-hackers by date: