Home > mailing lists

Re: Parallel append plan instability/randomness - Mailing list pgsql-hackers

From	Amit Kapila
Subject	Re: Parallel append plan instability/randomness
Date	January 8, 2018 07:40:38
Msg-id	CAA4eK1LegxN9of0sydNM5n4up=7oCj=-E94edKDOsYFKyb1R0Q@mail.gmail.com Whole thread
In response to	Parallel append plan instability/randomness (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Parallel append plan instability/randomness Re: Parallel append plan instability/randomness
List	pgsql-hackers

Tree view

On Sun, Jan 7, 2018 at 5:44 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> According to buildfarm member silverfish,
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=silverfish&dt=2018-01-06%2008%3A53%3A38
>
> it's possible to sometimes get this failure in the regression tests:
>
> *** /mnt/buildfarm/buildroot/HEAD/pgsql.build/../pgsql/src/test/regress/expected/select_parallel.out    Tue Dec 19
20:24:022017
 
> --- /mnt/buildfarm/buildroot/HEAD/pgsql.build/src/test/regress/results/select_parallel.out      Sat Jan  6 09:21:39
2018
> ***************
> *** 75,84 ****
>            Workers Planned: 3
>            ->  Partial Aggregate
>                  ->  Parallel Append
>                        ->  Seq Scan on d_star
>                        ->  Seq Scan on f_star
>                        ->  Seq Scan on e_star
> -                      ->  Seq Scan on b_star
>                        ->  Seq Scan on c_star
>                        ->  Seq Scan on a_star
>   (11 rows)
> --- 75,84 ----
>            Workers Planned: 3
>            ->  Partial Aggregate
>                  ->  Parallel Append
> +                      ->  Seq Scan on b_star
>                        ->  Seq Scan on d_star
>                        ->  Seq Scan on f_star
>                        ->  Seq Scan on e_star
>                        ->  Seq Scan on c_star
>                        ->  Seq Scan on a_star
>   (11 rows)
>
> Irreproducible failures in the regression tests are not very acceptable.
> Furthermore, considering that the query being tested is
>
> explain (costs off)
>   select round(avg(aa)), sum(aa) from a_star;
>
> it seems to me that the "expected" order of the sub-scans is mighty
> random to begin with.
>

I think order of sub-scans can be random if the number of rows in
child relations can vary across runs.  For the above case, the
subpaths (non-partial-paths) are always in the descending order of
their cost and I can see that by running it locally.  On my local m/c,
output is as below:

regression=# explain select round(avg(aa)), sum(aa) from a_star;
                                  QUERY PLAN
-------------------------------------------------------------------------------
 Finalize Aggregate  (cost=2.30..2.31 rows=1 width=40)
   ->  Gather  (cost=2.28..2.29 rows=3 width=40)
         Workers Planned: 3
         ->  Partial Aggregate  (cost=2.28..2.29 rows=1 width=40)
               ->  Parallel Append  (cost=0.00..2.20 rows=15 width=4)
                     ->  Seq Scan on d_star  (cost=0.00..1.16 rows=16 width=4)
                     ->  Seq Scan on f_star  (cost=0.00..1.16 rows=16 width=4)
                     ->  Seq Scan on e_star  (cost=0.00..1.07 rows=7 width=4)
                     ->  Seq Scan on b_star  (cost=0.00..1.04 rows=4 width=4)
                     ->  Seq Scan on c_star  (cost=0.00..1.04 rows=4 width=4)
                     ->  Seq Scan on a_star  (cost=0.00..1.03 rows=3 width=4)
(11 rows)

The above indicates that paths are listed in the order as expected.
What makes you think that the order of sub-scans can be random?  Is it
possible that the number of rows in child relations can vary across
runs?

One theory that can explain above failure is that the costs of
scanning some of the sub-paths is very close due to which sometimes
the results can vary.  If that is the case, then probably using
fuzz_factor in costs comparison (as is done in attached patch) can
improve the situation, may be we have to consider some other factors
like number of rows in each subpath.  However, it might be better to
first somehow reproduce this case and see what is going wrong, any
ideas?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment

fix_pa_cost_comp_v1.patch

pgsql-hackers by date:

From: Thomas Munro
Date: 08 January 2018, 07:32:27
Subject: Re: heads up: Fix for intel hardware bug will lead to performance regressions

From: Michael Paquier
Date: 08 January 2018, 08:24:54
Subject: Re: heads up: Fix for intel hardware bug will lead to performance regressions

Re: Parallel append plan instability/randomness - Mailing list pgsql-hackers

Attachment

Previous

Next