Re: [HACKERS] Possible regression with gather merge. - Mailing list pgsql-hackers

From Mithun Cy
Subject Re: [HACKERS] Possible regression with gather merge.
Date
Msg-id CAD__OugFR12QqSz-NpUEpAGKf2renavW_D80DHgg1row=RXcmQ@mail.gmail.com
Whole thread Raw
In response to [HACKERS] Possible regression with gather merge.  (Mithun Cy <mithun.cy@enterprisedb.com>)
Responses Re: [HACKERS] Possible regression with gather merge.  (Rushabh Lathia <rushabh.lathia@gmail.com>)
List pgsql-hackers
Adding more rows to table make gather merge execution time very slow
when compared to non-parallel plan we get after disabling gather
merge.

create table test as (select id, (random()*10000)::int as v1, random() as
v2 from generate_series(1,100000000) id);

postgres=# set max_parallel_workers_per_gather = default;
SET
postgres=# explain analyze select * from test order by v1, v2 limit 10;

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------Limit
(cost=1858610.53..1858611.70 rows=10 width=16) (actual
 
time=31103.880..31103.885 rows=10 loops=1)  ->  Gather Merge  (cost=1858610.53..11581520.05 rows=83333406
width=16) (actual time=31103.878..31103.882 rows=10 loops=1)        Workers Planned: 2        Workers Launched: 2
->  Sort  (cost=1857610.50..1961777.26 rows=41666703
 
width=16) (actual time=30560.865..30561.046 rows=911 loops=3)              Sort Key: v1, v2              Sort Method:
externalmerge  Disk: 841584kB              ->  Parallel Seq Scan on test  (cost=0.00..957208.03
 
rows=41666703 width=16) (actual time=0.050..2330.275 rows=33333333
loops=3)Planning time: 0.292 msExecution time: 31502.896 ms
(10 rows)

postgres=# set max_parallel_workers_per_gather = 0;
SET
postgres=# explain analyze select * from test order by v1, v2 limit 10;
          QUERY
 
PLAN

------------------------------------------------------------------------------------------------------------------------------------Limit
(cost=3701507.83..3701507.85 rows=10 width=16) (actual
 
time=13231.264..13231.266 rows=10 loops=1)  ->  Sort  (cost=3701507.83..3951508.05 rows=100000088 width=16)
(actual time=13231.261..13231.262 rows=10 loops=1)        Sort Key: v1, v2        Sort Method: top-N heapsort  Memory:
25kB       ->  Seq Scan on test  (cost=0.00..1540541.88 rows=100000088
 
width=16) (actual time=0.045..6759.363 rows=100000000 loops=1)Planning time: 0.131 msExecution time: 13231.299 ms
(7 rows)

On Wed, Mar 22, 2017 at 11:07 AM, Mithun Cy <mithun.cy@enterprisedb.com> wrote:
> I accidently encountered a case where gather merge was picked as
> default but disabling same by setting max_parallel_workers_per_gather
> = 0; produced a non-parallel plan which was faster than gather merge,
> but its cost is marked too high when compared to gather merge.
>
> I guess we need some cost adjustment is planner code.
>
> Test setting
> =========
> create table test as (select id, (random()*10000)::int as v1, random() as
> v2 from generate_series(1,1000000) id);
> create index test_v1_idx on test (v1);
>
>
> Server setting is default.
>
>
> postgres=# explain analyze select * from test order by v1, v2 limit 10;
>                                                                QUERY
> PLAN
>
----------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=19576.71..19577.88 rows=10 width=16) (actual
> time=265.989..265.995 rows=10 loops=1)
>    ->  Gather Merge  (cost=19576.71..116805.80 rows=833334 width=16)
> (actual time=265.987..265.992 rows=10 loops=1)
>          Workers Planned: 2
>          Workers Launched: 2
>          ->  Sort  (cost=18576.69..19618.36 rows=416667 width=16)
> (actual time=250.202..250.424 rows=911 loops=3)
>                Sort Key: v1, v2
>                Sort Method: external merge  Disk: 9272kB
>                ->  Parallel Seq Scan on test  (cost=0.00..9572.67
> rows=416667 width=16) (actual time=0.053..41.397 rows=333333 loops=3)
>  Planning time: 0.193 ms
>  Execution time: 271.222 ms
>
> postgres=# set max_parallel_workers_per_gather = 0;
> SET
> postgres=# explain analyze select * from test order by v1, v2 limit 10;
>                                                          QUERY PLAN
>
-----------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=37015.64..37015.67 rows=10 width=16) (actual
> time=211.582..211.584 rows=10 loops=1)
>    ->  Sort  (cost=37015.64..39515.64 rows=1000000 width=16) (actual
> time=211.581..211.582 rows=10 loops=1)
>          Sort Key: v1, v2
>          Sort Method: top-N heapsort  Memory: 25kB
>          ->  Seq Scan on test  (cost=0.00..15406.00 rows=1000000
> width=16) (actual time=0.085..107.522 rows=1000000 loops=1)
>  Planning time: 0.093 ms
>  Execution time: 211.608 ms
> (7 rows)
>
>
>
> --
> Thanks and Regards
> Mithun C Y
> EnterpriseDB: http://www.enterprisedb.com



-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Ashutosh Bapat
Date:
Subject: Re: [HACKERS] Asymmetry between parent and child wrt "false" quals
Next
From: Rushabh Lathia
Date:
Subject: Re: [HACKERS] Possible regression with gather merge.