Re: [HACKERS] Possible regression with gather merge. - Mailing list pgsql-hackers

From Mithun Cy
Subject Re: [HACKERS] Possible regression with gather merge.
Date
Msg-id CAD__Ouge0T0nvhTUKLNGrvN7+py=N1THxajvWf0wovqJ5F7k7A@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Possible regression with gather merge.  (Rushabh Lathia <rushabh.lathia@gmail.com>)
List pgsql-hackers
On Wed, Mar 22, 2017 at 1:09 PM, Rushabh Lathia
<rushabh.lathia@gmail.com> wrote:
> In the code, gather merge consider the limit for the sort into
> create_ordered_paths,
> which is wrong. Into create_ordered_paths(), GM should not consider the
> limit
> while doing costing for the sort node.
>
> Attached patch fix the bug.
Thanks, Rushabh, Now I see non-parallel scan being picked up by default

create table test as (select id, (random()*10000)::int as v1, random() as
v2 from generate_series(1,10000000) id);

postgres=# explain analyze select * from test order by v1, v2 limit 10;
       QUERY PLAN
 

-------------------------------------------------------------------------------------------------------------------------------Limit
(cost=370146.98..370147.00 rows=10 width=16) (actual
 
time=1169.685..1169.686 rows=10 loops=1)  ->  Sort  (cost=370146.98..395146.63 rows=9999860 width=16) (actual
time=1169.683..1169.684 rows=10 loops=1)        Sort Key: v1, v2        Sort Method: top-N heapsort  Memory: 25kB
->  Seq Scan on test  (cost=0.00..154053.60 rows=9999860
 
width=16) (actual time=0.016..590.176 rows=10000000 loops=1)Planning time: 0.070 msExecution time: 1169.706 ms
(7 rows)


Another find by accident. Setting higher
max_parallel_workers_per_gather to a higher value than default results
in more parallel workers, But query runs slower than the plan with
lesser workers.

postgres=# set max_parallel_workers_per_gather = default;
SET
postgres=# explain analyze select * from test order by v1, v2 limit 10000000;
                    QUERY
 
PLAN

--------------------------------------------------------------------------------------------------------------------------------------------Limit
(cost=697263.86..1669540.28 rows=8333216 width=16) (actual
 
time=2212.437..8207.161 rows=10000000 loops=1)  ->  Gather Merge  (cost=697263.86..1669540.28 rows=8333216
width=16) (actual time=2212.436..7600.478 rows=10000000 loops=1)        Workers Planned: 2        Workers Launched: 2
    ->  Sort  (cost=696263.84..706680.36 rows=4166608 width=16)
 
(actual time=2173.756..3105.512 rows=3333333 loops=3)              Sort Key: v1, v2              Sort Method: external
merge Disk: 86648kB              ->  Parallel Seq Scan on test  (cost=0.00..95721.08
 
rows=4166608 width=16) (actual time=0.030..240.486 rows=3333333
loops=3)Planning time: 0.096 msExecution time: 8537.214 ms
(10 rows)

postgres=# set max_parallel_workers_per_gather = 10;
SET
postgres=# explain analyze select * from test order by v1, v2 limit 10000000;
                    QUERY
 
PLAN

--------------------------------------------------------------------------------------------------------------------------------------------Limit
(cost=431168.44..1628498.09 rows=9999860 width=16) (actual
 
time=1525.120..13273.805 rows=10000000 loops=1)  ->  Gather Merge  (cost=431168.44..1628498.09 rows=9999860
width=16) (actual time=1525.119..12650.621 rows=10000000 loops=1)        Workers Planned: 4        Workers Launched: 4
     ->  Sort  (cost=430168.39..436418.30 rows=2499965 width=16)
 
(actual time=1472.799..2133.571 rows=2000000 loops=5)              Sort Key: v1, v2              Sort Method: external
merge Disk: 50336kB              ->  Parallel Seq Scan on test  (cost=0.00..79054.65
 
rows=2499965 width=16) (actual time=0.047..201.405 rows=2000000
loops=5)Planning time: 0.077 msExecution time: 13622.319 ms
(10 rows)



-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [HACKERS] Partitioned tables and relfilenode
Next
From: Emil Iggland
Date:
Subject: [HACKERS] Fix for grammatical error in code-comment