Re: [HACKERS] Gather Merge - Mailing list pgsql-hackers

From Rushabh Lathia
Subject Re: [HACKERS] Gather Merge
Date
Msg-id CAGPqQf2ojLdYhvDy3pRomyBpapWiiYM1W6q-rB7t-sMZ_UNfSQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Gather Merge  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [HACKERS] Gather Merge
List pgsql-hackers


On Thu, Mar 9, 2017 at 6:19 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Mar 8, 2017 at 11:59 PM, Rushabh Lathia
<rushabh.lathia@gmail.com> wrote:
> Here is another version of patch with the suggested changes.

Committed.


Thanks Robert for committing this.

My colleague Neha Sharma found one regression with the patch. I was about
to send this mail and noticed that you committed the patch.

Here is the small example:

Test setup:

1) ./db/bin/pgbench postgres -i -F 100 -s 20

2) update pgbench_accounts set filler = 'foo' where aid%10 = 0;

3) vacuum analyze pgbench_accounts

4)

postgres=# set max_parallel_workers_per_gather = 4;
SET

postgres=# explain select aid from pgbench_accounts where aid % 25= 0 group by aid;
ERROR:  ORDER/GROUP BY expression not found in targetlist

postgres=# set enable_indexonlyscan = off;
SET
postgres=# explain select aid from pgbench_accounts where aid % 25= 0 group by aid;
                                               QUERY PLAN                                              
--------------------------------------------------------------------------------------------------------
 Group  (cost=44708.21..45936.81 rows=10001 width=4)
   Group Key: aid
   ->  Gather Merge  (cost=44708.21..45911.81 rows=10000 width=0)
         Workers Planned: 4
         ->  Group  (cost=43708.15..43720.65 rows=2500 width=4)
               Group Key: aid
               ->  Sort  (cost=43708.15..43714.40 rows=2500 width=4)
                     Sort Key: aid
                     ->  Parallel Seq Scan on pgbench_accounts  (cost=0.00..43567.06 rows=2500 width=4)
                           Filter: ((aid % 25) = 0)
(10 rows)


- Index only scan with GM do work - but with ORDER BY clause
postgres=# set enable_seqscan = off;
SET
postgres=# explain select aid from pgbench_accounts where aid % 25= 0 order by aid;
                                                     QUERY PLAN                                                     
---------------------------------------------------------------------------------------------------------------------
 Gather Merge  (cost=1000.49..113924.61 rows=10001 width=4)
   Workers Planned: 4
   ->  Parallel Index Scan using pgbench_accounts_pkey on pgbench_accounts  (cost=0.43..111733.33 rows=2500 width=4)
         Filter: ((aid % 25) = 0)
(4 rows)

Debugging further I found that problem only when IndexOnlyScan under GM and
that to only with grouping. Debugging problem I found that ressortgroupref is
not getting set. That lead me thought that it might be because create_gather_merge_plan()
is building tlist, with build_path_tlist. Another problem I found is that
create_grouping_paths() is passing NULL for the targets while calling
create_gather_merge_path(). (fix_target_gm.patch)

With those changes above test is running fine, but it broke other things. Like

postgres=# explain select distinct(bid) from pgbench_accounts where filler like '%foo%' group by aid;
ERROR:  GatherMerge child's targetlist doesn't match GatherMerge

Another thing I noticed that, if we stop adding gathermerge during the
generate_gather_paths() then all the test working fine.
(comment_gm_from_generate_gather.patch)

I will continue looking at this problem.

Attaching both the patch for reference.


regards,
Rushabh Lathia
Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: [HACKERS] CREATE/ALTER ROLE PASSWORD ('value' USING 'method')
Next
From: Artur Zakirov
Date:
Subject: Re: [HACKERS] Should buffer of initialization fork have aBM_PERMANENT flag