Re: Parallel Seq Scan - Mailing list pgsql-hackers

From Haribabu Kommi
Subject Re: Parallel Seq Scan
Date
Msg-id CAJrrPGftdPzoUO5iZjd42FeCXPxb1ZRg7uFCUU4g6gcmLoDB4g@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Seq Scan  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Parallel Seq Scan  (Robert Haas <robertmhaas@gmail.com>)
Re: Parallel Seq Scan  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Thu, Sep 3, 2015 at 8:21 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Thu, Jul 23, 2015 at 7:43 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
>>
>> Hi Amit,
>>
>> The latest v16 patch cannot be applied to the latest
>> master as is.
>> 434873806a9b1c0edd53c2a9df7c93a8ba021147 changed various
>> lines in heapam.c, so it probably conflicts with this.
>>
>
> Attached, find the rebased version of patch.  It fixes the comments raised
> by Jeff  Davis and Antonin Houska.  The main changes in this version are
> now it supports sync scan along with parallel sequential scan (refer
> heapam.c)
> and the patch has been split into two parts, first contains the code for
> Funnel node and infrastructure to support the same and second contains
> the code for PartialSeqScan node  and its infrastructure.
>

Thanks for the updated patch.

With subquery, parallel scan is having some problem, please refer below.

postgres=# explain select * from test01 where kinkocord not in (select
kinkocord from test02 where tenpocord = '001');                                           QUERY PLAN
--------------------------------------------------------------------------------------------------Funnel on test01
(cost=0.00..155114352184.12rows=20000008 width=435)  Filter: (NOT (SubPlan 1))  Number of Workers: 16  ->  Partial Seq
Scanon test01  (cost=0.00..155114352184.12
 
rows=20000008 width=435)        Filter: (NOT (SubPlan 1))        SubPlan 1          ->  Materialize
(cost=0.00..130883.67rows=385333 width=5)                ->  Funnel on test02  (cost=0.00..127451.01
 
rows=385333 width=5)                      Filter: (tenpocord = '001'::bpchar)                      Number of Workers:
16                     ->  Partial Seq Scan on test02
 
(cost=0.00..127451.01 rows=385333 width=5)                            Filter: (tenpocord = '001'::bpchar)  SubPlan 1
-> Materialize  (cost=0.00..130883.67 rows=385333 width=5)          ->  Funnel on test02  (cost=0.00..127451.01
rows=385333width=5)                Filter: (tenpocord = '001'::bpchar)                Number of Workers: 16
  ->  Partial Seq Scan on test02  (cost=0.00..127451.01
 
rows=385333 width=5)                      Filter: (tenpocord = '001'::bpchar)
(19 rows)

postgres=# explain analyze select * from test01 where kinkocord not in
(select kinkocord from test02 where tenpocord = '001');
ERROR:  badly formatted node string "SUBPLAN :subLinkType 2 :testexpr"...
CONTEXT:  parallel worker, pid 32879
postgres=#


And also regarding the number of workers (16) that is shown in the
explain analyze plan are not actually allotted because the in my
configuration i set the max_worker_process as 8 only. I feel the plan
should show the allotted workers not the planned workers.
If the query execution takes time because of lack of workers and the
plan is showing as 16 workers, in that case user may think that
even with 16 workers the query is slower, but actually it is not.


Regards,
Hari Babu
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Use pg_rewind when target timeline was switched
Next
From: Etsuro Fujita
Date:
Subject: Re: Hooking at standard_join_search (Was: Re: Foreign join pushdown vs EvalPlanQual)