Re: Parallel Seq Scan - Mailing list pgsql-hackers

From Haribabu Kommi
Subject Re: Parallel Seq Scan
Date
Msg-id CAJrrPGcncrq9V_B+_1a20XdN2YojhcqiQf=5gV1JKPGq5euG9A@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Seq Scan  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Parallel Seq Scan  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Tue, Nov 3, 2015 at 9:41 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Fri, Oct 23, 2015 at 4:41 PM, Amit Kapila <amit.kapila16@gmail.com>
> wrote:
>>
>> On Fri, Oct 23, 2015 at 10:33 AM, Robert Haas <robertmhaas@gmail.com>
>> wrote:
>
> Please find the rebased partial seq scan patch attached with this
> mail.
>
> Robert suggested me off list that we should once try to see if we
> can use Seq Scan node instead of introducing a new Partial Seq Scan
> node. I have analyzed to see if we can use the SeqScan node (containing
> parallel flag) instead of introducing new partial seq scan and found that
> we primarily need to change most of the functions in nodeSeqScan.c to
> have a parallel flag check and do something special for Partial Seq Scan
> and apart from that we need special handling in function
> ExecSupportsBackwardScan().  In general, I think we can make
> SeqScan node parallel-aware by having some special paths without
> introducing much complexity and that can save us code-duplication
> between nodeSeqScan.c and nodePartialSeqScan.c.  One thing that makes
> me slightly uncomfortable with this approach is that for partial seq scan,
> currently the plan looks like:
>
>                                 QUERY PLAN
> --------------------------------------------------------------------------
>  Gather  (cost=0.00..2588194.25 rows=9990667 width=4)
>    Number of Workers: 1
>    ->  Partial Seq Scan on t1  (cost=0.00..89527.51 rows=9990667 width=4)
>          Filter: (c1 > 10000)
> (4 rows)
>
> Now instead of displaying Partial Seq Scan, if we just display Seq Scan,
> then it might confuse user, so it is better to add some thing indicating
> parallel node if we want to go this route.

IMO, the change from Partial Seq Scan to Seq Scan may not confuse user,
if we clearly specify in the documentation that all plans under a Gather node
are parallel plans.

This is possible for the execution nodes that executes fully under a
Gather node.
The same is not possible for parallel aggregates, so we have to mention the
aggregate node below Gather node as partial only.

I feel this suggestion arises as may be because of some duplicate code between
Partial Seq Scan and Seq scan. By using Seq Scan node only if we display as
Partial Seq Scan by storing some flag in the plan? This avoids the
need of adding
new plan nodes.


Regards,
Hari Babu
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Haribabu Kommi
Date:
Subject: Re: NOTIFY in Background Worker
Next
From: Michael Paquier
Date:
Subject: Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby