Re: Parallel Seq Scan - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Parallel Seq Scan
Date
Msg-id CAA4eK1J3es5xddQqWsARtrhgOFmwzBPYN=OaOzayFg+nLmEQjw@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Seq Scan  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Parallel Seq Scan  (Haribabu Kommi <kommi.haribabu@gmail.com>)
List pgsql-hackers
On Wed, Oct 14, 2015 at 3:29 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Tue, Oct 13, 2015 at 2:45 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > Attached is rebased patch for partial seqscan support.
>
> Review comments:
>
> - If you're going to pgindent execParallel.c, you need to add some
> entries to typedefs.list so it doesn't mangle the formatting.
> ExecParallelEstimate's parameter list is misformatted, for example.
> Also, I think if we're going to do this we had better extract the
> pgindent changes and commit those first.  It's pretty distracting the
> way you have it.
>

Agreed, we can do those separately. So, I have reverted those changes.

> - Instead of inlining the work needed by each parallel mode in
> ExecParallelEstimate(), I think you should mimic the style of
> ExecProcNode and call a node-type specific function that is part of
> that node's public interface - here, ExecPartialSeqScanEstimate,
> perhaps.  Similarly for ExecParallelInitializeDSM.  Perhaps
> ExecPartialSeqScanInitializeDSM.
>
> - I continue to think GetParallelShmToc is the wrong approach.
> Instead, each time ExecParallelInitializeDSM or
> ExecParallelInitializeDSM calls a nodetype-specific initialized
> function (as described in the previous point), have it pass d->pcxt as
> an argument.  The node can get the toc from there if it needs it.  I
> suppose it could store a pointer to the toc in its scanstate if it
> needs it, but it really shouldn't.  Instead, it should store a pointer
> to, say, the ParallelHeapScanDesc in the scanstate.  It really should
> only care about its own shared data, so once it finds that, the toc
> shouldn't be needed any more. Then ExecPartialSeqScan doesn't need to
> look up pscan; it's already recorded in the scanstate.
>
> - ExecParallelInitializeDSMContext's new pscan_len member is 100%
> wrong.  Individual scan nodes don't get to add stuff to that context
> object.  They should store details like this in their own ScanState as
> needed.
>

Changed the patch as per above suggestions.

> - The positioning of the new nodes in various lists doesn't seem to
> entirely consistent.  nodes.h adds them after SampleScan which isn't
> terrible, though maybe immediately after SeqScan would be better, but
> _outNode has it right after BitmapOr and the switch in _copyObject has
> it somewhere else again.
>

I think this got messed up while rebasing on top of Gather node
changes, but nonetheless, I have changed it such that PartialSeqScan
node handling is after SeqScan.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Union All View execution
Next
From: Craig Ringer
Date:
Subject: PATCH: 9.5 replication origins fix for logical decoding