Re: Parallel Seq Scan - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Parallel Seq Scan
Date
Msg-id CAA4eK1+n6Zu4SSjCSUAtq=bcpQooS5BjhxjDhJafXtm43ZAqKQ@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Seq Scan  (Haribabu Kommi <kommi.haribabu@gmail.com>)
List pgsql-hackers
On Tue, Sep 15, 2015 at 8:34 AM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>
> >
>
> I reviewed the parallel_seqscan_funnel_v17.patch and following are my comments.
> I will continue my review with the parallel_seqscan_partialseqscan_v17.patch.
>

Thanks for the review.

> + if (inst_options)
> + {
> + instoptions = shm_toc_lookup(toc, PARALLEL_KEY_INST_OPTIONS);
> + *inst_options = *instoptions;
> + if (inst_options)
>
> Same pointer variable check, it should be if (*inst_options) as per the
> estimate and store functions.
>

makes sense, will change in next version of patch.

>
> + if (funnelstate->ss.ps.ps_ProjInfo)
> + slot = funnelstate->ss.ps.ps_ProjInfo->pi_slot;
> + else
> + slot = funnelstate->ss.ss_ScanTupleSlot;
>
> Currently, there will not be a projinfo for funnel node. 

No, that's not true, it has projinfo for the cases where it is required.

> So always it uses
> the scan tuple slot. In case if it is different, we need to add the ExecProject
> call in ExecFunnel function. 
>

It will not use Scan tuple slot for cases for column list contains expression
or other cases where projection is required.  Currently we don't need separate
ExecProject as there is no case where workers won't do projection for us.
However in future we might need for something like restrictive functions.
I think it is better to add a comment explaining the same which I will do in
next version.  Does that makes sense?


>
>
> + if (!((*dest->receiveSlot) (slot, dest)))
> + break;
>
> and
>
> +void
> +TupleQueueFunnelShutdown(TupleQueueFunnel *funnel)
> +{
> + if (funnel)
> + {
> + int i;
> + shm_mq_handle *mqh;
> + shm_mq   *mq;
> + for (i = 0; i < funnel->nqueues; i++)
> + {
> + mqh = funnel->queue[i];
> + mq = shm_mq_get_queue(mqh);
> + shm_mq_detach(mq);
> + }
> + }
> +}
>
>
> Using this function, the backend detaches from the message queue, so
> that the workers
> which are trying to put results into the queues gets an error message
> as SHM_MQ_DETACHED.
> Then worker finshes the execution of the plan. For this reason all the
> printtup return
> types are changed from void to bool.
>
> But this way the worker doesn't get exited until it tries to put a
> tuple in the queue.
> If there are no valid tuples that satisfy the condition, then it may
> take time for the workers
> to exit. Am I correct? I am not sure how frequent such scenarios can occur.
>

Yes, you are right.  The main reason to keep it like this is that we
want to finish execution without going into error path, so that
the collected stats can be communicated back.  I am not sure trying
to do better in this case is worth at this point.

>
> + if (parallel_seqscan_degree >= MaxConnections)
> + {
> + write_stderr("%s: parallel_scan_degree must be less than
> max_connections\n", progname);
> + ExitPostmaster(1);
> + }
>
> The error condition works only during server start. User still can set
> parallel seqscan degree
> more than max connection at super user session level and etc.
>

I think we can remove this check as pointed out by Robert as well.

>
> + if (!parallelstmt->inst_options)
> + (*receiver->rDestroy) (receiver);
>
> Why only when there is no instruementation only, the receiver needs to
> be destroyed?
>

No, receiver should be destroyed unconditionally.  This is remnant of the
previous versions when receiver was created for no instrumentation case.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Nicolas Barbier
Date:
Subject: Re: a funnel by any other name
Next
From: Pavel Stehule
Date:
Subject: Re: [patch] Proposal for \rotate in psql