Re: Parallel Seq Scan - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Parallel Seq Scan
Date
Msg-id CA+TgmoYc1Q+oRBTrEG_kg65J7oBBPAAmvK1xh6GD76g8KZd8LA@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Seq Scan  (Haribabu Kommi <kommi.haribabu@gmail.com>)
List pgsql-hackers
On Mon, Sep 14, 2015 at 11:04 PM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
> Using this function, the backend detaches from the message queue, so
> that the workers
> which are trying to put results into the queues gets an error message
> as SHM_MQ_DETACHED.
> Then worker finshes the execution of the plan. For this reason all the
> printtup return
> types are changed from void to bool.
>
> But this way the worker doesn't get exited until it tries to put a
> tuple in the queue.
> If there are no valid tuples that satisfy the condition, then it may
> take time for the workers
> to exit. Am I correct? I am not sure how frequent such scenarios can occur.

Yes, that's a problem.  It's probably not that bad as long as the only
thing that can occur under a Funnel node is a sequential scan,
although even then the filter condition on the sequential scan could
be something expensive or highly selective.  But it will get a lot
worse when we get the ability to push joins below the funnel.

I welcome ideas for solving this problem.  Basically, the problem is
that we may need to shut down the executor before execution is
complete.  This can happen because we're beneath a limit node; it can
also happen because we're on the inner side of a semijoin and have
already found one match.  Presumably, parallel plans in such case will
be rare.  But there may be cases where they happen, and so we need
some way to handle it.

One idea is that the workers could exit by throwing an ERROR, maybe
after setting some flag first to say, hey, this isn't a *real* error,
we're just doing this to achieve a non-local transfer of control.  But
then we need to make sure that any instrumentation statistics still
get handled properly, which is maybe not so easy.  And it seems like
there might be other problems with things not getting shut down
properly as well.  Any code that expects a non-local exit to lead to a
(sub)transaction abort potentially gets broken by this approach.

Another idea is to try to gradually enrich the set of places that
check for shutdown.  So for example at the beginning of ExecProcNode()
we could add a check at the beginning to return NULL if the flag's
been set; that would probably dampen the amount of additional work
that could get done in many common scenarios.  But that might break a
bunch of things too, and it's far from a complete solution anyway: for
example, we could be stuck down inside some user-defined function, and
I don't see that there's much choice in that case to run the function
to conclusion.

This problem essentially happens because we're hoping that the workers
in parallel mode will "run ahead" of the master, producing tuples for
it to read before it gets to the point of sitting and waiting for
them.  Indeed, if that happens, we've missed the boat entirely.  But
then that opens up the problem that the master could always decide it
doesn't need any tuples after all.

Anyone have a smart idea for how to attack this?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Haribabu Kommi
Date:
Subject: Re: Parallel Seq Scan
Next
From: Robert Haas
Date:
Subject: Re: Parallel Seq Scan