Re: explain analyze output with parallel workers - question aboutmeaning of information for explain.depesz.com - Mailing list pgsql-hackers

From Robert Haas
Subject Re: explain analyze output with parallel workers - question aboutmeaning of information for explain.depesz.com
Date
Msg-id CA+TgmoZTznU_e2rQqYac2kS6VrTq94GEzizhDyqG1kUNBFjAaQ@mail.gmail.com
Whole thread Raw
In response to Re: explain analyze output with parallel workers - question aboutmeaning of information for explain.depesz.com  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: explain analyze output with parallel workers - question aboutmeaning of information for explain.depesz.com
Re: explain analyze output with parallel workers - question aboutmeaning of information for explain.depesz.com
List pgsql-hackers
On Wed, Dec 6, 2017 at 1:05 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> Right and seeing that I have prepared the patch (posted above [1])
> which fixes it such that it will resemble the non-parallel case.
> Ideally, it would have obviated the need for my previous patch which
> got committed as 778e78ae.  However, now that is committed, I could
> think of below options:
>
> 1. I shall rebase it atop what is committed and actually, I have done
> that in the attached patch.  I have also prepared a regression test
> case patch just to show the output with and without the patch.
> 2. For sort node, we can fix it by having some local_info same as
> shared_info in sort node and copy the shared_info in that or we could
> reinstate the pointer to the DSM in ExecSortReInitializeDSM() by
> looking it up in the TOC as suggested by Thomas. If we go this way,
> then we need a similar fix for hash node as well.

Well, the patch you've actually attached makes the bug go away by
removing a net of 53 lines of code.  The other approach would probably
add code.  So I am tempted to go with the approach you have here.  I
would probably change the T_HashState and T_SortState cases in
ExecParallelReInitializeDSM so that they still exist, but just do
something like this:

case T_HashState:
case T_SortState:
/* these nodes have DSM state, but no reinitialization is required */
break;

That way, it will be more clear to future readers of this code that
the lack of a reinitialize function is not an oversight, and the
compiler should optimize these cases away, merging them with the
default case.

I was a little worried when I first opened this patch that it might be
imposing a one-size-fits-all solution; just because sort and hash want
to report details from the last execution, there could be some other
executor node that wants to do otherwise.  But on reflection, that's
just fine: an individual executor node is free to accumulate stats
across rescans if it wants to do so.  It's merely that sort and hash
don't want to do that.  In fact, it's really the current system that
is imposing a straightjacket: no matter what the individual node types
choose to do, rescans are a pile of fail.

Long story short, I like the patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: Speeding up pg_upgrade
Next
From: Alvaro Herrera
Date:
Subject: Re: Speeding up pg_upgrade