Re: [HACKERS] parallelize queries containing initplans - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [HACKERS] parallelize queries containing initplans
Date
Msg-id CA+TgmoZXy6r_A0MQ98beC+OBw8R7xeyJxf4chYtVdHp0OCUP4A@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] parallelize queries containing initplans  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: [HACKERS] parallelize queries containing initplans
List pgsql-hackers
On Sat, Nov 11, 2017 at 7:19 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> I have tried to follow the practice we have used for param extern
> params (SerializeParamList is in params.c) and most of the handling of
> initplans is done in nodeSubplan.c, so I choose to keep the newly
> added functions there.  However, I think keeping it in execParallel.c
> is also okay as we do it for serialize plan.

To me it feels like there is only a very loose coupling between this
code and what's currently in nodeSubplan.c, but maybe I'm wrong.  I am
also not sure execParallel.c is the perfect place either; we might
eventually split execParallel.c into multiple files if it accumulates
too many things that are too different from each other.

> I think it would need some work in execution as well because the size
> won't be fixed in that case for varchar type of params.  We might end
> up with something different as well.

Hmm, true.  But don't we also have that problem with the patch as
written?  I mean, didn't we determine at some point that the value of
an InitPlan can change, if the initplan depends on a parameter that is
correlated but not directly correlated?  The existence of
ExecReScanSetParamPlan() seems to suggest that this is indeed
possible, and that means that the parameter could be reevaluated and
evaluate, on the second time through, to a value that requires more
storage than before.  That would be a problem, because
ExecParallelReInitializeDSM reuses the old DSM.

>>  I broke a lot of long lines in your version of
>> the patch into multiple lines; please try to be attentive to this
>> issue when writing patches in general, as it is a bit tedious to go
>> through and insert line breaks in many places.
>
> Okay, but I sometimes rely on pgindent for such things as for few
> things it becomes difficult to decide which way it will be better.

pgindent doesn't in general break long lines.  There are a few cases
where it does, like reflowing comments.  But mostly you have to do
that manually.  For instance, if you write a =
verylongfunctionname(longargument(thing), stuff(thing), foobar(thing,
thing, thing)) it's going to keep all that on one line.  If you insert
line breaks between the arguments it will keep them, though.  I think
it's worth doing something like:

git diff | awk 'length($0)>81'

on your patches before you submit them.  If you see things in there
that can be reformatted to make the long lines shorter, do it.

> I have fixed the first two in attached patch and left the last one as
> I was not sure what you have in mind

Oh, yeah, that should be changed too, something like "Serialize
PARAM_EXEC parameters."  Sorry, I thought I caught all of the
references to initplans specifically, but I guess I missed a few.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Oliver Ford
Date:
Subject: Re: Fix number skipping in to_number
Next
From: Andres Freund
Date:
Subject: Re: [HACKERS] pg_upgrade to clusters with a different WAL segmentsize