Re: Parallel INSERT (INTO ... SELECT ...) - Mailing list pgsql-hackers

From Greg Nancarrow
Subject Re: Parallel INSERT (INTO ... SELECT ...)
Date
Msg-id CAJcOf-cHrmQEJgw3t6AM_w_4eFw6BobP1Z_AyivWzx9nJ8mucA@mail.gmail.com
Whole thread Raw
In response to Re: Parallel INSERT (INTO ... SELECT ...)  (Amit Langote <amitlangote09@gmail.com>)
Responses Re: Parallel INSERT (INTO ... SELECT ...)  (Amit Langote <amitlangote09@gmail.com>)
List pgsql-hackers
On Sat, Feb 13, 2021 at 12:17 AM Amit Langote <amitlangote09@gmail.com> wrote:
>
> On Thu, Feb 11, 2021 at 4:43 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> > On Thu, Feb 11, 2021 at 5:33 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> > > On Tue, Feb 9, 2021 at 1:04 AM Amit Langote <amitlangote09@gmail.com> wrote:
> > > >
> > > > * I think that the concerns raised by Tsunakawa-san in:
> > > >
> > > >
https://www.postgresql.org/message-id/TYAPR01MB2990CCB6E24B10D35D28B949FEA30%40TYAPR01MB2990.jpnprd01.prod.outlook.com
> > > >
> > > > regarding how this interacts with plancache.c deserve a look.
> > > > Specifically, a plan that uses parallel insert may fail to be
> > > > invalidated when partitions are altered directly (that is without
> > > > altering their root parent).  That would be because we are not adding
> > > > partition OIDs to PlannerGlobal.invalItems despite making a plan
> > > > that's based on checking their properties.  See this (tested with all
> > > > patches applied!):
> > > >
> > >
> > > Does any current Postgres code add partition OIDs to
> > > PlannerGlobal.invalItems for a similar reason?
>
> Currently, the planner opens partitions only for SELECT queries and
> also adds them to the query's range table.  And because they are added
> to the range table, their OIDs do get added to
> PlannerGlobal.relationOids (not invalItems, sorry!) by way of
> CompleteCachedPlan() calling extract_query_dependencies(), which looks
> at Query.rtable to decide which tables/partitions to add.
>
> > > I would have thought that, for example,  partitions with a default
> > > column expression, using a function that is changed from SAFE to
> > > UNSAFE, would suffer the same plancache issue (for current parallel
> > > SELECT functionality) as we're talking about here - but so far I
> > > haven't seen any code handling this.
>
> AFAIK, default column expressions don't affect plans for SELECT
> queries.  OTOH, consider a check constraint expression as an example.
> The planner may use one to exclude a partition from the plan with its
> constraint exclusion algorithm (separate from "partition pruning").
> If the check constraint is dropped, any cached plans that used it will
> be invalidated.
>

Sorry, I got that wrong, default column expressions are relevant to
INSERT, not SELECT.

> >
> > Actually, I tried adding the following in the loop that checks the
> > parallel-safety of each partition and it seemed to work:
> >
> >             glob->relationOids =
> >                     lappend_oid(glob->relationOids, pdesc->oids[i]);
> >
> > Can you confirm, is that what you were referring to?
>
> Right.  I had mistakenly mentioned PlannerGlobal.invalItems, sorry.
>
> Although it gets the job done, I'm not sure if manipulating
> relationOids from max_parallel_hazard() or its subroutines is okay,
> but I will let the committer decide that.  As I mentioned above, the
> person who designed this decided for some reason that it is
> extract_query_dependencies()'s job to populate
> PlannerGlobal.relationOids/invalItems.
>

Yes, it doesn't really seem right doing it within max_parallel_hazard().
I tried doing it in extract_query_dependencies() instead - see
attached patch - and it seems to work, but I'm not sure if there might
be any unintended side-effects.

Regards,
Greg Nancarrow
Fujitsu Australia

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Fallback table AM for relkinds without storage
Next
From: "Joel Jacobson"
Date:
Subject: Re: Some regular-expression performance hacking