RE: [bug?] Missed parallel safety checks, and wrong parallel safety - Mailing list pgsql-hackers

From houzj.fnst@fujitsu.com
Subject RE: [bug?] Missed parallel safety checks, and wrong parallel safety
Date
Msg-id OS0PR01MB5716EC1D07ACCA24373C2557941B9@OS0PR01MB5716.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: [bug?] Missed parallel safety checks, and wrong parallel safety  (Dilip Kumar <dilipbalaut@gmail.com>)
List pgsql-hackers
On Sunday, July 4, 2021 1:44 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> On Fri, Jul 2, 2021 at 8:16 PM Robert Haas <robertmhaas@gmail.com> wrote:
> >
> > On Wed, Jun 30, 2021 at 11:46 PM Greg Nancarrow <gregn4422@gmail.com>
> wrote:
> > > I personally think "(b) provide an option to the user to specify
> > > whether inserts can be parallelized on a relation" is the preferable
> > > option.
> > > There seems to be too many issues with the alternative of trying to
> > > determine the parallel-safety of a partitioned table automatically.
> > > I think (b) is the simplest and most consistent approach, working
> > > the same way for all table types, and without the overhead of (a).
> > > Also, I don't think (b) is difficult for the user. At worst, the
> > > user can use the provided utility-functions at development-time to
> > > verify the intended declared table parallel-safety.
> > > I can't really see some mixture of (a) and (b) being acceptable.
> >
> > Yeah, I'd like to have it be automatic, but I don't have a clear idea
> > how to make that work nicely. It's possible somebody (Tom?) can
> > suggest something that I'm overlooking, though.
> 
> In general, for the non-partitioned table, where we don't have much overhead
> of checking the parallel safety and invalidation is also not a big problem so I am
> tempted to provide an automatic parallel safety check.  This would enable
> parallelism for more cases wherever it is suitable without user intervention.
> OTOH, I understand that providing automatic checking might be very costly if
> the number of partitions is more.  Can't we provide some mid-way where the
> parallelism is enabled by default for the normal table but for the partitioned
> table it is disabled by default and the user has to set it safe for enabling
> parallelism?  I agree that such behavior might sound a bit hackish.

About the invalidation for non-partitioned table, I think it still has a
problem: When a function's parallel safety changed, it's expensive to judge
whether the function is related to index or trigger or some table-related
objects by using pg_depend, because we can only do the judgement in each
backend when accept a invalidation message.  If we don't do that, it means
whenever a function's parallel safety changed, we invalidate every relation's
cached safety which looks not very nice to me.

So, I personally think "(b) provide an option to the user to specify whether
inserts can be parallelized on a relation" is the preferable option.

Best regards,
houzj

pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Evaluate expression at planning time for two more cases
Next
From: Michael Paquier
Date:
Subject: Re: Atomic rename feature for Windows.