Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode - Mailing list pgsql-committers

From Amit Kapila
Subject Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode
Date
Msg-id CAA4eK1L=u6yiz_1GH4VrydzscS23y0v33HGUM7PsB=_g-zNdpQ@mail.gmail.com
Whole thread Raw
In response to Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: pgsql: Add a new GUC and a reloption to enable inserts in parallel-mode  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-committers
On Wed, Mar 24, 2021 at 5:45 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Wed, Mar 24, 2021 at 12:30 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > How about a declarative approach instead?  That is, if a user would
> > like parallelized inserts into a partitioned table, she must declare
> > the table parallel-safe with some suitable annotation.  Then, checking
> > the property during DML is next door to free, and instead we have to think
> > about whether and how to enforce that the marking is valid during DDL.
> >
> > I don't honestly see a real cheap way to enforce such a property.
> > For instance, if someone does ALTER FUNCTION to remove a function's
> > parallel-safe marking, we can't really run around and verify that the
> > function is not used in any CHECK constraint.  (Aside from the cost,
> > there would be race conditions.)
> >
> > But maybe we don't have to enforce it exactly.  It could be on the
> > user's head that the marking is accurate.  We could prevent any
> > really bad misbehavior by having parallel workers error out if they
> > see they've been asked to execute a non-parallel-safe function.
> >

If we want to do something like this then we might want to provide a
function is_dml_rel_parallel_safe(relid/relname) (a better name for a
function could be used) where we can check all global properties of
relation and report whether it is safe or not to perform dml and
additionally we can report the unsafe property if it is unsafe? This
will provide a way for users to either update the parallel-safe
property of a relation or do something about the parallel-unsafe
property of the relation.

> > Or there are probably other ways to slice it up.  But I think some
> > outside-the-box thinking might be helpful here.
>
> It's a possibility. It's a bit different from my decision to mark
> functions as PARALLEL SAFE/RESTRICTED/UNSAFE because, if you wanted to
> deduce that without an explicit marking, you'd need to solve the
> halting problem, which I've heard is fairly difficult. In this case,
> though, the problem is solvable with a linear-time algorithm. It's
> possible that there is nothing better, but I'm not sure.
>
> One idea would be to try to cache some state in shared memory. That
> wouldn't work for an unbounded number of relations, at least not
> unless we used DSA, but you could have a hash table with a
> configurable number of slots and make the default big enough that it
> would bother few people in practice. There might be some other details
> about partitioned relations that would be useful to cache, too.
>

Wouldn't we need to invalidate the hash entries as soon as something
parallel-unsafe is associated with them? If so, how is this better
than setting a flag in relcache?

-- 
With Regards,
Amit Kapila.



pgsql-committers by date:

Previous
From: Michael Paquier
Date:
Subject: pgsql: Sanitize the term "combo CID" in code comments
Next
From: Peter Eisentraut
Date:
Subject: pgsql: Rename a parse node to be more general