Re: group locking: incomplete patch, just for discussion - Mailing list pgsql-hackers

From Andres Freund
Subject Re: group locking: incomplete patch, just for discussion
Date
Msg-id 20141121013107.GC25784@alap3.anarazel.de
Whole thread Raw
In response to Re: group locking: incomplete patch, just for discussion  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: group locking: incomplete patch, just for discussion
List pgsql-hackers
On 2014-11-20 20:21:07 -0500, Robert Haas wrote:
> On Thu, Nov 20, 2014 at 4:45 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> >> And you haven't answered how you'd like to fix that.  The
> >> only answer you've given so far, that I can see, is that maybe we can
> >> foresee all the locks that the child might take and not initiate
> >> parallelism if any of them conflict with locks we already hold. But
> >> that's not a reasonable approach; we have no reasonable way of
> >> predicting what locks the child will need, and even if we did, things
> >> can change between plan time and execution time.
> >
> > We don't need to predict all future locks. We only need to check which
> > self-exclusive locks we are already holding.
> 
> I can't follow that logic.  Why do you think self-exclusive locks are
> the only ones that matter?

All the others can be acquired by jumping in front of the lock's
waitqueue?

> > I don't buy the plan time/execution time argument. We'll need to be able
> > to adapt plans to the availability of bgworker slots *anyway*. Otherwise
> > we either need to to set the number of bgworkers to "MaxConnections *
> > MaxParallelism" or live with errors as soon as too many processes use
> > parallelism. The former is clearly unrealistic given PG's per-backend
> > overhead and the latter will be a *much* bigger PITA than all the
> > deadlock scenarios talked about here. It'll constantly fail, even if
> > everything is ok.
> 
> I certainly think that any parallel node needs to be able to cope with
> getting fewer workers than it would ideally like to have.  But that's
> different from saying that when our own backend takes new locks, we
> have to invalidate plans.  Those seem like largely unrelated problems.

No need to invalidate them if you have the infrastructure to run with no
parallel workers - just use the path that's used if there's no workers
available.

> > Another one is that the frontend process does, from some function or so,
> > CREATE POLICY. Triggering a relcache flush. And RelationClearRelation()
> > essentially assumes that a referenced relcache entry can't change (which
> > is the reason the keep_* stuff is in RelationClearRelation()).
> 
> I don't quite follow that one.  Are you saying that the frontend would
> do a CREATE POLICY while there are backend workers running?

Yes.

>  I
> seriously doubt that can ever be made safe, but I'd choose to prohibit
> it using something other than the heavyweight lock manager.

Right. It's perfectly fine to forbid it imo. But there's lots of places
that have assumptions about the locking behaviour baked in, and the
relevant bugs will be hard to find.

> > I think there's some local state as well, so that's probably not so
> > easy. I guess this needs input from Kevin.
> 
> Yeah.  I'd be OK to have v1 disable parallelism at serializable, and
> fix that in v2, but of course if we can avoid that it's even better.

I don't have a problem with that either.

> The heavyweight locking issue is really killing me, though.

I don't really understand why you're not content with just detecting
deadlocks for now. Everything else seems like bells and whistles for
later.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [v9.5] Custom Plan API
Next
From: Kouhei Kaigai
Date:
Subject: Re: [v9.5] Custom Plan API