Re: Delay locking partitions during query execution - Mailing list pgsql-hackers

From David Rowley
Subject Re: Delay locking partitions during query execution
Date
Msg-id CAKJS1f_7Snv=3kcB4BhRsOidWR7h6Ww1s=waFzGqBk-m1spTHg@mail.gmail.com
Whole thread Raw
In response to Re: Delay locking partitions during query execution  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Delay locking partitions during query execution  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Fri, 1 Feb 2019 at 10:32, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Sun, Jan 27, 2019 at 8:26 PM David Rowley
> <david.rowley@2ndquadrant.com> wrote:
> > One way around this would be to always perform an invalidation on the
> > partition's parent when performing a relcache invalidation on the
> > partition.  We could perhaps invalidate all the way up to the top
> > level partitioned table, that way we could just obtain a lock on the
> > target partitioned table during AcquireExecutorLocks(). I'm currently
> > only setting the delaylock flag to false for leaf partitions only.
>
> Would this problem go away if we adopted the proposal discussed in
> http://postgr.es/m/24823.1544220272@sss.pgh.pa.us and, if so, is that
> a good fix?
>
> I don't quite understand why this is happening.  It seems like as long
> as you take at least one new lock, you'll process *every* pending
> invalidation message, and that should trigger replanning as long as
> the dependencies are correct.  But maybe the issue is that you hold
> all the other locks you need already, and the lock on the partition at
> issue is only acquired during execution, at which point it's too late
> to replan.  If so, then I think something along the lines of the above
> might make a lot of sense.

I think perhaps accepting invalidations at the start of the statement
might appear to fix the problem in master, but I think there's still a
race condition around CheckCachedPlan() since we'll ignore
invalidation messages when we perform LockRelationOid() in
AcquireExecutorLocks() due to the lock already being held.  So there's
likely still a window where the invalidation message could come in and
we miss it.    That said, if lock is already taken in one session, and
some other session alters some relation we have a lock on, then that
alternation can't have been done with an AEL, as that would have
blocked on our lock, so it must be some ShareUpdateExclusiveLock
change, and if so that change must not be important enough to block
concurrently executing queries, so likely shouldn't actually break
anything.

So in summary, I think that checking for invalidation messages at the
start of the statement allows us to replan within a transaction, but
does not guarantee we can the exact correct plan for the current
settings.

I also don't have a full grasp on why we must ignore invalidation
messages when we already have a lock on the relation.  I assume it's
not a huge issue since obtaining a new lock on any other relation will
process the invalidation queue anyway. I know that f868a8143a9
recently made some changes to fix some recursive issues around these
invalidations.

(I admit to not having the best grasp on how all this works, so feel
free to shoot this down)



--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Prepare Transaction support for ON COMMIT DROP temporary tables
Next
From: Michael Paquier
Date:
Subject: Re: XLogInsert() of dangling pointer while logging replica identity