Thread: LOCK TABLE .. DEFERRABLE
The intention of this feature is to give the ability to slip into a normal workload for non-urgent maintenance work. In essence, instead of lock waiters being in a Queue, DEFERRABLE causes the current lock statement to always be last. It was discussed at last years pgCon as useful for replication tools adding/removing triggers. I've also seen more than one plpgsql loop using subtransactions and LOCK TABLE .. NOWAIT to achieve a similar effect. IMO, it's much cleaner built in.
If a lock is successfully obtained on one table, but not on all tables, it releases that lock and will retry to get them as a group in the future. Since inheritance acts as a group of tables (top + recursive cascade to children), this implementation is necessary even if only a single table is specified in the command.
Like various CONCURRENT commands, it waits on a set of transactions which were found to be blocking it. This puts it into the "waiting" state and allows isolation testing to work as expected. I started with a simple loop with a timer (and a GUC) but it didn't feel right without pg_stat_activity showing the waiting state. statement_timeout is suggested for a time restriction.
Possibly Ugly stuff:
SetLocktagRelationOid() no longer static inline. Better option? My C foo isn't all that it should be. Lock Table allows locking shared tables so I can't just assume MyDatabaseId is sufficient for the lock tag.
Return value InvalidOid in RangeVarGetRelidExtended() can now appear in 2 different situations; relation missing if missing_ok enabled and relation unlockable if LockWaitPolicy LockWaitNonBlock. No callers currently use both of these options at this time.
LockTableRecurse() returns the OID of the relation it could not lock in order to wait on the processes holding those locks. It also keeps a list of everything it did lock so they can be unlocked if necessary.
I'll add it to the open November commitfest.
regards,
Rod TaylorIf a lock is successfully obtained on one table, but not on all tables, it releases that lock and will retry to get them as a group in the future. Since inheritance acts as a group of tables (top + recursive cascade to children), this implementation is necessary even if only a single table is specified in the command.
Like various CONCURRENT commands, it waits on a set of transactions which were found to be blocking it. This puts it into the "waiting" state and allows isolation testing to work as expected. I started with a simple loop with a timer (and a GUC) but it didn't feel right without pg_stat_activity showing the waiting state. statement_timeout is suggested for a time restriction.
Possibly Ugly stuff:
SetLocktagRelationOid() no longer static inline. Better option? My C foo isn't all that it should be. Lock Table allows locking shared tables so I can't just assume MyDatabaseId is sufficient for the lock tag.
Return value InvalidOid in RangeVarGetRelidExtended() can now appear in 2 different situations; relation missing if missing_ok enabled and relation unlockable if LockWaitPolicy LockWaitNonBlock. No callers currently use both of these options at this time.
LockTableRecurse() returns the OID of the relation it could not lock in order to wait on the processes holding those locks. It also keeps a list of everything it did lock so they can be unlocked if necessary.
I'll add it to the open November commitfest.
regards,
Attachment
On 5 April 2016 at 17:43, Rod Taylor <rod.taylor@gmail.com> wrote:
--
The intention of this feature is to give the ability to slip into a normal workload for non-urgent maintenance work. In essence, instead of lock waiters being in a Queue, DEFERRABLE causes the current lock statement to always be last.
Good idea; this was on my list of things to implement. I was going to call it WAIT PATIENTLY option.
It was discussed at last years pgCon as useful for replication tools adding/removing triggers. I've also seen more than one plpgsql loop using subtransactions and LOCK TABLE .. NOWAIT to achieve a similar effect. IMO, it's much cleaner built in.
Agreed, but your implementation is essentially just the same looping concept, which I don't much like.
If a lock is successfully obtained on one table, but not on all tables, it releases that lock and will retry to get them as a group in the future. Since inheritance acts as a group of tables (top + recursive cascade to children), this implementation is necessary even if only a single table is specified in the command.
I'd prefer to see this as a lock wait mode where it sits in the normal lock queue BUT other lock requestors are allowed to queue jump past it. That should be just a few lines changed in the lock conflict checker and some sleight of hand in the lock queue code.
That way we avoid the busy-wait loop and multiple DEFERRABLE lock waiters queue up normally.
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2016-04-05 18:10:11 +0100, Simon Riggs wrote: > I'd prefer to see this as a lock wait mode where it sits in the normal lock > queue BUT other lock requestors are allowed to queue jump past it. That > should be just a few lines changed in the lock conflict checker and some > sleight of hand in the lock queue code. +1, although wading into deadlock.c makes one need a shower.
On Tue, Apr 5, 2016 at 1:10 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
If a lock is successfully obtained on one table, but not on all tables, it releases that lock and will retry to get them as a group in the future. Since inheritance acts as a group of tables (top + recursive cascade to children), this implementation is necessary even if only a single table is specified in the command.I'd prefer to see this as a lock wait mode where it sits in the normal lock queue BUT other lock requestors are allowed to queue jump past it. That should be just a few lines changed in the lock conflict checker and some sleight of hand in the lock queue code.That way we avoid the busy-wait loop and multiple DEFERRABLE lock waiters queue up normally.
Yeah, that would be better. I can see how to handle a single structure in that way but I'm not at all certain how to handle multiple tables and inheritance is multiple tables even with a single command.
X1 inherits from X
There is a long-running task on X1.
Someone requests LOCK TABLE X IN ACCESS EXCLUSIVE MODE WAIT PATIENTLY. Internally this also grabs X1.
There is a long-running task on X1.
Someone requests LOCK TABLE X IN ACCESS EXCLUSIVE MODE WAIT PATIENTLY. Internally this also grabs X1.
The lock on X might be granted immediately and now blocks all other access to that table.
There would need be a Locking Group kind of thing so various LockTags are treated as a single entity to grant them simultaneously. That seems pretty invasive; at least I don't see anything like that today.
On 5 April 2016 at 18:34, Rod Taylor <rod.taylor@gmail.com> wrote: > > > On Tue, Apr 5, 2016 at 1:10 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >>> >>> If a lock is successfully obtained on one table, but not on all tables, >>> it releases that lock and will retry to get them as a group in the future. >>> Since inheritance acts as a group of tables (top + recursive cascade to >>> children), this implementation is necessary even if only a single table is >>> specified in the command. >> >> >> I'd prefer to see this as a lock wait mode where it sits in the normal >> lock queue BUT other lock requestors are allowed to queue jump past it. That >> should be just a few lines changed in the lock conflict checker and some >> sleight of hand in the lock queue code. >> >> That way we avoid the busy-wait loop and multiple DEFERRABLE lock waiters >> queue up normally. > > > Yeah, that would be better. I can see how to handle a single structure in > that way but I'm not at all certain how to handle multiple tables and > inheritance is multiple tables even with a single command. Agreed; neither can I. > X1 inherits from X > > There is a long-running task on X1. > > Someone requests LOCK TABLE X IN ACCESS EXCLUSIVE MODE WAIT PATIENTLY. > Internally this also grabs X1. > > The lock on X might be granted immediately and now blocks all other access > to that table. > > There would need be a Locking Group kind of thing so various LockTags are > treated as a single entity to grant them simultaneously. That seems pretty > invasive; at least I don't see anything like that today. Multiple locktags would likely be behind different LWLocks anyway, so I don't see a way to make that work. So the only way to handle multiple locks is to do this roughly the way Rod suggests. The only thing I would add at this stage is that if one lock is unavailable, unlocking all previous locks is unnecessary. We only need to unlock if there are lock waiters for the locks we already hold. The use cases I am thinking of require only one table at a time, so I'm still inclined towards the non-looping approach. Thoughts? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 1 September 2016 at 21:28, Simon Riggs <simon@2ndquadrant.com> wrote: > So the only way to handle multiple locks is to do this roughly the way > Rod suggests. I've just been reading the VACUUM code and it turns out that we already use Rod's mechanism internally. So on that basis it seems fine to support this as a useful user-level feature. If there is a better way of doing it, then that can be added later. My proposed changes to this patch are these 1. Rename this WAIT PATIENTLY, which is IMHO a better description of what is being requested. Bikeshedding welcome. 2. Introduce a new API call LockTablePatiently() that returns bool. So its usage is similar to ConditionalLockTable(), the only difference is you supply some other wait parameters with it. This isolates the internal mechanism from the usage, so allows us to more easily support any fancy new way of doing this we think of later. 3. Use LockTablePatiently() within lockcmds.c where appropriate 4. Replace the code for waiting in VACUUM with the new call to LockTablePatiently() So I see this as 2 patches: 1) new API and make VACUUM use new API, 2) Rod's LOCK TABLE patch First patch attached, requires also lock by Oid. If we agree, Rod, please update your patch to match? (I pushed this back to next CF, but we can still go ahead if we complete) Comments? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On Tue, Sep 6, 2016 at 6:04 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > On 1 September 2016 at 21:28, Simon Riggs <simon@2ndquadrant.com> wrote: >> So the only way to handle multiple locks is to do this roughly the way >> Rod suggests. > > I've just been reading the VACUUM code and it turns out that we > already use Rod's mechanism internally. So on that basis it seems fine > to support this as a useful user-level feature. If there is a better > way of doing it, then that can be added later. > > My proposed changes to this patch are these > > 1. Rename this WAIT PATIENTLY, which is IMHO a better description of > what is being requested. Bikeshedding welcome. > > 2. Introduce a new API call LockTablePatiently() that returns bool. So > its usage is similar to ConditionalLockTable(), the only difference is > you supply some other wait parameters with it. This isolates the > internal mechanism from the usage, so allows us to more easily support > any fancy new way of doing this we think of later. > > 3. Use LockTablePatiently() within lockcmds.c where appropriate > > 4. Replace the code for waiting in VACUUM with the new call to > LockTablePatiently() > > So I see this as 2 patches: 1) new API and make VACUUM use new API, 2) > Rod's LOCK TABLE patch > > First patch attached, requires also lock by Oid. If we agree, Rod, > please update your patch to match? Aside from the fact that polling is generally inefficient and wasteful of system resources, this allows for undetected deadlocks. Consider: S1: LOCK TABLE A; S2: LOCK TABLE B; S1: LOCK TABLE B; -- blocks S2: LOCK TABLE A PATIENTLY; -- retries forever Livelock might be possible, too. I think it would be better to think harder about what would be required to implement this natively in the lock manager. Suppose we add a flag to each PROCLOCK which, if true, indicates that the lock request is low priority. Also, we add a counter to each LOCK indicating the number of pending low-priority lock requests. When LockAcquireExtended reaches this code here... if (lockMethodTable->conflictTab[lockmode] & lock->waitMask) status = STATUS_FOUND; else status = LockCheckConflicts(lockMethodTable, lockmode, lock, proclock); ...we add an additional check to the upper branch: if the number of low-priority waiters is not 0, then we walk the wait queue; if all waiters that conflict with the requested lock mode are low-priority, then we set status = STATUS_OK. So, new lock requests refuse to queue behind low-priority lock waiters. Is that good enough to implement the requested behavior, or do we need to do more? If we only do what's described above, then a normal-priority waiter which joins the queue after a low-priority waiter is already waiting will let the low-priority waiter go first. That's probably not desirable, but it's pretty easy to fix: the logic that determines where a new waiter enters the wait queue is in ProcSleep(), and it wouldn't be too hard to arrange for new normal-priority waiters to skip over any low-priority waiters that are at the end of the existing wait queue (but not any that are in the middle, because if we did that we'd also be skipping over normal-priority waiters, which we shouldn't). What more? Well, even after doing those two things, it's still possible for either the simple deadlock logic in ProcSleep() or the full deadlock detector to put a low-priority waiter in front of a normal-priority waiter. However, our typical policy is to advance waiters in the wait queue as little as possible. In other words, if the wait queue contains A B C and we will deadlock unless C is moved up, we will move it ahead of B but not A if that is sufficient to avoid the deadlock. We will only move it ahead of both B and A if that is necessary to avoid deadlock. So, low-priority requests won't be moved up further than needed, which is good. Still, it is possible to construct scenarios where we won't get perfect low-priority behavior without more invasive changes. For example, suppose we make a low-priority request queue-jump over an intervening waiter to avoid deadlocking against it. Next, a normal-priority waiter enters the queue. Then, the guy we skipped aborts. At this point, we could in theory notice that it's possible to move the low-priority request behind the new normal-priority waiter. However, I think we shouldn't do that. We certainly can't do it unconditionally because it might introduce deadlocks. We could test whether it will introduce a deadlock and do it only if not, but that's expensive. Instead, I think we should document that a low-priority request will ordinarily cause the request to be satisfied only after all conflicting normal-priority lock requests, but that this is not guaranteed in the case where the wait queue is rearranged to avoid deadlock. I don't think that limitation ought to be a tremendous problem for users, and the alternatives are pretty unappealing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Sep 16, 2016 at 3:51 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Sep 6, 2016 at 6:04 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 1 September 2016 at 21:28, Simon Riggs <simon@2ndquadrant.com> wrote:
>> So the only way to handle multiple locks is to do this roughly the way
>> Rod suggests.
>
> I've just been reading the VACUUM code and it turns out that we
> already use Rod's mechanism internally. So on that basis it seems fine
> to support this as a useful user-level feature. If there is a better
> way of doing it, then that can be added later.
>
> My proposed changes to this patch are these
>
> 1. Rename this WAIT PATIENTLY, which is IMHO a better description of
> what is being requested. Bikeshedding welcome.
>
> 2. Introduce a new API call LockTablePatiently() that returns bool. So
> its usage is similar to ConditionalLockTable(), the only difference is
> you supply some other wait parameters with it. This isolates the
> internal mechanism from the usage, so allows us to more easily support
> any fancy new way of doing this we think of later.
>
> 3. Use LockTablePatiently() within lockcmds.c where appropriate
>
> 4. Replace the code for waiting in VACUUM with the new call to
> LockTablePatiently()
>
> So I see this as 2 patches: 1) new API and make VACUUM use new API, 2)
> Rod's LOCK TABLE patch
>
> First patch attached, requires also lock by Oid. If we agree, Rod,
> please update your patch to match?
Aside from the fact that polling is generally inefficient and wasteful
of system resources, this allows for undetected deadlocks. Consider:
S1: LOCK TABLE A;
S2: LOCK TABLE B;
S1: LOCK TABLE B; -- blocks
S2: LOCK TABLE A PATIENTLY; -- retries forever
Livelock might be possible, too.
I think it would be better to think harder about what would be
required to implement this natively in the lock manager. Suppose we
add a flag to each PROCLOCK which, if true, indicates that the lock
request is low priority. Also, we add a counter to each LOCK
indicating the number of pending low-priority lock requests. When
LockAcquireExtended reaches this code here...
if (lockMethodTable->conflictTab[lockmode] & lock->waitMask)
status = STATUS_FOUND;
else
status = LockCheckConflicts(lockMethodTable, lockmode,
lock, proclock);
...we add an additional check to the upper branch: if the number of
low-priority waiters is not 0, then we walk the wait queue; if all
waiters that conflict with the requested lock mode are low-priority,
then we set status = STATUS_OK. So, new lock requests refuse to queue
behind low-priority lock waiters.
Is that good enough to implement the requested behavior, or do we need
to do more? If we only do what's described above, then a
normal-priority waiter which joins the queue after a low-priority
waiter is already waiting will let the low-priority waiter go first.
That's probably not desirable, but it's pretty easy to fix: the logic
that determines where a new waiter enters the wait queue is in
ProcSleep(), and it wouldn't be too hard to arrange for new
normal-priority waiters to skip over any low-priority waiters that are
at the end of the existing wait queue (but not any that are in the
middle, because if we did that we'd also be skipping over
normal-priority waiters, which we shouldn't).
What more? Well, even after doing those two things, it's still
possible for either the simple deadlock logic in ProcSleep() or the
full deadlock detector to put a low-priority waiter in front of a
normal-priority waiter. However, our typical policy is to advance
waiters in the wait queue as little as possible. In other words, if
the wait queue contains A B C and we will deadlock unless C is moved
up, we will move it ahead of B but not A if that is sufficient to
avoid the deadlock. We will only move it ahead of both B and A if
that is necessary to avoid deadlock. So, low-priority requests won't
be moved up further than needed, which is good.
Still, it is possible to construct scenarios where we won't get
perfect low-priority behavior without more invasive changes. For
example, suppose we make a low-priority request queue-jump over an
intervening waiter to avoid deadlocking against it. Next, a
normal-priority waiter enters the queue. Then, the guy we skipped
aborts. At this point, we could in theory notice that it's possible
to move the low-priority request behind the new normal-priority
waiter. However, I think we shouldn't do that. We certainly can't do
it unconditionally because it might introduce deadlocks. We could
test whether it will introduce a deadlock and do it only if not, but
that's expensive. Instead, I think we should document that a
low-priority request will ordinarily cause the request to be satisfied
only after all conflicting normal-priority lock requests, but that
this is not guaranteed in the case where the wait queue is rearranged
to avoid deadlock. I don't think that limitation ought to be a
tremendous problem for users, and the alternatives are pretty
unappealing.
Closed in 2016-11 commitfest with "returned with feedback" status.
Please feel free to update the status once you submit the update patch
that solves all the discussed problems.
Regards,
Hari Babu
Fujitsu Australia
On 15 September 2016 at 18:51, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Sep 6, 2016 at 6:04 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> On 1 September 2016 at 21:28, Simon Riggs <simon@2ndquadrant.com> wrote: >>> So the only way to handle multiple locks is to do this roughly the way >>> Rod suggests. >> >> I've just been reading the VACUUM code and it turns out that we >> already use Rod's mechanism internally. So on that basis it seems fine >> to support this as a useful user-level feature. If there is a better >> way of doing it, then that can be added later. >> >> My proposed changes to this patch are these >> >> 1. Rename this WAIT PATIENTLY, which is IMHO a better description of >> what is being requested. Bikeshedding welcome. >> >> 2. Introduce a new API call LockTablePatiently() that returns bool. So >> its usage is similar to ConditionalLockTable(), the only difference is >> you supply some other wait parameters with it. This isolates the >> internal mechanism from the usage, so allows us to more easily support >> any fancy new way of doing this we think of later. >> >> 3. Use LockTablePatiently() within lockcmds.c where appropriate >> >> 4. Replace the code for waiting in VACUUM with the new call to >> LockTablePatiently() >> >> So I see this as 2 patches: 1) new API and make VACUUM use new API, 2) >> Rod's LOCK TABLE patch >> >> First patch attached, requires also lock by Oid. If we agree, Rod, >> please update your patch to match? > > Aside from the fact that polling is generally inefficient and wasteful > of system resources, this allows for undetected deadlocks. Consider: > > S1: LOCK TABLE A; > S2: LOCK TABLE B; > S1: LOCK TABLE B; -- blocks > S2: LOCK TABLE A PATIENTLY; -- retries forever Hmmm > Livelock might be possible, too. > > I think it would be better to think harder about what would be > required to implement this natively in the lock manager. Suppose we > add a flag to each PROCLOCK which, if true, indicates that the lock > request is low priority. Also, we add a counter to each LOCK > indicating the number of pending low-priority lock requests. When > LockAcquireExtended reaches this code here... > > if (lockMethodTable->conflictTab[lockmode] & lock->waitMask) > status = STATUS_FOUND; > else > status = LockCheckConflicts(lockMethodTable, lockmode, > > lock, proclock); > > ...we add an additional check to the upper branch: if the number of > low-priority waiters is not 0, then we walk the wait queue; if all > waiters that conflict with the requested lock mode are low-priority, > then we set status = STATUS_OK. So, new lock requests refuse to queue > behind low-priority lock waiters. Well, that's pretty much the exact design I mentioned earlier. > Is that good enough to implement the requested behavior, or do we need > to do more? The only problem is that Rod's request was to be able to lock multiple tables in one statement, which cannot then be done that way. But there are problems with Rod's approach, so I suggest we ditch that now and I'll implement the single table lock approach that you, me and Andres preferred. I'd rather have something sweet with one table than something crappy with multiple tables. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services