Re: autovacuum truncate exclusive lock round two - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: autovacuum truncate exclusive lock round two
Date
Msg-id 001001cdb33a$d23c3a90$76b4afb0$@kapila@huawei.com
Whole thread Raw
In response to Re: autovacuum truncate exclusive lock round two  (Jan Wieck <JanWieck@Yahoo.com>)
Responses Re: autovacuum truncate exclusive lock round two
Re: autovacuum truncate exclusive lock round two
List pgsql-hackers
On Thursday, October 25, 2012 9:46 PM Jan Wieck wrote:
> On 10/25/2012 10:12 AM, Stephen Frost wrote:
> > Jan,
> >
> > * Jan Wieck (JanWieck@Yahoo.com) wrote:
> >> The problem case this patch is dealing with is rolling window tables
> >> that experienced some bloat. The typical example is a log table,
> >> that has new data constantly added and the oldest data constantly
> >> purged out. This data normally rotates through some blocks like a
> >> rolling window. If for some reason (purging turned off for example)
> >> this table bloats by several GB and later shrinks back to its normal
> >> content, soon all the used blocks are at the beginning of the heap
> >> and we find tens of thousands of empty pages at the end. Only now
> >> does the second scan take more than 1000ms and autovacuum is at risk
> >> to get killed while at it.
> >
> > My concern is that this could certainly also happen to a heavily
> updated
> > table in an OLTP type of environment where the requirement to take a
> > heavy lock to clean it up might prevent it from ever happening..  I
> was
> > simply hoping we could find a mechanism to lock just those pages we're
> > getting ready to nuke rather than the entire relation.  Perhaps we can
> > consider how to make those changes alongside of changes to eliminate
> or
> > reduce the extent locking that has been painful (for me at least) when
> > doing massive parallel loads into a table.
> 
> I've been testing this with loads of 20 writes/s to that bloated table.
> Preventing not only the clean up, but the following ANALYZE as well is
> precisely what happens. There may be multiple ways how to get into this
> situation, but once you're there the symptoms are the same. Vacuum fails
> to truncate it and causing a 1 second hiccup every minute, while vacuum
> is holding the exclusive lock until the deadlock detection code of
> another transaction kills it.
> 
> My patch doesn't change the logic how we ensure that we don't zap any
> data by accident with the truncate and Tom's comments suggest we should
> stick to it. It only makes autovacuum check frequently if the
> AccessExclusiveLock is actually blocking anyone and then get out of the
> way.
> 
> I would rather like to discuss any ideas how to do all this without 3
> new GUCs.
> 
> In the original code, the maximum delay that autovacuum can cause by
> holding the exclusive lock is one deadlock_timeout (default 1s). It
> would appear reasonable to me to use max(deadlock_timeout/10,10ms) as
> the interval to check for a conflicting lock request. For another
> transaction that needs to access the table this is 10 times faster than
> it is now and still guarantees that autovacuum will make some progress
> with the truncate.
 One other way could be to check after every few pages for a conflicting
lock request.

> The other two GUCs control how often and how fast autovacuum tries to
> acquire the exclusive lock in the first place. Since we actively release
> the lock *because someone needs it* it is pretty much guaranteed that
> the immediate next lock attempt fails. We on purpose do a
> ConditionalLockRelation() because there is a chance to deadlock. The
> current code only tries one lock attempt and gives up immediately. I
> don't know from what to derive a good value for how long to retry, 
 Can't we do something like, after nap check for conditional lock and if it
didn't get     then get lock unconditionally.  The reason why after your implementation it might be okay to have lock
unconditionally after one try is that anyway after every few pages or after small time, it will release the lock
if there is any waiter.

> but
> the nap time in between tries could be a hardcoded 20ms or using the
> cost based vacuum nap time (which defaults to 20ms).

I think using cost based vacuum nap time or default value is good.

Adding new parameters might have user/administrator overhead, it is always
better if it can be intelligently decided by database itself.
However if you feel these are parameters which can vary based on different
kind of usage, then I think it is better to expose it through configuration
parameters to users.

With Regards,
Amit Kapila.




pgsql-hackers by date:

Previous
From: Daniel Farina
Date:
Subject: Re: patch to add \watch to psql
Next
From: Jan Wieck
Date:
Subject: Re: autovacuum truncate exclusive lock round two