Re: autovacuum truncate exclusive lock round two - Mailing list pgsql-hackers

From Jan Wieck
Subject Re: autovacuum truncate exclusive lock round two
Date
Msg-id 508A2B72.3010205@Yahoo.com
Whole thread Raw
In response to Re: autovacuum truncate exclusive lock round two  (Amit Kapila <amit.kapila@huawei.com>)
Responses Re: autovacuum truncate exclusive lock round two  (Amit Kapila <amit.kapila@huawei.com>)
List pgsql-hackers
On 10/26/2012 1:29 AM, Amit Kapila wrote:
> On Thursday, October 25, 2012 9:46 PM Jan Wieck wrote:
>> On 10/25/2012 10:12 AM, Stephen Frost wrote:
>> > Jan,
>> >
>> > * Jan Wieck (JanWieck@Yahoo.com) wrote:
>> >> The problem case this patch is dealing with is rolling window tables
>> >> that experienced some bloat. The typical example is a log table,
>> >> that has new data constantly added and the oldest data constantly
>> >> purged out. This data normally rotates through some blocks like a
>> >> rolling window. If for some reason (purging turned off for example)
>> >> this table bloats by several GB and later shrinks back to its normal
>> >> content, soon all the used blocks are at the beginning of the heap
>> >> and we find tens of thousands of empty pages at the end. Only now
>> >> does the second scan take more than 1000ms and autovacuum is at risk
>> >> to get killed while at it.
>> >
>> > My concern is that this could certainly also happen to a heavily
>> updated
>> > table in an OLTP type of environment where the requirement to take a
>> > heavy lock to clean it up might prevent it from ever happening..  I
>> was
>> > simply hoping we could find a mechanism to lock just those pages we're
>> > getting ready to nuke rather than the entire relation.  Perhaps we can
>> > consider how to make those changes alongside of changes to eliminate
>> or
>> > reduce the extent locking that has been painful (for me at least) when
>> > doing massive parallel loads into a table.
>>
>> I've been testing this with loads of 20 writes/s to that bloated table.
>> Preventing not only the clean up, but the following ANALYZE as well is
>> precisely what happens. There may be multiple ways how to get into this
>> situation, but once you're there the symptoms are the same. Vacuum fails
>> to truncate it and causing a 1 second hiccup every minute, while vacuum
>> is holding the exclusive lock until the deadlock detection code of
>> another transaction kills it.
>>
>> My patch doesn't change the logic how we ensure that we don't zap any
>> data by accident with the truncate and Tom's comments suggest we should
>> stick to it. It only makes autovacuum check frequently if the
>> AccessExclusiveLock is actually blocking anyone and then get out of the
>> way.
>>
>> I would rather like to discuss any ideas how to do all this without 3
>> new GUCs.
>>
>> In the original code, the maximum delay that autovacuum can cause by
>> holding the exclusive lock is one deadlock_timeout (default 1s). It
>> would appear reasonable to me to use max(deadlock_timeout/10,10ms) as
>> the interval to check for a conflicting lock request. For another
>> transaction that needs to access the table this is 10 times faster than
>> it is now and still guarantees that autovacuum will make some progress
>> with the truncate.
>
>    One other way could be to check after every few pages for a conflicting
> lock request.

How is this any different from what my patch does? Did you even look at 
the code?

>
>> The other two GUCs control how often and how fast autovacuum tries to
>> acquire the exclusive lock in the first place. Since we actively release
>> the lock *because someone needs it* it is pretty much guaranteed that
>> the immediate next lock attempt fails. We on purpose do a
>> ConditionalLockRelation() because there is a chance to deadlock. The
>> current code only tries one lock attempt and gives up immediately. I
>> don't know from what to derive a good value for how long to retry,
>
>    Can't we do something like, after nap check for conditional lock and if it
> didn't get    
>    then get lock unconditionally.

No, we cannot. This is also well documented in the code.


Jan

-- 
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: autovacuum truncate exclusive lock round two
Next
From: Rushabh Lathia
Date:
Subject: Query ending up with hitting all the partition with sub-query in the projection list