Re: autovacuum truncate exclusive lock round two - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: autovacuum truncate exclusive lock round two |
Date | |
Msg-id | 001901cdb365$a6677800$f3366800$@kapila@huawei.com Whole thread Raw |
In response to | Re: autovacuum truncate exclusive lock round two (Jan Wieck <JanWieck@Yahoo.com>) |
Responses |
Re: autovacuum truncate exclusive lock round two
|
List | pgsql-hackers |
On Friday, October 26, 2012 11:50 AM Jan Wieck wrote: > On 10/26/2012 1:29 AM, Amit Kapila wrote: > > On Thursday, October 25, 2012 9:46 PM Jan Wieck wrote: > >> On 10/25/2012 10:12 AM, Stephen Frost wrote: > >> > Jan, > >> > > >> > * Jan Wieck (JanWieck@Yahoo.com) wrote: > >> >> The problem case this patch is dealing with is rolling window > tables > >> >> that experienced some bloat. The typical example is a log table, > >> >> that has new data constantly added and the oldest data constantly > >> >> purged out. This data normally rotates through some blocks like a > >> >> rolling window. If for some reason (purging turned off for > example) > >> >> this table bloats by several GB and later shrinks back to its > normal > >> >> content, soon all the used blocks are at the beginning of the heap > >> >> and we find tens of thousands of empty pages at the end. Only now > >> >> does the second scan take more than 1000ms and autovacuum is at > risk > >> >> to get killed while at it. > >> > > >> > My concern is that this could certainly also happen to a heavily > >> updated > >> > table in an OLTP type of environment where the requirement to take > a > >> > heavy lock to clean it up might prevent it from ever happening.. I > >> was > >> > simply hoping we could find a mechanism to lock just those pages > we're > >> > getting ready to nuke rather than the entire relation. Perhaps we > can > >> > consider how to make those changes alongside of changes to > eliminate > >> or > >> > reduce the extent locking that has been painful (for me at least) > when > >> > doing massive parallel loads into a table. > >> > >> I've been testing this with loads of 20 writes/s to that bloated > table. > >> Preventing not only the clean up, but the following ANALYZE as well > is > >> precisely what happens. There may be multiple ways how to get into > this > >> situation, but once you're there the symptoms are the same. Vacuum > fails > >> to truncate it and causing a 1 second hiccup every minute, while > vacuum > >> is holding the exclusive lock until the deadlock detection code of > >> another transaction kills it. > >> > >> My patch doesn't change the logic how we ensure that we don't zap any > >> data by accident with the truncate and Tom's comments suggest we > should > >> stick to it. It only makes autovacuum check frequently if the > >> AccessExclusiveLock is actually blocking anyone and then get out of > the > >> way. > >> > >> I would rather like to discuss any ideas how to do all this without 3 > >> new GUCs. > >> > >> In the original code, the maximum delay that autovacuum can cause by > >> holding the exclusive lock is one deadlock_timeout (default 1s). It > >> would appear reasonable to me to use max(deadlock_timeout/10,10ms) as > >> the interval to check for a conflicting lock request. For another > >> transaction that needs to access the table this is 10 times faster > than > >> it is now and still guarantees that autovacuum will make some > progress > >> with the truncate. > > > > One other way could be to check after every few pages for a > conflicting > > lock request. > > How is this any different from what my patch does? The difference is that in the patch it checks for waiters by using2 parameters autovacuum_truncate_lock_check and blkno%32 and what I had mentioned was to check only based on blkno. Will iteffect too much if we directly check for waiters after every 32 (any feasible number) blocks? > Did you even look at the code? I haven't looked at code when I had given reply to your previous mail. But now I have checked it. With Regards, Amit Kapila.
pgsql-hackers by date: