Re: autovacuum truncate exclusive lock round two - Mailing list pgsql-hackers
From | Jan Wieck |
---|---|
Subject | Re: autovacuum truncate exclusive lock round two |
Date | |
Msg-id | 50BE4996.90600@Yahoo.com Whole thread Raw |
In response to | Re: autovacuum truncate exclusive lock round two ("Kevin Grittner" <kgrittn@mail.com>) |
Responses |
Re: autovacuum truncate exclusive lock round two
|
List | pgsql-hackers |
On 12/4/2012 1:51 PM, Kevin Grittner wrote: > Jan Wieck wrote: > >> [arguments for GUCs] > > This is getting confusing. I thought I had already conceded the > case for autovacuum_truncate_lock_try, and you appeared to spend > most of your post arguing for it anyway. I think. It's a little > hard to tell. Perhaps the best thing is to present the issue to the > list and solicit more opinions on what to do. Please correct me if > I mis-state any of this. > > The primary problem this patch is solving is that in some > workloads, autovacuum will repeatedly try to truncate the unused > pages at the end of a table, but will continually get canceled > after burning resources because another process wants to acquire a > lock on the table which conflicts with the one held by autovacuum. > This is handled by the deadlock checker, so another process must > block for the deadlock_timeout interval each time. All work done by > the truncate phase of autovacuum is lost on each interrupted > attempt. Statistical information is not updated, so another attempt > will trigger the next time autovacuum looks at whether to vacuum > the table. > > It's obvious that this pattern not only fails to release > potentially large amounts of unused space back to the OS, but the > headbanging can continue to consume significant resources and for > an extended period, and the repeated blocking for deadlock_timeout > could cause latency problems. > > The patch has the truncate work, which requires > AccessExclusiveLock, check at intervals for whether another process > is waiting on its lock. That interval is one of the timings we need > to determine, and for which a GUC was initially proposed. I think > that the check should be fast enough that doing it once every 20ms > as a hard-coded interval would be good enough. When it sees this > situation, it truncates the file for as far as it has managed to > get, releases its lock on the table, sleeps for an interval, and > then checks to see if the lock has become available again. > > How long it should sleep between tries to reacquire the lock is > another possible GUC. Again, I'm inclined to think that this could > be hard-coded. Since autovacuum was knocked off-task after doing > some significant work, I'm inclined to make this interval a little > bigger, but I don't think it matters a whole lot. Anything between > 20ms and 100ms seens sane. Maybe 50ms? > > At any point that it is unable to acquire the lock, there is a > check for how long this autovacuum task has been starved for the > lock. Initially I was arguing for twice the deadlock_timeout on the > basis that this would probably be short enough not to leave the > autovacuum worker sidelined for too long, but long enough for the > attempt to get past a single deadlock between two other processes. > This is the setting Jan is least willing to concede. > > If the autovacuum worker does abandon the attempt, it will keep > retrying, since we go out of our way to prevent the autovacuum > process from updating the statistics based on the "incomplete" > processing. This last interval is not how long it will attempt to > truncate, but how long it will keep one autovacuum worker making > unsuccessful attempts to acquire the lock before it is put to other > uses. Workers will keep coming back to this table until the > truncate phase is completed, just as it does without the patch; the > difference being that anytime it gets the lock, even briefly, it is > able to persist some progress. That is all correct. > > So the question on the table is which of these three intervals > should be GUCs, and what values to use if they aren't. I could live with all the above defaults, but would like to see more comments on them. Jan -- Anyone who trades liberty for security deserves neither liberty nor security. -- Benjamin Franklin
pgsql-hackers by date: