Re: autovacuum truncate exclusive lock round two - Mailing list pgsql-hackers

From Jan Wieck
Subject Re: autovacuum truncate exclusive lock round two
Date
Msg-id 50BE4996.90600@Yahoo.com
Whole thread Raw
In response to Re: autovacuum truncate exclusive lock round two  ("Kevin Grittner" <kgrittn@mail.com>)
Responses Re: autovacuum truncate exclusive lock round two
List pgsql-hackers
On 12/4/2012 1:51 PM, Kevin Grittner wrote:
> Jan Wieck wrote:
>
>> [arguments for GUCs]
>
> This is getting confusing. I thought I had already conceded the
> case for autovacuum_truncate_lock_try, and you appeared to spend
> most of your post arguing for it anyway. I think. It's a little
> hard to tell. Perhaps the best thing is to present the issue to the
> list and solicit more opinions on what to do. Please correct me if
> I mis-state any of this.
>
> The primary problem this patch is solving is that in some
> workloads, autovacuum will repeatedly try to truncate the unused
> pages at the end of a table, but will continually get canceled
> after burning resources because another process wants to acquire a
> lock on the table which conflicts with the one held by autovacuum.
> This is handled by the deadlock checker, so another process must
> block for the deadlock_timeout interval each time. All work done by
> the truncate phase of autovacuum is lost on each interrupted
> attempt. Statistical information is not updated, so another attempt
> will trigger the next time autovacuum looks at whether to vacuum
> the table.
>
> It's obvious that this pattern not only fails to release
> potentially large amounts of unused space back to the OS, but the
> headbanging can continue to consume significant resources and for
> an extended period, and the repeated blocking for deadlock_timeout
> could cause latency problems.
>
> The patch has the truncate work, which requires
> AccessExclusiveLock, check at intervals for whether another process
> is waiting on its lock. That interval is one of the timings we need
> to determine, and for which a GUC was initially proposed. I think
> that the check should be fast enough that doing it once every 20ms
> as a hard-coded interval would be good enough. When it sees this
> situation, it truncates the file for as far as it has managed to
> get, releases its lock on the table, sleeps for an interval, and
> then checks to see if the lock has become available again.
>
> How long it should sleep between tries to reacquire the lock is
> another possible GUC. Again, I'm inclined to think that this could
> be hard-coded. Since autovacuum was knocked off-task after doing
> some significant work, I'm inclined to make this interval a little
> bigger, but I don't think it matters a whole lot. Anything between
> 20ms and 100ms seens sane. Maybe 50ms?
>
> At any point that it is unable to acquire the lock, there is a
> check for how long this autovacuum task has been starved for the
> lock. Initially I was arguing for twice the deadlock_timeout on the
> basis that this would probably be short enough not to leave the
> autovacuum worker sidelined for too long, but long enough for the
> attempt to get past a single deadlock between two other processes.
> This is the setting Jan is least willing to concede.
>
> If the autovacuum worker does abandon the attempt, it will keep
> retrying, since we go out of our way to prevent the autovacuum
> process from updating the statistics based on the "incomplete"
> processing. This last interval is not how long it will attempt to
> truncate, but how long it will keep one autovacuum worker making
> unsuccessful attempts to acquire the lock before it is put to other
> uses. Workers will keep coming back to this table until the
> truncate phase is completed, just as it does without the patch; the
> difference being that anytime it gets the lock, even briefly, it is
> able to persist some progress.

That is all correct.

>
> So the question on the table is which of these three intervals
> should be GUCs, and what values to use if they aren't.

I could live with all the above defaults, but would like to see more 
comments on them.


Jan

-- 
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin



pgsql-hackers by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Re: autovacuum truncate exclusive lock round two
Next
From: Alexander Korotkov
Date:
Subject: Re: WIP: store additional info in GIN index