Re: autovacuum truncate exclusive lock round two - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: autovacuum truncate exclusive lock round two
Date
Msg-id 20121209193742.142860@gmx.com
Whole thread Raw
In response to autovacuum truncate exclusive lock round two  (Jan Wieck <JanWieck@Yahoo.com>)
Responses Re: autovacuum truncate exclusive lock round two
List pgsql-hackers
Jan Wieck wrote:

> Based on the discussion and what I feel is a consensus I have
> created an updated patch that has no GUC at all. The hard coded
> parameters in include/postmaster/autovacuum.h are
> 
>  AUTOVACUUM_TRUNCATE_LOCK_CHECK_INTERVAL 20 /* ms */
>  AUTOVACUUM_TRUNCATE_LOCK_WAIT_INTERVAL 50 /* ms */
>  AUTOVACUUM_TRUNCATE_LOCK_TIMEOUT 5000 /* ms */

Since these really aren't part of the external API and are only
referenced in vacuumlazy.c, it seems more appropriate to define
them there.

> I gave that the worst workload I can think of. A pgbench (style) 
> application that throws about 10 transactions per second at it,
> so that there is constantly the need to give up the lock due to
> conflicting lock requests and then reacquiring it again. A
> "cleanup" process is periodically moving old tuples from the
> history table to an archive table, making history a rolling
> window table. And a third job that 2-3 times per minute produces
> a 10 second lasting transaction, forcing autovacuum to give up on
> the lock reacquisition.
> 
> Even with that workload autovacuum slow but steady is chopping
> away at the table.

Applies with minor offsets, builds without warning, and passes
`make check-world`. My tests based on your earlier posted test
script confirm the benefit.

There are some minor white-space issues; for example git diff
--color shows some trailing spaces in comments.

There are no docs, but since there are no user-visible changes in
behavior other than better performance and more prompt and reliable
trunction of tables where we were already doing so, it doesn't seem
like any new docs are needed. Due to the nature of the problem,
tests are tricky to run correctly and take a long time to run, so I
don't see how any regressions tests would be appropriate, either.

This patch seems ready for committer, and I would be comfortable
with making the minor changes I mention above and committing it. 
I'll wait a day or two to allow any other comments or objections.

To summarize, there has been pathalogical behavior in an
infrequently-encountered corner case of autovacuum, wasting a lot
of resources indefinitely when it is encountered; this patch gives
a major performance improvement in in this situation without any
other user-visible change and without requiring any new GUCs. It
adds a new public function in the locking area to allow a process
to check whether a particular lock it is holding is blocking any
other process, and another to wrap that to make it easy to check
whether the lock held on a particular table is blocking another
process. It uses this new capability to be smarter about scheduling
autovacuum's truncation work, and to avoid throwing away
incremental progress in doing so.

As such, I don't think it would be crazy to back-patch this, but I
think it would be wise to allow it to be proven on master/9.3 for a
while before considering that.

-Kevin



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [PATCH 02/14] Add support for a generic wal reading facility dubbed XLogReader
Next
From: Kohei KaiGai
Date:
Subject: Re: [v9.3] OAT_POST_ALTER object access hooks