Re: autovacuum truncate exclusive lock round two - Mailing list pgsql-hackers

From Jan Wieck
Subject Re: autovacuum truncate exclusive lock round two
Date
Msg-id 50B77432.10007@Yahoo.com
Whole thread Raw
In response to Re: autovacuum truncate exclusive lock round two  ("Kevin Grittner" <kgrittn@mail.com>)
Responses Re: autovacuum truncate exclusive lock round two  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 11/28/2012 3:33 PM, Kevin Grittner wrote:
> Kevin Grittner wrote:
>
>> I still need to review the timing calls, since I'm not familiar
>> with them so it wasn't immediately obvious to me whether they
>> were being used correctly. I have no reason to believe that they
>> aren't, but feel I should check.
>
> It seems odd to me that assignment of one instr_time to another is
> done with INSTR_TIME_SET_ZERO() of the target followed by
> INSTR_TIME_ADD() with the target and the source. It seems to me
> that simple assignment would be more readable, and I can't see any
> down side.
>
> Why shouldn't:
>
>      INSTR_TIME_SET_ZERO(elapsed);
>      INSTR_TIME_ADD(elapsed, currenttime);
>      INSTR_TIME_SUBTRACT(elapsed, starttime);
>
> instead be?:
>
>      elapsed = currenttime;
>      INSTR_TIME_SUBTRACT(elapsed, starttime);
>
> And why shouldn't:
>
>      INSTR_TIME_SET_ZERO(starttime);
>      INSTR_TIME_ADD(starttime, currenttime);
>
> instead be?:
>
>      starttime = currenttime;
>
> Resetting starttime this way seems especially odd.

instr_time is LARGE_INTEGER on Win32 but struct timeval on Unix. Is
    starttime = currenttime;

portable if those are structs?

>
>> Also, I want to do another pass looking just for off-by-one
>> errors on blkno. Again, I have no reason to believe that there is
>> a problem; it just seems like it would be a particularly easy
>> type of mistake to make and miss when a key structure has this
>> field:
>>
>>   BlockNumber nonempty_pages;
>>      /* actually, last nonempty page + 1 */
>
> No problems found with that.
>
>> And I want to test more.
>
> The patch worked as advertised in all my tests, but I became
> uncomforatable with the games being played with the last autovacuum
> timestamp and the count of dead tuples. Sure, that causes
> autovacuum to kick back in until it has dealt with the truncation,
> but it makes it hard for a human looking at the stat views to see
> what's going on, and I'm not sure it wouldn't lead to bad plans due
> to stale statistics.
>
> Personally, I would rather see us add a boolean to indicate that
> autovacuum was needed (regardless of the math on the other columns)
> and use that to control the retries -- leaving the other columns
> free to reflect reality.

You mean to extend the stats by yet another column? The thing is that 
this whole case happens rarely. We see this every other month or so and 
only on a rolling window table after it got severely bloated due to some 
abnormal use (purging disabled for some operational reason). The whole 
situation resolves itself after a few minutes to maybe an hour or two.

Personally I don't think living with a few wrong stats on one table for 
that time is so bad that it warrants changing that much more code.

Lower casing TRUE/FALSE will be done.

If the LW_SHARED is enough in LockHasWaiters(), that's fine too.

I think we have a consensus that the check interval should be derived 
from deadlock_timeout/10 clamped to 10ms. I'm going to remove that GUC.

About the other two, I'm just not sure. Maybe using half the value as 
for the lock waiter interval as the lock retry interval, again clamped 
to 10ms, and simply leaving one GUC that controls how long autovacuum 
should retry this. I'm not too worried about the retry interval. 
However, the problem with that overall retry duration is that this value 
very much depends on the usage pattern of the database. If long running 
transactions (like >5 seconds) are the norm, then a hard coded value of 
2 seconds before autovacuum gives up isn't going to help much.


Jan

-- 
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin



pgsql-hackers by date:

Previous
From: Jeremy Drake
Date:
Subject: Re: [COMMITTERS] pgsql: Refactor flex and bison make rules
Next
From: Tom Lane
Date:
Subject: Re: autovacuum truncate exclusive lock round two