Re: autovacuum truncate exclusive lock round two - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: autovacuum truncate exclusive lock round two
Date
Msg-id 20121123214236.297050@gmx.com
Whole thread Raw
In response to autovacuum truncate exclusive lock round two  (Jan Wieck <JanWieck@Yahoo.com>)
List pgsql-hackers
Alvaro Herrera wrote:

> Are you posting an updated patch?

Well, there wasn't exactly a consensus on what should change, so I'll
throw some initial review comments out even though I still need to
check some things in the code and do more testing.

The patch applied cleanly, compiled without warnings, and passed all
tests in `make check-world`.

The previous discussion seemed to me to achieve consensus on the need
for the feature, but a general dislike for adding so many new GUC
settings to configure it. I concur on that. I agree with the feelings
of others that just using deadlock_timeout / 10 (probably clamped to
a minimum of 10ms) would be good instead of
autovacuum_truncate_lock_check. I agree that the values of the other
two settings probably aren't too critical as long as they are fairly
reasonable, which I would define as being 20ms to 100ms with with
retries lasting no more than 2 seconds.  I'm inclined to suggest a
total time of deadlock_timeout * 2 clamped to 2 seconds, but maybe
that is over-engineering it.

There was also a mention of possibly inserting a vacuum_delay_point()
call outside of the AccessExclusiveLock. I don't feel strongly about
it, but if the page scanning cost can be tracked easily, I guess it
is better to do that.

Other than simplifying the code based on eliminating the new GUCs,
the coding issues I found were:

TRUE and FALSE were used as literals.  Since true and false are used
about 10 times as often and current code in the modified files is
mixed, I would recommend the lowercase form. We decided it wasn't
worth the trouble to convert the minority usage over, but I don't see
any reason to add new instances of the minority case.

In LockHasWaiters() the partition lock is acquired using
LW_EXCLUSIVE. I don't see why that can't be LW_SHARED. Was there a
reason for this, or was it just not changed after copy/paste?

I still need to review the timing calls, since I'm not familiar with
them so it wasn't immediately obvious to me whether they were being
used correctly. I have no reason to believe that they aren't, but
feel I should check.

Also, I want to do another pass looking just for off-by-one errors on
blkno. Again, I have no reason to believe that there is a problem; it
just seems like it would be a particularly easy type of mistake to
make and miss when a key structure has this field:
 BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */

And I want to test more.

But if a new version could be created based on the above, that would
be great. Chances seem relatively slim at this point that there will
be much else to change, if anything.

-Kevin



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: add -Wlogical-op to standard compiler options?
Next
From: Alvaro Herrera
Date:
Subject: Re: splitting *_desc routines