Re: autovacuum truncate exclusive lock round two - Mailing list pgsql-hackers
From | Jan Wieck |
---|---|
Subject | Re: autovacuum truncate exclusive lock round two |
Date | |
Msg-id | 508938E7.8060906@Yahoo.com Whole thread Raw |
In response to | autovacuum truncate exclusive lock round two (Jan Wieck <JanWieck@Yahoo.com>) |
Responses |
Re: autovacuum truncate exclusive lock round two
Re: autovacuum truncate exclusive lock round two |
List | pgsql-hackers |
Steven, On 10/24/2012 10:46 PM, Stephen Frost wrote: > Jan, > > * Jan Wieck (JanWieck@Yahoo.com) wrote: >> This problem has been discussed before. Those familiar with the >> subject please skip the next paragraph. > > Apologies if this was already thought-of and ruled out for some reason, > but... > >> Because all the scanning had been done in parallel to normal DB >> activity, it needs to verify that all those blocks are still empty. > > Would it be possible to use the FSM to figure out if things have changed > since the last scan..? Does that scan update the FSM, which would then > be updated by another backend in the event that it decided to write > something there? Or do we consider the FSM to be completely > untrustworthy wrt this (and if so, I don't suppose there's any hope to > using the visibility map...)? I honestly don't know if we can trust the FSM enough when it comes to throwing away heap pages. Can we? > > The notion of having to double-scan and the AccessExclusiveLock on the > relation are telling me this work-around, while completely possible, > isn't exactly ideal... Under normal circumstances with just a few pages to trim off the end this is no problem. Those pages were the last pages just scanned by this very autovacuum, so they are found in the shared buffers anyway. All the second scan does in that case is to fetch the page once more from shared buffers to be 100% sure, we are not truncating off new tuples. We definitely need the AccessExclusiveLock to prevent someone from extending the relation at the end between our check for relation size and the truncate. Fetching 50 empty blocks from the buffer cache while at it isn't that big of a deal and that is what it normally looks like. The problem case this patch is dealing with is rolling window tables that experienced some bloat. The typical example is a log table, that has new data constantly added and the oldest data constantly purged out. This data normally rotates through some blocks like a rolling window. If for some reason (purging turned off for example) this table bloats by several GB and later shrinks back to its normal content, soon all the used blocks are at the beginning of the heap and we find tens of thousands of empty pages at the end. Only now does the second scan take more than 1000ms and autovacuum is at risk to get killed while at it. Since we have experienced this problem several times now on our production systems, something clearly needs to be done. But IMHO it doesn't happen often enough to take any risk here. Jan -- Anyone who trades liberty for security deserves neither liberty nor security. -- Benjamin Franklin
pgsql-hackers by date: