Re: Re: Why frequently updated tables are an issue - Mailing list pgsql-hackers
From | |
---|---|
Subject | Re: Re: Why frequently updated tables are an issue |
Date | |
Msg-id | 28292295$10983772294177e80d250915.59640688@config1.schlund.de Whole thread Raw |
List | pgsql-hackers |
I'm thinking that this subject has some cross-overs with another topic mentioned recently: improving VACUUM performance. Just to recap: Frequently updated tables are a problem because they grow significantly, with many useless row versions in there. The usual solution is to run frequent VACUUMs on that table, though even that isn't enough for some update/change rates because we need to scan the whole table rather than just the hot blocks. The ideas thrown around earlier this month were to create a "Cleaner" process that performed individual block-level VACUUMs on dirty blocks in shared_buffers, prior to their being bgwriten/checkpointed. (It wasn't called that at the time, IIRC, but was described more in terms of the CTID list that would be required to manage this). Such a Cleaner process could solve the problem of frequently updated tables by maintaining the number of row versions at a reasonably low level. There wasn't much excitement when the "speed-up vacuum" idea was discussed, including from myself. Thinking about the frequent-update problem makes me want to revisit this idea now, since it has a much clearer role in solving the frequent update problem. As I look at this now it sounds very much like the freelist problem looked a while back, with a similar-ish solution... Frequent updates will happen on blocks within the T2 cache, since they will be heavily accessed. The more frequently accessed, the more they need cleaning. This is actually the opposite end of the cache from the bgwriter, who is interested in writing blocks that aren't frequently used - to avoid having to continually write the same blocks. As a result, I'm thinking maybe we wouldn't want to have a queue to manage the list of blocks that need both Cleaning and writing. Discuss... Another way to do this is to find a way of identifying the specific blocks that require Cleaning. We could use an event to fire-off the cleaning action, such as when an update causes a block split. In that case, the older block could be added to the Cleaner's cleanlist. The Cleaner then works through the cleanlist, doing a block-level vacuum. That way the Cleaner would stay a reasonable number of row versions back from the point of update to make a VACUUM potentially useful, as well as being fired infrequently enough not to get in the way [since as Alvaro has been pointed out, the indexes need to be visited in addition to the main data block - which could introduce contention for index locks because the frequent updater is probably using an index to reach the data] Also, picking the block-split event would also mean stopping the frequent-updater just at the point where significant extra costs get incurred from all the additional row versions - when a row has versions across many blocks. I'm sure there are some faulted thoughts in all of that, but overall, I think that a Cleaner process to perform block-level VACUUMs becomes more viable when it has an important problem to solve. Best Regards, Simon Riggs
pgsql-hackers by date: