Re: HOT latest patch - version 8 - Mailing list pgsql-patches
From | Simon Riggs |
---|---|
Subject | Re: HOT latest patch - version 8 |
Date | |
Msg-id | 1184537041.4512.402.camel@ebony.site Whole thread Raw |
In response to | Re: HOT latest patch - version 8 (Heikki Linnakangas <heikki@enterprisedb.com>) |
Responses |
Re: HOT latest patch - version 8
|
List | pgsql-patches |
On Fri, 2007-07-13 at 16:22 +0100, Heikki Linnakangas wrote: > Heikki Linnakangas wrote: > > I have some suggestions which I'll post separately, > I'm looking for ways to make the patch simpler and less invasive. We may > want to put back some of this stuff, or come up with a more clever > solution, in future releases, but right let's keep it simple. I believe we're all trying to do that, but I would like to see some analysis of which techniques are truly effective and which are not. Simpler code may not have desirable behaviour and then the whole lot of code is pointless. Let's make it effective by making it complex enough. I'm not clear where the optimum lies. (c.f. Flying Buttresses). > A significant chunk of the complexity and new code in the patch comes > from pruning hot chains and reusing the space for new updates. Because > we can't reclaim dead space in the page like a VACUUM does, without > holding the vacuum lock, we have to deal with pages that contain deleted > tuples, and be able to reuse them, and keep track of the changes in > tuple length etc. > > A much simpler approach would be to try to acquire the vacuum lock, and > compact the page the usual way, and fall back to a cold update if we > can't get the lock immediately. > > The obvious downside of that is that if a page is continuously pinned, > we can't HOT update tuples on it. Keeping in mind that the primary use > case for HOT is largeish tables, small tables are handled pretty well by > autovacuum, chances are pretty good that you can get a vacuum lock when > you need it. The main problem HOT seeks to avoid is wasted inserts into indexes, and the subsequent VACUUMing that requires. Small tables have smaller indexes, so that the VACUUMing is less of a problem. If we have hot spots in larger tables, DSM would allow us to avoid the I/O on the main table, but we would still need to scan the indexes. So HOT *can* be better than DSM. I'm worried that requiring the vacuum lock in all cases will mean that HOT will be ineffective where it is needed most - in the hot spots - i.e. the blocks that contain frequently updated rows. [As an aside, in OLTP it is frequently the index blocks that become hot spots, so reducing index inserts because of UPDATEs will also reduce block contention] Our main test case for OLTP is DBT-2 which follows TPC-C in being perfectly scalable with no hot spots in the heap and limited hot spots in the indexes. As such it's a poor test of real world applications, where Benfold's Law holds true. Requiring the vacuum lock in all cases would allow good benchmark performance but would probably fail in the real world at providing good long term performance. I'm interested in some numbers that show which we need. I'm thinking of some pg_stats output that shows how many vac locks were taken and how many prunes were made. Something general that allows some beta testers to provide feedback on the efficacy of the patch. That leads to the suggestion that we should make the HOT pruning logic into an add-on patch, commit it, but evaluate its performance during beta. If we have no clear evidence of additional benefit, we remove it again. I'm not in favour of background retail vacuuming by the bgwriter. The timeliness of that is (similarly) in question and I think bgwriter has enough work to do already. [Just as a note to all performance testers: HOT is designed to show long-term steady performance. Short performance tests frequently show no benefit if sufficient RAM is available to avoid the table bloat and we avoid hitting the point where autovacuums kick in. I know Heikki knows this, just not sure we actually said it.] -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
pgsql-patches by date: