Re: Frequent Update Project: Design Overview ofHOTUpdates - Mailing list pgsql-hackers
From | Simon Riggs |
---|---|
Subject | Re: Frequent Update Project: Design Overview ofHOTUpdates |
Date | |
Msg-id | 1163768581.27956.671.camel@silverbirch.site Whole thread Raw |
In response to | Re: Frequent Update Project: Design Overview ofHOTUpdates (Hannu Krosing <hannu@skype.net>) |
List | pgsql-hackers |
On Fri, 2006-11-17 at 13:30 +0200, Hannu Krosing wrote: > Ühel kenal päeval, E, 2006-11-13 kell 13:42, kirjutas Csaba Nagy: > > [snip] > > > IMHO *most* UPDATEs occur on non-indexed fields. [snip] > > > > > > If my assumption is badly wrong on that then perhaps HOT would not be > > > useful after all. If we find that the majority of UPDATEs meet the HOT > > > pre-conditions, then I would continue to advocate it. > > > > Just to confirm that the scenario is valid: our application has almost > > all it's updates affecting only non-indexed columns. There are a few > > exceptions, but the vast majority is non-indexed, and that holds to the > > execution frequency too, not just for the count of tables/queries. > > One interesting case which should also be considered is conditional > indexes: > > create index on payments(payment_id) where status = 'waiting'; > > here the payment_id is not changed when processing the payment, but when > status is changed to 'processed' it still should be removed from the > index. > > How would this interact with HOT ? This can work, but isn't yet implemented in the current patch. (Neither are expressional indexes). For UPDATEs we currently assume that this will always result in an inserted row, followed by inserts into appropriate indexes. So in the executor we need to decide whether the newly inserted tuple version should have a partial index tuple or not. So in after the call to heap_update() we call ExecInsertIndexTuples() where we call ExecQual() to evaluate the situation for the partial index. For HOT we need to test the Partial Index predicate to decide whether the UPDATE was HOT or COLD (i.e. non-HOT). We would then cache that result so that if it is a COLD update and we need to insert a partial index tuple then we do not have to evaluate it twice. We need to test whether the old *and* the new tuple versions were indexed; thats an additional cost discussed below. In code we would need to refactor the ExecQual call so that the call can be made slightly earlier than it is now. But same code, same caller, same code files, just slightly modified functions. So the answer is Yes, it can work with Partial Indexes, but caveats would be: * it may only work partially ;-) - in your example the first Payment INSERT would create an index row, the next UPDATE would not and so HOT would not be usable for that UPDATE. OTOH the UPDATE isn't going to produce a new index row for the partial index, so we're not making the situation any worse either. * the cost of evals would be higher with HOT, but how much higher depends upon the cost of the evaluation of your chosen index predicate. Those caveats in themselves don't prevent HOT from being useful with Partial Indexes. Remember, the more indexes you have the more you benefit from HOT and none of the above changes that. I think I'd be the first to say that the value of indexes *does* change with HOT. In many use-cases having more indexes on a heavily updated table is now feasible with HOT. In some use-cases, some existing indexes could become more expensive to maintain and their value may not be worthwhile any longer. That's why the proposal is for HOT to be an option, not a fix for all cases. The specific use-case you mention is fairly common, which is disappointing, but a slightly different design is easily achieved. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
pgsql-hackers by date: