Re: Frequent Update Project: Design Overview ofHOTUpdates - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Frequent Update Project: Design Overview ofHOTUpdates
Date
Msg-id 1163768581.27956.671.camel@silverbirch.site
Whole thread Raw
In response to Re: Frequent Update Project: Design Overview ofHOTUpdates  (Hannu Krosing <hannu@skype.net>)
List pgsql-hackers
On Fri, 2006-11-17 at 13:30 +0200, Hannu Krosing wrote:
> Ühel kenal päeval, E, 2006-11-13 kell 13:42, kirjutas Csaba Nagy:
> > [snip]
> > > IMHO *most* UPDATEs occur on non-indexed fields. [snip]
> > >
> > > If my assumption is badly wrong on that then perhaps HOT would not be
> > > useful after all. If we find that the majority of UPDATEs meet the HOT
> > > pre-conditions, then I would continue to advocate it.
> >
> > Just to confirm that the scenario is valid: our application has almost
> > all it's updates affecting only non-indexed columns. There are a few
> > exceptions, but the vast majority is non-indexed, and that holds to the
> > execution frequency too, not just for the count of tables/queries.
>
> One interesting case which should also be considered is conditional
> indexes:
>
> create index on payments(payment_id) where status = 'waiting';
>
> here the payment_id is not changed when processing the payment, but when
> status is changed to 'processed' it still should be removed from the
> index.
>
> How would this interact with HOT ?

This can work, but isn't yet implemented in the current patch. (Neither
are expressional indexes).

For UPDATEs we currently assume that this will always result in an
inserted row, followed by inserts into appropriate indexes. So in the
executor we need to decide whether the newly inserted tuple version
should have a partial index tuple or not. So in after the call to
heap_update() we call ExecInsertIndexTuples() where we call ExecQual()
to evaluate the situation for the partial index.

For HOT we need to test the Partial Index predicate to decide whether
the UPDATE was HOT or COLD (i.e. non-HOT). We would then cache that
result so that if it is a COLD update and we need to insert a partial
index tuple then we do not have to evaluate it twice. We need to test
whether the old *and* the new tuple versions were indexed; thats an
additional cost discussed below.

In code we would need to refactor the ExecQual call so that the call can
be made slightly earlier than it is now. But same code, same caller,
same code files, just slightly modified functions.

So the answer is Yes, it can work with Partial Indexes, but caveats
would be:
*  it may only work partially ;-) - in your example the first Payment
INSERT would create an index row, the next UPDATE would not and so HOT
would not be usable for that UPDATE. OTOH the UPDATE isn't going to
produce a new index row for the partial index, so we're not making the
situation any worse either.
*  the cost of evals would be higher with HOT, but how much higher
depends upon the cost of the evaluation of your chosen index predicate.

Those caveats in themselves don't prevent HOT from being useful with
Partial Indexes. Remember, the more indexes you have the more you
benefit from HOT and none of the above changes that.

I think I'd be the first to say that the value of indexes *does* change
with HOT. In many use-cases having more indexes on a heavily updated
table is now feasible with HOT. In some use-cases, some existing indexes
could become more expensive to maintain and their value may not be
worthwhile any longer. That's why the proposal is for HOT to be an
option, not a fix for all cases.

The specific use-case you mention is fairly common, which is
disappointing, but a slightly different design is easily achieved.

--  Simon Riggs              EnterpriseDB   http://www.enterprisedb.com




pgsql-hackers by date:

Previous
From: Teodor Sigaev
Date:
Subject: Proposal: syntax of operation with tsearch's configuration
Next
From: Bruce Momjian
Date:
Subject: Re: [PATCHES] replication docs: split single vs.