Thread: BUG #6325: Useless Index updates

BUG #6325: Useless Index updates

From
dmigowski@ikoffice.de
Date:
The following bug has been logged on the website:

Bug reference:      6325
Logged by:          Daniel Migowski
Email address:      dmigowski@ikoffice.de
PostgreSQL version: 8.3.16
Operating system:   Linux
Description:=20=20=20=20=20=20=20=20

It seems that an update to a row in a table always removes the element from
an index and adds it again. Wouldn't it be faster to check for equality of
the index parameters in the OLD and NEW record first?=20

I have this problem with an functional index using a relative expensive
index function, and noticed that the index function is always called even if
the parameter to the index function has not changed. Wouldn't it be better
to validate that the input to the index functions has not changed, instead
of calling the index function over and over again? Especially since the
index functions seems to be called with the new and the old value anyway.

I can understand that this might be a precaution in the case that the index
function isn't stable (is it even possible to use such a function for an
index?), but in the stable case comparing the input parameters allows for
much much faster table updates.

Re: BUG #6325: Useless Index updates

From
Craig Ringer
Date:
On 12/04/2011 08:54 PM, dmigowski@ikoffice.de wrote:
> The following bug has been logged on the website:
>
> Bug reference:      6325
> Logged by:          Daniel Migowski
> Email address:      dmigowski@ikoffice.de
> PostgreSQL version: 8.3.16
> Operating system:   Linux
> Description:
>
> It seems that an update to a row in a table always removes the element from
> an index and adds it again. Wouldn't it be faster to check for equality of
> the index parameters in the OLD and NEW record first?

- This isn't a bug report, it's a feature/enhancement request. Please
   use the mailing lists.

- You're reporting this issue against an old patch release of an old
   major release. Why not check with 9.1?

- The index isn't always updated. Check out HOT (introduced in 8.4, the
   release after your current one) which reduces unnecessary index
   updates in cases where the old and new row can fit on the same
   heap page.

- In most other cases the index update can't be avoided, because
   the new and old rows are on different database pages. The old index
   entry has to remain in place so that still-running transactions that
   can see the old row can still find it in the index, so it can't be
   overwritten and instead a new entry has to be added.

> I have this problem with an functional index using a relative expensive
> index function, and noticed that the index function is always called even if
> the parameter to the index function has not changed. Wouldn't it be better
> to validate that the input to the index functions has not changed, instead
> of calling the index function over and over again? Especially since the
> index functions seems to be called with the new and the old value anyway.

That's a more interesting one. Perhaps you could write it up in more
detail, with a test case, and submit it to the pgsql-general mailing list?

This isn't just about functions anyway. Pg would have to compare *all*
inputs to the old index expression to see if they were the same.
Otherwise, in an expression like  f(g(x,y),z)  Pg would not have any
stored value for the result of g(x,y) to compare against. It'd have to
instead compare (x1,y1,z1) to (x2,y2,z2) and decide that if they were
the same the result of the index expression hadn't changed.

That's probably possible, but I'm not sure it'd be a win over just
evaluating the expression in most cases. How would Pg know when to do
it? Using function COST parameters?

Essentially, this isn't as simple as it looks at face value.

> I can understand that this might be a precaution in the case that the index
> function isn't stable (is it even possible to use such a function for an
> index?)

No, it isn't possible. Index functions must be immutable, not just
stable, so their output must be determined entirely by their parameters.
At least on newer versions STABLE or VOLATILE functions should be
rejected in index expressions.

--
Craig Ringer

Re: BUG #6325: Useless Index updates

From
hubert depesz lubaczewski
Date:
On Sun, Dec 04, 2011 at 12:54:25PM +0000, dmigowski@ikoffice.de wrote:
> It seems that an update to a row in a table always removes the element from
> an index and adds it again. Wouldn't it be faster to check for equality of
> the index parameters in the OLD and NEW record first?

http://www.depesz.com/index.php/2008/11/05/waiting-for-84-suppress_redundant_updates_trigger/

Best regards,

depesz

--
The best thing about modern society is how easy it is to avoid contact with it.
                                                             http://depesz.com/

Re: BUG #6325: Useless Index updates

From
Andreas Karlsson
Date:
On 2011-12-04 15:02, Craig Ringer wrote:
> - The index isn't always updated. Check out HOT (introduced in 8.4, the
> release after your current one) which reduces unnecessary index
> updates in cases where the old and new row can fit on the same
> heap page.

Minor correction: HOT was introduced in 8.3.

http://www.postgresql.org/docs/9.1/static/release-8-3.html

--
Andreas Karlsson

Re: BUG #6325: Useless Index updates

From
Craig Ringer
Date:
On 12/05/2011 08:16 AM, Andreas Karlsson wrote:
> On 2011-12-04 15:02, Craig Ringer wrote:
>> - The index isn't always updated. Check out HOT (introduced in 8.4, the
>> release after your current one) which reduces unnecessary index
>> updates in cases where the old and new row can fit on the same
>> heap page.
>
> Minor correction: HOT was introduced in 8.3.
>
> http://www.postgresql.org/docs/9.1/static/release-8-3.html

Whoops, thanks. In that case the OP is already getting the benefit of
reduced index updates as much as is possible with PostgreSQL's MVCC design.

--
Craig Ringer

Re: BUG #6325: Useless Index updates

From
Daniel Migowski
Date:
Thank you all for your explanations. I will continue this on pgsql-general.

Daniel Migowski

________________________________________
Von: Craig Ringer [ringerc@ringerc.id.au]
Gesendet: Sonntag, 4. Dezember 2011 15:02
Bis: Daniel Migowski
Cc: pgsql-bugs@postgresql.org
Betreff: Re: [BUGS] BUG #6325: Useless Index updates

On 12/04/2011 08:54 PM, dmigowski@ikoffice.de wrote:
> The following bug has been logged on the website:
>
> Bug reference:      6325
> Logged by:          Daniel Migowski
> Email address:      dmigowski@ikoffice.de
> PostgreSQL version: 8.3.16
> Operating system:   Linux
> Description:
>
> It seems that an update to a row in a table always removes the element fr=
om
> an index and adds it again. Wouldn't it be faster to check for equality of
> the index parameters in the OLD and NEW record first?

- This isn't a bug report, it's a feature/enhancement request. Please
   use the mailing lists.

- You're reporting this issue against an old patch release of an old
   major release. Why not check with 9.1?

- The index isn't always updated. Check out HOT (introduced in 8.4, the
   release after your current one) which reduces unnecessary index
   updates in cases where the old and new row can fit on the same
   heap page.

- In most other cases the index update can't be avoided, because
   the new and old rows are on different database pages. The old index
   entry has to remain in place so that still-running transactions that
   can see the old row can still find it in the index, so it can't be
   overwritten and instead a new entry has to be added.

> I have this problem with an functional index using a relative expensive
> index function, and noticed that the index function is always called even=
 if
> the parameter to the index function has not changed. Wouldn't it be better
> to validate that the input to the index functions has not changed, instead
> of calling the index function over and over again? Especially since the
> index functions seems to be called with the new and the old value anyway.

That's a more interesting one. Perhaps you could write it up in more
detail, with a test case, and submit it to the pgsql-general mailing list?

This isn't just about functions anyway. Pg would have to compare *all*
inputs to the old index expression to see if they were the same.
Otherwise, in an expression like  f(g(x,y),z)  Pg would not have any
stored value for the result of g(x,y) to compare against. It'd have to
instead compare (x1,y1,z1) to (x2,y2,z2) and decide that if they were
the same the result of the index expression hadn't changed.

That's probably possible, but I'm not sure it'd be a win over just
evaluating the expression in most cases. How would Pg know when to do
it? Using function COST parameters?

Essentially, this isn't as simple as it looks at face value.

> I can understand that this might be a precaution in the case that the ind=
ex
> function isn't stable (is it even possible to use such a function for an
> index?)

No, it isn't possible. Index functions must be immutable, not just
stable, so their output must be determined entirely by their parameters.
At least on newer versions STABLE or VOLATILE functions should be
rejected in index expressions.

--
Craig Ringer