Re: [HACKERS] Surjective functional indexes - Mailing list pgsql-hackers

From Konstantin Knizhnik
Subject Re: [HACKERS] Surjective functional indexes
Date
Msg-id 8e3a0102-07c8-f073-ac02-ddaa3e24425d@postgrespro.ru
Whole thread Raw
In response to Re: [HACKERS] Surjective functional indexes  ("David G. Johnston" <david.g.johnston@gmail.com>)
Responses Re: [HACKERS] Surjective functional indexes  ("David G. Johnston" <david.g.johnston@gmail.com>)
List pgsql-hackers


On 11.05.2018 07:48, David G. Johnston wrote:
On Thursday, February 1, 2018, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:

Old + New for check = 2
plus calculate again in index = 3

Yes, we have to calculate the value of index expression for original and updated version of the record. If them are equal, then it is all we have to do with this index: hot update is applicable.
In this case we calculate index expression twice.
But if them are not equal, then hot update is not applicable and we have to update index. In this case expression will be calculated one more time. So totally three times.
This is why, if calculation of index expression is very expensive, then effect of this optimization may be negative even if value of index expression is not changed.

For the old/new comparison and the re-calculate if changed dynamics - is this a side effect of separation of concerns only or is there some other reason the old computed value already stored in the index isn't compared to the one and only function result of the new tuple which, if found to be different, is then stored.  One function invocation, which has to happen anyway, and one extra equality check.  Seems like this could be used for non-functional indexes too, so that mere presence in the update listing doesn't negate HOT if the column didn't actually change (if I'm not mis-remembering something here anyway...)

Also, create index page doc typo from site:  "with an total" s/b "with a total" (expression cost less than 1000) - maybe add a comma for 1,000

David J.



Sorry, may be I do not completely understand you.
So whats happed before this patch:

- On update postgres compares old and new values of all changed attributes to determine whether them are actually changed.
- If value of some indexed attribute is changed,  then hot update is not applicable and we have to rebuild indexed.
- Otherwise hot update is used and indexes should not be updated.

What is changed:
  
-  When some of attributes, on which functional index depends, is changed, then we calculate value of index expression. It is done using existed FormIndexDatum function which stores calculated expression value in the provided slot. This evaluation of index expressions and their comparison is done in access/heap/heapam.c file.
- Only if old and new values of index expression are different, then hot update is really not applicable.
- In this case we have to rebuild indexes. It is done by ExecInsertIndexTuples in executor/execIndexing.c which calls FormIndexDatum once again to calculate index expression.

So in principle, it is certainly possible to store value of index expression calculated in ProjIndexIsUnchanged and reuse it ExecInsertIndexTuples.
But I do not know right place (context) where this value can be stored.
And also it will add some dependency between  heapam and execIndexing modules. Also it is necessary to take in account that ProjIndexIsUnchanged is not always called. So index expression value may be not present.

Finally, most of practically used index expressions are not expensive. It is usually something like extraction of JSON component. Cost of execution of this function is much smaller than cost of extracting and unpacking toasted JSON value. So I do not think that such optimization will be really useful, while it is obvious that it significantly complicate code.
Also please notice that FormIndexDatum is used in some other places:

  utils/adt/selfuncs.c
  utils/sort/tuplesort.c
  mmands/constraint.c

So this patch just adds two more calls of this function.
-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

pgsql-hackers by date:

Previous
From: Hartmut Holzgraefe
Date:
Subject: Re: Having query cache in core
Next
From: Tatsuo Ishii
Date:
Subject: Re: Having query cache in core