Re: [HACKERS] Surjective functional indexes - Mailing list pgsql-hackers

From Konstantin Knizhnik
Subject Re: [HACKERS] Surjective functional indexes
Date
Msg-id 9f8b34b3-0db6-0887-256e-7e5ca8b4b047@postgrespro.ru
Whole thread Raw
In response to Re: [HACKERS] Surjective functional indexes  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers

On 25.05.2017 19:37, Tom Lane wrote:
> Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes:
>> My proposal is to check value of function for functional indexes instead
>> of just comparing set of effected attributes.
>> Obviously, for some complex functions it may  have negative effect on
>> update speed.
>> This is why I have added "surjective" option to index.
> This seems overcomplicated.  We would have to compute the function
> value at some point anyway.  Can't we refactor to do that earlier?
>
>             regards, tom lane


Check for affected indexes/applicability of HOT update and update of 
indexes themselves is done in two completely different parts of code.
And if we find out that values of indexed expressions are not changed, 
then we can use HOT update and indexes should not be updated
(so calculated value of function is not needed). And it is expected to 
be most frequent case.

Certainly, if value of indexed expression is changed, then we can avoid 
redundant calculation of function by storing result of calculations 
somewhere.
But it will greatly complicate all logic of updating indexes. Please 
notice, that if we have several functional indexes and only one of them 
is actually changed,
then in any case we can not use HOT and have to update all indexes. So 
we do not need to evaluate values of all indexed expressions. We just 
need to find first
changed one. So we should somehow keep track values of which expression 
are calculated and which not.

One more argument. Originally Postgres evaluates index expression only 
once (when inserting new version of tuple to the index).
Now (with this patch) Postgres has to evaluate expression three times in 
the worst case: calculate the value of expression for old and new tuples 
to make a decision bout hot update,
and the evaluate it once again when performing index update itself. Even 
if I managed to store somewhere calculated value of the expression, we 
still have to perform
twice more evaluations than before. This is why for expensive functions 
or for functions defined for frequently updated attributes (in case of 
JSON) such policy should be disabled.
And for non-expensive functions extra overhead is negligible. Also there 
is completely no overhead if indexed expression is not actually changed. 
And it is expected to be most frequent case.

At least at the particular example with YCSB benchmark, our first try 
was just to disable index update by commenting correspondent check of 
updated fields mask.
Obviously there are no extra function calculations in this case. Then I 
have implemented this patch. And performance is almost the same.
This is why I think that simplicity and modularity of code is more 
important here than elimination of redundant function calculation.


-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [HACKERS] Surjective functional indexes
Next
From: Alvaro Herrera
Date:
Subject: Re: [HACKERS] CREATE STATISTICS statistic_type documentation