Thread: Re: [COMMITTERS] pgsql: Allow GiST distance function to return merely a lower-bound.

On 05/15/2015 03:17 PM, Heikki Linnakangas wrote:
> On 05/15/2015 03:05 PM, Fujii Masao wrote:
>> Seems this patch causes the regression test of pg_trgm fail.
>> The regression diff that I got is:
>>
>> *** /home/postgres/pgsql/head/contrib/pg_trgm/expected/pg_trgm.out
>> 2013-07-23 16:46:22.212488785 +0900
>> --- /home/postgres/pgsql/head/contrib/pg_trgm/results/pg_trgm.out
>> 2015-05-15 20:59:16.574926732 +0900
>> ***************
>> *** 2332,2343 ****
>>     (3 rows)
>>
>>     select t <-> 'q0987wertyu0988', t from test_trgm order by t <->
>> 'q0987wertyu0988' limit 2;
>> !  ?column? |      t
>> ! ----------+-------------
>> !  0.411765 | qwertyu0988
>> !       0.5 | qwertyu0987
>> ! (2 rows)
>> !
>>     drop index trgm_idx;
>>     create index trgm_idx on test_trgm using gin (t gin_trgm_ops);
>>     set enable_seqscan=off;
>> --- 2332,2338 ----
>>     (3 rows)
>>
>>     select t <-> 'q0987wertyu0988', t from test_trgm order by t <->
>> 'q0987wertyu0988' limit 2;
>> ! ERROR:  index returned tuples in wrong order
>>     drop index trgm_idx;
>>     create index trgm_idx on test_trgm using gin (t gin_trgm_ops);
>>     set enable_seqscan=off;
>
> Hmm, OK. pg_trgm works for me, but I'll take a look. (rover_firefly also
> went red, due to rounding differences in the regression test)

There are two issues here:

1. I forgot to initialize the "recheck" variable before calling the 
distance function. pg_trgm's distance function is never lossy, and it 
doesn't set recheck to anything. If 'recheck' happens to be true, 
because it's uninitialized, the executor node will try to reorder the 
tuples.

2. There is confusion on the datatype of the distance. pg_trgm's <-> 
operator returns float4, but the GIST code and pg_trgm's GIST distance 
function always returns a float8. Gist thus returns the distance as a 
float8, but the executor node re-calculates it as a float4, and then 
tries to compare the two using float8 < operator.

The immediate problem goes away if we fix 1. and initialize the 
variable, as the executor won't try to re-calculate or compare the ORDER 
BY expressions if the index never returns a lossy tuple, but the 
confusion on the datatypes is still real.

It's not very nice that gist has a hardcoded assumption that the 
distance is always measured as a float8. It would be better to support 
whatever datatype the distance function returns, and use the type's 
comparison functions.

But as a quick fix, I think we should add a special case for float4. 
Gist should check if the datatype of the original ordering operator is 
float4, and convert the float8 used internally to float4 before 
returning it. If the datatype is anything other than float4 or float8, 
throw an error.

- Heikki