On 05/15/2015 03:17 PM, Heikki Linnakangas wrote:
> On 05/15/2015 03:05 PM, Fujii Masao wrote:
>> Seems this patch causes the regression test of pg_trgm fail.
>> The regression diff that I got is:
>>
>> *** /home/postgres/pgsql/head/contrib/pg_trgm/expected/pg_trgm.out
>> 2013-07-23 16:46:22.212488785 +0900
>> --- /home/postgres/pgsql/head/contrib/pg_trgm/results/pg_trgm.out
>> 2015-05-15 20:59:16.574926732 +0900
>> ***************
>> *** 2332,2343 ****
>> (3 rows)
>>
>> select t <-> 'q0987wertyu0988', t from test_trgm order by t <->
>> 'q0987wertyu0988' limit 2;
>> ! ?column? | t
>> ! ----------+-------------
>> ! 0.411765 | qwertyu0988
>> ! 0.5 | qwertyu0987
>> ! (2 rows)
>> !
>> drop index trgm_idx;
>> create index trgm_idx on test_trgm using gin (t gin_trgm_ops);
>> set enable_seqscan=off;
>> --- 2332,2338 ----
>> (3 rows)
>>
>> select t <-> 'q0987wertyu0988', t from test_trgm order by t <->
>> 'q0987wertyu0988' limit 2;
>> ! ERROR: index returned tuples in wrong order
>> drop index trgm_idx;
>> create index trgm_idx on test_trgm using gin (t gin_trgm_ops);
>> set enable_seqscan=off;
>
> Hmm, OK. pg_trgm works for me, but I'll take a look. (rover_firefly also
> went red, due to rounding differences in the regression test)
There are two issues here:
1. I forgot to initialize the "recheck" variable before calling the
distance function. pg_trgm's distance function is never lossy, and it
doesn't set recheck to anything. If 'recheck' happens to be true,
because it's uninitialized, the executor node will try to reorder the
tuples.
2. There is confusion on the datatype of the distance. pg_trgm's <->
operator returns float4, but the GIST code and pg_trgm's GIST distance
function always returns a float8. Gist thus returns the distance as a
float8, but the executor node re-calculates it as a float4, and then
tries to compare the two using float8 < operator.
The immediate problem goes away if we fix 1. and initialize the
variable, as the executor won't try to re-calculate or compare the ORDER
BY expressions if the index never returns a lossy tuple, but the
confusion on the datatypes is still real.
It's not very nice that gist has a hardcoded assumption that the
distance is always measured as a float8. It would be better to support
whatever datatype the distance function returns, and use the type's
comparison functions.
But as a quick fix, I think we should add a special case for float4.
Gist should check if the datatype of the original ordering operator is
float4, and convert the float8 used internally to float4 before
returning it. If the datatype is anything other than float4 or float8,
throw an error.
- Heikki