Re: BUG #6399: knngist sometimes returns tuples in incorrect order - Mailing list pgsql-bugs

From Heikki Linnakangas
Subject Re: BUG #6399: knngist sometimes returns tuples in incorrect order
Date
Msg-id 4F16B616.50709@enterprisedb.com
Whole thread Raw
In response to BUG #6399: knngist sometimes returns tuples in incorrect order  (yamt@mwd.biglobe.ne.jp)
Responses Re: BUG #6399: knngist sometimes returns tuples in incorrect order  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
List pgsql-bugs
On 16.01.2012 11:32, yamt@mwd.biglobe.ne.jp wrote:
> The following bug has been logged on the website:
>
> Bug reference:      6399
> Logged by:          YAMAMOTO Takashi
> Email address:      yamt@mwd.biglobe.ne.jp
> PostgreSQL version: Unsupported/Unknown
> Operating system:   NetBSD
> Description:
>
> 9.2devel
> (01d83ffdcae92f75dbfd41de0b4213d241edd394)
>
> knngist seems to assume that any distances can be represented with float8.
> at least distances between int8 values can not.

Yeah. That seems like a bad assumption. It might theoretically be
possible to somehow map all int8s to float8s, but e.g numerics will not be.

> the following example uses btree_gist extension.
> results should be the same regardless of the existance of the index.
>
> create temp table t (a int8);
> insert into t values (9223372036854775806),(9223372036854775807);
> select *,0<->a as dist from t order by dist;
> create index on t using gist (a);
> set enable_seqscan=off;
> select *,0<->a as dist from t order by dist;
>
>
> CREATE TABLE
> INSERT 0 2
>            a          |        dist
> ---------------------+---------------------
>   9223372036854775806 | 9223372036854775806
>   9223372036854775807 | 9223372036854775807
> (2 rows)
>
> CREATE INDEX
> SET
>            a          |        dist
> ---------------------+---------------------
>   9223372036854775807 | 9223372036854775807
>   9223372036854775806 | 9223372036854775806
> (2 rows)

Yep, that's wrong. Both rows have the same float8 distance value, and
get outputted in wrong order. For 9.2, I think we should change gist so
that the user-defined distance function can return any scalar data type,
not just float8 (as long as it has b-tree comparison operators).

For 9.1, I'm afraid it's too late to do that. Or is it? The other option
is to hack gist to re-check tuples with equal distance values.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

pgsql-bugs by date:

Previous
From: Robert Haas
Date:
Subject: Re: fatal flex error in guc-file.l kills the postmaster
Next
From: Heikki Linnakangas
Date:
Subject: Re: BUG #6399: knngist sometimes returns tuples in incorrect order