Re: knngist - 0.8 - Mailing list pgsql-hackers

From Teodor Sigaev
Subject Re: knngist - 0.8
Date
Msg-id 4D6D33CC.9060802@sigaev.ru
Whole thread Raw
In response to Re: knngist - 0.8  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: knngist - 0.8
List pgsql-hackers
> I did a quick look at this patch.  The major problem with it is of
> course that it needs to be fixed for the recent extension-related
> changes.  I transposed the .sql.in changes into additions to
> btree_gist--1.0.sql (attached), but haven't really sanity-checked
> them beyond checking that the regression tests pass.  The same mods
> would need to be made in btree_gist--unpackaged--1.0.sql.

Fixed

> 1. oid_dist() returns oid ... really?  Oid is unsigned.  I'd be inclined
> to argue though that distance between Oids is a meaningless concept, so
Hmm, oid is often used as unsigned int.

> you should remove this not just mess with the result type.  Anybody who
> actually wants to form a distance between Oids should have to cast them
> to an arithmetic type first.  Let the user figure out how wraparound
> cases should be handled.

Distance between unsigned 32-bit integers could not be more than 2^32.

>
> 2. Beyond that, none of the distance routines have given any thought to
> avoiding overflow.  For instance, dist_int2 had better return something
> wider than int2, and so on up.  It looks to me like the internal gist
Just like other operations:
# select 32000::smallint + 32000::smallint;
ERROR:  smallint out of range

> distance functions also suffer overflow risks, in that they tend to form
> the difference first (in the source datatype) and only afterwards cast
> to float8.
fixed

> 3. I was surprised that there wasn't a distance implementation for
> numeric.  I suppose that this might be difficult to do without risking
> overflow in conversion to float8, though.
Exactly

> 4. I didn't much care for changing the result type of gbt_num_consistent
> from bool to float8; that's just messy, and I don't see any compensating
> advantage.  I suggest you leave gbt_num_consistent and its callers
> alone, and add a separate gbt_num_distance routine that only handles the
> KNNDistance case.
Done

--
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

Attachment

pgsql-hackers by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Re: SSI bug?
Next
From: Tom Lane
Date:
Subject: Is the attribute options cache actually worth anything?