Thread: Box type equality

Box type equality

From
Stanislav Kelvich
Date:
Hi.

I've faced an issue with Box type comparison that exists almost for a five years.

> create table t (b box);
CREATE TABLE
> select count(*), b from t group by b;
ERROR:  could not identify an equality operator for type box

As mentioned in http://www.postgresql.org/message-id/Pine.LNX.4.64.1012040051500.12632@sn.sai.msu.ru
 That can be fixed by b-tree equality for boxes, but we need some decisions there. We can compare
floats up to certain threshold or strictly, and box equality can be defined as coordinate equality or as equality of
areas. In a case of threshold-based comparison we will not lose equality due to rounding errors, for example  
applying commutative functions in different order will preserve equality, but we can face non-transitive equalities,
like 
box1 == box2, box2 == box3, but box1 != box3 and GROUP BY can produce strange results. With strict comparison equality
istransitive, but we can lose that equality due to rounding. 

Now in geo_decls.h we have:

#define EPSILON                    1.0E-06
#define FPeq(A,B)                (fabs((A) - (B)) <= EPSILON)

and equality means equal areas.

But for GROUP BY it better be opposite: equality of coordinates and strict comparison.

Simplest workaround in my perspective is to add another operator for box type (e.g. ==) that will perform strict
comparison
of coordinates and use it in b-tree ops.

Any objections/suggestions?

-----------------
Stas Kelvich
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





Re: Box type equality

From
Tom Lane
Date:
Stanislav Kelvich <s.kelvich@postgrespro.ru> writes:
> I've faced an issue with Box type comparison that exists almost for a five years.

Try twenty-five years.  The code's been like that since Berkeley.

>   That can be fixed by b-tree equality for boxes, but we need some
>   decisions there.

The problem with inventing a btree opclass for boxes is much more
fundamental than fuzzy comparisons, unfortunately.  Btree requires a
linear sort order, and there's no plausible linear ordering of boxes,
unless you compare areas which won't give the equality semantics you want.

We could perhaps invent an exact-equality operator and construct just
a hash opclass for it, no btree.

In any case I think it would be a mistake to consider only boxes; all
the built-in geometric types have related issues.
        regards, tom lane



Re: Box type equality

From
Jeff Anton
Date:
The Box type is the oldest non-linear type in the Postgres system.
We used it as the template for extensibility in the original system 
about thirty years ago.  We had R-trees for box indexing.  If you want 
fuzzy box matching, that seems possible with R-trees and some creativity 
by say matching an R-tree with a little larger box and using containment 
and maybe also not contained by a smaller box.  This is the idea behind 
strategies.  That you can use existing operations to build a new operation.

If you have to force this onto B-tree's I think you will have to choose 
one edge to index on (i.e. one of the four values) then fuzzy match that 
through the index and have a secondary condition to further restrict the 
matches.

As with all the geometric types, you can use containment boxes for them 
and have the secondary condition checks.

It's all just a few lines of code as Stonebraker used to say.

Jeff Anton


On 09/29/15 08:43, Tom Lane wrote:
> Stanislav Kelvich <s.kelvich@postgrespro.ru> writes:
>> I've faced an issue with Box type comparison that exists almost for a five years.
>
> Try twenty-five years.  The code's been like that since Berkeley.
>
>>    That can be fixed by b-tree equality for boxes, but we need some
>>    decisions there.
>
> The problem with inventing a btree opclass for boxes is much more
> fundamental than fuzzy comparisons, unfortunately.  Btree requires a
> linear sort order, and there's no plausible linear ordering of boxes,
> unless you compare areas which won't give the equality semantics you want.
>
> We could perhaps invent an exact-equality operator and construct just
> a hash opclass for it, no btree.
>
> In any case I think it would be a mistake to consider only boxes; all
> the built-in geometric types have related issues.
>
>             regards, tom lane
>
>