Thread: SELECT DISTINCT on boxes

SELECT DISTINCT on boxes

From
Bruce Momjian
Date:
Is there a way to do a SELECT DISTINCT on boxes:

        test=> create TABLE t3 (a box);
        CREATE TABLE
        test=> insert into t3 values ('(2,2),(1,1)');
        INSERT 17232 1
        test=> insert into t3 values ('(2,2),(1,1)');
        INSERT 17233 1
        test=> insert into t3 values ('(3,3),(2,2)');
        INSERT 17234 1
        test=> insert into t3 values ('(3,3),(2,2)');
        INSERT 17235 1
        test=> select distinct * from t3;
        ERROR:  could not identify an ordering operator for type box
        HINT:  Use an explicit ordering operator or modify the query.

I tried doing subqueries and using oids but that didn't help.

I don't understand why this doesn't work:

    SELECT a
    FROM t3 t2
    WHERE t2.oid = (SELECT MIN(t.oid) FROM t3 t WHERE t2.a = t.a);

          a
    -------------
     (2,2),(1,1)
    (1 row)

If finds only the duplicate.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: SELECT DISTINCT on boxes

From
Bruce Momjian
Date:
Bruce Momjian wrote:
> Is there a way to do a SELECT DISTINCT on boxes:
>
>         test=> create TABLE t3 (a box);
>         CREATE TABLE
>         test=> insert into t3 values ('(2,2),(1,1)');
>         INSERT 17232 1
>         test=> insert into t3 values ('(2,2),(1,1)');
>         INSERT 17233 1
>         test=> insert into t3 values ('(3,3),(2,2)');
>         INSERT 17234 1
>         test=> insert into t3 values ('(3,3),(2,2)');
>         INSERT 17235 1
>         test=> select distinct * from t3;
>         ERROR:  could not identify an ordering operator for type box
>         HINT:  Use an explicit ordering operator or modify the query.
>
> I tried doing subqueries and using oids but that didn't help.
>
> I don't understand why this doesn't work:
>
>     SELECT a
>     FROM t3 t2
>     WHERE t2.oid = (SELECT MIN(t.oid) FROM t3 t WHERE t2.a = t.a);
>
>           a
>     -------------
>      (2,2),(1,1)
>     (1 row)
>
> If finds only the duplicate.

I found the cause.  Equals for boxes compares only area, \do:

    pg_catalog | =    | box    | box     | boolean   | equal by area

The proper fix is to use ~= which compares boxes for similarity:

    pg_catalog | ~=   | box   | box   | boolean            | same as?

The correct query for DISTINCT is:

    test=> SELECT oid, a FROM t3 t2 WHERE t2.oid = (SELECT MIN(t.oid) FROM
    t3 t WHERE t2.a ~= t.a);
      oid  |      a
    -------+-------------
     17232 | (2,2),(1,1)
     17234 | (3,3),(2,2)
    (2 rows)

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: SELECT DISTINCT on boxes

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> I found the cause.  Equals for boxes compares only area, \do:

... which is in itself pretty bogus IMHO.  There are a couple of the
geometric types for which '=' does not do what you'd really expect.
I've been wondering if we could get away with changing this ...

            regards, tom lane

Re: SELECT DISTINCT on boxes

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > I found the cause.  Equals for boxes compares only area, \do:
>
> ... which is in itself pretty bogus IMHO.  There are a couple of the
> geometric types for which '=' does not do what you'd really expect.
> I've been wondering if we could get away with changing this ...

Yes, we would have to document it in the release notes but it is quite
surprising at is currently is configured.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: SELECT DISTINCT on boxes

From
Bruce Momjian
Date:
Added to TODO:

    * Fix data types where equality comparison isn't intuitive, e.g. box


---------------------------------------------------------------------------

Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > I found the cause.  Equals for boxes compares only area, \do:
>
> ... which is in itself pretty bogus IMHO.  There are a couple of the
> geometric types for which '=' does not do what you'd really expect.
> I've been wondering if we could get away with changing this ...
>
>             regards, tom lane
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073