Re: Strange behavior with polygon and NaN - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Strange behavior with polygon and NaN
Date
Msg-id 20200825230350.GA20669@momjian.us
Whole thread Raw
In response to Strange behavior with polygon and NaN  (Jesse Zhang <sbjesse@gmail.com>)
Responses Re: Strange behavior with polygon and NaN
List pgsql-hackers
I can confirm that this two-month old email report still produces
different results with indexes on/off in git master, which I don't think
is ever correct behavior.

---------------------------------------------------------------------------

On Wed, Jun 24, 2020 at 03:11:03PM -0700, Jesse Zhang wrote:
> Hi hackers,
> 
> While working with Chris Hajas on merging Postgres 12 with Greenplum
> Database we stumbled upon the following strange behavior in the geometry
> type polygon:
> 
> ------ >8 --------
> 
> CREATE TEMP TABLE foo (p point);
> CREATE INDEX ON foo USING gist(p);
> 
> INSERT INTO foo VALUES ('0,0'), ('1,1'), ('NaN,NaN');
> 
> SELECT $q$
> SELECT * FROM foo WHERE p <@ polygon '(0,0), (0, 100), (100, 100), (100, 0)'
> $q$ AS qry \gset
> 
> BEGIN;
> SAVEPOINT yolo;
> SET LOCAL enable_seqscan TO off;
> :qry;
> 
> ROLLBACK TO SAVEPOINT yolo;
> SET LOCAL enable_indexscan TO off;
> SET LOCAL enable_bitmapscan TO off;
> :qry;
> 
> ------ 8< --------
> 
> If you run the above repro SQL in HEAD (and 12, and likely all older
> versions), you get the following output:
> 
> CREATE TABLE
> CREATE INDEX
> INSERT 0 3
> BEGIN
> SAVEPOINT
> SET
>    p
> -------
>  (0,0)
>  (1,1)
> (2 rows)
> 
> ROLLBACK
> SET
> SET
>      p
> -----------
>  (0,0)
>  (1,1)
>  (NaN,NaN)
> (3 rows)
> 
> 
> At first glance, you'd think this is the gist AM's bad, but on a second
> thought, something else is strange here. The following query returns
> true:
> 
> SELECT point '(NaN, NaN)' <@ polygon '(0,0), (0, 100), (100, 100), (100, 0)'
> 
> The above behavior of the "contained in" operator is surprising, and
> it's probably not what the GiST AM is expecting. I took a look at
> point_inside() in geo_ops.c, and it doesn't seem well equipped to handle
> NaN. Similary ill-equipped is dist_ppoly_internal() which underlies the
> distnace operator for polygon. It gives the following interesting
> output:
> 
> SELECT *, c <-> polygon '(0,0),(0,100),(100,100),(100,0)' as distance
> FROM (
>   SELECT circle(point(100 * i, 'NaN'), 50) AS c
>   FROM generate_series(-2, 4) i
> ) t(c)
> ORDER BY 2;
> 
>         c        | distance
> -----------------+----------
>  <(-200,NaN),50> |        0
>  <(-100,NaN),50> |        0
>  <(0,NaN),50>    |        0
>  <(100,NaN),50>  |        0
>  <(200,NaN),50>  |      NaN
>  <(300,NaN),50>  |      NaN
>  <(400,NaN),50>  |      NaN
> (7 rows)
> 
> Should they all be NaN? Am I alone in thinking the index is right but
> the operators are wrong? Or should we call the indexes wrong here?
> 
> Cheers,
> Jesse and Chris
> 
> 

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EnterpriseDB                             https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee




pgsql-hackers by date:

Previous
From: Ranier Vilela
Date:
Subject: Re: [PATCH] Fix Uninitialized scalar variable (UNINIT) (src/backend/access/heap/heapam_handler.c)
Next
From: Ranier Vilela
Date:
Subject: Re: [PATCH] Fix Uninitialized scalar variable (UNINIT) (src/backend/access/heap/heapam_handler.c)