Re: Issue with point_ops and NaN - Mailing list pgsql-hackers

From Laurenz Albe
Subject Re: Issue with point_ops and NaN
Date
Msg-id 449937a274a6362a77553b8786c45d787f8c0aa2.camel@cybertec.at
Whole thread Raw
In response to Issue with point_ops and NaN  (Julien Rouhaud <rjuju123@gmail.com>)
Responses Re: Issue with point_ops and NaN  (Julien Rouhaud <rjuju123@gmail.com>)
List pgsql-hackers
On Tue, 2021-03-30 at 17:57 +0800, Julien Rouhaud wrote:
> While running some sanity checks on the regression tests, I found one test that
> returns different results depending on whether an index or a sequential scan is
> used.
> 
> Minimal reproducer:
> 
> =# CREATE TABLE point_tbl AS select '(nan,nan)'::point f1;
> =# CREATE INDEX ON point_tbl USING gist(f1);
> 
> =# EXPLAIN SELECT * FROM point_tbl WHERE f1 <@ polygon '(0,0),(0,100),(100,100),(50,50),(100,0),(0,0)';
>                                   QUERY PLAN
> ------------------------------------------------------------------------------
>  Seq Scan on point_tbl  (cost=0.00..1.01 rows=1 width=16)
>    Filter: (f1 <@ '((0,0),(0,100),(100,100),(50,50),(100,0),(0,0))'::polygon)
> (2 rows)
> 
> =# SELECT * FROM point_tbl WHERE f1 <@ polygon '(0,0),(0,100),(100,100),(50,50),(100,0),(0,0)';
>     f1
> -----------
>  (NaN,NaN)
> (1 row)
> 
> SET enable_seqscan = 0;
> 
> 
> =# EXPLAIN SELECT * FROM point_tbl WHERE f1 <@ polygon '(0,0),(0,100),(100,100),(50,50),(100,0),(0,0)';
>                                        QUERY PLAN                                       
> ----------------------------------------------------------------------------------------
>  Index Only Scan using point_tbl_f1_idx on point_tbl  (cost=0.12..8.14 rows=1 width=16)
>    Index Cond: (f1 <@ '((0,0),(0,100),(100,100),(50,50),(100,0),(0,0))'::polygon)
> (2 rows)
> 
> =# SELECT * FROM point_tbl WHERE f1 <@ polygon '(0,0),(0,100),(100,100),(50,50),(100,0),(0,0)';
>  f1 
> ----
> (0 rows)
> 
> The discrepancy comes from the fact that the sequential scan checks the
> condition using point_inside() / lseg_crossing(), while the gist index will
> check the condition using box_overlap() / box_ov(), which have different
> opinions on how to handle NaN.
> 
> Getting a consistent behavior shouldn't be hard, but I'm unsure which behavior
> is actually correct.

I'd say that this is certainly wrong:

SELECT point('NaN','NaN') <@ polygon('(0,0),(1,0),(1,1),(0,0)');

 ?column? 
----------
 t
(1 row)

Yours,
Laurenz Albe




pgsql-hackers by date:

Previous
From: Julien Rouhaud
Date:
Subject: Outdated comment for CreateStmt.inhRelations
Next
From: Julien Rouhaud
Date:
Subject: Re: Issue with point_ops and NaN