On Tue, 2021-03-30 at 17:57 +0800, Julien Rouhaud wrote:
> While running some sanity checks on the regression tests, I found one test that
> returns different results depending on whether an index or a sequential scan is
> used.
>
> Minimal reproducer:
>
> =# CREATE TABLE point_tbl AS select '(nan,nan)'::point f1;
> =# CREATE INDEX ON point_tbl USING gist(f1);
>
> =# EXPLAIN SELECT * FROM point_tbl WHERE f1 <@ polygon '(0,0),(0,100),(100,100),(50,50),(100,0),(0,0)';
> QUERY PLAN
> ------------------------------------------------------------------------------
> Seq Scan on point_tbl (cost=0.00..1.01 rows=1 width=16)
> Filter: (f1 <@ '((0,0),(0,100),(100,100),(50,50),(100,0),(0,0))'::polygon)
> (2 rows)
>
> =# SELECT * FROM point_tbl WHERE f1 <@ polygon '(0,0),(0,100),(100,100),(50,50),(100,0),(0,0)';
> f1
> -----------
> (NaN,NaN)
> (1 row)
>
> SET enable_seqscan = 0;
>
>
> =# EXPLAIN SELECT * FROM point_tbl WHERE f1 <@ polygon '(0,0),(0,100),(100,100),(50,50),(100,0),(0,0)';
> QUERY PLAN
> ----------------------------------------------------------------------------------------
> Index Only Scan using point_tbl_f1_idx on point_tbl (cost=0.12..8.14 rows=1 width=16)
> Index Cond: (f1 <@ '((0,0),(0,100),(100,100),(50,50),(100,0),(0,0))'::polygon)
> (2 rows)
>
> =# SELECT * FROM point_tbl WHERE f1 <@ polygon '(0,0),(0,100),(100,100),(50,50),(100,0),(0,0)';
> f1
> ----
> (0 rows)
>
> The discrepancy comes from the fact that the sequential scan checks the
> condition using point_inside() / lseg_crossing(), while the gist index will
> check the condition using box_overlap() / box_ov(), which have different
> opinions on how to handle NaN.
>
> Getting a consistent behavior shouldn't be hard, but I'm unsure which behavior
> is actually correct.
I'd say that this is certainly wrong:
SELECT point('NaN','NaN') <@ polygon('(0,0),(1,0),(1,1),(0,0)');
?column?
----------
t
(1 row)
Yours,
Laurenz Albe