Re: Bug in GiST paring heap comparator - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: Bug in GiST paring heap comparator
Date
Msg-id CAPpHfdtaQZLC_9Zyn9rFMoxVKPimDBce0SwKvT5+7XH=z_xYvA@mail.gmail.com
Whole thread Raw
In response to Re: Bug in GiST paring heap comparator  (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses Re: Bug in GiST paring heap comparator
List pgsql-hackers
On Mon, Sep 2, 2019 at 9:28 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
> On 02/09/2019 07:54, Alexander Korotkov wrote:
> > NULL and '(NaN,NaN)' are swapped.  It happens so, because we assume
> > distance to NULL to be Inf, while float8_cmp_internal() assumes NaN to
> > be greater than NULL.  If even we would assume distance to NULL to be
> > Nan, it doesn't guarantee that NULLs would be last.  It looks like we
> > can handle this only by introduction array of "distance is null" flags
> > to GISTSearchItem.  But does it worth it?
>
> I don't think we have much choice, returning wrong results is not an
> option. At first I thought we could set the "recheck" flag if we see
> NULLs or NaNs, but that won't work because rechecking is not supported
> with Index-Only Scans.
>
> Looking at the corresponding SP-GiST code,
> pairingheap_SpGistSearchItem_cmp() gets this right. I'd suggest
> copy-pasting the implementation from there, so that they would be as
> similar as possible. (SP-GiST gets away with just one isNull flag,
> because it doesn't support multi-column indexes. In GiST, you need an
> array, as you said.)

Thank you for clarifying my doubts.

Second of attached patches solves this problem.  In this patch
GISTSearchItem have both array of distances and array of null flags at
the end.  They are accessed by a bit cumbersome
GISTSearchItemDistanceValues() and GISTSearchItemDistanceNulls()
macros.  I came up to this, because I didn't like to allocate null
flags or distances in separate chunk of memory.  Also I had idea to
wrap distance and null flag into struct, but I didn't like to change
signature of index_store_float8_orderby_distances() that much.

I'm going to push both if no objections.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [bug fix] Produce a crash dump before main() on Windows
Next
From: Alvaro Herrera from 2ndQuadrant
Date:
Subject: Re: [PATCH] Speedup truncates of relation forks