Re: Incorrect behaviour when using a GiST index on points - Mailing list pgsql-hackers

From Noah Misch
Subject Re: Incorrect behaviour when using a GiST index on points
Date
Msg-id 20121102124641.GA14496@tornado.leadboat.com
Whole thread Raw
In response to Re: Incorrect behaviour when using a GiST index on points  (Alexander Korotkov <aekorotkov@gmail.com>)
Responses Re: Incorrect behaviour when using a GiST index on points  (Alexander Korotkov <aekorotkov@gmail.com>)
List pgsql-hackers
On Fri, Nov 02, 2012 at 04:05:30PM +0400, Alexander Korotkov wrote:
> On Thu, Oct 18, 2012 at 11:18 PM, Noah Misch <noah@leadboat.com> wrote:

> > > --- 1339,1356 ----
> > >                       *recheck = false;
> > >                       break;
> > >               case BoxStrategyNumberGroup:
> > > !                     /*
> > > !                      * This code repeats logic of on_ob which uses
> > simple comparison
> > > !                      * rather than FP* functions.
> > > !                      */
> > > !                     query = PG_GETARG_BOX_P(1);
> > > !                     key = DatumGetBoxP(entry->key);
> > > !
> > > !                     *recheck = false;
> > > !                     result = key->high.x >= query->low.x &&
> > > !                                      key->low.x <= query->high.x &&
> > > !                                      key->high.y >= query->low.y &&
> > > !                                      key->low.y <= query->high.y;
> >
> > For leaf entries, this correctly degenerates to on_pb().  For internal
> > entries, it must, but does not, implement box_overlap().  (The fuzzy
> > box_overlap() would be fine.)  I recommend making gist_point_consistent()'s
> > treatment of boxes resemble its treatment of circles and polygons; that
> > eases
> > verifying their correctness.  Call gist_box_consistent.  Then, for leaf
> > entries, call box_contain_pt().
> >
> 
> I have two objections on doing that:
> 1) It's not evident for me that fuzzy comparison in internal pages is fine.
> Obviously, it depends on data distribution. It's easy to provide an example
> when fuzzy comparison will lead to significant performance degradation.
> 2) With PolygonStrategyNumberGroup CircleStrategyNumberGroup it's faster to
> do simple box comparison than doing calculation for exact circle and
> especially polygon check. In this case previous filtering in leaf pages
> looks reasonable. With BoxStrategyNumberGroup exact calculation is simpler
> than gist_box_consistent.

That's fair; I withdraw the recommendation to use gist_box_consistent().  It
remains that the code here must somehow implement a box_overlap()-style
calculation for internal pages.



pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: Incorrect behaviour when using a GiST index on points
Next
From: John Lumby
Date:
Subject: Re: [PATCH] Prefetch index pages for B-Tree index scans