Re: Minmax indexes - Mailing list pgsql-hackers

From Greg Stark
Subject Re: Minmax indexes
Date
Msg-id CAM-w4HPPs86--tRGyX8gDxXQX0rd208+ztFmA5GWq=3G-2QjBw@mail.gmail.com
Whole thread Raw
In response to Re: Minmax indexes  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: Minmax indexes  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
On Tue, Jun 17, 2014 at 11:16 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> Well, it might be doable to correlate them along one axis, but along
> both?  That's more complicated... And even alongside one axis you
> already get into problems if your geometries are irregularly sized.
> Asingle large polygon will completely destroy indexability for anything
> stored physically close by because suddently the minmax range will be
> huge... So you'll need to cleverly sort for that as well.

I think there's a misunderstanding here, possibly mine. My
understanding is that a min/max index will always be exactly the same
size for a given size table. It stores the minimum and maximum value
of the key for each page. Then you can do a bitmap scan by comparing
the search key with each page's minimum and maximum to see if that
page needs to be included in the scan. The failure mode is not that
the index is large but that a page that has an outlier will be
included in every scan as a false positive incurring an extra iop.

I don't think it's implausible at all that Geometric data would work
well. If you load Geometric data it's very common to load data by
geographic area so that all objects in San Francisco in one part of
the data load, probably even by zip code or census block.

What operations would an opclass for min/max need? I think it would be
pretty similar to the operators that GiST needs (thankfully minus the
baroque page split function):

An aggregate to generate a min/max "bounding box" from several values
A function which takes an "bounding box" and a new value and returns
the new "bounding box"
A function which tests if a value is in a "bounding box"
A function which tests if a "bounding box" overlaps a "bounding box"

The nice thing is this would let us add things like range @> (contains
element) to the plain integer min/max case.


-- 
greg



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: comparison operators
Next
From: Robert Haas
Date:
Subject: Re: btreecheck extension