Re: Minmax indexes - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Minmax indexes |
Date | |
Msg-id | 20140617160428.GE6836@awork2.anarazel.de Whole thread Raw |
In response to | Re: Minmax indexes (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Minmax indexes
Re: Minmax indexes |
List | pgsql-hackers |
On 2014-06-17 11:48:10 -0400, Robert Haas wrote: > On Tue, Jun 17, 2014 at 10:31 AM, Andres Freund <andres@2ndquadrant.com> wrote: > > On 2014-06-17 10:26:11 -0400, Robert Haas wrote: > >> On Sat, Jun 14, 2014 at 10:34 PM, Alvaro Herrera > >> <alvherre@2ndquadrant.com> wrote: > >> > Robert Haas wrote: > >> >> On Wed, Sep 25, 2013 at 4:34 PM, Alvaro Herrera > >> >> <alvherre@2ndquadrant.com> wrote: > >> >> > Here's an updated version of this patch, with fixes to all the bugs > >> >> > reported so far. Thanks to Thom Brown, Jaime Casanova, Erik Rijkers and > >> >> > Amit Kapila for the reports. > >> >> > >> >> I'm not very happy with the use of a separate relation fork for > >> >> storing this data. > >> > > >> > Here's a new version of this patch. Now the revmap is not stored in a > >> > separate fork, but together with all the regular data, as explained > >> > elsewhere in the thread. > >> > >> Cool. > >> > >> Have you thought more about this comment from Heikki? > >> > >> http://www.postgresql.org/message-id/52495DD3.9010809@vmware.com > > > > Is there actually a significant usecase behind that wish or just a > > general demand for being generic? To me it seems fairly unlikely you'd > > end up with something useful by doing a minmax index over bounding > > boxes. > > Well, I'm not the guy who does things with geometric data, but I don't > want to ignore the significant percentage of our users who are. As > you must surely know, the GIST implementations for geometric data > types store bounding boxes on internal pages, and that seems to be > useful to people. What is your reason for thinking that it would be > any less useful in this context? For me minmax indexes are helpful because they allow to generate *small* 'coarse' indexes over large volumes of data. From my pov that's possible possible because they don't contain item pointers for every contained row. That'ill imo work well if there are consecutive rows in the table that can be summarized into one min/max range. That's quite likely to happen for common applications of number of scalar datatypes. But the likelihood of placing sufficiently many rows with very similar bounding boxes close together seems much less relevant in practice. And I think that's generally likely for operations which can't be well represented as btree opclasses - the substructure that implies inside a Datum will make correlation between consecutive rows less likely. Maybe I've a major intuition failure here though... > I do also think that a general demand for being generic ought to carry > some weight. Agreed. It's always a balance act. But it's not like this doesn't use a datatype abstraction concept... > We have gone to great lengths to make sure that our > indexing can handle more than just < and >, where a lot of other > products have not bothered. I think we have gotten a lot of mileage > out of that decision and feel that we shouldn't casually back away > from it. I don't see this as a case of backing away from that though? > we shouldn't accept a less-generic > approach blindly, without questioning whether it's possible to do > better. But the aim shouldn't be to add genericity that's not going to be used, but to add it where it's somewhat likely to help... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: