Re: [RFC] Minmax indexes - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: [RFC] Minmax indexes
Date
Msg-id 20130617201248.GE3537@eldon.alvh.no-ip.org
Whole thread Raw
In response to Re: [RFC] Minmax indexes  (Greg Stark <stark@mit.edu>)
List pgsql-hackers
Greg Stark wrote:
> On Fri, Jun 14, 2013 at 11:28 PM, Alvaro Herrera
> <alvherre@2ndquadrant.com> wrote:
> > Re-summarization is relatively expensive, because the complete page range has
> > to be scanned.
> 
> That doesn't sound too bad to me. It just means there's a downside to
> having larger page ranges. I would expect the page ranges to be
> something in the ballpark of 32 pages --  scanning 32 pages to
> resummarize doesn't sound that painful but sounds like it's large
> enough that the resulting index would be a reasonable size.

Actually, I'm thinking that a range is more like, say, 1280 pages, or 10
MB.  My goal is to consider tables in the 10 TB magnitude; if I store
one index tuple for every 32 pages, I would end up with too large an
index.  With 10 MBs per range I can index the whole table with an index
of 50 MB, which seems reasonable to scan.  But anyway my intention is
that page range size is configurable.

> But I don't understand why an insert would invalid a tuple. An insert
> can just update the min and max incrementally. It's a delete that
> invalidates the range but as you note it doesn't really invalidate it,
> just mark it as needing a refresh -- and even then only if the value
> being deleted is equal to either the min or max.

No, I don't intend to update the index tuple with each heap insert.  I
think this will end up being too slow.  The validity map is there to
hold a single bit for each page saying whether the page range is known
valid or not; one insert into any page in the range invalidates the
range (and any scan of the table needs to scan that range as a whole).
This way I can wait until the storm of inserts has passed from a range,
and only then do the summarization for that range.  This avoids having
to summarize the range over and over.

Alternatively, I could consider the index tuple always valid, and update
it online as soon as we do an insert or update (i.e. examine the min/max
values in the index tuple, and update it to match if the new value is
out of bounds).  This seems too slow, so I won't try.

Also, a delete does not invalidate a range either.  As Simon said
elsewhere in a thread, if the range is not minimal, this is not a
problem; we might have to scan some more ranges than absolutely
necessary, but it's not a correctness problem.  The heap scan recheck
node will get rid of the unwanted tuples anyway.

> > Same-size page ranges?
> > Current related literature seems to consider that each "index entry" in a
> > minmax index must cover the same number of pages.  There doesn't seem to be a
> 
> I assume the reason for this in the literature is the need to quickly
> find the summary for a given page when you're handling an insert or
> delete.

Yeah, that's one thing to keep in mind.  I haven't gotten too much into
this; I only added these two entries at the end for my own future
reference, because I will need to consider them at some point.

Right now my intention is to have each index have a fixed page range
size, which is defined at index creation time.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Support for REINDEX CONCURRENTLY
Next
From: Alvaro Herrera
Date:
Subject: Re: [RFC] Minmax indexes