Minmax indexes - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Minmax indexes
Date
Msg-id 20130915001442.GE5033@eldon.alvh.no-ip.org
Whole thread Raw
Responses Re: Minmax indexes
Re: Minmax indexes
Re: Minmax indexes
Re: Minmax indexes
Re: Minmax indexes
List pgsql-hackers
Hi,

Here's a reviewable version of what I've dubbed Minmax indexes.  Some
people said they would like to use some other name for this feature, but
I have yet to hear usable ideas, so for now I will keep calling them
this way.  I'm open to proposals, but if you pick something that cannot
be abbreviated "mm" I might have you prepare a rebased version which
renames the files and structs.

The implementation here has been simplified from what I originally
proposed at 20130614222805.GZ5491@eldon.alvh.no-ip.org -- in particular,
I noticed that there's no need to involve aggregate functions at all; we
can just use inequality operators.  So the pg_amproc entries are gone;
only the pg_amop entries are necessary.

I've somewhat punted on the question of doing resummarization separately
from vacuuming.  Right now, resummarization (as well as other necessary
index cleanup) takes place in amvacuumcleanup.  This is not optimal; I
have stated elsewhere that I'd like to create separate maintenance
actions that can be carried out by autovacuum.  That would be useful
both for Minmax indexes and GIN indexes (pending insertion list); maybe
others.  That's not part of this patch, however.

The design of this stuff is in the file "minmax-proposal" at the top of
the tree.  That file is up to date, though it still contains some open
questions that were present in the original proposal.  (I have not fixed
some bogosities pointed out by Noah, for instance.  I will do that
shortly.)  In a final version, that file would be applied as
src/backend/access/minmax/README, most likely.

One area on which I needed to modify core code is IndexBuildHeapScan.  I
needed a version that was able to scan only a certain range of pages,
not the entire table, so I introduced a new IndexBuildHeapRangeScan, and
added a quick "heap_scansetlimits" function.  I haven't tested that this
works outside of the HeapRangeScan thingy, so it's probably completely
bogus; I'm open to suggestions if people think this should be
implemented differently.  In any case, keeping that implementation
together with vanilla IndexBuildHeapScan makes a lot of sense.

One thing still to tackle is when to mark ranges as unsummarized.  Right
now, any new tuple on a page range would cause a new index entry to be
created and a new revmap update.  This would cause huge index bloat if,
say, a page is emptied and vacuumed and filled with new tuples with
increasing values outside the original range; each new tuple would
create a new index tuple.  I have two ideas about this (1. mark range as
unsummarized if 3rd time we touch the same page range; 2. vacuum the
affected index page if it's full, so we can maintain the index always up
to date without causing unduly bloat), but I haven't implemented
anything yet.

The "amcostestimate" routine is completely bogus; right now it returns
constant 0, meaning the index is always chosen if it exists.

There are opclasses for int4, numeric and text.  The latter doesn't work
at all, because collation info is not passed down at all.  I will have
to figure that out (even if I find unlikely that minmax indexes have any
usefulness on top of text columns).  I admit that numeric hasn't been
tested, and it's quite likely that they won't work; mainly because of
lack of some datumCopy() calls, about which the code contains some
/* XXX */ lines.  I think this should be relatively straightforward.
Ideally, the final version of this patch would contain opclasses for all
supported datatypes (i.e. the same that have got btree opclasses).

I have messed up the opclass information, as evidenced by failures in
opr_sanity regression test.  I will research that later.


There's working contrib/pageinspect support; pg_xlogdump (and wal_debug)
seems to work sanely too.
This patch compiles cleanly under -Werror.

  The research leading to these results has received funding from the
  European Union's Seventh Framework Programme (FP7/2007-2013) under
  grant agreement n° 318633

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
Next
From: Bernd Helmle
Date:
Subject: Re: GUC for data checksums