Home > mailing lists

Re: Indirect indexes - Mailing list pgsql-hackers

From	Alvaro Herrera
Subject	Re: Indirect indexes
Date	November 1, 2016 04:44:40
Msg-id	20161101044331.4w73ywvi42niltva@alvherre.pgsql Whole thread Raw
In response to	Indirect indexes (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses	Re: Indirect indexes
List	pgsql-hackers

Tree view

Alvaro Herrera wrote:
> I propose we introduce the concept of "indirect indexes".

This is a WIP non-functional patch for indirect indexes.  I've been
distracted from working on it for some time already and will be off-line
for half this month yet, but since this was discussed and seems to be
considered a welcome idea, I am posting it for those who want to have a
look at what I'm doing.  This can write values to indirect indexes (only
btrees), but it cannot read values from them yet, so don't expect this
to work at all unless you are hoping to read index files using
pageinspect.  (If you do test this, be aware that "VACUUM FULL pg_class"
is a case that I know needs fixed.)

I think the most interesting change here is how
HeapSatisfiesHOTandKeyUpdate() has accomodated some additional code to
return a bitmapset of columns that are not modified by an update.

This implements a new command
  CREATE INDIRECT INDEX
which instructs the AM to create an index that stores primary key values
instead of CTID values.  I have not tried yet to remove the limitation
of only six bytes in the PK value.  The part of the patch I'm not
submitting just yet adds a flag to IndexInfo used by IndexPath, so that
when the index is selected by the planner, an IndirectIndexScan node is
created instead of a plain IndexScan.  This node knows how to invoke the
AM so that the PK values are extracted in a first step and the CTIDs are
extracted from the PK in a second step (IndirectIndexScan carries two
IndexScanDesc structs and two index RelationDescs, so it keeps both the
indirect index and the PK index open).

The part that generated the most discussion was vacuuming.  As I said
earlier, I think that instead of trying to shoehorn an index cleanup in
regular vacuum (and cause a terrible degradation of maintenance_work_mem
consumption, into optimizing which so much work has gone), these indexes
would rely on desultory cleanup instead through the "killtuple"
interface that causes index tuples to be removed during scan.  Timely
cleanup is not critical as it is with regular (direct) indexes, given
that CTIDs are not stored and thus tuple movement does not affect this
type of indexes.

This patch is considerably smaller than the toy patch I had, which
introduced a separate AM for "ibtree" -- that was duplicating a lot
of code and adding more code complexity, which becomes much simpler with
the approach in the current code; one thing I didn't like at all was the
fact that the ibtree routines were calling the "internal" btree
routines, which was not nice.  (Also, it would have meant having
duplicate operator family/class rows.)

I hope to be back at home to collaborate with the commitfest on Nov
14th.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

indirect.patch

pgsql-hackers by date:

From: "Tsunakawa, Takayuki"
Date: 01 November 2016, 04:35:39
Subject: Re: ECPG BUlk insert support using arrays

From: Peter Eisentraut
Date: 01 November 2016, 04:55:25
Subject: DROP FUNCTION of multiple functions

Re: Indirect indexes - Mailing list pgsql-hackers

Attachment

Previous

Next