Re: Indirect indexes - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: Indirect indexes
Date
Msg-id 20161021230410.kyngu5dzssvi44wk@alvherre.pgsql
Whole thread Raw
In response to Re: Indirect indexes  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Indirect indexes  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Indirect indexes  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Robert Haas wrote:

> So, I think that this is a really promising direction, but also that
> you should try very hard to try to get out from under this 6-byte PK
> limitation.  That seems really ugly, and in practice it probably means
> your PK is probably going to be limited to int4, which is kind of sad
> since it leaves people using int8 or text PKs out in the cold.

I think we could just add a new type, unsigned 6 byte int, specifically
for this purpose.  Little in the way of operators, as it's pointless to
try to do arithmetic with object identifiers.  (It'd be similar to UUID
in spirit, but I wouldn't try to do anything too smart to generate them.)

> I believe Claudio Freire is on to something when he suggests storing
> the PK in the index tuple; one could try to skip storing the TID, or
> always store it as all-zeroes.

That bloats the index a bit.  But then, maybe that is fine ...

> The VACUUM problems seem fairly serious.  It's true that these indexes
> will be less subject to bloat, because they only need updating when
> the PK or the indexed columns change, not when other indexed columns
> change.  On the other hand, there's nothing to prevent a PK from being
> recycled for an unrelated tuple.  We can guarantee that a TID won't be
> recycled until all index references to the TID are gone, but there's
> no such guarantee for a PK.  AFAICT, that would mean that an indirect
> index would have to be viewed as unreliable: after looking up the PK,
> you'd *always* have to recheck that it actually matched the index
> qual.

Yes, recheck is always needed.

As for vacuum, I was thinking this morning that perhaps the answer to
that is just to not vacuum the index at all and instead rely on the
killtuple interface (which removes tuples during scan).  So we don't
need to spend precious maint_work_mem space on a large list of PK
values.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Fun fact about autovacuum and orphan temp tables
Next
From: Tom Lane
Date:
Subject: Re: Indirect indexes