Home > mailing lists

Re: WIP: Covering + unique indexes. - Mailing list pgsql-hackers

From	Alexander Korotkov
Subject	Re: WIP: Covering + unique indexes.
Date	March 24, 2018 22:39:20
Msg-id	CAPpHfdvrn6S_2ArObKtEErB3_Z9+rqKzzKQfJ6jryCoKoc=peA@mail.gmail.com Whole thread
In response to	Re: WIP: Covering + unique indexes. (Peter Geoghegan <pg@bowt.ie>)
Responses	Re: WIP: Covering + unique indexes.
List	pgsql-hackers

Tree view

On Sat, Mar 24, 2018 at 5:21 AM, Peter Geoghegan <pg@bowt.ie> wrote:

On Thu, Mar 22, 2018 at 8:23 AM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
>> * There is minor formatting issue in this part of code. Some spaces need
>> to be replaced with tabs.
>> +IndexTuple
>> +index_truncate_tuple(Relation idxrel, IndexTuple olditup)
>> +{
>> + TupleDesc itupdesc =
>> CreateTupleDescCopyConstr(RelationGetDescr(idxrel));
>> + Datum values[INDEX_MAX_KEYS];
>> + bool isnull[INDEX_MAX_KEYS];
>> + IndexTuple newitup;

The last time I looked at this patch, in April 2017, I made the point
that we should add something like an "nattributes" argument to
index_truncate_tuple() [1], rather than always using
IndexRelationGetNumberOfKeyAttributes() within index_truncate_tuple().
I can see that that change hasn't been made since that time.

+1, putting "nattributes" to argument of index_truncate_tuple() would

make this function way more universal.

With that approach, we can avoid relying on catalog metadata to the
same degree, which was a specific concern that Tom had around that
time. It's easy to do something with t_tid's offset, which is unused
in internal page IndexTuples. We do very similar things in GIN, where
alternative use of an IndexTuple's t_tid supports all kinds of
enhancements, some of which were not originally anticipated. Alexander
surely knows more about this than I do, since he wrote that code.

Originally that code was written by Teodor, but I also put my hands there.

In GIN entry tree, item pointers are stored in a posting list which is located

after index tuple attributes. So, both t_tid block number and offset are

used for GIN needs.

Having this index_truncate_tuple() "nattributes" argument, and storing
the number of attributes directly improves quite a lot of things:

* It makes diagnosing issues in the field quite a bit easier. Tools
like pg_filedump can do the right thing (Tom mentioned pg_filedump and
amcheck specifically). The nbtree IndexTuple format should not need to
be interpreted in a context-sensitive way, if we can avoid it.

* It lets you use index_truncate_tuple() for regular suffix truncation
in the future. These INCLUDE IndexTuples are really just a special
case of suffix truncation. At least, they should be, because otherwise
an eventual suffix truncation feature is going to be incompatible with
the INCLUDE tuple format. *Not* doing this makes suffix truncation
harder. Suffix truncation is a classic technique, first described by
Bayer in 1977, and we are very probably going to add it someday.

* Once you can tell a truncated IndexTuple from a non-truncated one
with little or no context, you can add defensive assertions in various
places where they're helpful. Similarly, amcheck can use and expect
this as a cross-check against IndexRelationGetNumberOfKeyAttributes().
This will increase confidence in the design, both initially and over
time.

That makes sense. Let's store the number of tuple attributes to t_tid.

Assuming that our INDEX_MAX_KEYS is quite small number, we will have

higher bits of t_tid free for latter use.

I must say that I am disappointed that nothing has happened here,
especially because this really wasn't very much additional work, and
has essentially no downside. I can see that it doesn't work that way
in the Postgres Pro fork [2], and diverging from that may
inconvenience Postgres Pro, but that's a downside of forking. I don't
think that the community should have to absorb that cost.

Sure, community shouldn't take care about Postgres Pro fork. If we find

that something is better to be done differently, than let us do it so.

> +Notes to Operator Class Implementors
> +------------------------------------
>
> Besides I really appreciate it, it seems to be unrelated to the covering
> indexes.
> I'd like this to be extracted into separate patch and be committed
> separately.

Commit 3785f7ee, from last month, moved the original "Notes to
Operator Class Implementors" section to the SGML docs. It looks like
that README section was accidentally reintroduced during rebasing. The
new information ("Included attributes in B-tree indexes") should be
moved over to the new section of the user docs -- the section that
3785f7ee added.

Thank you for noticing that. I've overlooked that.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

pgsql-hackers by date:

From: Tom Lane
Date: 24 March 2018, 22:17:54
Subject: Re: [HACKERS] [PATCH] Vacuum: Update FSM more frequently

From: Andres Freund
Date: 24 March 2018, 23:33:21
Subject: Re: Why does load_external_function() return PGFunction?

Re: WIP: Covering + unique indexes. - Mailing list pgsql-hackers

Previous

Next