Re: [WIP] [B-Tree] Keep indexes sorted by heap physical location - Mailing list pgsql-hackers

From Claudio Freire
Subject Re: [WIP] [B-Tree] Keep indexes sorted by heap physical location
Date
Msg-id CAGTBQpYJDR7+qiFz6P7HavUvo6=zniOVEh-DGufQmDA+Wjwymg@mail.gmail.com
Whole thread Raw
In response to Re: [WIP] [B-Tree] Keep indexes sorted by heap physical location  (Peter Geoghegan <pg@heroku.com>)
Responses Re: [WIP] [B-Tree] Keep indexes sorted by heap physical location  (Kevin Grittner <kgrittn@gmail.com>)
List pgsql-hackers
On Thu, Aug 18, 2016 at 6:38 PM, Peter Geoghegan <pg@heroku.com> wrote:
> On Thu, Aug 18, 2016 at 2:26 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
>> I see that. I could try to measure average depth to measure the impact
>> this had on fan-in.
>>
>> While it should cut it in half for narrow indexes, half of very high
>> is still high. Wide indexes, which are are the ones that would suffer
>> from poor fan-in, would feel this far less, since the overhead is
>> constant.
>
> You'll also have to consider the impact on the choice of split point,
> FWIW. There is currently leeway about what page an index tuple can
> land on, if and when it happens to be equal to the high-key. I'm not
> sure how important that is, but it's a consideration.

When I read the code, it seemed generic enough, using ItemIdGetLength
(which works on non-leaf index tuples correctly, reporting the right
size), so it should still work.

But the choice of split point may make a difference for future
insertions, so I'll look into that.


>> Even if it does have an impact, I don't see an alternative, without
>> also implementing suffix truncation. Perhaps I could try to avoid
>> adding the leaf tid header if it isn't necessary, though I would have
>> to use up the last available flag bit in t_info for that.
>
> To be clear, I'm particularly worried about painting ourselves into a
> corner, suffix-truncation-wise. It's a very common optimization, that
> we should really have.

Well, page version numbers could be used to switch between
semantically similar page formats later on. So if another format is
necessary (say, prefix compression, suffix truncation, etc), it can be
changed on-the-fly on existing indexes by checking page version
numbers and doing the proper switch.

So we won't be painting ourselves into a corner, but picking the wrong
format may indeed be a headache.

I can go the way of the flag in t_info if that's preferrable. Both
would work. It's just that it's the last flag available, and that
would be painting ourselves into a corner regarding flags. To avoid
that, the flag itself could be "has extra data" (instead of has leaf
tid), and add a whole set of flags in the extra data. That could also
help for preffix compression and suffix truncation.

>>> ISTM that the way to address this problem is with a duplicate list
>>> and/or prefix compression in leaf pages.
>>
>> Prefix compression is another one I will be looking into eventually,
>> but last time I tried it was far more invasive so I abandoned until I
>> could find a less invasive way to do it.
>
> Unfortunately, sometimes it is necessary to be very ambitious in order
> to solve a problem. The understandable and usually well-reasoned
> approach of making progress as incremental as possible occasionally
> works against contributors. It's worth considering if this is such a
> case.

I'd agree if this regressed performance considerably without the other
improvements, so I guess this will depend on the fan-in measurements.



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [PATCH] add option to pg_dumpall to exclude tables from the dump
Next
From: Alvaro Herrera
Date:
Subject: errno clobbering in reorderbuffer