Re: Heap WARM Tuples - Design Draft - Mailing list pgsql-hackers

From Claudio Freire
Subject Re: Heap WARM Tuples - Design Draft
Date
Msg-id CAGTBQpa1VoYHsBjZ97-uPLn+poTJWY=6O9sSPGzNtsb1xfB_9Q@mail.gmail.com
Whole thread Raw
In response to Re: Heap WARM Tuples - Design Draft  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
On Mon, Aug 8, 2016 at 5:24 PM, Bruce Momjian <bruce@momjian.us> wrote:
> On Mon, Aug  8, 2016 at 03:36:12PM -0300, Claudio Freire wrote:
>> I think I prefer a more thorough approach.
>>
>> Increment/decrement may get very complicated with custom opclasses,
>> for instance. A column-change bitmap won't know how to handle
>> funcional indexes, etc.
>
> Hot does HOT handle them?  If it does, it has to look at all the columns
> passes to the expression index, so it seems to be the same amount of
> work.

The way HOT handles it IIRC induces false positives (as in: HOT is
forbidden when it might be ok) because it may flag as changed
expressions that don't change. Think date_trunc('day', timestamp), if
the timestamp changes within a day, there's no change in reality, but
the column changed so HOT is forbidden. But the logic won't create
false negatives (allow HOT when it's not allowed).

But for WARM that might be the case, because the guarantee that you
need is that no key already present in the WARM chain will be
re-added. An index over (a - b) could have both a and b in increasing
order yet repeat keys and violate the WARM chain invariant.

>> What I intend to try, is modify btree to allow efficient search of a
>> key-ctid pair, by adding the ctid to the sort order. Only inner pages
>> need to be changed, to include the ctid in the pointers, leaf pages
>> already have the ctid there, so they don't need any kind of change. A
>> bit in the metapage could indicate whether it's a new format or an old
>> one, and yes, only new indices will be able to use WARM.
>>
>> But with efficient key-ctid searches, you handle all cases, and not
>> just a few common ones.
>
> True.  I am worried about page spills caused by having to insert a rows
> into an existing page and and index entry having to be pushed to an
> adjacent page to maintain ctid index order.

I don't think it would be a concern, as inserting a serial column shouldn't.

Maybe some pages will split that wouldn't have, but the split will add
room to the new leaf pages and some splits that would have split
before won't happen afterwards.

Think of it as equivalent to adding the oid to the index - it's some
immutable attribute of the row being inserted, the fact that it is a
tid shouldn't make a difference to the performance of the btree, aside
from the extra comparisons perhaps.



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Heap WARM Tuples - Design Draft
Next
From: Robert Haas
Date:
Subject: Re: Declarative partitioning