Re: Faster inserts with mostly-monotonically increasing values - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Faster inserts with mostly-monotonically increasing values
Date
Msg-id CAH2-Wz=X8_FxPO0wkhcEdmTm5x7Ksqx6ix756Nu6KYCAbR7_VQ@mail.gmail.com
Whole thread Raw
In response to Faster inserts with mostly-monotonically increasing values  (Pavan Deolasee <pavan.deolasee@gmail.com>)
Responses Re: Faster inserts with mostly-monotonically increasing values  (Pavan Deolasee <pavan.deolasee@gmail.com>)
Re: Faster inserts with mostly-monotonically increasing values  (Simon Riggs <simon.riggs@2ndquadrant.com>)
Re: Faster inserts with mostly-monotonically increasing values  (Claudio Freire <klaussfreire@gmail.com>)
List pgsql-hackers
On Sun, Dec 31, 2017 at 6:44 AM, Pavan Deolasee
<pavan.deolasee@gmail.com> wrote:
> Here is a patch that implements the idea. If the last insert happens to be
> in the rightmost block of an index, then we cache the block and check that
> first for the next insert. For the cached block to be usable for the insert,
> it should still be the rightmost, the to-be-inserted key should go into the
> cached block and there is enough free space in the block. If any of these
> conditions do not meet, we fall back to the regular code path, i.e. descent
> down the index from the top.

I'd be particularly concerned about page deletion/page recycling still
being correct with the patch, especially because it's hard to test
anything there. The P_ISLEAF() part of your fastpath's test hints at
this -- why should it not *always* be a leaf page (surely you should
be sure that the page cannot be recycled anyway)? I also have my
doubts about unique index enforcement remaining correct with the patch
when there are many physical duplicates, to the extent that more than
a single leaf page is needed for a single value.

Maybe you should do something with page LSN here -- cache LSN
alongside block number, and have a non-matching LSN invalidate the
test.

How many clients are involved with your benchmark?

> So as the size of the index increases, the benefits of the patch also tend
> to increase. This is most likely because as the index becomes taller and
> taller, the costs associated with index descent becomes higher.

FWIW, I think that int4 indexes have to be extremely large before they
exceed 4 levels (where the leaf level counts as a level). My
conservative back-of-an-envelope estimate is 9 billion index tuples.

-- 
Peter Geoghegan


pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: Logical decoding fast-forward and slot advance
Next
From: Fabien COELHO
Date:
Subject: Re: [PATCH] GET DIAGNOSTICS FUNCTION_NAME