Home > mailing lists

Re: Faster inserts with mostly-monotonically increasing values - Mailing list pgsql-hackers

From	Pavan Deolasee
Subject	Re: Faster inserts with mostly-monotonically increasing values
Date	January 2, 2018 18:51:51
Msg-id	CABOikdMx91+Lou8Z5MQEg=44PT-jybMXaoypXyR=QhJGKbYvDg@mail.gmail.com Whole thread Raw
In response to	Re: Faster inserts with mostly-monotonically increasing values (Peter Geoghegan <pg@bowt.ie>)
Responses	Re: Faster inserts with mostly-monotonically increasing values
List	pgsql-hackers

Tree view

On Sun, Dec 31, 2017 at 4:36 PM, Peter Geoghegan <pg@bowt.ie> wrote:

On Sun, Dec 31, 2017 at 6:44 AM, Pavan Deolasee
<pavan.deolasee@gmail.com> wrote:
> Here is a patch that implements the idea. If the last insert happens to be
> in the rightmost block of an index, then we cache the block and check that
> first for the next insert. For the cached block to be usable for the insert,
> it should still be the rightmost, the to-be-inserted key should go into the
> cached block and there is enough free space in the block. If any of these
> conditions do not meet, we fall back to the regular code path, i.e. descent
> down the index from the top.

I'd be particularly concerned about page deletion/page recycling still
being correct with the patch, especially because it's hard to test
anything there. The P_ISLEAF() part of your fastpath's test hints at
this -- why should it not *always* be a leaf page (surely you should
be sure that the page cannot be recycled anyway)?

I thought we can throw in a bunch of checks there to ensure that we are still looking at the correct rightmost page or else fallback to the regular path. Do you see something missing there? I don't claim that I've got everything correct, but since we're holding an exclusive lock on the page at that point, wouldn't we be able to guard against any concurrent splits/deletions? I will reread the documentation/code again, but I must admit, it's not particularly easy to understand all the concurrency issues.

I also have my
doubts about unique index enforcement remaining correct with the patch
when there are many physical duplicates, to the extent that more than
a single leaf page is needed for a single value.

Why would that happen? We are checking if the to-be-inserted key is strictly greater than the last key in the rightmost block and only then proceeding with the fast path. And we do this while holding an exclusive lock on the page. Are you suggesting that someone can concurrently still split the page?

Maybe you should do something with page LSN here -- cache LSN
alongside block number, and have a non-matching LSN invalidate the
test.

This is a good idea, but we might need to track the block/LSN in shared memory. Otherwise with multiple backends doing inserts, we might never be able to match LSNs (i.e. they will mostly vary, thus falling back to slow path for majority of the time)

How many clients are involved with your benchmark?

Just a single client for these benchmarks.

> So as the size of the index increases, the benefits of the patch also tend
> to increase. This is most likely because as the index becomes taller and
> taller, the costs associated with index descent becomes higher.

FWIW, I think that int4 indexes have to be extremely large before they
exceed 4 levels (where the leaf level counts as a level). My
conservative back-of-an-envelope estimate is 9 billion index tuples.

Hmm Ok. I am not entirely sure whether it's the depth or just purely binary searching through 3-4 index pages and/or pinning/unpinning buffers result in much overhead. I will run some more tests and collect evidences.

Thanks,

Pavan

Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

From: Alvaro Herrera
Date: 02 January 2018, 18:20:05
Subject: Re: Better testing coverage and unified coding for plpgsql loops

From: Darafei "Komяpa" Praliaskouski
Date: 02 January 2018, 18:54:24
Subject: Re: Better testing coverage and unified coding for plpgsql loops

Re: Faster inserts with mostly-monotonically increasing values - Mailing list pgsql-hackers

Previous

Next