Re: Faster inserts with mostly-monotonically increasing values - Mailing list pgsql-hackers

From Pavan Deolasee
Subject Re: Faster inserts with mostly-monotonically increasing values
Date
Msg-id CABOikdOnM8eoeKpTY64hES1TW9G+ggmorHjV5yHZJynQzDcuLg@mail.gmail.com
Whole thread Raw
In response to Re: Faster inserts with mostly-monotonically increasing values  (Andrew Dunstan <andrew.dunstan@2ndquadrant.com>)
Responses Re: Faster inserts with mostly-monotonically increasing values  (Andrew Dunstan <andrew.dunstan@2ndquadrant.com>)
List pgsql-hackers
Hi Andrew and Claudio,

Thanks for the review!

On Fri, Mar 23, 2018 at 10:19 AM, Andrew Dunstan <andrew.dunstan@2ndquadrant.com> wrote:
On Fri, Mar 23, 2018 at 10:20 AM, Claudio Freire <klaussfreire@gmail.com> wrote:


This patch looks in pretty good shape. I have been trying hard to
think of some failure mode but haven't been able to come up with one.


Great!
 

> Some comments
>
> +    /*
> +     * It's very common to have an index on an auto-incremented or
> +     * monotonically increasing value. In such cases, every insertion happens
> +     * towards the end of the index. We try to optimise that case by caching
> +     * the right-most block of the index. If our cached block is still the
> +     * rightmost block, has enough free space to accommodate a new entry and
> +     * the insertion key is greater or equal to the first key in this page,
> +     * then we can safely conclude that the new key will be inserted in the
> +     * cached block. So we simply search within the cached page and insert the
> +     * key at the appropriate location. We call it a fastpath.
>
> It should say "the insertion key is strictly greater than the first key"


Good catch. Fixed.
 

>
> Also, "rightmost block" != "rightmost leaf page" ("leaf" being the key
> difference). So it should say "rightmost leaf page".


right.


Fixed.
 
[...]
>
> Setting "offset = InvalidOffsetNumber" in that contorted way is
> unnecessary. You can remove the first assignment and instead
> initialize unconditionally right after the fastpath block (that
> doesn't make use of offset anyway):


Yes, I noticed that and it's confusing, Just set it at the top.


Good idea. Changed that way.
 
>
> Having costs in explain tests can be fragile. Better use "explain
> (costs off)". If you run "make check" continuously in a loop, you
> should get some failures related to that pretty quickly.
>

Agree about costs off, but I'm fairly dubious of the value of using
EXPLAIN at all here.Nothing in the patch should result in any change
in EXPLAIN output.

I agree. I initially added EXPLAIN to ensure that we're doing index-only scans. But you're correct, we don't need them in the tests itself.
 

I would probably just have a few regression lines that should be sure
to exercise the code path and leave it at that.


I changed the regression tests to include a few more scenarios, basically using multi-column indexes in different ways and they querying rows by ordering rows in different ways. I did not take away the vacuum and I believe it will actually help the tests by introducing some fuzziness in the tests i.e. if the vacuum does not do its job, we might execute a different plan and ensure that the output remains unchanged.

Thanks,
Pavan

--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
Attachment

pgsql-hackers by date:

Previous
From: Michael Banck
Date:
Subject: Re: [PATCH] Verify Checksums during Basebackups
Next
From: Ashutosh Bapat
Date:
Subject: Odd procedure resolution