Re: Index range search optimization - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: Index range search optimization
Date
Msg-id CAPpHfdub4JwdgKoNDqL8RKUY2iUmoFKNH42LF+JZ1kO99dVhNQ@mail.gmail.com
Whole thread Raw
In response to Re: Index range search optimization  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Index range search optimization
List pgsql-hackers
Hi, Peter!

Thank you for your interest in this patch.

On Tue, Sep 19, 2023 at 1:48 AM Peter Geoghegan <pg@bowt.ie> wrote:
>
> On Thu, Sep 14, 2023 at 3:23 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> > The attached patch allows Postgres to skip scan keys required for directional scans (even when other keys are
presentin the scan).  I'll soon post the testing results and a more polished version of this patch. 
>
> This is very interesting to me, partly because it seems related to my
> ongoing work on SAOP execution within nbtree.
>
> My patch gives _bt_readpage and particularly _bt_checkkeys more
> high-level context, which they use to intelligently control the scan.
> That enables us to dynamically decide whether we should now perform
> another descent of the index via another call to _bt_first, or if we
> should prefer to continue on the leaf level for now. Maybe we will
> match many distinct sets of array keys on the same leaf page, in the
> same call to _bt_readpage. We don't want to miss out on such
> opportunities, but we also want to quickly notice when we're on a page
> where matching any more array keys is just hopeless.
>
> There is a need to keep these two things in balance. We need to notice
> the hopeless cases before wasting too many cycles on it. That creates
> a practical need to do an early precheck of the high key (roughly the
> same check that we do already). If the high key indicates that
> continuing on this page is truly hopeless, then we should give up and
> do another primitive index scan -- _bt_first will reposition us onto
> the leaf page that we need to go to next, which is (hopefully) far
> away from the leaf page we started on.

This is a pretty neat optimization indeed!

> Your patch therefore has the potential to help my own patch. But, it
> also has some potential to conflict with it, because my patch makes
> the meaning of SK_BT_REQFWD and SK_BT_REQBKWD more complicated (though
> only in cases where we have SK_SEARCHARRAY scan keys). I'm sure that
> this can be managed sensibly, though.

OK!  Let me know if you feel that I need to change something in this
patch to lower the potential conflict.

> Some feedback on your patch:
>
> * I notice that you're not using the high key for this, even in a
> forward scan -- you're using the last non-pivot tuple on the page. Why
> is that? (I have some idea why, actually, but I'd like to hear your
> thoughts first.)

I'm using the last non-pivot tuple on the page instead of hikey
because it's lower than hikey.  As you stated below, the most
significant column in the hikey is likely different from that of the
last non-pivot tuple.  So, it's more likely to use the optimization
with the last non-pivot tuple.

> * Separately, I don't think that it makes sense to use the same
> requiredDirMatched value (which came from the last non-pivot tuple on
> the page) when the special _bt_checkkeys call for the high key takes
> place. I don't think that this will lead to wrong answers, but it's
> weird, and is likely to defeat the existing optimization in some
> important cases.
>
> Due to the influence of suffix truncation, it's relatively likely that
> the most significant column in the high key will be different to the
> corresponding value from the last few non-pivot tuples on the page --
> the high key tends to be "aligned with natural boundaries in the key
> space", and so "gives us a good preview of the right sibling page". We
> don't want to treat it the same as non-pivot tuples here, because it's
> quite different, in ways that are subtle but still important.

This definitely makes sense. I've removed the usage of
requiredDirMatched from this _bt_checkkeys() call.

> * I would avoid using the terminology "preprocess scan keys" for this.
> That exact terminology is already used to describe what
> _bt_preprocess_keys() does.
>
> That function is actually involved in Konstantin's patch, so that
> could be very confusing. When we "preprocess" a scan key, it outputs a
> processed scan key with markings such as the required markings that
> you're using in the patch -- it's something that acts on/changes the
> scan keys themselves. Whereas your patch is exploiting information
> from already-processed scan keys, by applying it to the key space of a
> page.
>
> I suggest calling it "prechecking the page", or something like that. I
> don't feel very strongly about what you call it, provided it isn't
> confusing or ambiguous.


 This also makes sense. I've rephrased the comment.

------
Regards,
Alexander Korotkov

Attachment

pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: Disabling Heap-Only Tuples
Next
From: Dean Rasheed
Date:
Subject: Re: Infinite Interval