Re: Adding skip scan (including MDAM style range skip scan) to nbtree - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Adding skip scan (including MDAM style range skip scan) to nbtree
Date
Msg-id CAH2-Wz=bM_PwAP926_KKhP7LTTS=TWE=jKNm1Tkp5PMvxoodqQ@mail.gmail.com
Whole thread Raw
In response to Re: Adding skip scan (including MDAM style range skip scan) to nbtree  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
List pgsql-hackers
On Tue, Apr 1, 2025 at 5:56 PM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:
> 0002:
>
> // nbtutils.c
>
> > +             * (safe even when "key->sk_attno <= firstchangingattnum")
>
> Typo: should be "key->sk_attno >= firstchangingattnum".

Good catch!

Though I think it should be "" safe even when "key->sk_attno >
firstchangingattnum" "", to highlight that the rule here is
significantly more permissive than even the nearby range skip array
case, which is still safe when (key->sk_attno == firstchangingattnum).

As I'm sure you realize, SAOP = keys and regular = keys are only safe
when "key->sk_attno < firstchangingattnum". So there are a total of 3
distinct rules about how firstchangingattnum affects whether it's safe
to advance pstate.startikey past a scan key (which of the 3 rules we
apply depends solely on the type of scan key).

In summary, simple = keys have the strictest firstchangingattnum rule,
range skip arrays/scalar inequalities have a somewhat less restrictive
rule, and non-range skip arrays have the least restrictive/most
permissive rule. As I think you understand already, it is generally
safe to set pstate.startikey to an offset that's past several earlier
simple skip arrays (against several prefix columns, all omitted from
the query) -- even when firstchangingattnum is the lowest possible
value (which is attnum 1).

> I'd also refactor the final segment to something like the following,
> to remove a redundant compare when the attribute we're checking is
> equal between firsttup and lasttup:

At one point the code did look like that, but I concluded that the
extra code wasn't really worth it. We can only save cycles within
_bt_set_startikey itself this way, which doesn't add up to much.
_bt_set_startikey is only called once per page.

Besides, in general it's fairly likely that a range skip array that
_bt_set_startikey sees won't be against a column that we already know
(from the _bt_keep_natts_fast precheck, which returns
firstchangingattnum) to only have one distinct value among all tuples
on the page.

> 0003: LGTM
>
> 0004: LGTM

Great, thanks!

--
Peter Geoghegan



pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Hashed IN only applied to first encountered IN
Next
From: Junwang Zhao
Date:
Subject: Re: general purpose array_sort