Home > mailing lists
Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan - Mailing list pgsql-hackers

From	Peter Geoghegan
Subject	Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan
Date	March 22 02:21:53
Msg-id	CAH2-Wz=3O5QpiVgnzwHjLEc6jvLgmKowFCCYBMXFDRgrjeQmWg@mail.gmail.com Whole thread Raw
In response to	Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan (Matthias van de Meent <boekewurm+postgres@gmail.com>)
List	pgsql-hackers
Tree view
On Mon, Mar 18, 2024 at 9:25 AM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:
> I was thinking about a more unified processing model, where
> _bt_preprocess_keys would iterate over all keys, including processing
> of array keys (by merging and reduction to normal keys) if and when
> found. This would also reduce the effort expended when there are
> contradictory scan keys, as array preprocessing is relatively more
> expensive than other scankeys and contradictions are then found before
> processing of later keys.

Does it really matter? I just can't get excited about maybe getting
one less syscache lookup in queries that involve *obviously*
contradictory keys at the SQL level. Especially because we're so much
better off with the new design here anyway; calling
_bt_preprocess_keys once rather than once per distinct set of array
keys is an enormous improvement, all on its own.

My concern with preprocessing overhead is almost entirely limited to
pathological performance issues involving extreme (even adversarial)
input scan keys/arrays. I feel that if we're going to guarantee that
we won't choke in _bt_checkkeys, even given billions of distinct sets
of array keys, we should make the same guarantee in
_bt_preprocess_keys -- just to avoid POLA violations. But that's the
only thing that seems particularly important in the general area of
preprocessing performance. (Preprocessing performance probably does
matter quite a bit in more routine cases, where </<= and >/>= are
mixed together on the same attribute. But there'll be no new
_bt_preprocess_keys operator/function lookups for stuff like that.)

The advantage of not completely merging _bt_preprocess_array_keys with
_bt_preprocess_keys is that preserving this much of the existing code
structure allows us to still decide how many array keys we'll need for
the scan up front (if the scan doesn't end up having an unsatisfiable
qual). _bt_preprocess_array_keys will eliminate redundant arrays
early, in practically all cases (the exception is when it has to deal
with an incomplete opfamily lacking the appropriate cross-type ORDER
proc), so we don't have to think about merging arrays after that
point. Rather like how we don't have to worry about "WHERE a <
any('{1, 2, 3}')" type inequalities after _bt_preprocess_array_keys
does an initial round of array-specific preprocessing
(_bt_preprocess_keys can deal with those in the same way as it will
with standard inequalities).

> > This preprocessing work should all be happening during planning, not
> > during query execution -- that's the design that makes the most sense.
> > This is something we've discussed in the past in the context of skip
> > scan (see my original email to this thread for the reference).
>
> Yes, but IIRC we also agreed that it's impossible to do this fully in
> planning, amongst others due to joins on array fields.

Even with a nested loop join's inner index scan, where the constants
used for each btrescan are not really predictable in the planner, we
can still do most preprocessing in the planner, at least most of the
time.

We could still easily do analysis that is capable of ruling out
redundant or contradictory scan keys for any possible parameter value
-- seen or unseen. I'd expect this to be the common case -- most of
the time these inner index scans need only one simple = operator
(maybe 2 = operators). Obviously, tjat approach still requires that
btrescan at least accept a new set of constants for each new inner
index scan invocation. But that's still much cheaper than calling
_bt_preprocess_keys from scratch every time btresca is called.

> > It
> > would be especially useful for the very fancy kinds of preprocessing
> > that are described by the MDAM paper, like using an index scan for a
> > NOT IN() list/array (this can actually make sense with a low
> > cardinality index).
>
> Yes, indexes such as those on enums. Though, in those cases the NOT IN
> () could be transformed into IN()-lists by the planner, but not the
> index.

I think that it would probably be built as just another kind of index
path, like the ones we build for SAOPs. Anti-SAOPs?

Just as with SAOPs, the disjunctive accesses wouldn't really be
something that the planner would need too much direct understanding of
(except during costing). I'd only expect the plan to use such an index
scan when it wasn't too far off needing a full index scan anyway. Just
like with skip scan, the distinction between these NOT IN() index scan
paths and a simple full index scan path is supposed to be fuzzy (maybe
they'll actually turn out to be full scans at runtime, since the
number of primitive index scans is determined dynamically, based on
the structure of the index at runtime).

> > The structure for preprocessing that I'm working towards (especially
> > in v15) sets the groundwork for making those shifts in the planner,
> > because we'll no longer treat each array constant as its own primitive
> > index scan during preprocessing.
>
> I hope that's going to be a fully separate patch. I don't think I can
> handle much more complexity in this one.

Allowing the planner to hook into _bt_preprocess_keys is absolutely
not in scope here. I was just making the point that treating array
scan keys like just another type of scan key during preprocessing is
going to help with that too.

> Yeah. The _bt_readpage comment doesn't actually contain the search
> term scanBehind, so I wasn't expecting that to be documented there.

Can you think of a better way of explaining it?

> > > I find it weird that we call _bt_advance_array_keys for non-required
> > > sktrig. Shouldn't that be as easy as doing a binary search through the
> > > array? Why does this need to hit the difficult path?
> >
> > What difficult path?
>
> "Expensive" would probably have been a better wording: we do a
> comparative lot of processing in the !_bt_check_compare() +
> !continuescan path; much more than the binary searches you'd need for
> non-required array key checks.

I've come around to your point of view on this -- at least to a
degree. I'm now calling _bt_advance_array_keys from within
_bt_check_compare for non-required array keys only.

If nothing else, it's clearer this way because it makes it obvious
that non-required arrays cannot end the (primitive) scan. There's no
further need for the wart in _bt_check_compare's contract about
"continuescan=false" being associated with a non-required scan key in
this one special case.

> > Advancing non-required arrays isn't that different to advancing
> > required ones. We will never advance required arrays when called just
> > to advance a non-required one, obviously. But we can do the opposite.
> > In fact, we absolutely have to advance non-required arrays (as best we
> > can) when advancing a required one (or when the call was triggered by
> > a non-array required scan key).
>
> I think it's a lot more expensive to do the non-required array key
> increment for non-required triggers. What are we protecting against
> (or improving) by always doing advance_array_keys on non-required
> trigger keys?

We give up on the first non-required array that fails to find an exact
match, though. Regardless of whether the scan was triggered by a
required or a non-required scan key (it's equally useful in both types
of array advancement, because they're not all that different).

There were and are steps around starting a new primitive index scan at
the end of _bt_advance_array_keys, that aren't required when array
advancement was triggered by an unsatisfied non-required array scan
key. But those steps were skipped given a non-required trigger scan
key, anyway.

> I mean that we should just do the non-required array key binary search
> inside _bt_check_compare for non-required array keys, as that would
> skip a lot of the rather expensive other array key infrastructure, and
> only if we're outside the minimum or maximum bounds of the
> non-required scankeys should we trigger advance_array_keys (unless
> scan direction changed).

I've thought about optimizing non-required arrays along those lines.
But it doesn't really seem to be necessary at all.

If we were going to do it, then it ought to be done in a way that
works independent of the trigger condition for array advancement (that
is, it'd work for non-required arrays that have a required scan key
trigger, too).

--
Peter Geoghegan
pgsql-hackers by date:
From: Tom Lane
Date: 22 March, 02:17:28
Subject: Re: Catalog domain not-null constraints
From: Peter Eisentraut
Date: 22 March, 02:33:22
Subject: Re: documentation structure
Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan - Mailing list pgsql-hackers

Previous

Next