Pushing ScalarArrayOpExpr support into the btree index AM - Mailing list pgsql-hackers

Almost immediately after we committed index-only scans, there were
complaints that it didn't work with "indexkey IN (...)" conditions, that
is ScalarArrayOpExpr index quals.  That's because the current code only
supports ScalarArrayOpExpr as a bitmap indexscan qual, not a regular
indexscan.  The executor performs a separate indexscan for each element of
the array, relying on the bitmap mechanism to eliminate duplicate hits on
the same heap tuple.

In principle it's not hard to see how ScalarArrayOpExpr could be supported
as a regular indexqual for btree indexes.  If the comparison operator has
<, <=, >=, or > semantics, then the array condition is equivalent to a
simple scalar condition using the greatest or least array element
respectively.  Otherwise (i.e., the operator is =):

1. Sort the array elements into the same order as the btree, eliminating
duplicates and NULL entries.  (If there are no non-null elements, the
qual condition is unsatisfiable and we can skip the indexscan.)

2. Perform an indexscan for each remaining array element, in sequence.
Since we eliminated equal values in step 1, there can be no two matches
to the same index entry, so there's no need for duplicate elimination.
What's more, because we sorted the array to match the index order, index
entries will be matched in index order, so the results of the scan can
still be expected to come out in sorted order, a critical expectation for
btree indexscans.

If we have more than one ScalarArrayOpExpr then we have to be a bit
careful about how we perform the multiple indexscans to ensure that output
ordering is preserved, but it's certainly doable.

So, at least as far as btrees are concerned, it seems like I implemented
the ScalarArrayOpExpr logic at the wrong level and it ought to be pushed
down into the index AM.  The above rules seem pretty btree-specific, so
I don't think that we ought to have the main executor doing any of this.
I envision doing this by marking btree as supporting ScalarArrayOpExpr
scankeys directly, so that the executor does nothing special with them,
and the planner treats them the same as regular scalar indexquals.

So that leaves me wondering whether the existing executor-level
implementation of ScalarArrayOpExpr indexquals ought to be ripped out
entirely.  It would not save all that much code in the executor proper
(basically ExecIndexEvalArrayKeys, ExecIndexAdvanceArrayKeys, and a few
lines of supporting logic).  However, there's a fair amount of cruft
in the planner to deal with the concept that ScalarArrayOpExpr is
supported as a bitmap indexscan qual but not a plain indexscan qual.
If that code is only going to get exercised for non-btree index types,
it's likely to be under-tested and hence a continuing source of bugs.

In principle somebody could be doing something likeWHERE pointcol <@ ANY (ARRAY[list of box values])
and expecting that to generate a bitmap indexscan on a GIST index, but
is it likely that anyone is doing that?  (As opposed to the equivalent
formulation with "pointcol <@ box1 OR pointcol <@ box2 ...", which would
still work to generate OR'd bitmap scans even if we took out the
ScalarArrayOpExpr logic.)

Thoughts?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: Call stacks and RAISE INFO
Next
From: Josh Berkus
Date:
Subject: Re: Call stacks and RAISE INFO