Re: Use of additional index columns in rows filtering - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Use of additional index columns in rows filtering
Date
Msg-id 79c87e74-46cf-ecbb-2ad5-c44343f92345@enterprisedb.com
Whole thread Raw
In response to Re: Use of additional index columns in rows filtering  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Use of additional index columns in rows filtering
Re: Use of additional index columns in rows filtering
List pgsql-hackers

On 7/21/23 21:17, Peter Geoghegan wrote:
> ...
>> But I was
>> thinking more about the costing part - if you convert the clauses in
>> some way, does that affect the reliability of estimates somehow?
> 
> Obviously, it doesn't affect the selectivity at all. That seems most
> important (you kinda said the same thing yourself).
> 

Sorry, I think I meant 'cost estimates', not the selectivity estimates.
If you convert the original "simple" clauses into the more complex list,
presumably we'd cost that differently, right? I may be entirely wrong,
but my intuition is that costing these tiny clauses will be much more
difficult than costing the original clauses.

>> If the
>> conversion from AND to OR makes the list of clauses more complex, that
>> might be an issue ...
> 
> That's definitely a concern. Even still, the biggest problem by far in
> this general area is selectivity estimation. Which, in a way, can be
> made a lot easier by this general approach.
> 
> Let's say we have the tenk1 table, with the same composite index as in
> my example upthread (on "(two,four,twenty)"). Further suppose you have
> a very simple query: "select count(*) from tenk1". On master (and with
> the patch) that's going to give you an index-only scan on the
> composite index (assuming it's the only index), which is quite
> efficient despite being a full index scan -- 11 buffer hits.
> 
> This much more complicated count(*) query is where it gets interesting:
> 
> select
>   count(*),
>   two,
>   four,
>   twenty
> from
>   tenk1_dyn_saop
> where
>   two in (0, 1)
>   and four in (1, 2, 3, 4)
>   and twenty in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
> group by
>   two,
>   four,
>   twenty
> order by
>   two,
>   four,
>   twenty;
> 
> It's inherently very difficult to predict how selective this query
> will be using basic statistics. But maybe it doesn't need to matter so
> much, so often.
> 
> The patch can execute this with an index-only scan + GroupAggregate.
> What ends up happening is that the patch gets 9 buffer hits -- so
> pretty close to 11. Master can use almost the same query plan (it uses
> quals but needs to hashagg+ sort). It ends up getting 245 buffer hits
> -- vastly more than what we see for the full index scan case (and
> nothing to do with the sort/an issue with a limit). That's nearly as
> many hits as you'd get with a sequential scan. (BTW, I don't need to
> coax the query planner to get this result on master.)
> 
> With the patch you can vary the predicate in whatever way, so that the
> selectivity shifts up or down. Occasionally you'll get maybe one extra
> buffer access relative to the base full index scan case, but overall
> the patch makes the worst case look very much like a full index scan
> (plus some relatively tiny CPU overhead). This is just common sense,
> in a way; selectivities are always between 0.0 and 1.0. Why shouldn't
> we be able to think about it like that?
> 

Right, I agree with this reasoning in principle.

But I'm getting a bit lost regarding what's the proposed costing
strategy. It's hard to follow threads spanning days, with various other
distractions, etc.

In principle, I think:

a) If we estimate the scan to return almost everything (or rather if we
expect it to visit almost the whole index), it makes perfect sense to
cost is as a full index scan.

b) What should we do if we expect to read only a fraction of the index?
If we're optimistic, and cost is according to the estimates, but then
end up most of the index, how bad could it be (compared to the optimal
plan choice)? Similarly, if we're pessimistic/defensive and cost it as
full index scan, how many "good" cases would we reject based on the
artificially high cost estimate?

I don't have a very good idea how sensitive the cost is to selectivity
changes, which I think is crucial for making judgments.

>> I wasn't really thinking about LIMIT, and I don't think it changes the
>> overall behavior very much (sure, it's damn difficult to estimate for
>> skewed data sets, but meh).
>>
>> The case I had in mind is something like this:
>>
>> CREATE TABLE t (a int, b int, c int);
>> CREATE INDEX ON t (a);
>> INSERT INTO t SELECT i, i, i FROM generate_series(1,1000000) s(i);
>>
>> SELECT * FROM t WHERE bad_qual(a) AND b < 1 AND c < 1 ORDER BY a;
>>
>> where bad_qual is expensive and matches almost all rows.
> 
> You must distinguish between quals that can become required scan keys
> (or can terminate the scan), and all other quals. This is really
> important for my pending SAOP patch, but I think it might be important
> here too. I wonder if the best place to address the possibility of
> such a regression is in the index AM itself.
> 
> Let's make your example a bit more concrete: let's assume that
> bad_qual is a very expensive integer comparison, against a column that
> has only one possible value. So now your example becomes:
> 
> CREATE TABLE t (a expensive_int, b int, c int);
> CREATE INDEX ON t (a);
> INSERT INTO t SELECT 42, i, i FROM generate_series(1,1000000) s(i);
> SELECT * FROM t a in (7, 42) AND b < 1 AND c < 1 ORDER BY a;
> 
> (I'm using a SAOP here because the planner will more or less disregard
> the ORDER BY if I make it "= 42" instead. Maybe that makes it
> simpler.)
> 
> Sure, you're getting a full index scan here, and you get all these
> useless comparisons on "a" -- that's a real risk. But AFAICT there is
> no real need for it. There is another nbtree patch that might help. A
> patch that teaches nbtree's _bt_readpage function to skip useless
> comparisons like this:
> 
> https://postgr.es/m/079c3f8e-3371-abe2-e93c-fc8a0ae3f571@garret.ru
> 
> In order for this kind of optimization to be possible at all, we must
> be able to reason about "a" as a column whose values will always be in
> key space order. That is, nbtree must recognize that "a" is the most
> significant key column, not (say) a low-order column from a composite
> index -- it's a required column in both directions. If _bt_readpage
> can determine that the first tuple on a leaf page has the value "42",
> and the high key has that same value, then we can skip all of the
> comparisons of "a" for that page, right away (in fact we don't require
> any comparisons). Now it doesn't matter that they're super expensive
> comparisons (or it hardly matters).
> 
> It's natural to think of things like this _bt_readpage optimization as
> something that makes existing types of plan shapes run faster. But you
> can also think of them as things that make new and fundamentally
> better plan shapes feasible, by making risky things much less risky.
> 

That'd be an interesting optimization, but I don't think that matters
for this patch, as it's not messing with index scan keys at all. I mean,
it does not affect what scan keys get passed to the AM at all, or what
scan keys are required. And it does not influence what the AM does. So
all this seems interesting, but rather orthogonal to this patch.


>> FWIW the "ORDER BY" is necessary, because otherwise we may not even
>> build the index path (no index keys, no interesting pathkeys). It's just
>> an opportunistic optimization - if already doing index scan, try doing
>> this too. I wonder if we should relax that ...
> 
> I'm kinda doing the same thing with ordering in my own patch. In
> general, even if the query really doesn't care about the index order,
> there may be a lot of value in making the nbtree code understand that
> this is an ordered index scan. That's what enables skipping, in all
> its forms (skipping individual comparisons, skipping whole subsections
> of the index, etc).
> 
> I'm not saying that this is 100% problem free. But it seems like a
> promising high level direction.
> 

I was rather thinking about maybe relaxing the rules about which index
paths we create, to include indexes that might be interesting thanks to
index-only filters (unless already picked thanks to sorting).

>> In a way, focusing on the worst case does that by assuming the worst
>> combination - which is fine, although it may choose the slower (but
>> safer) approach in some cases.
> 
> I don't think that it has to be slower on average (even by a tiny
> bit). It might just end up being slightly faster on average, and way
> faster on occasion.
> 

OK, that'd be nice. I don't have a very good intuition about behavior
for these queries, I'd need to play with & benchmark it the way I did
for the prefetching patch etc.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Use COPY for populating all pgbench tables
Next
From: "Anton A. Melnikov"
Date:
Subject: Re: Making Vars outer-join aware