Re: Use of additional index columns in rows filtering - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: Use of additional index columns in rows filtering |
Date | |
Msg-id | CAH2-WzmAJsGqkfa0fEhk20NowmduwAim43ajBDRU_BAPaBB85w@mail.gmail.com Whole thread Raw |
In response to | Re: Use of additional index columns in rows filtering (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
List | pgsql-hackers |
On Sun, Jul 23, 2023 at 5:04 AM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > Sorry, I think I meant 'cost estimates', not the selectivity estimates. > If you convert the original "simple" clauses into the more complex list, > presumably we'd cost that differently, right? I may be entirely wrong, > but my intuition is that costing these tiny clauses will be much more > difficult than costing the original clauses. I think that that's definitely true (it is more difficult), but that there may be a bigger picture. > Right, I agree with this reasoning in principle. > > But I'm getting a bit lost regarding what's the proposed costing > strategy. To be clear, I don't have a clue how to better estimate the cardinality of multiple attributes from a composite index. The big and immediate change to the SAOP costing with my patch is that genericcostestimate/btcostestimate can safely assume that visiting each leaf page more than once is a physical impossibility. Because it is. It is no longer necessary to treat SAOPs similarly to a nested loop join during costing, which is how it works today. Now, whenever you add increasingly complicated clauses to a multi-column SAOP query (like the ones I've shown you), it makes sense for the cost to "saturate" at a certain point. That should be representative of the physical reality, for both CPU costs and I/O costs. Right now the worst case is really relevant to the average case, since the risk of the costs just exploding at runtime is very real. If the only problem in this area was the repeated accesses to the same leaf pages (accesses that happen in very close succession anyway), then all of this would be a nice win, but not much more. It certainly wouldn't be expected to change the way we think about stuff that isn't directly and obviously relevant. But, it's not just the index pages. Once you start to consider the interactions with filter/qpquals, it gets much more interesting. Now you're talking about completely avoiding physical I/Os for heap accesses, which has the potential to make a dramatic difference to some types of queries, particularly in the worst case. > It's hard to follow threads spanning days, with various other > distractions, etc. I have to admit that my thinking on this very high level stuff is rather unrefined. As much as anything else, I'm trying to invent (or discover) a shared vocabulary for discussing these issues. I might have gone about it clumsily, at times. I appreciate being able to bounce this stuff off you. > I don't have a very good idea how sensitive the cost is to selectivity > changes, which I think is crucial for making judgments. I'm not trying to find a way for the optimizer to make better judgments about costing with a multi-column index. What I'm suggesting (rather tentatively) is to find a way for it to make fewer (even no) judgements at all. If you can find a way of reducing the number of possible choices without any real downside -- in particular by just not producing index paths that cannot possibly be a win in some cases -- then you reduce the number of bad choices. The challenge is making that kind of approach in the optimizer actually representative of the executor in the real world. The executor has to have robust performance under a variety of runtime conditions for any of this to make sense. > > It's natural to think of things like this _bt_readpage optimization as > > something that makes existing types of plan shapes run faster. But you > > can also think of them as things that make new and fundamentally > > better plan shapes feasible, by making risky things much less risky. > > > > That'd be an interesting optimization, but I don't think that matters > for this patch, as it's not messing with index scan keys at all. I mean, > it does not affect what scan keys get passed to the AM at all, or what > scan keys are required. And it does not influence what the AM does. So > all this seems interesting, but rather orthogonal to this patch. Your patch is approximately the opposite of what I'm talking about, in terms of its structure. The important commonality is that each patch adds "superfluous" quals that can be very useful at runtime, under the right circumstances -- which can be hard to predict. Another similarity is that both patches inspire some of the same kind of lingering doubts about extreme cases -- cases where (somehow) the usual/expected cost asymmetry that usually works in our favor doesn't apply. My current plan is to post v1 of my patch early next week. It would be better to discuss this on the thread that I create for that patch. You're right that "exploiting index ordering" on the index AM side is totally unrelated to your patch. Sorry about that. > I was rather thinking about maybe relaxing the rules about which index > paths we create, to include indexes that might be interesting thanks to > index-only filters (unless already picked thanks to sorting). That seems like it would make sense. In general I think that we overuse and over rely on bitmap index scans -- we should try to chip away at the artificial advantages that bitmap index scans have, so that we can get the benefit of index scans more often -- accessing the heap/VM inline does open up a lot of possibilities that bitmap scans will never be able to match. (BTW, I'm hoping that your work on index prefetching will help with that.) I see that your patch has this diff change (which is 1 out of only 2 or 3 plan changes needed by the patch): --- a/src/test/regress/expected/create_index.out +++ b/src/test/regress/expected/create_index.out @@ -1838,18 +1838,13 @@ DROP TABLE onek_with_null; EXPLAIN (COSTS OFF) SELECT * FROM tenk1 WHERE thousand = 42 AND (tenthous = 1 OR tenthous = 3 OR tenthous = 42); - QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------ - Bitmap Heap Scan on tenk1 - Recheck Cond: (((thousand = 42) AND (tenthous = 1)) OR ((thousand = 42) AND (tenthous = 3)) OR ((thousand = 42) AND (tenthous = 42))) - -> BitmapOr - -> Bitmap Index Scan on tenk1_thous_tenthous - Index Cond: ((thousand = 42) AND (tenthous = 1)) - -> Bitmap Index Scan on tenk1_thous_tenthous - Index Cond: ((thousand = 42) AND (tenthous = 3)) - -> Bitmap Index Scan on tenk1_thous_tenthous - Index Cond: ((thousand = 42) AND (tenthous = 42)) -(9 rows) + QUERY PLAN +----------------------------------------------------------------------- + Index Scan using tenk1_thous_tenthous on tenk1 + Index Cond: (thousand = 42) + Index Filter: ((tenthous = 1) OR (tenthous = 3) OR (tenthous = 42)) + Filter: ((tenthous = 1) OR (tenthous = 3) OR (tenthous = 42)) +(4 rows) That does seem like an improvement to me. But, an even better plan is possible. Or would be possible once this SAOP-OR-transformation patch is in place (if it was then combined with my own SAOP patch): https://www.postgresql.org/message-id/flat/919bfbcb-f812-758d-d687-71f89f0d9a68%40postgrespro.ru#9d877caf48c4e331e507b5c63914228e That could give us an index scan plan that is perfectly efficient. Like the new plan shown here, it would pin/lock a single leaf page from the tenk1_thous_tenthous index, once. But unlike the plan shown here, it would be able to terminate as soon as the index scan reached an index tuple where thousand=42 and tenthous>42. That makes a significant difference if we have to do heap accesses for those extra tuples. Plus this hypothetical other plan of mine would be more robust: it would tolerate misestimates. It happens to be the case that there just aren't that many tuples with thousand=42 -- they take up only a fraction of one leaf page. But...why take a chance on that being true, if we don't have to? The old bitmap index scan plan has this same advantage already -- it is robust in the very same way. Because it managed to pass down specific "tenthous" index quals. It would be nice to have both advantages, together. (To be clear, I'm certainly not suggesting that the example argues against what you want to do here -- it's just an example that jumped out at me.) Perhaps this example will make my confusion about the boundaries between each of our patches a bit more understandable. I was confused -- and I still am. I look forward to being less confused at some point in the future. -- Peter Geoghegan
pgsql-hackers by date: