Re: should we have a fast-path planning for OLTP starjoins? - Mailing list pgsql-hackers
| From | Bruce Momjian |
|---|---|
| Subject | Re: should we have a fast-path planning for OLTP starjoins? |
| Date | |
| Msg-id | aRfq-OHE2uAKVdn5@momjian.us Whole thread Raw |
| In response to | Re: should we have a fast-path planning for OLTP starjoins? (Tomas Vondra <tomas@vondra.me>) |
| List | pgsql-hackers |
On Sat, Nov 15, 2025 at 01:41:04AM +0100, Tomas Vondra wrote: > > And then there is the problem of detecting when this happens. > > > > I think my big question is, when we join A->B->C->D, we do a lot of work > > in the optimizer to figure out what order to use, but when we do A->B, > > A->C, A->D, why are we spending the same amount of optimizer effort? > > > > I'm sorry, I don't quite understand what's the question here. What does > A->B->C->D mean, exactly? It means table A joins B, and B joins C, and C joins D. I can see that as a much harder problem, and one we have code for in the optimizer, than A joining to B, C, and D. > > Could we just order B, C, D in order of which have restrictions, or > > based on size? I assume we can't because we don't know if A->C or > > another join would increase the number of rows, and this patch is saying > > if there is a foreign key relationship, they can't increase the rows so > > just short-circuit and order them simply. Is that accurate? > > > > Crazy question, if we have A->B, A->C, and A->D, why can't we just sort > > B,C,D based in increasing order of adding rows to the result, and just > > use that ordering, without requiring foreign keys? > > > > Yeah, that's mostly the idea one of the comments in the patch suggests. > To do the joins that reduce the cardinality first, then the dimensions, > and finally the joins that increase the cardinality. Yes, I saw that. > However, the examples make it look like we're joining pairs of tables, > but that's not necessarily true. The join may be between F and relation > that is really a join itself. And now you need to know how this join > changes the cardinality, which is more expensive than when only looking > at joins of pairs of tables. Yes, if the join is from F to D, and D joins to something else, especially if it is not a foreign key where we know there is only one match, I think we have to give up and go back to the normal optimizer process. > So I think we'd need to first identify these "independent join groups" > first. But that seems non-trivial, because we don't know which table to > start from (we don't know what's the "F" table). Do we easily know how many relations each relation joins to? Does that help us? > I'm sure there's an algorithm to decompose the join tree like this, but > I'm not sure how expensive / invasive it'd be. The premise of this patch > is that it's a cheap optimization, that doesn't need to do this. Yeah, I can see expense being an issue, which explains, as you said, why many other databases have star join "flags", which ideally we can avoid since very few people are going to use the flag. > It simply "peels off" the dimensions from the join, based on things it > can prove locally, without having to decompose the whole jointree. I see, and I think understand better now. Thanks. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com Do not let urgent matters crowd out time for investment in the future.
pgsql-hackers by date: