Re: join removal - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: join removal |
Date | |
Msg-id | 22266.1269641438@sss.pgh.pa.us Whole thread Raw |
In response to | Re: join removal (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: join removal
|
List | pgsql-hackers |
Robert Haas <robertmhaas@gmail.com> writes: > On Sun, Jul 19, 2009 at 10:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Yeah. �Ideally this sort of thing would happen in prepjointree.c, but >> we don't have nearly enough information at that stage. > You've mentioned this point a couple of times - what is ideal about > prepjointree? Well, it's the place where we like to rearrange the join tree, and dropping a join altogether certainly counts as that. We can't do it there, at the moment anyway, for lack of supporting data --- but if it were possible to do it at that time I think it'd be the cleanest approach. > Reading through the "optimizer functions" section of > src/backend/optimizer/README, it seems like the earliest point at > which we could do this would be just before the call to > make_one_rel(). I think that would eliminate some redundant > computation. Maybe. It would also add a new pass over the join tree that, in 99% of cases, would make no useful contribution whatever. It's possible that this would still be cheaper than a lot of failed calls to join_is_removable, but I'm unconvinced --- we were able to make the failure path through that pretty durn cheap for most simple cases. The approach you're sketching still involves a combinatorial search so I doubt it's going to be cheap. I should maybe pause here a moment to say that my approach to considering the cost of new planner optimizations is to focus on how much time the added code will eat on queries where it fails to make any useful contribution. If the optimization wins, then presumably you will make back at execution time whatever it might have cost you to plan (if this is debatable, you probably shouldn't be bothering with the idea at all). So the pain will come from adding planning time on queries where there isn't any runtime payoff; especially for something like join removal, which only applies to a small minority of queries anyway. Therefore I'm suspicious of adding new passes over the query structure if they are only going to be used for low-probability wins. > The problem with moving it back any further seems to be that we have > to know which clauses are mergejoinable before we can do the necessary > computations; I think flattening of the query tree has to already be > done too. Yeah. I had been thinking that we could build the RelOptInfo and IndexOptInfo structs earlier, but really all of the clause-classification work done by deconstruct_jointree is important as well for this function's purposes, so it's not that easy to push back to prepjointree :-(. I suspect that any such attempt would end up requiring a massive rethinking of the order of operations in the planner. Maybe we should do that sometime but I'm not eager for it. regards, tom lane
pgsql-hackers by date: