Re: Making Vars outer-join aware - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: Making Vars outer-join aware |
Date | |
Msg-id | 2291988.1674588687@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Making Vars outer-join aware (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Making Vars outer-join aware
|
List | pgsql-hackers |
I wrote: > Hans Buschmann <buschmann@nidsa.net> writes: >> I just noticed your new efforts in this area. >> I wanted to recurr to my old thread [1] considering constant propagation of quals. >> [1] https://www.postgresql.org/message-id/1571413123735.26467@nidsa.net > Yeah, this patch series is not yet quite up to the point of improving > that. That area is indeed the very next thing I want to work on, and > I did spend some effort on it last month, but I ran out of time to get > it working. Maybe we'll have something there for v17. BTW, to clarify what's going on there: what I want to do is allow the regular equivalence-class machinery to handle deductions from equality operators appearing in LEFT JOIN ON clauses (maybe full joins too, but I'd be satisfied if it works for one-sided outer joins). I'd originally hoped that distinguishing pre-nulled from post-nulled variables would be enough to make that safe, but it's not. Here's an example: select ... from t1 left join t2 on (t1.x = t2.y and t1.x = 1); If we turn the generic equivclass.c logic loose on these clauses, it will deduce t2.y = 1, which is good, and then apply t2.y = 1 at the scan of t2, which is even better (since we might be able to turn that into an indexscan qual). However, it will also try to apply t1.x = 1 at the scan of t1, and that's just wrong, because that will eliminate t1 rows that should come through with null extension. My current plan for making this work is to define EquivalenceClass-generated clauses as applying within "join domains", which are sets of inner-joined relations, and in the case of a one-sided outer join then the join itself belongs to the same join domain as its right-hand side --- but not to the join domain of its left-hand side. This would allow us to push EC clauses from an outer join's qual down into the RHS, but not into the LHS, and then anything leftover would still have to be applied at the join. In this example we'd have to apply t1.x = t2.y or t1.x = 1, but not both, at the join. I got as far as inventing join domains, in the 0012 patch of this series, but I haven't quite finished puzzling out the clause application rules that would be needed for this scenario. Ordinarily an EC containing a constant would be fully enforced at the scan level (i.e., apply t1.x = 1 and t2.y = 1 at scan level) and generate no additional clauses at join level; but that clearly doesn't work anymore when some of the scans are outside the join domain. I think that the no-constant case might need to be different too. I have some WIP code but nothing I can show. Also, this doesn't seem to help for full joins. We can treat the two sides as each being their own join domains, but then the join's own ON clause doesn't belong to either one, since we can't throw away rows from either side on the basis of a restriction from ON. So it seems like we'll still need ad-hoc logic comparable to reconsider_full_join_clause, if we want to preserve that optimization. I'm only mildly discontented with that, but still discontented. regards, tom lane
pgsql-hackers by date: