Re: Converting NOT IN to anti-joins during planning - Mailing list pgsql-hackers
From | Antonin Houska |
---|---|
Subject | Re: Converting NOT IN to anti-joins during planning |
Date | |
Msg-id | 3793.1565689764@linux-edt6 Whole thread Raw |
In response to | Re: Converting NOT IN to anti-joins during planning (David Rowley <david.rowley@2ndquadrant.com>) |
Responses |
Re: Converting NOT IN to anti-joins during planning
|
List | pgsql-hackers |
David Rowley <david.rowley@2ndquadrant.com> wrote: > On Mon, 27 May 2019 at 20:43, Antonin Houska <ah@cybertec.at> wrote: > > I've spent some time looking into this. > > Thank you for having a look at this. > > > One problem I see is that SubLink can be in the JOIN/ON clause and thus it's > > not necessarily at the top of the join tree. Consider this example: > > > > CREATE TABLE a(i int); > > CREATE TABLE b(j int); > > CREATE TABLE c(k int NOT NULL); > > CREATE TABLE d(l int); > > > > SELECT * > > FROM > > a > > JOIN b ON b.j NOT IN > > ( SELECT > > c.k > > FROM > > c) > > JOIN d ON b.j = d.l; > > hmm yeah. Since the proofs that are being used in > expressions_are_not_nullable assume the join has already taken place, > then we'll either need to not use the join conditions are proofs in > that case, or just disable the optimisation instead. I think it's > fine to just disable the optimisation since it seem rather unlikely > that someone would write a join condition like that. Plus it seems > quite a bit more complex to validate that the optimisation would even > be correct if NULLs were not possible. > > I've attached a patch which restricts the pullups to FromExpr quals. > Anything below a JoinExpr disables the optimisation now. ok. The planner pulls-up other sublinks located in the ON clause, but it'd be quite tricky to do the same for the NOT IN case. Now that we only consider the WHERE clause, I wonder if the code can be simplified a bit more. In particular, pull_up_sublinks_jointree_recurse() passes valid pointer for notnull_proofs to pull_up_sublinks_qual_recurse(), while it also passes under_joinexpr=true. The latter should imply that NOT IN won't be converted to ANTI JOIN anyway, so no notnull_proofs should be needed. BTW, I'm not sure if notnull_proofs=j->quals is correct in cases like this: case JOIN_LEFT: j->quals = pull_up_sublinks_qual_recurse(root, j->quals, &j->rarg, rightrelids, NULL, NULL, j->quals, true); Even if j->quals evaluates to NULL or FALSE (due to NULL value on its input), it does not remove any rows (possibly containing NULL values) from the input of the SubLink's expression. I'm not even sure that expressions_are_not_nullable() needs the notnull_proofs argument. Now that we only consider SubLinks in the WHERE clause, it seems to me that nonnullable_vars is always a subset of nonnullable_inner_vars, isn't it? A few more minor findings: @@ -225,10 +227,13 @@ pull_up_sublinks(PlannerInfo *root) * * In addition to returning the possibly-modified jointree node, we return * a relids set of the contained rels into *relids. + * + * under_joinexpr must be passed as true if 'jtnode' is or is under a + * JoinExpr. */ static Node * pull_up_sublinks_jointree_recurse(PlannerInfo *root, Node *jtnode, - Relids *relids) + Relids *relids, bool under_joinexpr) { if (jtnode == NULL) { The comment "if 'jtnode' is or is under ..." is unclear. * is_NOTANY_compatible_with_antijoin() ** "'outerquery' is the parse of the query" -> "'outerquery' is the parse tree of the query" ** "2. We assume that each join qual is an OpExpr" -> "2. We assume that each sublink expression is an OpExpr" ? ** (OpExpr *) lfirst(lc) -> lfirst_node(OpExpr, lc) ** The kind of bool expression (AND_EXPR) might need to be tested here too: + /* Extract exprs from multiple expressions ANDed together */ + else if (IsA(testexpr, BoolExpr)) * find_innerjoined_rels() "we locate all WHERE and JOIN/ON quals that constrain these rels add them to" -> " ... and add them ..." * get_attnotnull() The comment says that FALSE is returned if the attribute is dropped, however the function it does not test att_tup->attisdropped. (This patch should not call the function for a dropped attribute, so I'm only saying that the function code is not consistent with the comment.) -- Antonin Houska Web: https://www.cybertec-postgresql.com
pgsql-hackers by date: