Re: Reference to parent query from ANY sublink - Mailing list pgsql-hackers
From | Antonin Houska |
---|---|
Subject | Re: Reference to parent query from ANY sublink |
Date | |
Msg-id | 52A77B9E.1020500@gmail.com Whole thread Raw |
In response to | Re: Reference to parent query from ANY sublink (Kevin Grittner <kgrittn@ymail.com>) |
Responses |
Re: Reference to parent query from ANY sublink
|
List | pgsql-hackers |
On 12/06/2013 03:33 PM, Kevin Grittner wrote: > Antonin Houska <antonin.houska@gmail.com> wrote: > >> SELECT * >> FROM tab1 a >> LEFT JOIN >> tab2 b >> ON a.i = ANY ( >> SELECT k >> FROM tab3 c >> WHERE k = a.i); > > This query works with k in any or all tables, but the semantics > certainly vary depending on where k happens to be. It would help a > lot if you showed SQL statements to create and populate the tables > involved and/or if you qualified all referenced column names with > the table alias to avoid ambiguity. I used the DDLs attached (tables.ddl) for this query too, not only for the queries in quaries.sql. Yes, if I had mentioned it and/or qualified the 'k' column reference, it wouldn't have broken anything. > If I assume that the k reference is supposed to be a column in > tab3, what you have is a query where you always get all rows from > tab1, and for each row from tab1 you either match it to all rows > from tab2 or no rows from tab2 depending on whether the tab1 row > has a match in tab3. I concede this particular query is not useful. But the important thing to consider here is which side of the LEFT JOIN the subquery references. >> SELECT * >> FROM tab1 a >> LEFT JOIN >> ( >> SELECT * >> tab2 b >> SEMI JOIN >> ( SELECT k >> FROM tab3 c >> WHERE k = a.i >> ) AS ANY_subquery >> ON a.i = ANY_subquery.k >> ) AS SJ_subquery >> ON true; > > It is hard to see what you intend here, since this is not valid > syntax. This is what I - after having read the related source code - imagine to happen internally when the ANY predicate of the first query is being processed. In fact it should become something like this (also internal stuff) SELECT * FROM tab1 a LEFT JOIN ( tab2 b SEMI JOIN ( SELECT k FROM tab3 c WHERE k = a.i ) AS ANY_subquery ON a.i = ANY_subquery.k) ON true; that is, SEMI JOIN node inserted into the tree rather than a subquery (SJ_subquery). I posted the construct with SJ_subquery to show how I thought about the problem: I thought it's safe (even though not necessarily beautiful) to wrap the SEMI JOIN into the SJ_subquery and let the existing infrastructure decide whether it's legal to turn it into a join node. I concluded that the subquery's references to the tab1 ensure that SJ_subquery won't be flattened, so the patch does nothing if such a reference exists. > PostgreSQL supports semi-joins; but that is an implementation detail > for the EXISTS or IN syntax. ... and for ANY, see subselect.c:convert_ANY_sublink_to_join() > Could you clarify your intent? To get rid of a subplan in some cases that require it so far: when the subquery references table exactly 1 level higher (i.e. the immediate parent query). (I got the idea while reading the source code, as opposed to query tuning.) // Antonin Houska (Tony) > -- > Kevin Grittner > EDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company >
pgsql-hackers by date: