Thread: BUG #12000: "CROSS JOIN" not equivalent to ","
The following bug has been logged on the website: Bug reference: 12000 Logged by: Andreas Kunert Email address: kunert@cms.hu-berlin.de PostgreSQL version: 9.3.5 Operating system: Ubuntu 14.04.1 LTS Description: Still not sure if this is really a bug, but I could not find a good explanation for the following behaviour: According to the documentation http://www.postgresql.org/docs/9.3/static/queries-table-expressions.html (7.2.1.1): "FROM T1 CROSS JOIN T2 is equivalent to FROM T1, T2" However, in the following example the SQL-Query using "CROSS JOIN" works, while the usage of "," results in an invalid reference: drop table if exists table1; drop table if exists table2; create table table1 (x integer); create table table2 (y integer, z integer); insert into table1 (x) values (1), (2); insert into table2 (y,z) values (1,1), (1,2); -- this query works: select a.x, b.x, c.y from table1 a cross join table1 b left outer join table2 c on a.x = y and b.x = z; -- this does not: select a.x, b.x, c.y from table1 a , table1 b left outer join table2 c on a.x = y and b.x = z; The error in the latter case is: ERROR: invalid reference to FROM-clause entry for table "a" LINE 29: left outer join table2 c on a.x = y and b.x = z; ^ HINT: There is an entry for table "a", but it cannot be referenced from this part of the query. ********** Error ********** ERROR: invalid reference to FROM-clause entry for table "a" SQL state: 42P01 Hint: There is an entry for table "a", but it cannot be referenced from this part of the query. Character: 447 Another example resulting in the same error but without using table aliases: drop table if exists table0; drop table if exists table1; drop table if exists table2; create table table0 (x integer); create table table1 (x integer); create table table2 (y integer, z integer); insert into table0 (x) values (1), (2); insert into table1 (x) values (1), (2); insert into table2 (y,z) values (1,1), (1,2); -- works: select table0.x, table1.x, table2.y from table0 cross join table1 left outer join table2 on table0.x = y and table1.x = z; -- does not work: select table0.x, table1.x, table2.y from table0 , table1 left outer join table2 on table0.x = y and table1.x = z;
kunert@cms.hu-berlin.de writes: > Still not sure if this is really a bug, but I could not find a good > explanation for the following behaviour: > According to the documentation > http://www.postgresql.org/docs/9.3/static/queries-table-expressions.html > (7.2.1.1): > "FROM T1 CROSS JOIN T2 is equivalent to FROM T1, T2" > However, in the following example the SQL-Query using "CROSS JOIN" works, > while the usage of "," results in an invalid reference: It's not a bug. The quoted statement is correct as far as it goes, but what it fails to point out is that CROSS JOIN has a different syntactic priority from ",", which matters when you have more than two tables. Fully parenthesized, your examples would look like > from > ((table1 a cross join table1 b) > left outer join table2 c on a.x = y and b.x = z); > from > table1 a > , (table1 b left outer join table2 c on a.x = y and b.x = z); so that table "a" is within the scope of the ON clause in the first case but not the second. I'm not sure we ought to dive into these syntactic details right at that point in the manual, though; it's not really on point for what we're trying to explain there, and could easily make the text less comprehensible not more so. A simple "fix" would be to remove the claim about "," and just compare CROSS JOIN to INNER JOIN ON TRUE. I'm not really convinced that's an improvement ... regards, tom lane
Dr. Andreas Kunert wrote > According to the documentation > http://www.postgresql.org/docs/9.3/static/queries-table-expressions.html > (7.2.1.1): > "FROM T1 CROSS JOIN T2 is equivalent to FROM T1, T2" > > However, in the following example the SQL-Query using "CROSS JOIN" works, > while the usage of "," results in an invalid reference: > > table1 a > , table1 b > left outer join table2 c on a.x = y and b.x = z; It does not say that: A, B JOIN C is equivalent to A CROSS JOIN B JOIN C The "," introduces lower precedence operator to the mix and so the B JOIN C is evaluated first and the corresponding "ON" clause can only refer to those two tables. In effect you are writing: FROM A, (B JOIN C) I personally find the use of "," in the FROM-clause to be poor form and its even worse if you decide to mix both "," and explicit JOIN clauses. If you want to make this be explained better feel free to provide suggestions; but I'd suggest you simply forget the fact that "FROM A, B" is even valid and just use the proper joining keywords instead. Given this does not work as far back at 9.0 I would say that it is not a bug. Nor do I believe it is particularly under-documented; its just complicated because of the mixing of two separate concepts. David J. -- View this message in context: http://postgresql.nabble.com/BUG-12000-CROSS-JOIN-not-equivalent-to-tp5827394p5827398.html Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.
Tom Lane-2 wrote > kunert@.hu-berlin > writes: >> Still not sure if this is really a bug, but I could not find a good >> explanation for the following behaviour: > >> According to the documentation >> http://www.postgresql.org/docs/9.3/static/queries-table-expressions.html >> (7.2.1.1): >> "FROM T1 CROSS JOIN T2 is equivalent to FROM T1, T2" > >> However, in the following example the SQL-Query using "CROSS JOIN" works, >> while the usage of "," results in an invalid reference: > > It's not a bug. The quoted statement is correct as far as it goes, > but what it fails to point out is that CROSS JOIN has a different > syntactic priority from ",", which matters when you have more than > two tables. Fully parenthesized, your examples would look like > >> from >> ((table1 a cross join table1 b) >> left outer join table2 c on a.x = y and b.x = z); > >> from >> table1 a >> , (table1 b left outer join table2 c on a.x = y and b.x = z); > > so that table "a" is within the scope of the ON clause in the first case > but not the second. > > I'm not sure we ought to dive into these syntactic details right at > that point in the manual, though; it's not really on point for what > we're trying to explain there, and could easily make the text less > comprehensible not more so. > > A simple "fix" would be to remove the claim about "," and just compare > CROSS JOIN to INNER JOIN ON TRUE. I'm not really convinced that's an > improvement ... How about adding the following to that sentence: "However, in the presence of three or more joined relations it is recommended to only use either explicit joins or commas since mixing them introduces non-obvious join order differences." David J. -- View this message in context: http://postgresql.nabble.com/BUG-12000-CROSS-JOIN-not-equivalent-to-tp5827394p5827399.html Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.
David G Johnston wrote: > I personally find the use of "," in the FROM-clause to be poor form and its > even worse if you decide to mix both "," and explicit JOIN clauses. If you > want to make this be explained better feel free to provide suggestions; but > I'd suggest you simply forget the fact that "FROM A, B" is even valid and > just use the proper joining keywords instead. Note that using explicit join form force the optimizer into some specific join order, depending on the join_collapse_limit parameter, whereas using commas gives it absolute freedom regardless of the parameter. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
David G Johnston <david.g.johnston@gmail.com> writes: > Tom Lane-2 wrote >> A simple "fix" would be to remove the claim about "," and just compare >> CROSS JOIN to INNER JOIN ON TRUE. I'm not really convinced that's an >> improvement ... > How about adding the following to that sentence: > "However, in the presence of three or more joined relations it is > recommended to only use either explicit joins or commas since mixing them > introduces non-obvious join order differences." I don't think it's the place of the manual to be prescriptive about style; at least, not here. We could do something like "<CROSS JOIN example> is equivalent to <INNER JOIN ON TRUE example>. <CROSS JOIN example> is also equivalent to <example with comma>, but in cases with more than two tables this equivalence is not exact, because JOIN binds more tightly than comma." Or maybe put the "but" in a footnote. regards, tom lane
On Tuesday, November 18, 2014, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > David G Johnston wrote: > > > I personally find the use of "," in the FROM-clause to be poor form and > its > > even worse if you decide to mix both "," and explicit JOIN clauses. If > you > > want to make this be explained better feel free to provide suggestions; > but > > I'd suggest you simply forget the fact that "FROM A, B" is even valid and > > just use the proper joining keywords instead. > > Note that using explicit join form force the optimizer into some > specific join order, depending on the join_collapse_limit parameter, > whereas using commas gives it absolute freedom regardless of the > parameter. > > Only because the parameter used in that case is "from_collapse_limit"... I'm not convinced there exists an example of micro-optimization where mixing the two is a net gain that cannot be done in any other join-only way and thus avoiding the mental context switch to know what binds where. IIRC with the default setting of the threshold GUCs the planner is equally limited when choosing between two homogeneous use queries. David J.
David Johnston <david.g.johnston@gmail.com> writes: > On Tuesday, November 18, 2014, Alvaro Herrera <alvherre@2ndquadrant.com> > wrote: >> Note that using explicit join form force the optimizer into some >> specific join order, depending on the join_collapse_limit parameter, >> whereas using commas gives it absolute freedom regardless of the >> parameter. > Only because the parameter used in that case is "from_collapse_limit"... IIRC, from_collapse_limit only affects the behavior when deciding whether to collapse a mixture of JOIN and comma syntax; it does not change the outcome if you write a whole bunch of tables in a comma-separated list to start with. In any case, such implementation artifacts shouldn't drive our consideration of what's the clearest way to document this SQL-standard behavior. regards, tom lane
On 18.11.2014 18:55, Tom Lane wrote: > I don't think it's the place of the manual to be prescriptive about style; > at least, not here. > > We could do something like "<CROSS JOIN example> is equivalent to <INNER JOIN ON > TRUE example>. <CROSS JOIN example> is also equivalent to <example with > comma>, but in cases with more than two tables this equivalence is not > exact, because JOIN binds more tightly than comma." > > Or maybe put the "but" in a footnote. Personally, I like the footnote idea best. It is a good compromise between not worsening the readability of the documentation and still mentioning the difference in priority for people who stumble upon this behavior like I did.
On Wed, Nov 19, 2014 at 5:21 AM, Dr. Andreas Kunert <kunert@cms.hu-berlin.d= e > wrote: > On 18.11.2014 18:55, Tom Lane wrote: > > I don't think it's the place of the manual to be prescriptive about > style; > > at least, not here. > > > > We could do something like "<CROSS JOIN example> is equivalent to <INNE= R > JOIN ON > > TRUE example>. <CROSS JOIN example> is also equivalent to <example wit= h > > comma>, but in cases with more than two tables this equivalence is not > > exact, because JOIN binds more tightly than comma." > Instead of embedding this in the CROSS JOIN section why not put the additional information at the top of the subsection since it does apply to all of the join types. After: "A joined table is a table derived from two other (real or derived) tables [...] cross-joins are available." add something like: "When explicit joins are mixed in with the comma-list the joins bind more tightly than the commas." David J.=E2=80=8B
David Johnston <david.g.johnston@gmail.com> writes: > Instead of embedding this in the CROSS JOIN section why not put the > additional information at the top of the subsection since it does apply to > all of the join types. I think the problem is specific to CROSS JOIN, because only for that is there a temptation to make an analogy with comma. We do have the binding-strength question addressed explicitly somewhere else, I believe (probably on the SELECT reference page). I don't really feel a need to duplicate that here. I think the footnote approach might be the best solution. regards, tom lane
I wrote: > We do have the binding-strength question addressed explicitly somewhere > else, I believe (probably on the SELECT reference page). I don't really > feel a need to duplicate that here. I think the footnote approach might > be the best solution. I experimented with that and ended up not liking the <footnote> solution, mainly because this text is near the top of a very long <sect1>, so that the footnote is very very far away from where it's referenced (at least in HTML rendering). I ended up just making it a regular <note> and then rearranging some of the existing text so that we weren't forward-referencing the concept of nested JOIN clauses. I think this is a good answer: although we do have the syntactic explanation on the SELECT reference page, it's not bad to have it here too as long as we can work it into subsidiary material. http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=8372304e3594a1d67737df779f098d9ae9937603 regards, tom lane