Thread: LATERAL, UNNEST and spec compliance
Folks, Andrew Gierth asked me to send this out as his email is in a parlous state at the moment. My comments will follow in replies. Without further ado: SQL2008 says, for 7.6 <table reference> 6) a) If TR is contained in a <from clause> FC with no intervening <query expression>, then the scope clause SC of TR isthe <select statement: single row> or innermost <query specification> that contains FC. The scope of a range variableof TR is the <select list>, <where clause>, <group by clause>, <having clause>, and <window clause> of SC, together with every <lateral derived table> that is simply contained in FC and is preceded by TR, and every <collectionderived table> that is simply contained in FC and is preceded by TR, and the <join condition> of all <joinedtable>s contained in SC that contain TR. If SC is the <query specification> that is the <query expression body>of a simple table query STQ, then the scope of a range variable of TR also includes the <order by clause> of STQ. This is the clause that defines the scope effect of LATERAL, and as can be seen, it defines <collection derived table>, i.e. UNNEST(), as having the same behaviour as <lateral derived table>. It is also worth noting at this point that pg's "FROM func()" syntax is not in the spec (the nearest is "FROM TABLE(<collection value expression>)"). Our implementation of UNNEST currently deviates from the spec by not being implicitly LATERAL; given the (sub)query SELECT * FROM sometable, UNNEST(somearray); then "somearray" is required to be a parameter or outer reference rather than a column of "sometable". To get the spec's behaviour for this, we currently have to do: SELECT * FROM sometable, LATERAL UNNEST(somearray); which is non-standard syntax. (In the spec, only <table subquery> can follow LATERAL.) (We also don't accept the (optional) syntax of S301, allowing multiple parameters to UNNEST().) As I see it, the current options are: 1. Do nothing, and insist on non-standard use of the LATERAL keyword. 2. Add UNNEST to the grammar (or parse analysis) as a special case, making it implicitly LATERAL. (This would make implementing S301 easier, but special cases are ugly.) 3. Make all cases of SRFs in the FROM-clause implicitly LATERAL. (As far as I can tell, those cases whose behaviour would be changed by this actually produce errors in versions priorto 9.3, so no working code should be affected.) Since LATERAL is new in 9.3, I think the pros and cons of these choices should be considered now, rather than being allowed to slide by unexamined. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Thu, Jan 24, 2013 at 09:51:46AM -0800, David Fetter wrote: > Folks, > > Andrew Gierth asked me to send this out as his email is in a parlous > state at the moment. My comments will follow in replies. Without > further ado: > [snip] > > As I see it, the current options are: > > 1. Do nothing, and insist on non-standard use of the LATERAL keyword. > > 2. Add UNNEST to the grammar (or parse analysis) as a special case, making > it implicitly LATERAL. > > (This would make implementing S301 easier, but special cases are ugly.) > > 3. Make all cases of SRFs in the FROM-clause implicitly LATERAL. > > (As far as I can tell, those cases whose behaviour would be changed by > this actually produce errors in versions prior to 9.3, so no working > code should be affected.) > > Since LATERAL is new in 9.3, I think the pros and cons of these choices > should be considered now, rather than being allowed to slide by unexamined. Please find attached a patch which implements approach 3. The vast majority of it is changes to the regression tests. The removed regression tests in join.{sql,out} are no longer errors, although some of them are pretty standard DoS attacks, hence they're all removed. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Attachment
On Thu, Jan 24, 2013 at 09:12:41PM -0800, David Fetter wrote: > On Thu, Jan 24, 2013 at 09:51:46AM -0800, David Fetter wrote: > > Folks, > > > > Andrew Gierth asked me to send this out as his email is in a parlous > > state at the moment. My comments will follow in replies. Without > > further ado: > > [snip] > > > > As I see it, the current options are: > > > > 1. Do nothing, and insist on non-standard use of the LATERAL keyword. > > > > 2. Add UNNEST to the grammar (or parse analysis) as a special case, making > > it implicitly LATERAL. > > > > (This would make implementing S301 easier, but special cases are ugly.) > > > > 3. Make all cases of SRFs in the FROM-clause implicitly LATERAL. > > > > (As far as I can tell, those cases whose behaviour would be changed by > > this actually produce errors in versions prior to 9.3, so no working > > code should be affected.) > > > > Since LATERAL is new in 9.3, I think the pros and cons of these choices > > should be considered now, rather than being allowed to slide by unexamined. > > Please find attached a patch which implements approach 3. The vast > majority of it is changes to the regression tests. The removed > regression tests in join.{sql,out} are no longer errors, although some > of them are pretty standard DoS attacks, hence they're all removed. > > Cheers, > David. Oops. Misspelled rtekind in the previous patch. Here's a corrected one, much shorter. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Attachment
* David Fetter (david@fetter.org) wrote: > As I see it, the current options are: > > 1. Do nothing, and insist on non-standard use of the LATERAL keyword. I'm not a big fan of this. Providing a good error message saying "you need to use LATERAL for this query to work" makes it slightly better, but I don't feel like there's really any ambiguity here. > 2. Add UNNEST to the grammar (or parse analysis) as a special case, making > it implicitly LATERAL. > > (This would make implementing S301 easier, but special cases are ugly.) This I really don't like. > 3. Make all cases of SRFs in the FROM-clause implicitly LATERAL. > > (As far as I can tell, those cases whose behaviour would be changed by > this actually produce errors in versions prior to 9.3, so no working > code should be affected.) +1 for me on this idea. If you're calling an SRF, passing in a lateral value, 'LATERAL' seems like it's just a noise word, and apparently the SQL authors felt the same, as they don't require it for unnest(). > Since LATERAL is new in 9.3, I think the pros and cons of these choices > should be considered now, rather than being allowed to slide by unexamined. I agree that we should really hammer this down before 9.3 is out the door. Thanks, Stephen
Stephen Frost <sfrost@snowman.net> writes: > * David Fetter (david@fetter.org) wrote: >> 3. Make all cases of SRFs in the FROM-clause implicitly LATERAL. >> >> (As far as I can tell, those cases whose behaviour would be changed by >> this actually produce errors in versions prior to 9.3, so no working >> code should be affected.) > +1 for me on this idea. If you're calling an SRF, passing in a lateral > value, 'LATERAL' seems like it's just a noise word, and apparently the > SQL authors felt the same, as they don't require it for unnest(). At first I didn't like this idea, but it's growing on me. However ... David is wrong to claim that it's zero-risk. It's true that an SRF can't contain any side-references today, but it can contain an outer reference. Consider a case like SELECT ... FROM a WHERE a.x IN (SELECT ... FROM b, srf(y) WHERE ...) In existing releases the "y" could be a valid outer reference to a.y. If b also has a column y, David's proposal would cause us to prefer that interpretation, since b.y would be more closely nested than a.y. If you're lucky, you'd get a type-mismatch error, but if the two y's are of similar datatypes the query would just silently do something different than it used to. This is a little bit far-fetched, but it could happen. As against that, we make incompatible changes in every release, and it does seem like assuming LATERAL for functions in FROM would be a usability gain most of the time. And special-casing UNNEST to satisfy the standard seems *really* ugly. > I agree that we should really hammer this down before 9.3 is out the > door. Yeah, if we're going to do this it'd make the most sense to do it in the same release that introduces LATERAL. regards, tom lane
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > However ... David is wrong to claim that it's zero-risk. It's true that > an SRF can't contain any side-references today, but it can contain an > outer reference. Consider a case like > > SELECT ... FROM a WHERE a.x IN (SELECT ... FROM b, srf(y) WHERE ...) I see what you mean, but on the other hand, that looks like something we might actually want to complain about as 'y' is pretty clearly ambiguous here. I'm a bit surprised that doesn't already throw an error. > This is a little bit far-fetched, but it could happen. As against that, > we make incompatible changes in every release, and it does seem like > assuming LATERAL for functions in FROM would be a usability gain most > of the time. And special-casing UNNEST to satisfy the standard seems > *really* ugly. It's definitely far-fetched, imv. If it's possible, within reason, to explicitly throw a "please disambiguate 'y'" type of error in those specific cases, that'd be nice, but I don't think it'd be required. A mention in the release notes would be sufficient. Thanks, Stephen
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > SELECT ... FROM a WHERE a.x IN (SELECT ... FROM b, srf(y) WHERE ...) Actually, this appears to fail already, at least in 9.2.2: => select * from (values (1)) v(a) where v.a in (select x from (values (2)) v2(a), -> generate_series(1,a) x); ERROR: function expression in FROM cannot refer to other relations of same query level LINE 2: generate_series(1,a) x); ^ Unless it's something else that you were referring to...? Thanks, Stephen
Stephen Frost <sfrost@snowman.net> writes: > * Tom Lane (tgl@sss.pgh.pa.us) wrote: >> SELECT ... FROM a WHERE a.x IN (SELECT ... FROM b, srf(y) WHERE ...) > Actually, this appears to fail already, at least in 9.2.2: > => select * from (values (1)) v(a) where v.a in (select x from (values (2)) v2(a), > -> generate_series(1,a) x); > ERROR: function expression in FROM cannot refer to other relations of same query level > LINE 2: generate_series(1,a) x); > ^ Huh ... you're right, I'd forgotten about that. That's an ancient bug that got fixed in passing in the LATERAL work. So, as long as we're not going to fix that bug in the back branches (which would be difficult anyway IIRC), we don't have a compatibility problem ... regards, tom lane
David Fetter <david@fetter.org> writes: > Please find attached a patch which implements approach 3. The vast > majority of it is changes to the regression tests. The removed > regression tests in join.{sql,out} are no longer errors, although some > of them are pretty standard DoS attacks, hence they're all removed. Here's a less quick-hack-y approach to that. regards, tom lane