Re: PG 7.2b4 bug? - Mailing list pgsql-hackers
From | Don Baccus |
---|---|
Subject | Re: PG 7.2b4 bug? |
Date | |
Msg-id | 3C20FF98.8080005@pacifier.com Whole thread Raw |
In response to | PG 7.2b4 bug? (Don Baccus <dhogaza@pacifier.com>) |
List | pgsql-hackers |
Some more bug-or-not-bug thoughts ... I thought I'd add a quote from Date that furthers my belief that the subselect example I posted does indeed expose a bug: (T1 is the table conceptually created by the various joins, etc) "[if] the select-item takes the form "scalar-expression [[AS] column]" ... For each row in T1 the scalar-expression is evaluated .." (page 151 Date & Darwin) SQL92 didn't support subselects in the select-item-list. SQL3 extends the expression to include one-row selects that return a single scalar value. It does NOT however add any wording that allows the subselect to be yanked and evaluated once rather than evaluated for each row. The standard uses the word "applied" not "evaluated". I interpret this to mean "evaluated" and it appears that Date does, too. On the other hand the view example is giving the proper result in PG 7.2, though only by luck, as Tom pointed out earler. For (given the view "create view foo as select nextval('foo_sequence') as nextval;") select foo.nextval from multiple_rows; isn't actually legal SQL. It must be stated as: select foo.nextval from foo, multiple_rows; (all PG does is add "foo" to the from clause for me if I leave it out). The semantics of this are obvious when you think about it - materialize "foo" then cross-join the resulting table with multiple_rows. Since "foo" returns a single row computed by "nextval('foo_sequence')" obviously the result seen with PG 7.2 is correct. Date is quite clear on the semantics of this and it makes tons of sense since views are meant to be treated like tables. So: 1. If an explicit scalar subselect appears in the target list, it should be executed for every row in the result set. 2. A view referenced in the target list is actually supposed to be materialized in the FROM clause (even if implictly addedto it for you) then joined to the other tables in the query, if any. Meaning it should always be executed onceand only once. The standard doesn't have PG-style rules, of course, but such tables are also should be in the FROMclause, evaluated and joined afterwards IMO. At least that's my reading and I've spent quite a bit of time on this now. Unfortunately PG currently doesn't use the form of the query to decide whether or not to execute the subselect or view once or for each row, but rather does so depending on the estimated cost of each approach. That's the real bug it seems. The form of the query, not the whim of the optimizer, is the determinant. Neither of these cases is likely to arise frequently in practice, so if I ruled Middle Earth I'd decree that: 1. It be filed as a bug 2. It not be assigned a high priority. However it's not merely of academic interest. The semantics of the view example is such that you should be able to force single-evaluation of a function by simply wrapping it in a view, regardless of whether or not it has side-effects. Meanwhile I get to go off and inspect the roughly 750 queries that use this particular style view and determine which ones incorrectly assume that the view's evaluated more than once per query! :) -- Don Baccus Portland, OR http://donb.photo.net, http://birdnotes.net, http://openacs.org
pgsql-hackers by date: