Thread: RE: [INTERFACES] JDBC next() method
On Sunday, 25 April 1999 2:01, Tom Lane [SMTP:tgl@sss.pgh.pa.us] wrote: > If you can demonstrate that this behavior [NULL result from aggregate > functions] is not compliant with the SQL92 standard, then the Postgres > backend will get fixed. > > There's been discussion of this point before on the hackers list, > with some people feeling that the current behavior is OK and others > not happy with it, but so far no one has made a convincing case about > what the standard expects. It seems that the standard is quite ambiguous as to what it expects and doesn't address the limit case (what happens for an aggregate function result when a table is empty). My SQL Programmer's Reference (ISBN 1-56604-760-9) has nothing to say on this matter and I haven't successfully found a copy of the SQL92 specifications (seems harder than tracking down a printed copy of "Historia Regum Britanniae"). I suspect that the SQL specifications may also overlook the issue as well. *sigh* Upon reflection on possible scenarios, it seems that the aggregates should return a row with a null value (but I'm not wholly certain on this). In SQL the following is valid: select column1, column2, column3from table twhere column3 = (select max(column3) from table where column2 = t.column2) Should the nested select have no max value (being an empty result), then perhaps a null is required to be returned such that the comparison for the outer select may still be made. I had overlooked this possibility. With it coming to light, it appears we should have no problems with the current implementation as this example seems to indirectly point to a null value return being a valid result. Someone may wish to check my logic on this. My apologies for ever doubting that things were not working correctly. :) Best regards, JonB.
At 15:58 +0300 on 25/04/1999, Jon Barnett wrote: > It seems that the standard is quite ambiguous as to what it expects and >doesn't > address the limit case (what happens for an aggregate function result when a > table is empty). My SQL Programmer's Reference (ISBN 1-56604-760-9) has > nothing to say on this matter and I haven't successfully found a copy of the > SQL92 specifications (seems harder than tracking down a printed copy of > "Historia Regum Britanniae"). I suspect that the SQL specifications may >also > overlook the issue as well. *sigh* Ah, well, I have a (late draft) of the SQL92 standard. And here is what it has to say on the matter (note 2-b-i): <<< begin quotation from standard >>> General Rules 1) Case: a) If COUNT(*) is specified, then the result is the cardinality of T. b) Otherwise, let TX be the single-column table that is the result of applying the <value expression> to each rowof T and eliminating null values. If one or more null values are eliminated, then a completion condition is raised:warning- null value eliminated in set function. 2) If DISTINCT is specified, then let TXA be the result of elimi- nating redundant duplicate values from TX. Otherwise,let TXA be TX. Case: a) If the <general set function> COUNT is specified, then the result is the cardinality of TXA. b) If AVG, MAX, MIN, or SUM is specified, then Case: i) If TXA is empty, then the result is the null value. ii) If AVG is specified, then the result is the average of the values in TXA. iii) If MAX or MIN is specified, then the result is respec- tively the maximum or minimum value in TXA. These results are determined using the comparison rules specified in Subclause 8.2, "<comparison predicate>". iv) If SUM is specified, then the result is the sum of the values in TXA. If the sum is not within the range ofthe data type of the result, then an exception condition is raised: data exception-numeric value out of range. <<< end quotation from standard >>> Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
Herouth Maoz <herouth@oumail.openu.ac.il> writes: > Ah, well, I have a (late draft) of the SQL92 standard. And here is what it > has to say on the matter (note 2-b-i): > Case: > a) If the <general set function> COUNT is specified, then the > result is the cardinality of TXA. > [ie, 0 for an empty set - tgl] > b) If AVG, MAX, MIN, or SUM is specified, then > Case: > i) If TXA is empty, then the result is the null value. Well, this is certainly adequate precedent for the behavior of these particular aggregates --- although I'd have to say that the standard- writers blew it for SUM; SUM of an empty set ought to return 0 not null. (It looks like Postgres follows the spec, however.) Anyway, we have here precedent for deciding on a function-by-function basis whether an aggregate over an empty selection should return NULL or return an appropriate ordinary value. There's no case in which the result of the aggregate has no rows, however. So I think Postgres is fairly standards-compliant on this point. Now that I think about it, the arguments on the hackers list were not about the plain SELECT case but about the GROUP BY case. For example, if you do SELECT productname, AVG(saleprice) FROM sales GROUP BY productname; then you get a row in the output for each different productname, and a separate instance of AVG is run over the prices for each group. (Unless there are NULLs in the saleprice column, none of the AVG instances could ever return a null result.) BUT: what happens if the sales table is empty? There are no productnames, therefore no groups, therefore no rows ought to appear in the output (IMHO). However, what Postgres actually does right now is to emit one all-nulls row (but only if an aggregate function was used; if you say "SELECT productname FROM sales GROUP BY productname" then you get no rows). That is the behavior that we've gone 'round and 'round on without any resolution; it seems obviously inconsistent to me, but others think it's OK because it parallels what happens in the non- GROUP BY case. Is there anything in the SQL92 spec addressing this point? regards, tom lane
At 19:40 +0300 on 25/04/1999, Tom Lane wrote: > Well, this is certainly adequate precedent for the behavior of these > particular aggregates --- although I'd have to say that the standard- > writers blew it for SUM; SUM of an empty set ought to return 0 not > null. (It looks like Postgres follows the spec, however.) I don't agree. Suppose you want to sum of all your banking transactions in January. There is a distinction between getting a 0, meaning you had a balanced budget in January, and getting a null, meaning you made no transactions in January. > Now that I think about it, the arguments on the hackers list were not > about the plain SELECT case but about the GROUP BY case. For example, > if you do > > SELECT productname, AVG(saleprice) FROM sales GROUP BY productname; > > then you get a row in the output for each different productname, and > a separate instance of AVG is run over the prices for each group. > (Unless there are NULLs in the saleprice column, none of the AVG > instances could ever return a null result.) > > BUT: what happens if the sales table is empty? There are no > productnames, therefore no groups, therefore no rows ought to appear > in the output (IMHO). However, what Postgres actually does right now > is to emit one all-nulls row (but only if an aggregate function was > used; if you say "SELECT productname FROM sales GROUP BY productname" > then you get no rows). That is the behavior that we've gone 'round and > 'round on without any resolution; it seems obviously inconsistent to me, > but others think it's OK because it parallels what happens in the non- > GROUP BY case. > > Is there anything in the SQL92 spec addressing this point? Ooh, definitely... Here are the general rules regarding a query expression (i.e. a general select statement). A grouped table is defined somewhere else in the document as the result of a group by or having clause. You really should read the definitions (I can make my draft available on the web for a while. I don't remember where I downloaded it). <<< Begin quotation >>> General Rules 1) Case: a) If T is not a grouped table, then Case: i) If the <select list> contains a <set function specifica- tion> that contains a reference to a column of T or di-rectly contains a <set function specification> that does not contain an outer reference, then T is the argument or argumentsource of each such <set function specification> and the result of the <query specification> is a table con- sistingof 1 row. The i-th value of the row is the value specified by the i-th <value expression>. ii) If the <select list> does not include a <set function spec- ification> that contains a reference to T, then each<value expression> is applied to each row of T yielding a table of M rows, where M is the cardinality of T. The i-thcolumn of the table contains the values derived by the evaluation of the i-th <value expression>. Case: 1) If the <set quantifier> DISTINCT is not specified, then the table is the result of the <query specification>. 2) If the <set quantifier> DISTINCT is specified, then the result of the <query specification> is the table derived from that table by the elimination of any redundant duplicate rows. b) If T is a grouped table, then Case: i) If T has 0 groups, then the result of the <query specifica- tion> is an empty table. ii) If T has one or more groups, then each <value expression> is applied to each group of T yielding a table of M rows,where M is the number of groups in T. The i-th column of the table contains the values derived by the evaluation ofthe i-th <value expression>. When a <value expression> is applied to a given group of T, that group is the argument orargument source of each <set function specification> in the <value expression>. Case: 1) If the <set quantifier> DISTINCT is not specified, then the table is the result of the <query specification>. 2) If the <set quantifier> DISTINCT is specified, then the result of the <query specification> is the table derived from T by the elimination of any redundant duplicate rows. <<< End quotation >>> Which implies exactly what you say. Perhaps you should forward this to the Hackers list to re-start the argument - I am not a subscriber there. Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma