Thread: RE: [INTERFACES] JDBC next() method

RE: [INTERFACES] JDBC next() method

From

Jon Barnett

Date:

25 April 1999, 09:02:48

On Sunday, 25 April 1999 2:01, Tom Lane [SMTP:tgl@sss.pgh.pa.us] wrote:
> If you can demonstrate that this behavior [NULL result from aggregate
> functions] is not compliant with the SQL92 standard, then the Postgres
> backend will get fixed.
>
> There's been discussion of this point before on the hackers list,
> with some people feeling that the current behavior is OK and others
> not happy with it, but so far no one has made a convincing case about
> what the standard expects.

It seems that the standard is quite ambiguous as to what it expects and doesn't 
address the limit case (what happens for an aggregate function result when a 
table is empty).  My SQL Programmer's Reference (ISBN 1-56604-760-9) has 
nothing to say on this matter and I haven't successfully found a copy of the 
SQL92 specifications (seems harder than tracking down a printed copy of 
"Historia Regum Britanniae").  I suspect that the SQL specifications may also 
overlook the issue as well.  *sigh*

Upon reflection on possible scenarios, it seems that the aggregates should 
return a row with a null value (but I'm not wholly certain on this).  In SQL 
the following is valid:
select column1, column2, column3from table twhere column3 =    (select max(column3)     from table     where column2 =
t.column2)

Should the nested select have no max value (being an empty result), then 
perhaps a null is required to be returned such that the comparison for the 
outer select may still be made.  I had overlooked this possibility.  With it 
coming to light, it appears we should have no problems with the current 
implementation as this example seems to indirectly point to a null value return 
being a valid result.  Someone may wish to check my logic on this.

My apologies for ever doubting that things were not working correctly. :)

Best regards,

JonB.

RE: [INTERFACES] JDBC next() method

From

Herouth Maoz

Date:

25 April 1999, 10:25:23

At 15:58 +0300 on 25/04/1999, Jon Barnett wrote:

> It seems that the standard is quite ambiguous as to what it expects and
>doesn't
> address the limit case (what happens for an aggregate function result when a
> table is empty).  My SQL Programmer's Reference (ISBN 1-56604-760-9) has
> nothing to say on this matter and I haven't successfully found a copy of the
> SQL92 specifications (seems harder than tracking down a printed copy of
> "Historia Regum Britanniae").  I suspect that the SQL specifications may
>also
> overlook the issue as well.  *sigh*

Ah, well, I have a (late draft) of the SQL92 standard. And here is what it
has to say on the matter (note 2-b-i):

<<< begin quotation from standard >>>
General Rules
1) Case:
   a) If COUNT(*) is specified, then the result is the cardinality     of T.
   b) Otherwise, let TX be the single-column table that is the     result of applying the <value expression> to each
rowof T     and eliminating null values. If one or more null values are     eliminated, then a completion condition is
raised:warning-     null value eliminated in set function.

2) If DISTINCT is specified, then let TXA be the result of elimi-   nating redundant duplicate values from TX.
Otherwise,let TXA be   TX.

   Case:
   a) If the <general set function> COUNT is specified, then the     result is the cardinality of TXA.
   b) If AVG, MAX, MIN, or SUM is specified, then
     Case:
     i) If TXA is empty, then the result is the null value.
    ii) If AVG is specified, then the result is the average of the        values in TXA.
   iii) If MAX or MIN is specified, then the result is respec-        tively the maximum or minimum value in TXA. These
results       are determined using the comparison rules specified in        Subclause 8.2, "<comparison predicate>".

    iv) If SUM is specified, then the result is the sum of the        values in TXA. If the sum is not within the range
ofthe        data type of the result, then an exception condition is        raised: data exception-numeric value out of
range.

<<< end quotation from standard >>>

Herouth

--
Herouth Maoz, Internet developer.
Open University of Israel - Telem project
http://telem.openu.ac.il/~herutma

Re: [INTERFACES] JDBC next() method

From

Tom Lane

Date:

25 April 1999, 12:42:40

Herouth Maoz <herouth@oumail.openu.ac.il> writes:
> Ah, well, I have a (late draft) of the SQL92 standard. And here is what it
> has to say on the matter (note 2-b-i):
>     Case:
>     a) If the <general set function> COUNT is specified, then the
>       result is the cardinality of TXA.
>          [ie, 0 for an empty set - tgl]
>     b) If AVG, MAX, MIN, or SUM is specified, then
>       Case:
>       i) If TXA is empty, then the result is the null value.

Well, this is certainly adequate precedent for the behavior of these
particular aggregates --- although I'd have to say that the standard-
writers blew it for SUM; SUM of an empty set ought to return 0 not
null.  (It looks like Postgres follows the spec, however.)

Anyway, we have here precedent for deciding on a function-by-function
basis whether an aggregate over an empty selection should return NULL
or return an appropriate ordinary value.  There's no case in which the
result of the aggregate has no rows, however.  So I think Postgres
is fairly standards-compliant on this point.

Now that I think about it, the arguments on the hackers list were not
about the plain SELECT case but about the GROUP BY case.  For example,
if you do
SELECT productname, AVG(saleprice) FROM sales GROUP BY productname;

then you get a row in the output for each different productname, and
a separate instance of AVG is run over the prices for each group.
(Unless there are NULLs in the saleprice column, none of the AVG
instances could ever return a null result.)

BUT: what happens if the sales table is empty?  There are no
productnames, therefore no groups, therefore no rows ought to appear
in the output (IMHO).  However, what Postgres actually does right now
is to emit one all-nulls row (but only if an aggregate function was
used; if you say "SELECT productname FROM sales GROUP BY productname"
then you get no rows).  That is the behavior that we've gone 'round and
'round on without any resolution; it seems obviously inconsistent to me,
but others think it's OK because it parallels what happens in the non-
GROUP BY case.

Is there anything in the SQL92 spec addressing this point?
        regards, tom lane

Re: [INTERFACES] JDBC next() method

From

Herouth Maoz

Date:

25 April 1999, 13:27:40

At 19:40 +0300 on 25/04/1999, Tom Lane wrote:

> Well, this is certainly adequate precedent for the behavior of these
> particular aggregates --- although I'd have to say that the standard-
> writers blew it for SUM; SUM of an empty set ought to return 0 not
> null.  (It looks like Postgres follows the spec, however.)

I don't agree. Suppose you want to sum of all your banking transactions in
January. There is a distinction between getting a 0, meaning you had a
balanced budget in January, and getting a null, meaning you made no
transactions in January.

> Now that I think about it, the arguments on the hackers list were not
> about the plain SELECT case but about the GROUP BY case.  For example,
> if you do
>
>     SELECT productname, AVG(saleprice) FROM sales GROUP BY productname;
>
> then you get a row in the output for each different productname, and
> a separate instance of AVG is run over the prices for each group.
> (Unless there are NULLs in the saleprice column, none of the AVG
> instances could ever return a null result.)
>
> BUT: what happens if the sales table is empty?  There are no
> productnames, therefore no groups, therefore no rows ought to appear
> in the output (IMHO).  However, what Postgres actually does right now
> is to emit one all-nulls row (but only if an aggregate function was
> used; if you say "SELECT productname FROM sales GROUP BY productname"
> then you get no rows).  That is the behavior that we've gone 'round and
> 'round on without any resolution; it seems obviously inconsistent to me,
> but others think it's OK because it parallels what happens in the non-
> GROUP BY case.
>
> Is there anything in the SQL92 spec addressing this point?

Ooh, definitely... Here are the general rules regarding a query expression
(i.e. a general select statement). A grouped table is defined somewhere
else in the document as the result of a group by or having clause. You
really should read the definitions (I can make my draft available on the
web for a while. I don't remember where I downloaded it).

<<< Begin quotation >>>
General Rules
1) Case:
   a) If T is not a grouped table, then
     Case:
     i) If the <select list> contains a <set function specifica- tion> that contains a reference to a column of T or
di-rectly contains a <set function specification> that does not contain an outer reference, then T is the argument or
argumentsource of each such <set function specification> and the result of the <query specification> is a table con-
sistingof 1 row. The i-th value of the row is the value specified by the i-th <value expression>.

    ii) If the <select list> does not include a <set function spec- ification> that contains a reference to T, then
each<value expression> is applied to each row of T yielding a table of M rows, where M is the cardinality of T. The
i-thcolumn of the table contains the values derived by the evaluation of the i-th <value expression>.

 Case:
 1) If the <set quantifier> DISTINCT is not specified, then   the table is the result of the <query specification>.
 2) If the <set quantifier> DISTINCT is specified, then the   result of the <query specification> is the table derived
from that table by the elimination of any redundant   duplicate rows.

   b) If T is a grouped table, then
     Case:
     i) If T has 0 groups, then the result of the <query specifica- tion> is an empty table.
    ii) If T has one or more groups, then each <value expression> is applied to each group of T yielding a table of M
rows,where M is the number of groups in T. The i-th column of the table contains the values derived by the evaluation
ofthe i-th <value expression>. When a <value expression> is applied to a given group of T, that group is the argument
orargument source of each <set function specification> in the <value expression>.

 Case:
 1) If the <set quantifier> DISTINCT is not specified, then   the table is the result of the <query specification>.
 2) If the <set quantifier> DISTINCT is specified, then the   result of the <query specification> is the table derived
from T by the elimination of any redundant duplicate   rows.

<<< End quotation >>>

Which implies exactly what you say. Perhaps you should forward this to the
Hackers list to re-start the argument - I am not a subscriber there.

Herouth

--
Herouth Maoz, Internet developer.
Open University of Israel - Telem project
http://telem.openu.ac.il/~herutma