Re: [HACKERS] SUM() and GROUP BY - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] SUM() and GROUP BY
Date
Msg-id 17980.916240321@sss.pgh.pa.us
Whole thread Raw
In response to Re: [HACKERS] SUM() and GROUP BY  ("Thomas G. Lockhart" <lockhart@alumni.caltech.edu>)
Responses Re: [HACKERS] SUM() and GROUP BY  ("D'Arcy" "J.M." Cain <darcy@druid.net>)
Re: [HACKERS] SUM() and GROUP BY  (Bruce Momjian <maillist@candle.pha.pa.us>)
List pgsql-hackers
"Thomas G. Lockhart" <lockhart@alumni.caltech.edu> writes:
>> druid=> SELECT SUM(tramount) FROM acctrans;
>> sum
>> ---
>> 
>> (1 row)
>> Shouldn't that be 0.00?

> No. It is returning NULL, because NULL means "don't know". It doesn't
> mean "nothing" or "zero". That is certainly the correct behavior if the
> table were populated with all NULLs in that column. And by extension, it
> is the correct result if there are no rows at all, since "don't know"
> for a bunch should give the same result as "don't know" for a few or for
> none.

I disagree ... the sum of zero items has traditionally been defined as
zero by any mathematician you care to ask.  No logical problems are
introduced by doing so, and it avoids an unpleasant special case that
applications would otherwise be forced to deal with.  (Example: if
D'Arcy's tramount column has been declared NOT NULL, then it seems to me
that his code is entitled to expect to get a non-NULL result from SUM().
He should not have to cope with a NULL just because the table is empty.)

Now, if the sum is taken over a set of rows that includes at least *one*
NULL, then I agree that SUM should return NULL: you "don't know" what
the sum ought to be if there are some inputs that you "don't know".
But we do know what the sum of no items is: it's zero.  There are no
unknown inputs and therefore no reason to claim we don't know the sum.

IOW, D'Arcy's right.  This is a bug.

You do have to take it on a case-by-case basis, though.  For example
AVG() of no items should (and does) return NULL, because the average
of no items is not defined (since 0/0 is indeterminate).
        regards, tom lane


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [HACKERS] postgres and year 2000
Next
From: Brook Milligan
Date:
Subject: Re: [HACKERS] RPM maintainer?