Thread: count() counts ROW values that are NULL
The manual says:
count
("any"
) →bigint
Computes the number of input rows in which the input value is not null.
But ROW values or composite types that "are null" are counted, anyway. See:
Feels like a bug, but I cannot imagine how this would have slipped everybody's attention for so long. It should at least be documented. Maybe:
Computes the number of input rows in which the input value is not a plain NULL value. (Composite or ROW values count in any case - even if value IS NULL evaluates to true.)
Regards
Erwin
Erwin Brandstetter <brsaweda@gmail.com> writes: > https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-AGGREGATE-TABLE > The manual says: >> count ( "any" ) → bigint >> Computes the number of input rows in which the input value is not null. > But ROW values or composite types that "are null" are counted, anyway. Well, there's nulls and nulls. The SQL "IS NULL" construct is fairly badly designed IMO, because it considers both a plain NULL and a row-of-all-NULL-fields to be "null". count(), like just about everything in Postgres other than "IS NULL", considers only a plain NULL to be null. This is discussed somewhere in the manual, but I think it's under IS NULL, not under all the other places that'd have to be annotated if we decide to annotate as you're suggesting. (One example is that functions that are marked STRICT use the tighter interpretation.) You could use "COUNT(*) FILTER (WHERE NOT (whatever IS NULL))" if you want to count values meeting the IS NULL definition. (Buttressing my point that IS NULL is not well thought out, the obvious "whatever IS NOT NULL" doesn't work here, because it's not the inverse of "whatever IS NULL".) regards, tom lane
On Fri, 18 Mar 2022 at 22:28, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Erwin Brandstetter <brsaweda@gmail.com> writes:
> https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-AGGREGATE-TABLE
> The manual says:
>> count ( "any" ) → bigint
>> Computes the number of input rows in which the input value is not null.
> But ROW values or composite types that "are null" are counted, anyway.
Well, there's nulls and nulls. The SQL "IS NULL" construct is fairly
badly designed IMO, because it considers both a plain NULL and a
row-of-all-NULL-fields to be "null". count(), like just about everything
in Postgres other than "IS NULL", considers only a plain NULL to be null.
This is discussed somewhere in the manual, but I think it's under IS NULL,
not under all the other places that'd have to be annotated if we decide to
annotate as you're suggesting. (One example is that functions that are
marked STRICT use the tighter interpretation.)
You could use "COUNT(*) FILTER (WHERE NOT (whatever IS NULL))" if you want
to count values meeting the IS NULL definition. (Buttressing my point
that IS NULL is not well thought out, the obvious "whatever IS NOT NULL"
doesn't work here, because it's not the inverse of "whatever IS NULL".)
I am aware of the mess, and I feel your pain (and my own).
But count(<expression>) is among the most frequently used functions, and hardly any user reading the manual will be aware of the implications. Maybe just:
... in which the input value is not null (does not evaluate to a scalar NULL).
To give them a fighting chance.
Regards
Erwin
On Fri, Mar 18, 2022 at 3:06 PM Erwin Brandstetter <brsaweda@gmail.com> wrote:
On Fri, 18 Mar 2022 at 22:28, Tom Lane <tgl@sss.pgh.pa.us> wrote:Erwin Brandstetter <brsaweda@gmail.com> writes:
> https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-AGGREGATE-TABLE
> The manual says:
>> count ( "any" ) → bigint
>> Computes the number of input rows in which the input value is not null.
> But ROW values or composite types that "are null" are counted, anyway.
Well, there's nulls and nulls. The SQL "IS NULL" construct is fairly
badly designed IMO, because it considers both a plain NULL and a
row-of-all-NULL-fields to be "null". count(), like just about everything
in Postgres other than "IS NULL", considers only a plain NULL to be null.
This is discussed somewhere in the manual, but I think it's under IS NULL,
not under all the other places that'd have to be annotated if we decide to
annotate as you're suggesting. (One example is that functions that are
marked STRICT use the tighter interpretation.)
But count(<expression>) is among the most frequently used functions, and hardly any user reading the manual will be aware of the implications. Maybe just:
I'm with Tom on this. The behavior exhibited is the expected behavior. I haven't looked, but if anything I would make the desired point in "composite IS NULL" that this special (ROW(null) IS NULL -> true) interpretation of NULL is limited to this SQL Standard mandated operator and that when speaking generally about a composite being null throughout the documentation it is done in a scalar sense (I don't know how best to word this but select null::rel yields "" while select row(null)::rel yields "()" on printout (assuming rel has a single column)).
David J.