Thread: count() counts ROW values that are NULL

count() counts ROW values that are NULL

From
Erwin Brandstetter
Date:
The manual says:

count ( "any" ) → bigint

Computes the number of input rows in which the input value is not null.


But ROW values or composite types that "are null" are counted, anyway. See:


Feels like a bug, but I cannot imagine how this would have slipped everybody's attention for so long. It should at least be documented. Maybe:

Computes the number of input rows in which the input value is not a plain NULL value. (Composite or ROW values count in any case - even if value IS NULL evaluates to true.)

Regards
Erwin

Re: count() counts ROW values that are NULL

From
Tom Lane
Date:
Erwin Brandstetter <brsaweda@gmail.com> writes:
> https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-AGGREGATE-TABLE
> The manual says:
>> count ( "any" ) → bigint
>> Computes the number of input rows in which the input value is not null.

> But ROW values or composite types that "are null" are counted, anyway.

Well, there's nulls and nulls.  The SQL "IS NULL" construct is fairly
badly designed IMO, because it considers both a plain NULL and a
row-of-all-NULL-fields to be "null".  count(), like just about everything
in Postgres other than "IS NULL", considers only a plain NULL to be null.

This is discussed somewhere in the manual, but I think it's under IS NULL,
not under all the other places that'd have to be annotated if we decide to
annotate as you're suggesting.  (One example is that functions that are
marked STRICT use the tighter interpretation.)

You could use "COUNT(*) FILTER (WHERE NOT (whatever IS NULL))" if you want
to count values meeting the IS NULL definition.  (Buttressing my point
that IS NULL is not well thought out, the obvious "whatever IS NOT NULL"
doesn't work here, because it's not the inverse of "whatever IS NULL".)

            regards, tom lane



Re: count() counts ROW values that are NULL

From
Erwin Brandstetter
Date:

On Fri, 18 Mar 2022 at 22:28, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Erwin Brandstetter <brsaweda@gmail.com> writes:
> https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-AGGREGATE-TABLE
> The manual says:
>> count ( "any" ) → bigint
>> Computes the number of input rows in which the input value is not null.

> But ROW values or composite types that "are null" are counted, anyway.

Well, there's nulls and nulls.  The SQL "IS NULL" construct is fairly
badly designed IMO, because it considers both a plain NULL and a
row-of-all-NULL-fields to be "null".  count(), like just about everything
in Postgres other than "IS NULL", considers only a plain NULL to be null.

This is discussed somewhere in the manual, but I think it's under IS NULL,
not under all the other places that'd have to be annotated if we decide to
annotate as you're suggesting.  (One example is that functions that are
marked STRICT use the tighter interpretation.)

You could use "COUNT(*) FILTER (WHERE NOT (whatever IS NULL))" if you want
to count values meeting the IS NULL definition.  (Buttressing my point
that IS NULL is not well thought out, the obvious "whatever IS NOT NULL"
doesn't work here, because it's not the inverse of "whatever IS NULL".)

I am aware of the mess, and I feel your pain (and my own).
But count(<expression>) is among the most frequently used functions, and hardly any user reading the manual will be aware of the implications. Maybe just:

... in which the input value is not null (does not evaluate to a scalar NULL).

To give them a fighting chance.


Regards

Erwin

Re: count() counts ROW values that are NULL

From
"David G. Johnston"
Date:
On Fri, Mar 18, 2022 at 3:06 PM Erwin Brandstetter <brsaweda@gmail.com> wrote:

On Fri, 18 Mar 2022 at 22:28, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Erwin Brandstetter <brsaweda@gmail.com> writes:
> https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-AGGREGATE-TABLE
> The manual says:
>> count ( "any" ) → bigint
>> Computes the number of input rows in which the input value is not null.

> But ROW values or composite types that "are null" are counted, anyway.

Well, there's nulls and nulls.  The SQL "IS NULL" construct is fairly
badly designed IMO, because it considers both a plain NULL and a
row-of-all-NULL-fields to be "null".  count(), like just about everything
in Postgres other than "IS NULL", considers only a plain NULL to be null.

This is discussed somewhere in the manual, but I think it's under IS NULL,
not under all the other places that'd have to be annotated if we decide to
annotate as you're suggesting.  (One example is that functions that are
marked STRICT use the tighter interpretation.)

 
But count(<expression>) is among the most frequently used functions, and hardly any user reading the manual will be aware of the implications. Maybe just:

I'm with Tom on this.  The behavior exhibited is the expected behavior.  I haven't looked, but if anything I would make the desired point in "composite IS NULL" that this special (ROW(null) IS NULL -> true) interpretation of NULL is limited to this SQL Standard mandated operator and that when speaking generally about a composite being null throughout the documentation it is done in a scalar sense (I don't know how best to word this but select null::rel yields "" while select row(null)::rel yields "()" on printout (assuming rel has a single column)).

David J.