Thread: BUG #4509: array_cat's null behaviour is inconsistent

BUG #4509: array_cat's null behaviour is inconsistent

From
"Kevin Field"
Date:
The following bug has been logged online:

Bug reference:      4509
Logged by:          Kevin Field
Email address:      kev@brantaero.com
PostgreSQL version: 8.3.4
Operating system:   Windows Server 2003 SP2
Description:        array_cat's null behaviour is inconsistent
Details:

Section 9.2 in the docs say, 'The ordinary comparison operators yield null
(signifying "unknown") when either input is null.'  This applies to other
operators too.  For example, the result of tacking an unknown value onto a
known one is unknown, because you don't know what exactly you just tacked
on.  So

   select null::text || 'hello';

...returns NULL, which makes sense.  But then this

   select array_cat(null::integer[], '{3}'::integer[]);

...doesn't return NULL, which it should, for consistency.

Kev

Re: BUG #4509: array_cat's null behaviour is inconsistent

From
Peter Eisentraut
Date:
Kevin Field wrote:
> Section 9.2 in the docs say, 'The ordinary comparison operators yield null
> (signifying "unknown") when either input is null.'  This applies to other
> operators too.  For example, the result of tacking an unknown value onto a
> known one is unknown, because you don't know what exactly you just tacked
> on.  So
>
>    select null::text || 'hello';
>
> ...returns NULL, which makes sense.  But then this
>
>    select array_cat(null::integer[], '{3}'::integer[]);
>
> ...doesn't return NULL, which it should, for consistency.

Doesn't make sense to me either.  I found that this was changed between
8.1 (where it returns null) and 8.2, but I find nothing in the
respective release notes or commit messages about this change.  Tom?

Re: BUG #4509: array_cat's null behaviour is inconsistent

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> Kevin Field wrote:
>> select array_cat(null::integer[], '{3}'::integer[]);
>> ...doesn't return NULL, which it should, for consistency.

> Doesn't make sense to me either.  I found that this was changed between
> 8.1 (where it returns null) and 8.2, but I find nothing in the
> respective release notes or commit messages about this change.  Tom?

Hm, I seem to have changed it in the patch that added support for nulls
within arrays, but I don't recall the reasoning for this detail ATM.

I'm loath to change just this one thing though.  If we're going to
monkey with it, I think we should start by having a thought-through
proposal about the behavior of all the array functions for corner
cases like null arrays and empty arrays (cf recent discussion about
zero-dimensional arrays).

            regards, tom lane

Re: BUG #4509: array_cat's null behaviour is inconsistent

From
Kev
Date:
> I'm loath to change just this one thing though.  If we're going to
> monkey with it, I think we should start by having a thought-through
> proposal about the behavior of all the array functions for corner
> cases like null arrays and empty arrays (cf recent discussion about
> zero-dimensional arrays).

If those are the only corner cases, here's something to work with.

Looking at http://www.postgresql.org/docs/8.3/interactive/functions-array.html,
I'd propose that every one of those operators and functions should
return NULL::(normal-return-type) if any of their operands are NULL,
to be consistent with Section 9.2 and to differentiate from the empty-
array cases below, and that logic should precede anything to do with
empty arrays.  As for empty arrays, the following additions to the
logic:

= true if both arrays are empty
<> true if either array is empty but not both
< true if left operand is empty and right is not
> true if right operand is empty and left is not
<= true if left operand is empty
>= true if right operand is empty
@> true if right operand is empty
<@ true if left operand is empty
&& false if either array is empty
|| (both array-array cases) return other array if one is empty (if
left is empty, return right, elsif right is empty, return left)
|| (both element-array cases) return ARRAY[element] if array is empty

array_append return ARRAY[element] if array is empty
array_cat return other array if one is empty (if left is empty, return
right, elsif right is empty, return left)
array_dims return results obtained by ignoring empty sub-arrays; if
entire array is empty, return ''
array_lower raise exception if array is empty (there are no dimensions
to inquire about)
array_prepend return ARRAY[element] if array is empty
array_to_string return '' if array is empty
array_upper raise exception if array is empty (there are no dimensions
to inquire about)
string_to_array return '' if left operand is ''

Hope that's useful...
Kev

Re: BUG #4509: array_cat's null behaviour is inconsistent

From
Tom Lane
Date:
Kev <kevinjamesfield@gmail.com> writes:
>> I'm loath to change just this one thing though.  If we're going to
>> monkey with it, I think we should start by having a thought-through
>> proposal about the behavior of all the array functions for corner
>> cases like null arrays and empty arrays (cf recent discussion about
>> zero-dimensional arrays).

> If those are the only corner cases, here's something to work with.

I don't think it's quite as easy as you think ...

> Looking at http://www.postgresql.org/docs/8.3/interactive/functions-array.html,
> I'd propose that every one of those operators and functions should
> return NULL::(normal-return-type) if any of their operands are NULL,

It is *certainly* not the case that array_append and array_prepend
should return NULL when their scalar argument is NULL --- that would
break a key component of the support for null elements in arrays.
As for the array argument, consider this plpgsql code:

    declare arr int[];
            r record;
    begin
    for r in select ... loop
        arr := arr || r.intfield;
    end loop;

Right now that will "do the right thing" because array_append treats
a null array input as though it were empty.  Now we could decree that
this code is wrong and the user should have initialized arr to '{}' or
some such, but ISTM that that's basically breaking code to no purpose.
There isn't any real type-safety or other code quality benefit to be
gained from making this fail.

(After having thought about it a bit, I believe that the rationale for
having array_cat treat null arrays as it does was similar.)

> || (both array-array cases) return other array if one is empty (if
> left is empty, return right, elsif right is empty, return left)

We could do that, but note that || has two different behaviors depending
on whether the arrays have the same rank, and it throws an error if the
ranks are more than one apart.  So doing the above amounts to saying
that an empty array can have any rank depending on context.  Not sure
that I like that --- this gets back to the whole issue of exactly what
we define an empty array as meaning.  (Note that array_cat does in fact
currently behave as you suggest.  So changing it could break existing
code; but I'm thinking that there would be far fewer such cases than
for the append-a-scalar situation.)

> array_lower raise exception if array is empty (there are no dimensions
> to inquire about)
> array_upper raise exception if array is empty (there are no dimensions
> to inquire about)

Well, these beg the question: is an empty array zero-dimensional, or
is it a one-dimensional array of no elements, or perhaps both of those
as well as higher-dimensional cases where any axis has zero elements,
or ???

It's really all kind of messy ... we need to trade off simplicity of
definition, ease of use, backwards compatibility, and standards
compliance (though the standard has only 1-D arrays so it's of just
limited help here).

            regards, tom lane

Re: BUG #4509: array_cat's null behaviour is inconsistent

From
Peter Eisentraut
Date:
Tom Lane wrote:
> It's really all kind of messy ... we need to trade off simplicity of
> definition, ease of use, backwards compatibility, and standards
> compliance (though the standard has only 1-D arrays so it's of just
> limited help here).

AFAICT, the standard would certainly allow a type specification of

    basetype ARRAY ARRAY ARRAY ...

which gives you a multidimensional array (in the same sense that C has
multidimensional arrays).  If my reading of everything is right, the
subscript order that this would require is the opposite of what
PostgreSQL uses for multidimensional arrays.  Talk about messy ...

Re: BUG #4509: array_cat's null behaviour is inconsistent

From
Bruce Momjian
Date:
Added to TODO:

    Improve handling of NULLs in arrays

        * http://archives.postgresql.org/pgsql-bugs/2008-11/msg00009.php

---------------------------------------------------------------------------

Kevin Field wrote:
>
> The following bug has been logged online:
>
> Bug reference:      4509
> Logged by:          Kevin Field
> Email address:      kev@brantaero.com
> PostgreSQL version: 8.3.4
> Operating system:   Windows Server 2003 SP2
> Description:        array_cat's null behaviour is inconsistent
> Details:
>
> Section 9.2 in the docs say, 'The ordinary comparison operators yield null
> (signifying "unknown") when either input is null.'  This applies to other
> operators too.  For example, the result of tacking an unknown value onto a
> known one is unknown, because you don't know what exactly you just tacked
> on.  So
>
>    select null::text || 'hello';
>
> ...returns NULL, which makes sense.  But then this
>
>    select array_cat(null::integer[], '{3}'::integer[]);
>
> ...doesn't return NULL, which it should, for consistency.
>
> Kev
>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +