Thread: BUG #4509: array_cat's null behaviour is inconsistent
The following bug has been logged online: Bug reference: 4509 Logged by: Kevin Field Email address: kev@brantaero.com PostgreSQL version: 8.3.4 Operating system: Windows Server 2003 SP2 Description: array_cat's null behaviour is inconsistent Details: Section 9.2 in the docs say, 'The ordinary comparison operators yield null (signifying "unknown") when either input is null.' This applies to other operators too. For example, the result of tacking an unknown value onto a known one is unknown, because you don't know what exactly you just tacked on. So select null::text || 'hello'; ...returns NULL, which makes sense. But then this select array_cat(null::integer[], '{3}'::integer[]); ...doesn't return NULL, which it should, for consistency. Kev
Kevin Field wrote: > Section 9.2 in the docs say, 'The ordinary comparison operators yield null > (signifying "unknown") when either input is null.' This applies to other > operators too. For example, the result of tacking an unknown value onto a > known one is unknown, because you don't know what exactly you just tacked > on. So > > select null::text || 'hello'; > > ...returns NULL, which makes sense. But then this > > select array_cat(null::integer[], '{3}'::integer[]); > > ...doesn't return NULL, which it should, for consistency. Doesn't make sense to me either. I found that this was changed between 8.1 (where it returns null) and 8.2, but I find nothing in the respective release notes or commit messages about this change. Tom?
Peter Eisentraut <peter_e@gmx.net> writes: > Kevin Field wrote: >> select array_cat(null::integer[], '{3}'::integer[]); >> ...doesn't return NULL, which it should, for consistency. > Doesn't make sense to me either. I found that this was changed between > 8.1 (where it returns null) and 8.2, but I find nothing in the > respective release notes or commit messages about this change. Tom? Hm, I seem to have changed it in the patch that added support for nulls within arrays, but I don't recall the reasoning for this detail ATM. I'm loath to change just this one thing though. If we're going to monkey with it, I think we should start by having a thought-through proposal about the behavior of all the array functions for corner cases like null arrays and empty arrays (cf recent discussion about zero-dimensional arrays). regards, tom lane
> I'm loath to change just this one thing though. If we're going to > monkey with it, I think we should start by having a thought-through > proposal about the behavior of all the array functions for corner > cases like null arrays and empty arrays (cf recent discussion about > zero-dimensional arrays). If those are the only corner cases, here's something to work with. Looking at http://www.postgresql.org/docs/8.3/interactive/functions-array.html, I'd propose that every one of those operators and functions should return NULL::(normal-return-type) if any of their operands are NULL, to be consistent with Section 9.2 and to differentiate from the empty- array cases below, and that logic should precede anything to do with empty arrays. As for empty arrays, the following additions to the logic: = true if both arrays are empty <> true if either array is empty but not both < true if left operand is empty and right is not > true if right operand is empty and left is not <= true if left operand is empty >= true if right operand is empty @> true if right operand is empty <@ true if left operand is empty && false if either array is empty || (both array-array cases) return other array if one is empty (if left is empty, return right, elsif right is empty, return left) || (both element-array cases) return ARRAY[element] if array is empty array_append return ARRAY[element] if array is empty array_cat return other array if one is empty (if left is empty, return right, elsif right is empty, return left) array_dims return results obtained by ignoring empty sub-arrays; if entire array is empty, return '' array_lower raise exception if array is empty (there are no dimensions to inquire about) array_prepend return ARRAY[element] if array is empty array_to_string return '' if array is empty array_upper raise exception if array is empty (there are no dimensions to inquire about) string_to_array return '' if left operand is '' Hope that's useful... Kev
Kev <kevinjamesfield@gmail.com> writes: >> I'm loath to change just this one thing though. If we're going to >> monkey with it, I think we should start by having a thought-through >> proposal about the behavior of all the array functions for corner >> cases like null arrays and empty arrays (cf recent discussion about >> zero-dimensional arrays). > If those are the only corner cases, here's something to work with. I don't think it's quite as easy as you think ... > Looking at http://www.postgresql.org/docs/8.3/interactive/functions-array.html, > I'd propose that every one of those operators and functions should > return NULL::(normal-return-type) if any of their operands are NULL, It is *certainly* not the case that array_append and array_prepend should return NULL when their scalar argument is NULL --- that would break a key component of the support for null elements in arrays. As for the array argument, consider this plpgsql code: declare arr int[]; r record; begin for r in select ... loop arr := arr || r.intfield; end loop; Right now that will "do the right thing" because array_append treats a null array input as though it were empty. Now we could decree that this code is wrong and the user should have initialized arr to '{}' or some such, but ISTM that that's basically breaking code to no purpose. There isn't any real type-safety or other code quality benefit to be gained from making this fail. (After having thought about it a bit, I believe that the rationale for having array_cat treat null arrays as it does was similar.) > || (both array-array cases) return other array if one is empty (if > left is empty, return right, elsif right is empty, return left) We could do that, but note that || has two different behaviors depending on whether the arrays have the same rank, and it throws an error if the ranks are more than one apart. So doing the above amounts to saying that an empty array can have any rank depending on context. Not sure that I like that --- this gets back to the whole issue of exactly what we define an empty array as meaning. (Note that array_cat does in fact currently behave as you suggest. So changing it could break existing code; but I'm thinking that there would be far fewer such cases than for the append-a-scalar situation.) > array_lower raise exception if array is empty (there are no dimensions > to inquire about) > array_upper raise exception if array is empty (there are no dimensions > to inquire about) Well, these beg the question: is an empty array zero-dimensional, or is it a one-dimensional array of no elements, or perhaps both of those as well as higher-dimensional cases where any axis has zero elements, or ??? It's really all kind of messy ... we need to trade off simplicity of definition, ease of use, backwards compatibility, and standards compliance (though the standard has only 1-D arrays so it's of just limited help here). regards, tom lane
Tom Lane wrote: > It's really all kind of messy ... we need to trade off simplicity of > definition, ease of use, backwards compatibility, and standards > compliance (though the standard has only 1-D arrays so it's of just > limited help here). AFAICT, the standard would certainly allow a type specification of basetype ARRAY ARRAY ARRAY ... which gives you a multidimensional array (in the same sense that C has multidimensional arrays). If my reading of everything is right, the subscript order that this would require is the opposite of what PostgreSQL uses for multidimensional arrays. Talk about messy ...
Added to TODO: Improve handling of NULLs in arrays * http://archives.postgresql.org/pgsql-bugs/2008-11/msg00009.php --------------------------------------------------------------------------- Kevin Field wrote: > > The following bug has been logged online: > > Bug reference: 4509 > Logged by: Kevin Field > Email address: kev@brantaero.com > PostgreSQL version: 8.3.4 > Operating system: Windows Server 2003 SP2 > Description: array_cat's null behaviour is inconsistent > Details: > > Section 9.2 in the docs say, 'The ordinary comparison operators yield null > (signifying "unknown") when either input is null.' This applies to other > operators too. For example, the result of tacking an unknown value onto a > known one is unknown, because you don't know what exactly you just tacked > on. So > > select null::text || 'hello'; > > ...returns NULL, which makes sense. But then this > > select array_cat(null::integer[], '{3}'::integer[]); > > ...doesn't return NULL, which it should, for consistency. > > Kev > > -- > Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-bugs -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +