Thread: ARRAY() returning NULL instead of ARRAY[] resp. {}

ARRAY() returning NULL instead of ARRAY[] resp. {}

From
Markus Bertheau
Date:
Hi,

why does SELECT ARRAY(SELECT 1 WHERE FALSE) return NULL instead of
ARRAY[] resp. '{}'?

Markus

--
Markus Bertheau <twanger@bluetwanger.de>

Re: ARRAY() returning NULL instead of ARRAY[] resp. {}

From
"Jan B."
Date:
Markus Bertheau wrote:
> Hi,
> 
> why does SELECT ARRAY(SELECT 1 WHERE FALSE) return NULL instead of
> ARRAY[] resp. '{}'?
> 
> Markus
> 

Perhaps Arrays always have to contain at least one element? (I don't 
know for sure.)

SELECT array[];
ERROR:  syntax error at or near "]" at character 14
LINE 1: SELECT array[];

SELECT array[1]; array
------- {1}
(1 row)


Re: ARRAY() returning NULL instead of ARRAY[] resp. {}

From
Markus Bertheau
Date:
Dnia 23-05-2005, pon o godzinie 18:54 +0000, Jan B. napisał(a):

> Perhaps Arrays always have to contain at least one element? (I don't
> know for sure.)

They can:

template1=# select '{}'::TEXT[];text
------{}
(1 запись)

I don't know, why the ARRAY[] syntax doesn't work for empty arrays.

Markus

--
Markus Bertheau <twanger@bluetwanger.de>

Re: ARRAY() returning NULL instead of ARRAY[] resp. {}

From
Joe Conway
Date:
Markus Bertheau wrote:
> why does SELECT ARRAY(SELECT 1 WHERE FALSE) return NULL instead of
> ARRAY[] resp. '{}'?
> 

Why would you expect an empty array instead of a NULL? NULL is what 
you'd get for other data types -- for example:

regression=# SELECT (SELECT 1 WHERE FALSE) IS NULL; ?column?
---------- t
(1 row)

Joe


Re: ARRAY() returning NULL instead of ARRAY[] resp. {}

From
Tom Lane
Date:
Joe Conway <mail@joeconway.com> writes:
> Markus Bertheau wrote:
>> why does SELECT ARRAY(SELECT 1 WHERE FALSE) return NULL instead of
>> ARRAY[] resp. '{}'?

> Why would you expect an empty array instead of a NULL?

I think he's got a good point, actually.  We document the ARRAY-with-
parens-around-a-SELECT syntax as
The resulting one-dimensional array will have an element foreach row in the subquery result, with an element type
matchingthatof the subquery's output column.
 

To me, that implies that a subquery result of no rows generates a
one-dimensional array of no elements, not a null array.

This is not the same as

SELECT ARRAY[(SELECT 1 WHERE FALSE)];

We define a scalar subquery that returns no rows as returning null, so
this is equivalent to

SELECT ARRAY[NULL];

which *ought* to yield an array containing a single NULL element,
but since we can't yet handle arrays containing nulls we punt and
return a null array value.  That's wrong too ... but it's a different
issue.  The point Markus is complaining about seems like it should
be easily fixable.
        regards, tom lane


Re: ARRAY() returning NULL instead of ARRAY[] resp. {}

From
Markus Bertheau
Date:
Dnia 24-05-2005, wto o godzinie 00:06 -0400, Tom Lane napisał(a):
> Joe Conway <mail@joeconway.com> writes:
> > Markus Bertheau wrote:
> >> why does SELECT ARRAY(SELECT 1 WHERE FALSE) return NULL instead of
> >> ARRAY[] resp. '{}'?
>
> > Why would you expect an empty array instead of a NULL?
>
> I think he's got a good point, actually.  We document the ARRAY-with-
> parens-around-a-SELECT syntax as
>
>     The resulting one-dimensional array will have an element for
>     each row in the subquery result, with an element type matching
>     that of the subquery's output column.
>
> To me, that implies that a subquery result of no rows generates a
> one-dimensional array of no elements, not a null array.
>
> The point Markus is complaining about seems like it should
> be easily fixable.

Great :) Is this a TODO?

--
Markus Bertheau <twanger@bluetwanger.de>

Re: ARRAY() returning NULL instead of ARRAY[] resp. {}

From
Achilleus Mantzios
Date:
O Joe Conway έγραψε στις May 23, 2005 :

> Markus Bertheau wrote:
> > why does SELECT ARRAY(SELECT 1 WHERE FALSE) return NULL instead of
> > ARRAY[] resp. '{}'?
> > 
> 
> Why would you expect an empty array instead of a NULL? NULL is what 
> you'd get for other data types -- for example:

One could ask in the same fashion why someone would want a table
if this table contains no rows.

A null value may mean "dont know",
wheras a '{}' (empty) value may mean "empty set".

For instance lets consider the case where an array holds
the factors of a polynomial formula.

An null value might mean that the person defining
the formulas haven't been bothered with this one yet.
An empty value might mean that the person indicates
that has worked on this particular one, but he/she has no data yet.

Ok extreme cases, but to me there is a clean distinction
between a null array and an empty array.

Also what is definately needed is arrays that may contain
null values.

> 
> regression=# SELECT (SELECT 1 WHERE FALSE) IS NULL;
>   ?column?
> ----------
>   t
> (1 row)
> 
> Joe
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
> 
>                http://www.postgresql.org/docs/faq
> 

-- 
-Achilleus



Re: ARRAY() returning NULL instead of ARRAY[] resp. {}

From
Joe Conway
Date:
Tom Lane wrote:
> I think he's got a good point, actually.  We document the ARRAY-with-
> parens-around-a-SELECT syntax as
>
>     The resulting one-dimensional array will have an element for
>     each row in the subquery result, with an element type matching
>     that of the subquery's output column.
>
> To me, that implies that a subquery result of no rows generates a
> one-dimensional array of no elements, not a null array.

OK, looks like I'm outnumbered.

But as far as I know, we have never had a way to produce a
one-dimensional empty array. Empty arrays thus far have been dimensionless.

Assuming we really want an empty 1D array, I created the attached patch.
This works fine, but now leaves a few oddities to be dealt with, e.g.:

regression=# select array_dims(array(select 1 where false));
  array_dims
------------
  [1:0]
(1 row)

Any thoughts on how this should be handled for an empty 1D array?

> The point Markus is complaining about seems like it should
> be easily fixable.

Well, "easily" is a relative term. My Postgres hacking neurons have
gotten kind of rusty lately -- but then maybe that was your underlying
point ;-)

Joe

Index: src/backend/executor/nodeSubplan.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/executor/nodeSubplan.c,v
retrieving revision 1.69
diff -c -r1.69 nodeSubplan.c
*** src/backend/executor/nodeSubplan.c    6 May 2005 17:24:54 -0000    1.69
--- src/backend/executor/nodeSubplan.c    26 May 2005 18:52:16 -0000
***************
*** 215,220 ****
--- 215,221 ----
      ListCell   *pvar;
      ListCell   *l;
      ArrayBuildState *astate = NULL;
+     Oid            element_type = planstate->ps_ResultTupleSlot->tts_tupleDescriptor->attrs[0]->atttypid;

      /*
       * We are probably in a short-lived expression-evaluation context.
***************
*** 259,268 ****
       *
       * For EXPR_SUBLINK we require the subplan to produce no more than one
       * tuple, else an error is raised. For ARRAY_SUBLINK we allow the
!      * subplan to produce more than one tuple. In either case, if zero
!      * tuples are produced, we return NULL. Assuming we get a tuple, we
!      * just use its first column (there can be only one non-junk column in
!      * this case).
       */
      result = BoolGetDatum(subLinkType == ALL_SUBLINK);
      *isNull = false;
--- 260,269 ----
       *
       * For EXPR_SUBLINK we require the subplan to produce no more than one
       * tuple, else an error is raised. For ARRAY_SUBLINK we allow the
!      * subplan to produce more than one tuple. In the former case, if zero
!      * tuples are produced, we return NULL. In the latter, we return an
!      * empty array. Assuming we get a tuple, we just use its first column
!      * (there can be only one non-junk column in this case).
       */
      result = BoolGetDatum(subLinkType == ALL_SUBLINK);
      *isNull = false;
***************
*** 432,458 ****
          }
      }

!     if (!found)
      {
          /*
           * deal with empty subplan result.    result/isNull were previously
!          * initialized correctly for all sublink types except EXPR, ARRAY,
           * and MULTIEXPR; for those, return NULL.
           */
          if (subLinkType == EXPR_SUBLINK ||
-             subLinkType == ARRAY_SUBLINK ||
              subLinkType == MULTIEXPR_SUBLINK)
          {
              result = (Datum) 0;
              *isNull = true;
          }
      }
-     else if (subLinkType == ARRAY_SUBLINK)
-     {
-         Assert(astate != NULL);
-         /* We return the result in the caller's context */
-         result = makeArrayResult(astate, oldcontext);
-     }

      MemoryContextSwitchTo(oldcontext);

--- 433,459 ----
          }
      }

!     if (subLinkType == ARRAY_SUBLINK)
!     {
!         if (!astate)
!             astate = initArrayResult(element_type, oldcontext);
!         /* We return the result in the caller's context */
!         result = makeArrayResult(astate, oldcontext);
!     }
!     else if (!found)
      {
          /*
           * deal with empty subplan result.    result/isNull were previously
!          * initialized correctly for all sublink types except EXPR
           * and MULTIEXPR; for those, return NULL.
           */
          if (subLinkType == EXPR_SUBLINK ||
              subLinkType == MULTIEXPR_SUBLINK)
          {
              result = (Datum) 0;
              *isNull = true;
          }
      }

      MemoryContextSwitchTo(oldcontext);

***************
*** 925,930 ****
--- 926,932 ----
      ListCell   *l;
      bool        found = false;
      ArrayBuildState *astate = NULL;
+     Oid            element_type = planstate->ps_ResultTupleSlot->tts_tupleDescriptor->attrs[0]->atttypid;

      /*
       * Must switch to child query's per-query memory context.
***************
*** 1010,1016 ****
          }
      }

!     if (!found)
      {
          if (subLinkType == EXISTS_SUBLINK)
          {
--- 1012,1033 ----
          }
      }

!     if (subLinkType == ARRAY_SUBLINK)
!     {
!         /* There can be only one param... */
!         int            paramid = linitial_int(subplan->setParam);
!         ParamExecData *prm = &(econtext->ecxt_param_exec_vals[paramid]);
!
!         prm->execPlan = NULL;
!
!         if (!astate)
!             astate = initArrayResult(element_type, oldcontext);
!
!         /* We build the result in query context so it won't disappear */
!         prm->value = makeArrayResult(astate, econtext->ecxt_per_query_memory);
!         prm->isnull = false;
!     }
!     else if (!found)
      {
          if (subLinkType == EXISTS_SUBLINK)
          {
***************
*** 1035,1052 ****
              }
          }
      }
-     else if (subLinkType == ARRAY_SUBLINK)
-     {
-         /* There can be only one param... */
-         int            paramid = linitial_int(subplan->setParam);
-         ParamExecData *prm = &(econtext->ecxt_param_exec_vals[paramid]);
-
-         Assert(astate != NULL);
-         prm->execPlan = NULL;
-         /* We build the result in query context so it won't disappear */
-         prm->value = makeArrayResult(astate, econtext->ecxt_per_query_memory);
-         prm->isnull = false;
-     }

      MemoryContextSwitchTo(oldcontext);
  }
--- 1052,1057 ----
Index: src/backend/utils/adt/arrayfuncs.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/adt/arrayfuncs.c,v
retrieving revision 1.120
diff -c -r1.120 arrayfuncs.c
*** src/backend/utils/adt/arrayfuncs.c    1 May 2005 18:56:18 -0000    1.120
--- src/backend/utils/adt/arrayfuncs.c    26 May 2005 18:52:16 -0000
***************
*** 3252,3257 ****
--- 3252,3293 ----
                       &my_extra->amstate);
  }

+
+ /*
+  * initArrayResult - initialize an ArrayBuildState for an array result
+  *
+  *    rcontext is where to keep working state
+  */
+ ArrayBuildState *
+ initArrayResult(Oid element_type, MemoryContext rcontext)
+ {
+     ArrayBuildState       *astate;
+     MemoryContext        arr_context,
+                         oldcontext;
+
+     /* Make a temporary context to hold all the junk */
+     arr_context = AllocSetContextCreate(rcontext,
+                                         "accumArrayResult",
+                                         ALLOCSET_DEFAULT_MINSIZE,
+                                         ALLOCSET_DEFAULT_INITSIZE,
+                                         ALLOCSET_DEFAULT_MAXSIZE);
+     oldcontext = MemoryContextSwitchTo(arr_context);
+     astate = (ArrayBuildState *) palloc(sizeof(ArrayBuildState));
+     astate->mcontext = arr_context;
+     astate->dvalues = (Datum *)
+         palloc(ARRAY_ELEMS_CHUNKSIZE * sizeof(Datum));
+     astate->nelems = 0;
+     astate->element_type = element_type;
+     get_typlenbyvalalign(element_type,
+                             &astate->typlen,
+                             &astate->typbyval,
+                             &astate->typalign);
+
+     MemoryContextSwitchTo(oldcontext);
+
+     return astate;
+ }
+
  /*
   * accumArrayResult - accumulate one (more) Datum for an array result
   *
***************
*** 3264,3293 ****
                   Oid element_type,
                   MemoryContext rcontext)
  {
!     MemoryContext arr_context,
!                 oldcontext;

      if (astate == NULL)
      {
          /* First time through --- initialize */
!
!         /* Make a temporary context to hold all the junk */
!         arr_context = AllocSetContextCreate(rcontext,
!                                             "accumArrayResult",
!                                             ALLOCSET_DEFAULT_MINSIZE,
!                                             ALLOCSET_DEFAULT_INITSIZE,
!                                             ALLOCSET_DEFAULT_MAXSIZE);
!         oldcontext = MemoryContextSwitchTo(arr_context);
!         astate = (ArrayBuildState *) palloc(sizeof(ArrayBuildState));
!         astate->mcontext = arr_context;
!         astate->dvalues = (Datum *)
!             palloc(ARRAY_ELEMS_CHUNKSIZE * sizeof(Datum));
!         astate->nelems = 0;
!         astate->element_type = element_type;
!         get_typlenbyvalalign(element_type,
!                              &astate->typlen,
!                              &astate->typbyval,
!                              &astate->typalign);
      }
      else
      {
--- 3300,3311 ----
                   Oid element_type,
                   MemoryContext rcontext)
  {
!     MemoryContext oldcontext;

      if (astate == NULL)
      {
          /* First time through --- initialize */
!         astate = initArrayResult(element_type, rcontext);
      }
      else
      {
Index: src/include/utils/array.h
===================================================================
RCS file: /cvsroot/pgsql/src/include/utils/array.h,v
retrieving revision 1.54
diff -c -r1.54 array.h
*** src/include/utils/array.h    29 Mar 2005 00:17:18 -0000    1.54
--- src/include/utils/array.h    26 May 2005 18:52:16 -0000
***************
*** 176,181 ****
--- 176,182 ----
                    Oid elmtype,
                    int elmlen, bool elmbyval, char elmalign,
                    Datum **elemsp, int *nelemsp);
+ extern ArrayBuildState *initArrayResult(Oid element_type, MemoryContext rcontext);
  extern ArrayBuildState *accumArrayResult(ArrayBuildState *astate,
                   Datum dvalue, bool disnull,
                   Oid element_type,

Re: ARRAY() returning NULL instead of ARRAY[] resp. {}

From
Bruce Momjian
Date:
Joe Conway wrote:
> Tom Lane wrote:
> > I think he's got a good point, actually.  We document the ARRAY-with-
> > parens-around-a-SELECT syntax as
> > 
> >     The resulting one-dimensional array will have an element for
> >     each row in the subquery result, with an element type matching
> >     that of the subquery's output column.
> > 
> > To me, that implies that a subquery result of no rows generates a
> > one-dimensional array of no elements, not a null array.
> 
> OK, looks like I'm outnumbered.
> 
> But as far as I know, we have never had a way to produce a 
> one-dimensional empty array. Empty arrays thus far have been dimensionless.
> 
> Assuming we really want an empty 1D array, I created the attached patch. 
> This works fine, but now leaves a few oddities to be dealt with, e.g.:
> 
> regression=# select array_dims(array(select 1 where false));
>   array_dims
> ------------
>   [1:0]
> (1 row)
> 
> Any thoughts on how this should be handled for an empty 1D array?
> 
> > The point Markus is complaining about seems like it should
> > be easily fixable.
> 
> Well, "easily" is a relative term. My Postgres hacking neurons have 
> gotten kind of rusty lately -- but then maybe that was your underlying 
> point ;-)

No one responed to this email, so I will try.  Is this the one
dimmentional array you were talking about?
test=> select array_dims('{}'::integer[]); array_dims------------(1 row)

Why is [1:0] wrong to return?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: ARRAY() returning NULL instead of ARRAY[] resp. {}

From
Joe Conway
Date:
Bruce Momjian wrote:
> Joe Conway wrote:
>>
>>Any thoughts on how this should be handled for an empty 1D array?
> 
> No one responed to this email, so I will try.  Is this the one
> dimmentional array you were talking about?
> 
>     test=> select array_dims('{}'::integer[]);
>      array_dims
>     ------------
>     
>     (1 row)

In this case, what you get is actually a dimensionless array. Literally, 
you get this:
if (nitems == 0){    /* Return empty array */    retval = (ArrayType *) palloc0(sizeof(ArrayType));    retval->size =
sizeof(ArrayType);   retval->elemtype = element_type;    PG_RETURN_ARRAYTYPE_P(retval);}
 

I.e. the array structure is allocated, the size is set (which is 
required since arrays are varlena), and the element type is initialized. 
There is no initialization of ndim, ARR_DIMS(), or ARR_LBOUND().

In this case, since there are no dimensions, array_dims() probably does 
the right thing by returning NULL.

> Why is [1:0] wrong to return?
> 

I'm not sure it is wrong -- it just seems a bit strange. The difference 
is that in order to return an empty *one-dimensional* array, ndim, 
ARR_DIMS(), and ARR_LBOUND() are all appropriately set (by the patched 
code). Basically, ndim == 1, ARR_DIMS() is a single element int array (C 
array that is) indicating 0 elements for dimension 1, and ARR_LBOUND() 
is a single element int array indicating a lower bound of 1. This leads 
to the array_dims() return value of [1:0]. The value 1 is unquestionably 
correct for the lower bound index, but what should be reported for the 
upper bound? We can't return [1:1], because that would indicate that we 
have one element.

Joe


Re: ARRAY() returning NULL instead of ARRAY[] resp. {}

From
Bruce Momjian
Date:
Joe Conway wrote:
> Bruce Momjian wrote:
> > Joe Conway wrote:
> >>
> >>Any thoughts on how this should be handled for an empty 1D array?
> > 
> > No one responed to this email, so I will try.  Is this the one
> > dimmentional array you were talking about?
> > 
> >     test=> select array_dims('{}'::integer[]);
> >      array_dims
> >     ------------
> >     
> >     (1 row)
> 
> In this case, what you get is actually a dimensionless array. Literally, 
> you get this:
> 
>     if (nitems == 0)
>     {
>         /* Return empty array */
>         retval = (ArrayType *) palloc0(sizeof(ArrayType));
>         retval->size = sizeof(ArrayType);
>         retval->elemtype = element_type;
>         PG_RETURN_ARRAYTYPE_P(retval);
>     }
> 
> I.e. the array structure is allocated, the size is set (which is 
> required since arrays are varlena), and the element type is initialized. 
> There is no initialization of ndim, ARR_DIMS(), or ARR_LBOUND().
> 
> In this case, since there are no dimensions, array_dims() probably does 
> the right thing by returning NULL.
> 
> > Why is [1:0] wrong to return?
> > 
> 
> I'm not sure it is wrong -- it just seems a bit strange. The difference 
> is that in order to return an empty *one-dimensional* array, ndim, 
> ARR_DIMS(), and ARR_LBOUND() are all appropriately set (by the patched 
> code). Basically, ndim == 1, ARR_DIMS() is a single element int array (C 
> array that is) indicating 0 elements for dimension 1, and ARR_LBOUND() 
> is a single element int array indicating a lower bound of 1. This leads 
> to the array_dims() return value of [1:0]. The value 1 is unquestionably 
> correct for the lower bound index, but what should be reported for the 
> upper bound? We can't return [1:1], because that would indicate that we 
> have one element.

OK, so '[1:0]' seems correct.  How would to specify such an array
manually in a string?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: ARRAY() returning NULL instead of ARRAY[] resp. {}

From
Markus Bertheau ☭
Date:
В Втр, 24/05/2005 в 00:06 -0400, Tom Lane пишет:
> Joe Conway <mail@joeconway.com> writes:
> > Markus Bertheau wrote:
> >> why does SELECT ARRAY(SELECT 1 WHERE FALSE) return NULL instead of
> >> ARRAY[] resp. '{}'?
> 
> > Why would you expect an empty array instead of a NULL?
> 
> I think he's got a good point, actually.  We document the ARRAY-with-
> parens-around-a-SELECT syntax as
> 
>     The resulting one-dimensional array will have an element for
>     each row in the subquery result, with an element type matching
>     that of the subquery's output column.
> 
> To me, that implies that a subquery result of no rows generates a
> one-dimensional array of no elements, not a null array.

By analogy, array_upper('{}'::TEXT[], 1) should return 0 instead of
NULL.

Markus

-- 
Markus Bertheau ☭ <twanger@bluetwanger.de>



Re: ARRAY() returning NULL instead of ARRAY[] resp. {}

From
Tom Lane
Date:
Markus Bertheau ☭ <twanger@bluetwanger.de> writes:
> By analogy, array_upper('{}'::TEXT[], 1) should return 0 instead of
> NULL.

No, that doesn't follow ... we've traditionally considered '{}' to
denote a zero-dimensional array.  A 1-D array of no elements is
'[1:0]={}', just as Joe shows ... or at least it would be except
for an overenthusiastic error check:

regression=# select '[1:0]={}' :: int[];
ERROR:  upper bound cannot be less than lower bound

I think this should be a legal boundary case.  In general, it should be
possible to form zero-size arrays of any number of dimensions.
        regards, tom lane


Re: ARRAY() returning NULL instead of ARRAY[] resp. {}

From
Joe Conway
Date:
Tom Lane wrote:
> Markus Bertheau ☭ <twanger@bluetwanger.de> writes:
> 
>>By analogy, array_upper('{}'::TEXT[], 1) should return 0 instead of
>>NULL.
> 
> No, that doesn't follow ... we've traditionally considered '{}' to
> denote a zero-dimensional array.  A 1-D array of no elements is
> '[1:0]={}', just as Joe shows ... or at least it would be except
> for an overenthusiastic error check:
> 
> regression=# select '[1:0]={}' :: int[];
> ERROR:  upper bound cannot be less than lower bound
> 
> I think this should be a legal boundary case.  In general, it should be
> possible to form zero-size arrays of any number of dimensions.
> 

I've been playing with exactly this over the weekend. Of course, as 
usual, the devil is in the details. For instance, using the above 
notation, how would I specify a zero-element 1D array starting at a 
lower bound index of 0? The answer following the above pattern would be:
  select '[0:-1]={}'::int[];

You could not use '[0:0]={}'::int[], because that would be a one-element 
array. I propose the following instead:

regression=# select '[1:]={}' :: int[]; int4
------ {}
(1 row)

regression=# select array_dims('[1:]={}' :: int[]); array_dims
------------ [1:]
(1 row)

In other words, an missing upper bound indicates zero elements.

Now the next question; what does a 2D zero-element array look like? I 
think this makes sense:

regression=# select '[1:2][1:]={{},{}}'::int[]; int4
------ {}
(1 row)

Except (I think) array_out() should probably output something closer to 
the input literal. Any thoughts on this?

Joe