Thread: ARRAY() returning NULL instead of ARRAY[] resp. {}
Hi, why does SELECT ARRAY(SELECT 1 WHERE FALSE) return NULL instead of ARRAY[] resp. '{}'? Markus -- Markus Bertheau <twanger@bluetwanger.de>
Markus Bertheau wrote: > Hi, > > why does SELECT ARRAY(SELECT 1 WHERE FALSE) return NULL instead of > ARRAY[] resp. '{}'? > > Markus > Perhaps Arrays always have to contain at least one element? (I don't know for sure.) SELECT array[]; ERROR: syntax error at or near "]" at character 14 LINE 1: SELECT array[]; SELECT array[1]; array ------- {1} (1 row)
Dnia 23-05-2005, pon o godzinie 18:54 +0000, Jan B. napisał(a): > Perhaps Arrays always have to contain at least one element? (I don't > know for sure.) They can: template1=# select '{}'::TEXT[];text ------{} (1 запись) I don't know, why the ARRAY[] syntax doesn't work for empty arrays. Markus -- Markus Bertheau <twanger@bluetwanger.de>
Markus Bertheau wrote: > why does SELECT ARRAY(SELECT 1 WHERE FALSE) return NULL instead of > ARRAY[] resp. '{}'? > Why would you expect an empty array instead of a NULL? NULL is what you'd get for other data types -- for example: regression=# SELECT (SELECT 1 WHERE FALSE) IS NULL; ?column? ---------- t (1 row) Joe
Joe Conway <mail@joeconway.com> writes: > Markus Bertheau wrote: >> why does SELECT ARRAY(SELECT 1 WHERE FALSE) return NULL instead of >> ARRAY[] resp. '{}'? > Why would you expect an empty array instead of a NULL? I think he's got a good point, actually. We document the ARRAY-with- parens-around-a-SELECT syntax as The resulting one-dimensional array will have an element foreach row in the subquery result, with an element type matchingthatof the subquery's output column. To me, that implies that a subquery result of no rows generates a one-dimensional array of no elements, not a null array. This is not the same as SELECT ARRAY[(SELECT 1 WHERE FALSE)]; We define a scalar subquery that returns no rows as returning null, so this is equivalent to SELECT ARRAY[NULL]; which *ought* to yield an array containing a single NULL element, but since we can't yet handle arrays containing nulls we punt and return a null array value. That's wrong too ... but it's a different issue. The point Markus is complaining about seems like it should be easily fixable. regards, tom lane
Dnia 24-05-2005, wto o godzinie 00:06 -0400, Tom Lane napisał(a): > Joe Conway <mail@joeconway.com> writes: > > Markus Bertheau wrote: > >> why does SELECT ARRAY(SELECT 1 WHERE FALSE) return NULL instead of > >> ARRAY[] resp. '{}'? > > > Why would you expect an empty array instead of a NULL? > > I think he's got a good point, actually. We document the ARRAY-with- > parens-around-a-SELECT syntax as > > The resulting one-dimensional array will have an element for > each row in the subquery result, with an element type matching > that of the subquery's output column. > > To me, that implies that a subquery result of no rows generates a > one-dimensional array of no elements, not a null array. > > The point Markus is complaining about seems like it should > be easily fixable. Great :) Is this a TODO? -- Markus Bertheau <twanger@bluetwanger.de>
O Joe Conway έγραψε στις May 23, 2005 : > Markus Bertheau wrote: > > why does SELECT ARRAY(SELECT 1 WHERE FALSE) return NULL instead of > > ARRAY[] resp. '{}'? > > > > Why would you expect an empty array instead of a NULL? NULL is what > you'd get for other data types -- for example: One could ask in the same fashion why someone would want a table if this table contains no rows. A null value may mean "dont know", wheras a '{}' (empty) value may mean "empty set". For instance lets consider the case where an array holds the factors of a polynomial formula. An null value might mean that the person defining the formulas haven't been bothered with this one yet. An empty value might mean that the person indicates that has worked on this particular one, but he/she has no data yet. Ok extreme cases, but to me there is a clean distinction between a null array and an empty array. Also what is definately needed is arrays that may contain null values. > > regression=# SELECT (SELECT 1 WHERE FALSE) IS NULL; > ?column? > ---------- > t > (1 row) > > Joe > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq > -- -Achilleus
Tom Lane wrote: > I think he's got a good point, actually. We document the ARRAY-with- > parens-around-a-SELECT syntax as > > The resulting one-dimensional array will have an element for > each row in the subquery result, with an element type matching > that of the subquery's output column. > > To me, that implies that a subquery result of no rows generates a > one-dimensional array of no elements, not a null array. OK, looks like I'm outnumbered. But as far as I know, we have never had a way to produce a one-dimensional empty array. Empty arrays thus far have been dimensionless. Assuming we really want an empty 1D array, I created the attached patch. This works fine, but now leaves a few oddities to be dealt with, e.g.: regression=# select array_dims(array(select 1 where false)); array_dims ------------ [1:0] (1 row) Any thoughts on how this should be handled for an empty 1D array? > The point Markus is complaining about seems like it should > be easily fixable. Well, "easily" is a relative term. My Postgres hacking neurons have gotten kind of rusty lately -- but then maybe that was your underlying point ;-) Joe Index: src/backend/executor/nodeSubplan.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/executor/nodeSubplan.c,v retrieving revision 1.69 diff -c -r1.69 nodeSubplan.c *** src/backend/executor/nodeSubplan.c 6 May 2005 17:24:54 -0000 1.69 --- src/backend/executor/nodeSubplan.c 26 May 2005 18:52:16 -0000 *************** *** 215,220 **** --- 215,221 ---- ListCell *pvar; ListCell *l; ArrayBuildState *astate = NULL; + Oid element_type = planstate->ps_ResultTupleSlot->tts_tupleDescriptor->attrs[0]->atttypid; /* * We are probably in a short-lived expression-evaluation context. *************** *** 259,268 **** * * For EXPR_SUBLINK we require the subplan to produce no more than one * tuple, else an error is raised. For ARRAY_SUBLINK we allow the ! * subplan to produce more than one tuple. In either case, if zero ! * tuples are produced, we return NULL. Assuming we get a tuple, we ! * just use its first column (there can be only one non-junk column in ! * this case). */ result = BoolGetDatum(subLinkType == ALL_SUBLINK); *isNull = false; --- 260,269 ---- * * For EXPR_SUBLINK we require the subplan to produce no more than one * tuple, else an error is raised. For ARRAY_SUBLINK we allow the ! * subplan to produce more than one tuple. In the former case, if zero ! * tuples are produced, we return NULL. In the latter, we return an ! * empty array. Assuming we get a tuple, we just use its first column ! * (there can be only one non-junk column in this case). */ result = BoolGetDatum(subLinkType == ALL_SUBLINK); *isNull = false; *************** *** 432,458 **** } } ! if (!found) { /* * deal with empty subplan result. result/isNull were previously ! * initialized correctly for all sublink types except EXPR, ARRAY, * and MULTIEXPR; for those, return NULL. */ if (subLinkType == EXPR_SUBLINK || - subLinkType == ARRAY_SUBLINK || subLinkType == MULTIEXPR_SUBLINK) { result = (Datum) 0; *isNull = true; } } - else if (subLinkType == ARRAY_SUBLINK) - { - Assert(astate != NULL); - /* We return the result in the caller's context */ - result = makeArrayResult(astate, oldcontext); - } MemoryContextSwitchTo(oldcontext); --- 433,459 ---- } } ! if (subLinkType == ARRAY_SUBLINK) ! { ! if (!astate) ! astate = initArrayResult(element_type, oldcontext); ! /* We return the result in the caller's context */ ! result = makeArrayResult(astate, oldcontext); ! } ! else if (!found) { /* * deal with empty subplan result. result/isNull were previously ! * initialized correctly for all sublink types except EXPR * and MULTIEXPR; for those, return NULL. */ if (subLinkType == EXPR_SUBLINK || subLinkType == MULTIEXPR_SUBLINK) { result = (Datum) 0; *isNull = true; } } MemoryContextSwitchTo(oldcontext); *************** *** 925,930 **** --- 926,932 ---- ListCell *l; bool found = false; ArrayBuildState *astate = NULL; + Oid element_type = planstate->ps_ResultTupleSlot->tts_tupleDescriptor->attrs[0]->atttypid; /* * Must switch to child query's per-query memory context. *************** *** 1010,1016 **** } } ! if (!found) { if (subLinkType == EXISTS_SUBLINK) { --- 1012,1033 ---- } } ! if (subLinkType == ARRAY_SUBLINK) ! { ! /* There can be only one param... */ ! int paramid = linitial_int(subplan->setParam); ! ParamExecData *prm = &(econtext->ecxt_param_exec_vals[paramid]); ! ! prm->execPlan = NULL; ! ! if (!astate) ! astate = initArrayResult(element_type, oldcontext); ! ! /* We build the result in query context so it won't disappear */ ! prm->value = makeArrayResult(astate, econtext->ecxt_per_query_memory); ! prm->isnull = false; ! } ! else if (!found) { if (subLinkType == EXISTS_SUBLINK) { *************** *** 1035,1052 **** } } } - else if (subLinkType == ARRAY_SUBLINK) - { - /* There can be only one param... */ - int paramid = linitial_int(subplan->setParam); - ParamExecData *prm = &(econtext->ecxt_param_exec_vals[paramid]); - - Assert(astate != NULL); - prm->execPlan = NULL; - /* We build the result in query context so it won't disappear */ - prm->value = makeArrayResult(astate, econtext->ecxt_per_query_memory); - prm->isnull = false; - } MemoryContextSwitchTo(oldcontext); } --- 1052,1057 ---- Index: src/backend/utils/adt/arrayfuncs.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/utils/adt/arrayfuncs.c,v retrieving revision 1.120 diff -c -r1.120 arrayfuncs.c *** src/backend/utils/adt/arrayfuncs.c 1 May 2005 18:56:18 -0000 1.120 --- src/backend/utils/adt/arrayfuncs.c 26 May 2005 18:52:16 -0000 *************** *** 3252,3257 **** --- 3252,3293 ---- &my_extra->amstate); } + + /* + * initArrayResult - initialize an ArrayBuildState for an array result + * + * rcontext is where to keep working state + */ + ArrayBuildState * + initArrayResult(Oid element_type, MemoryContext rcontext) + { + ArrayBuildState *astate; + MemoryContext arr_context, + oldcontext; + + /* Make a temporary context to hold all the junk */ + arr_context = AllocSetContextCreate(rcontext, + "accumArrayResult", + ALLOCSET_DEFAULT_MINSIZE, + ALLOCSET_DEFAULT_INITSIZE, + ALLOCSET_DEFAULT_MAXSIZE); + oldcontext = MemoryContextSwitchTo(arr_context); + astate = (ArrayBuildState *) palloc(sizeof(ArrayBuildState)); + astate->mcontext = arr_context; + astate->dvalues = (Datum *) + palloc(ARRAY_ELEMS_CHUNKSIZE * sizeof(Datum)); + astate->nelems = 0; + astate->element_type = element_type; + get_typlenbyvalalign(element_type, + &astate->typlen, + &astate->typbyval, + &astate->typalign); + + MemoryContextSwitchTo(oldcontext); + + return astate; + } + /* * accumArrayResult - accumulate one (more) Datum for an array result * *************** *** 3264,3293 **** Oid element_type, MemoryContext rcontext) { ! MemoryContext arr_context, ! oldcontext; if (astate == NULL) { /* First time through --- initialize */ ! ! /* Make a temporary context to hold all the junk */ ! arr_context = AllocSetContextCreate(rcontext, ! "accumArrayResult", ! ALLOCSET_DEFAULT_MINSIZE, ! ALLOCSET_DEFAULT_INITSIZE, ! ALLOCSET_DEFAULT_MAXSIZE); ! oldcontext = MemoryContextSwitchTo(arr_context); ! astate = (ArrayBuildState *) palloc(sizeof(ArrayBuildState)); ! astate->mcontext = arr_context; ! astate->dvalues = (Datum *) ! palloc(ARRAY_ELEMS_CHUNKSIZE * sizeof(Datum)); ! astate->nelems = 0; ! astate->element_type = element_type; ! get_typlenbyvalalign(element_type, ! &astate->typlen, ! &astate->typbyval, ! &astate->typalign); } else { --- 3300,3311 ---- Oid element_type, MemoryContext rcontext) { ! MemoryContext oldcontext; if (astate == NULL) { /* First time through --- initialize */ ! astate = initArrayResult(element_type, rcontext); } else { Index: src/include/utils/array.h =================================================================== RCS file: /cvsroot/pgsql/src/include/utils/array.h,v retrieving revision 1.54 diff -c -r1.54 array.h *** src/include/utils/array.h 29 Mar 2005 00:17:18 -0000 1.54 --- src/include/utils/array.h 26 May 2005 18:52:16 -0000 *************** *** 176,181 **** --- 176,182 ---- Oid elmtype, int elmlen, bool elmbyval, char elmalign, Datum **elemsp, int *nelemsp); + extern ArrayBuildState *initArrayResult(Oid element_type, MemoryContext rcontext); extern ArrayBuildState *accumArrayResult(ArrayBuildState *astate, Datum dvalue, bool disnull, Oid element_type,
Joe Conway wrote: > Tom Lane wrote: > > I think he's got a good point, actually. We document the ARRAY-with- > > parens-around-a-SELECT syntax as > > > > The resulting one-dimensional array will have an element for > > each row in the subquery result, with an element type matching > > that of the subquery's output column. > > > > To me, that implies that a subquery result of no rows generates a > > one-dimensional array of no elements, not a null array. > > OK, looks like I'm outnumbered. > > But as far as I know, we have never had a way to produce a > one-dimensional empty array. Empty arrays thus far have been dimensionless. > > Assuming we really want an empty 1D array, I created the attached patch. > This works fine, but now leaves a few oddities to be dealt with, e.g.: > > regression=# select array_dims(array(select 1 where false)); > array_dims > ------------ > [1:0] > (1 row) > > Any thoughts on how this should be handled for an empty 1D array? > > > The point Markus is complaining about seems like it should > > be easily fixable. > > Well, "easily" is a relative term. My Postgres hacking neurons have > gotten kind of rusty lately -- but then maybe that was your underlying > point ;-) No one responed to this email, so I will try. Is this the one dimmentional array you were talking about? test=> select array_dims('{}'::integer[]); array_dims------------(1 row) Why is [1:0] wrong to return? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian wrote: > Joe Conway wrote: >> >>Any thoughts on how this should be handled for an empty 1D array? > > No one responed to this email, so I will try. Is this the one > dimmentional array you were talking about? > > test=> select array_dims('{}'::integer[]); > array_dims > ------------ > > (1 row) In this case, what you get is actually a dimensionless array. Literally, you get this: if (nitems == 0){ /* Return empty array */ retval = (ArrayType *) palloc0(sizeof(ArrayType)); retval->size = sizeof(ArrayType); retval->elemtype = element_type; PG_RETURN_ARRAYTYPE_P(retval);} I.e. the array structure is allocated, the size is set (which is required since arrays are varlena), and the element type is initialized. There is no initialization of ndim, ARR_DIMS(), or ARR_LBOUND(). In this case, since there are no dimensions, array_dims() probably does the right thing by returning NULL. > Why is [1:0] wrong to return? > I'm not sure it is wrong -- it just seems a bit strange. The difference is that in order to return an empty *one-dimensional* array, ndim, ARR_DIMS(), and ARR_LBOUND() are all appropriately set (by the patched code). Basically, ndim == 1, ARR_DIMS() is a single element int array (C array that is) indicating 0 elements for dimension 1, and ARR_LBOUND() is a single element int array indicating a lower bound of 1. This leads to the array_dims() return value of [1:0]. The value 1 is unquestionably correct for the lower bound index, but what should be reported for the upper bound? We can't return [1:1], because that would indicate that we have one element. Joe
Joe Conway wrote: > Bruce Momjian wrote: > > Joe Conway wrote: > >> > >>Any thoughts on how this should be handled for an empty 1D array? > > > > No one responed to this email, so I will try. Is this the one > > dimmentional array you were talking about? > > > > test=> select array_dims('{}'::integer[]); > > array_dims > > ------------ > > > > (1 row) > > In this case, what you get is actually a dimensionless array. Literally, > you get this: > > if (nitems == 0) > { > /* Return empty array */ > retval = (ArrayType *) palloc0(sizeof(ArrayType)); > retval->size = sizeof(ArrayType); > retval->elemtype = element_type; > PG_RETURN_ARRAYTYPE_P(retval); > } > > I.e. the array structure is allocated, the size is set (which is > required since arrays are varlena), and the element type is initialized. > There is no initialization of ndim, ARR_DIMS(), or ARR_LBOUND(). > > In this case, since there are no dimensions, array_dims() probably does > the right thing by returning NULL. > > > Why is [1:0] wrong to return? > > > > I'm not sure it is wrong -- it just seems a bit strange. The difference > is that in order to return an empty *one-dimensional* array, ndim, > ARR_DIMS(), and ARR_LBOUND() are all appropriately set (by the patched > code). Basically, ndim == 1, ARR_DIMS() is a single element int array (C > array that is) indicating 0 elements for dimension 1, and ARR_LBOUND() > is a single element int array indicating a lower bound of 1. This leads > to the array_dims() return value of [1:0]. The value 1 is unquestionably > correct for the lower bound index, but what should be reported for the > upper bound? We can't return [1:1], because that would indicate that we > have one element. OK, so '[1:0]' seems correct. How would to specify such an array manually in a string? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
В Втр, 24/05/2005 в 00:06 -0400, Tom Lane пишет: > Joe Conway <mail@joeconway.com> writes: > > Markus Bertheau wrote: > >> why does SELECT ARRAY(SELECT 1 WHERE FALSE) return NULL instead of > >> ARRAY[] resp. '{}'? > > > Why would you expect an empty array instead of a NULL? > > I think he's got a good point, actually. We document the ARRAY-with- > parens-around-a-SELECT syntax as > > The resulting one-dimensional array will have an element for > each row in the subquery result, with an element type matching > that of the subquery's output column. > > To me, that implies that a subquery result of no rows generates a > one-dimensional array of no elements, not a null array. By analogy, array_upper('{}'::TEXT[], 1) should return 0 instead of NULL. Markus -- Markus Bertheau ☭ <twanger@bluetwanger.de>
Markus Bertheau ☭ <twanger@bluetwanger.de> writes: > By analogy, array_upper('{}'::TEXT[], 1) should return 0 instead of > NULL. No, that doesn't follow ... we've traditionally considered '{}' to denote a zero-dimensional array. A 1-D array of no elements is '[1:0]={}', just as Joe shows ... or at least it would be except for an overenthusiastic error check: regression=# select '[1:0]={}' :: int[]; ERROR: upper bound cannot be less than lower bound I think this should be a legal boundary case. In general, it should be possible to form zero-size arrays of any number of dimensions. regards, tom lane
Tom Lane wrote: > Markus Bertheau ☭ <twanger@bluetwanger.de> writes: > >>By analogy, array_upper('{}'::TEXT[], 1) should return 0 instead of >>NULL. > > No, that doesn't follow ... we've traditionally considered '{}' to > denote a zero-dimensional array. A 1-D array of no elements is > '[1:0]={}', just as Joe shows ... or at least it would be except > for an overenthusiastic error check: > > regression=# select '[1:0]={}' :: int[]; > ERROR: upper bound cannot be less than lower bound > > I think this should be a legal boundary case. In general, it should be > possible to form zero-size arrays of any number of dimensions. > I've been playing with exactly this over the weekend. Of course, as usual, the devil is in the details. For instance, using the above notation, how would I specify a zero-element 1D array starting at a lower bound index of 0? The answer following the above pattern would be: select '[0:-1]={}'::int[]; You could not use '[0:0]={}'::int[], because that would be a one-element array. I propose the following instead: regression=# select '[1:]={}' :: int[]; int4 ------ {} (1 row) regression=# select array_dims('[1:]={}' :: int[]); array_dims ------------ [1:] (1 row) In other words, an missing upper bound indicates zero elements. Now the next question; what does a 2D zero-element array look like? I think this makes sense: regression=# select '[1:2][1:]={{},{}}'::int[]; int4 ------ {} (1 row) Except (I think) array_out() should probably output something closer to the input literal. Any thoughts on this? Joe