Thread: casting strings to multidimensional arrays yields strange results

casting strings to multidimensional arrays yields strange results

From
Kris Jurka
Date:
Casting strings to multidimensional arrays yields strange results.  In one
case there are discard values and the other a value magically appears.
Trying both of these with the array[] constructor syntax yields the
expected:
ERROR:  multidimensional arrays must have array expressions with matching
dimensions

Tested on both 7.4.3 and 7.5dev.

Kris Jurka

jurka=# SELECT '{{1,2},{2,3},{4}}'::int[][];
     int4
---------------
 {{1},{2},{4}}

jurka=# SELECT '{{1},{2,3},{4,5}}'::int[][];
        int4
---------------------
 {{1,0},{2,3},{4,5}}

Re: casting strings to multidimensional arrays yields strange

From
Kris Jurka
Date:
On Tue, 27 Jul 2004, Tom Lane wrote:

> Right now I think the sanest behavior would be to throw an error on
> non-rectangular input.  Once we have support for null elements in
> arrays, however, it would arguably be reasonable to pad with NULLs
> where needed,
>

I'm just forwarding a report mentioned on irc so I have no real personal
interest.  The user was really just trying to figure out how it was
supposed to work, rather than requesting a particular behavior.

Are you considering NULL padding arrays constructed with the
ARRAY[] syntax?  If not then this should definitely throw an error to
match that.  If we plan on moving to consistent NULL padding, perhaps now
we should consistently pad with 0 instead of sometimes padding and sometimes
truncating.  This is a change along the direction we're going even if it
is an intermediate behavior.

Doing some testing along these lines for different data types makes me
think this might not be the best idea, 0 and '' seem like reasonable
defaults for numeric/text data, but for some reason 2000-01-01 is the
default for a date, and I'm sure other data types have similar problems.

jurka=# select
'{{2001-01-01},{2001-02-02,2003-03-03},{2004-02-02,2004-04-04}}'::date[][];
                                   date
---------------------------------------------------------------------------
 {{2001-01-01,2000-01-01},{2001-02-02,2003-03-03},{2004-02-02,2004-04-04}}


Kris Jurka

Re: casting strings to multidimensional arrays yields strange results

From
Tom Lane
Date:
Kris Jurka <books@ejurka.com> writes:
> Are you considering NULL padding arrays constructed with the
> ARRAY[] syntax?

Don't think anyone's really thought about it.

> we should consistently pad with 0 instead of sometimes padding and sometimes
> truncating.

"Pad with 0" is a meaningless concept as soon as you think about
nonnumeric data types.  I'm not very sure what's even happening
inside the code --- it's a bit surprising it doesn't crash outright
on pass-by-reference data types ...

I'd agree that the truncation behavior is wrong, but I don't want to
get rid of it by causing the padding behavior to happen more often.

            regards, tom lane

Re: casting strings to multidimensional arrays yields strange results

From
Tom Lane
Date:
Kris Jurka <books@ejurka.com> writes:
> Casting strings to multidimensional arrays yields strange results.

array_in has fairly bizarre behavior when presented with non-rectangular
input data, such as your examples:

> jurka=# SELECT '{{1,2},{2,3},{4}}'::int[][];

> jurka=# SELECT '{{1},{2,3},{4,5}}'::int[][];

I don't recall the details right now of how it chooses the actual array
dimensions, but it's weird.  I've been tempted to rewrite it but have
refrained for fear of breaking existing applications.  Also, it's not
entirely clear what the behavior *should* be.

Right now I think the sanest behavior would be to throw an error on
non-rectangular input.  Once we have support for null elements in
arrays, however, it would arguably be reasonable to pad with NULLs
where needed, so that the above would be read as

    {{1,2},{2,3},{4,NULL}}

    {{1,NULL},{2,3},{4,5}}

respectively.  If that's the direction we want to head in, it would
probably be best to leave array_in alone until we can do that; users
tend to get unhappy when we change behavior repeatedly.

What's your thoughts?

            regards, tom lane

Re: casting strings to multidimensional arrays yields strange

From
Joe Conway
Date:
Tom Lane wrote:
> Right now I think the sanest behavior would be to throw an error on
> non-rectangular input.  Once we have support for null elements in
> arrays, however, it would arguably be reasonable to pad with NULLs
> where needed, so that the above would be read as
>
>     {{1,2},{2,3},{4,NULL}}
>
>     {{1,NULL},{2,3},{4,5}}
>
> respectively.  If that's the direction we want to head in, it would
> probably be best to leave array_in alone until we can do that; users
> tend to get unhappy when we change behavior repeatedly.

I think that even once we support NULL array elements, they should be
explicitly requested -- i.e. throwing an error on non-rectangular input
is still the right thing to do. I haven't suggested that in the past
because of the backward-compatibility issue, but maybe now is the time
to bite the bullet.

If you think this qualifies as a bug fix for 7.5, I can take a look at
it next week.

Joe

Re: casting strings to multidimensional arrays yields strange results

From
Tom Lane
Date:
[ cc'ing pghackers in case anyone wants to object ]

Joe Conway <mail@joeconway.com> writes:
> Tom Lane wrote:
>> Right now I think the sanest behavior would be to throw an error on
>> non-rectangular input.  Once we have support for null elements in
>> arrays, however, it would arguably be reasonable to pad with NULLs
>> where needed, so that the above would be read as
>> 
>>     {{1,2},{2,3},{4,NULL}}
>> 
>>    {{1,NULL},{2,3},{4,5}}
>> 
>> respectively.  If that's the direction we want to head in, it would
>> probably be best to leave array_in alone until we can do that; users
>> tend to get unhappy when we change behavior repeatedly.

> I think that even once we support NULL array elements, they should be 
> explicitly requested -- i.e. throwing an error on non-rectangular input 
> is still the right thing to do. I haven't suggested that in the past 
> because of the backward-compatibility issue, but maybe now is the time 
> to bite the bullet.

Okay with me.  Anyone on pghackers not happy?

> If you think this qualifies as a bug fix for 7.5, I can take a look at 
> it next week.

Yeah, we can call it a bug fix.
        regards, tom lane