Thread: casting strings to multidimensional arrays yields strange results
Casting strings to multidimensional arrays yields strange results. In one case there are discard values and the other a value magically appears. Trying both of these with the array[] constructor syntax yields the expected: ERROR: multidimensional arrays must have array expressions with matching dimensions Tested on both 7.4.3 and 7.5dev. Kris Jurka jurka=# SELECT '{{1,2},{2,3},{4}}'::int[][]; int4 --------------- {{1},{2},{4}} jurka=# SELECT '{{1},{2,3},{4,5}}'::int[][]; int4 --------------------- {{1,0},{2,3},{4,5}}
On Tue, 27 Jul 2004, Tom Lane wrote: > Right now I think the sanest behavior would be to throw an error on > non-rectangular input. Once we have support for null elements in > arrays, however, it would arguably be reasonable to pad with NULLs > where needed, > I'm just forwarding a report mentioned on irc so I have no real personal interest. The user was really just trying to figure out how it was supposed to work, rather than requesting a particular behavior. Are you considering NULL padding arrays constructed with the ARRAY[] syntax? If not then this should definitely throw an error to match that. If we plan on moving to consistent NULL padding, perhaps now we should consistently pad with 0 instead of sometimes padding and sometimes truncating. This is a change along the direction we're going even if it is an intermediate behavior. Doing some testing along these lines for different data types makes me think this might not be the best idea, 0 and '' seem like reasonable defaults for numeric/text data, but for some reason 2000-01-01 is the default for a date, and I'm sure other data types have similar problems. jurka=# select '{{2001-01-01},{2001-02-02,2003-03-03},{2004-02-02,2004-04-04}}'::date[][]; date --------------------------------------------------------------------------- {{2001-01-01,2000-01-01},{2001-02-02,2003-03-03},{2004-02-02,2004-04-04}} Kris Jurka
Kris Jurka <books@ejurka.com> writes: > Are you considering NULL padding arrays constructed with the > ARRAY[] syntax? Don't think anyone's really thought about it. > we should consistently pad with 0 instead of sometimes padding and sometimes > truncating. "Pad with 0" is a meaningless concept as soon as you think about nonnumeric data types. I'm not very sure what's even happening inside the code --- it's a bit surprising it doesn't crash outright on pass-by-reference data types ... I'd agree that the truncation behavior is wrong, but I don't want to get rid of it by causing the padding behavior to happen more often. regards, tom lane
Kris Jurka <books@ejurka.com> writes: > Casting strings to multidimensional arrays yields strange results. array_in has fairly bizarre behavior when presented with non-rectangular input data, such as your examples: > jurka=# SELECT '{{1,2},{2,3},{4}}'::int[][]; > jurka=# SELECT '{{1},{2,3},{4,5}}'::int[][]; I don't recall the details right now of how it chooses the actual array dimensions, but it's weird. I've been tempted to rewrite it but have refrained for fear of breaking existing applications. Also, it's not entirely clear what the behavior *should* be. Right now I think the sanest behavior would be to throw an error on non-rectangular input. Once we have support for null elements in arrays, however, it would arguably be reasonable to pad with NULLs where needed, so that the above would be read as {{1,2},{2,3},{4,NULL}} {{1,NULL},{2,3},{4,5}} respectively. If that's the direction we want to head in, it would probably be best to leave array_in alone until we can do that; users tend to get unhappy when we change behavior repeatedly. What's your thoughts? regards, tom lane
Tom Lane wrote: > Right now I think the sanest behavior would be to throw an error on > non-rectangular input. Once we have support for null elements in > arrays, however, it would arguably be reasonable to pad with NULLs > where needed, so that the above would be read as > > {{1,2},{2,3},{4,NULL}} > > {{1,NULL},{2,3},{4,5}} > > respectively. If that's the direction we want to head in, it would > probably be best to leave array_in alone until we can do that; users > tend to get unhappy when we change behavior repeatedly. I think that even once we support NULL array elements, they should be explicitly requested -- i.e. throwing an error on non-rectangular input is still the right thing to do. I haven't suggested that in the past because of the backward-compatibility issue, but maybe now is the time to bite the bullet. If you think this qualifies as a bug fix for 7.5, I can take a look at it next week. Joe
[ cc'ing pghackers in case anyone wants to object ] Joe Conway <mail@joeconway.com> writes: > Tom Lane wrote: >> Right now I think the sanest behavior would be to throw an error on >> non-rectangular input. Once we have support for null elements in >> arrays, however, it would arguably be reasonable to pad with NULLs >> where needed, so that the above would be read as >> >> {{1,2},{2,3},{4,NULL}} >> >> {{1,NULL},{2,3},{4,5}} >> >> respectively. If that's the direction we want to head in, it would >> probably be best to leave array_in alone until we can do that; users >> tend to get unhappy when we change behavior repeatedly. > I think that even once we support NULL array elements, they should be > explicitly requested -- i.e. throwing an error on non-rectangular input > is still the right thing to do. I haven't suggested that in the past > because of the backward-compatibility issue, but maybe now is the time > to bite the bullet. Okay with me. Anyone on pghackers not happy? > If you think this qualifies as a bug fix for 7.5, I can take a look at > it next week. Yeah, we can call it a bug fix. regards, tom lane