Thread: Assignment to array elements
I'm nearly ready to commit a patch that adds support for INSERT and UPDATE assignments to individual fields of composite columns, along the lines of UPDATE mytab SET complex_col.r = (complex_col).r + 1 WHERE ...; INSERT INTO mytab (complex_col.r, complex_col.i) VALUES(1.1, 2.2); This turned out to be a rather more interesting project than I'd expected when I started, because it had a lot of interactions with the existing support for updating elements of arrays. I believe I've got things all worked out so that the two things play together; you can do weird & wonderful stuff like UPDATE foo SET col.subfld1.subfld2[2] = x, col.subfld2.y = z; and it all works. However, I was forced to confront the fact that our existing semantics for INSERT to an array member are just plain bizarre. If you look at the existing "arrays" regression test you will discover that you can do INSERT INTO tab (arraycol[subscripts]) VALUES (something); but *the subscripts make absolutely no difference in the result*. In fact, all three of the INSERTs in that test that do this are wrong, because the provided subscripts disagree with the dimensionality of the supplied data. The only reason this regression test ever passed is that the subscript values given in the INSERT target list are ignored. What I would like to do about this is define INSERT to a subscripted column name as working the same way that an assignment to a element or slice of a zero-dimension array presently does --- that is, you get an actual array back and not a NULL. It would also fall out that UPDATE of a null array value would behave that way. This is an area that no one's been very happy about in the past, so I'm not expecting a lot of push-back on this proposal, but I thought I'd better toss it out and see if anyone complains... regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > What I would like to do about this is define INSERT to a subscripted > column name as working the same way that an assignment to a element or > slice of a zero-dimension array presently does --- that is, you get an > actual array back and not a NULL. It would also fall out that UPDATE > of a null array value would behave that way. That certainly seems nice. Though update of a part of a null array value seems a little sketchy in theory it would sure be convenient for me. What I'm curious about is where the original behaviour came from. Is it just because insert with subscripts was never implemented? Or was there a rationale for ignoring the subscripts? -- greg
Greg Stark <gsstark@mit.edu> writes: > What I'm curious about is where the original behaviour came from. Is > it just because insert with subscripts was never implemented? Or was > there a rationale for ignoring the subscripts? It's been awhile, but I think that "ignore the subscripts" may have been something I put in, because the original behavior was even more broken. Old-timers will recall that the array behavior we got from Berkeley was bizarrely broken in a whole lot of ways :-( The regression tests expect that this will work: INSERT INTO arrtest (a, b[1][2][2], c, d[2][1]) VALUES ('{}', '{3,4}', '{foo,bar}', '{bar,foo}'); (AFAICT this has been in there since Berkeley) and you'll notice that the values supplied for b and d do not agree at all with the dimensionality specified by the insert targetlist. There are several other errors of the same kind. So the net effect has certainly always been that subscripts supplied here were ignored. Given that we never documented that you could write a subscript in INSERT, I doubt anybody ever tried, so the lack of functionality didn't get noticed. With the patch I'm about to commit, the subscripts *are* functional and so the above command draws an error. I've updated the regression tests to do INSERT INTO arrtest (a, b[1:2], c, d[1:2]) VALUES ('{}', '{3,4}', '{foo,bar}', '{bar,foo}'); which inserts the same data that this test has always inserted. Note that you have to write array-slice subscripts if your intention is to insert more than one element this way. It appears that the Berkeley guys may have intended to allow this shorthand as an equivalent to the above: INSERT INTO arrtest (a, b[2], c, d[2]) VALUES ('{}', '{3,4}', '{foo,bar}', '{bar,foo}'); but I think that's a lousy idea since it forecloses assigning just a single array element, which seems like behavior at least as useful as the slice case. With the patch, you can do this: INSERT INTO arrtest (b[1], b[2]) VALUES (3, 4); to insert the same b array value as the above. regards, tom lane